Neural Network (3) : Hopfield Net
Binary Hopfield net using Hebbian learning
We want to study Hopfield net from the simple case. Hopfield net is a fully connected feedback network. A feedback network is a network that is not a feedforward network, and in a feedforward network, all the connections are directed. All the connections in our example will be bi-directed. This symmetric property of the weight is important property of the Hopfield net.
Hopfield net can act as associative memories, and they can be used to solve optimization problems. In this post, we will discuss how it behaves as associative memories.
from __future__ import division
import numpy as np
import random
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
Let us call the data1, and transform it to an array.
ch = open('DJCM.txt','r')
for i in ch:
print i
1111001001010010100101110
1111100010000101001011100
0111110000100001000001111
1000111011101011000110001
0101010101010101010101010
0111010000011100000101110
Each line consists of 25 binary values (1,0) and each of them will give us a pattern in $5 \times 5$ box. Thus, we have 6 patterns. Will you tell how many neurons are there? We will see the visualized network later, and the neurons will appear as nodes in a graph, and the weights of the connection between neurons will appear as edges of the nodes.
X = np.zeros((6,25))
ch = open('DJCM.txt','r')
list = []
for i in ch:
list.append(i)
for i in range(len(list)):
for j in range(25):
X[i][j] = int(list[i][j])
Let us change the binary states from (1, 0) to (1, -1).
X = 2*(X-0.5)
The patterns are encoded in $X$. Display in in $5 \times 5$ box.
def display(X):
plt.imshow(X.reshape((5,5)))
fig = plt.figure(figsize=(15,4))
for i in range(6):
ax = fig.add_subplot(1, 6, i+1)
display(X[i])
We store the first 4 memories of the patterns in a weight matrix. By Hebb rule,
$$ w_{ij} = \eta \sum_n x^{(n)}_i x^{(n)}_j , $$
I set $\eta = 1$, and synchronously update the neurons
$$ a_i = \sum_j w_{ij} x_j $$
then update their states simultaneously to
$$ x_i = \Theta( a_i ) . $$
$\Theta$ is a threshold activation function, but I would replace it to $\tanh (a_i)$ because it is almost same as the threshold function in this example. The hyper tangent give very close number to 1 and show the same or very similar displays. Above all, more convenient to code to our tests.
w = np.zeros((25,25))
for i in range(25):
for j in range(25):
for k in range(4):
w[i,j] += X[k][i]*X[k][j]
G.add_edge(i,j, weight = w[i,j])
Recovery of the memories
I intentionally ruin the pattern a bit. Recovered by just 1 iteration.
fig = plt.figure(figsize=(8,4))
fig.add_subplot(1, 2, 1)
testX = X[0]
testX[23] = -testX[23]
display(testX)
fig.add_subplot(1, 2, 2)
a = np.dot(w,testX)
x = np.tanh(a)
display(x)
Reaction to the unknown memories.
Recover it to the deformed memory.
fig = plt.figure(figsize=(8,4))
fig.add_subplot(1, 3, 1)
display(X[5])
fig.add_subplot(1, 3, 2)
a = np.dot(w,X[5])
x = np.tanh(a)
display(x)
fig.add_subplot(1, 3, 3)
a = np.dot(w,np.dot(w,np.dot(w,np.dot(w,X[5]))))
x = np.tanh(a)
display(x)
Oops! My brain is damaged! but I can still remember
6 out of 25 weight matrix lost its value, it is 24 percentage-loss. Let us see if it still memorize the 5 memorized patterns.
dw = w
for i in range(15):
for j in range(10):
dw[i+3,j+5] = 0
fig = plt.figure(figsize=(1,4))
for i in range(4):
ax = fig.add_subplot(1, 4, i+1)
a = np.dot(dw,X[i])
x = np.tanh(a)
display(x)
Amazing! Isn’t it?
Greedy mommy made me remember too many similar things
If I try to make it memorize all 6 patterns?
wo = np.zeros((25,25))
for i in range(25):
for j in range(25):
for k in range(6):
wo[i,j] += X[k][i]*X[k][j]
fig = plt.figure(figsize=(8,4))
fig.add_subplot(1, 4, 1)
display(X[0])
fig.add_subplot(1, 4, 2)
a = np.dot(wo,X[0])
x = np.tanh(a)
display(x)
fig.add_subplot(1, 4, 3)
a = np.dot(wo,np.dot(wo,X[0]))
x = np.tanh(a)
display(x)
fig.add_subplot(1, 4, 4)
a = np.dot(wo,np.dot(wo,np.dot(wo,np.dot(wo,X[0]))))
x = np.tanh(a)
display(x)
From the first iteration, it seems that it remembers well, but soon turns out it gets confused if it is true and converges to the wrong answer. This problem can be fixed by better learning algorithm which replace to the Hebbian learning. For example, you can upgrade the Hebb rule using gradient descent we studied for the perceptron. We will have a chance to discuss about the learning when we study Hopfield net for optimization or Boltzmann machine.
Visualize Hopfield net
Using the networkx library, we could visualize our network.
The Hopfield net memorized 4 patterns
import networkx as nx
G = nx.Graph()
G.add_nodes_from(range(25))
G = nx.Graph()
G.add_nodes_from(range(25))
for i in range(25):
for j in range(25):
for k in range(4):
w[i,j] += X[k][i]*X[k][j]
G.add_edge(i,j, weight = w[i,j])
nx.draw(G, with_labels =True, font_weight = 'bold')
plt.show()
Apparently, we don’t need to connect with zero weights
G = nx.Graph()
G.add_nodes_from(range(25))
for i in range(25):
for j in range(25):
if w[i,j] != 0:
G.add_edge(i,j, weight = w[i,j])
nx.draw(G, with_labels =True, font_weight = 'bold')
plt.show()
Damaged Hopfield net and visualizaion of the weight.
The damaged Hopfied net has reduced network, but still the network with 24 percentage-loss seems firm. Besides, we can visualize the weight between nodes.
G = nx.Graph()
G.add_nodes_from(range(25))
for i in range(25):
for j in range(25):
if w[i,j] != 0:
G.add_edge(i,j, weight = dw[i,j])
pos=nx.spring_layout(G)
labels = nx.get_edge_attributes(G,'weight')
nx.draw_networkx_edge_labels(G,pos,edge_labels=labels)
nx.draw(G, pos, with_labels =True, font_weight = 'bold')
plt.show()
Obviously looks as a feedback network.
The other magic of the deep learning
In the first post of the neural network, I had commented about the magical power of the randomness. Interestingly, the magic was discussed in the meetup for Advanced Reading Data science. The magic we have seen in the associative memory of the neural network was also seen in the meetup this week. What is good to me is their recent interests lie in deep learning. Next week the topic is deep reinforcement learning and I will also post about deep q-learning neural network soon.
I was recommended to join the meetup last week. This meetup, Learn Data Science by Doing Kaggle competitions, was also better than my expectation. I thought they would learn some practical library such as pandas or scikit. However, this meetup was more advanced than I thought. It was the meeting to discuss about the core techniques and relevant research papers applied for the competition.
The main technique was U-Net. Experienced doctors see the ultrasound film of the neck and they find the irregular such as a tumor. Our goal is to make the machine learn the skill of the experienced doctors. In order to scan the idea of the technique, you can read this research paper and the references of the paper, or just visit this website and watch the video on the page.
I will not handle the detail of the research paper. Instead, what I want to emphasize on is that making the data more rough works is the key of the technique. When we, human being, read the picture, we don’t need to know all of the data. The size of the data human have used to read the ultrasound film should be very little compared to the size of the given data. We need to throw out unnecessary informations to elaborate intuition. The network of the data makes it possible to keep the key structure of the data, and it also improves the speed of the process dramatically.
-
If readers want to test it by themselves, save the result of the print as a txt file. ↩︎