**Keywords:** Hebbian Learning, Supervised Hebbian Learning

The whole Hebbian learning theory:

- Idea:https://brainbomb.org/Artificial-Intelligence/Neural-Networks/Neural-Network-Design-Supervised-Hebbian-Learning/
- Analysis:https://brainbomb.org/Artificial-Intelligence/Neural-Networks/Neural-Network-Design-Supervised-Hebbian-Learning-P2/
- Application: https://brainbomb.org/Artificial-Intelligence/Neural-Networks/Neural-Network-Design-Supervised-Hebbian-Learning-P3/

## Application of Hebb Learning^{1}

An application is proposed here. We have 3 input and outputs:

They are $5\times 6$ images which have only white and black pixels.

We then read the image and convert the white and black into $\{1,-1\}$ so the ‘zero’ image change into the matrix:

\begin{aligned}

\{&\\

&-1,1,1,1,-1,\\

&1,-1,-1,-1,1,\\

&1,-1,-1,-1,1,\\

&1,-1,-1,-1,1,\\

&1,-1,-1,-1,1,\\

&-1,1,1,1,-1\\

\}&

\end{aligned}

$$

we use the inputs as the target then the neuron network architecture become(transfer function is the hard limit):

following the algorithm we summarized above we got the code:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# part of code class HebbLearning(): def __init__(self, training_data_path='./data/train', gamma=0, alpha=0.5): """ initial function :param training_data_path: the path of training set and their labels. They should be organized as pictures in '.png' form :param gamma: the punishment coefficients :param alpha: learning rate """ self.gamma = gamma self.alpha = alpha x = self.load_data(training_data_path) self.X = np.array(x) self.label = np.array(x) self.weights = np.zeros((np.shape(x)[1], np.shape(x)[1])) self.test_data = [] def load_data(self, data_path): """ load image data and transfer it into matrix form :param data_path: the path of data :return: training set and targets respectively """ name_list = os.listdir(data_path) X = [] for file_name in name_list: data = cv2.imread(os.path.join(data_path, file_name), 0) if data is None: continue else: data = data.reshape(1, -1)[0].astype(np.float64) for i in range(len(data)): if data[i] > 0: data[i] = 1 else: data[i] = -1 data=data/np.linalg.norm(data) X.append(data) return X def process(self): """ comput weights using Hebb learning function :return: """ for x, label in zip(self.X, self.label): self.weights = self.weights + self.alpha * np.dot(label.reshape(-1,1), x.reshape(1,-1)) - self.gamma*self.weights def test(self, input_path='./data/test'): """ test function used to test a given input use the linear associator :param input_path: test date should be organized as pictures whose names are its label :return: output label and """ self.test_data = self.load_data(input_path) labels_test = [] for x in self.test_data: output_origin = np.dot(self.weights,x.reshape(-1,1)) labels_test.append(symmetrical_hard_limit(output_origin)) return np.array(labels_test) |

The whole project can be found: https://github.com/Tony-Tan/NeuronNetworks/tree/master/supervised_Hebb_learning

please, Star it!

The algorithm gives the following result(left: input; right: output):

It looks like you have associate memory.

## Some Variations of Hebb Learning

Derivate rules of Hebb learning are developed. And they overcome the shortage of Hebb learning algorithm, like:

Elements of $\boldsymbol{W}$ would grow bigger when more prototypes are provided.

To overcome this problem, a lot of ideas came into mind:

- Learning rate $\alpha$ can be used to slow down this phenomina
- Adding a decay term, so the learning rule is changed into a smooth filter: $\boldsymbol{W}^{\text{new}}=\boldsymbol{W}^{\text{old}}+\alpha\boldsymbol{t}_q\boldsymbol{p}_q^T-\gamma\boldsymbol{W}^{\text{old}}$ which can also be written as $\boldsymbol{W}^{\text{new}}=(1-\gamma)\boldsymbol{W}^{\text{old}}+\alpha\boldsymbol{t}_q\boldsymbol{p}_q^T$ where $0<\gamma<1$
- Using the residual between output and target to multipy input as the increasement of the weights: $\boldsymbol{W}^{\text{new}}=\boldsymbol{W}^{\text{old}}+\alpha(\boldsymbol{t}_q-\boldsymbol{a}_q)\boldsymbol{p}_q^T$

The second idea, when $\gamma\to 1$ the algorithm quickly forgets the old weights. but when $\gamma\to 0$ the algorithm goes back to the standard form. This idea of filter would be widely used in the following algorithms.

The third method also known as the Widrow-Hoff algorithm, could minimize mean square error as well as minimize the sum of the square error and this algorithm also has another advantage that is the update of the weights step by step whenever the prototype is provided. So it can quickly adapt to the changing environment while some other algorithms do not have this feature.