Keywords: neuron model, network architecture

Multiple-inputs Neuron1

After the insight of single-input neuron, we can easily build a more complex and powerful neuron model — multiple-inputs neuron, whose structure is more like the biological nerve cell than the single-input neuron:

then, we can describe the nuron by mathematical expression in which summation operation will play a part in the whole process as follow:

$$
a=w_{1,1}\cdot p_1+w_{1,2}\cdot p_2+\dots+ w_{1,R}\cdot p_R+b\tag{13}
$$

There are two numbers of subscript of $w$ which seem unnecessary in the equation because the first number does not vary any more. But as a long concern, it is better to remain this number for it is used to label the neuron. So $w_{1,2}$ represents the second synapse’s weight belonging to the first neuron. When we have $k$ neurons the $m$th synapse weight of $n$th neuron is $w_{n,m}$.

Let’s go back to the equation(13). It can be rewritten as:

$$
n=W\boldsymbol{p}+b\tag{14}
$$

where:

  • $W$ is a matrix who has only one row containing the weights
  • $\boldsymbol{p}$ is a vector representing inputs
  • $b$ is a scalar representing bias
  • $n$ is the result of the cell body operation,

then the output is:

$$
a=f(W\boldsymbol{p}+b)\tag{15}
$$

The diagram is a very powerful tool to express a neuron or a network because it’s good at showing the topological structure of the network. And for further research, an abbreviated notation was designed. To the multiple-inputs neuron, we have:

a feature of this kind of notation is that dimensions of each variable are labeled and the input dimension $R$ is decided by designer.

Network Architecture

A single neuron is not sufficient, even though it has multiple inputs.

A layer of neurons

To perform a more complicated function, we need more than one neurons and construct a network which contains a layer of neurons:

in this model, we have $R$-dimensions input and $S$ neurons then we get:

$$
a_i=f(\sum_{j=1}^{R}w_{i,j}\cdot p_{j}+b_j)\tag{16}
$$

this is the output of $j$th neuron in the whole network, and we can rewrite the whole network in a matrical form:

$$
\boldsymbol{a}=\boldsymbol{f}(W\boldsymbol{p}+\boldsymbol{b})\tag{17}
$$

where

  • $W$ is a matrix $\begin{bmatrix}w_{1,1}&\cdots&w_{1,R}\\ \vdots&&\vdots\\w_{S,1}&\cdots&w_{S_R}\end{bmatrix}$, where $w_{i,j}$ is the $j$th weight of the $i$th neuron
  • $\boldsymbol{p}$ is the vector of input $\begin{bmatrix}p_1\\ \vdots\\p_R\end{bmatrix}$
  • $\boldsymbol{a}$ is the vector of output $\begin{bmatrix}a_1\\ \vdots\\a_S\end{bmatrix}$
  • $\boldsymbol{f}$ is the vector of transfer functions $\begin{bmatrix}f_1\\ \vdots\\f_S\end{bmatrix}$ where each $f_i$ can be different.

This network is much more powerful than the single neuron but they have a very similar abbreviated notation:

the only distinction is the dimension of each variable.

Multiple Layers of Neurons

The next stage to extend a single-layer network is multiple layers:

and, its final output is:

$$
\boldsymbol{a}=\boldsymbol{f}^3(W^3\boldsymbol{f}^2(W^2\boldsymbol{f}^1(W^1\boldsymbol{p}+\boldsymbol{b}^1)+\boldsymbol{b}^2)+\boldsymbol{b}^3)\tag{18}
$$

the numbers of the right-top of the variable are the layer number, for example, $w^1_{2,3}$ is the weight of $2$nd synapse of the $3$rd neuron at the 1st layer.

Each layer has also its name, for instance, the first layer whose input is external input is called the input layer. The layer whose output is external output is called the output layer. Other layers are called hidden layers. Its abbreviated notation is:

The new model with multiple layers is powerful but it is hard to design because the layer number is arbitrary and neurons number in a layer is also arbitrary. So it becomes an experimental work. However, the input layer and output layer have a constant number and they are decided by the specialized task. Transfer functions are also arbitrary, and each neuron can have its transfer function which can be different from any neuron in the network. Bias can be omitted but this can cause a problem that it will always output $\boldsymbol{0}$ when the input is $\boldsymbol{0}$. This phenomenon could not make sense in some tasks, so bias plays an important part in the $\boldsymbol{0}$ input situation. But to some other input, bias seems not so important.

Recurrent Networks

It seems possible that a neuron’s output also connects to its input. This means that the input coming some times ago will go back to the neuron again. It acts somehow like

$$
\boldsymbol{a}=\boldsymbol{f}(W\boldsymbol{f}(W\boldsymbol{p}+\boldsymbol{b})+\boldsymbol{b})\tag{19}
$$

to illustrate the procedure, we present the delay block

where the output is the input delayed 1-time unit:

$$
a(t)=u(t-1)\tag{20}
$$

and the block is initialized by $a(0)$

Another useful operation for the recurrent network is integrator:

whose output is:

$$
a(t)=\int^t_0u(t)dt +a(0)\tag{21}
$$

A recurrent network is a network in which there is a feedback connection that a neuron’s output connects to its input through some path. This is more difficult than the network discussed above. Here we just list some basic concepts and more details would be researched in the following posts. The recurrent network works more powerful than a feedforward network because it exhibits temporal behavior which is a fundamental property of the biological brain. A typical recurrent network is:

where:

$$
a(0)=\boldsymbol{p}\\
a(t+1)=f(W\boldsymbol{p}+\boldsymbol{b})\tag{22}
$$

References


1 Demuth, H.B., Beale, M.H., De Jess, O. and Hagan, M.T., 2014. Neural network design. Martin Hagan.
Last modified: March 24, 2020