Skip to content

knowledge network

John Erling Blad edited this page Feb 1, 2022 · 1 revision

Knowledge network

This is not a complete description of the neuron model in use inside the knowledge representing network for families of HERA, but it should be sufficient for a scientist with some basic knowledge about artificial neural networks and the artificial neurons they use.

Knowledge network

A very short explanation of what kind of neural network this is; columns of autoencoders for each layer are evenly spaced over a neocortex, where autoencoders in each layer forms residual neural networks. There isnʼt a single autoencoder for each layer, as several neurons join together to make the composite encoder. They also represent binary values as sets of active neurons, and are not continuous values. It is a kind of misnomer to say a column form one autoencoder inside each layer, as there are several and also of different types of connections inside a column, but as a simplified description it holds.

Some layers have internal and external mixins, which has the role of hardcoded routing in more traditional deep learning networks, and in a biological neocortex it is done by corticocortical and intracortical connections. The corticocortical connections are fast internal mixins that does not need fast adaptations, and reside in laywer II and III. The intracortical connections are although between areas that need such fast adaptations, and reside in layer IV and V.

A neocortex has an assoc that create feedback on associations on activity in each column, letting other columns act on the association. This can be interpreted as manipulating the probability density function implemented by the columns. When something is thoroughly learned, it may automate the action, short-circuit an otherwise heavy operation.

There are also a memory stack for the neocortex, which keeps a trace of what the neocortex is doing at any given moment. This could create multi-level (or multi-node) addresses.

There are also networks for creating expectations and choosing the winner.

Neuron

The model tries to avoid calculations in the time domain, typically by only considering the state of the axon hillock, not any delayed spikes down the axon effectively collapsing time-space into space only. It is assumed that adjusting for time is effectively to adjust for variances in processing speeds in neurons and differences in paths, and with digital processing much of this will disappear.

In some cases this clearly an error, but in most cases it seems to be correct. One place where it is clearly wrong is in audio processing. In this case autocorrelation is used as a proxy for frequency analysis, and neglecting that is clearly an error.

Activation at soma is a sum over activation of synapses in the dendritic tree, where the synapses are assumed to be packed sufficiently dense to be represented by weights along vectors representing each dendrite. Those can be rearranged as a single vector. In many cases this representation of a weight vector can be counterproductive, and is better represented as a weight matrix overlaid on some portion of shared space.

The activation is a sum, not a product, as the input is bits. Either the bits are set and then the weight is added to the sum, or it is not set and then a zero is added. This summation can be done extremely fast, even in simplified streaming cores, without ever using any fancy tensor cores. The sum is compared to a preset value, and if it goes above the output will be activated and it spikes. This is the simple forward path.

More goes on though. If there is an input on a synapse (it might have a weight of zero) then a delta weight is increased. This delta weight is temporarily added to the synapse own weight during the summation, but the delta weight is only increased if the overall activation leads to generation of a spike. If not, the delta weight is held at the previous value. Thus, the synapse temporarily learns and adapts its weight.

Some artificial neurons take this a bit further, they also decrease the delta weight if the neuron is not included in an expected outcome. This is negative learning.

The delta weight will be kept over some time, slowly increasing (or decreasing) the value due to learning. Later a sluggish decrement operation counts all delta weights towards zero, and adds the remaining delta weight to the synapse own weight. The net effect is that when activation happens several times inside the time window for the decrement operation, then the learning will become permanent.

Clone this wiki locally