CapoCaccia

Overview
Content
Members

Convolutional Neural Networks have been shown to successfully classify and localize objects within static images. Furthermore, the learned features show a strong resemblance with receptive fields of neurons found in the early visual cortex. It is yet not clear how from this kind of feature representation one could build models which account for higher level concepts or combination of those, especially in the context of dynamic changing scenes.
The tremendous amount data needed to train conventional artificial neural networks using error back-propagation raises the question if unsupervised training which features continuous learning might be better suited, especially if the number of classes to classify is not available (unlabeled data).
Unsupervised training in a continuous fashion has the additional feature that the network might be able to account for new classes which were not there from the beginning on.
In this discussion group I would like to discuss one possible model for this: Spiking Self-Organizing Maps [1]

Questions I would like to discuss:

Is the spiking SOM able to cluster the presented input (see below) in accordance to human intuition?
In what way does the SOM transforms the data?
How to coordinate communication between different spiking SOMs?
How to analyze (t-SNE, PCA) the clustering capabilities?
What could be potential applications?

No timetable published yet.

Conventional artificial neural networks, such as Convolutional Neural Networks (e.g. [3]) or Recurrent Neural Networks [4], need tremendous amount of labeled data in order to classify a certain dataset with convincing accuracy. Once these networks are trained the weights are not updated and the network's parameters stays stationary. Furthermore, one needs to specify the number of classes to be classified a priori.
In this discussion group I would like to discuss the paper by Rumbell and colleagues [1] about spiking self-organizing maps (SOM). The authors extend the original self-organizing map as proposed by [2] in order to feature more biological inspired neuron model, i.e. leaky integrate and fire. Rumbell and colleagues could show that their spiking self-organizing map is capable of clustering the IRIS dataset, a very small dataset consisting of 3 classes (different flowers) and in total 150 samples.
Even though [1] could only proof the functionality and computational capabilities of spiking SOMs using a toy example this model has very interesting properties.
Figure 1 (spiking SOM) shows the overall structure of single spiking SOM. The input vector projects to a two dimensional neuron array u. u itself is connected to an inhibitory population with excitatory and inhibitory synapses with different time constants. The inhibitory population projects with inhibitory synapses back to u. This connectivity profile leads to oscillatory activity within u, if u is continuously stimulated. Furthermore, u is connected to the neuron population v in a feed-forward manner. Early firing in v determines the location of output activity in v through lateral (neighborhood) connections.
Figure 2 illustrate how the map self-organizes from random initialization to a structure with approximately equal distances between nodes. However, the network 'only' learned to classify a non-challenging task, from a machine learning perspective, the authors speculate that it might be possible to combine multiple spiking SOMs in order increase the computation capabilities and complexity of data which could be represented by a SOM. This raises a problem or strictly speaking a challenge:
The intrinsic property of spiking SOMs to modulate its activity due to the inhibitory population (oscillations). In order to coordinate communication between the maps and to coordinate how to form higher-level concepts the different maps need to synchronized.
In this context I would like to discuss, if we get this far, the Communication Through Coherence (CTC) hypothesis by Pascal Fries [5].

In order to probe the spiking SOM in the context of a more challenging and relevant task I would like to analyze in detail the model proposed by Rumbell and discuss with you what we could learn from this approach in order to train networks to cluster a certain dataset. The dataset I am thinking of consists of videos from a driving robot/car in different environments. The images are preprocessed by a conventional CNN, thus the input to the SOM are the activations of different features (~ 30k features), which are changing over time (due to the video). How does the network behave in the presence of such an input? How do temporal correlations in the input affect the map formation? Is this model able to grasp higher-level features, which also include a temporal component, and cluster these in a reasonable manner?

Please read especially reference 1 and 2!

References

[1] Rumbell, Timothy, Susan L. Denham, and Thomas Wennekers. "A spiking self-organizing map combining stdp, oscillations, and continuous learning." IEEE transactions on neural networks and learning systems 25.5 (2014): 894-907.
[2] Kohonen, Teuvo. "The self-organizing map." Neurocomputing 21.1 (1998): 1-6.
[3] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[4] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
[5] Fries, Pascal. "Neuronal gamma-band synchronization as a fundamental process in cortical computation." Annual review of neuroscience 32 (2009): 209-224.
[6] Code available at https://code.ini.uzh.ch/mmilde/NCSBrian2CNet.git (request at mmilde@ini.uzh.ch)

Fig 1.)Basic Building Block of spiking SOM Fig.2) Initialization to learned organization.png

Fig.2) Learning in spiking SOM

Moritz Milde

Dongchen Liang

Moritz Milde

Timoleon Moraitis

Sahana Prasanna

Matteo Ragni

Fredrik Sandin

Jacopo Tani

Bernhard Vogginger

Yexin Yan

CapoCaccia

Spiking self-organizing maps for deep learning

Timetable

Leader

Members