COMPARISON OF SUPER VISED LEARNING METHODS FOR SPIKE TIME CODING IN SPIKING NEURAL NETWORKSANDRZEJ KASIŃSKI,FILIP PONULAKAbstract.In this review we focus our attention on the supervised learning methods for spiketime coding in Spiking Neural Networks(SNN).This study is motivated by the recent experi-mental results on information coding in the biological neural systems which suggest that precisetiming of individual spikes may be essential for efficient computation in the brain.We pose a fundamental question,what paradigms of neural temporal coding can be imple-mented with the recent learning methods.In order to answer this question we discuss various approaches to the considered learning task.We shortly describe the particular learning algorithms and report the results of experiments.Finally we discuss properties,assumptions and limitations of each method.We complete thisreview with a comprehensive list of pointers to the literature.1.IntroductionFor many years a common belief was that the essential information in neurons is encoded in theirfiring rates.However,recent neurophysiological results suggest that efficient processing of information in neural systems can be founded also on the precise timing of action potentials (spikes)([1,2,3]).In the barn owl auditory system,neurons detecting coincidence receive volleys of precisely timed spikes from both ears([4,5]).Under the influence of a common oscillatory drive in the rat hippocampus,the strength of a constant stimulus is coded in the relative timing of neuronal action potentials([6]).In humans precise timing offirst spikes in tactile afferents encodes touch signals at thefinger tips([1]).Time codes have also been suggested for rapid visual processing([1]).Precise temporal coding paradigm is required in some artificial control systems.Examples are neuroprosthetic systems which aim at producing a functionally useful movements of the paralysed limbs by exciting muscles or nerves with the sequences of short electrical impulses([7]).Precise relative timing of impulses is critical for generating the desired,smooth movement trajectories.In addition to aforementioned examples,it has been theoretically demonstrated that the tem-poral neural code is very efficient whenever the fast processing of information is required([8]).All these arguments provide strong motivation for investigating the computational properties of the systems that compute with precisely timed spikes.It is generally agreed that artificial Spiking Neural Networks(SNN)([4,9,10])are capable of exploiting time as a resource for coding and computation in a much more sophisticated manner than typical neural computational models([11,12]).SNNs appear to be an interesting tool for investigating the temporal neural coding and for exploiting its computational potential.Although significant progress has already been made to recognize information codes that can be beneficial for computation in SNN([4,9,11,13]),it is still an open problem to determine efficient neural learning mechanisms that enable implementation of these particular time-coding schemes.Unsupervised spike-based learning methods,such as LTP,LTD and STDP have already been widely investigated and described in the literature([14,15,16,17,18,19,20]).However,unsuper-vised approach is not suitable to the learning tasks that require an explicit goal definition.In this article we focus on supervised learning methods for precise spike timing in SNN.The goal of our study is to determine what paradigms of neural information coding can be implemented with the recent approaches.Date:27.10.2005.Key words and phrases.Supervised Learning,Spiking Neural Networks,Time Coding,Temporal Sequences of Spikes.The work was partially supported by the State Committee for Scientific Research,project1445/T11/2004/27.12ANDRZEJ KASIŃSKI,FILIP PONULAKFirst,we present the supervised learning methods for spike timing,which are known from the literature.We classify these methods to more general groups representing particular learning approaches and shortly describe each of the learning algorithms.Finally,we summarize main facts about the learning approaches and discuss their properties.2.Review of Learning MethodsIn this section we present some representative methods for supervised learning in SNN.For all these methods the common goal of learning can be stated as follows:Given a sequence of input spikes trains S in (t )and a sequence of the target output spikes S d (t ),find vector of synaptic weights w ,such that outputs of learning neurons S out (t )are close to S d (t ).2.1.Methods based on gradient evaluation.Learning in the traditional,artificial neural networks (ANN)is usually performed by gradient ascent techniques ([21]).However,explicit evaluation of gradient in SNN is infeasible due to discontinuous-in-time nature of spiking neurons.Indirect approaches or special simplifications must be assumed to deal with this problem.In ([22,23])Bohte and colleagues presented one of such approaches.Their method,called Spike-Prop ,is analogous to the backpropagation algorithm ([24])known from the traditional Artificial Neural Networks.The target of SpikeProp is to learn a set of desired firing times,denoted t d j ,at the postsynapticneurons j ∈J for a given set of input patterns S in (t ).Each neuron in a simulated network isallowed to fire only once during a single simulation cycle.The learning method is based on an explicit evaluation of the gradient of E =1/2 j (t d j −t out j )2with respect to the weights of each synaptic input to j (where t out j is an actual firing time of neuron j ).To overcome the discontinuous nature of spiking neurons,authors approximated the thresholding ly,it was assumed that for a small region around t =t out j ,thefunction V m j (t ),denoting the membrane potential of j ,could be linearly approximated.On thisassumption error-backpropagation equations were derived for a fully connected feedforward network with hidden layers.SpikeProp algorithm has been re-investigated in ([25,26,27,28]).It was found that the weight initialization is a critical factor for a good performance of the learning rule.In ([25])the weights were initialized with the values that led the network to the successful training in a similar number of iterations as in ([22]),but with large learning rates,although Bohte argued that the approxi-mation of the threshold function implies that only small learning rates can be used ([23]).Other experiments of Moore ([25])also provided evidence that negative weights could be allowed and still led to successful convergence,which was in contradiction to the conclusions of Bohte.Xin and Embrechts ([27])proposed a modification of the learning algorithm by including the momentum term in the weight update equation.It has been demonstrated that this modification significantly speeded up the convergence of SpikeP rop .In ([26])additional learning rules were introduced for the synaptic delays,time constants and for the neurons’thresholds.This resulted with smaller network topologies and also with faster algorithm convergence.Finally,Tiňo and Mills ([28])ex-tended SpikeProp to recurrent network topologies,to account for the temporal dependencies in the input stream.Neither the original SpikeProp method nor any of the proposed modifications enable learning of patterns composed of more than one spike per neuron.Properties of the SpikeProp method were demonstrated in a set of classification experiments.These included standard and interpolated XOR problem ([13]).SpikeProp authors encoded the input and output values by time delays,associating the analog values with the corresponding “earlier”or “later”firing times.In the interpolated XOR experiment the network could learn the presented input with an accuracy of the order of the algorithm integration time-step.The classification abilities of SpikeProp were also tested on a number of common benchmark datasets (the Iris dataset,the Wisconsin breast-cancer dataset and the Statlog Landsat dataset).For these problems the accuracy of SNN trained with SpikeProp was comparable to that of sig-moidal neural network.Moreover,in experiments on the real-world datasets,the SpikeProp algo-rithm always converged,whereas the compared ANN algorithms,such as Levenberg–Marquardt algorithm,occasionally failed.The main drawback of the SpikeProp method is that there is no mechanism to “prop-up”the synaptic weights once the postsynaptic neuron no longer fires for any input pattern.LEARNING SPIKE TIMING IN SNN3 Moreover,in the SpikeProp approach only thefirst spike produced by a neuron is relevant and the rest of the time course of the neuron is ignored.Whenever a neuronfires a single spike,it is not allowed tofire again.For this reason the method cannot learn patterns consisting of multiple spikes.Thus it is suitable to implement only on the’time-to-first-spike’coding scheme([1]).2.2.Statistical methods.In([29,30]),authors proposed to derive a supervised spike-based learning algorithm starting with statistical learning criteria.Their method is based on the approach proposed by Barber.However,in([31])the author considered supervised learning for neurons operating on the discrete time scale.Pfister and colleagues extended this study to the continuous case.The fundamental hypothesis in([29])and([30])is to assume that the instantaneousfiring rate of the postsynaptic neuron j is determined by a point process with time dependent stochastic intensityρj(t)=g(V m j(t))that depends nonlinearly upon the membrane potential V m j(t).The firing rateρj(t)is known as escape rate([4]).The goal of the considered learning rule is to optimise the weights w j in order to maximise thelikelihood of getting postsynapticfiring times S outj (t)=S d j(t),given thefiring rateρj(t).Theoptimisation is performed via gradient ascent of the likelihood of the postsynapticfiring for one or several desiredfiring times.The advantage of the discussed probabilistic approach is that it allows to describe explicitly the likelihood P j S out j(t)|S in j(t) of emitting a S out j(t)for a given input S in j(t).Moreover,since this likelihood is a smooth function of its parameters,it is straightforward to differentiate it with respect to the synaptic efficacies w j.On the basis of this remark authors proposed a rule of the synaptic weights modifications that can be described by a two-phase learning window similar to that of Spike-Timing Dependent Plas-ticity(STDP)([19,20]).Authors demonstrated that the shape of the learning window was strongly influenced by the constraints imposed by the different scenarios of the optimization procedure.The described learning rule applies to all synaptic inputs of the learning neuron.It is also assumed that the postsynaptic neuron j receives additional’teaching’input I(t)that could either arise from a second group of neurons or from the intracellular current injection.The role of I(t) is to increase the probability that the neuronfires at or close to the desiredfiring time t d j.In this context the learning mechanism can also be viewed as a probabilistic version of the spike-based Supervised-Hebbian learning(described in section2.6).In([30])authors present a set of experiments which differ in the stimulation mode and the specific tasks of the learning neuron.The algorithm is applied to the spike response model(SRM) with escape noise as a generative model of neuron([4]).Authors consider different scenarios of the experiments:•different sources of’teaching’signal(the signal is given by a supervisor as a train of spikes or as a strong current pulse of short duration);•allowing(or not)for other postsynaptic spikes to be generated spontaneously;•implementing a temporal coding scheme where the postsynaptic neuron responds to one of the presynaptic spike patterns with a desired output spike train containing several spikes while staying inactive for the other presynaptic spike patterns.The experiments demonstrate the ability of the learning method to precisely set the time of the singlefirings at the neuron output.However,since in all experiments a desired postsynaptic spike train consisted of at most2spikes,it is hard to estimate a potential,practical suitability of the proposed method to learn complex spike trains consisting of dozens of spikes.2.3.Linear algebra methods.Carnell and Richardson proposed to apply linear algebra appa-ratus to the task of spike-time learning([32]).Authors begin with the definitions of the inner product,orthogonality and projection operations for the time series of spikes.They also introduce a specific metrics(norm),as a measure of the difference between two given time series.On the basis of these definitions authors formulate some algorithms for the approximation of the target pattern S d(t)given a set of input patterns S in(t)and a set of adjustable synaptic weights w:4ANDRZEJ KASIŃSKI,FILIP PONULAK(1)Gram-Schmidt solution:the Gram-Schmidt process([33,34])is used tofind an orthogonalbasis for the subspace spanned by a set of input time series S in(t).Having the orthogonal basis,the best approximation in the subspace to any given element of S d(t)can be found.(2)Iterative solution:the projection of error E onto direction of times series S in i is evaluated,with i randomly chosen in each iteration.Error is defined as a difference between the target and the actual time series E=S d(t)−S out(t).The algorithm is evaluated until norm(E)is sufficiently small.Authors demonstrated in a set of experiments that the iterative algorithm is able to approximate the target time series of spikes.The experiments were performed with the Liquid State Machine (LSM)network architecture([35,36])and LIF neurons([4]).Only an output neuron was subjected to learning.The approximated spike trains consisted of10spikes(spanned within a1second interval).In the successful training case,an input vector S in(t)was generated by500neurons. Good approximation of S d(t)was obtained after about600iterations.The presented results revealed that the ability of the method to produce the desired target patterns is strongly influenced by the number and variability of spikes in S in(t)and that the quality of approximation increased for longer sequences of spikes.This is a common conclusion for all LSM systems.As afinal remark,we state that the presented algorithm([32])is one out of only few algorithms that enable learning the patterns consisting of multiple spikes.However the algorithm updates weights in a batch mode and for this reason it is not suitable for the online learning.In some applications this can be considered as a drawback.2.4.Evolutionary methods.In([37]),authors investigate the viability of evolutionary strategies (ES)for supervised learning in spiking neural networks.The use of an evolutionary strategy is motivated by emphasising the ability of ES to work on real numbers without complex binary encoding schemes.ES proved to be well suited for solving continuous optimisation problems([38]).Unlike genetic algorithms,the primary search operator in ES is the mutation.A number of different mutation operators have been proposed.The traditional mutation operator adds to the alleles of the genes in the population some random value generated according to Gaussian distribution.Other mutation operators include the use of Cauchy distribution.The use of Cauchy distribution allows exploration of the search space by making large mutations and helping to prevent premature convergence.On the other hand the use of Gaussian mutation allows to exploit the best solutions found in a local search.In this algorithm,not only the synaptic strengths,but also the synaptic delays are the adjustable parameters.The spiking network is mapped to a vector of real values,which consists of the weights and delays of the synapses.A set of such vectors(individuals)will form the population evolving according to the ES.The population is expected to converge to a globally optimal network,tuned to the particular input patterns.The learning properties of the algorithm were tested in a set of classification tasks with XOR and Iris benchmark dataset.The SRM neuron models and the feed-forward fully connected spiking networks have been used.Similarly to([22])the analog values have been mapped here intofiring delays.Authors reported results comparable to those obtained with known classification algorithms (BP,LM,SpikeProp).Some limitation of the algorithm arises from the fact that each neuron is allowed to generate at most a single spike during the simulation time.Therefore the method is not suitable to learn pat-terns consisting of multiple spikes.Another disadvantage,common to all evolutionary algorithms, is that the computation with this approach is very time consuming.2.5.Learning in Synfire Chains.Human learning often involves relating two signals separated in time,or linking a signal,an action and a subsequent effect into a causal relationship.These events are often separated in time,but nonetheless,humans can link them,thereby allowing them to accurately predict the right moment for a particular action.Synfire chains(SFC)are considered as a possible mechanism for representing such relations between delayed events.SFC is a feedforward multi-layered architecture(a chain),in which spiking activity can propagate in a synchronous wave of neuronalfiring(a pulse packet)from one layer of the chain to the successive ones([39]). Each step in the SFC requires a pool of neurons whosefirings simultaneously raise the potentialLEARNING SPIKE TIMING IN SNN5 of the next pool of neurons to thefiring level.In this mechanism each cell of the chainfires only once.In([40]),a specific neural architecture-INFERNET is introduced.The architecture is an instance of the SFC.Its structure is organized into clusters of nodes called subnets.Each subnet is fully connected.Some subnet nodes have connections to external subnet nodes.The nodes are represented here by a simple model similar to SRM([4]).The learning task is to reproduce the temporal relation between two successive inputs(thefirst one-presented to thefirst layer of SFC and the latter one,considered as the’teaching’signal, given to the last layer).Thus the task is tofind a link between thefiring input nodes and the firing target nodes with a target time delay.Two successive inputs can be separated by several tenths of a second and a single connection cannot alone be responsible for such long delays.Therefore a long chain of successive pools of node firings might be required.In the reported approach the particular synaptic connections are modified by the rule similar to STDP,however with the additional non-Hebbian term.In our opinion this implies that the synaptic weights between the particular neurons must be strong enough to ensure that the wave of excitation will eventually reach the output subnet.This is the necessary condition which guaranties that the Hebbian rules would be activated.In([40]),author discussed experiments in which two inputs are presented,one(the probe) at time0ms and one(the target)some time later.The task for the network was to correctly reproduce the temporal association between these two inputs and therefore build an SFC between them.While trained,the network was able to trigger this synfire chain whenever thefirst input was presented.In this task author reported some difficulties.The algorithm could correctly reinforce a connection that led to the probe nodefiring at the right time,but could not in general prevent the target nodes fromfiring earlier,if some other’inter-nodes’fired several times before.Indeed,a careful analysis of the learning equations confirms that there is no rule for avoiding spuriousfiring.We conclude that the learning method under consideration represents an interesting approach to spike-time learning problem in SNN.In this method it is assumed that the time of postsynaptic neuronfiring depends mostly on the signal propagation delay in the presynatpic neurons.The ’time-weight’dependence is neglected.The author focuses on modifying the topology of the net-work,to obtain the desired delay between the signal delivered to the network input and the signal generated at the network output.However,with this approach the objective function(the desired time delay)is not a continuous function of the parameters(synaptic weights)of the optimization algorithm.For this reason the algorithm can be considered as a discrete optimization technique.This approach enables to attain the precision that takes values not from the continuous domain,but from afinite set of possible solutions(since global delay is a combination of thefixed component delays,constituting afinite set).The quality of approximation depends in general on the number and diversity of connection delays.Another limitation of the method is,again,that it can learn only singlefiring times and thus can be applied only to the’time-to-first-spike’coding scheme.Author claims that the method enables to learn sequentially many synfire chains.This property would be very interesting in the context of the real-life applications.Unfortunately,it is not described in the cited article,how this multi-learning can be achieved.2.6.Spike-based supervised-Hebbian learning.In this subsection we discuss methods that represent,so called,Supervised-Hebbian Learning(SHL)approach.In this approach Hebbian processes[41]are supervised by an additional’teaching’signal that reinforces the postsynaptic neuron tofire at the target times.The’teaching’signal can be transmitted to the neuron in a form of the synaptic currents or as the intracellularly injected currents.Ruf and Schmitt[42]proposed one of thefirst spike-based methods similar to SHL approach. In theirfirst attempt,they have defined the learning rule for the monosynaptic excitation.The learning process was based on three spikes(two presynaptic and one postsynaptic)generated during each learning cycle.Thefirst presynaptic spike at time t in1was considered as an input signal,whereas the second presynatpic spikes at t in2=t d pointed to the targetfiring time for the postsynaptic neuron.The learning rule reads:∆w=η(t out−t d),whereη>0is the learning rate6ANDRZEJ KASIŃSKI,FILIP PONULAKand t out is the actual time of the postsynaptic spike.This learning rule was applied after every learning cycle.It is easy to demonstrate that under certain conditions t out converges to t d.With this method it was possible to train only a single synaptic input,whereas neurons usually receive their inputs from several presynaptic neurons.The corresponding synaptic weights could still be learned in the way described above,if the weights were learned sequentially(a single synapse per learning cycle).This is,however,a very inefficient approach.As a solution to this problem authors proposed a parallel algorithm.Surprisingly,although this algorithm is considered as an extension to the monosynaptic rule,yet it does not aim at achieving the desired timing of the postsynaptic neuron.Instead,the goal is to modify synaptic weights to approach some target weight vector w d given by the difference between pre-and postsynapticfiring times,that is w d i=(t d−t in i)for any presynaptic neuron i.Authors claim that such an approach can be useful in the temporal pattern analysis in SNN,however no details are given to explain it.Thorough analysis of the Supervised-Hebbian learning in the context of spiking neurons was performed by Legenstein,Naeger and Maass([43]).The learning method,considered by authors,implements STDP process with supervision realised by the extra input currents injected to the learning neuron.These currents forced the learning neuron tofire at the target points in time and prevented it fromfiring at other times.Authors investigated the suitability of this approach to learn any given transformation of input to output spiking sequences.It is well-known that the common version of STDP always produces bimodal distribution of weights,where each weight either assumes its minimal or its maximal possible value.Therefore in this article authors considered mostly the target transformations that could be implemented with such bimodal distribution of weights.Authors reported a set of experiments in which they consider different options of uncorrelated and correlated inputs with the pure and noisy teacher signal.The learning algorithm was also tested with a multiplicative variation of STDP([44]).In contrast to standard STDP,this modified rule enabled producing intermediate stable weight values.However,authors reported that learning with this modified version of STDP was highly sensitive to input signal distributions.In all experiments LIF neuron models and the dynamic synapses models were used([45,46]). However,the synaptic plasticity was considered only for the excitatory connections.The results reported in([43])demonstrated that the learning algorithm was able to approximate the given target transformations quite well.These positive results were achieved not only for the case where the synaptic weights were the adjustable parameters,but also for a more realistic inter-pretation suggested by experimental results where STDP modulated the initial release probability of dynamic synapses([46]).Legenstein and colleagues proved that the method has the convergence property in average for arbitrary uncorrelated Poisson input spike trains.On the other hand,authors demonstrated that the convergence cannot be guarantied in a general case.Authors reported the following drawback of the considered algorithm:Since the teacher currents suppress all undesiredfirings during the training,the only correlations of pre-and postsynaptic activities occur around the targetfiring times.At other times there is no correlation and thus no mechanism to weaken these synaptic weights that led the neuron tofire at undesired times during the testing phase.Another reported problem is common to all Supervised-Hebbian approaches:Synapses continue to change their parameters even if the neuronfires already exactly at the desired times.Thus stable solutions can be achieved only by applying some additional constraints or extra learning rules to the original SHL.Despite these problems,the presented approach proves high ability to implement the precise spike timing coding scheme.Moreover this is thefirst method,of so far presented in this article, that enables learning of the target transformations from the input to the output spike trains. 2.7.ReSuMe-Remote Supervision.We have seen in section2.6that the supervised-Hebbian approach demonstrated interesting learning properties.With this approach it was feasible,not only to learn the desired sequences of spikes,but also to reconstruct the target input-output trans-formations.Moreover,this approach inherited interesting properties of the traditional HebbianLEARNING SPIKE TIMING IN SNN 7n k t n j w nt n i w A in d BC Figure 1.Mechanisms underlying ReSuMe learning:(A)Remote supervisionconcept .The target spike train,transmitted via neuron n d j (i ),is not directlydelivered to the learning neuron n out i .However,it determines (illustrated bydotted line)the changes of the synaptic efficacy w ki ,between a presynaptic neuron n in k (i )and n out i .(B,C)Learning windows.Changes of the synaptic efficacy w kiare triggered by target or postsynaptic action potentials.The amplitude of changeis determined by the functions W d (s d )and W out (s out ),called learning windows.paradigm:it is local in time and space,simple and thus suitable for online processing.On the other hand,it was demonstrated that SHL displays several serious disadvantages that may yield problems when more complex learning tasks are considered.Here we discuss ReSuMe -Remote Supervised Method proposed in ([47]).It is argued that the method possesses interesting properties of SHL approach,while avoiding its drawbacks.The goal of the ReSuMe learning is to impose on a neural network the desired input-output properties, produce the desired spike trains in response to the given input sequences.ReSuMe takes advantage of the Hebbian (correlation)processes and integrates them with a novel concept of remote supervision.The name ’remote supervision’comes from the fact that the target signals are not directly delivered to the learning neurons (as it is the case in SHL),however they still co-determine the changes of the synaptic efficacies in the connections terminating at the learning neurons.This is schematically illustrated in Fig.1.A.In more details,a synaptic efficacy w ki ,between any given presynaptic neuron n in k (i )and a corresponding postsynaptic neuron n out i ,is modified according to two rules.The first rule depends on the correlation between n in k (i )and n out i firing times.The second rule is determined by the correlation between n in k (i )and n d j (i )firing times.By n d j (i )we denote a ’teacher’neuron deliveringthe target signal for n out i .For the excitatory synapses these two rules have the forms similar to STDP and anti-STDP and are described by the learning windows W d (s d )and W out (s out )(Fig.1.B and 1.C).The parameters s d and s out denote the time delays (t d j −t in k )and (t out i −t in k ),respectively.For the inhibitory synapses the learning windows differ only in signs in regard to W d (s d )and W out (s out ).The balance of the learning rules,defined for each synapse,leads to the optimal weight values required for obtaining the desired timing of spikes at the learning neurons.The ReSuMe method is biologically plausible,since it is based on the Hebbian-like processes.Also the remote supervision concept,applied to ReSuMe ,can be biologically justified on the basis of heterosynaptic plasticity -a phenomenon recently observed in the neurophysiological experiments ([14,17,48]).High learning ability of ReSuMe has been confirmed through the extensive simulation experi-ments ([47,49,50]).Here we present results of an experiment discussed in ([47]),where ReSuMe was used to train the network to produce the desired sequence of spikes S d (t )in response to the given,specified input spike train S in (t )(Fig.2).ReSuMe has been applied to the LSM network consisting of 800LIF neurons.Both S in (t )and S d (t )signals were generated randomly over a time interval of 400ms (Fig.2.A,C).A single learning neuron was trained over 100learning sessions.An。
