Nn08

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Often patterns in the real world consist of many variables, but we are not interested in all of these
– there may be certain features though that we are hoping to detect or classify – eg. when examining our genes, we are all unique (so we would all represent a unique point in multidimensional “gene-space”), but we can mostly be classified into “races” based upon features like skin, hair and eye colour, etc.
– the identification of groups or sections of the input data which are “similar” according to some criterion (usually distance in input space)
The pattern set (input vectors) are presented to the network to determine decision boundaries required to identify possible clusters
Weights should be initialised to random values see page 72 of book or Word6 file During learning the weights gradually move towards the centres of the clusters
CounterPropagation Networks
To overcome the nonlinear separability problem, counterpropagation networks were proposed by Hecht-Nielsen (1987) Can also be used for associative mappings, data compression & classification Hidden layer is a winner-take-all layer The winning (hidden) neuron outputs a value of 1 (the rest are zero) The Output layer uses discrete bipolar f()
Assumes that the number of clusters is known (say p) Architecture is feedforward (single-layer) neural network with continuous output
– number of inputs = dimension of input patterns – number of outputs = p
The output layer learning rule is supervised though
– outstar learning rule (don’t worry about details) – just understand that the counterpropagation network combines unsupervised and supervised learning to enable clustering of linearly nonseparable data to be detected
random at first
K-Means (K=3)
assign data to clusters
move clusters to centroids
assign data to clusters
move clusters to centroids
Neural Network Approaches
One important measure of similarity is distance
– the Euclidean distance between two points x and y is defined as:
|| x − y|| = ( x1 − y1 ) + ( x2 − y2 )
Outputs are calculated, and the output neuron with the largest value is declared as the “winner” (say neuron m for max)
The weights connecting each input to this winning neuron are then updated according to ∆wmi = α ( xi − wmi )
Sometimes, the relationships cannot be measured in distance (clustering) though Need to be able to extract similar features from input patterns
Feature Mapping
– nb. no learning occurs if w=x – the learning keeps pushing each weight toward the input pattern which fires its neuron
Because of the Single Layer architecture though, it cannot cluster linearly nonseparable data (same reasons as Perceptron)
Clustering based on distance
– Winner-Take-All networks – Counterpropagation networks
Feature extraction methods
– self-organising feature maps
Winner-Take-All Learning
|| x − y|| =
∑ ( xi − yi )
N
2
Traditional Clustering
Clustering serves several purposes
– allows us to find unusual patterns in large data sets – preprocessing data to remove outliers – provides an alternative view for the data by allowing natural data structures and divisions to form – suprvised learning can also be applied to data within a cluster, so that clustering effectively reduces the size of the training set
– In the N dimensional plane:
based on Pythagoras’ Theorem
2
2
i =1 No information is available from a teacher about the desired network response, so the similarity measure is used
even knowing what the initial categories were)
These networks can learn by examining the inputs against some “similarity” measure An important application in data processing is clustering
Self-Organising neural networks let the data discovery similarities between themselves
K-Means Algorithm
Choose the number of clusters K, and their centres (randomly) Assign all data points to their closest cluster Recalculate the centre of each cluster as the centroid of all data points in that cluster Repeat until the cluster centres have stabilised
Clustering Algorithms
K-means is a statistical technique which tries to find K typical (or average) points from the data - ie. it forces K clusters
– each data point belongs to the cluster to which it is closest (in distance)
BUS3650 / BUS4650 / BUS5650
Business Applications of Neural Networks
Lecture 8
© Dr. Kate A. SmiN
This Lecture
Self-Organisation
The process of discovering relationships between inputs without any feedback from the environment during learning
– ie. unsupervised learning
The relationships discovered will be translated into outputs We can use such networks to discover patterns, features, regularities or categories without a “teacher” (eg. without
learning rate – all other weights are unchanged – the winning neuron takes all the learning, hence “winner-take-all”
It isn’t necessary to calculate the actual output, since the winning neuron will also have the largest net input: so just calculate the net inputs and take the largest as the winner This is also called competitive learning
– – – – – Clustering Winner-Take-All learning Counterpropagation Kohonen’s Feature Map Learning Vector Quantisation – Applications
What is Self-Organisation?