Neural networks:
Idea:
There could be more than 2 features, i.e x1, x2, x3,..., xn
We could have ϴ0, ϴ1,..., ϴv, while 'v' maps to the number of
terms in a polynomial.
Think about letting computer to recognize a picture of car.
A picture with 50 x 50 pixels image -> 2500 pixels
x = [
pixel 1 intensity (0-255)
pixel 2 intensity (0-255)
...
pixel 2500 intensity (0-255)]
So, if we still want to count the features of all these combination,
there would be about xi * xj := 3 millions of features, how's that?!
Way too large for previous learning function.
Here's how neural networks comes in...
-------
NN:
Algorithms that try to mimic the brain.
Diminish at the late 90's, however, thank to distributed/powerful computing,
NN has come alive again.
Auditory cortex:
Was used for hearing. However, if we cut the input of hearing and reroute
the imput signal from seeing, auditory cortex will start learning to see.
Idea:
Plug-in any sensors to the brain, brain will start to learn.
---
NN model representation I:
Neuron:
input wire: Dendrite
ouput wire: Axon
-----
Artificial neuron design:
Neuron model: logistic unit
Input:
x0(bias unit, always equal to 1), x1, x2, x3 (as features)
processing:
1 neuron
output:
h(x)
Sigmoid (logistic) activation function.
i.e:
g(z) = 1 / ( 1+e^(-z) )
----
Now, let's talk about network of neurons.
input: (layer 1)
x0(bias unit, always equal to 1), x1, x2, x3 (as features)
processing: (layer 2, aka hidden layer)
multiple neurons. (also with a0 neuron as bias unit)
output: (layer 3)
h(x)
---
So,
ai^(j) = "activation" of unit i in layer j
ϴ^(j) = matrix of weights controlling function mapping from
layer j to layer j + 1
---
Vectorize the computation:
a1^(2) = g( z1^(2) )
where z1^(2) = ϴ10^(1)x0 + ϴ11^(1)x1 + ... + ϴ13^(1)x3
=>
ϴ^(i)x
----
Neural network learning it's own features:
Rather than taking the sensor input from layer 1,
i.e x1, x2, x3
we are using a1, a2, a3 as the new input.
-----
Ok, let's unmask layer 1.
The input x1, x2, x3 for a^(i) can be predicted by
layer 1's ϴ1^(i-1), ϴ2^(i-1), ϴ3^(i-1), neat!
-----
Neural network can be compose by architectures:
i.e Multiple layers, but still,
only 1 input layer and 1 output layer. Others are hidden layer.
-----
Examples:
Non-linear classification example: XOR/XNOR
And Function:
Or Function:
-------
ϴ is also called 'weights'
Negation:
---
pipe line them together!
(Not x1) And (Not x2)
---
x1 XNOR x2
---
NN representation of Multi-class classification:
e.g 0-9 numbers, there are 10 catagories.
example:
Pedestrian
Car
Motorcycle
Truck
This a h(x) ∈ R^4
For Pedestrian:
[ 1
0
0
0]
For car:
[ 0
1
0
0]
For Motorcycle:
[ 0
0
1
0]
For truck:
[ 0
0
0
1]
Re-representing y = {1,2,3,4} into:
y^(i) is one of [1;0;0;0] ... [0;0;0;1]
Non-Linear hypothese.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.