Make the gradient descent finding the global minimum faster.
Idea:
Make sure features are on a similar scale.
e.g
make the feature scale unified as 0 <= x <=1
i.e
if a scale is of 0 ~ 2000(max),
divide it by 2000(range, means max - min) , then we end up with the value
between 0 and 1.
This will make the graph of gradient descent as close to a circle shape, which makes
the finding of global minimum much more easier.
Thus, we want every features into *approximately* a
-1 <= Xi <= 1 range.
X0 is 1 by define.
Mean normalization:
i.e
make every input feature minus the mean of all features.
=> X1 = (input - mean) / 2000( range of the scale, OR the standard deviation)
This will make the value more close to 0.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.