So the videos (and PDF files) are organized toward processing one training example at a time.
The course uses column vectors (in most cases), so h (a scalar for one training example) is theta' * x.
Lower-case x typically indicates a single training example.
即:
xxx
xxx * X = b
xxx
Always use X as a (matrix) of all training examples,
with each example as a row, and the features as columns.
m = examples
n = features
x
x * X 即 三個examples x乘上一個feature X
x
h (a vector of all the hypothesis values for the entire training set) as X * theta, with dimensions of (m x 1).
**Throughout this course, dimensional analysis is your friend.
--
Supervise learning:
Give correct data set to the question.
Using the algorithm to find the relative data according the input correct data.
--Regression: Predict continuous value output.
--Classification Problem: Predict discrete value output.
Unsupervise learning (cluster learning):
Data set has no labels. Just find a structure to present it.
-- cocktail party problem
------------------
Training set:
m = number of training set
x's = input variable / features
y's = output variable / target variable
(x, y) - one training example
(xi, yi) - i-th training example
Assignment:
a := 1
Truth association:
a = a
alpha: learing rate
derivative term: partial derivative
Batch gradient descent:
Each step of gradient descent uses all the training examples.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.