일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |
- 영어공부
- Ringle
- #nlp
- 오피스밋업
- 해외취업컨퍼런스
- 뉴노멀챌린지
- #링글후기
- 소통챌린지
- 영어공부법
- 강동구장어맛집
- 링글경험담
- 스몰토크
- 링글리뷰
- 영어회화
- 영어시험
- #Ringle
- #직장인영어
- #링글
- 장어랑고기같이
- 영어로전세계와소통하기
- CommunicateWiththeWorld
- 화상영어
- 둔촌역장어
- 총각네장어
- 링글
- 링글커리어
- 성내동장어
- #체험수업
- #영어공부
- #영어발음교정
- Today
- Total
Soohyun’s Machine-learning
Base 본문
Types of Machine Learning
- Supervised learning - with labels
- Unsupervised learning - no labels
- Reinforcement learning - know the objective, don't know how to achieve
Thumbtack Question
- H : probability of Head up
- T : probability of Tail up
Binomial distribution (Bernoulli experiment) is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, and each success has the probability of theta, θ
Flips are i.i.d (Independent and identically distributed random variables, i.i.d condition)
ㄴ Independent events / Identically distributed according to binomial distribution
P(H) = θ / P(T) = 1-θ
P(HHTHT) = θθ (1-θ) θ (1-θ) = θ^3 (1-θ)^2
n and p are given as parameters, and the value is calculated by varying k
Maximum Likelihood Estimation
Data : we have observed the sequence data of D with a_H and a_T
our hypothesis : The gambling result of thumbtack follows the binomial distribution of θ
How to make our hypothesis strong? Finding out the best candidate of θ. What's the condition to make θ most plausible?
One candidate is the Maximum Likelihood Estimation (MLE) of θ
MLE Calculation
then, this is the maximization problem, so you use a derivative that is set to zero
wikipedia : logarithm
Simple Error Bound
Let's say theta star(
) is the true parameter of the thumbtack flipping for any error,
We have a simple upper bound on the probability provided by Hoeffding's inequality :
Can you calculate the required number of trials, N? To obtain epsilon = 0.1 with 0.01% case (Probably Approximate Correct, PAC)
--------------- -------------
probably approximate
여기까지가 MLE 관점에서의 approximation
Incorporating Prior Knowledge
More formula from Bayes viewpoint
Why not use the Beta distribution?
출처 : other references 1) |
Maximum a Posteriori Estimation
Probability
Conditional Probability
The conditional probability of A given B
Nice to see that we can switch the condition and the target event. |
Nice to see that we can recover the target event by adding the whole conditional probs and priors |
Probability Distribution
- it assigns a probability to a subset of the potential events of a random trial, experiment, survey, etc.
A function mapping an event to a probability :: because we call it a probability, the probability should keep its own characteristics (or axioms)
Normal Distribution
Beta Distribution
Supports a closed interval
- Continuous numerical value
- [0,1]
- Very nice characteristic :: Matches up the characteristics of probs
Binomial Distribution
Simplest distribution for discrete values
- Bernoulli trial, yes or no / 0 or 1 / selection, switch...
Multinomial Distribution
The generalization of the binomial distribution
- Choose A,B,C,D... Z / Word selection, cluster selection...
Basic References:
1) kooc.kaist.ac.kr
2) Bishop - Pattern Recognition and Machine Learning
3) http://norman3.github.io/prml/
Other References:
1) https://datascienceschool.net/view-notebook/70a372b9c14a4e8d9d49737f0b5a3c97/
'Lectures > Machine Learning Basic' 카테고리의 다른 글
Kernel Idea (0) | 2017.11.10 |
---|---|
cost function and Gradient update rule (0) | 2017.10.21 |