profile pic
⌘ '
raccourcis clavier

😄 fun fact: actually better for classification instead of regression problems

Assume there is a plane in Rd\mathbb{R}^d parameterized by WW

P(Y=1x,W)=ϕ(WTx)P(Y=0x,W)=1ϕ(WTx)ϕ(a)=11+ea\begin{aligned} P(Y = 1 \mid x, W) &= \phi (W^T x) \\ P(Y= 0 \mid x, W) &= 1 - \phi (W^T x) \\[12pt] &\because \phi (a) = \frac{1}{1+e^{-a}} \end{aligned}

maximum likelihood

1ϕ(a)=ϕ(a)1 - \phi (a) = \phi (-a) WML=arg maxWP(xi,yiW)=arg maxWP(xi,yi,W)P(W)=arg maxWP(yixi,W)P(xi)=arg maxW[P(xi)][P(yixi,W)]=arg maxWi=1nlog(τ(yiWTxi))\begin{aligned} W^{\text{ML}} &= \argmax_{W} \prod P(x^i, y^i \mid W) \\ &= \argmax_{W} \prod \frac{P(x^i, y^i, W)}{P(W)} \\ &= \argmax_{W} \prod P(y^i | x^i, W) P(x^i) \\ &= \argmax_{W} \lbrack \prod P(x^i) \rbrack \lbrack \prod P(y^i \mid x^i, W) \rbrack \\ &= \argmax_{W} \sum_{i=1}^{n} \log (\tau (y^i W^T x^i)) \end{aligned}

equivalent form

maximize the following:

i=1n(yilogpi+(1yi)log(1pi))\sum_{i=1}^{n} (y^i \log p^i + (1-y^i) \log (1-p^i))

softmax

softmax(y)i=eyiieyi\text{softmax(y)}_i = \frac{e^{y_i}}{\sum_{i} e^{y_i}}

where yRky \in \mathbb{R}^k

Lien vers l'original

cross entropy

Lien vers l'original