next up previous
Next: Up: Previous:

Error Gradient for a Sigmoid Unit



$\displaystyle \frac{\partial E}{\partial w_{i}}$ = $\displaystyle \frac{\partial}{\partial w_{i}}\
\frac{1}{2}\sum_{d\in D}(t_{d} - o_{d})^{2}$  
  = $\displaystyle \frac{1}{2}\sum_{d}\frac{\partial}{\partial w_{i}}
(t_{d} - o_{d})^{2}$  
  = $\displaystyle \frac{1}{2}\sum_{d} 2 (t_{d} - o_{d})
\frac{\partial}{\partial w_{i}}(t_{d} - o_{d})$  
  = $\displaystyle \sum_{d} (t_{d} - o_{d}) \left( - \frac{\partial o_{d}}{\partial
w_{i}}\right)$  
  = $\displaystyle - \sum_{d} (t_{d} - o_{d})\ \frac{\partial o_{d}}{\partial
net_{d}}\ \frac{\partial net_{d}}{\partial w_{i}}$  

But we know:

\begin{displaymath}\frac{\partial o_{d}}{\partial net_{d}} = \frac{\partial
\sigma(net_{d})}{\partial net_{d}} = o_{d}(1 - o_{d}) \end{displaymath}


\begin{displaymath}\frac{\partial net_{d}}{\partial w_{i}} = \frac{\partial (\vec{w} \cdot
\vec{x}_{d})}{\partial w_{i}} = x_{i,d} \end{displaymath}

So:

$\displaystyle \frac{\partial E}{\partial w_{i}}$ = $\displaystyle - \sum_{d \in D} (t_{d} - o_{d})
o_{d}(1-o_{d}) x_{i,d}$