## Saturday, 28 February 2015

### Review 5.6: Learn from User Clicks - Back Propagation

Learn from User Clicks: Back Propagation

Here is what is interesting. The following picture shows the flow how the back propagation algorthm works(the calculation in the picture is meaningless, it is just used to describe the idea of Back Propagation. So ignore it). The basic idea of back propagation is, the strength of the links between nodes of different layers will change according to users' feed back (The difference between expectation and real output).

Here is the code how back propagation algorithms implemented.
```def backPropagation( targets:List[Double], howMuchToChange:Double = 0.5 ) {
val outDelta = {
for ( i ← 0 until linkids.length ) yield {
//The difference between expectations and real output
val error = targets(i) - ao(i)

//The reason of caculating slope of output is, when the output is larger, the average strength of links between hidden layer and the output nodes is more strong.
//So we need to change the strength more careful when the slope is not sheer
dtanh(ao(i)) * error
}
}.toList

val hiddenDelta = {
for ( i ← 0 until hiddenids.length ) yield {
val error = (for ( j ← 0 until linkids.length ) yield so(i)(j) * outDelta(j)).toList.reduceLeft( _ + _ )
dtanh(ah(i)) * error
}
}.toList

for ( i ← 0 until hiddenids.length ) {
for ( j ← 0 until linkids.length ) {
val change = outDelta(j) * ah(i)
so(i)(j) += change * howMuchToChange
}
}

for ( i ← 0 until wordids.length ) {
for ( j ← 0 until hiddenids.length ) {
val change = hiddenDelta(j) * ai(i)
si(i)(j) += change * howMuchToChange
}
}
}

feedForword()

val targets = {
for (i ← 0 until links.length) yield {
else 0.0
}
}.toList
backPropagation(targets)
//Save to database
saveNetwork()
}

//Slope of tanh
//When y approach to 1, the slope is 0 degree
//When y approach to 0, the slope is 45 degree
def dtanh( y:Double ) = 1 - y * y

```

The code performs the following steps:

For each node in output layer:
1. Caculate the difference between the node's current output and the expected value of this node.
2. Use dtanh to determine how much the node's total input has to change. (delta output) (Why use dtanh? The reason is, when the link is strong, we need to change its strength more careful. So if you see the curve of dtanh, you can find that dtanh just do what we want.).
3. Change the strength of every incoming link in proportion to the link's current strength and learning rate.

For each node in hidden layer:
1. Caculate the error of hidden node by sum up the strength of every node in output layer mulitpled by its delta output.
2. Use dtanh to determine how much the nodes's total input has to change.
3. Change the strength of every incoming link in proportion to the link's current strength and learning rate.

Notes, before running back propagation, it is necessary to run feeding forward to know the ouput of each node.
```def main(args:Array[String]){
createTables();

val players = 101
val football = 102
val NBA = 103

val NBAPlayerIntroduction = 201
val FootballSkills= 202
val VideoGames = 203

for ( i ← 0 until 30 ) {
train(List(players,NBA),List(NBAPlayerIntroduction,FootballSkills,VideoGames),NBAPlayerIntroduction)
train(List(football,NBA),List(NBAPlayerIntroduction,FootballSkills,VideoGames),FootballSkills)
train(List(players),List(NBAPlayerIntroduction,FootballSkills,VideoGames),VideoGames)
}

setupNetwork(List(players,NBA),List(NBAPlayerIntroduction,FootballSkills,VideoGames))
feedForword()
println(ao)

setupNetwork(List(football,NBA),List(NBAPlayerIntroduction,FootballSkills,VideoGames))
feedForword()
println(ao)

setupNetwork(List(NBA),List(NBAPlayerIntroduction,FootballSkills,VideoGames))
feedForword()
println(ao)
}

```

One of the magic of neural network algorthm, especially its hidden layer, is that it can make reasonable guess even though the input query is never seen before.

We can find the query "NBA" is never seen by it self before. But it give "NBAPlayerIntroduction" link a much better score than "FootballSkills" link, even though the query "NBA" was associated just as often with "Football" as it was with "Player". The neural network has not only learned with Link are associated with which queries, it has also learned what the important words are in a particular query.(The reason of this phenomenon: we can see the query "Play" always trained to has a output "VideoGame", so the network will treat "play" contribute less to "NBAPlayerIntroduction" link. And "NBA" are hence the more important word)