Key idea: just store all training examples
Nearest neighbor:
k-Nearest neighbor:
Advantages:
Disadvantages:
Consider p(x) defines probability that instance x will be labeled 1 (positive) versus 0 (negative).
k-Nearest neighbor:
Might want weight nearer neighbors more heavily...
where
Note now it makes sense to use all training examples instead of just k
Day | Outlook | Temperature | Humidity | Wind | PlayTennis |
D1 |
Sunny | 88 | High | Weak | No (4) |
D2 | Sunny | 80 | High | Strong | No (2) |
D3 | Overcast | 92 | High | Weak | Yes (8) |
D4 | Rain | 72 | High | Weak | Yes (6) |
D5 | Rain | 51 | Normal | Weak | Yes (6) |
D6 | Rain | 55 | Normal | Strong | No (2) |
D7 | Overcast | 60 | Normal | Strong | Yes (10) |
D8 | Sunny | 75 | High | Weak | No (9) |
D9 | Sunny | 48 | Normal | Weak | Yes (7) |
D10 | Rain | 68 | Normal | Weak | Yes (6) |
D11 | Sunny | 78 | Normal | Strong | Yes (7) |
D12 | Overcast | 77 | High | Strong | Yes (8) |
D13 | Overcast | 95 | Normal | Weak | Yes (8) |
D14 | Rain | 68 | High | Strong | No (4) |
Imagine instances described by 20 attributes, but only 2 are relevant to target function
Curse of dimensionality: nearest nbr is easily mislead when high-dimensional X
One approach: