## Sunday, September 10, 2017

### Horse Racing Puzzle

Question: There are 25 horses available.  We want to select the fastest three horses out of this 25 but are allowed to do horse-racing with 5 horses at a time only.  How many minimum numbers of horse races we need to do to get the best 3?

If you answer 11, it might not be wrong, but that's not optimal.  The key here is "can we quantify the speed of each horse doing the race?".  Say we do 5-horses racing, can we note the individual speed of each horse?  If we can, then the life is beautiful.

Here is the steps (if can measure the speed of each horse):

1. Do 5-group races (5-horse for each group).  We record the speed of top-3 (say a,b,c,d,e correspond to race group, and 1, 2, and 3 correspond to the speed ranking, then we get S[a][1], S[a][2], S[a][3], S[b][1], S[b][2], S[b][3], ..., S[e][1], S[e][2], S[e][3]).  (Note: S[group][rank] := speed value for group and rank).
2. Sort the table (We can flatten the 2-dimension to 1-dimension). For example S[a][1] = T[1], S[a][2]=T[2], ...., so on.
3. Pick the top-3.  We get the top-3 fastest horses.
4. Total race = 5
What about if don't have the ability to measure the speed (we know the top-3 just from seeing the race)?
1. Do 5-group races (say Group A, ..., E), each with 5 horses (total = 5 races)
2. For each group, we position the top-3 winners. For example, for group A: A1>A2>A3, B1>B2>B3 so on. (A1, A2, ..., A5 are arbitraries)
3. As we are interested in the top-3 only, we should remove the Rank 4 - rank 5 horses (e.g, eliminate all horses in rank 4 and 5.  Thus A4,A5, B4, B5, C4,C5,D4,D5,E4,E5 are eliminated)
4. We don't know whether A3>B3, or A3>C3, or B3>E3.  The same thing for A2 vs B2 so on.
5. We do another race (6th race): A1, B1, C1, D1, E1.  If we get the winner: E1>C1>D1>B1>A1 (E1 the 1st winner, C1 the 2nd, D1 the 3rd) so we can eliminate B1 and A1 (the 4th and 5th).
6. Because we eliminate A1 (the 1st winner in Group A), A2 and A3 (2nd and 3rd winners of group A) are automatically eliminated as well.
7. Because we eliminate the top winner in group B (B1), there is no chance B2 and B3 gonna be the winners so they would be eliminated as well.
8. 7th race: we do race C1 against D1 against E1.  Say the outcome is E1>C1>D1.  This way, we are sure E1 would be the fastest of all, C1 would be the runner-up and D1 is in the third place.  What about C2, C3, D2, D3, E2, E3? There is no chance they would win (Hey, if the best in each group [rank 1] couldn't win the top, how the 2nd and 3rd place would win?)
The original table (after 5 races), we rank the winners of each group as follows:
Group A Group B Group C Group D Group E
Rank 1 A1 B1 C1 D1 E1
Rank 2 A2 B2 C2 D2 E2
Rank 3 A3 B3 C3 D3 E3
Rank 4 A4 B4 C4 D4 E5
Rank 5 A5 B5 C5 D5 E5

After eliminating 4th and 5th of each group:
Group A Group B Group C Group D Group E
Rank 1 A1 B1 C1 D1 E1
Rank 2 A2 B2 C2 D2 E2
Rank 3 A3 B3 C3 D3 E3
Rank 4

Rank 5

After 6th race (A1,B1,C1,D1,E1), the winners are E1>C1>D1>B1>A1.  We eliminate A1, B1 (4th and 5th winners of 6th race)
Group A Group B Group C Group D Group E
Rank 1
C1 D1 E1
Rank 2 A2 B2 C2 D2 E2
Rank 3 A3 B3 C3 D3 E3
Rank 4

Rank 5

Because the best of group A (A1) and the best of group B (B1) are eliminated (losers), there is no chance for 2nd and 3rd winner of group A (A2, A3)  and 2nd and 3rd winner of B (B2, B3) would win. We eliminate A1, A2, B1, and B2:

Group A Group B Group C Group D Group E
Rank 1
C1 D1 E1
Rank 2
C2 D2 E2
Rank 3

C3 D3 E3
Rank 4

Rank 5

We've got E1>C1>D1.  If the best of group D (D1) only sits in the 3rd place, the worse two in group D (D2 and D3) would definitely never have a chance to win.  We eliminate D2, D3.

C1 could only sit in the 2nd place, C2 is still possible to sit in the 3rd place, but the 3rd place in group C (C3) would not get a chance to sit in the last 3rd place either. Eliminate C3.
Group A Group B Group C Group D Group E
Rank 1
C1 D1 E1
Rank 2
C2
E2
Rank 3

E3
Rank 4

Rank 5

We already know, E1 is the best of all, so there is no point to do a race with the horse again.  Now we have E2, E3, C1, C2, D1 to race against each other.

7th race: Say we get E2>C1>C2.  We could eliminate E3, D1 then.  Now we have E1>E2>C1>C2.  We eliminate the 4th (C2).

We get the top-3 (E1>E2>C1) !!!

Group A Group B Group C Group D Group E
Rank 1
C1
E1
Rank 2

E2
Rank 3

Rank 4

Rank 5

NOTE:
After second thought, the 4th and 5th elimination in the beginning actually assumes these bottom two are the lowest of all.  This may be dangerous, as we split the race into groups, each group is independent each other.

For example, after 5 races performed, the speed of horses (in Miles/hour and ordered according to the rank) in
group A:  [20.0, 19.0, 18.0, 17.0, 16.0]
group B:  [25.0, 24.8, 24.6, 24.4, 24.2]

It is obvious, B4 and B5 (the bottom two) are still faster than even the fastest in group A.  If we eliminate indiscriminately against all 4th and 5th winners, B4 and B5 are got rid of, while the actual losers (all horses in group A) are picked up (at least A1, A2, A3 are selected, where they shouldn't be).  Now you see the problem?

It is similar to say, the top-3 smartest men among 5-idiots better than 4th and 5th place of smartest men among 5-smart men.

I dunno why the explanations for the issue on the Internet fall into the same fallacy (by the way, this question is one of Google's interview question, I believe).