I love these kind of things, hadn't heard this one before so thank you King Damager.
I'm going to try to explain the value of the teacher saying "at least one hat is red" using an example, which might take some time so hopefully someone might have explained it to everyone's satisfaction in a more succint manner in the time it takes to write this.
As everyone seems to accept the information is important when there are 2 or less red hats, but that it breaks down once there are 3 red hats, as every child then knows "there must be at least 1 red hat", I will use the simplest such example with both red and black hats, 4 kids, 3 with red hats, 1 with a black hat.
Let's number them 1-4, with 1-3 wearing a red hat, and number 4 wearing a black hat.
They have no other information other than what they can see and their knowledge of what each other can see.
They are then asked in turn, 1 to 4, what hat they are wearing.
I think again everyone agrees that kid 1 and kid 2 will both say "I don't know".
We must now step into kid 3's head. Kid 3 thinks "I may be wearing a red hat, or I may be wearing a black hat". Kid 3 can see that kid 1 and 2 are wearing red hats and that kid 4 is wearing a black hat.
Kid 3 thinks to himself, "what was kid 2 thinking when he said 'I don't know'". We must now step into kid 2's head with only kid 3's knowledge (ie. not knowing if kid 3 has a red or black hat).
With that restrction, kid 2 can be seeing one of two things:
A) He can see kid 1 wearing a red hat (and kid 1 has already said "I don't know"), kid 3 wearing a red hat and kid 4 wearing a black hat OR
B) He can see kid 1 wearing a red hat (and kid 1 has already said "I don't know"), kid 3 wearing a black hat and kid 4 wearing a black hat.
In scenario A, he sees someone (kid 3) wearing a red hat who hasn't spoken yet and again, I think everyone agrees that in this case he must say "I don't know".
In scenario B, only kid 1 is wearing a red hat and kid 1 has said "I don't know", BUT AT THIS POINT HE CAN'T JUST SAY HE HAS A RED HAT ON, HE MUST STEP INSIDE KID 1'S HEAD (sorry for the caps but I feel this point needs emphasis).
So, still in scenario B, kid 2 thinks to himself "what was kid 1 thinking when he said "I don't know".
In scenario B, from kid 2's perspective, kid 1 can be seeing one of 2 things:
B1) He can see kid 2 wearing a red hat, kid 3 wearing a black hat and kid 4 wearing a black hat OR
B2) He can see kid 2 wearing a black hat, kid 3 wearing a black hat and kid 4 wearing a black hat
In scenario B1, kid 1 sees kid 2 wearing a red hat and kid 2 hasn't spoken yet, so he must say "I don't know".
In scenario B2, kid 1 sees only black hats (and here's the kicker) BUT CAN'T MAKE ANY ASSUMPTION ABOUT THEIR OWN HAT, BECAUSE THERE MIGHT ONLY BE BLACK HATS. Therefore he still says "I don't know".
So in either scenario B1 or B2, kid 1 has to say "I don't know". Then again in either scenario A or B, kid 2 will say "I don't know". So by the time we get to kid 3, he thinks "regardless of anything that happened before, kid 2 will say "I don't know", so I have no new information, my hat might still be red or black, so I have to say "I don't know". I hope we are all in agreement on that.
Now let's step back to scenarios B1 and B2, but in this case the teacher has said "at least one hat is red".
In scenario B1, nothing changes, kid 1 sees kid 2 wearing a red hat and kid 2 hasn't spoken yet, so he must say "I don't know".
But in scenario B2, where he sees only black hats, he knows his must be red, so he says "red". Now, as he didn't say "red", kid 2 knows that kid 1 did not see only black hats, so given that kid 2 knows that kid 1 said "I don't know", it can't have been scenario B2. Therefore kid 2 (and kid 3) can rule out scenario B2. But, had it been scenario B1, kid 2 would been in a situation where he sees kid 3 and 4 with black hats and kid 1 with a red hat and kid 1 has said "I don't know", so kid 2 would have said "red". But kid 3 knows that kid 2 said "I don't know", so it can't have been scenario B1 or B2, therefore kid 2 can't have seen scenario B.
So kid 3 now knows that kid 2 saw scenario A) kid 1 wearing a red hat, kid 3 wearing a red hat and kid 4 wearing a black hat.
Therefore, kid 3 knows they are wearing a red hat.
That's pretty convuluted and might take a few reads but I think it holds. Also there might be people thinking "well ok, maybe that holds for 3 red hats, but not for 4 or 5 or 17!". What I've tried to point out is that for the final kids (be it 3, or 253) logic to hold up, he must consider every permutation of every previous kid, up to and including the fact that first kid might have seen all black hats and needs to know that there is at least one red hat to make the "red" conclusion.
If you're still unconvinced for cases higher than 3, just try from this starting point and follow my logic from above. 5 kids, 4 red, 1 black. Kid 4 thinks what kid 3 is thinking when saying "I don't know". Kid 3 could be seeing:
A) Kid 1 red, kid 2 red, kid 4 red, kid 5 black
B) Kid 1 red, kid 2 red, kid 4 black, kid 5 black
Hope that helps, apologies if I'm talking rubbish!