Fornowagain produced the following formula for wattage scaling with frequency/voltage a while back. It's approximate but seems valid.
Intel publish a TDP value, in Watts. This is a worst case scenario for heat generation for the benefit of people designing cooling systems. Assuming that all the power going into a chip is lost as heat (reasonable), this value gives you the worst case power consumption. It's 95W for the Q6600 G0 stepping.
Power consumption scales linearly with frequency. This is intuitively reasonable, if you double the frequency it'll solve your equation in half the time, but it's still doing the same number of operations so total energy used should be constant. Power is energy / time, hence doubling frequency doubles power consumption.
Voltage increases exert a greater effect. Higher voltage means more energy is pushed into the cpu, so more heat is generated directly. As a second effect, more current flows through and this also generates more heat. For constant resistance power is proportional to voltage squared. A cpu isn't ohmic, but lets assume it's close enough for engineering purposes.
This yields, as quoted by Fornowagain,
Power ~= TDP x (Voltage / stock voltage)^2 x (Frequency / stock frequency)
**** end maths ****
A couple of results from this. It's possible to overclock without using more power, but only if you undervolt at the same time. It also offers a reason why lower Vid chips can run hotter.
Overclockers tend to take the maximum voltage from the stock range as the "safe limit". In the case of the Q6600 under discussion, this is 1.5V. Assuming that all processors, whatever their vid, have the same TDP value is probably naive but the best option available.
I believe Q6600's tend to top out around 3.8GHz so I'll use that for my example. A very low vid would be 1.1V, a "bad" one 1.4V.
Power(1.1V vid) ~= 95*(1.5/1.1)^2*(3.8/2.4) = 280W
Power(1.4V vid) ~= 95*(1.5/1.4)^2*(3.8/2.4) = 173W
So here, the high vid chip is estimated to use about 100W less than the low vid one.
I personally believe the issue here is people taking intel's maximum vid, or the communities advice (hopefully features temperature), as the safe voltage. As such a low vid has a greater voltage increase available before hitting the limit, and will tend to clock further as a result. There's precious little sense behind this though, as there's no reason to believe that a voltage that is safe at stock remains safe when running a higher frequency and indeed higher temperature. There's no reason to believe processors of a given batch are identical to each other either.
**** sidenote ****
May I suggest the alternative method, which I came across a couple of years ago when looking into the theory behind overclocking. To be combined with raising voltages only as required, and spending time minimising all those that one can which is good practice anyway. In general I'm considering vcore here, though it wouldn't be unreasonable to do it with qpi/analogues as well.
1/ Determine lowest voltage processor requires to run stably, under your chosen cooling system, at stock speeds (or less if you want to be really thorough)
2/ Write down the frequency and voltage(s)
3/ Increase frequency a bit, test as normal to determine the new required voltage(s)
4/ Write these down too
5/ Continue in this fashion, plotting the numbers as you go.
The relation is likely to be linear for quite a while, then hit diminishing returns where you need greater and greater voltages for the extra few MHz. The "safe" voltages for your system can be taken as the values around the linear-curved transition. Pick an aesthetically pleasing clockspeed around there and be content.
If you don't do the undervolting bit you'll loose a good part of the linear relationship to a flat line, which will make it difficult to see where the linear behaviour stops. Likewise, the smaller the frequency steps the better the graph you'll produce, but the longer it'll take.
edit: Forgot this was about temperature, not wattage. Heatsinks tend to develop a temperature gradient proportional to the wattage moving through them, with one end fixed near room temp. Increasing room temp by X degrees tends to increase processor temperature by X degrees as well. To decrease the difference between room/water temp and processor temp you need to decrease the processor wattage or improve the heatsink (say by a better fan). It's possible to estimate heatsink/radiator performance, but anything more complex than a constant K/W gets messy quite fast. Voltage/clock directly affect wattage, calculating temperature from known wattage is rather more difficult than the above.