Soldato
Hey, I've built a particle engine in opengl but really need a fast square root function. the one in math.h is rubbish.
Cheers.
Joe
Cheers.
Joe
float InvSqrt(float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x; // get bits for floating value
i = 0x5f375a86- (i>>1); // gives initial guess y0
x = *(float*)&i; // convert bits back to float
x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy
return x;
}
Don't do that - divides are slow! Instead multiply by the original number: x * (1/sqrt(x)) = sqrt(x). For most graphics work the reciprocal sqrt is what you want anyhow.Una said:It depends how accurate you want it to be?
Newton-Rapheson approximation.
This is a hell of a fast way to do it - just take 1/the result for the sqrt.
I think even SSE had support for doing fast approximate invsqrt(x); not sure what the best approach is these days though.You could possibly make it quicker if you drop to assembly language and make better use of SSE2 registers...
DaveF said:Don't do that - divides are slow! Instead multiply by the original number: x * (1/sqrt(x)) = sqrt(x).
Do you mind explaining the equations you're using that leave you needing a square root? (Rather than the inverse square root).yer_averagejoe said:tried the newton-raphson method and the reciprocal method and decided to go with the newton raphson as it worked better.
when i tried using x*(invsqrt(x)) instead of dividing by 1, i got some very strange results.
That would be good. Because you're wasting a lot of cycles doing two unneeded divides.yer_averagejoe said:Well I've written the algorithm right now in a step by step manner (I will change this of course, it just makes it easier to read while coding) so it does infact use the inverse square root but the way i have it right now is it uses the square root then is divided etc.. So can get around the problem of dividing by 1 when it comes to compressing the code.
Note that invsqrt(ai^2 + aj^2 + ak^2))x(invsqrt(bi^2 + bj^2 + bk^2)) = invsqrt[(ai^2 + aj^2 + ak^2)(bi^2 + bj^2 + bk^2)]so therefore the cos(anlge) is given by (a.b)x(invsqrt(ai^2 + aj^2 + ak^2))x(invsqrt(bi^2 + bj^2 + bk^2))