• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Cuda code crunching Nvidia vs ATi cores benchmark

Soldato
Joined
13 Jul 2004
Posts
20,341
Location
Stanley Hotel, Colorado
Igrargpu is a utility that uses both cpu cores and gpu cores on cards upto the 5870 to break codes.
There is not a fancy graphics benchmark just pure numbers computer speed test



47855985.jpg



Description

This software using ATI RV 670/770/870 and nVidia "CUDA" video cards to recover passwords for RAR archives v 3.x. Recovery speed at HD4850 about 20 times better than single core of Q6600 @ 2.4Ghz. (Comparing only optimized SSE2 code running on iCore. Currently only 2 programs heavily optimized for RAR 3.x recovery -- crark & ARCHPR). For non-optimized CPU versions it can be [place your favorite number here, like 100x or 200x].

Performance on nVidia cards slower than with ATI ones (because of nVidia GPU architecture). So, 8600 GT about 2x times faster than single core of Q6600 @ 2.4Ghz. GTX 260 /w 192SP about 12 times faster than single core of Q6600 @ 2.4Ghz.

Plain numbers (for RAR passwords with length == 4) are:
~168 passwords per second on single core of Q6600 @ 2.4Ghz (crark's result)
~325 passwords per second on 8600 GT
~3350 passwords per second on ATI HD4850
~2075 passwords per second on GTX260/192SP
Note that password recovery speed is not constant for RAR archive, it depends on password length (i.e shorter passwords will be checked faster than longer ones.



For this test just unrar to a directory and drag and drop the example encrypted rar onto the igrargpu.exe


The rar will give a password error, thats the whole point. You break the password code with brute processing power so just unrar everything but that passworded file and then drag and drop it onto Igrargpu to get your performance figure.

My result above is 3750 words per second with dual cores and a 4870 but Im sure someone can do better

Try to take a screenshot on a 5 letter word or higher to get an accurate result, shouldnt take long on any modern setup


http://rapidshare.com/files/371199338/_igrargpu_v05.zip.html
 
Interesting how ATi is quicker than equivalent NV... usually nvidia have the head up in number crunching things like this.

Must be the way the software is written? Perhaps takes advantage of ATi's different structure?
 
Interesting how ATi is quicker than equivalent NV... usually nvidia have the head up in number crunching things like this.

Must be the way the software is written? Perhaps takes advantage of ATi's different structure?

Most likely optimised for VLIW.

http://en.wikipedia.org/wiki/Radeon_R600 said:
The new unified shader functionality is based upon a Very long instruction word (VLIW) architecture in which the core executes operations in parallel.

That's taken from the R600 article but I believe the Unified Shader Architecture ATI uses has not changed since (apart from getting faster, obviously).
 
******************************************************************
*** RAR GPU Password Recovery v0.5 ***
*** For ATI RV670/770/870 cards and nVidia 'CUDA' ones (G80+) ***
*** (c) 2009 Ivan Golubev, http://golubev.com ***
*** see "readme.htm" for more details ***
******************************************************************

Opened [C:\Users\Desktop\New folder\test.rar]
Found 1 compatible CAL device(s).
CAL device #0 [RV870]: GPU clock = 765.000 Mhz, SP = 1440.
Blocksize = 128 x 64
Starting brute-force attack, Charset Len = 26, Min passlen = 1, Max passle
Charset [abcdefghijklmnopqrstuvwxyz]
Starting from [a]
CURPWD: adsyp DONE: 00.55% ETA: 26m 53s AVRSPD: 7324.0 CURSPD: 6644.0 @ 5

Spec as sig
 
Last edited:
GPU sure does get hot, also lags the hell out of the Windows 7 UI

Lag isn't the same as slow :p.

I tried it before and when it got to 5 digits, it was doing around 3700/s.

That's with a Q6700 currently at stock and a 4850@700/993.
save
 
Getting around ~8k at 5chars, with a 5850 (OC'd to 900,1150) and an X4 965 at stock.... not too bad :) only using 1 CPU core though :(
 
It probably runs faster on the ATi hardware because it's using CAL which is basically a slight abstraction above straight hardware calls, whereas for Nvidia hardware it's using C for CUDA which is a much higher level programming language (much more abstracted). That or it could be that AMD hardware handles non-floating point data much better than Nvidia hardware (this should be working with character data).

Edit: To follow that up, it's making my stock 4870 hit almost 50% fan speed, which is quite loud (got nothing on the 480 though ;)), and my system isn't as responsive as it could be. I've got an AVRSPD of about 3700 at the moment.
 
Last edited:
This is the thing, when AMD can cheaply produce cards and are all go at GloFO with as much capacity as they want and actually go after the gpgpu market, they could kill it pretty much.

Its not always possible to code easily for AMD's architecture, but theres huge potential for a massive number of gpgpu like apps to get significantly higher efficiency than in gaming, which is so unpredictable filling the entire shader with crunching is very difficult.

IN gaming AMD are only averaging around a 60-70% gpu usage, as in even at a apparent gpu load of 100% its not filling every single shader on the gpu, its basically impossible to do so, but its very hard to even get above 70% on average. Thats why AMD's theorectical performance numbers utterly blow Nvidia out of the water, in any gpgpu apps that can utilise the AMD architecture fully, Nvidia have zero chance of matching it on performance.

Thing is for instance now, AMD literally are selling everything they can make at TSMC< they can't get more production and can't sell more cards, its pointless going after a market you can't provide cards for, they've always had lower allocation than Nvidia because Nvidia are the prefered partners long term, and more recently TSMC favours Nvidia because they know AMD are off to Glofo very shortly, they want to keep Nvidia happy.
 
It probably runs faster on the ATi hardware because it's using CAL which is basically a slight abstraction above straight hardware calls, whereas for Nvidia hardware it's using C for CUDA which is a much higher level programming language (much more abstracted). That or it could be that AMD hardware handles non-floating point data much better than Nvidia hardware (this should be working with character data).

Also could be because this software is able to fully load all the shaders on ATI hardware, which means having 1600 shaders available would be a lot faster than Nvidia's shaders.
 
Brute force password crunching is pretty simple, no floating point, very little complexity, it plays well to ATI's hardware strengths but doesn't give you any idea how either GPU would handle more complex maths crunching.
 
Would be more interesting to see some mixed tests, a bit of physics simulation, AI pathfinding, etc. I could probably whip it up with CUDA but no idea where to start for ATI cards and not much Open CL experience.
 
How do you get this running i find it confusing, in laymans terms please i can get cmd running and bring up rar gpu but im at a loss what to type in.
 
Drag the rar file to the executable.

Ok that was worrying, my temp got to 67 degrees and it said the fan was 99% yet the fan didn't even speed up hmm.

Thanks btw worked like a charm, handy for those rars i can't open. ;)

Card is a 5970 on 10.2
 
Back
Top Bottom