Seti@Home optimized science apps and information
 
Welcome, Guest. Please login or register.
Did you miss your activation email?
05 Sep 2008, 04:15:27 am

Login with username, password and session length
 
If you've registered already but never got your activation email, please click here.
 
 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  Topic: GPU crunching question 0 Members and 0 Guests are viewing this topic. « previous next »
Pages: 1 2 3 [4] 5 6 ... 13 Go Down Print
Author Topic: GPU crunching question  (Read 39833 times)
Freddy
Knave
*
Offline Offline

Posts: 2


View Profile
Re: GPU crunching question
« Reply #45 on: 21 Feb 2007, 02:20:46 am »

Tested with 8800GTS 640MB Version (nothing done about the clock rate of memory or GPU)

min_n = 4
max_n = 4
RapidMind FFT Benchmark
-----------------------------------------------
Length: 16 = 2^4
Warming up...
Run timings, to and from host (in us):
10095.2 8976.7 9132.39 8718.98 8906.92
8904.71 8715.21 8833.48 8783.14 8836.1
8674.97 8913.12 8764.64 8645.37 8741.8
8818.75 9024.37 8807.76 8826.81 8911.87
9002.08 9067.97 8945.69 8910.78 8722.34
8785.37 8814.4 8836.28 8834.39 8795.27
8778.69 8968.62 8747 8943.26 9291.43
8890.32 8932.17 8860.98 8739.06 8734.42
8871.18 8755.89 8868.9 9068.03 8763.38
9002.55 8814.57 8864.37 8823.38 8856.53
8831.87 8614.2 8851.8 8697.95 8952.61
8711.42 8683.05 8912.46 8763.43 8755.46
8718.52 9060.99 8932.78 8812.21 8834.16
8825.66 8653.1 8801.54 8859.38 8665.22
8906.53 8957.47 8860.75 8777.11 8759.25
8845.62 9030.77 8915.02 8858.34 8676.31
8819.07 9009.46 8837.26 8762.6 8834.04
7046.69 8719.74 8610.55 8890.17 8839.04
9646.3 8775.46 8739.86 8720.51 9064.7
8947.07 8705.96 8704.77 8867.14 8880.16
Average execution time: 8842.67us
Normalized execution time (T/N): 552.667us/sample
Normalized by complexity (T/N lg N): 138.167
Mflops (5 N lg N/T): 0.0361882
Average execution time: 8842.67us
Minimum execution time: 7046.69us
Normalized average execution time (T/N): 552.667us/sample
Normalized minimum execution time (T/N): 440.418us/sample
Average time normalized by complexity (T/N lg N): 138.167
Minimum time normalized by complexity (T/N lg N): 110.105
Average Mflops (5 N lg N/T): 0.0361882
Peak Mflops (5 N lg N/T): 0.0454114
---
Warming up...
Run timings, GPU-local (in us):
8263.18 8381.39 8462.2 8356.22 8373.54
8503.47 8716.67 8385.77 8394.17 8419.64
8659.13 8294.88 8407.95 8567.22 8493.25
8384.13 8477.74 8508.42 8552.66 8398.76
8761.34 8573.63 8430.25 8437 8615.68
8464.32 8483.02 8540.84 8564.65 8566.38
8503.04 8614.77 8437.5 8545.99 8401.69
8442.15 8832.88 8638.04 8456.14 8492.51
8693.16 8371.29 8350.92 8427.35 8414.12
8851.89 8438.03 8443.12 8503.04 8665.21
8719.99 8375.58 8501.07 8526.01 8325.1
8614.5 8433.29 8432.5 8532.22 8529.62
8481.02 8251.49 8543.71 8523.21 8422.35
8640.62 8603.52 8661.46 8479.36 8548.6
8649.6 8542.74 8373.39 8379.29 8413.56
8598.13 8549.43 8460.99 8544.15 8515.79
8576.4 8485.85 8558.77 8380.95 8520.18
8764.88 8403.96 8483.77 8752.86 7361.6
8661.36 8332.67 8480.45 8310.8 8649.39
8708.75 8560.87 8488.33 8491.4 8473.15
Average execution time: 8495.79us
Minimum execution time: 7361.6us
Normalized average execution time (T/N): 530.987us/sample
Normalized minimum execution time (T/N): 460.1us/sample
Average time normalized by complexity (T/N lg N): 132.747
Minimum time normalized by complexity (T/N lg N): 115.025
BenchFFT average Mflops (5 N lg N/T): 0.0376657
BenchFFT peak Mflops (5 N lg N/T): 0.0434688
Residuals (compare with inverse):
  Average absolute: 1.26059e-008
  Maximum absolute: 5.96046e-008
  Average relative: -1.#IND
  Maximum relative: 1.#INF
-----------------------------------------------


RapidMind 2D FFT Benchmark
===============================================
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 13.7757ms
Overall average execution time: 13.7762ms
Minimum execution time: 13.2051ms
Average Mflops: 380.589
Peak Mflops: 397.035

Run timings, GPU-local (in ms):

Average execution time: 12.1273ms
Overall average execution time: 12.1279ms
Minimum execution time: 11.7326ms
Average Mflops: 432.32
Peak Mflops: 446.865


Both Tests end with an memory read  error.
OS is Windows XP Pro 32 Bit .Net 2.0 is not installed

Serching for Errors will be done later when work is over...
Logged
Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #46 on: 21 Feb 2007, 05:36:35 am »

for G80 is better a CUDA version , i may search on my home computer some apps by Hans Dorn - he had builded some test apps based on CUDA ...
Logged

WR-HW95
Knave
*
Offline Offline

Posts: 2


View Profile
Re: GPU crunching question
« Reply #47 on: 22 Feb 2007, 09:42:06 am »

With 8800GTX @ 612/975

Code:
C:\Release-vc8>fft.exe
min_n = 4
max_n = 4
RapidMind FFT Benchmark
-----------------------------------------------
Length: 16 = 2^4
Warming up...
Run timings, to and from host (in us):
11561.3 10482.5 8229.39 12829.6 8740.71
9539.26 9745.74 10875.1 11149.2 9760.27
12356 8845.49 11541.2 8558.26 9808.89
9916.74 9238.06 9773.12 8477.23 7909.47
11607.7 10333.6 7918.13 11377.5 7920.09
10473.6 8454.32 9801.9 10972.9 10767
9267.11 11145.3 9876.5 9839.62 13427.2
8664.71 10973.7 11119.3 9176.86 9062.31
9811.68 8923.72 7202.85 9036.6 9994.13
8747.42 10002.8 10443.1 9761.39 9866.44
10177.1 10808.3 8371.89 10052 9621.96
10266 11904.4 9640.12 9375.24 8899.69
9294.78 10726.2 6828.72 12483.1 9911.99
12466.6 8385.58 7925.68 10416.3 9766.97
9917.02 11196.4 9642.64 10324.1 11035.8
9518.3 8512.15 10829 9727.86 12404.3
10707.5 10192.5 10868.4 7899.13 9340.32
8048.62 7750.77 11226.9 8889.35 9273.54
7777.87 7842.69 7471.92 8830.4 10697.4
11466.3 8701.59 8419.39 7942.44 9761.11
Average execution time: 9788.45us
Normalized execution time (T/N): 611.778us/sample
Normalized by complexity (T/N lg N): 152.945
Mflops (5 N lg N/T): 0.0326916
Average execution time: 9788.45us
Minimum execution time: 6828.72us
Normalized average execution time (T/N): 611.778us/sample
Normalized minimum execution time (T/N): 426.795us/sample
Average time normalized by complexity (T/N lg N): 152.945
Minimum time normalized by complexity (T/N lg N): 106.699
Average Mflops (5 N lg N/T): 0.0326916
Peak Mflops (5 N lg N/T): 0.0468609
---
Warming up...
Run timings, GPU-local (in us):
10815.9 11730.4 7816.99 7627.83 9804.42
9321.6 9801.34 9725.06 7585.92 9003.07
9982.68 6766.42 10917.9 8505.45 7894.38
10349.5 8926.79 11731.8 7668.62 8905.56
11206.2 9771.44 11598.2 8679.8 9933.78
9116.51 8855.83 9696 9815.87 8695.17
12109.5 9716.4 8787.65 8662.48 8444.54
7717.24 8718.36 9792.96 10747.7 9169.6
11555.5 8955.85 9709.7 6659.12 10377.2
9286.95 10160.9 11761.7 8587.87 12249.8
8761.67 10833.5 9495.95 7892.71 9270.47
9678.68 10709.1 9684.55 7819.5 10225.5
8822.58 12600.2 8660.8 8996.09 11010.3
6783.74 10320.5 10069.9 9703.83 10450.1
7650.74 10810.8 10639.8 9755.24 11815.3
8054.21 7740.15 10277.5 10128.5 10209.3
6895.78 7671.42 9653.26 9822.86 12298.4
10547.4 7820.62 7712.77 6761.39 8859.18
7419.95 8623.08 7702.71 8842.41 9383.91
9820.06 7636.21 8563.29 9718.36 8473.6
Average execution time: 9385.19us
Minimum execution time: 6659.12us
Normalized average execution time (T/N): 586.574us/sample
Normalized minimum execution time (T/N): 416.195us/sample
Average time normalized by complexity (T/N lg N): 146.644
Minimum time normalized by complexity (T/N lg N): 104.049
BenchFFT average Mflops (5 N lg N/T): 0.0340963
BenchFFT peak Mflops (5 N lg N/T): 0.0480544
Residuals (compare with inverse):
  Average absolute: 1.26059e-008
  Maximum absolute: 5.96046e-008
  Average relative: -1.#IND
  Maximum relative: 1.#INF
-----------------------------------------------

Code:
C:\Release-vc8>fft2d.exe
RapidMind 2D FFT Benchmark
===============================================
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 15.6239ms
Overall average execution time: 15.6285ms
Minimum execution time: 13.4389ms
Average Mflops: 335.568
Peak Mflops: 390.126

Run timings, GPU-local (in ms):

Average execution time: 13.8474ms
Overall average execution time: 13.851ms
Minimum execution time: 10.7656ms
Average Mflops: 378.619
Peak Mflops: 487.004

It looks like this likes pretty much cpu speed too... above is ran with 2xrosetta and 3.05GHz Opteron 175.

I suspended Boinc and ran fft2d again.

Code:
C:\Release-vc8>fft2d.exe
RapidMind 2D FFT Benchmark
===============================================
Size: 256 x 256 = 2^8 x 2^8
Radix: 4 = 2^2
Total number of floating point operations: 5.24288e+006

Run timings, to and from host (in ms):

Average execution time: 14.0743ms
Overall average execution time: 14.0783ms
Minimum execution time: 13.1137ms
Average Mflops: 372.515
Peak Mflops: 399.801

Run timings, GPU-local (in ms):

Average execution time: 12.3266ms
Overall average execution time: 12.3304ms
Minimum execution time: 10.2948ms
Average Mflops: 425.332
Peak Mflops: 509.276
« Last Edit: 22 Feb 2007, 09:47:17 am by WR-HW95 » Logged
pepperammi
Pre-Release Tester
Knight o' the round Table
***
Offline Offline

Posts: 194


View Profile
Re: GPU crunching question
« Reply #48 on: 22 Feb 2007, 07:57:55 pm »

for G80 is better a CUDA version , i may search on my home computer some apps by Hans Dorn - he had builded some test apps based on CUDA ...
I hear the 8900 series will have 25% more shaders or something and still the G80 chips. Apparently there all along. Would that mean anything to all this?
I wonder if will be able to unlock them like I think was possible on some older ATI at some point?
Logged
Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #49 on: 24 Feb 2007, 06:24:22 pm »

as i have wrote for older card are better a BrookGPU or Rapidmind...
for new cards are better  CUDA (nVIDIA) or CTM (ATI)
Logged

Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #50 on: 24 Feb 2007, 06:32:34 pm »

as i have see in the RapidMind FFT source : algorithm is running on two complex on one pass (ala RGBA texture format). using this format has extremely efficiency in vertex/pixel shaders and by memory transfers (shaders/GPU memory)...
Logged

Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #51 on: 24 Feb 2007, 06:40:42 pm »

off topic : Code Wizard : cool  Smiley

my name is yellow  Shocked
« Last Edit: 24 Feb 2007, 07:04:50 pm by Devaster » Logged

Simon
Ni!
Lord o' the Board
Knight who says 'Ni!'
*****
Offline Offline

Posts: 1053



View Profile WWW
Re: GPU crunching question
« Reply #52 on: 24 Feb 2007, 06:47:41 pm »

Grin
I thought so, too. Keep up the good work!
Logged
Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #53 on: 24 Feb 2007, 07:03:44 pm »

maybe i have  a good idea : modifying a boinc manager to use a GPU as a next core ....
if you have a usable GPU , then you can run next instance of SETI ...

there would be a small performance hit .... (about 10 percent by my tests)
Logged

pepperammi
Pre-Release Tester
Knight o' the round Table
***
Offline Offline

Posts: 194


View Profile
Re: GPU crunching question
« Reply #54 on: 24 Feb 2007, 09:17:24 pm »

I was reading an article the other day that the G80 is more like an x86 processor than the normally thought of GPU.
http://news.softpedia.com/news/G80-Is-Actually-a-CPU-44724.shtml
Logged
Gecko_R7
Global Moderator
Knight Templar
*****
Offline Offline

Posts: 277



View Profile
Re: GPU crunching question
« Reply #55 on: 25 Feb 2007, 12:45:09 am »

Devastater:   So, if a person was running S@H on C2D and had a graphics card, BOINC would recognize the GPU as a 3rd processor and manage the GPU's own client?  Well, even if the GPU lost 10% performance, being able to run the CPU clients simultaneously appears to be quite a gain in aggregate vs. GPU-only crunching at 100%.

This sounds pretty darn cool!  Grin
Good luck!
Logged
Alex Kan
Code Wizard
Knight o' the Realm
*****
Offline Offline

Posts: 29



View Profile
Re: GPU crunching question
« Reply #56 on: 25 Feb 2007, 05:12:43 pm »

Devaster: Neither of the data points you've picked for fft.exe are representative of SETI's FFT workload--SETI doesn't do two-dimensional FFTs, and spends much more time doing FFTs with lengths between 16K and 128K than it does any other lengths.

Also, if you're using the standard MFLOPS = 5 N log2(N) / (1000 * time in ms) metric for FFT performance, those times strike me as a bit on the low side. A lot of those speeds seem no faster than (or worse, slower than) doing the same computations on the CPU with tuned libraries. Does RapidMind provide built-in functionality for computing FFTs?
Logged
Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #57 on: 25 Feb 2007, 05:49:04 pm »

from my side : for me  is not important if fft on gpu is more speedy or not but is in that you are using additional compute power to crunching ....
Logged

pepperammi
Pre-Release Tester
Knight o' the round Table
***
Offline Offline

Posts: 194


View Profile
Re: GPU crunching question
« Reply #58 on: 27 Feb 2007, 02:58:39 pm »

This article at all useful or interesting? bit over my head to be honest  Wink
http://arstechnica.com/news.ars/post/20070227-8931.html
Logged
Devaster
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 250


I like Duke !!!


View Profile
Re: GPU crunching question
« Reply #59 on: 28 Feb 2007, 09:47:24 am »

Tomorrow afternoon i put here first test SETI on GPU : Nagas FFT , power spectrum and data chirping in rapidmind. stay tuned and patient!!!
Logged

Pages: 1 2 3 [4] 5 6 ... 13 Go Up Print 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  Topic: GPU crunching question « previous next »
Jump to:  


Quote!
Left to themselves, things tend to go from bad to worse.
- Murphy's Law

 
Site Statistics
Total Members:1,021
Total Posts:9,117
Total Topics:425
Downloads
Apps
Windows R-1.x25,067
Windows R-2.020,291
Windows R-2.236,400
Linux 32bit 1.x6,527
Linux 32bit 2.24,305
Linux 64bit 2.21,714
Alpha/IA64187
FreeBSD581
HPUX323
Subtotal:94,304
Source packs:4,071
Tool/WU packs:7,680
Total:150,614
GBs dl'd:279.10
Pages served
Today:445
Total:3,093,854
(since 6/26/2006)
173 Donations to S@H
U.S. Dollars:3,196.59
Euros:863.90
Last 24h:$ 0.00
Avg./24h:$ 7.54
Estim. total:$ 4,319.66
Latest Member:
fos
 
 
Seti@Home optimized science apps and information | Powered by Enigma 2.0 (RC1).
© 2003-2008, LSP Dev Team. All Rights Reserved.
Seti@Home optimized science apps and information Forums | Powered by SMF.
© 2005, Simple Machines LLC. All Rights Reserved.
Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS!