|
|
Author
|
Topic: GPU client (Read 30587 times)
|
|
Devaster
|
u may use knabench system for speed comparision ...
|
|
|
|
|
Logged
|
|
|
|
|
TheMule
|
Ok, not what I expected. Using KNAbench and work unit 1, I got:
226 sec - setiathome_6.01_windows_intelx86 203 sec - setiathome_5.27_windows_intelx86
About 23 sec slower. Is it due to the FFT messages on the screen? Data follows:
setiathome_5.27_windows_intelx86.exe -nographics / testWU-1.wu : Started at : 13:53:37 Ended at : 13:57:00 Elapsed time: 203 seconds [ stderr ] Can't set up shared mem: -1 Will run in standalone mode. setiathome_enhanced 5.27 DevC++/MinGW
Work Unit Info: ............... WU true angle range is : 0.604884 Optimal function choices: ----------------------------------------------------- name ----------------------------------------------------- v_BaseLineSmooth (no other) v_vGetPowerSpectrumUnrolled 0.00006 0.00000 sse3_ChirpData_ak 0.00899 0.00000 v_vTranspose4 0.00143 0.00000 AK SSE folding 0.00076 0.00000
Flopcounter: 637401180238.359500
Spike count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 [ /stderr ]
setiathome_6.01_windows_intelx86.exe -nographics / testWU-1.wu : Started at : 13:44:56 Ended at : 13:48:42 Elapsed time: 226 seconds [ stderr ] Device name: GeForce 8800 GTS 512 Device version: 1.1 Total global memory (MB): 512 Number of multiprocessors : 16 Number of cores :128 Shared memory per block (kB): 16 Registers per block: 8192 Warp size: 32 Max threads per block: 512 Shaders clock rate (MHz): 1674 Concurrent copy and execution: No Can't set up shared mem: -1 Will run in standalone mode. setiathome_enhanced 6.01 Visual Studio/Microsoft C++ libboinc: 6.3.4
Work Unit Info: ............... WU true angle range is : 0.604884
Flopcounter: 627299330081.366820
Spike count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 called boinc_finish [ /stderr ] ------------
|
|
|
|
|
Logged
|
|
|
|
|
Devaster
|
okay : new code - now 64-bit ... as previous 32-bit build .... compiled with VS2008+VS2005 under Windows Server 2008 x64 small test : ============ setiathome_6.00S08_windows_intelx86.exe -verb -nog / testWU-4.wu : Started at : 20:57:18.970 Ended at : 21:00:46.190 207.126 secs Elapsed 199.109 secs CPU time [ stderr ] Can't set up shared mem: -1 Will run in standalone mode. setiathome_enhanced 6.00S08 DevC++/MinGW libboinc: 6.1.6
DataIn=0x32b00c0, ChirpedData=0x2aa0040
Work Unit Info: ............... WU true angle range is : 1.279649 Optimal function choices: ----------------------------------------------------- name timing error ----------------------------------------------------- v_BaseLineSmooth (no other)
v_GetPowerSpectrum 0.00079 0.00000 test v_vGetPowerSpectrum 0.00073 0.00000 test v_vGetPowerSpectrum2 0.00075 0.00000 test v_vGetPowerSpectrumUnrolled 0.00076 0.00000 test v_vGetPowerSpectrumUnrolled2 0.00075 0.00000 test v_vGetPowerSpectrum 0.00073 0.00000 choice
v_ChirpData 0.03327 0.00000 test fpu_ChirpData 0.04556 0.00000 test v_vChirpData_x86_64 0.24693 0.00002 test sse1_ChirpData_ak 0.03216 0.00000 test sse2_ChirpData_ak 0.03455 0.00000 test sse3_ChirpData_ak 0.02924 0.00000 test sse3_ChirpData_ak 0.02924 0.00000 choice
v_Transpose 0.04322 0.00000 test v_Transpose2 0.02599 0.00000 test v_Transpose4 0.01550 0.00000 test v_Transpose8 0.02781 0.00000 test v_pfTranspose2 0.02539 0.00000 test v_pfTranspose4 0.01571 0.00000 test v_pfTranspose8 0.02681 0.00000 test v_vTranspose4 0.01173 0.00000 test v_vTranspose4np 0.01197 0.00000 test v_vTranspose4ntw 0.01090 0.00000 test v_vTranspose4x8ntw 0.00758 0.00000 test v_vTranspose4x16ntw 0.00580 0.00000 test v_vpfTranspose8x4ntw 0.01072 0.00000 test v_vTranspose4x16ntw 0.00580 0.00000 choice
FPU opt folding 0.00423 0.00000 test AK SSE folding 0.00220 0.00000 test BH SSE folding 0.00201 0.00000 test BH SSE folding 0.00201 0.00000 choice
Flopcounter: 243285924139.522000
Spike count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 called boinc_finish [ /stderr ] ------------ setiathome_6.01_windows_intelx64.exe -verb -st / testWU-4.wu : Started at : 21:00:46.346 Ended at : 21:03:02.643 136.219 secs Elapsed 128.750 secs CPU time Speedup : 35.34% Ratio : 1.55 x Result : Strongly similar, Q= 99.99% [ stderr ] Device name: GeForce 9600 GT Device version: 1.1 Total global memory (MB): 512 Number of multiprocessors : 8 Number of cores :64 Shared memory per block (kB): 16 Registers per block: 8192 Warp size: 32 Max threads per block: 512 Shaders clock rate (MHz): 1625 Concurrent copy and execution: No Can't set up shared mem: -1 Will run in standalone mode. setiathome_enhanced 6.01 Visual Studio/Microsoft C++ libboinc: 6.3.5
Work Unit Info: ............... WU true angle range is : 1.279649
Flopcounter: 238022320153.522060
Spike count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 called boinc_finish [ /stderr ] ------------ Quick timetable WU : testWU-4.wu setiathome_6.00S08_windows_intelx86.exe : 199.109 secs CPU setiathome_6.01_windows_intelx64.exe : 128.750 secs CPU Speedup : 35.34% Ratio : 1.55 x ------------ CPU: Number of processors 1 Number of cores 1 (max 1) Specification AMD Athlon(tm) 64 Processor 3000+ Codename Venice Core Speed 1005.3 MHz (5.0 x 201.1 MHz) Core Stepping DH-E6 Technology 90 nm Stock frequency 1800 MHz ------------ Chipset: Northbridge NVIDIA nForce4 rev. A3 Southbridge NVIDIA nForce4 MCP rev. A3 ------------ RAM: Memory Type DDR Memory Size 2048 MBytes Memory Frequency 201.1 MHz (CPU/5) Max bandwidth PC3200 (200 MHz) CAS# 3.0 RAS# to CAS# 3 RAS# Precharge 3 Cycle Time (tRAS) 8 DRAM Idle Timer 16 ------------ OS: Windows Version Microsoft Windows Vista (6.0) Enterprise Edition (Full) Service Pack 1 (Build 6001) ============
apps was runnig almost all the time at 100 percent - MS has made very good job with 2008 server in performance ....
|
|
|
|
Logged
|
|
|
|
|
Morten
|
Hi,
Tested x64-version and got this:
================== Device name: Device Emulation (CPU) Device version: 9999.9999 Total global memory (MB): 4095 Number of multiprocessors : 16 Number of cores :128 Shared memory per block (kB): 16 Registers per block: 8192 Warp size: 1 Max threads per block: 512 Shaders clock rate (MHz): 1350 Concurrent copy and execution: No Can't set up shared mem: -1 Will run in standalone mode. GPU memory allocation error (source buffer) ...
==================
I'm running Cuda display driver NVIDIADisplayWinVista64(177_35)Int.exe on Geforce 8800 GT
Morten
|
|
|
|
|
Logged
|
|
|
|
|
Devaster
|
has someone same problem ?
try use latest drivers ....
|
|
|
|
|
Logged
|
|
|
|
|
Morten
|
I found the cause of the problem:
I was connected to the machine using RDP/Terminal Services ("mstsc /v:computer /console"). In this session Nvidia is not available.
After testing this I have some questions/comments:
1: When running the executable it's using 100% CPU - shouldn't the CPU utilization be close to zero and GPU be utilized to the max? As it is now it has no practical use as I give away my CPU in order to utilize the GPU. 2: How to install an run in combo with BOINC? What is your roadmap/intention on this? 3: With the Terminal Services issue mentioned, it appears the only way to run interactively is being logged on locally/physically. 3a: The best way to run is as a service - do you have any suggestions/plans on how to facilitate a service installation, or just use sc.exe?
Morten
|
|
|
|
Logged
|
|
|
|
|
Devaster
|
1. for now not using streams and ported only 10 % of code to GPU ... 2. this code is onlz technology preview so i dont know ..... 3. i dont know about some workaround with terminal services ... sorry 4. service running is managed by BOINC core not by computing app ....
|
|
|
|
|
Logged
|
|
|
|
|
Morten
|
Hi,
Thanks for clearing that up.
Do you recon it's realistic to port 100% to GPU? Do you have an idea how much you will be able to port and when? I think this is such an excellent idea and am really hoping you'll be able to pull it off!
M
|
|
|
|
|
Logged
|
|
|
|
|
cbuchner1
|
fft and powerspectrum on GPU
are you making use of CUFFT's batching feature? If you do, you can basically run multiple FFTs with one CUDA call, which can save some API and kernel launch overhead.
|
|
|
|
|
Logged
|
|
|
|
|
Devaster
|
yes , used cufft batch mode ....
|
|
|
|
|
Logged
|
|
|
|
|
BerndBrot
|
okay : new code - now 64-bit ...
as previous 32-bit build ....
compiled with VS2008+VS2005 under Windows Server 2008 x64
small test :
apps was runnig almost all the time at 100 percent - MS has made very good job with 2008 server in performance ....
How to install the test app?
|
|
|
|
|
Logged
|
|
|
|
|
Archangel999
|
all working fine for wu1 with the x64 6.01 app ----- 127 sec with ak v8 SSSE3.1 ----- 69 sec with ak v8 SSE4.1 ----- 45 sec Best Regards D.Draganov Nvidia GeForce 8800GTX 768Mb Core Duo E8500 @ 4.17 Windows x64 XP Pro Just wondering if the GPU is 100% load  myhahahaha and when it recog. as another pro not a co proc Device name: GeForce 8800 GTX Device version: 1.0 Total global memory (MB): 767 Number of multiprocessors : 16 Number of cores :128 Shared memory per block (kB): 16 Registers per block: 8192 Warp size: 32 Max threads per block: 512 Shaders clock rate (MHz): 1350 Concurrent copy and execution: No Can't set up shared mem: -1 Will run in standalone mode. setiathome_enhanced 6.01 Visual Studio/Microsoft C++ libboinc: 6.3.5
|
|
|
|
« Last Edit: 24 Aug 2008, 03:19:32 pm by Archangel999 »
|
Logged
|
Honda - The Power Of Dreams
|
|
|
|
Raistmer
|
What you mean? What your GPU timing?
|
|
|
|
|
Logged
|
|
|
|
|
Archangel999
|
What you mean? What your GPU timing?
all stock engine 576 shader 1350 memory 1800 if i understand what you are asking
|
|
|
|
|
Logged
|
Honda - The Power Of Dreams
|
|
|
|
Raistmer
|
Ah, no, I asked what time it takes to run GPU-version of SETI client on your host? You wrote with the x64 6.01 app ----- 127 sec with ak v8 SSSE3.1 ----- 69 sec with ak v8 SSE4.1 ----- 45 sec
Are these numbers GPU-app run times? What GPU app version you used?
|
|
|
|
|
Logged
|
|
|
|
|
|
Quote!
'An it harm none, do as ye will.- Wiccan Rede
|
 |  |  |
| |
| Site Statistics |
| Total Members: | 1,187 |
| Total Posts: | 12,411 |
| Total Topics: | 482 | | Downloads |
| Apps |
| Windows R-1.x | 25,177 |
| Windows R-2.0 | 20,387 |
| Windows R-2.2 | 36,768 |
| Linux 32bit 1.x | 6,589 |
| Linux 32bit 2.2 | 4,472 |
| Linux 64bit 2.2 | 1,839 |
| Alpha/IA64 | 216 |
| FreeBSD | 655 |
| HPUX | 355 |
| Subtotal: | 95,232 |
| Source packs: | 4,173 |
| Tool/WU packs: | 8,146 |
| Total: | 162,734 | | GBs dl'd: | 284.02 | | Pages served |
| Today: | 3,318 |
| Total: | 3,577,138 |
| (since 6/26/2006) |
| 173 Donations to S@H |
| U.S. Dollars: | 3,196.59 |
| Euros: | 863.90 |
| Last 24h: | $ 0.00 |
| Avg./24h: | $ 6.18 |
| Estim. total: | $ 4,319.66 |
Latest Member: phod |
| |
 | |  |
 |  |  |
| |
Online users/last 15m
15 Guests, 1 User
Maik 42 Members/last 24hMaik, Archangel999, _heinz, [B^S] zioriga, jlongden, Gecko_R7, Herus, Geek@Play, Pizzadude, Haselgrove, Devaster, Josef W. Segur, Macbeth, Raistmer, dayo21, sunu, Jason G, corsair, The Grinch, Bluesilvergreen, Claggy, KarVi, ppppgabor, Arnulf, clk, Crunch3r, Yurik, Morten, tfp, hwddawg, WHRoeder, Urs Echternacht, Vyper, arkayn, Alex Kraft, ajs, Hiroharu, firefox, Garry W, Vol-Phil, phod, peppe987
| |
 | |  |
|