|
|
Pages: [1]
|
 |
|
Author
|
Topic: AP6 r1363 for GPU (Read 4322 times)
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
New switch added: -initial_ffa_sleep N M where N - number of ms to sleep in short PC-FFA, M - number of ms to sleep in large FFA. This sleep will occur before polling for event loop in -use_sleep case (and this sleep independed from -use-sleep switch)
Recommended usage:
1) do test run with -v 2 -use_sleep options. 2) look into stderr.txt for usual sleeping times for short and large FFA (they will differ considerably) 3) Enter those usual values (or those -1ms) into this new switch parameters fialed. 4) additional run with this param "+" -v 2 -use_sleep can be done to check if sleep loop times now much smaller (1-2ms). Then -use_sleep can be omitted at all.
Take care, this switch requires exactly 2 params (2 integer numbers separated with space), not 1.
|
|
|
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
Bench for GTX 260 + Core2Duo 6420 (Conroe), CPU idle, OS Windows Server 2003 x64, driver 263.06 attached; dependence from unroll param. More will come later.
|
|
|
|
Logged
|
|
|
|
Fredericx51
Knight o' The Round Table
 
Offline
Posts: 207
Knight Who Says Ni N!
|
Bench for GTX 260 + Core2Duo 6420 (Conroe), CPU idle, OS Windows Server 2003 x64, driver 263.06 attached; dependence from unroll param. More will come later.
Is this new switch for all (NVidia & AMD?ATI) devices/GPUs or NVidia only.
|
|
|
|
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
It's for all GPU AP builds. [But how helpful it would be for particular vendor/device/driver config - need to test in each particular case]
|
|
|
|
|
Logged
|
|
|
|
|
skildude
|
Works very well on my 7970. many non zeroed WU's completing in less than an hour.
No errors to report.
|
|
|
|
|
Logged
|
|
|
|
Fredericx51
Knight o' The Round Table
 
Offline
Posts: 207
Knight Who Says Ni N!
|
It's for all GPU AP builds. [But how helpful it would be for particular vendor/device/driver config - need to test in each particular case]
Raistmer, do you have AMD/ATI 5000/6000/7000 series of GPU(s), since you're the man, doing most of the coding, testing, IIRC, Jason Gee; Richard Haselgrove and forgot somebody, too but you're putting a lot of time in this project and should have the necessary equipment, IMHO. If not, you're have to get one, I think and willing to pay for one or part of?! Just PM  (Also have a HD4850 & HD5770 lying and not using atm. cause my VISTA rig has strange failliars, could be PSU related cause it's only 350Watt). Hope, you don't mind asking this, Fredericx51.
|
|
|
|
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
Raistmer, do you have AMD/ATI 5000/6000/7000 series of GPU(s),
Currently I have HD6950 installed in one host, bought on SETI project members donations and GTX 260, donated too and sent by Mike to me, installed in another host. Also I have own HD4870, GSO9600, GT9500, 8600 (or8500?), but not installed. I bought PCI->PCI-e adaptor on eBay, tested it on some AMD64 host and perhaps will install it + some of these cards into another AMD64 host, Winchester based one. But I currently develop on C-60 based netbook so most of debugging and testing going there (it's AMD's APU: CPU+OpenCL-capable GPU in single chip). All other architectures covered by our excellent alpha testers.
|
|
|
|
|
Logged
|
|
|
|
|
Morten
|
Hi Raistmer,
What would be the correct setting based on these values:?
In FFA -2048 before main loop buffer freeing Awaited 40 ms for completion PC_inner_ffa result is: 0 Awaited 27 ms for completion PC_inner_ffa result is: 0 Awaited 27 ms for completion PC_inner_ffa result is: 0 Awaited 26 ms for completion PC_inner_ffa result is: 0 Before FFA buffer release, end of FFA -2048 In FFA 2048 before main loop buffer freeing Awaited 38 ms for completion PC_inner_ffa result is: 0 Awaited 28 ms for completion PC_inner_ffa result is: 0 Awaited 27 ms for completion PC_inner_ffa result is: 0 Awaited 26 ms for completion PC_inner_ffa result is: 0 Before FFA buffer release, end of FFA 2048 In FFA -2064 before main loop buffer freeing Awaited 40 ms for completion PC_inner_ffa result is: 0 Awaited 28 ms for completion PC_inner_ffa result is: 0 Awaited 27 ms for completion PC_inner_ffa result is: 0 Awaited 26 ms for completion PC_inner_ffa result is: 0 Before FFA buffer release, end of FFA -2064 In FFA 2064 before main loop buffer freeing
-initial_ffa_sleep 26 -2064 ? or -initial_ffa_sleep 40 2048 ?
As the crunching of the task progresses, these values are increasing, meaning that the negative and positive value gets larger, as well as the "Awaited xx ms" positive value.
At 50% crunched it's like this: Before FFA buffer release, end of FFA -8448 In FFA 8448 before main loop buffer freeing Awaited 120 ms for completion PC_inner_ffa result is: 0 Awaited 108 ms for completion PC_inner_ffa result is: 0 Awaited 106 ms for completion PC_inner_ffa result is: 0 Awaited 104 ms for completion PC_inner_ffa result is: 0 Awaited 103 ms for completion PC_inner_ffa result is: 0 Awaited 100 ms for completion PC_inner_ffa result is: 0 Awaited 96 ms for completion PC_inner_ffa result is: 0 Awaited 95 ms for completion PC_inner_ffa result is: 0 Awaited 95 ms for completion PC_inner_ffa result is: 0 Awaited 92 ms for completion PC_inner_ffa result is: 0 Awaited 89 ms for completion PC_inner_ffa result is: 0 Awaited 88 ms for completion PC_inner_ffa result is: 0 Awaited 88 ms for completion PC_inner_ffa result is: 0 Awaited 87 ms for completion PC_inner_ffa result is: 0 Awaited 56 ms for completion PC_inner_ffa result is: 0 Before FFA buffer release, end of FFA 8448
|
|
|
|
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
You can try -initial_ffa_sleep 26 95 then. And see if it saves any CPU time and how much it will increase elapsed time.
EDIT: and positive/negative number that increases over time is DM value, should be ignored for this particular purpose, it's not a time count.
|
|
|
|
« Last Edit: 02 Aug 2012, 03:04:38 pm by Raistmer »
|
Logged
|
|
|
|
Zeus Fab3r
Squire
Offline
Posts: 41
|
Hi Raistmer and everybody else, I've decided to get back to AP crunching on my GTX260 in hope to bypass current server issues (and limits), so I have few questions: - Is r1363 latest release? - Do I need -initial_ffa_sleep N M switch to run this app? - Can I use my old cmdline params <cmdline>-ffa_block 6144 -ffa_block_fetch 1536 -unroll 10 -instances_per_device 1 -no_cpu_lock</cmdline> /edit (I just saw from Raistmer's r1316 opening post, that I don't need -no cpu lock switch) - In above mentioned post there is app_info section for ATI GPU's in which I couldn't find file_info and file_ref parts for AstroPulse_Kernels_r1363.cl file. I used to have these when I was running r521, so are they obsolete, or not needed in ATI setup? - Can I stay with 266.58 drivers? (I don't like high cpu usage because I crunch AP wus on all four cores) Thanks in advance 
|
|
|
|
« Last Edit: 18 Nov 2012, 04:30:52 pm by Zeus Fab3r »
|
Logged
|
|
|
|
Mike
Alpha Tester
Knight who says 'Ni!'
 
Offline
Posts: 1105
|
Its latest official release yes. And you can still use old cmdline params.
|
|
|
|
|
Logged
|
|
|
|
Zeus Fab3r
Squire
Offline
Posts: 41
|
Is this OK?
<app> <name>astropulse_v6</name> </app> <file_info> <name>AP6_win_x86_SSE2_OpenCL_NV_r1363.exe</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels_r1363.cl</name> <executable/> </file_info> <app_version> <app_name>astropulse_v6</app_name> <version_num>604</version_num> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.20</max_ncpus> <plan_class>cuda</plan_class> <flops>475000000000</flops> <cmdline>-ffa_block 6144 -ffa_block_fetch 1536 -unroll 10 -instances_per_device 1</cmdline> <file_ref> <file_name>AP6_win_x86_SSE2_OpenCL_NV_r1363.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r1363.cl</file_name> <copy_file/> </file_ref> <coproc> <type>CUDA</type> <count>1</count> </coproc> </app_version>
|
|
|
|
|
Logged
|
|
|
|
Mike
Alpha Tester
Knight who says 'Ni!'
 
Offline
Posts: 1105
|
It should work. But you dont need to mention the cl file any longer. Evenso _instance_per_device 1 is needless. Count 1 is enough now.
Mine looks like this.
<app> <name>astropulse_v6</name> </app> <file_info> <name>AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe</name> <executable/> </file_info> <file_info> <name>ap_cmdline.txt</name> </file_info> <app_version> <app_name>astropulse_v6</app_name> <version_num>601</version_num> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>ati13ati</plan_class> <coproc> <type>ATI</type> <count>0.5</count> </coproc> <file_ref> <file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1363.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>ap_cmdline.txt</file_name> </file_ref> </app_version>
ap_cmdline.txt includes the params.
Mike
|
|
|
|
|
Logged
|
|
|
|
Zeus Fab3r
Squire
Offline
Posts: 41
|
It's working ! Thanks. Still I'd like to know about -initial_ffa_sleep N M switch and which is recomended driver for my old GTX260  edit: Here is my first result with new app. Should I be worried about infos and warnings about opening some binary kernel files? I've noticed that those files were created in my data folder.
|
|
|
|
« Last Edit: 18 Nov 2012, 08:04:15 pm by Zeus Fab3r »
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 11043
|
If they were created - no reason to worry. -initial_ffa_sleep N M is experimental switch provided in case someone finds it useful for own host. Recommended driver (for OpenCL NV app) is: 263.06
|
|
|
|
|
Logged
|
|
|
|
|
Pages: [1]
|
|
|
|
Quote!
It is common sense to take a method and try it. If it fails, admit it frankly and try another. But above all, try something.- Franklin D. Roosevelt
|
 |  |  |
| |
Online users/last 15m
29 Guests, 0 Users
13 Members/last 24h[seti.international] Philip J. Fry, Urs Echternacht, glennaxl, Claggy, Raistmer, Richard Haselgrove, arkayn, KarVi, Mike, Frizz, Josef W. Segur, mr.mac52, _heinz
| |
 | |  |
|