|
|
Pages: [1]
|
 |
|
Author
|
Topic: AP6 for NV & ATi GPUs r1316 released (Read 4032 times)
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Online
Posts: 11227
|
Here is replacement for r521, r555 and r560 GPU builds of AstroPulse that was used before. These new builds offer substantional (in many cases) speed increase and (in case of NV build) bug fixes that will result in less invalid results.
On low-end HD6450 plugged into PCI GPUs r1316 can consume too much CPU so ATi r1305 provided for such hosts. On other hosts better to use ATi r1316 cause it gives advantage both in CPU and GPU times over r1305 (and older).
There was long time from last GPU AP release so there are many changes in command line params and app behavior:
First of all, defaults are changed to work on slowest known GPUs so almost certainly will not use your GPU at max. Use command line params to tune to your GPU.
Second big change - there is ap_cmdline.txt file that can be used to add command line parameters to app. Put params there as you would put them in corresponding tag in app_info. App_info tag supported too so use a way that more convenient to you.
GPUlock and CPUlock are disabled by default. So -no_cpu_lock and -no_gpu_lock params are deprecated. One can use -cpu_lock and -gpu_lock instead to enable these features. On hosts with BOINC supporting OpenCL app will use device supplied by BOINC. With older BOINC versions own enumeration ability will be used.
-instances_per_device param still supported but not required for using multiple instances of app. One should set <count> tag in app_info to get multiple instances running.
-sbs param supported by will only issue warning if single block allocation will be bigger than supplied value. Needed memory amount will be allocated still. App's memory requirements depend from -unroll N and -ffa_block N params.
Other params like -hp, -ffa_block N, -ffa_block_fetch N, -unroll N work as before.
Please, report noticed issues here or in corresponding threads on SETI forums.
I would like to thank Lunatics crew, especially our alpha testers arkayn, Claggy and Mike, and beta testers from SETI beta site for invaluable help in debugging and tuning these new releases.
|
|
|
« Last Edit: 06 Jul 2012, 02:29:33 pm by Raistmer »
|
Logged
|
|
|
|
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
   
Online
Posts: 11227
|
Here is example of possible app_info section:
<app> <name>astropulse_v6</name> </app> <file_info> <name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</name> <executable/> </file_info> <app_version> <app_name>astropulse_v6</app_name> <version_num>604</version_num> <avg_ncpus>0.04</avg_ncpus> <max_ncpus>0.2</max_ncpus> <plan_class>ati13ati</plan_class> <cmdline></cmdline> <coproc> <type>ATI</type> <count>1</count> </coproc> <file_ref> <file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</file_name> <main_program/> </file_ref> <flops>30987654321</flops> </app_version>
As usual, installation of this app requires advanced skills and understanding of anonymous platform mechanism provided with BOINC. If you unsure ask for help on SETI boards or wait next Lunatics installer release.
|
|
|
|
« Last Edit: 06 Jul 2012, 08:28:12 am by Raistmer »
|
Logged
|
|
|
|
Urs Echternacht
Volunteer Developer
Knight who says 'Ni!'
   
Offline
Posts: 3064
++
|
Low end GPU with increased CPU-times was a Radeon HD6450 in a PCI-Slot !
|
|
|
|
|
Logged
|
_\|/_ U r s
|
|
|
|
|
Fredericx51
Knight o' The Round Table
 
Offline
Posts: 207
Knight Who Says Ni N!
|
Installed AstroPulse app. rev.1316, all looking good, even the AP running the 555 version, stopped at 33% when I changed versions. Oh well, SETI went off-line, maintenance started. Wanted to link to this host. And here is the host.One done 32% with rev.555 and the rest with rev.1316 the 2nd with rev.1316.
|
|
|
|
« Last Edit: 10 Jul 2012, 07:09:04 pm by Fredericx51 »
|
Logged
|
|
|
|
Claggy
Alpha Tester
Knight who says 'Ni!'
 
Offline
Posts: 2554
|
I did a few Benches of NV r1316 on my 9800GTX+ with different drivers recently to see if the Cuda slowdown (on legacy GPUs) on Cuda 5 preview drivers was also happening on NV OpenCL, The driver synch changes in later drivers (as opposed to 26x.xx drivers) has resulted in a speedup (subject to an unused core being available to feed the app), and there wasn't a noticeable slowdown on Cuda 5 preview drivers:
266.58: Quick timetable WU : #ap_genwis.dat astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 3.886 secs CPU 1.732 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 42.790 secs, speedup: -1001.13% ratio: 0.09x CPU 39.375 secs, speedup: -2173.38% ratio: 0.04x WU : ap_18se08aa_B6_P1_00046_1LC25.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 446.013 secs CPU 459.610 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 137.619 secs, speedup: 69.14% ratio: 3.24x CPU 9.797 secs, speedup: 97.87% ratio: 46.91x WU : JasonShort_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 894.734 secs CPU 875.290 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 75.794 secs, speedup: 91.53% ratio: 11.80x CPU 7.441 secs, speedup: 99.15% ratio: 117.63x WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 464.060 secs CPU 448.019 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 53.152 secs, speedup: 88.55% ratio: 8.73x CPU 17.254 secs, speedup: 96.15% ratio: 25.97x WU : sigind_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 941.842 secs CPU 905.196 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 332.466 secs, speedup: 64.70% ratio: 2.83x CPU 75.957 secs, speedup: 91.61% ratio: 11.92x WU : single_pulses.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 448.227 secs CPU 430.812 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 44.384 secs, speedup: 90.10% ratio: 10.10x CPU 7.316 secs, speedup: 98.30% ratio: 58.89x 301.42: Quick timetable WU : #ap_genwis.dat astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 3.731 secs CPU 1.732 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 52.223 secs, speedup: -1299.71% ratio: 0.07x CPU 41.028 secs, speedup: -2268.82% ratio: 0.04x WU : ap_18se08aa_B6_P1_00046_1LC25.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 446.013 secs CPU 459.610 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 129.331 secs, speedup: 71.00% ratio: 3.45x CPU 126.813 secs, speedup: 72.41% ratio: 3.62x WU : JasonShort_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 894.734 secs CPU 875.290 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 60.810 secs, speedup: 93.20% ratio: 14.71x CPU 58.376 secs, speedup: 93.33% ratio: 14.99x WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 464.060 secs CPU 448.019 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 42.138 secs, speedup: 90.92% ratio: 11.01x CPU 39.453 secs, speedup: 91.19% ratio: 11.36x WU : sigind_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 941.842 secs CPU 905.196 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 297.572 secs, speedup: 68.41% ratio: 3.17x CPU 292.689 secs, speedup: 67.67% ratio: 3.09x WU : single_pulses.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 448.227 secs CPU 430.812 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 33.530 secs, speedup: 92.52% ratio: 13.37x CPU 30.904 secs, speedup: 92.83% ratio: 13.94x
306.02: Quick timetable WU : #ap_genwis.dat astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 3.659 secs CPU 1.576 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 44.170 secs, speedup: -1107.16% ratio: 0.08x CPU 39.406 secs, speedup: -2400.38% ratio: 0.04x WU : ap_18se08aa_B6_P1_00046_1LC25.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 446.013 secs CPU 459.610 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 129.749 secs, speedup: 70.91% ratio: 3.44x CPU 125.347 secs, speedup: 72.73% ratio: 3.67x WU : JasonShort_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 894.734 secs CPU 875.290 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 61.220 secs, speedup: 93.16% ratio: 14.62x CPU 58.360 secs, speedup: 93.33% ratio: 15.00x WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 464.060 secs CPU 448.019 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 42.198 secs, speedup: 90.91% ratio: 11.00x CPU 39.250 secs, speedup: 91.24% ratio: 11.41x WU : sigind_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 941.842 secs CPU 905.196 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 300.304 secs, speedup: 68.12% ratio: 3.14x CPU 293.765 secs, speedup: 67.55% ratio: 3.08x WU : single_pulses.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 448.227 secs CPU 430.812 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 33.899 secs, speedup: 92.44% ratio: 13.22x CPU 31.029 secs, speedup: 92.80% ratio: 13.88x
306.23: Quick timetable WU : #ap_genwis.dat astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 3.823 secs CPU 1.778 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 45.188 secs, speedup: -1082.00% ratio: 0.08x CPU 40.857 secs, speedup: -2197.92% ratio: 0.04x WU : ap_18se08aa_B6_P1_00046_1LC25.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 446.013 secs CPU 459.610 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 129.390 secs, speedup: 70.99% ratio: 3.45x CPU 130.947 secs, speedup: 71.51% ratio: 3.51x WU : JasonShort_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 894.734 secs CPU 875.290 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 60.678 secs, speedup: 93.22% ratio: 14.75x CPU 59.904 secs, speedup: 93.16% ratio: 14.61x WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 464.060 secs CPU 448.019 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 42.019 secs, speedup: 90.95% ratio: 11.04x CPU 41.496 secs, speedup: 90.74% ratio: 10.80x WU : sigind_v5.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 941.842 secs CPU 905.196 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 298.223 secs, speedup: 68.34% ratio: 3.16x CPU 301.768 secs, speedup: 66.66% ratio: 3.00x WU : single_pulses.wu astropulse_6.01_windows_intelx86.exe -verbose : Elapsed 448.227 secs CPU 430.812 secs AP6_win_x86_SSE2_OpenCL_NV_r1316.exe : Elapsed 33.425 secs, speedup: 92.54% ratio: 13.41x CPU 31.621 secs, speedup: 92.66% ratio: 13.62x
Claggy
|
|
|
|
Logged
|
|
|
|
|
Pages: [1]
|
|
|
|
Quote!
The enemy of my enemy is not quite as much of an enemy as my enemy if they ask, and in either case, I will play nice to the enemy of my enemy only so far as it hurts my enemy for real.- 13th century Mongol warlord trying to describe the current semiconductor marketplace after dining on tainted cheese
|
 |  |  |
| |
Online users/last 15m
27 Guests, 3 Users
Raistmer, Pizzadude, ML1 12 Members/last 24hRaistmer, Pizzadude, ML1, Urs Echternacht, Richard Haselgrove, arkayn, Mike, Claggy, Josef W. Segur, KarVi, mr.mac52, Philip Bott
| |
 | |  |
|