Donate To Seti@HomeSeti@Home optimized science apps and information
 
Welcome, Guest. Please login or register.
31 Oct 2014, 07:51:04 am

Login with username, password and session length
 
» Home
» Forums
» Downloads
» FAQ
» News

» Search site
 
 
 
If you've registered already but never got your activation email, please click here.
 
 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  GPU crunching  |  Topic: AP6 for NV & ATi GPUs r1316 released 0 Members and 0 Guests are viewing this topic. « previous next »
Pages: [1] Go Down Print
Author Topic: AP6 for NV & ATi GPUs r1316 released  (Read 5368 times)
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
*****
Offline Offline

Posts: 12655



AP6 for NV & ATi GPUs r1316 released
« on: 06 Jul 2012, 08:20:45 am »

Here is replacement for r521, r555 and r560 GPU builds of AstroPulse that was used before.
These new builds offer substantional (in many cases) speed increase and (in case of NV build) bug fixes that will result in less invalid results.

On low-end HD6450 plugged into PCI  GPUs r1316 can consume too much CPU so ATi r1305 provided for such hosts.
On other hosts better to use ATi r1316 cause it gives advantage both in CPU and GPU times over r1305 (and older).

There was long time from last GPU AP release so there are many changes in command line params and app behavior:

First of all, defaults are changed to work on slowest known GPUs so almost certainly will not use your GPU at max. Use command line params to tune to your GPU.

Second big change - there is ap_cmdline.txt file that can be used to add command line parameters to app.
Put params there as you would put them in corresponding tag in app_info. App_info tag supported too so use a way that more convenient to you.

GPUlock and CPUlock are disabled by default. So -no_cpu_lock and -no_gpu_lock params are deprecated.
One can use -cpu_lock and -gpu_lock instead to enable these features.
On hosts with BOINC supporting OpenCL app will use device supplied by BOINC. With older BOINC versions own enumeration ability will be used.

-instances_per_device param still supported but not required for using multiple instances of app.
One should set <count> tag in app_info to get multiple instances running.

-sbs param supported by will only issue warning if single block allocation will be bigger than supplied value. Needed memory amount will be allocated still. App's memory requirements depend from -unroll N and -ffa_block N params.

Other params like -hp, -ffa_block N, -ffa_block_fetch N, -unroll N work as before.

Please, report noticed issues here or in corresponding threads on SETI forums.

I would like to thank Lunatics crew, especially our alpha testers arkayn, Claggy and Mike,  and beta testers from SETI beta site for invaluable help in debugging and tuning these new releases.

* AP6_win_x86_SSE2_OpenCL_ATI_r1316.7z (1440.42 KB - downloaded 240 times.)
* AP6_win_x86_SSE2_OpenCL_NV_r1316.7z (1439.04 KB - downloaded 283 times.)
* AP6_win_x86_SSE2_OpenCL_ATI_r1305.7z (1429.63 KB - downloaded 112 times.)
« Last Edit: 06 Jul 2012, 02:29:33 pm by Raistmer » Logged
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
*****
Offline Offline

Posts: 12655



Re: AP6 for NV & ATi GPUs r1316 released
« Reply #1 on: 06 Jul 2012, 08:24:58 am »

Here is example of possible app_info section:



<app>
   <name>astropulse_v6</name>
</app>
<file_info>
   <name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</name>
   <executable/>
</file_info>
<app_version>
   <app_name>astropulse_v6</app_name>
   <version_num>604</version_num>
   <avg_ncpus>0.04</avg_ncpus>
   <max_ncpus>0.2</max_ncpus>
   <plan_class>ati13ati</plan_class>
   <cmdline></cmdline>
   <coproc>
      <type>ATI</type>
      <count>1</count>
   </coproc>
   <file_ref>
      <file_name>AP6_win_x86_SSE2_OpenCL_ATI_r1316.exe</file_name>
      <main_program/>
   </file_ref>
   <flops>30987654321</flops>
</app_version>


As usual, installation of this app requires advanced skills and understanding of anonymous platform mechanism provided with BOINC. If you unsure ask for help on SETI boards or wait next Lunatics installer release.
« Last Edit: 06 Jul 2012, 08:28:12 am by Raistmer » Logged
Urs Echternacht
Volunteer Developer
Knight who says 'Ni!'
*****
Offline Offline

Posts: 3411

++


Re: AP6 for NV & ATi GPUs r1316 released
« Reply #2 on: 06 Jul 2012, 02:08:20 pm »

Low end GPU with increased CPU-times was a Radeon HD6450 in a PCI-Slot !
Logged

_\|/_
U r s
Raistmer
Working Code Wizard
Volunteer Developer
Knight who says 'Ni!'
*****
Offline Offline

Posts: 12655



Re: AP6 for NV & ATi GPUs r1316 released
« Reply #3 on: 09 Jul 2012, 03:03:18 am »

I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension

If you have something to say on topic or explain why this important for users, please do post in corresponding threads.
Logged
Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AP6 for NV & ATi GPUs r1316 released
« Reply #4 on: 10 Jul 2012, 12:06:37 pm »

I made 2 posts about current situation with driver support for OpenCL on both vendors forums recently:
http://devgurus.amd.com/thread/159432
http://developer.nvidia.com/devforum/discussion/10636/feature-request-to-add-synchronization-mode-tuning-via-nv-specific-opencl-extension

If you have something to say on topic or explain why this important for users, please do post in corresponding threads.


Installed AstroPulse app. rev.1316, all looking good, even the AP running the 555 version,
stopped at 33% when I changed versions.
Oh well, SETI went off-line, maintenance started. Wanted to link to this host.
And here
is the host.


One done 32% with rev.555 and the rest with rev.1316 the 2nd with rev.1316.

« Last Edit: 10 Jul 2012, 07:09:04 pm by Fredericx51 » Logged
Claggy
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 2930


WWW
Re: AP6 for NV & ATi GPUs r1316 released
« Reply #5 on: 15 Sep 2012, 11:38:33 am »

I did a few Benches of NV r1316 on my 9800GTX+ with different drivers recently to see if the Cuda slowdown (on legacy GPUs) on Cuda 5 preview drivers was also happening on NV OpenCL,
The driver synch changes in later drivers (as opposed to 26x.xx drivers) has resulted in a speedup (subject to an unused core being available to feed the app),
and there wasn't a noticeable slowdown on Cuda 5 preview drivers:

266.58:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.886 secs
      CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.790 secs, speedup: -1001.13%  ratio: 0.09x
      CPU 39.375 secs, speedup: -2173.38%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 137.619 secs, speedup: 69.14%  ratio: 3.24x
      CPU 9.797 secs, speedup: 97.87%  ratio: 46.91x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 75.794 secs, speedup: 91.53%  ratio: 11.80x
      CPU 7.441 secs, speedup: 99.15%  ratio: 117.63x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 53.152 secs, speedup: 88.55%  ratio: 8.73x
      CPU 17.254 secs, speedup: 96.15%  ratio: 25.97x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 332.466 secs, speedup: 64.70%  ratio: 2.83x
      CPU 75.957 secs, speedup: 91.61%  ratio: 11.92x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 44.384 secs, speedup: 90.10%  ratio: 10.10x
      CPU 7.316 secs, speedup: 98.30%  ratio: 58.89x
 
301.42:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.731 secs
      CPU 1.732 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 52.223 secs, speedup: -1299.71%  ratio: 0.07x
      CPU 41.028 secs, speedup: -2268.82%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.331 secs, speedup: 71.00%  ratio: 3.45x
      CPU 126.813 secs, speedup: 72.41%  ratio: 3.62x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 60.810 secs, speedup: 93.20%  ratio: 14.71x
      CPU 58.376 secs, speedup: 93.33%  ratio: 14.99x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.138 secs, speedup: 90.92%  ratio: 11.01x
      CPU 39.453 secs, speedup: 91.19%  ratio: 11.36x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 297.572 secs, speedup: 68.41%  ratio: 3.17x
      CPU 292.689 secs, speedup: 67.67%  ratio: 3.09x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.530 secs, speedup: 92.52%  ratio: 13.37x
      CPU 30.904 secs, speedup: 92.83%  ratio: 13.94x

306.02:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.659 secs
      CPU 1.576 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 44.170 secs, speedup: -1107.16%  ratio: 0.08x
      CPU 39.406 secs, speedup: -2400.38%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.749 secs, speedup: 70.91%  ratio: 3.44x
      CPU 125.347 secs, speedup: 72.73%  ratio: 3.67x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 61.220 secs, speedup: 93.16%  ratio: 14.62x
      CPU 58.360 secs, speedup: 93.33%  ratio: 15.00x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.198 secs, speedup: 90.91%  ratio: 11.00x
      CPU 39.250 secs, speedup: 91.24%  ratio: 11.41x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 300.304 secs, speedup: 68.12%  ratio: 3.14x
      CPU 293.765 secs, speedup: 67.55%  ratio: 3.08x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.899 secs, speedup: 92.44%  ratio: 13.22x
      CPU 31.029 secs, speedup: 92.80%  ratio: 13.88x

306.23:
Quick timetable
 
WU : #ap_genwis.dat
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 3.823 secs
      CPU 1.778 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 45.188 secs, speedup: -1082.00%  ratio: 0.08x
      CPU 40.857 secs, speedup: -2197.92%  ratio: 0.04x
 
WU : ap_18se08aa_B6_P1_00046_1LC25.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 446.013 secs
      CPU 459.610 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 129.390 secs, speedup: 70.99%  ratio: 3.45x
      CPU 130.947 secs, speedup: 71.51%  ratio: 3.51x
 
WU : JasonShort_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 894.734 secs
      CPU 875.290 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 60.678 secs, speedup: 93.22%  ratio: 14.75x
      CPU 59.904 secs, speedup: 93.16%  ratio: 14.61x
 
WU : short_ap_21oc08ab_B2_P0_00081_20081130_08605_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 464.060 secs
      CPU 448.019 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 42.019 secs, speedup: 90.95%  ratio: 11.04x
      CPU 41.496 secs, speedup: 90.74%  ratio: 10.80x
 
WU : sigind_v5.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 941.842 secs
      CPU 905.196 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 298.223 secs, speedup: 68.34%  ratio: 3.16x
      CPU 301.768 secs, speedup: 66.66%  ratio: 3.00x
 
WU : single_pulses.wu
astropulse_6.01_windows_intelx86.exe -verbose :
  Elapsed 448.227 secs
      CPU 430.812 secs
AP6_win_x86_SSE2_OpenCL_NV_r1316.exe :
  Elapsed 33.425 secs, speedup: 92.54%  ratio: 13.41x
      CPU 31.621 secs, speedup: 92.66%  ratio: 13.62x

Claggy

* P5N-E-SLI-20120902-1327-benchAP.7z (4.35 KB - downloaded 63 times.)
* P5N-E-SLI-20120902-1520-benchAP.7z (4.16 KB - downloaded 67 times.)
* P5N-E-SLI-20120902-1715-benchAP.7z (4.16 KB - downloaded 60 times.)
* P5N-E-SLI-20120915-1615-benchAP.7z (4.17 KB - downloaded 64 times.)
Logged
Pages: [1] Go Up Print 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  GPU crunching  |  Topic: AP6 for NV & ATi GPUs r1316 released « previous next »
Jump to:  


Quote!
Nature always sides with the hidden flaw.
- Murphy's Law

 
Site Statistics
Total Members:96
Total Posts:55,561
Total Topics:1,574
Downloads
..Some PHP stuff ToDo
Pages served
Today:1,183
Total:20,051,738
(since 6/26/2006)
Latest Member:
Just Will Lite
 
 
Seti@Home optimized science apps and information | Powered by Enigma 2.0 (RC1).
© 2003-2014, LSP Dev Team. All Rights Reserved.
Seti@Home optimized science apps and information Forums | Powered by SMF.
© 2005, Simple Machines LLC. All Rights Reserved.
Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS!