Donate To Seti@HomeSeti@Home optimized science apps and information
 
Welcome, Guest. Please login or register.
18 May 2013, 08:45:15 pm

Login with username, password and session length
 
» Home
» Forums
» Downloads
» FAQ
» News

» Search site
 
 
 
If you've registered already but never got your activation email, please click here.
 
 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Discussion Forum  |  Topic: AVX Optimized App Development 0 Members and 0 Guests are viewing this topic. « previous next »
Pages: 1 ... 3 4 [5] 6 7 ... 11 Go Down Print
Author Topic: AVX Optimized App Development  (Read 33532 times)
arkayn
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 1035


Aaaarrrrgggghhhh


WWW
Re: AVX Optimized App Development
« Reply #60 on: 12 May 2011, 09:40:30 pm »

From the Q8200

* stderr.txt (2.25 KB - downloaded 89 times.)
Logged

Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AVX Optimized App Development
« Reply #61 on: 13 May 2011, 07:14:23 am »

And from a i7-2600, without and with BOINC (6.10.60)


* stderr-j39.txt (5.73 KB - downloaded 85 times.)
Logged
Claggy
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #62 on: 13 May 2011, 07:23:51 am »

Here's my E8500's J39 run. (5 runs with Boinc and apps running, 5 runs with Boinc and apps shut down)

Claggy

* stderr.7z (1.64 KB - downloaded 75 times.)
Logged
Josef W. Segur
Janitor o' the Board
Knight who says 'Ni!'
*****
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #63 on: 14 May 2011, 08:59:44 pm »

So here's J40, modified the "_b" AVX folding but expect it will probably still be slower than the "_a" version. For the 4 float SIMD folding it was beter to use non-SIMD for the very shortest cases, AVX looks like it may be better just to handle all sizes as 8 floats with masking at the end. Anyhow, I reduced my guess about how small is too small to be efficient on AVX.

Also added SSE3 and SSE1 modified chirping based on AKv8. There are two variants for SSE1, one uses the Estrin method for the polynomials, the other Horner. Estrin has one fewer instruction but Horner needs fewer registers. On my Pentium-M it's a wash, either one may be marginally faster for a single run. But perhaps on even older systems where SSE1 is the best capability it may make a difference, or perhaps some newer systems will also react in surprising ways.

I've left the AVX chirping unchanged. Of the 6 tests on AVX capable systems a was chosen twice, b twice, and c twice. The largest difference between the slowest and fastest AVX version on one test was about 12%, so it's worth gathering more data.
                                                                                                        Joe
Edit: Attachment deleted, newer version in later post.
« Last Edit: 17 May 2011, 09:21:40 pm by Josef W. Segur » Logged
arkayn
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 1035


Aaaarrrrgggghhhh


WWW
Re: AVX Optimized App Development
« Reply #64 on: 14 May 2011, 10:09:06 pm »

First up the Q8200

The the X4 630

* Q8200stderr.txt (2.42 KB - downloaded 77 times.)
* X4630stderr.txt (2.42 KB - downloaded 81 times.)
« Last Edit: 14 May 2011, 10:12:34 pm by arkayn » Logged

Claggy
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #65 on: 15 May 2011, 05:21:03 am »

Here' the J40 run on my E8500 @ 4.14GHz (5 runs with Boinc and apps running, 5 runs with Boinc and apps shut down)

Edit: added Atom N450 run (5 runs with Boinc and one v7 r246 app and one AP r468 app running, and 5 runs with Boinc and apps shut down)

Edit 2: added C2D T8100 run (5 runs with Boinc and one v7 r246 app, one AP r409 app and one Collatz Cuda app running, and 5 runs with Boinc and apps shut down)

Edit 3: Dug my old XP3200 out it's box and connected it up, done a run with J40 (just 5 runs with Boinc shut down)

Claggy

* E8500_stderr.7z (1.76 KB - downloaded 74 times.)
* Atom_N450_stderr.7z (2.02 KB - downloaded 68 times.)
* T8100_stderr.7z (1.83 KB - downloaded 76 times.)
* XP3200_stderr.7z (1.04 KB - downloaded 65 times.)
« Last Edit: 15 May 2011, 09:27:09 am by Claggy » Logged
Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AVX Optimized App Development
« Reply #66 on: 15 May 2011, 06:20:57 am »

And 2 J40 runs with BOINC(6.10.60) doing 12 MB WUs and 2 runs nwith BOINC sleeping.
CPU= i7-2600 stock frequency. (3.4GHz.)




* stderr.rar (1.17 KB - downloaded 88 times.)
« Last Edit: 15 May 2011, 06:25:40 am by Fredericx51 » Logged
Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AVX Optimized App Development
« Reply #67 on: 15 May 2011, 05:45:44 pm »

More tests required from i7-2600/ any CPU supporting AVX ?

Be happy to test your latest fsj40, a couple of more times, if it's output is usefull for
your 'build' eventually, or part of the coders information.

I'm going to download the ertire AVX building and C++ compiler suite (IPP+???)(Still reading a lot of PDF files
getting some usefull info, very time time consuming, though.

Will return soon  Grin

Logged
Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AVX Optimized App Development
« Reply #68 on: 16 May 2011, 11:14:40 am »

Did a few more runs with FTST-J40, 5 with BOINC pauzed, leaving app in memory, 5 with BOINC shut down and 5 with
BOINC running 8 MB WUs on CPU (i7-2600)  & 4 on GPUs (HD5870s).  (Firefoxs history and cache data, flushed.)
OS=WIN7 64BIT, 8GByte DDR3 1333MHz, everything stock settings. BOINC 6.10.60, 64BIT.
46 KByte text, compressewd as f.i. RAR, not even 4KBytes, are needed! (Well, all text is infact the same for every run)

Hope it is usefull, if there is more or other  AVX extended tests are needed, please ask  Roll Eyes





* stderr_ftst-j40.rar (2.64 KB - downloaded 86 times.)
« Last Edit: 16 May 2011, 11:58:57 am by Fredericx51 » Logged
Josef W. Segur
Janitor o' the Board
Knight who says 'Ni!'
*****
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #69 on: 16 May 2011, 03:56:57 pm »

Thanks for the additional runs. More data is definitely useful when there are so many things which can affect individual runs. Whether I can recognize what's significant is doubtful, but it ought to limit my really bad guesses.

I do have a few more things in mind to try, but don't know when I'll be able to actually code them.
                                                                                                 Joe
Logged
Fredericx51
Knight o' The Round Table
***
Offline Offline

Posts: 207


Knight Who Says Ni N!


Re: AVX Optimized App Development
« Reply #70 on: 17 May 2011, 01:07:14 pm »

Well, I do am (almost) daily looking at something interresting and/or new.
AVX, happens to be one of them  Roll Eyes

I hope you'll be able to work out some usefull AVX  configuration for Gauss Fit and other like triplets, pulses and spikes.

If it's usefull, I can run all 3 versions a few times, just ask cause the info might be needed(?)

And the Outage @ SETI@Home, began while I was posting  Undecided.

Wishing all a pleasent day, Fred.
B.t.w., when doing 2 on 1 GPU, screenlag is quite heavy, sometime all motion stops, screen no longer Refreshed.
Is ther something to change using the cmd-line parameters? (SETI Bêta rev177 (or newer, will have a look!)

Oh. boy, there might be some double attachments.(?) [Edit by Miep - I took the second instance of stderrftst_V7_J37_W32.rar out]

* stderr.txt (32.52 KB - downloaded 93 times.)
* stderrftst_V7_J37_W32.rar (3.37 KB - downloaded 78 times.)
* stderr_ftst_Win32_J40.rar (2.03 KB - downloaded 68 times.)
« Last Edit: 17 May 2011, 02:43:05 pm by Miep » Logged
Miep
Global Moderator
Knight who says 'Ni!'
*****
Offline Offline

Posts: 964


Re: AVX Optimized App Development
« Reply #71 on: 17 May 2011, 02:45:52 pm »

B.t.w., when doing 2 on 1 GPU, screenlag is quite heavy, sometime all motion stops, screen no longer Refreshed.
Is ther something to change using the cmd-line parameters? (SETI Bêta rev177 (or newer, will have a look!)

with OpenCL MB increase -period_iteration_num if it's laggy/ driver restarts.
with AP decrease -unroll and block sizes
I'd point you to my main post on that topic, but that's a tiny bit tricky during maintenance Smiley
Logged

The road to hell is paved with good intentions
Josef W. Segur
Janitor o' the Board
Knight who says 'Ni!'
*****
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #72 on: 17 May 2011, 09:19:50 pm »

Here's another test version. I've dropped the JS_AVX_b folding because that approach was a clear failure, but added JS_AVX_c folding which may do better. For the non_AVX side I did some minor cleanup, but don't expect any noticeable difference in results unless I made a typo or something.
                                                                                                  Joe
Edit: Attachment deleted, newer version in later post.
« Last Edit: 21 May 2011, 02:14:10 pm by Josef W. Segur » Logged
Miep
Global Moderator
Knight who says 'Ni!'
*****
Offline Offline

Posts: 964


Re: AVX Optimized App Development
« Reply #73 on: 18 May 2011, 08:38:13 am »

once with boinc running, once without.

* J43_stderr.7z (0.8 KB - downloaded 63 times.)
Logged

The road to hell is paved with good intentions
Claggy
Alpha Tester
Knight who says 'Ni!'
***
Offline Offline

Posts: 2494


Re: AVX Optimized App Development
« Reply #74 on: 18 May 2011, 09:34:50 am »

Here's the J43 run on my E8500 (5 runs with Boinc and apps running, 5 runs with Boinc and apps shut down)

Claggy

* E8500_J43_stderr.7z (1.81 KB - downloaded 67 times.)
Logged
Pages: 1 ... 3 4 [5] 6 7 ... 11 Go Up Print 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Discussion Forum  |  Topic: AVX Optimized App Development « previous next »
Jump to:  


Quote!
I may disagree with what you have to say, but I shall defend to the death your right to say it.
- Voltaire

 
Site Statistics
Total Members:91
Total Posts:51,093
Total Topics:1,430
Downloads
..Some PHP stuff ToDo
Pages served
Today:6,338
Total:17,293,843
(since 6/26/2006)
Latest Member:
[seti.international] Philip J. Fry
 
 
Seti@Home optimized science apps and information | Powered by Enigma 2.0 (RC1).
© 2003-2013, LSP Dev Team. All Rights Reserved.
Seti@Home optimized science apps and information Forums | Powered by SMF.
© 2005, Simple Machines LLC. All Rights Reserved.
Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS!