Seti@Home optimized science apps and information
 
Welcome, Guest. Please login or register.
Did you miss your activation email?
23 Nov 2008, 10:12:20 am

Login with username, password and session length
 
If you've registered already but never got your activation email, please click here.
 
 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  Topic: Current Profile Analysis and points to optimze 0 Members and 0 Guests are viewing this topic. « previous next »
Pages: 1 [2] Go Down Print
Author Topic: Current Profile Analysis and points to optimze  (Read 4044 times)
BenHer
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 395


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #15 on: 15 Aug 2006, 04:36:21 pm »

Hmm...just checked out the older version of the seti source by Alex Kan & Rick Berry optimized mac source code from their website http://writhe.org.uk/seti@home/Note: the lastest modified file was 9-15-2005 so it was pre "enhanced" I'm guessing...

They not only optimized existing functions they cleaned up formatting, added documentation, re-wrote entire sections and changed the way computations were performed (chirping)...so apparently they have reviewed some of the math.  Roll Eyes

They also commented many undocumented routines inside the source, so they seem to have worked through what Eric K. et al were trying to achieve with many of their functions.

Regarding an earlier question..."can some students be tasked with reviewing the math..."  Alex is apparently a U.C. Berkeley engineering student.
Logged
Josef W. Segur
Global Moderator
Knight who says 'Ni!'
*****
Offline Offline

Posts: 800


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #16 on: 19 Aug 2006, 04:38:23 pm »

Hmm...just checked out the older version of the seti source by Alex Kan & Rick Berry optimized mac source code from their website http://writhe.org.uk/seti@home/Note: the lastest modified file was 9-15-2005 so it was pre "enhanced" I'm guessing...

They not only optimized existing functions they cleaned up formatting, added documentation, re-wrote entire sections and changed the way computations were performed (chirping)...so apparently they have reviewed some of the math.  Roll Eyes

They also commented many undocumented routines inside the source, so they seem to have worked through what Eric K. et al were trying to achieve with many of their functions.

I was impressed, too. Later source can be found at http://tbp.berkeley.edu/~alexkan/seti/. I'm wondering if I can restate some of the vectorized routines from the 6.1 source to compile with DevC++/MinGW. If I can get up to speed soon enough, I'll try to get at least some x86 SIMD variants into 5.17+. OTOH, you could probably do that much more efficiently than I...

Quote
Regarding an earlier question..."can some students be tasked with reviewing the math..."  Alex is apparently a U.C. Berkeley engineering student.

Graduate, now. I was reading the Macnn forum posts related to those optimized S@H apps, that was also quite interesting.
                                                                       Joe
Logged
Josef W. Segur
Global Moderator
Knight who says 'Ni!'
*****
Offline Offline

Posts: 800


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #17 on: 19 Aug 2006, 04:49:35 pm »

Figured out how to tell ICC to super optimize v_getPowerSpectrum...hand coding could hardly improve on it.

Is that ippsPowerSpectr_32fc() ?
                                                                     Joe
Logged
Chboss
Knave
*
Offline Offline

Posts: 9


View Profile WWW
Re: Current Profile Analysis and points to optimze
« Reply #18 on: 19 Aug 2006, 05:15:40 pm »

Yes, Alex's Mac client is impressive....

MacMini G4 1.25GHz  RAC 219
Athlon XP 2600+ (Linux) RAC 212

If some of their improvements can be brought over to the Linux version it would certainly be helpful.

Logged

BenHer
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 395


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #19 on: 20 Aug 2006, 01:43:04 am »

I've gotten about a 20% improvement so far vs the Simon's SSE3 Athlon exe.

SIMD is only a part of it...many of the bottlenecks are simple programming optimization.

1st identify what is slow...2nd identify why...fix.   Several have been float/int conversions that aren't needed...others if-then's inside of loops...big no no...another was an  'abs( )' inside a loop...big speed up from that.

I've also incorporated Alex's power spectrum re-ordered table from 5.17, but without using another table...its all inside of the original powerspectrum table.

Have to verify it all vs the test WUs now...am only testing against short WU 2 vs release-515 for general development.  WU2 verifies strongly...time on my Athlon 64 3800 X2 - using only core #2 -   537 seconds

In my latest...find_pulse (and i'ts new sub functions) uses 19.02% of WU time...and Intel's FFT uses 17.92%...the cache misses for Pot functions are down to 15.7%.

Might be able to squeeze another 5-10% out...harder now though.

Quote
Is that ippsPowerSpectr_32fc() ?    - Joe
No...I just let Intel compiler vectorize the loop, but I gave it better hints that it could be vectorized.


Simon,
Suggest you check out the program AutoIt3 at http://www.autoitscript.com/autoit3/  for automating the testing...I'm going to write a short one myself...time seconds...etc.

Logged
Simon
Ni!
Lord o' the Board
Knight who says 'Ni!'
*****
Offline Offline

Posts: 1053



View Profile WWW
Re: Current Profile Analysis and points to optimze
« Reply #20 on: 20 Aug 2006, 09:14:52 am »

Hi Ben,

Auto-It is pretty impressive stuff. Even more, so, the 20% you said you got out of the 5.15 sources Smiley Any chance of getting an archive of your changes or a full source snapshot anytime soon? If I seem eager, I am Wink

Also, do those 20% translate to Intel systems too or is it AMD-only?

About telling ICC to vectorize things - are you doing that with "#pragma vector aligned" or "#pragma vector always"?

Regards,
Simon.
« Last Edit: 20 Aug 2006, 09:17:29 am by Simon » Logged
BenHer
Code Wizard
Knight Templar
*****
Offline Offline

Posts: 395


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #21 on: 20 Aug 2006, 01:52:28 pm »

Simon,

I use this code to tell it what pointers point to aligned buffers (in powerspectrum its both)
Code:
#ifdef __INTEL_COMPILER
#define ALIGNED_YES( buffer_ ) __assume_aligned( buffer_, SIMD_ALIGN );
#else
#define ALIGNED_YES( buffer_ )
#endif
Logged
Josef W. Segur
Global Moderator
Knight who says 'Ni!'
*****
Offline Offline

Posts: 800


View Profile
Re: Current Profile Analysis and points to optimze
« Reply #22 on: 21 Aug 2006, 12:04:12 am »

For approximate comparison, I built 5.17 on DevC++/MinGW with profiling enabled. I had to drop optimization to O2 because the profiling code won't work with -fomit_frame_pointer. So FWIW here are some values from running WU2 with chirp limits 10 and 25, about 3 hours 41 minutes on my 1.4 GHz Pentium-m:

37.90% find_pulse()
11.09% v_Transpose4()
6.04% v_ChirpData()
5.28% CalcTrigArray()
5.24% GaussFit()
5.22% f_GetChiSq()
4.71% f_GetTrueMean()
3.61% FindSpikes()
3.29% f_GetPeak()
2.57% lcgf()
2.51% find_triplets()
2.36% v_GetPowerSpectrum()
1.95% float_to_uchar()
1.62% t_funct()
1.53% GetFixedPoT()
1.27% analyze_pot()
                                                                   Joe
Logged
Pages: 1 [2] Go Up Print 
Seti@Home optimized science apps and information  |  Optimized Seti@Home apps  |  Windows  |  Topic: Current Profile Analysis and points to optimze « previous next »
Jump to:  


Quote!
"New Technology" is the name we give to "stuff that doesn't work yet".
- Douglas Adams

 
Site Statistics
Total Members:1,072
Total Posts:10,825
Total Topics:447
Downloads
Apps
Windows R-1.x25,145
Windows R-2.020,356
Windows R-2.236,624
Linux 32bit 1.x6,574
Linux 32bit 2.24,406
Linux 64bit 2.21,784
Alpha/IA64204
FreeBSD629
HPUX346
Subtotal:94,889
Source packs:4,069
Tool/WU packs:7,928
Total:157,833
GBs dl'd:281.98
Pages served
Today:1,630
Total:3,358,646
(since 6/26/2006)
173 Donations to S@H
U.S. Dollars:3,196.59
Euros:863.90
Last 24h:$ 0.00
Avg./24h:$ 6.62
Estim. total:$ 4,319.66
Latest Member:
Luke@SETI
 
 
Seti@Home optimized science apps and information | Powered by Enigma 2.0 (RC1).
© 2003-2008, LSP Dev Team. All Rights Reserved.
Seti@Home optimized science apps and information Forums | Powered by SMF.
© 2005, Simple Machines LLC. All Rights Reserved.
Powered by MySQL Powered by PHP Valid XHTML 1.0! Valid CSS!