|
|
Pages: 1 [2]
|
 |
|
Author
|
Topic: Current Profile Analysis and points to optimze (Read 4044 times)
|
|
BenHer
|
Hmm...just checked out the older version of the seti source by Alex Kan & Rick Berry optimized mac source code from their website http://writhe.org.uk/seti@home/. Note: the lastest modified file was 9-15-2005 so it was pre "enhanced" I'm guessing... They not only optimized existing functions they cleaned up formatting, added documentation, re-wrote entire sections and changed the way computations were performed (chirping)...so apparently they have reviewed some of the math.  They also commented many undocumented routines inside the source, so they seem to have worked through what Eric K. et al were trying to achieve with many of their functions. Regarding an earlier question..."can some students be tasked with reviewing the math..." Alex is apparently a U.C. Berkeley engineering student.
|
|
|
|
|
Logged
|
|
|
|
|
Josef W. Segur
|
Hmm...just checked out the older version of the seti source by Alex Kan & Rick Berry optimized mac source code from their website http://writhe.org.uk/seti@home/. Note: the lastest modified file was 9-15-2005 so it was pre "enhanced" I'm guessing... They not only optimized existing functions they cleaned up formatting, added documentation, re-wrote entire sections and changed the way computations were performed (chirping)...so apparently they have reviewed some of the math.  They also commented many undocumented routines inside the source, so they seem to have worked through what Eric K. et al were trying to achieve with many of their functions. I was impressed, too. Later source can be found at http://tbp.berkeley.edu/~alexkan/seti/. I'm wondering if I can restate some of the vectorized routines from the 6.1 source to compile with DevC++/MinGW. If I can get up to speed soon enough, I'll try to get at least some x86 SIMD variants into 5.17+. OTOH, you could probably do that much more efficiently than I... Regarding an earlier question..."can some students be tasked with reviewing the math..." Alex is apparently a U.C. Berkeley engineering student. Graduate, now. I was reading the Macnn forum posts related to those optimized S@H apps, that was also quite interesting. Joe
|
|
|
|
|
Logged
|
|
|
|
|
Josef W. Segur
|
Figured out how to tell ICC to super optimize v_getPowerSpectrum...hand coding could hardly improve on it. Is that ippsPowerSpectr_32fc() ? Joe
|
|
|
|
|
Logged
|
|
|
|
|
Chboss
|
Yes, Alex's Mac client is impressive....
MacMini G4 1.25GHz RAC 219 Athlon XP 2600+ (Linux) RAC 212
If some of their improvements can be brought over to the Linux version it would certainly be helpful.
|
|
|
|
|
Logged
|
|
|
|
|
BenHer
|
I've gotten about a 20% improvement so far vs the Simon's SSE3 Athlon exe. SIMD is only a part of it...many of the bottlenecks are simple programming optimization. 1st identify what is slow...2nd identify why...fix. Several have been float/int conversions that aren't needed...others if-then's inside of loops...big no no...another was an 'abs( )' inside a loop...big speed up from that. I've also incorporated Alex's power spectrum re-ordered table from 5.17, but without using another table...its all inside of the original powerspectrum table. Have to verify it all vs the test WUs now...am only testing against short WU 2 vs release-515 for general development. WU2 verifies strongly...time on my Athlon 64 3800 X2 - using only core #2 - 537 secondsIn my latest...find_pulse (and i'ts new sub functions) uses 19.02% of WU time...and Intel's FFT uses 17.92%...the cache misses for Pot functions are down to 15.7%. Might be able to squeeze another 5-10% out...harder now though. Is that ippsPowerSpectr_32fc() ? - Joe No...I just let Intel compiler vectorize the loop, but I gave it better hints that it could be vectorized. Simon, Suggest you check out the program AutoIt3 at http://www.autoitscript.com/autoit3/ for automating the testing...I'm going to write a short one myself...time seconds...etc.
|
|
|
|
|
Logged
|
|
|
|
|
Simon
|
Hi Ben, Auto-It is pretty impressive stuff. Even more, so, the 20% you said you got out of the 5.15 sources  Any chance of getting an archive of your changes or a full source snapshot anytime soon? If I seem eager, I am  Also, do those 20% translate to Intel systems too or is it AMD-only? About telling ICC to vectorize things - are you doing that with "#pragma vector aligned" or "#pragma vector always"? Regards, Simon.
|
|
|
|
« Last Edit: 20 Aug 2006, 09:17:29 am by Simon »
|
Logged
|
|
|
|
|
BenHer
|
Simon, I use this code to tell it what pointers point to aligned buffers (in powerspectrum its both) #ifdef __INTEL_COMPILER #define ALIGNED_YES( buffer_ ) __assume_aligned( buffer_, SIMD_ALIGN ); #else #define ALIGNED_YES( buffer_ ) #endif
|
|
|
|
|
Logged
|
|
|
|
|
Josef W. Segur
|
For approximate comparison, I built 5.17 on DevC++/MinGW with profiling enabled. I had to drop optimization to O2 because the profiling code won't work with -fomit_frame_pointer. So FWIW here are some values from running WU2 with chirp limits 10 and 25, about 3 hours 41 minutes on my 1.4 GHz Pentium-m:
37.90% find_pulse() 11.09% v_Transpose4() 6.04% v_ChirpData() 5.28% CalcTrigArray() 5.24% GaussFit() 5.22% f_GetChiSq() 4.71% f_GetTrueMean() 3.61% FindSpikes() 3.29% f_GetPeak() 2.57% lcgf() 2.51% find_triplets() 2.36% v_GetPowerSpectrum() 1.95% float_to_uchar() 1.62% t_funct() 1.53% GetFixedPoT() 1.27% analyze_pot() Joe
|
|
|
|
|
Logged
|
|
|
|
|
Pages: 1 [2]
|
|
|
|
Quote!
"New Technology" is the name we give to "stuff that doesn't work yet".- Douglas Adams
|
 |  |  |
| |
| Site Statistics |
| Total Members: | 1,072 |
| Total Posts: | 10,825 |
| Total Topics: | 447 | | Downloads |
| Apps |
| Windows R-1.x | 25,145 |
| Windows R-2.0 | 20,356 |
| Windows R-2.2 | 36,624 |
| Linux 32bit 1.x | 6,574 |
| Linux 32bit 2.2 | 4,406 |
| Linux 64bit 2.2 | 1,784 |
| Alpha/IA64 | 204 |
| FreeBSD | 629 |
| HPUX | 346 |
| Subtotal: | 94,889 |
| Source packs: | 4,069 |
| Tool/WU packs: | 7,928 |
| Total: | 157,833 | | GBs dl'd: | 281.98 | | Pages served |
| Today: | 1,630 |
| Total: | 3,358,646 |
| (since 6/26/2006) |
| 173 Donations to S@H |
| U.S. Dollars: | 3,196.59 |
| Euros: | 863.90 |
| Last 24h: | $ 0.00 |
| Avg./24h: | $ 6.62 |
| Estim. total: | $ 4,319.66 |
Latest Member: Luke@SETI |
| |
 | |  |
 |  |  |
| |
Online users/last 15m
15 Guests, 2 Users
Haselgrove, Jason G 25 Members/last 24hHaselgrove, Jason G, WinterKnight, Leaps-from-Shadows, Raistmer, ajs, Luke@SETI, sunu, tfp, Josef W. Segur, Fivestar Crashtest, WHRoeder, Yin Gang, elec999, KarVi, firefox, Geek@Play, Urs Echternacht, Claggy, _heinz, Slawek, Devaster, Purple Rabbit, akula-ssh, Toffa
| |
 | |  |
|