|
|
Pages: [1] 2 3 ... 24
|
 |
|
Author
|
Topic: optimized sources (Read 37352 times)
|
|
_heinz
|
Hi Simon,
after studying the sources I found that in the client chirpfft.cpp is an object to have my attention. I reduced the code in CalcTrigArray by using a extern function FillTrigArray created by me and in TrigArrayInit.ptt I made some smart hints to compile. That should set up the speed. Next will be analyse.cpp So I will go through all the other sources to find some things to make shorter and more effective, but it takes a little time to finish this. who compiles the sources? Shold I do that ? Or should i send the sources back to you Simon. Till now I have not the complete environment at home to make a new client. I have the Microsoft C Compiler Version 4.00 and the debugger Code View Version 1.0 to make some short progs to look if my new code is fine. Sure I can download the all necessary new components to install a new environment, but it works still for a month. Its a little bit pitty. Or I must invest over 600 Dollers I think to get it for standy using. have anybody a good idea what to do? mfg seti_britta
|
|
|
|
|
Logged
|
|
|
|
|
Josef W. Segur
|
Hi Simon,
after studying the sources I found that in the client chirpfft.cpp is an object to have my attention. I reduced the code in CalcTrigArray by using a extern function FillTrigArray created by me and in TrigArrayInit.ptt I made some smart hints to compile. That should set up the speed. Next will be analyse.cpp So I will go through all the other sources to find some things to make shorter and more effective, but it takes a little time to finish this. who compiles the sources? Shold I do that ? Or should i send the sources back to you Simon. Till now I have not the complete environment at home to make a new client. I have the Microsoft C Compiler Version 4.00 and the debugger Code View Version 1.0 to make some short progs to look if my new code is fine. Sure I can download the all necessary new components to install a new environment, but it works still for a month. Its a little bit pitty. Or I must invest over 600 Dollers I think to get it for standy using. have anybody a good idea what to do? mfg seti_britta I've shifted this to the Windows side since that matches the compiler and what you are running. What you might do is just attach your changed source files to a post here. I'm definitely interested, one of my hosts has a Pentium-MMX CPU so can't use the vectorized chirp functions. And if you've improved the TrigArray approach enough, it might turn out to be faster than those vectorized versions even on systems with SSE, etc. Any further optimizations will also be welcome. Simon has the full build system with Intel compiler and Intel Performance Primitives, but I've been doing test GCC builds for Windows with DevC++/MinGW (as Eric Korpela uses for the stock Windows applications). If your changes can be built this way I'll probably try. Joe
|
|
|
|
|
Logged
|
|
|
|
|
BenHer
|
Britta,
Regarding your other questions.
1. Final releases are complied with Intel's C++ Compiler v9.x. There is a free version of this for Linux and the Windows version is available for a 30 day demo install.
2. Making your changes compile with Microsoft 2003 or 2005 C++ should almost allways work with the Intel compiler.
3. We (the programmers) usually make a change, compile a candidate executable with that change, and then test it by crunching one of 7 available test work units. These WUs are modified to make them run in about 1/15th the normal run time of a regular WU, but tests all parts of the seti code.
4. Once the test is complete (we also time the test and compare the time to the latest release) we verify that it produced the correct output file (result) by using rescmp, a result comparison utility. If that works (and the time is faster) we then post the changed source file(s) along with the new executable in a posting to one of these threads for the rest of the development/testing group to try out and validate.
|
|
|
|
|
Logged
|
|
|
|
|
_heinz
|
Joe, I´m working now on analyzeFuncs.ccp. The important part chirpfft.cpp is now done. Feel free to give some hints and comments. Don´t use to compile it alone, some variables are defined outside of it. All modifications are marked with "seti_britta:", so you can easy find it by searching. seti_britta
|
|
|
|
Logged
|
|
|
|
Crunch3r
Porting Team
Knight Templar
Offline
Posts: 402
|
Joe, I´m working now on analyzeFuncs.ccp. The important part chirpfft.cpp is now done. Feel free to give some hints and comments. Don´t use to compile it alone, some variables are defined outside of it. All modifications are marked with "seti_britta:", so you can easy find it by searching. seti_britta
Hello Seti_britta, I assume as i've seen that you joined Seti.Germany that i can write thise one in german.... (if i'm wrong please correct me ;-) ------------------------------------------------------------------------------------------------------------------------------------------------- ok... die (log etc. ) funktionen etc. sollten mit denen aus der intel ICC/IPP oder der MKL umgesetzt werden. (log mit libimf bzw. mathimf.h) Wir haben haben dafür die notwendigen Lizenzen ... (zum testen gib's die auch als 30 tage demo von intel) Was mich Persöhnlich interessieren würde, wäre eine umsetzung des powerspectrum und der transpose functions via Intel MKL... bzw. Powerspectrum viia Intel IPP. Kannst du das realisieren ? P.S. bist du mit linux vertraut oder nur windows 
|
|
|
|
« Last Edit: 16 Mar 2007, 12:53:32 am by Crunch3r »
|
Logged
|
I want to share something with you: The three little sentences that will get you through life. Number 1: Cover for me. Number 2: Oh, good idea, Boss! Number 3: It was like that when I got here.
Homer Simpson
|
|
|
|
_heinz
|
hello Joe, hello Cruch3r,
at the moment I´m very busy with analyzeFuncs.cpp, reducing code and make some optimizations in the sources. After that I will look what to do with powerspectrum and transpose. How you know analyzeFuncs is a fat thing, not easy to understand what is going on in the code. Therefore I divided it into logical parts easy to understand the function. This take me the possibility to have a better overview, reduce code and make other logical changes. Now I´m ready to show the first result of my studys , the new programmstructure of seti_analyze. Hints and suggestions are welcome.
for Crunch3r --> I know Linux too, have alredy installed a webserver with Apache and PHP, but at the moment I have still some old win and mac boxes and a P4 with xph, linux not installed in english für alle anderen zum mitlesen :-)
seti_britta
see attachfile: the new structure of seti_analyze ( still for understanding documentation and discussion)
|
|
|
|
Logged
|
|
|
|
|
BenHer
|
How you know analyzeFuncs is a fat thing, Britta, We know because we have compiled and then run the seti executable under control of a "profiling" program. After completing an entire WU crunching we then know that aa% of the time was spent within function abc, and bb% of the time was spent within function xyz and like so for all functions in the program. The ones that use the most time get the most of our optimization thinking and programming attempts.
|
|
|
|
|
Logged
|
|
|
|
|
_heinz
|
the first news use now an enhanced timer, count in timer ticks, test code pieces used for test the new fkt CalcAng let the function write in an 1000 element double vector cyclic take this in a loop of 10 000 so we call the fkt 10 mio times. was surprised about the result, tryed this with 2 small different functions here you see the result ------------------- Timer Frequency in: Hz = 3579545 MHz = 3.57955 GHz = 0.00358 Start Time = 743223057648 Ticks Stop Time = 743224081856 Ticks Duration in Ticks = 1024208 Duration in seconds = 0.2861279855401 -------------------------------------- Timer Frequency in: Hz = 3579545 MHz = 3.57955 GHz = 0.00358 Start Time = 743224082065 Ticks Stop Time = 743225105999 Ticks Duration in Ticks = 1023934 Duration in seconds = 0.2860514394986 -------------------------------------- P1 = 1024208 P2 = 1023934 dif= 274 Solution:P2 is faster than P1  the secand news: set up Ms Visual Studio 2005 Express update with Windows Server 2003 Platform SDK using this environment to compile seti sources go on now with further optimization of the sources seti_britta
|
|
|
|
|
Logged
|
|
|
|
|
_heinz
|
- imported seti_boinc from Visual Studio 2003 to Visual Studio 2005 Express Edition  - can now compile and get object modul - compile analyzeFuncs ------ Erstellen gestartet: Projekt: seti_boinc, Konfiguration: Debug Win32 ------ Kompilieren... analyzeFuncs.cpp ....some wanings Das Buildprotokoll wurde unter "file://c:\boincstuff\kwsn-seti_boinc_1.3\seti_boinc\client\win_build\Debug\BuildLog.htm" gespeichert. seti_boinc - 0 Fehler, 13 Warnung(en) ========== Erstellen: 1 erfolgreich, Fehler bei 0, 0 aktuell, 0 übersprungen ==========  now I can proof all my changes if there are any compiler errors  @Simon : till now I did not install IPP and MKL, but when I do that, it must be able to compile a optimized client. Hope did nothing forget. Simon, what do you think about it ?
|
|
|
|
|
Logged
|
|
|
|
|
Simon
|
Hi Britta, for optimal results, you should use ICC and IPP. Unless you modified the sources to use the fftw wrapper that MKL provides, it's not necessary (MKL). Go for it  Regards, Simon.
|
|
|
|
|
Logged
|
|
|
|
|
Gecko_R7
|
Hi Simon,
Are you planning to play w/ and compare new MKL 9.1 Beta? You think it has caught-up/surpassed speed of IPP?
|
|
|
|
|
Logged
|
|
|
|
|
Josef W. Segur
|
Hi Simon,
Are you planning to play w/ and compare new MKL 9.1 Beta? You think it has caught-up/surpassed speed of IPP? It would be interesting to know if they're products of separate teams within Intel which compete, or basically the same code under the hood with different interface and focus. My assumption has been the latter, in which case whichever one has the most recent release should be "better" in some sense. But note that "better" does not always mean "faster". Joe
|
|
|
|
|
Logged
|
|
|
|
|
Gecko_R7
|
Think I may have as close to an Apples to Apples comparo of IPP vs. MKL 9.0
XEON 3.0 w/ IPP vs. MKL 8.1 in the first graph. XEON 3.0 w/ new MKL 9.0 in the second
Looks pretty close w/ the new MKL 9.0 being slightly quicker than IPP in the 16K to 132K range.... if this is truly a level comparison. At 16K, IPP = 12.5 Gflops vs. @ 13.5 Gflops for MKL 9.0 or @ 8% quicker At 132K, IPP = 11.5 Gflops vs. @ 12.25 Gflops for MKL 9.0 or @ 6% quicker
I'd assume there are "other" improvements in 9.x w/ further optimization relevance as well? Would the added trigonometric and other complex data support in the 9.0 VML also be worth a closer look?
|
|
|
« Last Edit: 01 Apr 2007, 03:18:26 pm by Gecko_R7 »
|
Logged
|
|
|
|
|
msattler
|
Does this mean I might have some newly compiled apps to test soon?
|
|
|
|
|
Logged
|
|
|
|
Crunch3r
Porting Team
Knight Templar
Offline
Posts: 402
|
Think I may have as close to an Apples to Apples comparo of ICC vs. MKL 9.0
XEON 3.0 w/ IPP vs. MKL 8.1 in the first graph. XEON 3.0 w/ new MKL 9.0 in the second
Looks pretty close w/ the new MKL 9.0 being slightly quicker than ICC in the 16K to 132K range.... if this is truly a level comparison. At 16K, ICC = 12.5 Gflops vs. @ 13.5 Gflops for MKL 9.0 or @ 8% quicker At 132K, ICC = 11.5 Gflops vs. @ 12.25 Gflops for MKL 9.0 or @ 6% quicker
I'd assume there are "other" improvements in 9.x w/ further optimization relevance as well? Would the added trigonometric and other complex data support in the 9.0 VML also be worth a closer look?
Hi, MKL 9.0 is way faster than 8.0 and is equal or in some cases depending on the ar faster than ipp. Some weeks ago I've build a app from stock source and compared it to an old 5.12 and it was faster. Regarding the trigonometric stuff imho it is worth looking into it! But it depends on Ben and Joe if they like to have acloser look at it.
|
|
|
|
« Last Edit: 01 Apr 2007, 02:31:21 pm by Crunch3r »
|
Logged
|
I want to share something with you: The three little sentences that will get you through life. Number 1: Cover for me. Number 2: Oh, good idea, Boss! Number 3: It was like that when I got here.
Homer Simpson
|
|
|
|
Pages: [1] 2 3 ... 24
|
|
|
|
Quote!
Nothing is as easy as it looks.- Murphy's Law
|
 |  |  |
| |
| Site Statistics |
| Total Members: | 993 |
| Total Posts: | 8,678 |
| Total Topics: | 412 | | Downloads |
| Apps |
| Windows R-1.x | 25,019 |
| Windows R-2.0 | 20,247 |
| Windows R-2.2 | 36,282 |
| Linux 32bit 1.x | 6,496 |
| Linux 32bit 2.2 | 4,241 |
| Linux 64bit 2.2 | 1,682 |
| Alpha/IA64 | 175 |
| FreeBSD | 558 |
| HPUX | 311 |
| Subtotal: | 93,967 |
| Source packs: | 4,006 |
| Tool/WU packs: | 7,511 |
| Total: | 147,266 | | GBs dl'd: | 277.38 | | Pages served |
| Today: | 2,719 |
| Total: | 3,000,129 |
| (since 6/26/2006) |
| 173 Donations to S@H |
| U.S. Dollars: | 3,196.59 |
| Euros: | 863.90 |
| Last 24h: | $ 0.00 |
| Avg./24h: | $ 7.93 |
| Estim. total: | $ 4,319.66 |
Latest Member: DrDoug |
| |
 | |  |
 |  |  |
| |
Online users/last 15m
10 Guests, 1 User
Purple Rabbit 21 Members/last 24hPurple Rabbit, Jason G, sunu, ajs, Garry W, Boris0407, Raistmer, Geek@Play, clk, KWSN - jonnyv, DrDoug, Josef W. Segur, Gecko_R7, Toffa, Devaster, Metod, S56RKO, Urs Echternacht, Haselgrove, mafiltenborg, msattler, firefox
| |
 | |  |
|