|
|
Pages: [1]
|
 |
|
Author
|
Topic: Bug report science function (Read 781 times)
|
|
nanobyte
|
I am under the impression that the compiler has generated code that effectively ignores results in the chirp function.
Version: Windows optimized S@H Enhanced application by Alex Kan Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3x Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale
I use the 'pe explorer/disassembler' and Intel VTune to analyze the assembly source. VTune names this routine 'v_vChirpData'.
When disassembled, the function at 401b0c has a section where the following instruction sequence is used.
At address: 401c77 movaps xmm0,xmm4 addpd xmm0,xmm3 subpd xmm0,xmm4 subpd xmm3,xmm0
In this sequence, xmm3 is always zero.
Based on a similar sequence, several other registers are eventually zeroed as well. This undermines the logic in the entire routine.
Could you please verify this against the source code ?
nanobyte.
|
|
|
|
|
Logged
|
|
|
|
|
Jason G
|
Hi there, That stretch of code is among the intrinsic portion of the function, and is part of the code that just grabs the 'sign bit', so it should be +/- zero . I believe this is a necessary, and intentional part for the rounding action needed for the chirp , angle reduction to within the range -0.5 to 0.5. This is one of the few places in the code where extreme precision is required, so double precision is used) and a 'bug' as such would manifest probably with horrible consequences. If I am correct then what the compiler has done there is replicate the source faithfully (from the intrinsics), and sequences of adds & subs are generally faster to use within critical loops than other possible methods. I hope that helps. Thanks, Jason [See Alex's much better answer  ]
|
|
|
|
« Last Edit: 11 Jul 2008, 01:31:52 am by Jason G »
|
Logged
|
|
|
|
|
Alex Kan
|
When disassembled, the function at 401b0c has a section where the following instruction sequence is used.
At address: 401c77 movaps xmm0,xmm4 addpd xmm0,xmm3 subpd xmm0,xmm4 subpd xmm3,xmm0
In this sequence, xmm3 is always zero.
Based on a similar sequence, several other registers are eventually zeroed as well. This undermines the logic in the entire routine.
Could you please verify this against the source code ?
This is the intended behavior. Your observation would be true if all floating-point operations were carried out with infinite precision. However, since each operation rounds off to a fixed precision, addition and subtraction are not actually associative. Despite its outward appearances, that instruction sequence does not set xmm3 to zero—it actually generates the fractional part of the value originally contained in xmm3. Specifically, the first three instructions round xmm3 to the nearest integer value using magic numbers (chosen to push all the fractional bits off the end of the mantissa), then place the rounded value into xmm0. Subtracting xmm0 from xmm3 yields the fractional part. Overzealous compiler optimization based on arithmetic associativity breaks techniques relying on floating-point rounding behavior, like Kahan summation and the code above. Fortunately, the Intel compiler has not done anything of the sort here.
|
|
|
|
« Last Edit: 11 Jul 2008, 02:40:41 am by Alex Kan »
|
Logged
|
|
|
|
|
nanobyte
|
Clear explanation, thank you very much. I am relieved that this behaviour was given thought. You did an excellent job on the code.
best regards, nanobyte
|
|
|
|
|
Logged
|
|
|
|
|
Pages: [1]
|
|
|
|
Quote!
Those who cannot remember the past are condemned to repeat it.- George Santayana
|
 |  |  |
| |
| Site Statistics |
| Total Members: | 1,072 |
| Total Posts: | 10,826 |
| Total Topics: | 447 | | Downloads |
| Apps |
| Windows R-1.x | 25,145 |
| Windows R-2.0 | 20,356 |
| Windows R-2.2 | 36,624 |
| Linux 32bit 1.x | 6,574 |
| Linux 32bit 2.2 | 4,406 |
| Linux 64bit 2.2 | 1,784 |
| Alpha/IA64 | 204 |
| FreeBSD | 629 |
| HPUX | 346 |
| Subtotal: | 94,889 |
| Source packs: | 4,069 |
| Tool/WU packs: | 7,928 |
| Total: | 157,841 | | GBs dl'd: | 281.98 | | Pages served |
| Today: | 1,766 |
| Total: | 3,358,782 |
| (since 6/26/2006) |
| 173 Donations to S@H |
| U.S. Dollars: | 3,196.59 |
| Euros: | 863.90 |
| Last 24h: | $ 0.00 |
| Avg./24h: | $ 6.62 |
| Estim. total: | $ 4,319.66 |
Latest Member: Luke@SETI |
| |
 | |  |
 |  |  |
| |
Online users/last 15m
14 Guests, 1 User
Haselgrove 25 Members/last 24hHaselgrove, Jason G, WinterKnight, Leaps-from-Shadows, Raistmer, ajs, Luke@SETI, sunu, tfp, Josef W. Segur, Fivestar Crashtest, WHRoeder, Yin Gang, elec999, KarVi, firefox, Geek@Play, Urs Echternacht, Claggy, _heinz, Slawek, Devaster, Purple Rabbit, akula-ssh, Toffa
| |
 | |  |
|