Jump to content
Tuts 4 You
Sign in to follow this  

Speed optimization question comparing

Recommended Posts


And that behavior is perfectly normal and expected. You really should not try to be smarter than compiler. You aren't. 99% of programmers aren't. So just let compiler do its job.

There is so much more to optimize in your code - in the logic and algorithm. It's much wiser to spend your time on that..

  • Like 2

Share this post

Link to post
21 hours ago, kao said:

There is so much more to optimize in your code - in the logic and algorithm.

Noticed it.
For example:


    dif1 = Ban[1]-Ban[0];
    dif1_div = rightrotate(dif1, s[0]);
    F1 = (Cn xor (Bn or (not Dn)))+An;
    Mn[0] = dif1_div-F1;

    lr_val1_bak = leftrotate(F1+Mn[0], s[0]);
    Ban1Test = Ban[0]+lr_val1_bak;

    if (Ban[1]!=Ban1Test)
    //printf("Invalid Ban[1] value!!!!\n");
    goto StartOfSearch;

    lr_val1_bak = leftrotate(F1+Mn[0], s[0]);
    anyway Mn[0] = dif1_div-F1 so
    lr_val1_bak = leftrotate(F1+dif1_div-F1, s[0]);

    lr_val1_bak = leftrotate(dif1_div, s[0]);

    dif1 = Ban[1]-Ban[0];
    dif1_div = rightrotate(dif1, s[0]);

    lr_val1_bak = leftrotate(rightrotate(dif1, s[0]), s[0]);
    // these once again simplifies so:
    lr_val1_bak = dif1 = Ban[1]-Ban[0];

    // While the final test:
    Ban1Test = Ban[0]+lr_val1_bak = Ban[0]+Ban[1]-Ban[0];
    Ban1Test = Ban[1];

    // The only conclusion is that it was an useless check!



Share this post

Link to post

The only solution was to patch the compiler:


Target: Pelles C for Windows Version 8.00.60 x86
Symptom: ignores register keyword while compiling with any optimisation options (different from None)
Location: PellesC\Bin\pocc.exe
process arguments:
-std:C99 -Tx86-coff -Zi -Ot -Ob1 -fp:fast -W1 -Gd -Ze "D:\MD5PrimeBrute\simple.c" -Fo"D:\MD5PrimeBrute\output\main.obj"

004D6EB4   .  68 D05A5D00                   PUSH 5D5AD0                              ;  UNICODE "Os"
004D6EB9   .  57                            PUSH EDI
004D6EBA   .  E8 B1F50700                   CALL 00556470                            ;  pocc.00556470
004D6EBF   .  83C4 08                       ADD ESP,8
004D6EC2   .  85C0                          TEST EAX,EAX
004D6EC4   .  74 12                         JE SHORT 004D6ED8                        ;  pocc.004D6ED8

004D6ED8   > \C705 D8F65F00 01000000        MOV DWORD PTR DS:[5FF6D8],1

004D6EE7   > \68 C45A5D00                   PUSH 5D5AC4                              ;  UNICODE "Ot"
004D6EEC   .  57                            PUSH EDI
004D6EED   .  E8 7EF50700                   CALL 00556470                            ;  pocc.00556470
004D6EF2   .  83C4 08                       ADD ESP,8
004D6EF5   .  85C0                          TEST EAX,EAX
004D6EF7   .  74 12                         JE SHORT 004D6F0B                        ;  pocc.004D6F0B

004D6F0B   > \C705 D8F65F00 02000000        MOV DWORD PTR DS:[5FF6D8],2
004D6F15   .  E9 3E030000                   JMP 004D7258                             ;  pocc.004D7258

2 = Ot
1 = Os
0 = no optimisations

This is the check code:
004E0290  /$  53                   PUSH EBX
004E0291  |.  8B5C24 08            MOV EBX,DWORD PTR SS:[ESP+8]
004E0295  |.  833D 38F75F00 00     CMP DWORD PTR DS:[5FF738],0
004E029C  |.  0F8F BC000000        JG 004E035E                              ;  pocc.004E035E
004E02A2  |.  837B 2C 00           CMP DWORD PTR DS:[EBX+2C],0
004E02A6  |.  74 75                JE SHORT 004E031D                        ;  pocc.004E031D
004E02A8  |.  8B03                 MOV EAX,DWORD PTR DS:[EBX]
004E02AA  |.  F640 18 08           TEST BYTE PTR DS:[EAX+18],8
004E02AE  |.  75 6D                JNZ SHORT 004E031D                       ;  pocc.004E031D

Change at 004E02A6 to EB (short jump)
CMP DWORD PTR DS:[EBX+2C] - compare optimisation options with 0
004E031D is the "good" boy!
8 means naked functions so doesn't matter!

0055347D  |> \A1 7CAC6C00              MOV EAX,DWORD PTR DS:[6CAC7C]
00553482  |.  8338 00                  CMP DWORD PTR DS:[EAX],0
00553485  |.  75 0C                    JNZ SHORT 00553493                       ;  pocc.00553493
00553487  |.  C705 9CAC6C00 00000000   MOV DWORD PTR DS:[6CAC9C],0
00553491  |.  EB 09                    JMP SHORT 0055349C                       ;  pocc.0055349C

Here at 00553485 should NOT jump!
So change at 00553485 to two nops: 90 90

After these patches everything works like it should: it will use registers plus compiler automatic optimisations are used.
With registers time = 13-14 seconds - instead of 20 seconds.


Share this post

Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
  • Create New...