Jump to content
Tuts 4 You
Sign in to follow this  
CodeExplorer

leftrotate in SSE

Recommended Posts

CodeExplorer

Having those different ways of doing leftrotate:

Quote

#define ROTATE_LEFT_SSE(x, n) {__m128i tmp;tmp = _mm_srli_epi32(x, 32-n);x = _mm_slli_epi32(x, n);x = _mm_or_si128(x, tmp);};

versus:

Quote

#define ROTATE_LEFT_SSE(x, n) {x = _mm_or_si128(_mm_slli_epi32(x, n), _mm_srli_epi32(x, 32-n));};

Which one would you choose? The second one without temporal variable?
The speed difference seems to be small - can be a speed calculation bug.
 

Share this post


Link to post
atom0s

Depending on compiler optimizations, the 2nd one would probably be the slightly faster of the two. Assuming the compiler doesn't inject temporary variables, it should technically be less clock cycles than the first. However, again with optimizations, the first one could be optimized by the compiler to work more like the 2nd. 

At this point you're basically playing the game of readability over minimal optimizations. If your goal is speed, test both and see which yields the better results and go with that. But with modern compilers I'd assume both will have very similar performances if optimizations are allowed and turned on etc.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
×
×
  • Create New...