For some reason, there are a ridiculously many google hits on this blog, for the string “SSE2 bit shift”. I didn’t post verbatim code for this, just cpp hints on addressing the silly const-arg-ness of counts for SSE2 shift operators. I’m really curious if people are coming here and not seeing what they need. Drop me a comment if you did.

And since it’s Christmas, here’s the perl code that generates the “C” code for a left shift with a variable bit count. The fastest code has a 129-way switch statement. The perl code generates different sequences of 1,2,5 or 6 register instructions, implementing the various shifts. The following are the kinds of cases it generates:

case 0: break;
case 1: x = _mm_or_si128(_mm_slli_epi64(x, 1),
_mm_srli_epi64(_mm_slli_si128(x, 8), 64-1));
break;
...
case 8: x = _mm_slli_si128(x, 1); break; // multiples of 8
case 9: x = _mm_or_si128(_mm_slli_epi64(_mm_slli_si128(x, 9/8), 9%8),
_mm_srli_epi64(_mm_slli_si128(x, 8+9/8), 64-9%8))
...
case 65: x = _mm_slli_epi64(_mm_slli_si128(x, 65/8), 65%8); break;
...
default: x = _mm_setzero_si128(); break;

… and the perl code is:

print <<'__HEAD';
#include <emmintrin.h>
#define shl _mm_slli_epi64
#define shr _mm_srli_epi64
#define SHL _mm_slli_si128
#define C1(n) x = SHL(x, n/8)
#define C2(n) x = shl(SHL(x, n/8), n%8)
#define C5(n) x = _mm_or_si128(shl(x, n%8), shr(SHL(x, 8), 64-n%8))
#define C6(n) x = _mm_or_si128(shl(SHL(x, n/8), n%8), shr(SHL(x, 8+n/8), 64-n%8))
__m128i xm_shl(__m128i x, unsigned nbits)
{
switch (nbits) {
case 0: break;
__HEAD
print "\tcase $_: C".($_<8 ? 5 : $_%8 ? $_<64 ? 6 : 2 : 1)."($_); break;\n"
for 1..127;
print <<'__FOOT';
default: x = _mm_setzero_si128();
}
return x;
}
__FOOT

Good luck, and Happy New Year’s!

### Like this:

Like Loading...

*Related*

## About mischasan

I've had the privilege to work in a field where abstract thinking has concrete value. That applies at the macro level --- optimizing actions on terabyte database --- or the micro level --- fast parallel string searches in memory. You can find my documents on production-system radix sort (NOT just for academics!) and some neat little tricks for developers, on my blog https://mischasan.wordpress.com
My e-mail sig (since 1976):
Engineers think equations approximate reality.
Physicists think reality approximates the equations.
Mathematicians never make the connection.