Tag Archives: SSE

The “C” preprocessor: not as cryptic as you’d think

The C preprocessor is a modest macro-expansion language (check out “m4” if you want to see an immodest one).  Basic symbols and function-macros are convenient for giving meaningful names to constants and tiny function calls, with the rewarding feeling that … Continue reading

Posted in bit shift, preprocessor | Tagged , , , , , , | 6 Comments

SSE2 bit trick: ffs/fls for XMM registers

For the full “C” code that uses this idea for an arbitrary-length byte vector, see this later blog post In a discussion about all the wonderful uses of the combination movemask(pcmpxx(a,b)), it occurred to me that this gives you a … Continue reading

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Convergence: SSE2 and strstr

The original improved strstr routine split the problem up based on the pattern length: 2, 3 and 4+ bytes were separate cases. How about reimplementing the 2- and 3-byte cases using SSE2 functions? The main change is to compare each … Continue reading

Posted in algorithm | Tagged , , | 9 Comments

What the !@# is SSE2 good for: char search in long strings

You don’t need SSE4.2 to do some neat string operations with XMM registers. Case in point: using 16-byte parallelism, searching for a character in a null-terminated character string — aka strchr. Smart implementations of strchr don’t simply test each byte … Continue reading

Posted in Uncategorized | Tagged , , , | 10 Comments