Category Archives: ffs

“Unusual uses of SSE2” posted to github

In this month’s frenzy of putting source code out there in a usable form, I’ve posted source to github for the SSE2 implementations of string search, BNDM search, sorting [16] doubles, and bit-matrix transpose; plus some convenience tools for SSE2. … Continue reading

Posted in bit, bit matrix transpose, bit shift, ffs, SSE2, string search, Uncategorized | 2 Comments

SSE2 beats SSE4.2 in memcmp?

At the moment I haven’t any box where I can test the latest GCC compilers and SSE4.2 support (pcmpestri etc). So far, the following beats gcc 4.4 with -march=corei7 -msse4.2 (okay, perhaps that’s redundant :-). But gcc generates “repz cmpsb” … Continue reading

Posted in ffs, SSE2, SSE4.2, string search | 2 Comments

The Generic SSE2 Loop

In response to a couple of comments on my post about find-first-bit-set in SSE2 registers, amounting to “what use is a routine that only does 16-byte bitvecs”, I thought I’d post the canonic, generic loop through memory using SSE2 ops. … Continue reading

Posted in ffs, SSE2, Uncategorized | Tagged , , , , | 5 Comments