Category Archives: algorithm

Why is Quicksort that quick?

Quicksort has a reputation for speed. In school, you learn that it is O(n log n) … the optimum for all single-threaded, comparison-based sorts. In fact, when combined with median-of-three pivot selection, tiny-partition insertion-sort, and special equal-keys handling (qv Top … Continue reading

Posted in algorithm, sort, Uncategorized | Leave a comment

Bug fix for ACISM

Several people have used the Aho-Corasick interleaved state-transition matrix code, only to find it didn’t match all strings (!) The bug was in the machine construction: the “prune backlinks” step in acism_create is (as-is) broken. I commented it out, and … Continue reading

Posted in aho-corasick, algorithm, string search | Leave a comment

SSE2 bit matrix transpose special case … 8 x 256 … for Marek

It turns out that Marek’s Idea of the Day: Bitsliced SipHash used my SSE2 bit-matrix transpose┬ároutine, but it wasn’t fast enough. This is normally the case for SSE2: the more specific the problem, the better the code can be. I … Continue reading

Posted in algorithm, bit, bit shift, SSE2 | Tagged , , | Leave a comment

SSE2 and BNDM string search

For the past few weeks, I’ve been testing and experimenting with the Railgun string search function written by Sanmayce. Railgun is really a “memmem” function, where the target length is known in advance; and the cost of compiling the pattern … Continue reading

Posted in algorithm, SSE2, string search | Tagged , , , , , , | 8 Comments

Update on bitonic SSE2 sort of 16 doubles

For the complete source code for both sorting and ranking functions using SSE2, check out ssesort.c in this github repo I originally used asm to generate the bitonic sorter. After doing a little more testing, I found that gcc 4.4 … Continue reading

Posted in algorithm | Tagged , , , | 5 Comments

Okay, one more poke at SSE2: sorting doubles

Follow-up: source code is on github: ssesort.c. Old-school (pre-CUDA) non-graphic programming of GPU’s dusted off a bunch of classic algorithms that did little or no branching, and no data sharing between processors, but allowed massive parallelism. One of those algorithms … Continue reading

Posted in algorithm | Tagged , | 1 Comment

Radix sort

About a year ago, I decided to take a serious look at radix sort — one of those algorithms that everyone covers in school and then forgets, because everyone knows that it only works for some narrow special cases with … Continue reading

Posted in algorithm, Uncategorized | Tagged , , | Leave a comment