First thing first: we accidentally sabotaged Data.Text. Its variant used tuples for the outer loop state instead of the data type with strict fields — quite a disadvantage. It now performs a bit better than String. Please accept our apologies for screaming that that sky was falling down.
I’ve posted a prettied-up slice of our benchmarking suite on my university personal page. It’s the main algorithm from the data types post for ByteString, UArray Int Char, String, Seq Char, and Data.Text.
We also include a ByteString version that has some further optimizations we came up with after the fact. Please try to best it! Enjoy!
SPECIALISE pragma vs copy-and-paste
Turns out the slight overhead of SPECIALISE we were seeing was most likely incidental and just happened to look systematic. We had decided to use the pragma in this code release, since it’s just so much cleaner that way. That gave Text a significant performance boost and that’s how we realized we had copied from the wrong variant when writing the Text variant. This is exactly why we were so concerned that the SPECIALISE pragma somehow had some overhead: copy-and-pasting is just begging for bugs! … so ashamed …
Here’s what we get when we run this code as released (with SPECIALISE pragmas) and also with manual specialization via copy-and-paste. Note that use of the SPECIALISE pragma shows no overhead.
| variant | average time (s) | slow down factor | standard deviation | SNR |
| with the SPECIALISE pragma | ||||
spec |
0.857 | 4.47 | 0.00238 | 360.83 |
ByteString |
0.237 | 1.23 | 0.00217 | 109.28 |
UArray Int Char |
0.345 | 1.80 | 0.00124 | 278.67 |
String |
0.505 | 2.63 | 0.00196 | 258.11 |
Seq Char |
0.841 | 4.38 | 0.00247 | 340.12 |
Text |
0.487 | 2.54 | 0.00220 | 221.25 |
optimized ByteString |
0.192 | 1.00 | 0.00082 | 234.68 |
| with manual specialization via copy-and-paste-and-annotate | ||||
spec |
0.857 | 4.46 | 0.00359 | 238.89 |
ByteString |
0.236 | 1.23 | 0.00122 | 192.89 |
UArray Int Char |
0.345 | 1.80 | 0.00129 | 266.63 |
String |
0.505 | 2.63 | 0.00184 | 274.63 |
Seq Char |
0.841 | 4.38 | 0.00240 | 350.24 |
Text |
0.487 | 2.53 | 0.00206 | 236.75 |
optimized ByteString |
0.192 | 1.00 | 0.00107 | 180.06 |
(I just ran the SPECIALISEd version again in the background while editing this post and String slightly outperformed Text. This nondeterminism is driving me nuts.)
edit-distance
Also, check out Max Bolingbroke’s edit-distance library. That’s the only “related work” we found on hackage, and from his source code comments it looks like he’s waaay better at optimizing than we are.
That last post we promised is still in the works… thanks for your patience. Due dates, due dates, due dates!
I was hoping to post my results today, but it turns out that I misinterpreted the description of transpositions in the Wikipedia article. Besides that, I’m still 10% slower than your best version.