I’m following the Stanford Algorithms MOOC and trying to implement the problems using Haskell. Many of the algorithms require quite a bit of data juggling, and my pure solutions are running way slower than benchmarks people are quoting for imperative languages. This is due to expensive copying and appending.
So I wanted to explore mutable data structures, which took me into the world of Monads. This page benefits from the advice I received to this and other Stack Overflow posts.
QuickSort is an archetypal example of an algorithm using mutability, and will form the basis of the following examples.
ST Monad with mutable array
Consider the following:
Starting from main, note how we wrap the main instructions in runST
or runSTArray
. runST
is the more generally useful code, and it needs to be provided with a ‘pure’ value to work. So I freeze the mutable state (in this case an Array) and return, and print, that. runSTArray
is special syntax which has the same effect for mutable arrays.
Qsort itself takes the relevant mutable array, undertakes a succession of recursive partitions of the data, and ultimately stops. It does not return anything, but of course leaves the state considerably changed, hence the return type of State s ()
.
Note how foldM
works exactly as foldl
, but that it returns a Monad, so I extract the value from it for later use.
IO Monad
Another approach is to use the IO Monad, again with an Array as an example. This is simpler to write and, in particular, avoids the complexity of understanding runST. I believe ST provides more checks of some sort, so that might make it preferable.
Mutable Vector
Vectors are normally to be preferred over Arrays. The resulting code is unsurprisingly similar (Haskell is logical if sometimes bewildering!). The main issues I faced were
- finding the way to create the mutable vector in the first place, although I suspect that that was mostly due to;
- adding the type signature for qsort.
Here is my code
IO Vector
TBC
How quick are the various solutions?
Simplest things is to just do ‘:set +s’ in ghci, and then you can see the execution time of anything you run, along with memory usage.
Immutable Vector 8.3 seconds STVector - 0.84 seconds