@hackage halide-haskell0.0.2.0

Haskell bindings to Halide

halide-haskell

This project is still under heavy development and might not be production-ready
We encourage experimenting with it and reporting any issues you run into via Github issues.


Tutorials | Documentation | Showcases

license build Hackage

Halide is a programming language designed to make it easier to write high-performance image and array processing code on modern machines. Rather than being a standalone programming language, Halide is embedded in C++. This means you write C++ code that builds an in-memory representation of a Halide pipeline using Halide's C++ API. You can then compile this representation to an object file, or JIT-compile it and run it in the same process.

This package provides Haskell bindings that allow to write Halide embedded in Haskell without C++ 😋.

🚀 Example usage

As a simple example, here's how you could implement array addition with halide-haskell:

{-# LANGUAGE AllowAmbiguousTypes, DataKinds, OverloadedStrings, ViewPatterns #-}
import Language.Halide

-- The algorithm
mkArrayPlus = compile $ \(buffer "a" -> a) (buffer "b" -> b) -> do
  -- Create an index variable
  i <- mkVar "i"
  -- Define the resulting function. We call it "out".
  -- In pseudocode it's equivalent to the following: out[i] = a[i] + b[i]
  out <- define "out" i $ a ! i + b ! i
  -- Perform a fancy optimization and use SIMD: we split the loop over i into
  -- an inner and an outer loop and then vectorize the inner loop
  inner <- mkVar "inner"
  split TailAuto i (i, inner) 4 out >>= vectorize inner

-- Example usage of our Halide pipeline
main :: IO ()
main = do
  let a, b :: [Float]
      a = [1, 2, 3, 4, 5]
      b = [6, 7, 8, 9, 10]
  -- Compile the code
  arrayPlus <- mkArrayPlus
  -- We tell Halide to treat our list as a one-dimensional buffer
  withHalideBuffer @1 @Float a $ \a' ->
    withHalideBuffer b $ \b' ->
      -- allocate a temporary buffer for the output
      allocaCpuBuffer [length a] $ \out' -> do
        -- execute the kernel -- it is a normal function call!
        arrayPlus a' b' out'
        -- print the result
        print =<< peekToList out'

For more examples, have a look at the tutorials.

💻 Installing

Currently, the easiest way to install the library is using Nix. It is not a fundamental limitation, because the library itself is just a normal Cabal-based Haskell project, but installing & patching (not all our bug fixes have been upstreamed yet) the system dependencies is just too much work without Nix.

So, once you have Nix installed, you can add halide-haskell to your flake inputs like this project demonstrates and then include it in your build-depends section in the Cabal file.

If you just want to try building the library, type

nix build

and to run an example, try

nix run
nix run .#ghc927-intel-ocl.halide-haskell # for Intel OpenCL support
nix run .#ghc927-cuda.halide-haskell      # for CUDA support
nix run .#ghc944.halide-haskell           # to build with GHC 9.4.4 instead

(for OpenCL and CUDA, you may need to set NIXPKGS_ALLOW_UNFREE=1)

🤩 Motivation

The availability of Deep Learning frameworks such as PyTorch or JAX has revolutionized array processing, independently of whether one works on Machine Learning tasks or other numerical algorithms. The ecosystem in Haskell has been catching up as well, and there are now multiple good array libraries (hmatrix, massiv, Accelerate, arrayfire-haskell, Hasktorch, are all high-quality libraries). To accommodate multiple domains, such libraries have to support hundreds if not thousands of operations (e.g. there are more than 3.5 thousand of so called “native” functions in PyTorch), and this count does not include specializations for different device and data types).

To overcome this difficulty, we propose to build a common extension mechanism for Haskell array libraries. The mechanism is based on embedding the Halide language into Haskell that allows to just-in-time (JIT) compile computational kernels for various hardware.

🤨 Why not Accelerate?

One might wonder, why write another package instead of relying on Accelerate for the JIT compilation of the kernels. Accelerate is a Haskell eDSL (embedded Domain Specific Language) for collective operations on dense multi-dimensional arrays. It relies on LLVM to JIT compile the computational kernels for the target architecture including multicore CPUs and GPUs. Users have to rely on Accelerate to generate high-performance kernels and have no way to force some low-level optimizations. For example, Trevor L. McDonell et al. explain that the reason why hand-written CUDA implementation of the N-body problem outperforms Accelerate is the use of on-chip shared memory. Another example would be the matrix-matrix product where achieving maximal performance requires writing no fewer than six nested loops instead of the naive three (ACM Trans. Math. Softw. 34, 3, Article 12 (May 2008)). Accelerate has no way of knowing that such optimizations have to be applied and cannot perform them automatically, and this is precisely the gap that we are trying to fill by embedding Halide into Haskell.

Halide is a C++ eDSL for high-performance image and array processing. Its core idea is to decouple the algorithm (i.e. what is computed) from the schedule (i.e. where and when it is computed). The eDSL allows to quickly prototype and test the algorithm and then move on to the optimization. Optimizations such as fusion, tiling, parallelism and vectorization can be freely explored without the risk of breaking the original algorithm definition. Schedulers can also be generated automatically by advanced optimization algorithms

Halide provides a lower level interface than Accelerate and thus does not aim to replace it. Instead, Halide can be used to extend Accelerate, and later on one might even think about using Halide as a backend for Accelerate.

🔨 Contributing

Currently, the best way to get started is to use Nix:

nix develop

This will drop you into a shell with all the necessary tools to build the code such that you can do

cabal build

and

cabal test