Changelog of @hackage/accelerate-llvm-ptx 1.3.0.0

Change Log

Notable changes to the project will be documented in this file.

The format is based on Keep a Changelog and the project adheres to the Haskell Package Versioning Policy (PVP)

1.3.0.0 - 2018-08-27

Changed

  • Code generation improvements for stencil operations

Fixed

  • Segmented folds crash or give inconsistent results (accelerate#423)
  • Synchronisation problems on SM7+ #436

Contributors

Special thanks to those who contributed patches as part of this release:

  • Trevor L. McDonell (@tmcdonell)
  • Josh Meredith (@JoshMeredith)
  • Ivo Gabe de Wolff (@ivogabe)
  • Lars van den Haak (@sakehl)
  • Joshua Meredith (@JoshMeredith)

1.2.0.0 - 2018-04-03

Changed

  • run variants which do not take an explicit execution context now execute on the first available device in an exclusive fashion. Multi-GPU systems can specify the default set of GPUs to use with environment variable ACCELERATE_LLVM_PTX_DEVICES as a list of device ordinals.

Added

  • support for half-precision floats
  • support for struct-of-array-of-struct representations
  • support 64-bit atomic-add instruction in forward permutations (#363)
  • support for LLVM-6.0
  • support for GHC-8.4

Contributors

Special thanks to those who contributed patches as part of this release:

  • Trevor L. McDonell (@tmcdonell)
  • Moritz Kiefer (@cocreature)

1.1.0.1 - 2018-01-08

Fixed

  • add support for building with CUDA-9.x

1.1.0.0 - 2017-09-21

Added

  • support for GHC-8.2
  • caching of compilation results (accelerate-llvm#17)
  • support for ahead-of-time compilation (runQ and runQAsync)

Changed

  • generalise run1* to polyvariadic runN*

Fixed

  • Fixed synchronisation bug in multidimensional reduction

1.0.0.1 - 2017-05-25

Fixed

  • device kernel image is invalid (#386)

1.0.0.0 - 2017-03-31

  • initial release