@hackage http-conduit-downloader1.0.1

HTTP downloader tailored for web-crawler needs.

HTTP/HTTPS downloader built on top of http-conduit and used in http://bazqux.com crawler.

  • Handles all possible http-conduit exceptions and returns human readable error messages.

  • Handles some web server bugs (no persistent connections on HTTP/1.1, returning deflate data instead of gzip)

  • Ignores invalid SSL sertificates.

  • Receives data in 32k blocks internally to reduce memory fragmentation on many parallel downloads.

  • Download timeout.

  • Total download size limit.

  • Returns HTTP headers for subsequent redownloads and handles 'Not modified' results.

  • Can be used with external DSN resolver (hsdns-cache for example).