@hackage little-earley0.2.0.0

Simple implementation of Earley parsing

A little Earley parser

Parser for context-free grammars.

Example

The following module defines a grammar for arithmetic expressions (like 1+2*3). It defines:

  • A type of non-terminal symbols Ns.
  • A type of terminal symbols Ts.
  • The production rules of the grammar arithRules.
  • A matching function, which interprets terminal symbols Ts as sets of lexemes, of type Char here.
  • A grammar arithG packaging up all of the above.
import Data.Char (isDigit)
import Little.Earley

data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show)
data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show)

arithRules :: Ns -> [Rule Ns Ts]
arithRules n = case n of
  SUM ->
    [ [ N PRODUCT ]
    , [ N SUM, T (OneOf ['+', '-']), N PRODUCT ] ]
  PRODUCT ->
    [ [ N FACTOR ]
    , [ N PRODUCT, T (OneOf ['*', '/']), N FACTOR ] ]
  FACTOR ->
    [ [ N NUMBER ]
    , [ T (OneOf ['(']), N SUM, T (OneOf [')']) ] ]
  NUMBER ->
    [ [ T Digit ]
    , [ T Digit, N NUMBER ] ]

matchTs :: Ts -> Char -> Bool
matchTs Digit = isDigit
matchTs (OneOf s) = (`elem` s)

arithG :: Grammar Ns Ts Char
arithG = mkGrammar arithRules matchTs

Load that file in GHCi:

ghci -package little-earley example.hs

Parse a string using that grammar:

> pparse arithG SUM "1+2*3"

Output:

     +-----+--SUM #1---+
     |     |           |
  SUM #0   |      +PRODUCT #1-+
     |     |      |     |     |
PRODUCT #0 | PRODUCT #0 |     |
     |     |      |     |     |
 FACTOR #0 |  FACTOR #0 | FACTOR #0
     |     |      |     |     |
 NUMBER #0 |  NUMBER #0 | NUMBER #0
     |     |      |     |     |
-----------------------------------
     1     +      2     *     3

The module Little.Earley.Examples contains more examples of grammars using this library.

  • This Earley parser implementation is based on Loup Vaillant's tutorial.
  • See also the Earley library also in Haskell, with much fancier types to encode arbitrary semantic actions instead of clumsy trees.