Programming Languages

Parsing Markdown with First-Class Derivatives

Assigned to Johannes Hartmann.

Markdown is a lightweight and widespread way of structuring plain text documents. The layout rules that are used to structure the text make Markdown at the same time easy to read for humans but also difficult to write a grammar or a parser.

Many Markdown dialects and extensions with hand tailored parser implementations exist. However, in general, the dialects do not compose and parsers are not easily extensible with other layout sensitive features such as ASCII tables.

Parsing with first-class derivatives is a new approach to develop parsers that allows to modularily express layout senstive features as parser combinators, facilitating reuse and composition of the features.

The goal of this Bachelor Thesis is to investigate the modularity benefits of first class derivatives when applied to markdown parsing. The thesis sets out to answer the following questions: Can Markdown parsing be implemented using first-class derivatives? Do first-class derivatives offer modularity benefits? How easy can the resulting parser be extended with other layout senstive features (such as ASCII tables)?

To investigate this questions, if possible, new combinators should be developed that allow a modular definition of a markdown parser. If the definition of such combinators is not possible or advantageous, it should be elaborated why this is the case and alternatives should be proposed.

For evaluation, the combinators for markdown parsing should prototypically be implemented based on a provided library for parsing with first-class derivatives.

Contact

Jonathan Brachthäuser