diff --git a/admin/extending-remarkable.md b/admin/extending-remarkable.md new file mode 100644 index 0000000000..4d37834464 --- /dev/null +++ b/admin/extending-remarkable.md @@ -0,0 +1,109 @@ +Docusaurus uses [Remarkable](https://github.com/jonschlinkert/remarkable) to convert plain markdown text into HTML. This document covers how one may extend Remarkable to provide custom functionality. While the document focuses on extending Remarkable in implementation, the theory should apply in general to any markdown parser. + +## Why extend Remarkable? + +Users of GitHub Pages have come to expect certain features provided by GitHub Flavored Markdown. One such example would be heading anchors, where every sub-header has an associated anchor that matches the heading text. This makes it possible to link to a specific section in a document by passing a fragment that matches the heading. For example, to link to this very section, you may create a link like so: + +``` + [Link to this section](#why-extend-remarkable) +``` + +## A Brief Overview of How A Markdown Parser/Renderer Works + +This is a summary of the basic concepts you'll need to understand in order to extend Remarkable, based on the [Remarkable docs](https://github.com/jonschlinkert/remarkable/tree/master/docs) as well as our own experience extending Remarkable to support GFM-style heading anchors. + +As the heading here implies, there's two main parts to how a markdown parser works: the parsing phase, and the rendering phase. During the parsing phase, a plain markdown document is parsed into a set of tokens that describe its structure. These tokens are then used by the renderer to output the actual HTML contents. + +### Parsing Markdown into Tokens + +Let's talk a bit more about what is done as part of the parsing stage. The result of this stage is a tree made up of tokens. There's three types of tokens: inline, block, and core. + +#### Inline tokens + +Inline tokens are simple tokens that have text as a child. They are leaf nodes, and do not support having additional tokens within. An example of this might be `_emphasized text_`, which might be represented as a token of type `em` with contents of `emphasized text`. + +#### Block tokens + +A block token is a bit more complex. It may wrap one or more tokens, and can span more than one line of text. An example of this is the heading token: + +``` +### Hi there +``` + +The plain markdown text above would be parsed into three tokens: + +- `heading_open`: Marks the begining of the heading. May have additional props, such as `hLevel: 3` (heading level) in this case. +- `text`: Plain text token, with a value of "Hi there". +- `heading_close`: Marks the end of the heading. In this case, it would also have a `hLevel: 3` prop. + +This is a basic example, because it contains a simple `text` token within the opening and closing tags. A common block encountered in markdown is the paragraph, which might be tokenized into a series of tokens such as `paragraph_open`, one or more `text` tokens, `link` tokens (if links are present within the text, for example), and, eventually, a `paragraph_close` token. + +#### Core tokens + +These are outside of the initial scope of this article for now. Core tokens may be [reference-style links](https://github.github.com/gfm/#link-reference-definitions), which can appear anywhere in a markdown document. + +### Rendering Tokens into HTML + +After we have parsed everything into tokens, we go to the rendering phase. This is where we convert our `heading_open`, `text`, and `heading_close` tokens from earlier into `