Zettelmarkup: General Principles

Any document can be thought as a sequence of paragraphs and other block-structured elements (“blocks”), such as headings, lists, quotations, and code blocks. Some of these blocks can contain other blocks, for example lists may contain other lists or paragraphs. Other blocks contain inline-structured elements (“inlines”), such as text, links, emphasized text, and images.

With the exception of lists and tables, the markup for blocks always begins at the first position of a line with three or more identical characters. List blocks also begin at the first position of a line, but may need one or more identical character, plus a space character. Table blocks begin at the first position of a line with the character “|”. Non-list blocks are either fully specified on that line or they span multiple lines and are delimited with the same three or more character. It depends on the block kind, whether blocks are specified on one line or on at least two lines.

If a line does not begin with an explicit block element. the line is treated as a (implicit) paragraph block element that contains inline elements. This paragraph ends when a block element is detected at the beginning of a next line or when an empty line occurs. Some blocks may also contain inline elements, e.g. a heading.

Inline elements mostly begins with two non-space, often identical characters. With some exceptions, two identical non-space characters begins a formatting range that is ended with the same two characters.

Exceptions are: links, images, edits, comments, and both the “en-dash” and the “horizontal ellipsis”. A link is given with [[...]], an images with {{...}}, and an edit formatting with ((...)). An inline comment, beginning with the sequence %%, always ends at the end of the line where it begins. The “en-dash” (“–”) is specified as --, the “horizontal ellipsis” (“...”) as ...1.

Some inline elements do not follow the rule of two identical character, especially to specify footnotes, citation keys, and local marks. These elements begin with one opening square bracket (“[”), use a character for specifying the kind of the inline, typically allow to specify some content, and end with one closing square bracket (“]”).

One inline element that does not begin with two characters is the “entity”. It allows to specify any Unicode character. The specification of that character is put between an ampersand character and a semicolon: &...;. For example, an “n-dash” could also be specified as –.

The backslash character (“\”) possibly gives the next character a special meaning. This allows to resolve some left ambiguities. For example, a list of depth 2 will begin a line with ** Item 2.2. An inline element to strongly emphasize some text begin with a space will be specified as ** Text**. To force the inline element formatting at the beginning of a line, **\ Text** should better be specified.

Many block and inline elements can be refined by additional attributes. Attributes resemble roughly HTML attributes and are put near the corresponding elements by using the syntax {...}. One example is to make space characters visible inside a inline literal element: 1␣+␣2␣=␣3 was specified by using the default attribute: ``1 + 2 = 3``{-}.

To summarize:

These principles makes automatic recognizing zettelmarkup an (relatively) easy task. By looking at the reference implementation, a moderately skilled software developer should be able to create a appropriate software in a different programming language.

  1. If put at the end of non-space text. ↩︎