Markdown Is a Mess

An analysis of why Markdown’s architectural limitations and fragile syntax make it unsuitable for complex or feature-rich documents.

Contents

Introduction

Markdown is a lightweight markup language for rich-text documents. It facilitates writing and editing documents in plain text via markup for document structure and text formatting, such as headings, paragraphs, lists, emphasized text, and hyperlinks.

Markdown’s simple syntax makes it easy to learn and more suitable for text authoring than HTML, which is verbose (start and end tags) and fragile (omission/misspelling of tags can affect rendering of the document). Markdown documents are often rendered to the HTML format for display, e. g., for use on websites, documentation, and even books.

While Markdown is well suited for smaller documents, such as README files, it has severe limitations for authoring books and feature-rich documents. The design of Markdown is flawed for several reasons.

Many markup elements provided by Markdown are ambiguous and lack expressiveness. There are often alternative ways to achieve the same structure or semantics in the document (e. g., both _emphasized_ and *emphasized* are rendered to the HTML fragment <em>emphasized</em>).

Markdown documents can contain a mix of Markdown and HTML, but combining the two languages comes with the cost of added complexity, maintenance burden, and reduced clarity, especially when HTML is used or even required to circumvent Markdown’s limitations.

While Markdown excels at quick and simple plain-text formatting, scaling it for sophisticated publishing reveals fundamental architectural flaws. The following sections dissect these critical limitations, analyzing how the language falls short in structural integrity, core document maintenance, and robust extensibility.

Inconsistent Mixing of Markdown and HTML

HTML tags are not interpreted inside block or inline code, and Markdown does not provide a native way to achieve this nesting. For example, replacing the HTML b element in the sample below with Markdown’s native strong formatting (**…**) does not work, as text enclosed in backticks is printed verbatim:

URIs such as `https://<b>www</b>.example.org/` include the subdomain `www`.

Instead of using Markdown’s backticks to markup monospace text/code, the HTML code element is required to achieve the desired nesting:

URIs such as <code>https://<b>www</b>.example.org/</code> include a subdomain `www`.

This issue also applies to fenced code blocks. For example, it is not possible to highlight parts of the code using the HTML mark element.

The converse is also true. In the sample below, the Markdown strong formatting is ignored because it is embedded in an HTML element:

<p>This is a **very** important word.</p>

Rendered HTML:

<p>This is a **very** important word.</p>

However, in the markup below, the strong text will actually be rendered, even though it is embedded in an HTML element:

[<cite>This is a **very** important website</cite>](https://example.org/)

Rendered HTML:

<a href=""><cite>This is a <strong>very</strong> important website</cite></a>

No Native Support for Comments

Although Markdown is intended for writing documents, it lacks a dedicated way to comment out sections to prevent them from being rendered to HTML.

Using HTML comments (<!-- Comment -->) can be problematic, as some Markdown renderers retain them in the output, while alternative syntax hacks, such as using certain invalid hyperlink markup, are limited to single-line comments.

No Proper Nesting of Formats

Markdown provides rather limited support for nesting inline and block-level elements. Consider this sample HTML document:

<p>Paragraph before.</p>
<figure>
  <p><img src="source.png" alt="Sample image"></p>
  <figcaption>
    <p>Caption <em>first</em> paragraph.</p>
    <p>Caption <em>second</em> paragraph.</p>
  </figcaption>
</figure>
<p>Paragraph after.</p>

The Markdown required to recreate the HTML code above looks like this:

Paragraph before.

<figure>

![Sample image](source.png)

<figcaption>

Caption _first_ paragraph.

Caption _second_ paragraph.

</figcaption>
</figure>

Paragraph after.

Unfortunately, indentation can easily alter semantics in Markdown. For example, indenting the paragraphs inside figcaption by four spaces turns them into a code block, while omitting the blank line before the first caption paragraph prevents it from rendering as a paragraph.

No Standardized Extensibility

Markdown does not provide a standardized way to add custom block or inline elements. For example, consider the following HTML snippet:

<p>Paragraph before.</p>
<aside class="warning">
  <p>Warning first paragraph.</p>
  <p>Warning second paragraph</p>
</aside>
<p>Paragraph after.</p>

This HTML could be translated into the following Markdown snippet, which is fragile and difficult to read. It relies heavily on blank lines and fails to express the nested structure in a clear, visually appealing way:

Paragraph before.

<aside class="warning">

Warning first paragraph.

Warning second paragraph.

</aside>

Paragraph after.

Some Markdown parsers can be extended with plugins that provide specialized syntax for defining custom blocks:

Paragraph before.

:::warning
Warning first paragraph.

Warning second paragraph.
:::

Paragraph after.

However, this approach is highly limited and fails for structures containing multiple nested block elements, such as the figure and figcaption example in the section No Proper Nesting of Formats.

Conclusion

Ultimately, Markdown remains a highly capable tool for basic note-taking and simple web documentation, but it was never intended to handle the demands of high-quality publishing. Its ambiguous syntax, lack of structural nesting, and inconsistent handling of HTML snippets show that forcing Markdown into feature-rich document authoring creates more maintenance burden than it solves.

Since Markdown’s inception, more robust alternatives have emerged to address these shortcomings, often drawing inspiration from Markdown itself. Semantic markup languages like AsciiDoc provide the precision required for complex book production, while frameworks like MDX have proven themselves for interactive documentation.