πŸ‘©β€πŸ’» chrismanbrown.gitlab.io

table markup ranked worst to best

spoiler it is all bad!

2025-02-03

CONTENTS

  1. INTRODUCTION
  2. MARKDOWN
  3. DJOT
  4. COMMONMARK
  5. HTML
  6. SCDOC
  7. TBL
  8. TYPST
  9. CONCLUSION

INTRODUCTION

Hey! Have you ever tried to write a table in plain text markup? It sucks!

Here are the top options for tabular data markup ranked from worst to best.

Did I miss one? Did I rank your favorite incorrectly? Let me know in the comments!

I’m not going to review latex because I don’t know latex and have never used latex and I’m not gonna and you can’t make me.

MARKDOWN

https://pandoc.org/chunkedhtml-demo/8.9-tables.html

To be fair I use markdown and markdown tables all the time. But also markdown tables are an abomination! Because markdown itself is an abomination. It suffers from xkcd #927 disease: there is an unchecked proliferation of standards and implementations.

In reality, the best standard you can really hope for is pandoc.

So here are some pandoc tables.

The simple_table extension with the table_captions extension.

  Right     Left     Center     Default
-------     ------ ----------   -------
     12     12        12            12
    123     123       123          123
      1     1          1             1

Table:  Demonstration of simple table syntax.

The headless simple table:

-------     ------ ----------   -------
     12     12        12             12
    123     123       123           123
      1     1          1              1
-------     ------ ----------   -------

multiline_tables is a separate extension and is kind of required if you have tables of long text:

-------------------------------------------------------------
 Centered   Default           Right Left
  Header    Aligned         Aligned Aligned
----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
                                    spans multiple lines.

  Second    row                 5.0 Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------

It can also omit the header:

----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
                                    spans multiple lines.

  Second    row                 5.0 Here's another one. Note
                                    the blank line between
                                    rows.
----------- ------- --------------- -------------------------

The grid_tables extension:

+---------------+---------------+--------------------+
| Fruit         | Price         | Advantages         |
+===============+===============+====================+
| Bananas       | $1.34         | - built-in wrapper |
|               |               | - bright color     |
+---------------+---------------+--------------------+
| Oranges       | $2.10         | - cures scurvy     |
|               |               | - tasty            |
+---------------+---------------+--------------------+

And the pipe_tables extension:

| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |

So yeah! You’ve got options!

In my experience these extensions are already active but if it’s not working for you, you might need to explicitly specify them:

pandoc document.md -f markdown+grid_tables -t html -o page.html

My problem with markdown tables is how incredibly fussy they are. Editing or inserting data often requires tedious repairing of your cell content and alignment. Unless you’re a monster who just leaves ugly tables in your markup:

fruit| price
-----|-----:
apple|2.05
pear|1.37
orange|3.09

In practice it’s not too too much of a big deal if you have a decent editor with decent markdown / table plugins such as vimwiki or org-mode. But at that point you’re just adding features to your editor to make an awful editing experience feel tolerable and the whole thing just feels icky.

I personally cannot abide by any of these markups except for the simple_table format. It is the least offensive, and the easiest to update and maintain.

Ultimately my recommendation is this: don’t write or store or maintain your tabular in markdown. Put it in a tabular format like csv or tsv.

For example if you keep your data as tab separate fields:

fruit   price
apple   2.05
pear    1.37
orange  3.09

Then you can just pandoc it into markdown on the fly if you want to:

➜ pandoc -f tsv -t markdown fruit.tsv
  fruit    price
  -------- -------
  apple    2.05
  pear     1.37
  orange   3.09

If you want to include this table in your document, you can use the one true templating language m4 to preprocess your text before pandoc-ing it into html or whatever.

Can you believe the cost of fruit these days??

syscmd(`pandoc -f tsv -t markdown fruit.tsv')

Completely unhinged!

And then when processing your text:

m4 document.md | pandoc -f markdown -t html -o page.html

DJOT

https://djot.net/syntax.html#pipe-table

Djot is a next generation markup created by John MacFarlane, who created pandoc and commonmark (see below). So, you know. He’s no slouch.

Its design goal seems chiefly to be an opinionated kind of markdown that is simple and as easy to parse as possible.

Consequently it supports a single table markup: the pipe table:

| a  |  b |
|----|:--:|
| 1  | 2  |
|:---|---:|
| 3  | 4  |

Because djot is pedantic, you cannot omit the leading and trailing pipes like you can in pandoc pipe tables.

I still recommend storing your tabular data in a separate file and then pre-processing it in. Because editing a pipe table still sucks.

COMMONMARK

https://spec.commonmark.org/0.31.2/

Commonmark is a project that publishes an actual (and the first?) spec for markdown.

It does not support tables!

For commonmark, the responsibility for implementing tables falls firmly to extensions.

So if you wanna write tables in commonmark, you just gotta write raw html.

Which brings us to…

HTML

https://developer.mozilla.org/en-US/docs/Learn_web_development/Core/Structuring_content/HTML_table_basics

Despite being highly allergic to html/xml, I have to confess that html tables are FINE.

In fact, html is about 300% easier and more pleasant to edit and maintain than markdown/commonmark/djot because markdown tables are presentation and html tables are structure. Alignment and whitespace don’t matter in html. This makes it much better.

Add some css into the mix and baby you got a stew going.

You can still keep your data separate because pandoc can convert csv/tsv directly to html. But you can’t store information like rowspan or colspan in csv.

SCDOC

https://git.sr.ht/~sircmpwn/scdoc/tree/master/item/scdoc.5.scd#L116

scdoc is a markup that resembles markdown (except for its table markup; read on) and outputs manpages. Consequently it is meant to be a more simple and approachable replacement for mdoc and groff.

What scdoc has is novel table syntax. One cell per line, preceded by a control character(s) that start a new row or a new column and also specify text alignment within the cell.

This little bit of scdoc:

cat<<EOF | scdoc | groff -t -Tutf8
tables(7)

[- Fruit
:- Color
:- Taste
|[ Banana
:[ Yellow
:[ Mushy
|[ Apple
:[ Green
:[ Tart

EOF

Creates this little table:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Fruit  β”‚ Color  β”‚ Taste β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Banana β”‚ Yellow β”‚ Mushy β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Apple  β”‚ Green  β”‚ Tart  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜

This is kind of neat! I’m sure you can quite easily programmatically spit out all of your tabular data one bit at a time and prepend a character or two to each piece.

It outputs roff so you can basically only target plain text, tty, or pdf.

TBL

https://www.man7.org/linux/man-pages/man1/tbl.1.html

tbl is a pre-processor that reads text and outputs roff. Kind of like scdoc. Except tbl is like 30 years older scdoc and has a lot more syntax.

tbl + groff is fantastic because it has a terse syntax and is pretty ubiquitous. Chances are that it, like m4, is already installed on your computer. And you may not even know it! Except now you do because I just told you.

Here’s a little tbl and its output:

.TS
box center tab(#);
Cb Cb
L  L.
Ability#Application
Strength#crushes a tomato
Dexterity#dodges a thrown tomato
Constitution#eats a month-old tomato without becoming ill
Intelligence#knows that a tomato is a fruit
Wisdom#chooses \f[I]not\f[] to put tomato in a fruit salad
Charisma#sells tomato-based fruit salads to hypercarnivores
.TE
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Ability                         Application                     β”‚
β”‚ Strength       crushes a tomato                                   β”‚
β”‚ Dexterity      dodges a thrown tomato                             β”‚
β”‚ Constitution   eats a month‐old tomato without becoming ill       β”‚
β”‚ Intelligence   knows that a tomato is a fruit                     β”‚
β”‚ Wisdom         chooses not to put tomato in a fruit salad         β”‚
β”‚ Charisma       sells tomato‐based fruit salads to hypercarnivores β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

It starts with some directives concluded with a semicolon: draw a box around the table, center the table, specify a delimiter. And next are some alignment instructions concluded with a period: center bold, center bold two columns; left align, left align two columns. And then the table content.

The good news is that you can still keep your data separated if you want to:

.TS
tab(,);
cb cb
-- --
l l.
include(`fruit.csv')
.TE
m4 document | groff -t -Tutf8

 fruit    price
 ───────────────
 apple    2.05
 pear     1.37
 orange   3.09

The downsides are that groff is atrocious at creating html. In fact tables are usually rendered as images and then embedded in the document! So don’t go this route unless you are targeting pdf or plain text.

TYPST

https://typst.app/docs/guides/table-guide/

Typst is a next generation typesetting markup. It has an editor that I don’t use that poises it to compete with stuff like Google Docs or word. But its real killer feature is its syntax and compiler which makes it compete with latex and groff. It is very ergonomic and has a lot of niceties like a grid based layout that will feel very familiar if you know css grid.

Anyway, tables!

Here’s the killer bit about typst tables. You just call the table function, define the number of columns, pass in a few other optional configs, and then just list all your data as a flat list of args. It knows how many columns there are so, unlike every other table formatting markup ever, you don’t have to tell it when to start a new row.

#table(
  columns: 2,
  "fruit","price","apple","2.05","pear","1.37","orange","3.09"
)

Downsides: typst basically only targets pdf. html output is experimental.

CONCLUSION

You have just read some of the ways that you can represent tabular data in markup.

guess what i just ignored all of my own advice and just wrote an ugly as sin markdown pipe table because sometimes you just want a table and don’t wanna mess around: that’s the difference between theory and practice
MARKUP RANK OUTPUT
markdown 1 text,html,pdf
djot 2 text,html,pdf
commonmark 0 text,html,pdf
html 3 html?
scdoc 4 roff,text,pdf
tbl 4 roff,text,pdf
typst 4 pdf

My recommendation remains the same for nearly all of them: don’t represent your tabular data in markup. Use csv or tsv or even sqlite or something instead. And then include your data in your document as you process it with something like m4. This makes your data much easier to maintain and edit without having to fuss with presentation or formatting. We call this β€œseparation of concerns.”

If the csv format is holding you back from doing this, I don’t blame you. csv is tricky because of all of the character escaping you sometimes have to do. tsv solves nearly all of these problems though just by using tabs instead of commas.

If the editability of c/tsv is holding you back then I recommend you try something called a β€œspreadsheet.” Or visidata. That’s what I use to view and edit nearly all of my tabular data these days: csv, tsv, rec, jsonl, even sqlite.

https://www.visidata.org/docs/

My other recommendation is to flatten your data. Some tables don’t need to be tables. They can just be lists! Or even paragraphs made up of sentences and words. The secret context of this post though that I forgot to mention at the beginning is that I spent between β€œsome” and β€œa lot” of time writing tabletop roleplaying games (targeting html and pdf) and a lot of content in this medium really is tabular. No real way around it. And for that kind of content, again, it’s easiest for me to keep it in a separate database.