12 KiB
title | date | description |
---|---|---|
Email Formatting Is Harder Than It Looks | 2025-07-14 | A detailed overview of plain-text email formatting, what makes it deceptively hard, and how I wrote mailfmt. |
*[UTF-8]: Unicode Transformation Format – 8 bit. Text encoding standard.
*[plain–text]: Content representing only readable characters, and whitespace characters that affect the arrangement of the text.
[TOC]
Plain text email
As I've mentioned before, I like using Kakoune for reading & writing emails. Of course, Kakoune a source code editor, not a rich text editor. It operates on UTF-8 plain–text --- which means that the emails I write need to be in plain text, too.
As it turns out, plain-text email (which predates HTML by decades1) hasn't
really left a "legacy" so much as it hasn't actually gone anywhere. Many
developers swear by it; some are even so committed as to automatically filter
text/html
mail as spam. If you want to learn more, you can begin by reading
Drew DeVault's post,
useplaintext.email, and sourcehut's
mailing list etiquette guide.
As I went down text/plain
path, I quickly learned that I needed an email
formatter. Why? Plain text is like source code. You can't rely on the
recipient's mail client to render it in a certain way --- you have to assume
that what you see is exactly what they get.
On one hand, this isn't really a problem --- the whole point of plain text is not having to bother with formatting, right? There is, however, a crucial catch: line wrapping.
The wrapping problem
Since we (humanity) have been writing text, we've been wrapping it. Pages, after all, have finite width. At some point, an ongoing sentence needs to continue on the line below it. This is called wrapping. In digital text, there are two kinds of wrapping: soft and hard. The former is much more common, and we often take it for granted.
Hard-wrapped text is the simplest: the line breaks are directly part of the
source. If you're writing a sentence that's getting too long, you simply press
<ret>
to begin a new line. The author is responsible for all line breaks. This
guarantees that, (assuming the renderer doesn't reflow text), the output will
always look exactly how it does in the editor.
Soft-wrapped text has line breaks inserted by the renderer --- they're not present in the source file. It's incredibly convenient! As the writer, we don't need to worry at all about line breaks; only paragraph breaks. We can trust that the text will be wrapped properly whenever it's viewed.
Now... remember how I just said that, in the context of plain text email, we can't make any assumptions about how the text will be rendered? This applies to wrapping, too. Some mail clients may wrap text, but not all of them. This essentially consigns us to hard-wrapping our emails.
The problem? It's inconvenient! Imagine you edit a paragraph, and remove a sentence. Well, now that entire paragraph's spacing is messed up, and you need to manually reflow it and fix the line breaks. Yuck!
The Markdown complication
Standard tools
At this point, some of you may be screaming: "but what about fmt
and
fold
?" There exist utilities meant to solve this specific problem, included
in most Linux distributions out-of-the-box! Well, you would be right. Sort of.
It's true that we already have excellent, composable commands for wrapping and
paragraph formatting. A simple #!fish cat email.txt | fmt >email.txt
is enough
to cover many cases. However, there's a problem: these tools are markup
agnostic.
Why is that a problem when I literally just said we don't care about markup? Well, there are some markup formats that are delightfully readable even in plain--text. Consider the following unordered list in HTML (Hyper Text Markup Format):
<ul>
<li>Foobar</li>
<li>Barfoo</li>
</ul>
See, machines can read this no problem... but people? We struggle. Now, consider the exact same expressed in Markdown:
- Foobar
- Barfoo
Isn't that so much nicer? As it turns out, markup isn't only meant to make writing HTML easier --- it's also a great way to enhance the semantics of plain text.
This is where we run up against issues with fmt
& company: because they're
not aware of Markdown syntax, they have a tendency to break it. Consider
the unordered list example from before:
$ cat list.md | fmt
- Foobar - Barfoo
The tool has no idea this is meant to be a list. It just treats whitespace separated tokens as words and reflows paragraphs accordingly.
Markdown formatters
My immediate next thought was to try an actual Markdown formatter. Not only do they also handle wrapping & reflow, they won't break the markup. I gave it a shot, and to my horror, I found that they have the opposite problem: they preserve markup, but they break signature blocks, sign-offs, and headers!
Writing mailfmt
I eventually wrote mailfmt
to fill the
niche of email formatting. It provides consistent paragraph spacing,
hard-wrapping and paragraph reflow, while preserving Markdown syntax, email
headers, quotes, sign-offs, and signature blocks. Additionally, the wrapped
output can be made safe for passing to a Markdown parser. This is useful if you
want to build an HTML email from plain-text.
mailfmt
open-source under the ISC license, and is available on
PyPI for installation with tools like
pipx
and uv
. The source code is available on sourcehut at
git.ficd.sh/ficd/mailfmt.
I wrote this tool primarily for myself. It's served me very well over the past
few months. mailfmt
could be helpful for anyone that prefers writing email in
plain-text using text editors like Kakoune, Helix, and Vim. It can format via
stdin
/stdout
and read/write files, making mailfmt
easy to configure as a
formatter for the mail
filetype in your editor.
My requirements
- A way to consistently format my outgoing emails in my text editor.
- Paragraph reflow and automatic line wrapping.
- Ability to use Markdown syntax:
- Without it being broken by reflow & wrap.
- While looking good and retaining the same semantics in both rendered
and plain-text form — ideal for
multipart
emails.
- Ensure proper formatting of signature blocks.
- Preserve formatting of sign-offs.
Wrap & reflow
It turns out that the most important part was also the easiest to implement.
Python's standard library includes
textwrap
, which literally
just does it for you. So the real challenge becomes figuring out what to
wrap, versus what to ignore.
Preserving Markdown
Getting my tool to preseve Markdown was fairly straightforward. I'm not building a Markdown formatter, I'm building a formatter that doesn't break Markdown. In other words, I don't need to parse Markdown syntax; just recognize it, and ignore it.
mailfmt
's approach is simple: detect when a line matches a known pattern of
Markdown block element syntax, such as leading #
for headings, -
for lists,
etc. If so, leave the line untouched. Similarly, don't format anything
inside fenced code blocks.
Sign-offs
Consider the following sign-off:
Best wishes,
Daniel
A Markdown formatter considers this to be one paragraph, and reflows it accordingly, causing it to lose semantic meaning:
Best wishes, Daniel
Within the confines of Markdown, I counted three ways of dealing with the problem:
- Put an empty line between the two parts:
Best wishes,
Daniel
However, this empty line looks a tad awkward when viewed in plain--text.
- Put a backslash after the intentional line break:
Best wishes, \
Daniel
Again, this looks bad when the Markdown isn't rendered.
- Put two spaces after the intentional line break (
•
= space):
Best•wishes,••
Daniel
This syntax is ambiguous, easy to forget, and not supported by editors that trim trailing whitespace.
mailfmt
detects sign-offs using a very simple heuristic. First, we check if a
line has 5 or fewer words and ends with a comma. If we find such a line,
we check the next line. If it has 5 or fewer words that all begin with an
uppercase letter, then we assume these two lines are a sign-off, and we
don't reflow or wrap them. The heuristic supports a very simple pattern:
A courteous salutation,
Prefix. First Middle Last, Suffix
For instance:
Sincerely,
Rev. John Apple Smith, PHD.
Signature blocks
The standard for signature blocks is as follows:
- Begins with two
-
characters followed by a single space, then a newline. - Everything that follows until the EOF is part of the signature.
*[EOF]: End of file.
Here's an example (note the • = space):
--•
Daniel
Software•Developer,•Company
email@website.com
As with sign-offs, such a signature block gets mangled by other formatters.
Furthermore, the single space after the --
token is important: if it's
missing, some clients won't recognize it is a valid signature.
mailfmt
detects when a line's only content is --
. It adds the required
trailing space if it's missing, and it treats the rest of the file as part of
the signature, leaving it completely untouched.
Headers
Raw emails contain many
headers. Even if you're
reading/writing in plain--text, it's likely that your client strips these.
However, in some cases, you may want to insert a header or two manually.
Luckily, headers are easily matched by
regex, so mailfmt
can
ignore them without any issues.
Consistent multipart emails
Something you may want to do is generate a text/multipart
email. This means
that both an HTML and plain-text representation of the same email are
included in the file — leaving it up to the reader's client to pick which one to
display.
The plain-text email must be able to stand on its own, and should also render to decent-looking HTML. Essentially, you want to write your email in plain-text once, ensuring it has proper formatting, and then use a command to generate an HTML email from it.
For this, mailfmt
provides the --markdown-safe
flag, which appends
backslashes to the formatted output, making it safe for Markdown parsing without
messing up the line breaks after sign-offs and signature blocks.
Note that the only thing this does is output Markdown with hard line breaks. It's the user's responsibility to write the pipeline for generating the email file. For example, I use the following in aerc to generate an HTML multipart email whenever I want:
[multipart-converters]
text/html=mailfmt --markdown-safe | pandoc -f markdown -t html --standalone
Conclusion
If you've made it this far, thanks for sticking with me and reading to the end!
Even if you don't plan to write plain--text email or use mailfmt
at all, I
hope you learned something interesting.
-
The first email was sent in 1971 --- HTML was specified in 1990. ↩︎