--- title: Email Formatting Is Harder Than It Looks date: 2025-07-14 --- *[UTF-8]: Unicode Transformation Format – 8 bit. Text encoding standard. *[plain–text]: Content representing only readable characters, and whitespace characters that affect the arrangement of the text. [Kakoune]: https://kakoune.org [^html]: The first email was sent in 1971 --- HTML was specified in 1990. [TOC] ## Plain text email As I've [mentioned before](./email-in-kakoune.md), I like using [Kakoune] for reading & writing emails. Of course, Kakoune a source code editor, not a _rich text_ editor. It operates on UTF-8 _plain–text_ --- which means that the emails I write need to be in plain text, too. As it turns out, plain-text email (which predates HTML by decades[^html]) hasn't really left a "legacy" so much as it _hasn't actually gone anywhere_. Many developers swear by it; some are even so committed as to automatically filter `text/html` mail as spam. If you want to learn more, you can begin by reading [Drew DeVault's post](https://drewdevault.com/2016/04/11/Please-use-text-plain-for-emails.html), [useplaintext.email](https://useplaintext.email/#why-plaintext), and sourcehut's [mailing list etiquette](https://man.sr.ht/lists.sr.ht/etiquette.md) guide. As I went down `text/plain` path, I quickly learned that I needed an **email formatter**. Why? Plain text is like source code. You can't rely on the recipient's mail client to render it in a certain way --- you have to assume that what you see is _exactly_ what _they_ get. On one hand, this isn't really a problem --- the whole point of plain text is _not_ having to bother with formatting, right? There is, however, a crucial catch: **line wrapping**. ## The wrapping problem Since we (humanity) have been _writing_ text, we've been _wrapping_ it. Pages, after all, have finite width. At some point, an ongoing sentence needs to continue on the line below it. This is called _wrapping_. In digital text, there are two kinds of wrapping: **soft** and **hard**. The former is much more common, and we often take it for granted. **Hard-wrapped text** is the simplest: the line breaks are directly part of the source. If you're writing a sentence that's getting too long, you simply press `` to begin a new line. The author is responsible for all line breaks. This guarantees that, (assuming the renderer doesn't reflow text), the output will always look _exactly_ how it does in the editor. **Soft-wrapped text** has line breaks inserted by the _renderer_ --- they're _not_ present in the source file. It's incredibly convenient! As the writer, we don't need to worry at all about line breaks; only paragraph breaks. We can trust that the text _will_ be wrapped properly whenever it's viewed. Now... remember how I just said that, in the context of plain text email, we can't make _any_ assumptions about how the text will be rendered? This applies to wrapping, too. _Some_ mail clients may wrap text, **but not all of them**. This essentially consigns us to hard-wrapping our emails. The problem? _It's inconvenient!_ Imagine you edit a paragraph, and remove a sentence. Well, now that entire paragraph's spacing is messed up, and you need to manually reflow it and fix the line breaks. Yuck! ## The Markdown complication ### Standard tools At this point, some of you may be screaming: _"but what about `fmt` and `fold`?"_ There exist utilities meant to solve this specific problem, included in most Linux distributions out-of-the-box! Well, you would be right. _Sort of_. It's true that we already have excellent, composable commands for wrapping and paragraph formatting. A simple `#!fish cat email.txt | fmt >email.txt` is enough to cover many cases. However, there's a problem: **these tools are markup agnostic**. Why is that a problem when I literally [just](#plain-text-email) said we don't care about markup? Well, there are _some_ markup formats that are delightfully readable even in plain--text. Consider the following _unordered list_ in HTML (Hyper Text **Markup** Format): ```html ``` See, machines can read this no problem... but people? We struggle. Now, consider the exact same expressed in [Markdown](https://en.wikipedia.org/wiki/Markdown): ```markdown - Foobar - Barfoo ``` Isn't that so much nicer? As it turns out, markup isn't only meant to make writing HTML easier --- it's also a great way to enhance the _semantics_ of plain text. **This** is where we run up against issues with `fmt` & company: because they're not _aware_ of Markdown syntax, they have a tendency to **break** it. Consider the unordered list example from before: ```console $ cat list.md | fmt - Foobar - Barfoo ``` The tool has _no idea_ this is meant to be a list. It just treats whitespace separated tokens as words and reflows paragraphs accordingly. ### Markdown formatters My immediate next thought was to try an actual Markdown formatter. Not only do they _also_ handle wrapping & reflow, they won't break the markup. I gave it a shot, and to my horror, I found that they have the _opposite_ problem: they preserve markup, but they break [signature blocks](#signature-blocks), [sign-offs](#sign-offs), and [headers](#headers)! ## Writing `mailfmt` I eventually wrote [`mailfmt`](https://git.ficd.sh/ficd/mailfmt) to fill the niche of email formatting. It provides consistent paragraph spacing, hard-wrapping and paragraph reflow, while preserving Markdown syntax, email headers, quotes, sign-offs, and signature blocks. Additionally, the wrapped output can be made safe for passing to a Markdown parser. This is useful if you want to build an HTML email from plain-text. `mailfmt` open-source under the ISC license, and is available on [PyPI](https://pypi.org/project/mailfmt/) for installation with tools like `pipx` and `uv`. The source code is available on sourcehut at [git.ficd.sh/ficd/mailfmt](https://git.ficd.sh/ficd/mailfmt). I wrote this tool primarily for myself. It's served me very well over the past few months. `mailfmt` could be helpful for anyone that prefers writing email in plain-text using text editors like Kakoune, Helix, and Vim. It can format via `stdin`/`stdout` and read/write files, making `mailfmt` easy to configure as a formatter for the `mail` filetype in your editor. ### My requirements - A way to consistently format my outgoing emails in my text editor. - Paragraph reflow and automatic line wrapping. - Ability to use Markdown syntax: - Without it being broken by reflow & wrap. - While looking good and retaining the same semantics in _both_ rendered **and** plain-text form — ideal for `multipart` emails. - _Ensure_ proper formatting of [signature blocks](#signature-blocks). - _Preserve_ formatting of [sign-offs](#sign-offs). ### Wrap & reflow It turns out that the most important part was also the easiest to implement. Python's standard library includes [`textwrap`](https://docs.python.org/3/library/textwrap.html), which _literally_ just does it for you. So the _real_ challenge becomes figuring out _what to wrap_, versus **what to ignore**. ### Preserving Markdown Getting my tool to preseve Markdown was fairly straightforward. I'm not building a _Markdown formatter_, I'm building _a formatter that doesn't break Markdown_. In other words, I don't need to _parse_ Markdown syntax; just recognize it, **and ignore it**. `mailfmt`'s approach is simple: detect when a line matches a known pattern of Markdown block element syntax, such as leading `#` for headings, `-` for lists, etc. If so, **leave the line untouched**. Similarly, **don't format anything inside fenced code blocks**. ### Sign-offs Consider the following sign-off: ``` Best wishes, Daniel ``` A Markdown formatter considers this to be one paragraph, and reflows it accordingly, causing it to lost semantic meaning: ``` Best wishes, Daniel ``` Within the confines of Markdown, I counted three ways of dealing with the problem: 1. Put an empty line between the two parts: ``` Best wishes, Daniel ``` > However, this empty line looks a tad awkward when viewed in plain--text. 2. Put a backslash after the intentional line break: ``` Best wishes, \ Daniel ``` > Again, this looks bad when the Markdown isn't rendered. 3. Put two spaces after the intentional line break (`•` = space): ``` Best•wishes,•• Daniel ``` > This syntax is **ambiguous, easy to forget**, and **not supported by editors > that trim trailing whitespace.** `mailfmt` detects sign-offs using a very simple heuristic. First, we check if a line has _5 or less_ words, and **ends with a comma**. If we find such a line, we check the _next_ line. If it has 5 or less words **that all begin with an uppercase letter**, then we assume these two lines are a _sign-off_, and we don't reflow or wrap them. The heuristic matches a very simple pattern: ``` A courteous salutation, First Middle Last Name ``` ### Signature blocks The [standard](https://en.wikipedia.org/wiki/Signature_block#Standard_delimiter) for signature blocks is as follows: 1. Begins with two `-` characters followed by a single space, then a newline. 2. Everything that follows until the EOF is part of the signature. *[EOF]: End of file. Here's an example (note the • = space): ``` --• Daniel Software•Developer,•Company email@website.com ``` As with sign-offs, such a signature block gets mangled by other formatters. Furthermore, the single space after the `--` token is important: if it's missing, some clients won't recognize it is a valid signature. `mailfmt` detects when a line's _only_ content is `--`. It adds the required trailing space if it's missing, and it treats the rest of the file as part of the signature, leaving it completely untouched. ## Headers Raw emails contain many [headers](https://en.wikipedia.org/wiki/Email#Message_header). Even if you're reading/writing in plain--text, it's likely that your client strips these. However, in some cases, you may want to insert a header or two manually. Luckily, headers are easily matched by [regex](https://en.wikipedia.org/wiki/Regular_expression), so `mailfmt` can ignore them without any issues. ## Consistent multipart emails Something you may want to do is generate a `text/multipart` email. This means that _both_ an HTML **and** plain-text representation of the _same_ email are included in the file — leaving it up to the reader's client to pick which one to display. The plain-text email **must** be able to stand on its own, and should _also_ render to decent-looking HTML. Essentially, you want to write your email in plain-text once, ensuring it has proper formatting, and then use a command to generate an HTML email from it. For this, `mailfmt` provides the `--markdown-safe` flag, which appends backslashes to the formatted output, making it safe for Markdown parsing without messing up the line breaks after sign-offs and signature blocks. Note that the **only** thing this does is output Markdown with hard line breaks. It's the user's responsibility to write the pipeline for generating the email file. For example, I use the following in [aerc](https://aerc-mail.org/) to generate an HTML multipart email whenever I want: ```ini [multipart-converters] text/html=mailfmt --markdown-safe | pandoc -f markdown -t html --standalone ``` ## Conclusion If you've made it this far, thanks for sticking with me and reading to the end! Even if you don't plan to write plain--text email or use `mailfmt` at all, I hope you learned something interesting.