update email formatting post draft
This commit is contained in:
parent
ce89015228
commit
b82ab92586
1 changed files with 165 additions and 66 deletions
|
@ -1,10 +1,12 @@
|
||||||
---
|
---
|
||||||
title: Email Formatting Is Harder Than It Looks
|
title: Email Formatting Is Harder Than It Looks
|
||||||
date: 2025-07-13
|
date: 2025-07-14
|
||||||
draft: true
|
draft: true
|
||||||
---
|
---
|
||||||
|
|
||||||
*[UTF-8]: Unicode Transformation Format - 8 bit.
|
*[UTF-8]: Unicode Transformation Format – 8 bit. Text encoding standard.
|
||||||
|
|
||||||
|
*[plain–text]: Content representing only readable characters, and whitespace characters that affect the arrangement of the text.
|
||||||
|
|
||||||
[Kakoune]: https://kakoune.org
|
[Kakoune]: https://kakoune.org
|
||||||
|
|
||||||
|
@ -12,10 +14,12 @@ draft: true
|
||||||
|
|
||||||
[TOC]
|
[TOC]
|
||||||
|
|
||||||
|
## Plain text email
|
||||||
|
|
||||||
As I've [mentioned before](./email-in-kakoune.md), I like using [Kakoune] for
|
As I've [mentioned before](./email-in-kakoune.md), I like using [Kakoune] for
|
||||||
reading & writing emails. Of course, Kakoune is a text editor, not a _rich text_
|
reading & writing emails. Of course, Kakoune a source code editor, not a _rich
|
||||||
editor. It operates on UTF-8 _plain text_ --- which means that the emails I
|
text_ editor. It operates on UTF-8 _plain–text_ --- which means that the emails
|
||||||
write need to be in plain text, too.
|
I write need to be in plain text, too.
|
||||||
|
|
||||||
As it turns out, plain-text email (which predates HTML by decades[^html]) hasn't
|
As it turns out, plain-text email (which predates HTML by decades[^html]) hasn't
|
||||||
really left a "legacy" so much as it _hasn't actually gone anywhere_. Many
|
really left a "legacy" so much as it _hasn't actually gone anywhere_. Many
|
||||||
|
@ -26,72 +30,151 @@ developers swear by it; some are even so committed as to automatically filter
|
||||||
[mailing list etiquette](https://man.sr.ht/lists.sr.ht/etiquette.md) guide.
|
[mailing list etiquette](https://man.sr.ht/lists.sr.ht/etiquette.md) guide.
|
||||||
|
|
||||||
As I went down `text/plain` path, I quickly learned that I needed an **email
|
As I went down `text/plain` path, I quickly learned that I needed an **email
|
||||||
formatter**. Plain text is like source code. You can't rely on the recipient's
|
formatter**. Why? Plain text is like source code. You can't rely on the
|
||||||
mail client to render it in a certain way --- most often, what you see is
|
recipient's mail client to render it in a certain way --- you have to assume
|
||||||
_exactly_ what they get.
|
that what you see is _exactly_ what _they_ get.
|
||||||
|
|
||||||
I eventually wrote [`mailfmt`](https://git.ficd.sh/ficd/mailfmt) to fill this
|
On one hand, this isn't really a problem --- the whole point of plain text is
|
||||||
niche. It provides consistent paragraph spacing, hard-wrapping and paragraph
|
_not_ having to bother with formatting, right? There is, however, a crucial
|
||||||
reflow, while preserving Markdown syntax, email headers, quotes, sign-offs, and
|
catch: **line wrapping**.
|
||||||
signature blocks. Additionally, the wrapped output can be made safe for passing
|
|
||||||
to a Markdown parser. This is useful if you want to build an HTML email from
|
## The wrapping problem
|
||||||
plain-text.
|
|
||||||
|
Since we (humanity) have been _writing_ text, we've been _wrapping_ it. Pages,
|
||||||
|
after all, have finite width. At some point, an ongoing sentence needs to
|
||||||
|
continue on the line below it. This is called _wrapping_. In digital text, there
|
||||||
|
are two kinds of wrapping: **soft** and **hard**. The former is much more
|
||||||
|
common, and we often take it for granted.
|
||||||
|
|
||||||
|
**Hard-wrapped text** is the simplest: the line breaks are directly part of the
|
||||||
|
source. If you're writing a sentence that's getting too long, you simply press
|
||||||
|
`<ret>` to begin a new line. The author is responsible for all line breaks. This
|
||||||
|
guarantees that, (assuming the renderer doesn't reflow text), the output will
|
||||||
|
always look _exactly_ how it does in the editor.
|
||||||
|
|
||||||
|
**Soft-wrapped text** has line breaks inserted by the _renderer_ --- they're
|
||||||
|
_not_ present in the source file. It's incredibly convenient! As the writer, we
|
||||||
|
don't need to worry at all about line breaks; only paragraph breaks. We can
|
||||||
|
trust that the text _will_ be wrapped properly whenever it's viewed.
|
||||||
|
|
||||||
|
Now... remember how I just said that, in the context of plain text email, we
|
||||||
|
can't make _any_ assumptions about how the text will be rendered? This applies
|
||||||
|
to wrapping, too. _Some_ mail clients may wrap text, **but not all of them**.
|
||||||
|
This essentially consigns us to hard-wrapping our emails.
|
||||||
|
|
||||||
|
The problem? _It's inconvenient!_ Imagine you edit a paragraph, and remove a
|
||||||
|
sentence. Well, now that entire paragraph's spacing is messed up, and you need
|
||||||
|
to manually reflow it and fix the line breaks. Yuck!
|
||||||
|
|
||||||
|
## The Markdown complication
|
||||||
|
|
||||||
|
### Standard tools
|
||||||
|
|
||||||
|
At this point, some of you may be screaming: _"but what about `fmt` and
|
||||||
|
`fold`?"_ There exist utilities meant to solve this specific problem, included
|
||||||
|
in most Linux distributions out-of-the-box! Well, you would be right. _Sort of_.
|
||||||
|
|
||||||
|
It's true that we already have excellent, composable commands for wrapping and
|
||||||
|
paragraph formatting. A simple `#!fish cat email.txt | fmt >email.txt` is enough
|
||||||
|
to cover many cases. However, there's a problem: **these tools are markup
|
||||||
|
agnostic**.
|
||||||
|
|
||||||
|
Why is that a problem when I literally [just](#plain-text-email) said we don't
|
||||||
|
care about markup? Well, there are _some_ markup formats that are delightfully
|
||||||
|
readable even in plain--text. Consider the following _unordered list_ in HTML
|
||||||
|
(Hyper Text **Markup** Format):
|
||||||
|
|
||||||
|
```html
|
||||||
|
<ul>
|
||||||
|
<li>Foobar</li>
|
||||||
|
<li>Barfoo</li>
|
||||||
|
</ul>
|
||||||
|
```
|
||||||
|
|
||||||
|
See, machines can read this no problem... but people? We struggle. Now, consider
|
||||||
|
the exact same expressed in [Markdown](https://en.wikipedia.org/wiki/Markdown):
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- Foobar
|
||||||
|
- Barfoo
|
||||||
|
```
|
||||||
|
|
||||||
|
Isn't that so much nicer? As it turns out, markup isn't only meant to make
|
||||||
|
writing HTML easier --- it's also a great way to enhance the _semantics_ of
|
||||||
|
plain text.
|
||||||
|
|
||||||
|
**This** is where we run up against issues with `fmt` & company: because they're
|
||||||
|
not _aware_ of Markdown syntax, they have a tendency to **break** it. Consider
|
||||||
|
the unordered list example from before:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ cat list.md | fmt
|
||||||
|
- Foobar - Barfoo
|
||||||
|
```
|
||||||
|
|
||||||
|
The tool has _no idea_ this is meant to be a list. It just treats whitespace
|
||||||
|
separated tokens as words and reflows paragraphs accordingly.
|
||||||
|
|
||||||
|
### Markdown formatters
|
||||||
|
|
||||||
|
My immediate next thought was to try an actual Markdown formatter. Not only do
|
||||||
|
they _also_ handle wrapping & reflow, they won't break the markup. I gave it a
|
||||||
|
shot, and to my horror, I found that they have the _opposite_ problem: they
|
||||||
|
preserve markup, but they break [signature blocks](#signature-blocks),
|
||||||
|
[sign-offs](#sign-offs), and [headers](#headers)!
|
||||||
|
|
||||||
|
## Writing `mailfmt`
|
||||||
|
|
||||||
|
I eventually wrote [`mailfmt`](https://git.ficd.sh/ficd/mailfmt) to fill the
|
||||||
|
niche of email formatting. It provides consistent paragraph spacing,
|
||||||
|
hard-wrapping and paragraph reflow, while preserving Markdown syntax, email
|
||||||
|
headers, quotes, sign-offs, and signature blocks. Additionally, the wrapped
|
||||||
|
output can be made safe for passing to a Markdown parser. This is useful if you
|
||||||
|
want to build an HTML email from plain-text.
|
||||||
|
|
||||||
`mailfmt` open-source under the ISC license, and is available on
|
`mailfmt` open-source under the ISC license, and is available on
|
||||||
[PyPI](https://pypi.org/project/mailfmt/) for installation with tools like
|
[PyPI](https://pypi.org/project/mailfmt/) for installation with tools like
|
||||||
`pipx` and `uv`. The source code is available on sourcehut at
|
`pipx` and `uv`. The source code is available on sourcehut at
|
||||||
[git.ficd.sh/ficd/mailfmt](https://git.ficd.sh/ficd/mailfmt).
|
[git.ficd.sh/ficd/mailfmt](https://git.ficd.sh/ficd/mailfmt).
|
||||||
|
|
||||||
## Target Audience
|
|
||||||
|
|
||||||
I wrote this tool primarily for myself. It's served me very well over the past
|
I wrote this tool primarily for myself. It's served me very well over the past
|
||||||
few months. `mailfmt` could be helpful for anyone that prefers writing email in
|
few months. `mailfmt` could be helpful for anyone that prefers writing email in
|
||||||
plain-text using text editors like Kakoune, Helix, and Vim. It can format via
|
plain-text using text editors like Kakoune, Helix, and Vim. It can format via
|
||||||
`stdin`/`stdout` and read/write files, making `mailfmt` easy to configure as a
|
`stdin`/`stdout` and read/write files, making `mailfmt` easy to configure as a
|
||||||
formatter for the `mail` filetype in your editor.
|
formatter for the `mail` filetype in your editor.
|
||||||
|
|
||||||
I'm including a very lengthy explanation of exactly why I built this tool. You
|
### My requirements
|
||||||
may think it's overkill for such a small program — but I like to be crystal
|
|
||||||
clear about justifying my work. It reads like blog post rather than the
|
|
||||||
emoji-filled `README`/marketing style we're accustomed to seeing on this
|
|
||||||
platform. I've put a lot of thought into this, and I want to share my work. I
|
|
||||||
hope you enjoy reading about my thought process.
|
|
||||||
|
|
||||||
## Why I Built It (Comparison)
|
|
||||||
|
|
||||||
Unsurprisingly, it all started with a specific problem I was having composing
|
|
||||||
emails in plain-text format in my preferred text editor. As I searched for a
|
|
||||||
solution, I couldn't find anything that met all my needs, so I wrote it myself.
|
|
||||||
|
|
||||||
Here's what I wanted:
|
|
||||||
|
|
||||||
- A way to consistently format my outgoing emails in my text editor.
|
- A way to consistently format my outgoing emails in my text editor.
|
||||||
- Paragraph reflow and automatic line wrapping.
|
- Paragraph reflow and automatic line wrapping.
|
||||||
- Not all plain-text clients are capable of line-wrap. In some contexts, such
|
- Ability to use Markdown syntax:
|
||||||
as mailing lists, the author is expected to wrap the text themselves.
|
|
||||||
- Inline Markdown syntax `can _still_ look great, **even** in plain-text!` Thus,
|
|
||||||
I wanted to use it:
|
|
||||||
- Without it being broken by reflow & wrap.
|
- Without it being broken by reflow & wrap.
|
||||||
- While looking good and retaining the same semantics in _both_ rendered
|
- While looking good and retaining the same semantics in _both_ rendered
|
||||||
**and** plain-text form — ideal for `multipart` emails.
|
**and** plain-text form — ideal for `multipart` emails.
|
||||||
- Ensure signature block is formatted properly.
|
- _Ensure_ proper formatting of [signature blocks](#signature-blocks).
|
||||||
- The single space after `--` and before the newline **must** be included.
|
- _Preserve_ formatting of [sign-offs](#sign-offs).
|
||||||
|
|
||||||
### `fmt` and Markdown Formatters Don't Work For Email
|
### Wrap & reflow
|
||||||
|
|
||||||
The `fmt` utility provides great wrapping and reflow capabilities — I use it all
|
It turns out that the most important part was also the easiest to implement.
|
||||||
the time while writing LaTeX. However, it's syntax agnostic, and breaks
|
Python's standard library includes
|
||||||
Markdown. For example, it completely mangles fenced code blocks. I figured: hey,
|
[`textwrap`](https://docs.python.org/3/library/textwrap.html), which _literally_
|
||||||
why not just use a Markdown formatter? It supports Markdown (obviously), _and_
|
just does it for you. So the _real_ challenge becomes figuring out _what to
|
||||||
can reflow & wrap text! Here's the problem: it turns out treating your
|
wrap_, versus **what to ignore**.
|
||||||
**entire** email as a Markdown document isn't ideal.
|
|
||||||
|
### Preserving Markdown
|
||||||
|
|
||||||
|
Getting my tool to preseve Markdown was fairly straightforward. I'm not building
|
||||||
|
a _Markdown formatter_, I'm building _a formatter that doesn't break Markdown_.
|
||||||
|
In other words, I don't need to _parse_ Markdown syntax; just recognize it,
|
||||||
|
**and ignore it**.
|
||||||
|
|
||||||
`mailfmt`'s approach is simple: detect when a line matches a known pattern of
|
`mailfmt`'s approach is simple: detect when a line matches a known pattern of
|
||||||
Markdown block element syntax, such as leading `#` for headings, `-` for lists,
|
Markdown block element syntax, such as leading `#` for headings, `-` for lists,
|
||||||
etc. If so, **leave the line untouched**. Similarly, **don't format anything
|
etc. If so, **leave the line untouched**. Similarly, **don't format anything
|
||||||
inside fenced code blocks**.
|
inside fenced code blocks**.
|
||||||
|
|
||||||
#### Sign-Offs
|
### Sign-offs
|
||||||
|
|
||||||
Consider the following sign-off:
|
Consider the following sign-off:
|
||||||
|
|
||||||
|
@ -118,7 +201,7 @@ Best wishes,
|
||||||
Daniel
|
Daniel
|
||||||
```
|
```
|
||||||
|
|
||||||
> However, this empty line looks _awkward_ when viewed in plain-text.
|
> However, this empty line looks a tad awkward when viewed in plain--text.
|
||||||
|
|
||||||
2. Put a backslash after the intentional line break:
|
2. Put a backslash after the intentional line break:
|
||||||
|
|
||||||
|
@ -129,7 +212,7 @@ Daniel
|
||||||
|
|
||||||
> Again, this looks bad when the Markdown isn't rendered.
|
> Again, this looks bad when the Markdown isn't rendered.
|
||||||
|
|
||||||
3. Put two spaces after the intentional line break (• = space):
|
3. Put two spaces after the intentional line break (`•` = space):
|
||||||
|
|
||||||
```
|
```
|
||||||
Best•wishes,••
|
Best•wishes,••
|
||||||
|
@ -146,17 +229,20 @@ uppercase letter**, then we assume these two lines are a _sign-off_, and we
|
||||||
don't reflow or wrap them. The heuristic matches a very simple pattern:
|
don't reflow or wrap them. The heuristic matches a very simple pattern:
|
||||||
|
|
||||||
```
|
```
|
||||||
A courteous greeting,
|
A courteous salutation,
|
||||||
First Middle Last Name
|
First Middle Last Name
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Signature Block
|
### Signature blocks
|
||||||
|
|
||||||
The convention for signature blocks is as follows:
|
The [standard](https://en.wikipedia.org/wiki/Signature_block#Standard_delimiter)
|
||||||
|
for signature blocks is as follows:
|
||||||
|
|
||||||
1. Begins with two `-` characters followed by a single space, then a newline.
|
1. Begins with two `-` characters followed by a single space, then a newline.
|
||||||
2. Everything that follows until the EOF is part of the signature.
|
2. Everything that follows until the EOF is part of the signature.
|
||||||
|
|
||||||
|
*[EOF]: End of file.
|
||||||
|
|
||||||
Here's an example (note the • = space):
|
Here's an example (note the • = space):
|
||||||
|
|
||||||
```
|
```
|
||||||
|
@ -167,31 +253,44 @@ Software•Developer,•Company
|
||||||
email@website.com
|
email@website.com
|
||||||
```
|
```
|
||||||
|
|
||||||
As with sign-offs, such a signature block gets mangled by Markdown formatters.
|
As with sign-offs, such a signature block gets mangled by other formatters.
|
||||||
Furthermore, the single space after the `--` token is important: if it's
|
Furthermore, the single space after the `--` token is important: if it's
|
||||||
missing, some clients won't recognize it is a valid signature — our formatter
|
missing, some clients won't recognize it is a valid signature.
|
||||||
should address this too.
|
|
||||||
|
|
||||||
`mailfmt` detects when a line's _only_ content is `--`. It adds the required
|
`mailfmt` detects when a line's _only_ content is `--`. It adds the required
|
||||||
trailing space if it's missing, and it treats the rest of the input as part of
|
trailing space if it's missing, and it treats the rest of the file as part of
|
||||||
the signature, leaving it completely untouched.
|
the signature, leaving it completely untouched.
|
||||||
|
|
||||||
### Consistent Multipart Emails
|
## Headers
|
||||||
|
|
||||||
Something you may want to do is generate a `multipart` email. This means that
|
Raw emails contain many
|
||||||
_both_ an HTML **and** plain-text representation of the _same_ email are
|
[headers](https://en.wikipedia.org/wiki/Email#Message_header). Even if you're
|
||||||
|
reading/writing in plain--text, it's likely that your client strips these.
|
||||||
|
However, in some cases, you may want to insert a header or two manually.
|
||||||
|
Luckily, headers are easily matched by
|
||||||
|
[regex](https://en.wikipedia.org/wiki/Regular_expression), so `mailfmt` can
|
||||||
|
ignore them without any issues.
|
||||||
|
|
||||||
|
## Consistent multipart emails
|
||||||
|
|
||||||
|
Something you may want to do is generate a `text/multipart` email. This means
|
||||||
|
that _both_ an HTML **and** plain-text representation of the _same_ email are
|
||||||
included in the file — leaving it up to the reader's client to pick which one to
|
included in the file — leaving it up to the reader's client to pick which one to
|
||||||
display.
|
display.
|
||||||
|
|
||||||
The plain-text email **must** be able to stand on its own, and _also_ render to
|
The plain-text email **must** be able to stand on its own, and should _also_
|
||||||
decent-looking HTML. Essentially, you want to write your email in plain-text
|
render to decent-looking HTML. Essentially, you want to write your email in
|
||||||
once, ensuring it has proper formatting, and then use a command to generate an
|
plain-text once, ensuring it has proper formatting, and then use a command to
|
||||||
HTML email from it. For this, `mailfmt` provides the `--markdown-safe` flag,
|
generate an HTML email from it.
|
||||||
which appends backslashes to the formatted output, making it safe for Markdown
|
|
||||||
parsing without messing up the line breaks after sign-offs and signature blocks.
|
|
||||||
|
|
||||||
For example, I use the following in [aerc](https://aerc-mail.org/) to generate
|
For this, `mailfmt` provides the `--markdown-safe` flag, which appends
|
||||||
an HTML multipart email whenever I want:
|
backslashes to the formatted output, making it safe for Markdown parsing without
|
||||||
|
messing up the line breaks after sign-offs and signature blocks.
|
||||||
|
|
||||||
|
Note that the **only** thing this does is output Markdown with hard line breaks.
|
||||||
|
It's the user's responsibility to write the pipeline for generating the email
|
||||||
|
file. For example, I use the following in [aerc](https://aerc-mail.org/) to
|
||||||
|
generate an HTML multipart email whenever I want:
|
||||||
|
|
||||||
```ini
|
```ini
|
||||||
[multipart-converters]
|
[multipart-converters]
|
||||||
|
@ -201,5 +300,5 @@ text/html=mailfmt --markdown-safe | pandoc -f markdown -t html --standalone
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
If you've made it this far, thanks for sticking with me and reading to the end!
|
If you've made it this far, thanks for sticking with me and reading to the end!
|
||||||
Even if you don't plan to write plain-text email or use `mailfmt` at all, I hope
|
Even if you don't plan to write plain--text email or use `mailfmt` at all, I
|
||||||
you learned something interesting.
|
hope you learned something interesting.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue