added markdown safety

This commit is contained in:
Daniel Fichtinger 2025-06-03 14:09:58 -04:00
parent d5551d2269
commit 804e5a22eb
2 changed files with 164 additions and 8 deletions

147
README.md
View file

@ -1,11 +1,26 @@
# Mail Format
<h1>Mail Format</h1>
`mailfmt` is a simple plain text email formatter. It's designed to ensure
consistent paragraph spacing while preserving markdown syntax, email headers,
sign-offs, and signature blocks.
By default, this script accepts its input on `stdin` and prints to `stdout`.
This makes it well suited for use with an editor like Helix. It has no
dependencies besides the standard Python interpreter, and was written and tested
against Python 3.13.2.
This makes it well suited for use as a formatter with a text editor like Kakoune
or Helix. It has no dependencies besides the standard Python interpreter, and
was written and tested against Python 3.13.3.
**Features:**
<!--toc:start-->
- [Features](#features)
- [Usage](#usage)
- [Output Example](#output-example)
- [Markdown Safety](#markdown-safety)
- [Aerc Integration](#aerc-integration)
- [Contributing](#contributing)
<!--toc:end-->
## Features
- Wraps emails at specified columns.
- Automatically reflows paragraphs.
@ -18,12 +33,16 @@ against Python 3.13.2.
- Markdown-style code blocks.
- Usenet-style signature block at EOF.
- Sign-offs.
- If specified, output can be made safe for passing to a Markdown renderer.
- Use case: piping the output to `pandoc` to write a `text/html` message. See
[Markdown Safety](#markdown-safety).
**Usage:**
## Usage
```
usage: format.py [-h] [-w WIDTH] [-b] [--no-replace-whitespace] [--no-reflow]
[--no-signoff] [--no-signature] [--no-squash] [-i INPUT] [-o OUTPUT]
usage: mailfmt [-h] [-w WIDTH] [-b] [--no-replace-whitespace] [--no-reflow]
[--no-signoff] [--no-signature] [--no-squash] [-m] [-i INPUT]
[-o OUTPUT]
Formatter for plain text email.
"--no-*" options are NOT passed by default.
@ -39,6 +58,118 @@ options:
--no-signoff Don't preserve signoff line breaks.
--no-signature Don't preserve signature block.
--no-squash Don't squash consecutive paragraph breaks.
-m, --markdown-safe Output format safe for Markdown rendering.
-i, --input INPUT Input file. (default: STDIN)
-o, --output OUTPUT Output file. (default: STDOUT)
Author : Daniel Fichtinger
License: ISC
Contact: daniel@ficd.ca
```
## Output Example
Before:
```
Hey,
This is a really long paragraph with lots of words in it. However, my text editor uses soft-wrapping, so it ends up looking horrible when viewed without wrapping! Additionally,
if I manually add some line breaks, things start to look _super_ janky!
I can't just pipe this to `fmt` because it may break my beautiful
markdown
syntax. Markdown formatters are also problematic because they mess up
my signoff and signature blocks! What should I do?
Best wishes,
Daniel
--
Daniel
sr.ht/~ficd
daniel@ficd.ca
```
After:
```
Hey,
This is a really long paragraph with lots of words in it. However, my text
editor uses soft-wrapping, so it ends up looking horrible when viewed
without wrapping! Additionally, if I manually add some line breaks, things
start to look _super_ janky!
I can't just pipe this to `fmt` because it may break my beautiful markdown
syntax. Markdown formatters are also problematic because they mess up my
signoff and signature blocks! What should I do?
Best wishes,
Daniel
--
Daniel
sr.ht/~ficd
daniel@ficd.ca
```
## Markdown Safety
In some cases, you may want to generate an HTML email. Ideally, you'd want the
HTML to be generated directly from the plain text message, and for _both_
versions to be legible and have the same semantics.
Although `mailfmt` was written with Markdown markup in mind, its intended output
is still the `text/plain` format. If you pass its output directly to a Markdown
renderer, line breaks in sign-offs and the signature block won't be preserved.
If you invoke `mailfmt --markdown-safe`, then `\` characters will be appended to
mark line breaks that would otherwise be squashed, making the output suitable
for conversion into HTML. Here's an example of one such pipeline:
```bash
cat message.txt | mailfmt --markdown-safe | pandoc -f markdown -t html
--standalone > message.html
```
Here's the earlier example message with markdown safe output:
```
Hey,
This is a really long paragraph with lots of words in it. However, my text
editor uses soft-wrapping, so it ends up looking horrible when viewed
without wrapping! Additionally, if I manually add some line breaks, things
start to look _super_ janky!
I can't just pipe this to `fmt` because it may break my beautiful markdown
syntax. Markdown formatters are also problematic because they mess up my
signoff and signature blocks! What should I do?
Best wishes, \
Daniel \
-- \
Daniel \
sr.ht/~ficd \
daniel@ficd.ca \
```
## Aerc Integration
For integration with `aerc`, consider adding the following to your `aerc.conf`:
```ini
[multipart-converters]
text/html=mailfmt --markdown-safe | pandoc -f markdown -t html --standalone
```
When you're done writing your email, you can call the `:multipart text/html`
command to generate a `multipart/alternative` message which includes _both_ your
original `text/plain` _and_ the newly generated `text/html` content.
## Contributing
Please send patches, requests, and concerns to my
[public inbox](https://lists.sr.ht/~ficd/public-inbox).

View file

@ -1,5 +1,12 @@
#!/bin/env python
# TODO: generate an HTML version from the markdown syntax
# while preserving the signoff and signature block
# How to do this:
# - Go through and do regular formatting, but add line breaks
# - at the end of preserved lines. Then pass to md -> html converter.
# Most simply, there should be an option to output "markdown safe" text.
# Simple text-wrapping script for email.
# Preserves code blocks, quotes, and signature.
# Automatically joins and re-wraps paragraphs to
@ -25,9 +32,15 @@ reflow = True
width = 74
break_long_words = False
replace_whitespace = True
markdown_safe = False
in_signoff = False
in_signature = False
def pprint(string: str):
if markdown_safe and (in_signoff or in_signature) and string:
string += " \\"
if not squash:
print(string, file=out_stream)
else:
@ -136,6 +149,13 @@ Contact: daniel@ficd.ca
help="Don't squash consecutive paragraph breaks.",
action="store_false",
)
parser.add_argument(
"-m",
"--markdown-safe",
required=False,
help="Output format safe for Markdown rendering.",
action="store_true",
)
parser.add_argument(
"-i",
"--input",
@ -160,6 +180,7 @@ Contact: daniel@ficd.ca
squash = args.no_squash
replace_whitespace = args.no_replace_whitespace
break_long_words = args.break_long_words
markdown_safe = args.markdown_safe
if args.input == "STDIN":
reader = sys.stdin
@ -174,16 +195,19 @@ Contact: daniel@ficd.ca
if should_check_signoff:
is_signoff = check_signoff(line)
if is_signoff:
in_signoff = True
if not signoff_cache:
signoff_cache = line
else:
pprint(signoff_cache)
pprint(line)
in_signoff = False
signoff_cache = ""
continue
elif not is_signoff and signoff_cache:
paragraph.append(signoff_cache)
signoff_cache = ""
in_signoff = False
if line.startswith("```"):
flush_paragraph()
skipping = not skipping
@ -191,6 +215,7 @@ Contact: daniel@ficd.ca
elif should_check_signature and line == "--":
flush_paragraph()
skipping = True
in_signature = True
pprint("-- ")
elif not line or re.match(
r"^(\s+|-\s+|\+\s+|\*\s+|>\s*|#\s+|From:|To:|Cc:|Bcc:|Subject:|Reply-To:|In-Reply-To:|References:|Date:|Message-Id:|User-Agent:)",