From a08b48cc51cf94c8975e1467f6304dac79ea8c36 Mon Sep 17 00:00:00 2001
From: Daniel Fichtinger <daniel@ficd.ca>
Date: Fri, 4 Jul 2025 14:38:25 -0400
Subject: [PATCH] wrote kakoune lexer blog post

---
 ...mplementing-kakoune-syntax-highlighting.md | 104 ++++++++++++++++++
 content/blog/pygments-kakoune-lexer.md        |   7 --
 2 files changed, 104 insertions(+), 7 deletions(-)
 create mode 100644 content/blog/implementing-kakoune-syntax-highlighting.md
 delete mode 100644 content/blog/pygments-kakoune-lexer.md

diff --git a/content/blog/implementing-kakoune-syntax-highlighting.md b/content/blog/implementing-kakoune-syntax-highlighting.md
new file mode 100644
index 0000000..656869b
--- /dev/null
+++ b/content/blog/implementing-kakoune-syntax-highlighting.md
@@ -0,0 +1,104 @@
+---
+title: Implementing Kakoune Syntax Highlighting In Pygments
+date: July 04, 2025
+---
+
+As a programmer, one thing I care about a _lot_ is syntax highlighting. In fact,
+the main reason I created [Ashen] was to have more control over it. In my view,
+if you're going to spend all day looking at text, _it should at least look
+pleasant_. This naturally carries over to blogging as well.
+
+Over the past few months, I've become obsessed with Kakoune. I've been
+customizing it extensively, writing plugins, contributing to the wiki, and
+participating in its small (but _incredibly_ active and welcoming) community.
+And, well, when I get _this_ into something, I want to write about it!
+
+However, here's the problem: Kakoune doesn't have many users. It has around 10k
+stars on GitHub; while [Helix], a project that was directly inspired by it, has
+over 38 thousand. I don't mind that Kakoune is "unpopular". I enjoy the smaller,
+tighter-knit community — but I'd be lying if I said it wasn't inconvenient at
+times.
+
+One such time is _getting Kakoune syntax highlighting on my blog_. Most SSG
+setups (including [Zona], my home-brewed project) rely on external libraries to
+provide code highlighting. For example, this website uses [Pygments], which is a
+mature Python library. Now, Pygments boasts support for "a wide range of 597
+languages and other text formats".
+
+**Kakoune is _not_ among them**. Meaning that, if I wanted Kakoune highlighting,
+I'd have to do it myself. Now, perhaps unsurprisingly, Kakoune provides
+highlighting for its own syntax. Helpfully, this highlighting is _itself_
+implemented as Kakoune commands (sometimes referred to as Kakscript). Why is
+this helpful? Because Kakoune highlighters are defined in regular expressions;
+saving us some mental work if we want to port highlighting to another platform.
+
+There's a caveat, however: the regex engine must be capable of recursion. This
+is thanks to the weirdness that is Kakoune's shell blocks, and how they interact
+with balanced delimiters.
+
+Without getting too detailed, Kakoune's balanced strings are...
+[complicated](https://github.com/mawww/kakoune/blob/master/doc/pages/command-parsing.asciidoc).
+This wouldn't normally be a problem, because strings that aren't wrapped in
+double/single quotes aren't highlighted anyways. However, that's not true for
+shell blocks: the contents of `%sh{...}` should be highlighted as POSIX shell
+script.
+
+The problem? The `%sh` delimiter can be _anything_. Literally. Kakoune's
+standard RC **itself** uses `%§` as a delimiter. This means that the following
+two snippets are parsed the exact same:
+
+```kak
+evaluate-commands %sh{
+  printf '%s\n' "%sh{ echo 'hi' }"
+}
+```
+
+```kak
+evaluate-commands %sh∴
+  printf '%s\n' "%sh{ echo 'hi' }"
+∴
+```
+
+All of this makes implementing a true Kakoune lexer for a library like Pygments,
+which doesn't natively support recursive regex, a non-trivial task. To be
+honest, I barely understand how it's done in Kakoune in the first place.
+
+Luckily, a friend of mine pointed out something very interesting the other day
+when he sent me a Kakoune snippet over Discord _with highlighting._ It didn't
+look great, but it was actually highlighted! _In **Discord**!_
+
+As it turns out, all he did was denote the code block as `sh` instead of `kak` —
+Kakoune's _actual_ syntax (the parts outside balanced `%sh` strings) is
+_visually_ very similar to POSIX `sh`. After this realization, implementing a
+Kakoune Lexer was a much more straightforward task: all I had to do was extend
+the existing Bash Lexer and add some keywords!
+
+Of course, the result isn't _perfect_. The lexer can't tell the difference
+between inside and outside `%sh` strings; shell keywords are highlighted at the
+root level of the code, and Kakoune keywords are highlighted inside shell
+blocks. The _correct_ way would be properly detecting balanced `%sh` strings,
+and delegating their contents to the Bash Lexer. The following snippet (at the
+time of writing) is **not** highlighted correctly:
+
+```kak
+set buffer filetype kak
+evaluate-commands %sh{
+  echo define-command is-kak %< info -title is-kak 'Not Kak!' >
+}
+```
+
+Properly detecting these strings isn't currently possible with Pygments'
+`RegexLexer`. I'd need to subclass the base lexer and implement my own token
+scanning. Is it possible? Absolutely. Do I want to do it? **Absolutely not**.
+
+For now, please enjoy the janky, _but functional_ Kakoune syntax highlighting I
+created. The plugin is also available as the `pygments-kakoune` package on
+[sr.ht](https://git.sr.ht/~ficd/pygments-kakoune) and
+[PyPI](https://pypi.org/project/pygments-kakoune/) if you want to use it in your
+own projects.
+
+[Zona]: https://git.sr.ht/~ficd/zona
+[Ashen]: https://sr.ht/~ficd/ashen
+[Kakoune]: https://kakoune.org
+[Helix]: https://github.com/helix-editor/helix
+[Pygments]: https://pygments.org/
diff --git a/content/blog/pygments-kakoune-lexer.md b/content/blog/pygments-kakoune-lexer.md
deleted file mode 100644
index a274d19..0000000
--- a/content/blog/pygments-kakoune-lexer.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-title: Implementing Kakoune Syntax In Pygments
-date: July 04, 2025
-draft: true
----
-
-Some content goes here.