A Consistent Structural Editing Interface

2023-02-04

emacs , treesitter

Emacs 29 is getting native Tree-Sitter support, and the buzz is hard to miss.

Tree-Sitter maintains and provides a concrete parse tree of the buffer that you can query, but that’s as far as it goes. Acting on this information to provide context-aware navigation and editing tools is left to package authors, who have picked up this baton and started running with it! In the last few months we’ve had structural editing packages popping up all over the place.

Lisp modes in Emacs already have pretty good support for structural editing as the syntax is particularly simple: the code is its own parse tree Notwithstanding some Lisp reader shenanigans . But for every other language this is just the beginning. It’s going to take some work, but tree-sitter information is rich enough that over the next year or two of collective experimentation, I expect we’re going to see new editing paradigms emerge that could be to paredit what paredit is to naïve Lisp editing.

Considering my penchant for fussing over composability, I thought this might be a good time to make a short note about the UIs of structural editing packages.

Comparing structural editing packages usually comes down to a question of keybindings: Are they accessible? Mnemonic? Tedious? Modal? Chainable? Indeed, this is the subject of an extended example below. But I’m more interested in how composable structural editing packages are at a more fundamental level: Is it a megalithic suite with a rigid user-facing interface, or a bag of tools you can pick from and combine with other Emacs features?¹ Of course, every piece of elisp has a bag-of-tools nature if you’re not averse to some monkey patching, but brittle fixes crumble over time in unpleasant ways. Moreover, it’s not always clear which approach is better.²

Here’s an incomplete and evolving picture of structural editing in Emacs:

An incomplete picture:

Dashed lines: Support is being worked on or can be added with some configuration.

It follows from the thesis of this write-up that this chart is out of date as you peruse it now. But it also elides many details for reasons of readability:

Most packages offer the obvious, modeless editing interface of calling individual commands directly, for instance with a keybinding or with M-x.
Every structural editing package in the above list that is not made for evil-mode has an adapter that allows it to be used with evil-mode. (evil-paredit, evil-smartparens, …) I included lispyville as an exmaple.
Also not on the list: single-purpose structure editing commands like expand-region or embrace.
Emacs’ ppss (parse partial s-expression) parser as used here is a catchall for parsing the old way, through a combination of regexp searching and using syntax-ppss.

Parsers, Editing Suites and User Interfaces

The Parser: Emacs’ homegrown ppss (parse partial s-expression) parser and its regexp-scan based cousins are workhorses that are available across many major modes. But if you’ve ever tried the mark-defun or forward-sexp commands in a non-Lisp buffer you’ll have noticed how inconsistent it can be. Until now, structural editing in non-Lisp modes has had to rely on this mix of ppss and bespoke regexp scans of the buffer. The addition of tree-sitter-based parsing is going to be a massive upgrade in consistency.

Structural Editing packages: Along with old favorites like paredit and lispy, we now have the fledgling entries I included in the chart: puni, combobulate, ts-movement and so on. Not all of these are tree-sitter based, although I expect they will all eventually support it, or be supplanted in usage by packages that do.

The Editing Interface: As provided or supported by the structural editing package. A distinction that most packages don’t (and often shouldn’t) make, but is the focus of my attention here:

Packages like lispy and symex are end-to-end solutions, with a defined vision of the best way for users to interact with them. This is expressed through both the kinds of verbs they afford, with editing actions like convolute, flow and stringify, and how these are made available and invoked – contextually based on the cursor position with lispy, and through a special editing mode with symex. These are customizable in the sense that you can move the keybindings around, but not change the editing paradigm or interface itself. It’s our way, the highway, or defadvice away. The advantage is that these are plug and play solutions Or plug, practice, practice, practice and play. , and very capable out of the box. You must adapt to it, but it’s fast and intuitive afterwards.

Packages like smartparens and paredit are more restrained in their preferences: They offer a set of verbs that you can bind to keys as you please, with more levers to tweak their behavior.³ These packages also tend to be more singular in their focus: they don’t ship with commands to view docstrings inline or interact with the elisp debugger, like lispy does. They’re not all-encompassing editing “suites”, just participants in your editing experience. The advantage is malleability: they can adapt to you, and you pick only what you need. There are no nasty surprises along the way. They can change as your editing needs do.

To make my point about this difference in their design philosophies, the rest of this write-up is an extended example of the malleability gap. We try to create a uniform and consistent interface for structural editing irrespective of the package being used.

Example: A lispy-lite without lispy

One of the advantages of a specialized mode for structural edits is that the keybindings become much simpler – no chords needed – and editing is smoother and faster. Indeed, this is touted as the main feature of lispy:

This package reimagines Paredit - a popular method to navigate and edit LISP code in Emacs. The killer-feature are the short bindings […] Most of more than 100 interactive commands that lispy provides are bound to a-z and A-Z in lispy-mode.

100 interactive commands! In my experience you only need ten or fifteen interactive commands for most structural edits. The others commands will be needed so rarely that you’re likely to have forgotten they exist, and the corresponding keybindings will get in the way when you mean to do something else.

With Emacs’ repeat-mode and a smattering of helpers from a basic structural editing package like smartparens, we can setup up a lispy-lite no-chords keymap that is tailored to our needs.

Here’s an example of using a (mostly) one-key interface that uses smartparens and repeat-mode:

Play by play

The command-log window on the right shows the keys that are pressed and the commands they activate. Notice that once a structural editing command in the structural-edit-map (see below) is called with its full keybinding – typically C-M-key – we can continue to use single keys to move around and edit code syntactically.

Regular Emacs commands like isearch or character-based navigation are available at all times, and I use a few above, including upcase-dwim. Calling such a command deactivates structural-edit-map, however, and I have to use a “full keybinding” again to activate it again.

All structural editing commands of relevance are bound to single keys (like n and p) in a single keymap, and using any of these commands activates this transient keymap. Until we use a non-structural editing command again, we can continue to navigate and edit structurally with the single keys.

If necessary, this can be augmented with a prompter like repeat-help, giving us optional (or on-demand) hydra-like hints:

Defining a structural editing keymap

Here’s structural-edit-map and the instruction to use this with repeat-mode.

(repeat-mode 1)
(defvar structural-edit-map
  (let ((map (make-sparse-keymap)))
    (pcase-dolist (`(,k . ,f)
                   '(("u" . backward-up-list)
                     ("f" . forward-sexp)
                     ("b" . backward-sexp)
                     ("d" . down-list)
                     ("k" . kill-sexp)
                     ("n" . sp-next-sexp)
                     ("p" . sp-previous-sexp)
                     ("K" . sp-kill-hybrid-sexp)
                     ("]" . sp-forward-slurp-sexp)
                     ("[" . sp-backward-slurp-sexp)
                     ("}" . sp-forward-barf-sexp)
                     ("{" . sp-backward-barf-sexp)
                     ("C" . sp-convolute-sexp)
                     ("J" . sp-join-sexp)
                     ("S" . sp-split-sexp)
                     ("R" . sp-raise-sexp)
                     ("\\" . indent-region)
                     ("/" . undo)
                     ("t" . transpose-sexps)
                     ("x" . eval-defun)))
      (define-key map (kbd k) f))
    map))

(map-keymap
 (lambda (_ cmd)
   (put cmd 'repeat-map 'structural-edit-map))
 structural-edit-map)

To understand how this works, check out the repeating commands section of the Emacs manual, or my write-up on repeat-mode.

Seamless activation

The other advantage of the repeat-map bindings is that the activation and deactivation of the transient keymap (structural-edit-map above) requires no action on our part. As in the demo, we can seamlessly mix regular editing commands (standard cursor navigation, inserting text, isearch) with single-key structural navigation editing commands (up/down-list, convolute-sexp, slurp/barf and so on). You don’t have to explicitly switch modes – typing any key not in this map will deactivate the transient keymap and let you get on with regular editing.

Only what we need

We can start small and add to lisp-navigation-map incrementally. For example, suppose we find ourselves needing a command to duplicate s-expressions. We write a duplicate-sexp command (or just find sp-clone-sexp in the smartparens library) and add it to your one-key repeat-map repertoire. Each command we add has a multiplicative effect because of…

Composable features

Every command you add composes with everything already in our toolbelt. Here’s an example of moving a value into a variable where the newly added duplicate-sexp feeds into an Avy action that exchanges expressions:

The cost

There is, of course, a cost to pay for this convenience, and it’s the cost of all modal interaction: it’s the user’s responsibility to remember if the repeat-map is active. There are the usual band-aids, like an indicator in the mode-line. But realistically it comes down to two options:

Parsimony: Populate structural-edit-map with as few commands as you can get away with, thus minimizing the chance that a regular editing command (such as typing in some text) will cause a distressing series of structural edits that you didn’t intend.
Maximalism: Populate structural-edit-map with everything you think you might use, and explicitly deactivate the keymap with C-g (keyboard-quit) when you’re done with a round of structural edits. This effectively makes the method completely modal like evil-mode or symex.

Personally I prefer the first option, but the second is just as viable.

Minimum Viable Consistency

I used smartparens features in the structural-edit-map example above, but there isn’t anything special about smartparens here. repeat-mode is a generic Emacs feature.

As tree-sitter support matures, more packages are created and structural editing is available across many (non-Lisp) major-modes in Emacs, we won’t have to contend with selecting one and learning and unlearning the plethora of editing interfaces that will accompany them Compare for instance the symex and lispy keybindings. Phew! .

Assuming that

they are written as simple packages and not editing “suites”, and
structural editing has a reasonably consistent set of universal verbs (up- / down-tree, transpose-nodes etc),

they can be conveniently folded into structural-edit-map, and our interface will remain consistent across modes, structural editing packages and time!

Even though lispy and symex are not intended to be used this way, there’s nothing stopping us from applying some elbow grease and appropriating the fancy verbs they provide into structural-edit-map. But the hackier the solution, the more edge cases and frustration we’ll need to contend with.

More broadly, there’s room in the emerging structural editing space for both end-to-end solutions and composable pieces. The former kind is flashier and more usable out of the box, but the latter is invaluable to users who move slower and try to minimize changes to their editing environment. This is a plea to package authors: when you begin designing a structural editing library for Emacs, please consider which kind suits you better!

The Consult/Vertico/Embark ecosystem is a great example of the latter. ↩︎
The package that handles everything in a rigid way is also going to be more stable, an aspect for which my appreciation grows exponentially as deadlines begin to loom. ↩︎
smartparens lets you define custom syntactic pairs, for instance. ↩︎