Bridging Islands in Emacs: re-builder and query-replace-regexp
One of the problems with Emacs, especially out of the box, is that its constituents don’t communicate with each other as comprehensively as they ought to. This is expected given the bazaar nature of its development: it’s an amalgamation of elisp libraries written by different contributors over decades, few of whom were aware of many of Emacs’ existing capabilities they could reuse or plug into. I covered a couple of examples of this deficiency in my series on Batteries included with Emacs (such as the pulse
and view
libraries).
The experience of using most software teaches us to ignore such shortcomings and annoyances until we don’t even notice them any more. When using Emacs, however, it’s possible to glance occasionally at a lacuna between its islands that’s begging for a bridge. In stark contrast to most software, building arbitrary bridges in elisp is very easy.
The Islands: re-builder
and query-replace-regexp
One such example is a simple connection between the built in regexp-builder (re-builder
) library and the *-replace-regexp
functions. re-builder
lets you build a regular expression (henceforth “regexp”) with interactive feedback. Text in the main buffer matching the regular expression is highlighted, which is very helpful to catch and correct errors:
I’m reasonably familiar with Emacs’ flavor of regular expressions, idiosyncratic as it is (compared to PCRE), and thus often forget that re-builder
even exists. On the other hand, query-replace-regexp
(henceforth qrr
), an everyday tool which does what it says on the tin, does not show matches as you construct the expression to be replaced:1
Building regexps and replacing constructed regexps: these two commands do complementary things and deserve to work together. As things stand now, you’d have to
- Open
re-builder
and construct your regexp. - Copy it to the kill ring (
C-c C-w
by default). - Quit
re-builder
(C-c C-q
by default). - Run
query-replace-regexp
(C-M-%
by default). - Paste the kill (
C-y
). - Fix newlines and backslashes (if using the “read” interface to
re-builder
, more on this below) - Type in the replacement string and press
RET
to begin the replacements.
Connecting commands through the kill-ring (or clipboard) should be the last resort, not a go-to strategy!
(A)bridged functions
Here they are working in concert:
The idea is this: re-builder
and qrr
are now effectively the same command. Bring up re-builder
using a keybinding (preferably your keybinding for the latter) and build your regexp interactively. Press RET
and re-builder
exits, with its contents as the input to the replacement command.
This is exactly as many keystrokes as running qrr
or re-builder
for their individual purposes, but now you can use both fully, go from the latter to the former, and you have to remember one fewer command or keybinding!
A Combinatorial Expansion: rx
input to query-replace-regexp
re-builder
has many more advantages than just interactive feedback, and now they all carry over to qrr
:
- You construct the regexp in a regular buffer, giving you access to the full suite of Emacs command for editing, including your own editing customizations.
- You can navigate between matches interactively (like with
isearch
), and thus choose where in the buffer the replacement should begin. This even doubles as a replacement forisearch-forward-regexp
andisearch-backward-regexp
, although these can be configured to preview matches andre-builder
is less crucial. - You can switch between different modes of regex entry, including the powerful
rx
forms thatqrr
does not allow:
Pressing “RET” will run the replacement on the appropriate condensed version of this string:
To switch re-builder
to rx
mode, invoke reb-change-syntax
(bound to C-c C-i
by default). This is persistent, you only need to do it once. Note that this is a regular lisp buffer, so you have access to all your lisp editing tools: smartparens/lispy, autocomplete etc.
re-builder
actually has a third “read” interface, where you quote regexps like you would in a string in Lisp code. This is useful to test regexps that you plan to place in code.
Thus we now have a route that threads more islands that were originally disjoint: rx
, your Lisp editing suite, re-builder
and qrr
. This is perhaps the lesson here:
Connecting existing libraries in Emacs leads not to a linear growth in its features, but to a combinatorial expansion of its capabilities. This can be significantly more bang for your buck than writing things from scratch, and it will help minimize your cognitive load as the things you already know work in more contexts.
The Actual Bridge
Finally here’s the elisp forming a bridge between the two commands instead:
(defvar my/re-builder-positions nil
"Store point and region bounds before calling re-builder")
(advice-add 're-builder
:before
(defun my/re-builder-save-state (&rest _)
"Save into `my/re-builder-positions' the point and region
positions before calling `re-builder'."
(setq my/re-builder-positions
(cons (point)
(when (region-active-p)
(list (region-beginning)
(region-end)))))))
(defun reb-replace-regexp (&optional delimited)
"Run `query-replace-regexp' with the contents of re-builder. With
non-nil optional argument DELIMITED, only replace matches
surrounded by word boundaries."
(interactive "P")
(reb-update-regexp)
(let* ((re (reb-target-value 'reb-regexp))
(replacement (query-replace-read-to
re
(concat "Query replace"
(if current-prefix-arg
(if (eq current-prefix-arg '-) " backward" " word")
"")
" regexp"
(if (with-selected-window reb-target-window
(region-active-p)) " in region" ""))
t))
(pnt (car my/re-builder-positions))
(beg (cadr my/re-builder-positions))
(end (caddr my/re-builder-positions)))
(with-selected-window reb-target-window
(goto-char pnt) ; replace with (goto-char (match-beginning 0)) if you want
; to control where in the buffer the replacement starts
; with re-builder
(setq my/re-builder-positions nil)
(reb-quit)
(query-replace-regexp re replacement delimited beg end))))
Additionally, I bind this new replace-regexp function (reb-replace-regexp
) to RET
in the re-builder
buffer, and replace qrr
entirely with just re-builder
:
(define-key reb-mode-map (kbd "RET") #'reb-replace-regexp)
(define-key reb-lisp-mode-map (kbd "RET") #'reb-replace-regexp)
(global-set-key (kbd "C-M-%") #'re-builder)
Very briefly, the code works as follows:
- Save the region and point positions into
my/re-builder-positions
before invokingre-builder
, since these are lost. This is done by advising the function. - When you press
RET
, quitre-builder
and callqrr
with the built regexp, saved point and region information.
Lastly, if you want to insert a newline in the regexp-builder buffer you can now use C-q C-j
. Entering literal newlines in a regexp definition is rare enough that dedicating RET
to the much more useful qrr
is a no-brainer.
-
Yes, visual-regexp exists. But piling on another thousand lines of code here would be like bringing in a mountain of dirt to create a new self-contained island when the existing ones are lacking but a few connecting strings, and have the opportunity to form a denser network of interactions. ↩︎