regexp
.
The emacswiki.org
page Regular Expressions gives a nice summary of the ins and outs of Elisp regular expressions.
Outside of Elisp, I would like "rx" even better as a short abbreviation.
In Elisp however, rx
indicates an alternative representation of regular expressions, using s-expressions instead of strings. So if your Elisp regular expressions are in string form, I would suggest "regex" or perhaps "regexp".
For plural emacs uses "regexps", I use "regexs", and a few use "regexen" (Γ la oxen, children, and of course emacsen).
"\n"
, denotes the string holding the newline character, but '\n
' denotes the length 2 string
consisting of \
followed by n
.
Note this is completely different than the use of the single quote character, as in 'y
in Elisp code,
which is shorthand for (quote y)
and serves to protect literal symbols from evaluation.
As in:
(set 'x 17); x β 17 (set 'x 'y); x β 'y (set x 17); y β 17Although somewhat string-like, symbol names are a different type of thing and not normally interpreted as regular expressions.
\n
' below are just my notation for explanation, not Elisp code!
Fortunately, Elisp also provides tools to ameliorate these problems. Below I elaborate and try to give some helpful hints which make using regular expressions in Elisp okay.
For example:
;; Match "cat" or "dog" and save the match for future reference PCRE (cat|dog) Elisp \(cat\|dog\) ;; Match (πππππ) or (ππππππππ) ;; One way: both altenatives include parens to be matched (grouping not involved). PCRE \(defun\)|\(defmacro\) ;; '(', ')' are special unless escaped. Elisp (defun)\|(defmacro) ;; '|' needs to be escaped to be special. # Another way: place alternative inside a group. PCRE \((defun|defmacro)\) ;; unescaped parens ( ) indicate a group Elisp (\(defun\|\(defmacro\)) ;; escaped parens \( \) indiate a group # Third way. When matching will save "un" or "macro" in group 1. PCRE \(def(un|macro)\) Elisp (def\(un\|macro\)) # The same, except does not save the group. PCRE \(def(?:un|macro)\) Elisp (def\(?:un\|macro\))Not surprisingly perhaps, the Elisp regex syntax is optimized for matching literal parens. This can be nice when looking for patterns in lisp code, but overall Elisp regular expressions tend to require more escaping with '
\
' than PCRE's do.
And unfortunately the backslash character '\
' also has a special meaning in string literals.
For example, "\n"
indicates the newline character (ascii code 10).
So to represent the length 2 string consisting of the characters '\
' and 'n
',
one needs to use an additional backslash: "\\n"
.
Elisp literal regular expressions are string literals interpreted after string backslash substitution. So for example,
;; the following two are equivalent (string-match "(cat|dog)" var); string evaluates to: (cat|dog) (string-match "\(cat\|dog\)" var); also evaluates to: (cat|dog) ;; Both match only the length 9 string '(cat|dog)' ;; To match πππ or πππ and save the match use \\ (string-match "\\(cat\\|dog\\)" var); string evaluates to: \(cat\|dog\) ;;
\\\\
;; Trying to use regex to search for '\' followed by a letter in [a-z] ;; e.g. strings such as: '\a', '\b', ... '\z'. ;; string literal regex matches (re-search-forward "[a-z]"); [a-z] any 1 char in a-z (re-search-forward "\[a-z]"); [a-z] any 1 char in a-z (re-search-forward "\\[a-z]"); \[a-z] string '[a-z]' (re-search-forward "\\\[a-z]"); \[a-z] string '[a-z]' (re-search-forward "\\\\[a-z]"); \\[a-z] strings '\a', '\b', ..., '\z' voilΓNote the perhaps surprising equivalences:
(string= "\\[a-z]" "\\\[a-z]"); t both are read as: '\[a-z]' (string= "\\[" "\\\["); t both are read as: '\['So when using a string literal in Elisp code, to indicate matching a backslash character '
\
',
one needs to use "\\\\"
.
re-search-forward
in an Elisp code literal string,
one needs to use "\\\\"
to obtain a regular expression matching '\
'.
However, re-search-forward
can also be invoked as a command when bound to a key or
via execute-extended-command (by default bound to M-:).
In which case it prompts the user to enter a regular expression in the minibuffer.
It turns out that the text typed into the minibuffer is passed to the re-search-forward
without performing string literal backslash escaping.
So even though "|" and "\|" are read the same in Elisp source code:
(string= "|" "\|"); --> tWhen pressing M-x and typing re-search-forward to invoke it as a command we have
#USER ENTERS Matches at | next '|' in buffer \| anywhere \\| next '\|' in bufferThe regex '\|' means:γempty regex OR empty regexγ— so it matches anywhere.
For an example involving searching for newlines, compare:
# When invoked as a plain function using M-: #USER ENTERS string read as Cursor moves to (re-search-forward "n") n n (re-search-forward "\n") β€ β€ (re-search-forward "\\n") \n n (re-search-forward "\\\n") \β€ β€ (re-search-forward "\\\\n") \\n \ n # # When invoked as a command # after pressing M-x and entering re-search-forward #USER ENTERS Cursor moves to n n \n n \\n \n \\\n \n \\\\n \\nWhere again I use the Unicode character β€ as a way to denote the newline character (ascii code 10).
The difference between the way the argument to re-search-forward
looks as a literal string in code
versus the string read from the minibuffer (via the command re-search-function,
ultimately by the builtin function read-from-minibuffer
),
is that the Elisp reader performs backslash substitution on string literals, but read-from-minibuffer
does not.
So entering \n
yields the length two string '\n
' instead of a newline character.
To search for a newline, just enter a newline.
return
key won't work because it enters what you have input so far.
Under default keybindings, the straightforward way to do this is to press C-q C-j
(when entering a regex in the minibuffer), which should insert a newline in the minibuffer.
This is all a casual emacs user would need to know. But you are on the road to becoming an intense emacs user!
The details are that C-q
invokes the command quoted-insert which uses the builtin function
read-char
which interprets C-j
as the number 10
and 10 is the ascii (and utf8) code for the newline character
(Consulting an ascii table, one can see that "j" relates to 10 because
j is the 10th letter of the Roman alphabet).
Another way to insert a newline into the minibuffer is to press
C-o
to invoke the command open-line.
A general principle here is that the minibuffer can be edited more or less like any other buffer, the main limitation being that only commands bound to keys can be invoked conveniently (one cannot simply use execute-extended-command or eval-expression when already in the minibuffer).
[[:space:]]
(re-search-forward "[:space:]"); Like PCRE [:aceps] (re-search-forward "[[:space:]]"); Looks for white space. Pry what you want. (re-search-forward "[e[:space:]]fun") Looks for "efun" or "fun" preceded by white space. (re-search-forward "[^[:space:]]"); Looks for anything but white space.
Here I have organized the anchors described in the Elisp info pages in tabular form:
Matches Usage Restrictions \= at point Buffer only ^ after BEG or β€ Start of clause \` after BEG $ before END or β€ End of clause \' before END \b at wordBEG,wordEND,BEG,END \B not at wordBEG,wordEND,BEG,END \< before wordBEG \> after wordEND \_< before symbolBEG \_> after symbolENDWhere
BEG
and END denote the beginning or end of the target (string or buffer) being matched against; and again I use β€ to denote the newline character 10.
The "Start of clause" restriction on ^
indicates that the anchor can only be used at the beginning of a regular expression, an alternative, or a group.
In other words, as an anchor, ^
can only be preceded by: \| or \( or \(?:.
Note however that in regular expressions (Elisp or PCRE) the character ^
does double duty.
In addition to serving as a "start of clause" anchor, ^
also plays the role of negating a set of characters
in a character alteration.
"^[a-z]" Matches a single character in {a,b,...,z}, after a newline, or at the beginning of the target. "[^a-z]" Matches a single character *not* in {a,b,...,z} (anywhere)
The "End of clause" restriction on $
indicates that the anchor can only be used at the end of a regular expression, an alternative, or a group.
In other words, the anchor $
can only be followed by: \) or \|.
\=
only makes sense when matching against a buffer.
For example:
(re-search-forward "[[:space:]]+"); Skips to end of the next patch of space characters. (re-search-forward "\\=[[:space:]]+"); Skips past any space characters immediately after point.Note that
\=
has a completely different special meaning in documentation strings.Make a habit of conscientiously selecting between \` vs ^
and \' vs $ instead of just lazily using ^
and $
all the time because they are more familiar.
Note that the \'
anchor doesnβt do anything useful when combined with the start-idx
argument of string-match
.
You might think that passing a start-idx
to string-match
would act as if the text
argument
string starts at position start-idx
οΌ but that is not how it works.
;; Imagine you want only want matches starting at position 4 (string-match-p "cat" "the cat in the hat" ) --> 4 (string-match-p "cat" "the cat in the hat" 4) --> 4 Yes we want "cat" at pos 4 (string-match-p "cat" "the hat in the cat" 4) --> 15 But here one at 15 also matches ;; Unsuccessful attempts to use the only match at position 4 (string-match-p "\\`cat" "the cat in the hat" 4) --> nil Bad. Does not match, cuz 4 is not the start of the string. (string-match-p "^cat" "the cat in the hat" 4) --> nil Bad. Does not match for the same reason. (string-match-p "^cat" "the\ncat in the hat" 4) --> 4 Different; matches due to the newline.Instead use
substring
.
(defun regex/match-at? (regex text pos) "Does REGEX match TEXT at position POS?" (string-match-p (concat "\\`" regex) (substring text pos) )) (regex/match-at? "cat" "the cat in the hat" 4) --> 0 OK, a true value. (regex/match-at? "cat" "the cat in the hat" 2) --> nil false, as desired.
Another important detail be mindful of is that, in Elisp regexs, the special character
.
matches any single character except a newline.
(In PCRE regexs .
usually excludes the newline character, but not always. The PCRE construct \N
is more reliably like the Elisp .
special character.)
This begs the question of how to match any single character in Elisp regex.
One might be tempted to try or similar,
but that won't work because '[.\n]
.
' has no special meaning inside a character class.
Some options include:
STRING LITERAL REGEX Matches one of "[^z-a]" [^z-a] anything. The range z-a is empty, so its negation includes all characters. This slick way is used by rx. "\\(.\\|\n\\)" \(.\|β€\) anything (and captures it as a group) "\\(?:.\\|\n\\)" \(?:.\|β€\) anything (without capturing) "." . anything except a newline character "[[:print:]]" [[:print:]] Most chars, but not chars with ascii code below 32, notably line feed ^L
The s-expression based rx
however does provide a symbol anything
to match any single character,
and separately the symbol not-newline
or nonl
to match anything but the newline.
(string-match-p (rx "cat" anything "dog") "catdog") --> nil. no char between cat and dog (string-match-p (rx "cat" anything "dog") "cat:dog") --> 0. (matches at position 0) (string-match-p (rx "cat" anything "dog") "cat\ndog") --> 0. (matches at position 0) (string-match-p (rx "cat" nonl "dog") "cat:dog") --> 0. (matches at position 0) (string-match-p (rx "cat" nonl "dog") "cat\ndog") --> nil. char between cat and dog is a newline.
\b
is an anchor matching the empty string at word boundaries.
It is frequently useful when searching through text.\brust\b
at the prompt of the occur command.(re-search-forward "\\brust\\b"); Search for word rust
Moreover, you should think about whether you want \b or the pair \<, \>.
The pair obviously differ from \b, in that that \< matches at the beginning of words
and \> and the end; while \b does both.
More subtly, they also differ in how they treat the beginning and end of their target string or buffer.
;; search for word CAT returns (string-match-p "\\bCAT\\b" "CAT"); 0 (string-match-p "\\<CAT\\>" "CAT"); 0 (string-match-p "\\bCAT\\b" " CAT "); 1 (string-match-p "\\<CAT\\>" " CAT "); 1 ;; search for word CAT in between 2 words (string-match-p "\\b CAT \\b" "Dog CAT Pig"); 3 (string-match-p "\\> CAT \\<" "Dog CAT Pig"); 3 (string-match-p "\\b CAT \\b" " CAT "); 0 (string-match-p "\\< CAT \\>" " CAT "); nilThe last case differs because the pair \<, \> treat the target ends (start or finish) as word boundaries only if the ultimate (first or final) character is a word character;
The variable reb-re-syntax
controls how the text typed into the *RE-Builder*
buffer is interpreted.
;; To see the regex as in an Elisp string literal, e.g. "\\\\" to match \ (setq reb-re-syntax 'read); enter \\\\ to match \ ;; To see the same but after string backslash substitution, e.g. "\\" to match \ (setq reb-re-syntax 'string); enter \\ to match \You can also use reb-change-syntax {
C-c C-i
} to switch between those values.
To learn more use describe-keymap to see what commands re-builder
provides.
As an aside, many modes like this bind their commands to C-c C-somekey
(because emacs recommends that).
However I find I often accidentally type C-c somekey
instead.
So I often add key bindings without the second cntl key press.
For example:
(bind-keys :map reb-mode-map ("\C-c b" . reb-change-target-buffer) ("\C-c c" . reb-toggle-case) ("\C-c e" . reb-enter-subexp-mode) ("\C-c i" . reb-change-syntax) ("\C-c q" . reb-quit) ("\C-c r" . reb-prev-match) ("\C-c s" . reb-next-match) ("\C-c u" . reb-force-update) ("\C-c w" . reb-copy) )
This is a special case of regex matching and in fact
the typical way to do plain string matching in Elisp is to use
the Elisp regex matching machinery.
For example string-match-p
does regex matching, not just string matching.
But often what you want is plain string matching.
The two functions regexp-quote
, regexp-opt
are helpful in this case.
;; Obtain a regex string matching '\textit' ;; written as an Elisp literal string "\\textit" (string-match-p "\\back" "red ack"); --> 4. Oops, matches 'ack' after a word boundary (string-match-p (regexp-quote "\\back") "red ack"); --> nil. No '\back' in 'red ack' (string-match-p (regexp-quote "\\back") "\\back"); --> 0. matches '\back' ;; Obtain a regex string matching either '\bad' or '\boy' (regexp-opt '("\\bad" "\\boy")) --> "\\(?:\\\\b\\(?:ad\\|oy\\)\\)" ;; Same, except the regex also captures the match. (regexp-opt '("\\bad" "\\boy") t) "\\(\\\\b\\(?:ad\\|oy\\)\\)"
pcre2el
, available from elpa, provides the functions
rxt-pcre-to-elisp
and rxt-elisp-to-pcre
to translate back and forth
between Elisp and Perl regexs.
; Perl --> Elisp (rxt-pcre-to-elisp "(cat|dog)"); --> "\\(\\(?:cat\\|dog\\)\\)" ;; Not sure why it does not return "\\(cat\\|dog\\)" ; Elisp --> Perl (rxt-elisp-to-pcre "\(cat\|dog\)"); --> "\\(cat\\|dog\\)"Quite useful for folks more familiar with PCRE.
rx
notationrx
.
For example:
(rxt-pcre-to-rx "([cC]at|[dD]og)"); ;; returns list (submatch (or (seq (any 67 99) "at") (seq (any 68 100) "og"))) ;; Note in Elisp characters are integers ;; 67,99; 68,100 are the ascii codes for c,C,d,D
The rx
form is often much easier to read than the string form, even for people experienced using regular expressions in string form.
I find the command rxt-explain
extremely useful for checking regular expressions Elisp code.
For example, to check this (mistaken) code:
(looking-at "\\\\begin\\(\\s*{[^}\n]*}\\)"β))Invoking
rxt-explain
at β above, gives the message: rxt-parse-atom/el: Invalid regexp: "Invalid syntax class `\\\\s '"
'\s'
would match whitespace οΌ correct for PCRE, but wrong for Elisp.
Editing the code to fix that yields:
(looking-at "\\\\begin\\([[:space:]]*{[^}\n]*}\\)"β ))Invoking
rxt-explain
at β now pops up a buffer "* Regexp Explain *"
with the following contents:
\\begin\([[:space:]]*{[^} ]*}\) (seq "\\begin" (submatch (zero-or-more space) "{" (zero-or-more (not (any ?\n ?}))) "}"))Which is what I wanted. Note the string literal
"\\begin"
is read as the string '\begin'
as intended
(with just one backslash "\begin"
would be read as the backspace character ?\b
followed by 'egin'
).
To get help like this from rxt-explain
, the buffer should be in the emacs-lisp-mode
or lisp-interaction-mode
major mode
(which should be the case for filenames ending in .el
) and `point' placed inside or near the string.
Actually rxt-explain
just dispatches to rxt-explain-elisp
or rxt-explain-pcre
.
So, in any buffer, if you want an Elisp string-form regex explained, but rxt-explain
asks for a PCRE regex:
, you can simply call rxt-explain-elisp
directly.
A final note regarding rxt-explain
, is that it displays the plain space characters (ascii code 32) as \s
.
(looking-at-p "a cat"β)Pops up:
a\scat "a\scat"
The \s
here is not a regular expression construct, but rather an alternative way to represent a literal space character in Elisp code.
For example:
;; --> <-- plain space character (= 32 ?\s ?\ ) ;; true (string= "a cat" "a\scat") ;; true
I do not know why rxt-explain
displays space characters in this manner, but it may be useful to distinguish between
plain space characters and other space characters defined in Unicode.
For example
(looking-at-p "2 plain spaces( ) one fill-width-space(γ)"β)Pops up:
2\splain\spaces(\s)\sone\sfill-width-space(γ) "2\splain\spaces(\s)\one\sfill-width-space(γ)"Allowing one to distinguish between two plain spaces and one "full-width" space character (Unicode code point #x3000), which otherwise might look the same. One puzzle remains. Why do the two plain spaces inside the parens produce only a single
\s
in the rxt-explain
output? In other words why not:
2\splain\spaces(\s\s)\sone\sfill-width-space(γ) "2\splain\spaces(\s\s)\sone\sfill-width-space(γ)"I have no idea why. Perhaps it is a bug in this version (emacs 28.2) of
pcre2el.el
?rxt-explain
is still really useful for checking regular expressions.
I have not yet explored other uses of the rx
representation of regular expressions,
but I expect it should be much more "lisp friendly" than the string representation;
and therefore probably easier to work with when writing Elisp code which modifies or generates regexs.
The following example and discussion is similar to a nice
blog article written by Protesilaos Stavrou.
In the past I used caps for HTML tags, as in <TT>
and </B>
instead of <tt>
and </b>
. I think the caps make tags easier to spot; but the convention seems to be to use lower case.
So letβs say I decide to conform and convert my upper case tags to lower case. One way to do this is to use the code evaluating feature of query-replace-regex.
Place the cursor at the top of an HTML file, invoke query-replace-regex entering:
Query replace regexp </?[A-Z]+> Query replace regexp with \,(downcase \&)The
\,
construct allows us to evaluate arbitrary code to obtain the replacement text.\&
denotes the matching string; if our regex had groups, we could access them with \1
, \2
etc.case-fold-search
on query-replace-regexcase-fold-search
affects the behavior of query-replace-regex.case-fold-search
on (set to t
).case-fold-search
true) I invoked query-replace-regex like this:
Query replace regexp </?[a-z]+> Query replace regexp with \,(downcase \&)Producing the behavior that the regex matched where I intended it to, but the replaced text was still in upper case!
<TT>
,y
key to do the replacement, the replacement was still <TT>
.
What happened?
Notice my regex mistakenly included [a-z]
instead of [A-Z]
.
With case-fold-search
set to nil
, the regex would simply have failed to match "<TT>
".
When case-fold-search
is set to a true value it will match "<TT>
";
but will remember the match really was in upper case and therefore convert the replacement text to upper case.
This behavior is often useful, but not in this particular situation.
The remedy? The simplest advice is to be aware of this behavior and make sure your regexs are correct;
but you could also define a variant of query-replace-regex which temporarily turns off case folding:
(defun query-replace-regexp/fold-not () "Call query-replace-regex with case folding suppressed" (interactive) (let (case-fold-search) (call-interactively #'query-replace-regexp) ))
downcase
; but other code could be used (including
for the purposes of side effects).
For example, one might want to systematically adjust some numbers in some text.
Suppose you are editing some TikZ source code and want to shift some objects one centimeter
to the right. You might do this by adding one to every occurrence of xshift=NUMBER
.
query-replace-region is suitable for this sort of task.
Buffer contents before
\begin{scope}[xshift=7cm, yshift=15cm]... \begin{scope}[xshift=1cm, yshift=12cm]... \begin{scope}[xshift=13cm,yshift=9cm]...Then execute query-replace-regexp
Query replace regexp \(xshift=\)\([0-9]+\) Query replace regexp with \1\,(1+ (string-to-number \2))Buffer contents after
\begin{scope}[xshift=8cm, yshift=15cm]... \begin{scope}[xshift=2cm, yshift=12cm]... \begin{scope}[xshift=14cm,yshift=9cm]...When matching the text "xshift=13", in the replacement "xshift=" is substituted for
\1
and
(1+ (string-to-number \2))
evaluates to (1+ (string-to-num "13"))
.
\,(...)
need not be a string"13"
to a number before passing it to the function 1+
,
we do not need to convert the number returned by 1+
to a string.
This type flexibility on the part of query-replace-regexp is convenient (albeit somewhat surprising);
any insertable type seems to work.
Try for example:
Query replace regexp pig Query replace regexp with \,(make-vector (length \&) 'hund)
query-replace
mechanismquery-replace
is performed by the function perform-replace
,
which can be conveniently use to define your own commands behaving in a similar way.
For example to swap occurrences of two strings, one can define a command like this.
(defun ph/query-swap-string-occurrences (s1 s2) "Query replace occurrences of string S1 with S2, and string S2 with S1." (interactive (list (read-string "Replace string: ") (read-string "with string: ") )) (perform-replace (regexp-opt (list s1 s2)); FROM-STRING (list (lambda (_ _) (if (looking-back-p (regexp-quote s1)) s2 s1))); REPLACEMENT t; QUERY? Yes, ask for confirmation t; REGEXP? Yes, we search with a regex matching S1 or S2 nil; DELIMITED? nil; REPEAT-COUNT nil; MAP (if (region-active-p) (region-beginning) (point)); START (if (region-active-p) (region-end) (point)); END ))Where
looking-back-p
described below is a match data clean version of looking-back
.
With its numerous parameters, the call to perform-replace
may seem a bit intimidating to new (and not so new) Elisp programmers. In particular the second argument (named REPLACEMENTS
) involving the lambda expression with two dummy arguments; the code above uses _
for both of them. Indeed, although the docstring of perform-replace
is perfectly accurate, I must confess it took some trial and error for me to get this to work.
For example, the function match-data
returns the information start in the match data.
(progn; 0123456789012345 (string-match "\\(pill\\).*\\(pill\\)" "caterpillar pill") (match-data) ); returns: (5 16 5 9 12 16) ;; [span) of all \1 \2 ;;
Keep in mind the fact that the match data only stores positions. For a general explanation, I refer readers to the relevant Elisp info pages.
Here I just mention a caveat or two about using string-match
followed by match-string
(directly or indirectly).
No check is made that the strings are the same
;; Matched "cat", but returns "dog" (progn (string-match "cat" "acat") (match-string 0 "adog") )
;; Matched "cat", but returns whatever is in position 2..4 of the buffer. (progn (string-match "cat" "acat") (match-string 0); Logical error. Elisp quietly returns part of the buffer. )
ielm
instead...)(string-match "cat" "acat")β Invoke eval-last-sexp returns 1 (match-string 0 "acat")β Invoke eval-last-sexp KABOOM! (args-out-of-range "acat" 72499 72543)Where β indicates the positions at which I invoked eval-last-sexp.
(progn (string-match "cat" "acat") (match-string 0 "acat") )β Invoke eval-last-sexp returns "cat"As expected.
match-string
directly follows the call to string-match
,
... or does it?match-string
.
string-match-p
over string-match
string-match-p
preserves the match data, so use it if you only need to know whether or not there is a match.looking-at-p
over looking-at
and looking-back-p
over looking-back
.
Except looking-back-p
is excluded from Elisp because looking back can be very slow, depending on how it is used.
I think the more useful approach would be to let the programmers decide, so I myself trivially defined looking-back-p
as:
(defun looking-back-p (regexp) "Same as `looking-back' except this function does not change the match data." (let ((inhibit-changing-match-data t)) (looking-back regexp) ))(Also available on gitlab)
string-match
(or looking-at
, etc.) is a separate operation from subsequent
access to the matches via match-string
. This opens the door for the match
data being overwritten between calls, or the string argument used with match-string
to differ
from the one used for matching, as in the "acat" and "adog" example
or be missing entirely (defaulting to extracting text from the buffer).
Generally speaking a safer approach is to use functions which do matching and capturing in a single step, and politely restore the match data.
Such as s-match
from the s.el
library:
(progn (insert "\n") (string-match "cat" "acat") (insert "s-match: " (car (s-match "web" "webster")) "\n") (insert "outside: " (match-string 0 "acat")) )β invoking eval-last-sexp inserts: s-match: web outside: catNote the
car
, since s-match
returns a list.
save-match-data
to clean up after yourselfsave-match-data
scope:
For example a regex/match?
function
like this one provides a simple matching function which restores the match data to its original state.
(defun regex/match? (regex text &optional drop-props?) "If REGEX matches TEXT, return the first one. Otherwise nil. Drop text properties when DROP-PROPS? is true. Restores match data. See also `s-match'" (save-match-data (when (string-match regex text) (if drop-props? (match-string-no-props 0 text) (match-string 0 text) ))))