Vim’s substitution command is a powerful way to make changes to text files.
Besides finding and replacing text using regular expressions, substitutions can
call out to external programs for more complicated replacements. By using the
date
utility from a substitution, Vim can convert all dates in a file to a
different format and replace them all at once.
The input file is an HTML page with a list of articles. Each article includes
a <time>
tag with a value and a datetime attribute to show the publication
date.
We need to convert the dates’ values to a friendlier format that includes the full month name (“September 19, 2017”), while keeping the datetime attributes in their current format.
The input file has more than forty articles, so replacing them all by hand would be a lot of error-prone work. Instead, we write a substitution that finds all dates in the file and replaces them with a reformatted value.
The first step in replacing the dates is finding where they are in the input file and making sure not to match the ones in the datetime attributes.
To find all dates in the file, we could use ....-..-..
(esc
/....-..-..
) as our search pattern to match the date format. However, this
pattern’s results will include all matching dates in the file, including the
ones in the <time>
tags’ datetime attributes.
<time datetime="2017-09-19">2017-09-19</time>
In the input file, all <time>
values are immediately followed by the
less-than sign from the closing </time>
tag. To prevent the datetimes from
the attributes to be included in the results, we could include the less-than
sign in the search pattern and make sure to add it back when replacing.
....-..-..<
Hoever, Vim supports setting the start and end of the match in the search
pattern using the \zs
and \ze
pattern atoms. By prefixing the <
in our
search pattern with \ze
, the pattern finds all dates followed by a less-than
sign, but doesn’t include it in the match, meaning it won’t be replaced.
....-..-..\ze<
We need the month’s full name in the date replacements, so we can’t reorder the input value (“2017-09-19”) to get the result we want. Instead, we need to call out to an external utility that knows month names and can convert between date formats.
We reformat each match of our search pattern to our desired format (“September
19, 2017”) with the date
utility. We use "%Y-%m-%d"
as the input format
to match the results from the search pattern. The output format is "%B %d,
%Y"
to produce the month’s full name, the date’s number, a comma and the year
number.
With these formats the date
utility reformats 1991-11-02
to November 02,
1991
.
$ date -jf "%Y-%m-%d" "1991-11-02" +"%B %d, %Y" November 02 1991
-j
-f "%Y-%m-%d"
"%Y-%m-%d"
to match the input format (1991-11-02
)."1991-11-02"
-f
.+"%B %d, %Y"
November 02, 1991
.We know how to find all dates in the file, and how to convert a date to another
format from the command line. To replace all found dates with a reformatted
version from the date
utility, we need to run an expression from a
substitution.
Using the search pattern we prepared earlier, we can find and replace all date values from the input file with a substitution. For example, we could overwrite all dates with a hardcoded value:
:%s/....-..-..\ze</November 2, 1991/gc
....-..-..
November 2, 1991
Instead of inserting a hardcoded substitute string, we need to run an expression for each match to get its replacement.
When a substitute string starts with \=
, Vim evaluates it as an expression.
We can call Vim’s built in functions from an expression. To replace all numbers
in a file with the number of the line they’re on, we use the line()
function
from an expression in the substitute string.
%s/\d\+/\=line('.')/gc
\d\+
\d
) in the file. Multi-digit numbers (42,
785, 14281) are matched as one number by using \+
.\=line('.')
\=
) to call the line()
function. Passing '.'
as the function’s argument returns the current cursor
position, which is used to replace the match.Vim provides the system()
function to call out to an external command and use
the result as the replacement string. To replace all numbers in a file with a
random number, we call echo -n $RANDOM
with the system()
function.
:%s/\d\+/\=system('echo -n $RANDOM')/gc
We use the system()
function from an expression (\=
) to call out to the
date
utility. Sticking with hardcoded dates for now, we can use the utility
to convert a date’s format from “1991-11-02” to “November 2, 1991” before
inserting it into the file:
:%s/....-..-..\ze</\=system('date -jf "%Y-%m-%d" "1991-11-02" +"%B %d, %Y"')/gc
....-..-..\ze<
\=system('date …')
system()
function to execute an external
command and returns its value as the substitute string.'date -jf "%Y-%m-%d" "1991-11-02" +"%B %d, %Y"'
"1991-11-02"
) as its input date argument. This date matches the format of
the search pattern’s matches.This substitution produces a newline before the <time>
tag, because
the date
utility appends one to its output. We’ll remove these later
while discussing nested substitutions.
The replacement value is still hardcoded (“1991-11-02”), so this substitution will overwrite all date values in the file to a date in 1991. To put the matched date values back in the file, we need to pass them to the date command.
Vim’s submatch()
function returns matches from our pattern. If we call it
with 0
as its argument, it will return the whole match instead of a submatch.
To wrap each occurrence of “October” in brackets, we use [\0]
as the
substitute string.
:%s/October/[\0]/gc
In an expression, submatches can be included using the submatch()
function.
:%s/October/\='['.submatch('0').']'/gc
To pass the matched date to the call to date
in our expression, we need to
break out of the string passed to the system()
function and replace the
hardcoded date with a call to submatch(0)
to insert the whole match.
:%s/....-..-..\ze</\=system('date -jf "%Y-%m-%d" "'.submatch(0).'" +"%B %d, %Y"')/gc
Running this substitution will turn all <time>
tags from the input file to
our desired format, but it puts a newline before the closing </time>
tag.
A newline is appended to the result of the date
command, which ends up in the
file after running the substitution. Since there’s no way to get the date
command to omit the newline, we need to take it out ourselves.
We can run a second substitution to remove them after running the first one:
:%s/\n<\/time>/<\/time>/g
To keep the original substitution from adding newlines in the first place, we
can pipe the result from the date
command to tr
to remove the newline with
the -d
argument:
:%s/....-..-..\ze</\=system('date -jf "%Y-%m-%d" "'.submatch(0).'" +"%B %d, %Y" | tr -d "\n"')/gc
Another option is to take the newline out with a nested substitution.
substitute()
functionVim’s substitute()
function replaces strings and can be run from an
expression in a substitution. Nested substitutions are useful for transforming
the result of another function.
The function (substitute()
) works like the substitute command (:s
), and
takes the same arguments, so substitute("input", "find", "replace", "g")
is
equivalent to running :%s/find/replace/g
in a file.
The substitute()
function works like the substitute command (:s[ubstitute]
)
in Vim’s command line and takes the same arguments. The first argument is the
input, then the search pattern, the substitute string, followed by optional
options. substitute("input", "find", "replace", "g")
is equivalent to running
:%s/find/replace/g
in a file.
:echom substitute("October 02, 1991", "October", "November", "")
"October 02, 1991"
"October"
"November"
""
g
option because we’re sure there’s only one match in the
input string, so it isn’t necessary.If an external command called from a substitution returns a trailing newline
(like echo
would without the -n
flag), we can use the substitute()
function to take it out before the match is replaced.
:%s/October/\=substitute(system('echo "November"'), "\n", "", "")/gc
We wrap the call to the date
utility in a nested substitution using the
substitute()
function. It takes the result, matches the newline ("\n"
) and
replaces it with an empty string.
:%s/....-..-..\ze</\=substitute(system('date -jf "%Y-%m-%d" "'.submatch(0).'" +"%B %d, %Y"'), "\n", "", "")/gc
Now our substitution will turn the <time>
tags into the correct format,
without adding that extra newline.
Thanks Wouter Vos and Rico Sta. Cruz for feedback on the substitution command
styling, u/Vurpius for suggesting \ze
and Ben Sinclair for suggesting piping
through tr
.