Find, convert and replace dates with Vim substitutions

By

Vim’s substitution command is a powerful way to make changes to text files. Besides finding and replacing text using regular expressions, substitutions can call out to external programs for more complicated replacements. By using the date utility from a substitution, Vim can convert all dates in a file to a different format and replace them all at once.

Finding, converting and replacing dates with Vim substitutions
The input file
<article>
  <h1>Keeping open source projects maintainable</h1>
  <time datetime="2017-09-19">2017-09-19</time>
</article>

<article>
  <h1>Property-based testing in Elixir using PropEr</h1>
  <time datetime="2017-08-22">2017-08-22</time>
</article>

<article>
  <h1>git is not a git command</h1>
  <time datetime="2015-10-01">2015-10-01</time>
</article>

...

The input file is an HTML page with a list of articles. Each article includes a <time> tag with a value and a datetime attribute to show the publication date. We need to convert the dates’ values to a friendlier format that includes the full month name (“September 19, 2017”), while keeping the datetime attributes in their current format.

The result: articles with reformatted dates
<article>
  <h1>Keeping open source projects maintainable</h1>
  <time datetime="2017-09-19">September 19, 2017</time>
</article>

<article>
  <h1>Property-based testing in Elixir using PropEr</h1>
  <time datetime="2017-08-22">August 22, 2017</time>
</article>

<article>
  <h1>git is not a git command</h1>
  <time datetime="2015-10-01">October 01, 2015</time>
</article>
...

The input file has more then forty articles, so replacing them all by hand would be a lot of error-prone work. Instead, we write a substitution that finds all dates in the file and replaces them with a reformatted value.

Finding the dates

The first step in replacing the dates is finding where they are in the input file and making sure not to match the ones in the datetime attributes.

To find all dates in the file, we could use ....-..-.. (esc /....-..-..) as our search pattern to match the date format. However, this pattern’s results will include all matching dates in the file, including the ones in the <time> tags’ datetime attributes.

<time datetime="2017-09-19">2017-09-19</time>

In the input file, all <time> values are immediately followed by the less-than sign from the closing </time> tag. To prevent the datetimes from the attributes to be included in the results, we include the less-than sign in the search pattern.

....-..-..<

Reformatting dates from the command line

We need the month’s full name in the date replacements, so we can’t reorder the input value (“2017-09-19”) to get the result we want. Instead, we need to call out to an external utility that knows month names and can convert between date formats.

We reformat each match of our search pattern to our desired format (“September 19, 2017”) with the date utility. Because we include the trailing less-than sign in the search pattern, the dates we use for the conversion will end with one too (“2017-09-19<”). We append the less-than sign to the input format to match the results from the search pattern ("%Y-%m-%d<").

The output format is "%B %d, %Y<" to produce the month’s name, the date’s number, a comma and the year number, followed by the less-than sign (September 19, 2017<) to keep the HTML intact.

With these formats the date utility reformats 1992-11-02< to November 02, 1991<.

$ date -jf "%Y-%m-%d<" "1991-11-02<" +"%B %d, %Y<"
November 02 1991<
-j

Don’t try to set the system date

-f "%Y-%m-%d<"

Use the passed input format instead of the default. In this case "%Y-%m-%d<" to match the input format (1991-11-02<).

"1991-11-02<"

An example date to be parsed using the input format passed to -f.

+"%B %d, %Y<"

The output format, which produces November 02, 1991<.

Calling out to external utilities from substitutions

We know how to find all dates in the file, and how to convert a date to another format from the command line. To replace all found dates with a reformatted version from the date utility, we need to run an expression from a substitution.

Using the search pattern we prepared earlier, we can find and replace all date values from the input file with a substitution. For example, we could overwrite all dates with a hardcoded value:

:%s/....-..-..</November 2, 1991</gc
....-..-..<

The search pattern to find all dates in the file.

November 2, 1991<

The literal substitute string to replace the dates with a hardcoded one.

Instead of inserting a hardcoded substitute string, we need to run an expression for each match to get its replacement.

We use the system() function from an expression (\=) to call out to the date utility. Sticking with hardcoded dates for now, we can use the utility to convert a date’s format from “1991-11-02” to “November 2, 1991” before inserting it into the file:

:%s/....-..-..</\=system('date -jf "%Y-%m-%d<" "1991-11-02<" +"%B %d, %Y<"')/gc
....-..-..<

The search pattern to find all dates in the file.

\=system('date …​')

An expression that uses the system() function to execute an external command and returns its value as the substitute string.

'date -jf "%Y-%m-%d<" "1991-11-02<" +"%B %d, %Y<"'

The date command as a string, with a hardcoded date ("1991-11-02<") as its input date argument. This date matches the format of the search pattern’s matches.

⚠️
This substitution produces a newline in the <time> tag, because the date utility appends one to its output. We’ll remove these later while discussing nested substitutions.

The replacement value is still hardcoded (“1991-11-02”), so this substitution will overwrite all date values in the file to a date in 1991. To put the matched date values back in the file, we need to pass them to the date command.

To pass the matched date to the call to date in our expression, we need to break out of the string passed to the system() function and replace the hardcoded date with a call to submatch(0) to insert the whole match.

:%s/....-..-..</\=system('date -jf "%Y-%m-%d<" "'.submatch(0).'" +"%B %d, %Y<"')/gc

Running this substitution will turn all <time> tags from the input file to our desired format, but it beaks the closing </time> tag with an extra newline.

The current result, with an added newline that breaks the closing </time> tag
<article>
  <h1>Keeping open source projects maintainable</h1>
  <time datetime="2017-09-19">September 19, 2017<
/time>
</article>

...

Nested substitutions

A newline is appended to the result of the date command, which ends up in the file after running the substitution. Since there’s no way to get the date command to omit the newline, we need to take it out ourselves.

We can run a second substitution to remove them after running the first one:

:%s/<\n\/time>/<\/time>/g

Another option is to use a nested substitution to keep the original substitution from adding newlines in the first place.

We wrap the call to the date utility in a nested substitution using the substitute() function. It takes the result, matches the newline ("\n") and replaces it with an empty string.

:%s/....-..-..</\=substitute(system('date -jf "%Y-%m-%d<" "'.submatch(0).'" +"%B %d, %Y<"'), "\n", "", "")/gc

Now our substitution will turn the <time> tags into the correct format, without adding that extra newline.

The result
<article>
  <h1>Keeping open source projects maintainable</h1>
  <time datetime="2017-09-19">September 19, 2017</time>
</article>

<article>
  <h1>Property-based testing in Elixir using PropEr</h1>
  <time datetime="2017-08-22">August 22, 2017</time>
</article>

<article>
  <h1>git is not a git command</h1>
  <time datetime="2015-10-01">October 01, 2015</time>
</article>

...

Any questions, feedback or suggestions? Please respond on Twitter (or via direct message) or send an e-mail. Even better, write your own article as a response.