For an HTML file with an internal CSS stylesheet, I’d like to remove all unused selectors and minify the result.
<html> <head> <style> body{ font: 22px/1.6 system-ui, sans-serif; margin: auto; max-width: 35em; padding: 0 1em; } img, video{ max-width: 100%; height: auto; } </style> </head> <body> <h1>Hello world!</h1> </body> </html>
This output example purges the img
and video
selectors and minifies the stylesheet by removing newlines:
<html> <head> <style>body{font:22px/1.6 system-ui,sans-serif;margin:auto;max-width:35em;padding:0 1em}</style> </head> <body> <h1>Hello world!</h1> </body> </html>
The most popular tools to do this seem to be PurgeCSS for purging and cssnano for minifying, which are both PostCSS plugins.
We can’t use their command-line interfaces, as neither of these has a command-line option to take the internal stylesheet out of a page to purge and minify, as both expect the CSS and HTML to be in separate files. Instead, we’ll write a Node.js program that takes HTML files as input and prints them back out after minifying the internal stylesheets.
The program takes an HTML page through standard input with fs.readFileSync("/dev/stdin")
, which it then passes to a function named crush
:
#! /usr/bin/env node const fs = require("fs"); const crush = require(".."); let input = fs.readFileSync("/dev/stdin").toString(); crush.crush(input).then(console.log);
The crush
function takes the input HTML string and finds the stylesheet through a regular expression that captures anything within <style>
tags.
For each stylesheet, it passes the match from the regular expression along with the whole input file to a function called process
.
The promises it returns are collected and placed back into the <style>
tags in the HTML file:
const regex = /<style>(.*?)<\/style>/gs; exports.crush = async function (input) { let promises = [...input.matchAll(regex)].map((match) => { return process(match, input); }); let replacements = await Promise.all(promises); return input.replace(regex, () => replacements.shift()); };
The process
function handles processing the extracted stylesheet through PostCSS, initialized with the PurgeCSS1 and cssnano plugins.
When the promise is fulfilled, the result—which is the purged and minified style sheet—is placed back into the <style>
tag:
function process(match, html) { return postcss([ purgecss({ content: [{ raw: html.replace(match[1], "") }] }), cssnano(), ]) .process(match[1]) .then((result) => { return match[0].replace(match[1], result.css); }); }
In the end, the index.js
file looks like this:
const postcss = require("postcss"); const cssnano = require("cssnano"); const purgecss = require("@fullhuman/postcss-purgecss"); const regex = /<style>(.*?)<\/style>/gs; exports.crush = async function (input) { let promises = [...input.matchAll(regex)].map((match) => { return process(match, input); }); let replacements = await Promise.all(promises); return input.replace(regex, () => replacements.shift()); }; function process(match, html) { return postcss([ purgecss({ content: [{ raw: html.replace(match[1], "") }] }), cssnano(), ]) .process(match[1]) .then((result) => { return match[0].replace(match[1], result.css); }); }
The program now takes an HTML document as a string through standard input and minifies the internal stylesheet3:
cat input.html | ./bin/crush.js
<html> <head> <style>body{font:22px/1.6 system-ui,sans-serif;margin:auto;max-width:35em;padding:0 1em}</style> </head> <body> <h1>Hello world!</h1> </body> </html>
When passing the stylesheet through PurgeCSS, the HTML page to check against is passed via the
content
option:purgecss({content: [{raw: html.replace(match[1], "")}]})
We need to make sure the stylesheet itself isn’t included in
content
, as that would prevent PurgeCSS from removing the tags.As an example, consider this input HTML file, which has styling for a
<div>
which isn’t there:<style>div { color: red }</style>
PurgeCSS should purge that CSS selector, because there are no
↩︎<div>
tags on the page. However, if we pass the input file as acontent
as-is, PurgeCSS will see “div” in the stylesheet and assume there’s a<div>
tag on the page.2Upon closer inspection; PurgeCSS will also recognise the following document as having a
<div>
tag for mentioning the word “div” in another tag:<h1>An article about the div tag</h1>
Crush’s source code and tests are on GitHub, and is installed via Git:
npm install jeffkreeftmeijer/crush