Purge and minify internal CSS stylesheets with Node.js

For an HTML file with an internal CSS stylesheet, I’d like to remove all unused selectors and minify the result.

<html>
  <head>
    <style>
      body{
        font: 22px/1.6 system-ui, sans-serif;
        margin: auto;
        max-width: 35em;
        padding: 0 1em;
      }

      img, video{
        max-width: 100%;
        height: auto;
      }
    </style>
  </head>
  <body>
    <h1>Hello world!</h1>
  </body>
</html>

This output example purges the img and video selectors and minifies the stylesheet by removing newlines:

<html>
  <head>
    <style>body{font:22px/1.6 system-ui,sans-serif;margin:auto;max-width:35em;padding:0 1em}</style>
  </head>
  <body>
    <h1>Hello world!</h1>
  </body>
</html>

The most popular tools to do this seem to be PurgeCSS for purging and cssnano for minifying, which are both PostCSS plugins.

We can’t use their command-line interfaces, as neither of these has a command-line option to take the internal stylesheet out of a page to purge and minify, as both expect the CSS and HTML to be in separate files. Instead, we’ll write a Node.js program that takes HTML files as input and prints them back out after minifying the internal stylesheets.

The program takes an HTML page through standard input with fs.readFileSync("/dev/stdin"), which it then passes to a function named crush:

#! /usr/bin/env node
const fs = require("fs");
const crush = require("..");
let input = fs.readFileSync("/dev/stdin").toString();

crush.crush(input).then(console.log);

The crush function takes the input HTML string and finds the stylesheet through a regular expression that captures anything within <style> tags. For each stylesheet, it passes the match from the regular expression along with the whole input file to a function called process. The promises it returns are collected and placed back into the <style> tags in the HTML file:

const regex = /<style>(.*?)<\/style>/gs;

exports.crush = async function (input) {
  let promises = [...input.matchAll(regex)].map((match) => {
    return process(match, input);
  });
  let replacements = await Promise.all(promises);

  return input.replace(regex, () => replacements.shift());
};

The process function handles processing the extracted stylesheet through PostCSS, initialized with the PurgeCSS1 and cssnano plugins. When the promise is fulfilled, the result—which is the purged and minified style sheet—is placed back into the <style> tag:

function process(match, html) {
  return postcss([
    purgecss({ content: [{ raw: html.replace(match[1], "") }] }),
    cssnano(),
  ])
    .process(match[1])
    .then((result) => {
      return match[0].replace(match[1], result.css);
    });
}

In the end, the index.js file looks like this:

const postcss = require("postcss");
const cssnano = require("cssnano");
const purgecss = require("@fullhuman/postcss-purgecss");
const regex = /<style>(.*?)<\/style>/gs;

exports.crush = async function (input) {
  let promises = [...input.matchAll(regex)].map((match) => {
    return process(match, input);
  });
  let replacements = await Promise.all(promises);

  return input.replace(regex, () => replacements.shift());
};

function process(match, html) {
  return postcss([
    purgecss({ content: [{ raw: html.replace(match[1], "") }] }),
    cssnano(),
  ])
    .process(match[1])
    .then((result) => {
      return match[0].replace(match[1], result.css);
    });
}

The program now takes an HTML document as a string through standard input and minifies the internal stylesheet3:

cat input.html | ./bin/crush.js
<html>
  <head>
    <style>body{font:22px/1.6 system-ui,sans-serif;margin:auto;max-width:35em;padding:0 1em}</style>
  </head>
  <body>
    <h1>Hello world!</h1>
  </body>
</html>

  1. When passing the stylesheet through PurgeCSS, the HTML page to check against is passed via the content option:

    purgecss({content: [{raw: html.replace(match[1], "")}]})
    

    We need to make sure the stylesheet itself isn’t included in content, as that would prevent PurgeCSS from removing the tags.

    As an example, consider this input HTML file, which has styling for a <div> which isn’t there:

    <style>div { color: red }</style>
    

    PurgeCSS should purge that CSS selector, because there are no <div> tags on the page. However, if we pass the input file as a content as-is, PurgeCSS will see “div” in the stylesheet and assume there’s a <div> tag on the page.2

    ↩︎
  2. Upon closer inspection; PurgeCSS will also recognise the following document as having a <div> tag for mentioning the word “div” in another tag:

    <h1>An article about the div tag</h1>
    
    ↩︎
  3. Crush’s source code and tests are on GitHub, and is installed via Git:

    npm install jeffkreeftmeijer/crush
    
    ↩︎