Jeff Kreeftmeijer

Introducing Tapir: Simple search for static sites

2011-05-09

Note: I stopped working at 80beans, so I can’t help you with any issues anymore. You can always send an e-mail to Tapir’s support e-mail address if you need help, though.

Static site generators like Jekyll allow you to generate your whole website or blog as static HTML files and put it online without having to use a database or even building an actual application.

While being static has a lot of advantages, it can be quite a pain in some situations. Since you don’t have a database, you can’t build a commenting system so your readers can comment on what you write, for example. Some other things are analytics and contact forms. Luckily, there are really nice solutions — like Disqus, Google analytics and Wufoo — that allow you to implement these features using a simple javascript snippet.

A feature I really wanted in my website was search, since the archive grew quite a bit and it became impossible to find an article without knowing its title, so I started looking for a solution I could use.

After asking around, it turned out Google’s custom search is still the most used out there. The problem I ran into was that it takes you off the website you were searching on and takes you to a Google results page (complete with slightly irrelevant ads).

I wanted a more elegant solution, with a nice JSON API that would allow me to build a search feature on my website using only javascript, but I found out that most of the existing search engines didn’t have an API, made me put their logo on my website or gave me a request limit.

After discussing this at 80beans, we decided to get some beers, order dinner and throw something together ourselves. After a couple of hours, we had a working prototype and we asked @ivanasetiawan to draw us an awesome Tapir and create a nice design for the website. We took a week to tweak and test the application before we released it into the wild.

Tapir, go!

Tapir is a simple application that indexes your RSS feed and uses Tire (which is powered by Elasticsearch, which is powered by Lucene) to index and search it. It gives you a straightforward JSON API that returns the results.

After signing up and getting your token, you’ll be able to use this API to search your site. For example, here’s one of my results when searching for “Capybara”:

[
  {
      "title":"Capybara ate Swinger",
      "published_on":"2011-02-07T05:00:00Z",
      "content": [the full article content],
      "link":"http://jeffkreeftmeijer.com/2011/capybara-ate-swinger",
      "summary":"Remember Swinger, the Capybara RSpec driver swapper? Capybara can now swap drivers out of the box.",
      "_score":61.15513
    }
  ]

Tapir will check your feed every fifteen minutes to see if there’s anything new to index, but of course you can also ping it to update your search results immediately.

Using that, you can build your own search feature, which requests its results from Tapir’s API. If you want to give it a go, check out the one I implemented.

What’s next?

Over the last week we’ve been busy tweaking the search results and making the application more usable. We have a couple of plans to make Tapir more awesome, but we definitely need your help. If you start using Tapir, have a great idea or if you run into any issues, @tapirgo would be glad to hear about it or try help you out.

While we built Tapir to search static sites, you can use it for anything that has an RSS feed, of course. @thedjinn created a news search that searches multiple feeds, like CNN and the Dutch NOS. If you end up doing something awesome we hadn’t thought about before, be sure to let us know.

Now, what do you think of Tapir?