A walk through blog technologies

As with any programming project, I came across challenges and new technologies while writing the new blog website. This post describes my more interesting findings. Hopefully it might be helpful to anyone thinking of building a static blog too.

Build Process

I really wanted to try Grunt or Gulp in the project, as I thought that having an automated build process would help speed things up, and it would be important to minify code for production. I had never used either tool, but I decided to go with Gulp, as I liked the use of pipe() and streaming. As well as the use of the standard array of minifiers, I’m using gulp-gh-pages to deploy the site to GitHub Pages.

Searching blog posts

Because Jekyll websites are static, there can be no back-end search ability. The only option is to index all blog posts and search them in the browser using JavaScript. To do this, I found lunr.js, a full-text search engine.

The idea is that you have a JSON file containing the information you wish to search (like post titles) and any metadata you need (like post URLs). Then initialize lunr.js with the JSON and attributes you want indexed. Then it’s as simple as running index.search('foo'), which returns a list of matched blog posts. I index the post title, tags, and date in human-readable format (so you can search for something like ‘tuesday’ to find all posts written on a Tuesday).

Jekyll can generate the JSON file to be statically served, but we can take this a step further. Lunr.js also allows you to export the compiled index and load it back in. This means the index can be compiled once at build time and served to clients to load into lunr.js, thus speeding up the search initialisation. To have the index compiled at build time, I modified the gulp-lunr plugin to suite the fields I’ll be searching.

You can also search the blog directly from the URL. Any parameters will be passed as a search query, e.g. /search?pascalDE.

Tagging with hash maps

Originally I was going to use lunr.js to find all posts with a certain tag and display them on a page. I realised this would be unnecessary because tags are always fixed strings - they need to match exactly to map a blog post to a tag.

A /tagged page could simply be done by having a JSON file with two hash maps. One hash would use tags as keys with the value being a list of blog post slugs (the title formatted for a URL). The second would have the blog post slugs as keys and their metadata object (link, date etc.) as values. The post data is kept separate to avoid repetition.

This makes finding tagged posts really quick. The page will look up the tag from the URL parameters, then look up each of the blog posts and display them on-screen, for example: /tagged?gaming.

I thought that having an index of every blog post would carry a large file size, but currently it is around 94Kb, so it has as much a download footprint as a small image.

Syntax highlighting code

I use the popular highlight.js library to syntax highlight code client-side. Here’s an example:

// Hello1.cs
public class Hello1
   public static void Main()
      System.Console.WriteLine("Hello, World!");

Image galleries

One of the most challenging tasks was to replicate the WordPress image galleries. I quite like the layout WordPress gives for the Mosaic feature of their galleries. Unfortunately I couldn’t find anything as good as that online, and attempts of making my own were not optimal either.

I came across two libraries in my search. Photoset-grid allows you to specify rows of images and it will automatically lay them out for you client-side. For example you give it the sequence ‘121’ and with 4 images and it will put the 1st on one row all to itself, the next two side-by-side on the next row, and the last image full size on a row like the first.

I really liked the neat zooming images on Medium, and I wanted something like it here. I decided to go with photoswipe, a lightbox library. A cool feature is how well it handles touch input, letting you swipe between images, pinch to zoom, and drag to pan.

To combine these libraries together, I created my own Liquid tag (Liquid being the templating language used by Jekyll) that would output the HTML in the format both libraries needed. Here is an example of galleries being used in a post, and here is an example of the Liquid block:

{% photoset 12%}
-   url: 2015/12/20151205_222237769_ios.jpg
    width: 1340
    height: 1005
-   url: 2015/12/20151205_225627754_ios.jpg
    width: 1340
    height: 1005
-   url: 2015/12/20151205_205912898_ios.jpg
    width: 1340
    height: 1005
{% endphotoset %}

Importing WordPress posts

Jekyll has it’s own importer from WordPress, but it outputs to HTML. I wanted to use Markdown to be consistent with future posts and make it easier to make changes (and fix formatting issues from the import). The Jekyll docs listed exitwp which would do this. I also had to implement linebreaks_wp.py from this pull request to handle WordPress’s weird newlines. Obviously I wanted to automate as much as possible, so I also wrote some Ruby scripts to convert galleries, captions, and code blocks.

Drafting and publishing posts

I also wrote my first Node CLI app to perform tasks like create and publish drafts. You can find the source code on GitHub if you want to use it yourself. Creating a draft is as easy as:

$ post draft "Hello World!"

Which will create a file in Jekyll’s _draft folder called hello-world.md with basic front matter. Publishing the post will move it to _posts and add the current date and time to the front matter.