First Crack Release Notes, March 2020

By Zac Szewczyk on 2020/04/01 07:39:06 EST in Programming

In last month’s release notes, I talked about First Crack’s rewrite: the things I set out to accomplish, the changes I made, and their performance costs. Although a simple fix later slashed First Crack’s runtime, I waited to post the code until I could talk about a few things here.

After a snarky commenter mocked my hands-on approach to concurrency in Sequential Exeuction, Multiprocessing, and Multithreading IO-Bound Tasks in Python, I decided to rewrite First Crack the “right” way. By submitting jobs to a processor pool with concurrent.futures in real-time rather than in batches with multiprocessing, this new approach should have outperformed my hacky one. It did not — in fact, it made First Crack 50% slower. Ouch. I stuck with it, though, because this was the “obvious right way”. I gave tuning a try next.

A few discouraging tries later, I had not done any better. Restructuring the code to take advantage of concurrency should not have caused this, which left one culprit. A key term in the concurrent.futures documentation supported my suspicion: “The concurrent.futures module provides a high-level interface for asynchronously executing callables.” Given that each layer of abstraction costs some performance — a general critique of Python as a whole, when compared to low-level C-based languages — I theorized that this library must exist somewhere above — and thus run slower than — Python’s multiprocessing library. A simple swap of concurrent.futures.Executor.submit(job) with multiprocessing.Pool.apply_async(job) proved this, when First Crack’s runtime plummeted from around 1.2 seconds to between 0.4 and 0.6.

I must emphasize this: a two to three hundred percent boost in performance required no meaningful changes; for the most part, I just replaced calls to concurrent.futures with calls to multiprocessing.

For those curious as to why I saw such a drastic difference, I suggest checking out CloudFlare’s excellent writeup on speeding up Linux disk encryption, or at the very least the section on digging into the source code. In short, whenever you have a wrapper on a wrapper — or a queue for a queue — performance suffers. Avoid those situations as much as possible.

This experience reinforced two important lessons:

Knowledge and certainty are inversely proportional. The better you understand something, the less certain of your understanding you become. This explains why the commenter made such a strong case that I had done all the wrong things, and why I fell for it. After all, what do I know? I’m just some self-taught programmer who believes in premature optimization and writing performant Python.
Just because a language is not optimized for performance does not mean you should not optimize your code for performance. I doubt your Python script will ever beat a compiled C binary, but you should still do everything you can to close that gap — and there is a lot you can do to close that gap in Python. This language does not force you to do those things, which makes it a joy to use, but do not forget that you can.

Aside from proving someone wrong, I added a line to automatically open a web browser when previewing your website with make preview or make public. Python’s standard library module webbrowser made this easy, with open_new_tab(“http://localhost:8000”).

I also experimented with microformats this month. Although deceptively simple, I could not get the validator to pull out the proper metadata. Getting microformats right may require some structural changes that, although minor, I have little incentive or desire to take on right now.

Feature Roadmap #

Along with general maintenance and my constant pursuit of optimization, I still want to get these things done.

Release Markdown Parser #

I want to release my Markdown parser as its own project. I fixed a few bugs during the rewrite, but I still have some others to work out. At the least, I want to go public with greater coverage of the spec, and with the ability to handle multi-line strings and entire files at once. My main goal is to design a performant Markdown parser and then write an efficient implementation of it. Several people have already done some interesting work in this space. At present, it implements the subset of the spec I use on a regular basis, and handles files one line at a time.

Publish Implementation of Markdown Spec #

Along with the release of my Markdown parser, I will need to outline the peculiarities of my implementation. Parity with John Gruber’s spec would make sense, or something like GitHub Flavored Markdown which has much more detailed documentation, so I may go this route; if not, I will need to produce my own documentation. This would cover weird edge cases for the most part, but it would also give those who use my engine have some sort of explanation for why their article looks weird. In brief, my argument against going with a standard comes down to the fact that I have little use for most of those features and edge use cases. Once this becomes its own project, though, that others may use, this argument gets shakier. I will have to spend some time thinking about this before I move forward.

Improve Documentation #

A few of the ways I think I can improve the README in particular:

Re-create usage GIFs. I had a few neat GIFs that showed off First Crack’s simple install process and easy use case, but I will have to re-create those after the rewrite.
Performance graphs of First Crack’s back-end performance versus other, similar engines. At less than two seconds to build a website of over one thousand pages, I want to highlight this.
Performance graphs of the web pages First Crack builds versus the pages common content management systems build.
Screenshots. This site is a live demo of the engine, but I like the idea of having a few pictures in there, too.

As always, I look forward to the work ahead.