Speed is situational: two websites, two orders of magnitude
How do you make your application fast? It’s very hard to say in the abstract, because “fast” has no universal meaning—what is slow in one case might be fast in another. And this is a general truth about performance optimization.
You need to understand your goals and situation, and use them to derive your performance goals. Otherwise you’ll find yourself following random advice from someone who might have very different goals than you do.
To demonstrate what I mean, I’m going to share two performance optimization stories: both of them involving websites, but with very different goals and users, and therefore with very different performance goals.
Website #1: the one you’re reading right now
I have a website, the website you’re currently reading. I want it to be fast. What does “fast” mean for a static website hosting articles on the Internet?
Many of the readers of this website—including you, perhaps—find out about articles elsewhere (Twitter, link-sharing sites), and click through because the title sounds vaguely interesting. That’s a fairly low level of motivation.
And many readers are on smartphones, which intermittently have Internet connections with high latency and low bandwidth. Slow connections are a problem: if it’s taking too long to load a web page, you’re more likely to hit the back button and deprive yourself of my highly educational and ever-so-practical content.
The goal then is to optimize the reader’s perception of speed, under worst case conditions: a low-bandwidth high-latency network connection. The hope is this will increase user retention, especially among less motivated readers.
Speeding things up
To make a website load faster, one thing to address is bandwidth. So when I did an optimization pass, I tried to get rid of unnecessary bandwidth usage:
- Instead of using the FontAwesome icon font—30kb or so—just for three tiny icons, I’m using a subset with just the icons I want.
- The static web hosting service compresses files, uses a CDN, etc..
This is all useful, but it’s actually not enough to make the page appear quickly. Not being a frontend developer, I did some research, and I learned a bit more about how browsers render HTML.
It turns out that if you naively use a web font in your CSS, e.g. by using the default CSS suggested by Google Fonts, the page won’t render any of the text until the web font is downloaded. The text will sit there, invisible—and given network latency on a bad smartphone connection, this can take quite a while.
So I switched to the FOUT technique for rendering fonts. This means the page renders immediately after the main HTML and CSS are downloaded. Initially the text renders in the browser’s default font, and then it re-renders when the font eventually gets loaded.
The end result of all my optimizations was a reduction in perceived loading time by hundreds of milliseconds—and even more on a slow bandwidth connection. As far as I can tell, on a fast Internet connection in the US initial rendering happens after 200ms, and the bandwidth used is about 150kb-250kb for the initial load (a number that could be improved further).
Website #2: a highly interactive internal application
This website had a different goal and situation, and therefore performance optimization had different goals. In particular, this was an internal web UI used by a company’s employees to validate the results of a heavyweight computational process, by eyeballing the resulting images.
This UI would be used:
- By 5-10 people, no more.
- Maybe once a week each.
- Over a fast network connection.
And at the start of the optimization process rendering was taking so long users would get bored and wander off to get some coffee.
The goal, then, was to get render time on the page (which downloaded many megabytes of images) down to less than a minute. Compare that to that previous website, which involved shaving off milliseconds and kilobytes!
Speeding things up
The vast majority of performance issues in this case were nothing to do with frontend bandwidth, frontend latency, or browser oddities, and everything to do with backend calculation.
Some of the performance optimizations we did were:
- Ripping out some
O(N3)operations and replacing them with a dictionary lookup.
- Caching intermediate calculated results on the webserver, to speed up repeat views—not scalable, but remember we only had 5-10 users.
- Switching to a less accurate but much faster variant of one of the algorithms, because for visual inspection purposes the more accurate algorithm didn’t add much.
- Switching to a compressed data storage format to reduce load time from the cloud blob storage service.
And so on.
At the end of this process we had a web application that was still 50×-100× slower than the website you’re currently visiting—and while that was not ideal, it was a lot better than before. And it was sufficiently improved that we then moved on to more important work.
Speed is situational: a comparison
Let’s compare the two websites:
|Website #1||Website #2|
|Usage||Static pages||Highly interactive|
|Network||Very slow to very fast||Very fast|
|Bottlenecks||Browser rendering algorithm, frontend network latency||Backend algorithms, backend bandwidth|
|Load time||0.2-1 seconds||30-60 seconds (at least!)|
Both of these are websites, but the goals, bottlenecks, user needs, optimizations, and final performance characteristics are very very different. And that’s OK!
To return to your application: you can’t just “make it go fast”. Rather, you need to work backwards from your goals and situation and figure out what performance means in your particular context.
Find performance and memory bottlenecks in your data processing code with the Sciagraph profiler
Slow-running jobs waste your time during development, impede your users, and increase your compute costs. Speed up your code and you’ll iterate faster, have happier users, and stick to your budget—but first you need to identify the cause of the problem.
Find performance bottlenecks and memory hogs in your data science Python jobs with the Sciagraph profiler. Profile in development and production, with multiprocessing support, on macOS and Linux, with built-in support for Jupyter notebooks.
Learn practical Python software engineering skills you can use at your job
Sign up for my newsletter, and join over 6900 Python developers and data scientists learning practical tools and techniques, from Python performance to Docker packaging, with a free new article in your inbox every week.