Enter First Contentful Paint (FCP) — the moment when the browser renders the first visible content on the screen. Instead of staring at a blank page, users see the initial elements of your site, providing a visual cue and confirmation that something is happening. This moment is key in shaping their perception of speed and responsiveness.
Spotting the issue
For this project, we looked at the 80th percentile instead of Google's 75th percentile. While the website itself had a fairly clean HTML source code, RUMvision data showed that the FCP was typically above 2 seconds (2247ms) on mobile devices.
This was higher than Google's recommended threshold of 1.8 seconds. And given their audience (mainly The Netherlands at time of writing), an FCP of below 1.8 seconds should certainly be achievable.
From here, there are two ways to improve FCP:
- Reduce the Time to First Byte (TTFB);
- Reduce the time spent by the browser in between receiving that first byte and reaching the HTML's
<body>
to start rendering the first pixel.
FCP delta
For the latter, we could look at the firstByteToFCP
that the official web-vitals is providing. RUMvision is built on top of Google's library after all.
Looking at just the delta, we saw that more than 1.6 seconds was spent there. Even without improving TTFB, focusing solely on this sub-part can significantly enhance overall FCP.
Pageviewtype
Data and findings become more interesting when segmenting the data. When looking at the different pageviewtypes, we saw the following:
Pageviewtype | FCP delta (ms) |
---|---|
overall | 1636 |
unique | 1995 |
returning | 1327 |
successive | 1076 |
In other words, there's a significant gap between first cold pageloads within a new session, and successive pageviews where resources are likely cached already. A gap of almost a second (900ms). In other words, the perceived performance by real visitors on mobile devices is 85% worse compared to successive pageloads. That's a clear signal as to where to start investigating. This clearly is a render blocking resources which is regressing UX significantly more during a first pagehit.
Waterfall investigation
In these cases, I think that a WebPageTest waterfall is easier to scan than a waterfall in Chrome DevTools. So went to WebPageTest and started a run. Below is the outcome:
Not surprising, and actually a relief, to see this waterfall as I immediately knew we could achieve huge wins here with a few changes. This is what the waterfall is telling us:
- in this synthetic test, FCP happens just after the 6.0-second mark;
- Caused by many JS tasks between 3.7 and 5.7 second mark (defer fetchpriority="low"red scripts and scripts listening to the
DOMContentLoaded
event) - In between, rendering tasks are happening, probably the result of changed classes, forcing the browser to re-render elements.
But there's a more interesting part:
Multiple roundtrips
As of the moment where first bytes were received (0.9-second mark) there isn't a lot of work happening on the browser's main thread. There's something sitting in between that is pushing back both DOMContentLoaded
as well as the FCP. That's where our win is:
- according to the cross-symbol at the right, request number 8 is render blocking. However, it isn't discovered by the browser right away.
- when moving your eyes up in a vertical direction, we can spot that that request (to a CSS on p.typekit.net) starts to download when somewhere after request number 5 is done downloading. But the browser was busy executing JS, so it couldn't yet parse that CSS from use.typekit.net.
- but even that CSS file isn't discovered by the browser right away. Apparently, it depends on yet another file. And it does, the CSS from use.typekit.net starts to download around the 1.6-second mark, around the time when request number 2 (also a stylesheet) was done downloading.
A seasoned pagespeed consultant (or developer reading waterfalls from time to time) might immediately recognize what is happening here. @import
is used, from within a render blocking file. And not just once, but twice, causing three roundtrips and extending the duration of a blank screen.
The impact of @import
As CSS files are typically cached by a browser, we can expect a bigger bottleneck during cold pageloads while the UX across follow-up pagehits shouldn't be as big. But if files are not cached, the impact will be substantial.
Even when a browser is stuck on parser blocking resources, a browser's preload scanner can help us out and at least already detect and download resources in time. But if we hide render blocking resources from the preload scanner by putting it in files that are yet to be downloaded, the browser won't know about these resources in time.
And that's exactly what we end up doing when using the @import
CSS rule.
It turns out that nearly one in five websites is potentially and unknowingly sabotaging its own performance with
@import
.
I wrote follow-up article in the 2024 edition of the yearly web performance calendar with@import
usage statistics.
Fixing the @import issue
The fix seems to be easy. Swap out all @import
rules by regular link
-elements. But that's a bit more difficult when dealing with Typekit. Typekit is Adobe's hosted web font service. And using @import
is one of their suggested methods of implementing Typekit fonts.
Moreover, to track its usage, Typekit is doing yet another @import
to a hostname called p.typekit.net. The CSS that is being downloaded is just an empty file only containing opening and closing CSS comment symbols: /* */
So, our hands are tied. To a certain degree at least. A merchant could choose to go with a different implementation strategy:
- going with a different font so one can either self-host web fonts or chose a hosting service that comes with a significantly smaller impact
- using Typekit's JavaScript equivalent
Typekit JS equivalent
However, the JS equivalent for embedding Typekit should not be considered a better alternative for multiple reasons:
- Costs more time to implement as it involves JS instead of a simple stylesheet, while the JS equivalent will just end up downloading the same initial stylesheet served from use.typekit.net (request number 5, and then again 8)
- You only end up introducing more JS while JS tends to be the bigger web performance bottleneck nowadays due to constraints of the devices that our visitors are using;
- As a result, there will at least be a similar delay towards font deliver.
In other words, this option should not be considered and a different approach is needed to mitigate the impact of using a licensed Adobe font.
The HTML fix
Luckily, the fix here only involves HTML, meaning it is easy to implement. We now already know that @import
is basically hiding the imported CSS until the file containing this at-rule itself is both downloaded and parsed. We should just serve this information to the browser as soon as possible, allowing it to discover and download it in advance.
We can achieve this by using resource hints, preload and preconnect in particular. The HTML than results in the following:
<!-- Allow preload scanner to detect the main file in time, but prevent it from being render-blocking --> <link rel="preload" href="https://use.typekit.net/xxxxxxx.css" as="style"> <link href="https://use.typekit.net/xxxxxxx.css" rel="stylesheet"> <!-- Fonts will be downloaded from the use.typekit.net as well, so preconnect (with crossorigin as we're dealing with font files) --> <link rel="preconnect" href="https://use.typekit.net" crossorigin> <!-- The CSS will download yet another CSS from t.typekit.net, so preconnect (but without crossorigin) --> <link rel="preconnect" href="https://p.typekit.net">
Asyncing the stylesheet
Alternatively, if your custom fonts aren't used for above-the-fold or critical text, you could decide to load your Typekit files in an asynchronous way. This would result in the following:
<link href="https://use.typekit.net/xxxxxxx.css" rel="stylesheet" media="print" onload="this.media='all'">
That way, both the initial (and externally hosted) CSS and the imported CSS from within this initial CSS won't be render blocking anymore.
The FCP gains without @import
FCP delta wins
FCP delta improvements were as following:
In other words: overall FCP delta dropped from 1636ms to 1026ms, an improvement of 37.7%. This was mainly caused by the 41% win across unique wins as the FCP delta for cold pageloads dropped from 1995ms to 1117ms.
Overall FCP wins
By shaving off more than 900ms, this resulted in an overall FCP improvement as well. In other words, real users would certainly notice the difference. FCP data is confirming this:
Overall FCP dropped from 2247ms to 1611ms, mainly caused by the drop of 2782ms to 1872ms across unique pagehits.