How to Improve Core Web Vitals with Varnish

In this session, Thijs Feryn (Technical Evangelist, Varnish Software) explains how to improve Core Web Vitals using Varnish caching technology. He breaks down how caching at the edge reduces server load, improves Time to First Byte (TTFB), and ultimately speeds up perceived page load. The session focuses on how Varnish works as a reverse proxy cache and how it helps deliver faster, more stable web experiences at scale.

Hey everyone, uh you were listening to Marco Nikolic on how you can improve the 80% of your problems of your WordPress website. Uh the next session is by The Fin. Uh he is a evangelist and engineer at Varnish. Uh he is expert in website performance and content delivery. So we have this varnish with us. The topic is how you can improve your core web vital using varnish. So let’s stay tuned and watch uh this varnish this uh fair uh session. Hello ladies and gentlemen, my name is Tes. Thank you for joining me here today. I really appreciate it. A big shout out also to the folks at Cloudways and more specifically Denish for inviting me to talk about this one about how to improve your core web vitals with varnish which is just another way of saying how do we make your website or application a lot faster by using varnish caching technology and as a matter of fact ladies and gentlemen I’ve been traveling the world and doing a lot of outreach for the last decade and a half talking to people how much I think that slow websites suck and that’s my core message to you today and I this pretty strong message I must admit resonates as much with you as it does with me. But first things first. Hi, my name is T. Yes, Ta. T H I Js, that’s how you pronounce that. I’m a Dutch speaking Belgium. I’m the tech evangelist at Varnish Software. For those of you who don’t know Varnish Software, we’re a European company with global presence of course that builds softwaredefined web acceleration and content delivery solutions with a very strong emphasis on softwaredefined. That’s not the only thing we do, but that’s our core business. And you can host that software anywhere you want. You have freedom of architecture, freedom of infrastructure, and you can run that on your own bare metal hardware or in a colo space or with a hosting company on a VM or on a VPS. You can run it in the cloud, you can deploy it via Docker, or you can take these Docker containers and orchestrate them in cloudnative platforms such as Kubernetes. Of course, we’ll be talking about websites and web applications, but HTTP is a lot broader than that. And we support the different kinds of domains of application. As HTTP evolved, so have we and we’re adapting to these use cases. What started out by accelerating basic websites and applications ended up evolving towards acceleration of APIs, acceleration of video streaming platforms and nowadays it’s also happening behind the scenes for organizations. acceleration of the software supply chain of artifacts as we call them to accelerate deployments and accelerate CI/CD pipelines by caching docker images, npm packages, pi libraries, go modules, whatever is part of your software delivery process, we can accelerate. And to a larger extent, whatever is being transferred as a file or an object over HTTP, we can accelerate. I’d go to say that we accelerate all the things. However, today we’re going to narrow the scope and focus primarily on websites and web applications. And we do that as an organization. I mean, as a representative of Varnish Software, I’ll give you some advice. However, we’re also the company behind a very popular open-source project called Varnish Cache. Go to varnish.org if you want to learn more about it. It’s a very popular piece of technology that’s used by a couple million active websites. And as the stewards of this open source project and the company behind it, we’re making some exciting releases this week. We have a new release uh planned and laid out with a lot of extra capabilities and features. So go to varnish.org to check it out. However, we have noticed that this just delivering software to people approach doesn’t cover the full spectrum. That’s why we launched our own CDN which is nothing more than a geographically distributed SAS version of our software because we noticed that a lot more organizations don’t host their own infrastructure anymore. They work in the cloud or with a pass provider or want things to be in a SAS form and that’s what this is. Finally, you can use all the nice things that Varnish has to offer in a nice SAS form. And as we’re a European company, we’re not subject to the US Cloud Act, which allows us to give you data sovereignty, which is quite relevant in this day and age. However, we have a global presence and we’re also looking to expand to other areas in the world without compromising on that data sovereignty. Go to varnish-cdn.com if you want to play around with it because we have a free developer tier. And if you want to try the paid tier, ladies and gentlemen, just reach out to me. I’ll give you some free credits. Like just handing out some candy, right? You’ll find me. I don’t need to say where. Uh just go to some social media or if you know my email address, just find me. I’ll give you some free credits to play around with. So that technology we’re now going to apply in the context of core web vitals. So what are core web vitals? Well, let’s look at web vitals first. It’s vital signals of a web application to check its health, its efficiency, and its stability. is just like a human being who also has a web vital or who also has vitals, right? Key vitals like the pulse, like blood pressure, uh like heart rate, all that type of stuff. Now, in the world of web, all these vitals are about stability, but also primarily about loading speed. How fast can we deliver it? But not just that, because delivering bytes from A to B, which we really do well, is not the full name of the game and goes beyond that. It’s also about the perceived loading speed which is slightly more complicated because the browser needs to load your main page and from within the main page it figures out what hypermedia it needs. Web fonts, CSS, JavaScript, images, videos and other resources and the way the browser renders those and paints those visually has a big impact on your core web vitals. That’s what we’re trying to do. perceived loading speed but also the perceived interactivity because you have all these hypermedia and all these JavaScript frameworks and browser capabilities. There are interactions like single page applications, rich media applications. The way you interact across the different parts will also impact your web vitals. And those web vitals have a set of metrics, set of signals you can measure. And there’s a whole range. I’m not going to go into all of them, but I do want to highlight the top one because that’s our specialtity. We are experts in reducing your time to first bite. Let me quickly explain what time to first bite is. It is the time it takes between sending the request from the client, the server receiving the request and the time it takes for it to produce the first bite. Anything that creates a long time to first bite could be an indication of server problems. And then you have a whole range of other ones. But Google actually creates a short list called the core web vitals which is part of the title of course and those are the three ones that they find the most important. We have largest contentful paint LCP. We have cumulative layout shift CLS and we have interaction to next paint inp. And the list sometimes changes. One of the last times I did this one inp. So what is largest contentful paint? It is the time to render the largest visible block on a website. That could be an image, a video or a text block. It includes the time to first bite and it also includes any type of render blocking caused by CSS, JavaScript, browser capabilities, what have you. The goal for you to have good web files is to keep the largest contentful paint, the loading time of that below 2 and a half seconds. And yes, Varnish will attribute to that. The second one is interaction to next paint. meaning what is the page’s overall responsiveness when it comes to clicking, tapping on a touchcreen or keyboard interactions. The final value of this INS metric is the longest interaction of them all. It’s driven by CSS, JavaScript, and browser controls. And the goal is to keep it below 200 milliseconds. Those first two were primarily tied into performance. The next one is a bit different. Cumulative layout shift is a measurement of the visual stability of your website. As a browser loads your main page, it takes hints from hypermedia tags on what type of media to load, what type of CSS to load, what type of images to load, and it impacts how the browser estimates what screen real estate it needs to load those. And sometimes it gets it wrong when it loads it and then on-creen movement happens. The CLS tries to avoid that. This is not really related to performance, but what could help is adding dimension parameters such as height and width to any type of media that you load. Keep it below 0.1 to be stable. Now, how do you measure these things? Lighthouse is a good start. You’ve probably heard of that. It’s a tool that is part of your Google Chrome. It’s part of the developer tools. You can trigger it. When I trigger it for varnish.org, I get a 98% score. Not perfect, but still very, very good. And if you look at the metrics, my largest contentful paint is 1.1 seconds. It’s well below the 2 and a half seconds. Could we do a better job? We probably could, but this is a pretty good score. Now, let’s bring it to the next part of the presentation is figuring out how Varnish is connected to that message. How can varnish improve those core web vitals? Well, let’s start at the beginning and explain what life looks like without a cache. You have a direct interaction pattern between the user and the server. The user sends requests to the server and the server has to reply as fast as possible. But as concurrency increases, servers might end up being under pressure and might crumble under that pressure and expose additional latency or have complete outages. So the more people come in at the same time, the higher the pressure and it also is related to the type of workloads that are executed, some of which are more resource intensive and are more sensitive to latencies. So by adding varnish in front of your server as what we call a reverse caching proxy, you’re taking away a lot of that pressure which isn’t an improvement of the quality of service. The user no longer directly interacts with the server but connects to varnish and varnish can serve responses that it cached from the server as quickly as possible. When content has expired or is not in the cache at all, every now and then the server is being checked by varnish. Hence the dotted line. We don’t always connect to the server. only when content expires or when content is not in the cache. And that can handle a lot of requests and make your site a lot faster. It’s also a quality of experience improvement if the user has consistent performance under high pressure. And by positioning the cache sometimes a bit closer to that end user, latency might drop as well because the physical distance is smaller. The network latency can be lower. So how does that tie into the core web vitals then? Well, we’re not going to talk about cumulative layout shift. We’ll talk about the things that are associated with performance by loading things faster, by making it fast, by making it scale. And that’s what Varnish does. Your lighthouse score, your web vitals will be much better. Now, what makes our technology so uniquely qualified to do the job is a legitimate question you may have. And of course, I have the answers to that question. Let’s start with the beginning with the architecture and the conception of varnish. Varnish is a piece of technology that is built to cache and to deliver results at incredible performance and at incredible scale. So, it’s fast, stable, and efficient. And we can draw around some numbers if you want to. Intel benchmarked us at 1.7 terabits per second on a single Varnish server. And that’s a standard Varnish install without hardware acceleration on stock hardware. Of course, it’s a very big and really expensive uh skew that is hosted in a lab, but that’s a good number. In reality, you will probably never reach that number, but let’s just say that throughut is not an issue. For smaller, less network intensive workloads, regular objects, we can really increase concurrency for websites, web pages, images, we can easily do more than a million requests per second if your hardware allows it and if your varnish is properly tuned. And from a latency perspective, we can deliver content within the caching engine under a millisecond. Of course, the network’s the network, but we don’t add any additional latency. So, we’re fast, we’re stable, we’re efficient. We also adhere to HTTP’s caching best practices. There are conventional headers out there that are interpreted by browsers, servers, and caches, and that you can emit. You can emit this cache control header and say that varnish in this case has to cache for 100 seconds. If you don’t want varnish to cache specific pages, maybe because they’re so personalized, you can issue a header like this private no cache no store and will understand it. You can even add the more specialized directives like stale while rev validate 500 which tells varnish when cache has expired and the user uh needs new data that instead of letting the user wait for the server to respond you can surf stale data for 500 seconds beyond the expiration of the object while varnish deuce does an async fetch. So that’s all pretty nice. There’s a lot more to say about that but very little time to go into detail. What I do also want to mention is that we have a built-in content streaming and request collapsing engine. Content streaming is not tied into video streaming. It has nothing to do with that. But it has everything to do with us not needing to buffer all the content from the server first before delivering it. As chunks of data come in, we add them to the cache and forward them to the user without adding any extra latency. And when a lot of requests end up on varnish, we can do request collapsing. Meaning all these users maybe thousands of users coming for the same uncashed content. All those requests will be collapsed into a single backend request avoiding the thundering herd problem. That is great efficiency and will lead to a lot less pressure on your systems even for content that is not necessarily stored in a cache. Another thing that’s important is real-time content invalidation. I often use the quote that there is one thing more dangerous than not caching enough and that’s caching for too long. Imagine a news website wanting to show breaking news that’s happening in the world. Well, breaking news won’t be breaking if the object is stored in the cache. So having the ability to tell varnish sometimes from a CMS to invalidate a specific piece of updated content goes a long way. So typically users trigger those changes through CMS changes and these CMSs can connect with varnish and instruct varnish to remove that and that typically happens through integrations, plugins and modules of the well-known frameworks. I mean they’re out there for WordPress, Drupal and also primarily Magento which really need them. But there’s many many more that can be built and that live out there in the wild in the open source. And the final piece of the puzzle is the fact that we have our own programming language and the associated modules VCL as we call it the varnish configuration language which is a domain specific language that allows you to extend the behavior of the cache do request handling do request routing do response manipulation if you will backend selection controlling all aspects of the cache and ultimately also doing decision- making on the edge as we call it. VCL is a is a topic on its own. I’m not going to discuss it in great detail now, but I am willing to hint some things and scratch the surface. The VCL language is not a top- down language. It’s a language that is built to extend the existing behavior. And you can see the flow chart here how varnish behaves. By hooking in the different stages of the execution flow, you can trigger hits, misses, cache bypasses, and influence all aspects of that. Here are a couple of examples. Here’s some VCL where the back end the server is hosted on the same machine on port 8080 and we’re varnished by hooking into the receiving logic examines the URL. If the URL starts with /admin, we will bypass the cache because the underlying thought is that admin pages are highly personalized and cannot be stored in the cache. Here’s another aspect. By hooking into VCL backend response, we’re extending the behavior of a response coming back from the server prior to be stored in the cache. If there is a cache control header that does not contain the Smax H value, we will figure out our own cache control value, our own TTL. And we do that for content types that start with an image. So for all images that don’t have an Smax value in the cache control header, we will decide to overwrite the TTL of the cache to 60 seconds. And it goes on and on and on. Here’s some logic using an access control list to give limited access to the invalidation endpoint. Now, that’s typically some VCL that WordPress, Drupal, and Magento modules use to interact with the cache and to allow you to remove objects from the cache. Now, it goes on and on and on. Here’s a more advanced example that comes from WordPress where we communicate with example.com which is supposedly hypothetically a WordPress site and where we remove cookies that we don’t need and by stripping off all the tracking cookies and keeping the non-tracking ones we massively improve the hit rate but still we bypass well-known parts that aren’t supposed to be cached. And that is the basics ladies and gentlemen. But in the last couple of minutes that remain in the last like three or four minutes I want to show you what you can do when you want to go beyond the basics because our latest distribution of varnish cache and that is not part of the enterprise. The enterprise has other modules. I just wanted to hint some extra capabilities that we’ve packaged with the latest release of the open source version of varnish modules to sanitize accept headers. Here just just have a look at the list. accessing and modifying request and response bodies, string manipulation, TCP tuning, rate limiting, tagbased cache invalidation, encryption and hashing functions maybe to do an HMAX signature or whatever you want, direct access to the file system to serve objects from a file server rather than fetching them from an origin. Geoloccation using the MaxMine database, query string manipulation, JSON parsing, Reddit support, dynamic backends, an HTTP client, UUID generation, it goes on and on and on. We even allow you to parse and execute JSON scripts, incripts, Louisis scripts, and ECMAS scripts. That’s quite the package. And yes, all of that can be used to improve the performance of your site. But it can also be used to perform some logic on the edge, as we’ll call it. So, now that I potentially influenced you and made you decide to try it out, how can you get started? Well, deploying Varnish uh happens in different ways. I’m going to show you how to do it for open source. You can even deploy our enterprise products and get a trial license. But here’s how you do it for open source. If you want to deploy varnish cache to a Docker container, for example, you can pull our Varnish latest container and run it locally or wherever you want and use some specific environment variables to set behavior like varnish size to set the size of the cache or varnish backend host and backend port to set the location of the server. That’s in case you don’t want to work with VCL directly. If you do want to work with VCL directly, you can mount your VCL straight into the container. That’s individual Docker. If you want to orchestrate that on a Kubernetes cluster, we have an official Helm chart for that. And you can pull it from Docker.io. It’s varnish/varnish cache. And the same environment variables apply. Of course, if you want to mount VCL, you can load VCL files as config maps in Kubernetes and then reference them in the values.yaml YAML file and mount them into etc varnish default VCL and run that as part of the Helm chart. If you’re more of a traditionalist and you want to install it on bare metal servers or just on some VM using packages then that’s possible. We have them for AP systems and Yum systems. So meaning DBN Red Hat or Alma Linux, Red Hat, Linux, CentOS, Rocky, you name it. We have the packages to take full advantage of the underlying hardware. And that’s pretty much all the time I have for you today. Thank you for checking it out. And I’d like to give you some closing references. If you want to try out the open source project, go to varnish.org. If you want to see what we can offer you and how we can help you with more advanced use cases, go to varnish-software.com. If you want to try out Varnish in a SAS formula, go to varnishcdn.com. And remember, on Varnish CDN, we have a free developer tier. And if you find me online, just send me an email or find me on LinkedIn or any place else and ask me for free credits. Thank you, ladies and gentlemen. It’s been an absolute pleasure and I hope you enjoy the rest of the event. This was this uh explaining how you can improve your core web vital using the varnish. Uh stay tuned we are coming up with a panel discussion on how you can uh live performance tear down fixing a slow site in a real time. A reminder for all the viewers like we have lined up two of the very best activities for you. So it is your chance to participate and win prizes. The leaderboard is also active. Feel free to share your chat. Feel free to share the questions. We are here to answer your questions. Uh we are just coming back with a panel discussion. So stay tuned.