How to improve Amazon CloudFront cache hit ratio
In the fast-paced world of online content delivery, optimizing the performance of your websites is crucial. Leveraging the power of edge caching is a key strategy in achieving this goal. In this article, we’ll delve into six strategies to optimize edge cache for Single-Page Applications (SPA) and Server-Side Rendered (SSR) applications.
TL; DR
We can avoid cache misses due to inconsistencies in query parameters by whitelisting the essential parameters only and validating and normalizing query parameters.
These methods can be complemented by using a stale-while-revalidate strategy, client-side revalidation and versioning to minimize cache invalidations resulting in cache misses.
1. Whitelist essential parameters
Query parameters affect the efficiency of any cache greatly. This is true for the edge cache as well. Amazon CloudFront includes options to either not include query string parameters in the cache key at all, include all parameters or include only selected parameters.
To increase cache hits, we should include query string parameters that only result in different content served from the path.
For example, assume you have an image of a ball on your website, and the query parameters for this image are size
and color
.
/images/ball.jpg?size=large&color=red
While you may need size for the other functionalities of the application for example for business intelligence, you may not have different images for different sizes.
In such a scenario, you should include only the image path and color parameter in the cache key. This would ensure that even if the size differs the application will serve the image of the right color from the edge cache itself.
2. Validate parameters
As now we have dropped all unnecessary parameters from the cache key, we should think about parameter validation. It involves ensuring that the parameters used in your URLs are consistent and properly validated before reaching the cache. Let’s consider the above example of an image of a ball itself.
Let’s assume the valid values of an image are only red, green and yellow and color is a single value query string parameter. Here’s how we would validate the colour value using a CloudFront function that is triggered on the viewer request event to achieve parameter validation before checking the edge cache.
Note: CloudFront functions are CloudFront-specific lightweight ES5.1 functions that can run on either view request or viewer response events.
// Parameter validation function
var validateParams = (function() {
var validParams = [{
name: 'color',
values: ['red', 'green', 'yellow']
}];
return function(parameters) {
for(var i = 0; i < validParams.length; i++){
var name = validParams[i].name;
var values = validParams[i].values;
var isValid = values.indexOf(parameters[name]) !== -1;
if(!isValid) return isValid;
}
return true;
};
})();
var handler = function(event) {
var parameters = Object.entries(event.request.querystring).reduce(function(acc, current) {
acc[current[0]] = current[1].value;
return acc;
}, {});
var isValid = validateParams(parameters);
return isValid ? event.request : {
status: '400',
statusDescription: 'Bad Request',
headers: {
'content-type': [{
key: 'Content-Type',
value: 'text/plain'
}]
},
body: 'Invalid color value. Valid colors are red, green, and yellow.'
};
};
By implementing parameter validation, you ensure that only valid requests reach the cache, preventing unnecessary cache variations and load on server.
3. Normalize parameters
Parameter normalization involves standardizing the format of parameters to reduce cache redundancy. Equivalent URLs must result in the same cache key.
Parameter normalization includes but is not limited to ordering parameters, normalizing the case, and filling in default values.
function handler(event) {
// Default values map
const defaultValues = {
color: 'blue',
size: 'medium',
}
var normalizedQs = [];
// 1. Normalize parameter the case of keys and values
// NOTE: Normalizing parameter value can result in loss of information
for (var key in event.request.querystring) {
if (event.request.querystring[key].multiValue) {
event.request.querystring[key].multiValue.forEach((mv) => {
normalizedQs.push(key.toLowerCase() + "=" + mv.value.toLowerCase());
});
} else {
normalizedQs.push(key.toLowerCase() + "=" + event.request.querystring[key].value.toLowerCase());
}
}
// 2. Fill in missing parameters from the default values map
for (var defaultKey in defaultValues) {
if (!normalizedQs.some(qs => qs.startsWith(defaultKey.toLowerCase()))) {
normalizedQs.push(defaultKey.toLowerCase() + "=" + defaultValues[defaultKey].toLowerCase());
}
}
// 3. Sort the query string parameters alphabetically
event.request.querystring = normalizedQs.sort().join('&');
return event.request;
}
Normalizing parameters enhances cache efficiency by reducing unnecessary variations, leading to improved hit rates. Standardizing parameters also simplifies logic in downstream applications as they do not need to care about the case differences.
Note: Concepts of parameter whitelisting, validation and normalization can be extended to headers and cookies if they are included in the cache key in CloudFront.
4. Use Stale-While-Revalidate
Stale-While-Revalidate is a caching strategy that allows serving stale content from the cache while simultaneously fetching a fresh copy in the background.
This ensures minimal disruption for users, faster page loads and reduced loads on the origin and downstream components.
Let’s implement this in an express
application.
// Middleware to set cache control headers
app.use((req, res, next) => {
// Set cache control headers for stale-while-revalidate
res.setHeader('Cache-Control', 'public, max-age=3600, stale-while-revalidate=600');
// Other middleware or routes go here
next();
});
In the above example, content will expire in an hour but will be served for a maximum of an additional 10 minutes while CloudFront revalidates its cache.
5. Revalidate on client side
For SSR applications, client-side revalidations can be a powerful tool. This involves using client-side JavaScript to check for updates and trigger a revalidation when needed while serving stale content from the edge cache.
For example, imagine you have a dynamic navigation menu, and the navigation menu is included in all the pages. An update in the navigation menu will require invalidating the cache of all the pages. In a service like CloudFront where invalidations are costly, you could instead use client-side revalidation to update the navigation menu after the initial render. This does not mean the menu will not be available on the SSR page, but it will be updated on the client side if the menu has been updated.
A cached API path will complement this solution which will return the updated menu or the 304 responses if the menu has not changed.
Using libraries like SWR (Stale-While-Revalidate), client-side revalidations enhance SSR applications by fetching fresh data as needed, improving real-time updates.
The advantages of client-side revalidation can be negated by the amount of API calls and the cost of data distributed by CloudFront in the cached API path. Therefore, it’s important keep the content size returned in a single request to API path to a small amount.
6. Version dynamic content
For an object that frequently updates, versioning provides a better solution rather than cache invalidations.
For example, if the image ball
updates regularly instead of invalidating the /images/ball.jpg
we could version the image and modify the client to fetch the specific version. The modified URL would look like /image/ball/jpg?version=1
Versioning dynamic content would ensure backward compatibility, save the cost on invalidations and more insights into traffic and client behavior.
Conclusion
In conclusion, optimizing edge cache is essential for delivering fast and efficient online content. By implementing parameter whitelisting, validation, normalization, Stale-While-Revalidate, and client-side revalidations, you can significantly enhance the performance of CloudFront cache, providing a seamless experience for your users.
Validated and normalized parameters reduce cache variations, while Stale-While-Revalidate ensures a smooth experience even during cache misses. Client-side revalidations complement these server-side strategies, creating a comprehensive solution. The synergy of these strategies ensures a well-rounded approach to edge cache optimization, making your applications more resilient and responsive.