Practical Blog ProductDo

Service Level Indicators: Examples and Difference between SLI, SLO and SLA

As a product manager, it's essential to have a clear understanding of the Service Level Indicators (SLIs) used to measure the performance of your product. SLIs are quantitative metrics that measure a specific aspect of a service's performance, such as response time or error rate. In this article, we'll explore the differences between SLIs, Service Level Objectives (SLOs), and Service Level Agreements (SLAs), and provide examples of SLIs used by real-world companies.

SLIs vs. SLOs vs. SLAs

SLIs, SLOs, and SLAs are all part of a framework for measuring and managing the performance of a service. However, they have different purposes and levels of granularity.

SLIs are specific metrics that measure a particular aspect of a service's performance, such as response time or error rate. They provide an objective measurement of how well a service is performing, and are often used to track performance over time.

SLOs are targets for specific SLIs that a service must meet to meet the needs of users and stakeholders. They are usually expressed as a percentage of time or as a specific value for a metric. SLOs provide a clear goal for the service to strive for and can help to align expectations with capabilities.

SLAs are agreements between a service provider and its customers that outline the levels of service that will be provided. SLAs typically include specific targets for SLIs, along with penalties for failing to meet those targets. SLAs help to establish a shared understanding between the service provider and the customer of what is expected, and provide a mechanism for resolving issues if the service falls short.

Examples of SLIs

Real-world companies use SLIs to measure the performance of their services and drive continuous improvement. Here are some examples of SLIs used by popular companies:

Google uses several SLIs to measure the performance of its search engine, including:
  • Latency: the time it takes for a search query to return results
  • Accuracy: the relevance of the search results to the query
  • Coverage: the percentage of web pages indexed by the search engine

Netflix uses SLIs to measure the performance of its streaming service, including:
  • Start-up time: the time it takes for a video to start playing after the user clicks play
  • Buffering rate: the percentage of time a video is buffered during playback
  • Playback errors: the number of errors encountered during playback

Amazon uses SLIs to measure the performance of its e-commerce service, including:
  • Page load time: the time it takes for a page to load after the user clicks a link
  • Availability: the percentage of time the website is available for users
  • Error rate: the percentage of errors encountered during the checkout process

In one of my teams we used SLIs to measure health of SMS delivery:
  • Latency: the time it takes for our service to deliver SMS to a user (end-to-end)
  • Availability: the percentage of time the SMS API is available for clients to call
  • Error rate: the percentage of errors occurred when API was called

There is an interesting nuance here: what shall we do if our service did its job (sent an SMS) but user is offline without network coverage? Does it make our SLIs lower or it is "not our fault"? Long story short, it is very important to be very clear with exact SLI "formula" to ensure it really defines a health of the service. You can practice it here by defining SLIs of 7 real-world services.

How can product managers use Service Level Indicators?

Product managers can use SLIs in several ways to improve their products. First, they can use SLIs to identify areas for improvement. By monitoring performance against SLIs, product managers can identify areas where their service is falling short and make changes to address those areas. For example, if the percentage of successful responses to user requests is consistently low, a product manager may need to investigate and address issues that are causing these failures.

Second, product managers can use SLIs to prioritize work. By setting SLIs for different aspects of the service, product managers can determine which areas require the most attention. For example, if the percentage of successful responses to user requests is more important than the average response time, a product manager may prioritize work to improve the success rate over work to improve response time.

Finally, product managers can use SLIs to communicate with users and stakeholders. By setting specific SLIs and monitoring performance against them, product managers can communicate clearly what users can expect from the service. This can help to build trust and confidence in the service, which can lead to increased usage and revenue.

Conclusion

Service Level Indicators are critical tools for product managers to measure and manage the performance of their services. By understanding the differences between SLIs, SLOs, and SLAs, product managers can establish clear goals for their services and align expectations with stakeholders. By monitoring SLIs and making improvements based on them, product managers can continuously improve the performance of their services and deliver value to their users and stakeholders.

Once "complex" step of defining SLIs (metric) and SLOs (its health threshold) is done, a product manager can delegate the rest to the engineering team and "control" product health via the established indicators. If you want to learn how to do it, pass dedicated 3 chapters of our practical simulator "Technology for product managers".

{$te}