Understanding SLI,SLO and SLA

Service Level Indicator (SLI):

Imagine that SLI means how well something is doing what it's supposed to do. In a technical overview, this is something you can "feel" when using products that use this approach. For instance, from a user's perspective, response time, error rate, and availability are Indicators.

As mentioned above, we can get deeper and see a classic example of SLI:

  • Request Latency for requests should be under 330 milliseconds

  • The availability of the server should be 99.9% for a given period.

  • Throughput for an e-Commerce endpoint, for instance, using the number of successful purchases per minute

  • The error rate for the service should be below 1%

All the points above help us measure the service level delivered by a system. When thinking about SLI, remember the association with Product Managers, Product Owners, and SREs, where technical and clean objectives are designed.

Service Level Objectives (SLO):

On the other side, SLO works with the word "promise." This happens because you must perform a certain way most of the time and quantify the reliability of a product. After all, it is directly related to the customer experience.

Can some cases be associated with SLO:

  • Response time of 100 milliseconds for all requests

  • System uptime of 99.99%

  • An error rate of less than 0.8%

  • Error budget

Generally, SLO attempts tend to be aggressive. However, the goal of perfection could not be worth it. In the end, the customers need to be happy. If 99.99% causes customers happiness and mindfulness, it is unnecessary to change for a higher value.

In the Preface of the book Implementing Service Level Objectives: A Practical Guide to Slis, Slos, and Error Budgets[7], the author gives a great example about You Don't Have to Be Perfect that could help you on your journey with SLO.

Service Level Agreement (SLA):

If some "agreement" mentioned above is broken, a value, price, or touchable must be on the table. In other words, a contract. Almost all of the consequences are financial, but can vary as said before, for instance:

  • Uptime falls below 99.9% in a Black Friday week. As a result, the provider will issue a discount of 40% to the customer.

  • Support requests will be responded to within 1 hour.

  • Maintenance will be scheduled outside of business hours.

Last updated