You work for a global organization and run a service with an availability target of 99% with limited engineering resources. For the current calendar month, you noticed that the service has 99.5% availability. You must ensure that your service meets the defined availability goals and can react to business changes, including the upcoming launch of new features. You also need to reduce technical debt while minimizing operational costs. You want to follow Google-recommended practices. What should you do?
A. Add N+1 redundancy to your service by adding additional compute resources to the service.
B. Identify, measure, and eliminate toil by automating repetitive tasks.
C. Define an error budget for your service level availability and minimize the remaining error budget.
D. Allocate available engineers to the feature backlog while you ensure that the service remains within the availability target.
Disclaimer
This is a practice question. There is no guarantee of coming this question in the certification exam.
Answer
B
Explanation
A. Add N+1 redundancy to your service by adding additional compute resources to the service.
B. Identify, measure, and eliminate toil by automating repetitive tasks.
(In the context of running a service with a 99% availability target and limited engineering resources, this approach aligns with Google’s Site Reliability Engineering (SRE) principles. By automating manual and repetitive operational work, engineering teams can enhance efficiency, reduce the risk of human error, and free up valuable resources.
The emphasis on eliminating toil not only contributes to meeting availability targets by minimizing the potential for errors but also allows engineering teams to allocate more time to strategic tasks, reducing technical debt, and facilitating a more agile response to business changes, such as the launch of new features.
Overall, Option B addresses the need for operational efficiency and resource optimization to ensure the reliability of the service.)
C. Define an error budget for your service level availability and minimize the remaining error budget.
D. Allocate available engineers to the feature backlog while you ensure that the service remains within the availability target.