GCP Services SLAs

sli-slo-sla

Introduction

Google Cloud offers a comprehensive suite of services and products designed to meet various computing needs, ranging from compute, storage, and databases to networking, AI, and security. Each service comes with defined Service Level Agreements (SLAs) that guarantee specific performance metrics, such as uptime, availability, durability, and latency. These SLAs ensure that businesses can rely on Google Cloud for consistent and reliable service, making it a trusted platform for mission-critical applications.

SLI (Service Level Indicator)

An SLI is a specific metric that measures the performance of a service. It quantifies how well a service is doing regarding a particular aspect, such as uptime, latency, or error rate.

  • Example: Imagine you run a website, and you want to measure how fast your pages load. The SLI in this case might be the “percentage of page loads that complete in under 2 seconds.”

SLO (Service Level Objective)

An SLO is a target or goal for an SLI. It defines the acceptable level of service performance. SLOs are internal goals that you strive to meet to keep your service reliable and users satisfied.

  • Example: Continuing with the website example, your SLO could be that “99% of all page loads should complete in under 2 seconds.” This is the target you set based on your SLI.

SLA (Service Level Agreement)

An SLA is a formal contract or agreement between a service provider and a customer. It specifies the level of service that the provider commits to delivering, often including penalties if these levels are not met. SLAs usually include SLOs but also cover other aspects like support response times.

  • Example: If you’re providing a website hosting service, you might have an SLA with your clients that promises “99.9% uptime per month.” If you fail to meet this, you might be required to compensate the client, perhaps with a discount or credit for that month.

Real-Life Example

Let’s say you’re using a cloud service to host your website.

  • SLI: You measure the uptime of the service, and find that it’s up 99.95% of the time over the past month.
  • SLO: Internally, you have set a goal to maintain at least 99.9% uptime each month.
  • SLA: The cloud provider has an SLA with you that guarantees 99.9% uptime. If they fail to meet this, they might offer you a refund or a service credit as compensation.

Summary

  • SLI is the actual measurement (e.g., 99.95% uptime).
  • SLO is the goal you set (e.g., 99.9% uptime).
  • SLA is the formal agreement or guarantee (e.g., 99.9% uptime with penalties for failure).

GCP Services SLAs

Service/ProductUptime SLAAvailability SLADurability SLALatency SLA
Compute
Compute Engine99.99%99.99%N/A
App Engine (Standard)99.95%99.95%
App Engine (Flexible)99.95%99.95%
Kubernetes Engine99.95% for regional clusters99.95%N/A
Cloud Functions99.95%99.95%
Cloud Run99.95%99.95%
Bare Metal Solution99.9%99.9%
Storage & Databases
Cloud Storage (Multi-Regional)99.95%99.95%99.999999999% (11 9’s)
Cloud Storage (Regional)99.9%99.9%99.999999999% (11 9’s)
Persistent Disk99.99%99.99%99.999999999% (11 9’s)Sub-millisecond latency
Cloud Bigtable99.95%99.95%<10ms read/write (95th percentile)
Cloud Spanner99.999% (multi-region), 99.99% (regional)99.999% (multi-region), 99.99% (regional)99.999999999% (11 9’s)10ms for reads, 50ms for writes (99th percentile)
Cloud SQL99.95%99.95%N/A
Cloud Datastore99.95%99.95%99.999999999% (11 9’s)<10ms (99th percentile)
Cloud Firestore99.999% (multi-region), 99.99% (regional)99.999% (multi-region), 99.99% (regional)99.999999999% (11 9’s)<10ms (99th percentile)
Filestore99.99%99.99%N/A
Cloud Memorystore99.9% (for Redis)99.9%Sub-millisecond latency
Cloud SQL99.95%99.95%N/A
Cloud BigQuery99.99%99.99%3 seconds query execution (median)
Cloud Dataflow99.9%99.9%
Cloud Data Fusion99.9%99.9%
Cloud Composer99.9%99.9%
Cloud Data Catalog99.9%99.9%
Cloud Dataproc99.9%99.9%
Cloud Datalab99.9%99.9%
Firestore99.999% (multi-region), 99.99% (regional)99.999% (multi-region), 99.99% (regional)99.999999999% (11 9’s)<10ms (99th percentile)
Networking
Cloud Load Balancing99.99%99.99%N/A
Cloud CDN99.95%99.95%
Cloud DNS100% availability100% availability<10ms (global average)
Cloud Interconnect99.99%99.99%50-100ms (RTT latency)
Cloud VPN99.9%99.9%
Cloud Armor99.99%99.99%N/A
Traffic Director99.99%99.99%N/A
AI and Machine Learning
AI Platform (Vertex AI)99.9%99.9%
Dialogflow99.9%99.9%
AutoML99.9%99.9%
Cloud Natural Language99.9%99.9%
Cloud Translation API99.9%99.9%
Cloud Vision99.9%99.9%
Cloud Speech-to-Text99.9%99.9%
Cloud Text-to-Speech99.9%99.9%
AI Hub99.9%99.9%
Operations
Cloud Logging99.9%99.9%
Cloud Monitoring99.9%99.9%
Cloud Trace99.9%99.9%
Cloud Debugger99.9%99.9%
Cloud Profiler99.9%99.9%
Security
Cloud IAM99.95%99.95%
Identity-Aware Proxy99.9%99.9%
Cloud Key Management Service (KMS)99.95%99.95%
Secret Manager99.95%99.95%
Access Transparency99.9%99.9%
API Management & Integration
Apigee99.95%99.95%
Cloud Endpoints99.9%99.9%
Cloud Pub/Sub99.95%99.95%<100ms (publish-to-acknowledge latency)
Cloud Tasks99.9%99.9%
Cloud Scheduler99.9%99.9%
Developer Tools
Cloud Build99.95%99.95%
Cloud Source Repositories99.9%99.9%
Cloud Code99.9%99.9%
Cloud Deployment Manager99.9%99.9%
Cloud Container Registry99.95%99.95%
Hybrid and Multi-cloud
Anthos99.9% for GKE on-prem99.9% for GKE on-prem
Traffic Director99.99%99.99%N/A

SLAs can change over time. Refer Google Cloud SLA Documentation for most up-to-date information.

Notes:

  • Uptime SLA: This indicates the guaranteed uptime of the service. For example, a 99.9% SLA indicates that the service is expected to be available 99.9% of the time.
  • Availability SLA: Specifies the availability percentage that Google guarantees for the service. (uptime + maintenance factors)
  • Durability: Some services, particularly storage services, offer durability guarantees, often expressed in terms of “9’s” (e.g., 11 9’s for Cloud Storage Multi-Regional).
  • Latency: Indicates the general latency metrics, where available.


Leave a Reply

Your email address will not be published. Required fields are marked *

*