May
03
2021

Last minute Google Professional Cloud Architect notes

gcp-certificate-jinaldesai

During my certification exam preparation, I’m making my notes for easy last-minute reference. I used to refer to my notes in the last few hours for quick revision of various key concepts and keyword mappings which is very important. Since many topics are covered for this exam as a part of the exam syllabus, notes are quite helpful. I’m just sharing my notes here.

Keyword Tips

You can remember the keyword mappings, so when you encountered specific or related terminology mentioned in the questions then you can try to map it with the Google cloud service offerings mentioned in below points.

So in the exam question if the question contains any of the keywords you can consider the topic might be covered in the given options as the answer.

  • Cloud FireStore for Firebase
    • Mobile, html5, mobile SDKs, non-relational data
  • Cloud Functions
    • Event-driven, light weight application, single purpose, languages (Node.js, Python, Go, Java, .NET Core, Ruby, and PHP)
  • Google Compute Engine (GCE)
    • Specific OS or kernel, GPUs, HTTPs, licensed software, hybrid or multi-cloud deployment, hardened OS, OS image
  • Managed Instance Group (MIG)
    • Managed, un-managed
  • Google Kubernetes Engine (GKE)
    • Orchestration, containers, HTTPs, licensed software, hybrid or multi-cloud deployment, hardened OS, OS image, deployment
  • Cloud Run
    • Stateless HTTP, fully managed for running containerized applications
    • Serverless for containerized applications
    • Billed nearest 100 millisecond
    • Stateless containers invoked via web browser or pub/sub events
    • Scaled to zero
    • Use cases
      • Rest API backend, web services, back office administration
      • Lightweight data transformation, scheduled document generation
      • Business workflow with webhooks, migrating Node.js from Heroku to Cloud Run
      • Modernizing .NET applications
    • No background process
    • 15 mins request timeout
  • Google Cloud Storage (GCS)
    • Unstructured data, raw data, drive
  • BigTable
    • Online Transaction Processing (OLTP), storage autoscales, processing nodes scaled manually, high-throughput, zonal, HBase API, low-latency, need updates, analytics, time series data, NoSQL database, terrabyte and petabytes of data, highly scalable, Cassandra
    • Applications – Google search, Maps and GMail, financial data, IoT (Internet of Thing), fraud detection, recommendation engines
  • BitQuery
    • Analytics, Online Analytical Processing (OLAP), data warehouse for analytics, data store for analytics, historical data queries, analyze petabytes of data, server-less, keep appending data, you can’t update SQL queries, immutable, forecasting, data lake, column-store data warehouse, cached query results free, multi-regional
  • Cloud Spanner
    • Relational data, horizontal scalability, OLTP, managed SQL DB, transactional data, strongly consistent, high availability for multi-region, scaled from 1 to 1000 nodes
  • Cloud SQL
    • Relational data, does not scale beyond few terrabytes, region based
  • Cloud Datastore
    • Non-relational data, managed NoSQL database with indexes, supports SQL like queries, ACID transaction support, no joins or aggregates
  • Cloud DataFlow
    • ETL, managed ETL, data processing, Apache Beam jobs, server-less, portability, data pipelines, streaming analytics for stream and batch processing, zonal, stream MapReduce, fully managed
    • New data processing pipelines, unified streaming and batch processing, no-ops, processing for ML
    • Batch + Streaming processing for analysis (AB Initio, TIBCO Spotfire, etc.)
  • Cloud PubSub
    • Data ingestion from external or internal sources for further operation, global
  • Cloud Filestore
    • Managed NFS file servers, NoSQL, document storage, collections, documents, files, multi-regional
  • Cloud Dataproc
    • Image versioning, Hadoop cluster (Apache Pig, Hive, Spark), data processing, DevOps, global, managed cloud DataPrep, data wrangling, visually explore, clean and prepare data for analysis
    • Existing Hadoop or Spark applications, Machine Learning, Spark ML
    • Data Science Ecosystem, tunable cluster parameters
    • Iterative processing and notebooks
  • App Engine Standard
    • Managed service, heavy application, languages (Python, Java, Node.js, PHP, Ruby, Go, JavaScript, and HTML)
  • App Engine Flexible
    • Fully automated container based applications, supports all languages as containers
    • Go, Java, PHP, Python, .NET, Node.js, Ruby,
    • Custom runtime (using custom docker image or Dockerfile from open source)
  • Data Catalog
    • Metadata management service
  • Cloud DataLab
    • Interactive data exploration, analysis, visualization and ML (Machine Learning), Jupyter Notebook
  • Data Studio
    • BigData, visualization tool for dashboards and reporting
  • Data Transfer Appliance
    • Rackable, high-capacity storage, ingest only, 100TB/480TB, 6GB/sec
  • Storage Transfer Service
    • Destination GCP bucket, source S3 bucket or GCP bucket, HTTP(s) endpoint
  • Load Balancing
    • High performance, scalable traffic, auto-scalability
    • Regional LB – health checks, forwarding rules based on IP, protocals (TCP, UDP)
    • Gloabl LB – multi-region failover for HTTP(s), SSL proxy, TCP proxy, prioritize low-latency
  • Cloud CDN
    • Low-latency, HTTP(s), no custom origin (GCP only)
  • Cloud DNS
    • DNS peering, DNS forwarding, authoritative DNS lookup
    • DNS zone management, automatic scaling
  • Partner Gateway Protocol
    • Allos router to exchange information about network topology
    • Routing and reachability (cloud router)
  • Cloud NAT
    • Allows GCP resource to access internet without external IP address
  • VPC (global)
    • Global IPV4 unicast, software defined network (SDN)
    • VPC is global and subnets are regional
    • Configure subnets (each with private IP range), routes, firewalls, VPNs, BGPs, etc.
    • Shared across multiple projects in same organization
    • Private SDN in GCP
  • Subnet (regional)
    • Logical spaces to contains resources
  • Route (global)
    • Define “next hop” for the traffic based on destination IP address
  • Cloud Interconnect
    • Connecting external networks to Google’s network
    • Private connections to VPC via cloud VPN or dedicated/partner interconnect
    • 10 GBPS or 100 GBPS
  • Cloud VPN
    • IPSec VPN to connect to VPC via public internet for low-volume data connections
    • For persistent, static connections between gateways
    • 1.5 to 3 GBPS
  • Dedicated Interconnect
    • Direct physical link between VPC and on-prem for high-volume data connections
    • VLAN attachment is private connection to VPC in one region
    • Links are private but not encrypted, can layer your own encryption
    • Connect N x 10G transport circuits
    • For private cloud to Google cloud at Google PoPs (point of presence) with SLAs
    • 100 GBPs speed
  • Partner Interconnect
    • Connectivity from on-premises network to Google cloud through a supported service providers
    • 50 MBPS to 10 MBPS
  • Direct Peering
    • Private connection between organization and Google for hybrid cloud workloads
  • Carrier Peering
    • Connection through the largest partner network of service providers
  • Cloud Router
    • Dynamic routing (BGP) for hybrid networks linking GCP VPCs to external networks
    • Works with Cloud VPN and dedicated interconnect
  • CDN Interconnect
    • Direct, low-latency connectivity to certain CDN providers, with cheaper egress
  • Google recommendation
    • If its greater than 60 TB data to transfer to cloud from on-prem, use transfer appliance and then use rehydrator to decrypt the data
  • Cloud DataPrep
    • UI driven data processing, scales on-demand, fully managed, no-ops
    • To explore and clean data, to detect anomalies

Derivation of answers

Some of the keywords you can interpret and derive answers directly, and find the correct option from the given options list.

  • Analyst knows SQL
    • BigQuery
  • On-prem Spark cluster (Apache Hadoop)
    • Cloud DataProc
  • Provide access to the audit logs to the external auditor
    • StackDriver logging + Cloud Storage + Signed URL
  • Scaled down to zero web application
    • App Engine Standard
  • Access to audit logs and platform analytics using SQL
    • StackDriver logging + BigQuery
  • Health check is failing
    • Check firewall rules
  • Store backup or archive data
    • GCS Nearline or Coldline
  • How a Compute Engine can access BigQuery
    • Access Scope (default service account) or IAM (custom service account)
  • Horizontally scalable transactional DB
    • Cloud Spanner
  • Jenkins + Cloud Build
    • For automating the process of integration and deployment
  • Ansible + Puppet
    • For automating configuration management
    • Coordinating instance configuration
  • Terraform + Cloud Deployment Manager
    • Creating infrastructure
    • Coordinate environment creation
  • Redacting sensitive information
    • Cloud DLP
    • Scans and classify sensitive data in cloud storage
    • BigQuery, DataStore and a streaming API
    • Classify, mask, tokenize, and transform sensitive data elements
  • Cloud KMS
    • Encryption key management
  • Security key enforcement
    • Multi-factor authentication
  • Binary authorization
    • Properly validated containers can run in environments
  • Cloud SCC (Security Command Center)
    • Centralizes security information so you can manage it in one place
  • Firewall rules
    • For filtering traffic
    • Applied by instance-level tags or service account(s)
    • Restrictive inbound and permissive outbound
    • There is no implied deny but only implied allow egress
    • Every VPC has two implied firewall rules
      • Implied allow egress (allow all outgoing traffic)
      • Implied deny ingress (block all incoming traffic)
    • Firewall rules priority between 0 and 65535
      • 0 – highest, 65535 – lowest
  • Business Agility
    • GCP Marketplace
  • DevOps (Development and IT operations)
    • Reduce organization silos
    • Accept failure as normal
    • Leverage tooling and automation
    • Implement gradual change
    • Measure everything
    • Prescriptive way of accomplishing SRE philosophy
  • SRE (Site Reliability Engineering)
    • Share ownership of production with developers, use same tooling
    • Blameless postmortems, encoding a concept of error budget
    • Eliminate manual work
    • Amount of toil to measure SRE work
  • SLI (Service Level Indicators)
    • Level of service provided
      • Request latency, error rate, system (batch) throughput
      • Availability, yield (durability), failures per request
    • 100% availability is impossible
    • 99th percentile or median
      • e.g. if 99th percentile latency of requests received in the past five minutes (less than 300 milliseconds)
  • SLO (Service Level Objectives)
    • A target value or range of values for service level measured by SLI
    • Structure
      • SLI <= target, or lower inbound <= SLI <= upper inbound
    • Binding target for a collection of SLIs
    • 99.9% a year
    • e.g. 95th percentile homepage SLI with succeed 99.9% over trailing year
  • SLA (Service Level Agreements)
    • Explicit or implicit contract with users that include consequencies of meeting SLOs
    • Business agreement between a customer and service provider typically based on SLOs
    • e.g. service credit if 95th percentile homepage SLI succeeds less than 99.5% over trailing years
  • SLIs drive SLOs which informs SLAs
  • Strangler Fig Pattern
    • Incrementally migrate a legacy system by gradually replacing specific pieces of functionality with new applications and services
  • Graphite, Prometheous, InfluxDB
    • Store and display time series data
  • WRT (Work Recovery Time)
    • Extra time it takes to recover
  • RTO (Recovery Time Objective)
    • Maximum time under which a failed workload recovered
  • RPO (Recovery Point Objective)
    • Maximum data that organization afford to lose
    • How often data backed up or replicated
    • Allowed data loss
  • RTO + WRT = MTD (Maximum Tolerable Downtime)
  • Coordinate deployments
    • Jenkins + Cloud Build
  • Available to all users
    • Canary, Rolling
  • Not rolling back to the old environment
    • Blue-green, Red-black
  • Reimplement data analysis
    • Cloud DataFlow
  • Hadoop/Spark (Jobs)
    • Cloud DataProc
  • Scalable replacement of data warehouse
    • BigQuery
  • HTTP(s) LB (L7)
    • Cross region
  • Global SSL Proxy (L4)
    • TCP specific
  • TCP withou SSL
    • Global TCP Proxy
    • Cross region
    • Specific port
  • Regional LB
    • Traffic on any port (TCP/UDP)
  • Internal LB
    • Traffic within GCP projects
  • VPN tunnel
    • Secure multi-GBPs connection over VPN tunnels
  • When incoming data is continuous, unpredicted amount of data, ETL, Streaming data, batch data, batch computation, continuous computation using streaming
    • Cloud DataFlow
  • Google cloud active directory sync (free), Active Directory Federation service
    • For syncing local Microsoft Active Directory users with Google Cloud users
  • GCP Stackdriver error report export
    • Cloud Storage, BigQuery, Pub/Sub, Cloud Logging logs in Bucket, Splunk
  • GCP KMS symmetric key rotation
    • Periodic
    • Automatic
  • GCP KMS assymetric key rotation
  • Dual-region
    • Asia (Japan), America, Europe
  • GCP services not PCI compliant
    • App Engine – egress traffic (egress firewall rules not supported)
    • Cloud Functions – egress traffic
    • GKE – manual work needed to make it PCI compliant
    • Cloud Storage – need to tokenize PCI data before storing the data
  • GKE network policies works similar as firewalls for VPCs
  • Committed use discount
    • You commit while creating
  • Sustainable use discount
    • You use for long time and get discount
  • Load Balancing
    • Global
      • HTTP(s) LB, SSL Proxy, TCP Proxy – 1 million QPS
    • Regional
      • Network LB, Internal LB, Internal HTTP(s) LB
    • Network LB
      • TCP/UDP (L4)
    • External
      • HTTP(s) LB, SSL Proxy LB, TCP Proxy LB, TCP/UDP Network LB
    • Internal
      • TCP/UDP LB, HTTP(s) LB
  • Cloud (StrackDriver) Trace
    • Collects latency data from VMs
    • Containers and App Engine
    • Running applications (Microservices) request latencies
  • BigQuery
    • Partition BigQuery tables by
      • Ingestion time (PARTITIONTIME)
      • Date/Timestamp/DateTime
      • Integer range
  • Resiliency testing strategy
    • Needs chaos testing
  • App Engine Standard can’t use VPN directly, while App Engine Flexible can use it directly
  • Every time you see CPU usage can go down to 0.1 or 0.something
    • GKE
  • GKE
    • Horizontal pod autoscaler
    • Cluster autoscaler
    • Vertical pod autoscaler
  • Direct Attached Storage (DAS)
    • Local SSDs
    • Persistent SSDs
  • Network Attached Storage (NAS)
    • Google Filestore (For GKE and GCE)
  • Cross project VPC communication within same or different organization
    • VPC network peering
  • Secure tunnel connection between on-prem and GCP VPC (IPSec protocol)
    • VPN
  • Dynamic routing, one or more secure tunnel
    • HA VPN
  • Dynamic or static routing, single secure tunnel
    • Classic VPN
  • Max throughput per secure tunnel
    • 3 GBPS
  • Hybrid connectivity, no setup or maintenance costs, high throughput
    • Direct Peering
  • Hybrid connectivity, service provider costs, high throughput
    • Carrier Peering
  • Hybrid connectivity, required GCP, high throughput, >10 GBPS
    • Dedicated Interconnect
  • Hybrid connectivity, required GCP, high throughput, <=10 GBPS
    • Partner Interconnect
  • Strict SLAs
    • Dedicated Interconnect – per month per circuit, per hour per VLAN
    • Partner Interconnect – per hour per VLAN
  • GCP VM need to access internet without public (external) IP
    • Cloud NAT
  • Using AES128 encryption
    • HDD
  • Using AES256 encryption
    • Persistent disks
  • Support of FTP data ingestion
    • App Engine only supports HTTP data ingestion
    • GKE containers support both HTTP and FTP data ingestion

Important links

Of course you can refer my notes for the last minute reference, but apart from my notes I’ll also recommend following links a must reference for GCP Professional Cloud Architect certification.

Do let me know how was your exam and if my notes helpful or not. All the best for the exam.