Cloud Logging Architecture in Google Cloud

  1. Choosing Google Cloud Storage and Data Solutions
  2. Choosing a Google Cloud Deployment Platform
  3. Designing Google Cloud Networks
  4. Designing Reliable Systems
  5. GCP Compute Engine Instance Lifecycle
  6. Disaster Planning and Recovery Strategies
  7. Designing Secured Systems in GCP
  8. Cloud Logging Architecture in Google Cloud

gcp-logging-architecture
Open enlarged image in new tab

Bird Eye Introduction

Cloud Logging architecture consists of the following components: Log collection, log routing, log sinks and log analysis.

Log Collections

These are the places where log data originates.

Log sources can be Google Cloud services, such as Compute Engine, App Engine, and Kubernetes Engine, or your own applications.

Routing

The Log Router is responsible for routing log data to its destination. The Log Router uses a combination of inclusion filters and exclusion filters to determine which log data is routed to each destination.

Log sinks are destinations where log data is stored. Cloud Logging supports a variety of log sinks, including Cloud Logging log buckets, Cloud Pub/Sub topics, BigQuery, cloud storage bucket. Each of these have various benefits.

Store

Cloud Logging log buckets are storage buckets that are specifically designed for storing log data. Whereas Cloud Pub/Sub topics are topics can be used to route log data to other services, such as third-party logging solutions. Cloud Storage buckets provides storage of log data in Cloud Storage. Log entries are stored as JSON files.

BigQuery is a fully-managed, petabyte-scale analytics data warehouse that can be used to store and analyze log data.

Visualize & Analyze

Cloud Logging provides several tools to analyze logs. These include logs explorer, error reporting, logs-based metrics, and log analytics.

Logs Explorer is optimized for troubleshooting use cases with features like log streaming.

Error Reporting help users react to critical application errors through automated error grouping and notifications.

Logs-based metrics, dashboards and alerting provide other ways to understand and make logs actionable.

Log Analytics feature expands the toolset to include ad hoc log analysis capabilities.

gcp-log-routing-and-storage
Open enlarged image in new tab

Logs Routing and Storage

What we call Cloud Logging is actually a collection of components exposed through a centralized logging API. The components include log router, log sinks and log storage. Entries are passed through the API and fed to Log Router.

Log Router is optimized for processing streaming data, reliably buffering it, and sending it to any combination of log storage and sink (export) locations. By default, log entries are fed into one of the default logs storage buckets. Exclusion filters might be created to partially or totally prevent this behavior.

Log sinks run in parallel with the default log flow and might be used to direct entries to external locations. Locations might include additional Cloud Logging buckets, Cloud Storage, BigQuery, Pub/Sub, or external projects.

Inclusion and exclusion filters can control exactly which logging entries end up at a particular destination, and which are ignored completely.

For each Google Cloud project, Logging automatically creates two logs buckets: _Required and _Default, and corresponding log sinks with the same names.

All logs generated in the project are stored in one of these two locations:

_Required: This bucket holds Admin Activity audit logs, System Event audit logs, and Access Transparency logs, and retains them for 400 days. You aren’t charged for the logs stored in _Required, and the retention period of the logs stored here cannot be modified. You cannot delete or modify this bucket.

_Default: This bucket holds all other ingested logs in a Google Cloud project, except for the logs held in the _Required bucket. Standard Cloud Logging pricing applies to these logs. Log entries held in the _Default bucket are retained for 30 days, unless you apply custom retention rules. You can’t delete this bucket, but you can disable the _Default log sink that routes logs to this bucket.

The Logs Storage page displays a summary of statistics for the logs that your project is receiving, including:

Current total volume: The amount of logs your project has received since the first date of the current month.

Previous month volume: The amount of logs your project received in the last calendar month.

Projected volume by EOM: The estimated amount of logs your project will receive by the end of the current month, based on current usage.

Log Archiving and Exporting

gcp-log-archiving-exporting-1

Here, we are exporting logs through Pub/Sub, to Dataflow to BigQuery. Dataflow is an excellent option if you’re looking for real-time log processing at scale. In this example, the Dataflow job could react to real-time issues, while streaming the logs into BigQuery for longer-term analysis.

gcp-log-archiving-exporting-2

Here, we are storing logs in Cloud Storage via Log Sink. Thus, we can take advantages of Cloud Storage features. For example, long-term retention, reduced storage costs, and configurable object lifecycles. It also includes automated storage class changes, auto-delete, and guaranteed retention.

gcp-log-archiving-exporting-3

Here, we have an example organization that wants to integrate the logging data from Google Cloud, back into an on-premises Splunk instance. You can ingest logs into Splunk you can either stream logs using Pub/Sub to Splunk Dataflow or using the Splunk Add-on for Google Cloud.

Pub/Sub is one of the options available for exporting to Splunk, or to other third-party System Information and Event Management (SIEM) software packages.

Logs Aggregation Levels

gcp-log-agreegation-levels

A common logging need is centralized log aggregation for auditing, retention, or non-repudiation purposes. Aggregated sinks allow for easy exporting of logging entries without a one-to-one setup. There are three available Google Cloud Logging aggregation levels.

Project

It exports all the logs for a specific project and a log filter can be specified in the sink definition to include/exclude certain log types.

Folder

A folder-level log sink aggregates logs on the folder level and can include logs from children resources (subfolders, projects).

Organization

For a global view, an organization-level log sink can aggregate logs on the organization level and can also include logs from children resources (subfolders, projects).

Best Practice (Security Analytics)

gcp-logs-best-practice
Open enlarged image in new tab

Security practitioners onboard Google Cloud logs for security analytics. By performing security analytics, you help your organization prevent, detect, and respond to threats like malware, phishing, ransomware, and poorly configured assets.

One of the steps in security log analytics workflow is to create aggregate sinks and route those logs to a single destination depending on the choice of security analytics tool, such as Log Analytics, BigQuery, Chronicle, or a third-party security information and event management (SIEM) technology.

Logs are aggregated from your organization, including any contained folders, projects, and billing accounts.

Log-based Metrics

gcp-log-based-metrics

Logs-based metrics derive metric data from the content of log entries. For example, metrics can track the number of entries that contain specific messages or extract latency information that is reported in the logs.

These metrics transform into time series data and use it in Cloud Monitoring Charts and Alerting Policies. There are two types of log-based metrics, System-defined log-based metrics and User-defined log-based metrics.

System-defined log-based metrics, provided by Cloud Logging can be used by all Google Cloud projects. System-defined log-based metrics are calculated only from logs that have been ingested by Logging. If a log has been explicitly excluded from ingestion by Cloud Logging, it isn’t included in these metrics.

User-defined log-based metrics, created by you to track things in your Google Cloud project that are of particular interest to you. For example, you might create a log-based metric to count the number of log entries that match a given filter.

Log-based metrics suitable use cases

Log-based metrics are suitable when you want to do any of the following:

  • Count the occurrences of a message, like a warning or error, in your logs and receive a notification when the number of occurrences crosses a threshold.
  • Observe trends in your data, like latency values in your logs, and receive a notification if the values change in an unacceptable way.
  • Visualize extracted data, create charts to display the numeric data extracted from your logs.

Log-based metric types

gcp-types-of-log-based-metrics

Counter metrics

All predefined system log-based metrics are the counter type, but user-defined metrics can be either counter, distribution or boolean types.

Counter metrics count the number of log entries matching an advanced logs query. So, if we simply wanted to know how many of our “/score called” entries were generated, we could create a counter.

Distribution metrics

Distribution metrics record the statistical distribution of the extracted log values in histogram buckets. The extracted values are not recorded individually. Their distribution across the configured buckets is recorded, along with the count, mean, and sum of squared deviations of the values.

Boolean metrics

Boolean metrics record where a log entry matches a specified filter.

Scope of log-based metrics

gcp-scope-of-log-based-metrics

System-defined log-based metrics apply at the Google Cloud project level. These metrics are calculated by the Log Router and apply to logs only in the Google Cloud project in which they’re received.

User-defined log-based metrics can apply at either the Google Cloud project level or at the level of a specific log bucket: Project-level metrics are calculated like system-defined log-based metrics; these user-defined log-based metrics apply to logs only in the Google Cloud project in which they’re received. Bucket-scoped metrics apply to logs in the log bucket in which they’re received, regardless of the Google Cloud project in which the log entries originated. With bucket-scoped log-based metrics, you can create log-based metrics that can evaluate logs in the following cases: Logs that are routed from one project to a bucket in another project.

Log Analytics

gcp-log-analytics

Log Analytics gives you the analytical power of BigQuery within the Cloud Logging console and provides you with a new user interface that’s optimized for analyzing your logs.

When you create a bucket and activate analytics on it, Cloud Logging makes the logs data available in both the new Log Analytics interface and BigQuery; you don’t have to route and manage a separate copy of the data in BigQuery.

You can still query and examine the data as usual in Cloud Logging with the Logging query language. The three prominent use cases for Cloud Logging in Log Analyics are troubleshooting, log analytics and reporting.

Log Analytics, analyze application performance, data access and network access patterns. Log Analytics pipeline maps logs to BigQuery tables and writes to BigQuery. Use the same logs data in Log Analytics directly from BigQuery to report on aggregated application and business data found in logs.

The logs data in your analytics-enabled buckets is different than logs routed to BigQuery via traditional export in the following ways:

  • Log data in BigQuery is managed by Cloud Logging.
  • BigQuery ingestion and storage costs are included in your Logging costs.
  • Data residency and lifecycle are managed by Cloud Logging.

You can query your logs on Log Analytics-enabled buckets directly in Cloud Logging via the new Log Analytics UI. The Log Analytics UI is optimized for viewing unstructured log data.

Log analytics use cases

  • DevOps: For a DevOps specialist it is important to quickly troubleshoot an issue that requires to reduce Mean Time to Repair (MTTR). Log Analytics includes capabilities to count the top requests grouped by response type and severity, which allows engineers to diagnose the issues.
  • Security: A security personal is interest in finding all the audit logs associated with a specific use over the past month. Log Analytics help better investigate the security -related attacks with queries over large volumes of security data.
  • IT or Network Operations: A IT/Network Operations is interested in identifying network issues for GKE instances that are using VPC and firewall rules. Log Analytics in this case provides better network insights and management through advanced log aggregation capabilities.


Leave a Reply

Your email address will not be published. Required fields are marked *

*