Cloud Storage

Deploying and managing Google Cloud Storage for object storage

Google Cloud Storage is a globally unified, scalable, and highly durable object storage service for storing and accessing any amount of data. It provides industry-leading availability, performance, security, and management features.

Key Features

  • Global Accessibility: Access data from anywhere in the world

  • Scalability: Store and retrieve any amount of data at any time

  • Durability: 11 9's (99.999999999%) durability for stored objects

  • Storage Classes: Standard, Nearline, Coldline, and Archive storage tiers

  • Object Versioning: Maintain history and recover from accidental deletions

  • Object Lifecycle Management: Automatically transition and delete objects

  • Strong Consistency: Read-after-write and list consistency

  • Customer-Managed Encryption Keys (CMEK): Control encryption keys

  • Object Hold and Retention Policies: Enforce compliance requirements

  • VPC Service Controls: Add security perimeter around sensitive data

Cloud Storage Classes

Storage Class
Purpose
Minimum Storage Duration
Typical Use Cases

Standard

High-performance, frequent access

None

Website content, active data, mobile apps

Nearline

Low-frequency access

30 days

Data accessed less than once a month

Coldline

Very low-frequency access

90 days

Data accessed less than once a quarter

Archive

Data archiving, online backup

365 days

Long-term archive, disaster recovery

Deploying Cloud Storage with Terraform

Basic Bucket Creation

Advanced Configuration with Lifecycle Policies

Static Website Hosting Configuration

Managing Cloud Storage with gsutil

Basic Bucket Commands

Object Operations

Access Control

Lifecycle Management

Real-World Example: Multi-Region Data Lake Architecture

This example demonstrates a complete data lake architecture using Cloud Storage:

Architecture Overview

  1. Landing Zone: Raw data ingestion bucket

  2. Processing Zone: Data transformation and staging

  3. Curated Zone: Processed, high-quality data

  4. Archive Zone: Long-term, cold storage

Terraform Implementation

Data Lifecycle Automation Script

Best Practices

  1. Bucket Naming and Organization

    • Choose globally unique, DNS-compliant names

    • Use consistent naming conventions

    • Organize objects with clear prefix hierarchy

    • Consider regional requirements for data storage

  2. Security

    • Enable uniform bucket-level access

    • Use VPC Service Controls for sensitive data

    • Apply appropriate IAM roles with least privilege

    • Enforce public access prevention

    • Use CMEK for regulated data

    • Enable object holds for compliance

  3. Cost Optimization

    • Choose appropriate storage classes for data access patterns

    • Implement lifecycle policies for automatic transitions

    • Use composite objects for small files

    • Monitor usage with Cloud Monitoring

    • Consider requester pays for shared datasets

  4. Performance

    • Store frequently accessed data in regions close to users

    • Use parallel composite uploads for large files

    • Avoid small, frequent operations

    • Use signed URLs for temporary access

    • Implement connection pooling in applications

  5. Data Management

    • Enable object versioning for critical data

    • Configure access logs for audit trails

    • Use object metadata for classification

    • Set up notifications for bucket events

    • Implement retention policies for compliance

Common Issues and Troubleshooting

Access Denied Errors

  • Verify IAM permissions and roles

  • Check for VPC Service Controls blocking access

  • Ensure service accounts have proper permissions

  • Validate CMEK access for encrypted buckets

  • Check organization policies for restrictions

Performance Issues

  • Review network configuration for private Google access

  • Ensure proper region selection for proximity to users

  • Monitor request rates and throttling

  • Check object naming patterns for hotspots

  • Optimize upload/download processes

Cost Management

  • Review storage distribution across classes

  • Check lifecycle policies for effectiveness

  • Monitor large, unnecessary object versions

  • Watch for unexpected egress charges

  • Verify requester-pays configuration

Data Management

  • Validate versioning is working as expected

  • Check retention policy effectiveness

  • Monitor object holds and legal holds

  • Verify notification configurations

  • Ensure backups are properly configured

Further Reading

Last updated