Transforming Log Analysis with ClickHouse: Cloud CIRCUS's Journey

Transforming Log Analysis with ClickHouse: Cloud CIRCUS's Journey

Introduction

Cloud CIRCUS, a Tokyo-based SaaS company, is on a mission to streamline business marketing across Japan, helping marketers embrace the mantra of "working smarter." One of their innovative tools is BlueMonkey, a CMS platform that enables anyone to build and manage high-quality websites easily. With over 2,000 client websites currently hosted on AWS and powered by CloudFront distributions for swift content delivery, Cloud CIRCUS generates millions of access logs daily. While these logs are crucial for tracking errors and understanding user behavior, querying them with Amazon Athena presented challenges such as performance delays and limited scalability. Consequently, the team needed an efficient solution for analyzing all logs without executing thousands of queries, or enduring high costs.

Infrastructure engineer Qulin Zhu shared insights during the January 2025 ClickHouse Tokyo meetup, highlighting how transitioning to ClickHouse resolved these issues. Initially intended for testing, ClickHouse's capabilities revealed that it was a powerful self-managed analytics pipeline, significantly enhancing speed, ease of automation, and cost-effectiveness. This transition aligned perfectly with the team's focus on learning, discipline, growth, and persistence to foster a more agile data management approach.

Switching from Athena to ClickHouse

Previously, Cloud CIRCUS utilized Athena to analyze CloudFront logs stored in S3. However, as client websites proliferated, performance declined due to millions of new records generated daily by CloudFront distributions. Zhu explained, "Analyzing logs in Athena became burdensome. The excessive volume of records from CloudFront led to time-consuming and costly queries." One significant hurdle was the requirement of a separate table for each CloudFront distribution in Athena. Running queries across all sites resulted in substantial load, sometimes taking hours to incorporate new logs.

The team investigated alternative approaches, contemplating the feasibility of analyzing CloudFront logs in a self-installed ClickHouse environment. As an open-source model noted for its speed and robust aggregation processing, ClickHouse allowed for the aggregation of all log data into a single table, which eliminated the granular query-based pricing model previously faced. Zhu stated, "We believed that addressing speed and cost issues was feasible, partly driven by a desire to explore ClickHouse's performance through the technical validation process." After establishing a proof of concept on EC2 and executing benchmarks, they confidently decided to migrate to ClickHouse.

Beginning with a simple configuration, the team created a single EC2 instance, deploying Amazon Linux 2023 with 4 vCPUs and 32GB of memory to install ClickHouse using official RPM packages. Zhu noted, "Using a small server was intentional, aimed at maintaining an efficient cost-saving approach during our technical validation phase." Hosting both the ClickHouse server and client on the same machine simplified management during early testing stages.


Optimizing the New Log Analysis System

With a solid foundational structure in place, the team focused on schema design, mirroring the table structure used by Athena for CloudFront log ingestion to ensure comparability. Zhu expressed the importance of this consistency: "This mirroring allows for results similar to those produced by Athena queries." Nevertheless, one key detail required adjustment: while Google Athena used integers for HTTP codes, ClickHouse needed a string type to accommodate a value of "000" that appeared in CloudFront logs. Utilizing the MergeTree table engine and creating daily partitions based on host_header and date, they optimized the configuration for aggregation queries.

Once the table setup was complete, the import of logs commenced. Historically, CloudFront logs were directed to S3, categorized into folders by domain. Leveraging ClickHouse's s3 table function allowed them to ingest logs from multiple files across specific date ranges simultaneously, avoiding manual file listings. To streamline import and clean data, they configured ClickHouse to skip header rows at the beginning and treated placeholder dashes as null values. This straightforward setup facilitated rapid initial data loading without issues.

However, scaling remained a pressing challenge. Zhu commented, "Collecting logs from all sites results in enormous data volumes, and inserting them in bulk significantly slows down the processing." To automate this aspect, they constructed a pipeline using Amazon EventBridge and ECS, scheduling batch tasks to scan S3 for listed domains and loading logs into ClickHouse by domain. "This brought us a fully automated solution!" Zhu boasted.

Enhancing Performance and Cost-Efficiency with ClickHouse

Following the implementation of the new pipeline, Cloud CIRCUS witnessed substantial improvements in query performance. During the meetup, Zhu demonstrated an aggregation query for daily access figures, noting it took only 0.043 seconds in ClickHouse, dramatically down from 16 seconds on Athena. Remarkably, ClickHouse efficiently completed a query that scanned 470 million rows within nine seconds compared to Athena's cumbersome method that involved executing over 2,000 queries, which could take several hours.

More than speed, Zhu emphasized ClickHouse's notable convenience over Athena, stating, "I believe ClickHouse offers far superior usability." Along with performance enhancements, cost-effectiveness emerged as a significant advantage. In Athena, Cloud CIRCUS faced data-scan-based charges, leading to unpredictable costs during extensive analyses. Conversely, ClickHouse operated on a consistently priced EC2 instance, amounting to about $300 monthly. While Athena seemed cheaper at around $100, Zhu explained, "In ClickHouse, there are fixed costs per query regardless of volume, whereas Athena charges vary based on query execution. Running additional aggregation queries across all sites easily overshot ClickHouse's costs."

Transitioning to a centralized pipeline for log aggregation into a single table, automating ingestion, and locally executing queries with ClickHouse resulted in enhanced speed, improved manageability, and greater scalability for Cloud CIRCUS. Zhu concluded, "Our migration from Athena to ClickHouse not only resulted in significant time and cost savings, but also eliminated query execution cost concerns, which is immensely beneficial." This transformation facilitates the growth of BlueMonkey, enabling the seamless support of thousands of client websites, while streamlining service operations for marketers.


Conclusion

Cloud CIRCUS's strategic transition to ClickHouse signifies a transformative leap in handling data analytics, shifting from a fragmented system with unpredictable costs to a robust, efficient, and scalable solution. Embracing technological innovations empowers the company to enhance the efficiency of their digital tools while addressing the growing demands of their diverse clientele. For businesses seeking similar gains in data processing and analysis, ClickHouse may provide the key to a future of sustainable growth and success in an increasingly data-driven landscape.

Questions and Answers

Q1: What is ClickHouse and why did Cloud CIRCUS switch to it?
A1: ClickHouse is a powerful open-source analytics engine known for speed and efficiency. Cloud CIRCUS switched to it to enhance log analysis performance, reduce costs, and automate data ingestion processes.

Q2: How did the migration from Athena to ClickHouse affect query performance?
A2: Migration to ClickHouse resulted in drastically improved query performance, with access aggregation queries running in just 0.043 seconds compared to 16 seconds in Athena.

Q3: What challenges did Cloud CIRCUS face with Amazon Athena?
A3: Cloud CIRCUS faced significant performance delays and costs due to the volume of records generated from their CloudFront distributions.

Q4: What steps did Cloud CIRCUS take to optimize their log import process?
A4: They automated the import process using Amazon EventBridge and ECS, scheduling batch tasks to load logs by domain into ClickHouse.

Q5: How does ClickHouse's pricing model compare to Athena's?
A5: ClickHouse has a fixed pricing model based on instance size, while Athena charges based on the amount of data scanned per query, leading to unpredictable costs.

tags:ClickHouse, log analysis, Cloud CIRCUS, data management, SaaS

Comments

Social

Popular posts from this blog

Revolutionizing Developer Productivity with Shopify's AI Tool, Roast

Master JSON Merging: Best Practices and Step-by-Step Guide

Unveiling Garbage Collection: The Unsung Hero of Memory Management