데이터 분석
Last updated
Last updated
Describes the concepts of lambda architecture and the actual deployment process with an example of building a serverless business intelligence systems using Amazon Kinesis, S3, Athena, OpenSearch Service, and QuickSight.
Data Pipeline for CDC data from MySQL DB to Amazon OpenSearch Service through Amazon Kinesis using Amazon Data Migration Service(DMS).
Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK using Amazon MSK Connect (Debezium)
Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK Serverless using Amazon MSK Connect (Debezium)
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK Serverless and MSK Connect (Debezium)
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with Amazon Data Firehose and DMS
Streaming data pipeline to continuously load data from an Amazon MSK or MSK Serverless cluster to Amazon S3 using Amazon Kinesis Data Firehose.
This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.
Typical use cases of opensearch serverless: search, time-series, kinesis firehose integration, securing with VPC
(1) Search
(2) Time-series Log Analysis
(3) Streaming Ingestion through Kinesis Firehose
(4) Securing OpenSearch Serverless with VPC
This web analytics demo shows how to collect web logs with API Gateway and store them into S3 through Amazon Kinesis. Then this project shows how to analyze web logs with Amazon Athena.
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3
This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (MSK) and MSK Serverless into Apache Iceberg table in S3 with AWS Glue Streaming.
Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3
Example of CQRS(Command and Query Responsibility Segregation) Pattern using Amazon Athena
This repository provides you cdk scripts and sample code on how to count unique items (e.g., unique visitors) with hyperloglog in Amazon MemoryDB for Redis. HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization.
This sample project is a real-time image analysis system. As an image is uploaded, the real-time image analysis system annotates tags on the image using Amazon Rekognition and ingests image tags into Amazon Elasticsearch for analyzing image labels.