데이터 분석

Build Business Intelligence System from Scratch on AWS

Describes the concepts of lambda architecture and the actual deployment process with an example of building a serverless business intelligence systems using Amazon Kinesis, S3, Athena, OpenSearch Service, and QuickSight.

GitHub - aws-samples/aws-analytics-immersion-day: AWS Analytics Immersion Day - Build Business Intelligence System from ScratchGitHub

Zero-ETL integrations with Amazon Redshift

(1) Aurora MySQL to Amazon Redshift

An Amazon Aurora MySQL zero-ETL integration with Amazon Redshift enables near real-time analytics and machine learning (ML) using Amazon Redshift on petabytes of transactional data from RDS.

aws-kr-startup-samples/analytics/zero-etl-integrations/aurora-mysql-to-redshift at main · aws-samples/aws-kr-startup-samplesGitHub

(2) Aurora PostgreSQL to Amazon Redshift

An Amazon Aurora PostgreSQL zero-ETL integration with Amazon Redshift enables near real-time analytics and machine learning (ML) using Amazon Redshift on petabytes of transactional data from RDS.

aws-kr-startup-samples/analytics/zero-etl-integrations/aurora-postgresql-to-redshift at main · aws-samples/aws-kr-startup-samplesGitHub

(3) Amazon RDS MySQL to Amazon Redshift

An Amazon RDS MySQL zero-ETL integration with Amazon Redshift enables near real-time analytics and machine learning (ML) using Amazon Redshift on petabytes of transactional data from RDS.

aws-kr-startup-samples/analytics/zero-etl-integrations/rds-mysql-to-redshift at main · aws-samples/aws-kr-startup-samplesGitHub

CDC(Change Data Capture) Data Pipeline

Data Pipeline for CDC data from MySQL DB to Amazon OpenSearch Service through Amazon Kinesis using Amazon Data Migration Service(DMS).

GitHub - aws-samples/aws-dms-cdc-data-pipeline: Data Pipeline for CDC data from MySQL DB to Amazon OpenSearch Service through Amazon Kinesis using Amazon Data Migration Service(DMS).GitHub

CDC(Change Data Capture) Data Pipeline using Amazon MSK and MSK Connect with Debezium

Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK using Amazon MSK Connect (Debezium)

GitHub - aws-samples/aws-msk-cdc-data-pipeline-with-debezium: Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK using Amazon MSK Connect (Debezium).GitHub

CDC(Change Data Capture) Data Pipeline using Amazon MSK Serverless and MSK Connect with Debezium

Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK Serverless using Amazon MSK Connect (Debezium)

GitHub - aws-samples/aws-msk-serverless-cdc-data-pipeline-with-debezium: Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK Serverless using Amazon MSK Connect (Debezium).GitHub

Transactional Data Lake supporting CDC-based Upsert operation

Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS

GitHub - aws-samples/transactional-datalake-using-apache-iceberg-on-aws-glue: Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMSGitHub

Transactional Data Lake using Amazon MSK and Apache Iceberg on AWS Glue

Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)

GitHub - aws-samples/transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue: Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and MSK Connect (Debezium)GitHub

Transactional Data Lake using Amazon MSK Serverless and Apache Iceberg on AWS Glue

Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK Serverless and MSK Connect (Debezium)

GitHub - aws-samples/transactional-datalake-using-amazon-msk-serverless-and-apache-iceberg-on-aws-glue: Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and MSK Connect (Debezium)GitHub

Transactional Data Lake using Amazon Data Firehose and Apache Iceberg

Stream CDC into an Amazon S3 data lake in Apache Iceberg format with Amazon Data Firehose and DMS

GitHub - aws-samples/transactional-datalake-using-amazon-datafirehose-iceberg: Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with Amazon Data Firehose and DMSGitHub

Streaming Data to Amazon S3 Tables using Amazon Kinesis Data Firehose

This is a CDK Python project to build a fully managed data lake using Amazon Data Firehose and S3 Tables to store and analyze real-time streaming data.

aws-kr-startup-samples/analytics/streaming-data-to-s3tables-with-datafirehose at main · aws-samples/aws-kr-startup-samplesGitHub

Streaming Data Pipeline from Apache Kafka to Amazon S3 using Amazon Kinesis Data Firehose

Streaming data pipeline to continuously load data from an Amazon MSK or MSK Serverless cluster to Amazon S3 using Amazon Kinesis Data Firehose.

GitHub - aws-samples/streaming-data-pipeline-from-kafka-to-s3-using-aws-kinesis-firehose: Streaming data pipeline to continuously load data from an Amazon MSK or MSK Serverless cluster to Amazon S3 using Amazon Kinesis Data Firehose.GitHub

Redshift Streaming ingestion from Kinesis Data Streams, MSK, or MSK Serverelss (3 examples)

This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.

GitHub - aws-samples/redshift-streaming-ingestion-patterns: This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.GitHub

OpenSearch Serverless 4 Common Usage Patterns

Typical use cases of opensearch serverless: search, time-series, kinesis firehose integration, securing with VPC

(1) Search
(2) Time-series Log Analysis
(3) Streaming Ingestion through Kinesis Firehose
(4) Securing OpenSearch Serverless with VPC

GitHub - aws-samples/opensearch-serverless-common-usage-patterns: Typical use cases of opensearch serverelss: search, time-series, kinesis firehose integration, securing with VPCGitHub

Web Analytics System on AWS (a kind of Simple version of Google Analytics)

Web Log Analytics System with Parquet data format

This web analytics demo shows how to collect web logs with API Gateway and store them into S3 through Amazon Kinesis. Then this project shows how to analyze web logs with Amazon Athena.

GitHub - aws-samples/web-analytics-on-aws: This web analytics demo shows how to collect web logs with API Gateway and store them into S3 through Amazon Kinesis. Then this project shows how to analyze web logs with Amazon Athena.GitHub

Web Log Analytics System with Apache Iceberg Table

This repository provides you cdk scripts and sample code on how to implement a simple web analytics system. Below diagram shows what we are implementing.

Web Log Analytics System using API Gateway integrated with Data Firehose with Apache Iceberg table

This repository provides you cdk scripts and sample code on how to implement a simple web analytics system. Below diagram shows what we are implementing.

AWS Glue Streaming ETL example with Apache Iceberg

Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3

GitHub - aws-samples/aws-glue-streaming-etl-with-apache-iceberg: Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3GitHub

AWS Glue Streaming Ingestion from Kafka to Apache Iceberg table in S3

This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (MSK) and MSK Serverless into Apache Iceberg table in S3 with AWS Glue Streaming.

GitHub - aws-samples/aws-glue-streaming-ingestion-from-kafka-to-apache-iceberg: This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (MSK) and MSK Serverless into Apache Iceberg table in S3 with AWS Glue Streaming.GitHub

AWS Glue Streaming ETL example with Delta Lake

Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3

GitHub - aws-samples/aws-glue-streaming-etl-with-delta-lake: Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3GitHub

Building CQRS Pattern using Amazon Athena

Example of CQRS(Command and Query Responsibility Segregation) Pattern using Amazon Athena

GitHub - aws-samples/aws-athena-cqrs-pattern: Example of CQRS(Command and Query Responsibility Segregation) Pattern using Amazon AthenaGitHub

Streaming Count Sketches with HyperLogLog in Amazon MemoryDB for Redis

This repository provides you cdk scripts and sample code on how to count unique items (e.g., unique visitors) with hyperloglog in Amazon MemoryDB for Redis. HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization.

GitHub - aws-samples/streaming-count-sketches-with-hyperloglog-in-amazon-memorydbGitHub

Real-time Image Analysis System

This sample project is a real-time image analysis system. As an image is uploaded, the real-time image analysis system annotates tags on the image using Amazon Rekognition and ingests image tags into Amazon Elasticsearch for analyzing image labels.

GitHub - aws-samples/aws-realtime-image-analysis: This sample project is a real-time image analysis system. As an image is uploaded, the real-time image analysis system annotates tags on the image using Amazon Rekognition and ingests image tags into Amazon Elasticsearch for analyzing image labels.GitHub

Previous컨테이너 NextAI/ML

Last updated 3 months ago

Was this helpful?