AWS Startup
  • AWS 스타트업
  • AWS 스타트업 행사 일정표
  • AWS 스타트업 지원 프로그램
    • AWS Activate
    • Meet The Expert for Startup
      • Meet The Expert for TECH
      • Meet The Expert for BUSINESS
    • PaceMaker program for Startups
      • 고객 사례
  • AWS 스타트업 고객 사례
    • 보안/데이터베이스
    • DevOps/개발자도구/서버리스
    • 컨테이너
    • 데이터분석
      • Tappytoon 데이터 분석 파이프라인 구축기
    • AI/ML
      • Amazon SageMaker이용한 GI VITA의 MLOps 구축기
    • GenAI - Amazon Bedrock
    • SaaS
    • 글로벌 확장/비용 최적화
  • Startup Recipe
    • Architecture Recipes
      • 웹/애플리케이션 개발
      • 컨테이너
      • 데이터 분석
      • AI/ML
      • 생성형 AI
      • SaaS
    • Code Recipes
    • Demo Recipes
      • 생성형 AI
  • AWS 기본
    • AWS를 처음 시작한다면?
      • 1. 보안
      • 2. 네트워크
      • 3. 컴퓨팅
      • 4. 데이터베이스와 스토리지
      • 5. 비용 최적화
      • 6. 7 Effective AWS
  • AWS 보안
    • 보안
      • AWS 계정 안전하게 지키기 Tip
      • 멀티 팩터 인증(MFA)을 통해 AWS 계정을 안전하게 관리하기
      • 직원들의 사용자 계정에 MFA 적용하고 AWS 리소스 보호하기
      • AWS 계정 침해 사고 대응
      • Amazon GuardDuty를 이용한 침입 탐지 대응 전략
      • 아직도 SSH로 서버에 접속하니? 안전하게 서버 쉘에 접속하기!
      • 스타트업을 위한 보안 자가 진단 앱, Security Baseline Self Test
  • AWS 컨테이너
    • 📺AWS에서 시작하는 Container 생활
    • Amazon ECS
      • 📺Amazon ECS Service Connect 사용하기 | 기본편
      • 💻Hands-on Lab
    • Amazon EKS
      • 📺Amazon EKS를 통한 빠르고 편리한 컨테이너 플랫폼 활용
      • 📺Amazon EKS 마이그레이션 요점정리
      • 📺Amazon EKS로 간단한 웹 애플리케이션 구축하기
      • 📺Amazon EKS의 Observability 옵션들
      • 📺Amazon EKS의 Devops를 위한 Gitops 그리고 Progressive Delivery 소개
      • 💻Hands-on Lab
  • AWS 데이터 분석
    • AWS에서 데이터를 분석하는 방법
      • AWS에서 데이터 분석을 시작하기 위한 실시간, 배치 데이터 수집 방법 알아보기
      • 서버리스 데이터 분석
      • AWS 서비스를 이용하여 실시간 분석 시스템 구축하기
      • Lambda 아키텍처 데이터 분석 시스템 구축 하기
        • Part 1. 개념 및 워크 플로우
        • Part 2. 데모로 확인하기
    • AWS의 데이터 분석 서비스 소개
      • Amazon Quicksight로 파일 데이터 시각화해보기
      • Amazon Athena에 대해 알아보기
      • Amazon Kinesis Data Streams와 MSK를 비교해 보기
      • Amazon OpenSearch 업그레이드 및 Graviton2 사용하기
      • Amazon OpenSearch Service KNN 기능을 사용한 유사 이미지 검색 구현하기
    • 고객 사례
  • AWS 생성형 AI
    • 모든 스타트업을 위한 생성형 AI
    • RAG 아키텍처 - 개념부터 구현까지
  • AWS AI / ML
    • AI/ML
      • 한시간만에 AWS 머신러닝 서비스 따라잡기
      • 스타트업을 위한 AWS의 AI/ML 서비스 활용 방법 및 도입 전략
      • Amazon SageMaker로 Machine Learning 시작하기
      • Amazon SageMaker로 딥 러닝 기반 이미지 검색 서비스 만들기 - 개념 및 원리
      • Amazon SageMaker로 딥 러닝 기반 이미지 검색 서비스 만들기 - 구현 예제
      • Amazon Rekognition을 이용한 이미지 분석 및 검색 서비스 만들기
      • Amazon Rekognition Custom Labels를 이용한 나만의 이미지 분석 모델 만들기
      • Amazon Textact와 Amazon Neptune을 이용한 인맥 추천 서비스 만들기
      • Amazon SageMaker Canvas - a Visual, No-Code, AutoML tool for Business Analysts
      • Amazon SageMaker Model Deployment Strategies
      • JumpStart to Build Generative AI with Amazon SageMaker
    • Personalized Recommendations
      • 추천 시스템의 원리와 구축 사례
      • 5분만에 Amazon Personalize로 추천 시스템 구축하기
      • 추천 서비스를 위한 데이터 분석 시스템 구축하기
      • Amazon Personalize Recipes 120% 활용하기
  • AWS 비용최적화
    • 비용
      • 반드시 확인해야 할 비용 최적화 방법
      • 스타트업을 위한 6가지 AWS 비용 최적화 방법
      • 비용 및 리소스 관리를 위한 태그 생성 강제하기
Powered by GitBook
On this page
  • Hosting DeepSeek models on Amazon SageMaker
  • DeepSeek-R1 Distill Llama 8B using SGLang
  • DeepSeek-R1 Distill Llama 8B
  • DeepSeek-R1 Distill Qwen 14B
  • DeepSeek-R1 Distill Qwen 32B
  • DeepSeek-V2 Lite Chat
  • Janus-Pro 7B
  • Hosting LG AI EXAONE-Deep models on Amazon SageMaker
  • EXAONE-Deep 7.8B using SGLang
  • LLM Observability Tools
  • Langfuse on AWS
  • RAG(Retrieval Augmented Generation)
  • With Knowledge Bases for Amazon Bedrock
  • With Amazon Aurora Postgresql used for a Knowledge Base for Amazon Bedrock
  • With LLMs and Amazon Kendra
  • With Amazon Bedrock and Kendra
  • With Amazon Bedrock and OpenSearch
  • With LLMs and Amazon OpenSearch
  • With LLMs and Amazon OpenSearch Serverless
  • With Amazon Bedrock and Amazon Aurora Postgresql using pgvector
  • With LLMs and Amazon Aurora Postgresql using pgvector
  • With Amazon Bedrock and MemoryDB for Redis
  • With Amazon MemoryDB for Redis and SageMaker
  • With Amazon Bedrock and DocumentDB
  • With Amazon DocumentDB and SageMaker
  • Semantic Vector Search in PostgreSQL using Amazon SageMaker and pgvector

Was this helpful?

  1. Startup Recipe
  2. Architecture Recipes

생성형 AI

PreviousAI/MLNextSaaS

Last updated 1 month ago

Was this helpful?

Hosting DeepSeek models on Amazon SageMaker

DeepSeek-R1 Distill Llama 8B using SGLang

This is a CDK Python project to host on Amazon SageMaker Real-time Inference Endpoint. In this example, we'll demonstrate how to adapt the framework to run on SageMaker AI endpoints. SGLang is a serving framework for large language models that provides state-of-the-art performance, including a fast backend runtime for efficient serving with RadixAttention, extensive model support, and an active open-source community. For more information refer to and .

DeepSeek-R1 Distill Llama 8B

This is a CDK Python project to deploy DeepSeek-R1-Distill-Llama-8B a SageMaker real-time endpoint with the scale down to zero feature. This project demonstrates how you can scale in your SageMaker endpoint to zero instances during idle periods, eliminating the previous requirement of maintaining at least one running instance.

DeepSeek-R1 Distill Qwen 14B

DeepSeek-R1 Distill Qwen 32B

DeepSeek-V2 Lite Chat

Janus-Pro 7B

Hosting LG AI EXAONE-Deep models on Amazon SageMaker

EXAONE-Deep 7.8B using SGLang

LLM Observability Tools

Langfuse on AWS

(1) Langfuse v3

(2) Lanfuse v2

RAG(Retrieval Augmented Generation)

With Knowledge Bases for Amazon Bedrock

This project is an Question Answering application with Large Language Models (LLMs) and Knowledge Bases for Amazon Bedrock. An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response. In this project, Amazon OpenSearch Serverless is used for a Knowledge Base for Amazon Bedrock.

With Amazon Aurora Postgresql used for a Knowledge Base for Amazon Bedrock

With LLMs and Amazon Kendra

This project is a Question Answering application with Large Language Models (LLMs) and Amazon Kendra. An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response.

With Amazon Bedrock and Kendra

This project is a Question Answering application with Large Language Models (LLMs) and Amazon Kendra. An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response.

With Amazon Bedrock and OpenSearch

With LLMs and Amazon OpenSearch

This project is an Question Answering application with Large Language Models (LLMs) and Amazon OpenSearch Service. An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response.

With LLMs and Amazon OpenSearch Serverless

Question Answering Generative AI application with Large Language Models (LLMs) and Amazon OpenSearch Serverless Service

With Amazon Bedrock and Amazon Aurora Postgresql using pgvector

With LLMs and Amazon Aurora Postgresql using pgvector

This project is a Question Answering application with Large Language Models (LLMs) and Amazon Aurora Postgresql using pgvector. An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response. In this project, Amazon Aurora Postgresql with pgvector is used for knowledge base.

With Amazon Bedrock and MemoryDB for Redis

Question Answering Generative AI application with Large Language Models (LLMs), Amazon Bedrock, and Amazon MemoryDB for Redis.

With Amazon MemoryDB for Redis and SageMaker

Question Answering Generative AI application with Large Language Models (LLMs) deployed on Amazon SageMaker, and Amazon MemoryDB for Redis as a Vector Database.

With Amazon Bedrock and DocumentDB

Question Answering Generative AI application with Large Language Models (LLMs), Amazon Bedrock, and Amazon DocumentDB (with MongoDB Compatibility)

With Amazon DocumentDB and SageMaker

Question Answering Generative AI application with Large Language Models (LLMs) deployed on Amazon SageMaker, and Amazon DocumentDB (with MongoDB Compatibility) as a Vector Database.

Semantic Vector Search in PostgreSQL using Amazon SageMaker and pgvector

This is a CDK Python project to host on Amazon SageMaker Real-time Inference Endpoint. is one of the first generation of reasoning models, along with . is a fine-tuned based on open-source model, using samples generated by DeepSeek-R1.

This is a CDK Python project to host on Amazon SageMaker Real-time Inference Endpoint using SageMaker JumpStart.

This is a CDK Python project to host the on Amazon SageMaker Real-time Inference Endpoint using SageMaker DJL Serving DLC.

This is a CDK Python project to host on Amazon SageMaker Real-time Inference Endpoint. is a unified understanding and generation MLLM, which decouples visual encoding for multimodal understanding and generation.

This is a CDK Python project to host LG AI on Amazon SageMaker Real-time Inference Endpoint. , which exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research. In this example, we'll demonstrate how to adapt the framework to run on SageMaker AI endpoints. SGLang is a serving framework for large language models that provides state-of-the-art performance, including a fast backend runtime for efficient serving with RadixAttention, extensive model support, and an active open-source community. For more information refer to and .

This project is an AWS CDK Python project for deploying the application using Amazon Elastic Container Registry (ECR) and Amazon Elastic Container Service (ECS). Langfuse is an open-source LLM engineering platform that helps teams collaboratively debug, analyze, and iterate on their LLM applications.

This project is a Question Answering application with Large Language Models (LLMs) and Amazon Aurora Postgresql using . An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response. In this project, Amazon Aurora Postgresql with pgvector is used for a Knowledge Base for Amazon Bedrock.

This project is a Question Answering application with Large Language Models (LLMs) and Amazon Aurora Postgresql using . An application using the RAG(Retrieval Augmented Generation) approach retrieves information most relevant to the user’s request from the enterprise knowledge base or content, bundles it as context along with the user’s request as a prompt, and then sends it to the LLM to get a GenAI response. In this project, Amazon Aurora Postgresql with pgvector is used for knowledge base.

This project is a search solution using for an online retail store product catalog. We’ll build a search system that lets customers provide an item description to find similar items. For more information, check this blog post,

DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1
DeepSeek
DeepSeek-R1-Zero
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
deepseek-ai/Janus-Pro-7B
Janus-Pro-7B
EXAONE Deep 7.8B
EXAONE Deep
SGLang
https://docs.sglang.ai/index.html
https://github.com/sgl-project/sglang
Langfuse
pgvector
pgvector
pgvector
Building AI-powered search in PostgreSQL using Amazon SageMaker and pgvector (on MAY 2023)
DeepSeek-R1-Distill-Llama-8B
SGLang
https://docs.sglang.ai/index.html
https://github.com/sgl-project/sglang
aws-kr-startup-samples/machine-learning/sagemaker/deepseek-on-sagemaker/deepseek-r1-distill-llama-8b-sglang at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/machine-learning/sagemaker/scale-to-zero-sagemaker-endpoint at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/machine-learning/sagemaker/deepseek-on-sagemaker/deepseek-r1-distill-qwen-14b at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/machine-learning/sagemaker/deepseek-on-sagemaker/deepseek-r1-distill-qwen-32b at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/machine-learning/sagemaker/deepseek-on-sagemaker/deepseek-v2-lite-chat at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/machine-learning/sagemaker/deepseek-on-sagemaker/janus-pro-7b at main · aws-samples/aws-kr-startup-samplesGitHub
Logo
aws-kr-startup-samples/machine-learning/sagemaker/lgai-exaone-on-sagemaker/exaone-deep-7_8b-sglang at main · aws-samples/aws-kr-startup-samplesGitHub
Logo
GitHub - aws-samples/deploy-langfuse-on-ecs-with-fargate: Hosting Langfuse on Amazon ECS with Fargate using CDK PythonGitHub
Logo
aws-kr-startup-samples/gen-ai/rag-with-knowledge-bases-for-amazon-bedrock at main · aws-samples/aws-kr-startup-samplesGitHub
generative-ai-cdk constructs 사용 버전
AWS CDK L1 Constructs 사용 버전
Logo
https://github.com/aws-samples/qa-app-with-rag-using-amazon-bedrock-and-kendragithub.com
Logo
Logo
aws-kr-startup-samples/gen-ai/rag-with-knowledge-bases-for-amazon-bedrock-using-L1-cdk-constructs at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/gen-ai/rag-with-knowledge-bases-for-amazon-bedrock-using-aurora-postgresql at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/gen-ai/rag-with-amazon-kendra-and-sagemaker at main · aws-samples/aws-kr-startup-samplesGitHub
aws-kr-startup-samples/gen-ai/rag-with-amazon-bedrock-and-opensearch at main · aws-samples/aws-kr-startup-samplesGitHub
GitHub - aws-samples/rag-with-amazon-opensearch-and-sagemaker: Question Answering Generative AI application with Large Language Models (LLMs) and Amazon OpenSearch ServiceGitHub
https://github.com/aws-samples/rag-with-amazon-opensearch-serverlessgithub.com
Logo
Logo
Logo
aws-kr-startup-samples/gen-ai/rag-with-amazon-bedrock-and-postgresql-using-pgvector at main · aws-samples/aws-kr-startup-samplesGitHub
GitHub - aws-samples/rag-with-amazon-postgresql-using-pgvector-and-sagemaker: Question Answering application with Large Language Models (LLMs) and Amazon Postgresql using pgvectorGitHub
GitHub - aws-samples/rag-with-amazon-bedrock-and-memorydb: Question Answering Generative AI application with Large Language Models (LLMs), Amazon Bedrock, and Amazon MemoryDB for RedisGitHub
aws-kr-startup-samples/gen-ai/rag-with-amazon-memorydb-and-sagemaker at main · aws-samples/aws-kr-startup-samplesGitHub
GitHub - aws-samples/rag-with-amazon-bedrock-and-documentdb: Question Answering Generative AI application with Large Language Models (LLMs), Amazon Bedrock, and Amazon DocumentDB (with MongoDB Compatibility)GitHub
aws-kr-startup-samples/gen-ai/rag-with-amazon-documentdb-and-sagemaker at main · aws-samples/aws-kr-startup-samplesGitHub
GitHub - ksmin23/semantic-vector-search-with-sagemaker-pgvector: A search application using Aurora Postgresql and pgvector for an online retail store product catalogGitHub
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo