Skip to main content

Serverless Integration for Large Language Model (LLM) using AWS Lambda

About

In this blog, we’ll use a Large Language Model (LLM) to summarize a web page and store the summary in an Amazon DynamoDB table. The web page’s URL serves as both the input source and the key for storing the summary in the table.

source code

The above is the latest code. Please use source code link provided in each implementation step for consistency with the blog content.)

Prerequisite

  1. A computer. If using Windows please check it’s Windows 11 and recent versions of Windows 10 (version 1903 or higher, with build 18362 or higher).
  2. An AWS account. You can create a free account if don’t have one yet.
  3. Install Docker Desktop by clicking the Download Docker Desktop.
  4. Install Visual Studio Code, and DevContainer extension.
  5. Install Git.
  6. Clone this repo and open it in Visual Studio Code. (Choose “Reopen in container” when being asked)
  7. Install AWS CLI and CDK by running the ./script/install_tools.sh in Visual Studio Code’s Integrated Terminal.

Architecture

Architecture

References

Step by step implementation

1 - Hello World from Ollama
·1257 words·6 mins
This guide walks you through setting up a local AI development environment using Ollama for running large language models and DevContainers in Visual Studio Code for a consistent, isolated setup. It covers installation, configuration, and making REST API calls to Ollama, with specific instructions for Windows users using WSL 2.
2 - Hello World from Amazon Bedrock
·1085 words·6 mins
This guide details how to use Amazon Bedrock and AWS Lambda to deploy and invoke the Llama 3.2 1B model in a serverless environment. It covers enabling model access, creating a Python-based Lambda function, configuring IAM permissions, and testing the setup, with step-by-step instructions for seamless AI integration.
3 - Fetch HTML with an URL
·2418 words·12 mins
This guide explores how to use BeautifulSoup to extract plain text from URLs and build a reusable web scraper for integration into AI-driven applications like AWS Lambda. It includes step-by-step instructions for creating and testing a Python-based scraper with a modular class design.
4 - Summarize Web Content using Ollama
·775 words·4 mins
This guide shows how to create a modular Python application for web scraping and AI summarization using BeautifulSoup and Ollama. It covers organizing scraper and chat services into a reusable module, integrating them in a main script, and generating summaries from online content.
5 - Summarize Web Content using Amazon Bedrock
·1354 words·7 mins
This article guides you through building a modular Python application to leverage Large Language Models (LLMs) with AWS Lambda. Learn to set up AWS CLI with IAM Identity Center, create a Bedrock Chat Service, implement web scraping and summarization, and automate Lambda deployment with a ZIP file. Includes practical code examples and verification steps.
6 - A Small Step Towards Production Readiness
·1230 words·6 mins
This post guides us through improving Python code quality using Ruff, a fast linter and formatter, and pre-commit for automated checks. It also covers structuring LLM prompts with a Prompt model for scalable AI integrations, including updates to Ollama and Bedrock chat services.
7 - Getting Structural Output from Large Language Model (LLM)
·1952 words·10 mins
Hands-on guide to forcing LLMs to return perfect JSON using Pydantic schemas + few-shot examples + Ollama/Bedrock structured output features, with ready-to-run code for serverless Python apps.