Generative AI Platform — powered by Amazon Bedrock
Proof of concepts [POC] on use cases leveraging Generative AI is on the rise within enterprises. However, the path to production of these use cases has been limited. From system performance to cost of operations to addressing the inherent risks of Gen AI [hallucinations, moderations] is key for promoting these POCs to value driven production solutions. Building these capabilities is time consuming, requires diverse expertise and demands considerable operational overhead thus increasing time to Market.
The motivation of Amazon Bedrock is to provide these capabilities as a Generative AI platform and businesses can focus on delivering value — newer revenue streams or bolstering existing ones.
Let us consider a generative use case based on Retrieval Augmented Generation [RAG]
RAG is probably the most used design pattern. RAG implementations require more than just a Large Language Model. A knowledge base to hold the enterprise data /context, embedding model, a generation model and an orchestration engine to pipe the end-to-end workflow. There are multiple tools, frameworks and models available for developing a POC for a RAG based architecture.
However, the challenge is to take this to production. Considerations include but not limited to
- Cost: RAG demands a larger input to be sent to the model — Prompt + Enterprise Context + sometimes Conversation history.
- Performance: Production grade applications based on the use case demands high throughput, lower latency
- Prompts and Completion guardrails — Eliminating PII or any sensitive information, guard rails around bias, hallucinations, output moderation.
- Scalability: Supporting multiple concurrent users with guaranteed performance, multi-AZ support with limited manual efforts
- Stability of tools: Gen AI space is very dynamic with continuous emergence of new tools, frameworks often lacking backward compatibility
- Integration and Orchestration: Need for seamless integration and orchestration across the end to the end workflow
- Operational excellence: Visibility and control of end-to-end ops with logs, metrics, alarms and cost.
Considering the above requirements, the simple RAG architecture is no longer simple. You need to design capabilities from Caching, prompt engineering, data security, moderation of output completions and much more.
Scaling it to multiple use cases of similar nature requires a Generative AI platform that delivers a balance of Governance and Innovation
- Governance — Standardization of tools, frameworks and reusable architecture patterns
- Common services to address license to operate — Risk mitigation
- No Vendor lock in — to keep up with the pace of Gen AI.
Amazon Bedrock
Amazon Bedrock is a Generative AI platform designed to address enterprise-wide democratization. Amazon Bedrock provides built-in capabilities and extensible capabilities.
- Built in capabilities include Models, Agents, Customization, Prompt management
- Extensible capabilities leverage time tested AWS services from AI Services, Data Services, Integration and Orchestration services etc. Extensible services are accessed using Agents that are powered by AWS Lambda
Bedrock makes these services available as
- No Ops — all built in services are serverless (no requirement to provision or scale) and with extensible services you have option for both serverless and server based.
- No Vendor lock in — Choice of tools will a constant. For example, you can choose the LLM model, knowledge base, orchestration tooling etc.
- Gen AI for all — from no code, low code to high code users, Bedrock provides APIs, SDK and User interface to develop your Gen AI applications
For deep dive information on Amazon Bedrock, go to https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html
Now closing the loop on the above RAG implementation using Amazon Bedrock, a typical implementation would look similar to the below.
Amazon Bedrock is the fastest growing services within AWS with new capabilities added — from models, knowledge bases, prompt management capabilities.
Cannot wait to see what new features are going to be release in 4 week at ReInvent. if you have not registered for the reinvent, you can get registered here — https://reinvent.awsevents.com/