
Amazon Professional AIP-C01 Dumps Full Questions with Free PDF Questions to Pass
100% Updated Amazon AIP-C01 Enterprise PDF Dumps
Amazon AIP-C01 Exam Syllabus Topics:
| Topic | Details |
|---|---|
| Topic 1 |
|
| Topic 2 |
|
| Topic 3 |
|
| Topic 4 |
|
| Topic 5 |
|
NEW QUESTION # 20
A company has a generative AI (GenAI) application that uses Amazon Bedrock to provide real-time responses to customer queries. The company has noticed intermittent failures with API calls to foundation models (FMs) during peak traffic periods.
The company needs a solution to handle transient errors and provide detailed observability into FM performance. The solution must prevent cascading failures during throttling events and provide distributed tracing across service boundaries to identify latency contributors. The solution must also enable correlation of performance issues with specific FM characteristics.
Which solution will meet these requirements?
- A. Configure the AWS SDK with adaptive retry mode. Use AWS CloudTrail distributed tracing to monitor throttling events.
- B. Configure the AWS SDK with standard retry mode and exponential backoff with jitter. Use AWS X- Ray tracing with annotations to identify and filter service components.
- C. Implement a custom retry mechanism with a fixed delay of 1 second between retries. Configure Amazon CloudWatch alarms to monitor the application's error rates and latency metrics.
- D. Implement client-side caching of all FM responses. Add custom logging statements in the application code to record API call durations.
Answer: B
Explanation:
Option B best meets the combined resiliency and observability requirements because it applies AWS- recommended retry behavior for transient throttling and enables true distributed tracing across service boundaries. During peak traffic, intermittent failures are commonly caused by throttling and other transient conditions. The AWS SDK standard retry mode provides exponential backoff with jitter, which reduces synchronized retry storms, prevents cascading failures, and improves overall system stability. Jitter is important because it spreads retry attempts over time, reducing load amplification during throttling events.
For observability, AWS X-Ray provides distributed tracing that follows a request across components such as API Gateway or load balancers, application services, and downstream calls to Amazon Bedrock. X-Ray can identify where latency is being introduced and which downstream call is contributing most to end-to-end response time. This is required to "identify latency contributors" and isolate performance issues under load.
The requirement also states that the company must correlate performance issues with specific FM characteristics. X-Ray annotations are designed for this purpose: the application can annotate traces with the model ID, inference parameters, region, or inference profile used. This enables filtering and analysis (for example, comparing latency or error patterns by model, parameter set, or endpoint configuration) without building a separate telemetry system.
Option A's fixed-delay retries increase synchronized retry behavior and do not provide distributed tracing.
Option C does not prevent cascading failures and cannot provide cross-service tracing. Option D is incorrect because CloudTrail is an audit logging service and does not provide distributed tracing for request latency analysis.
Therefore, Option B provides the correct combination of resilient retries and deep, model-correlated distributed observability for Amazon Bedrock workloads.
NEW QUESTION # 21
A medical company uses Amazon Bedrock to power a clinical documentation summarization system. The system produces inconsistent summaries when handling complex clinical documents. The system performed well on simple clinical documents.
The company needs a solution that diagnoses inconsistencies, compares prompt performance against established metrics, and maintains historical records of prompt versions.
Which solution will meet these requirements?
- A. Create multiple prompt variants by using Prompt management in Amazon Bedrock. Manually test the prompts with simple clinical documents. Deploy the highest performing version by using the Amazon Bedrock console.
- B. Implement version control for prompts in a code repository with a test suite that contains complex clinical documents and quantifiable evaluation metrics. Use an automated testing framework to compare prompt versions and document performance patterns.
- C. Create a custom prompt evaluation flow in Amazon Bedrock Flows that applies the same clinical document inputs to different prompt variants. Use Amazon Comprehend Medical to analyze and score the factual accuracy of each version.
- D. Deploy each new prompt version to separate Amazon Bedrock API endpoints. Split production traffic between the endpoints. Configure Amazon CloudWatch to capture response metrics and user feedback for automatic version selection.
Answer: B
Explanation:
Option B best meets the requirements because it provides systematic diagnosis, measurable comparison, and historical traceability of prompt performance. By placing prompts under version control and testing them against complex clinical documents, the company can consistently reproduce issues, track regressions, and compare prompt behavior using quantifiable metrics such as factual accuracy, completeness, and consistency.
Automated testing ensures scalability and repeatability, while version history preserves prompt evolution over time.
Option A lacks objective metrics and does not address complex documents. Option C focuses on live traffic experimentation but does not inherently diagnose prompt inconsistencies or preserve detailed historical evaluations. Option D adds medical entity analysis but introduces unnecessary service coupling and does not provide robust prompt version history or automated comparative benchmarking. Therefore, Option B is the most complete and disciplined solution.
NEW QUESTION # 22
A company is developing a generative AI (GenAI)-powered customer support application that uses Amazon Bedrock foundation models (FMs). The application must maintain conversational context across multiple interactions with the same user. The application must run clarification workflows to handle ambiguous user queries. The company must store encrypted records of each user conversation to use for personalization. The application must be able to handle thousands of concurrent users while responding to each user quickly.
Which solution will meet these requirements?
- A. Use an AWS Step Functions Express workflow to orchestrate conversation flow. Invoke AWS Lambda functions to run clarification logic. Store conversation history in Amazon RDS and use session IDs as the primary key.
- B. Use AWS Lambda functions to call Amazon Bedrock inference APIs. Use Amazon SQS queues to orchestrate clarification steps. Store conversation history in an Amazon ElastiCache (Redis OSS) cluster. Configure encryption at rest.
- C. Deploy the application by using an Amazon API Gateway REST API to route user requests to an AWS Lambda function to update and retrieve conversation context. Store conversation history in Amazon S3 and configure server-side encryption. Save each interaction as a separate JSON file.
- D. Use an AWS Step Functions Standard workflow to orchestrate clarification workflows. Include Wait for a Callback patterns to manage the workflows. Store conversation history in Amazon DynamoDB.
Purchase on-demand capacity and configure server-side encryption.
Answer: D
Explanation:
Option B is the correct solution because it provides a scalable, durable, and secure architecture for conversational GenAI workloads that require multi-step clarification workflows and persistent memory.
AWS Step Functions Standard workflows are designed for long-running, stateful workflows with high reliability, which is ideal for clarification loops that may require multiple back-and-forth interactions. The Wait for a Callback pattern allows the workflow to pause while awaiting additional user input, making it well- suited for handling ambiguous queries without losing execution state.
Storing conversation history in Amazon DynamoDB enables millisecond-latency reads and writes at massive scale, supporting thousands of concurrent users. DynamoDB's on-demand capacity mode automatically scales with traffic, eliminating capacity planning. Server-side encryption ensures that stored conversation data is encrypted at rest, meeting security and compliance requirements for personalized data.
Option A uses Step Functions Express and Amazon RDS, which is not ideal for long-lived conversational workflows and introduces scaling and connection management challenges. Option C stores conversations as individual S3 objects, which increases latency and complicates context retrieval. Option D relies on Amazon ElastiCache, which is optimized for ephemeral caching rather than durable, auditable conversation history.
Therefore, Option B best balances scalability, performance, durability, and security for a conversational Amazon Bedrock-based customer support application.
NEW QUESTION # 23
A healthcare company is using Amazon Bedrock to develop a real-time patient care AI assistant to respond to queries for separate departments that handle clinical inquiries, insurance verification, appointment scheduling, and insurance claims. The company wants to use a multi-agent architecture.
The company must ensure that the AI assistant is scalable and can onboard new features for patients. The AI assistant must be able to handle thousands of parallel patient interactions. The company must ensure that patients receive appropriate domain-specific responses to queries.
Which solution will meet these requirements?
- A. Isolate data for each agent by using separate knowledge bases. Use IAM filtering to control access to each knowledge base. Deploy a supervisor agent to perform natural language intent classification on patient inquiries. Configure the supervisor agent to route queries to specialized collaborator agents to respond to department-specific queries. Configure each specialized collaborator agent to use Retrieval Augmented Generation (RAG) with the agent's department-specific knowledge base.
- B. Implement multiple independent supervisor agents that run in parallel to respond to patient inquiries for each department. Configure multiple collaborator agents for each supervisor agent. Integrate all agents with the same knowledge base. Use external routing logic to merge responses from multiple supervisor agents.
- C. Create a separate supervisor agent for each department. Configure individual collaborator agents to perform natural language intent classification for each specialty domain within each department.
Integrate each collaborator agent with department-specific knowledge bases only. Implement manual handoff processes between the supervisor agents. - D. Isolate data for each department in separate knowledge bases. Use IAM filtering to control access to each knowledge base. Deploy a single general-purpose agent. Configure multiple action groups within the general-purpose agent to perform specific department functions. Implement rule-based routing logic in the general-purpose agent instructions.
Answer: A
Explanation:
Option A best meets the requirements because it applies an AWS-aligned multi-agent pattern that cleanly separates responsibilities: a supervisor agent performs intent classification and orchestration, while specialized collaborator agents handle domain-specific tasks using the right knowledge sources. This structure is well suited for healthcare workflows where clinical questions, scheduling, and insurance processes require different policies, terminology, and data access boundaries.
The requirement for appropriate domain-specific responses is addressed by routing each user query to a department-focused collaborator agent that is grounded with its own department-specific knowledge base.
Using Retrieval Augmented Generation with the correct knowledge base improves factual alignment and reduces cross-department leakage (for example, avoiding claims content in a clinical answer). It also supports better prompt grounding and more consistent tone and constraints per department.
The requirement to isolate data maps to using separate knowledge bases per agent and enforcing access through IAM controls, ensuring that each agent can retrieve only from the authorized datasets. This is important for minimizing unintended exposure of sensitive or irrelevant departmental data and supports governance and compliance needs.
For scalability and thousands of parallel interactions, this architecture minimizes contention and bottlenecks. Each collaborator agent can scale independently because requests are distributed across multiple agents and multiple retrieval backends. Operationally, onboarding new features is also simpler: the company can add a new collaborator agent (for example, "billing disputes" or "pharmacy refills") with its own knowledge base and policies without redesigning the entire assistant.
Option B introduces unnecessary complexity with multiple supervisors and manual handoffs. Option C overloads a single agent with broad instructions and rule-based routing, which increases prompt complexity and reduces maintainability as features grow. Option D creates high operational complexity and risks inconsistent outputs when merging responses from parallel supervisors, and it weakens data isolation by using a shared knowledge base across agents.
NEW QUESTION # 24
A company is using Amazon Bedrock to develop a customer support AI assistant. The AI assistant must respond to customer questions about their accounts. The AI assistant must not expose personal information in responses. The company must comply with data residency policies by ensuring that all processing occurs within the same AWS Region where each customer is located.
The company wants to evaluate how effective the AI assistant is at preventing the exposure of personal information before the company makes the AI assistant available to customers.
Which solution will meet these requirements?
- A. Configure an Amazon Bedrock guardrail to apply content and topic filters. Set the guardrail to detect mode during development, testing, and production. Disable invocation logging for the Amazon Bedrock model.
- B. Configure a cross-Region Amazon Bedrock guardrail to apply a set of content and word filters. Set the guardrail to detect mode during development and testing. Switch to mask mode for production deployment.
- C. Configure a cross-Region Amazon Bedrock guardrail to apply sensitive information filters. Set the guardrail to detect mode during development and testing. Switch to block mode for production deployment.
- D. Configure an Amazon Bedrock guardrail to apply sensitive information filters. Set the guardrail to mask mode during development and testing. Switch to block mode for production deployment. Deploy a copy of the guardrail to each Region where the company operates.
Answer: D
Explanation:
Option B best meets all stated requirements by correctly combining PII protection, evaluation before launch
, and data residency compliance using Amazon Bedrock Guardrails. Amazon Bedrock guardrails provide native sensitive information filtering that operates inline during model invocation, making them well suited for preventing personal data exposure in customer-facing AI assistants.
The requirement to evaluate how effective the AI assistant is at preventing exposure before release is best addressed by using mask mode during development and testing. Mask mode allows responses to be generated while automatically redacting detected personal information, making it easy for developers and reviewers to see where and how PII would have appeared. This provides concrete validation that the guardrail rules are correctly configured without fully blocking responses, which is ideal for quality assurance and pre- production evaluation.
For production, switching the guardrail to block mode ensures that responses containing personal information are fully prevented from being returned to users. This offers the strongest protection and aligns with compliance expectations for customer account data. Block mode is appropriate once confidence in the guardrail configuration has been established during testing.
The data residency requirement is addressed by deploying a copy of the guardrail in each AWS Region where the application operates. Amazon Bedrock guardrails are Region-specific resources, and using Region- local guardrails ensures that inference, filtering, and enforcement all occur within the same Region as the customer data. This avoids cross-Region processing and helps the company comply with regulatory and contractual data residency policies.
Option A and D incorrectly rely on cross-Region guardrails, which can violate data residency constraints.
Option C focuses on topic filtering rather than sensitive information filtering and keeps detect mode enabled in production, which does not actively prevent PII exposure. Therefore, B is the only option that fully satisfies safety, compliance, and evaluation requirements.
NEW QUESTION # 25
A company upgraded its Amazon Bedrock-powered foundation model (FM) that supports a multilingual customer service assistant. After the upgrade, the assistant exhibited inconsistent behavior across languages.
The assistant began generating different responses in some languages when presented with identical questions.
The company needs a solution to detect and address similar problems for future updates. The evaluation must be completed within 45 minutes for all supported languages. The evaluation must process at least 15,000 test conversations in parallel. The evaluation process must be fully automated and integrated into the CI/CD pipeline. The solution must block deployment if quality thresholds are not met.
Which solution will meet these requirements?
- A. Create a distributed traffic simulation framework that sends translation-heavy workloads to the assistant in multiple languages simultaneously. Use Amazon CloudWatch metrics to monitor latency, concurrency, and throughput. Run simulations before production releases to identify infrastructure bottlenecks.
- B. Deploy the assistant in multiple AWS Regions with Amazon Route 53 latency-based routing and AWS Global Accelerator to improve global performance. Store multilingual conversation logs in Amazon S3.
Perform weekly post-deployment audits to review consistency. - C. Create a pre-processing pipeline that normalizes all incoming messages into a consistent format before sending the messages to the assistant. Apply rule-based checks to flag potential hallucinations in the outputs. Focus evaluation on normalized text to simplify testing across languages.
- D. Set up standardized multilingual test conversations with identical meaning. Run the test conversations in parallel by using Amazon Bedrock model evaluation jobs. Apply similarity and hallucination thresholds. Integrate the process into the CI/CD pipeline to block releases that fail.
Answer: D
Explanation:
Option D is the correct solution because it directly evaluates multilingual output consistency and quality in an automated, scalable, and deployment-gating workflow. Amazon Bedrock model evaluation jobs are designed to run large-scale, repeatable evaluations against defined datasets and to produce quantitative metrics that can be used as objective release criteria.
The core issue is semantic inconsistency across languages for equivalent inputs. The most reliable way to detect this is to create standardized test conversations where each language version expresses the same intent and constraints. Running those tests through the updated model and comparing results with similarity metrics (for example, semantic similarity between expected and actual answers, or between language variants) surfaces regressions that infrastructure testing cannot detect.
Bedrock evaluation jobs support running evaluations at scale and are well suited for processing large datasets quickly. By parallelizing evaluation runs across languages and conversations, the company can meet the 45- minute requirement while executing at least 15,000 conversations. Because the process is standardized, it also allows consistent baseline comparisons across releases.
Applying hallucination thresholds ensures that answers remain grounded and do not introduce fabricated details, which is particularly important when language-specific behavior shifts after a model upgrade.
Integrating evaluation jobs into the CI/CD pipeline enables fully automated execution on every model or configuration update. The pipeline can enforce a hard quality gate that blocks deployment if thresholds are not met, preventing regressions from reaching production.
Option A focuses on performance and infrastructure bottlenecks, not multilingual response quality. Option B is post-deployment and too slow to prevent regressions. Option C normalizes inputs but does not measure multilingual output equivalence or provide robust, quantitative gating.
Therefore, Option D best meets the automation, scale, timing, and deployment-blocking requirements.
NEW QUESTION # 26
A pharmaceutical company is developing a Retrieval Augmented Generation application that uses an Amazon Bedrock knowledge base. The knowledge base uses Amazon OpenSearch Service as a data source for more than 25 million scientific papers. Users report that the application produces inconsistent answers that cite irrelevant sections of papers when queries span methodology, results, and discussion sections of the papers.
The company needs to improve the knowledge base to preserve semantic context across related paragraphs on the scale of the entire corpus of data.
Which solution will meet these requirements?
- A. Configure the knowledge base to use semantic chunking. Use a buffer size of 1 and a breakpoint percentile threshold of 85% to determine chunk boundaries based on content meaning.
- B. Configure the knowledge base not to use chunking. Manually split each document into separate files before ingestion. Apply post-processing reranking during retrieval.
- C. Configure the knowledge base to use hierarchical chunking. Use parent chunks that contain 1,000 tokens and child chunks that contain 200 tokens. Set a 50-token overlap between chunks.
- D. Configure the knowledge base to use fixed-size chunking. Set a 300-token maximum chunk size and a
10% overlap between chunks. Use an appropriate Amazon Bedrock embedding model.
Answer: C
Explanation:
Option B is the best fit because hierarchical chunking is designed to preserve local detail while keeping broader document context available during retrieval, which directly addresses the problem of questions spanning methodology, results, and discussion. In large scientific papers, a single answer often depends on linked paragraphs across adjacent sections. If the knowledge base retrieves only small, isolated chunks, the RAG system can cite text that is semantically close to a query term but not contextually correct, producing inconsistent answers and irrelevant citations.
With hierarchical chunking, the knowledge base creates child chunks that are small enough for high- precision vector similarity matching, such as 200 tokens, which improves the likelihood that the retrieved text is tightly related to the user's query. At the same time, each child chunk is associated with a larger parent chunk, such as 1,000 tokens, which retains the surrounding narrative and section-level context. This structure helps the retrieval pipeline return passages that include the relevant subsection plus the explanatory framing that prevents misinterpretation, which is especially important in scientific writing where methods, results, and discussion are interdependent.
The configured overlap further reduces boundary effects where key statements split across chunks. This improves continuity for paragraphs that bridge sections, such as a results paragraph that references the methodological setup or a discussion paragraph interpreting a specific metric.
Option A can improve consistency slightly, but fixed-size chunking still risks separating related paragraphs and does not provide a built-in mechanism to retrieve broader context linked to precise matches. Option C can create more meaningful boundaries, but it does not guarantee the parent-level context that hierarchical chunking provides at retrieval time. Option D increases operational burden and is not practical at the scale of
25 million
NEW QUESTION # 27
A company is building a legal research AI assistant that uses Amazon Bedrock with an Anthropic Claude foundation model (FM). The AI assistant must retrieve highly relevant case law documents to augment the FM's responses. The AI assistant must identify semantic relationships between legal concepts, specific legal terminology, and citations. The AI assistant must perform quickly and return precise results.
Which solution will meet these requirements?
- A. Configure an Amazon Bedrock knowledge base to use a default vector search configuration. Use Amazon Bedrock to expand queries to improve retrieval for legal documents based on specific terminology and citations.
- B. Enable the Amazon Kendra query suggestion feature for end users. Use Amazon Bedrock to perform post-processing of search results to identify semantic similarity in the documents and to produce precise results.
- C. Use Amazon OpenSearch Service to deploy a hybrid search architecture that combines vector search with keyword search. Apply an Amazon Bedrock reranker model to optimize result relevance.
- D. Use Amazon OpenSearch Service with vector search and Amazon Bedrock Titan Embeddings to index and search legal documents. Use custom AWS Lambda functions to merge results with keyword-based filters that are stored in an Amazon RDS database.
Answer: C
Explanation:
Option B is the correct solution because legal research workloads require both semantic understanding and exact lexical precision, especially for statutes, citations, and domain-specific terminology. A hybrid search architecture directly addresses this need by combining vector similarity search with traditional keyword-based retrieval.
Vector search alone is often insufficient for legal research because exact phrases, citation formats, and jurisdiction-specific terms must be matched precisely. Keyword search ensures high recall and precision for citations and legal terms, while vector search captures deeper semantic relationships between legal concepts, precedents, and arguments. Amazon OpenSearch Service natively supports hybrid search, enabling efficient scoring and ranking without external orchestration.
Applying an Amazon Bedrock reranker model further improves relevance by reordering retrieved documents based on deeper contextual understanding. Reranking is especially valuable in legal research because multiple documents may appear relevant, but only a subset truly addresses the user's legal question. The reranker optimizes final results before they are passed to the Anthropic Claude FM, improving answer accuracy and reducing hallucinations.
Option A relies on default vector search, which does not reliably handle citations and exact terminology.
Option C focuses on query suggestions and post-processing rather than retrieval quality. Option D introduces unnecessary operational complexity by merging results across multiple systems.
Therefore, Option B best meets the requirements for precision, performance, and semantic understanding in a legal research AI assistant.
NEW QUESTION # 28
A financial technology company is using Amazon Bedrock to build an assessment system for the company's customer service AI assistant. The AI assistant must provide financial recommendations that are factually accurate, compliant with financial regulations, and conversationally appropriate. The company needs to combine automated quality evaluations at scale with targeted human reviews of critical interactions.
What solution will meet these requirements?
- A. Configure Amazon CloudWatch to monitor response patterns from the AI assistant. Configure CloudWatch alerts for potential compliance violations. Establish a team of human evaluators to review flagged interactions.
- B. Configure Amazon Bedrock evaluations that use Anthropic Claude Sonnet as a judge model to assess response accuracy and appropriateness. Configure custom Amazon Bedrock guardrails to check responses for compliance with financial policies. Add Amazon Augmented AI (Amazon A2I) human reviews for flagged critical interactions.
- C. Create an Amazon Lex bot to manage customer service interactions. Configure AWS Lambda functions to check responses against a static compliance database. Configure intents that call the Lambda functions. Add an additional intent to collect end-user reviews.
- D. Configure a pipeline in which financial experts manually score all responses for accuracy, compliance, and conversational quality. Use Amazon SageMaker notebooks to analyze results to identify improvement areas.
Answer: B
Explanation:
Option B meets the requirement to combine scalable automated evaluation with targeted human oversight using managed AWS GenAI capabilities. Amazon Bedrock evaluations enable systematic, repeatable quality assessment across large volumes of interactions. Using an LLM-as-a-judge approach with a strong evaluator model such as Anthropic Claude Sonnet allows the company to automatically score outputs for dimensions like factual accuracy, conversational appropriateness, and policy alignment. This directly supports "automated quality evaluations at scale" without building custom scoring models.
However, financial recommendations add higher risk because regulatory compliance requires additional enforcement beyond general quality scoring. Amazon Bedrock guardrails provide a dedicated policy enforcement layer that can block or intervene when responses violate compliance constraints. Guardrails are particularly important for preventing disallowed financial guidance patterns and ensuring consistent behavior across deployments.
The requirement also calls for "targeted human reviews of critical interactions." Amazon Augmented AI (A2I) is a managed human review service that supports routing specific items to human reviewers based on rules or confidence thresholds. In this design, the system can automatically send only high-risk or policy- flagged interactions to qualified financial experts for review, keeping human effort focused where it matters most while maintaining scale.
Option A is not scalable because it requires manual review of all responses. Option C relies on static rules and end-user feedback, which is insufficient for regulatory compliance and factual accuracy assurance. Option D provides monitoring but not structured quality evaluation or policy enforcement.
Therefore, Option B provides the most complete, AWS-aligned solution for scalable evaluation plus human oversight in a regulated financial context.
NEW QUESTION # 29
A healthcare company is using Amazon Bedrock to build a system to help practitioners make clinical decisions. The system must provide treatment recommendations to physicians based only on approved medical documentation and must cite specific sources. The system must not hallucinate or produce factually incorrect information.
Which solution will meet these requirements with the LEAST operational overhead?
- A. Use Amazon Bedrock and Amazon Comprehend Medical to extract medical entities. Implement verification logic against a medical terminology database.
- B. Deploy an Amazon Bedrock Knowledge Base and connect it to approved clinical source documents.
Use the Amazon Bedrock RetrieveAndGenerate API to return citations from the knowledge base. - C. Use an Amazon Bedrock knowledge base with Retrieve API calls and InvokeModel API calls to retrieve approved clinical source documents. Implement verification logic to compare against retrieved sources and to cite sources.
- D. Integrate Amazon Bedrock with Amazon Kendra to retrieve approved documents. Implement custom post-processing to compare generated responses against source documents and to include citations.
Answer: B
Explanation:
Option B is the correct solution because Amazon Bedrock Knowledge Bases with the RetrieveAndGenerate API provide a fully managed Retrieval Augmented Generation (RAG) capability that directly addresses grounding, citation, and hallucination prevention with the least operational overhead.
Amazon Bedrock Knowledge Bases automatically manage document ingestion, chunking, embedding, retrieval, and ranking from approved data sources. When used with the RetrieveAndGenerate API, the model is constrained to generate responses only from retrieved, approved clinical documentation, significantly reducing the risk of hallucinations or unsupported claims. The API also returns explicit source citations, which satisfies regulatory and clinical transparency requirements without requiring custom comparison or validation logic.
This approach aligns with AWS best practices for healthcare GenAI workloads, where correctness and traceability are critical. Because retrieval and generation are tightly integrated, the system avoids multi-step orchestration, custom verification pipelines, or additional compute layers that would increase latency and maintenance burden.
Option A introduces Amazon Kendra and custom post-processing logic, increasing operational complexity.
Option C focuses on entity extraction rather than controlled knowledge grounding and does not guarantee citation or hallucination prevention. Option D requires manual orchestration between retrieval and generation and custom verification logic, which increases development and maintenance effort.
Therefore, Option B delivers accurate, grounded, and cited clinical recommendations with minimal infrastructure and operational overhead.
NEW QUESTION # 30
A company is building a generative AI (GenAI) application that processes financial reports and provides summaries for analysts. The application must run two compute environments. In one environment, AWS Lambda functions must use the Python SDK to analyze reports on demand. In the second environment, Amazon EKS containers must use the JavaScript SDK to batch process multiple reports on a schedule. The application must maintain conversational context throughout multi-turn interactions, use the same foundation model (FM) across environments, and ensure consistent authentication.
Which solution will meet these requirements?
- A. Use the Amazon Bedrock InvokeModel API with a separate authentication method for each environment. Store conversation states in Amazon DynamoDB. Use custom I/O formatting logic for each programming language.
- B. Use the Amazon Bedrock Converse API directly in both environments with a common authentication mechanism that uses IAM roles. Store conversation states in Amazon ElastiCache. Create programming language-specific wrappers for model parameters.
- C. Use the Amazon Bedrock Converse API and IAM roles for authentication. Pass previous messages in the request messages array to maintain conversational context. Use programming language-specific SDKs to establish consistent API interfaces.
- D. Create a centralized Amazon API Gateway REST API endpoint that handles all model interactions by using the InvokeModel API. Store interaction history in application process memory in each Lambda function or EKS container. Use environment variables to configure model parameters.
Answer: C
Explanation:
Option D is the correct solution because the Amazon Bedrock Converse API is purpose-built for multi-turn conversational interactions and is designed to work consistently across SDKs and compute environments. The Converse API standardizes how messages, roles, and context are represented, which ensures consistent behavior whether the application is running in AWS Lambda with Python or in Amazon EKS with JavaScript.
By passing previous messages in the messages array, the application explicitly maintains conversational context across turns without relying on external state stores. This approach is recommended by AWS for conversational GenAI workflows because it avoids state synchronization complexity and ensures deterministic model behavior across environments.
Using IAM roles for authentication provides a single, consistent security model for both Lambda and EKS.
IAM roles integrate natively with AWS SDKs, eliminating the need for custom authentication logic or environment-specific credentials. This aligns with AWS best practices for least privilege and simplifies governance.
Option A introduces inconsistent authentication and custom formatting logic, increasing complexity. Option B unnecessarily introduces ElastiCache for state management, which is not required when using the Converse API correctly. Option C stores state in process memory, which is unsafe and unreliable for serverless and containerized workloads.
Therefore, Option D best satisfies the requirements for conversational consistency, multi-environment support, shared model usage, and consistent authentication with minimal operational overhead.
NEW QUESTION # 31
A company has a recommendation system. The system's applications run on Amazon EC2 instances. The applications make API calls to Amazon Bedrock foundation models (FMs) to analyze customer behavior and generate personalized product recommendations.
The system is experiencing intermittent issues. Some recommendations do not match customer preferences.
The company needs an observability solution to monitor operational metrics and detect patterns of operational performance degradation compared to established baselines. The solution must also generate alerts with correlation data within 10 minutes when FM behavior deviates from expected patterns.
Which solution will meet these requirements?
- A. Use Amazon OpenSearch Service with the Observability plugin. Ingest model metrics and logs by using Amazon Kinesis. Create custom Piped Processing Language (PPL) queries to analyze model behavior patterns. Establish operational dashboards to visualize anomalies in real time.
- B. Implement AWS X-Ray to trace requests through the application components. Enable CloudWatch Logs Insights for error pattern detection. Set up AWS CloudTrail to monitor all API calls to Amazon Bedrock. Create custom dashboards in Amazon QuickSight.
- C. Configure Amazon CloudWatch Container Insights for the application infrastructure. Set up CloudWatch alarms for latency thresholds. Add custom metrics for token counts by using the CloudWatch embedded metric format. Create CloudWatch dashboards to visualize the data.
- D. Enable Amazon CloudWatch Application Insights for the application resources. Create custom metrics for recommendation quality, token usage, and response latency by using the CloudWatch embedded metric format with dimensions for request types and user segments. Configure CloudWatch anomaly detection on the model metrics. Establish log pattern analysis by using CloudWatch Logs Insights.
Answer: D
Explanation:
Option C best satisfies the requirements because it combines application-aware observability, metric baselining, anomaly detection, and correlated alerting using fully managed AWS services with minimal operational overhead. Amazon CloudWatch Application Insights is designed to automatically monitor application health by analyzing metrics, logs, and events across EC2-based workloads. This aligns directly with the need to detect intermittent performance issues and deviations from expected behavior.
By publishing custom metrics using the CloudWatch embedded metric format, the application can track generative AI-specific signals such as recommendation quality indicators, token usage, request volume, and response latency from Amazon Bedrock foundation model calls. Adding dimensions such as request type or user segment enables fine-grained visibility into which workloads or customer groups are impacted when recommendation quality degrades.
A critical requirement is detecting degradation compared to established baselines and generating alerts within
10 minutes. CloudWatch anomaly detection automatically builds statistical models of normal behavior for time-series metrics and flags deviations without requiring manually tuned thresholds. This capability is well suited for monitoring foundation model behavior, which can vary subtly over time. When anomalies are detected, CloudWatch alarms can trigger notifications with contextual metric data quickly, meeting the alerting requirement.
CloudWatch Logs Insights complements the metric-based view by enabling log pattern analysis and correlation. Engineers can query application logs and model response logs to identify recurring error patterns or shifts in output behavior that explain why recommendations no longer align with user preferences.
Application Insights further correlates metrics and logs to surface probable root causes, reducing mean time to resolution.
The other options lack one or more critical elements. Option A focuses on infrastructure-level metrics without baseline anomaly detection. Option B emphasizes tracing and auditing but does not provide automated performance deviation analysis. Option D offers flexibility but requires significantly more development and operational effort than a native CloudWatch-based solution.
NEW QUESTION # 32
A company is using AWS Lambda and REST APIs to build a reasoning agent to automate support workflows.
The system must preserve memory across interactions, share relevant agent state, and support event-driven invocation and synchronous invocation. The system must also enforce access control and session-based permissions.
Which combination of steps provides the MOST scalable solution? (Select TWO.)
- A. Build a custom RAG pipeline by using Amazon Kendra and Amazon Bedrock. Use AWS Lambda to orchestrate tool invocations. Store agent state in Amazon S3.
- B. Use Amazon Bedrock Agents for reasoning and conversation management. Use AWS Step Functions and Amazon SQS for orchestration. Store agent state in Amazon DynamoDB.
- C. Register the Lambda functions and REST APIs as actions by using Amazon API Gateway and Amazon EventBridge. Enable Amazon Bedrock AgentCore to invoke the Lambda functions and REST APIs without custom orchestration code.
- D. Deploy the reasoning logic as a container on Amazon ECS behind API Gateway. Use Amazon Aurora to store memory and identity data.
- E. Use Amazon Bedrock AgentCore to manage memory and session-aware reasoning. Deploy the agent with built-in identity support, event handling, and observability.
Answer: C,E
Explanation:
The combination of Options A and B provides the most scalable and AWS-native architecture for building reasoning agents with persistent memory, session awareness, secure access control, and flexible invocation models.
Amazon Bedrock AgentCore is purpose-built to manage agent memory, session context, and identity-aware reasoning across interactions. It eliminates the need for developers to manually store and retrieve agent state, manage session lifecycles, or implement custom memory layers. AgentCore natively supports both synchronous requests and event-driven execution, making it ideal for support workflow automation.
Option B complements AgentCore by enabling seamless tool invocation. By registering AWS Lambda functions and REST APIs as agent actions through API Gateway and EventBridge, the agent can invoke tools reactively or synchronously without custom orchestration code. EventBridge enables event-driven execution, while API Gateway supports synchronous request-response patterns.
This combination provides built-in security, observability, and scaling, while avoiding the operational burden of managing queues, databases, or custom workflow engines.
Option C introduces unnecessary orchestration complexity. Option D increases infrastructure management and cost. Option E stores agent state in S3, which is not suitable for low-latency, session-based reasoning.
Therefore, A and B together deliver the most scalable, secure, and low-overhead solution for production- grade reasoning agents on AWS.
NEW QUESTION # 33
A company provides a service that helps users from around the world discover new restaurants. The service has 50 million monthly active users. The company wants to implement a semantic search solution across a database that contains 20 million restaurants and 200 million reviews. The company currently stores the data in PostgreSQL.
The solution must support complex natural language queries and return results for at least 95% of queries within 500 ms. The solution must maintain data freshness for restaurant details that update hourly. The solution must also scale cost-effectively during peak usage periods.
Which solution will meet these requirements with the LEAST development effort?
- A. Migrate the restaurant data to Amazon OpenSearch Service. Implement keyword-based search rules that use custom analyzers and relevance tuning to find restaurants based on attributes such as cuisine type, features, and location. Create Amazon API Gateway HTTP API endpoints to transform user queries into structured search parameters.
- B. Migrate the restaurant data to Amazon OpenSearch Service. Use a foundation model (FM) in Amazon Bedrock to generate vector embeddings from restaurant descriptions, reviews, and menu items. When users submit natural language queries, convert the queries to embeddings by using the same FM.
Perform k-nearest neighbors (k-NN) searches to find semantically similar results. - C. Keep the restaurant data in PostgreSQL and implement a pgvector extension. Use a foundation model (FM) in Amazon Bedrock to generate vector embeddings from restaurant data. Store the vector embeddings directly in PostgreSQL. Create an AWS Lambda function to convert natural language queries to vector representations by using the same FM. Configure the Lambda function to perform similarity searches within the database.
- D. Migrate restaurant data to an Amazon Bedrock knowledge base by using a custom ingestion pipeline.Configure the knowledge base to automatically generate embeddings from restaurant information. Use the Amazon Bedrock Retrieve API with built-in vector search capabilities to query the knowledge base directly by using natural language input.
Answer: B
Explanation:
Option B best satisfies the requirements while minimizing development effort by combining managed semantic search capabilities with fully managed foundation models. AWS Generative AI guidance describes semantic search as a vector-based retrieval pattern where both documents and user queries are embedded into a shared vector space. Similarity search (such as k-nearest neighbors) then retrieves results based on meaning rather than exact keywords.
Amazon OpenSearch Service natively supports vector indexing and k-NN search at scale. This makes it well suited for large datasets such as 20 million restaurants and 200 million reviews while still achieving sub- second latency for the majority of queries. Because OpenSearch is a distributed, managed service, it automatically scales during peak traffic periods and provides cost-effective performance compared with building and tuning custom vector search pipelines on relational databases.
Using Amazon Bedrock to generate embeddings significantly reduces development complexity. AWS manages the foundation models, eliminates the need for custom model hosting, and ensures consistency by using the same FM for both document embeddings and query embeddings. This aligns directly with AWS- recommended semantic search architectures and removes the need for model lifecycle management.
Hourly updates to restaurant data can be handled efficiently through incremental re-indexing in OpenSearch without disrupting query performance. This approach cleanly separates transactional data storage from search workloads, which is a best practice in AWS architectures.
Option A does not meet the semantic search requirement because keyword-based search cannot reliably interpret complex natural language intent. Option C introduces scalability and performance risks by running large-scale vector similarity searches inside PostgreSQL, which increases operational complexity. Option D adds unnecessary ingestion and abstraction layers intended for retrieval-augmented generation, not high- throughput semantic search.
Therefore, Option B provides the optimal balance of performance, scalability, data freshness, and minimal development effort using AWS Generative AI services.
NEW QUESTION # 34
An ecommerce company is using Amazon Bedrock to build a generative AI (GenAI) application. The application uses AWS Step Functions to orchestrate a multi-agent workflow to produce detailed product descriptions. The workflow consists of three sequential states: a description generator, a technical specifications validator, and a brand voice consistency checker. Each state produces intermediate reasoning traces and outputs that are passed to the next state. The application uses an Amazon S3 bucket for process storage and to store outputs.
During testing, the company discovers that outputs between Step Functions states frequently exceed the 256 KB quota and cause workflow failures. A GenAI Developer needs to revise the application architecture to efficiently handle the Step Functions 256 KB quota and maintain workflow observability. The revised architecture must preserve the existing multi-agent reasoning and acting (ReAct) pattern.
Which solution will meet these requirements with the LEAST operational overhead?
- A. Configure an Amazon Bedrock integration to use the S3 bucket URI in the input parameters for large outputs. Use the ResultPath and ResultSelector fields to route S3 references between the agent steps while maintaining the sequential validation workflow.
- B. Store intermediate outputs in Amazon DynamoDB. Pass only references between states. Create a Map state that retrieves the complete data from DynamoDB when required for each agent's processing step.
- C. Use AWS Lambda functions to compress outputs to less than 256 KB before each agent state.
Configure each agent task to decompress outputs before processing and to compress results before passing them to the next state. - D. Configure a separate Step Functions state machine to handle each agent's processing. Use Amazon EventBridge to coordinate the execution flow between state machines. Use S3 references for the outputs as event data.
Answer: A
Explanation:
Option B is the best solution because it directly addresses the Step Functions 256 KB state payload quota by externalizing large intermediate artifacts to Amazon S3 and passing only lightweight references (URIs/keys) between states. This is a standard AWS pattern for workflows that produce large intermediate results, and it avoids introducing additional databases, compression logic, or cross-state-machine coordination that increases operational overhead.
In a multi-agent ReAct workflow, intermediate reasoning traces can be verbose and grow quickly as each agent produces chain-of-thought style artifacts, structured outputs, and supporting evidence. Step Functions is designed to orchestrate state transitions and pass JSON payloads, but large payloads should be stored outside the state machine and referenced by pointer values. Using Amazon S3 for intermediate outputs is operationally efficient because the application already uses S3 for storage, and S3 provides durable, low-cost storage with simple access patterns.
ResultPath and ResultSelector allow each state to store or reshape results so that only the required reference fields (such as s3Uri, object key, metadata, trace IDs) are forwarded to subsequent states. This preserves observability because the workflow can still log trace references, correlate steps with S3 objects, and store structured metadata for debugging. It also preserves the sequential validation design, keeping the existing ReAct pattern intact while preventing failures due to oversized payloads.
Option A adds additional services and read/write patterns that increase operational complexity. Option C introduces custom compression/decompression logic that is fragile, adds latency, and complicates troubleshooting. Option D increases orchestration overhead by splitting workflows and coordinating with events, which makes debugging harder and increases failure modes.
Therefore, Option B meets the payload limit requirement while keeping the architecture simple and observable.
NEW QUESTION # 35
An ecommerce company is developing a generative AI (GenAI) solution that uses Amazon Bedrock with Anthropic Claude to recommend products to customers. Customers report that some recommended products are not available for sale or are not relevant. Customers also report long response times for some recommendations.
The company confirms that most customer interactions are unique and that the solution recommends products not present in the product catalog.
Which solution will meet this requirement?
- A. Create an Amazon Bedrock Knowledge Bases and implement Retrieval Augmented Generation (RAG).
Set the PerformanceConfigLatency parameter to optimized. - B. Store product catalog data in Amazon OpenSearch Service. Validate model recommendations against the catalog. Use Amazon DynamoDB for response caching.
- C. Increase grounding within Amazon Bedrock Guardrails. Enable automated reasoning checks. Set up provisioned throughput.
- D. Use prompt engineering to restrict model responses to relevant products. Use streaming inference to reduce perceived latency.
Answer: A
Explanation:
Option C is the correct solution because it directly addresses both correctness and performance issues by grounding the model's responses in authoritative product data using Retrieval Augmented Generation.
Amazon Bedrock Knowledge Bases are designed to connect foundation models to trusted enterprise data sources, ensuring that generated responses are constrained to known, validated content.
By ingesting the product catalog into a knowledge base, the GenAI application retrieves only products that actually exist in the catalog. This prevents hallucinated or unavailable recommendations, which is a common issue when models rely solely on prompt instructions without retrieval grounding. RAG ensures that the model's output is based on retrieved facts rather than learned generalizations.
Setting the PerformanceConfigLatency parameter to optimized enables Bedrock to prioritize lower-latency retrieval and inference paths, improving responsiveness for real-time recommendation scenarios. This directly addresses the reported performance issues without requiring provisioned throughput or caching strategies that are ineffective for mostly unique interactions.
Option A improves safety and latency predictability but does not ensure recommendations are limited to valid products. Option B relies on prompt constraints, which are not sufficient to prevent hallucinations. Option D introduces additional validation and caching layers but increases complexity and does not improve generation relevance.
Therefore, Option C best resolves both relevance and latency challenges using AWS-native, low-maintenance GenAI integration patterns.
NEW QUESTION # 36
A company uses Amazon Bedrock to build a Retrieval Augmented Generation (RAG) system. The RAG system uses an Amazon Bedrock Knowledge Bases that is based on an Amazon S3 bucket as the data source for emergency news video content. The system retrieves transcripts, archived reports, and related documents from the S3 bucket.
The RAG system uses state-of-the-art embedding models and a high-performing retrieval setup. However, users report slow responses and irrelevant results, which cause decreased user satisfaction. The company notices that vector searches are evaluating too many documents across too many content types and over long periods of time.
The company determines that the underlying models will not benefit from additional fine-tuning. The company must improve retrieval accuracy by applying smarter constraints and wants a solution that requires minimal changes to the existing architecture.
Which solution will meet these requirements?
- A. Migrate to Amazon OpenSearch Service. Use vector fields and metadata filters to define the scope of results retrieval.
- B. Migrate to an Amazon Q Business index to perform structured metadata filtering and document categorization during retrieval.
- C. Enhance embeddings by using a domain-adapted model that is specifically trained on emergency news content for improved vector similarity.
- D. Enable metadata-aware filtering within the Amazon Bedrock knowledge base by indexing S3 object metadata.
Answer: D
Explanation:
Option C is the correct solution because it directly addresses the root cause of the problem-overly broad retrieval-while requiring minimal architectural change. Amazon Bedrock Knowledge Bases support metadata-aware filtering, which allows the system to constrain retrieval queries based on indexed metadata such as content type, publication date, source, or category.
By indexing Amazon S3 object metadata, the company can restrict vector searches to relevant subsets of the corpus, such as recent emergency reports, specific content formats, or trusted sources. This significantly reduces the number of documents evaluated during retrieval, which improves both latency and result relevance without changing embedding models or retrieval infrastructure.
This approach aligns with AWS best practices for optimizing RAG systems: when embeddings are already strong, retrieval quality is often improved by narrowing the candidate set rather than increasing model complexity. Metadata filtering reduces noise and ensures that retrieved documents are more contextually aligned with user queries.
Option A requires retraining or adapting embedding models, which the company has already determined will not provide additional benefit. Option B introduces a migration to OpenSearch, which adds operational overhead and deviates from the existing Bedrock knowledge base architecture. Option D requires moving to a different indexing service, increasing complexity and implementation effort.
Therefore, Option C provides the most effective and low-effort solution to improve retrieval accuracy and performance in the existing Amazon Bedrock RAG system.
NEW QUESTION # 37
A company is developing a generative AI (GenAI) application that analyzes customer service calls in real time and generates suggested responses for human customer service agents. The application must process
500,000 concurrent calls during peak hours with less than 200 ms end-to-end latency for each suggestion. The company uses existing architecture to transcribe customer call audio streams. The application must not exceed a predefined monthly compute budget and must maintain auto scaling capabilities.
Which solution will meet these requirements?
- A. Deploy a large, complex reasoning model on Amazon Bedrock. Purchase provisioned throughput and optimize for batch processing.
- B. Deploy a mid-sized language model on an Amazon SageMaker serverless endpoint that is optimized for batch processing.
- C. Deploy a low-latency, real-time optimized model on Amazon Bedrock. Purchase provisioned throughput and set up automatic scaling policies.
- D. Deploy a large language model (LLM) on an Amazon SageMaker real-time endpoint that uses dedicated GPU instances.
Answer: C
Explanation:
Option B is the correct solution because it aligns with AWS guidance for building high-throughput, ultra-low- latency GenAI applications while maintaining predictable costs and automatic scaling. Amazon Bedrock provides access to foundation models that are specifically optimized for real-time inference use cases, including conversational and recommendation-style workloads that require responses within milliseconds.
Low-latency models in Amazon Bedrock are designed to handle very high request rates with minimal per- request overhead. Purchasing provisioned throughput ensures that sufficient model capacity is reserved to handle peak loads, eliminating cold starts and reducing request queuing during traffic surges. This is critical when supporting up to 500,000 concurrent calls with strict latency requirements.
Automatic scaling policies allow the application to dynamically adjust capacity based on demand, ensuring cost efficiency during off-peak hours while maintaining performance during peak usage. This directly supports the requirement to stay within a predefined monthly compute budget.
Option A fails because batch processing and complex reasoning models introduce higher latency and are not suitable for real-time suggestions. Option C introduces significantly higher operational and cost overhead due to dedicated GPU instances and manual scaling responsibilities. Option D is optimized for batch workloads and cannot meet the sub-200 ms latency requirement.
Therefore, Option B provides the best balance of performance, scalability, cost control, and operational simplicity using AWS-native GenAI services.
NEW QUESTION # 38
An enterprise application uses an Amazon Bedrock foundation model (FM) to process and analyze 50 to 200 pages of technical documents. Users are experiencing inconsistent responses and receiving truncated outputs when processing documents that exceed the FM's context window limits.
Which solution will resolve this problem?
- A. Configure fixed-size chunking at 4,000 tokens for each chunk with 20% overlap. Use application-level logic to link multiple chunks sequentially until the FM's maximum context window of 200,000 tokens is reached before making inference calls.
- B. Use hierarchical chunking with parent chunks of 8,000 tokens and child chunks of 2,000 tokens. Use Amazon Bedrock Knowledge Bases built-in retrieval to automatically select relevant parent chunks based on query context. Configure overlap tokens to maintain semantic continuity.
- C. Use semantic chunking with a breakpoint percentile threshold of 95% and a buffer size of 3 sentences.
Use the RetrieveAndGenerate API to dynamically select the most relevant chunks based on embedding similarity scores. - D. Create a pre-processing AWS Lambda function that analyzes document token count by using the FM's tokenizer. Configure the Lambda function to split documents into equal segments that fit within 80% of the context window. Configure the Lambda function to process each segment independently before aggregating the results.
Answer: C
Explanation:
Option C directly addresses the root cause of truncated and inconsistent responses by using AWS- recommended semantic chunking and dynamic retrieval rather than static or sequential chunk processing.
Amazon Bedrock documentation emphasizes that foundation models have fixed context windows and that sending oversized or poorly structured input can lead to truncation, loss of context, and degraded output quality.
Semantic chunking breaks documents based on meaning instead of fixed token counts. By using a breakpoint percentile threshold and sentence buffers, the content remains coherent and semantically complete. This approach reduces the likelihood that important concepts are split across chunks, which is a common cause of inconsistent summarization results.
The RetrieveAndGenerate API is designed specifically to handle large documents that exceed a model's context window. Instead of forcing all content into a single inference call, the API generates embeddings for chunks and dynamically selects only the most relevant chunks based on similarity to the user query. This ensures that the FM receives only high-value context while staying within its context window limits.
Option A is ineffective because chaining chunks sequentially does not align with how FMs process context and risks exceeding context limits or introducing irrelevant information. Option B improves structure but still relies on larger parent chunks, which can lead to inefficiencies when processing very large documents. Option D processes segments independently, which often causes loss of global context and inconsistent summaries.
Therefore, Option C is the most robust, AWS-aligned solution for resolving truncation and consistency issues when processing large technical documents with Amazon Bedrock.
NEW QUESTION # 39
A company is creating a workflow to review customer-facing communications before the company sends the communications. The company uses a pre-defined message template to generate the communications and stores the communications in an Amazon S3 bucket. The workflow needs to capture a specific portion from the template and send it to an Amazon Bedrock model. The workflow must store model responses back to the original S3 bucket.
Which solution will meet these requirements?
- A. Create a flow in Amazon Bedrock Flows. Configure S3 action nodes at the beginning and end of the flow to retrieve and store the communications and the model responses. In the middle of the flow, configure an expression to parse each communication. Configure an agent step to send the parsed input to the model for review.
- B. Create an Amazon Bedrock agent that has an action group. Configure instructions to define how the agent should parse the communications. Configure the action group to retrieve the communications from the S3 bucket, invoke the Amazon Bedrock model, and store the model responses back to the S3 bucket.
- C. Create an AWS Step Functions Express workflow state machine. Use an Amazon S3 integration GetObject step to retrieve the original communications. Use an intrinsic function Pass step to parse the communications and to pass the results to an Amazon Bedrock InvokeModel step. Configure an Amazon S3 integration PutObject step to store the model responses back to the S3 bucket.
- D. Create an Amazon Bedrock agent that has a single action group. Configure three AWS Lambda functions in the action group. Configure the functions to retrieve the communications from the S3 bucket, parse the communications and invoke the Amazon Bedrock model, and store the model responses back to the S3 bucket.
Answer: A
Explanation:
Option A is the correct answer because Amazon Bedrock Flows is purpose-built to orchestrate generative AI workflows that combine data access, deterministic transformations, and model invocation with minimal operational overhead. The requirements explicitly state that the workflow must retrieve content from Amazon S3, extract a specific portion of a predefined template, send that portion to an Amazon Bedrock model, and store the model's response back into the same S3 bucket. Amazon Bedrock Flows natively supports all of these steps.
By configuring S3 action nodes at the beginning and end of the flow, the workflow can retrieve the original communications and persist the reviewed output without custom code. The expression step allows deterministic parsing of a specific portion of the template, which is essential when only part of the message should be reviewed. This avoids relying on generative logic for parsing, which would be less predictable and harder to audit. The agent step is then used specifically for the review task, where the foundation model evaluates or modifies the extracted content.
Option B uses AWS Step Functions, which can achieve similar outcomes but requires more explicit orchestration logic and does not provide GenAI-native constructs such as expressions and agent steps in a single managed experience. Options C and D rely on Amazon Bedrock agents and AWS Lambda functions to handle parsing and data movement, which increases complexity, operational overhead, and maintenance burden.
Because Amazon Bedrock Flows directly integrates S3 actions, parsing expressions, and model review steps in a single managed workflow, Option A best meets the requirements with the least development and operational effort.
NEW QUESTION # 40
A financial services company uses an AI application to process financial documents by using Amazon Bedrock. During business hours, the application handles approximately 10,000 requests each hour, which requires consistent throughput.
The company uses the CreateProvisionedModelThroughput API to purchase provisioned throughput. Amazon CloudWatch metrics show that the provisioned capacity is unused while on-demand requests are being throttled. The company finds the following code in the application:
response = bedrock_runtime.invoke_model(
modelId="anthropic.claude-v2",
body=json.dumps(payload)
)
The company needs the application to use the provisioned throughput and to resolve the throttling issues.
Which solution will meet these requirements?
- A. Increase the number of model units (MUs) in the provisioned throughput configuration.
- B. Add exponential backoff retry logic to handle throttling exceptions during peak hours.
- C. Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.
- D. Modify the application to use the invokeModelWithResponseStream API instead of the invokeModel API.
Answer: C
Explanation:
Option B is the correct solution because Amazon Bedrock provisioned throughput is only used when the application explicitly invokes the provisioned model ARN, not the base foundation model ID. In the provided code, the application is calling the standard model identifier (anthropic.claude-v2), which routes requests to on-demand capacity instead of the purchased provisioned throughput.
When the CreateProvisionedModelThroughput API is used, Amazon Bedrock returns a provisioned model ARN that represents the reserved capacity. Applications must reference this ARN in the modelId parameter when invoking the model. If the base model ID is used instead, Bedrock treats the request as on-demand traffic, which explains why CloudWatch metrics show unused provisioned capacity alongside throttled on- demand requests.
Option A would increase capacity but would not fix the root cause because the application is not using the provisioned resource at all. Option C adds resiliency but does not ensure usage of provisioned throughput and would still incur throttling. Option D changes the response delivery mechanism but does not affect capacity routing.
Therefore, Option B directly resolves the throttling issue by correctly routing traffic to the reserved capacity and ensures that the company benefits from the provisioned throughput it has purchased.
NEW QUESTION # 41
......
Use Valid Exam AIP-C01 by ValidTorrent Books For Free Website: https://troytec.validtorrent.com/AIP-C01-valid-exam-torrent.html