AWS AI Practitioner Exam Notes

Jacinth Paul
Mar 20
31 min read

1. Fundamentals of Machine Learning and Artificial Intelligence

Artificial intelligence (AI) - AI is a broad field that encompasses the development of intelligent systems capable of performing tasks that typically require human intelligence, such as perception, reasoning, learning, problem-solving, and decision-making.
Machine learning (ML) - ML is a type of AI for understanding and building methods that make it possible for machines to learn.
Deep learning (DL) - Deep learning uses the concept of neurons and synapses similar to how our brain is wired. eg Amazon Rekognition
Generative AI - Generative AI is a subset of deep learning because it can adapt models built using deep learning, but without retraining or fine tuning.

Machine Learning Fundamentals

Building a machine learning model involves

data collection and preparation,
selecting an appropriate algorithm,
training the model on the prepared data, and e
valuating its performance through testing and iteration.

Types of data

Labeled data is a dataset where each instance or example is accompanied by a label or target variable that represents the desired output or classification.
Unlabeled data is a dataset where the instances or examples do not have any associated labels or target variables
Structured data refers to data that is organized and formatted in a predefined manner, typically in the form of tables or databases with rows and columns.
Unstructured data is data that lacks a predefined structure or format, such as text, images, audio, and video.

ML learning process

In supervised learning, the algorithms are trained on labeled data. The goal is to learn a mapping function that can predict the output for new, unseen input data.
Unsupervised learning refers to algorithms that learn from unlabeled data. The goal is to discover inherent patterns, structures, or relationships within the input data.
In reinforcement learning, the machine is given only a performance score as guidance and semi-supervised learning, where only a portion of training data is labeled. Feedback is provided in the form of rewards or penalties for its actions.

Inferencing - The process of using the information that a model has learned to make predictions or decisions

Batch inferencing is when the computer takes a large amount of data, such as images or text, and analyzes it all at once to provide a set of results. Speed of the decision-making process is not as crucial as the accuracy
Real-time inferencing is when the computer has to make decisions quickly, in response to new information as it comes in.

Deep Learning Fundamentals

The field of deep learning is inspired by the structure and function of the brain. It involves the use of artificial neural networks, which are computational models that are designed to mimic the way the human brain processes information.
Neural networks - Neural networks have lots of tiny units called nodes that are connected together. These nodes are organized into layers. The layers include an input layer, one or more hidden layers, and an output layer.
- Applications - Computer vision, Natural language processing (NLP)

Generative AI Fundamentals

Generative AI is powered by models that are pretrained on internet-scale data, and these models are called foundation models (FMs)
Unlabeled data > Pre Train > Foundational Model > Adapt > Broad Range of General Tasks
Foundation Model Life Cycle:
Data selection > Pre-training > Optimization > Evaluation > Deployment > Feedback and continuous improvement
Amazon Bedrock provides access to a choice of high-performing FMs from leading AI companies

Large language models - Large language models (LLMs) can be based on a variety of architectures, but the most common architecture in today's state-of-the-art models is the transformer architecture. They can understand and generate human-like text.

Tokens are the basic units of text that the model processes. Tokens can be words, phrases, or individual characters like a period.
Embeddings are numerical representations of tokens, where each token is assigned a vector (a list of numbers) that captures its meaning and relationships with other tokens. These vectors are learned during the training process
Diffusion models - Diffusion is a deep learning architecture system that starts with pure noise or random data. The models gradually add more and more meaningful information to this noise until they end up with a clear and coherent output, like an image or a piece of text.
- Diffusion models learn through a two-step process of forward diffusion and reverse diffusion.
Multimodal models - multimodal models can process and generate multiple modes of data simultaneously - text, image, video etc
Generative adversarial networks (GANs) - GANs are a type of generative model that involves two neural networks competing against each other in a zero-sum game framework. The two networks are generator and discriminator.
- During training, the generator tries to generate data that can fool the discriminator into thinking it's real, while the discriminator tries to correctly classify the real and generated data. This adversarial process continues until the generator produces data that is indistinguishable from the real data.
Variational autoencoders (VAEs) - VAEs are a type of generative model that combines ideas from autoencoders (a type of neural network) and variational inference (a technique from Bayesian statistics). In a VAE, the model consists of two parts: Encoder, Decoder

Optimizing model outputs

Prompt engineering - Prompts act as instructions for foundation models
- Instructions: This is a task for the FM to do. It provides a task description or instruction for how the model should perform.
- Context: This is external information to guide the model.
- Input data: This is the input for which you want a response.
- Output indicator: This is the output type or format.
Fine-tuning - Fine-tuning is a supervised learning process that involves taking a pre-trained model and adding specific, smaller datasets.
- Adding these narrower datasets modifies the weights of the data to better align with the task.
- There are two ways to fine-tune a model: Instruction fine-tuning , Reinforcement learning from human feedback (RLHF)
Retrieval-augmented generation - RAG is a technique that supplies domain-relevant data as context to produce responses based on that data.
- RAG will not change the weights of the foundation model, whereas fine-tuning will change model weights.
- Weights, also known as coefficients or parameters, are the strength or amplitude of the connections between neurons in different neural network layers. They determine how much influence the input will have on the output.

AWS AI/ML Services Stack

Amazon SageMaker AI - With SageMaker AI, you can build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows.
Amazon Comprehend - Amazon Comprehend uses ML and natural language processing (NLP) to help you uncover the insights and relationships in your unstructured data.
Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation.
Amazon Textract is a service that automatically extracts text and data from scanned documents.
Amazon Lex is a fully managed AI service to design, build, test, and deploy conversational interfaces into any application using voice and text.
Amazon Polly is a service that turns text into lifelike speech. Amazon Polly lets you create applications that talk, so you can build entirely new categories of speech-enabled products.
Amazon Transcribe is an automatic speech recognition (ASR) service for automatically converting speech to text.
Amazon Rekognition facilitates adding image and video analysis to your applications.
Amazon Kendra is an intelligent search service powered by ML. Amazon Kendra reimagines enterprise search for your websites and applications.
Amazon Personalize is an ML service that developers can use to create individualized recommendations for customers who use their applications.
AWS DeepRacer is a 1/18th scale race car that gives you an interesting and fun way to get started with reinforcement learning (RL).
SageMaker JumpStart helps you quickly get started with ML. To facilitate getting started, SageMaker JumpStart provides a set of solutions for the most common use cases, which can be rapidly deployed.
Amazon Bedrock is a fully managed service that makes FMs from Amazon and leading AI startups available through an API.
Amazon Q can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take actions using the data and expertise found in your company's information repositories, code, and enterprise systems.
Amazon Q Developer - Designed to improve developer productivity, Amazon Q Developer provides ML–powered code recommendations to accelerate development of C#, Java, JavaScript, Python, and TypeScript applications.

Advantages and benefits of AWS AI solutions

Accelerated development and deployment
Scalability and cost optimization
Flexibility and access to models
Integration with AWS tools and services

Cost considerations - Responsiveness and availability, Redundancy and Regional coverage, Performance, Token-based pricing, Provisioned throughput and Custom models

2. Exploring Artificial Intelligence Use Cases and Applications

Computer vision - Computer vision is a field of artificial intelligence that allows computers to interpret and understand digital images and videos
Natural language processing - NLP is a branch of artificial intelligence that deals with the interaction between computers and human languages.
Intelligent document processing - IDP is an application that extracts and classifies information from unstructured data, generates summaries, and provides actionable insights.
Fraud detection refers to the process of identifying and preventing fraudulent activities or unauthorized behavior with a system, process, or transaction.
When AI and ML are appropriate solutions?
- Coding the rules is challenging
- Scale of the project is challenging
- You do not need ML if you can determine a target value using simple rules, computations, or predetermined steps

Supervised learning Use cases

This is a popular type of ML because it’s widely applicable. It’s called supervised learning because there needs to be a supervisor. The supervisor is labeled training data.

Supervised learning has two subcategories—classification and regression.
Classification is a supervised learning technique used to assign labels or categories to new, unseen data instances based on a trained model.
- The model is trained on a labeled dataset, where each instance is already assigned to a known class or category.
- Use cases include the following: Fraud detection, Image classification, Customer retention, Diagnostics
Regression is a supervised learning technique used for predicting continuous or numerical values based on one or more input variable.
- It is used to model the relationship between a dependent variable (the value to be predicted) and one or more independent variables (the features or inputs used for prediction).
- Use cases include the following: Advertising popularity prediction, Weather forecasting, Market forecasting, Estimating life expectancy, Population growth prediction

Unsupervised learning use cases

Two main subcategories of unsupervised learning are clustering and dimensionality reduction.
Clustering - This kind of algorithm groups data into different clusters based on similar features or distances between the data point to better understand the attributes of a specific cluster.
- Use cases include the following: Customer segmentation, Targeted marketing, Recommended systems
Dimensionality reduction is an unsupervised learning technique used to reduce the number of features or dimensions in a dataset while preserving the most important information or patterns.
- Use cases include the following: Big data visualization, Meaningful compression, Structure discovery, Feature elicitation

Reinforcement learning use case

In reinforcement learning, an agent continuously learns through trial and error as it interacts in an environment.
Use case include autonomous vehicles

Capabilities of Generative AI

Features of Gen AI - Adaptability, Responsiveness, Simplicity, Creativity and exploration, Data efficiency, Personalization, Scalability
Challenges of Generative AI - Regulatory violations, Social risks, Data security and privacy concerns, Toxicity, Hallucinations, Interpretability, Nondeterminism, Plagiarism and cheating, Disruption of the nature of work
Some of the key factors to consider when selecting an appropriate generative AI model include the following: Model types, Performance requirements, Capabilities, Constraints, Compliance, Cost
Business Metrics for Generative AI - User satisfaction, Average revenue per user, Cross-domain performance, Conversion rate, Efficiency.

3. Responsible Artificial Intelligence Practices

What is responsible AI? Responsible AI refers to practices and principles that ensure that AI systems are transparent and trustworthy while mitigating potential risks and negative outcomes.
To operate AI responsibly, companies should proactively ensure the following about their system:
- It is fully transparent and accountable, with monitoring and oversight mechanisms in place.
- It is managed by a leadership team accountable for responsible AI strategies.
- It is developed by teams with expertise in responsible AI principles and practices.
- It is built following responsible AI guidelines.

Responsible AI Challenges in Traditional AI and Generative AI

Biases in AI systems: The number one problem that developers face in AI applications is accuracy.
Bias in a model means that the model is missing important features of the datasets. This means that the data is too basic.
- Bias is measured by the difference between the expected predictions of the model and the true values we are trying to predict.
- When a model has a high bias, it is underfitted. Underfitted means that the model is not capturing enough difference in the features of the data, and therefore, the model performs poorly on the training data.
Variance refers to the model's sensitivity to fluctuations or noise in the training data.
- When you introduce new data to the model, the model's accuracy drops. This is because the new data can have different features that the model is not trained on. This introduces the problem of overfitting.
- Overfitting is when model performs well on the training data but does not perform well on the evaluation data.
Overfitting occurs when a model is too complex, learning noise in training data, resulting in low training error but high test error (low generalization).
Underfitting occurs when a model is too simple, missing underlying patterns, resulting in poor performance on both training and test sets.
To help overcome bias and variance errors, you can use the following:
- Cross-validation is a technique for evaluating ML models by training several ML models on subsets of the available input data and evaluating them on the complementary subset of the data. Cross-validation should be used to detect overfitting.
- Increase data - Add more data samples to increase the learning scope of the model.
- Regularization is a method that penalizes extreme weight values to help prevent linear models from overfitting training data examples.
- Simpler models - Use simpler model architectures to help with overfitting. If the model is underfitting, the model might be too simple.
- Dimension reduction is an unsupervised machine learning algorithm that attempts to reduce the dimensionality (number of features) within a dataset while still retaining as much information as possible.
- Stop training early - End training early so that the model does not memorize the data.

Core dimensions of responsible AI

The core dimensions of responsible AI include fairness, explainability, privacy and security, robustness, governance, transparency, safety, and controllability.
Fairness - With fairness, AI systems promote inclusion, prevent discrimination, uphold responsible values and legal norms, and build trust with society.
Explainability refers to the ability of an AI model to clearly explain or provide justification for its internal mechanisms and decisions so that it is understandable to humans.
Privacy and security in responsible AI refers to data that is protected from theft and exposure.
- At the privacy level, individuals control when and if their data can be used.
- At the security level, it verifies that no unauthorized systems or unauthorized users will have access to the individual’s data.
Transparency communicates information about an AI system so stakeholders can make informed choices about their use of the system.
Veracity and robustness in AI refers to the mechanisms to ensure an AI system operates reliably, even with unexpected situations, uncertainty, and errors.
Governance is a set of processes that are used to define, implement, and enforce responsible AI practices within an organization.
Safety in responsible AI refers to the development of algorithms, models, and systems in such a way that they are responsible, safe, and beneficial for individuals and society as a whole.
Controllability in responsible AI refers to the ability to monitor and guide an AI system's behavior to align with human values and intent.

Reviewing Amazon service tools for responsible AI

With Model evaluation on Amazon Bedrock, you can evaluate, compare, and select the best foundation model for your use case in just a few clicks.
- Amazon Bedrock offers a choice of automatic evaluation and human evaluation.
SageMaker AI Clarify supports FM evaluation. You can automatically evaluate FMs for your generative AI use case with metrics such as accuracy, robustness, and toxicity to support your responsible AI initiative.
- SageMaker AI Clarify helps identify potential bias in machine learning models and datasets without the need for extensive coding.
With Guardrails for Amazon Bedrock, you can implement safeguards for your generative AI applications based on your use cases and responsible AI policies.
- Consistent level of AI safety
- Block undesirable topics
- Filter harmful content
- Redact PII to protect user privacy
Amazon SageMaker Data Wrangler to balance your data in cases of any imbalances.
- This addresses the problem where machine learning models might be biased toward the majority class in a skewed datase
- SageMaker Data Wrangler offers three balancing operators: random undersampling, random oversampling, and Synthetic Minority Oversampling Technique (SMOTE) to rebalance data in your unbalanced datasets.
Amazon SageMaker Model Monitor monitors the quality of SageMaker AI machine learning models in production.
Amazon Augmented AI (Amazon A2I) is a service that helps build the workflows required for human review of ML predictions.
Governance improvement: SageMaker AI provides purpose-built governance tools to help you implement AI responsibly
- Amazon SageMaker Role Manager: With SageMaker Role Manager, administrators can define minimum permissions in minutes.
- Amazon SageMaker Model Cards: With SageMaker Model Cards, you can capture, retrieve, and share essential model information, such as intended uses, risk ratings, and training details, from conception to deployment.
- Amazon SageMaker Model Dashboard: With SageMaker Model Dashboard, you can keep your team informed on model behavior in production, all in one place
Providing transparency
- AWS AI Service Cards are a new resource to help you better understand AWS AI services.
- Provides a single place to find information on the intended use cases and limitations, responsible AI design choices, and deployment and performance optimization best practices for AWS AI services.

Responsible Considerations to Select a Model

Remember that you can use Model evaluation on Amazon Bedrock or SageMaker AI Clarify to evaluate models for accuracy, robustness, toxicity, or nuanced content that requires human judgement.
Define application use case narrowly - This is important because you can tune your model for that specific use case.
Choosing a model based on performance - Consider a model based on performance with test datasets.
- Model performance varies across a number of factors: Level of customization, Model size, Inference options, Licensing agreements , Context windows, Latency.
Choosing a model based on sustainability concerns -
- Responsible agency considerations for selecting a model: Value alignment, Responsible reasoning skills, Appropriate level of autonomy, Transparency and accountability.
- Environmental considerations for selecting a model: Energy consumption, Resources utilization, Environmental impact assessment
Economic considerations for selecting a model - Economic considerations in responsible AI include the potential benefits and costs of AI technologies and the impact on jobs and the economy.

Responsible Preparation for Datasets

Balanced datasets are important for creating responsible AI models that do not unfairly discriminate or exhibit unwanted biases.
The main steps of curating data include data preprocessing, data augmentation, and regular auditing.
Balance your data for the intended use case.
Data augmentation can be used to generate new instances of underrepresented groups.

Transparent and Explainable Models

Transparency answers the question HOW, and explainability answers the question WHY. Both aspects are needed to build responsible AI systems. Needed because -
- Increased trust
- Easier to debug and optimize for improvements
- Better understanding of the data and the model's decision-making process
- Solutions for transparent and explainable models
Here are some potential solutions for increasing transparency and explainability in AI systems to help ensure responsible AI development.
- There are several explainability frameworks available, such as SHapley Value Added (SHAP), Layout-Independent Matrix Factorization (LIME), and Counterfactual Explanations
- Transparent documentation - Maintain clear and comprehensive documentation of the AI system's architecture, data sources, training processes, and underlying assumptions.
- Monitoring and auditing - AI systems should be monitored and audited to ensure that they are functioning as intended and not exhibiting bias or discriminatory behavior.
- Human oversight and involvement
- Counterfactual explanations - Provide counterfactual explanations that show how the output would change if certain input features were different to help users understand the model's behavior and reasoning
- User interface explanations - Design user interfaces that provide clear and understandable explanations of the AI system's outputs, rationale, and limitations to end-users, so they can make informed decisions
AWS tools for transparency: AWS AI Service Cards, Amazon SageMaker Model Cards
- AWS AI Service Cards are transparent documentation that AWS provides concerning the AI tools that they offer to customers to use. They are not customizable.
- The developer can use SageMaker Model Cards to provide transparency to models that they build and train.
AWS tools for explainability: SageMaker AI Clarify, SageMaker Autopilot
Interpretability is a feature of model transparency. Interpretability is the degree to which a human can understand the cause of a decision.
Higher controllability provides more transparency into the model and allows correcting undesired biases and outputs.
Safety and transparency trade-offs, Interpretability trade-offs
The following are key principles of human-centered design for explainable AI:
- Design for amplified decision-making - clarity, simplicity, usability, reflexivity, and accountability.
- Design for unbiased decision-making - transparency, fairness, and training.
- Design for human and AI learning - cognitive apprenticeship, personalization, and user-centered design.
SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities for incorporating human feedback across the ML lifecycle to improve model accuracy and relevancy.

4. Developing Machine Learning Solutions

ML development lifecycle - The end-to-end machine learning lifecycle process includes the following phases:

Business goal identification
ML problem framing
Data processing (data collection, data preprocessing, and feature engineering)
Model development (training, tuning, and evaluation)
Model deployment (inference and prediction)
Model monitoring
Model retraining
Define business goals - ML starts with a business objective. Business stakeholders define the value, budget, and success criteria.
ML problem framing - The problem formulation entails articulating the business problem and converting it into a machine learning problem.
Data processing - To train an accurate ML model, developers use data processing to convert data into a usable format. These steps include:
- Data collection and integration: Ensures the raw data is in one centrally accessible place.
- Data preprocessing and visualization: Transforms raw data into an understandable format.
- Feature engineering: The process of creating, transforming, extracting, and selecting variables from data.
Model development - Model development consists of model training, tuning, and evaluation.
Retrain - If the model doesn't meet the business goals, it's necessary to take a second look at the data and features to identify ways to improve the model.
Deployment - The model is now ready to make predictions and inferences against the model.
Monitoring - The model monitoring system ensures the model is maintaining a desired level of performance through early detection and mitigation

Amazon SageMaker AI

Amazon SageMaker AI is a fully managed ML service.
In a single unified visual interface, you can perform the following tasks:
- Collect and prepare data.
- Build and train machine learning models.
- Deploy the models and monitor the performance of their predictions.
Amazon SageMaker Studio is the recommended option to access SageMaker AI. It is a web-based UI that provides access to all SageMaker AI environments and resources.

Collecting, analyzing, and preparing your data

Amazon SageMaker Data Wrangler is a low-code no-code (LCNC) tool. It provides an end-to-end solution to import, prepare, transform, featurize, and analyze data by using a web interface.
For more advanced users and data preparation at scale, Amazon SageMaker Studio Classic comes with built-in integration of Amazon EMR and AWS Glue interactive sessions to handle large-scale interactive data preparation.
Finally, by using the SageMaker Processing API, customers can run scripts and notebooks to process, transform, and analyze datasets.

Managing Features

Amazon SageMaker Feature Store helps data scientists, machine learning engineers, and general practitioners to create, share, and manage features for ML development.

Model training and evaluation

SageMaker AI provides a training job feature to train and deploy models using built-in algorithms or custom algorithms.
LCNC option - with SageMaker Canvas, they can use machine learning to generate predictions without needing to write any code.
Amazon SageMaker JumpStart provides pretrained, open source models that customers can use for a wide range of problem types.

Model evaluation

Amazon SageMaker Experiments - experiment with multiple combinations of data, algorithms, and parameters, all while observing the impact of incremental changes on model accuracy.
Amazon SageMaker Automatic Model Tuning does Hyperparameter tuning by running many jobs with different hyperparameters in combination and measuring each of them by a metric that you choose.

Deployment

With SageMaker AI, customers can deploy their ML models to make predictions, also known as inference.

Monitoring

With Amazon SageMaker Model Monitor, customers can observe the quality of SageMaker ML models in production.

SageMaker AI built-in algorithms

Supervised AI learning - Classification & Regression
Unsupervised learning - Clustering, Topic Modelling, Embeddings, Anamoly detection, Dimensionality Reduction
Image Processing - Image Classification, Object Detection, Semantic Segmentation
Text - Text Classification, Word2Vec, Machine Translation, Topic Modelling
Time series and Speech

SageMaker Jumpstart - With SageMaker Jumpstart, you can deploy, fine-tune, and evaluate pre-trained models from the most popular model hubs.

ML Models Performance Evaluation

Evaluation occurs after a model is trained. The data you use is partitioned into three parts: training set (80%), validation set (10%), and test set (10%).
- The training set is used to train the model.
- The validation and test sets are the ones that you will use to evaluate the trained model performance.
Model fit - Model fit is important for understanding the root cause of poor model accuracy.
- Overfitting is when the model performs well on the training data but does not perform well on the evaluation data. This is because the model memorized the data it has seen and is unable to generalize to unseen examples.
- Underfitting is when the model performs poorly on the training data. This is because the model is unable to capture the relationship between the input examples (often called X) and the target values (often called Y).
- The model is balanced when it is not overfit or underfit to the training data.
In ML, the ideal algorithm has low bias and can accurately model the true relationship.
The ideal algorithm also has low variability, by producing consistent predictions across different datasets.

Classification Metrics - Accuracy, Precision, Recall, F1, AUC-ROC

A confusion matrix can help classify why and how a model gets something wrong.
- True Positive, False Negative, False Positive, True Negative
- To calculate the model’s accuracy, also known as its score, add up the correct predictions and then divide that number by the total number of predictions.
- This metric is less effective when there are a lot of true negative cases in your dataset. This is why two other metrics are often used in these situations: precision and recall.
Precision - Precision removes the negative predictions from the picture. Precision is the proportion of positive predictions that are actually correct.
Recall - In recall, you are looking at the proportion of correct sets that are identified as positive. You get an idea of how good the algorithm is at detecting, for example, cats.

Regression Metrics - Mean squared error, R squared

Mean squared error - You determine the prediction from the model and compare the difference between the prediction and the actual outcome.
R squared - It’s like a percentage, reporting a number from 0 to 1. When R squared is close to 1, it usually indicates that a lot of the variance in the data can be explained by the model itself.

Business metrics

Consider developing custom metrics that tune the model directly for the business objectives. One way is to develop a cost function to evaluate the economic impact of the model.
By using A/B testing or the canary deployments technique, developers can experiment with two or more variants of a model and help achieve the business goals.

Model Deployment

Model deployment is the integration of the model and its resources into a production environment so that it can be used to create predictions.
In a self-hosted API approach, you deploy and host your ML models on your own infrastructure, either on premises or in the cloud.
Managed API services are cloud-based services that provide a fully managed environment for deploying and hosting your ML models as APIs. SageMaker AI is an example.
SageMaker AI provides the following:
- Deployment with one click or a single API call
- Automatic scaling
- Model hosting services
- HTTPS endpoints that can host multiple models

Deployment Options

Real-time inference is ideal for inference workloads where you have real-time, interactive, and low latency requirements.
Use batch transform when you need to get inferences from large datasets and don't need a persistent endpoint
Asynchronous inference is a capability in SageMaker AI that queues incoming requests and processes them asynchronously.
On-demand serverless inference is ideal for workloads that have idle periods between traffic spurts and can tolerate cold starts.

Fundamental Concepts of MLOps

MLOps refers to the practice of operationalizing and streamlining the end-to-end machine learning lifecycle from model development and deployment to monitoring and maintenance.
MLOps combines people, technology, and processes to deliver collaborative ML solutions.

The aim is to use MLOps to do the following:

Increase the pace of the model development lifecycle through automation.
Improve quality metrics through testing and monitoring.
Promote a culture of collaboration between data scientists, data engineers, software engineers, and IT operations.
Provide transparency, explainability, audibility, and security of the models by using model governance.
Benefits of MLOps - Productivity, Reliability, Repeatability, Auditability , Data & Model Quality
Key principles of MLOps - Version control, Automation, CI/CD, Model governance

AWS AI Practitioner Exam Notes - AWS AI Services

AWS services for MLOps

Prepare data - SageMaker Data Wrangler is a LCNC tool that provides an end-to-end solution to import, prepare, transform, featurize, and analyze data by using a web interface.
Store features - SageMaker Feature Store helps data scientists, machine learning engineers, and general practitioners to create, share, and manage features for ML development.
Train - SageMaker AI provides a training job feature to train models using built-in algorithms or custom algorithms.
Experiments - Use SageMaker Experiments to experiment with multiple combinations of data, algorithms, and parameters, all while observing the impact of incremental changes on model accuracy.
Processing job - SageMaker AI Processing refers to the capabilities to run data pre-processing and post-processing, feature engineering, and model evaluation tasks
Registry - With SageMaker Model Registry you can catalog models, manage model versions, manage the approval status of a model, or deploy models to production.
Deployments - With SageMaker AI, you can deploy your ML models to make predictions, also known as inference.
Monitor model - With SageMaker Model Monitor, you can monitor the quality of SageMaker AI ML models in production.

5. Developing Generative AI Solutions

Capabilities of generative AI
- Adaptability
- Responsiveness
- Simplicity
- Creativity and exploration
- Data efficiency
- Personalization
- Scalability
Challenges of generative AI
- Regulatory violations
- Social risks
- Data security and privacy concerns
- Toxicity
- Hallucinations
- Interpretability
- Nondeterminism

Generative AI Application Lifecycle

Define a use case - This might involve analyzing the application's functionalities, user needs, and business goals to determine where generative AI can add value.
Select a foundation model - This decision depends on factors such as the availability of suitable pre-trained models, the complexity of the use case, and the availability of domain-specific data for training.
Improve performance - This might involve adapting the model's input and output formats, fine-tuning the model with application-specific data, and implementing any necessary customizations or optimizations.
Evaluate results - This might involve testing with various inputs, edge cases, and real-world scenarios, as well as evaluating the quality, coherence, and relevance of the generated content.
Deploy the application - Monitoring mechanisms are established to track the performance, usage, and potential issues or biases associated with the generative AI model's outputs.

Defining a Use Case

Defining the problem to be solved
Gathering relevant requirements
Aligning stakeholder expectations

A well-defined business use case typically consists of the following parts:

Use case name - A short and descriptive name that identifies the use case
Brief description - A high-level summary of the use case's purpose and objective
Actors - The entities or stakeholders that interact with the system or process
Preconditions - The conditions that must be true before the use case can be initiated
Basic flow (main success scenario) - A step-by-step description of the actions and interactions that occur when the use case is completed successfully, from start to finish
Alternative flows (extensions) - Additional scenarios or paths that might occur due to exceptional conditions, errors, or alternative user choices
Postconditions - The state or conditions that must be true after the successful completion of the use case
Business rules - Any business policies, constraints, or regulations that govern the behavior of the system or process within the context of the use case
Nonfunctional requirements - Any nonfunctional requirements, such as performance, security, or usability considerations, that are relevant to the use case
Assumptions - Any assumptions made about the system, environment, or context that are necessary for the use case to be valid or applicable.
Notes or additional information - Any additional notes, explanations, or supplementary information that might be helpful for understanding or implementing the use case

When it comes to resolving business problems using generative AI, there are various metrics and approaches that can be employed.

Cost savings, Time savings, Quality improvement, Customer satisfaction, Productivity gains

Approaches:

Process automation, Augmented decision-making, Personalization and customization, Creative content generation, Exploratory analysis and innovation

Selecting an FM

Pre-trained model selection criteria - Cost, Modality, Latency, Multi-lingual support, Model size, Model complexity, Customization, Input/output length, Responsibility considerations, Deployment and integration.
Amazon Titan foundation models are a family of models built by Amazon Web Services (AWS) that are pre-trained on large datasets, which makes them powerful, general-purpose models.

Improving the Performance of an FM

Prompt engineering

Prompt engineering is the fastest way to harness the power of large language models (LLMs).
Some key aspects of prompt engineering include the following:
- Design: Crafting clear, unambiguous, and context-rich prompts that effectively communicate the desired task or output to the model
- Augmentation: Incorporating additional information or constraints into the prompts, such as examples, demonstrations, or task-specific instructions, to guide the model's generation process
- Tuning: Iteratively refining and adjusting the prompts based on the model's outputs and performance, often through human evaluation or automated metrics
- Ensembling: Combining multiple prompts or generation strategies to improve the overall quality and robustness of the outputs
- Mining: Exploring and identifying effective prompts through techniques like prompt searching, prompt generation, or prompt retrieval from large prompt libraries
Prompt techniques - Zero-shot prompting, Few-shot prompting, Chain-of-thought (CoT) prompting, Self-consistency, Tree of thoughts (ToT), Retrieval Augmented Generation (RAG) etc
RAG is a natural language processing (NLP) technique that combines the capabilities of retrieval systems and generative language models to produce high-quality and informative text outputs.
RAG incorporates two main components
- A retrieval system - This component retrieves relevant information from a large corpus of text data, such as a knowledge base, web pages, or other textual sources
- A generative language model - The language model takes the input query or context, along with the retrieved relevant information. And from this, it generates a output that combines the retrieved knowledge with its own understanding.
RAG has several business applications, including the following:
- Building intelligent question-answering systems
- Expanding and enriching existing knowledge bases
- Generating high-quality content
Knowledge bases for Amazon Bedrock provide you the capability of amassing data sources into a repository of information.
RAG can use knowledge bases across various domains to provide intelligent and contextual responses, recommendations, or analysis by combining information retrieval and natural language generation capabilities.

Fine-tuning

Fine-tuning refers to the process of taking a pre-trained language model and further training it on a specific task or domain-specific dataset
There are two ways to fine-tune a model:
- Instruction fine-tuning uses examples of how the model should respond to a specific instruction. Prompt tuning is a type of instruction fine-tuning.
- Reinforcement learning from human feedback (RLHF) provides human feedback data, resulting in a model that is better aligned with human preferences.

Creating a foundation model from scratch

Creating a model from scratch allows for complete customization and tailoring to the specific problem. But it comes at a significant cost in terms of computational resources, time, and expertise required.
When developing a generative AI application, there is often a trade-off between cost and accuracy when deciding whether to use a pre-trained foundation model or pursue a more customized approach.

Automated multi-step tasks with agents

Here are some examples of tasks that agents can accomplish - Task coordination, Reporting and logging, Scalability and concurrency, Integration and communication

Evaluating an FM

Types of evaluation methods:
- Human evaluation
- Benchmark datasets - Some popular benchmark datasets for natural language processing tasks: SuperGLUE, Stanford Question Answering Dataset (SQuAD), Workshop on Machine Translation (WMT)
- Automated metrics - Perplexity, BLEU score, F1 score
Metrics like ROUGE, BLEU, and BERTScore provide an initial assessment of the foundation model's capabilities.
- Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is a set of metrics used for evaluating automatic summarization and machine translation systems
- Bilingual Evaluation Understudy (BLEU) is a metric used to evaluate the quality of machine-generated text, particularly in the context of machine translation.
- BERTScore is a metric that evaluates the semantic similarity between a generated text and one or more reference texts.

Deploying the Application

The deployment phase of the generative AI lifecycle ensures that the trained model is successfully integrated into the target environment for practical use.

Key considerations - Cost, Regions, Quotas, Security

6. Optimizing Foundation Models

Embedding is the process by which text, images, and audio are given numerical representation in a vector space.
The core function of vector databases is to compactly store billions of high-dimensional vectors representing words and entities
AWS offers the following viable vector database options:
- Amazon OpenSearch Service (provisioned)
- Amazon OpenSearch Serverless
- pgvector extension in Amazon Relational Database Service (Amazon RDS) for PostgreSQL
- pgvector extension in Amazon Aurora PostgreSQL-Compatible Edition
- Amazon Kendra
RAG approach that allows FMs to query knowledge bases to provide accurate and up-to-date answers to customer prompts.
Fine-tuning is critical because it helps to do the following: Increase specificity, Improve accuracy, Reduce biases, Boost efficiency.
RLHF refers to the improvement of the model by learning from feedback, such as ratings, preferences, demonstrations, helpfulness, or toxicity, provided by humans
The following list walks through the key steps in fine-tuning data preparation:
- Data curation, Labeling, Governance and compliance, Representativeness and bias checking
Model Evaluation - Three commonly used metrics for this purpose are Recall-Oriented Understudy for Gisting Evaluation (ROUGE), Bilingual Evaluation Understudy ****(BLEU), and BERTScore.

7 Security, Compliance, and Governance for AI Solutions

Security: Ensure that confidentiality, integrity, and availability are maintained for organizational data and information assets and infrastructure. This function is often called information security or cybersecurity in an organization.
Governance: Ensure that an organization can add value and manage risk in the operation of business.
Compliance: Ensure normative adherence to requirements across the functions of an organization.

Defense in depth for AI on AWS

Policies, procedures, and awareness - AWS Identity and Access Management Access Analyze
Threat detection and incident response
1. AWS services that help with threat detection include: AWS Security Hub, Amazon GuardDuty
2. AWS services for incident response include the following: AWS Lambda, Amazon EventBridge
Infrastructure protection - Some AWS services and features include the following: AWS Identity and Access Management (IAM), IAM user groups and network access control lists (network ACLs)
Network and edge protection - AWS services that provide network and edge protection include the following: Amazon Virtual Private Cloud (Amazon VPC), AWS WAF
Application protection - AWS offers several services to protect applications. These include the following: AWS Shield, Amazon Cognito, Others
Identity and access management - The fundamental service is AWS Identity and Access Management (IAM).
Data protection
1. Data at rest: Ensure that all data at rest is encrypted with AWS Key Management Service (AWS KMS), Simple Storage Service (Amazon S3) versioning.
2. Data in transit: Protect all data in transit between services using AWS Certificate Manager (ACM) and AWS Private Certificate Authority (AWS Private CA). Keep data within virtual private clouds (VPCs) using AWS PrivateLink.

High-level strategy for governance and compliance

The following is an example approach for establishing a governance framework.
- Establish an AI governance board or committee
- Define roles and responsibilities
- Implement policies and procedures

AWS compliance - AWS supports 143 security standards and compliance certifications.

The following are some specific security standards that might apply to AI systems
- NIST, ENISA, ISO, SOC, HIPAA, GDPR, PCI DSS etc
There are several key ways in which AI standards compliance differs from traditional software and technology requirements. The following are some issues to consider.
- Complexity, Opacity, Dynamism, Emergent Capabilities, Unique risks, algorithm accountability
Regulated workloads: Regulated is a common term used to indicate that a workload might need special consideration, because of some form of compliance that must be achieved. Some example industries are as follows:
- Financial services
- Healthcare
- Aerospace
You are operating in a regulated context when you must comply with regulatory frameworks such as HIPAA, GDPR, PCI DSS, and others.
The National Institute of Standards and Technology (NIST) 800-53 security controls are generally applicable to US Federal Information Systems
You can realize your data needs by determining if such standards exist or apply. Ask the following questions.
- Do you need to audit this workload?
- Do you need to archive this data for a period of time?
- Will the predictions created by my model constitute a record or other special data item?

AWS Services for Governance and Compliance

AWS Config provides a detailed view of the configuration of AWS resources in your AWS account. This includes how the resources are related to one another and how they were configured in the past so that you can see how the configurations and relationships change over time.
Amazon Inspector - Amazon Inspector is a vulnerability management service that continuously scans your AWS workloads for software vulnerabilities and unintended network exposure.
AWS Audit Manager helps you continually audit your AWS usage to streamline how you manage risk and compliance with regulations and industry standards.
AWS Artifact provides on-demand downloads of AWS security and compliance documents, such as AWS ISO certifications, PCI reports, and SOC Reports. Also used to perform due diligence of ISVs that sell products on AWS Marketplace.
AWS CloudTrail helps you perform operational and risk auditing, governance, and compliance of your AWS account.
AWS Trusted Advisor helps you optimize costs, increase performance, improve security and resilience, and operate at scale in the cloud.

Data Governance Strategies

The following are some key data governance strategies that organizations can consider.

Data quality and integrity
Data protection and privacy
Data lifecycle management
Responsible AI
Governance structures and roles
Data sharing and collaboration

The following concepts are all important considerations for the successful management and deployment of AI workloads.

Data lifecycles
Data logging
Data residency
Data monitoring
Data analysis
Data retention

Approaches to governance strategies

Governance strategies - The following are some key approaches to consider:

Policies, Review Cadence, Review Strategies, Transparency standards, Team training reqs

Monitoring an AI system

Performance metrics - Monitor the performance of the AI system by tracking metrics, such as the following: Model accuracy, Precision, Recall, F1-score, Latency
Infrastructure monitoring, Monitoring for bias and fairness, Monitoring for compliance and responsible AI

Security and Privacy Considerations for AI Systems

Security considerations - Threat detection, Vulnerability management, Infrastructure protection, Prompt injection, Data encryption

The OWASP Top 10 for LLMs

Prompt injection: Malicious user inputs that can manipulate the behavior of a language model
Insecure output handling: Failure to properly sanitize or validate model outputs, leading to security vulnerabilities
Training data poisoning: Introducing malicious data into a model's training set, causing it to learn harmful behaviors
Model denial of service: Techniques that exploit vulnerabilities in a model's architecture to disrupt its availability
Supply chain vulnerabilities: Weaknesses in the software, hardware, or services used to build or deploy a model
Sensitive information disclosure: Leakage of sensitive data through model outputs or other unintended channels
Insecure plugin design: Flaws in the design or implementation of optional model components that can be exploited
Excessive agency: Granting a model too much autonomy or capability, leading to unintended and potentially harmful actions
Overreliance: Over-dependence on a model's capabilities, leading to over-trust and failure to properly audit its outputs
Model theft: Unauthorized access or copying of a model's parameters or architecture, allowing for its reuse or misuse

AWS Services and Features for Securing AI Systems

Why you need to secure your AI systems?
- AI models process sensitive data
- AI Systems can be vulnerable to adversarial attacks
- Integration into critical applications and decision-making processes
The AWS Shared Responsibility Model
- Customer : Responsible for security in the cloud - Customer data, Platform, OS, Client side, server side encryption, network traffic protection etc
- AWS : Responsible for security of the cloud - Compute, storage, database, regions, etc
Four foundational AWS security services:
- AWS Security Hub provides customers with a single dashboard to view all security findings, and to create and run automated playbooks.
- AWS KMS encrypts data and gives customers the choice and control of using AWS managed keys or customer-managed keys to protect their data.
- Amazon GuardDuty is a threat detection service that monitors for suspicious activity and unauthorized behavior to protect AWS accounts, workloads, and data.
- AWS Shield Advanced helps protect workloads against Distributed Denial of Service (DDoS) events. AWS Shield Advanced includes AWS WAF and AWS Firewall Manager.

The following services are used to manage user identities and access to resources, identify and protect sensitive data, and guard your AI systems and applications.

Identify sensitive data before training models
Manage identities and access to AWS services and resources
Limit access to your data, models, and outputs
Protect data from exfiltration (data theft) and manipulation
Protect AI workloads with intelligent threat detection
Automate incident response and compliance
Defend your generative AI web applications and data

Citing sources and documenting origins

Source citation in generative AI refers to the act of properly attributing and acknowledging the sources of the data used to train the model. It is necessary to identify the sources from which the training data was collected, such as the following:
- Datasets
- Databases
- Other sources
Documenting data origins in the context of generative AI involves providing detailed information about the provenance, or the place of origin of the data used to train the model. This includes the following:
- Details about the data collection process
- The methods used to curate and clean the data
- Any preprocessing or transformations applied to the data

Tools and techniques

Data lineage is a technique used to track the history of data, including its origin, transformation, and movement through different systems.
Cataloging involves the systematic organization and documentation of the datasets, models, and other resources used in the development of a generative AI system.
Model cards are a standardized format for documenting the key details about an ML model, including its intended use, performance characteristics, and potential limitations.
You can use Amazon SageMaker Model Cards to document critical details about your ML models in a single place for streamlined governance and reporting.

Secure data engineering

Secure data engineering practices are essential for ensuring the safety and reliability of AI and generative AI systems.
- assessing the quality of data
- implementing privacy-enhancing technologies
- data access control
- data integrity
The AWS Privacy Reference Architecture (AWS PRA) offers a set of guidelines to assist in the design and implementation of privacy-supporting controls within AWS services.

8. Essentials of Prompt Engineering

Elements of a prompt

Instructions: This is a task for the large language model to do. It provides a task description or instruction for how the model should perform.
Context: This is external information to guide the model.
Input data: This is the input for which you want a response.
Output indicator: This is the output type or format.

Negative prompting is used to guide the model away from producing certain types of content or exhibiting specific behaviors.

Inference parameters

When interacting with FMs, you can often configure inference parameters to limit or influence the model response.

Randomness and diversity

Temperature: This parameter controls the randomness or creativity of the model's output.
- A higher temperature makes the output more diverse and unpredictable, and a lower temperature makes it more focused and predictable.
- Temperature is set between 0 and 1.
Top p is a setting that controls the diversity of the text by limiting the number of words that the model can choose from based on their probabilities.
- Top p is also set on a scale from 0 to 1 (more focused to more diverse)
Top k limits the number of words to the top k most probable words, regardless of their percent probabilities.
- For instance, if top k is set to 50, the model will only consider the 50 most likely words for the next word in the sequence, even if those 50 words only make up a small portion of the total probability distribution.

Length

Settings that control the maximum length of the generated output and specify the stop sequences that signal the end of the generation process.

Maximum length - The maximum length setting determines the maximum number of tokens that the model can generate during the inference process.
Stop sequences - Stop sequences are special tokens or sequences of tokens that signal the model to stop generating further output.

Best practices for prompting

Be clear and concise.
Include context if needed.
Use directives for the appropriate response type.
Consider the output in the prompt.
Start prompts with an interrogation.
Provide an example response.
Break up complex tasks.
Experiment and be creative.
Use prompt templates.

Prompt Engineering Techniques

Zero-shot prompting is a technique where a user presents a task to a generative model without providing any examples or explicit training for that specific task.

Few-shot prompting is a technique that involves providing a language model with contextual examples to guide its understanding and expected output for a specific task.

Chain-of-thought (CoT) prompting is a technique that divides intricate reasoning tasks into smaller, intermediary steps.

To initiate the chain-of-thought reasoning process in a machine learning model, you can use the phrase "Think step by step."

Prompt Misuses and Risks

Poisoning refers to the intentional introduction of malicious or biased data into the training dataset of a model.
Hijacking and prompt injection refer to the technique of influencing the outputs of generative models by embedding specific instructions within the prompts themselves.
Exposure - Exposure refers to the risk of exposing sensitive or confidential information to a generative model during training or inference.
Prompt leaking refers to the unintentional disclosure or leakage of the prompts or inputs (regardless of whether these are protected data or not) used within a model.
Jailbreaking refers to the practice of modifying or circumventing the constraints and safety measures implemented in a generative model or AI assistant to gain unauthorized access or functionality.