GenAI Security Engineering

Engineer defenses against prompt injection, jailbreaks, and data exfiltration. Implement PII leakage detection, content safety, and compliance.

Preview 13 goals free

10 skill groups8 courses778 goals~342 hrs

Verifiable skill graph

10 skill groups · each becomes a signed node on your graph.

Every lab you pass signs a W3C Verifiable Credential on your public skill graph. Completing the labs in each group below mints one node on that graph — the badge you walk away with is a cryptographic record of what you can ship, not a completion certificate.

Share the URL on your résumé or with a hiring manager. They click; they see the discipline, the labs you passed, and the verification signature. No honor system, no broker.

Threat Modeling & AI Red Teaming

Run the offense program: AI threat modeling (STRIDE-for-LLMs, attack trees), OWASP-LLM-Top-10 / MITRE ATLAS fluency, red-team automation, and adversarial test campaigns — the discipline that frames every other defense.

Prompt Injection (Direct & Indirect) & Jailbreak Defense

Attack and defend the prompt surface: direct and indirect/cross-domain injection (via retrieved or tool content), jailbreak/refusal-bypass, red-teaming content filters, and output-encoding to neutralize downstream injection.

Data Exfiltration & PII Leakage Defense

Stop data getting out under attack: exfiltration via prompt/output/tool side-channels, system-prompt leakage, and PII-leak detection — the adversarial channels, not corpus or in-feature redaction.

Model Privacy & Extraction Attacks

Defend the model itself as an asset: membership-inference, training-data extraction, and model-stealing attacks — and the defenses against them.

Attack Detection & Security Incident Response

Catch the attack and respond to the breach: adversarial anomaly/abuse detection, attack forensics and containment, and post-breach response — not operational on-call or outage IR.

Agentic & MCP Attack Surface

Attack the agent: confused-deputy and tool-poisoning, excessive-agency exploitation, malicious-MCP-server attacks, privilege escalation through tool chains, and sandbox-escape testing.

RAG Poisoning & Retrieval Attacks

Poison and defend the retrieval layer: corpus/embedding poisoning, malicious-document injection, retrieval manipulation, and detection vs prevention — attacking the index, not building it.

AI Supply-Chain, Model-Artifact & Abuse/DoS Defense

Secure the AI supply chain and abuse surface: model provenance/signing, malicious-model-file scanning, AI dependency/SBOM, and abuse/token-flooding DoS — not generic endpoint or org-secrets hardening.

Hosted LLM API Integration

Baseline provider access in security tooling: LLM/embedding SDK calls, auth, and retries.

Python for Security Engineering

Production Python for security tooling: async, typing, parsing, and error handling.

What you'll ship in production

Core responsibilities this discipline prepares you for.

1
Conduct adversarial red-team testing
of LLM systems
- Automate red-teaming with Garak for prompt injection, jailbreak, and data extraction probes
- Run multi-turn adversarial campaigns with Meta GOAT and structured vulnerability reporting
- Execute campaigns against realistic GenAI systems, discover attack vectors, and produce actionable reports
2
Implement defense-in-depth guardrails
— input validation, output filtering, content safety
- Layer NeMo Guardrails, Llama Guard 4, Prompt Guard 2, and Model Armor into a unified defense stack
- Configure multi-layer input validation, output filtering, and content classification policies
- Measure the safety-vs-helpfulness tradeoff across different defense layer configurations
3
Threat-model GenAI agent systems
— analyze attack surfaces across tools, memory, and inter-agent communication
- Analyze MCP security boundaries, memory manipulation vectors, and inter-agent trust relationships
- Map tool access control surfaces and agent communication channel vulnerabilities
- Threat-model a complete multi-agent system, identify attack vectors, and design targeted mitigations
4
Build PII protection
— detect, classify, and redact sensitive data in LLM pipelines
- Integrate Presidio for multi-language PII detection with custom entity recognizers
- Implement masking vs. pseudonymization redaction strategies with compliance validation
- Configure PII protection for a RAG pipeline and verify zero sensitive data leakage in outputs
5
Design compliance programs
aligned with OWASP LLM Top 10, MITRE ATLAS, EU AI Act
- Map OWASP LLM Top 10 mitigations to specific technical controls and implementation patterns
- Implement MITRE ATLAS threat taxonomy and NIST AI RMF compliance frameworks
- Create compliance mappings for GenAI systems and design repeatable audit procedures
6
Build security monitoring
for GenAI systems
- Build security-specific monitoring dashboards with anomalous prompt pattern detection
- Detect data exfiltration attempts, unusual token patterns, and adversarial input signatures
- Monitor a production-like GenAI system and detect simulated attacks in real time
7
Implement incident response
for GenAI security events
- Build GenAI-specific incident response playbooks with severity classification and containment procedures
- Design forensic analysis workflows for LLM interactions and post-incident reporting
- Simulate security incidents and practice the full end-to-end response lifecycle
8
Secure GenAI supply chain
— model provenance, dependency scanning, container security
- Verify model integrity with provenance checks and scan dependencies for known vulnerabilities
- Design secure CI/CD pipelines with container image scanning and signing for GenAI deployments
- Audit a complete GenAI application supply chain and implement security controls at each stage

Curriculum

8 courses · each builds on previous goals

13 goals unlocked for preview — click to read. Locked goals need a subscription.

CourseGoalsWeight

Python Essentials for Agent Builders621.3%

Your Dev Environment4

Navigate filesystem with terminal
Manage files from command line
Set up VS Code
Configure terminal in VS Code

Python, Git & Package Management6

Install and verify Python
Write hello world script
Use Python REPL
Initialize Git repository
Track changes with Git
Install packages with pip

Variables & Basic Types5

Create and name variables
Work with strings
Work with numbers
Work with booleans
Format with f-strings

Control Flow4

Make decisions with if/elif/else
Iterate with for loops
Repeat with while loops
Control loop execution

Functions5

Define and call functions
Use parameters
Return values
Document with docstrings
Understand scope

Modules & Imports4

Import standard library
Create custom modules
Understand Python path
Create packages

Lists & Tuples5

Create and access lists
Modify lists
Slice lists
Use list comprehensions
Work with tuples

Dictionaries & Sets5

Create and access dicts
Modify dictionaries
Iterate over dicts
Work with nested dicts
Use sets

Classes & Dataclasses5

Understand class basics
Create dataclasses
Add methods
Use default values
Basic inheritance

Files, JSON & Error Handling5

Read and write files
Work with JSON
Use pathlib
Handle exceptions
Create custom exceptions

Basic Testing4

Use assert statements
Create test functions
Run pytest
Test classes

Environment Variables & Configuration5

Understand environment variables
Use .env files
Load with python-dotenv
Handle missing variables
Organize configuration

Decorators & Context Managers5

Understand decorators
Write simple decorators
Use context managers
Write context managers
Combine patterns

LLM Foundations for Agent Builders601.6%

Generators & Iterators5

Understand iteration
Create generators
Use generator expressions
Build data pipelines
Use itertools

Async Programming Basics5

Understand async concepts
Write async functions
Run concurrent operations
Use async context managers
Handle async exceptions

Type Hints & Pydantic5

Add basic type hints
Use typing generics
Create Pydantic models
Validate API data
Configure Pydantic

Data Pipelines & Transformations5

Build functional pipelines
Work with tabular data
Transform data shapes
Process LLM data formats
Optimize for performance

HTTP Clients & httpx5

Make GET requests
Make POST requests
Use async httpx
Handle errors
Use sessions

Your First LLM Call5

Set up credentials
Install Gemini SDK
Make first API call
Parse response
Handle API errors

Sampling Parameters & Output Control5

Understand temperature
Use top-p sampling
Implement determinism
Control output length
Use structured output

Multi-Provider & Prompt Engineering5

Build provider abstraction
Structure conversations
Use few-shot prompting
Implement chain-of-thought
Build prompt templates

Embeddings & Semantic Search5

Understand embeddings
Generate embeddings
Calculate similarity
Build simple search
Compare embedding models

RAG Fundamentals5

Understand RAG pattern
Chunk documents
Build retrieval pipeline
Compose RAG prompts
Evaluate RAG quality

Cost Awareness & Token Economics5

Understand pricing models
Calculate request costs
Compare provider costs
Identify cost drivers
Basic cost optimization

Retry Patterns with Tenacity5

Understand retry need
Use tenacity basics
Implement exponential backoff
Handle specific exceptions
Combine with async

Kubernetes Essentials for GenAI601%

Containerizing LLM Applications6

Write a Python app that calls the Gemini API and returns structured responses
Write a Dockerfile and build a container image for the LLM app
Run the containerized LLM app with environment-based configuration
Use Docker Compose to run the LLM app with supporting services
Tag images with semantic versions and push to a container registry
Debug containers with exec, logs, and inspect

Your Kubernetes Cluster & First LLM Pod6

Understand K8s architecture and connect to your vCluster
Deploy the LLM app as your first Kubernetes pod
Organize workloads with namespaces
Use labels and selectors to organize and query resources
Understand pod lifecycle and restart policies
Master kubectl debugging: exec, logs, describe, port-forward

Services & the LLM Chat Backend6

Create a ClusterIP service to expose the LLM chat API internally
Deploy a multi-tier LLM chat application
Compare service types: ClusterIP, NodePort, LoadBalancer
Master DNS-based service discovery in Kubernetes
Understand endpoints and traffic routing
Debug service connectivity problems

Deployments, Scaling & Rolling Updates6

Create a Deployment for the LLM chat API
Scale LLM app replicas to handle concurrent requests
Perform a rolling update with zero downtime
Roll back a broken deployment
Compare deployment strategies: RollingUpdate vs Recreate
Manage deployment lifecycle with kubectl rollout

Multi-Container Pods: Sidecars & Init Containers6

Add an LLM proxy sidecar to the chat API pod
Use init containers for database setup and config loading
Share data between containers via emptyDir volumes
Implement the ambassador pattern for multi-model LLM routing
Add a logging and metrics sidecar to the LLM app
Debug multi-container pods

Resource Management & Cost Optimization6

Set resource requests and limits for the LLM chat API
Understand QoS classes and their impact on eviction
Enforce resource defaults with LimitRanges
Cap namespace resource usage with ResourceQuotas
Right-size LLM app containers based on actual usage
Diagnose OOMKilled and CPU throttling issues

Packaging with Helm & Kustomize6

Create a Helm chart for the LLM chat application
Parameterize the chart with values.yaml for each environment
Manage Helm release lifecycle: install, upgrade, rollback
Use Kustomize bases and overlays for the LLM app
Use Kustomize patches and generators
Compare Helm vs Kustomize for different deployment scenarios

Networking, Ingress & TLS6

Expose the LLM chat API via an Ingress resource
Add TLS to the Ingress for HTTPS access
Isolate services with NetworkPolicies
Configure Ingress annotations for production traffic
Understand K8s networking: pod IPs, CNI, and service routing
Debug networking and connectivity issues

Health Probes, Autoscaling & Self-Healing6

Add liveness and readiness probes to the LLM chat API
Configure startup probes for containers with slow initialization
Scale the chat API automatically with HPA based on CPU
Create PodDisruptionBudgets for safe maintenance
Implement health check patterns for LLM-dependent services
Combine autoscaling, probes, and PDBs for a resilient LLM service

RBAC, Security & K8s Troubleshooting6

Create RBAC roles for the LLM chat application
Enforce Pod Security Standards
Apply SecurityContext for defense in depth
Debug CrashLoopBackOff and OOMKilled failures
Use kubectl debug and ephemeral containers for live debugging
Troubleshoot LLM-specific issues: timeouts, proxy errors, stale connections

Web APIs for GenAI Engineers601.2%

FastAPI Fundamentals6

Create a FastAPI application with path operations
Define Pydantic request and response models
Implement dependency injection for shared resources
Build CRUD endpoints with proper HTTP semantics
Configure OpenAPI documentation with examples
Handle errors with custom exception handlers

Async Python for APIs6

Convert sync endpoints to async with proper await patterns
Implement background tasks for non-blocking operations
Execute concurrent API calls with asyncio.gather
Manage application lifecycle with lifespan handlers
Build async generators for streaming responses
Control concurrency with semaphores and throttling

Database Integration6

Configure SQLAlchemy async engine with connection pooling
Define ORM models with relationships and constraints
Create and manage database migrations with Alembic
Implement repository pattern for data access
Build transactional endpoints with session lifecycle
Implement filtering, sorting, and full-text search

Authentication & Authorization6

Implement user registration with password hashing
Build OAuth2 password flow with JWT tokens
Implement API key authentication for services
Enforce role-based access control with permissions
Build token refresh and revocation
Compose multiple auth strategies into dependencies

Real-time Streaming6

Build SSE endpoint for streaming LLM responses
Implement WebSocket endpoint with connection lifecycle
Build WebSocket connection manager for broadcasting
Handle backpressure and slow clients
Implement heartbeat and automatic reconnection
Build real-time notification system with Redis pub/sub

Resilience Patterns6

Implement rate limiting with Redis sliding window
Build circuit breaker for LLM provider calls
Configure retry logic with tenacity
Isolate critical paths with bulkhead semaphores
Build fallback responses for degraded mode
Combine resilience patterns into middleware stack

API Gateway & Routing6

Build reverse proxy with path-based routing
Implement load balancing across backend instances
Transform requests and responses through the gateway
Aggregate responses from multiple backends
Implement service discovery with health checking
Build gateway authentication and request enrichment

Testing & Documentation6

Write async endpoint tests with httpx.AsyncClient
Build database fixtures with transaction rollback
Mock external services for deterministic tests
Implement contract tests for API consumers
Measure test coverage and set quality gates
Generate rich OpenAPI documentation with examples

API Versioning & Evolution6

Implement URL-based API versioning with routers
Build header-based version negotiation
Manage deprecation with Sunset and Warning headers
Build request and response adapters for version translation
Detect breaking changes automatically
Generate API changelogs from schema diffs

Deployment & Observability6

Build production Docker images with multi-stage builds
Deploy to Kubernetes with health check probes
Instrument endpoints with Prometheus metrics
Implement distributed tracing with OpenTelemetry
Build structured logging with correlation IDs
Create Grafana dashboards for API monitoring

Agent Hosted Models2208.3%

The LLM Client7

OpenAI client setup
Anthropic client setup
Google Gemini client setup
Build a unified LLM client interface
Error handling and provider fallback
Async LLM client patterns
Practical use cases — security, parameters, observability

Token Economics7

Understand tokenization
Count tokens across providers
Cost forecasting and budgeting
Track LLM API usage in production
Implement budget controls
Optimize tokens
Advanced context engineering

Prompt Caching4

Implement Anthropic cache_control
Leverage OpenAI automatic caching
Design cache-friendly prompt architectures
Build cache monitoring systems

The Function Caller7

OpenAI function schemas
Anthropic function schemas
Gemini function schemas
Handle tool call responses
Execute tools safely with Pydantic validation
Handle parallel tool calls
Framework integration with LangGraph

The Tool Definer7

Write clear tool descriptions for LLMs
Define parameter schemas
Use Pydantic for tool schemas
Implement tool decorators
Handle complex parameter types
Validate tool inputs at runtime
Framework tool patterns — LangGraph, CrewAI, OpenAI, Gemini, Anthropic

The Raw Agent Loop7

The core agent while-loop
Manage context as a mutable list
Handle stop sequences
Track iteration limits
Tool execution in the loop
Build a conversation state tracker
Build with LangGraph StateGraph

The Prompt Engineer (Dynamic)6

Master Jinja2 templating for prompts
Implement dynamic few-shot example selection
Enforce Chain-of-Thought reasoning
Structure system prompts with a builder pattern
Inject dynamic context into prompts safely
Build prompt versioning and A/B testing

The ReAct Pattern (Manual)6

Build the Thought-Action generator
Tool execution and observation injection
Complete ReAct agent implementation
Advanced ReAct patterns — validation, retry, confidence
Optimize ReAct performance
Common ReAct pitfalls and solutions

The Planner Pattern7

Plan generation
Step execution
Dynamic replanning
Hierarchical planning
Plan optimization
Monitoring and observability
Practical considerations — strategy selection

The Pydantic Tool7

Pydantic fundamentals for tool definitions
Generate JSON Schema from Pydantic models
Input validation with custom validators
Build a Pydantic tool library
Advanced Pydantic patterns
Integrate Pydantic tools with agent frameworks
Common pitfalls and solutions

The Safe Executor (Sandboxing)5

Understand code execution risks
Static code analysis
Sandboxed execution
Apply resource limits
Build a complete safe executor

The Web Navigator5

Web navigation fundamentals
Web navigation tools — locating elements and forms
Browser automation with Playwright
Session management
Complete web navigator system

The MCP Protocol (Basics)4

JSON-RPC 2.0 message format and handler
Transport mechanisms — stdio and HTTP/SSE
Protocol lifecycle — initialization, runtime, shutdown
Capability negotiation

The MCP Server6

Create an MCP server with lifecycle management
Define MCP tools
Implement MCP resources
Create prompt templates
Error handling in MCP servers
Composable MCP server architecture

The MCP Client6

MCP client architecture and stdio transport
Discover available tools and translate schemas
Proxy tool invocation
Fetch and use MCP resources
Manage MCP server lifecycle
Build multi-server MCP clients

The Tool Router5

Tool routing architecture and implementation
Namespace-based routing
Capability-based routing
Fallback chains
Routing performance optimization

Short-Term Memory8

Sliding window memory
Token-aware memory management
Message summarization strategies
Memory persistence layers
Memory retrieval optimization
Integrate memory with agents
Memory performance considerations
Non-functional requirements (privacy + safety)

Long-Term Memory (RAG)6

Document chunking strategies
Embedding pipelines
Vector database integration
Hybrid search implementation
Retrieval optimization
RAG response generation

Agentic RAG Patterns5

Self-reflective RAG
Multi-hop retrieval
Query routing
Adaptive retrieval
Retrieval feedback loops

Semantic Memory6

Knowledge extraction pipelines
Entity and relationship extraction
Knowledge graph construction
Memory consolidation
Integrate semantic memory with agents
Build semantic memory with LangGraph

Context Optimizer6

Context economics
Dynamic context prioritization
Context compression techniques
Prompt optimization
Context utilization metrics
Complete context optimizer

The State Graph5

StateGraph fundamentals — config and lifecycle
Design state schemas with TypedDict
Add nodes to StateGraph
State initialization patterns
Tracing, debugging, validation

The Conditional Edge5

Understand conditional edges
Design routing functions
Fan-out and fan-in patterns
Handle unknown routes and errors
Multi-stage routing

The Checkpointer (Time Travel)4

Resumable workflows
Inspect, replay, and time-travel
Retention, large state, and performance
Thread management — IDs and namespaces

Human-in-the-Loop6

LangGraph interrupt patterns
Approval workflow patterns
Interactive agent conversations
Feedback integration
State management for HITL
Practical use cases — escalation and analytics

The Streaming Agent6

Streaming modes in LangGraph
Token streaming from LLMs
Custom events with `astream_events`
Build streaming APIs
Error handling in streams
Backpressure and flow control

The Subgraph (Composition)7

Subgraph fundamentals — compile + test in isolation
State schema mapping
Subgraph checkpointers + namespace isolation
Compose subgraphs into a parent
Catch subgraph exceptions and recover
Define subgraph interfaces and build a registry
Build a multi-agent orchestrator

The Supervisor Pattern7

Design supervisor architectures
Worker agent specialization
Build the complete supervisor graph
Manage inter-agent communication
Handle failures and edge cases
Implement task aggregation
Build the supervisor pattern with CrewAI

The Hierarchical Pattern4

Design hierarchical agent architectures
Implement team-lead agents
Build cross-team coordination
Build the complete hierarchical graph

The Reflector Pattern (Critique)6

Design reflection architectures
Implement critic agents
Build the evaluation and convergence system
Build the complete reflection graph
Handle reflection edge cases
Practical use cases for reflection

Input Guardrails6

Design layered guardrail architectures
Format and schema validation
Build content filtering systems
Create injection / jailbreak detection
Implement policy-based guardrails
Assemble the complete guardrail system

Output Guardrails6

Design output validation architectures
Implement factual validation (hallucination detection)
Build content safety filters
Create PII redaction
Implement policy compliance
Assemble the complete output guardrail system

Prompt Injection Defense7

Identify injection vulnerabilities
Detect direct injections
Detect indirect injections
Implement defense layers
Build red-team suites
Implement canary tokens
LangGraph injection defense pipeline

Evaluations (Evals)6

Design evaluation frameworks
Implement automated evaluation pipelines
Create task-specific metrics
Human evaluation protocols
Regression testing
Set baselines and track progress

Agent Benchmarking6

Understand the GAIA benchmark
Implement ToolBench evaluation
Use AgentBench
Design domain-specific benchmarks
Cross-model performance comparison
Build benchmark dashboards

Tracing & Observability6

Understand distributed tracing
Add tags and metadata
Context propagation
Build feedback collection
Integrate with Langfuse
Trace visualization

Tool Use Debugging6

Tool selection failures and solutions
Argument validation systems
Build tool use dashboards and visualization
Schema mismatch detection
Tool call replay
Interactive tool debugger

GenAI Eval Safety Governance11418.2%

Evaluation Dataset Curation6

Build a stratified evaluation dataset
Implement dataset versioning and snapshots
Detect dataset contamination and leakage
Build automated dataset refresh pipeline
Create dataset cards for documentation
Build a dataset annotation pipeline

Evaluation Observability with Langfuse v3 & OpenTelemetry6

Deploy Langfuse v3 on GKE with OpenTelemetry-native instrumentation
Build evaluation dashboards in Langfuse
Track costs and token usage across providers
Manage prompt versions with Langfuse
Compare observability platforms: Langfuse vs Arize Phoenix vs Braintrust
Build automated evaluation alerting

A/B Testing for LLM Systems6

Design A/B experiments for prompt variants
Build traffic splitting with consistent assignment
Implement statistical significance testing
Monitor experiments with guardrail metrics
Run a model swap experiment (OpenAI vs Gemini)
Build experiment results dashboard

Evaluation-Driven CI/CD & Continuous Production Monitoring6

Build evaluation gates with Promptfoo and DeepEval in CI
Implement progressive evaluation tiers
Implement cost-aware evaluation budgets
Build release validation pipeline
Track evaluation trends across releases
Build continuous production monitoring with async scoring

Cross-Model Evaluation6

Build a standardized multi-provider eval harness
Compare structured output compliance across providers
Build cost-performance analysis across providers
Test provider reliability and error handling
Build a model selection decision matrix
Implement model migration playbook

Cost Governance & Token Budgets6

Build per-user token usage tracking
Implement budget enforcement and rate limiting
Detect cost anomalies and spending spikes
Build intelligent model routing with LiteLLM gateway and RouteLLM semantic routing
Implement prompt compression and caching
Build cost governance dashboard and chargeback

Prompt Injection Defense6

Detect prompt injection with PromptGuard 2 and custom classifiers
Defend against indirect prompt injection
Prevent system prompt leakage (OWASP LLM07)
Build canary token detection
Deploy LlamaFirewall and Google Model Armor as unified defense orchestrators
Implement output filtering and response safety

Content Safety Filters6

Compare guardrail frameworks: Guardrails AI vs NeMo Guardrails 0.20 vs NemoGuard NIMs vs Google Model Armor
Build custom domain-specific safety validators
Integrate LlamaGuard 4 and hosted LLM safety APIs
Build multi-layer content safety pipeline
Tune filter sensitivity and manage false positives
Monitor content safety metrics in production

PII Detection & Redaction6

Detect PII with Presidio and Google Sensitive Data Protection
Implement reversible PII redaction
Build custom PII recognizers for domain data
Implement PII audit logging for compliance
Scan LLM outputs for PII leakage
Build end-to-end PII protection pipeline

Hallucination Detection6

Detect hallucinations with source grounding and Patronus AI Lynx 2.0
Implement citation verification
Build NLI-based faithfulness scoring
Classify hallucination types
Build production hallucination monitoring
Reduce hallucinations with prompt engineering

Adversarial Robustness Testing6

Execute manual adversarial attack categories
Automated red teaming with PyRIT, Promptfoo Hydra, and Meta GOAT
Vulnerability scanning with NeMo Auditor and Garak 0.14
Build adversarial CI/CD test suite
Measure defense effectiveness against attacks
Build adversarial robustness dashboard

Agent Safety, MCP Security & Sandboxing6

Validate agent tool calls against permission policies
Secure MCP servers and implement agent gateway patterns
Build human approval for high-risk operations
Detect privilege escalation in agent behavior
Build agent audit trail with GCP SCC Agent Engine Threat Detection
Build agent safety evaluation framework

Multi-Modal Safety6

Detect adversarial image inputs
Build image content safety filters with LlamaGuard 4
Defend against cross-modal prompt injection
Implement safe multi-modal processing pipeline
Test vision model hallucination in multi-modal context
Monitor multi-modal safety in production

Vector & Embedding Security6

Detect RAG data poisoning attacks
Implement document-level access control for RAG
Verify embedding integrity
Build adversarial embedding defense
Detect data exfiltration via RAG
Build vector store security dashboard

OWASP LLM Top 10 2025 & MITRE ATLAS6

Map OWASP LLM Top 10 2025 to implemented defenses
Implement defenses for Unbounded Consumption (LLM10)
Build MITRE ATLAS threat models
Build OWASP compliance testing with Promptfoo presets and Checks by Google
Implement supply chain security (LLM03)
Create OWASP + ATLAS compliance dashboard

EU AI Act Compliance6

Classify AI systems under EU AI Act risk categories
Implement the Feb 2025 AI literacy requirements
Build technical documentation for GPAI compliance
Implement risk management system
Build human oversight mechanisms
Track EU AI Act enforcement timeline compliance

Compliance Frameworks6

Implement NIST AI RMF Govern and Map functions
Implement NIST AI RMF Measure and Manage functions
Build ISO 42001 AI management system documentation
Create unified governance dashboard with Credo AI Agent Registry
Implement comprehensive audit trail
Build compliance automation and alerting

Red Teaming Methodology6

Plan and scope a red team exercise
Execute manual red team techniques
Combine manual, Meta GOAT automated, and Inspect AI red teaming
Build red team findings reports
Track remediation and verify fixes
Build AI safety scorecard and establish red team cadence

Bias, Fairness & Continuous Monitoring6

Detect bias in hosted LLM outputs
Implement fairness metrics for LLM applications
Build continuous safety monitoring for production
Detect safety drift over time
Build safety incident response workflow
Generate weekly and monthly safety reports

GenAI Operations364.6%

PII Detection Pipeline6

Deploy Microsoft Presidio on K8s for runtime PII detection in LLM traffic
Build PII scrubbing middleware that redacts sensitive data before sending to LLM providers
Implement PII detection alerting and audit logging for compliance
Create PII detection tuning workflows to reduce false positives
Implement performance optimization for runtime pii detection
Build operational documentation for runtime pii detection

Injection Monitoring System6

Implement multi-layer prompt injection detection with pattern and embedding-based methods
Build real-time injection alerting with severity classification
Create injection attack analysis dashboards for security monitoring
Implement adaptive detection that learns from new attack patterns
Implement performance optimization for prompt injection monitoring
Build operational documentation for prompt injection monitoring

Guardrail Operations Platform6

Deploy Guardrails AI and LlamaFirewall on K8s for runtime content validation
Implement hot-reload guardrail configuration without service restarts
Build A/B testing framework for guardrail thresholds to optimize block rates
Build testing and validation for guardrail operations
Implement performance optimization for guardrail operations
Build operational documentation for guardrail operations

Compliance Audit Engine6

Implement automated compliance scans for GenAI-specific requirements
Build evidence collection pipelines that gather audit artifacts
Schedule recurring compliance checks with drift detection
Build testing and validation for compliance audit automation
Implement performance optimization for compliance audit automation
Build operational documentation for compliance audit automation

Red Team Automation Platform6

Build automated red team attack suites using Promptfoo for systematic security testing
Implement scheduled security testing with regression detection across model changes
Build security posture scoring with trend monitoring and improvement tracking
Build testing and validation for red team operations
Implement performance optimization for red team operations
Build operational documentation for red team operations

AI Incident Response6

AI Incident Taxonomy
Automated Detection
AI Incident Runbooks
Post-Incident Review
Escalation Path Design
Incident Response Capstone

AI Security Engineering12063.8%

Prompt Injection Defense6

Build prompt injection classifier using LLM-as-judge via LiteLLM
Implement input sanitization pipeline with NeMo Guardrails
Detect indirect injection in RAG-retrieved documents
Build defense-in-depth with layered guard chain
Deploy injection defense as FastAPI sidecar on GKE
Monitor injection attempts with Prometheus and Grafana

Jailbreak Prevention6

Classify jailbreak attempts with pattern and semantic detection
Defend against multi-turn crescendo attacks
Implement response validation and refusal enforcement
Build adaptive jailbreak defense with evolving attack corpus
Integrate jailbreak defense into LiteLLM gateway
Red-team jailbreak defenses with Promptfoo

Output Sanitization Engineering6

Build output content scanner for dangerous patterns
Enforce structured output schemas with Guardrails AI
Prevent sensitive data leakage in model outputs
Implement output length and cost controls
Deploy output sanitizer as response middleware on GKE
Test output safety with adversarial generation

Content Safety Pipelines6

Build content safety classifier using hosted model APIs
Configure NeMo Guardrails for production safety flows
Deploy Guardrails AI validators for enterprise content policies
Implement async content moderation pipeline
Deploy multi-stage safety pipeline on GKE
Monitor content safety metrics and drift

Multimodal Injection Defense6

Detect text-in-image injection attacks
Defend against audio-based prompt injection
Implement cross-modal consistency validation
Build unified multimodal input sanitization pipeline
Deploy multimodal defense on GKE with resource limits
Test multimodal defenses with adversarial corpus

RAG Data Poisoning Defense6

Build document integrity validator with hash fingerprints
Implement embedding drift monitor for knowledge base poisoning
Deploy canary documents for tampering detection
Build provenance tracker for RAG data sources
Implement quarantine pipeline for suspicious documents
Deploy RAG defense system on GKE with pgvector

AI Supply Chain Security6

Build AI Bill of Materials for model pipelines
Implement model provenance verification with Sigstore
Scan dependencies for vulnerabilities with Trivy
Detect malicious code in model files
Enforce supply chain policies in GKE with Binary Authorization
Monitor supply chain health with continuous auditing

PII Leakage Engineering6

Configure Presidio recognizers for LLM-context PII
Build bidirectional PII redaction pipeline
Implement data loss prevention rules for LLM outputs
Integrate PII defense with LiteLLM gateway
Deploy PII defense pipeline on GKE
Monitor PII leakage metrics and compliance

Embedding & Vector Store Security6

Implement row-level access control for vector stores
Defend against query injection in vector search
Implement embedding integrity and provenance tracking
Encrypt embeddings at rest and in transit
Deploy secure vector store on GKE with network isolation
Monitor vector store access patterns and anomalies

Agentic AI Security6

Implement agent goal alignment validator
Defend against tool misuse and argument injection
Enforce least-privilege for agent tool access
Detect and contain rogue agent behavior
Deploy agent security monitor on GKE
Test agent security with adversarial scenarios

MCP Protocol Security6

Detect tool poisoning in MCP tool descriptions
Implement MCP server authentication and authorization
Secure agent-to-agent communication channels
Prevent cross-server data exfiltration
Deploy secure MCP infrastructure on GKE
Audit MCP security with automated testing

GKE Security for AI Workloads6

Enforce pod security standards for AI workloads
Configure GKE network policies for AI service isolation
Implement GKE Workload Identity for secure service auth
Deploy runtime threat detection with Falco
Implement container image security and scanning
Monitor GKE security posture continuously

API Security for LLM Endpoints6

Implement OAuth2/OIDC authentication for LLM APIs
Build intelligent rate limiting for LLM endpoints
Enforce request validation and size controls
Implement response security controls
Deploy LLM API gateway on GKE with LiteLLM
Penetration test LLM APIs with OWASP ZAP

Secrets & Key Management for AI6

Implement centralized secret storage with Vault
Sync secrets to GKE with External Secrets Operator
Implement automated API key rotation
Secure model credentials and inference tokens
Deploy secrets infrastructure on GKE with Workload Identity
Test secrets security and rotation resilience

Threat Modeling for AI Systems6

Map OWASP LLM Top 10 2025 to system architecture
Assess OWASP Agentic Top 10 2026 risks
Apply MITRE ATLAS tactics and techniques
Build threat model for hosted LLM architectures
Generate threat mitigation playbooks
Automate threat model maintenance

Automated AI Red Teaming6

Configure Promptfoo for comprehensive LLM vulnerability scanning
Run multi-turn adversarial campaigns with PyRIT
Probe model weaknesses with Garak adversarial probes
Test agent security with DeepTeam vulnerability scans
Integrate red teaming into CI/CD pipeline
Generate executive security assessment reports

Security Monitoring for AI6

Build security event collection pipeline
Implement attack detection rules engine
Monitor agent behavior for security anomalies
Deploy security monitoring stack on GKE
Integrate threat intelligence feeds
Build security monitoring SLOs and reporting

AI Incident Response6

Build incident classification and triage system
Implement agent containment procedures
Build forensic evidence collection pipeline
Implement model rollback and recovery
Deploy incident response automation on GKE
Conduct tabletop exercises and post-incident review

AI Security Compliance Engineering6

Map security controls to EU AI Act requirements
Implement NIST AI RMF controls
Build ISO 42001 AI management system controls
Automate compliance evidence collection
Deploy compliance monitoring on GKE
Generate audit-ready compliance reports

Security Engineering Capstone6

Design capstone threat model for multi-agent platform
Build integrated defense pipeline
Deploy security infrastructure on GKE
Execute red team assessment against deployed platform
Activate security monitoring and incident response
Generate compliance certification package

GenAI Security Engineering

Verifiable skill graph

What you'll ship in production

Conduct adversarial red-team testing

Implement defense-in-depth guardrails

Threat-model GenAI agent systems

Build PII protection

Design compliance programs

Build security monitoring

Implement incident response

Secure GenAI supply chain

Curriculum