{"title":"GenAI Engineering Role Taxonomy","version":"1.0","publishedAt":"2026-06-14","skillCatalogVersion":7,"license":"CC-BY-4.0","citeAs":"GenBodha. \"The GenAI Engineering Role Taxonomy v1.0.\" genbodha.ai, 2026-06-14. https://genbodha.ai/taxonomy (CC-BY-4.0).","source":"https://genbodha.ai/taxonomy","methodology":"Roles are analyst-curated from live GenAI job descriptions and vetted by practicing engineers; each role's responsibilities map to a skill ladder assessed by graded labs. Disciplines are the role unit; skills are the assessment unit.","counts":{"disciplines":12,"skills":102,"categories":14},"disciplines":[{"id":"genai-app","name":"GenAI Application Engineering","description":"Build production RAG & prompt chain applications, design streaming chat UIs, implement guardrails & evaluation, optimize LLM inference costs, and deploy on Kubernetes.","responsibilities":[{"title":"Design and build production GenAI features","context":"(chatbots, search, summarization) into web applications","learn":["Build streaming chat UIs with FastAPI backends using SSE and WebSocket transports","Wire React frontends to LLM-powered APIs with end-to-end full-stack integration","Deploy complete GenAI applications from prototype to production on Kubernetes"]},{"title":"Implement RAG pipelines","context":"with vector databases for enterprise search and knowledge retrieval","learn":["Build end-to-end RAG: document chunking → embedding generation → pgvector storage → LangGraph retrieval nodes","Validate retrieval accuracy using RAGAS metrics and implement self-verification loops","Benchmark chunking strategies and HNSW/IVFFlat index types against precision-recall tradeoffs"]},{"title":"Optimize LLM inference","context":"for latency, cost, and reliability across multiple providers","learn":["Configure multi-provider routing with LiteLLM gateway including load balancing and failover","Implement semantic caching with Redis + embedding similarity to reduce costs by 40%+","Extract structured outputs with Pydantic AI and handle provider-specific error recovery"]},{"title":"Integrate LLM APIs","context":"(OpenAI, Gemini, Anthropic) into existing applications with error handling","learn":["Connect to OpenAI, Anthropic, and Gemini APIs with streaming, function calling, and embeddings","Build FastAPI rate limiting middleware with exponential backoff and retry logic","Navigate provider contract differences across authentication, token limits, and response formats"]},{"title":"Build GenAI agent features","context":"with tool calling, function execution, and human-in-the-loop workflows","learn":["Design LangGraph state machines with structured tool calling and JSON schema validation","Implement MCP tool integration for dynamic tool discovery and execution","Wire interruptible agent workflows with human approval gates and checkpoint persistence"]},{"title":"Evaluate model outputs","context":"using automated metrics and LLM-as-judge for production quality","learn":["Build evaluation pipelines using RAGAS faithfulness/relevance metrics and DeepEval harnesses","Integrate LLM-as-judge scoring into CI/CD gates for automated quality control","Track quality metrics over time with Langfuse dashboards and regression detection"]},{"title":"Deploy and containerize","context":"GenAI applications on Kubernetes with CI/CD","learn":["Containerize FastAPI + LLM applications with multi-stage Docker builds","Deploy to Kubernetes with Helm charts, readiness probes, and Ingress configuration","Automate rollouts with ArgoCD GitOps workflows and Kustomize environment overlays"]}]},{"id":"ai-agent","name":"GenAI Agent Engineering","description":"Build autonomous multi-agent systems with planning, reasoning, tool use, memory, MCP/A2A protocols, safety boundaries, and production evaluation.","responsibilities":[{"title":"Design autonomous GenAI agents","context":"using state machines with tool calling, memory, and planning","learn":["Build LangGraph agents from scratch: define graph nodes, conditional edges, state schemas, and checkpointing","Progress from simple ReAct agents → planning agents → multi-step agents with persistent memory","Apply state machine theory to design agent graphs for complex, real-world task scenarios"]},{"title":"Build multi-agent systems","context":"with supervisor/worker hierarchies, delegation, and parallel execution","learn":["Implement supervisor agent patterns that route tasks to specialist worker agents","Construct hierarchical team structures with dynamic agent spawning and swarm coordination","Monitor cross-agent execution with delegation rules and parallel task orchestration"]},{"title":"Implement MCP servers and clients","context":"for standardized tool integration","learn":["Build Model Context Protocol servers that expose REST APIs as discoverable agent tools","Implement MCP clients in LangGraph agents with dynamic tool registration and schema negotiation","Validate tool selection accuracy across diverse query types and measure invocation reliability"]},{"title":"Enable agent-to-agent communication","context":"using A2A protocol for cross-framework interoperability","learn":["Implement A2A v0.3 protocol mechanics: Agent Cards, task lifecycle management, and gRPC transport","Build A2A-compatible agents using Google ADK with capability advertising","Verify cross-framework interoperability between independently built agent systems"]},{"title":"Build production RAG agents","context":"with iterative retrieval, self-verification, and query decomposition","learn":["Add vector search nodes to LangGraph agent graphs with quality-checked retrieval loops","Implement query decomposition for complex multi-part questions with iterative refinement","Benchmark agentic RAG against static RAG pipelines using faithfulness and relevance metrics"]},{"title":"Implement guardrails and safety controls","context":"within agent workflows","learn":["Integrate NeMo Guardrails for content filtering within running agent execution loops","Add LlamaFirewall middleware with policy-based tool access control and output filtering","Quantify safety-vs-helpfulness tradeoffs using adversarial test suites and scoring rubrics"]},{"title":"Evaluate agent performance","context":"with trajectory analysis and cost tracking","learn":["Build evaluation harnesses measuring trajectory quality, tool selection accuracy, and task completion","Run agents against standardized test suites and analyze per-task token cost attribution","Track agent quality regressions over time with Langfuse observability dashboards"]},{"title":"Design context engineering","context":"— systematic composition of prompts, memory, tools, and history","learn":["Structure system prompts, conversation memory windows, and tool result formatting strategies","Optimize context window utilization across multi-turn conversations with token budgeting","Measure agent behavior differences across context designs using controlled A/B evaluations"]}]},{"id":"ai-inference","name":"GenAI Inference Engineering","description":"Architect multi-provider LLM gateways, implement semantic caching and batch optimization, monitor provider SLAs, and optimize inference costs.","responsibilities":[{"title":"Design LLM gateway infrastructure","context":"routing requests across providers","learn":["Deploy and configure LiteLLM gateway on Kubernetes with provider routing rules and load balancing","Manage API key rotation, failover policies, and per-provider request distribution","Validate gateway behavior under failover scenarios and measure routing latency overhead"]},{"title":"Optimize request latency","context":"through caching, batching, and streaming","learn":["Implement semantic caching with Redis using embedding similarity for cache key matching","Build request batching strategies and streaming-first response patterns","Benchmark cache hit rates, measure P50/P95 latency improvements, and tune eviction policies"]},{"title":"Implement structured output extraction","context":"from LLMs with type safety","learn":["Use Pydantic AI for type-safe LLM interactions with guaranteed schema compliance","Build structured extraction pipelines with Instructor and DSPy for programmatic optimization","Validate extraction accuracy across providers and measure schema conformance rates"]},{"title":"Build cost attribution and FinOps dashboards","context":"tracking token spend","learn":["Track token costs per team, model, and feature using Langfuse cost attribution","Build Grafana dashboards for cost visualization with Prometheus budget alerting","Implement cost optimization through semantic caching, model tiering, and prompt compression"]},{"title":"Monitor inference quality metrics","context":"in production","learn":["Instrument LLM calls with OpenTelemetry spans capturing latency, tokens, and error rates","Set up Logfire for Python-native tracing and Prometheus for P50/P95/P99 latency monitoring","Configure alerting rules that detect latency spikes and diagnose root causes from distributed traces"]},{"title":"Implement intelligent routing","context":"— route queries to model tiers based on complexity","learn":["Build RouteLLM semantic routing with model cascading: cheap models for simple, expensive for complex","Configure complexity-based dispatch logic with fallback chains across providers","Demonstrate 60%+ cost savings while maintaining output quality on standardized test datasets"]},{"title":"Manage API rate limits and quotas","context":"across providers","learn":["Build rate limiting middleware in FastAPI with per-endpoint and per-user throttling","Configure LiteLLM quota management with per-team token budgets and key rotation policies","Validate graceful degradation behavior under sustained load with provider quota exhaustion"]},{"title":"Deploy inference services on K8s","context":"with scaling and health checks","learn":["Configure Kubernetes Deployments with readiness/liveness probes tailored for LLM services","Set up Horizontal Pod Autoscaler with custom metrics for token throughput scaling","Validate zero-downtime rolling updates under active inference load"]}]},{"id":"ai-platform","name":"GenAI Platform Engineering","description":"Build internal GenAI developer platforms with self-service capabilities, multi-tenancy, RBAC, CI/CD for model/prompt/guardrail pipelines.","responsibilities":[{"title":"Build the internal GenAI platform","context":"enabling developers to deploy LLM applications self-service","learn":["Design platform APIs with golden path templates and self-service provisioning workflows","Build developer portals with pre-approved LLM configurations, guardrails, and monitoring included","Wire end-to-end self-service: from app registration to deployed inference endpoint with observability"]},{"title":"Design multi-tenant infrastructure","context":"with namespace isolation and RBAC","learn":["Implement Kubernetes namespace isolation with RBAC policies and resource quotas per tenant","Automate tenant provisioning with network policies and admission controllers","Validate tenant isolation by enforcing resource limits under concurrent multi-team workloads"]},{"title":"Implement CI/CD pipelines","context":"with GitOps for GenAI applications","learn":["Set up ArgoCD GitOps for declarative deployment from Git push to production rollout","Build GitHub Actions workflows with act for local CI and Helm chart packaging","Wire complete GitOps pipelines with Kustomize overlays for dev/staging/production environments"]},{"title":"Manage data infrastructure","context":"— databases, caches, message queues on K8s","learn":["Deploy PostgreSQL + pgvector, Redis, Kafka, Neo4j, and MinIO as Kubernetes-native services","Configure backup/restore, horizontal scaling, and monitoring for each data component","Benchmark throughput and failover behavior for each infrastructure component under load"]},{"title":"Build autoscaling for GenAI workloads","context":"using event-driven scaling and batch job queuing","learn":["Configure KEDA for event-driven pod autoscaling based on queue depth, HTTP rate, and custom metrics","Set up Kueue for Kubernetes-native batch job scheduling with priorities and fair quotas","Validate auto-scaling policies under burst GenAI workloads with realistic traffic patterns"]},{"title":"Provision infrastructure-as-code","context":"using K8s-native tooling","learn":["Declare infrastructure as Kubernetes custom resources with Crossplane providers","Manage databases, storage, and networking declaratively through kubectl apply","Verify reconciliation behavior by modifying infrastructure state and observing self-healing"]},{"title":"Implement full-stack observability","context":"across the GenAI platform","learn":["Build unified observability with Prometheus metrics, Grafana dashboards, and OpenTelemetry tracing","Add Logfire for Python application tracing and Langfuse for LLM-specific cost and quality monitoring","Wire a unified observability stack spanning infrastructure, application, and LLM inference layers"]},{"title":"Operate LLM gateways","context":"as platform infrastructure","learn":["Manage LiteLLM gateway operations: API key lifecycle, per-team cost tracking, and provider health","Handle model version migration and zero-downtime provider switching","Operate a production gateway serving multiple internal teams with isolated quotas and routing"]}]},{"id":"forward-deployed","name":"Forward Deployed GenAI Engineering","description":"Rapid-prototype GenAI solutions on customer infrastructure, integrate GenAI with customer data and workflows, scope solutions with delivery methodology.","responsibilities":[{"title":"Embed on-site with clients","context":"to discover GenAI opportunities and scope projects","learn":["Run structured discovery sessions: stakeholder interviews, process mapping, and opportunity scoring","Score automation opportunities by ROI and write scope documents with acceptance criteria","Simulate realistic client discovery engagements with estimation and risk assessment exercises"]},{"title":"Build rapid prototypes","context":"that demonstrate GenAI value within weeks","learn":["Go from problem statement to working prototype using LangGraph agent logic and MCP tool integration","Iterate prototypes based on evaluation metrics and present results to stakeholders","Build end-to-end prototypes under time constraints with evaluation-driven iteration cycles"]},{"title":"Integrate GenAI into client data systems","context":"— databases, APIs, and legacy systems","learn":["Connect LLM applications to PostgreSQL, pgvector, Redis, Kafka, MinIO, and REST APIs","Build data ingestion pipelines that feed RAG systems from existing databases and legacy endpoints","Implement common enterprise integration patterns with real database connections and API adapters"]},{"title":"Customize LLM applications","context":"for client-specific domains (healthcare, finance, legal)","learn":["Build domain-specific RAG pipelines with HIPAA-compliant PII detection using Presidio","Construct financial RAG systems with regulatory citation and legal contract analysis pipelines","Validate domain-specific compliance constraints across regulated industry scenarios"]},{"title":"Deploy solutions as packaged Helm charts","context":"clients can operate independently","learn":["Package GenAI applications as self-contained Helm charts with Kustomize overlays per environment","Write operational runbooks and define SLAs with integrated monitoring and alerting","Simulate a complete solution handoff including packaging, documentation, and operational validation"]},{"title":"Build GenAI agent workflows","context":"tailored to client business processes","learn":["Design LangGraph agents with human-in-the-loop approval gates and MCP-based tool integration","Customize agent behavior for different business process requirements and approval hierarchies","Build and deploy domain-specific agents adapted to varied client business scenarios"]},{"title":"Manage LLM provider costs","context":"and build FinOps models for client engagements","learn":["Optimize multi-provider costs via LiteLLM routing with cost-per-request modeling","Build ROI estimation frameworks and pricing models for client proposals","Tune provider selection strategies across usage scenarios to hit target cost margins"]},{"title":"Configure enterprise guardrails","context":"to meet client compliance requirements","learn":["Set up NeMo Guardrails for content safety and Presidio for multi-language PII detection","Configure compliance-specific policies aligned with SOC2, HIPAA, and GDPR requirements","Validate guardrail configurations against adversarial test suites in regulated industry scenarios"]}]},{"id":"llmops","name":"LLMOps Engineering","description":"Monitor hallucination rates and token costs, operate guardrails and eval gates, manage prompt versioning and canary deployments.","responsibilities":[{"title":"Design CI/CD pipelines","context":"for LLM application deployment","learn":["Build ArgoCD GitOps workflows with Helm-based deployments and environment promotion","Implement canary and blue-green rollout strategies with automated quality-based rollback","Wire complete CI/CD pipelines that trigger rollbacks when evaluation metrics degrade"]},{"title":"Monitor LLM systems in production","context":"— latency, errors, costs, quality","learn":["Instrument with OpenTelemetry and Langfuse v3 for OTEL-native distributed tracing","Build Grafana dashboards with Logfire for Python application monitoring and alerting","Set up monitoring stacks that detect anomalies, fire alerts, and enable trace-based root cause analysis"]},{"title":"Manage LLM gateway operations","context":"— key rotation, failover, quota management","learn":["Operate LiteLLM gateway: API key lifecycle management, provider health monitoring, per-team quotas","Handle zero-downtime model version switching with traffic draining and validation","Simulate provider outages and quota exhaustion to validate failover and degradation behavior"]},{"title":"Implement FinOps practices","context":"— cost attribution, budgets, and optimization","learn":["Track token costs by team, feature, and model with Prometheus-based budget alerting","Implement cost optimization through semantic caching, model tiering, and prompt compression","Build FinOps dashboards that demonstrate measurable cost reduction across optimization strategies"]},{"title":"Build continuous evaluation pipelines","context":"for production LLM quality","learn":["Run RAGAS and DeepEval evaluation pipelines alongside production traffic as shadow evaluators","Set up Langfuse-based quality tracking with automated quality gates and threshold alerting","Detect quality degradation in real time and trigger automated alerts when scores drop below baselines"]},{"title":"Detect and respond to prompt attacks","context":"and safety incidents in production","learn":["Monitor NeMo Guardrails operationally for prompt injection and jailbreak detection patterns","Classify incident severity and execute structured response workflows with containment procedures","Simulate attack scenarios end-to-end: detection, triage, remediation, and post-incident analysis"]},{"title":"Manage data quality for RAG systems","context":"— freshness, drift, accuracy","learn":["Monitor embedding drift and retrieval accuracy with continuous RAGAS evaluation","Set up automated reindexing triggers and stale content detection pipelines","Build monitoring for live RAG systems that detects quality degradation and triggers reindexing workflows"]},{"title":"Implement capacity planning","context":"— predict demand and right-size deployments","learn":["Forecast token demand using historical usage patterns and run load tests for LLM services","Model SLA capacity requirements and configure KEDA-based autoscaling policies","Run load tests that predict capacity requirements and validate SLA compliance under variable traffic"]}]},{"id":"ai-safety","name":"GenAI Safety & Evaluation Engineering","description":"Design automated LLM evaluation pipelines, red-team GenAI systems, build bias detection and fairness benchmarks, implement guardrails.","responsibilities":[{"title":"Build automated evaluation pipelines","context":"to continuously measure LLM output quality","learn":["Design evaluation harnesses with RAGAS, DeepEval, and NeMo Evaluator SDK for multi-metric scoring","Create evaluation datasets with ground-truth annotations and run cross-provider comparisons","Wire CI gates that automatically block deployments when faithfulness or relevance scores degrade"]},{"title":"Conduct red-team exercises","context":"— probe LLMs for vulnerabilities","learn":["Automate adversarial testing with Garak for prompt injection, jailbreak, and data extraction probes","Run multi-turn adversarial campaigns with Meta GOAT and DeepTeam for agent vulnerability testing","Execute red-team campaigns against realistic systems, discover vulnerabilities, and write actionable findings"]},{"title":"Implement production guardrails","context":"— content filters, PII detection, jailbreak prevention","learn":["Configure NeMo Guardrails with Colang policy language, Llama Guard 4, and Prompt Guard 2","Add Presidio for PII detection/redaction and Model Armor for Google-native content safety","Layer multiple defenses, test against comprehensive attack suites, and quantify safety-vs-helpfulness tradeoffs"]},{"title":"Design GenAI governance frameworks","context":"aligned with regulations","learn":["Map EU AI Act risk classification and implement NIST AI RMF control frameworks","Build OWASP LLM Top 10 mitigation strategies mapped to technical controls","Create governance artifacts, conduct risk assessments, and build automated audit trail pipelines"]},{"title":"Evaluate GenAI agent behavior","context":"— trajectory quality, tool selection accuracy","learn":["Build trajectory scoring systems measuring tool selection accuracy and task completion quality","Design human preference alignment tests and regression test suites for agent workflows","Evaluate multi-step agent executions to identify failure modes and build targeted regression tests"]},{"title":"Monitor bias, fairness, and hallucination rates","context":"in production","learn":["Detect bias across protected attributes using statistical fairness metrics and disparity analysis","Measure hallucination rates through ground-truth comparison and citation verification","Implement continuous bias scanning, hallucination detection, and alerting for metric drift"]},{"title":"Build safety incident response processes","context":"for deployed GenAI systems","learn":["Design safety monitoring dashboards with severity-based alert routing and escalation paths","Build incident triage workflows with containment procedures and post-incident reporting templates","Simulate safety incidents end-to-end and practice the full detection-to-resolution workflow"]},{"title":"Design LlamaFirewall policies","context":"for agent safety","learn":["Configure LlamaFirewall middleware for controlling agent tool access and output filtering rules","Set up multi-agent safety boundaries with policy-based execution constraints","Validate firewall policies against adversarial scenarios where agents attempt to bypass controls"]}]},{"id":"ai-security","name":"GenAI Security Engineering","description":"Engineer defenses against prompt injection, jailbreaks, and data exfiltration. Implement PII leakage detection, content safety, and compliance.","responsibilities":[{"title":"Conduct adversarial red-team testing","context":"of LLM systems","learn":["Automate red-teaming with Garak for prompt injection, jailbreak, and data extraction probes","Run multi-turn adversarial campaigns with Meta GOAT and structured vulnerability reporting","Execute campaigns against realistic GenAI systems, discover attack vectors, and produce actionable reports"]},{"title":"Implement defense-in-depth guardrails","context":"— input validation, output filtering, content safety","learn":["Layer NeMo Guardrails, Llama Guard 4, Prompt Guard 2, and Model Armor into a unified defense stack","Configure multi-layer input validation, output filtering, and content classification policies","Measure the safety-vs-helpfulness tradeoff across different defense layer configurations"]},{"title":"Threat-model GenAI agent systems","context":"— analyze attack surfaces across tools, memory, and inter-agent communication","learn":["Analyze MCP security boundaries, memory manipulation vectors, and inter-agent trust relationships","Map tool access control surfaces and agent communication channel vulnerabilities","Threat-model a complete multi-agent system, identify attack vectors, and design targeted mitigations"]},{"title":"Build PII protection","context":"— detect, classify, and redact sensitive data in LLM pipelines","learn":["Integrate Presidio for multi-language PII detection with custom entity recognizers","Implement masking vs. pseudonymization redaction strategies with compliance validation","Configure PII protection for a RAG pipeline and verify zero sensitive data leakage in outputs"]},{"title":"Design compliance programs","context":"aligned with OWASP LLM Top 10, MITRE ATLAS, EU AI Act","learn":["Map OWASP LLM Top 10 mitigations to specific technical controls and implementation patterns","Implement MITRE ATLAS threat taxonomy and NIST AI RMF compliance frameworks","Create compliance mappings for GenAI systems and design repeatable audit procedures"]},{"title":"Build security monitoring","context":"for GenAI systems","learn":["Build security-specific monitoring dashboards with anomalous prompt pattern detection","Detect data exfiltration attempts, unusual token patterns, and adversarial input signatures","Monitor a production-like GenAI system and detect simulated attacks in real time"]},{"title":"Implement incident response","context":"for GenAI security events","learn":["Build GenAI-specific incident response playbooks with severity classification and containment procedures","Design forensic analysis workflows for LLM interactions and post-incident reporting","Simulate security incidents and practice the full end-to-end response lifecycle"]},{"title":"Secure GenAI supply chain","context":"— model provenance, dependency scanning, container security","learn":["Verify model integrity with provenance checks and scan dependencies for known vulnerabilities","Design secure CI/CD pipelines with container image scanning and signing for GenAI deployments","Audit a complete GenAI application supply chain and implement security controls at each stage"]}]},{"id":"solutions-arch","name":"GenAI Solutions Architecture","description":"Design enterprise GenAI reference architectures, create ADRs and technical standards, bridge GenAI with enterprise workflows.","responsibilities":[{"title":"Define enterprise GenAI architecture","context":"with proper documentation and governance","learn":["Write Architecture Decision Records (ADRs) for GenAI system design choices with trade-off analysis","Design reference architectures for common enterprise GenAI use cases","Create ADRs, design reference architectures, and present trade-off analyses to stakeholders"]},{"title":"Design scalable RAG systems","context":"at enterprise scale","learn":["Architect full RAG stacks: document processing → embedding pipelines → pgvector → hybrid search with reranking","Design multi-tenant data isolation with embedding pipeline separation and row-level security","Benchmark RAG systems against enterprise-scale document volumes for throughput and accuracy"]},{"title":"Architect multi-agent systems","context":"with MCP mesh and A2A network topology","learn":["Design MCP mesh architecture for distributed tool access across organizational boundaries","Plan A2A agent network topologies with lifecycle governance and communication protocols","Stress-test multi-agent architectures with simulated failure scenarios and cascading fault injection"]},{"title":"Lead PoC development and production rollouts","context":"with model selection and cost estimation","learn":["Compare models across providers with cost-per-request modeling and quality benchmarking","Build prototype evaluation frameworks with production readiness checklists and go/no-go criteria","Evaluate models for specific use cases, build cost projections, and create decision frameworks"]},{"title":"Design GenAI governance architecture","context":"— RBAC, audit trails, and compliance","learn":["Build multi-tenant GenAI governance with role-based access control for models, prompts, and data","Design audit trail architecture with policy-as-code enforcement and compliance reporting","Architect governance for multi-business-unit enterprises and validate regulatory compliance"]},{"title":"Oversee operational architecture","context":"— observability, FinOps, SLA management","learn":["Design full-stack observability architecture spanning metrics, logs, traces, and LLM-specific telemetry","Architect FinOps dashboards and incident response workflows with SLA definition and monitoring","Validate operational architecture designs against production SLA targets and failure scenarios"]},{"title":"Integrate GenAI with enterprise data platforms","context":"— pipelines, knowledge graphs, streaming","learn":["Design data architectures integrating PostgreSQL, pgvector, Kafka streaming, Neo4j, Redis, and MinIO","Architect data flows that support multiple GenAI use cases simultaneously across shared infrastructure","Build data architecture designs for multi-use-case enterprise scenarios with isolation and scaling"]},{"title":"Present architecture decisions","context":"with cost/risk analysis to leadership","learn":["Apply ADR methodology with structured trade-off analysis and risk quantification frameworks","Conduct architecture reviews with stakeholders and defend design decisions under scrutiny","Write ADRs, conduct architecture reviews, and present cost/risk arguments for design choices"]}]},{"id":"solutions-delivery","name":"GenAI Solutions & Delivery","description":"Scope GenAI solutions with estimation, risk, and success criteria. Orchestrate delivery teams, manage client relationships.","responsibilities":[{"title":"Lead end-to-end GenAI project delivery","context":"from discovery through production handoff","learn":["Run the complete delivery lifecycle: discovery workshops → problem scoping → rapid prototyping → handoff","Drive evaluation-driven iteration with measurable quality gates and knowledge transfer","Walk through each delivery phase with realistic client scenarios including scoping and risk assessment"]},{"title":"Design GenAI architecture","context":"for client engagements","learn":["Apply cell-based AI, MCP mesh, and multi-tenant architecture patterns to client requirements","Write ADR documentation with reference designs and technology evaluation rationale","Create architecture proposals for varied client scenarios and defend design decisions under review"]},{"title":"Build agent-based solutions","context":"for client business processes","learn":["Design LangGraph agents with MCP tool integration and human-in-the-loop approval gates","Customize agent behavior, tool access, and workflow logic for different business process requirements","Build and deploy domain-specific agents adapted to varied client business scenarios"]},{"title":"Customize enterprise LLM deployments","context":"— gateways, RAG, domain adaptation","learn":["Operate LiteLLM gateways with multi-provider management and enterprise RAG stack customization","Adapt LLM deployments for healthcare, finance, and legal verticals with domain-specific constraints","Deliver end-to-end LLM customization for regulated industries with compliance validation"]},{"title":"Manage FinOps","context":"for client GenAI projects","learn":["Build token cost attribution models with budget forecasting and TCO analysis for proposals","Design cost optimization strategies across providers, caching tiers, and model selection","Build cost models, forecast annual spend, and present optimization recommendations to stakeholders"]},{"title":"Scope project timelines and team requirements","context":"","learn":["Apply effort estimation frameworks designed for non-deterministic GenAI project delivery","Map team skills, assess technical risks, and develop detailed project proposals","Estimate effort for sample GenAI projects and identify optimal team composition and skill coverage"]},{"title":"Package solutions as deployable artifacts","context":"for client operations teams","learn":["Build Helm charts with operational runbooks, SLA definitions, and integrated monitoring","Create client handoff documentation with deployment guides and escalation procedures","Package a complete GenAI solution and conduct a simulated client handoff with operational validation"]},{"title":"Advise clients on technology roadmaps","context":"with emerging GenAI patterns","learn":["Evaluate emerging patterns: A2A protocol, MCP mesh, cell-based AI, and multi-tenant architectures","Assess industry trends, adoption timelines, and migration strategies for client technology stacks","Build technology roadmap recommendations that balance innovation with operational stability"]}]},{"id":"eng-manager","name":"GenAI Engineering Leader","description":"Hire and build GenAI engineering teams, design team structures for GenAI, set engineering quality frameworks.","responsibilities":[{"title":"Hire and build GenAI engineering teams","context":"","learn":["Define GenAI-specific hiring criteria and design technical interviews for LLM and agent engineering roles","Build skill assessment frameworks and team composition strategies balancing generalist and specialist profiles","Write job descriptions, design interview rubrics, and evaluate candidates against GenAI competency matrices"]},{"title":"Define engineering processes","context":"for GenAI development — eval-driven workflows","learn":["Design GenAI-specific sprint planning with eval-driven development as the core feedback loop","Define evaluation metrics before writing code and measure GenAI team velocity with non-deterministic outputs","Build team workflows integrating Langfuse for evaluation tracking and Grafana for velocity metrics"]},{"title":"Manage quality and team performance","context":"for GenAI outputs","learn":["Define GenAI quality metrics and SLA management frameworks for LLM system reliability","Build team performance dashboards using Grafana with latency, quality, and throughput indicators","Construct performance dashboards and define quality standards for GenAI engineering deliverables"]},{"title":"Understand the technical stack","context":"deeply enough to unblock teams","learn":["Learn LLM fundamentals, LangGraph agent engineering patterns, and LiteLLM gateway operations","Monitor production systems with Langfuse and Prometheus to review PRs and debug incidents","Gain sufficient depth to make architecture calls, review designs, and unblock teams on technical decisions"]},{"title":"Operate and budget for GenAI infrastructure","context":"— FinOps and capacity","learn":["Build LLM cost attribution dashboards with capacity planning and budget forecasting models","Manage vendor relationships and optimize spend allocation across multiple LLM providers","Construct FinOps dashboards, set team-level token budgets, and produce monthly cost reports for leadership"]},{"title":"Design organization structure","context":"for GenAI engineering teams","learn":["Apply GenAI team topology patterns including on-call rotation design and knowledge sharing practices","Evaluate embed-vs-centralize tradeoffs for GenAI engineering functions across the organization","Design org structures for different company sizes with clear ownership boundaries and escalation paths"]},{"title":"Drive technical strategy","context":"— evaluate new tools and plan migrations","learn":["Apply technology evaluation frameworks with structured criteria for GenAI tool and platform selection","Build migration planning methodology and strategic roadmaps for technology transitions","Evaluate new tools against defined criteria, build migration plans, and present strategy to leadership"]},{"title":"Ensure responsible AI practices","context":"across your team","learn":["Design governance policies and safety review processes for GenAI system development and deployment","Build compliance workflows and team-level responsible AI standards with enforcement mechanisms","Create governance policies and integrate safety review checkpoints into the development lifecycle"]}]},{"id":"data-eng","name":"GenAI Data Engineering","description":"Build RAG data pipelines for ingestion, chunking, embedding, and indexing. Manage vector store operations and embedding model lifecycle.","responsibilities":[{"title":"Build embedding pipelines","context":"— ingest, chunk, embed, and store in vector databases","learn":["Select and benchmark embedding models across OpenAI and Gemini for domain-specific accuracy","Implement chunking strategies (fixed, semantic, recursive) with batch embedding generation","Build complete pipelines processing thousands of documents into pgvector with HNSW indexing"]},{"title":"Design RAG data infrastructure","context":"— hybrid search and reranking","learn":["Build BM25 + semantic hybrid search with LLM-as-reranker patterns using Gemini","Implement semantic caching for throughput optimization and query result deduplication","Construct hybrid search pipelines and benchmark retrieval quality with RAGAS precision-recall metrics"]},{"title":"Build knowledge graph pipelines","context":"using Neo4j","learn":["Extract entities from unstructured text and construct knowledge graphs with relationship typing","Implement GraphRAG patterns and agentic Graph-RAG with MCP tool integration for graph traversal","Build knowledge graphs from document corpora and query them with graph-aware retrieval agents"]},{"title":"Process documents at scale","context":"— parsing, chunking, and quality filtering","learn":["Process multi-format documents with Docling across PDF, HTML, and Office formats","Apply intelligent context-preserving chunking and GPU-accelerated curation with NeMo Curator","Build document processing pipelines that handle real-world messy data with quality filtering"]},{"title":"Implement data quality controls","context":"— PII, dedup, compliance filtering","learn":["Integrate Presidio for PII detection with custom entity recognizers and deduplication strategies","Build compliance pipelines with content classification for regulated industries","Construct quality gates that block non-compliant documents from entering the embedding pipeline"]},{"title":"Orchestrate data pipelines","context":"with scheduling and failure recovery","learn":["Use Argo Workflows for Kubernetes-native pipeline orchestration with DVC data versioning","Build quality gates between pipeline stages with dead-letter queues and failure recovery patterns","Wire multi-stage pipelines with automatic retry, checkpoint recovery, and quality validation gates"]},{"title":"Monitor pipeline health","context":"— freshness, quality scores, embedding drift","learn":["Instrument pipeline stages with OpenTelemetry and build Grafana dashboards for freshness and quality","Monitor retrieval quality continuously with RAGAS evaluation and embedding drift detection","Build monitoring for live pipelines that detects data quality degradation and triggers remediation"]},{"title":"Design multi-tenant data isolation","context":"for enterprise RAG","learn":["Build tenant-aware embedding pipelines with pgvector namespace isolation per customer","Implement row-level security for vector search with per-tenant quality monitoring","Verify tenant data isolation under concurrent multi-tenant queries with cross-tenant leakage tests"]}]}],"skillCategories":[{"category":"Agent core","skills":[{"skillId":"agent_memory_systems","name":"Agent Memory Systems","description":"Implements short-term sliding windows, semantic memory, and context optimization for agents.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]},{"skillId":"agent_state_graph_patterns","name":"Agent State-Graph Patterns","description":"Designs agent state-graph pipelines with typed schemas, nodes, edges, conditional routing, compilation, and state-transition debugging.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]},{"skillId":"agent_tool_design","name":"Agent Tool Design & Validation","description":"Designs typed agent tools with Pydantic schemas, docstring parsing, and runtime validation.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]},{"skillId":"agentic_rag_knowledge_graphs","name":"Agentic RAG & Knowledge Graphs","description":"Builds Self-RAG, Corrective RAG, and GraphRAG pipelines with adaptive retrieval and entity graphs.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect"]},{"skillId":"enterprise_agent_patterns","name":"Enterprise Vertical Agent Patterns","description":"Builds document-processing, triage, and code-review agents with domain-specific tool sets and human handoff points.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-solutions-architect","genai-solutions-delivery-lead"]},{"skillId":"langgraph_framework_usage","name":"LangGraph Framework Usage","description":"Builds agents with the LangGraph library: StateGraph, conditional edges, MessagesState, ToolNode integration.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]},{"skillId":"multi_agent_orchestration","name":"Multi-Agent Orchestration","description":"Builds supervisor, hierarchical, and reflector multi-agent patterns with handoffs and result aggregation.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-solutions-architect"]},{"skillId":"specialized_multimodal_agents","name":"Multimodal & Computer-Use Agents","description":"Builds vision, voice, computer-use, and code agents with multimodal models and desktop automation.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-solutions-architect"]},{"skillId":"agent_react_planning_loops","name":"ReAct & Planning Agent Loops","description":"Builds ReAct agent loops with thought-action-observation, planning, and dynamic replanning.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]},{"skillId":"web_browsing_agents","name":"Web Browsing Agents","description":"Builds agents that navigate web pages with Playwright, extract structured data, and submit forms.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer"]}]},{"category":"Agent deployment","skills":[{"skillId":"agent_cost_routing","name":"Agent Cost Control & Model Routing","description":"Tracks per-agent token spend, routes tasks to cost-appropriate models, and enforces budget limits.","disciplines":["ai-agent-engineer","genai-engineering-manager","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"agent_load_testing","name":"Agent Load Testing & Capacity Planning","description":"Runs concurrent-load benchmarks with k6 or Locust, identifies bottlenecks, and plans capacity for production agents.","disciplines":["ai-agent-engineer","ai-platform-engineer","llmops-engineer"]},{"skillId":"agent_observability_tracing","name":"Agent Observability & Tracing","description":"Instruments agents with OpenTelemetry, Langfuse, fleet dashboards, and tool-use debugging.","disciplines":["ai-agent-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","llmops-engineer"]},{"skillId":"agent_release_management","name":"Agent Release Management","description":"Manages agent config versions with canary rollout, automated rollback, and config drift detection.","disciplines":["ai-agent-engineer","ai-platform-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"agent_production_deployment","name":"Production Agent Deployment","description":"Serves agents via FastAPI on Kubernetes with Postgres/Redis state, horizontal scaling, and CI/CD pipelines.","disciplines":["ai-agent-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-solutions-architect","llmops-engineer"]}]},{"category":"Agent infrastructure","skills":[{"skillId":"a2a_agent_networks","name":"A2A Protocol & Agent Networks","description":"Implements Agent-to-Agent protocol for discovery, authentication, and remote task delegation across agent fleets.","disciplines":["ai-agent-engineer","genai-solutions-architect"]},{"skillId":"mcp_protocol_implementation","name":"MCP Protocol Servers & Clients","description":"Builds and consumes MCP servers using JSON-RPC 2.0 over stdio and SSE with tool, resource, and prompt exposure.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect"]}]},{"category":"Agent safety","skills":[{"skillId":"agent_evaluation_pipelines","name":"Agent Evaluation & Benchmarking","description":"Builds golden datasets, LLM-as-judge pipelines, trajectory scoring, and CI-gated regression testing for agents.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"agent_safety_guardrails","name":"Agent Safety Guardrails & Injection Defense","description":"Implements input/output guardrails, jailbreak detection, prompt injection defense, and safety boundaries.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","ai-security-engineer","genai-solutions-architect"]},{"skillId":"enterprise_agent_governance","name":"Enterprise Agent Governance & Audit","description":"Enforces audit trails, escalation policies, human-in-the-loop checkpoints, and compliance reporting on production agents.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","genai-engineering-manager","genai-solutions-architect","genai-solutions-delivery-lead"]}]},{"category":"Cost & economics","skills":[{"skillId":"cache_strategy_economics","name":"Caching Strategies for Cost","description":"Designs prompt, semantic, and tool-call caches with appropriate TTLs and invalidation, quantifying cost-per-hit and quality impact.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"cost_anomaly_monitoring","name":"Cost Anomaly Monitoring","description":"Instruments cost telemetry per feature/tenant and detects anomalies via baselines or statistical detectors before they become bills.","disciplines":["ai-inference-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"model_routing_economics","name":"Cost-Aware Model Routing","description":"Builds cascade and routing strategies that send easy queries to cheap models and hard queries to expensive ones, governed by quality SLOs.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"gpu_capacity_planning","name":"GPU Capacity Planning","description":"Plans GPU capacity using spot/reserved/on-demand mix, autoscaling envelopes, and queue-based load shedding to hit SLO at target cost.","disciplines":["ai-inference-engineer","ai-platform-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_cost_modeling","name":"LLM Cost Modeling","description":"Models per-request token economics, p99 cost, and unit economics for LLM features; compares hosted vs. self-hosted total cost.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]}]},{"category":"Customization","skills":[{"skillId":"continued_pretraining","name":"Continued Pretraining for Domain Adaptation","description":"Performs domain-adaptive continued pretraining on curated corpora and measures downstream-task improvement vs. base model.","disciplines":["genai-solutions-architect"]},{"skillId":"training_infrastructure","name":"Distributed Training Infrastructure","description":"Configures distributed training with DeepSpeed, FSDP, or accelerate; understands ZeRO stages, gradient checkpointing, and mixed precision.","disciplines":["ai-platform-engineer","genai-solutions-architect"]},{"skillId":"few_shot_in_context_learning","name":"Few-Shot & In-Context Learning Design","description":"Designs few-shot exemplar selection (k, ordering, similarity-based retrieval) and measures in-context learning quality.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect"]},{"skillId":"fine_tuning_evaluation","name":"Fine-Tuning Evaluation","description":"Builds evaluation harnesses to compare base vs. fine-tuned models on task suites, regression sets, and held-out human preference data.","disciplines":["ai-safety-evaluation-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect"]},{"skillId":"preference_optimization_dpo","name":"Preference Optimization (DPO/RLHF)","description":"Aligns models with human or AI preferences using DPO, IPO, KTO, or RLHF/RLAIF pipelines, including reward modeling fundamentals.","disciplines":["genai-application-engineer","genai-data-engineer","genai-solutions-architect"]},{"skillId":"prompt_template_engineering","name":"Production Prompt Template Engineering","description":"Authors versioned production prompts with structured outputs, ablations, and prompt-variant A/B tests under load.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"supervised_fine_tuning","name":"Supervised Fine-Tuning (LoRA/QLoRA)","description":"Fine-tunes open-weight LLMs with LoRA, QLoRA, and full SFT, manages training hyperparameters, and evaluates instruction-following gains.","disciplines":["genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"dataset_curation","name":"Training Dataset Curation","description":"Curates, deduplicates, and decontaminates training datasets; balances domain mixtures and applies quality filters.","disciplines":["genai-application-engineer","genai-data-engineer","genai-solutions-architect"]}]},{"category":"Data engineering","skills":[{"skillId":"chunking_strategies","name":"Chunking Strategies for RAG","description":"Selects chunking strategies (fixed, recursive, semantic, hierarchical, late-chunking) per document class and measures retrieval impact.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"data_lake_warehouse","name":"Data Lake & Warehouse for AI","description":"Models AI feature and event tables in BigQuery, Snowflake, or open-table formats (Iceberg, Delta) with appropriate partitioning and clustering.","disciplines":["genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"data_pipeline_orchestration","name":"Data Pipeline Orchestration","description":"Designs idempotent batch and incremental pipelines using Airflow, Dagster, or Prefect, with retries, lineage, and SLAs.","disciplines":["ai-platform-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"data_quality_validation","name":"Data Quality & Validation","description":"Encodes data contracts, schema checks, drift detection, and quality SLOs using Great Expectations, dbt tests, or equivalent tooling.","disciplines":["genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"document_parsing_extraction","name":"Document Parsing & Extraction","description":"Extracts structured content from PDF, DOCX, HTML, and scanned images using Unstructured, Docling, or comparable tooling, including layout-aware parsing.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"hybrid_retrieval_reranking","name":"Hybrid Retrieval & Reranking","description":"Combines lexical (BM25) and dense retrieval, applies cross-encoder rerankers, and tunes retrieval-quality metrics (recall, MRR, nDCG).","disciplines":["forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"knowledge_graph_construction","name":"Knowledge Graph Construction","description":"Builds knowledge graphs from unstructured corpora — entity extraction, linking, deduplication, and graph schema design for retrieval.","disciplines":["ai-agent-engineer","genai-data-engineer","genai-solutions-architect"]},{"skillId":"pii_data_governance","name":"PII & Data Governance","description":"Detects and redacts PII, enforces data residency and retention policies, and tracks lineage for AI training and inference data.","disciplines":["ai-safety-evaluation-engineer","ai-security-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"streaming_data_kafka","name":"Streaming Data with Kafka/Pulsar","description":"Builds event-driven AI pipelines with Kafka or Pulsar — partitioning, consumer groups, exactly-once semantics, and schema evolution.","disciplines":["genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"vector_database_operations","name":"Vector Database Operations","description":"Operates production vector DBs (Pinecone, Weaviate, Qdrant, pgvector) — index tuning, sharding, hybrid filters, and capacity planning.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]}]},{"category":"Evaluation","skills":[{"skillId":"agent_trajectory_evaluation","name":"Agent Trajectory Evaluation","description":"Evaluates end-to-end agent task success — tool-call correctness, intermediate state validation, and trace-based replay scoring.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"bias_fairness_testing","name":"Bias, Fairness & Toxicity Testing","description":"Audits models for demographic bias, fairness gaps, and toxicity using accepted suites and reports impact in plain terms.","disciplines":["ai-safety-evaluation-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"benchmark_design","name":"Domain Benchmark Design","description":"Designs domain-specific benchmarks with held-out splits, contamination checks, and diverse failure-mode coverage.","disciplines":["ai-safety-evaluation-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"factuality_grounding_eval","name":"Factuality & Grounding Evaluation","description":"Quantifies hallucination rate and grounding fidelity for RAG and agent outputs using span-level annotators or reference-based metrics.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_eval_harnesses","name":"LLM Evaluation Harnesses","description":"Runs evaluations using lm-evaluation-harness, Inspect, OpenAI Evals, or custom harnesses with reproducible task specs.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"regression_testing_llm","name":"LLM Regression Testing in CI","description":"Wires evaluation suites into CI gates with golden-set tracking, drift alerts, and statistically valid regression thresholds.","disciplines":["ai-safety-evaluation-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_judge_evaluation","name":"LLM-as-Judge Evaluation","description":"Designs LLM-judge rubrics with calibration, debiasing, and inter-judge agreement checks; knows when judges are unreliable.","disciplines":["ai-safety-evaluation-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"red_teaming_jailbreaks","name":"Red-Teaming & Jailbreak Testing","description":"Generates adversarial prompts, tests jailbreak resistance, and reports findings with severity and reproduction steps.","disciplines":["ai-safety-evaluation-engineer","ai-security-engineer","genai-solutions-architect"]}]},{"category":"Foundations","skills":[{"skillId":"python_async_programming","name":"Async Python with asyncio","description":"Writes concurrent async/await code with asyncio, gather, semaphores, and async HTTP clients.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"python_config_secrets","name":"Configuration & Secrets Management","description":"Manages environment variables, .env files, and secrets safely with python-dotenv and decorators.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_file_io_errors","name":"File I/O, JSON & Exception Handling","description":"Reads and writes files, parses JSON, and handles errors with try/except and custom exceptions.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_oop_dataclasses","name":"Python Classes & Dataclasses","description":"Models data with classes, dataclasses, methods, and inheritance for structured Python code.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_core_programming","name":"Python Core Programming","description":"Writes Python programs using variables, control flow, functions, modules, and packages.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_data_pipelines","name":"Python Data Pipelines with Polars","description":"Builds data transformation pipelines with Polars, generators, and lazy evaluation for tabular data.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","llmops-engineer"]},{"skillId":"python_data_structures","name":"Python Data Structures & Comprehensions","description":"Manipulates lists, tuples, dictionaries, and sets using slicing, iteration, and comprehensions.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_testing_pytest","name":"Python Testing with pytest","description":"Writes unit and integration tests with pytest fixtures, assertions, and mocking patterns.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"python_pydantic_models","name":"Type Hints & Pydantic Models","description":"Builds typed Python data models with type hints, generics, protocols, and Pydantic validation.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]}]},{"category":"Inference optimization","skills":[{"skillId":"inference_batching_serving","name":"Continuous Batching & Inference Serving","description":"Implements continuous and dynamic batching for high-throughput LLM serving using vLLM, TGI, or comparable engines.","disciplines":["ai-platform-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"gpu_kernel_basics","name":"GPU Kernel Programming Basics","description":"Reads and authors basic Triton or CUDA kernels for custom ops, understands occupancy and memory coalescing fundamentals.","disciplines":[]},{"skillId":"gpu_memory_management","name":"GPU Memory Management","description":"Profiles CUDA memory, sizes batches to fit available VRAM, handles OOM gracefully, and uses gradient checkpointing or offloading for memory-bound workloads.","disciplines":["ai-platform-engineer","llmops-engineer"]},{"skillId":"inference_latency_profiling","name":"Inference Latency Profiling","description":"Profiles p50/p95/p99 token-generation latency, isolates bottlenecks across tokenizer, attention, and decode phases, and reports actionable findings.","disciplines":["ai-inference-engineer","ai-platform-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"inference_kv_caching","name":"KV Cache Optimization","description":"Tunes transformer decoder KV cache for throughput and memory; understands prefix caching, paged-attention, and cache eviction strategies.","disciplines":["ai-inference-engineer","ai-platform-engineer","llmops-engineer"]},{"skillId":"model_distillation_pruning","name":"Model Distillation & Pruning","description":"Compresses large models via knowledge distillation and structured/unstructured pruning while preserving target metrics.","disciplines":["genai-solutions-architect"]},{"skillId":"model_quantization","name":"Model Quantization","description":"Applies INT8/INT4/FP8 post-training quantization (GPTQ, AWQ, GGUF, bitsandbytes) and measures quality vs. throughput trade-offs.","disciplines":["ai-platform-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"model_serving_frameworks","name":"Model Serving Frameworks","description":"Deploys LLMs via vLLM, TGI, TensorRT-LLM, or SGLang with appropriate engine flags, schedulers, and runtime configuration.","disciplines":["ai-platform-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"multi_gpu_tensor_parallelism","name":"Multi-GPU Tensor & Pipeline Parallelism","description":"Configures tensor-parallel and pipeline-parallel sharding across multiple GPUs to serve models that exceed single-GPU memory.","disciplines":["ai-platform-engineer","genai-solutions-architect"]},{"skillId":"speculative_decoding","name":"Speculative Decoding","description":"Implements speculative-decoding strategies (draft models, Medusa, lookahead) to reduce decoder latency while preserving output distribution.","disciplines":["genai-solutions-architect"]}]},{"category":"Infrastructure","skills":[{"skillId":"docker_containerization","name":"Docker Containerization for LLM Apps","description":"Writes Dockerfiles with multi-stage builds, manages images, and runs containers with Compose.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"k8s_helm_kustomize","name":"Helm & Kustomize Packaging","description":"Packages Kubernetes apps with Helm charts and manages environment overlays with Kustomize.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"k8s_health_autoscaling","name":"K8s Health Probes & Autoscaling","description":"Configures liveness, readiness, startup probes, HPA, and PodDisruptionBudgets for resilient services.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"k8s_ingress_tls_networkpolicy","name":"Kubernetes Ingress, TLS & NetworkPolicy","description":"Exposes services via Ingress with TLS termination and isolates traffic with NetworkPolicies.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"k8s_security_rbac_troubleshooting","name":"Kubernetes RBAC & Troubleshooting","description":"Applies RBAC, Pod Security Standards, and SecurityContext while debugging CrashLoopBackOff and OOMKilled pods.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-security-engineer","forward-deployed-ai-engineer","llmops-engineer"]},{"skillId":"k8s_workloads_pods_services","name":"Kubernetes Workloads, Pods & Services","description":"Deploys pods, services, and Deployments to Kubernetes with rolling updates and DNS-based discovery.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"sandboxed_code_execution","name":"Sandboxed Agent Code Execution","description":"Isolates agent-generated code in containers with timeouts, cgroup resource limits, and input sanitization to prevent escape.","disciplines":["ai-agent-engineer","ai-security-engineer","ai-safety-evaluation-engineer"]}]},{"category":"LLM core","skills":[{"skillId":"embeddings_semantic_search","name":"Embeddings & Semantic Search","description":"Generates embeddings, computes cosine similarity, and builds semantic search over documents.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"langchain_lcel","name":"LangChain & LCEL Runnables","description":"Composes LangChain Runnables with LCEL pipe syntax, streaming, batching, and configurable runtime fields.","disciplines":["ai-agent-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_api_integration","name":"LLM API Integration","description":"Calls OpenAI, Anthropic, and Gemini APIs with auth, error handling, and response parsing.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-safety-evaluation-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_cost_resilience","name":"LLM Cost & Resilience Optimization","description":"Tracks token costs, applies retry with exponential backoff, and tunes prompts for budget.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-engineering-manager","genai-solutions-architect","genai-solutions-delivery-lead","llmops-engineer"]},{"skillId":"llm_function_calling","name":"LLM Function Calling & Tool Use","description":"Defines tool schemas in JSON Schema and orchestrates multi-turn function calling across providers.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_sampling_output_control","name":"LLM Sampling & Structured Output","description":"Controls LLM outputs with temperature, top-p, stop sequences, JSON mode, and structured schemas.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"llm_prompt_engineering","name":"Multi-Provider Prompt Engineering","description":"Builds versioned prompts with Jinja2, few-shot examples, and chain-of-thought across providers.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"rag_pipeline_fundamentals","name":"RAG Pipeline Fundamentals","description":"Builds retrieval-augmented generation pipelines with chunking, retrieval, and citation.","disciplines":["ai-agent-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"transformer_architecture_internals","name":"Transformer Architecture Internals","description":"Implements scaled dot-product attention and reasons about KV-cache memory, FFN dimensions, and quantization tradeoffs to choose inference strategies.","disciplines":["ai-agent-engineer","ai-inference-engineer","genai-solutions-architect"]}]},{"category":"Security","skills":[{"skillId":"ai_iam_secrets","name":"AI IAM & Secrets Management","description":"Configures IAM, Workload Identity / IRSA, KMS, and short-lived credentials for AI workloads; rotates and audits secrets.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"compliance_frameworks_ai","name":"Compliance Frameworks for AI","description":"Maps AI systems to SOC 2, ISO 27001, HIPAA, and EU AI Act controls; produces evidence and audit-ready documentation.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"model_supply_chain","name":"Model Supply-Chain Security","description":"Verifies model provenance, signed weights, SBOMs, and dependency integrity for open-weight and hosted models.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"output_filtering_dlp","name":"Output Filtering & Data-Loss Prevention","description":"Builds output-side DLP for PII, secrets, and proprietary IP, with deterministic filters layered with model-based classifiers.","disciplines":["ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"prompt_injection_defense","name":"Prompt Injection Defense","description":"Identifies direct and indirect prompt-injection vectors and implements input filtering, isolation, and least-privilege tool gating.","disciplines":["ai-agent-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"threat_modeling_ai","name":"Threat Modeling for AI Systems","description":"Applies STRIDE / PASTA threat modeling to AI architectures including model, data, and agent-tool boundaries.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"vulnerability_scanning_ai","name":"Vulnerability Scanning for AI Stacks","description":"Runs SCA, SAST, container, and model-asset scanning in CI; triages and remediates findings with appropriate severity gates.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"network_isolation_zero_trust","name":"Zero-Trust Networking for AI","description":"Enforces network isolation, egress allowlists, mTLS, and zero-trust policies for AI inference and training workloads.","disciplines":["ai-platform-engineer","ai-security-engineer","genai-solutions-architect","llmops-engineer"]}]},{"category":"Web APIs","skills":[{"skillId":"api_authentication","name":"API Authentication & Authorization","description":"Implements OAuth2, JWT, API keys, and role-based access control on FastAPI endpoints.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-security-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"api_gateway_routing","name":"API Gateway & Routing","description":"Builds reverse-proxy gateways with path routing, load balancing, and response aggregation.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"api_observability","name":"API Observability","description":"Instruments APIs with Prometheus metrics, OpenTelemetry traces, structured logging, and Grafana dashboards.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"api_resilience_patterns","name":"API Resilience Patterns","description":"Applies rate limiting, circuit breakers, retries with backoff, and bulkhead isolation to API services.","disciplines":["ai-agent-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"api_testing_versioning","name":"API Testing & Versioning","description":"Tests async endpoints with pytest and httpx, and manages API versions with deprecation strategies.","disciplines":["ai-agent-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","llmops-engineer"]},{"skillId":"async_database_sqlalchemy","name":"Async Databases with SQLAlchemy & Alembic","description":"Models data with async SQLAlchemy ORM, manages migrations with Alembic, and applies the repository pattern.","disciplines":["ai-agent-engineer","ai-platform-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","llmops-engineer"]},{"skillId":"fastapi_rest_apis","name":"FastAPI REST API Development","description":"Builds production REST APIs with FastAPI using Pydantic validation, dependency injection, and async handlers.","disciplines":["ai-agent-engineer","ai-inference-engineer","ai-platform-engineer","ai-safety-evaluation-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-data-engineer","genai-solutions-architect","llmops-engineer"]},{"skillId":"realtime_streaming_apis","name":"Real-Time Streaming with SSE & WebSockets","description":"Streams LLM responses with SSE and manages WebSocket connection lifecycles for real-time apps.","disciplines":["ai-agent-engineer","ai-inference-engineer","forward-deployed-ai-engineer","genai-application-engineer","genai-solutions-architect","llmops-engineer"]}]}]}