Gen-AI Testing ยท Agentic QA ยท AI Load Testing ยท LM Studio ยท MCP Agents
| Course Code | AIQA-201 |
| Duration | 12 Weeks (25 Sessions) |
| Schedule | Weekends Only (Saturday & Sunday) |
| Session Duration | 2.5 hours per session |
| Total Contact Hours | 62.5 hours live |
| Self-Study Hours | 60 hours (recommended) |
| Mode | Live Online / Hybrid |
| Level | Intermediate to Expert |
| Prerequisites | Basic QA or IT experience (no prior AI QA needed) |
| Language | English / Hindi |
| Certificate | Professional Certificate in AI QA Engineering & Architecture |
| Batch Size | Maximum 30 students (personalized attention) |
This intensive 12-week programme is designed for QA professionals, IT engineers, and IT managers who want to master the testing of AI and GenAI systems โ from basic LLM output testing all the way to multi-agent QA and AI governance.
Students will use a single real product story across all 25 sessions, applying every technique to the same AI application until they have built a complete, production-grade AI QA Architect portfolio.
QA Engineers wanting to specialise in AI and GenAI systems testing
QA Managers and Test Leads overseeing AI product delivery
Automation Engineers ready to move into AI QA architecture
DevOps Engineers adding AI quality gates to CI/CD pipelines
IT Professionals transitioning into AI QA roles
Software Developers who want to properly test what they build with AI
MLOps Engineers needing quality engineering skills
Anyone preparing for an AI QA Architect role or interview
AI QA landscape, LLM fundamentals, vibe coding with Copilot/Cursor, LM Studio local models, AI-assisted test design
API testing (Bruno/Hurl/Pact/Schemathesis), n8n workflow testing, Playwright E2E vibe-coded, database SQL test suites
LLM output quality, RAG pipeline testing, AI security red-teaming, MCP AI QA agents
k6 performance, AI load testing (TTFT/token cost/RAG under load), chaos engineering, DAST security
Goal-based agent testing, multi-agent systems, advanced n8n orchestration, AI drift monitoring
Test pyramid design, CI/CD quality gates, EU AI Act compliance, fairness testing, capstone presentation
Sessions 1-2 โข Building your AI QA foundation with real-world ShopSmart project
Duration: 2.5 hours
๐งต ShopSmart product manager shares a 1-page brief: "Add an AI chatbot to handle 80% of tier-1 support queries." This is Session 1 of the ShopSmart story.
๐ Assignment: Write a 2-page ShopSmart AI QA Architecture Overview covering: stack diagram, test types per layer, risk assessment
Duration: 2.5 hours
๐งต ShopSmart user story #1: "As a customer, I want to ask 'Where is my order?' and get a tracked answer." First Playwright smoke test needed.
๐ Assignment: 30-test ShopSmart AI-generated Playwright suite + AI Audit column (document what AI got wrong in each test)
Sessions 3-4 โข LM Studio setup, golden datasets, and risk-based testing
Duration: 2.5 hours โข No API Cost, No Internet Required
๐งต ShopSmart QA team has no cloud API budget in the dev environment and cannot send customer order data to cloud providers. They need a local model for test generation and CI evals.
๐ Assignment: LM Studio config + Promptfoo local provider config + 20 ShopSmart test cases (local) + quality comparison table + LocalAI CI Docker setup
Duration: 2.5 hours
๐งต ShopSmart story #2: "I want to know my refund status." 50 edge cases need to be identified and a golden dataset created for all future AI evals.
๐ Assignment: 50-test suite (AI-generated + reviewed) + ShopSmart golden dataset (20 pairs) + risk matrix
Sessions 5-6 โข Understanding GenAI architecture and comprehensive API testing
Duration: 2.5 hours
๐งต ShopSmart developer explains: the chatbot uses RAG to search a 500-document FAQ corpus before answering. QA must understand this to test it effectively.
๐ Assignment: Temperature comparison report + ShopSmart RAG failure map + agent anatomy diagram + 10 agent scenarios
Duration: 2.5 hours
๐งต ShopSmart backend exposes POST /chat, GET /order/{id}, POST /refund โ all need contract tests before the frontend team depends on them.
๐ Assignment: Bruno collection (35 requests) + Hurl smoke + Pact contract + Schemathesis fuzz report
Sessions 7-9 โข Workflow testing, E2E automation, and SQL validation
Duration: 2.5 hours
๐งต ShopSmart uses n8n: webhook (new message) โ AI classification node โ route to refund/tracking/escalate. All 3 routing paths need testing, isolated from the LLM.
๐ Assignment: n8n ShopSmart workflow + 12 test cases + Mockoon config + idempotency evidence
Must include all of the following:
ShopSmart AI QA Architecture Overview (2 pages) โ Session 1
LM Studio quality comparison: local vs cloud test generation scores โ Session 3
50-test suite with AI Audit column โ Session 4
Bruno API collection (35 requests) + Pact contract + Schemathesis fuzz โ Session 6
n8n test suite: 12 test cases + Mockoon failure evidence โ Session 7
Playwright POM: 12 ShopSmart E2E journeys + cross-browser + visual snapshots โ Session 8
SQL test suite: 15 ShopSmart business rules โ Session 9
GitHub Issues: 10 ShopSmart bugs with severity, priority, AI-suggested root cause
Assessment: Pass/Fail โ instructor reviews GitHub repo. CI green badge required for Playwright.
Sessions 10-11 โข Promptfoo, DeepEval, RAGAS, and LangSmith
Duration: 2.5 hours
๐งต ShopSmart QA discovers the chatbot sometimes says "returns are free for 90 days" when the actual policy is 30 days. A systematic evaluation framework is needed.
๐ Assignment: 30-test Promptfoo YAML + hallucination report + CI quality gate
Sessions 12-13 โข Red-teaming, prompt injection, and AI QA agents
Duration: 2.5 hours
๐งต ShopSmart red team challenge: "Can a customer trick the chatbot into revealing other customers' order details?" A full security test is required.
๐ Assignment: AI Security Report: Garak scan + 15 ShopSmart injection tests + IDOR evidence + Detoxify scores
Sessions 14-15 โข k6, Grafana, and the career-differentiating AI load testing session
Duration: 2.5 hours โข THE SESSION ALMOST NO QA COURSE COVERS โ YOUR CAREER DIFFERENTIATOR
๐งต ShopSmart CFO asks: "What will the AI chatbot COST at 5,000 concurrent users on Black Friday โ and will the quality hold up?" Traditional load testing cannot answer this.
| Metric | What It Measures | Target |
|---|---|---|
| TTFT | Time-to-First-Token (streaming start) | < 800ms P95 |
| TGT | Total Generation Time (full response) | < 4s P95 |
| Tokens/sec | LLM processing speed under load | > 20 t/s at 50 VUs |
| Cost/request | Dollar cost per LLM call under concurrency | Track per level |
| RAGAS@Load | RAG quality degradation under concurrency | Faithfulness delta < 0.15 |
๐ Assignment: โก ShopSmart AI Load Report: "Chatbot costs $847/hr at Black Friday peak ยท RAG quality drops 28% at 100 concurrent ยท n8n saturates at 47 concurrent" + Grafana dashboard
Sessions 16-17 โข ToxiProxy, OWASP ZAP, and SOC2 compliance
Must include all of the following:
Everything from Project 1 (updated)
Promptfoo YAML suite: 30 test cases + hallucination report + CI gate โ Session 10
RAGAS eval: all 4 metrics + LangSmith trace + before/after improvement evidence โ Session 11
AI Security Report: Garak scan + 15 injection tests + IDOR evidence โ Session 12
MCP agent demo: ShopSmart issue โ AI test โ Playwright runs โ PR (3-min video) โ Session 13
k6 load: ShopSmart baseline + Black Friday spike + Grafana dashboard โ Session 14
โก AI Load Report: TTFT + token cost + RAG quality under load + n8n saturation โ Session 15
Chaos Report: 6 failure scenarios + ToxiProxy config โ Session 16
Security Report: ZAP + nuclei + IDOR + CI integration โ SOC2-ready pack โ Session 17
Assessment: GitHub repo URL + CI green badge + live Allure + MCP demo video link
Sessions 18-19 โข Goal-based testing and multi-agent systems
Sessions 20-21 โข Orchestrator testing and drift detection
Sessions 22-23 โข Enterprise strategy and full pipeline implementation
Sessions 24-25 โข EU AI Act compliance and final presentation
Duration: 2.5 hours
๐งต STORY COMPLETE: You started with a 1-page product brief in Session 1. You now have a fully tested, monitored, governed, CI/CD-integrated AI QA platform. Time to prove it.
| Component | Weight | Description |
|---|---|---|
| Weekly Assignments | 30% | 24 practical assignments (one per session) |
| Project 1: Foundations Pack | 15% | End of Week 4 submission |
| Project 2: Automated AI QA Framework | 20% | End of Week 8 submission |
| Project 3: Capstone Portfolio | 25% | Session 25 live demo + submission |
| Attendance & Participation | 10% | Live session presence + GitHub activity |