Home / AI-Native QA Engineer
๐Ÿ† PROFESSIONAL CERTIFICATION

AI-Native QA Engineer & Architect Mastery

Gen-AI Testing ยท Agentic QA ยท AI Load Testing ยท LM Studio ยท MCP Agents

AIQA-201 12 Weeks 25 Sessions Weekends Only

๐Ÿ“‹ Course Information

Course CodeAIQA-201
Duration12 Weeks (25 Sessions)
ScheduleWeekends Only (Saturday & Sunday)
Session Duration2.5 hours per session
Total Contact Hours62.5 hours live
Self-Study Hours60 hours (recommended)
ModeLive Online / Hybrid
LevelIntermediate to Expert
PrerequisitesBasic QA or IT experience (no prior AI QA needed)
LanguageEnglish / Hindi
CertificateProfessional Certificate in AI QA Engineering & Architecture
Batch SizeMaximum 30 students (personalized attention)

๐ŸŽฏ Course Overview

This intensive 12-week programme is designed for QA professionals, IT engineers, and IT managers who want to master the testing of AI and GenAI systems โ€” from basic LLM output testing all the way to multi-agent QA and AI governance.

Students will use a single real product story across all 25 sessions, applying every technique to the same AI application until they have built a complete, production-grade AI QA Architect portfolio.

๐Ÿ‘ฅ Target Audience

โ€ข

QA Engineers wanting to specialise in AI and GenAI systems testing

โ€ข

QA Managers and Test Leads overseeing AI product delivery

โ€ข

Automation Engineers ready to move into AI QA architecture

โ€ข

DevOps Engineers adding AI quality gates to CI/CD pipelines

โ€ข

IT Professionals transitioning into AI QA roles

โ€ข

Software Developers who want to properly test what they build with AI

โ€ข

MLOps Engineers needing quality engineering skills

โ€ข

Anyone preparing for an AI QA Architect role or interview

๐Ÿ“– Course Structure

Phase 1: AI Mindset + Vibe Coding + Local AI (Weeks 1โ€“2)

AI QA landscape, LLM fundamentals, vibe coding with Copilot/Cursor, LM Studio local models, AI-assisted test design

Phase 2: Modern QA Toolkit + n8n (Weeks 3โ€“4)

API testing (Bruno/Hurl/Pact/Schemathesis), n8n workflow testing, Playwright E2E vibe-coded, database SQL test suites

Phase 3: Testing GenAI Systems (Weeks 5โ€“6)

LLM output quality, RAG pipeline testing, AI security red-teaming, MCP AI QA agents

Phase 4: Performance + AI Load Testing + Security (Weeks 7โ€“8)

k6 performance, AI load testing (TTFT/token cost/RAG under load), chaos engineering, DAST security

Phase 5: Agentic QA + Multi-Agent + Observability (Weeks 9โ€“10)

Goal-based agent testing, multi-agent systems, advanced n8n orchestration, AI drift monitoring

Phase 6: QA Architecture + Governance + Demo Day (Weeks 11โ€“12)

Test pyramid design, CI/CD quality gates, EU AI Act compliance, fairness testing, capstone presentation

WEEK 1: AI Mindset + Vibe Coding + ShopSmart Kickoff

Sessions 1-2 โ€ข Building your AI QA foundation with real-world ShopSmart project

SESSION 1

The AI QA Revolution & ShopSmart Kickoff

Duration: 2.5 hours

๐Ÿงต ShopSmart product manager shares a 1-page brief: "Add an AI chatbot to handle 80% of tier-1 support queries." This is Session 1 of the ShopSmart story.

Topics Covered:
  • The AI QA revolution โ€” how AI changes testing fundamentally
  • Traditional QA vs AI QA โ€” what is different, what stays the same
  • Practical AI stack walkthrough: UI โ†’ n8n โ†’ LLM โ†’ RAG โ†’ API โ†’ DB
  • First AI test plan โ€” paste ShopSmart brief into Claude, review output critically
๐Ÿ’ญ Think Before You Build:
  • What does it mean to "test" a system that gives a different answer every time?
  • How is testing an LLM different from testing a REST API?
  • What are the 4 layers of the ShopSmart AI stack โ€” and what test type does each need?
๐Ÿ› ๏ธ Hands-on Lab:
  • All 8 services running (shopmart-app, n8n, postgres, mockoon, prometheus, grafana, influxdb, localai)
  • Map ShopSmart AI stack layers on paper: identify every failure point
  • Generate first ShopSmart test plan using Claude โ€” then review and identify gaps
  • Create GitHub repository: first commit

๐Ÿ“ Assignment: Write a 2-page ShopSmart AI QA Architecture Overview covering: stack diagram, test types per layer, risk assessment

SESSION 2

Vibe Coding for QA โ€” AI-Accelerated Test Engineering

Duration: 2.5 hours

๐Ÿงต ShopSmart user story #1: "As a customer, I want to ask 'Where is my order?' and get a tracked answer." First Playwright smoke test needed.

Topics Covered:
  • Vibe coding methodology โ€” Comments โ†’ Code workflow
  • Cursor IDE composer and agent mode โ€” live demo
  • GitHub Copilot โ€” chat, inline, multi-file generation
  • AI-generated code review โ€” how to spot and fix hallucinated selectors
  • Vibe-code ShopSmart Playwright smoke tests โ€” live coding
  • AI Audit technique โ€” documenting what the AI got wrong
๐Ÿ› ๏ธ Hands-on Lab:
  • Vibe-code ShopSmart Playwright smoke tests using Copilot scaffold in under 10 minutes
  • Review every generated test: find and fix at least 5 errors the AI introduced
  • Cursor agent mode: build full ShopSmart chat UI POM from a 1-paragraph description
  • Istanbul coverage: measure what % of ShopSmart flows the AI missed

๐Ÿ“ Assignment: 30-test ShopSmart AI-generated Playwright suite + AI Audit column (document what AI got wrong in each test)

WEEK 2: Local AI Models + AI Test Design

Sessions 3-4 โ€ข LM Studio setup, golden datasets, and risk-based testing

SESSION 3

๐Ÿ–ฅ๏ธ LM Studio: Running Local AI Models for QA

Duration: 2.5 hours โ€ข No API Cost, No Internet Required

๐Ÿงต ShopSmart QA team has no cloud API budget in the dev environment and cannot send customer order data to cloud providers. They need a local model for test generation and CI evals.

Topics Covered:
  • Why run local AI models? โ€” cost, privacy, offline CI, rate limits
  • LM Studio installation โ€” macOS, Windows, Linux, Apple Silicon M1/M2/M3
  • Downloading models: Mistral 7B, Phi-3.5 Mini, Llama 3.1, Codestral
  • Starting the OpenAI-compatible local server on localhost:1234
  • Connecting QA tools to LM Studio: Promptfoo, Bruno, Cursor IDE, n8n
  • Quality comparison: Mistral 7B vs Claude โ€” scoring ShopSmart test generation
  • LocalAI in Docker for CI without internet

๐Ÿ“ Assignment: LM Studio config + Promptfoo local provider config + 20 ShopSmart test cases (local) + quality comparison table + LocalAI CI Docker setup

SESSION 4

AI-Powered Test Design: Requirements to Test Cases

Duration: 2.5 hours

๐Ÿงต ShopSmart story #2: "I want to know my refund status." 50 edge cases need to be identified and a golden dataset created for all future AI evals.

Topics Covered:
  • AI-assisted test design โ€” from PRD to test cases in 3 minutes
  • What makes a test case "good"? โ€” the 5-criteria scoring rubric
  • Building a golden dataset โ€” the most important AI QA artefact
  • Risk-based testing for AI systems โ€” how to prioritise when everything can fail
  • Exploratory testing with AI assistance

๐Ÿ“ Assignment: 50-test suite (AI-generated + reviewed) + ShopSmart golden dataset (20 pairs) + risk matrix

WEEK 3: LLM Internals + API Testing

Sessions 5-6 โ€ข Understanding GenAI architecture and comprehensive API testing

SESSION 5

GenAI Context: How LLMs, RAG & Agents Work (QA Perspective)

Duration: 2.5 hours

๐Ÿงต ShopSmart developer explains: the chatbot uses RAG to search a 500-document FAQ corpus before answering. QA must understand this to test it effectively.

Topics Covered:
  • LLM mechanics simplified โ€” tokens, temperature, context window
  • Temperature 0 vs 0.7 โ€” what changes and what that means for test design
  • RAG architecture โ€” 4 failure points a QA engineer must know
  • Agent anatomy โ€” goals, tools, memory, observation loop (ReAct pattern)
  • MCP (Model Context Protocol) โ€” what it is and why it changes QA

๐Ÿ“ Assignment: Temperature comparison report + ShopSmart RAG failure map + agent anatomy diagram + 10 agent scenarios

SESSION 6

API Testing: Bruno, Hurl, Contract & Fuzz

Duration: 2.5 hours

๐Ÿงต ShopSmart backend exposes POST /chat, GET /order/{id}, POST /refund โ€” all need contract tests before the frontend team depends on them.

Topics Covered:
  • API testing for AI systems โ€” what is different from traditional REST testing
  • Bruno โ€” git-native offline API client, generating from OpenAPI spec
  • Hurl โ€” CLI HTTP testing, CI-native, scriptable
  • Pact โ€” consumer-driven contract testing for AI APIs
  • Schemathesis โ€” automated OpenAPI fuzz testing
  • Mockoon โ€” API mocking, critical for isolating n8n AI node

๐Ÿ“ Assignment: Bruno collection (35 requests) + Hurl smoke + Pact contract + Schemathesis fuzz report

WEEK 4: n8n Integration + Playwright E2E + Database

Sessions 7-9 โ€ข Workflow testing, E2E automation, and SQL validation

SESSION 7

n8n Integration Testing: Workflows, AI Nodes & Failure Injection

Duration: 2.5 hours

๐Ÿงต ShopSmart uses n8n: webhook (new message) โ†’ AI classification node โ†’ route to refund/tracking/escalate. All 3 routing paths need testing, isolated from the LLM.

Topics Covered:
  • n8n architecture for QA โ€” workflows, triggers, AI nodes, webhook handling
  • Testing trigger nodes โ€” webhook validation and schema checking
  • Testing AI nodes โ€” why you must mock the LLM output for routing tests
  • Failure injection with Mockoon โ€” simulating AI node failures
  • Idempotency testing โ€” why duplicate messages must not create duplicate DB records

๐Ÿ“ Assignment: n8n ShopSmart workflow + 12 test cases + Mockoon config + idempotency evidence

๐Ÿ“ฆ PROJECT 1 SUBMISSION โ€” ShopSmart AI QA Foundations Pack (End of Week 4)

Must include all of the following:

โœ…

ShopSmart AI QA Architecture Overview (2 pages) โ€” Session 1

โœ…

LM Studio quality comparison: local vs cloud test generation scores โ€” Session 3

โœ…

50-test suite with AI Audit column โ€” Session 4

โœ…

Bruno API collection (35 requests) + Pact contract + Schemathesis fuzz โ€” Session 6

โœ…

n8n test suite: 12 test cases + Mockoon failure evidence โ€” Session 7

โœ…

Playwright POM: 12 ShopSmart E2E journeys + cross-browser + visual snapshots โ€” Session 8

โœ…

SQL test suite: 15 ShopSmart business rules โ€” Session 9

โœ…

GitHub Issues: 10 ShopSmart bugs with severity, priority, AI-suggested root cause

Assessment: Pass/Fail โ€” instructor reviews GitHub repo. CI green badge required for Playwright.

WEEK 5: LLM Output Quality + RAG Testing

Sessions 10-11 โ€ข Promptfoo, DeepEval, RAGAS, and LangSmith

SESSION 10

๐Ÿค– LLM Output Quality: Promptfoo, DeepEval & Hallucination Detection

Duration: 2.5 hours

๐Ÿงต ShopSmart QA discovers the chatbot sometimes says "returns are free for 90 days" when the actual policy is 30 days. A systematic evaluation framework is needed.

Topics Covered:
  • Types of LLM failures: hallucination, refusal, relevance drift, consistency
  • Promptfoo โ€” YAML eval suites, multi-provider, assertions
  • DeepEval โ€” Answer Relevancy, Hallucination metrics
  • SelfCheckGPT โ€” consistency-based hallucination scoring (no ground truth needed)
  • CI quality gate design โ€” what threshold is "good enough" for ShopSmart?

๐Ÿ“ Assignment: 30-test Promptfoo YAML + hallucination report + CI quality gate

WEEK 6: AI Security + MCP Agents

Sessions 12-13 โ€ข Red-teaming, prompt injection, and AI QA agents

SESSION 12

๐Ÿค– AI Security: Prompt Injection, Garak & ShopSmart Red Team

Duration: 2.5 hours

๐Ÿงต ShopSmart red team challenge: "Can a customer trick the chatbot into revealing other customers' order details?" A full security test is required.

Topics Covered:
  • AI attack surface โ€” how AI systems fail differently from traditional systems
  • Garak โ€” automated LLM red-teaming: 100+ attack probe categories
  • Manual injection patterns โ€” 15 ShopSmart-specific attack types
  • IDOR via prompt injection โ€” testing cross-user data leakage
  • Indirect injection โ€” embedding instructions in uploaded documents
  • Detoxify โ€” toxicity scoring for AI responses

๐Ÿ“ Assignment: AI Security Report: Garak scan + 15 ShopSmart injection tests + IDOR evidence + Detoxify scores

WEEK 7: Performance Testing + โšก AI Load Testing

Sessions 14-15 โ€ข k6, Grafana, and the career-differentiating AI load testing session

SESSION 15 โญ

โšก Load Testing AI Systems: TTFT, Token Cost, RAG Quality & n8n Concurrent

Duration: 2.5 hours โ€ข THE SESSION ALMOST NO QA COURSE COVERS โ€” YOUR CAREER DIFFERENTIATOR

๐Ÿงต ShopSmart CFO asks: "What will the AI chatbot COST at 5,000 concurrent users on Black Friday โ€” and will the quality hold up?" Traditional load testing cannot answer this.

New Metrics You Will Learn:
MetricWhat It MeasuresTarget
TTFTTime-to-First-Token (streaming start)< 800ms P95
TGTTotal Generation Time (full response)< 4s P95
Tokens/secLLM processing speed under load> 20 t/s at 50 VUs
Cost/requestDollar cost per LLM call under concurrencyTrack per level
RAGAS@LoadRAG quality degradation under concurrencyFaithfulness delta < 0.15

๐Ÿ“ Assignment: โšก ShopSmart AI Load Report: "Chatbot costs $847/hr at Black Friday peak ยท RAG quality drops 28% at 100 concurrent ยท n8n saturates at 47 concurrent" + Grafana dashboard

WEEK 8: Chaos Engineering + Security Testing

Sessions 16-17 โ€ข ToxiProxy, OWASP ZAP, and SOC2 compliance

๐Ÿ“ฆ PROJECT 2 SUBMISSION โ€” ShopSmart Automated AI QA Framework (End of Week 8)

Must include all of the following:

โœ…

Everything from Project 1 (updated)

โœ…

Promptfoo YAML suite: 30 test cases + hallucination report + CI gate โ€” Session 10

โœ…

RAGAS eval: all 4 metrics + LangSmith trace + before/after improvement evidence โ€” Session 11

โœ…

AI Security Report: Garak scan + 15 injection tests + IDOR evidence โ€” Session 12

โœ…

MCP agent demo: ShopSmart issue โ†’ AI test โ†’ Playwright runs โ†’ PR (3-min video) โ€” Session 13

โœ…

k6 load: ShopSmart baseline + Black Friday spike + Grafana dashboard โ€” Session 14

โœ…

โšก AI Load Report: TTFT + token cost + RAG quality under load + n8n saturation โ€” Session 15

โœ…

Chaos Report: 6 failure scenarios + ToxiProxy config โ€” Session 16

โœ…

Security Report: ZAP + nuclei + IDOR + CI integration โ€” SOC2-ready pack โ€” Session 17

Assessment: GitHub repo URL + CI green badge + live Allure + MCP demo video link

WEEK 9: Agentic QA โ€” Testing AI That Decides & Acts

Sessions 18-19 โ€ข Goal-based testing and multi-agent systems

WEEK 10: Advanced n8n + AI Observability

Sessions 20-21 โ€ข Orchestrator testing and drift detection

WEEK 11: QA Architecture + CI/CD Quality Gates

Sessions 22-23 โ€ข Enterprise strategy and full pipeline implementation

WEEK 12: AI Governance + Capstone Demo Day

Sessions 24-25 โ€ข EU AI Act compliance and final presentation

SESSION 25 ๐ŸŽ“

Demo Day: ShopSmart Full Journey Presentation & Certification

Duration: 2.5 hours

๐Ÿงต STORY COMPLETE: You started with a 1-page product brief in Session 1. You now have a fully tested, monitored, governed, CI/CD-integrated AI QA platform. Time to prove it.

Demo Day Activities:
  • Portfolio demo preparation โ€” the 10-minute ShopSmart story structure
  • Student presentations โ€” 10 minutes each
  • Peer review and scoring
  • Q&A challenge: "How would you adapt this for a financial services AI product?"
  • Certificate ceremony + career guidance + next steps

๐Ÿ“Š Assessment & Grading

Grading Components:

ComponentWeightDescription
Weekly Assignments30%24 practical assignments (one per session)
Project 1: Foundations Pack15%End of Week 4 submission
Project 2: Automated AI QA Framework20%End of Week 8 submission
Project 3: Capstone Portfolio25%Session 25 live demo + submission
Attendance & Participation10%Live session presence + GitHub activity

Certification Requirements:

  • Minimum 70% overall score
  • All 3 capstone projects submitted
  • 80% attendance (minimum 20 out of 25 sessions)
  • Final ShopSmart portfolio on GitHub with CI green badge
  • Allure report deployed to GitHub Pages