AI-Driven Testing Automated Testing

From Manual to AI-Assisted Testing: GitHub Copilot for QA Engineers 

Release cycles are getting shorter. Codebases are getting bigger. And QA engineers are expected to catch more issues faster than ever. 

At some point, every QA hits the same wall: there’s simply more to test than time allows. Writing every test case manually, investigating failures one by one it doesn’t scale anymore. 

This is where GitHub Copilot starts to feel less like a tool and more like a teammate. 

Inside Visual Studio Code, Copilot goes beyond autocomplete. With Copilot Chat, it can read your workspace, understand your test files, pick up terminal errors, and let you ask questions in plain English without jumping between tools. 

What makes it especially useful for QA is context. When you point Copilot to a failing test, an error log, or even an API doc, it stops giving generic answers. Instead, it responds like someone who has already gone through your code and understands what you’re trying to do. 

This guide walks you through how to use Copilot as a practical QA co-buddy with real workflows and prompts you can start using immediately. 

Table Of Content:

Introduction

The way software gets built has shifted. AI has crept into nearly every part of the process – writing code, catching bugs, reviewing changes before they ship. QA has felt that shift too, maybe more than most. 

For years, test automation was a predictable grind: write the test, run it, fix it when it breaks, move on. Useful work, but slow – and it assumed the person writing the tests already had a solid mental map of the codebase. Junior engineers hit that gap early. Senior engineers kept rewriting the same setup code they’d built on five different projects before. 

GitHub Copilot changes how that work feels – but probably not in the way you’d expect. 

The Shifting Role of QA in AI-Assisted Development

The QA engineer’s job has quietly grown. It used to be enough to write a solid regression suite and keep it green. Now teams expect QA to own a much bigger question: what actually needs testing, how deep does that testing need to go, and where does automation genuinely help versus where does it just create more things to maintain? 

That question gets harder as applications get bigger. A modern web app can have hundreds of endpoints, branching user flows across multiple devices, and pipelines pushing changes several times a day. Nobody covers that manually. Nobody builds full automation for every feature on every sprint without eventually running out of road. 

AI-assisted development doesn’t make that volume disappear. It just means you’re no longer facing it without help. 

Why GitHub Copilot is more than a developer tool – it’s a QA force multiplier

Most QA engineers hear “GitHub Copilot” and picture something their developers use. A code complete. A function generator. Something that lives in backend files and has nothing to do with test automation. 

That’s worth reconsidering. 

Copilot is not scoped to any role or file type. It works wherever you work – and for QA engineers spending their days inside test files, spec folders, and terminal windows full of failed runs, it has far more to offer than most tutorials ever show. 

Here’s a concrete example. You’re mid-sprint. A test starts failing in CI and you have no idea why. The error is inside a file someone else wrote three months ago. Normally you’d spend 20–30 minutes reading through the code, tracing the function, checking what changed in the last few commits. 

With Copilot Chat, you open the file and type: 

#testFailure #Files login.spec.js 
This test started failing after yesterday's deployment.  
What's the most likely cause? 

Copilot reads the test file, pulls in the failure output, and gives you a targeted answer. It might flag that a selector changed, that an async call isn’t being awaited properly, or that a new redirect was added that the test doesn’t account for. You still verify it. But you are starting from a real diagnosis, not a blank page. 

That difference compounds across a full sprint. 

Developers use Copilot to build things faster. QA engineers use it to understand things faster – across parts of a codebase they may have never touched, under time pressure, without slowing the team down to ask questions every time something unfamiliar shows up. 

That is the distinction. And it’s why Copilot is not just a developer tool that happens to work for QA. For test engineers who know how to use the context system, it behaves like a different product entirely. 

What is GitHub Copilot? 

From autocomplete to conversational AI – the evolution of Copilot 

GitHub Copilot’s first version did one thing. You typed, it guessed what came next – based on the function name, the code around it, whatever it could infer from the open file. Most of the time it was right enough to be useful. Sometimes it was confidently wrong. 

That was still limited in a specific way – it only knew what was directly in front of it. The active file, the current line, whatever context happened to be visible in the editor window. 

That version of Copilot is still there. But it’s no longer the main event. 

GitHub Copilot today is a conversational AI assistant built into VS Code. You can talk to it. Ask why a test is failing. Tell it to generate a Page Object Model based on a file you point it at. Ask it to explain what a service does before you write a single test for it. It reads terminal output, understands your project structure, pulls in live documentation from external URLs, and holds a conversation across multiple turns without losing track of what you were working on. 

The shift from autocomplete to chat changes what the tool is actually good for. Autocomplete saves keystrokes. Conversational AI saves thinking time – and for QA engineers working across unfamiliar code, that’s the more valuable thing. 

Copilot Chat vs. inline suggestions – knowing when to use which 

Both modes are active at the same time in VS Code. They serve different purposes and the faster you internalise which one to reach for, the less friction the tool creates. 

Inline suggestions appear as you type, directly inside your test file. Grey ghost text that either completes the line you’re on or suggests the next few lines. Accept with Tab, dismiss with Escape, or just keep typing to ignore it. 

Reach for inline suggestions when: 

  • You are writing a test and want Copilot to complete a pattern you’ve already started 
  • You need a quick assertion and the surrounding code makes the intent obvious 
  • You’re already mid-way through a Page Object and just need the next locator filled in 

Copilot Chat lives in a separate panel. Open it with Ctrl + Alt + I on Windows/Linux or Ctrl + Alt + I on Mac. Unlike inline suggestions, this is a full conversation – you ask questions, point it at files, paste in error logs, and work through a problem across multiple turns. 

Reach for Copilot Chat when: 

  • You need to understand code before you test it 
  • A test is failing and you don’t know why 
  • You want to generate a full test file, not just complete a line 
  • You’re referencing multiple files or pulling in external documentation 

A practical way to think about it: inline suggestions handle the writing, Copilot Chat handles the thinking. Most productive QA workflows use 

Copilot in VS Code – setup, authentication, and enabling chat 

Getting a Copilot running in VS Code takes about five minutes. 

Step 1 – Install the extension 

Open VS Code, go to the Extensions panel (Ctrl + Shift + X), and search for GitHub Copilot. Install it. This single extension covers both inline suggestions and Copilot Chat – no separate install needed for chat. 

Step 2 – Sign in with GitHub 

After installation, VS Code will prompt you to sign in with your GitHub account. Follow the browser authentication flow. Copilot requires an active subscription – Individual, Business, or Enterprise. If your organisation runs Business or Enterprise, your admin will have already switched it on for your account. 

Step 3 – Verify inline suggestions are on

Open any JavaScript or TypeScript file and start typing a function. If grey ghost text appears, inline suggestions are working. If nothing shows, check the Copilot icon in the VS Code status bar at the bottom – it shows whether Copilot is active or paused. 

Step 4 – Open Copilot Chat 

Open Copilot Chat using: 

  • Windows/Linux: Ctrl + Alt + I  
  • Mac: Cmd + Option + I  

Or click the Copilot icon in the VS Code sidebar. 
To check if it’s working, try: 

What does this workspace contain?


If Copilot describes your project structure, you’re good to go. 

One setting worth turning on immediately 

Go to VS Code Settings and search for “Copilot workspace index”. Turn on workspace indexing. Without it, project-wide queries return patchy results. With it, Copilot can trace functions across files, locate test coverage, and reason across your whole repo when you use #Codebase in prompts. It makes a noticeable difference. 

both – chat to plan and understand, inline to execute. 

Why QA Engineers Should Care About Copilot 

The growing complexity of modern codebases 

A decade ago, testing meant working on a relatively contained system – one repo, one team, one pipeline. 

That’s no longer the reality. 

Today’s applications are distributed systems: 

  • Frontend + API gateway  
  • Multiple microservices  
  • Independent deployments  
  • Third-party integrations  
  • Feature flags and A/B tests  

And each piece can fail in its own way. 

For QA engineers, this means: 

  • More surface area to test  
  • More dependencies to understand  
  • More hidden failure points  

You’re not just testing features you’re testing interactions across systems. 

Why manual test design alone doesn’t scale 

Manual testing has a natural limit and it’s not about skill, it’s about volume. 
Writing a good test takes time: 

  • Understanding the code  
  • Identifying meaningful scenarios  
  • Designing maintainable test cases  

Now multiply that across multiple features in a sprint. 

What happens? 

  • Happy paths get covered  
  • Edge cases get postponed  
  • Integration scenarios get skipped  

Over time, coverage quietly weakens until something breaks in production. 
This isn’t a team problem. It’s a scale problem. 
And that’s exactly where Copilot helps. 

AI-assisted QA workflows 

Copilot doesn’t replace QA engineers, it removes the repetitive work so you can focus on thinking. 

1. Understanding unfamiliar code 

Instead of reading files for 20 minutes, ask: 

#Files paymentService.js 
What does this service do and what are the likely failure points I should test? 


You still validate the answer but now you start faster. 

2. Generating test cases 

Instead of starting from scratch: 

  • Point Copilot to your implementation  
  • Ask what should be tested  

It won’t be perfect, but it gets you past the blank page. 

#Files checkout.spec.js #Files checkoutService.js 
What scenarios in the service are not covered by the current test file? 
It often catches:

3. Finding missing coverage 

  • Missing error paths  
  • Untested branches  
  • Edge cases you overlooked 

4. Debugging failures 

#testFailure #Files cart.spec.js 
This test passed yesterday and is failing now. What changed that could cause this? 


Instead of guessing, you start with a hypothesis. 

Where Copilot fits in the testing lifecycle 

Copilot isn’t limited to one phase, it helps wherever work slows down. 
During requirement analysis 
Ask what scenarios should be tested. You get a starting point instead of a blank page. 

While writing tests 
It speeds up: 

  • Playwright/Cypress scripts  
  • Page Objects  
  • Fixtures and setup  

You focus on decisions, not typing. 
During maintenance 

When tests break: 

  • Compare test + implementation  
  • Ask what’s out of sync  

This saves a lot of time. 

Before PR merges 
Quick coverage checks: 

What is not tested in this file? 
It won’t catch everything, but it catches enough to matter. 

When CI fails 

Paste logs + test file: 
What is causing this failure? 
Even if not perfect, it narrows the problem quickly. 

Understanding Copilot Context – The Brain Behind the Responses 

What “context” means in AI-assisted coding 

Most bad Copilot responses happen for one reason: lack of context

It works entirely from what lands in the prompt. It is not pulling from a live database of your project. The prompt includes more than just the words you type – the files that are open, the code you have selected, the conversation history, and anything you explicitly point it at. 

That collection of information is what “context” means in this setting. And the quality of what Copilot gives back is almost entirely a function of how much relevant context it had to work with when it generated the response. 

A prompt with no context gets a generic answer. A prompt with the right file, the right symbol, and the right error output gets something you can actually use. 

That’s the whole game with Copilot. Not clever prompts. 

Better context = better answers. 

Implicit context – what Copilot sees automatically 

Before you type a single # tag, Copilot is already reading several things without you asking it to. 

The active file is always included. Whatever file you have open and in focus in VS Code is part of every Copilot Chat request automatically. If your login.spec.js is open when you ask a question, Copilot has read it before it answers. 

Selected code is picked up too. Highlight a block of code in your editor, then open Copilot Chat – that selection is passed in as context. This is useful for targeting a specific function or assertion without referencing the whole file. 

Conversation history carries forward within the same chat session. If you told Copilot earlier that you are working with Playwright and targeting a login flow, it remembers that through the conversation. Starting a /new session clears this – which is worth knowing when you switch to a different task and don’t want old context bleeding in. 

Editor state – things like the current cursor position and recently opened files – can also influence inline suggestions, though this matters more for code completion than for Chat. 

The practical implication: just having the right file open already improves Copilot’s answers before you add anything explicit. It is a small thing but it adds up across a working day. 

Explicit context – what you feed Copilot intentionally via #-mentions 

Implicit context has limits. It only knows what happens to be open. For anything beyond the active file, you have to be deliberate. 

That is what #-mentions are for. They are how you tell Copilot exactly what to read before it answers. 

The ones QA engineers reach for most often: 

#Files – points Copilot at a specific file. Use this when your question is about something that isn’t currently open. 

#Files authService.js 
What are the main functions I should be writing tests for? 

#Codebase – tells Copilot to search your entire project. Use this when you don’t know where something lives or want a project-wide answer. 

#Codebase 
Where is the payment flow handled and are there existing tests for it? 

#testFailure – pulls in the details of a failing test from the VS Code test panel. Useful the moment something goes red in CI and you need a starting point fast. 

#testFailure 
What's causing this failure and how do I fix it? 

#terminalLastCommand – includes the output from your last terminal command. Use this after a failed test run to give Copilot the actual error output rather than describing it. 

#terminalLastCommand 
What does this error mean and which test is most likely responsible? 

#fetch – pulls in content from an external URL. Use this when you want Copilot to generate code against the current version of a framework’s documentation rather than whatever it learned during training. 

#fetch https://playwright.dev/docs/test-assertions 
Generate assertions for a checkout form validation test using the latest Playwright API. 

Each tag is a deliberate decision to give Copilot more to work with. The more precisely you target the relevant information, the less guessing it does. 

Why better context = more accurate, test-relevant responses

Here is the same question asked two different ways: 

Without context: 

Write a test for the login function. 

The copilot will write something generic. Probably a Playwright test with #username and #password selectors pointing at example.com. Technically valid. Completely useless for your actual project. 

With context: 

#Files loginPage.js #Files auth.spec.js 
Write additional test cases for the login function based on how it is currently implemented. 

Now Copilot has read your actual page object, seen your existing test patterns, and can generate something that fits your project’s structure, uses your real selectors, and follows the conventions already in the file. 

Same question. Completely different output. The only variable is context. 

This is worth internalising early because it changes how you think about prompting. The question you ask matters less than what you give Copilot to read before it answers. 

Workspace indexing – Remote index vs. Local index vs. Basic index 

When you use #Codebase in a prompt, Copilot must search your project somehow. What it finds – and how accurately – depends on which indexing mode is running in the background. 

There are three modes and they behave quite differently. 

The most capable is remote indexing. GitHub builds and maintains an index of your repository on its servers, and when you use #Codebase, Copilot searches that. On larger projects the difference is noticeable – it finds things buried deep in the folder structure that local search misses. Getting remote indexing working requires your repo to be on GitHub with workspace indexing enabled in VS Code. To enable it, open the Command Palette with Ctrl + Shift + P, type “GitHub Copilot: Index workspace” and run it. 

Without a GitHub remote, VS Code builds a local index from the files sitting in your workspace instead. This covers most situations well enough. For a single service or a frontend application with a reasonable number of test files, local indexing handles #Codebase queries without much trouble. 

The fallback below that is basic mode – no real index, just whatever VS Code can piece together from open files and recent activity. Results get thin quickly. If Copilot keeps giving you vague answers to project-wide questions despite you using #Codebase, this is usually the reason. 

To check what mode you are running, glance at the bottom status bar in VS Code. Active workspace indexing shows an icon and its current state. If you see nothing there, go to Settings with Ctrl +,, search for github.copilot.chat.codesearch.enabled, and turn it on. 

For most QA teams it’s worth five minutes to sort this out. Once remote or local indexing is active, #Codebase becomes a genuinely useful tool rather than a coin flip. 

Copilot Chat Context in VS Code – Deep Dive 

The #-mention system: files, folders, symbols, tools, terminal output 

The #-mention system is how you move Copilot from guessing to knowing. 

Without any # tags, Copilot works from whatever is currently open in your editor. That’s fine for simple questions about the active file. The moment your question touches anything outside that – another file, a function somewhere else in the project, a terminal error from a failed run – you need to be explicit. 

Every # tag you add is an instruction: read this before you answer. The more precisely you point Copilot at the relevant material, the less it has to infer, and the more useful the response gets. 

The system covers five broad categories: 

  • Files and folders – point at specific files or entire directories 
  • Symbols – target a function, class, or variable by name 
  • Tools – pull in test failures, terminal output, VS Code problems 
  • External sources – fetch live documentation or GitHub repositories 
  • Editor state – reference selected code or search results 

The sections below cover each one with QA-specific examples.

#Files and #Folders – scoping Copilot to your test directories 

#Files is the most straightforward tag. It tells Copilot to read a specific file before responding. 

#Files login.spec.js 

This test is failing intermittently on CI but passing locally. 

What are the most likely causes?
#Files authService.js 
What are the main functions in this file and what test cases should I be writing for each? 

You can reference multiple files in a single prompt. This is useful when your question spans an implementation file and its corresponding test: 

#Files checkoutService.js #Files checkout.spec.js 
The service was updated yesterday. Which tests need to change to reflect the new behaviour? 

#Folders work the same way but at directory level. Use it when your question is about a whole module rather than one file. 

#Folders tests/e2e/ 
Are there any tests in this folder that don't have explicit wait strategies? 

Review each file and flag the ones most likely to be flaky. 

#Folders src/api/ 
I need to write integration tests for this API layer. 

What endpoints exist and which ones have no test coverage yet? 

When to use which: Use #Files when you know exactly which file is relevant. Use #Folders when you want Copilot to survey a whole area of the project without you listing every file individually. 

Real-time Example: Playwright_UI framework 

#Files 

#Files pageObjects/loginPo.ts #Files fixtures/fixtures.ts 
The fixture-level login calls loginPage.login() then checks 
getPageTitleFromBreadcrumbText() equals "Dashboard". 
Is the wait strategy between these two calls reliable? 
What could cause this to fail intermittently on CI? 

#Files pageObjects/loginPo.ts #Files fixtures/fixtures.ts 

#Folders 

#Folders pageObjects/ 
Review all Page Objects in this folder. 
Are the locator strategies consistent across files? 
Which ones use XPath and which use CSS selectors? 
Should we standardise to one approach? 

#Codebase – letting Copilot search your full project for relevant code 

#Codebase hands Copilot a search across your entire workspace. You don’t have to know where something lives – you just ask and it finds it. 

#Codebase 
Where is the authentication flow handled and do we have tests covering the token refresh logic? 
#Codebase 
Are there any hardcoded waitForTimeout calls in our test suite? 
List the files and line numbers. 
#Codebase 
We're adding a new payment method. What existing tests would be affected and what new test scenarios should I create? 

This is particularly useful when you join a project mid-sprint and need to understand the test architecture quickly. Rather than spending an hour reading folder structures, one #Codebase prompt can give you a working map of where things live and what’s covered. 

One thing worth knowing: the quality of #Codebase results depend on your workspace indexing mode. If answers feel vague or incomplete, check that workspace indexing is enabled – covered in Section 4. 

Example: 

#Codebase 
We have Page Objects for login, admin, pim, myInfo and recruitment. 
Which Page Object methods currently have no test coverage?  
Group results by file. 

#Symbols / #sym – targeting specific test functions, classes, or methods 

#Symbols lets you reference a specific function, class, or variable by name rather than pointing at an entire file. Useful when you know exactly what you want to ask about. 

#Symbols validatePaymentInput 
What edge cases is this function not handling? 
Write test cases that cover the missing scenarios. 
#Symbols LoginPage 
Review this Page Object class. Are the locators stable? 
Would any of them likely break after a minor UI change? 
#sym processRefund 
Explain what this function does and generate a test that covers the failure path when the refund API returns a 500. 

#sym is just a shorthand for #Symbols – both works identically. 
The advantage over #Files are precision. If a file has 400 lines and your question is about one 20-line function, #Symbols keep the context tight and the response focused. 

Example: 

#Symbols 
#Symbols selectDropDownOption 
This method uses a scroll loop with waitForTimeout(100). 
What are the risks of this approach and how would you rewrite it to be more stable? 

#sym 

#sym createNewSystemUser 
Review this method in AdminPo. 
Are the expect() assertions inside the method a good practice or should they be moved to the test file? 

#selection – instant analysis of highlighted test code blocks 

#selection captures whatever code you have highlighted in the editor at the time you send the prompt. No file reference needed – just select the block and ask. 

How to use it: 

  1. Highlight the code you want to ask about in your editor 
  2. Open Copilot Chat 
  3. Type your prompt – #selection is included automatically when you have an active highlight, or you can type it explicitly 
#selection 
This assertion keeps failing on the third run. 
What's wrong with it and how should I rewrite it? 
#selection 
Is this Page Object method handling async operations correctly? 
#selection 
Rewrite this test to use data-testid selectors instead of CSS class names. 

When to use it: #Selection is the fastest option when you’re in the middle of writing or debugging and you want a quick answer about a specific block without switching to a file reference. It keeps you in flow. 

Example: #selection 

First highlight the entire waitForLoaderToDisappear method in pageObjects/basePo.ts then type: 

#selection 
This loader wait method uses a loop with waitForTimeout(200). 
Could this miss the loader appearing on fast page transitions? 
Is there a more reliable way to handle this? 

#fetch <URL> – pulling live API docs or changelogs into your prompt 

#fetch tells Copilot to retrieve the content of a URL and use it as context. This is how you make sure Copilot is working from the current version of a framework’s documentation rather than whatever it learned during training – which may be months or years out of date. 

#fetch https://playwright.dev/docs/test-assertions 
Generate assertions for a login form test that checks: 
- Successful redirect to /dashboard 
- Error message visibility on wrong credentials 
- Input field validation on empty submission 
#fetch https://docs.cypress.io/api/commands/intercept 
Write a Cypress test that intercepts the POST /api/login request and validates the request payload and response status. 
#fetch https://playwright.dev/docs/api/class-locator 
Our team is migrating from page.$ selectors to the Locator API. 
Show me how to rewrite these three selectors using the current approach. 

This is especially useful when a framework ships a breaking change. Instead of guessing whether Copilot knows about it, fetch the updated docs directly and let it generate against the current API. 

Example: 

#fetch 
#fetch https://playwright.dev/docs/api/class-locator 
#Files pageObjects/loginPo.ts 
Our loginPo uses string selectors passed to page.click() and page.fill() inside WebActions. 
Rewrite the login() method using the current Playwright Locator API best practices. 

#githubRepo – referencing external repos for consistency checks 

#githubRepo lets Copilot read a public GitHub repository and answer questions about it. Useful for learning from well-maintained open-source test suites or checking how a framework handles something before you implement it yourself. 

#githubRepo microsoft/playwright 
How are test fixtures implemented in this repo? 
Show me a pattern I can adapt for our project. 
#githubRepo cypress-io/cypress-realworld-app 
How is the Page Object pattern used in this project? 
We want to adopt a similar structure for our own test suite. 
#githubRepo microsoft/playwright 
How does this repo handle authentication state across multiple test files? 
We're trying to solve the same problem. 

When to use it: Before building something from scratch, check how a reputable project handles the same problem. It’s faster than reading through a repo manually and often surfaces patterns that aren’t obvious from the documentation alone. 

Example: 

#githubRepo microsoft/playwright 
Our project uses a custom WebActions class with methods like clickElement(), fillInput(), and isElementVisible(). 
How does the official Playwright repo handle these same interactions?  
Are we missing any best practices in our current approach? 

@-mentions – using @workspace, @terminal, @vscode for QA-specific queries 

@-mentions are different from # tags. Where # tags pull in content for Copilot to read, @ participants route your question to a specialised assistant that has domain-specific knowledge or access. 

@workspace – reasons across your entire project structure. Similar to #Codebase but as a participant rather than a context tag. Good for architectural questions. 

@workspace 
How is error handling tested across our application? 
Are there consistent patterns or are different services doing it differently? 

Example: 

@workspace 
Our project has Page Objects for login, admin, pim, myInfo and recruitment in the pageObjects folder. 
How is error handling managed across these Page Objects? 
Are we consistently handling failures or are different 
Page Objects doing it differently? 

@terminal – interprets shell commands and their output. Use this when you need help with test runner commands, debugging CLI output, or setting up scripts. 

@terminal 
How do I run only the failing Playwright tests in headed mode with verbose output and a specific grep filter? 
@terminal 
This npm test command is throwing an error I haven't seen before. 
What does it mean and how do I fix it? 

Example: 

@terminal 
How do I run only the login-related tests in this Playwright project in headed mode with the Allure reporter enabled and slow motion of 500ms? 

@vscode – answers questions about VS Code configuration and settings. Use this for test runner setup, launch configs, and extension settings. 

@vscode 
How do I configure the Playwright test runner so tests run automatically on file save? 
@vscode 
How do I set up a launch.json that runs a specific Playwright test file in debug mode? 

Example: 

@vscode 
How do I configure a launch.json in this Playwright TypeScript project to run a specific test file in debug mode with headed browser and slow motion? 

Vision support – attaching bug screenshots and UI sketches to Copilot chat 

Copilot Chat in VS Code accepts images. You can attach a screenshot directly to a prompt and ask questions about what’s in it. 

To attach an image, click the paperclip icon in the Copilot Chat input, or drag and drop the image file directly into the chat panel. 

Debugging a UI failure: 

Attach a screenshot of the failing state and ask: 

This is the login page in its error state. 
The error message is overlapping the input field. 
What CSS or layout issue is most likely causing this? 

Generating locators from a UI: 

Here is a screenshot of the checkout form. 
What stable selectors would you recommend for each interactive element? 
Prefer data-testid over CSS class names where possible. 

Reviewing a design against implementation: 

The first image is the Figma design for the registration form. 
The second is the current implementation. 
What differences do you see that a QA engineer should raise? 
This is genuinely useful for visual regression work and for onboarding onto a new application where you don't yet know the codebase but can see what the UI is supposed to look like. 

Example: 

Vision Support – Attaching Screenshots to Copilot Chat 

Copilot Chat accepts images directly. There are three ways to attach a screenshot: 

Option 1 – Direct paste (fastest) Copy any screenshot (Ctrl + C), click inside the Copilot Chat input, paste (Ctrl + V). It appears as “Pasted Image” and is ready to use immediately. 

Option 2 – Plus icon menu Click the + icon at the bottom left of the chat input, select “Files & Folders…” and browse to your screenshot file. 

Option 3 – Screenshot Window Click the + icon, select “Screenshot Window” to capture any currently open window directly into the chat. 

Once attached, type your prompt alongside it: 

This is our OrangeHRM login page showing the "Invalid credentials" error toast.  
Our loginPo.ts uses this selector to capture it:  
"div.oxd-alert--error p" 
Looking at this screenshot: 
1. Is this selector stable enough for production tests? 
2. Suggest stable selectors for username field, password field, login button and error toast   that won't break after a UI update. 

Simple Browser integration – live frontend QA context 

VS Code has a built-in browser called Simple Browser. It renders web pages directly inside the editor, and when it’s open, Copilot can use it as context for your questions. 

To open it, use the Command Palette (Ctrl + Shift + P) 

type “Simple Browser: Show”.  

Enter your local dev server URL – typically http://localhost:3000 or whichever port your app runs on. 

With Simple Browser open alongside your test file, you can ask Copilot questions that combine both: 

I have the login page open in Simple Browser. 
Looking at the current state of the form, what selectors should I use for the username field, password field, and submit button? 
The checkout flow is open in Simple Browser. 
I'm writing an end-to-end test for it. 
What steps should the test cover based on what you can see? 

This removes the back-and-forth of switching between a browser tab and VS Code when writing tests for something you are actively looking at. The page is right there, Copilot can see it, and the test file is open next to it. 

Example: 

First open Simple Browser in VS Code: 

  • Press Ctrl + Shift + P 
  • Type “Simple Browser: Show” 
  • Enter your OrangeHRM local URL – http://localhost 

Open Simple Browser in VS Code pointing at your OrangeHRM instance, type this into Copilot Chat alongside it, screenshot both the Simple Browser panel and the Copilot response together. That’s your blog visual for Simple Browser integration – it shows Copilot looking at the live page and generating selectors directly relevant to your existing loginPo.ts file. 

Exploring the Awesome Copilot Chat Context Repository 

Purpose and structure of the repository

Most Copilot guides stop after introducing a handful of context tags. There are far more available and the ones that often get skipped are surprisingly useful for QA work. 

That’s where the Awesome Copilot Chat Context repository comes in. You can find it here: github.com/demo/Awesome-Copilot-Chat-Context 

Think of it as a complete reference guide to everything you can feed into Copilot before asking a question. Instead of guessing which tag to use, you get a structured breakdown of all available options, grouped by the kind of context they provide, along with practical examples. 

For QA engineers, this repo becomes handy in two main situations. First, when you’re stuck figuring out which context tag fits the problem you’re solving. Second, when you’re helping teammates get started with Copilot—pointing them to a well-organized resource is far easier than explaining everything from memory. 

The repository organizes tags into clear categories like project and file context, code selection, terminal output, debugging tools, external data, and editor-specific inputs. It’s essentially a map of how to “talk” to Copilot more effectively. 

Complete breakdown of all supported # context tags for QA use 

Copilot offers far more than just #Files and #Codebase. Once you start exploring the full set, you realize how much control you actually have over what Copilot sees and understands. 

Here’s a cleaner way to think about them grouped by what kind of information they provide: 

Project & File Structure 

  • #Files – specific files from your workspace 
  • #Folders – all files within a directory 
  • #Codebase – a broader search across the entire project 

Symbol & Code Selection 

  • #Symbols / #sym – a specific function, class, or variable 
  • #selection – currently highlighted code in the editor 
  • #usage – how a symbol is used across the codebase 

Terminal & Runtime 

  • #terminalLastCommand – output from the last terminal command 
  • #Terminal command output – full terminal session output 

Debugging & Test 

  • #Problems – issues from the VS Code Problems panel 
  • #testFailure – failing test details from the test panel 
  • #findTestFiles – locates test files in a given scope 

External Data 

  • #fetch – content retrieved from a live URL 
  • #searchResults – results from a file or symbol search 
  • #vscodeAPI – relevant VS Code API documentation 

Each tag is basically a way of saying: “Here’s the exact information you should look at before answering.” The better you choose these inputs, the more relevant Copilot’s output becomes. 

Debugging & Test Contexts – #Problems, #testFailure, #findTestFiles 

#Problems 

Pulls in errors and warnings from the VS Code Problems panel – TypeScript errors, ESLint violations, import issues – directly into your prompt. 

#Problems 
There are TypeScript errors appearing in our pageObjects folder. 
Explain each error and suggest the fix for each one. 
#Problems #Files helper/webAction.ts 
The Problems panel is showing type errors in this file. 
What is causing them and how do I resolve them without breaking the existing method signatures? 

Benefit: You don’t have to copy-paste the error message. The copilot reads it directly from the panel and answers against the actual error text. 

Limitation: Only covers what VS Code’s diagnostic tools catch – TypeScript and linting errors. Runtime failures and test assertion errors don’t appear here. Use #testFailure for those. 

Example: 

#Problems #Files helper/webAction.ts 
The Problems panel is showing type errors in this file. 
Explain each error and suggest a fix without breaking the existing method signatures. 

#testFailure 

Pulls in the details of a failing test from the VS Code test panel – test name, error message, assertion output, and stack trace. 

#testFailure #Files fixtures/fixtures.ts 
The fixture-level login is failing on the expect() after getPageTitleFromBreadcrumbText(). 
What is causing this and how do I fix it? 
#testFailure #Files helper/webAction.ts 
This failure is coming from inside selectDropDownOption(). 
Is this a timing issue or a selector problem? 
What is the most reliable fix? 

Benefit: Combines the actual failure output with the source file in one prompt. No copy-pasting stack traces. 

Limitation: Requires tests to be run through VS Code’s built-in test runner panel. If you run via terminal only, use #terminalLastCommand instead. 

Example: 

#testFailure #Files fixtures/fixtures.ts
The fixture-level login is failing on the expect() after getPageTitleFromBreadcrumbText(). 
What is causing this and how do I fix it?

#findTestFiles 

Locates test files within a specified scope or directory. Useful when you want Copilot to work across your test suite without manually listing every file. 

#findTestFiles 
Identify all test files in this project. 
Which Page Objects have test coverage and Which ones have no tests written yet ?
#findTestFiles #Folders pageObjects/ 
Cross-reference the Page Objects in this folder against the existing test files. 
List every public method that has no test covering it. 

Benefit: Saves time when auditing coverage across a large test suite. 

Limitation: Results depend on workspace indexing quality. Works best with Remote or Local index enabled. 

Example:  
#findTestFiles #Folders pageObjects/ 
Cross-reference the Page Objects in this folder against the existing test files. 
List every public method in loginPo.ts, adminPo.ts and pimPo.ts that has no test covering it. 

Terminal & Runtime Contexts – #terminalLastCommand, #Terminal command output 

#terminalLastCommand 

Includes the output from your most recent terminal command – stdout and stderr – as context. 

#terminalLastCommand 
Three tests failed in this Playwright run. 
Based on the error output, what is the root cause and which files should I check first? 
#terminalLastCommand 
The npm test command threw an error I haven't seen before. 
What does this mean and how do I fix it? 

Benefit: Real error data, not your description of it. The copilot reads the actual output and answers against that. 

Limitation: Captures the last command only. If you run another command after the failure, that output replaces the failure log. Always reference immediately after the failed run. 

Example: 

npx playwright test --reporter=allure-playwright 

Then immediately in Copilot Chat: 
#terminalLastCommand 
Some tests failed in this Playwright run. 
Based on the error output, which Page Object is most likely causing the failure and what should I fix first? 

#Terminal command output 

Similar to #terminalLastCommand but references the broader terminal session output rather than just the last command. Useful when a failure spans multiple commands or when CI log output has been pasted into the terminal. 

#Terminal command output 
Our CI pipeline failed during the Playwright test run. 
I have pasted the full log output into the terminal. 
What is causing these failures and what should I fix first? 

Benefit: Works well for longer failure logs where the root cause isn’t in the last command alone. 

Example: 

npm test 
Then paste the full CI failure log into your VS Code terminal. Then in Copilot Chat: 
#Terminal command output 
Our Playwright test suite failed during the CI run. 
The full log is in the terminal output. 
Which tests failed, what is the common root cause and which file in our page Objects folder should I check first? 

External Data Contexts – #fetch, #searchResults, #vscodeAPI 

#fetch 

Retrieves content from a live public URL and uses it as context. Covered in depth in Section 5 – the key use case for QA is pulling current framework documentation before generating tests. 

#fetch https://playwright.dev/docs/test-assertions 
Our loginPo.ts uses getErrorMsgFromToast() to return the error text from "div.oxd-alert--error p". 
Write assertions using the current Playwright API for: 
- Toast message equals "Invalid credentials" 
- Login form visible before submit 
- Page title equals "Dashboard" after successful login 

#searchResults 

Uses results from a VS Code file, symbol, or global text search as context. Run a search in VS Code first, then reference the results in your prompt. 

#searchResults 
I searched for "waitForTimeout" across the project. 
Review every result and suggest a more stable alternative for each one. 
#searchResults 
I searched for "expect(" across all test files. 
Are we using assertions consistently or are there files mixing different assertion styles? 

Benefit: Lets you ask questions about a specific pattern across the whole project without listing every file manually. 

Limitation: The search must be run in VS Code before prompting. Copilot reads whatever results are currently showing in the search panel. 

Example:  

First in VS Code press Ctrl + Shift + F and search for: ‘waitForTimeout’ 

Then in Copilot Chat: 

#searchResults 
I searched for "waitForTimeout" across the project. 
This is used in webAction.ts inside selectDropDownOption and selectDropDownOptionRandomly scroll loops. 
Suggest a more stable alternative for each one that doesn't rely on fixed timing. 

#vscodeAPI 

Includes relevant VS Code API documentation as context. Primarily useful for extension developers – less common for QA day-to-day work, but useful when configuring the test runner or writing custom VS Code tooling for your team. 

#vscodeAPI 
How do I use the VS Code testing API to programmatically run only the failing tests from the last test run? 

Example: 

#vscodeAPI 
How do I configure VS Code to automatically run our Playwright tests in this project whenever a file in the pageObjects folder is saved? 

Slash commands quick reference – /new,/clear, /rename 

Slash commands manage your Copilot Chat sessions. They are not context tags – they control the conversation itself. 

/new – Starts a completely fresh conversation with no prior context. Use this every time you switch to a different task. Without it, context from a previous conversation bleeds into the new one and Copilot’s answers drift. 

/new 

Always run /new when moving from debugging a login test to writing PIM tests. Clean context means sharper answers. 

Example: 

After finishing login test debugging, before starting PIM work: 

/new 

Then immediately: 

#Files pageObjects/pimPo.ts 
#Files dataFactory/pim/employeeData.ts 
I am starting fresh on PIM test coverage. 
What test scenarios should I write for createNewEmployee() based on these files? 

/clear – Clears the current conversation and resets context without starting a new named session. 

/clear 

Use this mid-task when a conversation has gone in the wrong direction and you want to reset without losing the session name. 

/rename – Renames the current chat session. Useful when you run multiple Copilot sessions for different modules and want to keep them organised. 

/rename Login Flow Debugging 
/rename PIM Page Object Review 
/rename AdminPo Coverage Audit 

Why this matters for QA teams: If you work across multiple modules in a day – login, admin, recruitment – keeping sessions named and isolated means Copilot’s context stays relevant to the task at hand. One session per module is a good habit. 

/rename Login Flow Debugging 

Power combinations – using multiple context tags in a single prompt 

Single context tags are useful. Stacking them is where Copilot becomes genuinely powerful for QA work. 

The principle is simple: give Copilot everything it needs to answer properly in one prompt rather than asking follow-up questions. The more relevant context it has upfront, the more targeted the response. 

Debugging a fixture failure with full context: 

#Files pageObjects/loginPo.ts 
#Files fixtures/fixtures.ts 
#testFailure 
The fixture-level login is failing intermittently on CI. 
The error is coming from either login() or getPageTitleFromBreadcrumbText(). 
What is the most likely cause and how do I fix it?

Coverage audit across a full module: 

#findTestFiles 
#Folders pageObjects/ 
Which public methods across all Page Objects have no corresponding test coverage? 
Group the results by file and prioritise by risk. 

Generating tests from live docs + real code: 

#fetch https://playwright.dev/docs/test-assertions 
#Files pageObjects/adminPo.ts 
#Files dataFactory/admin/systemUserData.ts 
Generate a complete test file for createNewSystemUser(). 
Cover happy path, duplicate username, missing mandatory fields, and invalid employee names. 
Use the current Playwright assertion API. 

Finding and fixing flakiness across the project: 

#Codebase 
#terminalLastCommand 
Three tests are failing intermittently in CI. 
The terminal shows the error output from the last run. 
Search the codebase for waitForTimeout usage and cross-reference with the failing tests. 
What is causing the flakiness and what should I change? 

Full pre-release test review: 

#findTestFiles 
#Folders pageObjects/ 
#fetch https://playwright.dev/docs/best-practices 
Compare our current test structure against 
Playwright's best practices. 
What are the top three improvements we should make before the next release? 

Example: 

#Files pageObjects/loginPo.ts 
#Files fixtures/fixtures.ts 
#testFailure 
#terminalLastCommand 
The fixture-level login is failing intermittently on CI but passing locally. 
The error is coming from either login() or getPageTitleFromBreadcrumbText() in loginPo.ts. 
Cross-reference the test failure details and terminal output to identify the root cause and suggest the fix.

Using GitHub Copilot as a QA Co-Buddy – Core Capabilities

Test case generation from requirements and user stories

Most test case generation starts the same way – someone hands you a ticket, a user story, or a feature description and expects a test plan back. The thinking behind that plan still requires a QA engineer. But the initial list-building, the scenario drafting, the translation from requirement to test – that part Copilot handles well. 

The key is giving it something concrete to read. A vague prompt returns vague test cases. Point it at the actual requirement text and the relevant implementation together, and the output reflects both. 

From a user story: 

The following user story was given to our team: 
"As an HR admin, I want to add a new system user so that they can access OrangeHRM with assigned roles." #Files pageObjects/adminPo.ts 

#Files dataFactory/admin/systemUserData.ts 
Generate test cases for this user story based on how the feature is actually implemented. 
Cover happy path, validation errors, duplicate username, and role assignment scenario.

From ticket requirements directly: 

#Files pageObjects/pimPo.ts 
New requirement: Employee ID must be unique and cannot exceed 10 characters. 
What test cases should I add to cover this requirement based on the current createNewEmployee() implementation? 

Code explanation for testers unfamiliar with the implementation

Joining a project mid-sprint and being handed a 400-line Page Object with no documentation is a common QA experience. Before you can test something, you need to understand it – and reading through unfamiliar TypeScript cold takes time. 

This is one of the most underrated uses of Copilot for QA engineers. Not generating tests, just understanding code fast enough to test it properly. 

#Files fixtures/fixtures.ts 
I have never worked with Playwright fixtures before. 
Explain how this fixtures.ts file works and why the page fixture runs the login flow before every test automatically. 

Writing and reviewing automation scripts – Playwright, Cypress, Selenium, pytest 

Copilot generates automation scripts that match your existing project conventions when you give it the right context. The output isn’t always perfect – but it gets you to a working first draft faster than starting from scratch. 

Generating a new Playwright test: 

#Files pageObjects/loginPo.ts 
#Files fixtures/fixtures.ts 
#fetch https://playwright.dev/docs/test-assertions 

Write a Playwright test file for the login flow covering: 
- Valid credentials redirect to Dashboard 
- Invalid credentials show "Invalid credentials" toast 
- Empty username and password show validation errors Use our existing fixture setup and LoginPo methods. 

Reviewing an existing script for quality: 

#Files pageObjects/adminPo.ts 
Review the createNewSystemUser() method. 
Are the inline expect() assertions good practice or should they move to the test file? What would you change to make this more maintainable? 

Identifying brittle selectors before they break: 

#Files pageObjects/pimPo.ts 
Review all locators in this Page Object. 
Which ones are CSS class-based and likely to break after a UI update? 
Suggest more stable alternatives for each. 

API testing assistance – generating request payloads, validating responses

OrangeHRM runs on a REST API underneath the UI. Copilot can help generate API-level tests that run faster than E2E tests and catch issues before they surface in the browser. 

Generating an API test for login: 

#fetch https://playwright.dev/docs/api/class-apirequestcontext 
Our loginPo.ts tests login through the UI using input[name='username'] and input[name='password']. 
Write a Playwright API test that hits the OrangeHRM authentication endpoint directly, validates the response status is 200, and checks the token is returned in the response body. 

Validating API response structure: 

#Files dataObject/login/login.ts 
#Files dataFactory/login/loginData.ts 
We want to add API-level validation alongside our UI tests. 
Based on the Login interface and LoginData factory, generate an API test that validates the login response structure matches our expected data object. 

Comparing UI and API behaviour: 

#Files pageObjects/adminPo.ts 
The createNewSystemUser() method tests this flow through the UI. 
Write a parallel API test that creates the same system user via the OrangeHRM API and validates the response. 
This will help us identify if failures are UI or API-level. 

Debugging support using #testFailure and #Terminal command output 

When something breaks, the instinct is to start reading. Copilot lets you start with a diagnosis instead. 

The combination of #testFailure and the relevant source file gives Copilot both the error and the code it came from. Most of the time that’s enough to narrow the problem down significantly. 

Fixture-level failure: 

#testFailure 
#Files fixtures/fixtures.ts 
#Files pageObjects/loginPo.ts 
The base fixture is failing before any test runs. 
The error is in the login flow or the Dashboardtitle assertion. 
What is the most likely cause and how do I fix it?

Flaky test in CI: 

#testFailure 
#terminalLastCommand 
#Files helper/webAction.ts 
This test passes locally but fails intermittently on CI. 
The failure is inside selectDropDownOption(). 
Is this a timing issue or a selector problem? 
What is the fix? 

Unexpected selector failure: 

#testFailure 
#Files pageObjects/adminPo.ts 
The test is failing on populateMandatoryFieldForNewSystemUser(). 
One of the dropdowns is not being found. 
What could cause this and what should I check first? 

Identifying edge cases and boundary conditions from code context 

Edge cases are where most bugs live. They are also what gets skipped when a sprint is running out of time. Copilot can surface them quickly from the implementation itself – you don’t have to think of every scenario from scratch. 

Finding edge cases in a form: 

#Files pageObjects/adminPo.ts 
#Files dataObject/admin/systemUser.ts 
Looking at the Add User form implementation, what edge cases and boundary conditions am I probably not testing? 
List them by field -- username, password, employee name, user role, and status.

Boundary conditions in data factory: 

#Files dataFactory/pim/employeeData.ts 
#Files dataObject/pim/employee.ts 
What boundary conditions should I test for the Employee data object? 
Think about field length limits, special characters, numeric employee IDs, and empty optional fields. 

Date picker edge cases: 

#Files pageObjects/basePo.ts 
The selectDate() method handles date picker interactions. 
What edge cases should I test? 
Think about leap years, end of month boundaries, past dates, future dates, and invalid date formats. 

PR review assistance – catching regressions and missing test coverage 

Before a pull request merges, a quick Copilot review takes two minutes and regularly catches things that slip through manual review – especially on busy sprints where nobody has time to read every changed file carefully. 

Pre-merge coverage check: 

#Files pageObjects/pimPo.ts 
This file was modified in the latest PR. 
Two new methods were
added: searchForEmployeeUsingId() and isEmployeeListContainsResultAsPerSearchCriteria(). 
Do we have tests covering these methods? 
What test cases should be added before this merges? 

Regression risk assessment: 

#Files helper/webAction.ts 
#Files pageObjects/basePo.ts 
These two files were changed in the latest PR. 
Which existing tests are most likely to break because of these changes? 
What should I run first to check for regressions? 

Full PR test review: 

#Files pageObjects/adminPo.ts 
#Files pageObjects/loginPo.ts 
#Codebase 
Both these Page Objects were updated in the latest PR. 
Review the changes and identify: 1. Any missing test coverage for new methods 
2. Any existing tests that may now be incorrect 
3. Any regression risks we should check before merging

Writing Effective GitHub Copilot Prompts for QA Engineers – Templates & Best Practices 

Why Prompt Quality Matters 

Two engineers on the same project, using the same tool, can get completely different results from Copilot. The code is identical. The question is similar. But one prompt returns something usable and the other returns something generic that needs rewriting from scratch. 

The difference is almost always the prompt. 

Copilot is not a search engine. It doesn’t retrieve answers from a database – it generates them based on what it was given. A vague prompt with no context produces a response built on assumptions. A specific prompt with the right files attached produces a response built on your actual code. 

For QA engineers this matters more than it does for developers. Developers often ask Copilot to build something new – and a reasonable first draft is good enough to iterate on. QA engineers need accurate answers about existing code – the wrong locator, the wrong assertion pattern, or a missed edge case doesn’t show up until something breaks in production. 

Getting the prompt right is not about being clever. It is about being specific.

Providing proper context – always attach the right # tags 

Before typing a question, ask yourself: what does Copilot need to read to answer this properly? 

That question drives everything. The answer tells you which tags to attach. 

  • If your question is about a specific file: Use #Files and name the file directly. 
  • If your question spans the whole project: Use #Codebase and let workspace indexing find the relevant code. 
  • If a test is failing: Use #testFailure and stack it with the source file using #Files. 
  • If you just ran a test and got errors: Use #terminalLastCommand immediately after the run. 
  • If you want framework-accurate code: Use #fetch with the relevant documentation URL before asking Copilot to generate anything. 
  • If your question is about one specific method: Use #Symbols to keep the context tight rather than loading the entire file. 

A useful habit: before hitting send, check whether Copilot has enough to answer accurately. If the answer is no, add one more tag. The extra ten seconds almost always produces a noticeably better response. 

Prompt templates for test generation, bug investigation, and code review 

These are ready-to-use templates built around your actual project files. Copy, adjust the file names, and use them directly in Copilot Chat. 

Template 1 – Generate tests for a new feature

#Files pageObjects/[PageObjectFile].ts 
#Files dataFactory/[module]/[dataFile].ts 
#fetch https://playwright.dev/docs/test-assertions 
New feature: [describe what was added or changed]
 
Generate a Playwright test file that covers:  
- Happy path 
- Validation errors on mandatory fields 
- Boundary conditions on [specific field] 
- Error handling when [specific failure scenario] 

Use our existing fixture setup and Page Object methods. 
Match the coding style already in the project.

Template 2 – Investigate a failing test 

#testFailure 
#Files pageObjects/[PageObjectFile].ts 
#terminalLastCommand 
This test is failing [intermittently / consistently] on CI. 
The failure appears to be in [method name or step]. 
Based on the failure details and terminal output: 
1. What is the most likely root cause? 
2. Is this a timing issue, selector problem, or logic error? 
3. What is the recommended fix? 

Template 3 – Code review before PR merges 

#Files pageObjects/[ChangedFile].ts 
#Codebase 
This file was modified in the latest PR. 

The following methods were added or changed: 

- [method 1] 
- [method 2]

Review these changes and identify: 
1. Missing test coverage for new methods 
2. Existing tests that may now fail 
3. Regression risks to check before merging 
4. Any locators that look brittle or unstable 

Combining manual QA expertise with AI suggestions 

Copilot generates. QA engineers decide. 

That split matters and it is worth being deliberate about it. There are things Copilot does well and things it cannot do at all – and the engineers who get the most out of it are the ones who know the difference clearly. 

Copilot is good at reading code and generating against it. It spots structural gaps, recognises common patterns, and produces boilerplate quickly. It has seen enough test code across millions of repositories to know what a reasonable test looks like. 

What it doesn’t have is your context. It doesn’t know that the payment flow has a known flaky behaviour on slow networks. It doesn’t know that the date picker behaves differently in certain browsers. It doesn’t know that a particular field has a business rule that isn’t reflected in the code. It doesn’t know what your team tried last quarter and decided against. 

That knowledge lives with you. The copilot’s output is the starting point. Your review is what makes it production-ready. 

A practical workflow that works well: 

  1. Use Copilot to generate the first draft of test cases or a test file 
  1. Read through the output and mark what’s accurate, what’s missing, and what’s wrong 
  1. Ask Copilot to fill the gaps you identified: “You missed the case where the employee name field is left empty – add that scenario” 
  1. Review the final output against your acceptance criteria before committing 

The review step is not optional. It is where the QA expertise applies. 

Do’s and don’ts – when to trust Copilot and when to override it 

Do’s 

  • Do verify every generated selector before running tests. Copilot generates selectors based on what it reads in your Page Object files. If a selector looks unfamiliar or uses a CSS class that seems unstable, check it against the live application before committing. 
  • Do use #fetch when generating framework-specific code. Without it, Copilot may use a deprecated API or an older pattern. Fetching the current docs takes ten seconds and produces noticeably more accurate output. 
  • Do treat generated test files as first drafts. Run them, see what passes and what fails, then ask Copilot to fix the failures. Iterating on a draft is faster than reviewing it line by line before running anything. 
  • Do ask Copilot to explain its own output. If a generated method does something you don’t fully understand, ask: “Explain what this assertion is doing and why”. Understanding the output is part of owning it. 
  • Do use /new between unrelated tasks. Stale context from a previous conversation degrades the quality of answers on a new topic. Fresh session, fresh answers. 

Don’ts 

  • Don’t accept locators without checking them. Copilot will sometimes generate selectors that look reasonable but don’t match the actual DOM. Always verify in the browser or against the Page Object before running. 
  • Don’t skip the review because the output looks clean. Well-formatted code is not the same as correct code. A generated test that runs without errors but doesn’t actually validate the right thing is worse than no test – it gives false confidence. 
  • Don’t paste credentials, PII, or internal API keys into prompts. Copilot Chat sends your prompt to external servers. Anything sensitive in the context – environment files, config files with real credentials, personal data in test fixtures – should not be referenced. Use placeholder values in data factories instead. 
  • Don’t use Copilot’s output as the final word on test strategy. It can suggest scenarios but it doesn’t know your product, your users, or your risk areas. The test strategy decisions – what to automate, what to test manually, what to prioritise – belong with the QA engineer. 
  • Don’t ignore the inline suggestion when it looks wrong. Press Escape and keep typing. Accepting a suggestion you’re not sure about because it looks close enough is how incorrect tests get committed. If you’re not sure, ask Copilot Chat to explain it before accepting. 

Limitations and Considerations for QA Engineers 

AI hallucinations – why Copilot can generate plausible but wrong tests 

The most dangerous output Copilot produces is not obviously broken code. It is code that looks correct, runs without errors, and silently tests the wrong thing. 

Copilot generates based on patterns, not understanding. It has seen enough Playwright tests to fake a valid one. What it can’t do is know your app. A 200 from your login endpoint doesn’t always mean success. The Dashboard breadcrumb sometimes renders before the page data arrives. Your OrangeHRM instance has custom validation rules nobody else has. 

The copilot fills those gaps with guesses – and the guesses are usually plausible. That’s the problem. 

Wrong selectors that look right 

Copilot might generate button.oxd-button–medium for the login button. That selector exists in the DOM. It also matches three other buttons on the same page. The test passes in isolation, then fails randomly in a full suite run because it clicked something else. 

Check every selector against the live app before committing. Your loginPo.ts already defines button[type=’submit’] – if Copilot generates something different, question it. 

Assertions that don’t actually validate anything 

await expect(page).toHaveURL('/dashboard'); 

URL changed. That's all this confirms. The dashboard could still be loading, the user's data might not be there, the page could be showing an error. The test goes green either way.

Push Copilot on it: 
#Files pageObjects/loginPo.ts 
The generated test only checks the URL after login. What additional assertions should I add to verify the dashboard actually loaded correctly? 

Tests that pass without testing anything 

Occasionally Copilot generates a test that navigates to a page and makes no assertions at all. It runs, it passes, it looks fine in CI. Nothing was actually tested. Before committing any generated test, check that it asserts something meaningful – not just that a page exists. 

Outdated API usage 

The copilot’s training has a cutoff. It can generate code using Playwright APIs that were deprecated in a recent release and have no idea that happened. This is why #fetch with the current documentation URL matters – it forces Copilot to work from what’s current rather than what it learned two years ago. 

Security concerns – what not to share as context 

Every prompt you send to Copilot Chat goes to external servers. Whatever you include as context leaves your machine. 

For most QA work that’s fine. Page Objects, test files, helper classes – none of that is sensitive. But QA engineers also touch files that shouldn’t go anywhere outside the org. 

Environment files with real credentials 

Your .env file has database passwords, API keys, and service account credentials sitting in it. Don’t use #Files .env or paste actual environment variable values into a prompt. Instead, describe what the file does and ask Copilot to write code that reads from environment variables – no actual values needed. 

Real user data in test fixtures 

If your data factories have real names, real email addresses, or real employee IDs pulled from org systems, clean them up before referencing those files. The project already has @faker-js/faker in package.json – use it. Realistic fake data is the whole point of that library. 

Internal API endpoint structures 

Laying out your internal API architecture in a prompt – auth flows, internal service URLs, proprietary business logic – exposes that structure externally. Keep API prompts at the pattern level. Ask how to structure a request, not what your actual endpoints do. 

Compliance and regulated data 

If the app touches healthcare records, financial transactions, or anything personally identifiable, watch what test data ends up in prompts. Some regulatory environments flag synthetic data that mirrors real PII – better not to find out the hard way. 

Before hitting send, ask yourself: would this belong in a public GitHub repo? If the answer is no, leave it out of the prompt. 

The validation imperative – Copilot suggestions always need a QA human in the loop 

The appealing version of AI-assisted QA – describe a feature, press a button, get a complete test suite – doesn’t exist yet. 

What exists is a tool that produces a useful first draft, quickly and with enough accuracy to save real time. But a first draft is not a finished test. The gap between the two is where QA expertise lives. 

Every piece of Copilot output going into your test suite should clear three questions before it gets committed: 

Does it actually test what it claims to test? Read the assertions. Check that each one validates something meaningful about the feature’s behaviour – not just that a page loaded or a URL changed. 

Does it use the right selectors for your application? Open the browser, inspect the elements, confirm the selectors Copilot generated match the actual DOM. Your Page Objects already have locators defined – if Copilot generated something different, find out why before trusting it. 

Does it cover the scenarios that matter for your product? Copilot covers structural scenarios well – happy path, empty fields, duplicate entries. It does not know about business rules, known edge cases in your specific application, or the failure modes your team has hit before. Those scenarios come from you. 

Skipping this review because the output looks clean is the fastest way to build a test suite that gives false confidence. A test that runs and passes without validating the right thing is worse than no test – it actively misleads the team about the quality of the software. 

Model limitations – handling domain-specific or legacy codebases

Copilot performs well on standard patterns – Playwright tests, TypeScript classes, REST API interactions. It gets noticeably less accurate the further your codebase drifts from those patterns. 

Domain-specific business logic 

Copilot understands technical implementation but has no inherent knowledge of your domain rules – what makes a valid employee record, what the business consequences of a status change are, or what compliance requirements affect certain fields. Those rules live in your team’s heads, your product documentation, and your business requirements. The copilot cannot infer them from code alone. 

When generating tests for domain-heavy features, tell Copilot the business rules explicitly in plain text before asking it to generate anything. Describe the rule, describe the expected behaviour, then ask for the test. The more domain context you provide upfront, the more relevant the output. 

Legacy code and non-standard patterns 

Custom wait utilities, non-standard base classes, legacy frameworks sitting next to newer ones – Copilot doesn’t pick up on any of that automatically. It defaults to whatever modern pattern fits, which compiles fine but clashes with how the rest of your project is written. 

Show it an example first. Point it at a file that already follows the correct pattern and ask it to match that style. Referencing your actual WebActions or BasePo before generating new code tells Copilot what conventions your project uses – and it will follow them. 

Large monorepos 

On very large projects where workspace indexing hasn’t fully completed, project-wide queries can return incomplete results. Copilot may miss relevant files, generate code that duplicates existing utilities, or suggest patterns already handled elsewhere. Manually referencing key files directly is more reliable than relying on full codebase search until indexing is confirmed complete.

Ethical use in regulated industries – healthcare, finance, compliance testing 

QA engineers in regulated industries carry extra responsibility when using AI tools. A test suite in healthcare or finance isn’t just a quality signal – it can show up in a compliance audit, factor into a safety decision, or serve as evidence that a regulatory requirement was actually met. 

Using Copilot in these contexts isn’t the problem. But it introduces questions that don’t come up in standard software development. 

AI-generated tests are not automatically audit-ready 

Committing a Copilot-generated test without thorough review won’t meet the documentation standards most regulated environments require. Traceability – proving that a specific test covers a specific requirement – needs a human to own the intent behind the test, not just the code. Copilot can help write the code faster. Someone who actually understands the requirement still has to document why the test exists. 

Validation requirements differ 

FDA regulations, ISO standards, financial compliance frameworks – all of them require documented evidence that tests work as intended. Copilot speeds up the technical side. The review, the sign-off, the traceability record – those still need a human who owns the regulatory responsibility. 

Data handling in test environments 

Regulated industries have strict rules about where data gets processed and by whom. HIPAA, GDPR, PCI-DSS – if any of those apply to your test environment, check your organisation’s policy on AI tool usage before any of those files end up in a prompt. Synthetic data only, and run it by your compliance team if you’re unsure. 

Copilot speeds up the technical work. The review, the documentation, the regulatory calls – those stay with the humans responsible for them. 

Conclusion 

By now, you’ve seen how Copilot can fit into a QA engineer’s day-to-day work from understanding code and generating tests to debugging failures and reviewing coverage. 

But the real value doesn’t come from using Copilot blindly. It comes from using it deliberately

When you combine: 

  • the right context  
  • clear, specific prompts  
  • and your own QA judgment  

…you end up with a workflow that is faster, more consistent, and far less exhausting. 

Copilot won’t replace your expertise. It simply takes care of the repetitive parts, so you can focus on what matters deciding what to test, how deeply to test it, and what risks matter most for your product. 

Used this way, Copilot stops being just a tool. It becomes a reliable co-buddy always available, quick to respond, and helpful when you need a second perspective. 

The key is straightforward: 
Give it the right context. 
Question what it produces. 
And keep the final decisions in your hands. 

Witness how our meticulous approach and cutting-edge solutions elevated quality and performance to new heights. Begin your journey into the world of software testing excellence. To know more refer to Tools & Technologies & QA Services.

If you would like to learn more about the awesome services we provide,  be sure to reach out.

Happy Testing 🙂