The definition problem
Before you can measure progress toward good Copilot adoption, you need to define what good looks like. Most organisations have not done this. They measure seat activation (who has logged into Copilot at all) and sometimes monthly active users (who has opened Copilot in the past 30 days). Neither of these tells you whether Copilot is changing how people work.
Good Copilot usage is not defined by frequency of login. It is defined by the degree to which Copilot has become part of how someone approaches specific categories of work, not as a novelty to try, but as a tool they reach for automatically when a relevant task arises. That definition requires understanding what "automatic" looks like at different stages of a programme.
What week one looks like
At the start of a structured Copilot programme, good week-one usage is deliberate and prompted. A participant who has been given a specific challenge (draft a response to this email thread, summarise this document, create a table of data from this text) will attempt the challenge with varying degrees of success. They are using Copilot because the programme told them to. They are not yet using it spontaneously.
The outputs at week one are often unimpressive. Prompts are poorly constructed. Results require significant editing. The participant may conclude, incorrectly, that Copilot is not very useful. This is expected and normal, it is the first stage of any skill acquisition, and it mirrors what you would expect from any new tool in a real work context. Good week-one performance means attempting the challenge, submitting output, and reflecting on what worked and what did not. Nothing more.
What you are not looking for at week one: spontaneous use, unprompted experimentation, or impressive outputs from complex prompts. Those come later, and expecting them in week one is a reliable way to misread early programme data as evidence of failure.
What week nine looks like
By week nine of a well-run programme, the pattern is qualitatively different. The most meaningful signal is spontaneous use, times when a participant uses Copilot for a real task that was not part of the programme challenge, without being prompted to do so. This is the clearest evidence that a habit has formed.
Specific week-nine behaviours that indicate genuine adoption:
- Using Copilot to summarise a Teams meeting before reading the chat transcript
- Asking Copilot to draft a first version of a document before writing anything manually
- Using Copilot in Excel to analyse data rather than building formulas manually
- Mentioning Copilot prompts in team conversation, sharing what worked on a specific task
- Asking a colleague what prompt they used for a particular output
The last two are especially significant. When Copilot becomes a topic of conversation in settings that have nothing to do with the programme (team meetings, Slack channels, corridor conversations) it has crossed from a tool people use in isolation to a shared capability that the team owns collectively. That transition is what makes adoption durable rather than temporary.
The difference between prompted and spontaneous use
Microsoft's adoption data, published through the Work Trend Index and Viva Insights reports, primarily captures what might be called prompted use, times when a user opens Copilot deliberately within a Microsoft 365 application. This counts as active use in the dashboard. What it does not capture is the broader change in how people approach work: the tasks they no longer spend time on, the first drafts they no longer write from scratch, the meeting prep they no longer do manually.
This means Microsoft's own adoption figures systematically understate the real impact of good Copilot usage. A user who has genuinely changed their approach to writing, analysis, and meeting management will show a certain number in the dashboard, but the productivity gain is reflected in their output, not their click-count. Survey data consistently shows that the self-reported time saving (typically 30–60 minutes per day for active, habituated users) is substantially higher than what dashboard metrics would suggest.
Good Copilot usage, properly defined, is only partly visible in any dashboard. The rest shows up in how people describe their working week, what they no longer bother doing manually, what they get done before lunch that used to take all day, what they now volunteer to produce because they know it will not consume them.
The complexity progression
Another dimension of good usage that is easy to miss: the complexity of the tasks Copilot is applied to should increase over time. Week-one usage is email drafting and document summarisation, high-frequency, low-risk tasks where the stakes of a mediocre output are low. Week-nine usage should include more complex applications: multi-step prompts, structured data analysis, meeting intelligence, and ideally some experimentation with Copilot agents for repetitive workflows.
Users who are still doing only week-one-level tasks at week nine have not progressed. They have formed a habit, but a shallow one, limited to a narrow range of tasks and unlikely to produce meaningful productivity change. Good adoption at week nine means a participant is applying Copilot to task types they would not have considered in week one, with prompt quality that produces genuinely useful outputs without extensive editing.
This complexity dimension is what separates a structured nine-week programme from either a training day or simple repeated exposure. If you are designing that programme from scratch, our guide on how to run a Microsoft 365 Copilot pilot covers cohort selection, structure, and how to present results. The progression is deliberate: week by week, the challenge type changes, the complexity increases, and the user's mental model of what Copilot can do expands to match. By week nine, they have a realistic, experience-based understanding of Copilot's capabilities that no training session can create.
Setting the target before you start
The most important thing about defining good Copilot usage is doing it before the programme starts, not after it ends. Decide in advance what success looks like: what is your target active adoption rate at week nine? What self-reported confidence score are you aiming for? What time saving figure would you need to see to confirm that the investment is worthwhile? Our article on Copilot adoption metrics that actually matter sets out exactly which measures to use.
Without pre-defined success criteria, the programme ends and the data can be interpreted however is most convenient. With them, you have an honest assessment of whether the programme worked, and a baseline for the next cohort.
The Copilot Bootcamp Kit includes a pre-programme baseline survey and a week-nine completion survey, so you can measure confidence delta, spontaneous use rate, and self-reported time saving across your cohort. Everything you need to show leadership what changed, and by how much.
See the kit