Affinity diagrams don't scale
March 08, 2018
Affinity diagramming is a well-known and appreciated method for making sense of data. Sometimes, depending on the source or method where you get your data from, affinity diagramming might not be the best way to deal with it. Especially for usability tests, there is a much better way, and we'll talk about it in this blog post.
What's affinity diagramming?
The method works like this: You invite a number of people (e.g. your fellow user researchers) into a room with a large, empty wall. Then, you...
- Write observations on post-its
- Put the post-its on the wall
- Group similar items
- Name each group
- Vote on the most important groups
What's a usability test?
A usability test occurs in a situation where a team has designed a certain kind of (prototype or real) interface. The team wants to test this interface with users who haven't seen it before.
In this case, you...
- Invite those users into a lab
- Give them a device that has the interface to be tested
- Ask them to complete a certain task to reach one of the user's goals
- Observe what happens when they try to accomplish the task
- Record your observations in some way to make sense of them later
What happens when you use one method with the other
A typical idea might be to run the usability test in one room and watch a camera recording of it on a big screen in a different room where all the user researchers and developers can attend and see what really happens.
So far, so good. Now for the critical mistake:
People write post-it notes with the observations they make while they watch the usability test. They think that they can sort the post-its later, using affinity diagramming.
Why is that a mistake?
The researcher writes a post-it every time the participant...
- understands the task but cannot complete it within a reasonable amount of time
- understands the goal but has to fiddle with several ways to reach it
- gives up or resigns from the process
- completes a task but not the one that was specified
- expresses surprise or delight
- expresses frustration, confusion, or blames themselves for not being able to complete the task
- asserts that something went wrong or does not make sense
- makes a suggestion for a change in the interface
Now, let's say, one testing session takes 60 minutes. You make one observation every two minutes. That makes 30 observations per session. Say, you have 7 participants and you observe them with 6 researchers. That makes a whopping 1,260 total notes!
Can you imagine how the wall will look like during affinity diagramming?
Issues created by large affinity diagrams
It's a lot of work to group or cluster more than 1,000 post-it notes. Every researcher has to decipher the handwriting, understand the issue, move the post-it to a different place or draw large circles around a group of related items.
After a while, a certain fatigue will set in. People will find themselves sorting an item to a group where it does not really belong, only because they are too tired to create one more group of items.
Other people will draw circles that overlap each other, effectively mis-categorizing items.
Some notes will be discarded because someone thinks they are too special or they duplicate another item. Another person might disagree about this but it may be already too late to intercept.
Some items will be vague or abstract. Instead of a clear description, they carry a few words which are too general or even ambiguous, when seen with the background of the other readers. These items will be difficult to categorize, later.
In the evening, when personnel arrives to clean the room, they will possibly hoover away a few post-its that have dropped on the floor. Bad luck! Some data will be missing from the dataset.
Post-Its will need to be transcribed into tables or some other electronic form, in order to prepare them for automated analysis. Transcription requires a separate room with enough space for the post-its. And: Transcribing more than 1,000 post-its can be daunting, again!
There must be a better way to do this
If you followed me until here, you might ask: "O.K., that's enough, what do you propose? Where's the solution?"
There is indeed a better way to do this. It works like this:
- Record what happens during the user test, i.e. create photos or videos. These are your "pieces of evidence".
- Upload the evidence into the "Meaning Maker" system, a part of Just Ask Users.
- Look at the evidence, e.g. a photo or a screenshot, and click on it, pointing at an interesting part of the picture.
- Meaning Maker creates an observation, linked to that point in the picture.
- Write a sentence or two about your observation and add one or more
#hashtagsto it (as on Twitter). Meaning Maker automatically groups observations that have common tag names.
- Researchers: meet for debriefing! Discuss the observations that have been collected in Meaning Maker, before. Add more observations that pop up during your discussion.
- Invoke the Meaning Maker theme builder to derive recurring themes from the tagged observations. Themes can be frustrations that the user had or parts of the interface that are not clearly understood.
- Decide which themes have high priority and therefore shall be addressed in the upcoming design iteration(s).
Avoid the hassle
You saved a lot of time and avoided the hassle for affinity diagramming. Instead, you have...
- archived all your observations in a trusted place,
- tagged them with names that make sense,
- allowing the system to auto-group the observations for you,
- so that you can suddenly see recurring themes.
Themes that you (semi-automatically) identify in your user's behavior will let your team know which design changes must be made. Your users will appreciate what you've done for them after the release of the next version of the interface.
P.S. You can get started with Meaning Maker for free today. Sign up here to get your account:
Easy setup • Free trial with one multi-study project