Friday, 23 September 2011

Evaluating usability: what happens in a user testing session?

In my last post I talked about the test plan for assessing the usability of the Pelagios 'graph explorer' for the project's (deep breath) 'non/semi-specialist adults with an interest in the ancient world' audience. Before I get into the details of what happens in a usability test session, I thought I'd introduce you to our design persona, Johanna.
Image credit: @ANDYwithCAMERA
Johanna is 21, and is a third year History student. She moved from her native Germany to the UK three years ago for university. Her goal is to get a First so she has more options for future academic work, perhaps in the Classics. She's slightly swotty, and is always organised and methodical, but finds that she's easily distracted by Facebook and chat when she's working on the computer. She can often be found having coffee or in the pub with friends, at her part-time job in a clothing store, or in the library (her shared house is often noisy when she's trying to work). She dislikes distractions when she's trying to study, and hates rude customers at work. She likes her bike, RomComs and catching up with friends. Her favourite brands are Facebook, MacBook, Topshop, Spiegel Online and The Body Shop. Her most important personal belongings are her laptop, her mobile phone, and photos of friends and family from Hamburg and college.
Johanna is technically competent, and prefers to learn through trial and error rather than reading manuals or instructions. But she also has limited patience and will give up on interfaces that are too difficult. Johanna is a heavy user of social networks and also uses online research databases and library catalogues.
Johanna has an assignment on inscriptions due in a month. She hates the emphasis on big battles and big men in the subject, and finds inscriptions dry, but has been told they can also convey interesting social history and cultural values. She's not convinced (and she's not sure whether she'll be able to make much of the language of the inscriptions) so she wants to find an ancient place that also has other historical material about it to make the assignment more relevant to her own interests.
To create our persona and design the test tasks, I quizzed Elton on the types of questions people ask when they find out he's a Classicist to get a sense of common (mis)perceptions and interests, and about the types of students he's encountered.

So, onto the usability tests themselves. The time and venue for each test was organised directly with the participant, with the restriction that we had to be able to get online, be in an environment where it was ok to talk aloud, and ideally we'd meet somewhere the participant would feel comfortable.

In my last post I mentioned writing and testings some set tasks for the usability test, a short semi-structured interview, and an introductory script. Once the participant had arrived, and was settled with a cup of tea or whatever, I'd introduce myself and explain how I came to be working with the project. I've included the basic introductory script below so you can get a sense of how a test session starts:
Thank you for agreeing to help us test the usability of the current interface for Pelagios.
We'll be using these tests to produce a prioritised list of design and development tasks to improve the Pelagios visualisation for people like you.
The session will take up to an hour and will start with a short interview, then your initial impressions of the site, and finally we'll go through some typical tasks on the site. I'll ask you to 'think aloud' as you use the site - a running stream of thoughts about what you're seeing and how you think it works. I might also ask you questions to clarify or explore interesting things that come during the session.
I want you to know that you're not being tested! We're testing the interface - anything that goes wrong is almost definitely its fault, not yours! Also, I haven't been involved in the project design, so you don't need to worry about hurting my feelings - be as direct as you like about what you're seeing.
I won't be recording this, but I will be taking notes as we go, and summarising them to pass them on the project team.
You can stop for a break or questions at any time.
Do you have any questions before we begin?
The next phase of the test session was the short interview. Again, I've included the questions below:
  • Demographic data: what is your age, gender, educational level, nationality/cultural background?
  • What websites do you use regularly (on a daily/weekly basis)?
  • What's your favourite website, and why?
  • What websites do you use in your research/daily work?
  • Have you seen sites like [Guardian, Gapminder, etc] that feature interactive visualisations?
  • How would you describe your level of experience with the classics? (e.g. a lot, a little). Do you focus on any particular area?
  • What is your definition of the classics? (Geographical, chronological scope)
Once the questionnaire was over, and any questions that had arisen had been discussed, the test began. The first part of the test covered first impressions of the 'look and feel' of the site, what they thought the site might be about, and what content it would include, and what they thought the 'blobs' that are the first view of the graph visualisation represented. I was also observing the kinds of interactions participants tried with the visualisation, whether single or double mouse-clicks, dragging, right-clicking, etc, because I wanted to know how much of the functionality of the site was intuitively discoverable.

The first formal task was: "find all the resources related to Cyrene" [or a place related to their own interests]. I'd note the actions the participants took along with their comments as they 'thought aloud'. Sometimes I'd ask for more information about why they were doing certain things, or remind them to tell me about the options they were considering. I also noted the points where the participant expressed confusion or frustration, or gave up on a task, though I didn't time the tasks or record a qualitative count of errors.

After the task, I'd ask (if it hadn't already come up):
  • What do you think these resources are?
  • How do you think they relate to your actions?
  • What contextual information might you need to make sense of these resources?
These questions were based on the team's review of the site and were aimed at making sure we understood the participant's 'mental model' of the site. If there's a mismatch between the users' mental model and what your site actually does, you need to help users develop a more appropriate mental model.

The second task, "Are there links between [Place 1, Place 2]? If so, what are they and how many are there?" was more open-ended and designed to see how participants managed small result sets on the site. Again, I had questions prepared as prompts in case they hadn't already been answered during the task:
  • What do you think you're looking at here?
  • What does the screen tell you?
  • What do you think the links mean/are?
  • What do you think the movements on the screen mean?
  • How do you interpret the results?
  • How do you think they're selected?
Finally, I asked some questions aimed at giving the project some metrics to measure improvement in the usability of the site after design updates: 'would you use the site again?', 'how likely are you to recommend it to a friend?'.  The final questions were: 'what would you suggest as first priority?' and 'any final comments?'.

After running each test, I'd tidy up my notes and summarise the key points for the team so they could prioritise the next items of design or development work. Which leads me onto my next post, which will include some preliminary results...


  1. If the design persona is a fictional subject, are participants introduced to the persona, or have participants be selected because they are like Johanna?

  2. Hi Gregory,

    thanks for asking and giving me a chance to clarify...

    The persona and the participants are both representative of the target audience. The test participants are selected to be as close to the target audience as possible, so they might end up being similar to the design persona. The persona isn't used in tests, as it's more of a design tool to help inform decisions made when access to representative users isn't feasible. The test and design processes don't overlap that much, so test participants wouldn't generally know much about any project personas.

    I hope that's helped clear things up!

    Cheers, Mia