Usability Evaluation | part 3

Usability Testing Methods

Do-It-Yourself Usability Testing

If a company does not want to spend the time and money on an unmoderated or moderated usability test, there is also the possibility to do a DIY usability test. Although the data you get from this method cannot be compared with the data you get from its bigger siblings, it’s still better to do it yourself than to do no testing at all.

In the following table, you can see all advantages and disadvantages of the Do-It-Yourself tests compared to the moderated and unmoderated tests (he calls them the big honkin’ tests).

source: Rocket Surgery Made Easy, Steve Krug, page 25f

For even more details about this method I recommend watching his workshop.

Card Sorting

Another very popular testing method, primarily used for evaluating the information architecture of a system, is card sorting. It helps to organize the content of a site to match it to the way the actual users think. That’s also the main reason why it is so important that you are using actual users of the system for card sorting.

A card sorting can either be done in-person on a large table or a magnetic wall or remotely with specific online tools. These tools also offer you to ask follow-up questions and help you with gathering additional qualitative data to the quantitative data from the card sorting. Either way, card sorting is a very quick usability testing method.

During card sorting, you hand your participants a stack of cards with different topics written on them and they have to sort them and organize them into different piles. Depending on the type of card sorting you do they get labeled groups by the creator (closed card sorting), can label their groups on their own (open card sorting) or they can modify the groups given to them by the creator (hybrid card sorting). Card sorting helps you to understand the mental models of your users.

Click Test

Click tests are also often called first click tests and help with identifying navigation problems. It is not used for creating the information architecture or navigation like the card sorting, but it is used for seeing if the navigation from the card sorting works and helps the users to accomplish different tasks.

Click tests are done with an image of a sketch, wireframe, or design of a system and are therefore easy to set up and done really quickly. The Participants are then asked to click where they think they need to click to complete a given task. These clicks and the times it takes them to click are also recorded in the background and can be visualized in a heatmap and helps with visualizing the areas where the most clicks occurred. Since it is easy to set up and it also works with low fidelity sketches, you can start doing them early in the design process and repeat them along the process.

An example of a click test would be to show them an image of the homepage and ask them to sign up. After they have clicked the element they think will bring them a step closer to the sign-up process, you normally also ask them why they clicked there. These questions also help with understanding their mental models and getting more qualitative data.

Another important result of a click test is that you can easily find areas that are clicked a lot but are not actually clickable. This helps you with minimizing the number of wasted clicks.

The biggest disadvantage of this method is that you can only use them for single screens or a few of them and you cannot do a click test with longer and more complicated task flows. Additionally, because you are using a static image it is also not simulating a realistic surfing or browsing behavior of a normal user and the results may be different when elements are located below the fold of the screen on the live version of the system.

Eye Tracking

Eye-tracking is great for identifying elements that are helping and hurting the attention. It uses additional hardware and software to track the movement of the eye and measure the gaze points while a user is viewing the system. During this process, everything is recorded and can be presented as a heatmap or a gaze plot. The gaze plot can help you with identifying where the users are actually looking and where they are not looking. It also helps with identifying the order in which they are looking at different elements. Heatmaps are primarily used to visualize the elements that attract the most amount of attention and which elements are overlooked by the majority of users.

The main advantage of eye tracking is that you can identify how much attention every element gets. You cannot only identify important elements that get too little attention, but you can also identify elements that pull the attention away from the more important elements. For getting quantitative data out of the eye-tracking test it is also important that the facilitator also ask follow-up questions. 

The biggest disadvantage of eye tracking is that you need special hardware and software to actually start collecting data. That’s also the reason why you cannot do an eye-tracking test remotely and have to set up a room with your eye tracking device before the test starts.


Preference Test

Another fast but effective method is the preference test. It helps to determine which design of several choices the users like best, why they liked it best and what they liked about this version. This method is especially helpful during the early stages of the design process or when the team is not able to decide which version they like best. A preference test can also be done online with a video call and screen sharing, without any special tools.

The biggest disadvantage of this method is that it does not mean that the users are also picking the version with the best usability or performance. Especially when you are presenting them with different high-fidelity designs, they will most likely focus on the aesthetics of the design.

Question Test and 5-Second-Test

The question test and the 5-second-test are similar methods. During both tests, you show the participant an image of the system and then ask questions about it. The biggest difference between the two is, that you only show them the picture for 5 seconds during the 5-second-test and during the question test they can look at it as long as they want. The 5-second limit is due to the fact that the vast majority of website visits are less than 10 seconds long and that users make up their mind about the quality of a website within 50 milliseconds.

The questions during these tests are mainly about the layout or content of a site but you can also ask them where they would complete a certain task or action on the page and what they would expect to happen if they click on a certain element.

Since these tests can also be done with only an image of the product, they are also fast to set up and can be easily done online with users from around the world. The biggest disadvantage of these tests is that they cannot solve design issues, they can just point out if there are potential problems somewhere.

Additional Methods

There are a lot more UX research methods out there that also help with testing the usability of a system. Additional examples include

  • Contextual Inquiries, Ride-Alongs, Field Visits, Ethnographic Field Studies
  • Diary Studies
  • Focus Groups
  • Surveys
  • Voice of the Customer (VoC) Tools including feedback forms, questionnaires, and ratings on websites, apps, or app stores
  • Interviews
  • Usability Lab Studies
  • Participatory Design
  • Concept Testing
  • Desirability Studies
  • Clickstream Analysis
  • A/B Testing, Multivariate Testing, Live Testing, Bucket Testing
  • True-Intent Studies
  • … and many more

Usability Benchmarking

Another really important summative testing method (end of the design process) is usability benchmarking. The main goal of this method is not to improve the usability of a system, it is about measuring the current usability of a system to provide a baseline against which future versions of the system can be compared.

It is a great tool to ensure that the changes you are making help you with moving in the right direction and that you have clear reference points. UX benchmarking involves collecting quantitative data that describes the current user experience. This data could include detailed numbers about the average time spent on the system, average time spent until they make a purchase or complete a certain task, the success rate or conversion rate, the retention rate, and many more.

Once you have done your UX benchmark, you can compare your data against an earlier version of the product, the data from a competitor, an industry standard, or a goal defined by the stakeholders of the product. Even when you just did your first benchmark, you can still compare your data to the competitors or the industry standard.

The main advantage of this method is that you can measure the progress you have made after a lot of design iterations and show this data also to stakeholders or clients to prove your good work along the process.


Since most of these methods can also be done remotely, there is currently a big boom in online tools that help companies with usability testing. These tools are offering different methods and different pricing models and are especially useful for testing with users all over the world. Popular testing services include UserTesting, Validately, UsabilityHub, UXTesting, Userlytics, and many more. Since I prefer to use the help of local companies, I would personally use Userbrain from Graz to do my usability tests.



Just Enough Research
Erika Hall

Usability Testing Essentials, Ready, Set…Test!
Carol M. Barnum

UX Optimization, Combining Behavioral UX and Usability Testing Data to Optimize Websites
W. Craig Tomlin

Inclusive Design for a Digital World, Designing with Accessibility in Mind
Regine M. Gilbert


The Elements of Successful UX Design, Best Practises for Meaningful Products