Quantitative Methods of Usability Analysis
In his recent article Three Hypotheses of Human Interface Design Tantek Çelik came up with some interesting hypotheses about the usability of computer interfaces, the cognitive load of doing things.
- Minimize the number of text fields in your interfaces down to the absolute minimum necessary.
- Minimize the number of clicks / keystrokes / gestures necessary to accomplish actions in your interface.
- Make your interface as responsive as possible — minimize the latency of each and every action a user might take in your interface.
In short: keep it simple. Don’t make me think. Don’t make me wait.
Brilliant! Then a couple of days later I found the time to finally read Jef Raskin’s book The Humane Interface (2000) where he demonstrates four methods of quantifying the efficiency of a software interface: Fitt’s Law (1954), Hick’s Law (1952), Raskin’s own measures of efficiency, and the GOMS method, in particular the Keystroke-Level Model (1980).
I hadn’t heard of at least half of them before, and so did my colleagues. Apparently those methods are well established in software UI design, but less known among web developers. Which is a good reason to blog about them.
Fitt’s Law
You most probably have heard of that law. Given the size S of a target (like a button) and the distance D to move a cursor, you get the average time it takes to hit the button:
time (in msec) = a + b log2 (D / S + 1)
a and b are empirical constants, like the reaction time or time to click a button. You can use an approximation of a = 50 and b = 150. Also the smaller horizontal or vertical value for the size is good enough, and an average distance on a screen. The binary logarithm log2 is the equivalent of the binary chance to hit or miss the button. That said, a large button is easier to hit.
For some practical examples see Kevin Hale’s excellent article, or Microsoft’s application of Fitt’s Law in Office 2007.
Hick’s Law
Hick’s Law is about the time it takes to make a choice from a number of elements. Be careful when applying this law to menus, because there are other factors like the size, design, sort order, or readability of menu items.
time (in msec) = a + b log2 (n + 1)
For example, ignoring the constants, we can boldly claim that choosing from a menu of eight items takes longer than from one of four items because log2 8 = 3
and log2 4 = 2
. In theory, opening a submenu of four items in another menu of four items takes longer than the choice between eight, because 2(log2 4) = 4
and log2 8 = 3
. Because human performance sinks on cognitive tasks involving more than 7 ± 2
items, choosing one of eight items could actually take longer.
Interface efficiency
The information-theoretic efficiency of an application is the minimum amount of necessary information divided by the actual amount of information. Sounds difficult, but is easy. Making a choice means one bit of information. “Push OK or Cancel.
” One bit of information, requiring one click or keystroke. One keystroke is approximately 5 bits, so the efficiency is 1 / 5 = 0.2
, or 20%. Thus you know there is room for improvement.
In the following example you got no choice at all, so the information efficiency is zero:
Of course it is necessary to inform a user that the search is over or that certain fields in a web form are required, but that can be done in an unobtrusive way without alert boxes. A yellow-fade does not require pushing a button and has 100% efficiency. If the user can only do one thing next, have the computer do it. Besides hitting “OK” soon becomes habitual and therefore pointless.
Goals, objects, methods, and selection rules (GOMS)
These rules were further described by Stuart Card, Thomas P. Moran & artificial intelligence pioneer Allen Newell in their influential book The Psychology of Human-Computer Interaction (1983). GOMS, and in particular the simplified Keystroke-Level Model (KLM), offers a simple approach to estimate the duration of tasks on a computer.
- K = 0.28 sec. — key press and release (keyboard)
- P = 1.1 sec. — pointing the mouse to something
- B = 0.1 sec. — button press or release (mouse)
- H = 0.4 sec. — homing, hands movement from mouse to keyboard or reverse
- M = 1.2 sec. — mentally preparing
- W(t) = t msec. — wait or response time (system)
Inserting mentals is probably the most difficult part of this, but there are six simple rules how to apply them. So in another infamous MS Word example when the user wants to change a radio button we get the sequence HMPBPB with a total of 4.0 sec., or HMPBHK with a total of 3.48 sec. when the user hits Return instead of clicking on “OK.”
3.48 seconds is pretty fast, but since there is only one bit of information but two keystrokes we know it could be faster:
The sequence required for this selection is HMPB or HMKK with a duration of 2.8 sec. or 2.16 sec., respectively.
Simpler interfaces increase usability and are faster. Because they are simpler, I don’t have to think so hard. Wait, that sounds familiar … it sounds like … cognitive load!
Let’s take Tantek’s examples where he describes the steps for instant messaging and writing an email. That would be something like:
- HMPMBBM(K × n)MK
- HMPMKKKKMKKM(K × n)MKM(K × n)MPB
Even without knowing and thus ignoring the amount of characters in the message and subject (K × n), the estimated time for writing an IM (5.58 sec.) is significantly smaller than writing an email (13.06 sec.).
I’m really sorry, it was a brilliant idea, but I’m afraid KLM-GOMS describes pretty well what Tantek calls Three Hypotheses of Human Interface Design. Unfortunately somebody came up independently with that established method for Human Computer Interaction, 27 years ago.
Great article! Ever since I read Raskin a few years ago I wanted to write more about it in my blog but just kept refering to his book. When reading Tantek I thought the same like you.
I have got to write about your post soon in my blog ;-).