Towards a Toolkit for Interaction Design
IBM T. J. Watson
The Roving Tribes of Interaction
This volume is concerned with
establishing foundations for interaction design. "Foundations"
strikes me as an ambitious metaphor, suggesting, as it does, a solid base
upon which a single, unified edifice will be erected. And, following the
metaphor a step further, it assumes the existence of a stable, well
organized community with a shared set of values that is ready to embark
upon a such construction project.
I don't believe these assumptions
hold up. To me, the state of interaction design feels more primitive.
Rather than being an organized community, interaction design feels closer
to being composed of a number of roving tribes who occasionally enounter
one another, warily engage, and, finding the engagements stimulating,
remain open to other encounters.
If this is the case, how do
we make progress? I suggest that rather than trying to construct a unified,
coherent account of interaction design, we would do better to take a more
syncretic approach, gathering appropriate concepts and exploring their
interplay without, however, insisting on resolving their tensions and
In this essay I explore these
issues. I begin with a definition, and illustrate my approach to partitioning
the terrain of interaction design using five conceptual "lenses."
In so doing, I cover most of what I see as the theoretical roots of interaction
design. I then turn to the role of theory in interaction design, and suggest
that a good way to begin is to assemble a toolkit of concepts for interaction
design that consists of appropriately sized theoretical constructs.
I define interaction design
Interaction design has to
do with the design of any artifact, be it an object, system, or environment,
whose primary aim is to support either an interaction of a person with
the artifact, or an interaction among people that is mediated
by the artifact.
Although some see interaction
design as particularly concerned with digital systems--either computer
systems or artifacts with embedded computational capabilities--I see no
reason to exclude humbler artifacts. The forces that shape our interactions,
from perceptual and motor processes such as seeing and touching, to social
and cultural phenomena such as imitation and fashion, are agnostic with
respect to whether an artifact contains digital components. Indeed,
much of what we understand about the design of non-digital artifacts--whether
it be how to make a switch with a satisfying 'click,' or how clothing
functions as a means of expressing identity--are applicable, as well,
to digital systems. Finally, as computer systems become increasingly embedded
in our artifacts and environments, and even the most mundane objects are
tagged and tracked by digital systems, our ability to discriminate between
the digital and the non-digital will fade, even should we wish to maintain
The Terrain of Interaction
Figure 1 shows a series of
chess games in Washington Square, in New York City. In the foreground
we see a chessboard, the players rapt in concentration. To one side of
the board a few captured black pieces are gathered together; to the other
is a pair of chess clocks that meter out the players' allotted minutes.
Farther back we see other chess games, each with its circle of spectators.
Still farther back we see passers by, most of whom are oblivious to what
is going on, but a few of whom may be drawn into the circle of spectators,
and then, perhaps, into playing a game or two themselves. And in the far
background we discern trees and buildings, and see that the games are
taking place outdoors in a city square.
1. Pickup chess games in a park. Photo © 2004 Project for Public
Spaces, Inc. www.pps.org
To me, this picture represents,
in miniature, the terrain of interaction design. As such, I'll use it
to describe how I go about making sense of interaction. As a designer,
I'm continually confronted with new sites and situations, and for each
site I need to come up with a way to see it, to analyze it, to design
for it, and to understand the consequences of what I have designed. I
find that I work best when I orient to the site or situation in which
the interaction takes place; for me the site comes first, and the conceptual
framework and methods and tools come later. As a designer, my principal
challenge is to make sure that I don't get too fixated on a single aspect
of the situation, that I don't get trapped in a particular perspective
or approach. Rather than find a single conceptual framework that fits
the situation, instead my aim is to stay grounded in the concrete reality
of the site, and to bring a range of conceptual lenses to bear on it.
So let us return to the picture
in Figure 1. We will walk through the image, taking a look through each
of the set of lenses that I bring to bear on the sites with which I engage
as a designer.
I begin, perhaps as a consequence
of my early training, with the mind, envisioning the game in purely cognitive
terms. Playing chess, viewed through this lens, involves a cycle of perception,
cognition and action. This is the domain of cognitive psychologists, such
as Donald Norman (1986), and is concerned with issues such as how people
might go about learning chess, what sorts of errors they might make while
doing so, how players develop strategies, why people find games of this
sort engaging, and so on. This is the lens most often deployed by interaction
designers versed in human-computer interaction, and is of critical import
in the design of screen-based applications.
Moving on, we deploy a new
lens, shifting our focus from minds to bodies and the ways in which we
use our bodies to interact with one another. In the picture we see a number
of bodies: the player in the left foreground, his face rapt in concentration
as he gazes at the board; the spectator in the right foreground, gazing
at the game, his posture suggesting that he has settled down to watch
for a while. In the next game back, a player is reaching to move a piece,
after which he will quickly slap the chess clock to stop his time and
start his opponent's; that game, too, has spectators, though they seem
less intent on the game and more interested in talking with one another.
This is the domain of ethnomethodologists such as Adam Kendon (1990),
sociologists such as Erving Goffman (1963), and anthropologists such as
Edward Hall (1983), who focus on the role of expression, posture, gaze,
gesture and timing in interactions within small groups. This lens is important
for those concerned with designing material artifacts--especially large
artifacts such as control panels, rooms and buildings--as well as those
designing digital systems which support mediated (i.e. disembodied) interaction.
Next we shift our view to
the artifacts in the picture. We see a chessboard arrayed with white and
black pieces; off to one side we see a cluster of captured black pieces,
and off to the other a pair of chess clocks. These artifacts play a variety
of roles, interacting with the views from other lenses. One role of artifacts,
that Norman explores in Things that Make Us Smart
(1993), is to ease the cognitive load: the board and the pattern of pieces
on it serve to preserve the state of the game, enabling players to focus
on planning their next moves. Another role of artifacts is their status
as objects that are manipulated by the participants. While the manipulation
of chess pieces is a relatively simple matter, ethnomethodologists like
David Sudnow demonstrate that the ways in which people physically interact
with objects is incredibly subtle. In his book, Ways of the
Hand, Sudnow (2001) gives an exquisitely detailed account of the process
of learning improvise jazz on the piano, and the ways in which his hands
(not his mind) learned to traverse the keys. A third role of artifacts
is depicted by Ed Hutchins in Cognition in the Wild (1995),
in which he explores the view that cognition is not just a property of
minds, but can be seen as a global property of systems of people and artifacts.
A fourth role of artifacts is a social one, in that the pair of clocks
substitute for a human time keeper. This view is explored by Bruno Latour
(1992), who eloquently makes the case for a sociology of artifacts, suggesting
that it is artifacts which stabilize and extend human interaction patterns.
This lens--with the glimpses it gives of artifacts and their varied roles--is
important for those who design material artifacts, as well as for those
who aim to replace material objects with digital 'equivalents.'
Now we move to a level of
analysis that is not grounded in anything that can be explicitly seen
in our picture. The social lens examines relationships, both among people
and between people and objects, and tries to take notice of the norms
and rules that underlie them. Thus, in our picture, we see not just people,
but people who stand in relationship to one another--players, spectators,
passersby--and who are obeying rules as a consequence. Of course, the
game of chess has a set of rules associated with it, but of more interest
are the unwritten rules being adhered to. Thus, one chess player does
not shout at the other as he ponders his move (something which is permissible
in games like baseball), nor does he, after capturing a piece, toss it
into the dirt beneath the table. There is an unarticulated notion of "proper"
behavior in play, and one that, furthermore, extends beyond the game.
Thus, the onlookers watch quietly and refrain from offering advice (again,
unlike some other games), and one, standing nearby, appears to be waiting
his turn to take on the winner, thus participating in an unarticulated
but mutually understood notion of turn-taking. This is the realm of social
psychology, sociology (Goffman, 1963), ethnomethodology (Heath & Luff,
2000) and anthropology (Whyte, 1988). This lens is essential to any interaction
designer wishing to reflect upon ways in which a newly designed artifact
may disrupt situations in which it is introduced, or the ways in which--as
with a web-based chess game--the digital equivalent of a face to face
interaction may have very different social effects.
The last lens I'll discuss
gives, by far, the broadest view. It is the view of the interaction as
it is situated in its larger context. Here we look not just at the chess
game and its audience, but at its temporal and spatial location. Temporally,
these chess games are a fixture, recurring nearly every day in the same
location-- outdoors in a public square. By virtue of its location, passersby,
on their ways to other places, become aware of the game and, over time,
notice that it is a recurring event. Perhaps, another day, when on less
urgent business, one passerby may pause to watch and even to play, thus
helping the game, as an on-going event, to sustain and extend itself.
Even if the game fails to interest most passersby, it still contributes
to the liveliness and interest of the urban space. This lens, looking
at the ways small interactions like the chess game flourish (or not) in
the context of other interactions, is exemplified by the work of urbanists
like Jane Jacobs (1961), urban designers like Kevin Lynch (Banerjee &
Southworth, 1990), architects like Christopher Alexander (Alexander et
al., 1977), and anthropologists like William Whyte (1988). This lens is
crucial for the interaction designer who creates artifacts for use in
public places, and who desires to create self-sustaining interactive systems.
About the Lenses
I do not wish to argue that
these are the five and only five lenses of use to interaction designers;
others may wish to suggest additional lenses, or to partition things up
differently. The main point is that there are multiple perspectives from
which interaction designers can analyze the sites or situations with which
they are confronted, and that designers will fare best when they are able
to pick up one lens, then another, and then a third. It is the ability
to fluidly shift perspective that is, in my opinion, of most value to
The Role of Theory
Now I'd like to turn to the
question of the role of theory in interaction design. As I've said, I
think its too soon to try to create a unified theory or framework for
interaction design; instead, I suggest that a more productive way to proceed
is to syncretically assemble a toolkit of theoretical constructs and methods,
such that for any of my five lenses (or other lenses to be suggested),
there are a number of theoretical constructs and methods that might be
brought into play.
In my opinion, the key question
is how to select theories, etc., that are likely to be useful. I believe
the problem is one of scale. It is not clear what the proper scale of
theoretical construct is, and often we err by seizing on apparently useful
concepts without sufficiently understanding their contexts. As an example,
consider the notion of "affordance." Affordance, a concept developed
by ecological psychologist J. J. Gibson (1979), is now commonly misused
in interaction design. As initially defined, it was a relational
concept, denoting the possibility of an interaction between an organism
with particular characteristics and an artifact with particular characteristics.
Gibson developed a sophisticated argument--drawing on a number of concepts
ranging from "affordance" to "agent" to "ecology"--that
organisms perceive their environment in terms of affordances. "Affordance,"
as Gibson used it, has little to do with its popular use in interaction
design as a visible indication that something can be done (visibility
has nothing to do with affordances), nor does it make any sense to talk
about an artifact affording something without also specifying the sort
of entity to which the affordance applies. The problem is that "affordance"
has been plucked out of the theoretical framework which gave it its power
and nuance, and used in isolation has become a bit of jargon with little
At the same time, we need
to be cautious about adopting full-fledged theories from other disciplines.
The reason is that theories play multiple roles. At its most basic level,
a theory is a useful simplification, a mechanism for imposing a framework
on the blooming buzzing confusion that is reality. To the extent that
its basic components are understandable and memorable, theories serve
as common frameworks, lingua franca that allow insiders and outsiders to speak to one another
using a common language and shared concepts. Thus biological concepts
such as "disease," "bacteria," "virus,"
"germ," "infection," "antiseptic," and "antibiotic"
provide both specialists and layfolk with a common ground through which
they can understand and discuss basic medical issues. However, theories
play other roles within a discipline. In particular, a theory can serve
as a framework for debate within a discipline and, as a consequence, over
time the theory is articulated and refined in response to the debate resulting
in a more complex theory, or possibly multiple versions of the theory.
These two roles of theory
stand in tension to one another: the utility of a theory for promoting
debate and further articulation of itself within a field may actually
interfere with its utility in communicating beyond the field. The requirements
for promoting articulation within a field involve supporting the creation
of distinctions and nuances that can serve as the ground upon positions
can be established, whereas the requirements for communicating beyond
a field require the ability to depict the conceptual framework in a few
bold and broad strokes of the brush. While the ability of a framework
to support the finely detailed nuance is not necessarily at odds with
the ability to also serve as a simplifying framework, it often is.
What this boils down to is
that we need to think carefully about the theoretical constructs we choose
to use in interaction design. We need constructs that are neither so large
that they bring along all the analytical baggage developed in response
to internal disciplinary debate, but not so small that they lose the ability
to provide a useful framework for dealing with complexity that makes them
useful in the first place. In short, we need a conceptual middle ground,
a repertoire of theoretical constructs that are larger than "affordance"
or "breakdown" or "flow", and that are smaller than
"activity theory" or "distributed cognition" or "ethnomethodology".
Towards a Conceptual Toolkit
What sort of theories and
methods belong in a 'toolkit' for interaction designers? What is the right
size or scale of a theory or method? How do we go about finding them?
One possibility is that we
need to take theories developed by other disciplines and simplify them
for our purposes, pruning away the complexity generated for internal disciplinary
purposes. This is something along the lines that Don Norman has suggested
in his proposal for an applied discipline of cognitive engineering (Norman,
1986). Perhaps, just as cognitive engineering could serve as tool when
applying the "Mind" lens, other theories might simplified for
use with other lenses. Another candiate--an area of Economics known as
mechanism design that examines the ways in which systems of incentives
are designed to shape large scale group behavior -- is discussed by Picci
Another possibility is that
interaction designers might, by drawing on the work of multiple disciplines,
develop design-oriented theories that are targeted at particular areas
of interaction design. Such design theories would span several lenses,
but by virtue of being targeted at a particular design domain, would retain
some simplicity. For example, over the last several years, my colleagues
and I have been developing the construct of social translucence, which
is a design approach to designing of systems that support human-human
collaboration (Erickson & Kellogg, 2003). Similarly, Katie Salen and
Eric Zimmerman (2004), have made an impressive attempt to develop a theory
of game design, drawing from a wide range of disciplines.
A third possibility is that
a more radical form of simplification is needed: elsewhere I've proposed
that adapting the notion of pattern languages from architecture (Alexander
et al., 1977) might provide a way of creating a lingua franca
for interaction design (Erickson, 2000a, 2000b) that would foster communication
amongst the diverse constituencies which make it up.
I began this essay by objecting
to the synthetic program of trying to create a unified and coherent foundation
for interaction design. Rather than an organized field with the shared
values necessary for such a project, interaction design feels much closer
to a confederation of nomadic tribes who occasionally come together. Instead
of joining together to construct foundations, we would be better advised
to procede syncretically by sharing our tools--i.e. theories, concepts
and techniques--and trying to apply them in our own territories. When
we encounter one another again, by virtue of our attempts to use some
of the same tools for different ends, we'll have a bit more common ground,
and a new set of experiences to share.
Alexander, C., Ishikawa, S.,
Silverstein, M., Jacobson, M., Fiksdahl-King, I., & Angel, S. A. (1977).
A pattern language. New York: Oxford
T. & Southworth, M. (Eds.). (1990). City sense and city design:
Writings and projects of Kevin Lynch. Cambridge, MA: The MIT Press.
Erickson, T. (2000a). Towards
a pattern language for interaction design. In P. Luff, J. Hindmarsh &
C. Heath (Eds.), Workplace studies: Recovering work practice and informing
systems design (pp. 252Á261). Cambridge: Cambridge University Press.
Erickson, T. (2000b). Lingua
francas for design: Sacred places and pattern languages. In D. Boyarski
& W. A. Kellogg (Eds.), Proceedings of the ACM Conference on Designing
Interactive Systems (pp. 357Á368). New York: ACM Press.
Erickson, T. & Kellogg,
W. A. (2003). Social translucence: Using minimalist visualizations of
social activity to support collective interaction. In K. H‡‡k, D. Benyon,
& A. Munro (Eds.), Designing information spaces: The social navigation
approach (pp. 17Á42). London: Springer-Verlag.
Gibson, J. J. (1979). The
ecological approach to visual perception.
Boston: Houghton Mifflin.
Goffman, E. (1963). Behavior
in public places: Notes on the social organization of gatherings.
New York: Macmillan.
Hall, E. T. (1983). The
dance of life: The other dimensions of time.
New York: Anchor Books.
Heath, C. & Luff, P. (2000).
Technology in action. Cambridge: Cambridge University Press.
Hutchins, E. (1995). Cognition
in the wild. Cambridge, MA: The MIT Press.
Jacobs J. (1961). The death
and life of great American cities.
New York: Random House.
Kendon, A. (1990). Conducting
interaction: Patterns of behavior in focused encounters.
Cambridge: Cambridge University Press.
Latour, B. (1992). Where are
the missing masses: The sociology of a few mundane objects. In W. E. Bijker
& J. Law (Eds.), Shaping technology / building society: Studies
in sociotechnical change (pp. 225Á258). Cambridge, MA: MIT Press.
Norman, D. A. (1986). Cognitive
engineering. In D. A. Norman and S. W. Draper (Eds.), User centered
system design: New perspectives on human-computer interaction
(pp. 31Á61). Hillsdale, NJ: Lawrence Erlbaum Associates.
Norman, D. A. (1993). Things
that make us smart: Defending human attributes in the age of the machine. Reading, MA: Addison-Wesley.
Salen, K. & Zimmerman,
E. (2004). Rules of play: Game design fundamentals.
Cambridge, MA: The MIT Press.
Sudnow, D. (2001). Ways
of the hand: A rewritten account. Cambridge, MA: The MIT Press.
Whyte, W. H. (1988).
City: Return to the center. New
York: Anchor Books.
Thomas Erickson practices
interaction design and research at IBM's T. J. Watson Research Center
in New York, to whence he telecommutes from his home in Minneapolis. His
current work involves studying and designing systems for supporting computer
mediated communication (CMC) in groups and organizations, and his principle
aim is to create systems that can mesh with the social processes that
govern our daily communication practices. Erickson's approach to systems
design is shaped by methods developed in HCI, and theories and representational
techniques drawn from architecture and urban design. His theoretical and
analytical approaches are drawn primarily from rhetoric and sociology.
In addition to CMC, research interests include virtual communities, pattern
languages, genre theory and interaction design. Over the last two decades
Erickson has published about fifty refereed papers, and has been involved
in the design of over a dozen systems ranging from advanced research prototypes
to commercial products). Prior to joining IBM Research in 1997, he spent
nine years at Apple Research, five years at startup called Software Products
International, and before that five years studying Cognitive Psychology
at University California, San Diego.
in Foundations of Interaction Design. Lawrence
Erlbaum Associates, in press, 2005.