A Measurable Attention Economics

  | NOTICE:
  | 
  | Ignore these notes to self, indented with a vertical bar.

Chapter 1.

  | To add: nanoeconomics, and that attention economics is measuring
  | fast vs. slow thought.

The internet is like a child—it needs human attention to survive. Websites need humans to visit them. Facebook and Google make money by attracting human attention to ads. The nascent fields of human computation, online games, and crowdsourcing only exist by finding clever ways to motivate humans to attend to tasks, and they succeed proportional to the quantity of attention they recruit. Human attention is the fundamental resource—the fuel—on which websites, and many other human-computer systems run.

  | Animated flow-chart diagram goes here.

Human-computer systems are information processors that consume human attention. They attract it with compelling tasks for achieving valuable goals. Flickr created a compelling way to share photos. This attracted human attention to the task of uploading photos. The more photos Flickr attracted, the more viewers of photos it attracted. The more viewers, the more comments photos received. The more comments received, the more engaged the artists felt, and the more photos they uploaded. This was a virtuous cycle of attention—Flickr used attention to recruit more attention. Attention can serve many other purposes as well. Website owners want more attention, for its many uses.

  | I can also give an example for google and/or facebook, and
  | emphasize some of the other purposes that attention can go to.

Every website on the internet is competing for human attention, in a giant market for attention. Producers of attention (like you, my dear attender of this document), discretionarily forage for high-utility tasks. If an interface is well-designed, you will prefer it. If a system is faster or more reliable, you will prefer it. If a website is socially approved, you will prefer it. For website designers to attract your sustained attention, they must design high-utility tasks and interfaces. Now imagine that we could measure the utility of a task and interface. Then, perhaps we could predict how many people would use it—how much attention it would receive—and thus quantify the attention economics flowing through the web. These measures could inform design decisions, business decisions, or even broader scientific questions about human behavior.

  | That is what this dissertation shows is feasible.

That is what this dissertation enables us to do. I have done some initial measurements to demonstrate it. For instance, imagine that you have a website that requires people to fill out many CAPTCHA “type-the-blurry-word” tests. Because the CAPTCHAs are already annoying, you figure it would be ok to put high-paying (but very ugly) animated ads on the page. Why not make some money while they are already annoyed? However, you were not certain that your intuition was correct, and would like to quantify the utility, or cost of the ugliness, to predict the degree of impact the ads have on your site's traffic. You would like some numbers to help guide your design decisions.

  | This example is not realistic enough.  Maybe you are considering
  | some ads from another company... they want to add some ads but you
  | think they might decrease traffic.
  | For instance, imagine that you want to increase ad revenue on your
  | website, but noticed that the most profitable ads are also the
  | most annoying.  You are worried that the annoyingness drive away
  | your users.  What increase in revenue would outweigh the decrease
  | in usage?  If you could quantify the utility, or /cost to users/
  | of the ads, you could infer the degree of impact the ads have on
  | your site's traffic, and determine if it is justified by the
  | /increase of revenue/ you will receive.  You would like some
  | numbers to help guide your design decisions.

I ran an experiment to prove you can measure the value of aesthetics in some situations. Below is a standard CAPTCHA “type-the-blurry-word” interface. We paid 1,200 users different amounts of money to use it. We also made a very ugly version. Mouse over the Ugly tab.

We had to pay users 0.6¢ more money to complete the ugly version. In other words, the ugliness is worth –0.6¢ per task; it causes a decrease in attention worth 0.6¢ per task. This is the attention utility of that ugliness.

We can measure the attention economics of many manipulations to user interfaces and tasks. In the future, I believe we can quantify much of the internet's attention economics.

Attention Economics Beyond the Internet

However, attention economics is not only an issue on the internet—it exists anywhere in life where attention is a scarce resource. The scarcity of attention puts it within the purview of economics.

Consider this definition of economics:

  "Economics is a science which studies human behaviour as a
   relationship between ends and scarce means which have alternative
   uses."
   - Lionel Robbins (1932)

Economics is the study of how to allocate scarce resources. It has traditionally been applied to resources like money, land, and energy. But today we find that attention—particularly with information technology—is the scarce resource of interest, as Herb Simon famously wrote in 1971:

  "... a wealth of information means a dearth of something else - a
   scarcity of whatever it is that information consumes. What
   information consumes is rather obvious: it consumes the attention
   of its recipients. Hence a wealth of information creates a poverty
   of attention, and a need to allocate that attention efficiently
   among the overabundance of information sources that might consume
   it."
   - Herb Simon (1971)

This scarcity is increasingly visible. For instance, our world is increasingly covered with advertisements, each trying to get our attention. Entertainment options have multiplied, first with radio, then television, then 10s of cable channels, then 100s, and now the infinite variety offered over the internet—all competing for our attention.

New ways of aggregating our collective attention have led to amazing breakthroughs, such as Wikipedia and other crowdsourcing systems. On the other hand, however, problems such as the ADHD pandemic make attention look like society's new Achilles' heel. Attention is the scarce resource.

  | (cartoon) This is a new world.  Instead of hoarding land, it's
  | about attracting attention.

How well do you control your attention? When you set a goal, do you achieve it? Do you ever procrastinate, mis-allocating your attention away from that which you desire to attend to? How can it be that you believe you want to attend to your homework, but in the moment, you somehow choose to attend to an interesting television show instead? We can answer these questions.

How distractable are you? Do you pay attention to ads? How about loud sounds or sexy pictures? How much would you pay to remove ads from all webpages on the internet? How much do you think this compares to what companies pay to place those advertisements on the internet? Attention Economics can now measure these quantities.

What is the value of Google to its users? People already count the number of times users visit it—the amount of attention—but how much value does this attention have? Which is more valuable, Facebook or Google? How much value does Google or Facebook have in users' attention? How much value does users' attention have to Facebook and Google?

For instance, the average college student Facebook user spends more time on Facebook (~30 minutes per day) than Google Search (~1 minute per day), however, my research has found that users would need to paid more to live without Google Search than Facebook. Google Search has more value, even when it receives less use.

This dissertation will allow us to address questions like these, by measuring the economics of attention. I will propose a set of methods that allow us to run experiments to measure the attention utility of tasks, goals, and objects of interest.

Attention Economics has been explored before by many writers and academics (e.g., Goldhaber, Lanham, Simon), but none have yet found a way to measure it quantitatively. This dissertation will argue that a measurable attention economics could benefit a variety of other human endeavors, such as the design of user interfaces, the modeling of user behavior on the internet, the understanding and diagnosing of ADHD, the modeling of neuroeconomics, and the analysis of political discourse.

Can we measure it?

Because human attention is a scarce resource, it has an economy. Humans produce attention, creating supply. Websites and tasks consume it, creating demand. Competitive markets determine its allocations. And macro-economic tides of high and low-value attendable tasks ebb and flow. To understand the future of the internet and humanity, we should map the economics of attention.

But Attention Economics are currently invisible. There is no visible exchange of money when we use most websites. We can see attention “changing hands,” but we do not know the price it goes for—the economics of the attention.

And at a more fundamental level, we cannot even apply standard (neoclassical) economics to attention, because whereas economics studies things allocated by the mind—by rational thought—attention is thought itself. Attention is the resource that allocates resources.

Capitalism models how goods can build up over time and accrue value. Attention, on the other hand, does not build up over time—it exists only in the present. In fact, the verb “to attend” literally means “to be present at.”

In traditional Economics, a rational agent is assumed to be able to think infinitely long before making every choice. In Attention Economics, an agent makes an infinite number of choices about how to allocate attention in every infinitesimal subsecond. We measure the value of each choice in dollars and cents.

In this dissertation, Attention Economics is an economics only of the present. To it, value only exists right now. In the present, Attention Economics evaluates one's opportunities for attention. In the present, Attention Economics defines our moment-to-moment stream of attention—the path of exploration we choose through life.

  | This is a deeply buddhist economics.  Forego material things.
  | Value the present moment.  (cartoon?)

Allow me to explain the details—what follows is a model of attention economics. Like with traditional economics, this is an idealized model of man that can be used to explain his behaviors.

The Attention Economic Model

In every moment of your life, a set of opportunities occur to you as possible. As you read this sentence, a related thought may occur in your mind as possible to attend to. As you finish this section, it may occur to you as possible to read the next section. If I toss a baseball to a pro outfielder, it will occur to him as possible to catch it.

The baseball player does not have to think hard about the ball to be aware that he can catch it. The opportunity just occurs to him, almost instantaneously, almost automatically, with fast thought, and very little mental effort. Daniel Kahneman would call this a System 1 result (which I will explain later). The action of reaching towards the ball with his opening gloved hand just occurs to him as a possibility.

                      /   opp. 1  - catch baseball
              __|__  /    opp. 2  - duck underneath baseball
             [ o__o] -    opp. 3  - continue scratching buttox
              \_==_| \    opp. 4  - ...
                |     \   opp. 5  - ...
                 -E
             Baseball     Attention
             Player       Opportunities

This video illustrates the idea:

(The video was staged in an advertising production by Gillette.)

Each of these possibilities occur with a degree of appealingness or unappealingness to the player. He finds catching balls, for instance, intrinsically attractive, especially when doing so protects a cute girl from injury. He does not have to think deeply about how much he wants to catch the ball vs. duck under it; the appealingness just occurs to him, as a mostly-automatic function of his mind, in the same way that the possibilities themselves occur. Almost instantly, as soon as he is aware of a ball being thrown his direction, his mind's processing makes him aware of opportunities to attend to the ball, and the appealingness of the opportunities.

Thus, in every present moment, opportunities for attention occur to us with an inherent degree of appealingness. We have a word for this degree of appealingness—we call it the utility of the attention opportunity, or attention utility for short. We define attention utility as the degree to which a person prefers to attend to a present opportunity for attention.

  | Possible alt definition: relative to the present alternatives
                      /   Utility(catch)     = .7  <-- Best
              __|__  /    Utility(duck)      = .5
             [ o__o] -    Utility(scratch)   = .4
              \_==_| \    Utility(...)       = .2
                |     \   Utility(...)       = .4
                 -E
             Baseball     Attention          Attention
             Player       Opportunities      Utilities

The baseball player chooses to attend to the opportunity with the most attention utility.

Moment-by-moment, sets of opportunities occur to us, with fast, almost automatic cranial processing. We choose amongst them; we are choosing our stream of attention. Attention utility is an opportunity's appealingness, as computed with fast (System 1) thought, in the present, at the moment of action.

(Note that this is just a model. I am not claiming these constructs exist in any physical sense, even though they have analogs in neural models. And I will give more detail on this model in Chapter 2.)

  | An attention opportunity's /utility/, also occurs to us quickly,
  | almost automatically.  In every moment of your life, a set of very
  | fast, almost automatic processes in your mind are helping to
  | choose every possibility that you attend to.  Our stream of
  | attention is chosen with fast thought.

Measuring Attention Utility

There is something special about this definition of attention utility: it is something we can measure. If we measure it, we can shed light onto the invisible attention economy of life.

  | If we use these definitions, we can actually measure the utilities
  | of attending to things.  The streams of values that people use,
  | moment-by-moment, to decide how to be.

Here is how we might measure attention utility. Imagine we strike a deal with the baseball player, and now he knows that every ball he catches will earn him an extra $3. This will increase the appealingness of catching balls in general. And when he sees a particular ball, the utility of catching it will be greater than it would have been without the $3:

                      /   Utility(catch + $3) = .8
              __|__  /    Utility(duck)       = .5
             [ o__o] -    Utility(scratch)    = .4
              \_==_| \    Utility(...)        = .2
                |     \   Utility(...)        = .4
                 -E
             Baseball     Attention          Attention
             Player       Opportunities      Utilities

Money attracts our attention. We can use this fact to measure the attention utility of tasks. Perhaps we instead pay the baseball player $20 per minute to scratch his buttox:

                      /   Utility(catch)         = .7
              __|__  /    Utility(duck)          = .5
             [ o__o] -    Utility(scratch + $20) = .8  <- New best
              \_==_| \    Utility(...)           = .2
                |     \   Utility(...)           = .4
                 -E
             Baseball     Attention              Attention
             Player       Opportunities          Utilities

With enough money, we can change the player's observable behavior—we make him scratch his buttox even when a ball is thrown to him. And if we can identify the exact amount necessary to change his behavior, we can quantify the utility difference between scratching his buttox and catching the ball.

Measuring Attention Utility with Computers

We can take this idea to the extreme with computers that track our attention, methodically reward or penalize us for it, measure the changes in our behavior, and automatically repeat the experiment thousands of times with thousands of people. Computers let us measure aspects of Attention Economics that were not feasible before.

  | Say that this is our primary contribution: doing this shit with
  | computers.  It's what makes this all totally new.  Unlocks new
  | capabilities for us.  Prior preference-elicitation techniques
  | were limited... and we'll explain this more in chapter 4.

For instance, recall the CAPTCHA experiment:

This experiment shows we can quantify aesthetics, in dollars and cents, for any design. We can put numbers to feelings. We can create an economic measurement of the artistic side of life. This is the attention utility of aesthetics, and we can measure it for the first time.

Utility is a summative metric that aggregates all factors affecting a user's choice into a single number:

                               Utility
                                  |    
                          _______/ \_______
                         /                 \
                      Efficiency       Beauty
                      Speed            Clarity
                      Fun              Learnability
                      Satisfaction     Reliability
                      Social Reward    Intuitiveness

This lets us compare different aspects of an interface. We can quantify tradeoffs between efficiency and aesthetics, for instance. We can see how much different factors matter to users on the ultimate scale of utility.

This might feel overly reductionist—to quantify human experience in a single number. After all, the field of economics has been attacked for doing precisely this, using an overly-constrictive assumption of rational, goal-driven behavior. However, we are proposing a new type of economics, with new assumptions, and a new language for interpreting data. Our economics avoids many of the problems and limitations of standard economics.

A New Economic Model

      In this world, the scarce resource is not allocated by your mind,
      but is thought itself.

Attention economics has fundamental differences from the standard economic paradigm, which is called Neoclassical Economics. The neoclassical paradigm assumes rational, self-interested behavior to achieve optimal outcomes in life.

  "[Neoclassical Economics'] fundamental assumptions include the following:
   1. People have rational preferences among outcomes.
   2. Individuals maximize utility and firms maximize profits.
   3. People act independently on the basis of full and relevant information."
   – E. Roy Weintraub, in The Concise Encyclopedia of Economics

To understand the distinction between neoclassical and attention economics, consider that the actions we take in life often fail to live up to the goals we declare. For instance, one might think “I should be athletic and fit”—a desirable outcome. Yet when in front of a treadmill, the prospect of swiping through Facebook messages might somehow have a higher attention utility. Similarly, one might decide that they want to be more social, but in a social situation pull out their iPhone. I want to finish this dissertation, but sometimes browsing articles on bitcoin is more appealing. Even though an outcome might provide us a high degree of outcome utility, when it comes to the moment of truth, the necessary actions might occur to us with a different attention utility.

  | Try calling outcomes "material" outcomes, or "commoditized"

These situations illustrate a difference between evaluations of utility in the moment and evaluations of utility on outcomes. The momentary attention utility is psychological. It depends on one's current mood, state, and how opportunities are phrased. It depends not just on what they are, but how they occur.

This is the difference between these two versions of economics. Consider this model of life:

     ...............           ...............           ...............
    .               .         .               .         .               .
    .               .         .               .         .               .
    . How Opportun- . ----->  .   Actions     . ----->  .   Outcomes    .
    . ities Occur   .         .   We Choose   .         .   We Obtain   .
    .               .         .               .         .               .
     ...............           ...............           ...............
                                                                ^
                                                                |
                                                            Utility()
                                                         of economic man
                                                          defined here

Standard economics evaluates outcomes. Man is assumed to choose actions that optimize his utility function, as applied to the outcomes of his actions. Economics focuses on ends over means.

This model does not capture the complexities of real people. Neoclassical economics assumes people can deduce their optimal actions with full rationality. But real people, it turns out, have bounded rationality. Their attention is bounded. They cannot think forever, and they cannot compute the optimal actions, nor the optimal outcome.

To make sense of real data, behavioral economists have incorporated many heuristics and biases into this rational model of man trying to achieve optimal outcomes. This has made the model more complex.

Shifting Focus, like Copernicus

This reminds me of Copernicus' discovery that the earth revolves around the sun, rather than the other way around.

I do not doubt that, in his day, it was appealing to think of the earth as the center of the universe, a stable base with man standing on top of it all. When man looked up at the sky, it must have felt as if the cosmos revolved around him, in perfect concentric circles. That model must have felt close to god.

However, if you start to actually measure, chart, and quantify the planetary bodies from this perspective, their motions appear to be distorted:

  | http://www.sciencewriter.net/ncd/r.htm

Sometimes planets go backwards. Sometimes they swirl around a line. Sometimes there are swirls within the swirls, and sometimes swirls within those. Astronomers had to account for this data with a complex set of retrograde motions and epicycles layered upon epicycles. These complexities only increased as telescopes and charting techniques improved, uncovering more distortions from in the idealized orbital lens. Take, for instance, the numerous parameterized gears required for an early Galilean planetary model:

  | William Pearson, in Abraham Rees ed., The Cyclopaedia; or,
  | Universal Dictionary of Arts, Sciences, and Literature, Plates,
  | Vol. iv (London 1820).

Similarly, when we look at real human data from the standard economic perspective, there appear to be many distortions from the idealized rational lens. People seem to make suboptimal decisions. To account for these, behavioral economists have added a growing list of heuristics and biases to our model of man. The theory is that we can consider man to be somewhat rational—if we remember to adjust for the fact that his optimizations are loss-averse, easily-primed, emotionally-driven, and myopic, to name just a few. In fact, the Wikipedia page for “List of Cognitive Biases” references 169 separate “systematic deviations from a standard of rationality” that we should remember. The standard economic model becomes very complex when confronted with reality.

A New Focal Point

In order to obtain a simpler model of the planets, Copernicus first had to change the lens with which we saw the world—he had to move the center of the universe from the earth to the sun. Only from this perspective could Keplar later formalize the simple elliptical model we use today, which Newton would explain as interplanetary gravity.

The economic model of rational utility may be in a similar situation. Economic utility also has an implicit focus: on goods, services, and the general outcomes of human action. This may be creating unneeded complexities, and preventing us from discovering new fundamental laws of human nature.

  | Maybe I should add an explicit hint at a possible conclusion:
  |  - If we had a simpler model of economics, we could...
  |  - Economics created social philosophy in ways x,y,z (free market)
  |    ... and perhaps a new, simpler model could...

Neoclassical economics evaluates utility on outcomes:

                               ...............           ...............
                              .               .         .               .
                              .               .         .               .
                              .    Actions    . ----->  .   Outcomes    .
                              .   We Choose   .         .   We Obtain   .
                              .               .         .               .
                               ...............           ...............
                                                                ^
                                                                |
                                                             Utility()
                                                           defined here

As a result, economics becomes focused on outcomes, even though there are other grounds upon which to define human behavior.

Even Behavioral Economics, which proves that people are irrational, does so from a lens focused on outcomes: a typical experiment will prove that subjects have not optimized their outcomes by the end of the experiment.

Rationality is a theory that implies what people should do optimally—it is a theory of what should be. Irrationality, and bounded rationality, are theories that point out the holes in rationality—they are theories of what is not. Science needs a good theory of what is. I believe our attachment to outcome orientation is holding us back from finding a parsimonious theory of what is.

If we are to move the focal point, then where to? Perhaps we should shed a light on what vexes us. The neoclassical model fails when our rationality is bounded. When our attention for optimal choice is limited. It fails in our psychology.

I propose a new focal point for economics: inside of the human mind. We will evaluate utility as it occurs to the human. This new utility function is the utility of attention.

     ...............           ...............           ...............
    .               .         .               .         .               .
    .               .         .               .         .               .
    .   Attention   . ----->  .    Actions    . ----->  .   Outcomes    .
    . Opportunities .         .   We Choose   .         .   We Obtain   .
    .     Occur     .         .               .         .               .
     ...............           ...............           ...............
            ^                                                   ^
            |                                                   |
   Utility Moved to Here                                  ...from Here

My model may not accurately predict production and consumption of traditional goods and services as well as standard neoclassical economics, but it will predict where people put their attention.

Unlike with the Copernican shift, I do not advocate switching entirely to this new lens. I believe that neoclassical utility is complementary to attention utility, and we should use both together as complementary reference points. Both models are idealized.

Even though the neoclassical paradigm is the dominant standard today, it is in fact only one particular set of agreed-upon assumptions and language, that were chosen amongst an evolving history of alternatives that we can revisit at any time. Neoclassical economics arose in the mid-1900s. There are other ways to theorize about human choice and value. We can make a new economic metatheory. Attention economics allows us to measure something important for the future of humanity: attention.

Neoclassical economics cannot robustly measure attention as a resource, because whereas economics models how resources are allocated by your mind, attention is thought itself. Attention must be allocated to allocate attention. Chapter 2 will explain how attempting to model attention within neoclassical economics results in a homunculus theory, which infinitely recurses inwards upon itself.

My purpose in this dissertation is to derive a measurable economics of attention. When I say economics of attention, I mean that we want to know the utility of attention choices—not just the quantity of choices made. And we want to measure these empirically, not just theorize on their possible existence in the abstract. My goal here is to inspire others to use these measurements in their work.

  | Chapter 3 explains some measurement methods, and how some
  | scientists used them to answer new attention economic questions in
  | other fields.

This would help many fields

  | Warning: This section's language is still rough.

A measurable attention economics would be relevant to many fields. Chapter 4 will go into further details, but let us get a taste for them first—something concrete to flavor the upcoming abstractions of Chapter 2.

  | It sits between them right now.  It has been understood either as:
  |  - Attention without Economics (without quantifiable utility)
  |  - or Economics without Attention

Authors from a variety of fields have written entire books about the importance of attention economics to their fields—but have not measured it. In Business, Davenport and Beck wrote Attention Economy Understanding the New Currency of Business. In Rhetoric, Richard Lanham wrote The Economics of Attention: Style and Substance in the Age of Information. In Film and Visual Studies, Jonathan Beller wrote The Cinematic Mode of Production: Attention Economy and the Society of the Spectacle. Similar books have been written in Marketing (e.g., Ferzini) and other fields. A number of scholarly articles have also been written on Attention Economics, such as Goldhaber's Attention Economy and the Net in First Monday, and Georg Frack's Civic Planning perspective in The Economy of Attention. However, these investigations are qualitative. Each of these books identifies the changing environment of attention economics as having an impact on their fields. They raise awareness for attention economics, but are still wanting for a quantitative, measurable way of looking at the Economics of Attention. As Goldhaber notes, “What does it mean then that none of us are professional economists?”

  | These quotes blow.  I should fix them.
  | Let us examine Goldhaber, for instance.  In his 2006 critique of
  | Lanham's work, he expresses a distrust of standard economics, but
  | laments the lack of attempt by his colleagues at trying to derive
  | attention economic versions of core economic phenomena, such as:
  | "minimax principles; satsificing strategies; concepts of
  | efficiency (obtaining the most attention for the least attention
  | put in, for instance); relative, if not absolute, growth (of
  | audiences, say); international and intercultural effects; notions
  | of advanced versus underdeveloped attention economies; etc."  He
  | is curious about the possibility to "mathematize" attention
  | economics: "If attention economics were mathematizable, it would
  | certainly require a rather different mathematics."
  | Goldhaber critiques Lanham's work on its inability to investigate
  | attention economics with rigor:
  |
  |  "Given the emphasis on style, rhetoric and design, one might take
  |   the first word in the title, “economics” as meant only
  |   metaphorically, and thus see the whole work as simply a broad
  |   attempt at artistic and literary criticism. But Lanham seems to
  |   intend that the title be taken literally. It is not a promise he
  |   keeps."
  |
  |   - Michael Goldhaber, 2006.  How (Not) to Study the Attention
  |     Economy: A Review of The Economics of Attention: Style and
  |     Substance in the Age of Information
  | Davenport and Beck's "The Attention Economy” /does/ propose
  | an attention measurement tool called the /AttentionScape/: a
  | self-report survey method, wherein subjects create a list of items
  | they recall attending to recently, rate each item's agreement or
  | disagreement with 6 categories of attention, and then project the
  | 6 categories down to a 2-dimensional space of /back/ vs. /front of
  | mind/ attention and /captive/ vs. /voluntary/ attention.  This
  | measurement tool appears to have garnered interest--the top-rated
  | critical review of the book on Amazon reports: "The most
  | interesting part of the book is the proposed measurement model."
  | [] However, this tool relies on many assumptions--the accuracy of
  | a subject's memory, their honesty, their ability to reflect--and
  | still does not measure the /value/ of attention.  Given the
  | widespread recognition of the importance of attention economics,
  | and the awareness by these authors that there is /something/
  | unknown in the formal mathemetizing of its economics, I imagine
  | that a measurable attention economics could be quite illuminating.
  | From a 3-star review by Donald Mitchell "Jesus Loves you!" on
  | Amazon.com entitled "Good Summary of Latest Research and
  | Measurement Model"

The field of Psychology studies the human capacities of attention (e.g., parallel visual processing, multi-tasking), and disorders of attention allocation (e.g., ADHD and procrastination). However, the field has yet to study the Economics of attention allocation, e.g., how much money would it take to convince an ADHD child to focus on a task? From an attention economics perspective, people do not simply have short attention spans—they have different response curves to degrees of attention utility. An ADHD person will still pay attention very important, high-utility opportunities. If we could measure these responses to utility—the economics of attention—we could add a very important dimension to ADHD studies in psychology. In the end of this dissertation I propose how to create an Index of Attention Span, that lets us quantify the value curves of attention for different people with varying forms of ADHD. I believe a method and theory for measuring the economics of Attention could help in at least two ways. (1) We could build better diagnostic tools to measure expressions of ADHD with economic fidelity. (2) We could learn and benchmark how to design environments that are better able to capture attention for people with ADHD.

The field of Economics, on the other hand, is hamstrung by an inability to cope with the bounds of attention, which Herb Simon refers to as bounded rationality. Neoclassical economics—the dominant economic paradigm since the mid 1900s—mathematically predicts economies from an assumption that man is rational: that he can think for an unbounded length of time, making full use of information, to determine optimal decisions in allocations of resources under constraints. In other words, economics assumes that man's attention is unlimited. In the upcoming chapters, I will argue that attention economics can be formalized into a complementary perspective to neoclassical rational economics, that understands the limits of attention by making them a fundamental assumption of the theory.

  | XXXX Add that this would be a nanoeconomics, as hal varian has
  | been asking for.

The field of neuroeconomics—combining psychology and economics—would seem to be perfectly situated to begin a study of Attention Economics—the economics of neural processing. But neuroeconomics still needs a measurable model of how people allocate neural attention resources. This neuroeconomics survey article explains:

  "Neuroscience is shot through with familiar economic language
   --delegation, division of labor, constraint, coordination, executive
   function -- but these concepts are not formalized in neuroscience
   as they are in economics. There is no overall theory of how the
   brain allocates resources that are essentially fixed (e.g., blood
   flow and attention).  An “economic model of the brain” could help
   here."
   – Neuroeconomics: How Neuroscience Can Inform Economics
     by Colin Camerer, George Loewenstein, and Drazen Prelec

If we could measure the utility, or priority of attention objects, then perhaps we could analyze how the brain differentially allocates neural resources to those targets under different situations.

I am personally interested in the effect of attention economics on Politics. Politics are shaped and limited by the bounds of voters' attention. Politicians, for instance, will focus on issues that citizens pay attention to. Natural experiments have shown that people's choices in television programs affect their votes—for instance, when Fox News becomes easier for a town's citizens to watch, more of them vote Republican. People find some topics unappealing to pay attention to, even if they are important. Many television news sources dedicate more airtime to celebrity gossip than foreign affairs, for instance—is this an appropriate allocation of attention? If we could quantitatively measure the utility that viewers perceive in attending to different topics, and run scientific studies manipulating variables in the programs and measuring the results, we could begin to understand why we pay attention to the topics we do, and help us understand whether we should do something to change it, and what we might do.

  | Social sciences study racism & stereotypes, which are an artifact
  | of limited attention

These are just some of the questions that a Measurable Attention Economics could help us answer. However, as for me personally, I discovered Attention Economics because of its enormous impact on computing.

Attention Economics of Technology

The attention economy has a special relationship with technology. This relationship occurs in three ways:

  1. Technology changes our patterns of attention.
  2. Technology can be evaluated in terms of its Attention Utility.
  3. And finally, technology lets us measure these Attention Utilities.

In other words, even though Technology poses risks to our attention, we can learn how to make it better by measuring its Attention Utilities, and in order to measure them, we need to use Technology.

Indeed, I discovered this need for a measurable attention economics while designing and evaluating technology for human-computer interactions. Chapter 4 will argue that our field of human-computer interaction should place attention economics at the center, and transform itself into an understanding of human-computer systems.

Let us now examine the three interactions between attention economics and technology.

  | XXX Restructure this as "it applies to tech/HCS" and "measured by
  | tech/HCS"

1. Technology changes our patterns of attention

We stay up later with electric lighting, which provides us with more hours in a day to attend with. Cars enable us to live farther from our origins, and as a result, we devote less attention to our families. Telephones, however, decrease the difficulty, or attention cost, of speaking with these now-distant relatives, and lets us attend to them more. Facebook expands upon this trend. These changes to the ease of attending to some objects come with a decrease in attention to other objects.

As another example, our national attention is shaped and scoped by the media. Television news makes it easy to attend to the topics it broadcasts, and we attend less to other topics. The media, in turn, choose to broadcast objects that society attends to, creating a feedback loop of attention choice, driven by technology. Technology changes our attention by making objects easier or more appealing to attend to, and, in turn, are designed in reflection to what they choose.

  | So you can see that attention is like a currency.  Technology
  | decreases costs of attention, and this changes economic allocation
  | patterns of attention, just like supply and demand in traditional
  | neoclassical economics.  Technology is interacting with a hidden
  | economic system of attention allocation... but since there is no
  | currency placed on attention choices, this is an economic system
  | that we cannot measure.

Technology seems to have a magnifying relationship with ADHD. Many ADHD children have trouble paying attention to textbooks, but not television or video games. Parents, in response, limit “screen time.” Internet services cater towards micro-slices of attention: tweets, status updates, and photographs. Modern television programming is flashier than it was. It seems that modern technology has produced a vast market demand for tasks of micro-attention slices.

In these ways, technology shapes the flows of societal attention, alternatively magnifying and alleviating attention issues.

  | We can also see this in mechanical turk.  A market explicitly
  | paying people for micro-slices of their attention.  Here we can
  | begin to see the first measurable attention economics.  But turk
  | is a very constrained world.  Only 40,000 people.  What are the
  | attention economics of the rest of the world?  Are they insidious?
  | Like an ADD child?  Or are they actually relatively rational?
  | Could we have a debate about this?  We would need to measure the
  | economics of attention.

2. Technology is EVALUATED by its ability to recruit human attention

Because the internet is fueled by human attention, websites must compete for it, and therefore evaluate themselves on their ability to recruit it. Entire information industries, such as journalism and encyclopedias, are losing to competition by crowds, such as Twitter and Wikipedia, that produce content for free. The websites behind these crowds compete against one another to recruit the human attention necessary to produce their information products. The firms of the internet are these human-computer systems, and they must measure their “financial” bottom line in terms of attention.

As a result, the evaluation criteria for human-computer systems (and, as an extension, much of the Human-Computer Interaction field) is now human attention. Website designers measure hits, page views, ad views and clicks. They try to increase the number of posts and comments contributed by users. Facebook reports its number of daily active and monthly active users. These are measures of human attention.

When developing a new feature, a designer will often A/B test it to verify that it increased the conversion rate—a measure of attention. (They especially seek attention to the task of entering a credit card number.) The dominant revenue source for the web is advertising—the explicit transformation of human attention to cash. Thus, even the real financial bottom line for internet companies is derived from production in the attention economy. If web companies had a scientific measure for the degree of utility, and thus motivation that their tasks attract, they could use this to benchmark and evaluate their financial bottom line.

Scientists and historians could also use attention economics to study human-computer systems. Consider that a human-computer system is only feasible if it attracts more attention than it spends. By measuring the costs and values of tasks and user interfaces, I envision that we could measure this feasibility point for different systems. For instance, internet historians have reflected that social networking sites only became feasible once broadband and digital photography reached a threshold of adoption across society, making it easy enough for people to upload and view photographs online. Wikipedia became feasible by making it easy for one to contribute an article—rather than spending years of one's life to establish known expertise in a field and create a trusted relationship with an encyclopedia company. Furthermore, rather than require editors to write complete, flawless articles, Wikipedia allows many people to contribute micro-slices of attention to collaboratively optimize articles. By requiring less attention, Wikipedia can recruit much more attention. In other words, as the cost of attending decreases, the supply of attention increases. This is a traditional economics—but it is not about money. It is a about the cost and value of the attention itself. This economics is currently invisible. Measuring the invisible will become a core theme of this dissertation. We need a way to measure these economics of attention.

3. We can use technology to measure attention economics

The rest of this dissertation explains how we can use technology to measure attention economics. Specifically, in Chapter 3 I will describe methods for measuring attention economics with human-computer systems, by instrumenting tasks with economic payments.

In summary, attention economics is on the one hand a product of technology. On the other hand, it can be used for the evaluation of technology. And on the third hand, we can measure attention economics with human-computer systems, and apply the measurements to other fields of inquiry.

Excelsior! (Onwards!)

[Note, I am currently working on revising this description of the thesis argument. Sorry!]

This dissertation thus makes human-computer systems and technology a prime focus; it looks at attention economics with and for human-computer systems.

  | •My thesis is that we should pursue a measurable attention economics.•
  | "We should adapt the economic utility function to a place in the
  | human mind, in the present, to measure attention."
  | •My thesis is that we should measure attention with a utility
  | function in the present moment of the mind.•
  | •My thesis is that we should evaluate utility in the present
  | moment of the mind.•
  | •My thesis is that utility happens in the present moment of the
  | mind.•
  | •My thesis is that attention utility exists in the present moment
  | of the mind.•
  | •My thesis is that attention utility exists in the present moment
  | of awareness.
  | Attention Economics is the values that people place on
  | opportunities for attention in the present.
  | •We should measure values that people place on opportunities for
  | attention in the present.•

This dissertation is organized into three chapters to prove three legs of a top-level thesis.

Top-level thesis: We should measure attention economics.

Subtheses for each chapter:

  1. The economics of attention exists in the present moment of awareness. (The what.)
  2. We can measure it by embedding money amounts into the fast-thought judgement of choices. (The how.)
  3. Attention utility explains many aspects of life. (The why.)

You can think of these subtheses as the what, the how, and the why. I have allocated each of the next three chapters to each of these questions.

To explain what a measurable attention economics might look like, I propose a new model for measurable attention economics and show how it compares to neoclassical economics. To satisfy you that it is possible to measure attention economics, I have run experiments and made prototypes to develop ways how to measure attention economics. And to answer the why, I show how a measurable attention economics relates to a set of disparate fields (economics, psychology, design, computer science and philosophy of values), providing a common language, and sketching a set of exemplary experiments that one might run to answer research questions in those fields. I hope these explorations will satisfy you that it is important and possible to measure attention economics.

Note that my thesis is not falsifiable. Neither is neoclassical economics. Although we are accustomed to falsifiable theses in the sciences, the theses of neoclassical and attention economics are metatheory:

   "Neoclassical economics is what is called a metatheory. That is, it
    is a set of implicit rules or understandings for constructing
    satisfactory economic theories. It is a scientific research
    program that generates economic theories. Its fundamental
    assumptions are not open to discussion in that they define the
    shared understandings of those who call themselves neoclassical
    economists, or economists without any adjective."
   – E. Roy Weintraub, in The Concise Encyclopedia of Economics

Science itself is a metatheory—scientists reason about theories in its terms.

Thus, to evaluate this thesis, please do not look for a falsifiable theory, but rather a language and set of understandings for constructing falsifiable theories.

Furthermore, I do not claim that my model, methods, or reasons are the complete and final word on the what, how, and why of attention economics. Neither was Adam Smith's classical depiction of economics the final word—his version did not even include the ideas of utility, nor selfishness, for instance, which we now take as basic principles.

Rather, this dissertation provides the initial foundation of a measurable attention economics. We should continue to refine these models and methods, as we gain experience employing them. One cannot prove a metatheory with pure reason. I instead invite the reader to judge my thesis by examining whether he or she feels that a measurable attention economics is feasible and worth pursuing.

WHAT do we mean by Attention Economics

Chapter 2.

  Life is finite.
  You can only attend to so much before you die.
  Attention is the ultimate human resource.

In order to measure attention economics, we need to know what it is we are measuring. This chapter will go deeper into the question of “what” a measurable attention economics is, expanding on the informal descriptions you just read in the introduction.

  | TRAVIS NOTES:
  |
  | This is a nano-economics.
  |
  | Two main sections in ch 2:
  |   - What is attention (quantities, qualities, and so forth).
  |   - How is it allocated (the cognitive process by which attention is
  |     allocate)
  |     - this is what fast thought is
  |       how attention allocates itself
  |       how people learn attention utility over time
  |       fast thought escalating to slow thought
  |
  | Change the long arguments and logic to assumptions.  Find references
  | if needed, but say "This is an assumption, and it's reasonable for
  | these reasons."
  |
  | Ch 3. Given that this is how it's operating, here are some things we
  | think we can measure.
  |   1. How do we model this?
  |   2. Given that this is some modeling, here's a method that operates
  |      according to that model

Attention economics is a Theory of Human Action, like Lucy Suchman's situated action, and normal (aka “neoclassical”) economics. A theory of action seeks to explain and predict people's behavior in a set of circumstances. HCI depends upon theories of action, such as Fitts' law, to predict people's behavior with computers. Using this knowledge, we can build better user interfaces, develop methods for evaluating them, and better understand people's behavior for its own sake.

But of all theories of action, economics is of a particular type: it is a theory of action based on value. Economics introduces a new abstraction of value, called utility, and defines man in some way as seeking and optimizing for that value, or utility.

Introducing value as a concept is very useful for theories that seek to predict man's choice, or consider man as having a free will of some sort. Consider, on the other hand, Fitts' law and much of Cognitive Psychology (e.g. studies of memory limits, learning, attention limits). These theories characterize man's capabilities more than his values. They help us understand what the machine of man can do, but not what he will choose to do. I believe HCI is in need of a theory of value, and choice—a theory of economics.

Why should HCI understand choice? Because to design for humans, one must grapple with free will. I see free will as a core distinction between man and machine in human-computer systems. You can program computers to do anything you want, but humans, on the other hand, must choose of their own volition to complete any tasks you provide them. Your system will only succeed if users choose your tasks and interfaces. Attention economics is the theory for understanding which of a designer's tasks a person will choose to attend to—and complete—in different circumstances. I call it Attention Economics because, like traditional (“neoclassical”) economics, I believe that the concept of value, or utility, is an excellent device for modeling human choice. Attention Economics also uses utility.

However, attention economics is not completely like traditional economics. It also introduces a new concept into its core assumptions: attention. Whereas traditional economics assumes that there exists a utility, attention economics further assumes that there exists a concept called attention, and that there is a utility of attention. I call this attention utility for short.

  | This is a "Positive Economics."  It says "what is," not "what
  | should be."

The distinction of Attention Economics

Attention utility is different than neoclassical utility. This difference is the core distinction between neoclassical and attention economics:

This changes how they interpret data. For instance, if someone buys an apple instead of a banana:

If a neoclassical grocer were to observe this data, he might conclude “people like apples, I should order more.” An attention-oriented grocer, on the other hand, might conclude “people do not find the bananas attractive; I should improve their display.”

The distinction of attention economics is thus in the interpretation—they are two different lenses with which to see data and explain events. They interpret data differently because they assume different models of the world. Although they are complementary, I do not see them as mutually exclusive. I believe a holistic theory of action should leverage both models of man, and will suggest how to do so in the next chapter.

  | INCLUDE THIS? You can think of these utility functions as having
  | different type signatures.  What does the parameter to U()
  | type-check to?  Is it an outcome, or is it an opportunity in the
  | present?  Ultimately, both describe the same situations in
  | reality, but they use different data structures and modes of
  | processing to interpret them.
  | Whereas neoclassical utility is an evaluation over a state of the
  | world that results from actions -- specifically, a "bundle of
  | goods and services that are consumed" in that state -- attention
  | utility is an evaluation of an opportunity, in the present moment,
  | for a person to attend to, in the present.  Whereas the
  | neoclassical model chooses the optimal set of actions to optimize
  | his outcomes in life, the attention economic model of man chooses
  | the opportunity for attention with optimal utility, in the
  | present, at every moment of his life.

From Microeconomics to Nanoeconomics

Additionally, I view attention economics as a nanoeconomics, according to Hal Varian's view:

  "Microeconomics offers a black-box description of consumer and
   producer choices.  Nanoeconomics looks inside the black box to
   describe the business processes of firms and behavioral economics
   of individuals."
   —Hal Varian (in personal communication)

I see attention economics as a nanoeconomics in two ways. First, it is a theory for the insides of a person or human-computer system. (I see human-computer systems as organizations of people, with information processes, analogous to a firm with business processes in Varian's view.) Second, it is a theory applied to a smaller scale of choice than traditional microeconomics.

  | I just got this quoted definition, and need to refine it to
  | something more like:
  | 
  |  "there is a
  |   Macroeconomics, studying entire markets, then there's a
  |   Microeconomics, studying the transactions amongst actors
  |   (individuals and firms), and a Nanoeconomics studies the
  |   behaviors within an actor or firm.
  | 
  | ...or else change the following to better fit this definition.

Whereas microeconomics applies to the level of the transactions of goods and services amongst individuals, attention economics applies to the moment-to-moment choices of attention within individuals. Consider that the average american buys or sells fewer than perhaps 100 items per day—food items, gas, etc.—but makes thousands of nanodecisions—choosing to check email, look outside, whether to take another bite of food, etc.—that do not involve transactions. Thus, microeconomics has no data on these decisions with which to measure their behavior. Attention economics, with the aid of human-computer systems, will measure and model humans at this nanoeconomic level.

  | At the nanoeconomic level, for instance, I am choosing to leave my
  | office to choose a way to transport myself to a restuarant that I
  | am choosing to have lunch at, and I choose the way that I eat my
  | food--but only once did I purchase anything.  There are perhaps
  | 1,000s of such micro-choices per day: if we model a choice as
  | occurring every few seconds, then there are 1,000 per hour, or
  | 16,000 per waking moment of each day.  This is the level at which
  | much of the interesting attention economics happens.
  | For comparison, I only make one to five microeconomic purchasing
  | decisions per day.  These are far fewer decisions.  If we are to
  | build a theory of action from purchasing data, we will have far
  | less data with which to build a model.

The purpose of this chapter is to define the new, unique features of attention economics. First, let us define what we mean by attention as a quantity, how we choose to allocate it, and what the economics of it means.

What is attention?

  NOTE!  I need to introduce this section as DEFINITIONS.  These are
  all definitions of terms.  E.g. I am not claiming that psychology
  works this way, I am just defining terms that we will use next in
  the MODEL.  If I have made claims in this section, I need to
  identify them and take them out!

This section defines attention: the substance that our attention utility function evaluates.

Right now, I am writing you a message. You are reading it. You are flowing with my thoughts. I have your attention.

Focused attention is how tasks get done. You must attend to this paper to understand it. An organization of people needs each person giving attention. When we build websites and human-computer systems we want our users' attention. We want them to use our systems. Most websites want more users. We want lots of attention.

Attention is our Processing

Attention is our neurons firing on a task or object. The more neurons, the more attention. The longer they focus on the task, the more attention. We want as many of our neurons as possible. Sometimes we get multiple people attending to a task. That gives us even more neurons.

When we measure attention, we will want to quantify it. We cannot count the activities of neurons very easily, so we will need to approximate it with the unit of the person-hour—the attention that a single average person can generate, or allocate, in an hour.

Attention has a QUALITY

However, some people, and some neurons, are better than others, for some tasks. Mathematicians are better at math. The visual processing system of the brain is better at visual processing. Some attention is higher quality than others.

We could put a mathematician in the history department, and the brain does re-allocate subsystems for different purposes if it needs to, but doing so is not as effective. Similarly, the configuration of people affects the quality of attention. Some tasks, such as brainstorming, are more effectively done by many people allocating a few minutes each, whereas others, such as writing a paper, are more effective with a few people allocating many minutes each. Even with the same number of person-hours, a group's configuration affects its quality.

When we measure the attention required to do something, we should consider both its quantity and its quality.

We have a CAPACITY of attention

The quantity and quality of our minds and bodies give us a capacity of attention. We can choose to allocate this capacity on different tasks and roles. We can attend to information tasks and objects, or physical tasks and objects.

When we multi-task, we split our capacity of attention amongst tasks. When we are tired, we have a lower capacity.

A group, organization, or human-computer system also has a capacity for attention. Consider a crowdsourcing disaster relief organization. It would be nice to know what kind of capacity it has in situations of a disaster, and to know how that capacity changes over time, as volunteers become tired, or the situation begins to lose media coverage.

ACTION is a form of attention

Let us generalize the idea of attention beyond cognitive tasks, to any action that we perform, mental or physical. In this dissertation, we will view action and attention as the same.

Attention often employs our physical bodies, and the physical tools around us. Consider how a doctor attends to a wound on your arm. He uses his mind and body. He employs mental attention to coordinate the required motions of his hands and physical instruments—to physically attend to the wound.

Attention consists of both thoughts (mental) and actions (physical). It is difficult to have one form of attention without the other. Consider, could you ever concentrate mental attention without physical attention? Mental effort, at minimum, requires your body to allocate physical blood flow to your brain. You will also likely change your gaze, perhaps looking into the distance for inspiration, or bring out paper to take notes or draw a sketch and seek a new perspective. You might recruit your vocal chords to speak to a friend, so that the two of you, and your physical bodies, can attend together. I cannot imagine a situation of attention that is purely conceptual, with no physical component.

And in the other direction, can you imagine physical attention that did not, in some way, involve your mind? All physical motions are controlled by the nervous system. Even relatively automatic processes, like one's breath and heartbeat, are still attended to by the mind. And our actions affect the mind. Our motions, and our physical environments affect our thoughts; they frame and guide our attention. Mental and physical attention are inextricably linked. Actions and Attention are co-present: where one exists, the other can surely be found. In the rest of this dissertation, I think of Action and Attention as being one and the same—two sides of the same coin.

                          Attention is both:
                -------------------------------------
               |                  |                  |
               |                  |                  |
               |      Action      |     Thought      |
               |    (physical)    |     (mental)     |
               |                  |                  |
                -------------------------------------

We attend TO objects

Just as the doctor attends to a wound, we can generally say that when we attend, we attend to something. We may attend to a task, such as writing a dissertation, or using a user interface in a particular way. However, we can also attend to physical objects, such as beautiful flowers, another person, a problem (“attend to the issue”), or an event (“attend a funeral”). These are examples of the object of attention. The object that we attend to can be a physical object, or a virtual concept.

When we attend to an object, we focus our processing on it. We can attend to a thought or memory, sitting in quiet contemplation. We can focus our attention to a goal, or a task or role, focusing our being on completing the task and attaining the goal. We can attend to physical objects and spaces. “To attend” literally means to be present at, or to concentrate on. One can be present physically, or mentally. One can concentrate one's mental energies, or physical forces. Both are forms of attention.

                   ______
                         \_
          __|__            \__
         [ o__o]                 OBJECT of attention
          \_==_|            __   can be goal, task, or thing
            |             _/
             -E    ______/   

When an object occurs to someone as possible to attend to—when the possibility of attending to it enters their awareness—we call it an attention opportunity. Each attention opportunity corresponds to an object.

Dual-nature of Objects

We can think of the object as an objective, or as a thing without implied values or goals, one that simply enters the senses and spurs thoughts and processes. These two forms often co-occur simultaneously. For instance, when we ask a doctor to attend to a wound, even though the wound is literally a physical object, the doctor immediately sees it as a problem and invents a goal of cleansing, disinfecting, and dressing it. To the doctor, attending to a physical wound would be impossible without the thought at least considering it as a problem to be solved—an objective.

On the other hand, some things can be attended to without a conscious goal. We may gaze idly into the clouds. However, these thoughts often engender goals—such as trying to see a shape in the cloud, or noticing that it is about to rain and one should move inside. Or we might provide a story for such idle gaze—explaining that we are pursuing the goal of browsing or exploring. So, although things do not always co-occur with clear objectives, they often do, and in the rest of this dissertation we will consider the object of attention to abstractly represent both an objective to achieve and a thing to be present with, physical and/or mentally.

              The object of attention is simultaneously:
                -------------------------------------
               |                  |                  |
               |                  |                  |
               |    Objective     |      Thing       |
               |                  |                  |
               |                  |                  |
                -------------------------------------

A “gas station attendant” attends to the physical gas station and the cars going through it; to the process of filling the cars with gas and accepting payment from customers; with physical and mental actions.

In summary

How is attention ALLOCATED?

   NOTE!  This section is just a MODEL.  I need to make clear in my
   next revision pass that am not saying that attention works this way
   psychologically--I am just defining a model that lets us interpret
   man's behavior.  This is the alternative to the Homo Economicus
   rational economic model of man as optimizing his utility function
   over consumption of goods and services and time!

Now we have defined attention and what it does. To make an economic theory of attention, we need a model of how man chooses to allocate it, as a scarce resource, ascribing it with some type of value.

Like with neoclassical economics, we will define a utility function to express the value of attention. However, unlike neoclassical economics, we cannot then simply use mathematical inference to deduce how a rational being would allocate his attention, because the rational assumption that one has unbounded thought, or attention, with which to determine the allocation of attention contradicts our purpose—understanding how man allocates his limited attention.

It takes attention to allocate attention. If you build an economic model of how man allocates attention, then how do you explain how he allocates the attention that allocates the attention? This train of thought—called a homunculus theory in Psychology—leads to infinite regress.

Instead, we propose a model of value that enables continuous choice of attention. Instead of long periods of unbounded thought, interspersed with periodic choices, we view one's choice of attention as a constantly-executing, continuous process.

We are constantly CHOOSING attention

A great gas station attendant does not daydream when there is work to be done. He must fully attend to the task. That is a choice. He might get distracted, and lose attention. We are constantly choosing attention.

Perhaps you are reading this paper because something in your life made you interested in attention economics, and now you are so stoked to find a relevant dissertation on the topic that you will read this entire manuscript in one literary gulp. On the other hand, however, you might be reading this because you “have” to—for instance you might be on my thesis committee—and see this as a chore. In that case, maybe you will suddenly find yourself distracted by Facebook or Twitter. Or television, or a phone call.

Some tasks are attractive to us to attend to. Some are less attractive. Some are fun, and enjoyable. Some are dreary, drudgerous.

Attention opportunities have UTILITY

The more an object is appealing, exciting, or valuable, the more we focus our attention on it. Economically, we can model this by saying there is an attention utility to the object. In this model, attention utility is the amount of value that we predict that attending to an object will have.

Thus, every attention opportunity occurs simultaneously with a utility value attached.

       ----------------------- --------------------------------
      |   Attendable Object   |   Utility of Attending to It   |
       ----------------------- --------------------------------

The problem this dissertation is trying to solve is that attention utilities are invisible today, because we do not know how to measure them:

       -----------------------   - - - - - - - - - - - - - -
      |   Attendable Object   |   Utility of Attending to It   
       -----------------------   - - - - - - - - - - - - - -

Many choices of attention are semi-automatic

Choices need not be conscious to be called a “choice.” Many of our choices are due to habit or instinct, and could be said to occur preattentively.

If you repeatedly encounter a similar situation, with similar utility values, and make the same attention choice, it may become habit: something that you choose in the future without premeditation. For instance, you may execute a routine every morning, allocating attention sequentially to the same tasks of showering, brushing your teeth, making eggs, and getting dressed. This sequence of tasks may feel as if it is on “auto-pilot”—executing without your active attention. However, it is still an allocation of physical and mental attention—you have just learned how to execute it with a more efficient use of mental resources. Subconscious attention is still attention. A subconscious choice is still a choice.

Many attention choices are instinctive. Our attention is drawn to flashing images and loud noises, for instance. It takes training and focus to not to pay attention to them. These instinctive and habitual objects will consequently have a very high attention utility in our model. It is difficult to teach an old dog new tricks.

One last note: Psychology often refers to processing of this sort as preattentive—that is, before attention. However, we think of “attention” in a more general form—to us, all processing is attention, even these rapid, automatic, instintive forms of processing.

These choices can be said to be FAST THOUGHT

These rapid, automatic, instinctive forms of attention have a name in psychology: fast thought. This is a distinction popularized recently by behavioral economist and Nobel laureate Daniel Kahneman in his book Thinking Fast and Slow, but is also buffeted by a wide array of experimental data, and in fact a variety of other psychologists have proposed similar models in the past.

Kahneman's model of the mind is composed of two systems—a fast thinking system and a slow thinking system, which he calls plainly “System 1” and “System 2.” He notes that many of the mind's processes are computed quickly, almost automatically, and without great effort. These are the processes of the fast thinking system. Other processes are slow, and require greater effort and attention—aspects of the slow thinking system.

For example, most people can decode an audio stream of human voice (ie. “understand speech”) quickly, with little experience of effort. In fact, it can be difficult not to pay attention to someone's voice; we do it almost automatically. We can also speak, watch tv, and ride a bicycle without mental effort, using the fast system.

On the other hand, some cognitive tasks require logic, reason, planning, and other processes that require a slower, plodding manner of effort. We are not very fast at solving long-hand division problems, or planning a complex holiday vacation. Novice chess players must think hard to deliberately choose a good chess move, within the constraints given rules they are unfamiliar with.

However, with experience and learning, we begin to transition slow-thinking processes to the fast-thinking system of our brain. An expert chess player, for instance, can take a glance at a chess board and immediately, without trying, have an intuition for the strengths and weaknesses of player's positions. This intuitive reaction is fast thought.

Our mental processing generally consists of some combination of fast and slow thought. Over time, we become better at slow things, and learn aspects of them with fast, automatic thought. When we play chess, we learn fast-thought signals that determine a winning or losing game. In the same way, our fast-thought attention-choosing system learns signals for what is a low or a high-utility attention opportunity.

  | SAVE THIS FOR HCI "WHY" SECTION: Designers call these fast-thought
  | judgments "taste."  Taste is a fast way of determining if
  | something is good or bad.  We develop taste through experience.
  | Expert designershave developed their sense of taste to be a very
  | accurate reflection of underlying utility of the outcomes.  They
  | then learn to trust their taste.  It will accurately signal if a
  | design direction is good or bad, like an expert hunting dog's nose
  | will signal the trail of a game rabbit.

Fast thought chooses our Slow thought

We have a specific way of modeling attention as a combination of fast and slow thought. Fast thought chooses attention, and slow thought is one of the opportunities chosen.

  | Fast thought is fast and effortless.  It is therefore less
  | constrained in allocation.

For instance, let us consider the task of buying a cell phone. Perhaps we are aware that there are Android phones and iPhones, and decided that we want one of the two. Making the optimal decision will require some amount of research. How do we model this decision and purchasing process with attention economics?

Attention economics is a theory of choice in the present moment. And at the beginning of this process, we have a very poor idea of which phone we want. We will know much more in the future, after we do research. In the initial moment, our attention opportunities look like this:

              /   opp. 1  - buy android phone
      __|__  /    opp. 2  - buy iphone
     [ o__o] -    opp. 3  - ...
      \_==_| \    opp. 4  - ...
        |     \   opp. 5  - LET'S THINK MORE ABOUT THIS
         -E
                  Attention
                  Opportunities

We are aware that we could choose without thinking, but the utility of thinking more about the issue is much greater than making a rash decision. So we take actions towards thinking more about the phone.

    o------------->
    |
    Think more

We will continue to think, and research, until we have learned enough that we choose to buy one of the phones, and at that point will start taking actions towards buying the phone

    o------------->o-------------> ... o--------------------->
    |              |                   |
    Think more     Research more       Go to store.apple.com

This chain of attentive actions is slow thought—it is our attention at work deciding which phone is best for us to purchase. At each step along the process, we made a choice as to how to allocate our attention for the next step. In our attention economic model, that choice is made with fast thought. At some point, our fast thought finds that the utility of further research has diminished below the utility of attending to buying an iPhone, and it chooses to proceed with the purchase.

To summarize, our model has the following features based on the Psychological model of Fast and Slow thought:

Utility is a continuous model of Fast Thinking choice over time

Thus, over time, our fast thinking mind is constantly evaluating the best opportunities for attention in the new occurring present moment. These constant evaluations express in our attention utility function.

          o---->o---->o---->o---->o---->o---->o---->
          |     |     |     |     |     |     |
          U(t0) U(t1) U(t2) U(t3) U(t4) U(t5) U(t6)

The utility function changes over time. You can think of the utility function as a state—the state of our current best guess of what we should be attending to. This state changes constantly, as our mood shifts, as we become tired or energized, as we discover new exciting goals, and as we learn from experience how to make better attention decisions.

If one is at work, one's attention utility function will generally have high values for attending to work. If one is in a tired state, one's utility for energy-intensive tasks will be low. If one is in a sad state, one will generally find reflection appealing. If one is in the middle of a task, one will find high utility in staying on the task, and finishing it. The nature of the utility function is to constantly change from moment to moment. It the purpose of attention economics to measure and model this changing state—the trends in values of the utility function for different objects of attention in different situations. Scientists can hypothesize about its behavior, build models, and test them against measured attention utility data.

We learn to improve our utility function from experience

Over time, we learn from the outcomes of our experiences, good or bad, and feed that knowledge back into our fast-thinking system's information store to influence our future utility functions. This reflection is often a slow thought attention choice, itself.

One way we learn is by associating features of the past experiences high or low utility. These features are similar to tastes and smells. When something tastes or smells good or bad, it does not mean the entity actually is good or bad—but it probably has features in common with prior good or bad experiences.

For instance, if you get sick soon after eating a type of food, the features of that food will occur to you as disgusting in the future. If you become sick from a poisoned apple, you will likely find apples disgusting, even though it was a poison—not the apple—that caused your sickness.

In this way, we learn our taste in the world's objects, such fashion, food, and user interfaces. Expert designers learn their taste over years of careful, conscious reflection on experience. Taste is our fast-thought judgment of utility.

  | Often we will need to reflect on experiences to learn from them,
  | and this requires slow thought.  If I do poorly on a test at
  | school, it might take slow thought to realize that I did poorly
  | because I was tired, which happened because I stayed up too late
  | studying.  I will need to update my attention utility function so
  | that I remember to the value of going to sleep earlier in the
  | future.

We can manipulate utility to measure it

To measure the utility function in attention economics, we will want a measuring stick—something that we can compare utility values against. A good candidate for this is money. Money is like a lowest-common denominator for utility. Most people find utility in it. Most people are familiar with a range of its denominations. We are familiar with comparing the values of objects and experiences to amounts of money.

This familiarity is important. If people have an intuitive sense of the value of $10 vs., for instance, $100 or 10¢, then they will be able to judge such values with their fast thought. If we increase or decrease the price of an object by an amount of money, it is then likely that their attention utility function will increase or decrease by a corresponding amount. This is a property we will rely on when measuring utility in Chapter 3.

Psychological experiments give us reason to believe that this property, in fact, holds. Neuroscientific studies show that people's dopamine systems react similarly to money as to sweets and material pleasures. The positive features of obtaining the things you buy with money seem to transfer, or become embedded in the attention utility of obtaining the money itself. The more familiar and experienced one is with the features of an object, the better their system 1 reactions will be at judging its value.

  | Mechanical Turk workers are very familiar with choosing
  | micro-tasks for micro-payments.  They will likely be good at using
  | fast thought to judge microtask value.
  | • If something is real expensive, it will begin to taste unappealing
  |   over time.  You'll start to think you have other ways to get that
  |   done:
  |         "I don't need google.  Fuck that, I'll use bing."

Summary

Let us now be a bit more precise with the attention utility model. You can consider this to be an attention-economic alternative to homo economicus—the neoclassical economic model of man. Like homo economicus, this model is not particularly falsifiable. We can explain most human behavior within it, by choosing particular parameters for utility values. Its purpose is rather as a metatheory—it is a lens for interpreting data and posing new falsifiable theories. It allows us to model man as a constantly-changing attention utility function.

The science of attention economics consists of defining and testing new hypotheses for how that utility function behaves.

At any point in continuous time, there is a state of your attention utility. It reports the predicted value of opportunities you are aware of at that time. Your utility function is evaluated over the possibilities for attention in your awareness.

          o---->o---->o---->o---->o---->o---->o---->
          |     |     |     |     |     |     |
          A(t0) A(t1) A(t3) A(t4) A(t5) A(t6) A(t7)
                     A(t)  =  {(Oi, Ui)}
      Awareness at a Time     is a set of Opportunities.
                              Each Opportunity is an (Object, Utility) tuple

(Although this graphic shows discrete steps, we actually model time as continuous. Opportunities flow in and out of awareness, and their utility values slide up and down on gradients.)

Your mind has a fast-thinking subsystem that perceives the world and identifies new attention opportunities, placing them into your awareness.

One option for attention is to think more about what to do. When you think more, you reflectively explore and improve your estimates of utility.

I do not claim that this is the final word on defining attention utility, but just that it is reasonable and measurable. The purpose of this is to inspire you to believe that we could define a working model of attention that is measurable. I hope that scientists after me continue to pursue and refine these models.

Also note that even though neoclassical utility is generally ordinal I consider attention utility to be cardinal. A rationale is provided in appendix.

Limitations of the model

This model does not represent multi-tasking. It models a person as choosing one attention opportunity at a time.

I have described each attention opportunity as a separate, discrete entity. In reality, attention opportunities may be a continuous distribution of possibility. However, I believe that my analysis could also be extended to a continuous model.

HOW to Measure Attention Economics

Chapter 3.

  | Reason about non-reasons!

The purpose of this chapter is to demonstrate that it is possible to measure attention economics, according to the definitions just given.

To do so, we need to show we can measure attention utility: the continuous state of fast-thought judgment predicting the value of an attention opportunity, in the present, at the moment of action. We want to measure not just the amount of attention that people give to objects, but how much that attention is worth. We will measure attention utility with human-computer systems, either by manipulating an existing system, or recreating a facsimile within a labor market like Mechanical Turk, in what I call a virtual economic laboratory for human-computer systems.

Our method relies on revealed preference theory, which we adapt from neoclassical economics. Revealed preference theory interprets a person's choice, such as buying apples instead of oranges, as preference: preferring apples to oranges. Whereas neoclassical economics infers from a choice that a person prefers obtaining good or service chosen, we will infer what a person prefers attending to in the moment. For instance, if a person chooses to email a friend instead of posting on his Facebook wall, we can infer that the person prefers emailing his friend.

To reveal the value of a preference—the magnitude of its utility—we will instrument people's computing environments so that their choices of attention also pay or charge them an amount of money. We infer the value of preference from the amount needed to change their behavior.

An Idealized Method

There are many variations of this method. Let me begin by explaining an idealized method that measures utility by instrumenting an existing user's environment. In this method, we will “sweeten or sour the pot” of specific attention opportunities, by specific amounts, and measure the resulting change in user behavior. We will do this by paying or charging the user a small amount of money for attending to a part of the website. For instance, we could charge the user $3 per minute to use Facebook:

              /  Facebook - $3/min fee
      __|__  /   Twitter
     [ o__o] -   Write thesis
      \_==_| \   Write girlfriend
        |     \  Suck thumb
         -E
     Person      Attention
                 Opportunities

By measuring the amount of money required to change user behavior, we will produce a money metric of utility: a measure of utility, in terms of dollars and cents, that describes how much the user values Facebook usage.

Over time, the subject will attend to the task more or less frequently because of the reward or fee. For instance, suppose that a user would normally have visited Facebook three times during a day:

        __ <-- visit #1
       |  |
       |  |                 __  <- visit #3
       |  |     __ <- #2   |  |
       |  |    |  |        |  |
       |  |    |  |        |  |
   ==================================> time
    | 1:00 PM   | 2:00 PM   | 3:00 PM

Each visit has a different utility for the user, represented by the height of that visit's bar. The first visit was worth a lot to the user. The second, a little. And the third, somewhere in between:

        __  <-- a lot of utility
       |  |
       |  |                 __  <-- less utility
       |  |     __ <-least |  |
       |  |    |  |        |  |
       |  |    |  |        |  |
   ==================================> time
    | 1:00 PM   | 2:00 PM   | 3:00 PM

This is the utility of Facebook as compared to the other opportunities that occurred to the user as possible. The zero point is where the user is indifferent between using Facebook and doing something else. We call this “extra amount” the utility surplus of each Facebook attention opportunity.

We want to measure these utility surpluses—the heights of the bars. But how can we measure them? We do not know how to measure utility from someone's mind directly. Instead, we can manipulate his environment. Specifically, we will instrument the object of his attention—interaction with Facebook—with an additional fee.

Let us charge him $3 per minute (or equivalent in milliseconds) for every partial second that he is interacting with Facebook. We will charge him when Facebook is loaded in an active browser window, when he scrolls, types, or clicks on the page, and—to capture the time he is reading a static article—for a window of 20 seconds after any activity, as long as the page is still visible. To motivate users to participate in this study, they will be receive a baseline award each day that their fee will be subtracted from. We will ensure that the award is greater than the fees they receive.

We will display his current fee at the top of his monitor, with a banner on the menubar displaying the current price: “Facebook: -$3/minute” along with the total he has earned or been charged so far. We assume that, over time, he will internalize that his Facebook usage carries this additional cost, and his fast-thinking utility function state will embed the $3/minute fee into its calculations.

This changes his present attention utilities:

              /  Facebook - $3/min fee  = U(Facebook-$3)
      __|__  /   Twitter                = U(Twitter)
     [ o__o] -   Write thesis           = U(Thesis)
      \_==_| \   Write girlfriend       = U(Girlfriend)
        |     \  Suck thumb             = U(Thumb)
         -E
     Person      Attention                Utilities
                 Opportunities

Given our model of attention economic behavior, there will now be a particular threshold on his behavior. He will only use Facebook if its utility is at least $3-worth more than his other options. We have established a threshold on his Facebook usage at $3:

        __
    -  |- |  -  -  -  -  -  -  -  -  -  $3/min fee
       |  |                 __
       |  |     __         |  |
       |  |    |  |        |  |
   ==================================>  no fee
    | 1:00 PM   | 2:00 PM   | 3:00 PM

We will now only observe a subset of his original Facebook usage.

If we then run the experiment over multiple days or weeks, and vary the fee on different sets of days, we will be able to paint a picture of Facebook use at different utility thresholds:

   -   -     -      -      -     -   -  $5/min fee
        __
    -  |- |  -  -  -  -  -  -  -  -  -  $3/min fee
       |  |                 __
   - - | -|- - -__- - - - -|- |- - - -  $1/min fee
       |  |    |  |        |  |
   ==================================>  no fee
    | 1:00 PM   | 2:00 PM   | 3:00 PM

We can also charge the user negative amounts—that is, pay him to use Facebook. This will let us peek below the zero line:

   -   -     -      -      -     -   -  $5/min fee
        __
    -  |- |  -  -  -  -  -  -  -  -  -  $3/min fee
       |  |                 __
   - - | -|- - -__- - - - -|- |- - - -  $1/min fee
       |  |    |  |        |  |
   ==================================>  no fee
       |  |    |  |  __    |  |
   - - | -|- - |- |-| -|- -|- | -__- -  $1/min pay
       |  |    |  | |  |   |  | |  |

If we pay him enough, he might just use Facebook all day. But if we pay him a little, he will use it a little more than usual, but eventually become bored and do something else. The amount that we pay him will increase his usage.

We will have to be careful to ensure that he does not try to cheat us, for instance writing a script to scroll the page automatically and game our metrics while he makes and eats a sandwich. We will need to establish some safeguards, such as pay limits, monitoring suspicious behavior, network traffic logs, and establishing trusted relationships with our user-study subjects. We can also record screen captures, and check the screens and activity logs using crowdsourcing.

If we include a large number of people in the study, and vary the price across them, then we can plot the average aggregate amount of usage at each price to obtain a useful economic graph called a labor supply curve:

                        usage
                          ^
                          :     ___________
                          :  --
                          :/
                         /:
                        / :
           __________--   :
       <==================+==================> utility
          fee per minute  |  pay per minute

As you pay people more, you increase their usage. If you charge them, you decrease it. The labor supply curve tells you what supply of “labor” (actually, attention in this case) one can expect to obtain for different amounts of pay. As with traditional labor supply curves, we expect to find marginal decreasing returns with increased pay: the graph should flatten out. There should be a marginal reduction of loss in the negative direction as well.

The labor supply curve is the basis for many economic analyses. For instance, suppose Facebook's developers were considering improving an aspect of their user interface, to make it easier to send messages to friends. Before implementing, they could predict the change in usage they expect to obtain from these labor supply curves. First, they would measure the existing labor supply curve for the task, by charging people or paying their subject population to attend to it. Second, they would estimate the change in utility (in dollars and cents) that they think they could achieve with an improved message sending feature. Since it makes the task easier, they would expect a decrease in cost, and thus an increase in utility of the task—people would do more of the task for the same fee or pay. Once they have an estimate for the expected change in utility, the labor supply curve will report the change in usage to expect:

                        usage
                          ^
                          :     _:_________
                          :  --  :
                          :/     :
                         /:      :
                        / :      :
           __________--   :      :
       <==================+==================> utility
                          ^      ^
                          |      |
             Existing usage      Increased usage
             (Zero pay)          (Due to increased utility)

This data could be used to help allocate priorities within the company, or for purely scientific inquiry. If Facebook found that a particular feature was both particularly utility-sensitive, and important for their business goals, they might devote more resources to increasing its utility.

The astute reader will have noticed, however, that I have not yet explained how to estimate the amount of utility that the new user interface change is worth—how much the graphed value will shift to the right. There are a few ways to accomplish this. First, one might just guess, using a designer's expert judgment. With practice, an experienced designer might become highly skilled at predicting the amount of money equivalent to various user interface differences. Second, they could compare it with a prior result for a similar change that they have already measured. And if these methods provide too much uncertainty, then they could determine it experimentally, using the virtual economic laboratory method I will describe next in the next section.

A Measure of Attention Utility

To verify that this measures attention utility, note that we are measuring choice of attention in the present moment of action—the choices we observe are actual choices of use—and that our subjects would be presumably using fast thought to judge them, since we expect the value of this money to become embedded in the subject's attention utility function over time, as they gain experience attending to tasks and watching their rewards rise or fall in association. Over time, we can expect them to reflect on their choices, and perhaps think “oh, sending that email just earned me 40 cents.”

Most neoclassical methods for preference measurement, on the other hand, involve slow thought judgments over outcomes. The typical experiment places subjects in an artificial auction, and asks them to provide bids on a good that they will not experience until a future moment, if at all.

  | *** High-level remarks on this style of economics
  | 
  | This does not give us the same static analysis powers.  We cannot
  | infer the choice tables from first principles.  But we can measure
  | them within some situations.  So we cannot do an equilibrium
  | analysis statically in the same ways.  Choices are very
  | context-dependent.  But we can do other things.

Measuring Cost with a Virtual Economic Laboratory on Mechanical Turk

However, I have not implemented that idealized method. This section describes a method I actually have implemented and experimented with. Unlike the idealized method, this one uses a ready-made user population available to any researcher—Mechanical Turk.

Using Mechanical Turk gives us a different type of experimental setup, which I call a virtual economic laboratory. Rather than implement experiments in the field, with a live website and real users, we can create synthetic situations on Mechanical Turk that attempt to replicate certain phenomena of interest. This lets us run experiments that are not feasible in real life. Scientists at large companies are limited in their range of possible scientific inquiry by the real-world constraints in the products they experiment with. Real websites are complex. It is difficult to generalize results from one website to another. Whereas a chemist would carefully remove or control for confounding reagents in an experiment, a researcher at Google cannot simply remove arbitrary features (e.g. search) from a running website to answer a scientific question. By putting tasks and interfaces on internet labor markets, such as Mechanical Turk, we can create a virtual economic laboratory for experiments. We can pay users to try any task and interface we want. Furthermore, we can pass our code around for other scientists to replicate, or modify and extend our results.

Overview of the technique

This technique measures the utility of a user interface for a task. The basic method is to place the user interface and task on Mechanical Turk, and see how much we have to pay Turkers to complete the task with the user interface.

More specifically, we will quantify the utility difference between a set of interfaces or tasks. We will give the different variations of experimental conditions (different interfaces and tasks) to different Mechanical Turk workers, in a giant A/B test. However, unlike a traditional A/B test, we will also simultaneously vary the price we pay the Turkers. We will measure the percentage of workers who complete a task, and the number of tasks that they complete. (Workers on Mechanical Turk can repeatedly perform iterations of your tasks, if you allow it, until they become bored or distracted.) By then comparing the change in amount of work completed due to the experimental condition with the change due to the difference in pay, we are able to infer the amount of pay that is equivalent to the change in user interface or task.

This technique has similarities to the idealized method described earlier. Rather than run an auction to solicit bids, we simply present each worker with a job at a price and observe how much attention they choose to allocate to it. A user chooses to complete a task at a given price, or not. If they choose to complete the task, we observe how much work they do before boredom or other factors reduce the net utility of the task below their other options, and they stop. We then aggregate this data. Holding task and context constant, a data point in utility space is (interface_id, worker_id, wage_compensation_per_completion, number_of_completions).

Our approach to measuring the utility of an interface is to determine the compensating wage differential between different tasks and computer interfaces. This economic theory dates back to Adam Smith, who defined it as the additional amount a worker must by paid to convince him to do a job that is unpleasant, risky, or otherwise undesirable []. Our method determines the desirability or undesirability of an interface to achieve a task by observing how many workers choose to use it at different amounts of pay.

We use a between-subjects design, and minimize the explicitness with which workers must reason about their choices. We call jobs “Mystery Tasks,” presenting them as a surprise or a game rather than an explicit auction (detailed later). Workers do not know that their activities are being aggregated to infer utility. This technique is simple, direct, and requires few assumptions. The downside is that it requires a large amount of data, because every completed job provides only one bit of information: whether the user accepted the job, or not. Luckily, obtaining this amount of data is feasible with Mechanical Turk.

Dirty Details: Making it work in Mechanical Turk

To make our auction work, we had to overcome a number of challenges on Mechanical Turk that would bias the results and compromise their integrity as a measure of net utility. In this section, we first overview the software framework implementing the utility methodology, then describe a couple of the specific problems it helps solve, and finally detail a few outstanding problems that need to be addressed in future work.

We have automated most of the experimental method. An experimenter first implements a user interface for a task using HTML, JavaScript, Flash, Java, or any other technology embeddable into webpages. She can define a set of experimental conditions (a pairing of interface and task), and parameterize the interface for each condition, for instance implementing multiple button widths or interaction styles that she would like to test. She then tells our software to run a study with the interface and task, declaring which conditions she would like to run, and how many workers she would like to receive utility judgments from (this determines how much she will spend on Mechanical Turk). Our software automatically creates hundreds to thousands of jobs on Mechanical Turk and randomly assigns workers to the prescribed conditions. Workers complete jobs interacting directly with the experimenter’s web application embedded in an IFrame on the Mechanical Turk website. Our software logs to a database how many workers look at the task, and how many jobs each completes, along with the workers’ interactions, worker IDs, geographic locations, and their randomly-assigned pay and interface conditions. The software then applies micro-econometric analyses to the data, computes a money-metric, and produces graphs to explain the results. It also geolocates IP addresses and tracks local time of day for workers automatically, which make it possible to observe differences in utility across regions and time of day. We implement our system in the fantastic web2py web programming framework. We call it the Utiliscope.

The Utiliscope framework addresses a number of issues that anyone who wants to use our method for running utility auctions on Mechanical Turk need to address. We describe four such problems and solutions here.

Implementing multiple conditions.

We want to measure the varying work done per condition, so we need to post multiple conditions of jobs. However, if we post them all at the same time, workers will only consider completing the best, highest-priced job. If, on the other hand, we change the price over time, we have to control for the other factors that change over time, such as worker populations and market conditions, and have to control for the increases in experience and boredom that occur with workers who happen to remain on Mechanical Turk during multiple phases of the conditions. Our solution is to post all conditions to Turk simultaneously, but allow each worker to see only one of the many conditions. Mechanical Turk itself does not support this, but we were able to implement a novel workaround using Mechanical Turk’s bonus and IFrame APIs. We will describe this below.

Selection bias.

The second problem is that the rate of work depends on how many workers find a job in the first place. We want to measure a job’s inherent labor completion rate, but other factors confound the rate. For instance, we were able to get a 12x increase in work rate by modifying a job so that it appeared on the Mechanical Turk front page for workers who do not change Amazon’s default search settings. Our solution is to measure the number of people who look at our jobs, separately from the number who complete them, so that we can measure instead the proportion of those that complete tasks out of those who consider tasks. Unfortunately, Mechanical Turk does not tell us how many people look at our task listings, since they appear in a large list of search results on the Mechanical Turk server.

Luckily, we can solve the multiple conditions and selection bias problems with one technique. Our solution is to post each job as a “Mystery Task,” with a listed pay of $0.00, and a description that tells the worker she must preview the job to see the task and how much it actually pays:

Until a user accepts a Mystery Task, they do not know the type of task or how much it pays. This allows us to log the percentage of users who see the task vs. accept it, and dispatch a different experimental condition and price to each worker.

When the worker opens a Mystery Task, Mechanical Turk opens an IFrame (using the externalQuestion API) to our web server, which tracks that the worker has seen our task. Our framework is then able to initialize an appropriate interface, task, and price that instantiates an experimental condition. If the framework has not yet seen the worker’s ID, the price is randomly assigned, and sticks with them for subsequent jobs. Our jobs pay entirely in bonus, a Mechanical Turk feature that allows employers to pay a worker a discretionary amount beyond the initial payment. The randomization of condition and pay is hidden from the worker, who is not told that there are multiple conditions available behind the scenes. Unfortunately, Mechanical Turk does not provide external webservers with the worker’s ID when she previews the job, only when she accepts it, so we display the Mystery Task for a previewing worker until she accepts the job. This set of techniques removes all condition-specific information from the job description listings, which we cannot control on a per-user basis, and gives us control over every step in a participant’s process of choice.

Lopping off the long tail of job completions.

Some small proportion of workers will continue to complete the same job ad infinitum. However, this does not necessarily yield more interesting data for informing utility analysis (see e.g. later section on survival analysis). Our framework allows the experimenter to set a parameter for the maximum number of jobs (e.g., 50) any single worker can complete. This reduces the cost of the study if the experimenter is not interested in the utility of doing a job more than a certain number of times in a row. It also makes it less worthwhile for a worker to try to cheat by programming a script to parse the webpages and perform tasks automatically.

Temporal Market Price Fluctuations.

The market clearing price on Mechanical Turk can change over time because of an increase in jobs or workers. Labor supply is sensitive to short term fluctuations like time of day and day of week, as well as longer term boom and bust of the Turk economy. This can create problems for analyzing the results of utility studies conducted over a longer period of time, as well as problems comparing studies conducted at different times. To control for this, we can post a baseline control condition (answering CAPTCHAs for 1¢) along with the experimental conditions. Note that the data we present in this paper has not been controlled for in this way.

Case Studies

We applied our method in two case studies, to verify that it detects significant differences in utility that we would expect. The first study is the Fitts’ Law study. Its purpose is to measure the utility of time-on-task, a factor that HCI routinely measures. The second study, on the other hand, measures the utility of two user interface factors that HCI has had a difficult time quantifying because they do not affect efficiency: aesthetics and feedback. As we describe the studies, we will also present a repertoire of techniques that are useful for analyzing and communicating about the data produced by utility measurement.

These case studies demonstrate that utility measurement can apply to a range of factors—both those that we measure today, and those that we do not—and that it replicates our existing design knowledge and achieves statistically significant results.

Utility of Efficiency: Testing Fitts’ Law

To get a baseline understanding of the relationship between utility and HCI’s existing metrics, we studied the utility of efficiency in a Fitts’ law task.

Study design and execution.

We implemented a traditional Fitts’ law task in JavaScript, where the user must click back and forth between rectangles on the left and right sides of the screen:

Subjects had to click on a blue rectangle 60 times to complete a HIT. We created three variations of bar width and the distance it moved: hard (a), medium (b), and easy (c). Each time they clicked on the bar, it moved to the opposite side of the screen (d). The three conditions as [width, distance] were [300px, 700px] for the easy task, [30px, 870px] for the medium, and [3px, 897px] for the hard. We posted 22,190 jobs to Mechanical Turk, recruiting 1,176 distinct workers, at six prices: 1¢ through 6¢. Our software automatically crossed the six prices and three experimental conditions to create eighteen conditions total. Each job required 60 clicks to complete, giving us time-to-click data on 1,331,400 clicks. We set an upper limit of 51 jobs, or 3,060 clicks, per worker. The study took 5 hours 15 minutes to complete, and cost $970.

Analyzing the data.

Now that we have a database containing the choices of hundreds of workers completing the Fitts’ law task at different prices, we would like to answer a set of questions. How much use occurs in the different conditions? How much money is this difference in use worth to users? How does utility vary over time; for instance, what is the difference between novice utility and expert utility? On the converse, how quickly do users bore of the task? And how does use vary with user context, such as a user’s geographic location or local time of day? Here we define two analytical techniques that can help make sense of choice data.

Creating a Labor Supply Curve. The labor supply curve predicts the amount of use that a particular interface and degree of incentive (pay) will produce. It is a plot of the number of jobs completed at each price for each condition. We estimate the labor supply curve using a Tobit regression model []. The Tobit model takes into account the upper limit of 51 jobs per worker, known as “censored” data. Our analysis of the Fitts' law study data predicts the following labor supply curves:

You can see the six different prices (1-6¢) along the bottom. These were varied for each of the three experimental conditions: a total of eighteen conditions. Be traversing from left to right along a line, we can see the effect of pay on amount of use. By looking at the differences between lines, we can see the effect of interface condition.

Computing a Money Metric: the Compensating Wage Differential. We can infer the amount of money an interface variation is worth: the money metric of utility. To derive the money metric, we measure the horizontal distance between two curves—the change in pay that makes them equivalent. For instance, if we holding the number of jobs constant at 3, we can see that it require an equivalent of 3.8¢ to compensate for the difference between the Hard and Medium difficulties per 60-click job.

In this example, we filtered to workers from the United States (31% of the total data). As expected, workers do more work if they are paid more money. Furthermore, the curve shows they prefer the conditions in the order easy > medium > hard. These differences are significant at p = 0.03 from easy to medium, and 0.07 from medium to hard. If we plot the slope of the curves on its own axis, we can see this preference ordering even clearer:

This graph visualizes the utility of the index of difficulty. Each point is the number of clicks a participant completed before quitting (points jittered to show spread). There is a clear inverse relationship between task difficulty and utility for Fitts’ law tasks. These results are qualitatively the same for Indian workers (44% of the data), but the data show they respond much more strongly to both pay and index of difficulty than Americans. Our data can also be used to quantitatively predict the amount of work, within error bounds, that will be produced given the variables of interface, context, and incentive (pay).

In conclusion, our efficiency case study showed that utility can capture an existing metric: efficiency. Users prefer clicking efficiently-designed targets. Labor supply curves predict the amount of use an interface will get. By calculating a money-metric, we can quantify the magnitude of the utility differences across conditions, and reason and hypothesize about them in terms of the lingua franca of utility: dollars and cents.

Utility of Aesthetics & Feedback: CAPTCHAs

This second study demonstrates that we can measure the utility of two particularly elusive quantities in HCI: aesthetics and feedback. These quantities are elusive because they do not make an interface slower to use, or otherwise affect the user’s actual behavior. They only affect his perception of the interface and his understanding of its internal process.

Study design and execution.

We experimented with two interface variations for the task of answering CAPTCHAs. One interface had a clear, minimalist design, and the other had gaudy colors, small fonts, and a distracting animated GIF advertisement. Both tasks had the same instructions and wording, required 10 CAPTCHAs to be completed per job, and took the same amount of time to complete. The pretty condition implemented an elegant animated countdown reminding the user how many CAPTCHAs they had left, and the ugly condition only told them when they had completed all 10. We posted 15,000 jobs to Mechanical Turk, with one 10-CAPTCHA task per job. Workers were paid either 1 or 2 cents, for a total of 4 conditions. 1,270 workers completed our jobs, and the entire study cost us $388. In this study we did not limit the number of jobs a worker could complete.

Survival Analysis.

Our between-subjects auction method collects a binary choice for a user, over time until he quits. One intuitive way to represent this data is with a survival function []: a function S(t) that represents the probability of a user “surviving” t tasks before quitting. Analysis starts by preprocessing the data to identify how many tasks each worker completed. Then we group data by condition and price, and plot a graph of the percentage of users who continued to use the interface after N jobs. If a line is higher in the survival graph, it means more workers completed more tasks. We compute 95% confidence intervals using Wilson’s estimate [], since survival data is binomial. When the study ends, it artificially stops, or censors the work of some users who might otherwise have completed more tasks. We label those users as censored and account for them statistically using standard survival analysis techniques.

The survival graph for the CAPTCHA experiment is shown above. The confidence intervals for each line are shaded. The survival analysis shows how use changes over time. We can see that all four conditions are spaced apart roughly equivalent for the first 20 tasks, but for work done at 80 tasks, the top two lines (2¢) and bottom two lines (1¢) converge. This means that price dominates the utility for workers who acquire more experience with the task, and aesthetics is primarily important for those who are inexperienced. Or, those who stick with the task are more resilient to aesthetic quality.

We also estimated the effect of aesthetics on labor supply, as we did with the Fitts’ law study. The results show that the effect of aesthetics and feedback is substantial: all else equal, the pretty style of the interface produces 58% more use. This is statistically significant at p = 0.02.

Scope, Limitations, & Future Work

We view our technique as just one point in the space of preference measurement, along with A/B testing and techniques that have not yet been invented. Our technique has two key features. (1) We measure behavior for controlled tasks and interfaces in a labor market, instead of experimenting with real users on a production website like A/B testing. (2) We vary pay, artificially manipulating a user’s motivation, and from it infer a money metric of utility.

These unique aspects separate it from A/B testing. Interfaces to be A/B tested must be developed and tested to production quality and deployed to a real userbase. This constrains applicability in most companies, and indeed A/B testing is often limited to optimizing small variations in an existing UI—i.e., for late-stage designs. Our techniques can be used to obtain quick feedback for early-stage design decisions. Furthermore, A/B testing is adept at answering “what” users will do for a particular website, but not “why.” [] By paying users, we have more discretion to ask them survey questions to elicit “why,” and we also gain the ability to test abstract interfaces and tasks—such as our Fitts’ law interface—to isolate factors and remove confounds of real-world websites. Future researchers can replicate and extend the results, in the same labor market, by copying and re-running the study’s source code. This lets us develop and validate scientific models that generalize across concrete website instances. And by calculating utility in terms of money, we create a language to compare interfaces, tasks, and contexts and create generalizable knowledge. In summary, our techniques fill a niche in early-stage and generalizable interaction studies.

Mechanical Turk & Other Labor Markets

Although it is common to expect that Mechanical Turk workers are somehow “different” from normal computer users, the demographic data actually show workers to be remarkably representative of the Internet population. A substantial portion of workers have bachelors, masters, and PhD degrees, for instance []. Workers use Turk not just to make money, but also to have fun and spend free time, and avoid boring or distasteful tasks similarly to other Internet users []. We encourage the skeptical reader to investigate this data. In fact, Mechanical Turk’s population is much more diverse and ecologically valid than the small-sample college populations commonly employed in HCI and Psychology research. Our Fitts’ law study recruited workers from 32 countries in five hours. Moreover, the setting of use for Turk workers—e.g. at home, at work, at a cafe, watching TV, on one’s own hardware—is often more naturalistic than a laboratory. Furthermore, we can survey workers for demographic or other information, and store it in a database with their worker-id. This allows us to examine the effect of context in a study, for a variety of personal characteristics, without additional experimental effort.

Yet at a higher level, our techniques are by no means limited to Mechanical Turk. In fact, there are more than ten alternative crowdsourced labor markets in current deployment [], and we expect more to develop in the near future. Each market has different characteristics. Some markets even “pay” users with non-monetary incentives. For instance, the company CrowdFlower deploys micro-tasks through gaming company Zynga, which rewards game players with upgraded “cows” in the game FarmVille in exchange for doing small pieces of work.

Finally, we expect that our techniques could be used without a labor market at all, by finding other ways to recruit users. Facebook or Google could run utility experiments by recruiting their own users through advertisements, and paying them small amounts on PayPal. Researchers could run a custom ad campaign in this way, targeting a subpopulation. Going one step further, our economic methods could in theory be applied within A/B tests themselves, creating utility-augmented A/B tests. For instance, Amazon might offer randomly-selected users the opportunity to discuss a product within an experimental social system in exchange for a few cents of store credit, and thus combine many of the benefits of a traditional A/B test with many of the benefits of economic utility analysis. Indeed, it is important to remember that our existing research techniques are all biased and limited, but with time we have learned how and when to trust them. We believe the crowd enables the future of HCI evaluation.

However, when we assume we can replace a user’s existing goals with money, we run into a number of potential hurdles. First, we cannot measure the value of their existing goals—only the cost (or value) inherent to the process of using the interface itself. This is a significant limitation, and difficult to get around in theory without data from actual use of the real system (e.g. a utility-augmented A/B test). Second, the researcher must be careful to avoid situations where the use of a money incentive adversely effects one’s decision process, as has been recorded in Behavioral Economics []. Third, the researcher’s freedom in defining the labor market worker’s goal with an interface comes with the difficulty of enforcing it. To do so, the researcher might employ quality measurement, which we will describe next.

Quality measurement. Many computer tasks, such as writing articles, blog posts, and authoring presentations, are open-ended. The quality of results is difficult to verify with a computer. We have not yet studied such tasks, because we need to know which tasks were completed successfully so we can determine who to pay. The standard technique on Mechanical Turk is to post the result of tasks back to Mechanical Turk as new “reviewing” tasks, having workers review the work of other workers. This is the basis of a growing body of quality-assurance techniques used on Mechanical Turk []. We hope to implement this technique in our software framework. Moreover, this will enable us to study the relationship between quality and utility. High quality articles are often more difficult to write, and likely to cost more. But we can measure, for instance, whether writing on a topic of personal interest to the worker results in both higher quality and lower cost utility.

  | Note: those guys did some sort of quality measurement

Cheating. Related to quality measurement is preventing cheaters and spammers on Mechanical Turk from abusing our experiments for money. Quality measurement will be critical for open-ended tasks, to prevent workers from submitting garbage results. It is also possible for someone to write a browser script to automate the submission of tasks without doing them himself. Our aesthetics task used CAPTCHAs to guard against automation, and our Fitts’ law task recorded the time of each click, of which we ran simple data analyses to validate they looked human. Furthermore, we generally set a limit, such as 50 tasks, on the amount of work a worker can do. This way, even a script would make at most a dollar for its author, reducing the incentive to writing scripts.

General Elements of Utility Measurement

Now that we have seen two methods, let us break them down and look at some of the key design decisions that can be varied:

We will examine these in turn.

Choice of Metric Scale

Money is not the only metric we need use for utility. Instead of paying people to do tasks, we could give them points, kudos, cookies, bananas, or kisses—any scalar quantity that (1) is easy to value and (2) easy to give away.

Money happens to be a very nice lowest-common denominator. Unfortunately, since it is so widespread, it also comes with baggage—it can change the framing of a task to introduce money. If Facebook started charging everyday users for access, they might grow more suspicious, for instance, than if a virtual reward was given and taken away.

On Mechanical Turk, money works well, because the workers already assume that every task will be paid. In fact, I suspect one might encounter more confounds if they did not pay for a task than if they did.

When you design a method for utility measurement, you should consider the connotations that your metric scale brings to the table, and choose one that is not likely to produce bias.

Choice of Object (Time vs. Task)

To determine when someone is attending, we instrument the object of attention with money fees or rewards. For instance, on Mechanical Turk, the object is an objective: completing our task, and we pay workers to complete each task. However, in the idealized method, the object was a thing: the Facebook website itself. In this situation, we paid workers per second that they spent actively browsing Facebook. Thus, in the objective version we paid per completed task, and in the thing version we paid per time.

These are the two types of payment—per time, and per task—that I see as typical for instrumenting objects, and it is more natural to pay objectives per completed task, and things per unit of time.

  | Recall that attention pursues an /object/--either an objective, or
  | a thing.  We infer that attention is present by either observing
  | (1) that the objective has been achieved, or (2) that the person
  | has spent some amount of time looking at, playing with, observing,
  | or interacting with the thing.  For instance, when someone
  | completes a /task/, we can say that they achieved an objective,
  | and thus must have attended.  In this say, a CAPTCHA is a test of
  | human attention.  We can also observe attention through
  | non-goal-oriented interactions with an object.  If you observe a
  | person watching television for seven minutes, it is likely that
  | they were attending to whatever was on the screen during those 7
  | minutes.
  |
  | Thus, we will measure attention either in /tasks completed/, or
  | amounts of /time/.  When we run a study, we can choose one or the
  | other measure, depending on the appropriateness to the situation.
  |
  | For instance, consider the examples in the introduction.  In one
  | situation, we paid the baseball player /per catch/.  This is
  | measuring attention at the level of a task.  In the other
  | situation, we paid the baseball player /per minute spent
  | scratching his buttox/.  This is measured in time.

Choice of Setting (Virtual Laboratory vs. In-Situ)

A core distinction between a virtual laboratory and the idealized method is that the virtual laboratory replaces all intrinsic motivation with pay. This has the advantage that it gives the experimenter more control and flexibility, but also carries a couple downsides.

First, it is less naturalistic and ecologically valid. Second, in a virtual economic laboratory we cannot measure the original value of the task itself, but only the cost of using the interface. For instance, with our Mechanical Turk method it would be impossible to capture value to a father of uploading photographs for his children, because we replace that value with money and throw away the results.

There is a general model behind this. Consider how many of the situations in which we use technology can be described as trying to obtain some valuable goal, out in the world, but in order to do so, we must go through some costly process with a user interface. The overall utility is then the value of the goal minus the cost of the process.

          Utility = Value of Goal - Cost of Process

The goal is generally associated with a task in HCI, and the process with a user interface.

          Utility = Value of Task - Cost of User Interface

On Mechanical Turk (and other such Virtual Laboratories), we substitute an artificial goal—obtaining the money we pay people—to motivate people to complete our arbitrary tasks. This gives us flexibility, but prevents us from measuring the goal.

          Utility = [ XX $$$ XX ] - Cost of Process

The upside is that we obtain a cleaner measurement of the cost. Furthermore, if we then measure the net utility separately, on a live site with real users, we will be able to subtract the UtilityCost of Process to obtain the Value of Task. By combining experimental methods in this way, we can triangulate to determine each component separately.

Choice of Price-Elicitation Method

At a higher level, one might consider alternative mechanisms to eliciting prices for choices altogether. Let us first consider some traditional methods, before we draw some more general principles.

Traditional methods

Traditional methods tend to run explicit auctions, e.g. soliciting bids from subjects. Ours are the first to determine preference indirectly via labor supply calculations at manipulated wages. This requires cognition on behalf of the subjects, and allows them to learn the price as an embedded feature of the attention utility function.

If we compare our experimental results to other studies, it appears that our approach is superior for measuring attention utility. For instance, Ben-Bassat et al. [] used a traditional second-price auction to determine user preference for interfaces, implemented in a traditional face-to-face laboratory study. Subjects provided an explicit money bid for each interface. However, their method failed to measure a significant difference between interfaces that varied only in aesthetics—something our method finds a robust effect for.

Horton and Chilton [] copied, implemented, and wrote a paper on one of our early “alpha” ideas (without acknowledging that we gave them the idea) for preference elicitation: to measure a worker's reservation wage—the wage below which she will not do a task—by starting with a high wage and incrementally reducing it until the worker quits. However, this style of experiment suffers from strong observer effects: the sequence of declining prices workers are paid has a significant effect on when they decide to stop, and thus changes the “reservation wage” the method calculates. Horton and Chilton’s experiments found a significant effect of this pricing style on the results, but did not find a significant effect of the Fitts’ law index of difficulty condition—something our technique detects.

Our Approach

In comparison to these approaches, our method simplifies the user's choice. As opposed to most auction methods, we do not require users to express their full-resolution bid price, but rather only measure a binary choice at each datapoint—whether they attended to the task or not. As opposed to our decreasing-wage method copied by Horton and Chilton, we do not distract users with a changing price—taking their attention away from the task. We seek to keep the price stable for an extended period of time so that users stop attending to the price rationally, with slow thought, and make choices instead with fast thought. We seek to embed the price manipulation into the task, so that it becomes an implicit sense in the user's taste, at the same level as other instinctive, habitual, fast-thought aspects.

I believe that any method for measuring attention utility will involve some loss—it is impossible to embed a measured utility into an object perfectly. Rather, a method should be as transparent as possible. Future scientists might try take the design further, for instance by visualizing the quantity of money metric with preattentive features (e.g. size, color) instead of written numbers, to try pushing thought further into the fast lane. However, I suspect that our existing techniques are good enough for many purposes.

Discussion and Evidence of Success

Our definition of attention economics is measurable if we can succesfully measure the magnitude of values in the fast-thinking attention utility function. In this chapter, I have proposed a method for doing this, which relies on embedding a price into the user's utility function for an object of attention. If can successfully manipulate a user's fast-thinking attention utility function for a task, up or down by a prescribed metric amount, then we will have a successful measurement technique.

So then, have we been successful? Although I do not have absolute proof that we have successfuly embedded price into the user's fast-thinking attention utility function, we do see evidence that this is the case. Consider that we successfully measured aesthetics. By definition, aesthetics are superficial qualities of perception. They are features that one perceives early in any span of attention—they are the result of fast thought. The fact that our method measured a signal for a purely aesthetic change to a user interface suggests that we have instrumented the fast-thought system for utility judgment.

Recall that the method of Ben-Bassat et al., in which subjects evaluated an aesthetic interface change within a slow thinking second-price auction, failed to measure a significant effect for aesthetics. This provides further evidence that our method, designed for fast thought, is measuring something distinctive from the traditional neoclassical perspective on utility.

Furthermore, our result has been replicated by another scientist. Just this past month, in fact, Daniel G. Goldstein et al. published a report using our Mechanical Turk method for measuring the cost of annoying advertisements. Whereas our aesthetics study included an annoying ad along with other aesthetic features, Goldstein et al. focused their experiment on manipulating only the advertisement. They still found a significant effect, with only the ad. Furthermore, they demonstrated the effect in a more realistically common task—processing repetitive emails. This work provides additional evidence that our method can measure fast-thought utility.

Finally, the result of our Fitts' law study shows that we can embed other qualities into the utility function in the same fashion. This method looks as if it may be generalizable.

  | As a counter-example, I can show the data I am collecting on
  | slow-thought utility analysis, and how divergent and unpredictable
  | it is.

Threats to Validity

Throughout this chapter, I have expressed some of the many potential distortions that our measurement function may be subject to. For instance, our users may be non-representative, or their behavior may be intrinsically altered by the mere presence of cash rewards. I do not claim that our method is perfect. However, I do believe that these obstacles are surmountable. In this chapter, I hope to have convinced you no that we can build a measurable science of attention economics today, but merely that it appears likely that we will be able to, and that we should pursue one.

If you would like additional motivation, then please continue flowing your attention onwards to the next chapter, where I detail some examples for why we need a measurable attention economics.

WHY Measure Attention Economics

Chapter 4.

We should pursue a measurable attention economics because it would be widely applicable to the world. In this chapter, I will describe some specific exciting ideas for how it could apply. I will focus on applications to human-computer systems, since that is my area of expertise, but I expect broad applicability in other areas of life as well.

Additional Mechanical Turk Studies

To begin, let us consider the ways in which we could extend the concrete study and measurement technique we implemented in the Utiliscope on Mechanical Turk. These ideas are more concrete, and readily doable. In fact, some of them are underway by me and my colleagues, and one of them has already been completed by other researchers studying Electronic Commerce.

Unfarmable CAPTCHAs

Consider our experiment measuring the utility of CAPTCHA user interfaces. There are two ways in which I would be excited to extend it. The first creates a new type of benchmark for CAPTCHAs.

CAPTCHAs are caught in an arms race. Their designers (e.g. Yahoo, Google, reCaptcha) need to make them difficult enough that computers cannot solve them, but users are annoyed when they are too difficult to solve. Meanwhile, sweatshop companies employ people to sit in rooms and solve them. CAPTCHAs thus need to optimize three goals:

  1. High utility for intended users to solve
  2. Difficult for computer programs to solve
  3. Low utility for sweatshop CAPTCHA farms to solve

Perhaps we could produce better CAPTCHAs if we had standard benchmarks against which we could compete. For example, consider this idea for a new type of CAPTCHA: we embed key words or images within funny videos of kittens, in such a way that the user must watch the entire video to determine the message in the video to type in. By re-using the same or similar video clips of kittens repeatedly, we can make the video enjoyable the first time you watch it, but boring to sweatshop workers who must watch it over and over again. This could slow down the rate at which sweatshop workers complete CAPTCHAs from a couple per second to less than 1 per 30 seconds (or however long each video is), thus reducing viability of the CAPTCHA sweatshop business.

If an academic researcher (such as myself) had this idea, how would he test it? With traditional methods, I would need to have access to a large-scale user population (such as the one at Yahoo or Google) and roll out a live A/B test with real users. Our Mechanical Turk method, on the other hand, provides an alternative. We can benchmark the utility of this new CAPTCHA for both first-time users, and repeat users, and try to make it high attention-utility for the former and low-utility for the latter.

This would allow independent researchers, or small start-up companies, to demonstrate the effectiveness of a new type of user interface. Larger companies could then adopt and refine the basic proven ideas. Additionally, companies could compete on these benchmarks, just as hardware companies compete on industry performance benchmarks. This could encourage a marketplace of innovation for CAPTCHAs, and also for other types of user interfaces.

Advertising Economics

The second way we could extend that study is to take a deeper look at advertising. In fact, Goldstein et al. have done precisely this, extending our method to study the annoyingness of advertisements. In so doing, they were able, for the first time, to compare the economics of advertising among the three parties involved in any advertising attention transaction:

  1. The advertiser
  2. The website owner
  3. The user

Without our method, scientists could only measure the economics between the first two, because the advertiser pays money to the website owner for the ad. However, advertisements also occur at a cost to the user if they are annoying, and this decreases the utility of the website, which poses an additional downside for the website owner. Finally, Goldstein et al. pointed out that annoying ads also pose a potential downside to the advertiser himself, who loses reputation in the eyes of potential customers who view the site. This is a hidden attention economy of advertising.

To understand this economy, Goldstein et al. extended our method in some important ways. First, they incorporated a qualitative survey into the experiment, so that they recorded why users found the ads annoying concordant with the quantitatively measured results of how much they found them annoying. Second, they measured the quality of work that occurred in the different conditions, and could determine how quality of work correlated with utility. Third, they used the utility data in a theoretical neoclassic model of the advertising economics. The measured utility values became a parameter for a variable in the model. Thus, the utility data served as a lynchpin enabling them to bridge theory to practice. I anticipate these three methodological extensions generalizing to other attention economic situations as well.

  | *** ADD study

Security studies, like michael brooks

Mechanical Turk appears to be particularly applicable to usable security studies. Security interfaces, such as passwords and access control, are generally not a fun part of using a computer. They are usually in the way of a user's goal, rather than a part of it. This makes them annoying for users, but perfect for Utiliscope.

As we've explained before, Mechanical Turk and other virtual economic laboratories are ideal for measuring the cost of an interface, rather than the value of a task's goal. As a consequence, I have seen interest from the usable security community in using these methods for evaluating the cost of security mechanisms.

As an example, I am currently collaborating on a study to evaluate biometric password authentication mechanisms, that record and match your unique inter-keystroke timings when typing a phrase. These systems inevitably have some degree of error, and this error lowers their utility. However, no one knows how important the error rate is, or what the key issue is holding these systems back from widespread deployment. We are using utiliscope to investigate a set of such hypotheses. I have been in discussion with other usable security researchers, to investigate other issues, as well.

Proving Adam Smith's Theory of Compensating Wage Differential

The neoclassical economic theory behind our Mechanical Turk method is the Compensating Wage Differential by Adam Smith—a theory that dates back to the origins of economics, but has not been experimentally validated! The reason is that experimental economists have found it very difficult—near impossible—to obtain an ideal experimental setup that allows them to manipulate the independent variables of the theory.

The Compensating Wage Differential includes five dimensions, such as “the agreeable or disagreeableness of the task” (ie. is it nice?) and “the constancy or inconstancy of employment” (does it offer job security?) that are presumed to affect the wage that a worker is paid. However, experimental economists do not find themselves in situations where they can easily manipulate these five variables of a jobs with real-life employers at physical jobs. Mechanical Turk, on the other hand, is an ideal setup, as we can post jobs and manipulate them in a variety of ways, and collect plenty of data. We are currently developing an experiment that proves that you can measure all five dimensions of Adam Smith's theory on Mechanical Turk. | Variations: | - we can do this with a virtual economic laboratory -- mturk. where | we upload controlled tasks. This is what we prototyped. | - or you could implement it in a live site, particularly using | bitcoins. | - or you could make adjustments to go for a longer-term neoclassic | utility function, and perhaps compare the two.

  | Variation of Implementing on a live site.
  |   - Check out this real site that pays people with bitcoins
  |     So this is real, but let me give you a hypothetical vision for it
  |   - When we pay people to do a task, 
  | 
  |   - We can pay people to bootstrap a site, paying people for the
  |     crappy tasks that make the community better, slowly reducing the
  |     cost as we find bottlenecks
  |   - Or just vary the
  | 
  | Let's compare this to outcome-utility measures
  |   - Have people think and come up with a bid for a long period of time
  |   - We find big differences in WTA vs. WTP
  |   - When you solicit bids, lots of things matter and affect the
  |     outcome, so be careful.  It's possible this also affects us...
  | Proposed initial survey study for data on fb/google per-minute in
  | cents.

Reasoning about Human-Computer Systems

At a higher level, I believe that a measurable attention economics would allow us to make in-depth theoretical predictions of the possibilities and architectures of human-computer systems.

What is Possible? What is Difficult? What is "Performance"?

Consider, for instance, the issue of computational complexity in algorithms. We have a language and set of assumptions for analyzing and discussing an algorithm's performance, with O(n) notation, and models of time and space. What would an equivalent system be like for human-computer systems, which require not just computing power, but also human attention?

I believe a system would be based on attention economics. A human-computer system runs on human attention. It should be evaluated on its ability to recruit attention, and to use it efficiently and effectively. That is, we should measure the system's inputs and outputs:

                    _____________________________
                   |                             |
   Attention       |    Human-Computer System    |      Product
   • Quantity  ---->                             ---->  • Quantity
   • Quality       |  (recruits human attention  |      • Quality
                   |     to produce results)     |
                   |_____________________________|

Any human-computer system is an information processing system that takes human attention as input and produces some information product result. The product might be a change in representation of information (such as summarizing an article), the production of new information (writing a new article), or the conveyance of information (advertising, education) to a human audience. The effectiveness of a human-computer system can be quantified by the quantity and quality of information product that it produces. The efficiency of a system can be quantified as the Quantity and Quality of the product divided by the Quantity and Quality of the attention required to create it—the bang per buck.

And in our case, the “buck” is literal; the amount of attention that a system recruits can be measured in the dollar-equivalent that would be required to pay people to do the task. Most websites encourage participation from users without pay, by providing intrinsically motivating tasks. If we want to quantify that intrinsic motivation, we can pay people (e.g. on Mechanical Turk, or a specially-recruited user study sample population) to replicate their efforts, and measure the wage necessary.

Furthermore, we can decompose human-computer systems into smaller parts, by drawing such a box around any subset of the system. Any task can be considered a system in this way: it takes human attention as input, and produces some sort of product as a result. Most human-computer processes are long chains of such tasks. For instance, consider this analogy:

   water     ___________________________________________
   pressure  -> [pipe step 1] -> [pipe step 2] -> ... ->  user goal
   force     ——————\ \————————————————\  \——————————————
                    \ \                \  \

You can envision tasks as giant pipes, through which human attention flows. The attention enters the pipe with a force, similar to water pressure, corresponding to the motivation of the users towards their goals. To get through the pipe, users must complete a sequence of tasks in multiple steps. Each step has some cost, or friction, that decreases the motivating force of attention, and also incurs other types of loss. If we model any of these steps as a subsystem, with its own quantity and quality of inputs and outputs, we could begin to deconstruct some of the complexity of human-computer system design.

(However, let me be clear that I believe much of human-computer systems are inherently complex, and cannot be deconstructed in this way.)

  | *** Value of communications
  | *** Thermodynamics theories, converting Attention into Work

Provable Economics of Security

A mindset of such formalization of the attention economics of human-computer systems may allow us to make formal claims about the security of human-computer systems. In other words, we may be able to prove aspects of internet security, economically.

Stefan Savage's research group at UCSD has done fantastic work uncovering the economics of internet security in the last five years. For instance, they have uncovered some of the economics of CAPTCHA farming, and the economics of SPAM email messaging. They recommend that computer security focus not just on static software defenses, but economic defenses as well, identifying and exploiting the motivational weaknesses of one's adversaries. For example, they found that most SPAM emails were reliant on receiving payment through one of just three (?) financial exchanges, and by putting pressure on those banking institutions was the most effective way to prevent SPAM email.

Economics of CAPTCHA farms

I propose that attention economics has an analogous nature. Consider the aforementioned unfarmable CAPTCHA project. If we could benchmark the differences in difficulty of different types of CAPTCHAs, then we could predict the amount of motivation required to convince people to do them for illicit purposes. Evildoers use a few methods of recruiting attention. Some of them simply pay for it in third-world sweatshops, but many of them find other veins of attention to tap. For instance, there are porn sites that give users free porn if they answer CAPTCHAs, and those CAPTCHAs are then used by the site operator to sign up for new email accounts to send SPAM from. Additionally, some online games will motivate their players to do microtasks, such as filling out CAPTCHAs, in order to achieve in-game upgrades like fancy cows for Farmville.

These sources of attention are not always paid for via the traditional economy—they come from the attention economy. If we want to understand the ability of an attacker to recruit human attention, we will need to know more than the price of a sweatshop worker; we will also need to know about the attention market in video games, and the attention market in pornography.

Provable Internet Democracy

As another example, consider the challenge of soliciting democratic participation with government over the internet. President Obama used an idea submission website (http://opengov.ideascale.com) to solicit ideas for the following topic of government transparency:

   "How can we strengthen our democracy and promote efficiency and
    effectiveness by making government more transparent,
    participatory, and collaborative?"

Citizens voted on the ideas democratically. However, the platform was unfortunately overwhelmed by activity from special interest groups who upvoted ideas completely irrelevant to the initial call. The top ranked ideas for increasing government openness were eventually dominated by duplicate, irrelevant requests to legalize marijuana and force President Obama to release his birth certifate.

This illustrates a general problem: how might we prevent special groups from overwhelming a voting system? How might we achieve a reliable sample of citizen opinion?

One approach is for citizens to vote with their attention. It is easy for a special-interest group to mass-mail their constituents to click a button on a website, but is more difficult to motivate them to write independent, thoughtful articles. This requires more motivation, and more prior attention utility. If a website were to weight user votes by the attention they put into voting, it might open new possibililities for design of the democratic process.

Furthermore, if we could measure the motivation, or attention utility required to complete different levels of tasks, and the attention capacity available in different special interest groups, we might be able to prove that certain types of attackes from special interest groups can only succeed to a certain extent with such a system.

  | **** Crowdsourced Judicial System
  | ** Human-Computer Interactions

Because There will be Real Attention Money Soon Enough

A Bitcoin Attention Economy

Example: show that image

it takes crowdsourcing much broader

even if it doesn't succeed, likely that something will change here

as market leader, force others to compete and make it that easy

the economy will BECOME visible

the men who succeed will be those that understand

(Optional) Relating Attention vs. Outcomes

NOTE: I am currently writing this

 - Attention econ is in how opportunities occur vs. outcomes
 - How opportunities occur /defines/ our attention, which defines outcomes
 - The outcomes feed back to our opportunities
 - BASIS IN THE BRAIN
   - We used to think dopamine coded for outcomes
   - Now we see it connects outcome to occurence
 - IN AND OUT OF SYNC/HARmONY
   - We want them in harmony.  This is represented in buddhism and
     hindu yoga.
   - Examples: lose weight, but don't, etc.  Procrastination.
   - We can measure at either end, and discover the difference between
     the two
     ................           ...............           ................
    .                .         .               .         .                .
    .                .         .               .         .                .
    . How Opportuni- . ----->  .   Attention   . ----->  .    Outcome     .
    . ties Occur     .         .   / Actions   .         .                .
    .                .         .               .         .                .
     ................           ...............           ................
            ^                                                    |
            |                                                    |
             \______  ... and feeds back to the mind     _______/

Conclusions

            Anything you put your mind to, you can do.
            - Anonymous New-Age Hippie

The billions of minds connected to the internet contain the greatest pool of intelligence ever available on Earth. In a very literal way, humanity will do what we put our minds to. We are limited by what we choose to attend to, with every moment of our lives.

Attention economics is the allocation of the Earth's minds. We should be aware of what Earth is attending to. We should reflect on the problems we are solving.

The existing economic model has produced robust financial institutions and markets, but it does not explain everything. There are many important problems on Earth that we do not seem to be attending to. I think we need to inspect our inapproach to these problems. I think we should study where our attention itself is going, and why it goes wrong. Why do we say we care about an outcome, but in the moment of action, fail to allocate our attention to obtaining it?

As a society, we say that we care about the environment, yet we waste energy and drive polluting cars. We cannot explain this with neoclassical economics. The neoclassical interpretation is that we must want to drive cars and waste energy, based on our behavior, and that we are selfish. But what does this give us? How can we take joint action towards improving our allocations of attention, if our shared language of value assumes that we are always doing exactly what we want?

We need an economics that lets us measure the gap between our voiced objectives and our actions, so that we can create scientific models of the problems in our behavior.

Table of Contents

Introduction

Attention Economics Beyond the Internet

Can we measure it?

The Attention Economic Model

Measuring Attention Utility

Measuring Attention Utility with Human-Computer Systems

A New Economic Model

Shifting Focus, like Copernicus

A New Focal Point

This would help many fields

... FOR and WITH Human-Computer Systems

1. Technology magnifies attention issues in society

2. Technology is EVALUATED by its ability to recruit human attention

3. We can use technology to measure attention economics

Excelsior! (Onwards!)

WHAT do we mean by Attention Economics

The distinction of Attention Economics

From Microeconomics to Nanoeconomics

What is attention?

Attention is our Processing

Attention has a QUALITY

We have a CAPACITY of attention

ACTION is a form of attention

We attend TO objects

Dual-nature of Objects

In summary

How is attention ALLOCATED?

We are constantly CHOOSING attention

Attention opportunities have UTILITY

Many choices of attention are semi-automatic

These choices can be said to be FAST THOUGHT

Fast thought chooses our Slow thought

Utility is a continuous model of Fast Thinking choice over time

We learn to improve our utility function from experience

We can manipulate utility to measure it

Summary

Limitations of the model

HOW to Measure Attention Economics

An Idealized Method

A Measure of Attention Utility

Measuring Cost with a Virtual Economic Laboratory on Mechanical Turk

Overview of the technique

Dirty Details: Making it work in Mechanical Turk

Implementing multiple conditions.

Selection bias.

Lopping off the long tail of job completions.

Temporal Market Price Fluctuations.

Case Studies

Utility of Efficiency: Testing Fitts’ Law

Study design and execution.

Analyzing the data.

Utility of Aesthetics & Feedback: CAPTCHAs

Study design and execution.

Survival Analysis.

Scope, Limitations, & Future Work

Mechanical Turk & Other Labor Markets

General Elements of Utility Measurement

Choice of Metric Scale

Choice of Object (Time vs. Task)

Choice of Setting (Virtual Laboratory vs. In-Situ)

Choice of Price-Elicitation Method

Traditional methods

Our Approach

Discussion and Evidence of Success

Threats to Validity

WHY Measure Attention Economics

Additional Mechanical Turk Studies

Unfarmable CAPTCHAs

Advertising Economics

Security studies, like michael brooks

Proving Adam Smith's Theory of Compensating Wage Differential

Reasoning about Human-Computer Systems

What is Possible? What is Difficult? What is "Performance"?

Provable Economics of Security

Economics of CAPTCHA farms

Provable Internet Democracy

Because There will be Real Attention Money Soon Enough

A Bitcoin Attention Economy

it takes crowdsourcing much broader

even if it doesn't succeed, likely that something will change here

the economy will BECOME visible

the men who succeed will be those that understand

(Optional) Relating Attention vs. Outcomes

Conclusions