What This Book Is About
This book is not about information architecture, although you will find information architecture principles alluded to throughout it. And this book is also not about visual design, although you will find that the backdrop of good visual design is assumed throughout.
This book is about interaction design: specifically, interaction design on the Web. And even more specifically, about rich interaction design on the Web. It is a distillation of best practices, patterns, and principles for creating a rich experience unique to the Web.
By unique I mean that the Web comes with its own context. It is not the desktop. And while over time the lines between desktop and Web blur more and more, there is still a unique aspect to creating rich interactions on the Web. Editing content directly on the page (e.g., In-Page Editing, as we discuss in Chapter 1) borrows heavily from the desktop—but has its own unique flavor when applied to a web page. This book explores these unique rich interactions as set of design patterns in the context of a few key design principles.
What do we mean by design patterns?
Christopher Alexander coined the term “patterns” in his seminal work A Pattern Language: Towns, Buildings, Construction (Oxford University Press) to catalog common architectural solutions to human activities. He described a pattern as:
…a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem…
Patterns were later applied to software in the book Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley), by the Gang of Four (Erich Gamma, Richard Helm, Ralph Johnson, and John M. Vlissides). A few years later design patterns were extended to the realm of user interface design.
It is the latter form of patterns that we present in this book: interaction design patterns. You will find 75+ patterns illustrating the most common techniques used for rich web interaction. Each design pattern is illustrated by examples from various websites. Since the patterns described are interactive, we use a generous amount of figures to explain the concept. We tease out the nuances for a given solution as well as identify patterns to be avoided (anti-patterns). Best practice sections call out suggestions along the way.
The patterns are presented in the context of six design principles, which form the framework for the book:
As Alan Cooper states: “Where there is output, let there be input.” This is the principle of direct manipulation. For example, instead of editing content on a separate page, do it directly in context. Chapters 1–3 in this principle include patterns for In-Page Editing, Drag and Drop, and Direct Selection.
While working on a redesign of Yahoo! 360 the designer, Ericson deJesus, used the phrase “light footprint” to describe the need to reduce the effort required to interact with the site. A primary way to create a light footprint is through the use of Contextual Tools. This principle explores the various patterns for these in Chapter 4, Contextual Tools.
The page refresh is disruptive to the user’s mental flow. Instead of assuming a page refresh for every action, we can get back to modeling the user’s process. We can decide intelligently when to keep the user on the page. Ways to overlay information or provide the information in the page flow are discussed in Chapters 5 and 6, Overlays and Inlays, respectively. Revealing dynamic content is discussed in Chapter 7, Virtual Pages. In the last chapter of this section, Chapter 8, we discuss Process Flow, where instead of moving from page to page, we can create in-page flows.
Discoverability is one of the primary challenges for rich interaction on the Web. A feature is useless if users don’t discover it. A key way to improve discoverability is to provide invitations. Invitations cue the user to the next level of interaction. This section, including Chapters 9 and 10, looks at Static Invitations, those offered statically on the page, and Dynamic Invitations, those that come into play in response to the user.
Animations, cinematic effects, and various other types of visual transitions can be powerful techniques. We explore engagement and communication in Chapter 11, looking at a set of the most common Transitional Patterns, and Chapter 12 is devoted to the Purpose of Transitions. A number of anti-patterns are explored as well.
A responsive interface is an intelligent interface. This principle explores how to make a rich experience by using lively responses. In Chapter 13, a set of Lookup Patterns is explored, including Live Search, Live Suggest, Refining Search, and Auto Complete. In Chapter 14, we look at a set of Feedback Patterns, including Live Previews, Progressive Disclosure, Progress Indication, and Periodic Refresh.
* * *
 See works such as Jenifer Tidwell’s Designing Interfaces: Patterns for Effective Interaction Design (O’Reilly) and the pattern library of Martijn van Welie (http://www.welie.com/).
Who Should Read This Book
Designing Web Interfaces is for anyone who specifies, designs, or builds web interfaces.
Web designers will find the principles especially helpful as they form a mental framework, defining a philosophy of designing nuanced rich interactions. They will also find the patterns a welcome addition to their design toolbox, as well as find the hundreds of provided examples a useful reference. And of course the best practices should provide a nice checklist reminder for various interaction idioms.
Product managers will find the patterns and examples to be excellent idea starters as they think through a new business problem. Though this book does not provide programming solutions, web developers will nevertheless appreciate the patterns, as they can be mapped directly into specific code solutions. For everyone involved, the patterns form a vocabulary that can span product management, design, and engineering, which in the end forms the basis for clearer cross-team communication.
You’ll also find that whether you are just starting out or you are a grizzled veteran, the wealth of real-world examples in the context of design principles and patterns will be a benefit to your daily work.
What Comes with This Book
This book has a companion website (http://designingwebinterfaces.com) that serves as an addendum containing updated examples; additional thoughts on the principles, patterns, and best practices; and helpful links to articles and resources on designing web interfaces.
All of the book’s diagrams and figures are available under a Creative Commons license for you to download and use in your own presentations. You’ll find them at Flickr (http://www.flickr.com/photos/designingwebinterfaces/).
Conventions Used in This Book
This book uses the following typographic conventions:
Used for example URLs, names of directories and files, options, and occasionally for emphasis.
Indicates pattern names.
* * *
This indicates a tip, suggestion, or general note.
* * *
You can find all of the figure examples on our companion Flickr site (http://flickr.com/photos/designingwebinterfaces). The figures are available for use in presentations or other derivative works provided you respect the Creative Commons license and provide attribution to this work. An attribution usually includes the title, author, publisher, and ISBN. For example: “Designing Web Interfaces, by Bill Scott and Theresa Neil, Copyright 2009 Bill Scott and Theresa Neil, 978-0-596-51625-3.”
If you feel your use of examples falls outside fair use or the permission given above, feel free to contact us at email@example.com.
We’d Like to Hear from You
We have tested and verified the information in this book to the best of our ability, but you may find that features have changed or that we may have made a mistake or two (shocking and hard to believe). Please let us know about any errors you find, as well as your suggestions for future editions by writing to:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 (in the United States or Canada)
707-829-0515 (international or local)
We have a web page for this book where we list examples and any plans for future editions. You can access this information at:
You can also send messages electronically. To be put on the mailing list or request a catalog, send an email to:
To comment on the book, send an email to:
For more information about our books, conferences, Resource Centers, and the O’Reilly Network, see our website at:
Safari Books Online
When you see a Safari® Books Online icon on the cover of your favorite technology book, that means the book is available online through the O’Reilly Network Safari Bookshelf.
Safari offers a solution that’s better than e-books. It’s a virtual library that lets you easily search thousands of top tech books, cut and paste code samples, download chapters, and find quick answers when you need the most accurate, current information. Try it for free at http://safari.oreilly.com.
Writing this book was not just the effort of Theresa Neil and myself. There are many direct contributors but even more that indirectly inspired us.
Most importantly I wish to thank Ruth. You are my wonderful wife of 30 years, my friend, and an amazing mother. Without your patience and support I could not have gotten this book finished.
I am deeply indebted to my editors at O’Reilly. Double kudos go to Mary Treseler, who patiently cajoled Theresa and I to complete this work. You provided valuable feedback early in the process. Thanks to the rest of the team that brought this book to life: Rachel Monaghan, Marlowe Shaeffer, Ron Bilodeau, Colleen Gorman, Adam Witwer, and Robert Romano, to name a few.
Anyone who has written a book also knows that the technical reviewers are your critical test market. Thanks for the helpful praise and constructive criticism from Christian Crumlish, Dan Saffer, Luke Wroblewski, Juhan Sonin, Kevin Arthur, and Alan Baumgarten. Though I could not address every issue, I took each comment seriously and they had a significant impact on the finished product.
I owe a lot to my time at Yahoo!. Thanks to Erin Malone for sending me an email out of the blue, which eventually led to her hiring me at Yahoo!. There I was surrounded by brilliant people and given the opportunity to succeed. To Erin, Matt Leacock, and Chanel Wheeler for founding the Yahoo! Design Pattern Library. Thanks to Larry Tesler and Erin, who gave me the opportunity to lead and evangelize the launch of the public Yahoo! Design Pattern Library. It was in my role as pattern curator that I crystallized much of the thinking contained in this book. A special thanks to the many talented designers and developers who gave me continual feedback and inspired me with their craftsmanship. The YUI team, and in particular Nate Koechley and Eric Miraglia, for the formulation of “Interesting Moments” grids and for the opportunity to tie the patterns to real-world code. My co-evangelists: Douglas Crockford, Iain Lamb, Julien Lecomte, and Adam Platti. My good friend, Darren James, who encouraged me along the way. Thanks to the many talented designers that I got the chance to collaborate with and whose thoughts are found sprinkled throughout this text: Karon Weber, Samantha Tripodi, Ericson deJesus, Micah Laaker, Luke Wroblewski, Tom Chi, Lucas Pettinati, Kevin Cheng, Kathleen Watkins, Kiersten Lammerding, Annette Leong, Lance Nishihira, and many others.
Outside of Yahoo!, my thinking was encouraged and matured by knowing/learning from Dan Saffer (Adaptive Path), Ryan Freitas (Adaptive Path), Aza Raskin (Humanized), Scott Robbins (Humanized), Peter Moerholz (Adaptive Path), and David Verba (Adaptive Path). A special debt of gratitude to those in the pattern community. Jenifer Tidwell for pointing the way to patterns. For Martijn van Welie for his excellent pattern library. For James Refell and Luke Wroblewski and their work on patterns at eBay. For Christian Crumlish, the current pattern curator at Yahoo! and his clear thinking. Jesse James Garrett, for not only giving Ajax a name, but inviting me to the first Ajax Summit and then taking me on tour with him. Teaching in the Designing for Ajax Workshops gave me the confidence to write this book and tested the material in front of a live audience.
And thanks to the many companies and conference coordinators that invited me to speak. Sharing this material with thousands of listeners was invaluable in determining what resonates with most designers and developers. In no particular order (listed with the company that they invited me to speak at): Jared Spool (UIE), Ben Galbraith and Dion Almer (Ajaxian/Ajax Experience), Kathryn McKinnon (Adobe), Jeremy Geelan (SysCon), Rashmi Sinha (BayCHI/Slideshare), Aaron Newton (CNET), Brian Kromrey (Yahoo! UED courses), Luke Kowalski (Oracle), Sean Kane (Netflix), Reshma Kumar (Silicon Valley Web Guild), Emmanuel Levi-Valensi (People in Action), Bruno Figueiredo (SHiFT), Matthew Moroz (Avenue A Razorfish), Peter Boersma (SIGCHI.NL), Kit Seeborg (WebVisions), Will Tschumy (Microsoft), Bob Baxley (Yahoo!), Jay Zimmerman (Rich Web Experience), Dave Verba (UX Week). Other conferences and companies that I must thank: Web Builder 2.0, eBig, PayPal, eBay, CSU Hayward, City College San Francisco, Apple, and many others.
My deep appreciation goes to Sabre Airline Solutions, and especially Brad Jensen, who bet on me and gave me a great opportunity to build a UX practice in his organization; and to David Endicott and Damon Hougland, who encouraged me to bring these ideas to the public. And to my whole team there for helping Theresa and I vet these ideas in the wild. Many patterns in this book were born out of designing products there.
Finally, I want to thank Netflix, where I am now happily engaged in one of the best places to work in the world. Thanks for supporting me in this endeavor and for teaching me how to design and build great user experiences.
I would like to gratefully acknowledge the following folks:
Aaron Arlof, who provided the illustrations for this book. They are the perfect representation of the six principles.
Brad Jensen, my vice president at Sabre Airline solutions, who had me interview Bill in the first place. Without Bill’s mentoring and training I would not be in this field.
Damon Hougland, who helped Bill and I build out the User Experience team at Sabre.
Jo Balderas, who made me learn to code.
Darren James, who taught me how to code.
All of my clients who have participated in many a white board session, enthusiastically learning and exploring the patterns and principles of UI design, especially Steven Smith, Dave Wilby, Suri Bala, Jeff Como, and Seth Alsbury, who allowed me to design their enterprise applications at the beginning of the RIA revolution. A special thanks to my current colleagues: Scott Boms of Wishingline, Paulo Viera, Jessica Douglas, Alan Baumgarten, and Rob Jones.
Most importantly, I wish to thank my husband for his unwavering support, and my parents for their encouragement. And my son, Aaron, for letting me spend so many hours in front of the computer.
Part I. Make It Direct
On a recent trip to Big Sur, California, I took some photos along scenic Highway 1. After uploading my pictures to the online photo service, Flickr, I decided to give one of the photos a descriptive name. Instead of “IMG_6420.jpg”, I thought a more apt name would be “Coastline with Bixby Bridge.”
The traditional way to do this on the Web requires going to the photo’s page and clicking an edit button. Then a separate page for the photo’s title, description, and other information is displayed. Once on the editing page, the photo’s title can be changed. Clicking “Save” saves the changes and returns to the original photo page with the new title displayed. Figure 1 illustrates this flow.
Figure 1. Web applications have typically led the user to a new page to perform editing
In Flickr you can edit the photo title just like this. However, Flickr’s main way to edit photos is much more direct. In Figure 2 you can see that by just clicking on “IMG_6420.jpg”, editing controls now encapsulate the title. You have entered the editing mode directly with just a simple click.
Editing directly in context is a better user experience since it does not require switching the user’s context. As an added bonus, making it easier to edit the photo’s title, description, and tags means more meta-information recorded for each photo—resulting in a better searching and browsing experience.
Figure 2. In Flickr, clicking directly on the title allows it to be edited inline
Make It Direct
The very first websites were focused on displaying content and making it easy to navigate to more content. There wasn’t much in the way of interactivity. Early versions of HTML didn’t include input forms for users to submit information. Even after both input and output were standard in websites, the early Web was still primarily a read-only experience punctuated by the occasional user input. This separation was not by design but due to the limits of the technology.
Alan Cooper, in the book About Face 3: The Essentials of Interaction Design, describes the false dichotomy.
…many programs have one place [for] output and another place [for] input, [treating them] as separate processes. The user’s mental model…doesn’t recognize a difference.
Cooper then summarizes this as a simple rule: Allow input wherever you have output. More generally we should make the interface respond directly to the user’s interaction: Make it direct.
To illustrate this principle, we look at some broad patterns of interaction that can be used to make your interface more direct. The next three chapters discuss these patterns:
Directly editing of content.
Moving objects around directly with the mouse.
Applying actions to directly selected objects.
* * *
 Cooper, Alan et al. About Face 3: The Essentials of Interaction Design (Wiley, 2007), 231.
 This is a restatement of the principle of Direct Manipulation coined by Ben Scheiderman (“Direct manipulation: a step beyond programming languages,” IEEE Computer 16 [August 1983], 57–69).
Chapter 1. In-Page Editing
Content on web pages has traditionally been display-only. If something needs editing, a separate form is presented with a series of input fields and a button to submit the change. Letting the user directly edit content on the page follows the principle of Make It Direct.
This chapter describes a family of design patterns for directly editing content in a web page. There are six patterns that define the most common in-page editing techniques:
Single-Field Inline Edit
Editing a single line of text.
Multi-Field Inline Edit
Editing more complex information.
Editing in an overlay panel.
Editing items in a grid.
Changing a group of items directly.
Configuring settings on a page directly.
The most direct form of In-Page Editing is to edit within the context of the page. First, it means we don’t leave the page. Second, we do the editing directly in the page.
The advantage of Inline Edit is the power of context. It is often necessary for users to continue to see the rest of the information on the page while editing. For example, it is helpful to see the photo while editing the photo’s title, as explained in the next section, Single-Field Inline Edit.
It is also useful when editing an element that is part of a larger set. Disqus, a global comment service, provides inline editing for comments (Figure 1-1). After posting a comment and before anyone replies to the comment, an edit link is provided. The editing occurs within the context of the rest of the comments shown on the page.
* * *
If editing needs the context of the page, perform the editing inline.
* * *
Figure 1-1. Disqus allows comments to editing inline within the context of other comments
The first two patterns, Single-Field Inline Edit and Multi-Field Inline Edit, describe techniques for bringing direct inline editing into the page.
Single-Field Inline Edit
The simplest type of In-Page Editing is when editing a single field of text inline. The editing happens in place instead of in a separate window or on a separate page. Flickr provides us a canonical example of Single-Field Inline Edit (Figure 1-2).
Figure 1-2. Flickr provides a straightforward way to edit a photo’s title directly inline
The flow is simple. Click on the title to start editing. When you are done, hit the “Save” button, and the title is saved in place. Flickr was one of the first sites to employ this type of in-page editing. As a testament to its usefulness, the interaction style first designed has not changed much over the last few years.
But there are some challenges to consider.
Just how discoverable is this feature? In this example, there are a number of cues that invite the user to edit. The invitations include:
Showing a tool tip (“Click to edit”)
Highlighting the background of the editable area in yellow
Changing the cursor to an edit cursor (I-beam)
But all these cues display after the user pauses the mouse over the title (mouse hover). Discoverability depends on the user hovering over the title and then noticing these invitations.
To make the feature more discoverable, invitational cues could be included directly in the page. For example, an “edit” link could be shown along with the title. Clicking the link would trigger editing. By showing the link at all times, the edit feature would be made more discoverable.
But this has to be balanced with how much visual noise the page can accommodate. Each additional link or button makes the page harder to process and can lead to a feature not being utilized due to the sheer volume of features and their hints shown on the page.
* * *
If readability is more important than editability, keep the editing action hidden until the user interacts with the content.
* * *
Yahoo! Photos took this approach for editing titles (Figure 1-3). When showing a group of photos, it would be visually noisy to display edit links beside each title. Instead, the titles are shown without any editing adornment. As the mouse hovers over a photo title, the text background highlights. Clicking on the title reveals an edit box. Clicking outside of the edit field or tabbing to another title automatically saves the change. This approach reduces the visual noise both during invitation and during editing. The result is a visually uncluttered gallery of photos.
Figure 1-3. Editing titles in Yahoo! Photos keeps visual clutter to a minimum; it simply turns on a visible edit area during editing
Another concern that arises from inline editing is the lack of accessibility. Accessibility affects a wider range of users than you might first consider. Assistive technologies help those with physical impairments, medical conditions, problems with sight, and many other conditions.
Assistive technologies generally parse the page’s markup to find content, anchors, alternate titles for images, and other page structure. If the inline edit feature does not contain explicit markup built into the page (such as an explicit, visible edit link), assistive technologies cannot easily discover the inline edit feature.
In a sense, relying on the mouse to discover features will prevent some users from being able to edit inline. As mentioned before, providing an explicit edit link helps with discoverability (as shown previously in Figure 1-1). But as a by-product it also makes the feature more accessible.
* * *
Providing an alternative to inline editing by allowing editing on a separate page can improve accessibility.
* * *
There is a natural tension between direct interaction and a more indirect, easily accessible flow. It is possible to relieve this tension by providing both approaches in the same interface. Flickr actually does this by offering an alternate, separate page for editing (Figure 1-4).
Figure 1-4. Flickr allows you to also edit a photo’s title, description, and tags in a separate page
* * *
 We use the term “design patterns” to denote common solutions to common problems. Design patterns originate from Christopher Alexander’s book A Pattern Language (Oxford University Press). You can read a series of essays from me (Bill) and others on design patterns at http://www.lukew.com/ff/entry.asp?347
 While the Yahoo! Design Pattern Library (http://developer.yahoo.com/ypatterns/) was being launched, this pattern was not included in the initial set of patterns due to an internal debate over this issue of discoverability. In fact, one of the reviewers, a senior designer and frequent user of Flickr, had only recently discovered the feature. As a result, we withheld the pattern from the public launch.
 Yahoo! Photos was replaced in 2007 with Flickr.
Multi-Field Inline Edit
In our previous example, a single value was being edited inline. What happens if there are multiple values, or the item being edited is more complex than a string of text and you still would like to edit the values inline?
The pattern Multi-Field Inline Edit describes this approach: editing multiple values inline.
37 Signal’s Backpackit application uses this pattern for editing a note (Figure 1-5). A note consists of a title and its body. For readability, the title is displayed as a header and the body as normal text. During editing, the two values are shown in a form as input text fields with labeled prompts.
Figure 1-5. Backpackit reveals a multi-field form for editing a note’s title and body
In Single-Field Inline Edit the difference between display mode and edit mode can be more easily minimized, making the transition less disruptive. But when editing multiple fields, there is a bigger difference between what is shown in display mode and what is needed to support editing.
Readability versus editability
Readability is a primary concern during display. But the best way to present editing is with the common input form. The user will need some or all of the following:
Help text for user input
Assistive input (e.g., calendar pop up or drop-down selection field)
Editing styles (e.g., edit fields with 3D sunken style)
The edit mode will need to be different in size and layout, as well as in the number and type of components used. This means that moving between modes has the potential to be a disruptive experience.
In our example, the form for editing the note takes up a larger space on the page than when just displaying the note.
Blending display and edit modes
Ideally you would like the two modes to blend together in a seamless manner. Bringing the edit form into the page flow will have an effect on the rest of the page content. One way to smooth out the transition is by a subtle use of animation. Backpackit does this by fading out the display view and fading in the edit view at the same time (see the cross-fade in Figure 1-5).
Another approach is to use the same amount of space for both display and edit modes. In Yahoo! 360, you can set a status message for yourself. Your current status shows up on your 360 home page as a “blast,” represented as a comic book-style word bubble. Visually it looks like a single value, but there are actually three fields to edit: the blast style, the status, and any web page link you want to provide when the user clicks on your blast. Figure 1-6 shows the blast as it appears before editing.
Figure 1-6. Yahoo! 360 shows an “Edit Blast” link to invite editing
Figure 1-7 shows how the blast appears during editing. Notice that the edit form is designed to show both modes (display and editing) in the same visual space.
Figure 1-7. Yahoo! 360 brings the editing into the displayed blast; the difference between display and edit modes is minimized
The size similarity was not an accident. During design there was a concerted effort to make the display mode slightly larger without losing visual integrity, while accommodating the editing tools in the space of the bubble.
If the two modes (display and editing) are in completely separate spaces, the user may lose a sense of what effect the change will have during display. In Yahoo! 360, you can change the type of bubble and immediately see what it will look like. Switching from a “quote” bubble to a “thought” bubble is reflected while still in editing mode (Figure 1-8). This would not be possible if editing happened in a separate edit form.
Figure 1-8. Yahoo! 360 immediately displays the new blast type while still in edit mode
* * *
 What You See Is What You Get: an interface where content displayed during editing appears very similar to the final output.
The previous two patterns brought editing inline to the flow of the page. Inline editing keeps the editing in context with the rest of the elements on the page.
Overlay Edit patterns bring the editing form just a layer above the page. While still not leaving the page for editing, it does not attempt to do the editing directly in the flow of the page. Instead a lightweight pop-up layer (e.g., dialog) is used for the editing pane.
There are several reasons for choosing Overlay Edit instead of Inline Edit.
Sometimes you can’t fit a complex edit into the flow of the page. If the editing area is just too large, bringing editing inline can shuffle content around on the page, detracting from the overall experience. A noisy transition from display to edit mode is not desirable.
At other times you might choose to interrupt the flow, especially if the information being edited is important in its own right. Overlays give the user a definite editing space. A lightweight overlay does this job nicely.
* * *
An Overlay Edit is a good choice if the editing pane needs dedicated screen space and the context of the page is not important to the editing task.
* * *
Yahoo! Trip Planner is an online place for creating and sharing trip plans. Trips contain itinerary items that can be scheduled. When scheduled, the itinerary contains the dates the item is scheduled. Each item can be edited in an overlay (Figure 1-9).
Figure 1-9. Yahoo! Trip Planner provides a complex editor in an overlay for scheduling an itinerary item
“Sun Jun 4 12:00am—Mon Jun 5 12:00am” is easier to read than a format appropriate for editing (Figure 1-10). Using an editor prevents errors when entering the start and end dates for a specific itinerary item.
Figure 1-10. Yahoo! Trip Planner provides an overlay editor for adjusting itinerary times
Since the range of dates is known, Trip Planner uses a series of drop-downs to pick the start and end dates along with the time.
It should be noted that using multiple drop-downs for choosing the hour and minute is not the best experience. Although not in the context of an overlay, a better example of choosing an event time when creating an event can be found at Upcoming.org (Figure 1-11).
Figure 1-11. Upcoming provides a better experience for choosing time of day
The experience of picking a time from a single list (or typing the time in) is more direct than navigating multiple drop-downs.
Why an overlay?
An overlay should be considered when:
The editing module is considerably larger than the display values.
Opening an area on the page for the editing module would be distracting or push important information down the page.
There is concern that the opened information might go partially below the fold. An overlay can be positioned to always be visible in the page.
You want to create a clear editing area for the user.
What you are editing is not frequently edited. Having to click on an edit link, adjust to the pop-up location, perform your edit, and close the dialog is a tedious way to edit a series of items. In such cases, opt to either dedicate a space on the page for each item as it is selected, or allow the editing to occur in context to remove some of the time required to deal with an overlay.
What you are editing is a single entity. If you have a series of items, you should not obscure the other similar items with an overlay. By allowing the edit to occur in context, you can see what the other item’s values are while editing.
* * *
Best Practices for Inline Edit and Overlay Edit
In-Page Editing provides a nice way to change displayed content and observe the change in context. Here are some best practices to consider:
Keep the editing inline for single fields.
Use inline when editing one of many in a set. This keeps the context in view.
Keep the display and editing modes the same size when possible. This will avoid page jitter and reduce distraction when moving between the two modes.
Make the transition between display and editing as smooth as possible.
Use mouse hover invitations to indicate editing when readability is primary.
Avoid using double-click to activate editing.
Place a bracketed “” link near the item to be edited if editability is equally important or if the quantity of items that can be edited is small. This is a nice way to separate the link from the displayed text without creating visual distractions.
Show the edit in place when editing one item in a series (to preserve context).
Use an overlay when what is being edited needs careful attention. This removes the likelihood of accidentally changing a critical value.
Do not use multiple overlays for additional fields. If you have a complex edit for a series of elements, use one overlay for all.
When providing an overlay, use the most lightweight style available to reduce the disruptiveness of the context switch between render and editing state.
Use buttons when it might be too subtle to trigger completion implicitly.
Use explicit buttons for saving and canceling when there is room.
Whenever possible, allow the overlay to be draggable so that obscured content can be revealed as needed.
* * *
* * *
 In the past, separate browser windows were used for secondary windows. Lightweight overlays simply map the secondary content into a floating layer on the page. The resulting overlay feels more lightweight. See Chapter 5.
Editing tables of data is less common in consumer web applications. In enterprise web applications, however, tables reign supreme. The most common request is for the table editing to work like Microsoft Excel, which long ago set the standard for editing data in a grid.
A good example of Table Edit is a Google Docs Spreadsheet (Figure 1-12).
Figure 1-12. Editing a spreadsheet in Google Docs is very similar to editing a spreadsheet in Microsoft Excel
Presentation is the primary consideration when displaying a table of data. Editing is secondary. As a result, the editing scaffolding is hidden and only revealed when it’s clear the user wants to edit a cell.
A single mouse click is required to start editing a cell instead of a mouse hover. This is consistent with keeping the display of the grid uncluttered. Imagine how irritating it would be if every mouse motion revealed an edit box.
* * *
You should generally avoid double-click in web applications. However, when web applications look and behave like desktop applications, double-click can be appropriate.
* * *
Rendering versus editing. Google Spreadsheet displays the edit box slightly larger than the cell. This clearly indicates editability and lets the user know that input is not limited to the size of the cell (the edit box actually dynamically resizes as the user types into it). The only issue to consider is that the larger edit area covers other table cells. However, this works well in this case since editing is explicitly triggered by a mouse click. If activation had occurred on a mouse hover, the edit mode would have interfered with cell-to-cell navigation.
* * *
Best Practices for Table Edit
Here are some best practices for Table Edit:
Bias the display toward readability of the table data.
Avoid mouse hover for activating cell editing. It makes for the feeling of “mouse traps” and makes the interaction noisy.
Activate edit with a single click. While using a double-click may not be totally unexpected (since it looks like an Excel spreadsheet), a single click is easier to perform.
Consider allowing extra space during editing either through a drop-down editor or by slightly enlarging the edit cell.
As much as possible, mimic the normal conventions of cell navigation that users will already be familiar with (e.g., in Microsoft Excel).
* * *
As mentioned before, it is a good idea to keep the differences between the edit mode and the display mode as minimal as possible. In fact, it is a good idea to minimize modes where possible. In honor of this principle, a former manager of mine sported a vanity plate with the phrase “NOMODES”. However, modes cannot be avoided altogether, as they do provide necessary context for completing specific tasks.
If you want to keep the display of items on the page as uncluttered as possible while still supporting editing, consider using a single mechanism to enter a special editing mode: Group Edit.
On the iPhone’s home screen, the icons are normally locked down. However, there is a way to switch into a special Group Edit mode that allows you to rearrange the icon’s positions by drag and drop. You enter the mode by pressing down continuously on an icon until the editing mode is turned on (Figure 1-13).
Figure 1-13. The iPhone has a special mode for rearranging applications on the home page—pressing and holding down on an icon places all the applications in “wiggly mode”
The Apple technique signifies that we have entered a special editing mode. When the icons become “wiggly,” it is not a large intuitive leap that the icons have become loose and thus we can rearrange them.
Admittedly, the feature is not very discoverable. But it can be argued that it is straightforward once discovered. However, pressing the home button deactivates the rearranging mode. This really should operate more like a toggle. A better way to exit the “wiggly” mode would be to press and hold down on a wiggly icon. It follows the idea that you are pushing the icon back into its fixed place. Since deactivation is not the same mechanism as activation, it is a little hard to figure out how to go back into the normal display mode.
* * *
Activation and deactivation should normally follow the same interaction style. This makes it easy to discover the inverse action. This is a principle we call Symmetry of Interaction.
* * *
Another example of group editing is in the 37 Signals product, Basecamp (Figure 1-14). When sharing files with Basecamp, you can organize them into various categories. The categories are like folders. Clicking on a category link shows all the files in that “folder.” What if you want to delete a category? Or rename it? At the top of the category section there is a single “Edit” link that turns on editing for the whole area.
Figure 1-14. 37 Signals Basecamp provides a way to toggle a set of items into edit mode
Once the Group Edit mode is entered, you can add another category, rename an existing category, or delete empty categories. Notice the “Edit” link toggled to read “Done Editing”. Clicking this link exits the group-editing mode.
* * *
Switching between edit modes should happen instantaneously. There is no point in making the user wait on an animation to finish before he can start editing.
* * *
Discoverability versus readability
The advantage of providing a toggling edit mode is that it keeps the display uncluttered with the editing scaffolding. The disadvantage is that it is less discoverable. This tension between discoverability versus readability is common and must be balanced by the needs of the user.
Symmetry of Interaction
Unlike the iPhone example, you turn off editing in the same manner and location that you switched it on. The “Done Editing” link is in the same spot as the “Edit” link was. Since both are hyperlinks, they have the same interaction style. Interactions should be symmetrical wherever possible.
Popular portal sites like Yahoo! and Google’s interactive home page display specific content modules (e.g., Top Stories).
Module Configuration is a common pattern on these types of sites. Instead of modifying modules on a separate page, the sites provide ways to directly configure the amount and type of content that shows in each module. The My Yahoo! home page provides an “Edit” link that allows for Module Configuration (Figure 10-15).
Figure 1-15. Configuring modules on the My Yahoo! page can be done directly in place
There are some issues to consider when using Module Configuration.
Putting edit links on each module can be visually noisy. An alternative approach is to use the Group Edit pattern (as we saw in Figure 1-14) to place an edit link at the page level that turns on edit links for each module. When the “Done Editing” link is clicked, the links for each module are hidden. Again the trade-off is between visual noise and discoverability.
* * *
Best Practices for Group Edit and Module Configuration
Here are some best practices to keep in mind:
Use an edit toggle when there are a number of items to edit and showing edit scaffolding would make the display visually noisy.
Make activation and deactivation as similar as possible (Symmetry of Interaction). Switching in and out of an editing mode should operate more like a toggle.
Provide inline edit configuration for modules when configuration is an important feature.
Provide a way to turn configuration on/off globally for module configuration when this is secondary to content display.
* * *
Guidelines for Choosing Specific Editing Patterns
In-Page Edit provides a powerful way to make interfaces direct. Here are some general guidelines to think about when choosing an editing pattern:
Whenever you have a single field on the page that needs editing, consider using the Single-Field Inline Edit.
For multiple fields or more complex editing, use the Multi-Field Inline Edit.
If you don’t need inline context while editing, or the editing is something that demands the user’s full attention, use Overlay Edit.
For grid editing, follow the pattern Table Edit.
When dealing with multiple items on a page, Group Edit provides a way to balance between visual noise and discoverability.
When providing direct configuring to modules, use the Module Configuration pattern.
Chapter 2. Drag and Drop
One of the great innovations that the Macintosh brought to the world in 1984 was Drag and Drop. Influenced by the graphical user interface work on Xerox PARC’s Star Information System and subsequent lessons learned from the Apple Lisa, the Macintosh team invented drag and drop as an easy way to move, copy, and delete files on the user’s desktop.
It was quite a while before drag and drop made its way to the Web in any serious application. In 2000, a small startup, HalfBrain, launched a web-based presentation application, BrainMatter. It was written entirely in DHTML and used drag and drop as an integral part of its interface.
Drag and drop showed up again with another small startup, Oddpost, when it launched a web-based mail application (Figure 2-1) that allowed users to drag and drop messages between folders.
Figure 2-1. The Oddpost web mail client performed like a desktop mail application and included drag and drop as a key feature
The biggest hindrance was the difficulty in saving the user’s state after a drag was completed without refreshing the page. It was possible, but the underlying technology was not consistent across all browsers. Now that the technologies underlying Ajax have become widely known and a full complement of browsers support these techniques, Drag and Drop has become a more familiar idiom on the Web.
At first blush, drag and drop seems simple. Just grab an object and drop it somewhere. But, as always, the devil is in the details. There are a number of individual states at which interaction is possible. We call these microstates interesting moments:
How will users know what is draggable?
What does it mean to drag and drop an object?
Where can you drop an object, and where is it not valid to drop an object?
What visual affordance will be used to indicate draggability?
During drag, how will valid and invalid drop targets be signified?
Do you drag the actual object?
Or do you drag just a ghost of the object?
Or is it a thumbnail representation that gets dragged?
What visual feedback should be used during the drag and drop interaction?
What makes it challenging is that there are a lot of events during drag and drop that can be used as opportunities for feedback to the user. Additionally, there are a number of elements on the page that can participate as actors in this feedback loop.
There are at least 15 events available for cueing the user during a drag and drop interaction:
Before any interaction occurs, you can pre-signify the availability of drag and drop. For example, you could display a tip on the page to indicate draggability.
The mouse pointer hovers over an object that is draggable.
The user holds down the mouse button on the draggable object.
After the mouse drag starts (usually some threshold—3 pixels).
Drag Leaves Original Location
After the drag object is pulled from its location or object that contains it.
Drag Re-Enters Original Location
When the object re-enters the original location.
Drag Enters Valid Target
Dragging over a valid drop target.
Drag Exits Valid Target
Dragging back out of a valid drop target.
Drag Enters Specific Invalid Target
Dragging over an invalid drop target.
Drag Is Over No Specific Target
Dragging over neither a valid or invalid target. Do you treat all areas outside of valid targets as invalid?
Drag Hovers Over Valid Target
User pauses over the valid target without dropping the object. This is usually when a spring loaded drop target can open up. For example, drag over a folder and pause, the folder opens revealing a new area to drag into.
Drag Hovers Over Invalid Target
User pauses over an invalid target without dropping the object. Do you care? Will you want additional feedback as to why it is not a valid target?
Drop occurs over a valid target and drop has been accepted.
Drop occurs over an invalid target and drop has been rejected. Do you zoom back the dropped object?
Drop on Parent Container
Is the place where the object was dragged from special? Usually this is not the case, but it may carry special meaning in some contexts.
During each event you can visually manipulate a number of actors. The page elements available include:
Page (e.g., static messaging on the page)
Drag Object (or some portion of the drag object, e.g., title area of a module)
Drag Object’s Parent Container
Interesting Moments Grid
That’s 15 events times 6 actors. That means there are 90 possible interesting moments—each requiring a decision involving an almost unlimited number of style and timing choices.
You can pull all this together into a simple interesting moments grid for Drag and Drop. Figure 2-2 shows an interesting moments grid for My Yahoo!.
* * *
You can use an interesting moments grid to capture any complex interaction.
* * *
The grid is a handy tool for planning out interesting moments during a drag and drop interaction. It serves as a checklist to make sure there are no “holes” in the interaction. Just place the actors along the lefthand side and the moments along the top. In the grid intersections, place the desired behaviors.
Figure 2-2. A simplified interesting moments grid for the original My Yahoo! drag and drop design; it provided a way to capture the complexities of drag and drop into a single page
* * *
 HalfBrain also created a full spreadsheet application written in DHTML prior to this. It included many of the features of Microsoft Excel.
 Some of the developers at Oddpost actually came from HalfBrain. Yahoo! later purchased Oddpost’s mail application to form the basis for the current Yahoo! Mail product.
 Bill Scott, author of this book, originally called these interaction events. Eric Miraglia, a former colleague of his at Yahoo!, coined the more colorful term interesting moments.
 A template for the interesting moments grid can be found at http://designingwebinterfaces.com/resources/interestingmomentsgrid.xls.
Purpose of Drag and Drop
Drag and drop can be a powerful idiom if used correctly. Specifically it is useful for:
Drag and Drop Module
Rearranging modules on a page.
Drag and Drop List
Drag and Drop Object
Changing relationships between objects.
Drag and Drop Action
Invoking actions on a dropped object.
Drag and Drop Collection
Maintaining collections through drag and drop.
Drag and Drop Module
One of the most useful purposes of drag and drop is to allow the user to directly place objects where she wants them on the page. A typical pattern is Drag and Drop Modules on a page. Netvibes provides a good example of this interaction pattern (Figure 2-3).
Figure 2-3. Netvibes allows modules to be arranged directly via drag and drop; the hole cues what will happen when a module is dropped
Netvibes allows its modules to be rearranged with drag and drop. A number of interesting moments decide the specific interaction style for this site. Figure 2-4 shows the interesting moments grid for Netvibes.
Figure 2-4. Interesting moments grid for Netvibes: there are 20 possible moments of interaction; Netvibes specifically handles 9 of these moments
While dragging, it is important to make it clear what will happen when the user drops the dragged object. There are two common approaches to targeting a drop:
Netvibes uses a placeholder (hole with dashed outline) as the drop target. The idea (illustrated in Figure 2-5) is to always position a hole in the spot where the drop would occur. When module ① starts dragging, it gets “ripped” out of the spot. In its place is the placeholder target (dashed outline). As ① gets dragged to the spot between ③ and ④, the placeholder target jumps to fill in this spot as ④ moves out of the way.
Figure 2-5. A placeholder target always shows where the dragged module will end after the drop; module 1 is being dragged from the upper right to the position between modules 3 and 4
The hole serves as a placeholder and always marks the spot that the dragged module will land when dropped. It also previews what the page will look like (in relation to the other modules) if the drop occurs there. For module drag and drop, the other modules only slide up or down within a vertical column to make room for the dragged module.
One complaint with using placeholder targets is that the page content jumps around a lot during the drag. This makes the interaction noisier and can make it harder to understand what is actually happening. This issue is compounded when modules look similar. The user starts dragging the modules around and quickly gets confused about what just got moved. One way to resolve this is to provide a quick animated transition as the modules move. It is important, however, that any animated transitions not get in the way of the normal interaction. In Chapter 11, we will discuss timing of transitions in detail.
There is a point in Figure 2-5 where the placeholder shifts to a new location. What determines placeholder targeting? In other words, what determines where the user is intending to place the dragged object? The position of the mouse, the boundary of the dragged object, and the boundary of the dragged-over object can all be used to choose the module’s new location.
Boundary-based placement. Since most sites that use placeholder targeting drag the module in its original size, targeting is determined by the boundaries of the dragged object and the boundaries of the dragged-over object. The mouse position is usually ignored because modules are only draggable in the title (a small region). Both Netvibes and iGoogle take the boundary-based approach. But, interestingly, they calculate the position of their placeholders differently.
In Netvibes, the placeholder changes position only after the dragged module’s title bar has moved beyond the dragged-over module’s title bar. In practice, this means if you are moving a small module to be positioned above a large module, you have to move it to the very top of the large module. In Figure 2-6 you have to drag the small “To Do List” module all the way to the top of the “Blog Directory” module before the placeholder changes position.
Figure 2-6. In Netvibes, dragging a small module to be placed above a large module requires dragging a large distance; the “To Do List” has to be dragged to the top of the “Blog Directory” module
In contrast, moving the small module below the large module actually requires less drag distance since you only have to get the title bar of the small module below the title bar of the large module (Figure 2-7).
Figure 2-7. Dragging a small module below a large module requires a smaller drag distance; since the targeting is based on the header of the dragged-over module, the drag distance in this scenario is less than in the previous figure
This approach to boundary-based drop targeting is non-symmetrical in the drag distance when dragging modules up versus dragging modules down (Figure 2-8).
Figure 2-8. The Netvibes approach requires the dragged object’s title to be placed above or below a module before the placement position changes; this results in inconsistent drag distances
A more desirable approach is that taken by iGoogle. Instead of basing the drag on the title bar, iGoogle calculates the placeholder targeting on the dragged-over object’s midpoint. In Figure 2-9, the stock market module is very large (the module just above the moon phase module).
Figure 2-9. When dragging a module downward, iGoogle moves the placeholder when the bottom of the dragged module crosses the midpoint of the object being dragged over; the distance to accomplish a move is less than in the Netvibes approach
With the Netvibes approach, you would have to drag the stock module’s title below the moon phase module’s title. iGoogle instead moves the placeholder when the bottom of the dragged module (stock module) crosses the midpoint of the dragged over module (moon phase module).
What happens when we head the other way? When we drag the stock module up to place it above the moon phase module, iGoogle moves the placeholder when the top of the stock module crosses the midpoint of the moon phase module (Figure 2-10).
Figure 2-10. When dragging a module upward, iGoogle moves the placeholder when the top of the dragged module crosses the midpoint of the object being dragged over; dragging modules up or down requires the same effort, unlike in the Netvibes example
As Figure 2-11 illustrates, module ① is dragged from the first column to the second column, the placeholder moves above module ③. As module ① is dragged downward, the placeholder moves below ③ and ④ as the bottom of module ① crosses their midpoints.
Figure 2-11. To create the best drag experience, use the original midpoint location of the module being dragged over to determine where to drop the dragged module: module 1 is being dragged into the position just below module 4
The net result is that the iGoogle approach feels more responsive and requires less mouse movement to position modules. Figure 2-12 shows the interesting moments grid for the iGoogle drag and drop interaction.
Figure 2-12. Interesting moments grid for iGoogle: as in the Netvibes grid, there are 20 possible moments of interaction; iGoogle specifically handles 8 of these moments
Placeholder positioning is a common approach, but it is not the only way to indicate drop targeting. An alternate approach is to keep the page as stable as possible and only move around an insertion target (usually an insertion bar). A previous version of My Yahoo! used the insertion bar approach as the dragged module was moved around (see Figure 2-13).
Figure 2-13. My Yahoo! uses the insertion bar approach
While the module is dragged, the page remains stable. No modules move around. Instead an insertion bar marks where the module will be placed when dropped.
This technique is illustrated in Figure 2-14. When module ① is dragged to the position between ③ and ④, an insertion bar is placed there. This indicates that if ① is dropped, then ④ will slide down to open up the drop spot.
Figure 2-14. Using an insertion bar keeps the page stable during dragging and makes it clear how things get rearranged when the module is dropped
Unlike with the placeholder target, the dragged module ① is usually represented with a slightly transparent version of the module (also known as ghosting). This is the approach shown in Figure 2-13 in an earlier version of My Yahoo!. In the most current version, full-size module dragging has been replaced with a thumbnail representation (the small gray outline being dragged in Figure 2-15). This is somewhat unfortunate since the small gray outline is not very visible.
Figure 2-15. My Yahoo! uses a small gray rectangle to represent the dragged module
As you can see in Figure 2-16, the My Yahoo! page makes different decisions about how drag and drop modules are implemented as compared to Netvibes (Figure 2-4) and iGoogle (Figure 2-12).
Figure 2-16. My Yahoo! uses 15 of the possible 32 moments to interact with the user during drag and drop; the biggest difference between My Yahoo!, Netvibes, and iGoogle is the insertion bar placement—another subtle difference is how drag gets initiated
Dragging the thumbnail around does have other issues. Since the object being dragged is small, it does not intersect a large area. It requires moving the small thumbnail directly to the place it will be dropped. With iGoogle, the complete module is dragged. Since the module will always be larger than the thumbnail, it intersects a drop target with much less movement. The result is a shorter drag distance to accomplish a move.
* * *
Keep in mind that Drag and Drop takes additional mouse dexterity. If possible, shorten the necessary drag distance to target a drop.
* * *
How should the dragged object be represented? Should it be rendered with a slight transparency (ghost)? Or should it be shown fully opaque? Should a thumbnail representation be used instead?
As shown earlier, My Yahoo! uses a small gray rectangle to represent a module (Figure 2-15). Netvibes represents the dragged module in full size as opaque (shown back in Figure 2-3), while iGoogle uses partial transparency (Figure 2-17). The transparency (ghosting) effect communicates that the object being dragged is actually a representation of the dragged object. It also keeps more of the page visible, thus giving a clearer picture of the final result of a drop.
Figure 2-17. On iGoogle the dragged module Top Stories is given transparency to make it easier to see the page and to indicate that we are in a placement mode
Ghosting the module also indicates that the module is in a special mode. It signals that the module has not been positioned; instead, it is in a transitional state.
* * *
For Drag and Drop Modules, use the module’s midpoint to control the drop targeting.
* * *
Of the various approaches for Drag and Drop Modules, iGoogle combines the best approaches into a single interface:
Most explicit way to preview the effect.
Requires the least drag effort to move modules around.
Full-size module dragging
Coupled with placeholder targeting and midpoint boundary detection, it means drag distances to complete a move are shorter.
Emphasizes the page rather than the dragged object. Keeps the preview clear.
* * *
Best Practices for Drag and Drop Module
Here are some best practices to keep in mind:
Use the placeholder approach when showing a clear preview during drag is important.
Use the insertion bar approach when you want to avoid page jitter.
Use the midpoint of the dragged object to determine drag position.
Use a slightly transparent version of the object being dragged (ghost) instead of an opaque version.
If you drag thumbnail representations, use the insertion bar targeting approach.
* * *
Drag and Drop List
Rearranging lists is very similar to rearranging modules on the page but with the added constraint of being in a single dimension (up/down or left/right). The Drag and Drop List pattern defines interactions for rearranging items in a list.
37 Signal’s Backpackit allows to-do items to be rearranged with Drag and Drop List (Figure 2-18).
Figure 2-18. Backpackit allows to-do lists to be rearranged directly via drag and drop
Backpackit takes a real-time approach to dragging items. Since the list is constrained, this is a natural approach to moving objects around in a list. You immediately see the result of the drag.
This is essentially the same placeholder target approach we discussed earlier for dragging and dropping modules. The difference is that when moving an item in a list, we are constrained to a single dimension. Less feedback is needed. Instead of a “ripped-out” area (represented earlier with a dotted rectangle), a simple hole can be exposed where the object will be placed when dropped.
A good example from the desktop world is Apple’s iPhoto. In a slideshow, you can easily rearrange the order of photos with drag and drop. Dragging the photo left or right causes the other photos to shuffle open a drop spot (Figure 2-19).
Figure 2-19. iPhoto uses cursor position: when the cursor crosses a threshold (the edge of the next photo), a new position is opened up
The difference between iPhoto and Backpackit is that instead of using the dragged photo’s boundary as the trigger for crossing a threshold, iPhoto uses the mouse cursor position. In the top row of Figure 2-19, the user clicked on the right side of the photo. When the cursor crosses into the left edge of the next photo, a new space is opened. In the bottom row, the user clicked on the top left side of the photo. Notice in both cases it is the mouse position that determines when a dragged photo has moved into the space of another photo, not the dragged photo’s boundary.
* * *
In a Drag and Drop List, use the mouse position to control where the item will be dropped.
* * *
Just as with Drag and Drop Modules, placeholder targeting is not the only game in town. You can also use an insertion bar within a list to indicate where a dropped item will land. Netflix uses an insertion target when movies are dragged to a new location in a user’s movie queue (Figure 2-20).
Figure 2-20. A Netflix queue can be rearranged via drag and drop
The upside to this approach is that the list doesn’t have to shuffle around during drag. The resulting experience is smoother than the Backpackit approach. The downside is that it is not as obvious where the movie is being positioned. The insertion bar appears under the ghosted item. The addition of the brackets on the left and right of the insertion bar is an attempt to make the targeting clearer.
Non-drag and drop alternative
Besides drag and drop, the Netflix queue actually supports two other ways to move objects around:
Edit the row number and then press the “Update DVD Queue” button.
Click the “Move to Top” icon to pop a movie to the top.
Modifying the row number is straightforward. It’s a way to rearrange items without drag and drop. The “Move to Top” button is a little more direct and fairly straightforward (if the user really understands that this icon means “move to top”). Drag and drop is the least discoverable of the three, but it is the most direct, visual way to rearrange the list. Since rearranging the queue is central to the Netflix customer’s satisfaction, it is appropriate to allow multiple ways to do so.
Hinting at drag and drop
When the user clicks the “Move to Top” button, Netflix animates the movie as it moves up. But first, the movie is jerked downward slightly and then spring-loaded to the top (Figure 2-21).
Figure 2-21. When a movie is moved to the top with the “Move to Top” button, the movie jerks down slightly, then springs to the top
The combination of the downward jerk and then the quick animation to the top gives a subtle clue that the object is draggable. This is also an interesting moment to advertise drag and drop. After the move to top completes, a simple tip could appear to invite users to drag and drop. The tip should probably be shown only once, or there should be a way to turn it off. Providing an invitation within a familiar idiom is a good way to lead users to the new idiom.
* * *
If drag and drop is a secondary way to perform a task, use the completion of the familiar task as an opportunity invite the user to drag and drop the next time.
* * *
Drag and drop works well when a list is short or the items are all visible on the page. But when the list is long, drag and drop becomes painful. Providing alternative ways to rearrange is one way to get around this issue. Another is to provide a drag lens while dragging.
A drag lens provides a view into a different part of the list that can serve as a shortcut target. It could be a fixed area that is always visible, or it could be a miniature view of the list that provides more rows for targeting. The lens will be made visible only during dragging. A good example of this is dragging the insertion bar while editing text on the iPhone (Figure 2-22).
Figure 2-22. The iPhone provides a drag magnifier lens that makes it easier to position the cursor
* * *
Best Practices for Drag and Drop List
Here are some best practices to keep in mind:
If possible, drag the items in a list in real time using the placeholder target approach.
Use the mouse position for drag target positioning.
If the goal is speed of dragging or if dragged items are large, consider using the insertion target approach, as rendering an insertion bar is inexpensive compared to dynamically rearranging the list.
Since drag and drop in lists is not easily discoverable, consider providing an alternate way to rearrange the list.
When the user rearranges the list with an alternate method, use that moment for a one-time advertisement for drag and drop.
* * *
Drag and Drop Object
Another common use for drag and drop is to change relationships between objects. This is appropriate when the relationships can be represented visually. Drag and drop as a means of visually manipulating relationships is a powerful tool.
Cogmap is a wiki for organizational charts. Drag and Drop Object is used to rearrange members of the organization (Figure 2-23).
Figure 2-23. Cogmap allows organizational charts to be rearranged on the fly with drag and drop
When object relationships can be clearly represented visually, drag and drop is a natural choice to make these type of changes. Cogmap uses the target insertion approach. This allows the dragging to be nondistracting, since the chart does not have to be disturbed during targeting.
Drag feedback: Highlighting
Bubbl.us, an online mind-mapping tool, simply highlights the node that will be the new parent (Figure 2-24).
Figure 2-24. Bubbl.us provides a visual indication of which node the dropped node will attach itself to
In both cases, immediate preview is avoided since it is difficult to render the relationships in real time without becoming unnecessarily distracting.
Looking outside the world of the Web, the desktop application Mind Manager also uses highlighting to indicate the parent in which insertion will occur. In addition, it provides insertion targeting to give a preview of where the employee will be positioned once dropped (Figure 2-25).
Figure 2-25. Mind Manager is a desktop tool that uses a combination of insertion targeting plus a clear preview of the drop
Drag feedback: Dragged object versus drop target
As we mentioned at the beginning of this chapter, one of the first serious uses for drag and drop was in the Oddpost web mail application. Oddpost was eventually acquired by Yahoo! and is now the Yahoo! Mail application.
Yahoo! Mail uses drag and drop objects for organizing email messages into folders (Figure 2-26).
Figure 2-26. Yahoo! Mail allows messages to be dragged to folders
Instead of signaling that a drop is valid or invalid by changing the visual appearance of the area dragged over, Yahoo! Mail shows validity through the dragged object. When a drop will be invalid (Figure 2-27, left):
The dragged object’s icon becomes a red invalid sign.
If over an invalid folder, the folder is highlighted as well.
When a drop will be valid (Figure 2-27, right):
The dragged object’s icon changes to a green checkmark.
The drop target highlights.
Another approach is to signal both validity and location in the drop target itself. In this case you would highlight the valid drop target when it is dragged over and not highlight the drop target if it is invalid. In Yahoo! Mail’s interaction, the signaling of validity and where it can be dropped are kept separate. This allows a drag to indicate that a target is a drop target, just not valid for the current object being dragged.
Figure 2-27. Yahoo! Mail mistakenly shows a valid indicator instead of an invalid indicator for a message when it is dragged back over the inbox
One odd situation occurs when you first start dragging a message and then later drag it back into the inbox area (Figure 2-27). At first it shows the inbox as an invalid drop area. Then it shows it as a valid drop area. Recall in our discussion on the various interesting events that initially dragging over your “home area” and then later dragging back into it are all events that should be considered during drag and drop. Here the interface needs to display the same indicator in both cases.
* * *
Feedback during dragging is key to providing a clear Drag and Drop Object interaction.
* * *
Drag feedback: Drag positioning
Another slightly troublesome approach is positioning the dragged object some distance away from the mouse (Figure 2-28). The reason the object is positioned in this manner is to avoid obscuring dragged-over folders. While this may alleviate that problem, it introduces a second problem: when you initiate the drag, the dragged message jumps into the offset position. Instead of conveying that the first message in the list is being dragged, it feels like the second message in the list is being dragged (Figure 2-28, bottom).
Figure 2-28. By offsetting the drag object a large distance from the cursor, the message feels disjointed from the actual object being dragged; in fact, it looks like it is closer to the second message in the list instead of the first message actually being dragged
Drag feedback: Drag start
In Yahoo! Mail, message dragging is initiated when the mouse is dragged about four or five pixels (Figure 2-29).
Figure 2-29. Yahoo! Mail requires the user to drag four or five pixels to initiate a drag (notice the cursor is at the top of the “B” and has to be dragged 2/3 of the way down to start the drag); this gives the impression that the message is stuck and not easy to drag. Reducing this value will make messages feel easier to drag
A good rule of thumb on drag initiation comes from the Apple Human Interface Guidelines:
Your application should provide drag feedback as soon as the user drags an item at least three pixels. If a user holds the mouse button down on an object or selected text, it should become draggable immediately and stay draggable as long as the mouse remains down.
It might seem like a small nit, but there is quite a difference between starting a drag after three pixels of movement versus four or five pixels. The larger value makes the object feel hard to pull out of its slot to start dragging. On the flip side, starting a drag with too small a value can cause drag to initiate accidentally, usually resulting in the interface feeling too finicky.
* * *
Start a drag when the object is dragged three pixels or the mouse is held down for half a second.
* * *
The only part of the Apple guideline that could be quibbled with is whether to start drag mode immediately on mouse down or wait about a half-second to start. Why not initiate the drag immediately? Certain devices, like pen input, are not as precise as mouse input. If you allow an object to be dragged and that object has other controls (like hyperlinks), you will want to allow the user to start a drag even if he clicks down over some element within the object (like a hyperlink). You will also want to allow him to just click the hyperlink and not have a drag accidentally initiate. Moving into drag mode immediately will preclude the ability to disambiguate between a click on an item within the object versus a drag start on the object itself.
* * *
Best Practices for Drag and Drop Object
Here are some best practices to keep in mind:
If objects are represented in a complex visual relationship, use insertion targeting to indicate drop location (minimizes disturbing the page during drag).
For parent/child relationships, highlight the parent as well to indicate drop location.
If possible, reveal drag affordances on mouse hover to indicate draggability.
Initiate drag when the mouse is dragged three pixels or if the mouse is held down for at least half a second.
Position dragged objects directly in sync with the cursor. Offsetting will make the drag feel disjointed.
When hovering over a draggable object, change the cursor to indicate draggability.
* * *
* * *
 For example, while a contact may be dragged into the Contacts folder, a message may not. In either situation, the Contacts folder will highlight. However, the dragged contact will show a green checkmark, while the dragged message will show a red invalid sign.
 See http://tinyurl.com/5aqd4k for the Apple Human Interface Guideline on drag feedback.
Drag and Drop Action
Drag and drop is also useful for invoking an action or actions on a dropped object. The Drag and Drop Action is a common pattern. Its most familiar example is dropping an item in the trash to perform the delete action.
Normally uploading files to a web application includes pressing the upload button and browsing for a photo. This process is repeated for each photo.
When Yahoo! Photos was relaunched in 2006, it included a drag and drop upload feature. It allowed the user to drag photos directly into the upload page. The drop signified the upload action (Figure 2-30).
Figure 2-30. Yahoo! Photos provided a way to upload files directly to the site via drag and drop from the user’s filesystem into the web page
This is not a trivial implementation. But it does clearly illustrate the benefit of drag and drop for operating on a set of files. The traditional model requires each photo to be selected individually for upload. Drag and drop frees you to use whatever browsing method is available on your system and then drop those photos for upload.
Anti-pattern: Artificial Visual Construct
Unfortunately, drag and drop can sometimes drive the design of an interface instead of being an extension of a natural interface. These interactions are almost always doomed, as they are the tail wagging the proverbial dog. Rating movies, books, and music is a common feature found on many sites. But what happens if you try to use drag and drop to rate movies?
In Figure 2-31 you can rate movies by dragging them into three buckets: “Loved It”, “Haven’t Seen It”, or “Loathed It”.
Figure 2-31. Drag and drop recommendations: the hard way to do ratings
While this certainly would work, it is wrong for several reasons:
Requires some additional instructions to “Drag the DVDs into the boxes below” in order for the user to know how to rate the movies.
Too much effort
Requires too much user effort for a simple task. The user needs to employ mouse gymnastics to simply rate a movie. Drag and drop involves these discrete steps: target, then drag, then target, and then drop. The user has to carefully pick the movie, drag it to the right bucket, and release.
Too much space
Requires a lot of visual space on the page to support the idiom. Is it worth this amount of screen real estate?
Direct rating systems (thumbs up/down, star ratings, etc.) are a much simpler way to rate a movie than using an Artificial Visual Construct. A set of stars is an intuitive, compact, and simple way to rate a movie (Figure 2-32).
Figure 2-32. Instead of drag and drop, Netflix uses a simple set of stars to rate a movie
You might still be tempted to take this approach if you have a lot of objects that you want to add as favorites or set an attribute on. Don’t give in. This method still falls way short since the amount of space needed for this far outweighs simpler approaches such as providing an action button for the selected objects.
* * *
Drag and drop should never be forced. Don’t create an artificial visual construct to support it.
* * *
Natural Visual Construct
Another example of Drag and Drop Action is demonstrated in Google Maps. A route is visually represented on the map with a dark purple line. Dragging an arbitrary route point to a new location changes the route in real time (Figure 2-33).
Figure 2-33. Rerouting in Google Maps is as simple as drag and drop
This is the opposite of the Artificial Visual Construct anti-pattern. The route is a Natural Visual Construct. Since anywhere along the route is draggable, there are a lot of opportunities to discover the rerouting bubble. When the route is being dragged, Google dynamically updates it. The constant feedback forms the basis of a Live Preview (which we will discuss in Chapter 13).
* * *
Best Practices for Drag and Drop Action
Here are some best practices to keep in mind:
Use Drag and Drop Actions sparingly in web interfaces, as they are not as discoverable or expected.
Provide alternate ways to accomplish the action. Use the Drag and Drop Action as a shortcut mechanism.
Don’t use drag and drop for setting simple attributes. Instead use a more direct approach to setting attributes on the object.
Don’t construct an artificial visual representation for the sole purpose of implementing drag and drop. Drag and drop should follow the natural representation of the objects in the interface.
Provide clear invitations on hover to indicate the associated action.
* * *
Drag and Drop Collection
A variation on dragging objects is collecting objects for purchase, bookmarking, or saving into a temporary area. This type of interaction is called Drag and Drop Collection. Drag and drop is a nice way to grab items of interest and save them to a list. The Laszlo shopping cart example illustrates this nicely (Figure 2-34).
Figure 2-34. This Laszlo shopping cart demo uses both drag and drop and a button action to add items to its shopping cart
There are a few issues to consider in this example.
Drag and drop is a natural way collect items for purchase. It mimics the shopping experience in the real world. Grab an item. Drop it in your basket. This is fast and convenient once you know about the feature. However, as a general rule, you should never rely solely on drag and drop for remembering items.
Parallel, more explicit ways to do the same action should be provided. In this example, Laszlo provides an alternative to dragging items in the cart. Notice the “+ cart” button in Figure 2-34. Clicking this button adds the item to the shopping cart.
* * *
Best Practices for Drag and Drop Collection
Here are some best practices to keep in mind:
Use as an alternate way to collect items (e.g., a shopping cart).
When a drag gets initiated, highlight the valid drop area to hint where drop is available.
Provide alternate cues that drag and drop into collections as available.
* * *
When providing alternates to drag and drop, it is a good idea to hint that dragging is an option. In the Laszlo example, clicking the “+ cart” button causes the shopping cart tray to bump slightly open and then closed again. This points to the physicality of the cart. Using another interaction as a teachable moment to guide the user to richer interactions is a good way to solve discoverability issues.
* * *
Look for opportunities for teachable moments in the interface leading users to advanced features.
* * *
The Challenges of Drag and Drop
As you can see from the discussion in this chapter, Drag and Drop is complex. There are four broad areas where Drag and Drop may be employed: Drag and Drop Module, Drag and Drop List, Drag and Drop Object, and Drag and Drop Action. And in each area, there are a large number of interesting moments that may be handled in numerous ways. Being consistent in visual and interaction styles across all of these moments for all of these types of interactions is a challenge in itself. And keeping the user informed throughout the process with just the right amount of hints requires design finesse. In Chapter 10, we explore some ways to bring this finesse into Drag and Drop.
* * *
General Best Practices for Drag and Drop
Keep page jitter to a minimum while dragging objects.
Initiate dragging if the user presses the mouse down and moves the mouse three pixels, or if she holds the mouse down for at least half a second.
Use drag and drop for performing direct actions as an alternate method to more direct mechanisms in the interface.
Hint at the availability of drag and drop when using alternatives to drag and drop.
Pay attention to all of the interesting moments during drag and drop. Remember, you must keep the user informed throughout the process.
Use Invitations (discussed more in Chapter 9 and Chapter 10) to cue the user that drag and drop is available.
* * *
Chapter 3. Direct Selection
When the Macintosh was introduced, it ushered into the popular mainstream the ability to directly select objects and apply actions to them. Folders and files became first-class citizens. Instead of a command line to delete a file, you simply dragged a file to the trashcan (Figure 3-1).
Figure 3-1. DOS command line for deleting a file versus dragging a file to the trash on the Macintosh
Treating elements in the interface as directly selectable is a clear application of the Make It Direct principle. On the desktop, the most common approach is to initiate a selection by directly clicking on the object itself. We call this selection pattern Object Selection (Figure 3-2).
Figure 3-2. Files can be selected directly on the Macintosh; Object Selection is the most common pattern used in desktop applications
In this chapter we will look at the following types of selection patterns:
Checkbox or control-based selection.
Selection that spans multiple pages.
Direct object selection.
Combination of Toggle Selection and Object Selection.
The most common form of selection on the Web is Toggle Selection. Checkboxes and toggle buttons are the familiar interface for selecting elements on most web pages. An example of this can be seen in Figure 3-3 with Yahoo! Mail Classic.
Figure 3-3. In Yahoo! Mail Classic a mail message can be selected by clicking on the corresponding row’s checkbox
The way to select an individual mail message is through the row’s checkbox. Clicking on the row itself does not select the message. We call this pattern of selection Toggle Selection since toggle-style controls are typically used for selecting items.
* * *
Toggle Selection is the easiest way to allow discontinuous selection.
* * *
Once items have been check-selected, actions can be performed on them. Usually these actions are performed on the selection by clicking on a separate button (e.g., the Delete button). Gmail is a good example of actions in concert with Toggle Selection (Figure 3-4).
Figure 3-4. Gmail uses checkbox selection to operate on messages
Toggle Selection with checkboxes has some nice attributes:
Clear targeting, with no ambiguity about how to select the item or deselect it.
Straightforward discontinuous selection, and no need to know about Shift or Control-key ways to extend a selection. Just click the checkboxes in any order, either in a continuous or discontinuous manner.
Clear indication of what has been selected.
Scrolling versus paging
The previous examples were with paged lists. But what about a scrolled list? Yahoo! Mail uses a scrolled list to show all of its mail messages (Figure 3-5). While not all messages are visible at a time, the user knows that scrolling through the list retains the currently selected items. Since the user understands that all the messages not visible are still on the same continuous pane, there is no confusion about what an action will operate on—it will affect all selected items in the list. Sometimes the need for clarity of selection will drive the choice between scrolling and paging.
* * *
Toggle Selection is the normal pattern used when content is paged. Actions normally apply only to the selected items on the visible page.
* * *
Figure 3-5. Yahoo! Mail uses a scrolled list for its messages; selection includes what is in the visible part of the list as well as what is scrolled out of view
Making selection explicit
With Yahoo! Bookmarks you can manage your bookmarks by selecting bookmarked pages and then acting on them. The selection model is visually explicit (Figure 3-6).
Figure 3-6. Yahoo! Bookmarks explicitly displays the state of the selection
The advantage of this method is that it is always clear how many items have been selected. Visualizing the underlying selection model is generally a good approach. This direct approach to selection and acting on bookmarks creates a straightforward interface.
One interesting question: what happens when nothing is selected? One approach is to disable any actions that require at least one selected item. Yahoo! Bookmarks takes a different approach. Since buttons on the Web do not follow a standard convention, you often can’t rely on a color change to let you know something is not clickable. Yahoo! Bookmarks chose to make selection very explicit and make it clear when a command is invalid because nothing is selected (“No selection” in Figure 3-6). This is not normally the optimal way to handle errors. Generally, the earlier you can prevent errors, the better the user experience.
Netflix disables the “Update DVD Queue” button when nothing is selected and enables it when a movie gets selected (Figure 3-7).
Figure 3-7. When nothing is selected, Netflix disables the “Update DVD Queue” button to prevent errors early
* * *
Best Practices for Toggle Selection
Here are some best practices to keep in mind:
Use Toggle Selection for selecting elements in a row.
Use Toggle Selection to make it easy to select discontinuous elements.
In a list, highlight the row in addition to the checkbox to make the selection explicit.
When moving from page to page, actions should only operate on the items selected on that page
If offering a “select all” option, consider providing a way to select all elements across all pages.
Provide clear feedback for the number of elements selected.
If possible, disable unavailable actions when nothing is selected. If you keep the action enabled, you will need additional interface elements to signal that it can’t be completed.
* * *
Toggle Selection is great for showing a list of items on a single page. But what happens if you want to collect selected items across multiple pages? Collected Selection is a pattern for keeping track of selection as it spans multiple pages.
In Gmail, you can select items as you move from page to page. The selections are remembered for each page. If you select two items on page one, then move to page two and select three items, there are only three items selected. This is because actions only operate on a single page. This makes sense, as users do not normally expect selected items to be remembered across different pages.
Gmail does provide a way to select all items across different pages. When selecting all items on a individual page (with the “All” link), a prompt appears inviting the user to “Select all 2785 conversations in Spam”. Clicking that will select all items across all pages (Figure 3-8). The “Delete Forever” action will operate on all 2785 conversations, not just the 25 selected on the page.
Figure 3-8. Gmail provides a way to select all items across all pages, allowing the user to delete all items in a folder without having to delete all items on each page individually
Keeping the selection visible
The real challenge for multi-page selection is finding a way to show selections gathered across multiple pages. You need a way to collect and show the selection as it is being created. Here is one way that Collected Selection comes into play.
LinkedIn uses Collected Selection to add potential contacts to an invite list (Figure 3-9).
Figure 3-9. LinkedIn provides a holding place for saving selections across multiple pages
The list of potential invitees is shown in a paginated list on the lefthand side. Clicking the checkbox adds them to the invite list. The invite list becomes the place where selected contacts across multiple pages are remembered.
Any name in the invite list can be removed by clicking the “X” button beside it. Once the complete list of invitees is selected, clicking the “Invite selected contacts” sends each selected contact a LinkedIn invitation.
Collected Selection and actions
When Yahoo! Photos was working its way through an early design of its Photo Gallery (see Figure 3-13, later in this chapter), the plan was to show all photos in a single, continuous scrolling page (we discuss virtual scrolling in Chapter 7). In a long virtual list, the selection model is simple. Photos are shown in a single page and selection is easily understood in the context of this single page.
However, due to performance issues, the design was changed. Instead of a virtual page, photos had to be chunked into pages. In order to support Collected Selection, Yahoo! Photos introduced the concept of a “tray” into the interface (Figure 3-10). On any page, photos can be dragged into the tray. The tray keeps its contents as the user moves from page to page. So, adding a photo from page one and three more from page four would yield four items in the tray. As a nice touch, the tray would make itself visible (by sliding into view) even when the user was scrolled down below the fold.
Figure 3-10. Yahoo! Photos used a “tray” to implement a form of Collected Selection; the confusing aspect was which actions in the menu operated on the tray versus the photos selected on the page
There was a problem with the design, however. In the menu system it was hard to discern whether the user meant to operate on the selection (photos on the page could be selected through an Object Selection model) or on the collected items in the tray. To resolve this ambiguity, the drop-down menus contained two identical sets of commands. The first group of commands in the menu operated on the collected items in the tray. The second set of commands operated on the selected objects. Needless to say, this was confusing since it required the user to be fully aware of these two selection models when initiating a command.
One way to remove this ambiguity would have been to have a single set of commands that operated on either the tray or the photos—depending on which had the focus. This would require a way to select the tray and a way to deselect it (by clicking outside the tray). A possible approach would be to slightly dim the photo gallery when the tray is selected (causing it to clearly have the focus), and do the opposite when the tray is not the focus.
* * *
Best Practices for Collected Selection
Here are some best practices to keep in mind:
If you allow selection across page boundaries, accumulate the selected items (from each page) into a separate area. This makes the selection explicit even when moving from page to page.
Use Collected Selection to blend Toggle Selection and Object Selection in the same interface.
Watch out for ambiguity between items selected with Collected Selection and any items or objects that can be normally selected on the page.