Mozilla Ubiquity (& an alternative to it)
Aza Raskin & co. at Mozilla have released a preview interface for their new experiment, called Ubiquity. The goal is to come up with a quicksilver-like interface for the browser to aid in completing tasks. This is different from how web interaction works now; today, it is information-based. A user finds disparate bits of information, and combines them manually. With Ubiquity, the goal is to specify the task, and let the software figure out the information needed to complete the task. But, can they do it?
To quote the announcement page:
The overall goals of Ubiquity are to explore how best to:
- Empower users to control the web browser with language-based instructions. (With search, users type what they want to find. With Ubiquity, they type what they want to do.)
- Enable on-demand, user-generated mashups with existing open Web APIs. (In other words, allowing everyone–not just Web developers–to remix the Web so it fits their needs, no matter what page they are on, or what they are doing.)
- Use Trust networks and social constructs to balance security with ease of extensibility.
- Extend the browser functionality easily.
The idea is interesting, but I see a few immediate and show-stopping roadblocks. To start, it all hinges on their ability to capture and interpret the intricacies of English (and other languages they hope to target). Natural Language Processing is one of, if not the most, difficult task in computer science. We have had the goal to make computers understand us for some time, and our best attempts have helped us make systems that can mimic conversation at best. Systems like Alice (http://alice.pandorabots.com/) do a good job of responding to conversation, but not as well understanding the intent of conversation. They have to figure this out to progress. From my research experience (http://www.firstmonday.org/issues/issue12_9/argamon/index.html), I know that some patterns can be found from text. But to actually 'interpret' it is a different matter.
Of course, if they restrict the language that can be used, it makes the job easier. But, that does not work out in the long-run. The problem is that English is a moving target; there is no sound way to pick the proper, all-encompassing wordlist that should be used. And, that becomes no different than a particularly verbose programming language (think AppleScript).
Another problem is the audience for this. There are many people who would find this feature unnecessary or unfriendly. I might be the type of person to use this, but it is not clear this would be generally applicable to many users, for many reasons. Things like simple spelling errors to mis-phrasings (swap the from location with the to location or other context problems) would frustrate and deter average users and advanced users.
An alternative idea would provide the user with a template to fill in their bits of information as they come up with it. Imagine a panel on one side, containing task specifiers (like the book mark section in most browsers, but more focused on the current page's resources). Selecting items in this section would open (non-modally) a window providing a template for completing the task.
Selecting a Task
A panel would appear on the left side of the screen with tasks, nested in groups. A search bar would appear when this section has focus, showing individual tasks that match the description. Once a task is selected, it causes a different section to open (as either a floating semi-transparent window on the screen, or another section on the right of the screen) with the fields needed to complete the task.
Providing data to the fields
Imagine that an item "Book a flight" was selected in the task section. The template panel would open, and provide areas to enter their start and end locations and times, mode of transportation, some sort of ordering information (cheapest, fastest, etc.) and a field to notify people (possibly from the address book).
Another example could be for mapping how to get from one place to the next, or communicating some information to a social networking site, or setting up an event. All of these would be dependent on open APIs plugging into the browser, and a settings panel would be available to change how the defaults are set, what they currently are set to , and other options.
Conclusion
Microsoft tried this out in XP and other products released at that time; the clearest example of this is the oft-hidden task pane in explorer. I believe the problem with those menu items is that, like many things in Windows, there were many other ways to accomplish those tasks, which had existed in previous versions of Windows. Users did not shift over to the task pane, and they went unused.
With the web, we have the opportunity to introduce these capabilities where none exist (except as actions on each page which are unconnected). The Ubiquity idea is good, but maybe it can be better. Maybe it can be more applicable to more users, and not those who would also like Enso or QuickSilver; those who do not wish to type so much, or remember some language patterns to type in phrases.