I have been obsessed with User Interfaces (UI) for as long as I can remember. I remember marveling at the beauty that was Compaq TabWorks while I played “The Incredible Machine” and listened to “Tears For Fears—Greatest Hits” on the family computer.
Don’t judge me—I was listening to “Mad World” way before Donny Darko and that creepy rabbit. If none of those references landed with you, it’s probably because I’m super old. In the words of George Castanza, “It’s not you, it’s me.”
That’s another super old reference you might not get. You know what—forget all that, let’s move on.
I really got into UI when I bought my own computer. I had joined the Coast Guard and saved a bunch of money during boot camp (when you can’t go shopping—you know—because of push-ups and stuff). I wanted to buy a Chevy Cavalier (sadly, that’s not a joke), but my father encouraged me to invest in a computer instead, so I bought a Compaq from Office Depot that came with Windows 98. Also you can’t buy a Cavalier with 800 bucks.
I spent countless hours changing the themes in Windows 98. I was mesmerized by the way windows overlapped and how the icons and fonts would change; the shapes of buttons and the different colors. The slight drop shadow each window had to layer it in space. Each theme was better than the previous theme!
If only I had known how much better things were going to get. If only I had known, about Windows XP.
Does love at first sight exist? No—don’t be ridiculous. Love is an extremely complex part of the human condition that can only manifest itself over time through long periods of struggling and the dark night of the soul.
“What is love? Baby don’t hurt me. Don’t hurt me. No more.”
—Haddaway, “What Is Love”
But love’s fickle and cruel cousin, Infatuation, does exist and it is almost exclusively available at first sight. I was absolutely infatuated with Windows XP.
The curves on the start menu. The menu animations. I could just look at it for hours. And I did. Shocking fact—I wasn’t exactly in high social demand so I had a great deal of free time to do weird things like stare at an operating system.
For those who remember, Windows XP was extremely customizable. Virtually every part of the operating system could be skinned or themed. This spawned a lot of UI hacking communities and third party tools like Window Blinds from the fine folks at Stardock. I see you Stardock; the north remembers.
I Love UI
I could go on and on about my long, boring and slightly disturbing obsession with UI. Oddly enough, I am not a designer or an artist. I can build a decent UI, but you would not hire me to design your site. Or you would but your name would be “Burke’s Mom.”
I can however assemble great UI if I have the building blocks. I’ve been lucky enough to work on some great UI projects in my career, including being part of the Kendo UI project when it first launched. I love buttons, dropdown lists, and dialogue windows with over the top animation. And I can assemble those parts into an application like Thomas Kinkade. I am the UI assembler of light.
But as a user, one thought has been recurring for me during the past few years: the best user experience is really no user interface at all.
UI is a Necessary Evil
The only reason that a UI even exists is so that users can interact with our systems. It’s a middle-man. It’s an abstracted layer of communication and the conversation is pre-canned. The user and the UI can communicate, but only within the specifically defined boundaries of the interface. And this is how we end up with GLORIOUS UX fails like the one that falsely notified Hawaiian residents this past weekend of an incoming ballistic missile.
This is the screen that set off the ballistic missile alert on Saturday. The operator clicked the PACOM (CDW) State Only link. The drill link is the one that was supposed to be clicked. #Hawaii pic.twitter.com/lDVnqUmyHa
— Honolulu Civil Beat (@CivilBeat) January 16, 2018
We have to anticipate how the user is going to think or react and everyone is different. Well designed systems can get us close to intuitive. I am still a fan of skeumorphic design and “sorry not sorry.” If a 4 year old can pick up and use and iPad with no instruction, that’s kind of a feat of UX genius.
That said, even a perfect UI would be less than ideal. The ideal is to have no middleman at all. No translation layer. Historically speaking, this hasn’t been possible because we can’t “speak” to computers.
Natural-language processing (NLP) is the field of computing that deals with language interaction between humans and machines. The most recognizable example of this would be the Amazon Echo, Siri, Cortana or Google. Or “OK Google.” Or whatever the heck you call that thing.
I firmly believe that being able to communicate with an AI via spoken language is a better user interaction than a button—every time. To make this case, I would like to give you three examples of how NLP can completely replace a UI and the result is a far better user experience.
Exhibit A: Hey Siri, Remind Me To…
Siri is not a shining example of “a better user experience,” but one thing that it does fairly well and the thing I use it for almost every day, is creating reminders.
It is a far better user experience to say “Hey Siri, remind me to email my mom tomorrow morning 9 AM” than it is to do this…
- Open the app
- Tap a new line
- Type out the reminder
- Tap the “i”
- Select the date
- Tap “Done”
No matter how beautiful the Reminders app is, it will never match the UX of just telling Siri to do it.
Now this comes with the disclaimer of, “when it works.” Siri frequently just goes to lunch or cuts me off halfway through which results in a nonsensical reminder with no due date. When NLP goes wrong, it tends to go WAY wrong. It’s also incredibly annoying as anyone who as EVER used Siri can attest.
This is a simple example, and one that you might already be aware of or not that impressed with. Fair enough; here’s another: Home Automation.
Exhibit B: Home Automation
I have a bunch of the GE Z-Wave switches installed in my house. I tie them all together with a Vera Controller. If you aren’t big into home automation, just know that the switches connect to the controller and the controller exposes the interface with which to control them, allowing me to turn the lights on and off with my phone.
The Vera app for controlling lights is quite nice. It’s not perfect, but the UX is decent. For instance, if I wanted to turn on the office lights, this is how I would do it using the app.
I said it was “quite nice.” Not perfect. I’m just saying I’ve seen worse.
To be honest though, when I want to turn a light on or off, I don’t want to go hunting and pecking through an app on my phone to do it. That is not awesome. I want the light on and I want it on now. Turning lights on and off via your phone is a step backward in usability when compared to, I don’t know, A LIGHT SWITCH?
What is awesome, is telling my Echo to do it.
I can, for any switch in my house, say…
“Alexa, turn on/off the office lights”
Or the bedroom, or the dining room or what have you. Vera has an Alexa skill that allows Alexa to communicate directly with the controller and because Alexa uses NLP, I don’t have to say the phrase exactly right to get it to work. It just works.
Now, there is a slight delay between the time that I finish issuing the command and the time that Alexa responds. I assume this is the latency to go out to the server, execute the skill, call back into my controller, turn off the light, go back out to the skill in the cloud and then back down into my house.
I’m going to be honest and say that I sometimes get irritated that it takes a second or two to turn the lights on. Sure—blah blah blah technical reasons, but I don’t care. I want the lights on and I want them on NOW. Like Veruca Salt.
I also have Nest thermostats which I can control with the Echo and I gotta tell you, being able to adjust your thermostat without even getting out of bed is kind of, well, it’s kind of pathetic now that I’ve said it out loud. Never mind. I never ever do that.
NLP doesn’t have to be limited to the spoken word. It turns out that interfacing with computers via text is STILL better than buttons and sliders.
For that, I give you Exhibit C.
Exhibit C: Digit
Digit is a remarkable little service that I discovered via a Twitter ad. You’ve aways wondered who clicks on Twitter ads and now you know.
I wish more people knew about Digit. The basic premise behind the service is that they save money for you automatically each month by running machine learning on your spending habits to figure out where they can save money without sending you into the red.
The most remarkable thing about Digit is that you don’t interface with it via an app. Everything is done via text; and I love it.
Digit texts me every day to give me an update on my bank account balance. This is a nice daily heads up look at my current balance.
If I want to know how much Digit has saved for me, I just ask how much is in my savings. But again, because Digit is using NLP, I can ask it however I like. I can even just use the word “savings” and it still works. It’s almost like I’m interfacing with a real person.
Now if I want to transfer some of that back into savings because I want to buy more Lego and my wife says that Lego are a “want” not a “need” and that we should be saving for our kids “college,” I can just ask Digit to transfer some money. Again, I don’t have to know exactly what to say. I can interface with Digit until I get the right result. Even If I screw up mid-transaction, Digit can handle it. This is basically me filling out a form via text without the hell that is “filling out a form.”
After using Digit via text for so long, I now want to interface with everything via text. Sometimes it’s even better than having to talk out loud, especially if you are in a situation where you can’t just yell something out to a robot, or you can’t be bothered to speak. I have days like that too.
Is UX as We Know it Dead?
No. Emphatically no. NLP is not a substitution for all user interfaces. For instance, I wouldn’t want to text my camera to tell it to take a picture. Or scroll through photos with my voice. It is, however, a new way to think about how we design our user interfaces now that we have this powerful new form of input available.
So, before you design that next form or shopping cart, ask yourself: Do I really even need this UI? There’s a good chance that thanks to NLP and AI/ML, you don’t.
How to Get Started With NLP
NLP is far easier to create and develop than you might think. We’ve come a long way in terms of developer tooling. You can check out the LUIS project from Azure which provides a GUI tool for building and training NLP models.
It’s free and seriously easy.
Here’s a video of me building an AI that can understand when I ask it to turn lights on or off by picking the light state and room location out of an interaction.
“And this is how we end up with GLORIOUS UX fails like the one that falsely notified Hawaiian residents this past weekend of an incoming ballistic missile.” This is incorrect. That was the early speculation, but it turns out the user legitimately thought there was an incoming missile and pressed the button they intended to. I strongly suggest a revision to this article.
I saw that in the news. In all fairness, when I wrote the article that was still the case and it was SUCH a good example. Can you think of a different “glorious ux” fail I can reference?
If you cant find another “GLORIOUS UX fail” then you should probably overthink if anything you wrote makes sense.
UI is a way to communicate and i find buttons and text much more charming than being forced to talk to a machine and it talking back (or not, if you are not saying the right things).
Thats almost like you would ask the phone to replace email. Thats not the case. We are talking about different topics. And no UI is definitely not the best user experience, because even voicecommands are a kind of UI, without telling you like a button would for example.
We have sayings like “a picture says more than 1000 words” and this is still a valid point. Transporting information via sound is one of the worst ways and one of the hardest for humans.
Thats what focusing on human design means. Finding the best tool for the case. So yes, you may do things like reservations for a restraunt via voice, but on the other hand you did long time ago via phoning and talking to a person that was more likely to give you worthy feedback and can ask questions.
Boils down to choosing the right tool for the job.
There are problems, as you showcased, which can be handled a lot easier with NLP. IMHO most of the time using audio or voice to convey information is way slower than using a visual representation. One example would be voiceover or screenreaders.
I think you’re right in some cases and wrong in others. For the first time, using voice is now faster and the coverage for that is growing. So while it is still slower for some interactions – like the one you referenced, it’s a whole lot faster when I want to say – play “Owl City” on Pandora (yes, I listen to Owl City. I know.) instead of getting out my phone, opening the app, finding the station. There are a lot of other examples such as the ones in this article.
But your point is well taken – we aren’t all the way there yet, and I think fo a lot of things, we won’t ever be. We won’t ever not have a UI. We just might not need one anymore depending on the use case.
Hanging out with children while they use Alexa is informative, and I come away from it feeling the exact opposite way that you do — for all the “it just works” propaganda, I see kids having to learn all same sorts of UI management that GUIs have.
Real-life UX-handling from a 6-year-old:
“Alexa, play Bad Romance”
“Playing sample of Lady Gaga’s Bad Romance”
[Annoyed] “Alexa STOP. Alexa play Bad Romance from Spotify”
“Playing Bad Romance by Lady Gaga on Spotify”
Look at all the crazy UI elements: “Alexa” “Play” “Stop” “Spotify” [Artist name] the concept of “sample” — how are these different from an X to close or the concept of being offline, etc. It is still about learning a system. The “smarter” a system seems to be, most likely the more limiting it is — always assuming you only eat Chinese Food (if you are from China), or that you are straight, or white, or a man — and what those things mean — or that a paid-subscription model is a given.
Yes, voice commands is also UI (in this case the interface consists of a microphone and boxes) and the UX can be terrible (did you mean x? can you repeat the question? Ordering new CD of lady gaga… nooooo alexa nooooo cancel order…)
We definitely still have ground to cover in NLP, but the technology is advancing at a screaming pace. Just think of how much better it is now then say, 5 years ago.
You’re right though – when NLP fails you as an interface, it’s SUPER frustrating. In many cases, it’s even more frustrating than a traditional interface fail.
I would argue that voice commands is a form of UI, but it’s sometimes not accompanied by a GUI.
I like using my voice for some things, especially if I am not in a hurry. But I still prefer a good GUI in most cases due to speed and privacy. No one wants to listen to me talking to my phone on the bus, and I don’t want other people to hear what reminders I’m setting, or my bank account balance and so on.
As you said though, in some use cases voice can be useful. I mostly use it it when I am at home and writing long articles, it’s much easier to do by voice and then edit small errors later by hand.
By the way, I wrote this comment with Google Voice on my phone. :)
Ha! The last line there is a home run. Tip of the hat.
I’m surprised to not see a reference to Golden Krishna and his ‘no interface’ work here.
I had not heard of Golden Krishna. Will be checking out now though!
Very interesting article. The rise of NLP is certainly a bit of a gamechanger. But, there is still a huge job to evolve the design of Voice UI as well.
That thing about the thermostat, I laughed out loud, for real. Nice writing style.
Thanks, Jennifer! That thermostat is the best decision I ever made. ⭐️⭐️⭐️⭐️⭐️
This article seems to confuse UI with visual user interface. Voice control… is a user interface. So is texting. If you’re going to argue that the best UX is no UI, then you’d want systems that accomplish your desires based on ambient information that requires no user interaction at all.
Examples, would be that your home automation system knows your location and the time so when it senses that it’s 5pm and your phone GPS tells it that you’re leaving the location it knows as Work, that it should turn on the heat, tun on a set of lights and, as you approach the garage, it should open the garage doors and was you walk to the door, it should unlock. Yes, you have to define where Work is and the other rules upfront but once you do that, the activity happens with no user interaction. Similarly, instead of telling a voice assistant to run on the room lights, they should turn on as you cross the threshold based on sensor data and turn off when you leave (presuming you’re the only one there).
Yes – ideally yes. But we’re a long way off from that, so speech is the current breakthrough. I use the term “UI” to mean what it means traditionally, not semantically.
Can I be the person to say “But verbal Input is still a User Interface” and then “No UI would only be possible if We Were the Machines”?
[goes off to write concept album called We Were the Machines]
You can, you just wouldn’t be the first person to say it.
Clearly I should have clarified that. Although clarity is not my forte. Good luck with the album!
This is how some people ended up with a bug in their home. Alexa, please tell Amazon everything we are talking about … Unbelievable. How about just using a switch to turn on the light?
You can still do that. But when you’re on the couch watching re-runs of SNL and you just devoured an entire large pizza all by yourself, “getting up and turning off the lights” will be the hardest thing you do all day. Not that I would know anything about that specific scenario.
Nice informative article. I would argue however that Voice is still an interface. Simple because it is a task a user is required to perform in order to interact with a system. There are however, “interfaceless” systems, autonomous ones. What is better than speaking to your thermostat? If it “knows” you and changes without you even asking for it. That way you are free of interfacing and there is 0 delay!
Of cause these kind of systems are mostly impossible to create but we see them in parts all day. Take form-fillers for instance, application which fills your profile-information automatically. Relieving you the troublesome act of manually enter your address and other information.
Google has system which informs you it is time to get in your car so you are not late for a meeting because there is extra traffic today. So not only does it help you having to look-up and follow traffic information, it even knows you want to be reminded of it despite not actually setting a reminder.
Voice is great, automatic is better!
NLP (when working perfectly) feels so good for repetitive common tasks. But there are two caveats:
1) I feel like it can have a real learning curve for the user and does not offer a good UX discovery. But maybe overtime, when these will become standard, this issue won’t be around anymore
2) It can be slower than a GUI or a “physical” UI. The light switch is the best example for that. But any one click action feels more efficient for me.
NLP is surely an important part of the future of UI and will be a more efficient tool in many cases. However I think the real UI breaker will be AI driven one/two button interfaces, where people would be only presented with the right actions at the right moment. It’s basically what we are trying to do beforehand when designing interfaces, but more dynamic and more precise. It has itself its own issues and it’s yet easy to implement, but I think it can become the best UI tool.
Thanks for taking the time to write it.
“The only reason that a UI even exists is so that users can interact with our systems. It’s a middle-man.”
Perhaps this can be said about all forms of communication.
“It’s an abstracted layer of communication and the conversation is pre-canned.”
I don’t think that is correct. As you pointed out there is already variation in your interaction with the $ management app. And look at what computers have done to the game of chess. In many areas they are already less pre-canned than humans.
Things evolve. How about an NLP HTML property to add to web UI design?
At first I thought being able to squeeze my phone to launch Google Assistant was weird – but it’s actually amazing! It enables me to interact with the phone without having to touch the screen at all. Props to whoever thought of that one.
“Squeeze my phone” is hands down the best description ever made
Agree with previous commenters – voice is a UI and not a particularly good one for now expect for extremely mundane tasks, mostly because of the limitations of AI and NLP. For example, we were experimenting at work with building this office companion tool that would let us do all the boring admin stuff through slack or over voice instead of doing it through Google Calendar or our HR system, however we found out the the technology is not mature enough to handle that complexity without spending serious amount of time it takes to build and to configure. We could build Alexa skills for every action (book a meeting, request holiday etc) but as soon as you tried to bundle those actions into one “product” the whole thing started breaking everywhere, not to mention the many unforced errors coming from the NLP side.
From a UX-side, these types of interfaces are also tricky because:
1. They take control from users and transfer that power to AI, meaning that AI translates what you say/write and into an action instead of having you clicking a button. In at least 10 percent of the cases (at least for us) it translated what we wanted to the wrong actions or couldn’t translate our request at all. So every interaction is a roll of the dice and leads to longer, and often more frustrating, times to perform tasks compared with traditional UIs.
Everyone writes and speaks differently. There is as many ways of writing and speaking as there is people. That is why it’s very hard to do NLP-driven interfaces for anything but mundane tasks. It can also lead to exclusion and difficulty among users that are different than whatever your NLP was trained on. For example, a Finnish person with a heavily accented english might not be able to be properly understood by current NLPs.
Finally, according to the UX research we did internally at least, people really don’t like using voice interfaces outside of their home or other personal spaces. It makes sense. Even when we have a phone call we go away from public spaces and find a private one. So having a form or a shopping cart that is voice-controlled might be okay-ish if you’re at home (even though I doubt the usefulness for it right now) but it’s an absolute no-no if you’re not (“please whisper your credit card number after the tone”). Which, you know, is really bad UX for most eCommerce shops and their customers.
Voice and text interfaces driven by NLP might be interesting, but the technology is not mature enough for any but the most mundane of tasks, and your user might still prefer traditional UIs anyway.
You know that ‘Dragon UI’ screenshot under Windows XP UI description is not from XP at all, right?
Oh snap! Good catch. 100 points for Gryffindor.
Hey there! Former Digit Product Designer here
Some great examples and points.
First – I’m really happy to hear that you enjoy using Digit, and that the conversational UI works well for you.
Second – it might surprise you to learn that for most of Digit’s existence, there hasn’t really been any NLP/AI/ML powering those interactions. Most were hardcoded syntax. Which is actually why one of the last projects I worked on there was a redesign of the app https://blog.digit.co/
I’ll say that for people who are using Digit for the bare minimum (Rainy Day Savings), the interfaceless design worked fairly well. But it was exceedingly difficult to design for new features and functionality when everything we did had to be viewed through that lens. A lot of the time it felt like reinventing the wheel. Even Alexa has a companion “traditional UI” based app. More that could be said, but most of it has been said already. True NLP could make products like Digit work better than they do, but it’s still not a one size fits all type deal imo.
Dammit Trevor! You’re ruining my argument! :)
That’s interesting to hear. I can imagine how frustrating hard coding for all situations would be. That doesn’t sound like that would scale at all. Do you think it would have been better if it was based on something like a bot framework / NLP platform? The primary value prop of NLP is that you can add intents and entities, train a model and then your bot is now that much smarter. The bot still has to know what to do with that information (which would require code), but only the action has to be coded for, not the interface.
Re: “Een Alexa has a companion traditional UI based app”. Yes, and really prefer not to ever use it. But maybe that’s just me. It seems that right now maybe we are in a sweet spot of a mixture of traditional and conversational UIs providing the best experience.
Yes it almost certainly would have worked better if it were using true NLP. Toward the end of my time there we actually were starting to use an out of the box solution, but it was still a little rough. Again, for simple interactions, it’s usually ok. Problems arose when more complex (or sensitive) interactions were needed. Especially because people expect for those things to go smoothly in that context – where as in a traditional UI context, the expectation isn’t really there at all.
Well, in any event, thank you for pioneering a conversational UI that was near and dear to my heart. It was my first introduction to a service that I would have rather interfaced with via text. I think that’s quite an accomplishment.
Please consider accessibility and those that may not be able to speak (either permanently or temporarily, say, in a library or due to laringitis), those who may not speak the language of the interface, those who speak with an accent, etc. Providing options is always a good idea. Thanks.
Definitely. This is another reason that traditional interfaces will always exist. Just like the traditional interface can fail the visually impaired, NLP also falls short in addressing all users. I probably should have referenced this in my conclusion as accessibility is a primary concern.
The incredible machine and tabworks , wondeful memories of a simpler time
Yes! What about Lode Runner? Can you feel the nostalgia?
Glorious UX fail = easy. Look for the Apollo 8 flight. Some info in this article: https://www.wired.com/2015/10/margaret-hamilton-nasa-apollo/