This my 3-year-old daughter Ylva, trying to make one of our computers do what she wants.
Don’t get me wrong: Ylva is perfectly adept at using computers. This one, however, is frustratingly unresponsive. Nothing happens when you try to interact directly with the pictures on the screen. Instead, you have to translate your desires into indirect manipulation of a mouse or touchpad and keyboard. It’s not intuitive, and it’s taking her some time to adjust.
It’s as if I told her to draw a picture, but that she had to do it with oven mitts on.
Life at the cutting edge of 2007
Recently I helped teach a programming class. The adult students all came with computers that had the exact same input/output limitations as the ones my daughter is struggling with above. In fact, 28 out of 29 of them were using some form of Apple MacBook.
I can understand why. From 2006 to 2008, I used an Apple MacBook Pro almost exclusively. It was sleek, beautiful, and nearly everything about it was well-designed. The way the OS smoothly complemented the hardware was a joy.
For various reasons I switched back to using Linux machines for development, augmented with Windows boxes for video editing. But pretty much all of my programmer friends have a MacBook. In the years since, they’ve only gotten more sleek, more beautiful, and more powerful.
A MacBook in 2015 is the best 2007-era computer money can buy.
Ruined by touch
Sometime not long after that MacBook, I bought an early Lenovo convertible netbook for travel use. “Convertible”, meaning you could flip the screen over and use it as a tablet. It was slow, clumsy, and the software support for it (in either Windows or Linux) was terrible.
And yet, despite all the warts, it ruined me forever. From that point on, I would find myself reaching out to jab at some item on a laptop screen with my fingertip, only to stop myself just shy of the glass. Non-touch screen devices started to feel “dumb” and unresponsive, in the much same way that mouse-less machines had started to feel after I graduated from DOS computers.
In the years since that little convertible, our household has filled with touch-screen devices. Our younger children are baffled by the rare screen they find that won’t respond to their touch. Clearly, (to them), those devices are broken.
Also in those years, Microsoft went all-in on touch screens and convertible PCs. They’ve kicked off a veritable Cambrian explosion of devices, with every imaginable size, form-factor, and bizarrely improbable screen rotation mechanism. They’re all a little cool, and a little wonky. None of them are quite as polished and honed as a MacBook.
Last year I bought a Windows 8.1 laptop to use as a video editing box. It has a 3200×1800 multitouch display. When I’m sitting out on my deck, with this laptop on my knees, I find myself smoothly switching between touchpad, keyboard, and direct screen interaction. I don’t really think about it. I just do what feels natural. Sometimes it feels natural to mouse around precisely. Other times, it feels right to fling a web page up and then catch it, or jab at a button right on the screen. Or pinch to zoom and scroll on a google map. Reaching out to scoot a video clip from one track to another, it becomes strange to think of doing it any other way.
Then I switch to my older ThinkPad Linux box, with its traditional non-interactive display, and feel disconcerted and a little annoyed. Why is this screen broken?
The Windows experience is not slick and beautiful, like the Mac experience is. Windows 8 was a bizarre mash-up of desktop and touch UI. UI controls are oddly placed or inexplicably ugly. Sometimes fonts still render weirdly. Scaling for high-DPI displays is still a work in progress. A lot of this has improved in Windows 10, but there’s still work to be done.
But for all that, I have a hard time imagining buying a new laptop that doesn’t have a touch screen. Without touch interaction machines just feel… half-dead. Un-tactile. Kinda broken.
Microsoft innovation, and other oxymorons
From my observation, in 2015, Microsoft is the only company seriously questioning and innovating what a personal computing device can be. And when I suggest this out loud, I get pushback from other geeks.
One thing I hear is that Microsoft is trying to make machines that are “all things to all people”.
I’m not much of a prognosticator, but I feel pretty confident about the following prediction: If you traveled to 20 years into the future and told someone that a machine that supported keyboard and trackpad and a touch-screen was considered “all things to all people”, they’d bust out laughing. And then they’d wave their hand to command their music to start playing again.
I’m worried about the kind of thinking that says “whoah, whoah, a touch screen? Now you’re just trying to do everything!”
I’ve also been told that my kids shouldn’t be surprised when they try to touch a screen and it doesn’t respond. After all, they’re just trying to use it wrong.
“Trying to use it wrong”. That’s a red flag for me. It’s a phrase that’s often indicative of the deep-seated meme that we just need to do a better job of serving the machines. A lot of programmers believe this, either consciously or unconsciously. A lot of programming languages are built on this idea.
My kids aren’t wrong to try a language of direct, intuitive interaction that they are familiar with. From their point of view, these devices are engineered wrong. In 2015, I’m inclined to agree. In 2035, I’ll be trying to explain the concept of a “non-touch” display to my grandkids, and failing.
I’ve also heard that all these hybrid devices just can’t decide if they are a PC or a tablet or a phone. But I don’t think that’s what’s going on at all. I think Microsoft has embraced the idea that your biggest decision is what size of computer you want. Whatever size that turns out to be, of course you can directly manipulate things on the screen, and of course you can attach a keyboard to it.
The power of plain text
So if you’ve read all this way, you probably think I’m trying to make the point that touch screens are important, or that Microsoft is awesome now, or that Apple is falling behind. But that’s not it at all.
Microsoft is still a mess. Apple is still doing what they do perfectly. They don’t want to embrace new forms of PC until they can do it perfectly. That’s what Apple does. That’s what they should keep doing.
But I am worried that as programmers, we seem to have all decided that the laptop PC circa 2007 is the ideal environment to be developing in. And that the best way to program on it is to simulate a teletype circa 1970.
About those teletypes…
Look, I get it. I truly, truly get it. At my first programming job the first question you asked when someone told you you’d be using a new programming language was “where is the install CD for the IDE?”. We were totally dependent on wizards and GUI tools and it was bad. We were in thrall to the vendors.
And the vendors abused us. They buried important configuration in binary files and XML and do-not-modify magic code sections, and told us not to worry our pretty little heads about it. They sold our managers on UML CASE tools that would write the code for us and render those annoying smelly coders obsolete. (Except for the ones who wrote the code to map the pictures to code but pay no attention to the hacker behind the curtain…)
So I read The Pragmatic Programmer and I fiddled with GNU/Linux at home and I installed Cygwin and Emacs at work. And I learned the zen of the command line. I learned how the constraint of operating on nothing but streams of plain text could be freeing.
At the point I’m writing this, my primary OS has been some form of UNIX for nearly a decade, and I’ve been using Emacs to write my code even longer. I wrote all my books in Org-Mode. My first reflex when I can’t make something work is to pop open a terminal. I once wrote a web application in Bash just for the hell of it.
I get it. Constraints are powerful. Lowest-common-denominator orientation is powerful. Simple, well-understood building blocks are powerful.
Illusions of maximal efficiency
But constraints can also limit our imagination. Moreover, they can make us lazy.
We know how to maximize our productivity in a console window. We know about modal editing and macros and pipe filters and grep. We can study this stuff and feel like UNIX deities, imagining that we have maximized our efficiency at communicating our ideas to the computer. We can point and laugh at people who are still poking around with their mouse pointer when we’ve already banged out a half a dozen commands.
The trouble is, programmers are really good at finding local maxima and thinking they are global maxima.
Just as an example: in the last year I finally learned to touch-type. I’m getting pretty fast.
But lately my right ulnar nerve has been giving me trouble, and sometimes my Kinesis keyboard isn’t enough to fully ameliorate the problem. As a result, the other day I tried out the built-in Windows voice dictation software to do some writing.
And I discovered something surprising: no matter how good I get at touch-typing, there is no way in hell I’ll ever type as fast as I can dictate by voice. It turns out that speaking is a really efficient way to communicate words. Who knew??
But see, I always thought of voice dictation as something for management dorks in stock photos. My background is in a programmer culture that holds up the keyboard as the gold standard of human/computer interaction efficiency. What could possibly be faster?
I also discovered that I can easily and efficiently use GMail by voice in Internet Explorer, but not in Google Chrome. Ironic, that.
Did you hear about the programmer who was tasked with making Formula One real-time data accessible to the blind? He wound up coming up with a way to make that data more accessible to everyone. This kind of thing happens surprisingly often. When people start thinking about solutions using alternate media, they end up discovering ways to improve usability across the board. Those global maxima of efficient interaction turn out to be local maxima built on historical conventions.
Dancing about algorithms
So where do we draw the line? Are we going to start communicating with our computers via interpretive dance?
I dunno. But that’s an interesting question now that I think about it. I wonder what we’d find out if we tried? Maybe our computers could notice when we hunch forward and when we sit back, and react appropriately. More likely, we’d develop a kinesthetic vocabulary of interaction that I can’t even begin to imagine yet.
The laptop/keyboard/terminals/text editor combo that we tend to prefer as programmers has reached its pinnacle in the MacBook. But it’s not sacred. It’s legacy. It’s punch cards and teletypes.
I’m not saying that the future is “graphical programming”. That’s another limiting idea. And a lot of programmers have been legitimately burned by attempts in that direction.
IDEs have shown repeatedly how awkward it is to try to put a graphical face on top of an interaction that is fundamentally about manipulating ASCII text files. And many of us have tried the “boxes and lines” stabs at “graphical programming” systems, and seen how their metaphors fall down and their abstractions fail to scale. A bitmap square to “represent” a C function (written in ASCII text) doesn’t help anyone get anything done faster.
But there is no law that says instructions to the computer have to be in ASCII… or even in the form of text. And boxes and arrows aren’t the only way to encode behavior graphically. The most popular programming system on the planet isn’t BASIC or C; it’s Microsoft Excel. Paul Chiusano makes a convincing case that a spreadsheet model is a logical next step for programming.
And again, the idea of “graphical programming” is still arbitrarily limiting. The future of human/computer interaction is touch, and motion, and vibration, and texture, and voice, and expression, and eye contact, and probably thought as well.
And we fool ourselves if we believe we can effectively develop software for the future of computer-mediated creativity without also physically occupying the future of human-computer interaction.
Leaving the past behind
This world is coming either way. I doubt my daughter is going to stand for a world in which the only way to reprogram her computer is to punch up a virtual simulation of a terminal, complete with a simulated rack of buttons with letter glyphs painted on them.
But we can influence which decade the future arrives in. And we can choose whether it’s the vendors or the hackers who lead the way.
Get a Windows convertible laptop. Get a Chromebook. Try out voice dictation. Fiddle with a Kinect. Pre-order a VR headset.
Try to write a program on your phone. Then imagine what it would take for the experience to not suck. Do not accept “it’s just the wrong medium” as a final answer. Ask why it is impractical to program on machine that is able to sense your touch, hear and interpret your voice, watch your expressions, feel your movements. And imagine what it would take to make communicating your intent to such a device feel effortless instead.
Feel the pain. Feel the awkwardness of areas that have been neglected by programmers because we disdain them. And at the same time, feel the possibilities.
There are a few efforts attempting to make dictation into a real input method that programmers can use (or anybody). I’ve been using VoiceCode (http://voicecode.io) for a few months and have been very impressed at how useful it is. It’s not perfect by any means, but it actually works really well. It also has a very steep learning curve to get productive… it feels at the end of the day a lot like learning a natural language.
I haven’t taken a serious look at any other possibilities, but I know there are a couple designed specifically for Emacs.
This video is what inspired me in the first place to look into this: https://www.youtube.com/watch?v=8SkdfdXWYaI
Thanks for those links! I’ve queued them up for investigation.
Amazing article, thanks, Avdi!, just for curiosity what Linux distribution are you using for development? It would be great tf you could make a post on what is your development setup
Read your blog and loved the idea behind it. An engineer working in telecom want to read others journey on learning more in this vast field.
We are constantly extending our knowledge about physical and still doesn’t know pretty much anything about consciousness.
The problem you described here is a problem of computer-brain interaction – we want to communicate with computers directly (limbs are another medium that we don’t need for that actually) and the next step here is to let computers to write programs for themselves to do what we want from them.
for clarity: language is another medium we actually don’t need 🙂
Software’s essential form really is close to text. “Any sufficiently complicated non-text programming system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of ordinary text.”.
This is because software is essentially constructive: it is about assembling and analysing, connecting and nesting, hierarchies of pieces. And text, also built from layers of parts, reflects that essential structure in a very effective and efficient way.
See http://www.hxa.name/notes/note-hxa7241-20140330T1114Z.html for a brief pondering.
So, there might be some room for surface UI innovation — gesturing a bit, dictating a bit, graphic designing a bit — but underlyingly, the structure seems inevitably quite textual.
Underlyingly it’s all 1’s and 0’s
i think this is so refreshing to hear. I had a brief nerve issue that kept me from typing. This was in late 2009 but I found I could operate at a professional level after about a week of experimentation. Having lots of words auto-paste code like TextMate/Sublime snippets works wonders. I could imagine creating a specially designed regular expression language you could speak easily. With that and a template language that supported snippets I bet programming could be a live performance art.
I assume you’ve read Gary Berhardt’s talk on a similar theme: https://www.destroyallsoftware.com/talks/a-whole-new-world
Innovation at fundamental layers of software requires a longer attention span than most people are able to manage. As well as a willingness to be laughed at and not gain acceptance in your own lifetime.
I’m skeptical that algorithms-without-code is just around the corner. It’s been just-around-the-corner for a while. I nevertheless find utterly impossible the “Try to write a program on your phone” challenge. Text in English would almost work if it weren’t for punctuation. So I thought to look backward rather than forward for a remedy:
http://www.halfbakery.com/idea/COBOL-like_20programming_20language_20for_20mobile_20computing
Algorithms without (text) code aren’t “just around the corner and always will be” so much as “seeping around the edges and we’re not paying attention.”
My favorite example is Scratch, and its more-usable-for-real-stuff commercial semi-descendent, Stencyl. There are others, I’m sure, because with block.ly it seems like there would have to be.
In the same way that the biggest bit of real-world programming we ignore at this point is Excel, there are a lot of sources of code that programmers ignore, but do so at their peril. If you don’t think ignoring programming-by-Excel culture has hurt you, keep reading Patrick McKenzie. He’ll keep pointing it out. The short version is that most of business process (which is business DNA, which is code) is conducted there, out of the sight of our attempts at optimizing the wrong thing. Surprisingly often, the “non-code dark areas” of business that screw us up are, in fact, code that we refuse to acknowledge because our biases cause us to miss it.
I wish you would make this into a blog post in its own right 🙂
Maybe we can speculate what characteristics a global maximum based on the intersection of the problem domain with biology.
Programming involves high bandwidth bidirectional communication between the computer and the programmer.
You can think of this communication as having two half-duplex channels: computer to programmer, and programmer to computer.
In terms of the computer-to-programmer channel, the global maximum is almost certainly some kind of visual language, due to the fact that our brains have more capacity devoted to visual processing than any other sense, and because language is how we manipulate logic and logic is how computers operate.
Written words might not be the best visual language for high bandwidth communication, but all the alternatives I can think of suffer either from a loss of precision or a loss of accuracy.
In terms of the programmer-to-computer channel, hands would seem to clearly be the best suited for an output interface, again based on how much capacity our brains devote to them:
http://www.amareway.org/holisticliving/06/sensory-homunculus-cortical-homunculus-motor-homunculus/
Any mechanism of communication that’s not our hands or our voice is going to be at a distinct disadvantage. This doesn’t mean that other forms of communication won’t work – they just won’t be as efficient ceteris paribus.
Based on what humans are biologically good at, I think the optimal communication strategy between us and computers is to receive information visually, and send it tactically, possibility augmented at times with vocal and auditory subchannels.
If this is true, then there’s always going to be a usability tradeoff with small form factor devices. A device with less tactile bandwidth will always be at a disadvantage compared to a device with more tactile bandwidth.
Maybe a device that’s constrained in the tactile area can try to leverage other channels, but that doesn’t really help them in relative terms because the larger devices can leverage the same capabilities and obtain the same benefit.
Well, sorta kinda. Programming also involves communication between one programmer and another, and getting this channel right is vital to successful software development. Knuth: “Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.”
Also it’s not very high bandwidth. Programming is intrinsically complex, so a low-bandwidth channel is perfectly sufficient for all the complexity that one programmer can produce or consume. More like poetry or music than like story-telling.
This is good, sensible advice and a powerful opportunity. Being “dressed in overalls and look[ing] like work”, I expect most of us will miss it 🙂
Great article, can’t agree more. IMO the main problem still stands in between OS hardware interaction, for some reason. I wonder what this new https://pixel.google.com will bring us.
I’m also a fan of “you can use your phone as your computer” approach, sadly ubuntu folks who, I think, started it are falling behind. So I also wonder what those new lumias can bring to us – developers.
We’ll definitely see it soon, and our kids won’t stand the old approach)
一方、スイス社会民主党、キリスト教民主党と労働組合連合など党グループは、これまでも公開言論を発表によると、2010年末スイス若者の失業者の数は大幅に増加。ロレックス スーパーコピー統計によると、今年2月にスイス24歳以下の若い人の失業率は前年同月比3割。上記の予測と言論令スイス国民は、国家経済と就職の見通しが心配になり、国を挙げての注目の話題。民衆の要求は、連邦政府と各州が再び登場力強い経済政策の振興。 http://www.bagkakaku.com/vuitton_bag/2/N41118.html