I hate underscores. They are ugly. They are like the neon orange belt pack of syntax. Dated and unfashionable no matter what era they are found in. As the former owner of a neon orange belt pack, I feel I can speak with authority on this. Aesthetically speaking, underscores are one step shy of “C:\Progra~1“.

Only nerds even know what an underscore is. What’s the first thing I have to teach people who are brand new to programming? What the hell an underscore is, and where to find it on the keyboard. No, not that line. The other line. Hold down shift. There you go.

Underscores require me to hit the Shift key. When writing prose, I only hit the shift once or twice per sentence.  In Ruby, I have to hit the shift key every few letters. Awkward. Inefficient.

I’ve been thinking about rebinding the dash key to underscore in Emacs ruby-mode. It’s stupid that I’m even considering such a thing.

Underscores are visually ambiguous. When the font is underlined for some reason, the underscores get muddled up with the underlining. In badly-rendered text boxes it’s not always clear where the underscore leaves off and the bottom border begins.

CamelCase is no better. I still have to hit Shift all the time. Plus it relies on a peculiarity of the Latin family of languages.

Lisp gets it right. Lisp S-expressions have one basic rule that vastly simplifies the syntax: whitespace separates tokens. Always.

This means that “foo - bar” parses out to the three tokens “foo”, “-“, “bar”. Whereas “foo-bar” is a single token that happens to have a dash in the middle. That’s right, your tokens can have dashes in the middle. Ordinary, easy-to-type, recognizable, visually unambiguous dashes. Lisp functions have names like:

find-file

All you have give up for this is tightly-packed arithmetic statements, like this:

5*5-3

Instead, you have to give your tokens some breathing room:

5 * 5 - 3

That’s a trade-off I’ll gladly make if it means using dashed identifiers instead of underscores.

I mostly code in Ruby these days, and I love almost every aspect of its very rich syntax. But I hate all the snake_casing. I want a Ruby that lets me use dashes in identifiers. If I never type another underscore, it will be too soon.

UPDATE: The Recursive blog points out that Japanese keyboards may be more comfortable for typing Ruby.

Published by Avdi Grimm

68 Comments

  1. why not just use camel case? thisWouldntMakeYouHappy because you still have to hit shift I guess.

    Reply
  2. You have to hit shift to type parens. 😉

    Reply
  3. Of course, COBOL let you use dashes. Earlier versions even helped with casing: any case you wanted as long as it was upper…

    Reply
  4. Oh, for fuck’s sake.

    Reply
  5. I always  separate operators and operands using spaces for better readability, so I see no trade-offs here, only wins 🙂

    Reply
  6. nice rant 🙂

    Reply
  7. I respectfully disagree.  Let me preface by saying some of why I disagree is simply personal preference.  Here goes: 

    Having to hold down the shift key can’t be that uncommon for someone writing ruby code (or pretty much any code).  Brackets, Parenthesis, any number of operators all require the pressing of the shift key.  

    You bring up a valid point about dasherized text.  It might be nice to have a bit more flexibility in our syntax, but I definitely believe its better than most alternatives. 

    Furthermore, if we were to implement a flexible system, one in which both dasherized and snake_case would be tolerated by the interpreter, we would risk splitting the common semantic patterns that have taken since Ruby’s birth to build.  A relatively recent example of this would be the new 1.9 hash syntax.  

    I might have more concerns but it mostly boils down to ‘I like it’.  

    edit formatting

    Reply
  8. This is the perfect rant. Entertaining, slightly irrational, but reasonable enough to not be dismissed out of hand.

    (Also, the title is bitchin’.)

    Reply
  9. I wonder if you could create logic in your editor of choice (I assume either emacs or vim) so that if you type a dash as in snake-case it turns it into an underscore for you. Following the same rules of interpretation as your Lisp example. This would solve the problem of having to hold down the shift key but not the core style issue. 

    Reply
  10. I don’t think it’s stupid to consider remapping keys. I was going to do an analysis of the code I’ve written and try and design a keyboard that works for me when programming.This does a quick analysis http://patorjk.com/keyboard-layout-analyzer/

    Reply
  11. snakes_in_my_pants

    Reply
  12. I agree with you totally but there are cases like variable names etc… what do you think about them ?

    Reply
  13. Can I point out your cultural bias here? 
    Tongue in cheek of course!Imagine us poor Europeans having to deal with a multitude of different keyboard layouts.

    For example, the german layout will conveniently place the underscore right next to the right shift but will force you to press AltGr-Q for @ and AltGr-ß (ß being in place of – ) for with AltGr being the right Alt button.

    Coding with a german layout is a ballet of jumps and slides across the keyboard 😛

    At the same time, the swiss layout, having to accomodate three languages (german, french and italian) goes completely overboard. But I think the one I could never get used to is the croatian one.

    Reply
    •  I hear you! I currently have a latin-america keyboard. In Spanish we have only three more symbols than English: ñ, acute-accent (á) and direresis (ü). Simple, the keyboard designers just had to add two more keys. But what did they do? The moved every fricking symbol, added ¬, and 1/2 and other useless cr*p

      Reply
  14. Sometimes it is easier if you can use symbols to differentiate tokens instead of whitespace.  The dash is used for the arithmetic minus operator.

    If you could only use whitespace for separating tokens then2*(b-a)
    becomes
    2 * ( b – a )It is really annoying to have to keep pressing space all the time.

    Reply
    • If you’re going to press one key a lot, don’t you think that the huge key under both of your thumbs would be a pretty good choice?

      Reply
  15. God forbid you ever use a { C } like language. All those mustaches/curly-brackets/braces oi! And Lisp with all those ( paren )’s! And certain dialects of SQL that love those [ square ] brackies!

    Ergh, sorry. What I meant to say is that you’ll never get away from shift. Not as long as people still use punctuation of any type.

    Hmmm. You could also get  a DVORAK keyboard. Then shift would feel normal.

    Reply
    •  I just realized that [ square ] brackets don’t need shift. You should become a MS-SQL DB administrator.

      Reply
      • In many (most?) keyboard layouts you don’t need shift for square brackets, it’s true. You need Alt+Gr instead.

        Reply
  16. I’ve often wondered why we can’t use whitespace in identifiers. This can’t be an insurmountable language design problem.

    For example

     if number of animals > 12

    instead of

     if animal_count > 12
     if animalCount > 12
     if animal-count > 12

    Just a thought.

    Reply
    •  Some languages do allow it, I forget which. They usually have string-like syntax to have no ambiguity as to what is delimited, but I guess you technically wouldn’t need that.

      Reply
    • There is a character, NON-BREAKING SPACE (U+00A0), which looks like a space but is different from a normal space character (U+0020).  In some languages, you might be able to write “number of animals” with non-breaking spaces and have it work.

      From a brief glance, it looks like this wouldn’t work in Python (ASCII only) or Java (letter-like characters only), but might well work in Ruby 1.9 (not sure) and Lisp (e.g., with SBCL).

      For the languages which don’t support this, there’s no technical reason they couldn’t, of course, but Python and Java are somewhat known for making the programmer do things their way, and invisible characters in identifiers is not their way.  🙂

      Reply
      • Of course, this doesn’t solve all of his original issues, since then you’d just be typing option-space instead of shift-minus, but it might look prettier.  Especially if you set up your IDE to style them (like with a subtle background color), so that you could tell where one identifier ended and the next began.  Otherwise a line like “if check something list foo bar” could have many possible (human) interpretations, and just lead to more confusion.

        Reply
        • Thank you for posting this publicly, online. If I ever come across code that uses non-breaking spaces as part of variable names, I will know who to hunt down.

          Reply
  17. this_is_easier to read than-this-is 

    QED

    P.S. If you use an editor that auto-completes (like Sublime Text) then you rarely have to type more than the first few characters of a variable name.

    Reply
    • I absolutely agree. In the spirit of “writing code that’s easy for the next developer (who may be yourself in 6mos)”, I think we should be willing to make the sacrifice of an extra shift in exchange for easier readability.  

      Dashes just don’t visually stand out enough for the eye/brain to simply scan an expression and automagically parse it.  They’re like speed-bumps which cause you to slow down and “manually” determine where the token begins and ends.

      Since code is going to be read way more often than it’s written, I’m in favor of underscores vs dashes.  I can massage and rest my weary pinkies (from all that shifting) while I leisurely read through code.

      Reply
      • Funny, I find the underscore version much more visually jarring. I’m used to seeing dashes in prose; they tie concepts together. Underscores? Not so much.

        Reply
        • The tying together thing is what makes it hard for me.  My whole reason for wanting a dash/underscore is to separate the words for the reader, yet keep as a unit for the machine.  I guess different people “see” things differently. (BTW, Thanks a lot.  Now I’m going to spend my entire lunch break wondering if my “blue” is the same as your “blue”!)

          Reply
    • No. No, it really isn’t.

      Reply
    • Not to me, it isn’t.  (Or rather, in this case it is only because your comment is a variable-width font, which nobody would use for viewing source code.)

      (Or would that have been easier to read if I’d said variable_width font?)

      Reply
  18.  visually unambiguous dashes 
    Really? Then please identify (by eye) which one below is the Hyphen.

    find−filefind–filefind‐filefind¯filefind—filefind⁃filefind―file

    Reply
  19. Lispers should just die like lisp

    Reply
  20. You could always use an en dash:

    #!/usr/bin/env ruby

    encoding: utf-8

    foo–bar = “Hello World”
    puts foo–bar

    Reply
  21. This can be trivially fixed in Vim by using a bunch of abbreviations such as “iab a- a_” to “iab Z- Z_” 

    Reply
  22. […] by Avdi Grimm’s post on underscores in program identifiers, I’ll publicly air my guess about why theBlightOfCamelCaseInfestsOurPrograms […]

    Reply
  23. object . call ( parameter , other ) do | thing |

    Reply
  24. Both editors I use regularly (Komodo Edit and SciTE) treat “foo_bar” as one token for things like double-click-selection and auto-completion, and “foo-bar” as three tokens. This is why I have to side with underscore in the debate of http://stackoverflow.com/questions/1686337/hyphens-or-underscores-in-css-and-html-identifiers

    Reply
  25. Bravo!

    Also the underscore in a URI strings appear strikingly awful.

    The presence of an underscore within a URI appears esthetically repulsive.

    Examples:

    http://www.foxhop.net/this_uri_is_ugly

    vs

    http://www.foxhop.net/this-uri-is-pretty

    Python suffers more than Ruby from the underscore plague:

    init() # a constructor

    _varible = ‘potato’ # a conventionally private variable

    Also in python the dash is NOT allowed in script names…

    We cannot import from module-test.py but we can from module_test.py

    Thank you for posting this, I “laughed-out-loud”.  Yes, I used dashes just now.

    Reply
  26. Personally I’d much rather use underscores than add spaces to arithmetic expressions.

    Also to this day I still don’t use spaces in my filenames because shells are fucking insane when it comes to handling spaces in paths.

    Also I’m not sure why being a modified key is a bad thing; given that curly brackets and parens are both modified keys on QWERTY.
    That being said: you have a point that they’re ambiguous when being used with an underlined font. To that I must ask: when/where have you ever seen underlined Ruby code?To my knowledge I can’t even apply an underline style in TextMate.

    Reply
  27. DVORAK layouts makes this slightly less painful.

    Reply
  28. Write yourself a pre-processor… it should be easy enough in this case.

    Reply
  29. If it bothers you that much, change your keyboard layout.

    Reply
  30. On an azerty keyboard, you haven’t to hit Shift to type an underscore. I agree with you, Lisp get it right, except that find_file is easier to read than find-file.

    Reply
  31. In this post, the author says that having the line half the line height above baseline is much better than on the baseline.

    Really, this is considered Hacker News?

    Reply
  32. Hi Avdi – have a look at pogoscript (http://pogoscript.org/)

    Reply
  33. glasses-mode does a decent job of correcting camelCase, adapt it for snake_case too?

    I have no sympathy for the typing complaints.

    Reply
  34. sounds like you havent been programming for very long son!

    Reply
  35. I don’t mind the underscore. However, I agree with the extra spacing for breathing room. “5*5-3” is an abomination. The cramped format makes it harder to scan visually.

    Reply
  36. You’re stupid! 😛

    Reply
  37. Well, I hate colons. Especially Double::Colons.
    And the & syntaxt to denote blocks passed as parameters.Both syntax elements seem like a like a special case made a bit too explicit – not as well-rounded as the rest of the language. Probably they look like that because I don’t know the underlying reasons well enough. But they still look a bit “rough”.But the underscores? I find they are a good balance between utility, legibility and ease of use.

    Reply
  38. “Give up” tightly-packed arithmetic statements?  Oh puh-leez.  That’s like asking us to “give up” waiting half a minute for our environment to load, or “give up” inheriting steaming mounds of code.  Those who want to pack things tightly and save characters can go hack Perl.  (Or, seeing as ancient languages like COBOL have already been mentioned, perhaps APL.)

    Reply
  39. 2 points:
    * You use emacs and you’re complaining about using modifier keys!?
    * You use emacs and you type more than the first few characters of a variable before autocomplete kicks in?

    That’s quite a rant you’ve got, there 🙂

    Reply
  40. “Lisp gets it right.”

    I love whitespace, but I hate Lisp’s use of ‘-‘… camel case is much preferable imo, years of reading English, C and C++ mean that I naturally parse a-b as ‘a’ ‘minus’ ‘b’.

    I do like the whitespace enforcement though – I always write ‘a – b’ in code when I can. spacing out tokens helps readability – its something I never would have started doing if coding standards hadn’t forced it upon me. 🙂

    Reply
  41. I’ve often thought about this subtle aspect of our art, and I think it’s highly based on the comfort one feels when typing ‘in the zone’, that is to say, utterly focused on the task at hand and entirely composed. That’s when I really start to notice my annoyance at having to type underscores, because this is not only when I am typing the fastest, but moreover, it is also during this time at which my stream of consciousness is most forcefull.

    When I’m solving problems and creating new material, I can’t stand any slowdown, even minor issues such as pressing shift too often. So I feel your pain, Mr. Grimm. On the other hand, I think it’s difficult to understand what constitutes a valid preference for the language, because we certainly have strong opinions all around. I would be extremely interested to see some serious academic research on programmer efficiency when using contrasting delinators. Other factors might emerge as more apparently influenced by these differences as well, such as code legibility, or more important, compatibility.

    I recently wrote some code just to solve the inconvenience of having to track whether or not my data was using hyphens or underscores for its keys. While merely hacking on a pet project, I was inspired to consider the bigger picture regarding Ruby. Because a Hash can use either a Symbol or String as it’s key, and Symbols can’t contain hyphens, there must arise cases in which someone’s data operations are gummed up due to inconsistencies in the processing code. People like to have solid expectations about their data, and with all the data crunching Ruby is used for, I’m sure this issue has seen plenty of coding hours wasted creating work-arounds.

    Reply
  42. You should check out LiveScript. Love it for this exact reason 🙂

    Reply
  43. […] the words are are separated from each other by a character that most computer beginners don’t even know exists. And don’t get me started on the use of […]

    Reply

Leave a Reply to Leonid Shevtsov Cancel reply

Your email address will not be published. Required fields are marked *