How to Make an Open Source Feature Request

When using an open source project, you may find it lacks some important feature you need to work with it effectively. When you are in this situation, you can either request the feature you need and implement it yourself, or just look for another project to use that has something closer to what you need. While most of the time people pick the second option, requesting and implementing the feature yourself can have a lot of benefits.

  • Maintenance work on the feature can be shared by all of its users
  • Working within the project exposes powerful internals that can give you exactly what you are looking for
  • Implementing the feature can give you deep knowledge of the project that can be shared within your organization

If you always choose to look for another project when it lacks a feature, you are missing out on one of the main benefits of using open source software: the fact that you have access to the source and are able to change it. This is an enormous amount of power to have, and the top technology companies take advantage of it. When time and budget allow, it should be considered as an option for your important dependencies.

Making good feature requests is an essential skill to master to be productive with open source projects. As an open source maintainer, I’ve seen a lot of variation in the quality of feature requests to my projects over the years. Making a good feature request is much more difficult than people realize. It’s part creative, part sales, and part technical. But when you get it right, it’s one of the most rewarding experiences I’ve had as a developer.

It may seem intimidating at first, but it’s much easier when you know the rules. In this article, I’d like to share some things I’ve learned about making good feature requests to help you create contributions that are able to get the attention of maintainers so they can be accepted into the project.

Creating an Issue

Once you have something in mind to work on, the first step is always to make sure you have an issue to work from. The most common beginner mistake I see is to start coding right away. This works for simple bugs when you fix something that’s obviously wrong, but any nontrivial feature will require some discussion before it’s ready to be implemented. You want to get people involved in the design process as soon as possible. The amount of discussion you generate with your proposal is a great way to gauge how interested other people are in the feature. Someone who is involved early in the process will have the best understanding of your goals and will provide valuable feedback that will guide your development. Treat anyone who gets involved early in the discussion as a potential user of the feature. Even if they seem adversarial or the feedback is negative, taking the time to respond at all should be taken as a sign of respect and an indication that a conensus is possible.

If the project is active, chances are someone may have already thought of your feature and proposed it on the issue tracker. Spend five to ten minutes searching for your issue with different wordings and see if you can find something similar to avoid creating a duplicate issue. If an issue already exists, read through the discussion carefully because it can save you a lot of time by not duplicating a discussion that has already happened or not trying an implementation strategy that is known to have problems. Check if someone is already working on the feature. If you see that someone is actively working on it, you can still contribute by adding your opinion to the discussion, testing the active branch, and reviewing the code. However, don’t get discouraged if you see someone who claims to be working on the feature who doesn’t have an active branch they are working on. In my experience, about eighty percent of the people who start working on something never actually finish it. Ask if you can pick up the work where they left off and try to credit them the best you can in your work. The best thing to do would be to use their commits directly in their branch, but that’s not always possible so at least give some thanks in your commit message.

Once you have an issue to work from, now it’s time to explain what you want to do. A good feature request always has at minimum these three components:

  1. The use case
  2. The approach
  3. The test

The Use Case

Coming to an agreement on the use case is the most important part of the discussion. If the maintainers agree that the use case is important to support, the rest is just implementation details. If the maintainers believe the use case is not valid, then nothing else is important. Don’t try to sell a beautiful approach for an invalid use case. The three most effective ways to justify a use case are 1) appeal to the project’s mission, 2) appeal to similar features, 3) demonstrate user demand.

The project’s mission is often best expressed in the description which is usually in the form of e.g., “a library to do X”. Justify your use case by explaining how your feature facilitates the user to more effectively accomplish X with this library. A more detailed description of the project mission is usually included in the project overview or the README. You may even find your feature is explicitly requested or blacklisted within the documentation. You can use all of these sources of information to support your case.

Additionally, look for features in the project that are similar to yours. This sort of appeal is extremely efficient because you get to reuse all of the justification that was used to justify the similar feature, which by default is assumed to be valid or otherwise the similar feature would be deprecated. On a related note, keep in mind that this sort of justification is so powerful, that maintainers may be wary of accepting even small features that may expand the scope of the project in unpredictable ways. To accept a certain feature with a certain justification is to implicitly accept all future features proposed within the same scope. If the project doesn’t have the resources to support the whole class of features, this is a good justification by the maintainer to reject the smaller feature even if it doesn’t add a lot of complexity by itself.

Finally, it is important to demonstrate user demand. Most often the user of the feature will be yourself or another project you are involved with. If you can, demonstrate a concrete use case with issues from other projects. Explain how adding the feature will help to fix those issues on the other projects. High demand for a feature means a larger pool of developers who can potentially come fix bugs when things break, as well as more influence within the project space.

The Approach

Discussion of the approach should come after everyone has come to a rough consensus on the nature and validity of the use case. Give a general overview of the changes you will need to make to implement the feature in a few sentences. For example, explain whatever new classes you might need to add or how the existing functions need to be modified. The amount of complexity a maintainer will allow in an approach usually relates to the strength of the use case, with a stronger use case warranting a greater amount of complexity. Changes that break backwards compatibility and need a major version bump need the most justification, so don’t propose these kinds of changes unless it’s clear to everyone why it’s important to do so.

Explaining the approach is an important step because people who know more about the project will often have valuable feedback on what you’ve proposed. What might seem simple to you may not be extensible enough to accomidate future planned work, or there might be unwritten conventions to follow so your code fits better with the project’s style. Knowing these things up front will save you a lot of time in code review.

The Test

Finally, you should include a test with your feature request. These don’t have to be formal tests, just an example snippet that demonstrates what the important part of the API will do when you are done. Including a short test tends to bring about a good discussion of details you might not have thought of like error handling. You need to have a test in mind during development anyway so you might as well post it on the issue. Be honest about any edge cases and bring up any problems you find in the issue as early as possible.

Issue Discussion Etiquite

These sorts of discussions are what give open source projects their reputation for being unfriendly places. Please keep in mind that discussions on open source issue trackers have very little resemblence to what is commonly called “normal human interaction” and a different set of rules tends to apply. They can sometimes resemble a game unto themselves much like poker with lots of posturing and bluffing. You may even sometimes feel like a lawyer in a court room. The most important thing to keep in mind is to always begin your thoughts from a place of respect and do not take things personally. The goal of the discussion is to always move forward towards a consensus. If you sense that no progress is being made, do not repeat your points. Wait a few days for others to chime in with a fresh opinion. Be willing to accept that not every idea you propose can come to a consensus in the project, and have a backup plan such as forking or starting a new project that can accomidate your use case.

Now Start Coding

Congratulations, your feature request was accepted. With all you’ve been through, this might seem like the easy part. It is very rare to have a feature rejected at this phase, but I have regretably seen it happen before. There’s a lot more that can be said after this point such as how to make a good pull request and how to respond to feedback during code review, but I’ll leave those topics to another post.

Messing Around with JavaScript Decorators

I’ve been looking at the new features in ES6 and boy has JavaScript changed a lot in the last few years. The ECMAScript standards team is really doing a great job making the language more comfortable to work with. My favorite features are block scoped variables with let, a better syntax for defining a class (I never really understood what a “prototype” was), built-in support for loading and exporting modules, and native support for promises. Proxy objects seem like they could be a powerful tool as well. My first impression is that JavaScript is starting to look like a more liberal version of Python, which isn’t a bad thing.

One of the features I looked for in ES6 and could not find was support for function decorators. Decorators can be a great thing to have in a language sometimes. When they fit into an api, they really fit in well and I often use them in my Python library code. I was surprised this feature didn’t make it into ES6 because they are used extensively as part of Angular and React and have native support in TypeScript.

The proposal for decorators can be found in this repository with user guide located here. The proposal is currently at stage two which means it will likely be included in the language in the next major update but are not ready to be included in production code yet. I wrote some sample decorators to test out the new features which you can find in my notes repository here and I’d like to explain to you how they work.

Note: I am not an expert JavaScript, front-end, or Node developer so if you see anything I’m doing wrong in these examples, please let me know in the comments or in an email.

Building the Project

Unfortunately, there is no native implementation for decorators in Node (currently version 11) or any browser yet so you must compile your decorator code with a project called Babel. This project requires two additional plugins, @babel/plugin-propsal-decorators and @babel/plugin-proposal-class-properties.

npm install --save-dev @babel/cli \
    @babel/core \
    @babel/plugin-proposal-class-properties \
    @babel/plugin-proposal-decorators

Babel must also be configured with some options in a .babelrc file you can find here.

Now you can compile and run your code with babel like this and it should work correctly:

babel ./index.js -o index-compiled.js && node ./index-compiled.js

Anatomy of a Decorator

A decorator is basically just a function that gets called in the context of a target method or property that is able to change its state somehow. Here is an annotated example of a decorator which does nothing:

function(descriptor) {
  // alter the descriptor to change its properties and return it
  descriptor.finisher = function(klass) {
    // now you get the class it was defined on at the end
  }
  return descriptor;
}
 
class Example {
  @decorator
  decorated() {
    return 'foo';
  }
}

The descriptor that is returned from the decorator is an object that you can mutate to change the target method. The important properties of this object are:

  • kind – whether this is a ‘method’, a ‘field’, or something else
  • key – the name of what is being decorated (in this case, ‘decorated’)
  • descriptor – contains configuration for the property, and the value which you can hook into with custom behavior
  • finisher – add a function to be called after the class is defined for customization of the class

To have your decorator take parameters, use a function that returns a decorator like the one above.

function decorator(options) {
  // options are passed with the decorator method
  return function(descriptor) {
    return descriptor;
  }
}
 
class Example {
  @decorator({foo:'bar'})
  decorated() {
    return 'foo';
  }
}

Example: Log a Warning When Calling a Deprecated Method

function deprecated(descriptor) {
  // save the given function itself and replace it with a wrapped version that
     logs the warning
  let fn = descriptor.descriptor.value;
  descriptor.descriptor.value = function() {
    console.log('this function is deprecated!');
    return fn.apply(this, arguments);
  }
  return descriptor;
}
 
class Example {
  @deprecated
  oldFunc(val) {
    return 'oldFunc: ' + val;
  }
}
 
let ex = new Example();
ex.oldFun('foo')
// > this function is deprecated!
// > oldFunc: foo

Example: Make a Property Readonly

function readonly(descriptor) {
  descriptor.descriptor.writable = false;
  return descriptor;
}
 
class Example {
  @readonly
  x = 4;
}
 
let ex = new Example();
ex.x = 8;
ex.x;
// returns 4. note that you don't get a warning for trying to set a readonly property

Example: Reflect a Class

Given a class, we want to find all the methods that are decorated with a certain decorator. We will use the @property decorator for this. This sort of thing would normally be used for a base class in your API that your user is expected to override.

function property(descriptor) {
  descriptor.finisher = function(klass) {
    klass.properties = klass.properties || [];
    klass.properties.push(descriptor.key);
  };
  return descriptor;
}
 
class Example {
  @property
  someProp = 5;
 
  @property
  anotherProp = 'foo';
 
  static listProperties() {
    return this.properties || [];
  }
}
 
Example.listProperties()
// > [ 'someProp', 'anotherProp']

Conclusion

Decorators open up a lot of possibilities in a language and I am looking forward to their inclusion into JavaScript. I plan to use them in a Node library I am writing right now. However, reflection is a very powerful tool and should not be used without careful consideration. Make sure the decorator pattern actually fits your use case before you decide to use them. Have fun with decorators!

Playerctl at Version 2.0

I’ve spent the last month revisiting an older project of mine called Playerctl. I wrote the first version nearly five years ago over a weekend and have been making small tweaks to it here and there in my free time since then. The idea was to make a command line application to control media players so I could bind a command to keyboard key combinations to have media key functionality. I do mostly everything on the keyboard so I found it distracting to reach over to the mouse, find the media player window, and click on a button whenever I wanted to pause the player or skip a track on the radio. Playerctl works great for this. These lines have been in my i3 config for years.

bindsym $mod+space exec --no-startup-id playerctl play-pause
bindsym $mod+$mod2+space exec --no-startup-id playerctl next

Another goal of the project was to access track metadata for a “now playing” statusline indicator in i3bar, tmux, or whatever. I did a few implementations of this, but never really made a satisfying statusline (more on that later).

Other people seemed to have found it useful too, and to this day it’s the most popular project I’ve published under my name on Github. With users come issues when people find bugs and limitations in the interface for their use case. These discussions with users have guided the development of version 2.0 of Playerctl and I think I’ve addressed everyone’s concerns. The main points that drove development for this version were:

  • Make the command line application easier to use with multiple players running at the same time
  • Make it easier to print metadata and properties in the format you want
  • Make it easier to make a statusline application
  • Make the library more usable

It wasn’t easy to do these things, but in the end, I’m pretty happy with how things came out and I hope everybody enjoys the new features. Version 2.0 was a big effort and represents nearly a rewrite of v0.6.2 and a doubling of the size of the code base to accomidate the new features. In this post, I’d like to go over the changes, share some of the rationale for the design choices I made, and alert you to some breaking changes in the new version.

Player Selection Overhaul

In the old version, you could select players by passing a name or comma-separated list of players to the --player flag but I found this feature rather limiting because the behavior was simply to execute the command on all the players. That makes it not very usable for the play command because you almost never want all your players to start playing something at the same time. My thinking is most people have something like a “main music player” which is good at handling large playlists and then a secondary player they use for movies or other random media on their system. So I changed the default behavior of the --player command to only execute the command on the first player in the list that is running and supports the command. That way people can pick the priority of the players they want to control, and if the command is not supported (such as if you command the player to go to the next track but there is no next track), it will skip the player and go to the next one. You can still get the old behavior by combining this flag with the --all-players flag.

Another requested feature was the ability to ignore players instead of explicitly whitelisting them. This is useful for instance to ignore players that are not really media players like Gwenview. Now you can pass those players with --ignore-player and they will simply be ignored.

Another detail about player selection is that now a player name will match all the instances of the players, which it didn’t do before. So if I pass --player=vlc, all the instances of VLC that are open will be selected.

I think that should cover everybody’s needs the best I can, but there are always going to be edge cases that I won’t be able to address without mind reading abilities.

Format Strings

Before format strings, the way to get multiple properties was normally with multiple calls to the CLI like this:

artist=$(playerctl metadata artist)
title=$(playerctl metadata title)
echo "${artist} - ${title}"

But that’s obviously not a very elegant solution and doesn’t scale very well if you want to print more than a few things. I implemented a few features in this version to address this. One feature is that you can now specify multiple keys and each key you specify will be outputted on its own line.

$ playerctl metadata artist title
> Katy Perry
> California Gurls

Then you can split on the newline and they’ll be in an array in that order. I still wasn’t very happy with this because it’s not very semantic for what people are trying to do with it. What people said they wanted was for a raw playerctl metadata to output JSON they could parse, which I wasn’t willing to do because I don’t want to add the dependency just for that feature, and even then, scripts would then need something like jq to parse the output. What I did do is make the metadata call output a table (instead of the serialized gvariant before) which is great for readability, but I wouldn’t recommend parsing it.

vlc   mpris:trackid             '/org/videolan/vlc/playlist/37'
vlc   xesam:title               Synthestitch
vlc   xesam:artist              Garoad
vlc   xesam:album               VA-11 HALL-A - Second Round
vlc   xesam:tracknumber         34
vlc   vlc:time                  281
vlc   mpris:length              281538480
vlc   xesam:contentCreated      2016
vlc   vlc:length                281538
vlc   vlc:publisher             3

So after deliberating for awhile, I decided to do the hard thing and just go ahead and write my own template language based loosely on jinja2 but with a lot fewer features. The parser was a lot of fun to write and I’m happy with how it came out. Now you can do this:

$ playerctl metadata --format '{{artist}} - {{title}}'
> Garoad - Your Love is a Drug

I even put template helpers into the language for additional formatting you may want to do to make the variables more readable. This is the format string I use for my statusline generator right now:

fmt='{{playerName}}: {{artist}} - {{title}} \
     {{duration(position)}}|{{duration(mpris:length)}}'
playerctl metadata --format ${fmt}
> vlc: Garoad - Dawn Approaches 3:28|4:10

The duration() helper converts the position from time in microseconds to hh:mm:ss format. There are a few others too that are pretty interesting documented in the man page.

Follow Mode

If you wanted to use the CLI to make a statusline before, you basically had to poll. And if there’s one thing that everybody hates to do, it’s polling. It would be better to have a tail -f style flag that blocks and automatically updates when things change. This was the hardest feature to add because there was a lot of functionality lacking in the library, and the CLI was designed to be mostly stateless because it was only supposed to be a one-off. There’s also a lot of edge cases with players starting, exiting, and changing states which is difficult to get right. I put a lot of detail into making sure the most relevant thing is shown on the statusline based on the input. If you have players passed with --player, it will show them in order of player priority based on the last player that has changed. If you pass --all-players, it just shows whichever one changed last. I think the last one is what I prefer.

It even works with the --format arg and will tick if you give it a position variable. Here is the grand finale:

fmt='{{playerName}}: {{artist}} - {{title}} \
     {{duration(position)}}|{{duration(mpris:length)}}'
playerctl metadata --all-players --format ${fmt} --follow

My own personal statusline implementation of this can be seen here in i3-dstatus, another one of my neglected projects that will get my attention next.

Library Improvements

I originally had bigger plans for the library, but didn’t end up doing as much with it. I still think it’s really cool though, and I want to keep supporting it. The problem was there was no way to listen to when players connect and disconnect to control, so you basically had to run your script when you knew your player was running which is not great. I needed to add this feature for the follow command anyway, so I decided to go ahead and externalize it in the form of a new class called the PlayerctlPlayerManager. This is meant to be a singleton which emits events for when players start, and keeps an up-to-date list of player names that are available to control. It can manage the players for you too and alert you when they exit.

Here is an exmaple of the manager in action:

#!/usr/bin/env python3

from gi.repository import Playerctl, GLib

manager = Playerctl.PlayerManager()

def on_play(player, status, manager):
    print('player is playing: {}'.format(player.props.player_name))

def init_player(name):
    # choose if you want to manage the player based on the name
    if name.name in ['vlc', 'cmus']:
        player = Playerctl.Player.new_from_name(name)
        # connect to whatever you want to listen to
        player.connect('playback-status::playing', on_play, manager)
        # add the player to the list of managed players and be notified when it
        # exits
        manager.manage_player(player)

def on_name_appeared(manager, name):
    # a player is available to control
    init_player(name)

def on_player_vanished(manager, player):
    # a player has exited
    print('player has exited: {}'.format(player.props.player_name))

manager.connect('name-appeared', on_name_appeared)
manager.connect('player-vanished', on_player_vanished)

# manage the initial players
for name in manager.props.player_names:
    init_player(name)

main = GLib.MainLoop()
main.run()

I tried to make as few breaking changes to the library, but a few were inevitable. There are also a few deprecations that will affect anyone who based a script of the previous example code. See the library docs for more details.

There are a lot of other little changes, but those are the main ones. Enjoy Playerctl 2.0!

Software Maintenance is Like Golf

I think software maintenance is one of the least understood concepts among engineering managers. By maintenance I mean all the small little tasks developers do to make their code nicer to work with, refactoring, testing, as well as fixing bugs. This part of the development process is difficult to manage for several reasons. For one, developers tend to be quite bad at making a case for why these activities are necessary. Maintenance is essentially a technical task so there is a mismatch in communication between decision makers and developers that is difficult to overcome. Often a developers arguments for refactoring reduce to aesthetic principals of best practices for coding that are difficult to reason about clearly but may still be impactful on the outcome of the project. The decision to spend resources for maintenance activities must involve a degree of trust in developers to spend their time wisely. And my experience tells me that often they don’t, so maybe managers are right to be a little skeptical.

While a little skepticism towards maintenance activities is healthy for managers, I believe this point of view when taken to extremes can cause managers to hold an inaccurate model of how their software is actually being developed. I once had a manager say to me that “refactoring is a bad word” and in my personal experience, tasks related to code quality have been the most difficult to pitch because this bias is so common. In fact, Stripe recently published a paper which identifies these activities as waste with the implication that if developers just wrote their code correctly the first time, we could save 85 billion dollars in lost GDP per year.

While it’s possible to be an effective manager with this simple heuristic, I think there’s a different understanding of the process of developing software that is closer to reality although maybe a bit less intuitive. The uncomfortable fact is that maintenance activities are seemingly unavoidable and sometimes refactoring really is the most impactful thing your developers can be doing for the outcome of the project. While these decisions may rarely make or break the project, having a good understanding of software maintenance is a great way to become a more effective leader and gain respect among your engineers.

The Current Metaphore

Metaphores and language shape the culture of our teams. Just like the ancient Greeks created myths to explain the chaos of the world in human terms, we do the same thing as engineers and managers. The current metaphore of maintenance is understood in the same way as home maintenance. For instance, my kitchen sink clogged up recently so I had to have a plumber come to the house to fix it and he charged me $300. It’s easy to understand this cost as a waste. My sink worked perfectly well, then I called the plumber and the end result is the same as it was before the clog: a working sink. If the plumber would have offered to rearrange the pipes to be more efficient for a cost of an extra $500, I would have respectfully declined.

Depreciation is a real phenominon for code, but it happens for a different reason than plumbing or factory equipment. The reason code requires maintenance is not because it wears out when you use it. Code is just a description of a deterministic logical process. Given the same inputs and state of your hardware, you will get the same sort of result now that you will get 20 years from now. Rather, code depreciates because people change the way they use it over time.

With the home maintenance metaphore, bugs are understood to be clear cut and well defined (we call these regressions: when something used to work and now it doesn’t), but the overwhelming majority of bugs I’ve encountered are not regressions, but rather come from the user using the software in a way that was not expected by the original designer. The user did something that seemed like it should have worked, but then the software did something else entirely. In this case, the line between a bug and a feature is not clear and it’s often not useful to make the distinction at all. For instance, if the user installs my software in an environment I didn’t do any tests for and runs into problems, is it a bug because the software is not working correctly or a feature to add support for the new environment? Whatever we call it, usually it doesn’t affect the discussion so we don’t bother with semantics. Sometimes we just tell the person we don’t want to support their use case and close the bug as wontfix. Windows users should be used to this by now.

Most “maintenance” work is actually feature development in disguise. The cause of most bugs are changes in user expectations. Your developers want to refactor the code base because they are anticipating changes in user expectations and they want to get an early start implementing the features they think you’ll need while they have the problem fresh in their mind. If you understand things this way, you should see that the home maintenance metaphore is limiting for practical decision making about maintenance activities.

The Golf Metaphore

I rather see maintenance not as a wasteful activity, but rather an important part of the development process. To me, a software project is a lot like a round of golf. The object of the game of golf is to get a ball into a hole across a field using clubs with as little effort as possible. Effort is measured by strokes, or how many times you hit the ball.

On the first stroke (the drive), you are very far away from the hole. The objective of this stroke is not necessarily to get the ball in the hole. It could happen, but getting a hole in one is really just a lucky outcome that can’t be attributed to the skill of the golfer. There are a lot of factors the golfer is not thinking about at this point like the exact speed of the wind that may alter the ball’s trajectory by several feet. The best you can hope for is to get close enough to make the rest of your shots easier.

Let’s notice some things about a good drive. First of all, the ball travels about 80% of the distance to the hole during this shot. Second, this shot costs the same as any other: one stroke. If you do a naive calculation of distance per stroke, you will come to the conclusion that your drive is by far your most efficient shot. With this data in hand, a good golf coach might tell his player your drive is your most efficient shot, so just do that every time. Worse yet, if you golf as a team and have a designated driver, you might mistakenly think this is your star player because he moves the ball the farthest. Truly this must be a 10x golfer!

But in reality you can’t drive every time (even if you could, you may never get to the hole). Your next shot requires a different set of skills and even a different set of clubs. The ball will travel much less distance during this shot, but it still counts the same as the shot before: 1 stroke. Finally, you are close enough that it’s time to putt. The putt uses the smallest club and causes the ball to move the shortest distance, but the effort required is the same as the drive.

Software maintenance work is like putting. At this point, small details matter like the contours of the earth and the length of the grass. And while this shot is not nearly as efficient as the drive, it’s the best way to play golf (as long as you don’t make the mistake of doing it too early). Some people even make a game out of just this part: minigolf and it’s pretty fun.

So if you think about a software project like this, you should see that maintenance work is just part of software development just like putting is part of golf. If your code base needs refactoring or you have bugs, it doesn’t mean that anybody made a mistake or wrote “bad code” just like a golfer who doesn’t make a hole in one isn’t a bad golfer. That’s just how golf works.

My Definition of a Hacker

The word hacker can be an charged word that means different things to different people. Usually this word is used among people outside of the software industry to refer to a computer criminal, but there are alternative definitions that predate this that are still in common use today within the industry that don’t have anything to do with criminal behavior (such as the popular forum Hacker News). In this post, I would like to explore the different uses of this word and try to come up with an alternative definition that unifies them all.

Alternative Definitions within the Software Industry

Within the software industry, the term is sometimes used to describe a talented computer programmer who enjoys solving problems with code. The common definition can be used within the industry as well, but this might be seen by some as controversial. In an influential attempt to define the term in 2001, Eric S. Raymond, a legendary self-described “hacker”, made this distinction.

There is another group of people who loudly call themselves hackers, but aren’t. These are people (mainly adolescent males) who get a kick out of breaking into computers and phreaking the phone system. Real hackers call these people ‘crackers’ and want nothing to do with them.

While I agree that the word can and should be used to describe noncriminal behavior, if a colleague calls me in the middle of the night and says to me “hackers just broke into our database”, I’m not going to spend any time correcting his usage. The word cracker seems to have fallen out of style because I’ve never heard anyone use it.

Rather, the word hacker is more often used as a term of endearment among software engineers to denote membership in a particular subculture. Between developers, a hacker is someone who enjoys solving difficult problems in a creative way. Richard Stallman, another legendary hacker, puts it like this:

Hacking means exploring the limits of what is possible, in a spirit of playful cleverness.

Hacker culture values a healthy skepticism of authority and the status quo in search of more effective and more interesting solutions to common problems. You can hear echos of the hacker ethos in the present Silicon Valley ideal of the disruption of established industries. In fact, both Stallman and Raymond were particularly disruptive to the software industry in their time by developing the Free and Open Source Licenses which now form the legal basis for how companies and individuals share common code bases with each other.

There is one more common usage of the word hack in the industry that is not as flattering. It refers to a solution to a problem that is either seen as overly complicated, fragile, or uses a low-level interface to accomplish a high-level task. This usage is somewhat related to the common definition of “a professional who is bad at something” (such as the comedian I saw last night is a hack), but in programming the word is more specific and I think it’s more related to the above two definitions as this article explains well. The word can also be used positively to describe “a clever hack” which is a surprising use of an algorithm or obscure interface to accomplish a task in an unexpected way. This is the sense of the word in the phrase “life hack”.

Putting it all Together

These definitions might seem to be incompatible with each other, but I think they are related in a simple way. Let’s review them and try to find out what they have in common.

  1. Someone who breaks into computer systems
  2. A creative and disruptive programmer
  3. Inappropriate or surprising use of low-level interfaces

I would like to propose a definition that unifies all of these usages.

A hacker is someone who creates an interface where none existed before.

[note: I think this is an original insight, but please let me know in the comments if someone has already put it like this].

Thus, hackers are people who create new interfaces. This definitely describes the pioneers of the early Internet protocols and the creators of the GNU operating system like Richard Stallman who made a Free Unix interface. These people are legendary hackers and they deserve our praise. It also describes people who break into systems and steal data. For instance, there is not an interface on your banking website that lets you take money out of someone else’s account and put it into yours. If you create this interface (hopefully using a surprising low-level approach!) then you are a hacker and you should go to jail.

Let me explain what I mean in an analogy of a house. A common saying is when God closes a door, he opens a window. Entering a house through the window is a hack in all three senses. It is 1) a way to subvert God’s security system of the house by bypassing the lock on the door to gain access, 2) a creative solution to a technical problem which disrupts our notion of home entry, and 3) the abuse of an interface to do something it was not designed to do. I think all hacks have these elements to them. While good hacks are not criminal, there is still a sort of a sense of mischief to them and a spirit of doing things you are not supposed to do. Someone who solves a problem in the way they are “supposed” to do it is definitely not a hacker after all. And of course, to create any new interface requires using a lower level interface by definition.

Don’t get me wrong, I’m not trying to compare the hacker community to criminals. These are two distinct groups of people with very different goals. But maybe these similarities are why the terms got tangled up together in the media.

Command Line vs GUI

An important topic in interface design for end users that is rarely discussed is when to design an application for the command line and when to design it as a GUI. Most computer users have a strong opinion about this. Nontechnical people often seem to hold the belief that the GUI is simply more advanced technology and command line applications are an artifact of a primitive era, while some hardcore adherents to the philosophy of Unix believe that command line applications are superior. I believe that the truth is somewhere in the middle and each paradigm of UI design can be applied appropriately to different use cases. In this blog post, I will give an outline of things to consider when choosing between the two when designing or using an application.

Technical Literacy of the Users

First consider how technically literate your users are. Not everybody who uses a computer is necessarily a computer person and that’s perfectly ok. Technical literacy proceeds much like normal literacy. When people first learn to read, they tend to start off with picture books. The pictures in a picture book give context to what is going on with the story. When words are necessary, the context of the pictures can help the reader to understand words or concepts they are not familiar with. This is a bit like the icons of a GUI which can give some visual context to a command that might not be easy to understand with words alone.

As we learn to read, we tend to prefer novels. Words are capable of expressing certain subtleties that pictures alone cannot. Some complex processes are much better modeled with words than pictures. What we lose with the visual immediacy of the GUI is made up with a greater capacity for complexity. Anyone who has put together furniture with the picture-only instructions from Ikea might know what I mean (or is that just me?).

Takeaway: always use a GUI if your users are nontechnical.

Composability and Automation

The most obvious weakness of a GUI is that they are notoriously difficult to compose and automate. By compose I mean the ability to take the output of one program and use it as the input of another program. Command line applications can be combined and assembled into scripts that can accomplish a task automatically on certain events. This isn’t possible with GUIs because they require user interaction at runtime. Command line applications are like Lego blocks that can be assembled into whatever you need at the time and rearranged when your needs change.

This has a strong influence on the interface design in either paradigm. GUI applications need to have strong and well thought out opinions on how the user will use the application. They live alone at the end of a chain of automation so they must provide everything the user needs to accomplish their task. Thus GUI applications tend to have a much larger scope than command line applications making them more difficult to develop and maintain. This isn’t all bad though because highly opinionated software tends to be easier to use, given you use it for the purpose it is designed for.

Command line applications can be designed with a limited scope to accomplish a general task. This is in line with an important tenant of UNIX design philosophy:

Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new “features”. – Doug McIlroy

This design approach is sometimes called “worse is better”. A design that has less features (“worse”) will be simpler to use and understand (“better”). Less opinionated software has the advantage of being more extensible, that is, able to be used for a purpose that the designers of the software did not anticipate.

Takeaway: prefer the command line when your application might be used as part of a larger workflow.

The Visualness of the Domain

The most obvious weakness of command line applications and text interfaces is that they are not capable of displaying raster or vector data (e.g., images or elevation data) in their most natural visual form. A full-featured web browser or image editor must always be written as a GUI.

In this situation, it’s often a good idea to provide both a command line interface and a GUI. For instance, when I’m doing image processing, I will start with GUI image editing software like Gimp and then when the task becomes more clear, I’ll switch to a command line image editing application like imagemagick which can be automated. Doing a simple task like resizing all the images in a directory is much easier to do on the command line than with a GUI image editor, but determining exactly what is the right size is easier in the GUI.

Takeaway: if your domain is visual, provide both a command line and a GUI application.

Documentation and Discoverability

GUI applications are known for being difficult to formally document. Instructions on how to do things are presented with screen shots often with little circles around the right places to click which the user must repeat themselves to get the right result. Following these instructions for especially complex tasks can start to feel like a scavenger hunt. Seemingly innocuous changes in the UI can cause the documentation to go out of date because the screenshots no longer reflect what the application looks like and updating all the screenshots is tedious. The reality is that documentation for GUIs is rarely helpful and therefore rarely used. Instead, users expect the interface itself to guide them towards their goal. GUIs are great at this because the flow of the application can be laid out visually in a way anyone can understand without reading anything.

This experience isn’t possible with a command line application. Reading (and therefore documentation) is simply required to use the software either by passing a --help flag or reading the installed manual page. However, command line applications can be documented much better because the documentation medium is the same as the command medium (text!). Often times, documentation can simply be evaluated directly by copying and pasting an example command directly from the manual page or your web browser. This makes command line applications easier to support through text based communication like email or instant messaging, because telling someone what to type is easier than telling them where to click when you can’t see their screen.

On a related note, GUIs are better at internationalization. Command line applications and their parameters are almost always in English which not everybody knows, while GUIs can provide an emersive environment in many different languages.

Takeaway: use a GUI if your users don’t like to read.

Portability

The greatest advantage of command line applications is you can run them on systems without a graphical environment. As long as you have a shell on the computer, such as with a remote shell protocol like SSH, you can run the command. This is invaluable when you need to run your application on a remote server or an embedded device. A GUI application can normally only run on a computer with a connected mouse, keyboard, and monitor. Remote desktop protocols exist, but they are heavy on resources and much less reliable than SSH. If your workflow primarily consists of command line applications, it’s possible to work seamlessly on many machines at once which is a big boost to productivity. This is the reason why I prefer to use a lightweight text editor like Vim instead of an IDE.

Takeaway: use the command line if you need to run on servers and embedded devices.

Conclusion

This all might seem obvious, but you’d be surprised at how often I see people choosing the wrong tool for the job. Using both command line applications and GUI applications are an essential part of being a computer power user and knowing when to use each one is an important skill for computer programmers who want to work as efficiently as possible.