Why I Like Improv Comedy

For about the past year, I’ve been going to a lot of shows at a local improv theater called Village Theatre in the city of Atlanta. If you don’t know, improv comedy is a form of theater where all the scenes are improvised. Sort of like that show Who’s Line is it Anyway. I really enjoy going to shows and hanging out there so I wrote a little bit about why I like it so much.

It’s Spontaneus

The most important aspect of improv is that it’s all made up on the spot. A show starts out with a suggestion from the audience, usually a word or phrase from a question from the improvisers. The suggestion serves as a kind of random seed that gets the performers thinking about the same thing. The show starts off with someone telling a story or starting a scene based on a free association of the suggestion. The spontaneus nature of an improv show is what makes it unique.

There’s just something cool about watching people do things live. You are watching unplanned events unfold before your eyes. There is a sense of danger and a feeling that anything can happen. Each new scene may succeed or fail spectacularly, and you get to see the raw unedited events as they unfold. This makes being part of the audience a social experience. When everyone is sharing the same moment, it connects everyone together. This is the reason why sports are broadcast live. Watching a rebroadcast of the Superbowl is just not the same experience.

Improv comedy takes this experience to the limit. Normal theater is something like a “rebroadcast” of an initial inspiration. The success of the production depends on recreating the moment that inspired the piece. Doing this takes an enormous amount of effort and skill for the performers and nobody can do it perfectly. The greatest actors in the world cannot reliably recreate the kind of spontaneous behavior we are all capable of in our everyday lives. You can always tell the difference between a speech given extemporaneously and read verbatim. When you make something up on the spot, you have perfect delivery almost by definition. You did it exactly how you would do it, and that’s indistinguishable from perfect acting.

This affects the content of the shows too. The fact that everything is made up on stage does limit the complexity of the content that can be performed to whatever the group can store in its head during the performance. You won’t see a perfect three act play with a twisting plot and deeply written characters. But interestingly, I’ve found there is just as much variety in improv as there is in normal theater and maybe more. Without an editing process, ideas can be explored that would normally be cut off early. An off-the-cuff remark may become a full blown bit or recurring gag. Shows are unique because they only exist in that moment. The shows are sort of like what friends do when they joke around, which has an infinite possibility for entertainment.

When you watch an improv show, it’s more than just theater. You are watching something truthful actually happen on stage. You may not get to see a complex plot, costumes, or special effects but those things aren’t really the point of the experience anyway.

Audience Participation

When you think of audience participation, you might think of the performers inviting some people on stage to do a dance or sit in a chair or something. That sort of thing happens in improv too. One improviser named Mark Kendal is pretty famous for this. One time he led the whole audience out of the auditorium to go interrupt another show just for fun. But there’s a deeper level of audience participation that is achieved during these shows.

Improv is an extremely simple artform. There aren’t any sets, costumes or props. The simpler an artform is, the more of yourself you are required to bring to it as an audience member. In the audience, you have to bring all of these things to life with your imagination during the performance. This is a really cool thing because it allows for lots of creativity. If the scene takes place in a police station, everyone in the audience will have something different in mind. When scenes get crazy, it’s a lot of fun to puzzle over the insanity of the circumstances.

When worlds can be created and destroyed in the imagination of the audience so quickly, a level of abstraction can be achieved that’s very difficult for other dramatic arts to get to. Scenes can take place in dreams or fantasy settings where the audience is given very little to go on. The performers rely on physical comedy and clowning techniques to create a funny “stage picture” to explore. In one show I went to, a performer announced that the stage was now “the dice world” and the rest of them rolled around the stage like dice. In another scene, the performers formed two lines that spoke to each other like they were two people in a serious conversation which gave it a feeling of generality, but the concept had just enough of the element of the ridiculous to be comedic. These ideas were very successful with the audience, but if these concepts were fleshed out and put in a film production, it would completely fail. This sort of theater is only possible when it is within the imagination of the audience.

It’s Local

Another great thing about improv comedy is that the people who are in the shows are just people from your community. They aren’t big shot millionaires who live in California. They’re just the local theater people and they’re hanging out in the lobby and you can have a drink with them after the show. When I first started going to improv theaters, this was really shocking to me because I was so used to having a lot of distance between myself and the people on TV. Knowing something about the performers in a show makes me appreciate the performance a lot more because I can relate to it on a more personal level. And the theater is also just a really fun place to hang out. The people I’ve met in the improv community are some of the nicest and most supportive people I’ve encountered in my life, and I like to think some of that energy is rubbing off on me by spending time there.

What’s more is that the audience is local. The performers and the audience have so many shared experiences living in the city of Atlanta that a lot of things become accessible that would be impossible to do in a national act. When I go to the improv theater, I get to see scenes about my own local culture and the things that affect the people of the city I live in.

Improv theater just feels like a bunch of people from the neighborhood getting together, expressing themselves, and working through their issues. This is the proper role of art in society and something that’s completely lost in other media like film and television. There’s just something about the whole thing that’s really magical to me.

The Limits of Standardization

Without standards we would all die. Without a standard kind of air to breathe, there would be mass suffocation. If there was no standard song to sing at birthdays, these events would be pure chaos. Nobody wants to live in a world without a standard keyboard layout so I can type on your keyboard. So stop setting your layout to Dvorak, Ed. That’s really annoying and even you say it doesn’t make you type faster. Standards improve our lives by allowing us to make assumptions about how to accomplish certain things and ultimately form the basis of our work culture. However it’s important to realize that standards are not free and should be evaluated in terms of their costs and benefits.

Incentives to Standardize

Standards are not always proposed with the best interest of users in mind. Standards are created by people with their own goals and interests that might not perfectly align with those of the user and a greater political context has to be taken into account when evaluating a standard. Some companies create standards to sell or license for profit. These standards are designed to be profitable first, and may consequently be useful, but the profit motive may lead to design decisions that might not be good for the user. When a company creates a set of standards that are mutually interoperable but not compatible with the wider ecosystem of related standards, we call that a walled garden. Creating walled gardens can be very profitable because once you buy in, it can be very hard to get out.

The most prominent walled garden for consumers today is the Apple ecosystem. Instead of using the standard USB charger, Apple has created their own charger standard that can only be used with iPhones. This is a great business move for the company for several reasons. For one, instead of paying for licensing of the existing standard, creating their own allows them to license the standard to other manufacturers at a large premium over the existing one. Another thing is that if the iPhone user has bought into Apple hardware, it will be more expensive for them to switch to a phone that uses competing standards because they already own a lot of accessories that don’t work with any other type of phone. They are essentially locked in to Apple now and the locks can be expensive to break. These sorts of lock-ins are bad for consumers and ultimately for markets because they reduce consumer choice and stifle innovation by entrenching the owners of the walled garden in their market position. This is the price we pay as a society to give the company incentive to create a standard in the first place.

At their worst, standards seem to be made intentionally difficult to implement, although I believe this effect is really a natural consequence of there being no incentive to make the standard easy to follow. For instance, the Microsoft Word document format is a standard that can theoretically be implemented legally by other word processing applications. However, alternative word processors like LibreOffice have a very difficult time actually creating an implementation because the standard is not well suited for public use. There’s no incentive for Microsoft to expend the effort to make their document standard accessible to competitors. It would be more efficient to have a common open standard we can all use, especially considering that Word documents are used in the course of some civic functions.

Using a standard controlled by another entity gives them a lot of power over your project. You should only choose standards where you trust that the incentives of the creators of the standard align with the interests of your business.

The Reduction of Minds

Aside from enabling interoperability, another important way standards work is to reduce the number of minds that are thinking about solving a problem. Standards provide readymade abstractions like TCP packets or HTTP headers so you don’t have to think about these things if you just want to run a website for your dog grooming service or whatever. The core internet standards are mostly considered a solved problem and nobody even thinks about any other way that could be done. That’s a great thing when the standard you’ve come up with is good enough and the problem is fairly static like data transfer. But when the problem is dynamic and the future is unclear, and especially when the process is social, premature standardization can lead to a whole host of problems.

Standardization is essentially a controlled automation of thought. When thought it automated, it reduces the amount of consciousness being put on a problem. Since changing standards are difficult, it raises the bar for the benefit required for some improvement to be proposed for the standard. With premature standardization, you lose the ability for many people to come up with competing incremental improvements so it puts a lot of pressure on the creator of the standards to get it right on the first shot. And most often this is an impossible task to complete. When incremental improvements are delayed, technical debt builds and can ultimately lead to a breaking point where a different standard is needed. An then you have yet another competing standard.

Sometimes it’s better to have a period at the beginning of the project where there are no standards at all and just let everyone figure out the problem for themselves. An explicit lack of standardization in some areas is a defining characteristic of the Federal system of the United States government. Certain powers of the government are delegated to the states to allow for experimentation with the rules to find out what is the best way to do things. When processes are decentralized, we can have disagreements in theory and try out a lot of different ways of doing things and let the outcome of the experiments be our guide to good policy. Many times having all those extra minds thinking about a problem is really not wasteful, but the best thing they can be doing to improve the situation.

The best standards arise organically from a diverse group of organizations each with their own experience solving the problem in their own way. Standards that are created by a committee at a single organization sometimes lack the flexibility to be used outside their original context.

Application to the Workplace

As a manager, you are probably a conscientous and orderly person who likes everything to fit into nice little boxes and it makes you uncomfortable when things get messy. So to make the world a bit more understandable, you begin to make standards. There’s a standard web framework everyone should use, a standard http server library, a standard number of approvals for code changes, and a standard code coverage target. At the end of the day you have a big pile of standards and you feel like you’ve accomplished a lot of work.

The problem is that most of the time, the decision process for coming up with these standards is simply the intuition of a few people and sometimes these people get things very wrong. It’s easy to implement a standard on a whim in a meeting when none exists, but to change the standard requires justification which can be very difficult to provide when the original standard was imposed without much justification at all. Sometimes it’s just a matter of one person’s intuition against anothers and the person with the original thought may not even work for the company anymore. Team standards are like any other knowledge asset in that they need continuous maintenance to justify their existence. Rules accumulate and become technical debt like code. Before you know it, you have a ton of crufty old rules that people are blindly following and nobody is getting any work done. This is the number one job complaint I hear from my developer friends.

While it might seem counter intuitive, sometimes chaos is not a problem to be solved, but should be embraced as a normal part of the creative process. Sometimes the best thing you can do is to just sit back and let everyone fight it out and see what comes out of it. The results may surprise you. In the process you’ll be gaining knowledge of all the things that don’t work which can be just as useful to know as the things that do. Never forget that you have the great advantage of real human beings solving your problems for you. And while the human way of solving problems in groups can be a very messy, this should not be treated as an engineering problem, but simply a natural limitation of the human experience.

Some Thoughts on Testing

The main problem with testing is that nobody can think of a procedure to decide how to test. And maybe that’s not even possible. What’s worse is that the feedback cycle from bad tests lasts a long time. It might take a year or more before you discover that an important code path has bad tests when a use case changes and subtle bugs appear or large swaths of the test suite needs to be rewritten. Every developer will have a different set of experiences with tests that will shape their attitudes on the matter simply by random chance. It’s easy to see how opposite extremist viewpoints arise under these conditions. A developer who is bitten early in his career by a bad test suite will naturally be opposed to testing efforts. He will test less which will reinforce the belief since he will have less opportunity to be exposed to well-written tests. And if you think about it, that’s a perfectly rational way to react to those experiences since maintaining a bad test suite can definitely be worse than having no tests at all.

I was very fortunate to have been exposed to an excellent test suite on the first major project I worked on as a new developer in open source. The maintainers always insisted on a test for bug fixes and features and I saw for myself how this practice benefited the project.

What is a Test?

A test is a model that is used to approximate user behavior. When the model fits well, we can conclude from a passing test that the user will experience the result defined by the test assertions under the conditions of the test setup. In this way, tests become a precise definition of the intended application behavior bridging the gap between how the application actually runs and how the application ought to run. The setup shows an ideal for usage and the assertions show value judgments the user can use to set their expectations of defined application behavior.

In the broader picture of the engineering process, you can imagine downward flow of work from requirements, to design, to documentation, to testing, to implementation. Each lower level serves the higher level by adding precision at the expense of the ability to judge value. Tests use the documentation as their source of truth and serve its purpose to define application behavior but at a level of specificity that cannot be attained by plain English. However, this specificity comes at the cost of the ability to make more general value judgments since only specific inputs can be tested.

Tests sit only above the implementation which is by definition completely specific and objective. Implementation code simply runs how it runs. Correct behavior can only be determined in the context of the levels above. It is easier to determine whether application code is correct in the context of tests than documentation since tests are written in the same language domain as the implementation.

The Benefits of Testing

Since there is no procedure to decide how to write tests, it’s important for anyone who writes or reviews tests to understand their purpose. If the answer is “because my boss said every pull request must have tests” then in my experience, this nearly always leads to a low quality test suite. Testing may be a legitimate business requirement, and it’s a reasonable ask, but to meaningfully deliver this as a feature, your technical lead must be conscious of the testing strategy and must be able to articulate good practices to less experienced developers. Only include tests that you can reasonably understand brings value to the project, even if this understanding is only intuitive.

There are two different groups of people who are benefited by tests and tests must benefit both of these groups simultaneously.

Benefits to Developers

Most often it is your developers who will be writing their own tests so it’s important to get them invested in the process. Testing is unusual in that it is seen as a low class technical activity, but at the same time it requires an enormous amount of skill to do correctly. If a developer is resentful about needing to write tests, they will always write bad tests with this mindset. To make things easier for them, it’s a good idea to come up with a testing strategy while you determine the approach to implementation. All things equal, you should always prefer an approach that is more testable. Lack of testability is a valid reason to reject an approach.

The main benefit to the writer of the test is that it codifies their intent into the repository. Writing a test sends a clear message to other developers who may modify this code what it’s supposed to do so the writer may be confident others won’t break a feature he is relying on for future work. This is especially important in open source projects where lots of people are making one-off changes and might not recognize an edge case you are relying on for your feature. This sort of communication with others is done much more efficiently with tests than comments.

If your team has a culture where testing is a responsibility the benefits are much clearer. When a developer can expect others to write tests for the features they need, they gain the ability to freely modify the code without being as nervous about breaking another developer’s feature. This frees up mental energy for code quality improvements like large-scale refactoring that simply wouldn’t be possible without feedback to guard against unintended side effects. Ideally, any sensible implementation at all that passes the test suite should be acceptable which has the effect of reducing the actual code base to just details. It’s much more fun to commit to a code base where there are less consequences for mistakes, and much easier to review as well.

Without this expectation, when a regression occurs, the only possible solution is to “be more careful” which is not nearly as actionable as “write a regression test”.

Benefits to Users

As an user, the test suite is a good way to evaluate your use of the application. The test suite contains examples of usage you can compare to your own usage to understand whether you are on the common path. If your usage is different than the tests, you’ll know you are doing something novel and need to exercise some caution with your implementation. Whenever I see some odd behavior with a library I’m using, the first place I look for an explanation is the test suite. If my path is tested and my results are different, then it narrows down the possible reasons for the discrepancy to the environment. Knowing this is helpful when reporting bugs on the project. If the path is not tested, I know I’m doing something with the library the authors may not have intended and I’m on my own to make sure it works. In that case, I know I need to do some work in the library and then add a test for my use case to make sure it remains supported in the future.

You can use the test suite as documentation for the project. In some ways, it’s better than the actual documentation because you know if the tests pass you are looking at working code, while the documentation may be out of date. Not nearly enough people know to use the test suites in the projects they use this way.

The Costs of Testing

Writing tests is a lot of work, but when done correctly, it’s a force multiplier for developer and user productivity by clearly showing design intent and increasing stability of the code base. The problem is the stability you gain is forced and it takes additional effort to relax assumptions when a use case changes.

Just like with any other code, most of your testing effort will go into maintaining an existing test suite and this should be considered the primary driver of cost. Maintenance costs vary inversely with the stability of the interface. The more stable an interface is, the less it costs to test it which makes it a better target for tests. It doesn’t make any sense to test scaffold or POC code because you’ll end up paying the cost of removing the tests later.

General Testing Principles

So in conclusion, here are some basic principles to decide how to design your test suite.

  • Only include a test if you can justify its value.
  • Limit your tests to code under your control.
  • Write application code with testability in mind.
  • Write tests to augment the documentation.
  • Do not test undefined application behavior.
  • Write the minimum amount of tests you need.
  • Write tests whose failure has a meaningful business reason.

The Great Node Mpris Project

I think one of the things that makes me different from other people is that it really bothers me when things don’t work correctly. I feel a compulsion to fix things when I see that they’re broken. As I’ve written about in the past, it’s not glamorous work to be a bug fixer. You don’t get the same credit as the original author. But it’s still important work to do and I find it oddly satisfying to put things back into their intended order.

The Bug

This project started with a bug on my issue tracker for Playerctl that was submitted two years ago. Media players implement a standard protocol on the Linux Desktop called MPRIS which is used for desktop integration. This allows things like the media keys to work, and the desktop to have widgets that allow you to see what song is playing, adjust the volume, and things like that. Playerctl is a utility people use to make their own desktop media player integrations.

When I built the media players affected by the bug and tested them, I found that the bug was in their code and there was nothing I could do on my side to make this work. This makes things a lot more complicated for me. It’s a lot more difficult to understand the inner workings of code that you didn’t write. And since these are established projects, I would have to communicate clearly what needed to be done and make the fixes in the least intrusive way possible so people would accept the fix. There is a whole established etiquite for this within the open source community that needs to be followed in situations like these.

The Broken Library

What the broken media players have in common is that they all have a dependency on a library called mpris-service. I was really lucky here because the owner of the library is someone who I have worked with a lot in the past, Simon (emersion), who is an amazingly talented and responsive open source developer. We met in person about a year ago at a hackathon for Sway.

On his issue tracker, I found all the same issues. Only the very basic features of MPRIS were working and everything else was broken. I was surprised that in the state it was in that the library had gotten such wide adoption. Three major media players were using it despite all the bugs and no progress had been made on the issue for years. I decided to make this my responsibility to help out a friend with a buggy library he didn’t have time to fix (because he’s busy doing other amazing work), for the users of Playerctl, and to improve the Linux Desktop environment.

The Next Broken Library

But it turned out that the bug wasn’t in Simon’s library either. He was using a library for the underlying protocol of MPRIS (called DBus) which simply wasn’t working correctly. It didn’t have support for the data types that are used in MPRIS. And further, both the implementation and the user interface were very bad because it uses platform-specific code written in C++ which makes it less portable across systems. This introduced some build errors in the media players they got around in various hacky ways with their own fork of the library. This definitely needed to be fixed.

The problem though was the DBus library was just not written in such a way that it could ever support MPRIS. Also, the owner seems to have abandoned the project and is no longer taking submissions for fixes. It was then I realized why this hasn’t been fixed. This was going to take a lot of work.

There was some discussion about using an alternative DBus library called dbus-native which had gotten some support by the library users. This path seemed promising because this library was much cleaner than the other DBus library and didn’t require compiling platform-specific C++ code. So I set out to make mpris-service work with this new library.

This didn’t work either. While dbus-native has great internal features, the user interface for creating DBus services did not support some very basic features I needed to implement an MPRIS service, and adding them would require a very extensive rewrite of the top layer of the library.

My Very Own DBus Library

Since I knew this was the only way to get this bug fixed, I did this rewrite and submitted a pull request on the dbus-native project. This pull request remained open for a few months before I realized that it would probably not be merged. This is totally understandable because lots of old code depends on this library that could break with my changes, and reviewing the code is a lot of effort that I couldn’t expect someone to do just to help fix my silly Playerctl bug.

So I decided to fork the library with all my changes and release it as a new library called dbus-next. I also fixed a lot of other bugs and added an integration test suite for all the new functionality that has very good coverage. So now NodeJS finally has a working DBus library. Great.

After that work was done, I then rewrote mpris-service to use my new library and everything worked great.

Media player implementations

Now that the mpris-service library works, people are starting on implementations of MPRIS on media players written in NodeJS and I’m doing my best to help out.

Now it’s possible for all these media players to support Linux Desktop integration. And when that work is done, I can finally close that Playerctl bug on my issue tracker.

Impressions of AlphaStar

Recently I heard that DeepMind has turned its attention towards making a StarCraft II bot in a similar way it made AlphaGo, the bot that recently proved to be capable of playing the game of Go on a very high level. The SC2 bot turned out to be really good as well. It beat two excellent pro players in decisive victories in a series of five games. SC2 is a game that is very dear to me. I first picked up the game in the late 90s when it came out and have at some times played at a fairly high level. A lot has been said about the games and I’d like to add my perspective about the performance of AlphaStar.

How does AlphaStar work?

I’m not an expert in machine learning, and the details are quite dense, so if you want an actual technical explanation, check out their whitepaper if you’re up for it. Otherwise, enjoy my very naive attempt at understanding how this works.

The bot plays programmatically through a headless interface that’s pretty much like a human would use. The domain of possible actions is pretty daunting considering how many places on the screen there are to click. However, it probably simplifies a lot at higher levels of reasoning. If the idea is “move the Stalker away from the Marines”, the exact angle at which that happens is probably not very important, and you really only have maybe like three or four sensible actions in that case. But still it seems like a pretty difficult technical challenge to overcome.

For the higher level gameplay, they broke the game into a few simple challenges.

  • Mine as many minerals as you can in a certain time
  • Build as many units as you can in a certain time
  • Win a battle against enemy units
  • Find things on the map to kill

These things are essentially what you do while you play the game. They created “agents” that do these things with complicated neural networks with a lot of different parameters to tweak and selected the best ones through a process of training. Then I think they glued these things together and ran them all at the same time and basically got something that plays StarCraft. The mining part mines minerals, the building part builds units, the finding part finds enemies, and the winning part wins the battles. Repeat until you win or lose.

This is a really fascinating way to think about the game. It seems so obvious, but in reality humans are thinking about things in a completely different way. Humans start with very high level plans and then think about execution afterwards sort of like a basketball play. I am going to open with this build order, then try to do a heavy Immortal push, and if I see X then I’ll do Y, etc. How does a person come up with a crazy idea like this? Who knows. Machines can’t seem to think this way though.

Whatever high level plans you think the machine is thinking of just seems to emerge out of the details. It’s sort of like when you see a V formation of birds in the sky. They don’t all get together and decide to fly in that formation. It’s just the easiest thing to do because flying like that cuts down on wind resistance, and any bird who doesn’t do it won’t be able to keep up. The machine looks at the details of the situation, and then estimates the probabilities of certain actions (actually to the end of the game) and then picks the action with the best chance of winning. You can really see this at work in the bot’s play style.

How does AlphaStar play?

With such a different approach to the game, AlphaStar naturally has come up with some completely new strategies for playing. A lot has been made in particular about two of its behaviors.

For one, AlphaStar will almost always overbuild workers in the beginning part of the game. It builds about 20 to a human’s 16. This is definitely the most practical result I’ve seen come out of the project because it’s something that a human being can easily copy and test to see if it works. This actually makes a lot of sense because it’s an easy way to counter all the early harass that Protoss has that usually picks off about two to four probes anyway. I expect this to become a new standard on ladder. It would be interesting if we still saw the behavior in matchups with less early worker harass pressure.

Another strange thing AlphaStar is doing is not building a wall at the base entrance, which is considered to be a best practice among human players. It’s difficult interpret this however. The purpose of the wall partially is to address the very human problem of a slow reaction to an Adept harass, but also to block an Adept shade from getting into the base to begin with by building a Pylon for a full block. It would be understandable to think that the Pylon block strategy would not emerge quickly because it takes quite a bit of high level thinking. So I think people will continue doing this. The machine is after all not perfect.

There are some other strategies that AlphaStar notably does not use. For instance, it does not use Sentries to block ramps and it doesn’t drop. These might also be a bit too complicated to emerge from the limited time they trained the agents.

AlphaStar does however have a very entertaining play style. It micromanages its units perfectly in every situation, sometimes even at multiple locations at once. At one point, it executed a perfect three-pronged Stalker ambush in the middle of the map. Each group of Stalkers almost seemed to be controlled by separate players. Much of human play optimizes for the limited attention of the person, but a machine has no such restrictions. Each Stalker can move out of the way of fire at exactly the right moment to avoid destruction. Seeing the game played perfectly was truly amazing.

This point however did however receive some criticism. If AlphaStar is trying to teach us how StarCraft should properly be played and the answer is “just have perfect mechanics”, then that is not very interesting. Sort of like how it’s pretty trivial to create a chess AI that can beat a human opponent with only 500ms on the clock. On every tick, AlphaStar has the human equivalent of hours of pondering for each small move.

While AlphaStar did put on a very impressive show, I still found the play style to be very cold and mechanical. I didn’t feel what was described by people who watched the AlphaGo games who thought that agent played in a human-like way. AlphaStar did do some really insane stuff. But it seemed to almost completely ignore the unit composition of its opponent and most of its decisions seemed to be predicated entirely on the assumption of perfect control. For instance, the game against TLO where it massed Disruptors is a strategy you could not possibly use without perfect control. It’s almost as if it is playing a completely different game than we are. There’s an entirely different set of constraints that the game is simply not balanced for. The same isn’t true in a game like Go which does not reward reaction time.

In fact, a lot of StarCraft II mechanics are specifically designed for the fact that humans do have a limited number of things they can focus on. For instance, Queens do not auto-inject Hatcheries precisely because that would make Zerg imbalanced in the early game. Human-scale focus is baked into many of the mechanics of the game.

What can we learn from this?

My primary takeaway is that machines like this just do random dumb shit until they find something that works. I really like what this company is doing though because I think overall these sorts of things can have a positive impact on our culture. They sort of remind me of Boston Dynamics, a company that seems to be in the business of making random cool things for YouTube videos.

I hope this project can influence game designers to make better competitive games. It puts into focus what machines do well versus what humans do well. I think game designers should take a cue from this to maximize game design for human skills by rewarding high-level thinking and creative problem solving over mechanical mastery. Now that computers are better than us at StarCraft II, the challenge should be to create a game that humans will always be better at. There must be some kind of game like that, right? I’m not sure anymore actually.

At least we know if the Zerg are real and they ever invade Earth, we should have a chance to defeat them now.

References

Write Drunk Edit Sober

The famous quote write drunk, edit sober is often attributed to Ernest Hemmingway. When I first heard this, I thought of how I could incorporate this idea into my own creative process and came to the conclusion it is a terrible idea. As a creative professional, I need to come up with creative ideas every day. If I needed to rely on alcohol for my creative process, I would very quickly destroy my health. But something about this idea still rings true to me so I think it’s worth some time to analyze to see if there’s anything we can learn from it not just for writing, but for any creative activity.

Write Drunk

“I hate to advocate drugs, alcohol, violence, or insanity to anyone, but they’ve always worked for me.” – Hunter S. Thompson

The first part is write drunk. The first thing I think about when I hear this is one of my favorite authors, Hunter S. Thompson and the book Fear and Loathing in Las Vegas. Thompson’s inspiration for writing often comes from altered states of consiousness from the use of drugs and alcohol. Obviously you don’t need to go this far to be a creative person. But there is certainly something about the creative process that makes it feel like a different state of mind than normal waking consciousness. Creativity seems to flourish not by direct effort, but by the suppression of some more rational part of our personality that is responsible for inhibition.

The key insight is that the creative state of mind is sort of like being drunk. If we change this to write as if you were drunk the advice becomes much more practical. Someone who is drunk tends to be bursting with ideas. Most of the ideas are really bad, but there’s a lot to choose from. The drunk person has so many ideas precisely because he doesn’t care whether they’re bad or not. The alcohol suppresses the critical faculty responsible for their immediate evaluation.

We can mimick the creative part of the drunken state by slowing the feedback loop between idea generation and evaluation. An idea that seems bad may actually lead to some valuable path that we may have never discovered if we cut off the line of reasoning too early. For instance in chess, a common tactic is to make a move that intentionally loses a piece, sometimes even an important one like the queen, to gain a positional advantage that will win the game. If we consider the move and then immediately evaluate it, it might seem insane to intentionally lose our queen and the opportunity to win will be lost.

Edit Sober

Another great insight in this quote is that the creative process happens in two distinct stages. There is the drunk stage where you freely come up with ideas without judgement and then there is the sober stage where you pick one of the ideas and start to flesh it out. These two stages are completely different contexts, and switching between them incurs some overhead cost. Knowing when to switch is an important part of being creative.

With this process, the skill of creativity is to recognize a good idea through a process of selection. You become sort of like a music critic rummaging through recently released albums trying to find something to recommend to your readers. Sometimes a great piece of music won’t just jump out at you. Some of my favorite albums required multiple listenings for me to appreciate them. Many good ideas will challenge you to find their value. These tend to be the best ideas though, because if it were obvious, then everybody would be doing it already.

This selection process must be done sober. One of the problems with actually getting drunk is that drunk people make really bad decisions when it comes to selecting something to act on. I think we’ve all had that experience.

Applications to Programming

Since my craft is computer programming, I’ve thought about this quote in the context of what I do. Write drunk edit sober works for writing code too. When you first start on a project, none of the rules for best practices are practical. They just get in the way and slow you down. I make big ugly monoliths, write giant functions, hardcode everything, copy and paste big swaths of code around, and all the other stuff they teach you not to do the first day on the job. I can write code really quickly and efficiently this way because I’m self-taught and this is exactly what I did for the first few years and nobody told me any differently. I made some really beautiful disasters like this.

These days, nobody ever sees my code in this state because after I’ve gotten something basically working, only then do I clean it up and make into something pretty. After it works and all the details are in place, cleaning things up becomes really easy. The abstractions are neat and pretty because they were made at the last minute by necessity, not up front because of a guess. None of the cruft survives the editing process. Anything I show to anyone has probably been rewritten three or four times just by drunkenly iterating through bad ideas and then polishing up whatever is left.

I really wish language designers would take my workflow into account by providing me with tools to support both stages of work. I think we can split up programming languages (and frameworks) into those that are drunk and those that are sober.

Drunk Languages

  • JavaScript
  • Python
  • C
  • Scala

Sober Languages

  • Rust
  • Java
  • Go

The problem with this dichotomy is that it is very hard to write sober code in a drunk language and it’s very hard to write drunk code in a sober language. This is my main problem with Rust how it is right now. It’s extremely hard to drunkenly iterate with Rust code because it forces you to deal with a bunch of details you aren’t prepared to think about. In Rust, your code won’t even compile unless it’s guaranteed to be thread safe, has no memory leaks, and a lot of other things. Once you actually get your code to compile, it tends to be extremely reliable and safe. But your abstractions are going to be weird because it’s so painful to try new things without slogging through a bunch of details first.

A language like JavaScript has the opposite problem. Writing drunk code is really easy because there aren’t many rules. But cleaning it up after is very hard because your code doesn’t have a lot of enforcable structure to it. Anything can be anything at any time, which is very liberating in the first stage because it gives you a lot of flexibility, but becomes infuriating when you realize it’s nearly impossible to finish a big project cleanly. I’ve heard a lot of people complain about this when their Node projects ultimately become unmanagable when the structure becomes difficult to reason about.

I really wish someone would make a language that did both of these things well.

So I think the things we can learn from this quote are 1) don’t judge your ideas too early and 2) design your APIs for both stages of the development process.

Happy coding and please drink responsibly.

Balance What You Read, Think, and Do

In the past few years, I’ve learned a lot of new things. The software industry changes very quickly, and I need to stay up to date on the current trends and practices to be effective at my job. To make things more difficult for myself, I’ve made an effort to work in as many different fields of technology as I can. I’ve done frontend, databases, embedded systems, graphics, DevOps, and management in tens of programming languages, and along the way I’ve established myself as a capable generalist problem solver in several domains.

If you need to learn a lot of new things quickly, it pays dividends to be mindful of your learning process. Going from a cold start in a new field can be intimidating and stagnating in a field you already know can be a frustrating experience. To help me overcome these challenges, I’ve developed a system for learning to help me make decisions for how to spend my time in the most effective way possible, which I’ll present here. It’s written with software in mind, but I believe it can be applicable to any kind of learning. This system is a work in progress, so feedback is appreciated in the comments.

The three vital activities for learning are reading, thinking, and doing. These activities should be balanced to create a positive feedback loop. Each activity is a force multiplier for the next. The better you read, the better the quality of your thoughts become. Better thoughts will lead to more effective action. And more effective action will lead to better reading.

Learn for a Purpose

Learning for its own sake is a beautiful endeavor that everyone should engage in from time to time. Everyone should know a little bit about things like history, chemistry, or the classics. You don’t need a system like this for that kind of learning. An important part of this system is having some sort of purpose to work towards. If you want to be a novelist, you should be working towards writing a book. If you want to be a well respected software engineer, you should be working towards creating great software. At the end of the day, we are judged by the results we achieve and not our capabilities.

Don’t take results too seriously though. Learning is a long process and not everything you do will have a direct impact on what will ultimately become your greatest achievements. Plan to throw one away. Take risks and expect to produce some really bad stuff while you are learning. Use these experiences to refine your purpose.

The Three Activities

Now I’ll explain the role of the three activities and most importantly how to find a balance between them.

Reading

When asked about his genius, Isaac Newton famously said “If I have seen further it is by standing on the shoulders of Giants.” How to read well is an art unto itself that deserves another article. In this context, reading has an important relationship with the other two activities.

Any piece of writing on its own is simply some arrangement of symbols on a page. What brings the writing to life is the experience a person has while reading it. Every person who reads a passage in a novel gets a different mental image of the setting and the characters because we bring our own experiences to the scene. If the passage takes place in a port, I’ll imagine it’s like a port I’ve been to. If one of the characters acts like one of my friends, I’ll relate to the character like I relate to my friend.

Reading nonfiction is the same way. If I’m trying to build microservices, I’ll have a completely different experience with a book on microservice architecture if I’ve actually tried to build one. If I’ve thought deeply about microservices, I may find the author has put into words exactly what I was thinking but in a more eloquent way. Use the other two activities to provide context and purpose for your reading. Experience on the subject you are reading about deepens your reading experience so you can spend your reading time more efficiently.

Reading Too Much

To be well read in itself is rarely the purpose we are trying to achieve with learning. Read too much and you risk becoming the stereotypical academic locked away in an ivory tower and disconnected from the real world. The character who comes to mind is Chidi from The Good Place. Chidi is a college professor who is an “expert” on the moral philosophy of Immanuel Kant. Chidi is very well read, but is characteristically adverse to making actual moral decisions. Without the context of being a moral person, we see through the course of the show that Chidi actually has not gained any understanding of morality despite being well read on the subject, and ultimately ends up in Hell because of it.

Reading Too Little

Reading too little is a missed opportunity to learn from the experiences of those who have come before you in the field. As a beginner, I sometimes find myself averse to reading because I believe it will stifle my creativity or I just want the experience of figuring things out for myself for fun. As I learn more and become an expert, I tend to think I already know everything there is to know and my mind closes to new ideas. It’s important to fight these tendencies and keep reading on a subject no matter what your skill level is. Reading too little leads to stagnation of your thoughts.

Thinking

The philosopher RenĂ© Descartes was famous for locking himself in his room, laying in bed all day and thinking deeply about things. During these bouts of meditation, he came up with the Cartesian coordinate system that we all learn about in high school, and a lot of other influential ideas in philosophy like “I think therefore I am”. Similarly, Immanuel Kant would take very long walks every day where he would think about the great ideas of his moral philosophy. Aristotle even went so far as to say that the unexamined life is not worth living.

The purpose of high quality reading is high quality thinking. High quality thinking leads to high quality actions, like in the examples above.

Thinking Too Much

“A person who thinks all the time has nothing to think about except thoughts. So he loses touch with reality, and lives in a world of illusion.” – Alan Watts

Low quality thinking spins around in a circle. At its worst, it becomes existential despair, like the famous opening line from the play Waiting For Godotnothing to be done. Low quality thoughts simply lead to more thinking ad infinitum.

The most important skill to develop with your thinking is knowing when to stop. When you have something that looks like a good idea, it’s time to go to the next step and start implementing it. You don’t need to have all the details worked out in advance. Things will become much more clear once you have at least part of your vision out of your head and into the real world. The only way to have your thoughts build on top of each other is having something in front of you to give you different things to think about.

Thinking Too Little

The risk of thinking too little is doing the wrong thing. Without taking the time to absorb what you read, you may develop a sense of false confidence where you believe you are an expert in a field you really don’t know much about. No matter how much work you put into creating your work, if you start with a bad idea you will not be successful. In business, this may lead to the very common mistake of creating the wrong product for the market. If you feel like you are just spinning your wheels without really going anywhere, you probably need to spend some more time thinking about what you’re doing.

Doing

Real artists ship. – Steve Jobs

The purpose of going through this process is to actually create something valuable and that happens in this stage. Now that you’ve thought about what to do and read enough to know how to do it, it’s time to get to work.

Doing Too Much

It may seem counterintuitive, but spending too much time on action directed towards your goal can be unproductive if you aren’t mindful of what you are doing. When playing the piano, there is a big difference between practice and performance. When I learned how to play the piano, I started with scales and exercises. I found these exercises to be tedius because what I really wanted to play was Mozart. As I got better and learned a few songs, I found that improving at these scales and exercises was the only way I could improve at playing songs. The key insight I learned form this experience is that you don’t get better at playing Mozart by playing Mozart. If you spend too much time on your performance without practicing, you will learn bad habits and it will be harder to get better.

Software is exactly the same way. You have to practice at it to get better, and this practice is a completely different sort of exercise than what you will do at your job. When you practice, spend your time pouring over the code rewriting it until you get everything perfect. Go very slowly and strive for perfection, like a pianist who slowly plays a passage of Mozart over and over until it is perfect. Then when it is time to write code under the pressure of a deadline, you’ll get a better result much faster. Both practice and performance are essential for effectively getting the most out of what you do.

Doing Too Little

Without having actual experience, you won’t have context to absorb what you read to the fullest extent. Without getting your thoughts out into the real world, you’ll need to keep so much in your head that there won’t be room for anything else. Taking too little action can cause your learning to stagnate just as much as too little reading or thinking. And at the end of the day, it’s time to start performing and working towards what will become your greatest achievements, because that is after all the point of going through all this work.

Conclusion

This system has come about from years of observing my own process of learning and it seems to work for me pretty well. However, I don’t consider it complete and this is my first time writing it down so let me know what you think. There’s a lot here that I’d like to refine in further articles.

How to Make an Open Source Feature Request

When using an open source project, you may find it lacks some important feature you need to work with it effectively. When you are in this situation, you can either request the feature you need and implement it yourself, or just look for another project to use that has something closer to what you need. While most of the time people pick the second option, requesting and implementing the feature yourself can have a lot of benefits.

  • Maintenance work on the feature can be shared by all of its users
  • Working within the project exposes powerful internals that can give you exactly what you are looking for
  • Implementing the feature can give you deep knowledge of the project that can be shared within your organization

If you always choose to look for another project when it lacks a feature, you are missing out on one of the main benefits of using open source software: the fact that you have access to the source and are able to change it. This is an enormous amount of power to have, and the top technology companies take advantage of it. When time and budget allow, it should be considered as an option for your important dependencies.

Making good feature requests is an essential skill to master to be productive with open source projects. As an open source maintainer, I’ve seen a lot of variation in the quality of feature requests to my projects over the years. Making a good feature request is much more difficult than people realize. It’s part creative, part sales, and part technical. But when you get it right, it’s one of the most rewarding experiences I’ve had as a developer.

It may seem intimidating at first, but it’s much easier when you know the rules. In this article, I’d like to share some things I’ve learned about making good feature requests to help you create contributions that are able to get the attention of maintainers so they can be accepted into the project.

Creating an Issue

Once you have something in mind to work on, the first step is always to make sure you have an issue to work from. The most common beginner mistake I see is to start coding right away. This works for simple bugs when you fix something that’s obviously wrong, but any nontrivial feature will require some discussion before it’s ready to be implemented. You want to get people involved in the design process as soon as possible. The amount of discussion you generate with your proposal is a great way to gauge how interested other people are in the feature. Someone who is involved early in the process will have the best understanding of your goals and will provide valuable feedback that will guide your development. Treat anyone who gets involved early in the discussion as a potential user of the feature. Even if they seem adversarial or the feedback is negative, taking the time to respond at all should be taken as a sign of respect and an indication that a conensus is possible.

If the project is active, chances are someone may have already thought of your feature and proposed it on the issue tracker. Spend five to ten minutes searching for your issue with different wordings and see if you can find something similar to avoid creating a duplicate issue. If an issue already exists, read through the discussion carefully because it can save you a lot of time by not duplicating a discussion that has already happened or not trying an implementation strategy that is known to have problems. Check if someone is already working on the feature. If you see that someone is actively working on it, you can still contribute by adding your opinion to the discussion, testing the active branch, and reviewing the code. However, don’t get discouraged if you see someone who claims to be working on the feature who doesn’t have an active branch they are working on. In my experience, about eighty percent of the people who start working on something never actually finish it. Ask if you can pick up the work where they left off and try to credit them the best you can in your work. The best thing to do would be to use their commits directly in their branch, but that’s not always possible so at least give some thanks in your commit message.

Once you have an issue to work from, now it’s time to explain what you want to do. A good feature request always has at minimum these three components:

  1. The use case
  2. The approach
  3. The test

The Use Case

Coming to an agreement on the use case is the most important part of the discussion. If the maintainers agree that the use case is important to support, the rest is just implementation details. If the maintainers believe the use case is not valid, then nothing else is important. Don’t try to sell a beautiful approach for an invalid use case. The three most effective ways to justify a use case are 1) appeal to the project’s mission, 2) appeal to similar features, 3) demonstrate user demand.

The project’s mission is often best expressed in the description which is usually in the form of e.g., “a library to do X”. Justify your use case by explaining how your feature facilitates the user to more effectively accomplish X with this library. A more detailed description of the project mission is usually included in the project overview or the README. You may even find your feature is explicitly requested or blacklisted within the documentation. You can use all of these sources of information to support your case.

Additionally, look for features in the project that are similar to yours. This sort of appeal is extremely efficient because you get to reuse all of the justification that was used to justify the similar feature, which by default is assumed to be valid or otherwise the similar feature would be deprecated. On a related note, keep in mind that this sort of justification is so powerful, that maintainers may be wary of accepting even small features that may expand the scope of the project in unpredictable ways. To accept a certain feature with a certain justification is to implicitly accept all future features proposed within the same scope. If the project doesn’t have the resources to support the whole class of features, this is a good justification by the maintainer to reject the smaller feature even if it doesn’t add a lot of complexity by itself.

Finally, it is important to demonstrate user demand. Most often the user of the feature will be yourself or another project you are involved with. If you can, demonstrate a concrete use case with issues from other projects. Explain how adding the feature will help to fix those issues on the other projects. High demand for a feature means a larger pool of developers who can potentially come fix bugs when things break, as well as more influence within the project space.

The Approach

Discussion of the approach should come after everyone has come to a rough consensus on the nature and validity of the use case. Give a general overview of the changes you will need to make to implement the feature in a few sentences. For example, explain whatever new classes you might need to add or how the existing functions need to be modified. The amount of complexity a maintainer will allow in an approach usually relates to the strength of the use case, with a stronger use case warranting a greater amount of complexity. Changes that break backwards compatibility and need a major version bump need the most justification, so don’t propose these kinds of changes unless it’s clear to everyone why it’s important to do so.

Explaining the approach is an important step because people who know more about the project will often have valuable feedback on what you’ve proposed. What might seem simple to you may not be extensible enough to accomidate future planned work, or there might be unwritten conventions to follow so your code fits better with the project’s style. Knowing these things up front will save you a lot of time in code review.

The Test

Finally, you should include a test with your feature request. These don’t have to be formal tests, just an example snippet that demonstrates what the important part of the API will do when you are done. Including a short test tends to bring about a good discussion of details you might not have thought of like error handling. You need to have a test in mind during development anyway so you might as well post it on the issue. Be honest about any edge cases and bring up any problems you find in the issue as early as possible.

Issue Discussion Etiquite

These sorts of discussions are what give open source projects their reputation for being unfriendly places. Please keep in mind that discussions on open source issue trackers have very little resemblence to what is commonly called “normal human interaction” and a different set of rules tends to apply. They can sometimes resemble a game unto themselves much like poker with lots of posturing and bluffing. You may even sometimes feel like a lawyer in a court room. The most important thing to keep in mind is to always begin your thoughts from a place of respect and do not take things personally. The goal of the discussion is to always move forward towards a consensus. If you sense that no progress is being made, do not repeat your points. Wait a few days for others to chime in with a fresh opinion. Be willing to accept that not every idea you propose can come to a consensus in the project, and have a backup plan such as forking or starting a new project that can accomidate your use case.

Now Start Coding

Congratulations, your feature request was accepted. With all you’ve been through, this might seem like the easy part. It is very rare to have a feature rejected at this phase, but I have regretably seen it happen before. There’s a lot more that can be said after this point such as how to make a good pull request and how to respond to feedback during code review, but I’ll leave those topics to another post.

Messing Around with JavaScript Decorators

I’ve been looking at the new features in ES6 and boy has JavaScript changed a lot in the last few years. The ECMAScript standards team is really doing a great job making the language more comfortable to work with. My favorite features are block scoped variables with let, a better syntax for defining a class (I never really understood what a “prototype” was), built-in support for loading and exporting modules, and native support for promises. Proxy objects seem like they could be a powerful tool as well. My first impression is that JavaScript is starting to look like a more liberal version of Python, which isn’t a bad thing.

One of the features I looked for in ES6 and could not find was support for function decorators. Decorators can be a great thing to have in a language sometimes. When they fit into an api, they really fit in well and I often use them in my Python library code. I was surprised this feature didn’t make it into ES6 because they are used extensively as part of Angular and React and have native support in TypeScript.

The proposal for decorators can be found in this repository with user guide located here. The proposal is currently at stage two which means it will likely be included in the language in the next major update but are not ready to be included in production code yet. I wrote some sample decorators to test out the new features which you can find in my notes repository here and I’d like to explain to you how they work.

Note: I am not an expert JavaScript, front-end, or Node developer so if you see anything I’m doing wrong in these examples, please let me know in the comments or in an email.

Building the Project

Unfortunately, there is no native implementation for decorators in Node (currently version 11) or any browser yet so you must compile your decorator code with a project called Babel. This project requires two additional plugins, @babel/plugin-propsal-decorators and @babel/plugin-proposal-class-properties.

npm install --save-dev @babel/cli \
    @babel/core \
    @babel/plugin-proposal-class-properties \
    @babel/plugin-proposal-decorators

Babel must also be configured with some options in a .babelrc file you can find here.

Now you can compile and run your code with babel like this and it should work correctly:

babel ./index.js -o index-compiled.js && node ./index-compiled.js

Anatomy of a Decorator

A decorator is basically just a function that gets called in the context of a target method or property that is able to change its state somehow. Here is an annotated example of a decorator which does nothing:

function(descriptor) {
  // alter the descriptor to change its properties and return it
  descriptor.finisher = function(klass) {
    // now you get the class it was defined on at the end
  }
  return descriptor;
}
 
class Example {
  @decorator
  decorated() {
    return 'foo';
  }
}

The descriptor that is returned from the decorator is an object that you can mutate to change the target method. The important properties of this object are:

  • kind – whether this is a ‘method’, a ‘field’, or something else
  • key – the name of what is being decorated (in this case, ‘decorated’)
  • descriptor – contains configuration for the property, and the value which you can hook into with custom behavior
  • finisher – add a function to be called after the class is defined for customization of the class

To have your decorator take parameters, use a function that returns a decorator like the one above.

function decorator(options) {
  // options are passed with the decorator method
  return function(descriptor) {
    return descriptor;
  }
}
 
class Example {
  @decorator({foo:'bar'})
  decorated() {
    return 'foo';
  }
}

Example: Log a Warning When Calling a Deprecated Method

function deprecated(descriptor) {
  // save the given function itself and replace it with a wrapped version that
     logs the warning
  let fn = descriptor.descriptor.value;
  descriptor.descriptor.value = function() {
    console.log('this function is deprecated!');
    return fn.apply(this, arguments);
  }
  return descriptor;
}
 
class Example {
  @deprecated
  oldFunc(val) {
    return 'oldFunc: ' + val;
  }
}
 
let ex = new Example();
ex.oldFun('foo')
// > this function is deprecated!
// > oldFunc: foo

Example: Make a Property Readonly

function readonly(descriptor) {
  descriptor.descriptor.writable = false;
  return descriptor;
}
 
class Example {
  @readonly
  x = 4;
}
 
let ex = new Example();
ex.x = 8;
ex.x;
// returns 4. note that you don't get a warning for trying to set a readonly property

Example: Reflect a Class

Given a class, we want to find all the methods that are decorated with a certain decorator. We will use the @property decorator for this. This sort of thing would normally be used for a base class in your API that your user is expected to override.

function property(descriptor) {
  descriptor.finisher = function(klass) {
    klass.properties = klass.properties || [];
    klass.properties.push(descriptor.key);
  };
  return descriptor;
}
 
class Example {
  @property
  someProp = 5;
 
  @property
  anotherProp = 'foo';
 
  static listProperties() {
    return this.properties || [];
  }
}
 
Example.listProperties()
// > [ 'someProp', 'anotherProp']

Conclusion

Decorators open up a lot of possibilities in a language and I am looking forward to their inclusion into JavaScript. I plan to use them in a Node library I am writing right now. However, reflection is a very powerful tool and should not be used without careful consideration. Make sure the decorator pattern actually fits your use case before you decide to use them. Have fun with decorators!

Playerctl at Version 2.0

I’ve spent the last month revisiting an older project of mine called Playerctl. I wrote the first version nearly five years ago over a weekend and have been making small tweaks to it here and there in my free time since then. The idea was to make a command line application to control media players so I could bind a command to keyboard key combinations to have media key functionality. I do mostly everything on the keyboard so I found it distracting to reach over to the mouse, find the media player window, and click on a button whenever I wanted to pause the player or skip a track on the radio. Playerctl works great for this. These lines have been in my i3 config for years.

bindsym $mod+space exec --no-startup-id playerctl play-pause
bindsym $mod+$mod2+space exec --no-startup-id playerctl next

Another goal of the project was to access track metadata for a “now playing” statusline indicator in i3bar, tmux, or whatever. I did a few implementations of this, but never really made a satisfying statusline (more on that later).

Other people seemed to have found it useful too, and to this day it’s the most popular project I’ve published under my name on Github. With users come issues when people find bugs and limitations in the interface for their use case. These discussions with users have guided the development of version 2.0 of Playerctl and I think I’ve addressed everyone’s concerns. The main points that drove development for this version were:

  • Make the command line application easier to use with multiple players running at the same time
  • Make it easier to print metadata and properties in the format you want
  • Make it easier to make a statusline application
  • Make the library more usable

It wasn’t easy to do these things, but in the end, I’m pretty happy with how things came out and I hope everybody enjoys the new features. Version 2.0 was a big effort and represents nearly a rewrite of v0.6.2 and a doubling of the size of the code base to accomidate the new features. In this post, I’d like to go over the changes, share some of the rationale for the design choices I made, and alert you to some breaking changes in the new version.

Player Selection Overhaul

In the old version, you could select players by passing a name or comma-separated list of players to the --player flag but I found this feature rather limiting because the behavior was simply to execute the command on all the players. That makes it not very usable for the play command because you almost never want all your players to start playing something at the same time. My thinking is most people have something like a “main music player” which is good at handling large playlists and then a secondary player they use for movies or other random media on their system. So I changed the default behavior of the --player command to only execute the command on the first player in the list that is running and supports the command. That way people can pick the priority of the players they want to control, and if the command is not supported (such as if you command the player to go to the next track but there is no next track), it will skip the player and go to the next one. You can still get the old behavior by combining this flag with the --all-players flag.

Another requested feature was the ability to ignore players instead of explicitly whitelisting them. This is useful for instance to ignore players that are not really media players like Gwenview. Now you can pass those players with --ignore-player and they will simply be ignored.

Another detail about player selection is that now a player name will match all the instances of the players, which it didn’t do before. So if I pass --player=vlc, all the instances of VLC that are open will be selected.

I think that should cover everybody’s needs the best I can, but there are always going to be edge cases that I won’t be able to address without mind reading abilities.

Format Strings

Before format strings, the way to get multiple properties was normally with multiple calls to the CLI like this:

artist=$(playerctl metadata artist)
title=$(playerctl metadata title)
echo "${artist} - ${title}"

But that’s obviously not a very elegant solution and doesn’t scale very well if you want to print more than a few things. I implemented a few features in this version to address this. One feature is that you can now specify multiple keys and each key you specify will be outputted on its own line.

$ playerctl metadata artist title
> Katy Perry
> California Gurls

Then you can split on the newline and they’ll be in an array in that order. I still wasn’t very happy with this because it’s not very semantic for what people are trying to do with it. What people said they wanted was for a raw playerctl metadata to output JSON they could parse, which I wasn’t willing to do because I don’t want to add the dependency just for that feature, and even then, scripts would then need something like jq to parse the output. What I did do is make the metadata call output a table (instead of the serialized gvariant before) which is great for readability, but I wouldn’t recommend parsing it.

vlc   mpris:trackid             '/org/videolan/vlc/playlist/37'
vlc   xesam:title               Synthestitch
vlc   xesam:artist              Garoad
vlc   xesam:album               VA-11 HALL-A - Second Round
vlc   xesam:tracknumber         34
vlc   vlc:time                  281
vlc   mpris:length              281538480
vlc   xesam:contentCreated      2016
vlc   vlc:length                281538
vlc   vlc:publisher             3

So after deliberating for awhile, I decided to do the hard thing and just go ahead and write my own template language based loosely on jinja2 but with a lot fewer features. The parser was a lot of fun to write and I’m happy with how it came out. Now you can do this:

$ playerctl metadata --format '{{artist}} - {{title}}'
> Garoad - Your Love is a Drug

I even put template helpers into the language for additional formatting you may want to do to make the variables more readable. This is the format string I use for my statusline generator right now:

fmt='{{playerName}}: {{artist}} - {{title}} \
     {{duration(position)}}|{{duration(mpris:length)}}'
playerctl metadata --format ${fmt}
> vlc: Garoad - Dawn Approaches 3:28|4:10

The duration() helper converts the position from time in microseconds to hh:mm:ss format. There are a few others too that are pretty interesting documented in the man page.

Follow Mode

If you wanted to use the CLI to make a statusline before, you basically had to poll. And if there’s one thing that everybody hates to do, it’s polling. It would be better to have a tail -f style flag that blocks and automatically updates when things change. This was the hardest feature to add because there was a lot of functionality lacking in the library, and the CLI was designed to be mostly stateless because it was only supposed to be a one-off. There’s also a lot of edge cases with players starting, exiting, and changing states which is difficult to get right. I put a lot of detail into making sure the most relevant thing is shown on the statusline based on the input. If you have players passed with --player, it will show them in order of player priority based on the last player that has changed. If you pass --all-players, it just shows whichever one changed last. I think the last one is what I prefer.

It even works with the --format arg and will tick if you give it a position variable. Here is the grand finale:

fmt='{{playerName}}: {{artist}} - {{title}} \
     {{duration(position)}}|{{duration(mpris:length)}}'
playerctl metadata --all-players --format ${fmt} --follow

My own personal statusline implementation of this can be seen here in i3-dstatus, another one of my neglected projects that will get my attention next.

Library Improvements

I originally had bigger plans for the library, but didn’t end up doing as much with it. I still think it’s really cool though, and I want to keep supporting it. The problem was there was no way to listen to when players connect and disconnect to control, so you basically had to run your script when you knew your player was running which is not great. I needed to add this feature for the follow command anyway, so I decided to go ahead and externalize it in the form of a new class called the PlayerctlPlayerManager. This is meant to be a singleton which emits events for when players start, and keeps an up-to-date list of player names that are available to control. It can manage the players for you too and alert you when they exit.

Here is an exmaple of the manager in action:

#!/usr/bin/env python3

from gi.repository import Playerctl, GLib

manager = Playerctl.PlayerManager()

def on_play(player, status, manager):
    print('player is playing: {}'.format(player.props.player_name))

def init_player(name):
    # choose if you want to manage the player based on the name
    if name.name in ['vlc', 'cmus']:
        player = Playerctl.Player.new_from_name(name)
        # connect to whatever you want to listen to
        player.connect('playback-status::playing', on_play, manager)
        # add the player to the list of managed players and be notified when it
        # exits
        manager.manage_player(player)

def on_name_appeared(manager, name):
    # a player is available to control
    init_player(name)

def on_player_vanished(manager, player):
    # a player has exited
    print('player has exited: {}'.format(player.props.player_name))

manager.connect('name-appeared', on_name_appeared)
manager.connect('player-vanished', on_player_vanished)

# manage the initial players
for name in manager.props.player_names:
    init_player(name)

main = GLib.MainLoop()
main.run()

I tried to make as few breaking changes to the library, but a few were inevitable. There are also a few deprecations that will affect anyone who based a script of the previous example code. See the library docs for more details.

There are a lot of other little changes, but those are the main ones. Enjoy Playerctl 2.0!