Frog and toad, willpower, and microservice architecture

Frog and Toad eating cookies
Frog and Toad eating cookies. Illustration by Arnold Lobel.

One of my favorite Frog and Toad stories goes like this:

One day, Frog and Toad make cookies together. Later, as they are scarfing down cookie after delicious cookie, Frog suggests that if they don’t stop eating cookies, they will get sick. Toad agrees.

So they put the cookies in a box. But then Toad points out the obvious: they could easily open the box and eat the cookies. Frog responds by tying a string around the box. But then Toad objects that they could still cut the string and open the box. So Frog proceeds to fetch a step ladder and put the box on a high shelf. Toad counters that they might take the cookies back down, cut the string, open the box, and eat the cookies.

They realize they are at an impasse. All of the safeguards against eating more cookies are easily bypassed. Frog announces that what they really need is willpower. After explaining what willpower is to Toad, he takes the box down, cuts the string, opens it up, and offers all of the cookies to some nearby birds, who are happy to gobble them down.

Toad laments that now they have no cookies at all. Frog concedes this point, but points out that they now have lots and lots of willpower. Toad replies that Frog is welcome to keep all the willpower, and goes home to bake a cake.

New bikes and new programming languages

There’s a recurring discussion I see in software that reminds me of this story. In a nutshell, it goes like this: “X is causing us lots of problems. Maybe if we use a system that prevents us from doing X, we’ll stop having these problems.”

This argument shows up in a lot of forms. The Java language was sold to the programming establishment partly on these grounds: “programmers keep writing terrible code in C++. If we switch to a language that strictly limits what they can do, they’ll stop writing such terrible code.”

I’ve heard it used to justify a switch to a functional programming language: “we know that mutable state is the root of many evils. If we prevent mutable state from existing, we won’t have those problems.”

Lately I’ve been messing around in Go, and it contains more than a hint of this kind of thinking: “Programmers are bad at handling failure modes. Maybe if we force them to surround every function call with its own if/then/else, they’ll do a better job of thinking about failure.”

It’s not strictly an organizational phenomenon. People use this strategy at a personal level all the time. “As soon as I get a new bike, then I’ll get into shape.”

And it often works, too. A shiny new tool and a clean break from the old ways can be a great motivation to get out of a rut.

At least, it often works for a while. For the first couple of months, that new bike gets lots of exercise. But then the weather gets cold, and some spokes get bent, and they can never seem to remember to take the bike to the shop. And the weeks pass, and then months.

Not coincidentally, when we’re in this honeymoon period we want to talk all about it. The person with the new bike constantly posts their latest stats to Facebook. The team with a new programming language blogs all about how it has revolutionized their work. Our industry is full of experience reports from three or six months in.

But then the Facebook posts and blog posts start to peter out. For some, this means a new, better normal has settled in. But in other cases, it can mean that things aren’t going so well anymore.

The lure of microservices

Most recently I’ve started to see “microservices” treated as if they are one of these “liberating constraints”. By splitting our apps into dozens of tiny programs, we’ll finally enforce modularity. We’ll stop building big balls of mud.

Of course, those services still depend on each other. We’ve just given them (hopefully) well-defined interfaces. And we’ve pushed their interconnections and dependencies out from a realm where our tools could visualize them as tangled webs, into a realm where our tools can give us no insight whatsoever. We’ve moved our diagnostics from tests into distributed log files. And we’ve almost certainly introduced new implied temporal dependencies that didn’t exist before.

There are definitely teams that are succeeding with microservices. There are also teams who are succeeding with monoliths. Facebook is still one single giant codebase. Every tiny tweak increments the entire app.

Martin Fowler has been observing and documenting the rise of microservices. When we last had him on Ruby Rogues I took the opportunity to ask him about what he had seen so far. He had this to say:

I’m still a little bit unsure about the whole microservices thing… exactly when you should do it is an interesting question, because there are a lot of benefits from having separate services. You can independently deploy the services. You can independently scale the services. You get an enforced modularity which most programming languages’ environments just don’t give you. You can really solidly say what your interfaces are. And you can make damn sure that you’re never passing mutable data across module boundaries, things like that.

But on the other hand, as soon as you move to a distributed design, distribution is a huge complexity booster. And in order to get distribution to perform effectively, you’re probably going to have to go to an asynchronous call mechanism. And asynchrony is a huge complexity booster. And so, you take on quite a price. And the tradeoff between those two things is really quite significant.

And this also plays into the sacrificial architecture idea, because your first, your early attempt, you probably don’t want to go down the microservices route because you don’t really know what your module structure’s going to look like. You’re still trying to figure out what on earth’s the system you’re trying to build from a user experience and from a functional sense. So, you want to be able to rapidly change things within the structure. And only when things begin to solidify is it a good idea to peel out things into separate services.

But this area is still very new. We’re still really getting only the first indications of what is a good and bad practice. So, the best I can do is listen to the people and try to pick up what I can and distill as much signal as I can from all the stuff I’m hearing.

 

What’s my motivation?

I don’t think microservice architecture is a bad idea. I’ve started using it a little bit myself, writing a separate service to do some periodic processing instead of adding to an existing application. There was something freeing about not worrying about how to control duplication between the two programs. On the other hand, it took roughly 30 seconds of development before I’d introduced a bug by writing a regular expression slightly differently in one program than it was written in the other.

This article is really just about motivations: why we choose new tools, or new architectures. We know that certain bad habits lead to unmaintainable codebases. And yet we keep falling into those habits. And it’s very seductive to think that when we switch over to that new framework or that new language, we’ll kick all those bad old habits at once and ring in a new era of clean coding.

I think it’s important to reflect on why we are really considering a change to our development stack. Is there a chance we’re just looking for a box and a piece of string and a high shelf to keep us from eating those tasty, tasty cookies?

Perhaps what we really need is some willpower.

7 comments

  1. I agree. But I would. The Ruby world is all about handing very sharp tools to people and saying, “well, usually they’ll do the right thing with them. And when they don’t, I don’t generally need to use what they build.”

    Microservices are being touted by places like Netflix right now. Which makes sense — they’re also a big Java shop, for instance.

    Fundamentally, their model is much more based on putting the closed box on a high shelf with a string around it.

    And for a variety of reasons, it works for them.

    You’re a one-man Ruby shop. Your solution (and mine) doesn’t generally look like theirs.

  2. I couldn’t agree more, I’ve recently been asking myself the same questions, thought I’d share my experiences.

    At my current job, I recently started to push toward moving to more of a service oriented architecture. My opinions were partially shaped by the last company I was at, which had easily one of the most cohesive SOA platforms around (protobuf for synchronous RPC client/service communication, plus a RabbitMQ layer for asynchronous observation of events by any one of the services within the platform).

    I decided that if we were to go SOA, the first step, was to build my own implementation of the async event observation concept for cross service observation (something many SOA’s don’t make the distinction between or account for as they typically stick to 1 method of communication), as I largely needed the first microservice to be reactive. So I did that, with: https://github.com/jasonayre/active_pubsub

    I hoped I could get away with starting with just the async piece, and building the tools I knew I would need in small pieces. As I built out the first service separated from the main application, I quickly realized that I was going to need to an RPC layer, as the new service was listening for the creation of new users, and creating a local data model when that happened, but later on, the main application would need to be able to update that remote data model with a different status. (which I could have done using just the reactive rabbit library I wrote, but it would be super hacky).

    So I started to look at the options for the RPC layer. Protobuf was the first consideration. It’s been battle tested by my previous company, I’m very experienced with it, and it’s used on a large production scale there. I was however haunted by the memories of some of the smartest developers I know, being completely stumped by RPC servers which suddenly stopped responding in the middle of the night, who would wake up in the middle of the night to debug when one went down. Not to mention the complexity added by maintaining a shared schema which multiple services use as a single point of truth, as if you change something in 2 services, you will need to update the protos, recompile them, roll a new gem version etc.

    I became of the opinion that protobuf would be way too much overhead to implement for a 2/3 person dev team. There were a large number of smaller pieces of that system that made it as cohesive as it was, and I did not want to have to deal with debugging seemingly magic, zeromq powered RPC servers.

    Sure, zeromq/protobuf allows the communication to be as fast as possible, and the tech is super badass, but I began to question, really whats the point? The truth is, HTTP is FAR more reliable, discoverable, scalable (nginx), and therefore IMO, better suited for RPC communication (at least for my needs). Is the added complexity really worth the tradeoff of zero discoverability of the RPC services (you cant just open a browser and hit an endpoint to see what the servers (not) returning). Furthermore, if we hire new devs that aren’t familiar with SOA, its a hell of alot easier to explain how a service speaking through http + json works, versus protobuf and cryptic zeromq rpc server processes.

    So, I arrived at the opinion that in general HTTP is a better route. But was then of course totally disappointed by the lack of anything ruby which was well written enough for me to have confidence in using as a very key piece in a platform. (A high level http client geared towards consuming http services, something along the lines of ActiveResource, but if you’ve ever used active resource, or looked at the code, you likely know how bad of an idea it would be to try to build atop).

    This was fairly recent, and very recently I REALLY began the process of honest introspection, in terms of asking myself what problems would I really be solving in the first place. The reason I began pushing the idea of micro services, is because I was tasked with building an application (or putting into the existing monolithic application), a bunch of code which effectively has nothing to do with our product or platform, (essentially internal tools).

    Although I think the above warrants a very compelling argument for separation into a service, at the same time I’ve come to the realization that the main benefit I sought, has nothing to do with performance. My concern is mainly of separation of the code itself. (Which recently Ive been thinking about tackling by building a small, higher level, even more opinionated framework, on top of rails, which would follow more of a grouped mini applications with individual controllers,models,routes, within the application, (modeled partially after the magento ecommerce framework), to solve.

    TLDR;

    What I’ve learned/concluded from all this is:

    1) I would take a well built monolith over a poorly built, or not very cohesive, SOA (even if the individual applications were built well, if you don’t have the tools to make the thing work together, you will begin adding huge levels of complexity in the individual applications to make them communicate with one off calls to services, or all sorts of other hacky solutions Ive seen people resort to such as shared database between services, etc..)

    2) Before attempting to solve something by separation into a service, a good honest introspection asking yourself what problem are you actually intending to solve, is of utmost importance.

    3) Don’t drink too much of the same flavor of koolaid. Honestly I’ve been pretty obsessed with SOA’s since I first started working with one. Not in a, I ever thought it was all rainbows and butterflies way, but in a, I think the concept is fascinating, and although I’ve worked on one of the most state of the art platforms around, I hated many things about it, including the complexity of many of the pieces, or how difficult it was to push up a simple change in the platform.

    I’ve done a ton of experimenting with better ways (theorycrafting) to eliminate that complexity (i.e. https://github.com/jasonayre/cylons), and while the latter is an incredible concept, its much more incredible on paper than in practice as its nothing Id trust using in production, (at least yet. Really I’m roadblocked by dependent DCell libraries instability). — But the point is, as much as I like the idea of SOA’s in terms of separation of responsibilities, only recently have I really truly begun to realize for myself what the cost of building/maintaining that separation is, and how it converts into complexity. Which is due to the fact that to some degree, I drank too much of the “man having a bunch of separate services that can all talk to each other and react to each other is a super badass scalable way to build web applications”, flavored koolaid.

    1. Thanks for this, this is a great comment and I love hearing from people who have deep experience with the topic at hand.

      One very minor note: “drinking the kool-aid” as a phrase has a rather nasty history that troubles some folks when they see it used lightly. I note this because a lot of people aren’t familiar with the origin of the phrase.

      But yeah, thanks again for sharing your experiences.

  3. I was at a conference this year and they talked about microservices a lot. But what was interesting in all these talks was the reason the speakers’ organizations employed microservices: scalability. They didn’t extract functionality from the mothership for the sake of interfaces, new languages, or anything like that. They had to do it because the mothership has grown too large for a single server and was hard to scale. So they extracted a component and scaled it across 10-20 machines and it became faster. Things like that.

    1. This use case makes a lot of sense. Regarding microservices as a way to get better performance at the cost of greater complexity seems like a reasonable eyes-wide-open trade-off to make.

  4. I actually think any discussion of whether one architecture or the other is “better” in some global sense is deeply flawed. The only real “law” that I see having a true long-term effect on the quality of a system is Conway’s Law. If your architecture is at odds with the way your organization works, you either have to change your architecture, your organization, or be prepared to have your toes stepped on.

    With that in mind, I see the Microservices Architecture being a neat solution to an organizational problem: the desire to allow many small teams, possibly geographically distributed, to work together towards a common goal while enabling maximal autonomy. Forcing a distributed organization to work within the confines of a single monolith runs the risk of building up an organizational hierarchy that enforces consistency, deployment and operations processes, and workflows.

    I think this is why it’s so difficult to compare people’s experiences, and why things that work one place don’t necessarily work another: everything depends on the context in which the architecture is implemented.

    1. I was wondering when someone would bring up Conway’s law 🙂

      I think this can flow in either direction. Microservices can enable different independent teams to accomplish mostly independent tasks without too much interference. On the flip side, in some cases the desire to have microservices may indicate undesirable fragmentation of the organization.

Comments are closed.