The Relicans

loading...
Cover image for Thought Leaders & Machetes: Soloing The Corporate Jungle with Mike Perham

Thought Leaders & Machetes: Soloing The Corporate Jungle with Mike Perham

Mandy Moore
Single Mom 👩‍👧 🐶😺😺😺😺 Owner/producer: Greater Than Code 💕 #DevRel 🥑 WiT/D&I 👩🏻‍💻 Podcast Production 🎙 #BlackLivesMatter #python 🐍 she/her
・22 min read

Jonan Scheffler interviews Mike Perham of Contributed Systems about his work on Sidekiq: a framework for building and executing background jobs.

Should you find a burning need to share your thoughts or rants about the show, please spray them at devrel@newrelic.com. While you’re going to all the trouble of shipping us some bytes, please consider taking a moment to let us know what you’d like to hear on the show in the future. Despite the all-caps flaming you will receive in response, please know that we are sincerely interested in your feedback; we aim to appease. Follow us on the Twitters: @PolyglotShow.

play pause Polyglot

Jonan Scheffler: Hello and welcome to Polyglot, proudly brought to you by New Relic's developer relations team, The Relicans. Polyglot is about software design. It's about looking beyond languages to the patterns and methods that we as developers use to do our best work. You can join us every week to hear from developers who have stories to share about what has worked for them and may have some opinions about how best to write quality software. We may not always agree, but we are certainly going to have fun, and we will always do our best to level up together. You can find the show notes for this episode and all of The Relicans podcasts on developer.newrelic.com/podcasts. Thank you so much for joining us. Enjoy the show.

Welcome back to Polyglot. I am joined today by my friend, Mike Perham. How are you, Mike?

Mike: Hi, Jonan. Thanks for having me.

Jonan: It is my pleasure. I'm really excited to have you on the show finally. I've known you for a long time, and I am pretty sure this is the first time we've ever recorded a podcast together.

Mike: Well, we had a Twitch stream the other day.

Jonan: Oh, we did that. That was super fun, actually, the Ruby Galaxy one.

Mike: I’ll be a guest on every single one of your shows on every single platform you're on, eventually.

Jonan: This sounds lovely to me. I love it. [chuckles] I like that I finally get to have all of these different opportunities to just hang out with my friends for a living. It's a nice job to have.

Mike: This is what capitalism has brought to us; every conversation between friends has to be monetized somehow, right?

Jonan: Exactly. And while we're here, go to newrelic.com and click sign up. [laughter] We should get some, I don't know, tattoos and Swag up in the mix here, video podcasting. Actually, I'm very fortunate to have this job that I have, and I get to just have conversations with interesting people about technology. And Mike, I think you certainly fall into that category. Your work on Sidekiq is fascinating to me. Maybe tell people a little bit about Sidekiq and what it is that you do here.

Mike: Sidekiq is a framework for building and executing background jobs. And background jobs is a design pattern for scaling any sort of application processing that you or your business may want to do. So I have an open-source project called Sidekiq. And then I have commercial upgrades, which I offer to businesses in the form of Sidekiq Pro and Sidekiq Enterprise. And that is the gist of my business is maintaining and supporting my projects and my customers.

Jonan: The company itself is called Contributed Systems?

Mike: Contributed Systems, that’s right.

Jonan: I think that your project actually represents one of the more successful open source business models that I've ever seen. For an individual developer, I think that you have done a lot of things right over the years. And I wonder if you would have advice for people starting out. I mean, there are a lot of people hacking on these side projects, and they're starting to build a community around them, and maybe they would like to do that full-time; maybe this is the project that they get to turn into their Sidekiq. But getting started down the path where you ended up seems really hard. Did you start out monetizing Sidekiq from the very beginning?

Mike: I started out as an engineer, a developer like I assume most people listening to this podcast. And just as part of my job, I found myself repeating and building the same infrastructure over and over. So I decided, I said to myself, “Why do I continue to build the same infrastructure over and over?” At this point, I've got enough experience to build something better, a next-generation, if you will, version of the same infrastructure. And so that's really what turned into Sidekiq is it was something that was near and dear to my heart. I thought to myself, if I make this next generation thing that's better than what exists, that's going to be valuable. And so any time you're building value, you've got an opportunity possibly to charge money for that, so that's what I did.

And to your point, you're right; I’m one of probably a handful of people who have really made a multi-million dollar business on my own by just pushing forward with this thing and doing it despite what people may say is the happy path. I hear of a lot of open-source infrastructure companies taking venture capital. And that seems to be the more popular route is you build something, and then you take $10 million in funding and you turn it into a $100 million-dollar company. And you get some percentage of that. But I decided that I didn't want to do any of that, that I wasn't really interested in the business side. And I just said, “I want to remain solo,” and so that's what I did. And the business section, as it is, I've designed and run day-to-day solo, and I've designed it to run solo. So I have to keep my policies as efficient as possible; for instance, Sidekiq Pro is credit card only. I don't allow people to invoice for it. Even though a lot of enterprises want to invoice, I say, “No. Credit card. Self-service only.” I'm not going to get involved in the purchase process for something that only costs $1000 a year. And that's an example of maximizing my time.

Jonan: You're not down to go through a vendor onboarding process that involves sending consultants to your house to inspect under the floorboards for hidden microphones.

Mike: Lord, Jonan, I've got some stories I could tell about vendor onboarding.

Jonan: [laughs]

Mike: Yeah, you've nailed it on the head. The vendor onboarding process can be a nightmare.

Jonan: It's awful. Yeah.

Mike: I don't want to call out companies that have terrible process because typically, the people that are talking to me it's just a process that has come down from on high, and there's nothing they can do about it. And I have to follow it if I want them to purchase from me. And typically, I'll say, “No, I'm not interested in jumping through your hoops.” If it's like one form and maybe a dozen questions, then I'll fill that out, and it's done. Typically, vendor onboarding is no more than give me your address, give me a couple of points of contact, give me some basic information about the company. And that's it. And that's not too bad. But I've had people that wanted me to send an official letterhead from the State of Oregon proving that I'm a real company.

Jonan: [laughs]

Mike: And I actually started going down that road and got that paperwork. And when I realized that that was the first step in probably what was a 20-step process, I said, “No, I don't care.” And that was a $5,000 a year sale. And I was like, for $5,000 a year, I'm not going to bother, no.

Jonan: It's a huge amount of overhead to operate at that scale. And I get that there have to be vetting procedures for vendor partners, especially people who do work in the government space. And there's this whole accountability trail where everyone wants someone to point the finger at if something goes wrong. Sticking to credit card only certainly, I think, avoids some of the pitfalls, like the huge time suck there. But it also means that you're not spending your life being miserable, doing the things that you don't want to do with your time. You get to own your business and run the business that you want to do. I imagine your board meetings are terribly boring. It's just you talking to yourself.

Mike: [laughs] And you're exactly right. When I'm a business of one person, I can't afford to spend a week, literally 40 hours, onboarding for one customer. That makes no sense. And if it was a customer that was like a hundred years old, you could understand how a company might collect this enormously terrible process over the years, decades. A lot of times, it's a company that's just a couple of years old. They've accepted and taken on this horrible process. And all I can think is you need to fire your CFO. You need to fire your head of accounting because this is horrifying.

Jonan: I feel like it's this kind of legacy craft that accumulates on an industry scale where you work at some megacorp and learn the best practices for your role, and then you go on, and you work at a startup, and you bring those best practices from Cisco right on over. And it's like, “Well, of course, we have a 45-day onboarding process. You just have to sign this 25-page Master Service Agreement to get started here. You can't ever use the internet except for our personal benefit.”

Mike: To name one name, one of the companies that was like this was General Motors. So they actually contacted me about an enterprise license. And I can understand with GM that's a hundred-year-old company. They're used to dealing with automotive suppliers and million-dollar contracts, and the process by which to get onboarded and then to get paid through them is just this horrifying arcane maze. But you can understand that when I am the exact opposite of the businesses they’re used to dealing with. [laughter] But that's an example where my business is a little non-traditional, and there are just times where it's just not going to work out very well.

Jonan: But honestly, that's what makes your business so amazing. Sidekiq is an extraordinarily good piece of software, and it has been for a really long time.

Mike: Thank you.

Jonan: I have not seen a project live this long and be this good.

Mike: I've put a lot of time into it, that's for sure. I certainly am paid to maintain it, and I maintain it well. I've put as much polish as I can into it to make it as easy to start to use and to make it as bug-free and as high quality in experience as possible.

Jonan: When you were designing it, you talked a little bit about this from a software perspective; some of the choices that you made were kind of interesting at the time. You made a hard turn towards concurrency. I guess this is something that was common. Other tools in this space like Resque and Delayed Job -- I was pretty early on in my career, so I don't have a ton of insight into what sorts of problems those tools had. But I wonder if you could talk about what were the shortcomings of those projects that you resolved with Sidekiq, just kind of on a high-level software architecture perspective.

Mike: Well, those earlier projects that you mentioned are all designed to be single-threaded. They don't execute more than one job in a process at a time. And it turns out that uses a ton of memory. So Ruby on Rails is not exactly known for being light on memory. So when you boot a Rails process, it could easily take 500 megabytes of memory or a gigabyte of memory. And so, if you need to execute 10 or 20 or 100 jobs per second, because of the amount of work you're doing, you've got to spin up 10 or 20 or 100 processes to execute all these jobs at the same time. And when you've got 100 processes, and each process takes a gigabyte of memory, you're talking about 100 gigabytes of memory. And that's really expensive in the cloud where they charge you for every megabyte, for those EC2 Instances or those Heroku Dynos. They've got memory limits, and you've got to pay for more and more Dynos as you use more and more memory. So one of the tactics that Sidekiq chose to do was go down the path to be multithreaded, and that way, you could have multiple threads in a single process, all sharing memory. So it's going to be a little more efficient. So if you want to execute 100 jobs, you can execute maybe 5 or 10 jobs in each process, and that'll save you five or 10X of memory, and that can be a substantial saving. I think the first month that I released Sidekiq, a person that migrated from – I think it was Resque to Sidekiq, told me I was saving them $3,000 a month.

Jonan: Wow.

Mike: And that's just by switching to Sidekiq. And that's when I realized, wow, well, if I can capture 10% of that, that's $300 a month. That's a great annual income from one customer. And now all I have to do is get 20, 50, 100 of those customers. And all of a sudden, I've got a good business that can keep me going for years. And that's exactly what happened.

Jonan: That’s brilliant. If I remember correctly, in Ruby at the time, there was a change with the way threads worked. Like, they shifted to green threads. You certainly know more about this than I do. Help me out here. What am I trying to say?

Mike: So I think the migration you're talking about was Ruby 1.8 to Ruby 1.9. And Ruby 1.8 had what's known as green threads. And those are, let's call them, the fake threads. They're under the covers. It's still just one thread, one actual physical thread that's executing on a core, but it's switching. Green threads are also known as fibers. So when you hear Ruby talk about fibers, that's another way of saying green threads. And that's what it turns out, a fiber. You can think of a green thread as a fiber. So with fibers, you have to context switch yourself. The system doesn't automatically context switch. And with Ruby 1.9, they migrated all Ruby threads to be mapped one-to-one to a physical thread. So a thread in Ruby now can run on a distinct core, separate from another Ruby thread. Now it turns out in practice that that's not really true still because Ruby has what's known as the Global VM Lock (GVL). And so that prevents Ruby from actually taking advantage of multiple cores. But in Ruby today, a thread does actually still map onto a real physical thread that's run by an operating system. A thread in Linux maps to a thread within Ruby.

Jonan: And now, with the latest release of Ruby with Ractors and things, we are actually able to take advantage of those multiple cores.

Mike: Yeah. Ractors are separate, or Ractors are distinct from the GVL, so they don't have the GVL anymore. That's not quite true. The GVL continues to linger well past when anybody wants it. It's the friend at the party that just won't leave, unfortunately. So Ractors, theoretically, can be distinct from the GVL, but the GVL there's still locks that prevent Ractors from executing completely, distinctly in parallel from each other.

Jonan: And this is, I think, in part because of Ruby's commitment to backwards compatibility that we could actually rip the GVL out. It would just break everyone's code because people are not used to programming in a way that makes sense for that environment, right?

Mike: The issue with the GVL and with Ractors is that Matz is really strong on backwards compatibility and part of that is C extensions gems. C extension gems are, as the name implies, they're written in C. And they can basically do anything. Like, they can touch any memory they want anywhere. And because of that, it's a lot more Wild West. And so you have to be really careful with the changes you make. You don't want to break C extensions and possibly allow them to crash the Ruby VM.

Jonan: So we're using threads which are lighter weight than processes operating system-level thread now. This is Ruby 1.9, and that maps directly to the Ruby concept. And then you're using those from the beginning, or did you have to make changes to Sidekiq at the time? Sidekiq was around in 1.8, right?

Mike: I don't think so. I think Sidekiq was really 1.9 only from the start. Sidekiq was started in 2012. And I think by 2012, Ruby 1.9 was pretty mature. But Sidekiq has always been multi-threaded from day one. That's been its big difference from Resque. And in doing so, it became a lot more efficient. And so that's where it got its speed increase and its efficiency from.

Jonan: I've been making a case to people over the years. It's been interesting to watch. My career is much shorter than yours, obviously. Like, I met you actually very early on in my time of software, a little over a decade ago probably. And the things that I have seen change– we initially had these monoliths, and then we started talking about microservices and the value of extracting microservices. And then people started building only microservices all the time. And in the first three months, you're developing a greenfield application. And suddenly, you have 100 distinct Rails applications, all built together doing this thing. And then you come back together into the monolith, and there's this kind of like ebb and flow of architecture choices. And I had been advocating to people at the time that the microservices weren't necessarily a terrible idea, but they needed to be used as part of an event-based architecture. You would be talking more about this event happened, and then that kicked off this event and controlling things across services according to those boundaries. And using a background job processor like Sidekiq is actually incredibly valuable in those situations. I think a lot of people use background jobs for email. Like, okay, well, sending emails takes a long time; therefore, we use a background job. But applying it to a broader pattern of developing an application, I actually think there's rather a lot of work that could and should be loaded into background jobs. Would you have advice for people who are trying to tease apart their applications or speed them up generally as to what other workloads they may not expect that they could be handling that way?

Mike: So I think of a background job as a distinct unit of work that your business or your application may want to do. Sending email is certainly a good, simple example of that, but it really can be anything. The tricky thing with background jobs is because they're asynchronous; how do you build larger workflows out of them? It's tough to say, “Execute these thousand things in parallel, and then go to the next step in this larger workflow process.” I have solutions for that, and there are other solutions. But going back to what you were talking about with microservices, I think of this as a hill that I have particular opinions on, and I will die on it. I think the best architecture is a monolith per team. So if you have a team of four or five developers, you all should be working in a monolith. Now, that doesn't mean that you can't have logically separate services that all use that one monolithic repository and codebase. But I don't necessarily think it's advantageous to call out 20 different microservices that have 20 different codebases if all those 20 microservices are all part of the application for one team. I think that's a losing proposition, and I think you're going to spin your wheels with a lot of maintenance that you wouldn't need to do if you just had a monolithic architecture. As a Rails app shows, you can have a Puma service. You can have a Sidekiq service. You can have any number of other services that are all part of a Rails monolith. So yeah, it's not black or white; there are shades of gray here. You can have a monolithic codebase that has distinct services that run out of that same codebase.

Jonan: So you have your foreman Procfile with your Sidekiq process in it and your Puma process in it. One of them serves up web traffic, and the other one handles these background jobs, but that allows you to then scale horizontally with your background processing as your application has more needs. You can spin up more processes to do this Sidekiq piece. They don't even necessarily need to live on the same machine. But if you're building a monolith, then kind of, right?

Mike: Yeah. I mean, if everything is written in the same language, you can certainly have it all use the same repository. You can just have one Procfile that starts up, and it could be half a dozen or a dozen different services. I'm thinking of if you're doing a lot with images and you've got an image resizing proxy or if you've got some sort of engine X proxy, all of these things can be started in one Procfile. And those count as sort of logically distinct services, but they can all be based on code that's all in the same monolithic repository such that the team can all maintain it together.

Jonan: From your perspective, the value of maintaining a monolith is more that everyone on the team has insight into all of the pieces. You're not digging across multiple repos and projects to go and try and find or debug an issue. All of the things that we've built, all these tooling around to help us where you're mapping, like, following a request ID from service to service, is unnecessary if you could see it all.

Mike: Yeah. That's part of it. The other part is that if you're all working with the exact same database if you're working with one master database, one primary, then you can all use the same models, and you don't have to -- I've seen companies that put all their models in a gem and then they have different projects depend on that gem, and that's how they share models, Ruby model code, which seems kind of weird and maybe there are other good reasons for doing that. But that always seemed a bit much.

Jonan: Yeah, I think that I actually have seen that pattern in a couple of companies where I've worked. I think there were a lot of interesting experiments going on with Rails. Rails has always been a community. Ruby and Rails have always been communities where you see people pushing the boundaries of these design patterns, and architecture is, I think, a little bit -- not that they're necessarily inventing anything new, but they are certainly stretching the edges a little bit. There was this Hexagonal Rails, this way of building Rails apps called Hexagonal.

Mike: There was the Twelve-Factor App architecture. I'm not sure what the Hexagonal Rails is. That vaguely rings a bell, but I don't recall what it was about. Was that the thing where it was about Trailblazer or Hanami and using service layer? I don't recall.

Jonan: It was about hexagons and viewing the middle as a hexagon where you have this app, and then you have a UI, and then you have an integration hanging off of it, and then you have a database hanging off of it. And the app being a hexagon in the middle and then attaching other modules to it and deploying those things independently. But it led to some patterns like that. I don't know that the thing where we're putting all of our models in a gem is an exact mapping to Hexagonal Architecture. But the Twelve-Factor one is a good example of something -- like, if you go and read the Twelve-Factor App, I think they called it a manifesto of the Twelve-Factor App thing. Now, you're like, well, yeah, 9 out of 10 of these are just obvious. A couple of them are maybe also implemented pretty broadly in the industry. But it's nothing groundbreaking in there anymore. And at the time that it came out, it kind of was.

Mike: That's just it. It's not obvious until all of a sudden, everybody says, “Oh yeah, this is obvious. Everybody should do this.” I think you're right. I think they called it the Twelve-Factor because there were like 12 bullets. I'm assuming that the majority of those bullets have now been accepted as a good idea. And certainly, I think I would agree with that. It was something like -- Lord, I don't want to try and conjecture as to what the various bullet points were. I think one of them was like parsing stuff via the environment.

Jonan: Exactly.

Mike: I can get behind a lot of that.

Jonan: Storing config in the environment was a huge change for people at the time. There were a lot of people who were hard coding their passwords into their code.

Mike: Not only that, but you're passing in config via config files, which were found on the file system. Now you have to lay down a file system before that app can be used on the machine. And the whole point of a Twelve-Factor App is if you're passing stuff via the environment, the file system can be completely ephemeral.

Jonan: Right. And this is what enabled Heroku to exist; it’s that ephemeral file system, right?

Mike: Bingo.

Jonan: And then container architecture building on that and now containers are just the norm. And you can write to the file system if you want, but it's a temp directory. That whole thing gets blown away in a moment's notice. You can't trust it to still be there. Everyone just accepts that that's the way. And I remember so much frustration when I was first starting at Heroku from customers who were like, “What do you mean there's an ephemeral file system? I just uploaded a four-gigabyte file to my production server. I need it back now.” And I'm like, “No, it's not there anymore because your production server was replaced when it started to struggle. We moved it to another Dyno,” that kind of description of architecture, I think, is pretty rarely seen. But I think that if someone were to design a background job system today, it would seem nonsense not to use a multi-threaded approach.

Mike: I certainly agree with that. Back when I started Sidekiq, it was somewhat heretical because the community basically viewed Ruby as not thread-safe. Almost everybody was terrified that their code was not thread-safe and that gems were not thread-safe, and Rails was not thread-safe. And I realized pretty early on that this was going to be probably the biggest speed bump in adoption was trying to convince people that thread safety was less of a concern than you think, that your most normal Ruby code is thread-safe. And that if there are issues, that they will quickly get resolved. All you have to do is open up an issue on GitHub, and the gem maintainer will see, oh yeah, that is a problem, and they'll fix it. And it turns out that's exactly what happened. It took about a year, but basically, every gem got fixed, with very few exceptions. And now the entire ecosystem is thread-safe, and it's no longer an issue.

Jonan: So you would, I think, definitely describe yourself as a thought leader then. You led the thoughts of thousands and thousands of people.

Mike: [chuckles]

Jonan: We should get you a thought leader t-shirt.

Mike: I blazed a trail.

Jonan: Yeah.

Mike: Well, I guess a thought leader is just someone with a machete who's whacking his way through the jungle and telling people, “I'm pretty sure there's stuff on the other side of this jungle.”

Jonan: [laughs]

Mike: “And I hope I don't lead us off a cliff.”

Jonan: I talk a lot about thought leaders in the marketing spheres in which I operate sometimes. And I'm like, wow, you can't just say that with a straight face, though. You can't just talk about thought leaders and influencers. Like, it's a thing, to be sure; there are people who are ahead of the curve with their thoughts. We could also just call those people community leaders or visionaries. But I will get you that thought leader t-shirt. As soon as we're able to get back in person, I’ll take you out to dinner and bring you a thought leader t-shirt.

Mike: I'll wear it with pride, Jonan.

Jonan: [laughs] Mike, thank you so much for coming on the show. Do you have any tips for people who are starting out? I know that I personally very much aspire to live the dream that you are living where I bootstrap this company, and I am not beholden to a board who gets to tell me what my exit strategy is going to be. What advice would you have for people starting out where you were eight years ago?

Mike: My secret sauce is that I got paid to solve the same problem over and over and over. And so I knew how to solve the problem, and I went out and did it, and then I charged people for my solution. I'm incredibly privileged that I had the nights and weekends to do that because this was open source development. So I did it at night. I did it Friday nights. I was coding. I wasn't going out. I wasn't hanging out with friends necessarily as much. My wife put up with it. And once she saw that there was actual money to be made here, then she realized, oh yeah, well, he wasn't just doing something crazy. But I was privileged enough to have a family situation and a wife situation where I could solve this problem. I wish I had a magic wand I could wave. But yeah, solve a problem that you know how to solve and charge people for the solution.

Jonan: That charge people for the solution part is what I'm working on. I make the same case when I spend my Friday nights soldering Raspberry Pis into Yoda dolls. But my wife somehow thinks this is not a viable business. Can you believe that, though?

Mike: [chuckles]

Jonan: I think it's going to take off. It's going to be the next big thing.

Mike: How could that not be a sure thing, Jonan?

Jonan: [chuckles] I know, right? I, too, I'm a thought leader.

Mike: Are you going to get Yoda as your stock ticker?

Jonan: I should, absolutely. Yodapi.com.

Mike: [laughs]

Jonan: All right. Well, thank you again, and we'll see you back here the next time I get you on one of our many podcasts or live streams.

Mike: Thank you so much, Jonan.

Jonan: I want to remind you all that New Relic and The Relicans are going to be at our upcoming conference, FutureStack, coming up on May 24th. You can stop by therelicans.com/futurestack and read about it. We would love to have you there. I hope you have a wonderful day.

Thank you so much for joining us. We really appreciate it. You can find the show notes for this episode along with all of the rest of The Relicans podcasts on therelicans.com. In fact, most anything The Relicans get up to online will be on that site. You'll also find news there of FutureStack, our upcoming conference here at New Relic. We would love to have you join us. We'll see you next week. Take care.

Discussion (0)