This blog is going to be a little different – rather than tell you about all the wonderful ways Replicated is helping software vendors, we’re going to explore what the experience is for those companies who try to build their own DIY software distribution tooling. This outline is based on a SaaS company and/or traditional on-premises software company that is delivering their app to customer Kubernetes (K8s) environments in cloud for the first time. Think of it as an alternate history based on a composite of many people’s experiences. We hope you don’t make the same mistakes!
Day 0 - The sales or product team asks engineering simple sounding questions: “Can we deliver our SaaS application into our customer’s self-hosted Kubernetes environments?” or “Now that we’ve modernized and containerized our application, can we distribute it to customer-managed clusters in the cloud?” Either way, what they are really saying is:
“Our prospects keep asking for us to do this, and we’re leaving money on the table every time we say ‘no.’”
Day 1 - “How hard can it be?” The lead engineer spends a couple weekends hacking on a rough solution -- very excited to build something new. It seems to be fairly straightforward to refactor the app to work in any AWS or customer-hosted environment, right? We could use Terraform, maybe.
Day 30 - The field engineers deliver the app to their first customer-hosted K8s cluster running in an AWS Virtual Private Cloud (VPC.) The proof-of-concept (POC) installation doesn’t go as smoothly as hoped, but after a couple of escalations to engineering and some patience from the customer, they finally get the app deployed. High fives!
Day 45 - The lead engineer has shipped several updates and changes to the new “on-prem” K8s installer to make it work. A production install is started in a different environment, but it’s not working the same way, and no one is quite sure why. More and more engineering time is being spent on Zoom with the customer whose frustration is growing steadily. Other modernization, innovation, and/or backlog work is starting to take priority, and this project is starting to look a lot more complicated than expected. The impact:
The sales team is getting a bit nervous about their account and escalating to management.
Day 60 - The project is no longer fun at all and continues to suck time and people. The Terraform scripts are failing security reviews at some companies. The lead engineer asks their manager to get them off this ASAP because they are burning out. Company doesn’t want to halt the project because product and sales are close to closing this customer. There are a surprising number of on-prem and K8s cluster-based opportunities in the pipeline, and in this economy, the VP of Sales doesn’t want to turn away any revenue. The head of engineering begrudgingly assigns more engineers to work on the on-prem installer project, delaying the schedule for other planned app features and innovations.
Day 180 - A lot has gone on in the last four months. New customers are running the installer, but each one has a slightly different environment and installation requirements. A few examples:
Day 270 - With mixed failures and success, the on-prem K8s install initiative carries on in fits and starts. More issues keep popping up. The install success rate is hovering around 50%, where half the attempted installs end with the customer getting fed up and losing trust. Other customers and prospects keep asking for it, and a number of big accounts are now deployed with it, so it seems impossible to turn back, but the quagmire is getting deeper:
Day 360 - One year in, the engineering team, exasperated and burnt out, holds another all-hand-on-deck meeting to reset and figure out what to do. Everyone dreads doing a rotation on the on-prem installer team; some people actively seek to get off the team. A few veteran engineers sit permanently on the team because they understand that without them, a big source of revenue would be in jeopardy. Engineering and product leadership agree to deemphasize new feature work to give the team up to 50% of their time for three months to invest back into the install tooling. While they’re at it, engineering agrees to spend significant time developing the air gap installer that more and more customers are requesting. The team develops a wishlist for everything they’d want:
Day 390 - The team is making progress, and even the lead engineers who built v1 are engaged again. A few improvements are made, and momentum is building, but there’s still so much to do. The most knowledgeable people are still getting pulled into many support escalations with currently-deployed customers and new customers.
Day 480 - The three-month sprint has now sprawled out to six months. With half the team still improving the build/test/distribute/support platform for on-prem installs, app feature development is still behind pace. Work on an air gap installer has not even reached the prototype phase. With half the backend team focused on infrastructure-flavored tasks, frontend engineers staffed to work on a SaaS application or other modernization efforts are consistently running out of things to do. What happens next:
Disillusioned and completely burned out, the two engineers who built v1 of the installer and have the deepest knowledge of the project leave to go join small startups founded by former colleagues. This sets the team back even further.
Some might read this cautionary tale and conclude that distributing their software to customer-managed on-prem K8s and private cloud environments simply isn’t worth the pain. But 80% of all software spending still goes to applications that aren’t pure SaaS, and most organizations now expect applications to be K8s-friendly. We’re seeing a looming trend of application boomerangs from the cloud for reasons of security, compliance, performance, and cost.
There’s got to be a better way to solve the hard problems outlined above and still increase your addressable market! Ask us how Replicated can help you sell more, install faster, and efficiently support your customers.