When distributing a Helm chart to hundreds of enterprise customers, it's important for ISVs to ensure compatibility with a range of Kubernetes versions, distributions, configurations, add-ons, and entitlements. The ultimate goal is to provide for reliable installation, upgrades and operations of your applications in K8s environments representative of what your customers are actually using. This will improve key metrics like time-to-live, install success rate, number of support calls, and ultimately improve CSAT and NPS too.
But there are some inherent complexities which tend to trip up ISVs when testing compatibility to improve reliability and stability. These often arise when certain combinations of components are running in a particular customer environment. For example:
Unfortunately, it's not reasonable to expect that an application tested in GKE will run properly in all configurations of Tanzu. Many successful ISVs implement a comprehensive testing strategy to combine a matrix of variables to validate their updates against a spectrum of possible environments. The most successful ISVs mirror customer environments as closely as possible to create customer representative environments for testing before releasing to those customers.
There are 4 dimensions to consider when completing compatibility testing for your K8s app.
This complexity can quickly become unwieldy and requires a strategy for successful compatibility testing. The first challenge is actually making sure these versions are available to your team on demand for testing. From there, ISVs must determine the right level of testing at the different points of the SDLC. This might include:
We'll cover separately what should go into each level of compatibility testing in another blog soon. There are exponential numbers of combinations possible, but it's usually possible to make educated guesses about common patterns, and validate according to the expected sensitivities of each combination and level of testing.
Of course, simply deciding on what combinations to test at what levels doesn't solve everything. You still need to provision an environment to actually do the testing. Today this is often a very manual, labor-intensive process involving cloud services, K8s distros, add-ons, platforms, projects, and resources. We often hear vendors say that this becomes too time-consuming to manage, particularly if they have a high frequency release cadence like monthly, weekly, daily, or even more often. While there are test automation suites and tools for the code, there isn't always an easy answer for provisioning the variety of environments needed.
We've been putting a lot of thought into how to resolve this issue. If you'd like to hear more, sign up now to learn more and join our waitlist.
You can also checks out the Compatibility Matrix docs or watch our short Compatibility Matrix intro video from RepliCon Q2.