Today, we’re excited to announce new and improved ways to export and explore data about your customers and their instances.
We've heard from customers that they want instance data in the hands of their analysts so that they can combine it with other data, like from CRMs, and build custom analyses and reporting. We are introducing three ways to do that now.
In working with the 100+ software vendors who use Replicated to deliver their customer-hosted software, we’ve found that teams working at scale need good data to make the right decisions. While exposing this data in the vendor portal via several collection and reporting features was a great start, we found that this wasn’t enough once teams started to hit 20-30 customers and beyond. These teams are using data about their customers’ instances and usage to drive decisions at the product, sales, and strategic levels, and we found that many software vendors wanted to consume this information from more centralized places than the Vendor portal:
At the core, we wanted to deliver features that would allow analysts, analytics engineers, data engineers, etc to make the most of this data. With new options for CSV Instances Export, JSON Instances Export, and Bulk Event Export, vendors now have the option to review data in the Vendor Portal, or export it via APIs or CSVs into any other system. We hope that in making this usage data available, we’ll enable your team to make better decisions about where to focus efforts across product, sales, engineering, and customer success.
While some teams have been using the existing export methods for years, we wanted to address several shortcomings in the current methods:
The new report addresses both of these by providing a report that adds a number of useful columns, and delivers 1 row per instance so that you and your team can decide if and how you want to aggregate data for a customer.
Once you have it, you can process that CSV and/or move it into your tool of choice. The below example uses Google Sheets:
Some notable new data points here are:
For a full list of columns with data definitions, see: Export Customer and Instance Data.
While CSVs provide a simple standard for export, we also know that for some teams, CSV management can be complex. JSON APIs give benefits like typed data, OpenAPI schemas, and generally just tend to be easier to consume and parse. This is true for both simple scripts and for workflow orchestrators like Airflow or Meltano.
We identified multiple issues with the existing JSON methods for exporting data
To that end, we’re publishing a new endpoint for exporting instances data as JSON. You can see an example request/response below
JSON export is further documented at Export Customer and Instance Data in our docs site.
Knowing the state of every instance is valuable, but we find that analysts and analytics engineers are also trying to answer a lot of questions that revolve around knowing the history of an instance. For example:
By analyzing historical time-series data, vendors can:
While some of these time-series style views could be “hacked” by querying the current state of all instances regularly and snapshotting the changes, we wanted to provide first-class support for understanding the history of instance upgrades.
To that end, we’re publishing a new endpoint that allows for fetching events for all instances and customers.
From this data, any historical / time series views can be constructed. The data can be filtered by date range, event type, and more.
To demo this functionality and what’s possible with it, we’ve open-sourced a few Example Analytics Notebooks showing some analyses that can be done with the Replicated data, including a Kaplan-Meier analysis of time-to-install, an example of using Meltano to load CSV data into a SQL database and query it back out, and some timeseries analysis of Kubernetes Version Adoption. We hope this serves as inspiration and guidance as you explore these features!
We’re looking for feedback on this functionality. If you’d like to be a design partner, please log a feature request. While API and CSV export is available today, we’ll look to continue improving the integration points that enable you to move this data into the systems where you need it. We’re also actively developing functionality for enabling telemetry collection from air-gapped environments, and will aim to include Data Export in that work. If you’d like to be an alpha tester for air-gapped telemetry, let us know.
Want to learn more about what Replicated does to help vendors distribute software to self-hosted environments? We would love to show you -- click here to schedule a demo.