Crafting a Golden Baseline for Analytics Testing

In a previous article, we discussed how shifting left your analytics testing can help prevent data losses and spot data inconsistencies before they reach production. Essentially, we claimed that analytics testing can and should stand proudly as a first-class citizen in your QA processes, instead of being forgotten in the product development process until the data bug is already causing harm, impacting your data collection and business reports, and possibly calling into question any decision taken based on your data

That is why, in this article, we are going to delve deeper into the process of crafting a golden baseline for testing your analytics, an approach that not only mitigates the risk of poor data quality and data losses, but also ensures the reliability and accuracy of our analytics reports.

Analytics Testing Demystified: Drawing Parallels to Code Testing

Just as with traditional code, analytics testing involves the application of both unit and integration testing methodologies to validate the functionality and integrity of its building blocks. These building blocks often encompass explicit invocations in the code such as track(), identify(), or group(), each representing essential components of the analytics tracking process which can be tested in isolation, or are automated through scripts, pixels, or the use of Google Tag Manager, which makes testing more complicated.

Moreover, similarly to code testing, we typically treat external dependencies, such as third-party analytics libraries, as invariant elements that are assumed to function consistently across different testing scenarios and are therefore simply ignored or mocked.

However, despite the similarities found between analytics testing and code testing that corroborate we’re not talking about an engineering mystery, it’s also important to mention that analytics testing presents its own set of challenges.

Firstly, we believe that the challenge of achieving a comprehensive analytics QA coverage to fix bugs before these have the chance to propagate into production environments arises from the fact that event tracking is always the last step in an implementation, leading analytics efforts to be typically deferred and treated as an add-on at a later stage.

Secondly, the technical specifications guiding analytics testing are heavily contingent upon how the tracked user actions are actually implemented, rather than solely relying on the initial design specifications. In other words, the effectiveness of technical specifications cannot solely rely on how well user actions are conceived on paper during the planning phase. Instead, the practical execution of these actions within the actual system becomes paramount. This execution encompasses a spectrum of activities, from coding and integration to deployment and ongoing maintenance.

Finally, it’s also worth mentioning the multifaceted nature of analytics testing often involves the participation of various stakeholders, each with their own perspectives and priorities. Coordinating these diverse interests and ensuring alignment throughout the testing process can pose additional challenges, particularly when it comes to defining and validating the expected analytics outcomes.

Overcoming these challenges with Trackingplan’s Regression Testing

Trackingplan’s module for Regression Testing tackles these difficulties by piggybacking on your existing functional testing frameworks, eliminating the need for standalone, complex testing setups. Yes, you do not need to write new tests! Instead, the analytics of your existing functional tests will be validated automatically. This approach not only saves time and resources, but also ensures consistency and coherence across the testing pipeline.

Moreover, the module's ability to seamlessly integrate with popular testing frameworks such as Cypress and others enhances its accessibility and usability for development teams. With minimal setup and configuration requirements, teams can quickly incorporate analytics testing into their existing workflows, fostering a culture of continuous testing and quality assurance and democratizing access to decisive testing capabilities.

View of Trackingplan's Regression Testing module.

Lessons Learned: Navigating Challenges in Regression Testing

A few months have passed since the release of our public version of Trackingplan’s Regression Testing and its API for CI/CD Integration. Since then, we have seen it being automatically checked in thousands of CI/CD pipeline executions by companies obtaining analytics testing coverage for hundreds of tests for free from one day to the next.

Likewise, we’ve witnessed dozens of users manually debugging with ease regressions that otherwise would have gone unnoticed thanks to the regression bugs and specification mismatches that our solution has automatically spotted. This has allowed them to fix mistakes, update baselines, and approve release candidates before errors slipped through the cracks in production.

To our surprise, using Regression Testing, Google Analytics 4 and custom tracking have been the most tested, whereas for our Production Monitoring, Universal Analytics is still ahead.

Another clear lesson we’ve come across is how difficult it can be for large-scale software projects to obtain a baseline they can reliably use for analytics testing, knowing that all tests passed and the analytics tracked are the ones they specified.

Particularly, in large-scale software projects, each build introduces changes from potentially hundreds of developers, which makes it almost impossible to test changes in isolation and attribute any subsequent issues or breakages to a single cause. Yet, this is even worse for shared builds deployed at staging environments or release candidates, where multiple teams or departments may contribute to it.

As a result, while comparing each new release or build against the previous one may help identify new issues or breakages, it does not provide a clear picture about how well we are doing compared to an ideal, or golden baseline, which we may not be able to attribute to any of our past releases.

Towards a Golden Standard: Crafting Baselines for Reliable Testing

In light of these challenges and the growing need for an ideal and reliable baseline for analytics testing, Trackingplan has been hard at work to provide our users with the possibility to craft a golden baseline manually.

This golden baseline serves as a reference point against which future builds or releases can be compared. Moreover, even if no build has yet achieved this ideal state, users have the opportunity to define and establish it themselves.

In essence, users can manually craft the set of events, properties and their specifications, –including on which pages these should trigger–, that represent the desired behavior of each of their functional tests.

Crafting a Golden Baseline: A step by-step-guide

Creating a golden baseline for analytics testing with Trackingplan is simple. Let’s explore a step-by-step walkthrough together:

Select a starting point: Begin by choosing a test session that serves as a suitable starting point for creating your golden baseline. This could be your organization's last public release or any other test session that represents accurately the current state of your analytics.
Identify and delete failed tests: Then, delete any tests that have failed functionally or have been identified as out of spec previously to ensure only funcional tests are included in the creation of your golden baseline.

View of Trackingplan's Regression Testing: Test sessions and baselines sessions.

Manually Fix Code: We can then manually fix the code and run only those tests again, iterating this process until we pass the test and our analytics aligns with our specifications.
Set as Default: Once the golden baseline has been successfully created and validated, set it as the default reference point for future analytics testing. This ensures that all subsequent test sessions are compared against the established baseline, enabling easier detection of regressions or deviations in analytics data in future CI/CD runs.

Image showing how to set up a baseline as default.

‍

Conclusion

By following these steps, you can effectively create a golden baseline for analytics testing, providing a reliable benchmark against which to evaluate the performance and accuracy of your analytics system.

We invite you to reflect on your current analytics testing approach. Are you happy with the process and the coverage it provides? Do you find yourself facing data downtime issues because bugs reached production and were detected too late in time? We're keen to hear your thoughts, experiences, and any hurdles you've encountered (swag and free trials may apply!).

Crafting a Golden Baseline for Analytics Testing

Analytics Testing Demystified: Drawing Parallels to Code Testing

Overcoming these challenges with Trackingplan’s Regression Testing

Lessons Learned: Navigating Challenges in Regression Testing

Towards a Golden Standard: Crafting Baselines for Reliable Testing

Crafting a Golden Baseline: A step by-step-guide

Conclusion

Getting started is simple

Similar articles

10 things you didn’t know about Trackingplan

Shifting Left your Analytics Testing

Exploiting the Value of Data: Why Trackingplan was Born

How to Get Started With Trackingplan

Evolving into a QA Tool for your Digital Analytics

Master Data Observability with Trackingplan’s Digests