More Trustworthy A/B Analysis: Less Data Sampling and More Data Reducing

We are all familiar with terabytes and petabytes. But have you heard about zettabytes (1000 petabytes)? Worldwide data volume is expected to hit 163 zettabytes by 2025, 10 times the data in 2017. Your product will contribute to the surge, especially if it is growing rapidly. Therefore, you should be prepared to manage the increase in data.

The cost of storage and computation will spike as the data volume keeps increasing. Your data pipeline could even fail to process data if the computation request exceeds the capacity. To avoid these types of issues, you can reduce the data volume by collecting a portion of the data generated. But you need to answer several questions to ensure the data are collected in a trustworthy way: Are you mindful of its impact on the analysis for A/B tests? Do you still have valid and sensitive metrics? Are you confident the A/B analysis is still trustworthy so that you can make correct ship decisions?

Read more: https://www.microsoft.com/en-us/research/group/experimentation-platform-exp/articles/more-trustworthy-a-b-analysis-less-data-sampling-and-more-data-reducing/