The p-value won’t trend consistently toward zero unless there is a true and significant difference between the variants. Some variability is expected.
Early in a test, due to smaller sample sizes, the p-value tends to be quite volatile but it should stabilise over time as the sample size increases. One factor that can cause it to fluctuate is the exclusion of outliers on a day where an outlier breaks the threshold. In such cases, the number of tracked searches may drop suddenly, which can cause the rate ratios to adapt and, subsequently, the p-value to change.
Other potential causes for fluctuations in p-value can be:
- sales cycles or user behaviour patterns (e.g. the difference between nobody searching on a weekend vs the following Monday)
- events such as marketing campaigns or the launch of a sale
- any changes that affect either the volume of tracked searches or events