Coronavirus Case Data Illustrates Heteroskedasticity

John Vandivier

Coronavirus statistical case data is an ideal illustration of the concept of statistical heteroskedasticity.

Test coverage and data quality has increased over time. The reliability of data is increasing over time and measurement error variation is decreasing. Measurement error was originally high due to differing policy, testing, and reporting implementations by state and at the federal level.

True that the early data is bad, but it does not follow that forecasting is unsound. As long as data quality is improved over time, rather than increasing in error over time, the point estimate converges and eventual confidence for any particular range is greater than or equal to the confidence level based on the current period's data.