Data Sources

My model works by trying to “guess” the values of some parameters so that the model dynamics best fits empirical data of the cumulative number of cases and deaths counted over time. The scientific and medical communities have made great efforts to share data widely and I've found several sources I rely on regularly. Thanks to these data curators:

When I run my model (ie. systematically try to “guess” the parameter values) I first load the empirical data from the above sources and then merge them into a single timeseries for total global cases and deaths. You can read my scripts to see how I load and merge the data. I try to remove duplicates found in multiple datasets but they can still occur (for example, if the same data are marked with different datestamps). It may seem good to have extra data but if they're redundant they can be dangerous: they might give the impression that your data are more reliable than they actually are. I've tried to design my model and analysis so redundant data don't give me false impressions.

I also estimate a single value of the parameter $\phi$ to anchor it. Here's the spreadsheet and data:

  • science/ebola2014/data/start.txt
  • Last modified: 2015-07-30 22:14 (7 years ago)
  • by Rik Blok