This was inspired by the disease incidence rate in the US featured on the Wall Street Journal which I mentioned in one of the previous posts. The disease incidence dataset was originally used in this article in the New England Journal of Medicine. Here, I use the measles level 1 incidence (cases per 100,000 people) dataset obtained as a .csv file from Project Tycho. Download the .csv file here or head over to Project Tycho for other datasets.
In this post, we will look into creating a neat, clean and elegant heatmap in R. No clustering, no dendrograms, no trace lines, no bullshit. We will go through some basic data cleanup, reformatting and finally plotting. We go through this step by step. For the whole code with minimal explanations, scroll to the bottom of the page.