Close

Multiple Statistical Testing

In the advent of big data such as genomics, running numerous statistical tests is unavoidable. But long comes strange statistical problems. This post investigates issues with multiple statistical testing and its solutions along with simulated data.

In a standard statistical test, one assumes a null hypothesis, performs a statistical test and computes a p-value. The estimated p-value is compared to a predetermined threshold (usually 0.05). If the estimated p-value is greater than 0.05 (say 0.2), it means that there is a 20% chance of obtaining the current result if the null hypothesis is true. Since we decided our threshold as 5%, the 20% is too high to reject the null hypothesis and we accept the null hypothesis. Now, if the estimated p-value was less than 0.05 (say 0.02), there is a 2% probability of obtaining the observed result if the null hypothesis is true. Since 2% is a very low probability and it is below our threshold of 5%, we reject the null hypothesis and accept an alternative hypothesis.

The 5% threshold, although giving us high confidence, is an arbitrary value and does not absolutely guarantee an outcome. There is still the possibility that we are wrong 5% of the time. This is known as the probability of a Type I error. A Type I error occurs when a researcher falsely concludes that an observed difference is real, when in fact, there is no difference.

That was the story of a single statistical test. With large data, it is common for data analysts to do multiple statistical tests on the same data. Similar to a single test, each test in a multiple test has the 5% Type 1 error rate. And this accumulates for the number of tests.

Read More

India 2017

A short visit to India to visit my family and rejuvenate my senses.

I took a short vacation to Kerala, India to visit my family and to rejuvenate myself.

It was mostly visiting relatives, excessive eating and dealing with heat, humidity and traffic. On the bright side, the ayurvedic massages can be quite relaxing. This also presents opportunities to enjoy some local cuisine. I do enjoy Indian food, but South Indian food tend to be quite spicy and it can be challenging to find food that agrees with my taste buds. I usually eat a lot of non-veg, but here in Kerala, there is so much diversity in vegetarian food that I would happily become a vegetarian.

The mountainous regions of Kerala, such as Idukki district is one my favourite destinations to escape the heat and pollution of cities. The route is scenic and if you are lucky, you might even get to see some wildlife as a lot of Idukki is part of the Western ghats nature reserve and national parks. This is also an ideal location for stargazing as there is minimal light pollution. As a plus, there tends to be less mosquitoes in the mountains.

Humanity has never lived in better times

It is easy to be disillusioned and pessimistic about the world we live in. Bad news seems to be followed by worse news. But humanity has come a long way from the disease-ridden, impoverished, war-torn lives of our fore-fathers. Here we look at a few data-driven graphs to convince ourselves of the progress we have made over time in various aspects of life. Slow progress never makes headlines.

It may seem like the world is descending into total chaos, violence, and destruction. War in Syria, Ukraine, Yemen, Islamic state, migrant crisis, Ebola, plane crashes, earthquakes, tsunamis and what-not. The more news you watch, the more worried you will be. This is because the news outlets tend to focus on spectacularly negative instances. Violence, atrocities, and hatred are thrown into the spotlight and into the lives of common people. With the ever increasing digital connectivity, it is easy to disseminate information and to absorb information at an unprecedented level. Relatively smaller incidents have a larger voice. As said by Ray Kurzwil, “The world isn’t getting worse, our information is getting better”. To appreciate the world we live in, we have to put things into a wider context.

The fact is that humanity has never lived in a better time than now in pretty much every aspect you look at; war, violence, diseases, poverty are all at the lowest it has ever been. Of course, there is still a long way to go, but this is the best it has been since the beginning of humankind. To prove my point, here we evaluate human progress using some real data and simple time-series plots. Most of the data and information was obtained from OurWorldInData.

(more…)

Autumn in Fulufjället

Autumn pictures from Fulufjällets national park.

Autumn in Fulufjällets National park in western Sweden.
Fulufjällets national park on the border with Norway is a great location for a weekend away in nature.  The place is charming with vast meadows, forests, lakes, streams, mountains and idyllic houses. Fulufjället is known for its well preserved primeval forests, wildlife and unique geology. Fulufjället is also home to Sweden’s highest waterfall; Njupeskär falls and oldest tree; Old Tjikko. The location is also popular for winter activities such as skiing, snowmobiling and ice climbing on the Njupeskär falls. The park has an attractive and informative visitor centre (naturum) with user information including maps and guides. The park has wind shelters and camping sites. Part of the national park is located in Norway.