Speaker
Description
Sir Ronald Fisher was one of the giants in both the fields of statistics and genetics. His seminal work was done at Rothamsted Experimental Station outside of London.
Fisher was both a statistician and a scientist. He was well trained in mathematics, but in the final analysis, he was a scientist who understood how to perfom proper data analysis. Fisher strongly rejected the Neyman-Pearson approach based on the strict distributional assumptions required to perform hypothesis testing. Fisher understood that the assumption that the data are independent and identically distributed does not exist for real-world data.
This talk summarizes his three basic principles for conducting experiments best understood as randomization, replication, and local control of error (often referred to as “blocking”). This talk presents why these principles are extremely important through simple examples, especially randomization. Fisher understood that real data are always correlated, often highly correlated. Randomization provided a mechanism for making fair comparisons among the various treatments under study.
This talk then discusses how to adapt these principles for a large data universe, something that did not exist when Fisher was alive, much less when he was at Rothamsted. In the current big data universe, all of the data are highly correlated. His fundamental approach, properly modified, provides a formal basis for the analysis of complex data that corrects for the complexity of the models used to explain the data.
| Special/ Invited session | "Statistics and data science in the technological field: current issues and new proposals" |
|---|---|
| Classification | Mainly methodology |
| Keywords | mathematical statistics versus scientific data analysis, deterministic data, complex model identification |