May 8 2017
The first annual Statistics and Data Science Center Day (SDSCon) at MIT highlighted a variety of research projects, including efforts to better understand gene editing, climate change, microcredit programs, international trade, and recommendation systems. The common thread of all of these diverse research areas is that researchers can use statistics and data science to learn about and accurately model different systems — leading to insights into how the systems work, as well as the ability to make better-informed decisions and policies.
SDSCon 2017 was hosted by the Statistics and Data Science Center (SDSC), which is part of the MIT Institute for Data, Systems, and Society. The April 21 conference was the first of what will be an annual celebration of statistics and data science, bringing together a growing community at MIT and beyond.
The day featured several short talks by SDSC faculty, three longer presentations by experts from outside of MIT, a brief industry session, and a graduate student poster session. Videos of all talks are available online.
SDSC brings together Institute-wide efforts and expertise in the areas of statistics and data science, facilitating both academics and research. New academic programs include an undergraduate minor in statistics and data science launched last fall and a PhD program that is still in the planning stages.
Devavrat Shah, SDSC director and a professor of electrical engineering and computer science, noted the interdependence of the academic and research components of SDSC. "As we know, a good education cannot happen without good research activities," he said.
Shah described SDSC as providing "a wide and common umbrella for people across the campus to come together and … make progress and learn from each other." The major challenges addressed by SDSC researchers often involve both people and data, and their research often looks at questions of how to analyze data and how to use it to inform decisions. Shah also addressed some different perceptions of statistics, as well as how the field is shifting and evolving.
"It's important to understand and remember and celebrate classical statistics," he said. "But it's also important to expand our horizons by bringing things like computation as a foundational topic."
Michael Steele, a professor at the University of Pennsylvania, discussed the effectiveness of decision-making algorithms in relation to the St. Petersburg paradox, a concept that explores the challenge of determining and making decisions based on expected reward. He highlighted some new theoretical work that is attempting to explain some of the strategies that might allow decision-making algorithms to work well despite this challenge.
Jennifer Listgarten, a senior researcher at Microsoft Research New England spoke about data science challenges in the area of genetics, which she described as "a truly data-driven science." Listgarten focused primarily on the gene-editing system CRISPR.
Harvard Kennedy School Professor James Stock talked about statistical analysis of climate change, especially in the context of clearly communicating climate change research and models to policymakers. He noted that although climate change might seem to be a "data-rich challenge," the reality is still that the data are from only one "experiment" — the increasing temperatures of the Earth. He also presented some of the different types of climate change data available and some insights they might provide.