Annual Report

Letter From the President and the Chair

In 2014, The New York Times featured 7,029 articles that included the word ‘data.’ One hundred years earlier, in 1914, there were just 467 articles mentioning ‘data,’ and 50 years before that, in 1864, a mere 63 articles. Percentagewise, these figures represent 8.5 percent, 0.63 percent and 0.31 percent of the newspaper’s stories for those years, respectively. As these statistics from The New York Times’ language usage tool Chronicle illustrate, the role of data is increasingly important in our world.

Why is data the new frontier? Recent technology is facilitating our ability to capture, store and process massive sets of information. With the potential to accumulate such troves of numerical observations across fields of expertise, researchers see many opportunities for querying large resources to ask the big questions in science. What are the ultimate constituents of the universe? What are the origins of life? Is there life on other planets? How do neural circuits integrate information to form thoughts and memories? What can we learn from our genome about disease, evolution and the diversity of life?

While large-scale data collection holds the promise of advancing our knowledge, there are also many challenges to surmount in the handling of these vast datasets. We need improved techniques for analysis, advanced filtering algorithms, larger storage capacity, better transporting capability and faster processing hardware.

At the Simons Foundation, we see the potential gains to be garnered from the analysis of datasets, and we see the complexities inherent in processing immense stores of information. As you will see in the following pages, we are supporting both theoretical and applied efforts in big data.

The foundation is interested not only in big data, however, but in data generally. We support the development of datasets and aim to provide them to investigators as a no-cost, collective resource. Such shared resources facilitate the cross-pollination of ideas among scientists who share information across disciplines and organizations. The foundation also fosters collaboration between outside investigators and in-house working groups.

At the nexus of this interchange is our new internal data research division, the Simons Center for Data Analysis. SCDA seeks to study datasets of great scientific interest and, in the process, develop new mathematical tools for their study. With an initial focus on neuroscience, genomics and systems biology, the modus operandi of SCDA is to collect, analyze, innovate and share.

With data taking on an increasingly important role in our society and in decision-making, mathematical skills and scientific literacy are becoming ever more essential. We need these skills not only to process information, but also to weigh the validity of its purported conclusions. As John Ewing cautions in his opinion piece, intelligence and insight must always be applied to truly gain insights from a set of numbers.

In this annual report, our goal is to show you some of our efforts around big data, and around data in general. The word ‘data’ will appear in 64.7 percent of our stories, or 76.57 percent if you include the words ‘database’ and ‘dataset.’ We hope you enjoy reading about our work.

Marilyn Hawrys Simons, Ph.D.

President

James H. Simons, Ph.D.

Chair

Back To Top