自然科学与人文科学大数据——第六届中德前沿探索圆桌会议综述
Big Data in Natural Sciences, Humanities and Social Sciences——Review of the 6th Exploratory Round Table Conference
Big Data in Natural Sciences, Humanities and Social Sciences——Review of the 6th Exploratory Round Table Conference
作者
郭华东(中国科学院遥感与数字地球研究所 北京 100094)
陈润生(中国科学院生物物理研究所 北京 100101)
徐志伟(中国科学院计算技术研究所 北京 100190)
孙建军(南京大学 南京 210023)
毕军(南京大学 南京 210023)
王力哲(中国科学院遥感与数字地球研究所 北京 100094)
骆健俊(中国科学院生物物理研究所 北京 100101)
沈华伟(中国科学院计算技术研究所 北京 100190)
顾东晓(南京大学 南京 210023)
梁栋(中国科学院遥感与数字地球研究所 北京 100094)
沈文庆(中国科学院上海分院 上海 200031)
张旭(中国科学院上海分院 上海 200031)
Hans Wolfgang Spiess(Max Planck Institute for Polymer Research Mainz 55128)
Thomas Lengauer(Max Planck Institute for Informatics Saarbrücken 66123)
陈润生(中国科学院生物物理研究所 北京 100101)
徐志伟(中国科学院计算技术研究所 北京 100190)
孙建军(南京大学 南京 210023)
毕军(南京大学 南京 210023)
王力哲(中国科学院遥感与数字地球研究所 北京 100094)
骆健俊(中国科学院生物物理研究所 北京 100101)
沈华伟(中国科学院计算技术研究所 北京 100190)
顾东晓(南京大学 南京 210023)
梁栋(中国科学院遥感与数字地球研究所 北京 100094)
沈文庆(中国科学院上海分院 上海 200031)
张旭(中国科学院上海分院 上海 200031)
Hans Wolfgang Spiess(Max Planck Institute for Polymer Research Mainz 55128)
Thomas Lengauer(Max Planck Institute for Informatics Saarbrücken 66123)
中文关键词
大数据;科学大数据;生命科学;地球科学;人文科学;社会科学;计算机技术;中德前沿探索圆桌会议
英文关键词
big data;scientific big data;life sciences;earth sciences;humanities;social sciences;computing technology;Exploratory Round Table Conference
中文摘要
大数据是知识经济时代的战略高地,是国家和全球的新型战略资源。作为思维的革命性创新,大数据为科学研究带来了新的方法论。第六届中德前沿探索圆桌会议以"自然科学与人文科学大数据"为主题,在"生物医药大数据"、"物理、化学与地球科学领域大数据"、"人文与社会科学领域大数据"和"大数据处理技术与方法"4个领域进行研讨,总结了大数据对于科学发现的重要作用、意义以及面临的重大问题,形成了关于发展科学大数据研究的相关建议。
英文摘要
Big data has begun to significantly influence global production, circulation, distribution, and consumption patterns. It is changing humankind's production methods, lifestyles, mechanisms of economic operation, and country governance models. It is a strategic enabling technology in the era of knowledge-driven economies, and also a new type of strategic resource for nations and the world. It offers a promising new route for innovative methods of analysis and inference, and provides new opportunities for natural sciences, humanities and social sciences. Ubiquitous in the discussion of today's technology, the colorful and not clearly delineated term "big data" is on people's minds, regarding both its immense potential and its actual and perceived risks. The 6th Exploratory Round Table Conference (ERTC 2015) under the theme of "Big Data in the Natural Sciences and Humanities" was successfully held in Shanghai in November 2015. It was a joint project of the Chinese Academy of Sciences (CAS) and Max Planck Society (MPG), focused on topics that are only just beginning to emerge in the scientific community. Scientists from CAS and MPG met together with experts around China and the world to review the status of research and technology regarding and using big data and to discuss how it can and should be harnessed for furthering science. Big data is characterized by (1) highly accessible generation of large volumes of data which (2) are generated continuously in a highly dynamic fashion, and which feature (3) high data heterogeneity and (4) serious issues of data quality regarding noise, incompleteness, and biases. The status and requirements of big data research differ substantially among individual scientific domains. In the life sciences, the field has large, internationally shared repositories of highly diverse omics data. Current activities include bringing together biological and medical (patient) data for research on diagnosis and therapy and making patient data accessible while preserving patient privacy. In the Earth sciences, various Earth observation methods, for example, remote sensing, ground sensor networks, geophysics, geochemistry, and geological surveys, have afforded huge volumes of data, so called big Earth data. Exciting themes include global change and digital Earth science. The concept of digital Earth is a virtual representation of our planet constructed with massive, multi-resolution, multi-temporal Earth observation, and socioeconomic data of different types. This multi-disciplinary challenge relies on big data. Big data is also emerging for the humanities and social sciences. High-resolution 3D-imaging, for example, has led to the generation of large amounts of data for digital reproductions of cultural heritage artifacts that require large processing capabilities for filtering and reassembly. The key problem in social sciences is that the vast majority of data is still only available as images, texts, or websites, without appropriate metadata to enable discovery and analysis. Methodologies based on big data pose a number of challenges. (1) In order to gain trust in the data and learned predictive models, the predictions must be interpretable by a human. (2) Another challenge is the resulting loss of privacy: in some settings, complex predictive models are able to recoup partial information from different databases, and effectively deanonymize seemingly anonymous data. (3) At the infrastructure level, energy- and cost-efficient solutions are becoming a growing necessity. (4) Furthermore, the software deployed on such infrastructure must deal transparently and resiliently with the noise and heterogeneity inherent to big data. In the three-day conference, a preliminary consensus was proposed that big data, as a new way of human life and understanding the world, is driving the transformation of scientific research paradigms and promoting scientific development. It should be scientifically cognized how big data is playing a critical role for scientific discovery, what the significance is, and what major challenges are being faced. The conference also recommended establishing a Scientific Data Center in communication and cooperation, to form a scientific working group to research big data issues, and to enhance cultivation of young scientists in the realm of big data.
DOI10.16418/j.issn.1000-3045.2016.06.014
作者简介
郭华东 中科院遥感与数字地球所研究员。中科院院士、发展中国家科学院院士、国际欧亚科学院院士。现担任国际数字地球学会(ISDE)主席及ISDE中国国家委员会主席、国科联(ICSU)国际科技数据委员会(CODATA)前主席及中国国家代表、灾害风险综合研究计划(IRDR)科学委员会委员及IRDR中国委员会主席、《国际数字地球学报》主编等职。主要从事遥感科学与应用研究,在遥感信息机理、雷达对地观测、数字地球科学等方面取得系列成果。发表论文400余篇,出版专著和主编著作16部,获国家和省部级科技奖励13项。E-mail:hdguo@radi.ac.cn