Sign In

Communications of the ACM


Managing Scientific Data: Coping with a Multidisciplinary World Grace Hopper Talk

View as: Print Mobile App Share:
Valerie Barr

Valerie Barr, Professor of Computer Science, Union College

After the opening session of Grace Hopper on Thursday, I attended a talk on  Managing Scientific Data: Coping with a Multidisciplinary World given by Claudia Bauzer Medeiros, professor of Computer Science at the University of Campinas.  Medeiros has received several awards for research, teaching, and her work on women and IT.  She is also the ACM-W embassador in Brazil, and was president of the Brazilian Computer Society, 2003-2007.


In her talk, Medeiros discussed the kinds of issues that arise in medicine, agriculture, urban planning, etc. when you are dealing with large volumes of heterogeneous data. Some of the problems involve networking or image processing, for example, but require new algorithms because of the quantity and nature of the data.  Data-driven science is a realm in which computer scientists jointly develop research with scientists from other domains in order to leverage scientific discovery. Data comes from many places, from scientific devices, archives, research literature, simulations, models.  The form of the data could be images, people, text. There's a high degree of heterogeneity. Medeiros refered to it as a data deluge, but in many situations we are data rich, analysis poor. Scientists need new approaches to data mining, filtering, massaging, visualization. They need more analysis toos and visualization tools. Most of the data will neer be viewed directly by people because there is so much, so we need tools to digest it. The computer science challenge as people move into new domains is to figure out new ways to handle the volume and heterogeneity of the data, while meeting the requirements of our colleague scientists from other disciplines.  At what level should data be analyzed, what level of granularity presents a useful solution in the original problem domain?  For example, if we are using satellite data, should we analyze an image that represents one day of data, or a composition of 16 days of daily images (such as in certain agricultural applications). What about years of image data? Does the answer change depending on what phenomenon we are studying, whether we are looking at heat, water, underground humidity?


In her presentation, Medeiros echoed many themes from Duy-Loan Le's opening keynote address.  She said that when dealing with scientists requires collaboration and cooperation, that computer scientists have to be able to manage user expectations, and we have to understand the differing "dialects" we use in different disciplines. In order to contribute to solving the problems that arise in other research areas, we need to have not only technical skills, but also communication skills and transcultural skills.  In addition, rather than seeing ourselves as simply giving help, computer scientists have to build two-way roads in which we also receive feedback that helps us develop better collaborative solutions. 


Medeiros closed by saying that when computer scientists work with people in other disciplines, we have to develop some understanding of what their job entails.  This will help us understand their problems in the appropriate context. We need to draw on technical cross-cultural understanding. But she emphasized that we should never forget that we are computer scientists!






No entries found