Scientific data, including that from NASA and non-NASA satellites, is accumulating at an ever increasing rate. If it can be processed and analyzed, this data holds great potential for the discovery of new knowledge. A significant problem is how to provide users with the means to effectively extract useful information from this growing volume of data. In the last few years, a new discipline, called data mining, has appeared that has the potential to provide additional techniques and tools to extract knowledge from this large volume of data.

Data mining consists of an evolving set of techniques that can be used to extract useful knowledge from massive amounts of data. Up to this time, data mining research and tools have primarily focused on applications oriented toward the commercial sector. A limited amount of data mining research has focused on data mining of scientific data, including remotely-sensed satellite data.

There have been a number of conferences on various aspects of data mining which have included some discussion of scientific data mining. Data mining has also been considered in the context of scientific data conferences. What has been lacking, however, is a focused interchange of ideas between domain scientists and the data mining community on issues important to mining of scientific data. This workshop is intended as a vehicle to bring together these two communities so that they can begin to identify issues and formulate data mining objectives that would be mutually beneficial to both communities.

For this first workshop, the focus will be limited to data mining in the Earth Science domain. Participation is by invitation only, based on acknowledged expertise in the field.


The workshop is sponsored by NASA. 


The general objectives of the proposed data mining workshop are oriented toward providing the identification of issues relevant to data mining applied to scientific data. The specific objectives of the workshop are the following:

  1. Bring together representatives of the data mining community and the domain science community so that they can begin to understand the current capabilities and research objectives of each other's communities related to data mining.
  2. Identify a set of research objectives from the domain science community that would be facilitated by current or anticipated data mining techniques.
  3. Identity a set of research objectives for the data mining community that could be useful to support the research objectives of the domain science community.
  4. Identify any requirements for additional national infrastructure to support data mining.



The workshop will begin with a keynote presentation by a data mining authority and domain science authority. These two keynote speakers will respectively characterize the state-of-the of data mining and the state of science problems that are currently challenging existing technology or are anticipated to challenge it in the future. The workshop format allows for the formal presentation of a few worked examples, where data mining has been applied to a science domain. In addition a few invited talks will highlight various projects that are relevant to data mining. These presentations will serve as a catalyst for discussions that will be held within smaller theme groups that will consist of an interdisciplinary group of scientists and data mining experts. These theme groups will address a number of questions that will form the basis for presentations at the conclusion of the workshop.

In contrast to other mining conferences, we are attempting to have more domain scientists than computer scientists, since we want to understand what the science community would like to see from the data mining community.