Massive Data Sets: Proceedings of a Workshop (1996)

Chapter: Items for Ongoing Consideration

Previous Chapter: Panel Discussion
Suggested Citation: "Items for Ongoing Consideration." National Research Council. 1996. Massive Data Sets: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/5505.

Items for Ongoing Consideration

Data Preparation

  • Elevation of status of data preparation and data quality stages in professional societies
  • Clear articulation of what is meant by a massive data set
  • Development of rigorous, theory-based methods for reduction of dimensionality
  • Systematic study of how, when, and why methods used with small and medium-sized data sets break down with large size data sets; understanding of how far current methods, both statistical and computational, can be pushed; articulation of the variety of models that might be useful
  • Development of methods for integration of tools and techniques
  • Development of specialized tools in general "packages" for non-standard (e.g., sensor-based) data
  • Establishment of better links between statistics and computer science
  • Exploration of the use of "infinite" data sets to stimulate methods for massive data sets
  • Creation of richer language for describing structure in data
  • Educational opportunities—for nonstatisticians who use some statistical techniques and for statisticians, to broaden the knowledge base and provide better links to computer science

Models and Data Presentation Research Issues

  • Discovery and comparison of homogeneous groups
  • Communication and display of variability and bias in models
  • Better design of hierarchical visual display
  • New modeling metaphors and richer class of presentation approaches
  • Methods to help "generalize" and "match" local models (e.g., automated agents)
  • Robust or multiple models; sequential and dynamic models
Suggested Citation: "Items for Ongoing Consideration." National Research Council. 1996. Massive Data Sets: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/5505.
  • Alternatives to internal cross-validation for model verification
  • Retooling of computing environment for modeling massive data sets
  • Simple presentation of ''massive'' complex data analyses
Suggested Citation: "Items for Ongoing Consideration." National Research Council. 1996. Massive Data Sets: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/5505.
Page 203
Suggested Citation: "Items for Ongoing Consideration." National Research Council. 1996. Massive Data Sets: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/5505.
Page 204
Next Chapter: Closing Remarks
Subscribe to Emails from the National Academies
Stay up to date on activities, publications, and events by subscribing to email updates.