Methods for Approximate Query Processing (AQP) are essential for dealing with massive data. They are often the only means of providing interactive response times when exploring massive datasets, and are also needed to handle high speed data streams. These methods proceed by computing a lossy, compact synopsis of the data, and then executing the query of interest against the synopsis rather than the entire dataset. We describe basic principles and recent developments in AQP. We focus on four key synopses: random samples, histograms, wavelets, and sketches. We consider issues such as accuracy, space and time efficiency, optimality, practicality, range of applicability, error bounds on query answers, and incremental maintenance. We also discuss the tradeoffs between the different synopsis types.
Article navigation
31 December 2011
Research Article|
December 31 2011
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Graham Cormode;
Graham Cormode
AT&T Labs — Research
, 180 Park Avenue, Florham Park, NJ 07932, USA
Search for other works by this author on:
Minos Garofalakis;
Minos Garofalakis
Technical University of Crete,
University Campus — Kounoupidiana
, Chania, 73100, Greece
Search for other works by this author on:
Peter J. Haas;
Peter J. Haas
IBM Almaden Research Center
, 650 Harry Road, San Jose, CA 95120-6099, USA
Search for other works by this author on:
Chris Jermaine
Chris Jermaine
Rice University
, 6100 Main Street, Houston, TX 77005, USA
Search for other works by this author on:
Online ISSN: 1931-7891
Print ISSN: 1931-7883
© 2012 G. Cormode, M. Garofalakis, P. J. Haas and C. Jermaine
2012
G. Cormode, M. Garofalakis, P. J. Haas and C. Jermaine
Licensed re-use rights only
Foundations and Trends in Databases (2011) 4 (1-3): 1–294.
Citation
Cormode G, Garofalakis M, Haas PJ, Jermaine C (2011), "Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches". Foundations and Trends in Databases, Vol. 4 No. 1-3 pp. 1–294, doi: https://doi.org/10.1561/1900000004
Download citation file:
Suggested Reading
Visual tracking via crossing-bin histogram Bhattacharyya similarity
Sensor Review (October,2017)
Applying distance histograms for robust object recognition
Kybernetes (February,2007)
The analysis of lane detection algorithms using histogram shapes and Hough transform
International Journal of Intelligent Computing and Cybernetics (August,2015)
ENTROPY RELATIONSHIPS FOR DATA SEQUENCE HISTOGRAMS
COMPEL (March,1993)
An innovative information hiding technique utilizing cumulative peak histogram regions
Journal of Systems and Information Technology (November,2012)
Related Chapters
Chapter 7 Personalised Search Engine Evaluation: Methodologies and Metrics
Web Search Engine Research
A new basis for measuring and evaluating forecasting models
Advances in Business and Management Forecasting
Patterns of nominal and real wage rigidity
Jobs, Training, and Worker Well-being
Recommended for you
These recommendations are informed by your reading behaviors and indicated interests.
