In many applications data is generated at such a fast pace that storing the data is no longer desirable or is even impossible. Typical examples of such streams of data include Internet traffic data and continuous sensor readings. Nevertheless, often there is a need for analyzing these streams. In many cases it is even necessary to report the analysis results immediately. Think, for example, of a network being monitored for unusual traffic loads in order to detect network attacks; any approach that does not immediately report suspicious patterns is simply useless. Traditional data mining approaches are not suitable for mining such streams, because they assume static data stored in a database. At first sight, the situation seems hopeless; simple tasks such as counting the number of distinct items in a stream or identifying the most frequent items already becomes impossible. In my presentation I will show that the situation is not that bad after all. I will survey three surprisingly elegant and simple algorithms for answering important queries over streaming data.
Tuesday, January 11, 2011 at 12:00 AM
Tuesday, January 11, 2011 at 12:00 AM
HG 6.29
Please contact the organizing party or the board (cib@gewis.nl) with any questions or concerns. Have fun!