In many applications data is generated at such a fast pace that storing the data is no longer desirable or is even impossible. Typical examples of such streams of data include Internet traffic data and continuous sensor readings. Nevertheless, often there is a need for analyzing these streams. In many cases it is even necessary to report the analysis results immediately. Think, for example, of a network being monitored for unusual traffic loads in order to detect network attacks; any approach that does not immediately report suspicious patterns is simply useless. Traditional data mining approaches are not suitable for mining such streams, because they assume static data stored in a database. At first sight, the situation seems hopeless; simple tasks such as counting the number of distinct items in a stream or identifying the most frequent items already becomes impossible. In my presentation I will show that the situation is not that bad after all. I will survey three surprisingly elegant and simple algorithms for answering important queries over streaming data.
dinsdag 11 januari 2011 om 00:00
dinsdag 11 januari 2011 om 00:00
HG 6.29
Neem contact op met de organiserende partij of het bestuur (cib@gewis.nl) als je vragen of opmerkingen hebt. Veel plezier!