Seminar on Data Stream Mining
Winter Semester 2010/2011
Dr. Rainer Gemulla
Dr.-Ing. Sebastian Michel
The seminar has moved from Thursday to Friday, 16:15-18:00.
For the schedule, see below.
- Place: Seminarraum 001, Building E1.7 (New MMCI Cluster of Excellence Building)
- Time: Friday, 16:15-18:00 (updated)
- Attend all talks - not just your own. We will keep track of participation! If you are sick, please let us know in advance by writing a short mail.
- Read your papers and other related literature.
- Contact your tutor at least 3 weeks before your talk and present a brief draft of your intended talk.
- Prepare a 45 minutes talk about your topic that introduces the matter to your fellow students. This is about twice the size of a conference talk, so there should be enough time to present some background information on the topic. Try to pick the most interesting, challenging or futuristic contribution(s) from the paper. You are very welcome to discuss any potential weaknesses or problems of the paper(s) in your talk. If you are unsure about what to present, ask your tutor. Note that, even though the conference slides of some papers are available on the Web, we expect that you prepare your own slides (which may be, of course, inspired by the original slides).
You must send your slides to and discuss them with your tutor by the Friday before your talk (4pm) at the latest, otherwise your talk will be cancelled (this is a hard deadline).
- Both the slides and the presentation itself must be given in English. Otherwise, some students will not be able to follow all talks, which is one of the main purposes of the seminar. After the presentations, there will be a discussion in which all fellow students are encouraged to ask questions. We will keep track of your participation (i.e., if you ask questions) and, of course, the answers of the presenter.
- For each talk, a second student will be preselected as an opponent. His or her role is to prepare tough questions to challenge the paper presented in the talk (not the talk itself or the speaker!). To make life a little easier, the preliminary version of the slides will be sent to the opponent on the Friday before the talk. However, as interaction is an important part of science, we expect that every participant actively participates in the discussions.
- Two weeks after the talk, the presenter and the opponent together have to submit a short (usually not longer than 5 pages) summary of the topic of the talk. The focus of this report should be on pointing out strengths and weaknesses of the approach presented in the paper(s), not just summarizing the paper(s).
- After your talk, there will be another meeting with your tutor and Rainer and/or Sebastian to give feedback on the talk and the report.
- In other words: Your final grade will be influenced by the following components: Your oral presentation, the knowledge about your topic (your answers to questions after the presentation), the questions you asked as opponent, your general participation in the seminar, and your two written reports (one in the role of presenter, one in the role of opponent).
Recent topics on data mining over streaming data.
(changes in the order possible)
- 19.11.: Statistics
Maintaining stream statistics over sliding windows (SODA02)
Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwan
Speaker: Andreas Frische
Opponent: Jianan Ma
Tutor: Rainer Gemulla
- 26.11.: Skylines
Categorical Skylines for Streaming Data (SIGMOD08)
Nikos Sarkas, Gautam Das, Nick Koudas, Anthony K. H. Tung
Speaker: Frederic Raber
Opponent: Wenkai Dai
Tutor: Sebastian Michel
- 03.12.: Top-K
Ad-hoc top-k query answering for data streams (VLDB07)
Gautam Das, Dimitrios Gunopulos, Nick Koudas, Nikos Sarkas
Speaker: Isha Khosla
Opponent: Ugur Kira
Tutor: Sebastian Michel
- 10.12.: Sampling
Optimal sampling from sliding windows (PODS09)
Vladimir Braverman, Rafail Ostrovsky, Carlo Zaniolo
Speaker: Djeukem Djoumbou
Opponent: Frederic Raber
Tutor: Rainer Gemulla
- 17.12.: Sketching (Canceled)
An Improved Data Stream Summary: The Count-Min Sketch and its Applications (JA05)
Graham Cormode, S. Muthukrishnan
- 07.01.: Frequent items
Finding frequent items in data streams (VLDB08)
Graham Cormode, Marios Hadjieleftheriou
Speaker: Aliaksandr Talaika
Opponent: Andreas Frische
Tutor: Rainer Gemulla
- 14.01.: Patterns
Efficient Pattern Matching over Event Streams (SIGMOD08)
Jagrati Agrawal, Yanlei Diao, Daniel Gyllstrom, and Neil Immerman
Speaker: Ugur Kira
Opponent: Djeukem Djoumbou
Tutor: Rainer Gemulla
- 21.01.: Nearest Neighbors (Canceled)
Continuous nearest neighbor queries over sliding windows (TKDE07)
Kyriakos Mouratidis and Dimitris Papadias
- 28.01.: Sketching
An Improved Data Stream Summary: The Count-Min Sketch and its Applications (JA05)
Graham Cormode, S. Muthukrishnan
Speaker: Luciano Del Corro
Opponent: Sallam Abualhaija
Tutor: Rainer Gemulla
- 04.02.: Distributed clustering
Conquering the divide: Continuous clustering of distributed data streams. (ICDE07)
G. Cormode, S. Muthukrishnan, and W. Zhuang
Speaker: Wenkai Dai
Opponent: Aliaksandr Talaika
Tutor: Rainer Gemulla
- 11.02.: Time series
Managing Massive Time Series Streams with MultiScale Compressed Trickles (VLDB09)
Galen Reeves, Jie Liu, Suman Nath, Feng Zhao
Speaker: Sallam Abualhaija
Opponent: Luciano Del Corro
Tutor: Sebastian Michel