Πέμπτη, 10/01/2013 - Ομιλίες
Διάλεξη με τίτλο: Enabling Exploratory Analysis on Very Large Scientific Data
Διάλεξη του Dr. Θέμη Παλπάνα, την Πέμπτη 10/1/2013 και ώρα 14:00 στην αίθουσα σεμιναρίων του κτιρίου Πληροφορικής, Πανεπιστήμιο Ιωαννίνων.
ΠερίληψηIn this talk, we describe iSAX 2.0 and its improvements, iSAX 2.0 Clustered and iSAX2+, three methods designed for indexing and mining truly massive collections of data series. We show that the main bottleneck in mining such massive datasets is the time taken to build the index, and we thus introduce a novel bulk loading mechanism, the first of this kind specifically tailored to a data series index. Furthermore, we observe that in several cases scientists, and data analysts in general, need to issue a set of queries as soon as possible, as a first exploratory step of the datasets. We discuss extensions of our previous techniques that adaptively create data series indexes, and at the same time are able to correctly answer user queries. We show how our methods allows mining on datasets that would otherwise be completely untenable, including the first published experiments to index one billion data series, and experiments in mining massive data from domains as diverse as entomology, DNA and web-scale image collections.