Anastasia Ailamaki

Ecole Polytechnique Federale de Lausanne (EPFL)


Title: Querying and Exploring Big Brain Data

Today's scientific processes heavily depend on fast and accurate analysis of experimental data. Scientists are routinely overwhelmed by the effort needed to manage the volumes of data produced either by observing phenomena or by sophisticated simulations. As database systems have proven inefficient, inadequate, or insufficient to meet the needs of scientific applications, the scientific community typically uses special-purpose legacy software. With the exponential growth of dataset size and complexity, application-specific systems, however, no longer scale to efficiently analyse the relevant parts of their data, thereby slowing down the cycle of analysing, understanding, and preparing new experiments. I will illustrate the problem with a challenging application on brain simulation data and will show how the problems from neuroscience translate into challenges for the data management community. I will also use the example of neuroscience to show how novel data management and, in particular, spatial indexing and navigation have enabled today's neuroscientists to simulate a meaningful percentage of the human brain. Finally I will describe the challenges of integrating simulation and medical neuroscience data to advance our understanding of the functionality of the brain.


Anastasia Ailamaki is a Professor of Computer Sciences at the Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland. Her research interests are in database systems and applications, and in particular (a) in strengthening the interaction between the database software and emerging hardware and I/O devices, and (b) in automating database management to support computationally-demanding and demanding data-intensive scientific applications. She has received a Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon (2007), a European Young Investigator Award from the European Science Foundation (2007), an Alfred P. Sloan Research Fellowship (2005), eight best-paper awards at top conferences (2001-2011), and an NSF CAREER award (2002). She earned her Ph.D. in Computer Science from the University of Wisconsin-Madison in 2000. She is a senior member of the IEEE and a member of the ACM, and has also been a CRA-W mentor.


Bela Stantic

Griffith University, Australia


Title: Periodic data, burden or convenience 


Periodic data are present in many application domains, spanning from manufacturing to scheduling and medical domain. In many of such domains, the huge number of repetitions, and also unknown end of repetitions, makes the goal of explicitly storing and accessing such data very challenging. In this talk a new innovative approach to cope with periodic data in an implicit way will be explained. It will be shown that the proposed concept captures the notion of periodic granularity provided by the temporal database glossary. The algebraic operators will be defined and access algorithms will be introduced, then the proof that they are correct and complete with respect to the traditional explicit approach will be provided. Finally, results from extensive experimental evaluation will be presented, which demonstrate that the implicit representation of periodic data outperforms the explicit approach.


Dr Bela Stantic is member of Institute for Integrated and Intelligent Systems within the Griffith University, Brisbane, Australia. His area of research is efficient management of complex data structures. He has published more than 80 conferences, Journals and books chapters. He was/is member of more than 100 Program Committees and was/is doing editorial duties of many Journals in area of his research.


Martin Theobald

University of Antwerp


Title 10 Years of Probabilistic Querying -- What Next?


Over the past decade, the two research areas of probabilistic databases and probabilistic programming have intensively studied the problem of making structured probabilistic inference scalable, but—so far—both areas developed almost independently of one another. While probabilistic databases have focused on describing tractable query classes based on the structure of query plans and data lineage, probabilistic programming has contributed sophisticated inference techniquesbased on knowledge compilation and lifted (first-order) inference. Both fields have developed their own variants of—both exact and approximate—top-k algorithms for query evaluation, and both investigate query optimization techniques known from SQL, Datalog, and Prolog, which all calls for a more intensive study of the commonalities and integration of the two fields. Moreover, we believe that natural-language processing and information extraction will remain a driving factor and in fact a longstanding challenge for developing expressive representation models which can be combined with structured probabilistic inference—also for the next decades to come.



Martin Theobald is an Associate Professor for Databases and Information Retrieval at the University of Antwerp. Before joining the ADReM research group in Antwerp in 2012, he spent four years as a Senior Researcher at the Max-Planck-Institute for Informatics in Saarbrücken. Between 2006 and 2008, Martin was a Post-Doc at the Stanford Infolab, where he worked on the Trio probabilistic database system. Martin obtained a doctoral degree in Computer Science from Saarland University in 2006. For his dissertation with the title “Efficient Top-k Query Processing for Text, Semistructured, and Structured Data”, Martin received several awards, including an ACM-Sigmod Jim Gray Dissertation Award “Honorable Mention”. Martin is currently a member of the editorial advisory board of Elsevier's Information Systems, and he served on the program committees and as a reviewer for numerous international journals, conferences and workshops, including TODS, TKDE, VLDB-J, PVDLB, SIGMOD, SIGIR, ICDE, WSDM and WWW.