David Robillard‘Big Data’. It’s an up and coming field. No wonder Carleton is hosting a ‘Big Data’ event on April 24 to celebrate its more than 130 researchers who are focused on this area. The university is also in the process of creating a new collaborative master’s program in data science.

PhD student David Robillard is one of those researchers.

Collaborating with a team in Carleton’s Parallel Computing & Data Science Research Lab, he is working on a cloud-based system for Online Analytical Processing, or OLAP.

OLAP is a way of analyzing multi-dimensional data like sales figures or student grades. For example, one might be interested in the average GPA of students at Carleton with respect to many different “dimensions” like age, country of birth, department, and so on.

Explains Robillard: “The traditional approach to this problem pre-calculates the necessary data, which forces analysts to work with out-of-date information. This pre-calculated data must also be stored, which becomes increasingly difficult and expensive as data grows. Our research eliminates this pre-calculation, reducing system requirements and allowing analysts to perform OLAP queries directly on current data as it changes.”

This system is specifically designed for the cloud, so this work will bring all the benefits of cloud-based computing to the OLAP workloads that are important for many large organizations.

Robillard chose to pursue his PhD in Computer Science at Carleton primarily because “Prof. Frank Dehne’s research suits my desire to work on problems that have an interesting theoretical component, but are also practical and rooted in strong implementation.”

He also did his master’s at Carleton, taking a break afterwards to work on other projects. He is active in the Free/Open Source Software community and authors and maintains several projects related to audio and the Semantic Web. To that end, he is presenting at the Linux Audio Conference in Karlsruhe for the first time this year, after working in that community for many years.

Robillard says that he is interested in how applicable his research on cloud-based OLAP is to Semantic Web workloads. The Semantic Web provides a common framework that allows meaningful data to be shared and reused across application, enterprise, and community boundaries.

“Like OLAP, the Semantic Web has huge potential for performance improvements,” says Robillard. “My project ‘Serd’ raised the bar for RDF read and write performance, and it would be great to leverage our OLAP work to similarly improve query speed.” RDF is the data model of the semantic web.

For the future, Robillard says only that there are many interesting and important problems in the world. “I only hope to be able to work on some of them and have my contributions make a positive impact on the world.”

Thursday, April 10, 2014 in ,
Share: Twitter, Facebook