Connected Nation - Interview: Sofia Olhede, UCL & Alan Turing Institute

Supplementary content information

Sofia Olhede is professor of Statistics at University College London in the Department of Statistical Science. Her research interests include the theoretical underpinning of Big Data Analysis, including the analysis of network data, and heterogeneous observations in time and space.

You must select the video player for these keys to function.

Keyboard shortcut Function
Spacebar Play/Pause when the seek bar is selected. Activate a button if a button has focus.
Play/Pause Media Key on keyboards Play / Pause.
K Pause/Play in player.
Stop Media Key on keyboards Stop.
Next Track Media Key on keyboards Moves to the next track in a playlist.
Left/Right arrow on the seek bar Seek backward/forward 5 seconds.
J Seek backward 10 seconds in player.
L Seek forward 10 seconds in player.
Home/End on the seek bar Seek to the beginning/last seconds of the video.
Up/Down arrow on the seek bar Increase/Decrease volume 5%.
Numbers 1 to 9 on the seek bar (not on the numeric pad) Seek to the 10% to 90% of the video.
Number 0 on the seek bar  (not on the numeric pad) Seek to the beginning of the video.
Number 1 or Shift+1 Move between H1 headers.
/ Go to search box.
F Activate full screen. If full screen mode is enabled, activate F again or press escape to exit full screen mode. 
C Activate closed captions and subtitles if available. To hide captions and subtitles, activate C again. 
Shift+N Move to the next video (If you are using a playlist, will go to the next video of the playlist. If not using a playlist, it will move to the next YouTube suggested video).
Shift+P Move to the previous video. Note that this shortcut only works when you are using a playlist. 

Sofia Olhede, UCL & Alan Turing Institute

I'm Sofia Olhede. I work at UCL and I’m associated with three different departments there, which sort of reflects data science, which was sort of the topic of the panels. I'm in maths, computer science and statistical science.

The panel was chosen all outside of academia, so we have the public sector represented by Sian Thomas who comes from the Food Standards Agency, Patrick Callinan who comes from Channel 4, and then Rosie who comes from dunnhumby which is the analytics company of Tesco. They all have very different experiences of data, they all have very strong commitments to the people who use their services, or who go to Channel 4’s website. So they have got very strong ethical standpoints, in terms of how they treat their data; not sharing it, not redistributing it, but only using it for the purposes that they’ve told the people who are visiting the websites regarding the use.

One of the interesting aspects of data science is that it's still a very ill-defined area. Everyone seems to know what Big Data is when they see it, but they can't really define it. Similarly, data science is very hot and lots of people claim to be doing it, but no-one can really say what it is. So we started to discuss; how do you hire good data scientists, given that there are not very many universities that give out degrees in data science?

So Sian talked about setting up her unit at the Food Standards Agency and how they've been getting Twitter analytics people in just to understand norovirus and it’s really cool application. Then Patrick Callinan talked about how Channel 4 deal with the difficulty of recruiting; and Rosie coming from dunnhumby, she doesn't have quite the same difficulty in recruiting, but they all expressed, sort of, concern that how are they going to deal with future challenges, given that they are, sort of, badly keeping up with getting the people they need?

I think the most powerful message was really that all of their stories were consistent. So all of them said that they had difficulties recruiting, they also were very excited about what they can achieve with data science. So it was a positive message, but also a slightly worrying one. How do we get more people engaged in the subject? Do things need to happen at a lower level, like secondary school, so that people have the basic skills? And something else which is ever recurring when you have discussions like this is, how do you get management in who understand enough of the analytics to, sort of, contribute to the process when they’re not directly involved?

Everyone seems convinced that getting large volumes of data together, linking data sets, it can all create cheaper options that can make for a better society, it can increase democracy; but at the core of the challenge lies answering the technical problems we have, to enable such developments. So yes, the government can help, they can get help from data scientists to increase democracy, have more involvement in the decisions, but the technology needs to be there. And one of the things that came up was data sharing and Sian called out for more efforts in helping data sharing, and I think that's a lot of technical challenges which still haven’t been met.

It sort of has multiple resonances, so at the basic level you can think of Engineering and Technology; you can think about having the infrastructure to achieve connectedness; you can think about people living in the countryside having the same services as people who live in the city, and that's, of course, a great ideal to strive towards. But then you have the second facet of this - connected means talking. It means people from different areas of the country being able to exchange ideas and collaborate even if they're not necessarily at the same place, and there's a positive aspect of that connectedness of being able to achieve more because you are connected.

One of the main things that connect researchers are cool ideas, which may sound a little gauche, but basically most researchers want to work on things that have the potential to be transformative; that have really the ability to push the boundaries of what we can achieve. So I think it's really getting the message, that the things you are working on have the potential to really change what we do. But also identifying problems; important problems where the solution can make a big impact on our lives. So it's not just about researchers finding new tools to solve problems.

I mean one of the cool things about Sian was she said "well, we're making data available online because the problems we can't solve, and we want to engage broadly with people to solve these challenges, you know, on a time scale that's realistic". So I think it's really finding that idea of what makes an important problem. It has to be technically challenging, thereby making it fun; it has to have an impact so that it matters to people; and the fundamental computer science, engineering and maths problems have to be interesting enough to engage people.

One thing that people asked me a lot about was the Alan Turing Institute, which is my new place of work. The Alan Turing Institute is the new National Data Science Institute. It's been operational just for a few months. The joint venture agreement between the parties to form the Institute which was signed earlier on this year in the spring, and we're very much gathering people together to start collaborating within the remit of data science. One of the big exciting aspects is the interdisciplinary work and the potential this might bring, so we're bringing together engineers, computer scientists, mathematicians, and also social scientists who understand how people data functions, and what we can get out of it.

So we’re bringing all these people together under one roof to do research, and one of the really exciting things is that we've been running a series of workshops over the last two months, really, running until the end of the year and a little beyond, to get a notion of what people want to be working on. It’s very exciting, we're going to write the report once we’re through this process and figure out what is data science to researchers, and how can we all come together to make a dent in these sorts of problems.