Close your eyes and try to visualize “data”. What do you see? Let me guess, it’s probably a white background, black numbers with serif font. Possibly, you have a table organizing those numbers. It looks like a bank statement, telephone bill or worse: just ones and zeroes. The good news is it’s getting much more interesting than that.
You all know the story, data science is the hype and there is more and more “big data”. But then, other than peeking into your bank accounts for credit scores or e-mails to determine your terrorist affiliations; what do data scientists do? Anything cool enough for a retweet?
It is true that when working with data we usually end up looking at boring screens with a lot of numbers around, but the repercussions of “data” echo much wider. Here are a few examples of what “data” can mean, and what a scientist can beat out of it on a nice day.
Marketing people know the story; Walmart looked at data and figured out beer sells well with diapers. This is the old story. The same people nowadays write computer programs that take into account your local weather, your click stream, site usage, and your previous purchases to predict the next product you will buy. Then they use that information to send out a personalized offer. Ever had that moment where you saw a scary offer e-mail or on-line ad and said “they’re watching me!”? They are. It’s just that, they’re not human.
Actually, the large players are in the business of manually lowering the accuracy of their targeting algorithms; since people tend to get scared and stop using the service.
When you upload a photo to Facebook, do you ever marvel at how Facebook predicts who the people in that picture are? That’s also data science. Facebook’s machines know which people have a greater probability of co-occurring in the same photo, and also run face recognition on the imagery to identify the people. In the end, an image is also data! You can see your Hollywood selfie, but a machine sees a large matrix of numbers representing colors.
A data scientist has the tools and tricks to predict the species of a bird from a recording of its chirps, the morphology of galaxies from telescope imagery, deduct your feelings on politics from your Twitter content and use that to predict election outcomes. When a high-end SUV predicts when you’re in trouble or gives recommendations based on your driving performance, that’s data science. What Oakland Athletics used sabermetrics to assemble the winning team, they relied on, again, data science.
To see data, you don’t have to close your eyes. But close them again, and imagine yourself walking down the busiest street in your city. Smartphones, security cameras, cars, cash registers, city lights, bus tickets, wi-fi signals are just some of the more readily available, obvious generators of “big data”.
This is where we start.