THE 4 C’S THAT MAKE A MODERN DAY DATA SCIENTIST

blog-1

By Neeraj Kulkarni

Nate Silver has become a rock star. Data scientist has become a buzzword. The title itself has been butchered enough as it lacks specificity and can be perceived as a glorified pseudonym for a data analyst. Regardless of this, I see enterprises, business pundits and “pseudo” data savvy people talk and use (or more like abuse) it all the time.  There are zillion descriptions online and ever organization is trying to hire data scientists in every shape and form. Funnily enough on a recent client engagement, the CMO asked me “Do you think it’s time for use to start looking at Silicon Valley to hire a data scientist to help our analytics team?” Now having practiced analytics for over 15 years and having various titles associated with me during that time, I found that question really fascinating. I believe there is so much hype and diversity in description of that role that it has left marketers confused thinking that they seemingly don’t have people with that kind of talent in their organizations which may or may not be true.

I have described here what I feel are the 4 essential qualities of the modern day data scientist. Hopefully it will help marketers identify this talent closer to their home base rather than Silicon Valley (unless they are already based out of Silicon Valley… duh!)

 

a) Curiosity

“Intellectual growth should commence at birth and cease only at death.”
  – Albert Einstein

Curiosity killed the cat. But not the data cat, it actually made him/her stronger. Data analysts will take a request, implement it, and deliver the results that they arrive at using some statistical techniques with certain degree of confidence. A data scientist will first start with a data/business discovery discussion to understand the essential question (and the underlying context and challenge associated) and then interrogate the data with the end goal in mind. It is that underlying curiosity to research and learn more about the business problem which helps he/she in coming up with the most effective solution. It helps them in identifying the right data sets, the right variables and delivering the right insights which are relevant, timely and meaningful for the C suite to implement.

 

b) Coding

If you can’t code then you can’t be a data scientist. It’s as simple as that. I have found the KDNuggets and the IBM blogs very informative in keeping myself abreast of new techniques evolving in the field of data mining and marketing decision sciences. It’s important for a data scientist to have a learning mindset and learn about new coding techniques like open source languages like R/Python in their free time thru Coursera or Udacity to hone their skills and be better at their trade. Saying that, I have to emphasize here that it is not how efficient you are as a coder (it helps if you are..) that it will make you a great datascientist but it is about how effective and analytical are you in your mindset to attack a particular problem to achieve your end goal and to arrive at actionable insights which differentiate from you just a coder/programmer to being a data scientist.

 

c) Causality:

blog-inner

The 91% correlation between the two trends does not imply that people marrying in the state of Alabama are generally unhappy with their marriages and commit suicide by electrocuting using power lines. Spurious correlations can lead to inaccurate conclusions. This is true in today’s marketers who are trying to understanding the value of their marketing efforts by correlating individual channels contributions rather than looking at their marketing ecosystem holistically. Having a good data scientist/s will help you avoid this pitfalls. Good data scientists have solid knowledge of statistics, probability theory and statistical software’s. They have the ability to understand the business problem and then pick the right set of variables that can help answer it. Further they can combine data and human knowledge to derive insights which are not only statistically significant but have causal significance in predicting the outcome.

 

Communication:

Good data scientists has the ability to communicate their insights in a simple, visual and easy to understand manner. It’s not just enough to just have the technical chops, a data scientist must be able to effectively explain how he or she came to a specific conclusion and convince the internal or external customer that their results should be leveraged. The real “aha” moment for me in presentation is when someone from the audience who is non-technical looks at a chart and draws an insight which may not have even called out on the slide. Instead of telling a marketer that 4500 customers in their database have only transacted once, they call out the fact that 70% of the customers have only transacted once. Through effective storytelling, they provide context to the insights and create a buy-in for the data solution to be implemented.

In my opinion, these are the 4 essential qualities of a good data scientist. You could argue that having expertise in text mining, Hadoop, MongoDB etc etc. are important and go on to list 40 more attributes. But to me those things are part of the learning ethics of a good data scientist which they will train and acquire as per the business need. You can find your data scientists right where and don’t always need to run to Silicon Valley. If you don’t trust me, atleast trust Einstein and he wasn’t even called a data scientist.