Wednesday, June 12, 2013

Data Scientists . . . or Data Wannabes?

Ladies and gentlemen of the analytic community – we have an epidemic. A seed has been planted in the minds of quantitative professionals that has sprouted and will not subside. The more we hear about Big Data, the more we hear about ‘Data Scientists.’ And like with any hot job, once the media starts to speculate on salary, candidates go nuts. As the President of Teradata Scott Gnau said in a recent article about careers in Big Data, “There are a lot of people who can spell Hadoop and put it on their resume and call themselves data scientists.”

Gnau makes a plea to get the term ‘Data Scientist’ defined and I could not agree more. When a national article makes the claim that one of these workers can be making $200K+ with a couple of years of experience, you can bet everybody who works with data will start calling themselves a Data Scientist. I have already seen it start to happen and thought I would offer my two cents on the reality of this highly coveted (and grossly overused) title.

The compensation issue is a tough one to tackle. Probably the most common question I get asked as a recruiter, by both candidates and clients, is related to salary. While it’s true that careers in marketing analytics are in demand and lucrative, this is all dependent on multiple factors including: years of experience; software familiarity; advanced degrees; pedigree of school; location preferred; and many more.  So when Rob Bearden, the CEO of Hortonworks, says that a “qualified data analyst” coming right out of school can make $125K, note that this is a very specific candidate with a very impressive background. To name just a few bullet points of the perfect candidate, data scientists typically have: 
  • A PhD in Computer Science and advanced degrees in other highly quantitative disciplines like Mathematics  
  • Knowledge of MapReduce tools like Hadoop
  • Use of tools such as Python, R, Java, Hive, Pig
  • Prior work or internship experience working with enormous, unstructured data sets

These are just general guidelines, of course. To be sure, a candidate with a Masters in Statistics can be a Data Scientist but this candidate will likely not be demanding as high of a salary as someone with two PhDs and Google on his resume. It might help to think of Data Scientists as a unique subset of the Big Data analytics profession. Just as an IT professional may work with Big Data, a marketing analyst who builds predictive models is not necessarily a Data Scientist.

Allow me to reiterate what I’m sure you have already heard countless times in the media: careers in Big Data are in demand and continue to grow. However, not every person with a quantitative background or experience in analytics is a real Data Scientist. The tools they are using are very new and many companies looking to take advantage of Big Data do not have a previously established foundation of Data Scientists to provide road maps for less experienced professionals. Likewise, innovative companies like Hortonworks that work to support tools such as Hive and HBase can and must offer workers competitive salaries to ensure the use of these systems are further developed.

For now we will continue to see ‘Data Scientist’ placed on resumes to catch the attention of hiring managers, and hiring managers will continue to shell out unprecedented resources to attract talent. Rightfully so – these guys are one in a million.