Understanding Data Science: Two Recent Studies

If you need such a deeper understanding of data science than Drew Conway's popular venn diagram model, or Josh Wills' tongue in cheek characterization, "Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician." two relatively recent studies are worth reading.

'Analyzing the Analyzers,' an O'Reilly e-book by Harlan Harris, Sean Patrick Murphy, and Marck Vaisman, suggests four distinct types of data scientists -- effectively personas, in a design sense -- based on analysis of self-identified skills among practitioners.  The scenario format dramatizes the different personas, making what could be a dry statistical readout of survey data more engaging.  The survey-only nature of the data,  the restriction of scope to just skills, and the suggested models of skill-profiles makes this feel like the sort of exercise that data scientists undertake as an every day task; collecting data, analyzing it using a mix of statistical techniques, and sharing the model that emerges from the data mining exercise.  That's not an indictment, simply an observation about the consistent feel of the effort as a product of data scientists, about data science.

And the paper 'Enterprise Data Analysis and Visualization: An Interview Study' by researchers Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffery Heer considers data science within the larger context of industrial data analysis, examining analytical workflows, skills, and the challenges common to enterprise analysis efforts, and identifying three archetypes of data scientist.  As an interview-based study, the data the researchers collected is richer, and there's correspondingly greater depth in the synthesis.  The scope of the study included a broader set of roles than data scientist (enterprise analysts) and involved questions of workflow and organizational context for analytical efforts in general.  I'd suggest this is useful as a primer on analytical work and workers in enterprise settings for those who need a baseline understanding; it also offers some genuinely interesting nuggets for those already familiar with discovery work.

We've undertaken a considerable amount of research into discovery, analytical work/ers, and data science over the past three years -- part of our programmatic approach to laying a foundation for product strategy and highlighting innovation opportunities -- and both studies complement and confirm much of the direct research into data science that we conducted. There were a few important differences in our findings, which I'll share and discuss in upcoming posts.

Related posts:

  1. Big Data Is Not the Insight: Slides From Enterprise Search Europe Slides from my talk Big Data Is Not the Insight:...
  2. Speaking at UXLX on "The Language of Discovery: A Grammar for Designing Big Data Interactions" I’ve just con­firmed that I’ll be pre­sent­ing a light­ning talk at...
  3. Discovery and the Age of Insight Sev­eral weeks ago, I was invited to speak to an audi­ence...
  4. Approaches to Understanding People: Qualitative vs. Quantitative David Brooks Op-Ed col­umn The Art of Intel­li­gence in today’s...
  5. Presenting "A Taxonomy of Enterprise Search" at EUROHCIR I’m pleased to be pre­sent­ing ‘A Tax­on­omy of Enter­prise Search’...

Category: Language of Discovery, User Research
Tags: , , , , Comment »


Leave a Reply



*

Back to top