Panel Speakers: How is Data Science Organized?
Each year at JSM the NISS Affiliates gather for facilitated discussion and conversation over lunch and a chance to meet new colleagues, talk and catch up. This year’s luncheon was very well attended. The general topic of the luncheon was how data science groups are organized within industry as well as within institutes of data science in academia. Affiliates from all three sectors, academic, industry, and government got a chance to hear from a number of dynamic speakers.
“Statisticians need to stand up for our important contribution to data science, and not allow, if at all possible, for computer science to define what data science is, or control committees that establish data science programs.”
Joel Dubin, Affiliate Luncheon Moderator
James Rosenberger, NISS Director, welcomed everyone and introduced Joel Dubin, the session organizer and moderator for the event. Joel first introduced the industry panel speakers, Ming Li from Amazon and Souvik Ghosh from LinkedIn. The speakers focused on their training needs in terms of data science. Ming Li’s talk focused on data science in technology companies where he looked at the differences between a statistician and a data scientist and the differences between a generalist and a specialist. Souvik Ghosh talked about the role of statisticians at LinkedIn, noting that some problems can have solutions using tools that are not statistical in nature.
They were followed by a panel of academic speakers that included Tian Zheng from Columbia University, Roy Welsch from MIT, and Bhramar Mukherjee from University of Michigan. Tian Zheng described how they prepare students for a career in data science at Columbia. She introduced perhaps the most memorable concept of the session, the ultimate data scientist as the ‘mythical unicorn!’ Her point being that while many curricula in Statistics prepare students for some of skills needed to create the data science workflow, data science job descriptions often list many other skills that no single person possesses. Roy Welsch talked about the MIT Institute for Data, Systems, and Society and the wide array of skills that are required to solve the current data problems in retail, technology, power systems, and social media settings as examples. Lastly Bhramar Mukherjee talked about the ‘Battle of Two Cultures: Statistics vs Computer Science.’ She argued for expanding the boundaries for how statisticians work to solve problems. Her challenge included the words of Gertrude Cox:
“With our scientific backgrounds, we should spend most of our time seeking out the new, the underdeveloped, the unexplored or even the dangerous areas. It is one of the challenges of the statistical universe that, as new regions are discovered and developed, the horizon moves further away.”
- Gertrude M Cox (1957)
Lively discussion after each of these panel presentations ensued. Participants acknowledged that this distinction between statistician and data scientist is something that will be debated and discussed for some time to come.
"Listening to the industry and academic representatives discuss data science confirmed my view that data science requires statisticians, computer scientists, and data base specialists to work together. The unusual data scientist with all of these skills embodied in one person, we learned, is like the mythical unicorn.”
James Rosenberger, NISS Director
Affiliate Certificates
During the event Jim Rosenberger also handed out Affiliate Certificates for each of the institutions that have signed on as affiliates of NISS. This is something new that NISS is doing and participants appreciated this acknowledgement, the idea being to bring something back home and display for colleagues to recognize their institution’s involvement with NISS.