Nand Kishor Contributor

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Follow on

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

Data science is the big draw in business schools
157 days ago

7 Effective Methods for Fitting a Liner
167 days ago

3 Thoughts on Why Deep Learning Works So Well
167 days ago

3 million at risk from the rise of robots
167 days ago

15 Highest Paying Programming Languages Trending
168 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies
183024 views

Here's why so many data scientists are leaving their jobs
74454 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning
67491 views

2018 Data Science Interview Questions for Top Tech Companies
57126 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies
56250 views

The Data Science Diversity Gap

Jan 8, 2018 | 8181 Views

How diverse will a lucrative, growing field like data science be in the future?

Will it end up like computer science today (not very diverse) or computer science a few decades ago (much more so)?  One way to prognosticate the future demographic composition of data science is to look at who is studying data science and its prerequisite skills today. For data science, the results are not encouraging.

A recent article in Forbes notes, "Women hold only about 26% of data jobs in the United States. There are a few proposed reasons for the gender gap: a lack of STEM education for women early on in life, lack of mentorship for women in data science, and human resources rules and regulations not catching up to gender balance policies, to name a few." 

Moreover, federal civil rights data further demonstrate that "black and Latino high school students are being shortchanged in their access to high-level math and science courses that could prepare them for college" and for careers in fields like data science.

Just how diverse is data science? More specifically, if we look at the study of data science as a predictor of future participation in the field, what is the gender and demographic breakdown down of its students compared to other fields? 

We analyzed data from Priceonomics customer General Assembly, an education company that trains students in data science and other technical fields. We analyzed data from their part-time programs (which typically reach students who already have jobs and are looking to expand their skill set as they pursue a promotion or a career shift), here's what we found: 

While great gender parity strides have been made in fields like web development and user experience (UX) design, data science a relatively newer concentration still has a ways to go.

Off all the technical education fields we studied, data science had the lowest representation of female students, at just 35.3%. 

Additionally, among these same technical fields, data science had the lowest percentage of African American and Latino/Hispanic students enrolled.

Gender and Data Science
For our analysis, we went through five months' worth (September 2016 through January 2017) of anonymized enrollment data for part-time General Assembly students (those enrolled in 10- to 12-week evening courses). We chose to focus on part-time data (rather than the full-time program) because the sample size was bigger though the results would be similar.

First, let's take a look specifically at the gender breakdown of students in these courses.

Data source: General Assembly

On average, part-time courses skew more female (56.5%) than male (42.3%).

Some courses, like Product Management and Data Analytics, seem to come close to gender parity. Front-End Web Development falls in right around the average across all courses, and in Digital Marketing and User Experience Design, both more consumer-facing fields, two-thirds or more students are women.

But the Data Science course shows the largest composition of male students and the lowest of female students, at just 35.3%.

Race and Ethnicity in Data Science

Turning to the same anonymized data set, let's now look at race and ethnicity.

Data source: General Assembly

At first glance, it appears that Data Science courses fare pretty well in diversity: The percentage of enrolled students who are white (46.1%) is less than average (46.9%).

But looking specifically at Hispanic/Latino and African-American students, the course has by far the lowest total percentage of students.

Data source: General Assembly

To put this data in context, the population of the United States is 62% white, 17% Hispanic or Latino, 12% African-American, and 6% Asian/Pacific Islander.

Just 11.8% of part-time Data Science enrollees are Hispanic/Latino or African-American. That's 5.7% below the overall average, and nearly half of the figure in the Front-End Web Development courses.

Education in Data Science
This data set also gives us insight into the highest level of education attained from part-time enrollees across these courses.

On average, Data Science students come in with the highest degree attainment.

Data source: General Assembly

Across all courses, 85.4% of part-time these part-time students have a bachelor's degree or higher; in Data Science, that figure is 93.8%. This seems to largely be driven by the fact that there are far more master's and Ph.D. graduates in Data Science (37.7%) than the overall average (24.%). A surprisingly high 3.7% of students hold a Ph.D. more than triple the average of 1.2%.

Data Science seems to draw from a smaller, more specialized pool, which could, in part, perpetuate diversity issues.

Data Science Is Still New
Female and minority students have made positive strides in coding and tech education in this data set.

When coding and web development started getting increasingly popular two decades ago, the fields were almost entirely dominated by men most of whom were white. 

Looking at the data here, though, it's clear things have changed dramatically: Front-End Web Development courses are now 57% female and boast the highest percentage of students of color of any course. Since data science is still a relatively new field, it is possible things may just take some time to equalize but its entirely possibly it won't unless the issue is addressed directly. 

Source: Forbes