Loading...

Blog

Latest posts

Where do Data Science experts exists?

There has been rapid increase in the amount of information or data being created or produced everyday. Many sectors/industries are seeing this rapid increase in the amount of data available such as Healthcare, Retail, Information Technology, Consulting and even Government organisations. The basic reason for such growth is that more people have more tools to create and share information than ever before. Consumers aren’t the only ones creating data. Businesses are churning out lots of data as well.

In the not so distant past, industries and organisations largely relied on guesswork when making crucial decisions. Big data and Data Science allowed them to look through incredible amounts of information and feel confident when figuring out how to deal with things in their respective industries. As the amount of available data grows, the problem of managing the information or data becomes more difficult.To handle this ever growing data and to make sense of this data, more & more data science experts are required so that organisations can make informed decisions about their businesses.

Due, to this explosion in recent few years, number of data science experts have also increased across the globe. So, the question comes, if the data science experts are increasing over the years, then where do experts exists and in what proportion?

In this post, we would try to find answers with the Stack Overflow survey data from years 2011–2018. Survey data for multiple years can be found hereStack Overflow is an online technology forum that has a large monthly active user base. Using the survey results, we could find out the insights of the general software engineer community as well as data science community. In this analysis, I was interested in using 2011–2018 Stack Overflow developer survey data to understand the Data Science community growth.

In general, Data science community consists of ‘Database Administrator’, ‘Business intelligence expert’, Data warehousing expert’, ‘Machine learning specialist’, ‘Data Scientist’ and ‘Developer with a statistics or mathematics background’.

This post contains the analysis of proportion & trend in data science community growth in various countries, industries and different sized companies across the globe.

Therefore, we could ask several below questions to ourselves-

  1. What is the trend in Data Science community growth from 2011 to 2018?
  2. In which countries has the Data Science community grown?
  3. What is the trend in Data Science community growth in various countries over the years?
  4. In which Industries has the Data Science community grown and in what proportion?
  5. What is the trend in Data Science community growth in various industries over the years?
  6. In which companies(small, medium & large) has the Data Science community grown and in what proportion?
  7. What is the trend in Data Science community growth in various different sized companies over the years?

All the answers to the above questions would be based upon the survey data. Lets try to answer each of these questions one by one…

1. What is the trend in Data Science community growth from 2011 to 2018?

From the above visualization, we can observe that the Data Science community grew rapidly among total software developers in recent years. It was not prominent till 2014 but from 2015 grew in an exponentially manner. This goes hand in hand with the data explosion in recent few years which is also exponential.

Data is exploding more ever since & to handle and make inference from this new data getting created everyday, more and more data science jobs are being created across the globe with each passing year.


2. In which countries has the Data Science community grown?

From the above visualization, we can observe the trend in growth of data science community in top 10 countries with data science experts. We can observe that United States lead the growth trend, followed by India, Germany, United Kingdom and so on. The trend observed in growth of Data Science community in United States is exponential and is at full boom whereas trend observed in growth of Data Science community in India, Germany & United Kingdom is also exponential but has not reached its full boom. For rest of the countries like Canada,Brazil,Russia,France,Australia & Spain, there is a rise in data science community but it is slow as compared to top 4 countries.

Since, United States has the Silicon valley and is the leader & home of large software & IT organisations, Banking,Finance and Insurance firms, Healthcare service providers, Educational institutions, Better infrastructure and is always at the top of technological and IT advancements, a large amount of data is being created everyday by these sectors and their services in United States alone. As a result, a lot more data science experts are required in United States as compared to other countries which justify the exponential growth.

India has always been a big chunk among the portion of the IT services provider countries for United States and there is a fair amount of IT workload that is being shared with India by United States. Also like United States, India has its own set of data science requirements in various sectors mentioned above. As a result, a lot of data science opportunities is being created in India which is leading to rapid data science community growth in India.

And the same goes with Germany & United Kingdom and rest of the other top 6 countries. Large amount of data is being created and to handle & take care of it & to make meaning from this data, requirement for data science experts is also growing rapidly in each of the country but at different rate as per the need, demand and market within each country.

From the visualization on the left, we can observe the trend in growth of data science community in top 10 countries but now in terms of proportion or percentage across years for a particular country. So, for each country, 2011 have lowest proportion of experts and proportion of experts increased then onward till 2018 where the proportion is maximum. So across 8 years, 2011 has the lowest proportion of data science community and 2018 has the highest proportion for each country across 8 years and sum of proportion or percentage across 8 years for each country is 100%.

Also, rise in proportion is exponential which coincides with the data getting created within the countries over the years in an exponential manner. We can clearly see, that every top 10 country with data science experts has the same pattern i.e. exponential across years but with different rate.


3. What is the trend in Data Science community growth in various countries over the years?

From the above visualization, we can observe the following –

  • For year 2011, not all of the top 10 countries were actually using data science. Most of the countries were having 0 proportion of data science experts and only few countries like United States, United Kingdom, Australia & Germany had presence of data science especially United States had 50% share of the data science experts across all of the top 10 countries.
  • Share of United States is way more than rest of the countries in every year. In 2011, United States held about 50% of the share of the data science experts and continued to hold higher share of data science experts across different countries in each of the years. In 2018, United States held around approx. 38% of the share of the data science experts.
  • As the years passed on, the rest of the countries also started using data science and share of the proportion for rest of the countries rose up from 0 to 20%.
  • Different countries had different rate of data science experts community growth, especially India rose up from 0 proportion in 2011 to around 18% in 2018.
  • Germany also made an increase in their share of proportion. Australia lost its share across the years. United Kingdom first grew up and then slightly lost the proportion across years.
  • Since, other countries started to have their share in proportion, United States lost its proportion for data science community but still held the highest proportion across all of the years.

Difference between proportion of countries using data science was more in 2011 and this proportion difference started to decrease across years and in 2018, the proportion difference between the countries became less which meant each of the top 10 countries was using data science but with different proportion depending upon the need, demand & market within each of the country.


4. In which Industries has the Data Science community grown and in what proportion?

While combining the data from various years 2011–2018, there was no data about industry to which an individual belonged to in survey 2017 & 2018. Therefore, below inference is based on the survey data from years 2011–2016.

From the above visualization, almost all of the industries are using Data Science to much or less extent predominantly used by Software products, Finance & Banking sector, Consulting, Healthcare & Education sector. Also, from the visualization, we can see that the top most industry corresponds to ‘Other’ which denotes that Stack overflow survey(s) didn’t had industries in which data science is being used. By ‘Other’, it could mean various industries like — Research of different kinds, Medicines, Pharmaceuticals, E-commerce, Construction, Transportation, Insurance, Travel & Hospitality, Utilities, Natural resources & Energy, etc..

All industries and organisations alike are awashed with data in this pro-tech age and data is being created in an exponential manner, so Data Science is being used by every industry in different proportion as Data Science leads to Smarter Decision-Making.

  • From the above visualization, we can observe the trend in growth of data science community in top 10 industries with data science experts. We can observe that ‘Other’ lead the growth trend, followed by Software Products, Finance/Banking and so on. ‘Other’ industry here could collaboratively mean industries like — Research of different kinds, Medicines, Pharmaceuticals, E-commerce, Construction, Transportation, Insurance, Travel & Hospitality, Utilities, Natural resources & Energy, etc.
  • The trend observed in growth of Data Science community in Software Products & Finance/Banking is exponential whereas trend observed in growth of Data Science community in Consulting, Education, Healthcare is also upwards but slow in comparison to Software Products & Finance/Banking.
  • Internet, Government, Media/Advertising & Manufacturing also had upward trend but only after 2014 and is very less as compared to rest of the Industries.
  • ‘Other’ which is combination of many other industries had the biggest upward trend since it combined the trend of each industry which is included in ‘Other’ category. So, we can’t state that ‘Other’ had the biggest upward trend in a true manner.

Well, the upward trend of industries using data science is like exponential which goes hand in hand with the creation of data in an exponential manner across industries. Increase of the data getting created and exchanged within industries meant, industries needed more data science experts to take care & handle data and to make inference from the data to make better & informed decisions for their respective business across different industries. This requirement of more & more data science experts in each of the industries was at a different rate depending upon the need, demand, geographical location and market within each of the industries. So, after 2014, almost all of the industries were using data science.


5. What is the trend in Data Science community growth in various industries over the years?

From the above visualization, we can observe the following –

  • For year 2011, not all of the top 10 industries were actually using data science. 3 out of 10 industries were having 0 proportion of data science experts.
  • In 2011, Software products had the highest share of data science community i.e. 30% followed by Consulting & (Finance/Banking) which had 20% share each and all of these 3 sectors/Industries took 70% of the share of proportion of data science experts. Rest of the 30% were occupied by Other & Education each with 10% share followed by Healthcare & Manufacturing each with 5% of the share of proportion.
  • As the years passed by, (Media/Advertising), Internet, Government didn’t had their share in proportion until 2013 or 2014.

As the years passed by the share of proportion of data science community of top 10 industries increased & decreased over time. The difference in proportion of these industries became less across years.

  • Since, other industries started to have their share in proportion, Software Products, Consulting, (Finance/Banking) lost its proportion for data science community across the years.
  • Also, from the visualization, we can see that the data science experts in ‘Other’ industries which could mean various industries like — Research of different kinds, Medicines, Pharmaceuticals, E-commerce, Construction, Transportation, Insurance, Travel & Hospitality, Utilities, Natural resources & Energy, etc. grew more than rest of the industries and remained in the top 2 spot after 2011. That meant that the data science was being used in many other industries and was not limited to only few industries.

After 2014, there was consistency in use of data science within each of the industries with each industry having a share from approx. 3% to 18% of the data science experts.

  • Difference between proportion of industries using data science was more in 2011 and this proportion difference started to decrease across years and in 2018, the proportion difference between the industries became less which meant each of the top 10 industries was using data science but with different proportion depending upon the geographical location, need, demand & market within each of the industry.

6. In which companies(small, medium & large) has the Data Science community grown and in what proportion?

Since, Stack Overflow didn’t had data related to company size for years 2014–2015, we used two intervals 2011–2013 & 2016–2018 to make an inference of trend of data science experts in different sized companies. Also, the categories for size of the industry were different for these 2 intervals.

Below is the observation for years 2011–2013:

From the above visualization of proportion of companies with different sizes using data science for years 2011 to 2013, we can observe the following during the earlier years i.e. from 2011 to 2013 –

Considering, Small sized companies as combination of Start Up(1–25) & Mature Small Business(25–100), Medium sized companies as Mid sized(100–999) & Large sized companies as Fortune 1000(1000+) we can state the below-

1) Companies(Small sized) i.e. with 1–100 employee’s occupied 35% of the data science experts as compared to companies(Medium sized) with 100–999 employee’s & companies(Large sized) with 1000+ employee’s which occupied 29% & 36% of the share of data science experts respectively.

2) Medium sized companies had 35 data science experts for years 2011–2013 which was slightly less in comparison to Small sized companies & Large sized companies which had 42 & 43 number of data science experts respectively for years 2011–2013.

So, for years 2011–2013, there was about similar distribution of proportion of share of data science experts for years 2011–2013 with different sized companies if we consider Start Ups & Mature Small Businesses as Small sized companies.

Below is the observation for years 2016–2018:

From the above visualization of proportion of companies with different sizes using data science for years 2016 to 2018, we can observe that during the years i.e. from 2016 to 2018, Small sized companies(0–499 employee’s) were having data science experts much more than as compared to the Medium sized companies(500–4999 employee’s) and large sized companies(5000–10000+ employee’s). Small sized companies occupied 65% of the data science experts as compared to Medium sized & Large sized companies which occupied 16.47% & 18.22% of data science experts respectively.

So, most of the share of proportion of data science experts was with small sized(1–499 employee’s) & extremely Large sized(10000+ employee’s) which was 79% combined and rest of the 21% of the proportion was with companies(500–9999 employee’s). So, data science experts are either with Small sized or extremely Large sized companies.

There was an increase of percentage of data science experts in Small sized companies for the years 2016–2018 as compared to years 2011–2013 whereas percentage of data science experts in Medium sized & Large sized companies decreased for the years 2016–2018 as compared to years 2011–2013.

Below could be the potential reasons on why it happened –

  • Since by 2016, data science has emerged has the new technology for future. So, the existing Small sized companies started hiring more and more data science experts just like large sized organisations did for years 2011–2013 in order to handle & take care of the ever increasing data and to make informed decisions in their respective businesses.
  • As data science had a huge impact on every industries across the globe, new Start-Ups or Small sized companies highly equipped to handle data science requirements of various industries got emerged which hired highly trained professional having data science skill sets. These Start-Ups were willing to afford the highly priced data science experts because these small sized organisation knew the return & benefits they will reap from these hiring as data was supposed to increase in an exponential manner and demand for the services to handle& take care and make meaning of the data would also increase which would in-turn benefit the Small sized companies in the long run. These led to the increase in number of data science experts in Small sized companies in comparison to number of small sized companies for years 2011–2013.

Since, for years 2016–2018, the proportion of share of data science experts increased with Small sized companies, there was a decrease in proportion of share of data science experts with Medium sized & Large sized companies.


7. What is the trend in Data Science community growth in various different sized companies over the years?

Below is the observation for years 2011–2013:

From above visualization, we can observe the following –

1) In 2011, Start Ups(1–25 employee’s) and Large sized(1000+ employee’s) were the bottom two among 4 categories and in 2013, Start Ups(1–25 employee’s) and Large sized(1000+ employee’s) were top two among 4 categories.

2) Similarly, in 2011, Mature Small Business(25–100 employee’s) from Small sized companies & Mid Sized companies(100–999 employee’s) were the top two among 4 categories and in 2013, these were at the bottom two among 4 categories

3) So, the share of proportion of data science experts with Start Ups(1–25 employee’s) and Large sized(1000+ employee’s) increased over time whereas share of proportion of data science experts with Mature Small Business(25–100 employee’s) from Small sized companies & Mid Sized companies(100–999 employee’s) decreased over time.

  • One of the potential reasons why large size companies had more data science experts is that large size companies are into the research and development and had the vision of technology which would be more productive, efficient and would be used widely in upcoming years. Since, large sized companies already knew that there would be an exponential increase in creation of data on a daily basis, they hired more data science experts to be better prepared to handle & take care of the data and make informed decisions in their respective businesses. On the other hand, few Start Ups got started by group of like-minded individuals which envisioned the potential in data science being the future technology and started hiring data science experts over the period of time from 2011 to 2013. Medium sized companies also hired data science experts but the number of experts got up & down.
  • Also, the number of data science experts during the years 2011–2013 was far less. Therefore, affordability of data science experts having rare skill sets of data science by Large sized companies was more. Start Ups were or would might had been started by data science experts themselves, so they had to hire like-minded data experts into the Start Ups. Therefore, there was increase of data science experts in Large sized companies & Start Ups.

Also, the large sized organisations have huge data in comparison to other sized companies, so the demand to have data science experts was more with large sized companies than with other sized companies, so large sized companies hired more data science experts than other sized companies. And for Start Ups, since they were formed only due to huge potential in data science in future, more data science experts were hired into Start Ups.

Below is the observation for years 2016–2018:

From above visualization, we can observe the following –

1) There is a upward trend for Small sized companies(0–9,10–19,20–99 & 100–499 employee’s) across years and downward tend for Medium sized companies(500–999, 1000–4999) & Large sized companies(5000–9999, 10000+ employee’s) across years. So, the proportion of data science experts with Small sized(1–499 employee’s) across different sized companies increased over years from 2016–2018 while proportion of data science experts with Medium sized(500–4999 employee’s) & Large sized(5000–10000+ employee’s) across different sized companies decreased over years from 2016–2018.

2) Extremely Large sized companies(10000+ employee’s) were still in top 4 spots in proportion of data science experts across different sized companies for every year from years 2016–2018.

So, the Small sized companies(0–499 employee’s) & extremely large sized companies(10000+ employee’s) had the biggest chunk of proportion of data science experts across different sized companies for years 2016–2018.


Summary

So from the answers to all the questions, as per the Stack Overflow survey data from years 2011–2018, we can say –

  • As Data is growing at an incredible rate, it’s wise to take note — we can’t ignore the data revolution.
  • Given the rocket-speed of data growth, requirement for data science experts is growing rapidly but at different rate as per the need, demand, geographical location and market within each of the countries & industries.
  • Major proportion of data science experts could be found in countries-United States, followed by India, Germany, United Kingdom, Canada and other countries.
  • Almost all of the industries are using Data Science to much or less extent predominantly used by Software products, Finance & Banking sector, Consulting, Healthcare & Education sector and other industries.
  • By the end of 2018, Small sized companies(0–499 employee’s) & extremely large sized companies(10000+ employee’s) had the biggest chunk of proportion of data science experts across different sized companies.
  • While at some point the data explosion may begin to slow, it’s a fact that both businesses and consumers will continue to create new information every second of every day. This represents opportunity to all the industries to offer data science projects businesses need to create, store, manage, and analyze the masses of data they have at their fingertips.

Originally posted here.