Career Switch: Being part of “World of Data”
Analyzing Stack Overflow’s 2017 survey data
I believe there are lots of people like me that would like to switch their career path and find a prestigious data-related job.
From couple of years ago, I decided to dedicate part of my daily time to learn how to work with data. At that time, I didn’t have any serious plan to change my career path at all. I loved and still love the work that I am doing but I strongly believed that sooner or later knowing how to work with data and how to find interesting conclusions from them is not an “option” and it will become a “MUST TO HAVE” capability.
So from almost two years ago and without having any plan to change my career path I decided to educate myself with data. The more I read and learned from this topic, the more I understood how interested I am to be Data Scientist and/or Machine Learning Engineer. The new challenge started at this time: “HOW ?”
I passed several Udacity NanoDegree Program, I have admitted from Georgia Tech University for Online Master of Science in Analytics. I worked a lot on my Resume & Cover Letter, I networked a lot with several friends working in this area but honestly, there were still lots of un-answered question in my mind.
As part of Data Science Nano Degree Program, and for the first project of this program, we supposed to analyze one data set and I found 2017 Stack Overflow Survey data the most interesting ones as I thought not only I can complete one required project for this Nano-Degree program, but also I can learn a lot from it and make my future moves more accurate and more fit-to-purpose.
Here is the list of what I am going to share with you in this post:
What is employment status of participants ? What about employment status for participants from Canada and USA specifically ? If there is specific parameter which help individuals to be employed ?
How People are happy with their Career ? i.e. How “Career Satisfaction” is varied and which parameters may have impact on Career Satisfaction more than others ?
How Salary is related to some of main features of data base ?
Detailed explanation of each of the above questions have been discussed in the following sections.
Key Finding 1: Job Market in USA is “Relatively” better in comparison to Canada and globally all around the world.
if we consider “Employed-Full time” as the KPI which shows us job market status in each country, it sounds like that USA has better status in comparison to Canada and Globally, although the difference is not so significant.
The following pie-chart shows status of employment status from all participants:
And the following plots show the same information but just for participants from Canada (top) and USA (bottom plot).
Again, if we assume that Full Time Employment is indicator of job market, it could be concluded that job market in US (~ 77% full time employed) is relatively better than Canada (~ 72% full time employed) and all around the world(~ 70% full time employed).
I was curios to see, if better performance of full time employment in US is specifically related to better job condition in US or it is related to better quality of candidates in US. For this purpose, I tried to evaluate quality of participants in US vs Canada and all data. I considered Formal Education and Major Undergrad as two indicators to compare quality of US fulltime employers vs. Canadian and global full time employers.
Statistical information of Formal Education of global, Canadian and US full time employees is summarized in the following three figures:
It sounds like that proportion of employees with Bachelor degree is higher in US in comparison to Canada and globally. in US, ~ 57% of participants who has full time job has bachelor degree while globally this percentage is ~47%.
The second characteristic parameter that I was interested to look at that was, Major Undergrad degree of participants. Statistical information of Major Undergrad of global, Canadian and US full time employees is summarized in the following three figures:
The outcome is almost the same for all three cases. which means that no matter you want to work in US, Canada or all around the world, the chance of getting full time job is almost the same for these three cases. if you have background in computer science, your chance is obviously higher (~ 40%).
Key Finding 2: Career Satisfaction is “Relatively” better in USA in comparison to data coming from ALL participants:
By comparing career satisfaction of participants in all countries except USA vs. career satisfaction of US participants, it could be seen that US employees are satisfied relatively better.
I was curious to compare some other parameters which may have impact on career satisfaction. for this purpose, I decided to compare: Problem Solving, Building things, Learning New Tech, Job Security and Importance of Diversity between participants from US vs. participants from other countries.
Just to show one as an example, the following plots shows how problem solving shows more value for employees in US in comparison to employees all around the world.
less or more, same trend were observed for all other attributes listed previously by comparing their trend between participants from US vs. participants from other countries.
what MIGHT Be concluded here is: US participants have ranked importance of important attributes which have impact on job satisfaction higher than participants from other countries. Although without further investigation we can not strongly make this conclusion, but it might give us this conclusion that US participants are satisfied with problem solving, job security, building things and learning new tech in their job better than other participants and that’s the reason that they have better career satisfaction.
Key Finding 3: Salary is regressed pretty well with some of main features in the data base.
I tried to fit regression model (linear) to predict Salary based on some of the variables of data base. The value that I used are: Race, Gender, Currency, Level of Experience and Level of coding experience and the amount of hours that each participants would like to work per week.
Although I was happy with final regression KPI’s (i.e. R2 Square, MSE and etc.) but I do believe that more and more work could be done to improve this regression model.
Here I have put predicted vs. Actual result of regression model. As you can see good positive correlation is obtained from regression model:
I’d love to hear your thoughts:
There are lots of other things that I could do based on the survey data. But I decided to stop here. If you were me, what other questions you were interested to find answer for them ? if you wanted to create regression model to predict Salary, what was the main features that you put in your model ? I REALL WOULD LOVE TO HEAR FROM YOU:-)