Learning Data Science can be hard, and finding a job in the Data Science field can be equally hard if you are a beginner. As a beginner, what are the opportunities and what should you know that would help make better decisions.
I had other questions too.
As with any data related projects we start, I started digging through the data. Read and analyzed more than 100 job postings in two job portals across four different cities in India with an experience of 0 to 5 years.
At the end of this post, if you had the same questions as I did, you will be very well informed to make the right decisions. The objective of this post is to help fine-tune your resume with the needed skills and experience levels. If you just started in Data Science, it will help in getting a realistic view of the job market for Data Science for Beginners.
Some companies call Data Scientists with multiple names like Machine learning engineer, Data Science Consultant, Decision Scientist, and so on. All those jobs listed under these different titles but related to data science are taken in scope.
I had to filter out jobs that fit the above criteria as they were too specific or too vague. I call them outliers. Here are the criteria
Note: I had reviewed each job description and have prepared a dataset for you to consider. You can subscribe to my newsletter and get a copy of the dataset for further analysis and the actual job posted with company links.
Let’s get to the analysis!
If you guessed Bangalore, it’s correct!. Bangalore or Bengaluru India’s silicon valley has the most number of jobs with companies like Amazon, Citi, Genpact, Honeywell posting for Data Scientist positions. Bangalore amounts to around 35% of the job positions analyzed.
If you are starting in Data Science with just a couple of years of related experience under your belt, Bangalore is the way to go. Chennai stands second with 25% of the job positions.
The word best is subjective. According to me, a site that helps in finding jobs quickly and easily without having to look around for important information is best. Naukri and Indeed had almost the same number of job postings - 61 and 57 postings respectively over the past 30 days. With Naukri, the job search was mostly broad - search multiple job titles like Data Scientist, Data Analyst, etc. at once, and you can select various cities in one single go. With Indeed, the best I could find is to choose one city or a single job title. I found it constraining. Definitely, not the best experience with Indeed.
I also felt that Indeed lists the job in an unstructured manner. Naukri had a dedicated job location, min salary, max salary, part-time/full-time information separated from the rest of the job description. In Indeed, it was all over the place, making it tough for finding information quickly.
Beware! I found less than 10% of the job posting that are duplicates - both posted in Naukri and Indeed. It means that when you are applying for jobs, post in both the sites.
I didn’t do a job search on Linked-in, which is another excellent resource for the job hunt. There might be other sites that you may need to explore.
Almost all the positions posted requires adequate familiarity in math with strong skills in Statistics, probability, and other quantitative methods. Still, only 11% of the job openings are very specific about exceptional math skills being mandatory. Otherwise, if you are a self-learner in data science or have a decent math skill, you should still get a job.
It is a no-brainer. Data Scientist keyword search would cover Data Scientist, Lead Data Scientist, Senior Data Scientist, Machine learning engineer, and should wrap around 80% of the job openings. If a job opening is not tagged correctly by the recruiters, then you might consider other keywords like AI Engineer, Data Science Consultant, Applied Scientist, Decision Scientist, etc.
The full range of keywords is present in the dataset used for this search. See the “Role” column in the dataset.
For starters, domain here refers to the industry type of the company. E.g., Banking, Retail, Manufacturing, etc.
Most software services companies are now enabling clients to make decisions based on data, and I see that 25% of these companies are working in multiple domains. It is good news as you can apply for any of these positions irrespective of the area of expertise. No specific domain knowledge mandated.
But, if you want to get past the competition or targeting a specific domain of interest, these top three domains that have the most number of job postings. Banking and Financial Services (BFSI) 16%, Healthcare 8% and Telecom 5%
If you want to target niche domains like Media, Real Estate, or Gaming, you have positions, but not many. Refer to the dataset for the exact numbers and the companies list.
You can certainly use your domain knowledge to kill your competition. Also, data would make much more sense if you have domain knowledge, and you can deliver more business value.
Is it mandatory? Only 12% of the companies are looking for specific domain knowledge. You can use this either way - remaining 88% of the companies you can apply, and your chances of getting selected are more, or you can specifically target the 12% as you have an edge there and cruise thru your competition.
37% of the companies are looking for some kind experience in deep learning - especially with packages like Tensorflow.
The short answer is Yes. Around 30% of the companies mandate text analytical skills like NLP (Natural Language Processing).
Around 20% of the job postings need experience deploying models on the cloud. The rest 80% are not keen on cloud experience. If you don’t have cloud experience, it is logical to apply for large companies that may have separate teams to take care of this requirement. Companies of smaller scale or have a small data science team will look forward to this experience as they are dependent on the cloud for computing power and scalability.
As of what cloud technologies these 20% firms are looking forward to? - exposure to Azure/AWS will get you covered in 99% of the cases.
Actual data: 23⁄118 companies need cloud experience. Out of these, 22⁄23 companies need experience either with AWS/Azure.
Python is the undisputed leader here with close to 95% of the companies using it for data wrangling, analysis, and machine learning.
R is next with 52% of the companies naming it as one of the statistical tools they use.
Most of the time, organizations are looking for exposure in both these tools. Having R knowledge can give you an edge.
Apart from these two languages, SAS and Java are popular with 18% of the companies looking for experience in these languages.
Big data technologies like Hadoop and Spark are sought after by 25% recruiting companies.
In addition to these, related technologies like Hive, Pig, Sqoop are also sought after - though I have not captured these numbers in the dataset as they fall under the big data realm.
Most data scientist aspirants and enthusiasts concentrate on machine learning algorithms and the programming languages, and they miss out on fundamental technologies that companies are used to. Almost all the companies have some sort of relational database to store structured data and are looking forward to people who can capitalize on these structured data. It makes SQL expertise a valuable skill.
58% of the companies are expecting the Data Scientist role to write SQL statements to work with the structured data. With that said, companies are also looking for No SQL database skills - with 15% of the companies asking for expertise in a Non-structured database like MongoDB.
Tableau is the most popular tool among the industry with 27% asking for this skill. Power BI comes second with 15% and Qlikview third with 10% asking for these skills.
If you expose yourself to the following tools and framework, it will increase your probability of getting selected for the position. Listed in the order of importance
If you are a starter with zero to two years of experience, the likelihood of getting an interview call is around 20%.
If you have three years of experience, the likelihood of getting an interview call increases by 45%.
Based on the analysis, I would say the average minimum experience for an entry-level Data Scientist is three years.
Here is the infographic that summarizes everything.
Get access to the dataset by signing-up for the newsletter. You will be sent an email with the link to
download the dataset. The dataset is provided in the
With the dataset, you can derive further insights.
Thank you for your time. If you have any comments/questions, you can record your comments below and I shall try to answer as soon as I can.