It's 9am on a March morning and the mercury is just inching past 24 degrees Celsius in Bengaluru. The workday is already two hours old in the metropolis's densely laid-out eastern suburb of Marathahalli. A student batch of both unemployed and working software professionals at Robotek Minds, a tech training institute, has just finished its data science class.
Data science is the new buzzword in the tech industry and the code jocks in the Marathahalli class have a singular focus - a job or a leg-up at one of the shiny information technology campuses dotting the city and housing the world's leading tech corporations. This, they hope, will be a passport to a comfortable salary that will grow in long strides in the years ahead, as the use of data in the world economy explodes.
"The very first point is on the salary... we get a good pay," says Azmat Ali, who is paying Rs25,000 for a 50-hour data science training course at Robotek Minds, explaining his interest in the field. "The world is completely dependent on data and processing it. Other tools may expire but data will not expire."
There are hundreds and thousands in India who have ambitions similar to Ali's. Look around these bylanes of Marathahalli and you are greeted by posters covering almost every wall advertising data science or related courses in local training shops, which jostle for space with paying guest digs for young men and women.
Prasad Reddy, the operations head at Robotek Minds, counts his employer among half a dozen institutes providing data science courses in the immediate vicinity, out of the 50-60 in the larger neighbourhood. Marathahalli's nearest competition is Ameerpet "where around 10 institutes provide this kind of training", he says, speaking of the IT training hub in Hyderabad.
Robotek Minds offers courses also in machine learning, deep learning, and artificial intelligence (AI) with "100% job support". "Our faculty will explain how to crack the interview, help with resume preparation, and discuss real-time scenarios," says Reddy. Some of the large certified ones even more. "We are not charging that much since we get a lot of unemployed students, who are not willing to pay that much," he says.
Another training institute Eminent IT Info has done its bit to make sure anyone interested in data science knows of it - its posters are splashed all over the neighbourhood advertising weekday and weekend batches for training in Python, a high-level programming language; statistical software framework and language, R; machine learning and deep learning; natural language processing (NLP); and AI. A receptionist says its 90-day data science course starts at Rs30,000 and quickly forwards PDFs of the courseware over WhatsApp.
"Three to four years back, everyone was crazy for Hadoop (an open-source big data processing framework). Now, it's data science. From the inquiries that we get, almost 50% of the crowd wants to do data science," says Sourabh Sharma, marketing manager at Realtime Signal Technologies, a training institute with branches in Marathahalli and BTM Layout in south Bengaluru. The institute charges Rs40,000 for a course on data science covering machine learning, Python and R and taking three to four months to complete.
The Marathahalli institute managers we interviewed say they are holding up against the rapid spread of massive open online courses (MOOCs) that offer learners the flexibility to work through a course of their choice and at their pace. Akhil Teja, data science trainer at Robotek Minds, insists the key to learning data science concepts is finding the right mentors, which MOOCs can't provide. "It's not a tough subject to learn if you are able to focus on the concepts," he says.
Warning: MOOCs ahead
Teja's edge over MOOCs may be getting dull as a recent visit to Ameerpet, a locality that presents a shabby facade of decades-old buildings showed. A sewage canal nearby is under a major engineering overhaul throwing up a stench and the adjacent road has been barricaded into a narrow lane where people elbow for space with vehicles. In the middle of this north-central Hyderabad chaos, we found at least a dozen IT institutes advertising machine learning and data science courses.
The locality has seen better days and is now at the receiving end of tech disruption, due to the proliferation of MOOCs - global ones such as Udacity, Coursera and EdX, and NPTEL, an India government-run MOOC. "There were around 10,000 to 12,000 institutes earlier in 2009-10, but now there are hardly 3,000," says Suchitha Rudragani, administrator at Sathya Technologies, an Ameerpet IT training institute. It charges Rs20,000 for its 40-hour course in data science and offers a range of options from online, offline, fast tracks, and weekend batches. "Over 60-70% candidates are interested in data science now. This technology is booming in the US and UK," she says.
Some of them are working professionals. Australia-returned Manikiran Reddy, a mid-career Oracle developer, has enrolled for a morning data science course in a nearby training shop, Kelly Technologies. His aim - add data science skills to his talent stack to be more versatile in his work and target job opportunities better.
Others, to be sure, are less enthusiastic about getting an education from here. Dilip Kumar Reddy, a third-year computer science student, finds more value in Coursera, which offers a Rs3,103 per month subscription. Apart from certifications from a number of international universities, the all-you-can-eat package and quality of teachers attracts him.
"It's a lot better; it's taught by professors who are teaching courses at universities like University of Michigan, people who teach Masters students," says Reddy, who studies at B.V. Raju Institute of Technology in Medak district, about 60km north of Hyderabad. "Your assignments are verified by international professors directly. They give you personal feedback on your assignment, areas you should concentrate on."
The demand is reflecting on Indian MOOCs, as well. IIT-Kharagpur professor Sudeshna Sarkar's â??Introduction to Machine Learning' course on NPTEL has seen a 4X year-on-year growth in enrolments in its July-September 2017 batch. Data for later courses are not immediately available.
Nirant Kasliwal, an undergrad from BITS Pilani, puts his passion for data science to an EdX course he took some five years ago. For him, today "Udacity is the best, even their free courses are pretty good". A Medium post that takes a data-driven approach to ranking the top-reviewed best introductory courses on data science rates Kirill Eremenko's Data Science A-Z course on Udemy and Intro to Data Analysis by Udacity as the best online resources.
Six out of 10 developers are looking to acquire, or are currently learning machine learning and deep learning skills, says competitive coding platform HackerRank as part of its 2018 developer skills report released last month. AI beats blockchain, augmented reality/virtual reality, internet of things, and quantum computing, as the most sought-after tech skill.
Two of the top three languages Indian developers plan to learn next are Python (43%), and R (36%) - the two most used languages by data scientists and statisticians respectively, according to Kaggle's 2017 State of Data Science and Machine Learning survey, which polled its global online community of data scientists.
Python is also the most loved language in HackerRank's language preference graph, with more than a 30% lead over the next closest language, C. "Python has a number of libraries (NumPy, SciPy, Pandas) geared towards machine learning, which are used by a lot of companies, and its an easy language to learn, the closest programming language to English", says HackerRank co-founder Vivek Ravisankar, explaining its growing popularity.
While high-quality resource materials are available and access to information has been democratized through MOOCs, learning by doing is the most important, says Partha Pratim Talukdar, assistant professor at the Department of Computational and Data Sciences at Indian Institute of Science, Bangalore. "You need to get your hands dirty, really do stuff," he says. "It's not just doing coursework, doing real projects is key to learning data science."
Keeping obsolescence at bay
For developers, the race to acquire AI skills is an existential challenge. There's the fear of job loss to automation and being left with outdated skills. While data science boasts a higher take-home package and scope for growth, it requires interdisciplinary skills - a combination of maths, statistics, programming, and some domain expertise.
"There is a genuine concern among software developers and testers that the requirements for their kin are on the way down. These guys are a bit uncertain about their own future, and one of the popular avenues to up-skill towards a more rewarding future is getting into data science and analytics," says Charanpreet Singh, co-founder and director of Praxis Business School, a business school that offers analytics programs at its Kolkata and Bengaluru campuses.
Its one-year programme in business analytics, priced a little above Rs5 lakh, started off with eight students in 2011. It now has about 150 students, Singh says, adding he keeps the class size limited to focus on quality. "All the big names have since started their analytics programme, so obviously the market demand has increased tremendously," he says.
Praxis's business analytics course has 80-85% engineers and a growing number of science graduates majoring in economics, statistics, and maths. It teaches techniques (statistics, machine learning, deep learning, visualization), tools and technologies (Python, R, Tableau, Spark and Hadoop), and how it can be applied to business. "Data science requires a problem solver's mindset - understanding business problems, converting a business problem into a data problem, and going deep into data. Some of these people struggle at the beginning to get into that mould," he says.
The push by tech giants like Google, Amazon, Microsoft, and IBM to ensure that their respective ecosystems come up on top doesn't help, says Gunnvant Singh Saini, a data science trainer at Bengaluru's Jigsaw Academy. "They have their own libraries that they want to promote. So, for example, Google has a big stake there and somehow Tensorflow becomes the de facto deep learning library and everyone just uses that," he points out. Tensorflow is an open source machine learning framework from Google.
Easy as they make data science on-boarding, such tools and crash courses come with the risks of shortcuts. "Being able to use that toolkit and get the best advantage out of it still requires you to understand the basics of maths and stats. If you do not understand what is the statistically valid sample size for your models, it will always fail," says Santanu Bhattacharya, a prominent Indian data scientist who has led teams at Facebook and is now at MIT's Emerging Market Group. "It becomes a scenario of a monkey with a machine gun."
Shinu Abhi, director, corporate training at REVA Academy for Corporate Excellence, which runs an advanced analytics program for working professionals, agrees. Its training course stresses on what she calls the "two primal KPIs" for companies - increasing revenue and reducing cost. "The creamy layer of smart data scientists bring business impact," she says, stressing how skills need to go beyond just use of tools.
At the higher end of the training value chain, international certification is one of the big draws. International School of Engineering (INSOFE), with a presence in Bengaluru and Hyderabad, for instance, provides certification recognized by the Language Technologies Institute of Carnegie Mellon University (CMU) for a post-graduate program in data analytics and optimization. Jigsaw Academy's data science post-graduate program gets you certified from the University of Chicago Graham School.
Piyush Mishra, an engineer and aspiring data scientist currently undergoing a course at INSOFE in Bengaluru, says its CMU affiliation, access to labs, and faculty of data scientists motivated him to spend over Rs3.5 lakh on the 23-weekend course. The programme includes a paid internship as well. Despite layoffs in the IT industry, AI, deep learning, and automation are performing, he says. "That's why I planned for this course."
Vaibhav Gokhale, who did a Master's in mechanical engineering from Purdue University and recently completed his internship at INSOFE, will be soon joining a newly created analytics team at a Pune firm with a salary of Rs4 lakh a year. The relatively low salary for his kind of education doesn't deter him. "Initially, it's fine, the field is growing. Once I have some experience I can demand more," he says, attributing the low take-home to his lack of programming experience. "In our class, most have an average of 10 years experience in some domain. If someone has experience in software development and business background, they are more likely to get a higher pay."
Core AI job roles related to deep learning, machine learning, and NLP (natural language processing), are areas where talent supply is lower than the market demand in India, says Rishabh Kaul, co-founder, Belong, a Bengaluru HR start-up, citing its talent supply index study from March 2017.
A bulk of the demand for data scientists and machine learning engineers is being generated by R&D centres of large global corporations, says Kaul. "India is still maturing as an AI/ML ecosystem. For a lot of companies, their expectation from AI talent is knowledge of tools, languages, software packages, and libraries."
According to Belong's research, less than 2% of professionals who call themselves data scientists or data engineers have a Ph.D in AI-related technologies and just 4% AI professionals in India have worked on areas such as deep learning and neural networks. Kaggle's research also seems to indicate India's talent pool is much younger than elsewhere and seems more bottom-heavy, with fewer Master's and Ph.D-level talent.
"Starting salary for basic analytics folks can be anywhere from Rs4-8 lakh per annum, while for data scientists with 4-5 years of experience, it can be Rs15-30 lakh per annum. A 10-plus years of experience in AI can pay anywhere between Rs60 lakh and Rs1.5 crore, depending upon the company," says Kaul. In the US, AI professionals with 10-plus years of experience earn upwards of $300,000 and with top hirers, this can go upwards of half a million dollars and stock, he says.
The market demand for data science in India is much smaller than the developer job market, as expected. As of 15 February, LinkedIn listed 2,657 jobs related to data science in India in the past month, while there were 31,378 results for other developer jobs. Kaul believes this is an underestimation, and some 4,000-5,000 data science jobs are added every month in India.
Kasliwal, the BITS Pilani alum who works at a stealth start-up in Bengaluru, strikes a note of caution about the kind of data science work done in India. "Even if you get a data science job, you will be doing a lot more development work, rather than data science. People looking to get into this field in India need to temper their expectations, and learn to take the ones and twos, rather than go for the sixes," he says, taking a cricket mindset. The students in Marathahalli could benefit from that advice.