Are Data Scientists the Highest Paid Job?
5 Ways Data Scientists Keep Learning After College
- Randy Bartlett has held analyst roles at Citibank, WellsFargo, PWC, and AstraZeneca, has authored A Practitioner's Guide to Business Analytics, and holds two patents for predictive modeling.
- Edwin Chen has worked on ads quality at Twitter, quantitative analysis at Google, and data science at dropbox. His blog is a must-read among data enthusiasts.
- Jason Dolatshahi created a data science curriculum for General Assemb.ly and taught the first session of the introduction to data science course. He is currently the Manager of Data Science at Bonobos.
- Amy Heineike co-authored Data Scientists at Work, was the Head of Mathematics at Quid, and is now the VP of technology at a stealth startup.
- Rob Hyndman has written more than 100 research papers and 5 books. He is currently the Editor-in-Chief of the International Journal of Forecasting.
- Mark Madsen has received numerous information management awards including the Smithsonian/Computerworld award for innovative use of information technology. He is the President at Third Nature.
- Andreas Weigend is the former Chief Scientist at Amazon. He's written over 100 scientific papers on machine learning techniques and is currently a professor at the UC Berkeley Social Lab.
Hanging around Q&A sites like crossvalidated.com is really useful. Typically someone who's practicing in data science will also be attending conferences like useR! or their local data science meetup group or their local R user group. There's often speakers coming through that they're getting new ideas from, or they're discussing some package that they've heard of. There's a lot of self-learning happening that way.
Don't be swayed by consultants that tell you Hadoop is data science. It's not about the plumbing; it's what you do with it. Many consultants make money by selling you systems, but instead you should ask the right questions. That's why data scientists come from other fields like physics; they are used to carrying out experiments, forming hypothesis. Know what tool to pick for a given problem, and formulate the question.
The classes offered now are extremely valuable because you don't need to have a masters or any qualification if you're interested in learning the material. The nice thing about taking a class is that it gives you a supportive learning environment with instructors, TAs, opportunities to ask questions, and a community of fellow students.
Coursera has a fantastic data science program. I think there are four separate Coursera courses that run out of Johns Hopkins which are really good, which I recommend regardless of what your background is.
You can't really do anything without software, so the books that teach you in the context of a particular software package are some of the best. Frank Harrell wrote a book on using R for survival analysis and basic regression. What you want are books written by practitioners, people who've actually done things in the field.
I want people who will bring something to the table, so maybe they have some expertise in some area of statistics or mathematics or computer science that's kind of novel to the team and broadens what we can think about. You need to have a ferocious appetite for learning and know how to cope with continually not knowing what you're doing - know how to continually be in a position where you have to learn a lot.