Nand Kishor Contributor

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Full Bio 
Follow on

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

3 Best Programming Languages For Internet of Things Development In 2018
340 days ago

Data science is the big draw in business schools
513 days ago

7 Effective Methods for Fitting a Liner
523 days ago

3 Thoughts on Why Deep Learning Works So Well
523 days ago

3 million at risk from the rise of robots
523 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies
310272 views

Here's why so many data scientists are leaving their jobs
80721 views

2018 Data Science Interview Questions for Top Tech Companies
76497 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning
75966 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies
61323 views

Machine Learning Showdown: Python vs R

By Nand Kishor |Email | Aug 3, 2017 | 13605 Views

Let's say you have an amazing idea for a machine learning app. It's going to be brilliant. It's going to revolutionize the world of finance, mobile advertising, or... some other world, but it's definitely going to revolutionize something. And gosh darn it, it's going to be the smartest, most learned app the world has ever seen.

The only thing standing between you and glory is the small matter of actually coding your brilliant idea; and the first question you would want to ask yourself in this regard is which programming language you want to use for your app, with the two immediate candidates likely being R and Python.

Each of these languages has its pros, cons, and diehard fanbase. This article is meant to help developers choose between these two bitter rivals, in the context of machine learning (for a more general, feature-by-feature comparison you might want to check out this great infographic by DataCamp).

Let's get down to it then!

Round 1: Ease of Development

Python lets you hit the ground running... if you have programming experience.

While both Python and R are completely manageable and used by many developers in both business and academia, Python lends itself more easily to developers who have experience with other programming languages. Its syntax is more familiar than R, while also closer to regular English text - making it easier to read and debug.

R is very popular with advanced business users - e.g. data analysts in fields such as retail, marketing or finance - who come from more of a statistics background, rather than programming or software development. Since you're developing a machine learning app, we're guessing you're closer to the latter group - in which case you might appreciate Python's flexibility, readability and similarity to the type of programming you already know and love.

Winner: Python

Round 2: Robustness and Production Readiness

Python fits more naturally into a complex coding environment.

While applications of R in the business world are definitely on a growth trajectory, Python is still a more full-fledged programming language and is used for many types of web and other applications, in addition to its data science applications. R, on the other hand, is still mostly used for data analysis advanced statistical modeling.

Hence, assuming you would want to integrate your machine learning algorithms into some kind of interface that's communicating with other code, written by other programmers, Python might be the better choice. R can be used for rapid prototyping or to solve a specific problem, but Python will be easier to maintain and scale in the long run (especially considering its versioning and documentation are far more consistent).

Winner: Python

Round 3: External Libraries
Both languages have a breadth of external libraries that can be (relatively) easily used in a machine learning project, Python's are a bit more mature. Specifically, scikit-learn is an extremely popular, open-source machine learning package that is used in many commercial applications.

Meanwhile, R libraries such as caret are catching up, but are not quite there yet when it comes to breadth of functionality. With R you might be able to more quickly build and launch your first model - but mastering scikit and similar libraries will provide you with a deeper and more complete toolset that you can feel safe using in your machine learning app.

Winner: Python

Round 4: Performance with Big Data

R can provide better performance when performing large computations.

Machine learning will often involve working with massive datasets and highly complex computations to train and test your algorithms - so you'll want to make sure the programming language you use will perform will in these kind of scenarios.

While both R and Python can integrate with Hadoop for big data, newer R packages utilize C to provide better performance for large-scale computation. Hence, you might get faster results when using R in these situations.

Winner: R

Round 5: Statistics and Data Visualization

While this would not be the core of your machine learning software, your app might very well include some elements of statistics, analytics and data visualization.

Here, R is the hands-off winner as a tool that's built from the ground up to provide a robust platform for advanced statistical analysis. Integrating ggplot2 will enable you to create some really nifty visualizations as well, including interactive, browser-based graphs and charts.

While Python can and is used for statistical analysis and data visualization, R will probably be the better choice for this type of functionality - especially when it comes to ??one-off' operations, prototyping and testing various hypotheses (versus creating reusable and extendible features).

Winner: R

And the overall winner is...

Python. With the necessary caveats that every application, use case and business scenario is different, Python is the more mature, fully-fledged and flexible option for machine learning - and for creating complex coding projects in general. However, with R's rapid development and growing popularity, we won't be surprised if it catches up within a few years.

P.S.: if you're developing from scratch, it's probably neither

Our discussion above assumes you would want to be using an external library and build your machine learning app around it. Unless you've got a team of programming superstars, this is probably the direction you'd go.

However, if you want to start from scratch and rewrite the libraries themselves - either as a research project or because you have a truly brilliant idea for optimizing some of the under-the-hood processes - then you probably would use a compiled language (rather than an interpreted one), such as C or Java. In fact, most of the external libraries you'll be using are actually written in these languages.


Source: Webhose