R vs Pandas & Python – Which is best suited for Data Science applications and why….

over 3 years ago

I was curious upon hearing a Data Scientist mention that he uses Pandas, the open source statistical library for the Python programming language instead of R and decided to investigate.

I understand that R is a key part of the Data Scientist toolset and wanted to find out three things about R and Pandas:


  1. Would Pandas would overtake R in popularity within the data scientist community

  2. If and why our clients may be requesting the Python / Pandas skillset over R,

  3. If a data scientist might use Pandas over R for certain Data Science tasks.


The obvious benefit of both Pandas and R is that they are open source software and free to download. So that aside, what would make one a choice over the other for Data Scientists on the front line?

I found some interesting articles focussing on a comparison between R and Pandas:

As to a definitive winner, the evidence seems inconclusive. Both R and Python / Pandas have arguably strong analytical capabilities, and it appears that the choice between the two will depend on the type of analytical challenges one is facing.

It would be interesting to hear any comments / experiences from Data Scientists who have used either R or Pandas for particular tasks / assignments and why they have chosen one over the other.

Chris Wright,


