When Python has ML libraries, why do you need Apache Spark for analytics?

When Python has ML libraries, why do you need Apache Spark for analytics?

When Python has ML libraries, why do you need Apache Spark for analytics?  Awantik Das
Posted on May 22, 2017, 10:55 a.m.

learn apache spark

Summing it up in two points

  1. Python ML libraries inherently don’t support distributed computing.
  2. Spark mllib leverages distributed computing infrastructure without making programmer bother about it at all.

About the Expert: Awantik Das is a Technology Evangelist and is currently working as a Corporate Trainer. He has already trained more than 3000+ Professionals from Fortune 500 companies that include companies like Cognizant, Mindtree, HappiestMinds, CISCO and Others. He is also involved in Talent Acquisition Consulting for leading Companies on niche Technologies. Previously he has worked with Technology Companies like CISCO, Juniper and Rancore (A Reliance Group Company)




Keywords : data-science spark scala


Recommended Reading


Deep Dive into Understanding Functions in Python

Python provides very easy-to-use syntaxes so that even a novice programmer can learn Python and start delivering quality codes, yet it does not compromises on the powerful features that it provides. With respect to functions it gives a lot of flexibility to...


How can one explain the concept of Apache Spark in layman's terms?

Data needs computation to get some information out. Size of data can be really huge. Huge data is broken down into chunks & stored across different systems.



What are Big Data, Hadoop & Spark ? What is the relationship among them ?

difference between big data and spark, relationship between big data & spark