Top 3 Applications of Apache Spark

Top 3 Applications of Apache Spark

Top 3 Applications of Apache Spark  Awantik Das
Posted on May 22, 2017, 12:07 p.m.

appications of apache spark

Based on statistics published by DataBricks, the top 3 applications using Apache Spark are the following.   

  • BUSINESS / CUSTOMER INTELLIGENCE ( 68% of Spark Users )
  • DATA WAREHOUSING ( 52% of Spark Users )
  • REAL-TIME / STREAMING SOLUTIONS (45% of Spark Users )

BUSINESS / CUSTOMER INTELLIGENCE

Business Intelligence (BI) is deriving presentable & actionable information to help corporate executives, business managers   & other stack holders to make informed business decision.  Benefits of BI includes - accelerating, improving decisions and finding new opportunities.Customer Intelligence (CI) is the information derived from customer data that an organization can use to understand customer needs & serve better.

Before Spark, accessing few days of data took 24 hours. And, after using Spark, a year’s data get processed in a 10 min coffee break.  

Now, because of real quick BI & CI due to Spark, businesses have a real competitive edge over their rivals. A simplest example is – Knowing customers early is an unparalleled advantage over your rivals.  

DATA WAREHOUSING

Traditional data warehouses are great for structured data. But, current trend of data consist of 4 V’s (Volume, Velocity, Variety, and Veracity). Data is coming from various sources like smart phones, sensors, social media, log, transactions etc.Your competitive edge is processing it faster. Data warehouses built using Spark-SQL provide capability to address 4 V’s & gives an edge over other competitors.

REAL-TIME / STREAMING SOLUTIONS

Organizations get data from various sources in real-time like sensors, mobile, IoT devices, twitter, online transaction. All these data needs to be monitored & processed. So, the need of the hour is large-scale, real-time data processing capability.Streaming ETL – Data is continuously cleaned and aggregated prior to pushing it to stores.  Spark Streaming solutions is used by companies like Pinterest to provide live insight how users are engaging with Pins across the world. Based on this Pinterest’s recommendation engine show more related pins.

Other, applications that uses Apache Spark are - RECOMMENDATION ENGINES, LOG PROCESSING, USER FACING SERVICES & FRAUD DETECTION


Awantik Das is a Technology Evangelist and is currently working as a Corporate Trainer. He has already trained more than 3000+ Professionals from Fortune 500 companies that include companies like Cognizant, Mindtree, HappiestMinds, CISCO and Others. He is also involved in Talent Acquisition Consulting for leading Companies on niche Technologies. Previously he has worked with Technology Companies like CISCO, Juniper and Rancore (A Reliance Group Company).




Keywords : data-science spark


Recommended Reading


What are Big Data, Hadoop & Spark ? What is the relationship among them ?

Big Data is a problem statement & what it means is size of data under process has grown to 100's of petabytes ( 1 PB = 1000TB ). Yahoomail generates some 40-50 PB of data everyday. Yahoo has to read those 40-50 PB of data & filter out spans. E-commerence w...


How can one explain the concept of Apache Spark in layman's terms?

Data needs computation to get some information out. Size of data can be really huge. Huge data is broken down into chunks & stored across different systems.


What are Big Data, Hadoop & Spark ? What is the relationship among them ?

difference between big data and spark, relationship between big data & spark