An Introduction of
Hadoop:
Hadoop is an Open Source framework for Big data processing
and storing using simple programming language. Hadoop allow distributed data
processing of large data set across various cluster of computer
Read More- Hadoop Intoduction
Read More- What is Big data and Hadoop
Apache Spark:
Apache Spark is designed for cluster computing. Apache Spark
used HDFS file system and use computer memory and disk for fast computation.
Similarities Hadoop and Spark :
Similarities
|
Hadoop
|
Spark
|
open source framework
|
YES
|
YES
|
Real Time BI & Big data analytics
|
YES
|
YES
|
fault tolerance & scalability
|
YES
|
YES
|
JVM based programming languages
|
YES
|
YES
|
Difference Hadoop and
Spark :
Hadoop
|
Spark
|
Hadoop MapReduce is best suited for batch processing. For big data
applications that require real time options,
|
Apache Spark used for both batch processing and real time processing. |
Spark processes in-memory data
|
Hadoop MapReduce persists back to the disk
|
Hadoop MapReduce is written in Java. using Apache Pig makes it
easier to develop in Hadoop, although some time needs to be spent on
understand and learning the Syntax of Apache Pig. To add the SQL
compatibility to Hadoop, developers can use Hive on top of Hadoop.
|
Spark uses Scala tuples and they can only be intensified by nesting the generic types because Scala tuples are difficult to be implemented in Java. |
Read Some Use Articles:
1 comments:
Click here for commentsCheck out the latest Bollywood news, new Hindi movie reviews, box office collection updates Bollywood Hungama .
ConversionConversion EmoticonEmoticon