Spark - Matei Zaharia
Spark 5 times faster than Hive on disk
Spark 18 times faster than Hive in Memory RAM
Spark 100 times faster than MapReduce
Spark Stack - Shark SQL,
Spark Streaming, MLlib machine Learning, GraphX
Hadoop - Batch Processing
Spark - Iterative Processing
Yarn - Resource Manager,
HDFS, HBase, etc.- Storage
120 lines in Scala, compared to 15K in C++
30 mins to run on 100 million Samples
Yahoo Ad Analytics - Hive on Spark - Shark
Storm - Streaming
Hadoop
Map Reduce - batch processsing
Impala - SQL processing in Big Data
Spark - Hive (SQL query) on top of Spark - Shark
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.