Introduction to Big Data Week 5 Quiz Answers

[ad_1]

Introduction to Big Data complete course is currently being offered by UC San Diego through Coursera platform.

Learning Outcomes for Introduction to Big Data Course!

At the end of this course, you will be able to:

* Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.

* Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.

* Get value out of Big Data by using a 5-step process to structure your analysis.

* Identify what are and what are not big data problems and be able to recast big data problems as data science questions.

* Provide an explanation of the architectural components and programming models used for scalable big data analysis.

* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.

Instructors for Introduction to Big Data Course!

– Ilkay Altintas

– Amarnath Gupta

Skills You Will Gain

Big Data

Apache Hadoop

Mapreduce

Cloudera

Also Check: How to Apply for Coursera Financial Aid

Intro to MapReduce

Q1) What does IaaS provide?

Q2) What does PaaS provide?

Q3) What does SaaS provide?

Q4) What are the two key components of HDFS and what are they used for?

NameNode for block storage and Data Node for metadata.

NameNode for metadata and DataNode for block storage.

FASTA for genome sequence and Rasters for geospatial data.

Q5) What is the job of the NameNode?

For gene sequencing calculations.

Coordinate operations and assigns tasks to Data Nodes

Listens from DataNode for block creation, deletion, and replication.

Q6) What is the order of the three steps to Map Reduce?

Map -> Shuffle and Sort -> Reduce

Shuffle and Sort -> Map -> Reduce

Map -> Reduce -> Shuffle and Sort

Shuffle and Sort -> Reduce -> Map

Q7) What is a benefit of using pre-built Hadoop images?

Guaranteed hardware support.

Less software choices to choose from.

Quick prototyping, deploying, and validating of projects.

Quick prototyping, deploying, and guaranteed bug free.

Q8) What is an example of open-source tools built for Hadoop and what does it do?

Giraph, for SQL-like queries.

Zookeeper, analyze social graphs.

Pig, for real-time and in-memory processing of big data.

Zookeeper, management system for animal named related components

Q9) What is the difference between low level interfaces and high level interfaces?

Low level deals with storage and scheduling while high level deals with interactivity.

Low level deals with interactivity while high level deals with storage and scheduling.

Q10) Which of the following are problems to look out for when integrating your project with Hadoop?

Infrastructure Replacement

Q11) As covered in the slides, which of the following are the major goals of Hadoop?

Facilitate a Shared Environment

Optimized for a Variety of Data Types

Q12) What is the purpose of YARN?

Implementation of Map Reduce.

Enables large scale data across clusters.

Allows various applications to run on the same Hadoop cluster.

Q13) What are the two main components for a data computation framework that were described in the slides?

Node Manager and Container

Resource Manager and Container

Applications Master and Container

Node Manager and Applications Master

Resource Manager and Node Manager