[ad_1]
Graph Analytics for Big Data complete course is currently being offered by UC San Diego through Coursera platform.
About this Course
After completing this course, you will be able to model a problem into a graph database and perform analytical tasks over the graph in a scalable manner. Better yet, you will be able to apply these techniques to understand the significance of your data sets for your own projects.
Skills You Will Gain
- Graph Theory
- Neo4j
- Analytics
- Graph Database
Also Check: How to Apply for Coursera Financial Aid
Quiz 1 – Introduction to Graphs
Q1) Which of the following are graphs? (check all that apply)
Q2) Which of the following is the correct adjacency matrix for this graph?
- Neither option is correct.
- Created_post (the action of creating a post)
- friends (the action of making someone your friend)
- extract conversation threads
- find interacting groups of users
- find influencers in a twitter community
- The complexity of interactions that correlate to inform phenotypes.
- The new use of computational techniques to explore new areas of biology research more quickly than can be done with “live” or wetlab experiments.
- The integration of multiple data sources from different researchers and of different sources of information.
- Velocity
- Valence
- Variety
- Volume
- Valence
- Velocity
- Volume
- Variety
- Variety
- Valence
- Volume
- Velocity
- Velocity
- Variety
- Volume
- Valence
- Variety
- Valence
- Velocity
- Volume
- Valence
- Velocity
- Variety
- Volume
Quiz 2 – Graph Analytics Applications
- the total number of emails sent by one user in a week
- the total number of people who sent an email in a week
- average number of emails sent from one user to another in a week
- where there is an edge from a node to itself.
- when there is a edge from A->B, there is also an edge from B->A.
- where there is a path in some way from a node, through 1 or more other – nodes, back to the original node.
- Separate graphs for each kind of relationship
- Multiple nodes for each of Maria and Julio, to capture the various relationships
- Multiple edges between Maria and Julio
- Routing to avoid visiting the same city.
- An email network tracing email replies.
- Routing to avoid using the same bridge or road.
- An email network tracing frequency of emails from one person to another.
- Inclusion of nodes and/or edges
- Exclusion of nodes and/or edges
- Avoid roads under construction
Q17) In the video on “Inclusion and Exclusion Constraints” we learn that adding constraints can actually make our analysis job easier. For example, when we require that a given node be included on a path, which of the following impacts now make the analysis job easier?
- Reduction of the size of the graph
- Splitting the task into 2 independent shortest path problems
- Changing the weights on the edges of the graph and/or subgraphs
Quiz 3 – Connectivity, Community, and Centrality Analytics
Q1) The example given in the lectures of when a power network loses power in large portions of its service area was an example of what?
- a problem that can occur when centrality is too high
- an attack which causes disconnection of the graph
- high levels of connectivity which make it easy to bring a network down
Q3) Is the following graph strongly connected, weakly connected or neither?
- nodes that, if they were removed, would cause the graph to go from strongly connected to weakly connected
Q8) What would we be looking for if we followed the steps below? Note: we have 2 graphs.
Create a table for each graph where, for each node, you list the degree of the node. For each graph, create a histogram indicating how many nodes in that graph have a specific degree (e.g., how many nodes have degree 1? 2? etc.). Use advanced approaches (e.g. Euclidean distances) to compare these two histograms.
Did a community form on twitter around the 2014 World Cup in Brazil?
How tightly knit was the 2014 World Cup twitter community on July 13, 2014 (the day of the finals)?
- the biggest gossip in the network
- a node which can reach all other nodes quickly
- a node which has heavy weight edges to at least 1/2 of the nodes in the network
- What is the shortest path through a network
- A set of nodes which can reach (almost) all other nodes
- Which nodes’ removal will maximally disrupt the network
- Which nodes have the highest ratio of out-degree nodes to in-degree nodes
Quiz 4 – Graph Analytics with Neo4j
Q2) For a graph network whose nodes are all of type “MyNode”, which has both incoming and outgoing edges, and which has both root and leaf nodes, what will the following Cypher code return in a Neo4j report?
- All nodes except root nodes.
- The entire network, all nodes and edges
- All nodes and edges except leaf nodes and their edges.
Q4) The following query will return a graph containing whatever loops might exist.
Quiz 5 – Assessment Questions on ‘Practicing Graph Analytics in Neo4j With Cypher’
Q4) The query match (n)-[r]->(m) where m <> n return distinct n, m, count(r) gives us
- the count of all edges.
- None of the above
- the count of all edges between every adjacent node pair.
- the count of all non loop edges between every adjacent node pair.
- a random edge
- two neighboring nodes, each with a high outdegree
- the node with the maximum number of looping edges
- the pair of nodes with the maximum number of multi-edges between them
- The neighbors of the node whose name is ‘BRCA1’
- The 2-neighborhood of the node whose name is ‘BRCA1’
- The neighbors’ neighbors of the node whose name is ‘BRCA1’
- The neighbors whose distance is greater than 1 and less than 2 of the node whose name is ‘BRCA1’
Q8) The top 2 nodes with the highest outdegree are:
- GRB2 and TP53
- MEPCE and EGFR
- SNCA and BRCA1
- EP300 and BRCA1
Quiz 6 – Using GraphX
Q1) In this code snippet below from the Hands On exercise on importing data, ‘100L + row…’ adds 100 to the value of every country ID. Which of the following statements are true regarding this decision? (Note: you may select more than one)
val countries: RDD[(VertexId, PlaceNode)] =
sc.textFile(“./EOADATA/country.csv”).
filter(! _.startsWith(“#”)).
map {line =>
val row = line split ‘,’
(100L + row(0).toInt, Country(row(1)))
}
- Another option would be to add 500 to the country keys.
- This step was needed to create unique keys between the country and the metropolis datasets.
- Another option would have been to add 100 to the metropolis keys as they were imported, and leave the country keys as they were originally numbered.
- A metro area or metropolis
- It had a vertex ID of 205
- It is the green dot that that has no connections, or it is the least connected cluster
- In a directed graph, the stalks are large.
- Social networks have communities or pockets of people who interact densely.
- The high centrality of some people nodes in facebook gives the graph its broccoli shape.