Graph Analytics for Big Data Quiz Answers

Q17) In the video on “Inclusion and Exclusion Constraints” we learn that adding constraints can actually make our analysis job easier. For example, when we require that a given node be included on a path, which of the following impacts now make the analysis job easier?

Reduction of the size of the graph

Splitting the task into 2 independent shortest path problems

Changing the weights on the edges of the graph and/or subgraphs

Quiz 3 – Connectivity, Community, and Centrality Analytics

Q1) The example given in the lectures of when a power network loses power in large portions of its service area was an example of what?

a problem that can occur when centrality is too high

an attack which causes disconnection of the graph

high levels of connectivity which make it easy to bring a network down

Q2) Is the following graph strongly connected, weakly connected or neither?

Q3) Is the following graph strongly connected, weakly connected or neither?

Q4) If you were going to look for a node which would be most likely to be the target of an attack to disconnect a network, what would be the best characteristic to look for?

nodes that, if they were removed, would cause the graph to go from strongly connected to weakly connected

Q5) What is the out-degree of node B?

Q6) In the graph below, which node is the greatest listener?

Q7) In the graph below, which nodes are the greatest communicators? (Hint: there’s a tie)

Q8) What would we be looking for if we followed the steps below? Note: we have 2 graphs.

Create a table for each graph where, for each node, you list the degree of the node. For each graph, create a histogram indicating how many nodes in that graph have a specific degree (e.g., how many nodes have degree 1? 2? etc.). Use advanced approaches (e.g. Euclidean distances) to compare these two histograms.

Q9) Which of the following are the three type of analytics questions asked about communities?

Q10) What type of community analytics question is the following?

Did a community form on twitter around the 2014 World Cup in Brazil?

Q11) Which type of community analytics question is the following?

How tightly knit was the 2014 World Cup twitter community on July 13, 2014 (the day of the finals)?

Q12) What is the external degree of the node indicated in the graph below?

Q13) Which of the two graphs below is more modular?

Q14. Which of the following community tracking phases usually occurs when a company spins off a start-up?

Q15) An influencer in a network is defined as:

the biggest gossip in the network

a node which can reach all other nodes quickly

a node which has heavy weight edges to at least 1/2 of the nodes in the network

Q16) Which of the following are the 2 core “key player” problems that centrality analytics can address?

What is the shortest path through a network

A set of nodes which can reach (almost) all other nodes

Which nodes’ removal will maximally disrupt the network

Which nodes have the highest ratio of out-degree nodes to in-degree nodes

Q17) What kind of centrality would you want to analyze in a graph if you wanted to inject information that flows through the shortest path in a network and have it spread quickly?

Q18) What kind of centrality would you want to analyze in a graph if you wanted maximize commodity flow in a network?

Q19) What kind of centrality identifies “hubness”?

Quiz 4 – Graph Analytics with Neo4j

Q1) Which of the following is a Cypher command used to combine two or more query results?

Q2) For a graph network whose nodes are all of type “MyNode”, which has both incoming and outgoing edges, and which has both root and leaf nodes, what will the following Cypher code return in a Neo4j report?

match (n:MyNode)<-[r]-() return n

All nodes except root nodes.

The entire network, all nodes and edges

All nodes and edges except leaf nodes and their edges.

Q3) The Cypher query language shares some commands in common with SQL.

Q4) The following query will return a graph containing whatever loops might exist.

match (n)-[r]-(n) return n, r

Q5) Which Cypher pattern is used to represent a node?

Q7) Which Cypher command launches a Neo4j database search?

Q8) Cypher does not include a specific command to find the shortest path in a graph network.

Q9) Cypher includes a ‘diameter’ command to find the longest path in a graph network.

Quiz 5 – Assessment Questions on ‘Practicing Graph Analytics in Neo4j With Cypher’

Q1) What is the number of nodes returned?

Q2) What’s the number of edges?

49,834
46,621
50,000
None of the above

Q3) The number of loops in the graph is:
Q4) The query match (n)-[r]->(m) where m <> n return distinct n, m, count(r) gives us

the count of all edges.
None of the above
the count of all edges between every adjacent node pair.
the count of all non loop edges between every adjacent node pair.

Q5) The query match (n)-[r]->(m) where m <> n return distinct n, m, count(r) as myCount order by myCount desc limit 1 produces what?

a random edge
two neighboring nodes, each with a high outdegree
the node with the maximum number of looping edges
the pair of nodes with the maximum number of multi-edges between them

Q6) The query match p=(n {Name:’BRCA1′})-[:AssociationType*..2]->(m) return p produces what?

The neighbors of the node whose name is ‘BRCA1’
The 2-neighborhood of the node whose name is ‘BRCA1’
The neighbors’ neighbors of the node whose name is ‘BRCA1’
The neighbors whose distance is greater than 1 and less than 2 of the node whose name is ‘BRCA1’

Q7) How many non-directed shortest paths are there between the node named ‘BRCA1’ and the node named ‘NBR1’?
Q8) The top 2 nodes with the highest outdegree are:

GRB2 and TP53
MEPCE and EGFR
SNCA and BRCA1
EP300 and BRCA1

Q9) Applying the example queries provided to you, create the degree histogram for the network. How many nodes in the graph have a degree of 3?

Quiz 6 – Using GraphX

Q1) In this code snippet below from the Hands On exercise on importing data, ‘100L + row…’ adds 100 to the value of every country ID. Which of the following statements are true regarding this decision? (Note: you may select more than one)