What is the secret of Neo4j’s speed?

No, Neo4j developers haven’t invented a superfast algorithm for the military. Nor is Neo4j’s speed a

product of the fantastic speed of the technologies it relies on (it’s implemented in Java after all!).

The secret is in the data structure—the localized nature of graphs makes it very fast for this type of

traversal. Imagine yourself cheering on your team at a small local football stadium. If someone asks you

how many people are sitting 15 feet around you, you’ll get up and count them—you’ll count people

around you as fast as you can count. Now imagine you’re attending the game at the national stadium,

with a lot more spectators, and you want to answer the same question—how many people are there

within 15 feet of you. Given that the density of people in both stadiums is the same, you’ll have

approximately the same number of people to count, taking a very similar time. We can say that

regardless of how many people can fit into the stadium, you’ll be able to count the people around you at

a predictable speed; you’re only interested in the people sitting 15 feet around you, so you won’t be

worried about packed seats on the other end of the stadium, for example.

This is exactly how the Neo4j engine works in the example—it visits nodes connected to the starting

node, at a predictable speed. Even when the number of nodes in the whole graph increases (given

similar node density), the performance can remain predictably fast.

If you apply the same football analogy to the relational database queries, you’d count all the people in

the stadium and then remove those not around you, which is not the most efficient strategy given the

interconnectivity of the data.

These experiments demonstrate that the Neo4j graph database is significantly faster in querying graph

data than using a relational database. In addition, a single Neo4j instance can handle data sets of three

orders of magnitude without performance penalties. The independence of traversal performance on

graph size is one of the key aspects that make Neo4j an ideal candidate for solving graph problems,

even when data sets are very large.

results matching ""

    No results matching ""