ARN

Neo4J 5.0 improves on scalability, performance and agility

The fifth version of the leading native graph database has been released with additional features such as an Ops Manager and the ability to continuously provide updates to self-managed users.

Neo4j, broadly considered to offer the leading graph database, said that the fifth version of its namesake database, dubbed Neo4j 5.0, is being made generally available for both community and enterprise users.

The latest iteration of the native graph database, which can be downloaded from the company’s website and the AWS, Azure, and GCP marketplaces, improves on the scalability, performance and agility aspects of Neo4j 4.0, launched in February 2020.  

The rationale behind the new version of the database is pinned on enterprises’ need to ask more complex questions of their data, in order to unearth unexplored business insights and develop new strategies in the face of continued economic uncertainty, Neo4j said.

What is new in Neo4j's scalability?

Neo4j 5.0 comes with an automated scale-out ability that will allow self-managed users—those users who are not on a managed service plan—to grow the database and manage massive queries with little manual effort and relatively lower infrastructure cost, the company said, adding that the 5.0 version also automates the allocation and reassignment of computing resources. This automated horizontal scaling is made possible due to back-end features such as Autonomous Clustering and the new enhanced Fabric.

Clustering in database management systems is the process of combining more than one instance or server that is connected to a single database. Clustering is done to increase performance and scalability of a database.

The Autonomous Clustering feature can automatically decide how to distribute primary (writable, synchronous) and secondary (read-only, asynchronous) database instances or copies across servers (bare metal machines, virtual machines, or containers) according to the requirements and constraints that the database administrator provides when specifying the cluster topology, the company said.

Additionally, Server-Side Routing (SSR), which is turned on by default in the latest version, enables the use of network load balancers and other network technologies to route queries internally to an appropriate database management system server.

Further, the latest version builds on the company’s Fabric feature.

Users can use the company’s Cypher Query Language to create a composite graph database of other graph databases or sharded databases, Neo4j said, adding that Fabric can be used in an Autonomous Cluster to execute queries across databases including the ones remote clusters.

Neo4j 5.0 will allow users to import bulk data incrementally into an existing database, which the company said is expected to reduce data loading time.

Faster query performance than Neo4j 4.0

The latest version, according to the company, can offer faster query performance than the previous version. The improvement in query performance is based on optimisations in indexing, query planning and runtime.

Indexing in database management systems is the process of improving query performance by limiting the number of disk accesses required to run a query, and Neo4j 5.0 extends the matching capabilities of indexes, the company said.

“FULLTEXT now indexes lists and arrays of strings to improve the quality of text search results. The Cypher clauses CONTAINS and ENDS WITH are widely used for filtering results by text properties. The new TEXT indexes implementation in Neo4j 5, based on trigrams, makes them up to hundreds of times faster,” the company said in a statement.

“RANGE allows you to specify or compare values, e.g., find reviews rated 3-5 by users in postal codes 94000-95000. POINT, often used in routing and supply chain analysis, allows you to find and compare spatial data like longitude and latitude,” it added.

In order to make it easier to write complex pattern matching queries in the Cypher Query Language, the company has introduced new syntax as part of the latest version.

“Cypher now has syntax for label and relationship type expressions, allowing the user to specify Disjunction (OR), Negation (NOT), and Conjunction (AND) operators between individual labels and relationship types,” the company said.

Neo4j 5.0’s query response times are accelerated by the use of K-Hop query processing that is used to find unique nodes with a breadth-first search approach.

Additionally, the company has updated the Cypher runtime component of its community edition to Slotted from Interpreted, claiming that will improve read speeds by 30%.

The Slotted runtime, which is very similar to Interpreted, adds additional optimisations regarding the way in which the records are streamed through the iterators. This, according to the company, results in improvements to both the performance and memory usage of the query.

Neo4j now offers continuous updates

Neo4j 5.0 comes with cloud-like agile features, the company said, adding that the latest version comes with a new Ops Manager and the ability to support a frequent release update schedule for self-managed users.

 The new Ops Manager is an operations console that can be used for monitoring and administering Neo4j deployments such as a database, instance, or cluster, Neo4j said.

In contrast to its previous edition, where continuous updates were available to only its managed service users, the new edition will provide continuous updates to self-managed users as well on multicloud, hybrid and on-premises deployments.