You are using an ad blocker that is interfering with our web typography and internal javascript. Please whitelist our domain to live in a more beautiful world. No ads here, just really great software!

Be an IT Changemaker! Learn, get inspired and inspire others on our new DEX Hub. Visit now

Blog Post|3 minutes

From SQL to The Immutable Database

published
October 14, 2015

Developing an innovative database engine wasn’t on our radar.

Back in 2007, Nexthink was leveraging PostgreSQL to manage insertions and queries — and as business grew so did lag times. Insertion wait periods were unreliable at best, while data queries often timed out after five minutes. And with only 1000 supported devices per Engine, the sheer volume of data was nothing compared to our potential growth.

The solution?

Two choices:

  1. Use the tried-and-true method of piling up servers to handle the volume; or
  2. Doubling down and figuring out a smarter way to store and access data.

We opted for risk over easy reward — the opposite of most tech advice — and created our in-memory database. Today, an Engine easily processes 5000 to 10,000 insertions per second while also serving queries from Portal and Finder.

The magic behind our curtain? Performing all queries, insertions and updates directly in RAM thereby using both server RAM and disk at maximum capacity. As RAM is 1000 times faster than disk and with no need for context switch, queries take only seconds and modifications persist on disk as a continuous flow of write operations. Now we’re betting again — this time, it’s immutable.

Get the best pods, articles, & research in IT experience

The Immutable Advantage

Why use immutable data?

The biggest advantage is parallelization. Each transaction can be executed with literally no performance impact on other transactions. More CPU cores really means more performance. As noted by The Scale-Out Blog, immutable is a now-a-day part of a lot of database systems but may require way more disk than mutable systems. Designing an immutable database system in RAM running on the classic Nexthink appliance is the challenge that we are tackling.

The CPU Conundrum

When we started designing the new Engine in 2007, good servers came with two CPU cores. Today, 16 or more is not uncommon. And while the Engine can easily run parallelized queries across multiple cores it has to serialize read and write operations because memory is not transactional.

The result?

Queries may not always be performed concurrently on different cores — something especially problematic with real-time data streams. Our new challenge is clear: use all available cores without losing the benefits of RAM processing.

Of course, we’re not going in blind. We started by studying other database systems like CouchDB and DAtomic, in addition to functional languages like Scala.

Our conclusion?

That immutability only gets you so far — completely immutable databases have no chance of handling real-time data flow like ours on our hardware. Solving the problem meant creating a partially mutable system using C++ data structures that are mostly lockless, along with developing new ways to better use CPU caches. The process is ongoing but we’re pleased with the result: the concurrency level is almost equal to the number of cores when processing real-time data and query results are five times faster.

We’re betting that hybrid mutable/immutable structures are the ideal way to reduce query times and leverage real-time data without neglecting benefits of the multiple CPU. SQL has had its day — it’s time to opt for immutable.

Get the best pods, articles, & research in IT experience