A typical Nexthink engine database contains 100 million events, up to one million of events may be created on a busy hour by 5,000 devices. To analyze this amount of data, you needed a query language which is both simple and powerful. It’s why we developed Nexthink Query Language or NXQL. We considered SQL but we needed some language of a higher level dedicated to search on time events linked to complex environment. However, NXQL and SQL are close relative. So, if you are familiar with SQL, you should be able to feel comfortable with NXQL in no time.
Let’s start by modeling a database to store TCP connections.
Get the best pods, articles, & research in IT experience
Let’s imagine that a connection is made by a user on a device to a destination at a given time and once established, some traffic can be exchanged between the device and the destination during a given period. Figure bellow depicts a possible schema for this database. The connection table contains the connection specific data like its establishment time, its duration and traffic and the signature table links a device, a user and a destination. Every connection points to a signature. In fact, it is a simplified version of Nexthink engine database but it does capture the essence of Nexthink’s data model.
Now that we have a schema, we can try to see how NXQL and SQL can be used to retrieve the data. Let’s imagine that a user john made a connection to company servers and we want to discover what he is doing or not doing. If you are interested in the list of destinations that he visited in the last 24 hours, you would probably write the following SQL query.
As you see, you clearly have to know the model and how to join tables to forge this query. You have to go from destinations to connections and from connections to users to filter out connections linked to users you are not interested in. NXQL simplifies the query by given a logical name to this relationship in our case connection. Joining a connection to a destination or a user is implicit. The following NXQL query is identical to SQL above.
So, with defines the relationship, in that case connection. The internal of the database is hidden and the query is clearer. In fact, if we would take our current data model, the SQL statement would at least have 6 joins, an extra 4 conditions and would require a full table scan of a 100 million rows table to determine which connections are not only established but also are active during the time period. All that complexity is captured in the “with connection”.
One of the first reaction with NXQL is why so much parentheses. The reason is that NXQL is often a machine to machine language used either by the Finder or the Portal or though the Web API. So we decided to define a language that can be either written with parentheses for you and me or as XML snippets for software clients.
In you want to go a little bit further and to get the traffic per destination, both SQL and NXQL are using aggregates, let’s see the difference. In SQL, you will use the SUM aggregate with a GROUP BY.
In NXQL, you will use the compute keyword.
The logic is the same but the computation performed by NXQL is more complex because the traffic of the connection is averaged using the connection duration and only the part of the traffic made in the last 24 hours is taken into account. Doing the same in SQL would be challenging but here as well, NXQL hides all that complexity. Impressive, isn’t it?
All the available aggregates, the relationships between tables and tables themselves are defined in the NXQL data model. In a way, NXQL defines both a query language and a data model and in that respect, it sets it apart from SQL.
Sometimes, we need to know no only what happened but also what did not happened. For instance, if we want to know all the machine that did not connect to a given destination. In SQL, this can be achieved by using an inner query or the SQL EXCEPT. I will use the latter because NXQL uses the same keyword and logic but probably you would naturally use the first solution. For instance, the SQL query below returns the IP addresses of the destinations used made by at least one device in the last 24 hours, but not used by john during the same period.
Basically this query is building two sets of destinations A and B: A, the destinations used during the period and B the destination used by john. The result is the destinations present in A not in B or in a more mathematical notation A B. As for SQL, NXQL uses the same except keyword.
NXQL is little bit clearer because it assumes that only set of rows of the same table can combined. In addition to except, you may also use union and intersect.
I hope that you see now the SQL and NXQL common points and that now you feel like giving NXQL a try. Enjoy it!