Recursive SQL

November 15, 2015

Recently I had to write some SQL to connect some CIs (for example, servers) with their parent applications. The crucial point was that there could be arbitrary many layers of applications above the server.

Here's an example with 3 layers of applications (the → indicates "... is parent of ..."):

All the relations are in the same table, so there's not really an indicator on which hierarchical level an application is until you search for more relations to other connected applications. Technically, the goal was to join the table by itself n-times, whereas n may be different for each entry:

As this is not possible by simple joining, I had to find another way to connect the different applications with each other. Let's start with a simple table with 11 relations:

As you can see App A is the root node with its children (App B, App E), which again have their own children.

Using CONNECT BY

To query the table recursively, one can use Oracle's CONNECT BY, and connect the child_id with the corresponding parent_id:

SELECT
  parent_id,
  child_id,
  CONNECT_BY_ROOT parent_id root_id,          -- root application
  SYS_CONNECT_BY_PATH(child_id, ' -> ') path, -- path
  LEVEL depth                                 -- tree depth    
FROM
  app_relations
CONNECT BY NOCYCLE PRIOR child_id = parent_id -- connect w/ PRIOR
ORDER BY
  depth, root_id, parent_id

With CONNECT BY ... PRIOR the chaining is initiated, connecting Applications who are a child and a parent at the same time at a different place in the table.

This will result in the complete list of existing relations, complete with root_id, path and also a depth for each relation, indicating how many nodes have been found:

Quite some paths found! As you can see, also incomplete paths "in the middle" are shown, for example App B → App D, a path which misses the root node (App A) and also the leaf node (Server D or E), but is nonetheless a valid path.

Narrowing it down

Because we're not interested in all the paths, we have to narrow it down to paths which start at our root Application, App A. Luckily, Oracle provides an addition to the CONNECT BY clause, namely START WITH. Here we can specify that we're not interested in any path other than starting with App A, and efficiently rule out the whole rest.

SELECT ...
FROM app_relations
START WITH parent_id = 'App A'
CONNECT BY NOCYCLE PRIOR child_id = parent_id

Now we have all relations from our root node, and we can as a last step just take those which end with a server:

SELECT
  root_id || path AS complete_path,
  depth
FROM ( ... ) -- subquery from above
WHERE
  child_id like 'Server%'   -- only paths which end with a server.

This gives us the list:

  • App A -> App E -> Server F
  • App A -> App E -> Server G
  • App A -> App B -> App C -> Server A
  • App A -> App B -> App C -> Server C
  • App A -> App B -> App C -> Server B
  • App A -> App B -> App D -> Server D
  • App A -> App B -> App D -> Server E

Hierarchy tree

Using the jQuery plugin jsTree and a little helper library we can generate a clickable Javascript tree:

For further reading and some nice examples of CONNECT BY have a look here.

Thanks for reading!