Richard Bucker

Recursive Select - tree traversal

Posted at — Aug 18, 2015

Walking a tree is not that big of a challenge. There are a few different variations in the SQL language but they are similar enough.Given a schema that looks like:create table tree( id int, parentid int, type int);SQLite:;with recursive    tree_cte (id, parentid, type) as (       select 1,0,0       union all       select t.id, t.parentid, t.type         from tree t         inner join tree_cte tc on t.parentid=tc.id     )     select * from tree_cte tc     ;SQL Server:;with     tree_cte (id, parentid, type) as (       select id, parentid, type from (values(1,0,0) X(id, parentid, type, path))       union all       select t.id, t.parentid, t.type         from tree t         inner join tree_cte tc on t.parentid=tc.id     )     select * from tree_cte tc     ;The differences are ‘+’ instead of ‘||’ and “recursive” and a little of the initiator SQL in the SQL Server version.In a project I’m working on I need to create a graph of the entire tree, however, there are a few constraints. (a) the graph is huge so limit the graph as soon as possible; (b) only report full paths (no downline nodes); (c) a downline node can only be of type ‘1’. (NOTE: nodes of type 1 will not be separated by other node types.)The original query returned:1|0|02|1|13|2|14|2|15|2|16|4|27|4|38|4|19|4|110|9|111|9|112|9|1Nothing special to see, it’s the same as if I had selected everything:select * from treeLet’s look at the individual paths:;with recursive    tree_cte (id, parentid, type, path) as (       select 1,0,0,‘1'       union all       select t.id, t.parentid, t.type, tc.path || ‘.’ || cast(t.id as varchar)         from tree t         inner join tree_cte tc on t.parentid=tc.id     )     select * from tree_cte tc     ;With this output:1|0|0|12|1|1|1.23|2|1|1.2.34|2|1|1.2.45|2|1|1.2.56|4|2|1.2.4.67|4|3|1.2.4.78|4|1|1.2.4.89|4|1|1.2.4.910|9|1|1.2.4.9.1011|9|1|1.2.4.9.1112|9|1|1.2.4.9.12But according to my rules there are a few extra lines:1|0|0|12|1|1|1.23|2|1|1.2.34|2|1|1.2.45|2|1|1.2.5+6|4|2|1.2.4.6+7|4|3|1.2.4.78|4|1|1.2.4.89|4|1|1.2.4.910|9|1|1.2.4.9.1011|9|1|1.2.4.9.1112|9|1|1.2.4.9.12'+’ when the node was the wrong type’*’ when the node was not a leaf nodeThis is the final code unless someone posts a recommendation… it is pretty simple:;with recursive    tree_cte (id, parentid, type, path) as (       select 1,0,0,‘1'       union all       select t.id, t.parentid, t.type, tc.path || ‘.’ || cast(t.id as varchar)         from tree t         inner join tree_cte tc on t.parentid=tc.id         where t.type =1     )     select * from tree_cte tc     where not exists (select 1 from tree t where t.parentid=tc.id and type=1)     ;and it produced:3|2|1|1.2.35|2|1|1.2.58|4|1|1.2.4.810|9|1|1.2.4.9.1011|9|1|1.2.4.9.1112|9|1|1.2.4.9.12The differences being the where clause in the recursive SQL in the CTE and the conditional check in the outer select.(1) where t.type =1and(2) where not exists (select 1 from tree t where t.parentid=tc.id and type=1)(1) being some sort of a sentinel terminator for quick termination of the recursion instead of exhausting the data.(2) when there is a mix of nodes where the non-desired nodes appear at different depths in the tree we need to make sure that the perceived leaf node is actually a leaf or the last of it’s type.Feels good to me. Now I have to put this into practice.