Hierarchies WITH Common Table Expressions
Storing hierarchical data in a database is a very common task, and SQL Server 2005 brings us Common Table Expressions (CTEs) – a very useful tool for retrieving such data both elegantly and efficiently. I haven’t actually used these before, and they came to mind during a session with a customer, so I did a bit of research to see if they would meet the customer’s needs, and thought I’d publish the path I took.
For the purposes of this post, I’ll be using the simple database schema shown below;
The key point to note is that an employee has a manager, and this is represented as a foreign key field (named Manager) on the Employee table that points to another record in the same Employee table, yielding our hierarchy.
Adapting the SQL from the Common Table Expression documentation in SQL Server Books Online results in the query below;
WITH OrganisationChart (Id, [Name], JobTitle, Manager) AS( SELECT Id, [Name], JobTitle, Manager FROM dbo.Employee WHERE Manager IS NULL UNION ALL SELECT emp.Id, emp.[Name], emp.JobTitle, emp.Manager FROM dbo.Employee emp INNER JOIN OrganisationChart ON emp.Manager = OrganisationChart.Id)SELECT * FROM OrganisationChart
Breaking down the WITH statement, our CTE, we see two distinct parts. The first SELECT fetches the top level employees in our organisation, those which have no manager. The second SELECT is the really clever bit – it selects all employees that have a manager in the results of a recursive call to the OrganisationChart CTE. This recursion builds a tree of employees and their manager when it is invoked by SELECT-ing from the CTE – the final line in the script. The documentation contains more explanation about how these clauses work together.
Running this query gives some results that might look like those below;
Now when I first ran this query, I have to be honest, my first reaction was “so what?”! As an application developer, what does this give me that a “SELECT * FROM Employee” statement wouldn’t? Surely with either statement I’ve got to load all the data into memory and build my object hierarchy before I can filter it for a specific management branch of the organisation that I’m after?
Well, yes.
The documentation adds something that makes the query slightly more useful though – namely a “level” field, which indicates what level of the hierarchy this employee is in. Our new query looks very similar to the first, then, but with one more field;
WITH OrganisationChart (Id, [Name], JobTitle, [Level], Manager) AS
(
SELECT
Id, [Name], JobTitle, 0, Manager FROM dbo.Employee WHERE Manager IS NULL UNION ALL SELECT emp.Id, emp.[Name], emp.JobTitle, [Level] + 1, emp.Manager FROM dbo.Employee emp INNER JOIN OrganisationChart ON emp.Manager = OrganisationChart.Id)SELECT * FROM OrganisationChartORDER BY [Level]
On the top level clause, the “Level” is fixed at zero for obvious reasons. And relying on recursion, Level is then defined in the second clause as the previous level plus one. This yields results as follows;
Now the results seem much more useful – the Level field not only gives us an order that helps us build our in-memory object hierarchy by organising the data logically, but it also may meet your full requirements – if you only want a list of people that are 3 steps removed from a Director, a simple WHERE clause on the script’s final SELECT statement would suffice.
But let’s be honest – this isn’t what I usually want. What I want to be able to do is say “show me everyone that reports to X, directly or indirectly”. That is, to retrieve one whole branch of an organisation with a simple WHERE clause. This enables me to load only the data I want into memory, minimising memory consumption, processing, network bandwidth, and so on. But at the moment I can’t do this – if I filter on the Manager column I will only get direct reports.
Well, it turns out the answer is surprisingly simple. Consider the following modified query;
WITH OrganisationChart (Id, [Name], JobTitle, [Level], Manager, [Root]) AS( SELECT Id, [Name], JobTitle, 0, Manager, Id FROM dbo.Employee WHERE Manager IS NULL UNION ALL SELECT emp.Id, emp.[Name], emp.JobTitle, [Level] + 1, emp.Manager, [Root] FROM dbo.Employee emp INNER JOIN OrganisationChart ON emp.Manager = OrganisationChart.Id)SELECT * FROM OrganisationChartWHERE [Root] = 11
I have named the new field “Root”, and it holds the top-most manager for each employee. The key to making this work is that in the first SELECT clause of the WITH statement, Root is hard-coded to the Identifier of the current record. In the second SELECT clause, and therefore for all other levels, Root is selected directly out of the recursive call to the OrganisationChart CTE – so it is never modified throughout the hierarchy.
This immediately enables us to add a WHERE clause to limit the results to employees that report to a single manager, directly or indirectly, such as follows;
In this example, I have retrieved all employees that report to the Finance Director – and no-one else.
Suppose you don’t want the whole organisation branch – perhaps you want to retrieve the hierarchy that report to a manager in charge of a group of teams, further down the organisation? No problem, just change the WHERE clause in the WITH statement’s first SELECT statement; for example, instead of WHERE Manager IS NULL you could try filtering on another known field, such as WHERE [Name] = ‘A Ince’. You could of course create a Stored Procedure, and parameterise this WHERE clause to enable you to retrieve a hierarchy below any individual you specify.
So to conclude, Common Table Expressions are indeed a powerful tool to ensure you can retrieve all the data you need, and only the data you need, in a flexible and straightforward way. This post has shown just one simple way that you can optimise your data retrieval with them. Enjoy!
Comments
Anonymous
October 17, 2007
PingBack from http://www.artofbam.com/wordpress/?p=9580Anonymous
October 30, 2007
How about the other way around? Given Peon A, give me his chain of management all the way up through the CEO in one simple recursive query. How would this be done?Anonymous
October 30, 2007
There was an interesting question left on my last post entitled Hierarchies WITH Common Table ExpressionsAnonymous
October 30, 2007
Hi - good question. As might be able to see from the automatic ping-back above, I've just put up a short post that should answer your question; http://blogs.msdn.com/simonince/archive/2007/10/31/common-table-expressions-reversed.aspx Hope that helps! SimonAnonymous
October 30, 2007
There was an interesting question left on my last post entitled Hierarchies WITH Common Table ExpressionsAnonymous
January 08, 2008
helo i have a hierarchie for an auditory the hierarchie is ok but now i need to show in a report the customer who was modify or create for the name and in a deteminate date, i need show the herarchie whit all information this is the query, i apreciate your help thank you CREATE PROCEDURE [dbo].[ReporteAuditoriaDatosDetallePropiedad] ( @Entidad Varchar(200) ,@Propiedad Varchar(100) ,@FechaInicial dateTime ,@FechaFinal dateTime ) as DECLARE @Bitacora table ( IdOperacion int ,OperacionPadre int ,NombreOperacion varchar(200) ,Entidad Varchar(200) ,Fecha dateTime ,Usuario Varchar(200) ,Ip Varchar(100) ,IdEntidad int ) ;WITH CTEBitacora as ( select IdOperacion , OperacionPadre,NombreOperacion,right(NombreOperacion, charindex(' ', reverse(NombreOperacion)) - 1) as Entidad ,Fecha,Usuario,Ip,IdEntidad from Bitacora b where right(NombreOperacion, charindex(' ', reverse(NombreOperacion)) - 1) = @Entidad UNION ALL SELECT b.IdOperacion , b.OperacionPadre,b.NombreOperacion,right(b.NombreOperacion, charindex(' ', reverse(b.NombreOperacion)) - 1) as Entidad ,b.Fecha,b.Usuario,b.Ip,b.IdEntidad from CTEBitacora cte inner join bitacora b on cte.IdOperacion = b.OperacionPadre ) insert into @Bitacora select IdOperacion, OperacionPadre,NombreOperacion,Entidad,Fecha,Usuario,Ip,IdEntidad from CTEBitacora set dateformat dmy SELECT B.IdOperacion, B.OperacionPadre , B.Entidad, B.NombreOperacion, B.Fecha, B.Usuario, B.IP,B.IdEntidad , DatosBitacora.Propiedad,DatosBitacora.DescripcionPropiedad , DatosBitacora.ValorAnterior, DatosBitacora.ValorNuevo FROM @Bitacora as B left JOIN DatosBitacora ON B.IdOperacion = DatosBitacora.IdOperacion WHERE B.Fecha BETWEEN convert(datetime, @FechaInicial, 103) and convert(datetime, dateadd(day,1,@FechaFinal) ,103) and DatosBitacora.Propiedad=@Propiedad ORDER BY B.IDOPERACIONAnonymous
January 12, 2008
Ingrid; I have struggled a little to understand what you are trying to achieve as I don't know your database or system. I believe you are saying that you are using a CTE to retrieve the correct data, but now you also need to show more data in a report? If this is the case, the simplest solution is to join the results of the CTE back to your customer table, to fetch the other data. In other words, work out which data you need using the CTE, and then use this result set to fetch the data you wish to see in the report. It looks like you might already be heading down this path? My apologies if I have failed to understand your problem. SimonAnonymous
July 06, 2008
The keyword WITH is giving me an errorAnonymous
July 08, 2008
Andile; in what context? I assume you are running SQL Server 2005 or later? If so, the most common situation I've seen cause errors is when incorrectly declaring the CTE in a stored procedure (e.g. trying to reference a WITH block defined in a different stored proc, or similar, which isn't possible). SimonAnonymous
October 17, 2008
If you’ve been following my blog you should remember a couple of posts about hierarchical data in SQLAnonymous
November 11, 2008
This post by Simon Ince explores the topic of Hierarchies with HierarchyID in SQL 2008. If you’ve beenAnonymous
January 13, 2011
Awesome -- your final example with the ROOT ID is just what I was looking for. Thanks!!Anonymous
April 03, 2014
Just what I was looking for! This solution has answered a lot of questions for me. And really simple. Great article!