TFS Warehouse Work Item Tracking Compensating Records
Compensating Records
What are compensating records?
First a little background. Whenever a work item is updated a new revision is created. These revisions are moved over to the Warehouse and stored in the various Work Item Dimension and Fact tables. Each revision will be in the Warehouse – however each date will not have a revision. Something like (ignoring actual schema for now):
ID |
Revision |
Date |
Remaining Work |
1 |
1 |
4/10/2009 |
50 |
1 |
2 |
4/15/2009 |
40 |
1 |
3 |
4/20/2009 |
30 |
1 |
4 |
4/25/2009 |
20 |
Now if you wanted to ask a question about work items on a specific date then one way you can answer your question is to get the latest revision of the work item created before that date.
For example say you want to know the Remaining Work for a Work item on 4/22/2009. You could get all the work items whose Date is <= 4/22/2009 (Revisions 1, 2, and 3); then from these find the one with the largest Revision (Revision 3). This gives you a Remaining Work of 30.
However this requires you to do a MAX in your query and doesn’t work particularly well in the Cube. So we have added compensating records. These records are primarily used for running sum computations. Each one effectively negates, or compensates, the previous revision of that work item.
For example, using the same data from above:
1. 4/10/2009 – Work Item 1 is added.
ID |
Revision |
Date |
Remaining Work |
Record Count |
Is Compensating |
1 |
1 |
4/10/2009 |
50 |
1 |
No |
2. 4/15/2009 – Work Item 1 is revised.
ID |
Revision |
Date |
Remaining Work |
Record Count |
Is Compensating |
1 |
1 |
4/10/2009 |
50 |
1 |
No |
1 |
1 |
4/15/2009 |
-50 |
-1 |
Yes |
1 |
2 |
4/15/2009 |
40 |
1 |
No |
3. 4/20/2009 and 4/25/2009 – Work Item 1 is revised.
ID |
Revision |
Date |
Remaining Work |
Record Count |
Is Compensating |
1 |
1 |
4/10/2009 |
50 |
1 |
No |
1 |
1 |
4/15/2009 |
-50 |
-1 |
Yes |
1 |
2 |
4/15/2009 |
40 |
1 |
No |
1 |
2 |
4/20/2009 |
-40 |
-1 |
Yes |
1 |
3 |
4/20/2009 |
30 |
1 |
No |
1 |
3 |
4/25/2009 |
-30 |
-1 |
Yes |
1 |
4 |
4/25/2009 |
20 |
1 |
No |
Now you can answer the same question as above by getting all the revisions of work items older than 4/22/2009 (Revisions 1, 1’, 2, 2’, and 3) and summing their Remaining Work. This gives you a Remaining Work of (50 + -50 + 40 + -40 + 30) = 30. This should be faster than without the compensating records (and it makes the Cube happy!).
What about that Record Count column?
You can apply the same running sum logic to the Record Count to get the number of work items as of a date. So you could ask how many work items were there on 4/22/2009. Again we sum up the values for all rows with a Date <= 4/22/2009 and we get (1 + -1 + 1 + -1 + 1) = 1. Not terribly exciting with our data, but you can see how if we had more work items this could be useful.
Again – with TFS
Alright so let’s look at some example using the TFS 2008 schema. Say I want to get the number of Active and Resolved Work Items assigned to me as of a given date. First thing I need to know is what tables I should look at:
Info needed in query |
Warehouse Table |
Active and Resolved Work Items |
[Work Item] filtered on [System_State] |
Assigned to me |
[Work Item History] or [Current Work Item] joined to [Person] on [Work Item History].[Person] = [Person].[__ID] or [Current Work Item].[Person] = [Person].[__ID] |
Work Items as of a date |
[Work Item History] filtered on [Date] |
Putting this together I see I need [Work Item History], [Work Item], and [Person]. So let’s build the query:
SELECT wi.[System_State]
,SUM([Record Count]) AS [Count]
FROM [Work Item History] AS wih
LEFT JOIN [Work Item] AS wi
ON wi.[__ID] = wih.[Work Item]
LEFT JOIN [Person] AS p
ON p.[__ID] = wih.[Assigned To]
WHERE [Date] < CONVERT(DATETIME, '2009-03-19', 126) AND
p.[Person] = N'Nick Ericson' AND
wi.[System_State] IN ('Active', 'Resolved')
GROUP BY wi.[System_State]
System_State |
Count |
Resolved |
2 |
Active |
6 |
I can change the query slightly to get a better query plan (please send any further optimizations along – I’ll add them):
SELECT wi.[System_State]
,SUM([Record Count]) AS [Count]
FROM [Work Item History] AS wih
LEFT JOIN [Work Item] AS wi
ON wi.[__ID] = wih.[Work Item]
WHERE wih.[Assigned To] = (SELECT TOP 1 [__ID] FROM [Person] WHERE [Person] = N'Nick Ericson') AND
wih.[Date] < CONVERT(DATETIME, '2009-03-19', 126) AND
wi.[System_State] IN ('Active', 'Resolved')
GROUP BY wi.[System_State]
One thing I should note that can be confusing is that I am passing in 2009-03-19 00:00:00.000 for the Date. Most people would consider this not to be 19th, but instead the 18th (or so barely the 19th that we shouldn’t use it for determining how many bugs were Active on the 19th). However in the Warehouse we truncate the times off of the Date field. So if a work item was marked active at 6 PM on the 19th the Date field in the Warehouse will have 2009-03-19 00:00:00.000. This is a long way of saying that 2009-03-19 is the 19th, not the 18th.
Alright getting the counts for someone on a given day was fairly straight forward. You may want to add in some logic to return Active and Resolved counts of 0 when the query returns NULLs – but this depends on how you will build your report.
Trending
How about a trend report? You basically need to execute the queries from the section above over a range of dates. We can use a Cross Apply here (again send along your optimized queries):
SELECT d.[Date] AS [Date], ca.*
FROM [Date] AS d
CROSS APPLY
(
SELECT wi.[System_State]
,SUM([Record Count]) AS [Count]
FROM [Work Item History] AS wih
LEFT JOIN [Work Item] AS wi
ON wi.[__ID] = wih.[Work Item]
WHERE wih.[Assigned To] = (SELECT TOP 1 [__ID] FROM [Person] WHERE [Person] = N'Nick Ericson') AND
wih.[Date] < d.[__ID] AND
wi.[System_State] IN ('Active', 'Resolved')
GROUP BY wi.[System_State]
) AS ca
WHERE d.[__ID] BETWEEN CONVERT(DATETIME, '2009-03-19', 126) AND CONVERT(DATETIME, '2009-03-21', 126)
Date |
System_State |
Count |
3/19/2009 |
Active |
6 |
3/19/2009 |
Resolved |
2 |
3/20/2009 |
Active |
7 |
3/20/2009 |
Resolved |
2 |
3/21/2009 |
Active |
7 |
3/21/2009 |
Resolved |
2 |
[Cross Posted from blogs.msdn.com/nericson]
Comments
Anonymous
March 21, 2009
PingBack from http://www.arjansworld.com/2009/03/21/linkblog-for-march-21-2009/Anonymous
March 26, 2009
The SRL Team Blog on How To Add More Browsers and Network Types To The Web Test List and How To Extract