Diagnosing failures in DTM

So you've run all the tests in DTM and tried to create a submission package, but you find that you are failing tests. Don't panic: DTM was built with testing in mind and will give you all the information you need to fix the problem fast.

The first thing you need to know to diagnose a test failure is that each test consists of a job made up of tasks. These tasks may be copy tasks, execute tasks, run job tasks, or copy result tasks. Because one job can call another you get a tree structure of tasks for any particular job. This can seem intimidating at first, but don't lose hope: with a systematic approach to examining the failure we can immediately converge to the root cause.

When analyzing a failure you need to hone in on the task that contains the point of failure. You may see many failures in a task, but not all of them matter. I know this is confusing, but stay with me. There are two key concepts to understand: roll up results, and failure action. Whenever a task fails (be it determined from the log, or the exit code, or an unexpected reboot, etc.) DTM marks the task with a red X. Just because a task has this, it does not mean that it is what caused the job to fail. If the task marked with the X has roll up results set to false, it will not cause the job to fail. Another common attribute to be set on a task that does not rollup results is a failure action of IgnoreFailAndContinue. This tells DTM that the failure should not designate a halt in job execution. Lets use this knowledge to analyze a failure.

I have consolidated all the task information for this job into one view. You can get the same information by selecting the job in job monitor and viewing the Job and Task panes. To look at the Tasks of a job called by a Run Job task, right click the Run Job task and choose “Child Job Result”; I call this “pushing down”. To go back to the tasks of the parent job, right click the Job and choose “Parent Job Result”; I call this “popping up”.

If we look at the iSCSI Digest results, our first intuition might point to the “Create Skip Parameter” task as the cause of the job failure. This is incorrect. We can look at the Result Report for the “iSCSI Digest Setup” job to see that the “Create Skip Parameter” task has roll up results set to false. To see this push down so that “iSCSI Digest Setup” is in the job pane, right click the job, and choose “View Result Report”. The next possible failure is “Copy SDStress ntlog based log” under iSCSI Digest Disk Header. If we view the result report for this we’ll see again that this task does not roll up results. A good rule of thumb is investigate failures in Copy Results tasks and Cleanup phase tasks last: these are generally non-essential. The next failure we see is the “Execute SDStress” task. Execute tasks are normally the prime candidate for the cause of a test failure. This is where all the testing happens. Another key thing to notice is how all parents of this task are marked as failed. Finally we can look at the result report and see that indeed this task rolls up results and is shown as failing. The next thing to look at is the “Task Result Error Details” section of the report. I will discuss a few common errors and how to debug them:

· Task is Marked Failed as it had non-zero Fail Counts in the LogFile.

o This is the most useful error. It means that the task produced a log and it contains an error message. To open this, you can either click the link in the result report or right-click the task and choose “View Task Log”

· Task Cancelled Because of an Unexpected Reboot

o This usually designates a system crash or bugcheck. In order to debug this issue you can either connect the test client to a kernel debugger or enable crash dumps and analyze the resulting memory dump after the crash using the –z flag in your preferred debugger. Check the documentation for Debugging Tools for Windows for more hints.

· The Execute Task with Commandline ...Failed with ExitCode XXXXXXXX

o Check to see if the exit code matches a Win32 Error Code. If the error code looks like an HRESULT (starts with 0x8 for failures) try using this tool to decode it. Alternatively you could enter the code into the Error Lookup tool in Visual Studio.

o If this doesn’t get you anywhere check the tests parameters. You can do so by checking the Effective Parameters section of the Result Report of the top most (pop up all the way) job. Normally this should be caught by our Parameter Validator (the annoying thing that puts the yellow exclamation mark on all the storage jobs and brings up a dialog if there’s a problem) but this only runs on storage jobs.

o A number of issues resulting from network outages and unavailable sessions will report a failure based on exit code. Check DTM’s “Resolution” section in the error view for more details on this.

 

digest_fail.png

Comments

  • Anonymous
    October 09, 2007
    I meet the "Task is Marked Failed as it had non-zero Fail Counts in the LogFile" fail, Can you help me? and tell me how to resolve about it? thanks!!!

  • Anonymous
    October 09, 2007
    and my email is lebing823@126.com thanks!!!

  • Anonymous
    October 10, 2007
    Hello Lebing, Please try viewing the logfile by right-clicking the failing task and "View Task Log".

  • Anonymous
    October 10, 2007
    hi,ericstj my msn: redroselily@hotmail.com  please add me! thanks!

  • Anonymous
    October 10, 2007
    the task log show as below: Title                                  Result Generating test cases                  Failed Start Test 10/10/2007 4:04:46.493 PM   Error in testcase "Generating test cases". Exception  Message Object reference not set to an instance of an object.  TargetSite UInt64 get_FreeBytes()  StackTrace   StackFrame    Class StorageDevices.Disk    Method get_FreeBytes    Signature UInt64 get_FreeBytes()   StackFrame    Class StorageDevices.Disk    Method IsRaw    Signature Boolean IsRaw()   StackFrame    Class DiskIO.DiskIOTestCase    Method GetTestCases    Signature DiskIO.DiskIOTestCase[] GetTestCases(System.String)   StackFrame    Class DiskIO.DiskIOApplication    Method Main    Signature Void Main(System.String[])  Source StorageDevices  HResult 0x80004003 File:    Line: 0 Error Type:   HRESULT Error Code:   0x80004003 Error Text:   Invalid pointer End Test 10/10/2007 4:04:48.493 PM Generating test cases Result:   Fail Repro:   diskio.exe /d "" /b 32KB /t 00:30:00 /c sr;sw;sv;xr;xw;xv /a /o

  • Anonymous
    October 11, 2007
    The comment has been removed

  • Anonymous
    October 11, 2007
    The comment has been removed

  • Anonymous
    June 09, 2008
    Dear Sir, I use DTM 1.2 and the issue I met is about errata filter. I updated UpdateFilters.sql 2008/06/05 and confirm that the errata we needed is expired at 2009/06/01. But it didn't work and the items still failed after checking "Status" of submission. The errata works successfully on DTM 1.1 and before. Is there any method to check if the errata is workable?

  • Anonymous
    September 09, 2008
    Dear Eric, I use the DTM to test on our product (solely USB speaker).  The controller is a WHQL certified IC.  However, I find that the device shows error on KS Position test.  The IC supplier told me that the problem was solved as an errata ID 182 was raised but they cannot disclose the detail to me.  I doubt that as I cannot find any information from the web about this errata ID.  Can I still submit the test log with the failed test item in attachment with the errata ID that the supplier told me? If so, usually how long will it take for the approval? Raymond

  • Anonymous
    September 10, 2008
    Hi Raymond,          The KS Position test is not a storage test. So, I may not be able to provide accurate information about that test. You may have to contact DTM support to resolve this issue. Thanks, Bhanu

  • Anonymous
    September 22, 2008
    storage query property failed in DTM wlk1.2 during the SCSI Driver testing

  • Anonymous
    September 26, 2008
    Sathya - What type of failure are you getting ?