More on hard links

After posting my original entry on how hard links work, a number of comments were made requesting clarification.  The original blog posting is below:

https://blogs.technet.com/b/joscon/archive/2011/01/06/how-hard-links-work.aspx

To his credit Joseph has been asking me about revisiting this topic for months.

I think part of the confusion about how hard links work revolves around the difference between what the Windows shell shows us and what is really happening in NTFS.

Here we have a couple of directories roughly displayed as the Windows shell would show it to us. The diagram gives the impression that the files exist inside their respective directories. In the following example there are two instances of ‘File1.txt’.

clip_image002

If we look under the hood we can see that each directory and each file has its own entry in the Master File Table (MFT).

clip_image004

As you can see from the above diagram a file isn’t really ‘inside’ the directory. The directory just has a pointer to the location where the file exists in the MFT. Using the diagram from my old blog entry we can see the three part relationship between the file and the parent directory.

clip_image005

1. The directory has an index entry that tells us the MFT address for the child file.

2. The file has a file name attribute that tells us what the file record number of the parent directory.

3. The file has a link count that tells us that it only has one parent directory.

If I were to dump out the metadata for a directory, it would only tell me the location in the MFT for the files that are related to the directory. No part of the actual file is actually IN the directory. If you were to look at the actual addresses in the MFT they might appear like this…

0025 – Dir1

005a – Dir2

100a – File1.txt

15ab – File1.txt

Dir1 would have an index entry that included a reference to 100a (File1.txt). And Dir2 would have an index entry that included a reference to 15ab (the second instance of File1.txt).

Now let’s look at a hard linked file. The shell part isn’t really going to appear differently.

clip_image006

But when we add in what NTFS is really doing you can start to see a difference.

clip_image008

Instead of having two copies of the same file, the index entries in both directories point to the same address in the MFT for the child file.

The three part relationship also changes. The file becomes aware that it is referenced by multiple directories.

clip_image009

1. Each directory has an index entry that tells us the MFT address for the child file.

2. The file has two file name attributes. One for each parent directory.

3. The link count is incremented to 2h.

And finally if we looked at the addresses in the MFT, they might look like this….

0025 – Dir1

005a – Dir2

100a – File1.txt

100a – File1.txt

Hopefully the new diagrams combined with the older ones will help you to properly visualize what NTFS is doing. To really get your head around it is essential to stop thinking about ‘the real copy of the file’ or ‘the file being IN the directory’.

Finally, when looking at the two link diagrams side-by-side…

clip_image010

…you might be asking yourself, “How is the hard link different than the normal link relationship?”

The answer is that it isn’t. Technically EVERY file is hard linked. We just reserve the term for talking about files that have more than one directory linked to them.

Now moving forward, let’s look at some real world information. Simple names like Dir1 and File1.txt are fine to start off with but we need to relate it to what’s in the Windows directory. We can do this with some easy substitutions.

Dir1 = c:\windows\system32

Dir2 = C:\Windows\winsxs\amd64_microsoft-windows-securestartup-service_31bf3856ad364e35_6.1.7600.16385_none_c09aa5b3bec88beb

File1.txt = bdesvc.dll

And I kept them color coded to keep it easier to follow.

I dumped out the metadata for the file bdesvc.dll. I’ve simplified it for readability but you can see that it has two file name attributes, one that lists a parent directory of 280b and one that lists a parent directory of 124d.

_FILE_NAME {

_MFT_SEGMENT_REFERENCE ParentDirectory {

ULONGLONG SegmentNumber : 0x000000000000280b

USHORT SequenceNumber : 0x0001

..... FileName : "bdesvc.dll"

_FILE_NAME {

_MFT_SEGMENT_REFERENCE ParentDirectory {

ULONGLONG SegmentNumber : 0x000000000000124d

USHORT SequenceNumber : 0x0001

..... FileName : "bdesvc.dll"

And of course the metadata also showed the higher ‘link count’, meaning that there are two links pointing to the file record.

USHORT ReferenceCount : 0x0002

I dumped out the metadata for both 280b and 124d and found that they were the two directories that I’d expected (system32 and amd64_microsoft-windows-securestartup-service_31bf3856ad364e35_6.1.7600.16385_none_c09aa5b3bec88beb).

Joseph brought up an example of what would happen if a private hotfix were installed. Depending on how that was done it would sever the hardlink and put a new version of the file in the system32 directory. So we would end up with two copies of the file. The old one would still be under amd64_microsoft-windows-securestartup-service_31bf3856ad364e35_6.1.7600.16385_none_c09aa5b3bec88beb. And the new one would be in the system32 directory.

Later if you were to run ‘SFC /scannow’ Windows would remove the new copy and establish a new hard link using the file that was still stored under WinSxS.

When SFC runs it compares a checksum of the file against a copy of the checksum that Windows has squirreled away somewhere.

However if the one and only file were to become damaged, then SFC would fail with an error…

“Windows Resource Protection found corrupt files but was unable to fix some of them.

Details are included in the CBS.Log windir\Logs\CBS\CBS.log.”

The other main concern was how to view disk space. That’s actually the easy one.

clip_image011

See the pie chart? Its correct.

Okay, I’ll explain it a bit more in-depth than that.

There are two ways to view how much free space. The first way it to use the pie chart. The information in the pie chart actually comes from a special metafile named $BITMAP. This file maintains a list of all the clusters of the volume and if they are in use or not. When a file needs space, $BITMAP is queried to see what is free. When space is allocated, $BITMAP is updated to show that the allocated clusters are now in use. Keep in mind that $BITMAP doesn’t track what files own what clusters. It only tracks what clusters are in use. So when we draw the pie chart, we just query $BITMAP to find out how many clusters we have and how many are free. This is also why the pie chart is populated so quickly. We just have to read a single file to build the chart.

The second way to get free space is what I refer to as “the wrong way”. That is to open a CMD prompt at the root directory and do a ‘dir /s’. This will list all the files on the volume that you have access to and add up the sizes at the end. This method is just plain wrong. A big part of why it is so wrong is that hardlinked files will get counted twice….once for each directory that is linked to them. The other big reason is that the DIR will only list files that you have access to. Files in the System Volume Information directory will not be included. That’s a problem because that’s where the VSS snapshots are stored. And the special metafiles that are hidden from the user are also not listed in the total. So the space used by your MFT will not be listed, your security file ($SECURE) will not be listed, and so on. There’s just too much to take into account to get a truly accurate total by adding files together.

I know it sounds like it should work but there are factors involved in storing your files that most people just don’t know about. As an example, Windows 2003 reserved about 12% of the volume for the MFT to have room to grow. So if you had a very large volume with just a few files, you might wonder where all your space was.

The take away from that is what I tell my customers and coworkers, “Trust the Pie Chart”.

I hope this has been helpful.

Robert Mitchell

High Availability

Enterprise Platform Support

Enjoy my writing? Here are other blog entries that I have authored…

https://blogs.technet.com/askcore/archive/2009/10/16/the-four-stages-of-ntfs-file-growth.aspx

https://blogs.technet.com/askcore/archive/2009/12/30/ntfs-metafiles.aspx

https://blogs.technet.com/b/askcore/archive/2010/08/25/ntfs-file-attributes.aspx

https://blogs.technet.com/b/askcore/archive/2010/10/08/gpt-in-windows.aspx

https://blogs.technet.com/b/askperf/archive/2010/12/03/performance-counter-for-iscsi.aspx

https://blogs.technet.com/b/joscon/archive/2011/01/06/how-hard-links-work.aspx

https://blogs.technet.com/b/askcore/archive/2011/04/07/gpt-and-failover-clustering.aspx

https://blogs.technet.com/askcore/archive/2010/02/18/understanding-the-2-tb-limit-in-windows-storage.aspx

Comments

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    @Dean Windows XP HAD an folaer with backups, This was the DLLCache folder which sfc used to restore files.

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    It's not used as a backup, think of it more as a flat directory of the files (similar to what you would have done in XP when you wanted the installation files locally).

  • Anonymous
    January 01, 2003
    The backup folder inside WinSxS is used to get Windows working again if some critical files are corrupted which Windows needs to boot into safe mode. This is called Windows Resource Protection: "WRP copies files that are needed to restart Windows in the cache directory located at %Windir%winsxsBackup. Critical files that are not needed to restart Windows are not copied to the cache directory. The size of the cache directory and the list of files copied to cache cannot be modified." msdn.microsoft.com/.../aa382530%28v=vs.85%29.aspx

  • Anonymous
    January 01, 2003
    Dean; The installation methods are very different actually.  XP was a flat file copy process, we had an ordered list of files that were expanded onto the disk one at a time.  Vista ++ explodes the install.wim to its given directory structure but it's not a flat file copy.  The foundation packages are layed down first and then the SKU differentiating packages that make up your Windows edition are layed down.  From there they are parented with the servicing stack and then projected to the appropriate directories using hard links.

  • Anonymous
    January 01, 2003
    Dean, Actually I’m glad that Joseph kept reminding me.  I just had a great deal of content creation in the last 6 months.  So my time was stretched pretty thin. > Although I still want to know why your first example where there is more than one entry in the MFT for a file would ever happen. The first example was of two files in two different directories that just happened to have the same name.  Since each instance of the file is unique, each gets an entry in the MFT.  Hardlinks are the exception.  It allows you have to one file that is seemingly in two or more places at once.  As such it only gets a single entry in the MFT. I can’t address your installation question.  That’s a bit outside my area.  Perhaps Joseph can handle that one.

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    Drewfus, There is a CMD way to get the free space that does use the $BITMAP method.   'fsutil volume diskfree c:' Just be aware that if you go to compare the GUI and CMD methods you have to do them quickly.  Free space, especially on your system volume is in constant flux. Having a command like what you are suggesting wouldn’t work currently because the $BITMAP file is per volume, not per directory.  For your suggestion to work we would have to add a $BITMAP attribute to every directory….and that would end up being an extreme performance hit.

  • Anonymous
    January 01, 2003
    LOL, good suggestions Drew.  I'm not sure what the content plans around Win8 are right now but I will see what I can do to get this feedback in front of the right group(s) internally here.

  • Anonymous
    January 01, 2003
    I'd thought about that but we were trying to keep that KB as informative without being overly technical as possible to allow for easier reading.

  • Anonymous
    August 27, 2011
    The comment has been removed

  • Anonymous
    August 29, 2011
    Thanks Robert, very informative. "See the pie chart? Its correct." Yes, but should the pie chart be a bar chart instead? simplecomplexity.net/pie-chart-arguments "The second way to get free space is what I refer to as “the wrong way”. That is to open a CMD prompt at the root directory and do a ‘dir /s’." It's odd that the GUI interface is accurate, and the CLI isn't. How about a new command? > bitmap <drive>[path] [/free | /used | /both] [/units:<Bytes|KB|MB|GB|%>] [/b]

  • Anonymous
    August 30, 2011
    The comment has been removed

  • Anonymous
    August 31, 2011
    When I said: "So in reality the only difference between the way it was done in XP and the way it is done in Windows 7 ( and Vista but we won't count that :-) ) is that in Windows 7 the WinSxS directories act as the last reference to the files as a backup." I meant AFTER the files were laid down and the installation was finished. So am I right ? Also there is a major problem with your new site design. Once you get past about 5 lines of comments everything slows WAY down and you can only type one character a second and after about 8 lines you can't see what your typing anymore but it's there.

  • Anonymous
    August 31, 2011
    I thought the whole point of the WinSxS directory was to act as a backup of the installation files to be able to replace them from the WinSxS directory ( using hard links ) in case they needed to be replaced for some reason. In this regard Windows 7 would be different from Windows XP in that Windows XP had no directory installed that contained a backup of the installation files.

  • Anonymous
    August 31, 2011
    So the winsxsbackup directory is the backup and is where the SFC gets it's files from to hardlink ? The WinSxS directory is just an installation source like copying the XP CD to the hard drive ? If so why can't the WinSxS directory also act as the backup ? Why make another redundant directory just for backup ?

  • Anonymous
    August 31, 2011
    The comment has been removed

  • Anonymous
    August 31, 2011
    I just looked at the WinSxSbackup folder on my Windows 7 installation and it contained 650 Amd64 manifest files. I then looked at the WinSxSmanifests folder and it contained 9,976 Amd64 manifest files. So your statement about the WinSxSbackup directory only containing the most critical manifests and files seems to be valid.

  • Anonymous
    August 31, 2011
    The comment has been removed

  • Anonymous
    August 31, 2011
    @joscon: "Vista ++ explodes the install.wim to its given directory structure but it's not a flat file copy.  The foundation packages are layed down first and then the SKU differentiating packages that make up your Windows edition are layed down.  From there they are parented with the servicing stack and then projected to the appropriate directories using hard links." Interesting. Questions:

  1. So by 'given directory structure' you mean relative to %systemroot%winsxs ?
  2. Would it be accurate to think of %systemroot%winsxs as the true installation root, %windir%system32 as the 'working backup', and %systemroot%winsxsbackup as the boot critical backup?
  3. How does a manifest determine foundation verus SKU package?
  4. Is 'parented with servicing stack' outlined somewhere?
  5. Is the projection of packages achieved simply by calling SFC.exe /scanNow? Dean is correct about the comments field. I got dizzy writing this. :-)
  • Anonymous
    August 31, 2011
    This: msdn.microsoft.com/.../aa382541(v=vs.85).aspx states: "Protected files not critical to restart Windows are not repaired." and another KB said that it only works on "non modifiable" system files. So that is also verified. The KB that got me thinking that ALL OS files were scanned and protected is KB929833 that says "The sfc /scannow command scans ALL protected system files and replaces incorrect versions with correct Microsoft versions." but you have to know ahead of time that ALL means ALL of the WRP files which are NOT ALL of the files but ALL of the critical files that are protected by WRP. I think I am getting the hang of writing definitions in circles. Maybe I should apply for a job at Microsoft now.

  • Anonymous
    August 31, 2011
    So from my "crying" posting I still need question 6 answered.

  • Anonymous
    August 31, 2011
    "5. Is the projection of packages achieved simply by calling SFC.exe /scanNow?" With what I have learned here I think I may be able to answer that one myself. I think the answer would have to be no because the SFC only works with the most critical boot files and not ALL the OS files.

  • Anonymous
    August 31, 2011
    "5. Is the projection of packages achieved simply by calling SFC.exe /scanNow?" I'm really going to go out on a limb here and say it's done by the Trusted Installer Module.

  • Anonymous
    September 01, 2011
    The comment has been removed

  • Anonymous
    September 01, 2011
    The comment has been removed

  • Anonymous
    September 01, 2011
    I would cut and paste some of this stuff into a KB article maybe titled "Explaining the Mystery of the WinSxS Directory" I would use the two postings on the Hard Links and some of the comments and answers modified for KB format.

  • Anonymous
    September 01, 2011
    Which KB ? And if there is already a KB then there needs to be an additional one. One KB that's not to technical and one that I want which is very technical. After all I think your web servers could handle the addition of one more document without running out of drive space.

  • Anonymous
    September 01, 2011
    The comment has been removed

  • Anonymous
    September 02, 2011
    The comment has been removed

  • Anonymous
    September 02, 2011
    The comment has been removed

  • Anonymous
    September 02, 2011
    And it's not just about the hard links. It's also about the backup directory and how the SFC works with the WinSxS directory. It all needs to be in ONE document that CLEARLY lays it all out.

  • Anonymous
    September 02, 2011
    If you want I can do a draft of what I think it should look like and send it to you.

  • Anonymous
    September 02, 2011
    @joscon: "Aside from knowing how the servicing mechanics work, what else would you do with that information?" Other than as a troubleshooting aid, understanding how a new technology works can help people feel comfortable with that technology, and in the case of Windows 6.x, contribute to overcoming this issue: apcmag.com/calling-time-microsofts-chris-jackson-on-retiring-win-xp-ie6-and-office-2003.htm

  • Anonymous
    September 02, 2011
    The comment has been removed