NTFS hardlink creation performance issues

Hi Everyone,

I have a task where I need to create many thousands of hardlinks so I
built a simple test tool to benchmark the performance of this. I
begin with the following:

-Freshly formatted 750GB drive with 1 simple volume
-1 directory containing 10,000 files, each 512kb in length
-Win 7 x64, i7 w/16GB ram

Then my test tool creates new folders, each of which create a hardlink
to every file in the original folder. The first 7 folders always go
quickly (averaging around 7,000 links/second created) but then it
immediately drops to 700-800 links/second for any remaining number of
folders.

I have tried many things and none of them effect this in any
measurable way. It is completely repeatable in terms of when it
happens and how much the performance drops off. I have tried the
following:

-Disabling short name creation
-Disabling last access update (and rebooting)
-Setting the MFTZone to 4 (and rebooting and reformatting the volume)
-Rebooting after the slowdown begins, but after the boot additional
folders are still created at the slower speed
-Deleting all folders but the original and then creating new folders -
still slow

Using the contig tool I see that $Mft::Bitmap is only 2 fragments so
that doesn’t seem to be an issue.

When I format and create a new base folder then every time the first 7
folders of links are very fast and the remaining folders drop to the
same approx 1/10 performance.

Can anyone shed some light on what wall I am hitting and hopefully any
ways to avoid it? The 7,000 link/s rate would be perfect for my
application but the 700 link/s rate will not do so the perf result is
going to make or break this entire thing. I regret that I cannot
disclose more information about the “why” I’m doing this, it is
protected under NDA for a client. In any case I hope it would be of
interest to the community what limitations may exist here.

Thanks,
-JT

Ok, the plot thickens. I have done a lot more testing now and found something interesting. It doesn’t matter how many files are on the volume. I was originally thinking this was related to total file counts (and MFT fragmentation, etc.) but that’s not it at all. I can create 10 directories, each with 10,000 unique files, and then when I start creating the directories full of hardlinks I get 7 fast ones and then they slow down. Then I start linking to a different source directory of unique files and again I’ll get 7 fast and then they slow down.

So what this tells me is that the speed is related to the link count not the volume file count. And the magic number is 8 (my original plus the 7 links). Any link count higher than that and we get a big drop in creation performance.

The really interesting thing is that this behavior seems to be “sticky”. In other words, once we exceed a link count of 8, even if I delete half the links the performance is still at the slower rate as I create new links (link numbers 4-8, for example).

Can anyone enlighten me as to what is going on here and if there’s any way to get around it?

If I had to guess, I’d say it sounds like something is going from resident (inside the MFT) to non-resident (not in the MFT, but rather in a secondary storage block). Have you looked at the information in the MFT after you get into this “slow link count” state?

I note that http://en.wikipedia.org/wiki/NTFS states “Hard links can only be applied to files on the same volume since an additional filename record is added to the file’s MFT record” which is consistent with the theory that this is related to what happens when it overflows the MFT record.

Ultimately, you’d need to confirm either by looking at the MFT record directly or by having someone familiar with the NTFS implementation confirm (or deny) this theory.

Tony
OSR

I guess:
when you create a new hardlink for a file, ntfs will create a new filename($30) attribute in mft. after you create 7 hardlinks, the mft space(1024 bytes) cannot hold all the attributes, so ntfs will use attribute list($20) for this file, this will slow down file access.