* Link count for directories @ 2020-08-21 17:40 Steve Keller 2020-08-24 12:44 ` Nikolay Borisov 2020-08-31 13:18 ` David Sterba 0 siblings, 2 replies; 12+ messages in thread From: Steve Keller @ 2020-08-21 17:40 UTC (permalink / raw) To: linux-btrfs Are there any plans to implement the traditional link count behavior in btrfs, as described in the following URL? https://btrfs.wiki.kernel.org/index.php/Project_ideas#Track_link_count_for_directories Would it be a major effort to do so? I'd really like that feature. Steve ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-21 17:40 Link count for directories Steve Keller @ 2020-08-24 12:44 ` Nikolay Borisov 2020-08-24 21:50 ` Steve Keller 2020-08-31 13:18 ` David Sterba 1 sibling, 1 reply; 12+ messages in thread From: Nikolay Borisov @ 2020-08-24 12:44 UTC (permalink / raw) To: Steve Keller, linux-btrfs On 21.08.20 г. 20:40 ч., Steve Keller wrote: > Are there any plans to implement the traditional link count behavior in btrfs, > as described in the following URL? > > https://btrfs.wiki.kernel.org/index.php/Project_ideas#Track_link_count_for_directories > > Would it be a major effort to do so? I'd really like that feature. I have implemented it so it's not that big of a deal. However turns out it has pretty steep requirements for backport because so far btrfs always kept the link count of dirs to 1. So such a change should be justifiable because it's not only the kernel code that is affected but: 1. Backporting relevant patch to older, stable kernels 2. Changing btrfs-progs so that it doesn't erroneously think a kernel with link count larger than 1 is broken. So how effective is such an optimisation to the software using it ? > > Steve > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-24 12:44 ` Nikolay Borisov @ 2020-08-24 21:50 ` Steve Keller 2020-08-24 22:23 ` Adam Borowski 2020-08-25 12:39 ` Nikolay Borisov 0 siblings, 2 replies; 12+ messages in thread From: Steve Keller @ 2020-08-24 21:50 UTC (permalink / raw) To: linux-btrfs Nikolay Borisov <nborisov@suse.com> wrote: > I have implemented it so it's not that big of a deal. However turns out > it has pretty steep requirements for backport because so far btrfs > always kept the link count of dirs to 1. So such a change should be > justifiable because it's not only the kernel code that is affected but: > > 1. Backporting relevant patch to older, stable kernels Why would that be needed? > 2. Changing btrfs-progs so that it doesn't erroneously think a kernel > with link count larger than 1 is broken. OK, should be doable, right? > So how effective is such an optimisation to the software using it ? It's not only optimization like in find(1). As an old and long-time Unix user I'd also like that traditional behavior. It just feels more correct since if you do mkdir ./a ./b ./c ./d, you will actually see the 4 links to the current dir if you do ls -ai a b c d and the two links from . itself and from .. Steve ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-24 21:50 ` Steve Keller @ 2020-08-24 22:23 ` Adam Borowski 2020-08-25 12:24 ` Urs Thuermann 2020-08-25 12:37 ` Nikolay Borisov 2020-08-25 12:39 ` Nikolay Borisov 1 sibling, 2 replies; 12+ messages in thread From: Adam Borowski @ 2020-08-24 22:23 UTC (permalink / raw) To: linux-btrfs On Mon, Aug 24, 2020 at 11:50:30PM +0200, Steve Keller wrote: > Nikolay Borisov <nborisov@suse.com> wrote: > > I have implemented it so it's not that big of a deal. However turns out > > it has pretty steep requirements for backport because so far btrfs > > always kept the link count of dirs to 1. > > So how effective is such an optimisation to the software using it ? > > It's not only optimization like in find(1). As an old and long-time Unix > user I'd also like that traditional behavior. It just feels more correct > since if you do mkdir ./a ./b ./c ./d, you will actually see the 4 links > to the current dir if you do ls -ai a b c d and the two links from . itself > and from .. It's just an implementation detail of sysvfs, and a case of bug-compatibility. The link count of a directory is always 1 as btrfs, ext4, xfs, etc -- none of them support directory hardlinks, unlike sysvfs. So the proper value, as documented, is 1. Copying sysvfs behaviour is also costly as you need to know the count of contents while statting parent. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋⠀ It's time to migrate your Imaginary Protocol from version 4i to 6i. ⠈⠳⣄⠀⠀⠀⠀ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-24 22:23 ` Adam Borowski @ 2020-08-25 12:24 ` Urs Thuermann 2020-08-25 12:38 ` Nikolay Borisov 2020-08-25 12:37 ` Nikolay Borisov 1 sibling, 1 reply; 12+ messages in thread From: Urs Thuermann @ 2020-08-25 12:24 UTC (permalink / raw) To: linux-btrfs Adam Borowski <kilobyte@angband.pl> writes: > It's just an implementation detail of sysvfs, and a case of > bug-compatibility. The link count of a directory is always 1 as btrfs, > ext4, xfs, etc -- none of them support directory hardlinks, unlike sysvfs. No, allmost all other file systems handle the directory link count in the traditional way, at least minix, ext2, ext3, ext4, xfs, tmpfs, devtmpfs, devpts, sysfs, cgroup on Linux do so. And also FreeBSD ufs and devfs, NetBSD ffs, tmpfs, and kernfs, and OpenBSD ffs and mfs do that also. I'd like to check how zfs handles this (on Linux, FreeBSD or Solaris), but currently have no access to a system using it. > So the proper value, as documented, is 1. Copying sysvfs behaviour is also > costly as you need to know the count of contents while statting parent. No, it's not that costly. Directories start with nlink = 2 and nlinks is incremented or decremented with each mkdir or rmdir system call. urs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-25 12:24 ` Urs Thuermann @ 2020-08-25 12:38 ` Nikolay Borisov 0 siblings, 0 replies; 12+ messages in thread From: Nikolay Borisov @ 2020-08-25 12:38 UTC (permalink / raw) To: Urs Thuermann, linux-btrfs On 25.08.20 г. 15:24 ч., Urs Thuermann wrote: > Adam Borowski <kilobyte@angband.pl> writes: > >> It's just an implementation detail of sysvfs, and a case of >> bug-compatibility. The link count of a directory is always 1 as btrfs, >> ext4, xfs, etc -- none of them support directory hardlinks, unlike sysvfs. > > No, allmost all other file systems handle the directory link count in > the traditional way, at least minix, ext2, ext3, ext4, xfs, tmpfs, > devtmpfs, devpts, sysfs, cgroup on Linux do so. And also FreeBSD ufs > and devfs, NetBSD ffs, tmpfs, and kernfs, and OpenBSD ffs and mfs do > that also. > > I'd like to check how zfs handles this (on Linux, FreeBSD or Solaris), > but currently have no access to a system using it. > >> So the proper value, as documented, is 1. Copying sysvfs behaviour is also >> costly as you need to know the count of contents while statting parent. > > No, it's not that costly. Directories start with nlink = 2 and nlinks > is incremented or decremented with each mkdir or rmdir system call. It's slightly more complicated on btrfs because subvolumes and snapshots are also directories from the POV of the user but are created in different code paths. But that's the general idea. > > urs > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-24 22:23 ` Adam Borowski 2020-08-25 12:24 ` Urs Thuermann @ 2020-08-25 12:37 ` Nikolay Borisov 2020-08-25 13:46 ` Chris Mason 1 sibling, 1 reply; 12+ messages in thread From: Nikolay Borisov @ 2020-08-25 12:37 UTC (permalink / raw) To: Adam Borowski, linux-btrfs On 25.08.20 г. 1:23 ч., Adam Borowski wrote: > On Mon, Aug 24, 2020 at 11:50:30PM +0200, Steve Keller wrote: >> Nikolay Borisov <nborisov@suse.com> wrote: >>> I have implemented it so it's not that big of a deal. However turns out >>> it has pretty steep requirements for backport because so far btrfs >>> always kept the link count of dirs to 1. > >>> So how effective is such an optimisation to the software using it ? >> >> It's not only optimization like in find(1). As an old and long-time Unix >> user I'd also like that traditional behavior. It just feels more correct >> since if you do mkdir ./a ./b ./c ./d, you will actually see the 4 links >> to the current dir if you do ls -ai a b c d and the two links from . itself >> and from .. > > It's just an implementation detail of sysvfs, and a case of > bug-compatibility. The link count of a directory is always 1 as btrfs, > ext4, xfs, etc -- none of them support directory hardlinks, unlike sysvfs. Wrong, ext4 and xfs do maintain the count. > > So the proper value, as documented, is 1. Copying sysvfs behaviour is also > costly as you need to know the count of contents while statting parent. > Turns out it's not costly at all at least implementation wise. However it incurs a significant cost on maintainability because progs needs to be modified because they'd think a filesystem with a directory count different than 1 is broken... So really the pros need to outweigh the cons. > > Meow! > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-25 12:37 ` Nikolay Borisov @ 2020-08-25 13:46 ` Chris Mason 2020-08-26 7:04 ` Urs Thuermann 0 siblings, 1 reply; 12+ messages in thread From: Chris Mason @ 2020-08-25 13:46 UTC (permalink / raw) To: Nikolay Borisov; +Cc: Adam Borowski, linux-btrfs On 25 Aug 2020, at 8:37, Nikolay Borisov wrote: > On 25.08.20 г. 1:23 ч., Adam Borowski wrote: >> On Mon, Aug 24, 2020 at 11:50:30PM +0200, Steve Keller wrote: >>> Nikolay Borisov <nborisov@suse.com> wrote: >>>> I have implemented it so it's not that big of a deal. However turns >>>> out >>>> it has pretty steep requirements for backport because so far btrfs >>>> always kept the link count of dirs to 1. >> >>>> So how effective is such an optimisation to the software using it ? >>> >>> It's not only optimization like in find(1). As an old and long-time >>> Unix >>> user I'd also like that traditional behavior. It just feels more >>> correct >>> since if you do mkdir ./a ./b ./c ./d, you will actually see the 4 >>> links >>> to the current dir if you do ls -ai a b c d and the two links from . >>> itself >>> and from .. >> >> It's just an implementation detail of sysvfs, and a case of >> bug-compatibility. The link count of a directory is always 1 as >> btrfs, >> ext4, xfs, etc -- none of them support directory hardlinks, unlike >> sysvfs. > > Wrong, ext4 and xfs do maintain the count. Correct, at least until they hit 65536. It’s some weird special case handling for an optimization I didn’t see applications really using, so I just with with a fixed link count of one. Find certainly paid attention, but from an optimization point of view, dtype in directory entries is dramatically more helpful. I’d want to see big wins from applications before adding this into btrfs. -chris ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-25 13:46 ` Chris Mason @ 2020-08-26 7:04 ` Urs Thuermann 2020-08-31 13:03 ` David Sterba 0 siblings, 1 reply; 12+ messages in thread From: Urs Thuermann @ 2020-08-26 7:04 UTC (permalink / raw) To: linux-btrfs Chris Mason <clm@fb.com> writes: > > Wrong, ext4 and xfs do maintain the count. > > Correct, at least until they hit 65536. For ext4 the limit is actually 65000 links. For directories, nlink is set to 1 if an additional sub-directory is created (and stays at 1 even if you remove directories), while for other file you get EMLNK. In xfs there seems to be no limit (dirs and other files, I have seen more than 200000 for both) and btrfs has a limit of 65535 links on non-directories. > Find certainly paid attention, but from an optimization point of > view, dtype in directory entries is dramatically more helpful. 1. dtype is not in POSIX. OTOH, I seem to remember that POSIX states st_link == 1 for directories or otherwise it has the traditional value of 2 + sub-directory count. However, I cannot find this anymore. Is that correct and can anyone give me a pointer? 2. If you just want to find all directories you only need stat(2) on a directory and if it has st_nlink == 2 you don't need to read all directory entries (with or without dtype). So this optimization is possible with the traditional link count of directories but not with dtype. > I'd want to see big wins from applications before adding this > into btrfs. I would expect noticable but not big wins. However, I'd like adding this to btrfs just because it looks nicer. Since ls (and readdir) gives you the . and .. links they should be counted in the usual way. urs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-26 7:04 ` Urs Thuermann @ 2020-08-31 13:03 ` David Sterba 0 siblings, 0 replies; 12+ messages in thread From: David Sterba @ 2020-08-31 13:03 UTC (permalink / raw) To: Urs Thuermann; +Cc: linux-btrfs On Wed, Aug 26, 2020 at 09:04:21AM +0200, Urs Thuermann wrote: > > Find certainly paid attention, but from an optimization point of > > view, dtype in directory entries is dramatically more helpful. > > 1. dtype is not in POSIX. It is not but is this a problem in practice? Linux supports d_type and that's what matters right now (manual page says that some BSDs support that too). > OTOH, I seem to remember that POSIX states > st_link == 1 for directories or otherwise it has the traditional > value of 2 + sub-directory count. However, I cannot find this > anymore. Is that correct and can anyone give me a pointer? I can't find a reliable source for that but I remember reading it somewhere. > 2. If you just want to find all directories you only need stat(2) on a > directory and if it has st_nlink == 2 you don't need to read all > directory entries (with or without dtype). So this optimization is > possible with the traditional link count of directories but not > with dtype. > > > I'd want to see big wins from applications before adding this > > into btrfs. > > I would expect noticable but not big wins. The usecasee you describe skips readdir in case there are no directories, ie. if there are, full readdir will need to be done anyway. I can see that this saves some calls but otherwise it's IMHO quite narrow. > However, I'd like adding > this to btrfs just because it looks nicer. Since ls (and readdir) > gives you the . and .. links they should be counted in the usual way. As said elsewhere, this comes with a cost of backward compatibility issues and unfortunatelly overrides a 'nice to have' feature. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-24 21:50 ` Steve Keller 2020-08-24 22:23 ` Adam Borowski @ 2020-08-25 12:39 ` Nikolay Borisov 1 sibling, 0 replies; 12+ messages in thread From: Nikolay Borisov @ 2020-08-25 12:39 UTC (permalink / raw) To: Steve Keller, linux-btrfs On 25.08.20 г. 0:50 ч., Steve Keller wrote: > Nikolay Borisov <nborisov@suse.com> wrote: > >> I have implemented it so it's not that big of a deal. However turns out >> it has pretty steep requirements for backport because so far btrfs >> always kept the link count of dirs to 1. So such a change should be >> justifiable because it's not only the kernel code that is affected but: >> >> 1. Backporting relevant patch to older, stable kernels > > Why would that be needed? So what happens when a new filesystem (i.e one created with a kernel with this features) gets mounted on an older kernel (one that doesn't support it)? (It's a rhetorical question, it will refuse to mount because of the tree checker). > >> 2. Changing btrfs-progs so that it doesn't erroneously think a kernel >> with link count larger than 1 is broken. > > OK, should be doable, right? I never said it wasn't, however the question is if it's worth it doing that work. > >> So how effective is such an optimisation to the software using it ? > > It's not only optimization like in find(1). As an old and long-time Unix > user I'd also like that traditional behavior. It just feels more correct > since if you do mkdir ./a ./b ./c ./d, you will actually see the 4 links > to the current dir if you do ls -ai a b c d and the two links from . itself > and from .. > > Steve > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Link count for directories 2020-08-21 17:40 Link count for directories Steve Keller 2020-08-24 12:44 ` Nikolay Borisov @ 2020-08-31 13:18 ` David Sterba 1 sibling, 0 replies; 12+ messages in thread From: David Sterba @ 2020-08-31 13:18 UTC (permalink / raw) To: Steve Keller; +Cc: linux-btrfs On Fri, Aug 21, 2020 at 07:40:19PM +0200, Steve Keller wrote: > Are there any plans to implement the traditional link count behavior in btrfs, > as described in the following URL? > > https://btrfs.wiki.kernel.org/index.php/Project_ideas#Track_link_count_for_directories > > Would it be a major effort to do so? I'd really like that feature. So the main concern is backward compatibility and what would happen if a filesystem with nlink tracking gets mounted by a kernel without the support. The wiki project entry seems to be too optimistic regarding that (and I think it was me adding it there): It seems that the link count can be tracked like the other filesystems do. This will be even backward compatible: * for new directories and subvolumes , set the initial link count to 2 * a mkdir/rmdir/move/snapshot will update the link count accordingly iff the current link count is not 1 Bad scenario: * new kernel creates directory with many sudirectories, with nlink eg 100 * reboot to an older kernel * delete some of the subdirectories, nlink untouched and silently out of sync * reboot to new kernel * creating more subdirectories will not fix nlink, only add the new entries, and deletion can go below zero (though it would stay stop at 1) This does not sound unrealistic to me, eg booting a new kernel after update and then going back because some random driver does not work. All the directories created meanwhile will be affected. Usually such incompatibilies are shielded by incompat bits but in this case it sounds like a too heavy measure for a minor performance optimization. I'll move the project idea away with explanation why it's not implemented. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-08-31 13:23 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-21 17:40 Link count for directories Steve Keller 2020-08-24 12:44 ` Nikolay Borisov 2020-08-24 21:50 ` Steve Keller 2020-08-24 22:23 ` Adam Borowski 2020-08-25 12:24 ` Urs Thuermann 2020-08-25 12:38 ` Nikolay Borisov 2020-08-25 12:37 ` Nikolay Borisov 2020-08-25 13:46 ` Chris Mason 2020-08-26 7:04 ` Urs Thuermann 2020-08-31 13:03 ` David Sterba 2020-08-25 12:39 ` Nikolay Borisov 2020-08-31 13:18 ` David Sterba
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.