* ext3: fix ext3_dx_readdir hash collision handling - Regression @ 2008-10-22 9:32 Markus Trippelsdorf 2008-10-23 3:28 ` Theodore Tso 0 siblings, 1 reply; 13+ messages in thread From: Markus Trippelsdorf @ 2008-10-22 9:32 UTC (permalink / raw) To: linux-kernel, eugene, msnitzer, tytso, akpm, torvalds Commit 6a897cf447a83c9c3fd1b85a1e525c02d6eada7d "ext3: fix ext3_dx_readdir hash collision handling" causes a regression when deleting big directories. I'm using an ext3 filesystem in data=writeback mode as my root fs. When I untar a kernel tarball and then rm -r the files, this is what happens on my machine (latest git): markus@gentoox2 ~ % tar xjf linux-2.6.27.2.tar.bz2 markus@gentoox2 ~ % rm -r linux-2.6.27.2 rm: cannot remove `linux-2.6.27.2/arch/alpha/include/asm/statfs.h': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/m68knommu/include/asm/cputime.h': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/sparc/include/asm/highmem.h': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/sparc/include/asm/sections_32.h': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/s390/include/asm/io.h': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/arm/configs/pcm038_defconfig': No such file or directory rm: cannot remove `linux-2.6.27.2/arch/powerpc/include/asm/pgtable-ppc32.h': No such file or directory rm: cannot remove `linux-2.6.27.2/include/asm-um/linkage.h': No such file or directory rm: cannot remove `linux-2.6.27.2/include/asm-m68k/cachectl.h': No such file or directory markus@gentoox2 ~ % rm -r linux-2.6.27.2 markus@gentoox2 ~ % After reverting 6a897cf447a83c9c3fd1b85a1e525c02d6eada7d I get the expected result: markus@gentoox2 ~ % tar xjf linux-2.6.27.2.tar.bz2 markus@gentoox2 ~ % rm -r linux-2.6.27.2 markus@gentoox2 ~ % -- Markus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-22 9:32 ext3: fix ext3_dx_readdir hash collision handling - Regression Markus Trippelsdorf @ 2008-10-23 3:28 ` Theodore Tso 2008-10-23 6:37 ` Markus Trippelsdorf 0 siblings, 1 reply; 13+ messages in thread From: Theodore Tso @ 2008-10-23 3:28 UTC (permalink / raw) To: Markus Trippelsdorf; +Cc: linux-kernel, eugene, msnitzer, akpm, torvalds On Wed, Oct 22, 2008 at 11:32:01AM +0200, Markus Trippelsdorf wrote: > Commit 6a897cf447a83c9c3fd1b85a1e525c02d6eada7d > "ext3: fix ext3_dx_readdir hash collision handling" causes a regression > when deleting big directories. > > I'm using an ext3 filesystem in data=writeback mode as my root fs. > When I untar a kernel tarball and then rm -r the files, this is what > happens on my machine (latest git): > > markus@gentoox2 ~ % tar xjf linux-2.6.27.2.tar.bz2 > markus@gentoox2 ~ % rm -r linux-2.6.27.2 > rm: cannot remove `linux-2.6.27.2/arch/alpha/include/asm/statfs.h': No such file or directory... I haven't been able to replicate this. Does it matter which kernel tarball you use? And can you send me the output of "dumpe2fs -h /dev/hdXX", where hdXX is the device of the filesystem where this was failing? Thanks, - Ted ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-23 3:28 ` Theodore Tso @ 2008-10-23 6:37 ` Markus Trippelsdorf 2008-10-24 0:01 ` Theodore Tso 0 siblings, 1 reply; 13+ messages in thread From: Markus Trippelsdorf @ 2008-10-23 6:37 UTC (permalink / raw) To: Theodore Tso, linux-kernel, eugene, msnitzer, akpm, torvalds On Wed, Oct 22, 2008 at 11:28:32PM -0400, Theodore Tso wrote: > On Wed, Oct 22, 2008 at 11:32:01AM +0200, Markus Trippelsdorf wrote: > > Commit 6a897cf447a83c9c3fd1b85a1e525c02d6eada7d > > "ext3: fix ext3_dx_readdir hash collision handling" causes a regression > > when deleting big directories. > > > > I'm using an ext3 filesystem in data=writeback mode as my root fs. > > When I untar a kernel tarball and then rm -r the files, this is what > > happens on my machine (latest git): > > > > markus@gentoox2 ~ % tar xjf linux-2.6.27.2.tar.bz2 > > markus@gentoox2 ~ % rm -r linux-2.6.27.2 > > rm: cannot remove `linux-2.6.27.2/arch/alpha/include/asm/statfs.h': No such file or directory... > > I haven't been able to replicate this. Does it matter which kernel > tarball you use? And can you send me the output of "dumpe2fs -h > /dev/hdXX", where hdXX is the device of the filesystem where this was > failing? No, it happens with _every_ tarball from a certain size on. Ext4 is also affected due to d015641734cde55d2fce48a6db3983c8a029fe05 . This is the output of dumpe2fs -h /dev/sda1: dumpe2fs 1.41.3 (12-Oct-2008) Filesystem volume name: / Last mounted on: <not available> Filesystem UUID: 9a6c302d-e3f3-4fea-96d8-dad6890780a9 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Filesystem flags: signed_directory_hash Default mount options: journal_data_writeback Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 6406144 Block count: 25599569 Reserved block count: 1279978 Free blocks: 19067411 Free inodes: 5642168 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1017 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Filesystem created: Sat May 3 13:22:22 2008 Last mount time: Thu Oct 23 08:27:43 2008 Last write time: Thu Oct 23 08:27:43 2008 Mount count: 40 Maximum mount count: -1 Last checked: Wed Oct 15 20:42:58 2008 Check interval: 0 (<none>) Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Journal inode: 8 First orphan inode: 2233455 Default directory hash: tea Directory Hash Seed: c3188d40-b161-4458-b3f9-22ca7788110f Journal backup: inode blocks Journal size: 128M -- Markus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-23 6:37 ` Markus Trippelsdorf @ 2008-10-24 0:01 ` Theodore Tso [not found] ` <20081024042851.GA2360@gentoox2.trippelsdorf.de> 0 siblings, 1 reply; 13+ messages in thread From: Theodore Tso @ 2008-10-24 0:01 UTC (permalink / raw) To: Markus Trippelsdorf; +Cc: linux-kernel, eugene, msnitzer, akpm, torvalds On Thu, Oct 23, 2008 at 08:37:40AM +0200, Markus Trippelsdorf wrote: > > I haven't been able to replicate this. Does it matter which kernel > > tarball you use? And can you send me the output of "dumpe2fs -h > > /dev/hdXX", where hdXX is the device of the filesystem where this was > > failing? > > No, it happens with _every_ tarball from a certain size on. > Ext4 is also affected due to d015641734cde55d2fce48a6db3983c8a029fe05 . Sorry for not getting back to you right away; I've been in the Bay Area the past few days, and so I haven't as much ext4 hacking time this week. I'm still not able to reproduce this, on a 2.6.27-git7 based kernel, using either ext3 or ext4. (My attempt to reproduce using ext3 shown below; the debugfs -w commands were to replicate using a filesystem with your hash_seed and hash algorithm as you sent in your dumpe2fs output. Probably not important if you say that it happens on any kernel tarball; since I'm testing while on an airplane, I had to generate a tar.gz file from the git repository of 2.6.27, as shown below.) So you're reproducing this on an unmodified kernel from Linus's tip post 2.6.27, correct? The funny thing is the git commit you've identified was merged sometime just before and 2.6.27-rc5, and I pretty sure it's in the Fedora 10 beta kernel, and so I'm a bit surprised we haven't seen more reports of this problem than just yours. The question then is what is going on which means that you see this problem rather easily, but for whatever reason it's not reproducing easily for me. Can you send me your .config? Is there anything unusual about your setup you can think of? What distribution are you running? You've done an explicit revert of just that commit and it afterwards you were unable to reproduce the problem, is that correct? Can you reproduce it while running the "rm -rf" command under strace, (i.e., "strace -o /tmp/strace.log rm -rf linux-2.6.27.2") and send me the strace.log file? Thanks, regards, - Ted Script started on Thu 23 Oct 2008 07:39:57 PM EDT Top-level shell (parent script) Using forwarded ssh authentication socket <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 497# git archive v2.6.27 | gzip > /tmp/v2.6.27.tar.gz <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 498# mke2fs -t ext3 /dev/thunk/testext4 mke2fs 1.41.3 (12-Oct-2008) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 65536 inodes, 262144 blocks 13107 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=268435456 8 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 25 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 499# debugfs -w /dev/thunk/testext4 debugfs 1.41.3 (12-Oct-2008) debugfs: ssv s_hash_seed c3188d40-b161-4458-b3f9-22ca7788110f debugfs: ssv s_def_hash_version tea debugfs: q <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 500# mount /dev/thunk/testext4 /mnt <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 501# mkdir /mnt/v2.6.27 <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 502# tar -C /mnt/v2.6.27 -x -z -f /tmp/v2.6.27.tar.gz <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 503# /bin/rm -rf /mnt/v2.6.27 <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 504# umount /mnt <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 505# uname -a Linux closure 2.6.27-05216-g0719d38 #86 SMP Fri Oct 17 11:53:34 EDT 2008 i686 GNU/Linux <tytso.root@closure> {/usr/projects/linux/linux-2.6} [master] 506# exit Script done on Thu 23 Oct 2008 07:42:30 PM EDT ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20081024042851.GA2360@gentoox2.trippelsdorf.de>]
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression [not found] ` <20081024042851.GA2360@gentoox2.trippelsdorf.de> @ 2008-10-24 10:41 ` Theodore Tso 2008-10-24 16:08 ` Linus Torvalds 1 sibling, 0 replies; 13+ messages in thread From: Theodore Tso @ 2008-10-24 10:41 UTC (permalink / raw) To: Markus Trippelsdorf; +Cc: linux-kernel, eugene, msnitzer, akpm, torvalds On Fri, Oct 24, 2008 at 06:28:51AM +0200, Markus Trippelsdorf wrote: > > Notice that I don't use "rm -rf" but "rm -r". The problem only occurs > when I run "rm -r" without the "-f" switch. > Still having trouble reproducing it. I'll try to look more at your .config. One difference i can see right off the bat is that I'm still using an 32-bit x86 system, not a 64-bit x86 system. That shouldn't make a difference, though.... - Ted Script started on Fri 24 Oct 2008 06:36:47 AM EDT Top-level shell (parent script) Using forwarded ssh authentication socket <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 498# mke2fs -t ext3 /dev/thunk/testext4 mke2fs 1.41.3 (12-Oct-2008) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 65536 inodes, 262144 blocks 13107 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=268435456 8 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 22 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 499# debugfs -w /dev/thunk/testext4 debugfs 1.41.3 (12-Oct-2008) debugfs: ssv s_hash_seed c3188d40-b161-4458-b3f9-22ca7788110f debugfs: ssv s_def_hash_version tea debugfs: q <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 500# mount /dev/thunk/testext4 /mnt <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 501# tar -C /mnt -xjf /usr/projects/lin\aux/linux-2.6.27.2.tar.bz2 <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 502# /bin/rm -r /mnt/linux-2.6.27.2/ <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 503# umount /mnt/ <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 504# uname -a Linux closure 2.6.27-00067-g5593f6c-dirty #13 SMP Thu Oct 23 22:22:03 EDT 2008 i686 GNU/Linux <tytso.root@closure> {/usr/projects/linux/ext4} [btree-debug] 505# exit Script done on Fri 24 Oct 2008 06:39:22 AM EDT ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression [not found] ` <20081024042851.GA2360@gentoox2.trippelsdorf.de> 2008-10-24 10:41 ` Theodore Tso @ 2008-10-24 16:08 ` Linus Torvalds 2008-10-24 20:44 ` Theodore Tso 2008-10-25 11:56 ` [PATCH] " Theodore Tso 1 sibling, 2 replies; 13+ messages in thread From: Linus Torvalds @ 2008-10-24 16:08 UTC (permalink / raw) To: Markus Trippelsdorf; +Cc: Theodore Tso, linux-kernel, eugene, msnitzer, akpm On Fri, 24 Oct 2008, Markus Trippelsdorf wrote: > > Notice that I don't use "rm -rf" but "rm -r". The problem only occurs > when I run "rm -r" without the "-f" switch. > > The first time I encountered this problem was on an ext4 test partition > a few days ago. When I tried to delete a big directory structure on that > partition with mc (GNU Midnight Commander) by hitting F8 and then selecting > "All" on the "Delete it recursively" popup, it just stopped deleting > files after a while. So I had to repeat this procedure several times to > get rid of the directory. Hmm. I may actually have hit the same problem. I attributed it to a git buglet, and left it for later, because the last few days were pretty crazy (closing the merge window is when people always send their last-minute stuff even if they shouldn't, but to make things worse I was also gone for a day-and-a-half). The "git buglet" looks liek this: - build a kernel - do "git clean -dqfx". This fails with warning: failed to remove 'include/config/' - do "git clean -dqfx" again. And now it works - apparently because the first invocation deleted enough of the big directory. Doing an 'strace' to see what is going on, I see: .. getdents(3, /* 132 entries */, 4096) = 3888 lstat("include/config/sgi", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("include/config/sgi", O_RDONLY|O_NONBLOCK|O_DIRECTORY|0x80000) = 4 fstat(4, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 getdents(4, /* 3 entries */, 4096) = 80 lstat("include/config/sgi/partition.h", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 unlink("include/config/sgi/partition.h") = 0 getdents(4, /* 0 entries */, 4096) = 0 close(4) = 0 rmdir("include/config/sgi") = 0 lstat("include/config/sgi", 0x7fffdc4d2110) = -1 ENOENT (No such file or directory) close(3) = 0 write(2, "warning: failed to remove \'include/config/\'\n", 44) = 44 .. and notice how it does that lstat("include/config/sgi", .. _twice_. That, in turn (knowing the git implementation), implies that it was returned twice from that first "getdents(3, ...)" call. So what git clean does is to scan over the readdir() return values, see if it's a file it knows about, try to remove it if not, lstat() every entry, recurse into them if they are directories, and then remove it. If the lstat() fails, "git clean" will fail. So what happens is that it sees that "sgi" entry _twice_ in the readdir output, traverse into it once (and delete it), and the second time when it does an lstat() it will obviously fail (because now it's deleted!), and that will make "git clean" fail the whole thing due to an unexpected error. And I _think_ that what brings this on is: - caring about the error value This is why "rm -rf" works for you, but "rm -r" does not. With "-f", rm simply won't care. And like "rm -r", "git clean" cares about error values. - large enough directories that you have *multiple* "getdents()" calls. - likely: removing files in between getdents() calls. Hmm. Ted? I have not tried to revert that commit that Markus pinpointed (6a897cf447a83c9c3fd1b85a1e525c02d6eada7d: "ext3: fix ext3_dx_readdir hash collision handling"), but now that I look at that "git bug", I am getting pretty damn sure that it's exactly the same issue, and it's not a git bug at all. Markus basically has exactly the same thing: rm: cannot remove `linux-2.6.27.2/arch/alpha/include/asm/statfs.h': No such file or directory and that ENOENT is almost certainly because "rm" already removed that filename _once_, and it's the second time that fails. And yes, that commit is all about returning directory entries twice. It claims to fix it, but I bet it breaks it. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-24 16:08 ` Linus Torvalds @ 2008-10-24 20:44 ` Theodore Tso 2008-10-25 11:56 ` [PATCH] " Theodore Tso 1 sibling, 0 replies; 13+ messages in thread From: Theodore Tso @ 2008-10-24 20:44 UTC (permalink / raw) To: Linus Torvalds; +Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm On Fri, Oct 24, 2008 at 09:08:26AM -0700, Linus Torvalds wrote: > > Hmm. Ted? I have not tried to revert that commit that Markus pinpointed > (6a897cf447a83c9c3fd1b85a1e525c02d6eada7d: "ext3: fix ext3_dx_readdir hash > collision handling"), but now that I look at that "git bug", I am getting > pretty damn sure that it's exactly the same issue, and it's not a git bug > at all. Yep, I can replicate it now. It appears to only show up if you are running with an x86_64 kernel. I normally run with an x86_32 kernel, so I didn't notice the problem. The commit in question avoids returning duplicate entries when there is a hash collision; unfortunately, it seems to return duplicate entries for any large directory if you are running on x86_64 (and possibly/probably other 64-bit platforms). I'm working on it, and should hopefully have a fix for you soon. If this is too annoying, you can revert it and I'll resubmit a fixed version once I get a fix. - Ted ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-24 16:08 ` Linus Torvalds 2008-10-24 20:44 ` Theodore Tso @ 2008-10-25 11:56 ` Theodore Tso 2008-10-25 12:25 ` Markus Trippelsdorf 2008-10-25 21:15 ` Linus Torvalds 1 sibling, 2 replies; 13+ messages in thread From: Theodore Tso @ 2008-10-25 11:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm I believe I've found the problem. I'm still not 100% sure why I'm not seeing this on a 32-bit kernel, so I want to do a bit more testing to make sure I completely understand what was happening. However, once I reproduced it on a a 64-bit kernel, it was pretty clear what the problem was. The problem only happens when the directory is changing out from under us (which would be the case when the directory is getting deleted via "/bin/rm -r" or "git clean"). When that happens, the following check causes us to reload the rbtree we use to return the directory entries in hash sort error: /* * Fill the rbtree if we have no more entries, * or the inode has changed since we last read in the * cached entries. */ if ((!info->curr_node) || (filp->f_version != inode->i_version)) { info->curr_node = NULL; free_rb_tree_fname(&info->root); filp->f_version = inode->i_version; ret = ext3_htree_fill_tree(filp, info->curr_hash, info->curr_minor_hash, &info->next_hash); Unfortunately, the commit which which fixes the duplicated entries when there are hash collisions wasn't properly setting info->curr_hash and info->curr_minor_hash after going to the next node, which results the first entry being returned twice on the subsequent getdents() system call. What I don't understand was why I wasn't seeing this on a 32-bit kernel (maybe I screwed up my test environment; this problem should have occurred regardless of the 32-bit or 64-bit environment). In any case, if folks could test to see whether this patch fixes things, I'd appreciate it. (Patch for ext3 and ext4 follows). - Ted diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c index 4c82531..5853f44 100644 --- a/fs/ext3/dir.c +++ b/fs/ext3/dir.c @@ -456,17 +456,8 @@ static int ext3_dx_readdir(struct file * filp, if (info->extra_fname) { if (call_filldir(filp, dirent, filldir, info->extra_fname)) goto finished; - info->extra_fname = NULL; - info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { - if (info->next_hash == ~0) { - filp->f_pos = EXT3_HTREE_EOF; - goto finished; - } - info->curr_hash = info->next_hash; - info->curr_minor_hash = 0; - } + goto next_node; } else if (!info->curr_node) info->curr_node = rb_first(&info->root); @@ -498,9 +489,14 @@ static int ext3_dx_readdir(struct file * filp, info->curr_minor_hash = fname->minor_hash; if (call_filldir(filp, dirent, filldir, fname)) break; - + next_node: info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { + if (info->curr_node) { + fname = rb_entry(info->curr_node, struct fname, + rb_hash); + info->curr_hash = fname->hash; + info->curr_minor_hash = fname->minor_hash; + } else { if (info->next_hash == ~0) { filp->f_pos = EXT3_HTREE_EOF; break; diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c index 3ca6a2b..fed5b61 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c @@ -459,17 +459,8 @@ static int ext4_dx_readdir(struct file *filp, if (info->extra_fname) { if (call_filldir(filp, dirent, filldir, info->extra_fname)) goto finished; - info->extra_fname = NULL; - info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { - if (info->next_hash == ~0) { - filp->f_pos = EXT4_HTREE_EOF; - goto finished; - } - info->curr_hash = info->next_hash; - info->curr_minor_hash = 0; - } + goto next_node; } else if (!info->curr_node) info->curr_node = rb_first(&info->root); @@ -501,9 +492,14 @@ static int ext4_dx_readdir(struct file *filp, info->curr_minor_hash = fname->minor_hash; if (call_filldir(filp, dirent, filldir, fname)) break; - + next_node: info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { + if (info->curr_node) { + fname = rb_entry(info->curr_node, struct fname, + rb_hash); + info->curr_hash = fname->hash; + info->curr_minor_hash = fname->minor_hash; + } else { if (info->next_hash == ~0) { filp->f_pos = EXT4_HTREE_EOF; break; ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-25 11:56 ` [PATCH] " Theodore Tso @ 2008-10-25 12:25 ` Markus Trippelsdorf 2008-10-25 21:15 ` Linus Torvalds 1 sibling, 0 replies; 13+ messages in thread From: Markus Trippelsdorf @ 2008-10-25 12:25 UTC (permalink / raw) To: Theodore Tso; +Cc: Linus Torvalds, linux-kernel, eugene, msnitzer, akpm On Sat, Oct 25, 2008 at 07:56:31AM -0400, Theodore Tso wrote: > > Unfortunately, the commit which which fixes the duplicated entries > when there are hash collisions wasn't properly setting info->curr_hash > and info->curr_minor_hash after going to the next node, which results > the first entry being returned twice on the subsequent getdents() > system call. What I don't understand was why I wasn't seeing this on > a 32-bit kernel (maybe I screwed up my test environment; this problem > should have occurred regardless of the 32-bit or 64-bit environment). > > In any case, if folks could test to see whether this patch fixes > things, I'd appreciate it. (Patch for ext3 and ext4 follows). > > - Ted Yep, your patch fixes the issue. Thank you Ted. -- Markus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-25 11:56 ` [PATCH] " Theodore Tso 2008-10-25 12:25 ` Markus Trippelsdorf @ 2008-10-25 21:15 ` Linus Torvalds 2008-10-26 2:39 ` [GIT PULL] " Theodore Tso 1 sibling, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2008-10-25 21:15 UTC (permalink / raw) To: Theodore Tso; +Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm On Sat, 25 Oct 2008, Theodore Tso wrote: > > I believe I've found the problem. Your patch seems to fix it for me, so ack on that. Please do send a version with a sign-off when you're ok with it - otherwise I'll have to revert the old one.. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* [GIT PULL] Re: [PATCH] Re: ext3: fix ext3_dx_readdir hash collision handling - Regression 2008-10-25 21:15 ` Linus Torvalds @ 2008-10-26 2:39 ` Theodore Tso 2008-10-26 2:42 ` [PATCH 1/2] ext3: Fix duplicate entries returned from getdents() system call Theodore Ts'o 0 siblings, 1 reply; 13+ messages in thread From: Theodore Tso @ 2008-10-26 2:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm On Sat, Oct 25, 2008 at 02:15:41PM -0700, Linus Torvalds wrote: > Your patch seems to fix it for me, so ack on that. Please do send a > version with a sign-off when you're ok with it - otherwise I'll have to > revert the old one.. OK, please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git for_linus I did some more testing, and I think I screwed up my tests on the 32-bit kernel (or I was testing "rm -rf", not "rm -r"). These patches are identical to what I sent before, but with patch comments and a sign-off. - Ted Theodore Ts'o (2): ext3: Fix duplicate entries returned from getdents() system call ext4: Fix duplicate entries returned from getdents() system call fs/ext3/dir.c | 20 ++++++++------------ fs/ext4/dir.c | 20 ++++++++------------ 2 files changed, 16 insertions(+), 24 deletions(-) ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/2] ext3: Fix duplicate entries returned from getdents() system call 2008-10-26 2:39 ` [GIT PULL] " Theodore Tso @ 2008-10-26 2:42 ` Theodore Ts'o 2008-10-26 2:42 ` [PATCH 2/2] ext4: " Theodore Ts'o 0 siblings, 1 reply; 13+ messages in thread From: Theodore Ts'o @ 2008-10-26 2:42 UTC (permalink / raw) To: Linus Torvalds Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm, Theodore Ts'o Fix a regression caused by commit 6a897cf4, "ext3: fix ext3_dx_readdir hash collision handling", where deleting files in a large directory (requiring more than one getdents system call), results in some filenames being returned twice. This was caused by a failure to update info->curr_hash and info->curr_minor_hash, so that if the directory had gotten modified since the last getdents() system call (as would be the case if the user is running "rm -r" or "git clean"), a directory entry would get returned twice to the userspace. This patch fixes the bug reported by Markus Trippelsdorf at: http://bugzilla.kernel.org/show_bug.cgi?id=11844 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> --- fs/ext3/dir.c | 20 ++++++++------------ 1 files changed, 8 insertions(+), 12 deletions(-) diff --git a/fs/ext3/dir.c b/fs/ext3/dir.c index 4c82531..5853f44 100644 --- a/fs/ext3/dir.c +++ b/fs/ext3/dir.c @@ -456,17 +456,8 @@ static int ext3_dx_readdir(struct file * filp, if (info->extra_fname) { if (call_filldir(filp, dirent, filldir, info->extra_fname)) goto finished; - info->extra_fname = NULL; - info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { - if (info->next_hash == ~0) { - filp->f_pos = EXT3_HTREE_EOF; - goto finished; - } - info->curr_hash = info->next_hash; - info->curr_minor_hash = 0; - } + goto next_node; } else if (!info->curr_node) info->curr_node = rb_first(&info->root); @@ -498,9 +489,14 @@ static int ext3_dx_readdir(struct file * filp, info->curr_minor_hash = fname->minor_hash; if (call_filldir(filp, dirent, filldir, fname)) break; - + next_node: info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { + if (info->curr_node) { + fname = rb_entry(info->curr_node, struct fname, + rb_hash); + info->curr_hash = fname->hash; + info->curr_minor_hash = fname->minor_hash; + } else { if (info->next_hash == ~0) { filp->f_pos = EXT3_HTREE_EOF; break; -- 1.5.6.1.205.ge2c7.dirty ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/2] ext4: Fix duplicate entries returned from getdents() system call 2008-10-26 2:42 ` [PATCH 1/2] ext3: Fix duplicate entries returned from getdents() system call Theodore Ts'o @ 2008-10-26 2:42 ` Theodore Ts'o 0 siblings, 0 replies; 13+ messages in thread From: Theodore Ts'o @ 2008-10-26 2:42 UTC (permalink / raw) To: Linus Torvalds Cc: Markus Trippelsdorf, linux-kernel, eugene, msnitzer, akpm, Theodore Ts'o Fix a regression caused by commit d0156417, "ext4: fix ext4_dx_readdir hash collision handling", where deleting files in a large directory (requiring more than one getdents system call), results in some filenames being returned twice. This was caused by a failure to update info->curr_hash and info->curr_minor_hash, so that if the directory had gotten modified since the last getdents() system call (as would be the case if the user is running "rm -r" or "git clean"), a directory entry would get returned twice to the userspace. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> This patch fixes the bug reported by Markus Trippelsdorf at: http://bugzilla.kernel.org/show_bug.cgi?id=11844 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> --- fs/ext4/dir.c | 20 ++++++++------------ 1 files changed, 8 insertions(+), 12 deletions(-) diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c index 3ca6a2b..fed5b61 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c @@ -459,17 +459,8 @@ static int ext4_dx_readdir(struct file *filp, if (info->extra_fname) { if (call_filldir(filp, dirent, filldir, info->extra_fname)) goto finished; - info->extra_fname = NULL; - info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { - if (info->next_hash == ~0) { - filp->f_pos = EXT4_HTREE_EOF; - goto finished; - } - info->curr_hash = info->next_hash; - info->curr_minor_hash = 0; - } + goto next_node; } else if (!info->curr_node) info->curr_node = rb_first(&info->root); @@ -501,9 +492,14 @@ static int ext4_dx_readdir(struct file *filp, info->curr_minor_hash = fname->minor_hash; if (call_filldir(filp, dirent, filldir, fname)) break; - + next_node: info->curr_node = rb_next(info->curr_node); - if (!info->curr_node) { + if (info->curr_node) { + fname = rb_entry(info->curr_node, struct fname, + rb_hash); + info->curr_hash = fname->hash; + info->curr_minor_hash = fname->minor_hash; + } else { if (info->next_hash == ~0) { filp->f_pos = EXT4_HTREE_EOF; break; -- 1.5.6.1.205.ge2c7.dirty ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-10-26 2:42 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-10-22 9:32 ext3: fix ext3_dx_readdir hash collision handling - Regression Markus Trippelsdorf 2008-10-23 3:28 ` Theodore Tso 2008-10-23 6:37 ` Markus Trippelsdorf 2008-10-24 0:01 ` Theodore Tso [not found] ` <20081024042851.GA2360@gentoox2.trippelsdorf.de> 2008-10-24 10:41 ` Theodore Tso 2008-10-24 16:08 ` Linus Torvalds 2008-10-24 20:44 ` Theodore Tso 2008-10-25 11:56 ` [PATCH] " Theodore Tso 2008-10-25 12:25 ` Markus Trippelsdorf 2008-10-25 21:15 ` Linus Torvalds 2008-10-26 2:39 ` [GIT PULL] " Theodore Tso 2008-10-26 2:42 ` [PATCH 1/2] ext3: Fix duplicate entries returned from getdents() system call Theodore Ts'o 2008-10-26 2:42 ` [PATCH 2/2] ext4: " Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).