ext4 bug: getdents uninterruptible for 117 seconds

* ext4 bug: getdents uninterruptible for 117 seconds
@ 2016-03-02 17:15 Benjamin LaHaise
  2016-03-02 21:43 ` Theodore Ts'o
  0 siblings, 1 reply; 5+ messages in thread
From: Benjamin LaHaise @ 2016-03-02 17:15 UTC (permalink / raw)
  To: linux-ext4

Hi folks,

While working on a bug involving write starvation, the test I was running 
managed to trigger some pretty horrific worst case behaviour in ext4.  The 
filesystem I'm working on is about 4TB in size, and is used for storing a 
number of spool files across 100 subdirectories in the filesystem.  One of 
these subdirectories ended up growing to ~497MB in size.  Once all of the 
files were removed from these directories, the filesystem was unmounted.  
On subsequent mounts of the filesystem, it became apparent that whenever 
a specific directory was accessed using ls or find, the kernel would block 
in getdents() for north of 117 seconds.  It is clear that ext4 is slowly 
reading the entire contents of the directory into memory during this time 
at a rate of ~4MB/s.  This filesystem is being stored on an external 8Gbps 
FC SAN comprised of about 8 x 10Krpm spindles.

I've placed a copy of the e2image for the filesystem at 
http://www.kvack.org/~bcrl/ext4/ext4-readdir.img.xz .  The problematic 
directory is broken/1.  The relevant snippet of strace output is below.  
Thoughts?

		-ben

write(1, "/mnt/broken/"..., 34) = 34 <0.000039>
openat(5, "1", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 6 <0.000007>
fcntl(6, F_GETFD)                       = 0 <0.000004>
fcntl(6, F_SETFD, FD_CLOEXEC)           = 0 <0.000004>
fstat(6, {st_mode=S_IFDIR|0755, st_size=520896512, ...}) = 0 <0.000004>
fcntl(6, F_GETFL)                       = 0x38800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_NOFOLLOW) <0.000004>
fcntl(6, F_SETFD, FD_CLOEXEC)           = 0 <0.000004>
newfstatat(5, "1", {st_mode=S_IFDIR|0755, st_size=520896512, ...}, AT_SYMLINK_NOFOLLOW) = 0 <0.000006>
fcntl(6, F_DUPFD, 3)                    = 7 <0.000005>
fcntl(7, F_GETFD)                       = 0 <0.000005>
fcntl(7, F_SETFD, FD_CLOEXEC)           = 0 <0.000004>
getdents(6, /* 2 entries */, 32768)     = 48 <117.122463>
getdents(6, /* 0 entries */, 32768)     = 0 <0.000005>
close(6)                                = 0 <0.000004>
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 5+ messages in thread