linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Performance issue with recently_deleted() /no_journal with huge directories
@ 2020-08-27 12:09 James Scriven (jamscriv)
  2020-08-28  9:13 ` Jan Kara
  0 siblings, 1 reply; 2+ messages in thread
From: James Scriven (jamscriv) @ 2020-08-27 12:09 UTC (permalink / raw)
  To: linux-ext4

Hi, I'm working on migrating a workload from kernel 2.6 to 4.18 (REHL6 to RHEL8).

The use case is a build farm that has a basic workflow of:

1) rm -rf a large directory tree (about 2M files ~ 200GB) to free some space
2) download and extract a large tarbar (about 2M files ~ 200GB)
3) perform a build in the extracted directory tree
Repeat...

We've being using an ext4 filesystem with no journal for maximum performance with great success. We're not very concerned about losing data, but do want some persistence, which is why we don't just use tmpfs for this. We'll keep a number of these large workspaces around as long as space permits, and delete the oldest (step 1) just before starting a new one (step 2). 

When migrating to this newer kernel, we are seeing performance degradation when we expand the tar, which I suspect is caused by inode allocation trying to find an unused inode that has not been used too recently. Since we have 2M deleted inodes that *have* been recently deleted, every one of the new 2M inodes has to search through all 2M of the deleted ones (or something to that approximation - my full understanding of the ext4 code is limited).

The simple testcase below shows the issue. My question is, is this edge case already understood? Is there a good way to re-gain this lost performance? Adding a "sync + drop_caches", or a sufficiently long sleep, between steps 1 and 2 will work around the issue, but is not ideal.



# each iteration of the test case the number of recently deleted inodes increases and performance degrades.

$ uname -a
Linux sjc-asr-bm-470 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Wed Nov 27 01:11:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
3

real    0m1.796s
user    0m0.041s
sys     0m1.528s

real    0m3.280s
user    0m0.035s
sys     0m3.235s

real    0m4.329s
user    0m0.035s
sys     0m4.279s

real    0m6.033s
user    0m0.032s
sys     0m5.988s

real    0m7.303s
user    0m0.041s
sys     0m7.246s

real    0m7.874s
user    0m0.032s
sys     0m7.826s

real    0m9.376s
user    0m0.036s
sys     0m9.323s

real    0m9.979s
user    0m0.052s
sys     0m9.910s

real    0m9.808s
user    0m0.037s
sys     0m9.749s

real    0m9.067s
user    0m0.038s
sys     0m9.011s




$ uname -a
Linux sjc-asr-bm-100 2.6.32-754.17.1.el6.x86_64 #1 SMP Thu Jun 20 11:47:12 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
3

real    0m0.724s
user    0m0.031s
sys     0m0.693s

real    0m0.762s
user    0m0.041s
sys     0m0.721s

real    0m0.717s
user    0m0.043s
sys     0m0.674s

real    0m0.712s
user    0m0.037s
sys     0m0.675s

real    0m0.749s
user    0m0.036s
sys     0m0.712s

real    0m0.710s
user    0m0.040s
sys     0m0.670s

real    0m0.746s
user    0m0.038s
sys     0m0.707s

real    0m0.715s
user    0m0.034s
sys     0m0.680s

real    0m0.747s
user    0m0.040s
sys     0m0.707s

real    0m0.732s
user    0m0.042s
sys     0m0.690s




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Performance issue with recently_deleted() /no_journal with huge directories
  2020-08-27 12:09 Performance issue with recently_deleted() /no_journal with huge directories James Scriven (jamscriv)
@ 2020-08-28  9:13 ` Jan Kara
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara @ 2020-08-28  9:13 UTC (permalink / raw)
  To: James Scriven (jamscriv); +Cc: linux-ext4

Hi!

On Thu 27-08-20 12:09:21, James Scriven (jamscriv) wrote:
> Hi, I'm working on migrating a workload from kernel 2.6 to 4.18 (REHL6 to
> RHEL8).
> 
> The use case is a build farm that has a basic workflow of:
> 
> 1) rm -rf a large directory tree (about 2M files ~ 200GB) to free some space
> 2) download and extract a large tarbar (about 2M files ~ 200GB)
> 3) perform a build in the extracted directory tree
> Repeat...
> 
> We've being using an ext4 filesystem with no journal for maximum
> performance with great success. We're not very concerned about losing
> data, but do want some persistence, which is why we don't just use tmpfs
> for this. We'll keep a number of these large workspaces around as long as
> space permits, and delete the oldest (step 1) just before starting a new
> one (step 2). 
> 
> When migrating to this newer kernel, we are seeing performance
> degradation when we expand the tar, which I suspect is caused by inode
> allocation trying to find an unused inode that has not been used too
> recently. Since we have 2M deleted inodes that *have* been recently
> deleted, every one of the new 2M inodes has to search through all 2M of
> the deleted ones (or something to that approximation - my full
> understanding of the ext4 code is limited).
> 
> The simple testcase below shows the issue. My question is, is this edge
> case already understood? Is there a good way to re-gain this lost
> performance? Adding a "sync + drop_caches", or a sufficiently long sleep,
> between steps 1 and 2 will work around the issue, but is not ideal.

So from the tests below it isn't obvious to me that the recently_deleted()
logic is the culprit of the problem. But it is quite possible. We have
somewhat changed the logic in commit d05466b27b19 "ext4: avoid ENOSPC when
avoiding to reuse recently deleted inodes" so now we just reuse recently
deleted inode if we cannot find any better in the current group. That should
significantly reduce the cost of searching for free inode in your usecase
so you might give that change a try...

								Honza

> # each iteration of the test case the number of recently deleted inodes increases and performance degrades.
> 
> $ uname -a
> Linux sjc-asr-bm-470 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Wed Nov 27 01:11:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> $ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
> 3
> 
> real    0m1.796s
> user    0m0.041s
> sys     0m1.528s
> 
> real    0m3.280s
> user    0m0.035s
> sys     0m3.235s
> 
> real    0m4.329s
> user    0m0.035s
> sys     0m4.279s
> 
> real    0m6.033s
> user    0m0.032s
> sys     0m5.988s
> 
> real    0m7.303s
> user    0m0.041s
> sys     0m7.246s
> 
> real    0m7.874s
> user    0m0.032s
> sys     0m7.826s
> 
> real    0m9.376s
> user    0m0.036s
> sys     0m9.323s
> 
> real    0m9.979s
> user    0m0.052s
> sys     0m9.910s
> 
> real    0m9.808s
> user    0m0.037s
> sys     0m9.749s
> 
> real    0m9.067s
> user    0m0.038s
> sys     0m9.011s
> 
> 
> 
> 
> $ uname -a
> Linux sjc-asr-bm-100 2.6.32-754.17.1.el6.x86_64 #1 SMP Thu Jun 20 11:47:12 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
> $ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
> 3
> 
> real    0m0.724s
> user    0m0.031s
> sys     0m0.693s
> 
> real    0m0.762s
> user    0m0.041s
> sys     0m0.721s
> 
> real    0m0.717s
> user    0m0.043s
> sys     0m0.674s
> 
> real    0m0.712s
> user    0m0.037s
> sys     0m0.675s
> 
> real    0m0.749s
> user    0m0.036s
> sys     0m0.712s
> 
> real    0m0.710s
> user    0m0.040s
> sys     0m0.670s
> 
> real    0m0.746s
> user    0m0.038s
> sys     0m0.707s
> 
> real    0m0.715s
> user    0m0.034s
> sys     0m0.680s
> 
> real    0m0.747s
> user    0m0.040s
> sys     0m0.707s
> 
> real    0m0.732s
> user    0m0.042s
> sys     0m0.690s
> 
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-08-28  9:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 12:09 Performance issue with recently_deleted() /no_journal with huge directories James Scriven (jamscriv)
2020-08-28  9:13 ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).