[PATCH v2 0/1] scale loop device lock

* [PATCH v2 0/1] scale loop device lock
@ 2020-07-23 21:17 Pavel Tatashin
  2020-07-23 21:17 ` [PATCH v2 1/1] loop: scale loop device by introducing per " Pavel Tatashin
  2021-01-19 17:12 ` [PATCH v2 0/1] scale loop " Pavel Tatashin
  0 siblings, 2 replies; 4+ messages in thread
From: Pavel Tatashin @ 2020-07-23 21:17 UTC (permalink / raw)
  To: pasha.tatashin, tyhicks, axboe, linux-block, linux-kernel

Changelog
v2: Addressed Tyler Hicks comments
	- added mutex_destroy()
	- comment in lo_open()
	- added lock around lo_disk in 

===

In our environment we are using systemd portable containers in
squashfs formats, convert them into loop device, and mount.

NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop5                       7:5    0  76.4M  0 loop
`-BaseImageM1908          252:3    0  76.4M  1 crypt /BaseImageM1908
loop6                       7:6    0    20K  0 loop
`-test_launchperf20       252:17   0   1.3M  1 crypt /app/test_launchperf20
loop7                       7:7    0    20K  0 loop
`-test_launchperf18       252:4    0   1.5M  1 crypt /app/test_launchperf18
loop8                       7:8    0     8K  0 loop
`-test_launchperf8        252:25   0    28K  1 crypt app/test_launchperf8
loop9                       7:9    0   376K  0 loop
`-test_launchperf14       252:29   0  45.7M  1 crypt /app/test_launchperf14
loop10                      7:10   0    16K  0 loop
`-test_launchperf4        252:11   0   968K  1 crypt app/test_launchperf4
loop11                      7:11   0   1.2M  0 loop
`-test_launchperf17       252:26   0 150.4M  1 crypt /app/test_launchperf17
loop12                      7:12   0    36K  0 loop
`-test_launchperf19       252:13   0   3.3M  1 crypt /app/test_launchperf19
loop13                      7:13   0     8K  0 loop
...

We have over 50 loop devices which are mounted  during boot.

We observed contentions around loop_ctl_mutex.

The sample contentions stacks:

Contention 1:
__blkdev_get()
   bdev->bd_disk->fops->open()
      lo_open()
         mutex_lock_killable(&loop_ctl_mutex); <- contention

Contention 2:
__blkdev_put()
   disk->fops->release()
      lo_release()
         mutex_lock(&loop_ctl_mutex); <- contention

With total time waiting for loop_ctl_mutex ~18.8s during boot (across 8
CPUs) on our machine (69 loop devices): 2.35s per CPU.

Scaling this lock eliminates this contention entirely, and improves the boot
performance by 2s on our machine.

Pavel Tatashin (1):
  loop: scale loop device by introducing per device lock

 drivers/block/loop.c | 99 ++++++++++++++++++++++++++------------------
 drivers/block/loop.h |  1 +
 2 files changed, 59 insertions(+), 41 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 4+ messages in thread