All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] zram: do not lookup algorithm in backends table
@ 2022-06-22  2:35 Sergey Senozhatsky
  2022-06-22 15:20 ` Sergey Senozhatsky
  0 siblings, 1 reply; 4+ messages in thread
From: Sergey Senozhatsky @ 2022-06-22  2:35 UTC (permalink / raw)
  To: Minchan Kim, Nitin Gupta
  Cc: Andrew Morton, linux-kernel, linux-block, Sergey Senozhatsky

Always use crypto_has_comp() so that crypto can lookup module,
call usermodhelper to load the modules, wait for usermodhelper
to finish and so on. Otherwise crypto will do all of these steps
under CPU hot-plug lock and this looks like too much stuff to
handle under the CPU hot-plug lock. Besides this can end up in
a deadlock when usermodhelper triggers a code path that attempts
to lock the CPU hot-plug lock, that zram already holds.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 drivers/block/zram/zcomp.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 052aa3f65514..398eb9e24eff 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -63,12 +63,6 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
 
 bool zcomp_available_algorithm(const char *comp)
 {
-	int i;
-
-	i = sysfs_match_string(backends, comp);
-	if (i >= 0)
-		return true;
-
 	/*
 	 * Crypto does not ignore a trailing new line symbol,
 	 * so make sure you don't supply a string containing
-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] zram: do not lookup algorithm in backends table
  2022-06-22  2:35 [PATCH] zram: do not lookup algorithm in backends table Sergey Senozhatsky
@ 2022-06-22 15:20 ` Sergey Senozhatsky
  2022-06-22 19:19   ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Sergey Senozhatsky @ 2022-06-22 15:20 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Nitin Gupta, Andrew Morton, linux-kernel, linux-block,
	Sergey Senozhatsky

On (22/06/22 11:35), Sergey Senozhatsky wrote:
> Always use crypto_has_comp() so that crypto can lookup module,
> call usermodhelper to load the modules, wait for usermodhelper
> to finish and so on. Otherwise crypto will do all of these steps
> under CPU hot-plug lock and this looks like too much stuff to
> handle under the CPU hot-plug lock. Besides this can end up in
> a deadlock when usermodhelper triggers a code path that attempts
> to lock the CPU hot-plug lock, that zram already holds.

And we think that we (not exactly "we", our partners) actually
see a deadlock. It goes something like this:

- path A. zram grabs CPU hot-plug lock, execs /sbin/modprobe from crypto
  and waits for modprobe to finish

disksize_store
 zcomp_create
  __cpuhp_state_add_instance
   __cpuhp_state_add_instance_cpuslocked
    zcomp_cpu_up_prepare
     crypto_alloc_base
      crypto_alg_mod_lookup
       call_usermodehelper_exec
        wait_for_completion_killable
         do_wait_for_common
          schedule

- path B. async work kthread that brings in scsi device. It wants to
  register CPUHP states at some point, and it needs the CPU hot-plug
  lock for that, which is owned by zram.

async_run_entry_fn
 scsi_probe_and_add_lun
  scsi_mq_alloc_queue
   blk_mq_init_queue
    blk_mq_init_allocated_queue
     blk_mq_realloc_hw_ctxs
      __cpuhp_state_add_instance
       __cpuhp_state_add_instance_cpuslocked
        mutex_lock
         schedule

- path C. modprobe sleeps, waiting for all aync works to finish.

load_module
 do_init_module
  async_synchronize_full
   async_synchronize_cookie_domain
    schedule

And none can make any progress.

So I think we need to move crypto_alg_mod_lookup()->call_usermodehelper_exec()
out of CPU hot-plug lock and pre-load modules in advance, before we grab the
hot-plug lock.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] zram: do not lookup algorithm in backends table
  2022-06-22 15:20 ` Sergey Senozhatsky
@ 2022-06-22 19:19   ` Andrew Morton
  2022-06-23  0:21     ` Sergey Senozhatsky
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2022-06-22 19:19 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: Minchan Kim, Nitin Gupta, linux-kernel, linux-block

On Thu, 23 Jun 2022 00:20:05 +0900 Sergey Senozhatsky <senozhatsky@chromium.org> wrote:

> On (22/06/22 11:35), Sergey Senozhatsky wrote:
> > Always use crypto_has_comp() so that crypto can lookup module,
> > call usermodhelper to load the modules, wait for usermodhelper
> > to finish and so on. Otherwise crypto will do all of these steps
> > under CPU hot-plug lock and this looks like too much stuff to
> > handle under the CPU hot-plug lock. Besides this can end up in
> > a deadlock when usermodhelper triggers a code path that attempts
> > to lock the CPU hot-plug lock, that zram already holds.
> 
> And we think that we (not exactly "we", our partners) actually
> see a deadlock. It goes something like this:
> 
> - path A. zram grabs CPU hot-plug lock, execs /sbin/modprobe from crypto
>   and waits for modprobe to finish

Nope, can't do that.

> disksize_store
>  zcomp_create
>   __cpuhp_state_add_instance
>    __cpuhp_state_add_instance_cpuslocked
>     zcomp_cpu_up_prepare
>      crypto_alloc_base
>       crypto_alg_mod_lookup
>        call_usermodehelper_exec
>         wait_for_completion_killable
>          do_wait_for_common
>           schedule

The usermode helper is free to do anything it wants, including
operations that take the CPU hotplug lock.  Or operations which might
in the future be changed to take that lock.

> - path B. async work kthread that brings in scsi device. It wants to
>   register CPUHP states at some point, and it needs the CPU hot-plug
>   lock for that, which is owned by zram.
> 
> async_run_entry_fn
>  scsi_probe_and_add_lun
>   scsi_mq_alloc_queue
>    blk_mq_init_queue
>     blk_mq_init_allocated_queue
>      blk_mq_realloc_hw_ctxs
>       __cpuhp_state_add_instance
>        __cpuhp_state_add_instance_cpuslocked
>         mutex_lock
>          schedule
> 
> - path C. modprobe sleeps, waiting for all aync works to finish.
> 
> load_module
>  do_init_module
>   async_synchronize_full
>    async_synchronize_cookie_domain
>     schedule
> 
> And none can make any progress.
> 
> So I think we need to move crypto_alg_mod_lookup()->call_usermodehelper_exec()
> out of CPU hot-plug lock and pre-load modules in advance, before we grab the
> hot-plug lock.

If the locking is fixed, why is there still a need to preload modules?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] zram: do not lookup algorithm in backends table
  2022-06-22 19:19   ` Andrew Morton
@ 2022-06-23  0:21     ` Sergey Senozhatsky
  0 siblings, 0 replies; 4+ messages in thread
From: Sergey Senozhatsky @ 2022-06-23  0:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Sergey Senozhatsky, Minchan Kim, Nitin Gupta, linux-kernel, linux-block

On (22/06/22 12:19), Andrew Morton wrote:
> > On (22/06/22 11:35), Sergey Senozhatsky wrote:
> > > Always use crypto_has_comp() so that crypto can lookup module,
> > > call usermodhelper to load the modules, wait for usermodhelper
> > > to finish and so on. Otherwise crypto will do all of these steps
> > > under CPU hot-plug lock and this looks like too much stuff to
> > > handle under the CPU hot-plug lock. Besides this can end up in
> > > a deadlock when usermodhelper triggers a code path that attempts
> > > to lock the CPU hot-plug lock, that zram already holds.
> > 
> > And we think that we (not exactly "we", our partners) actually
> > see a deadlock. It goes something like this:
> > 
> > - path A. zram grabs CPU hot-plug lock, execs /sbin/modprobe from crypto
> >   and waits for modprobe to finish
> 
> Nope, can't do that.
> 
> > disksize_store
> >  zcomp_create
> >   __cpuhp_state_add_instance
> >    __cpuhp_state_add_instance_cpuslocked
> >     zcomp_cpu_up_prepare
> >      crypto_alloc_base
> >       crypto_alg_mod_lookup
> >        call_usermodehelper_exec
> >         wait_for_completion_killable
> >          do_wait_for_common
> >           schedule
> 
> The usermode helper is free to do anything it wants, including
> operations that take the CPU hotplug lock.  Or operations which might
> in the future be changed to take that lock.

Agreed.

> > - path B. async work kthread that brings in scsi device. It wants to
> >   register CPUHP states at some point, and it needs the CPU hot-plug
> >   lock for that, which is owned by zram.
> > 
> > async_run_entry_fn
> >  scsi_probe_and_add_lun
> >   scsi_mq_alloc_queue
> >    blk_mq_init_queue
> >     blk_mq_init_allocated_queue
> >      blk_mq_realloc_hw_ctxs
> >       __cpuhp_state_add_instance
> >        __cpuhp_state_add_instance_cpuslocked
> >         mutex_lock
> >          schedule
> > 
> > - path C. modprobe sleeps, waiting for all aync works to finish.
> > 
> > load_module
> >  do_init_module
> >   async_synchronize_full
> >    async_synchronize_cookie_domain
> >     schedule
> > 
> > And none can make any progress.
> > 
> > So I think we need to move crypto_alg_mod_lookup()->call_usermodehelper_exec()
> > out of CPU hot-plug lock and pre-load modules in advance, before we grab the
> > hot-plug lock.
> 
> If the locking is fixed, why is there still a need to preload modules?

We "fix" locking by doing initial crypto compression algorithm lookup
outside of hot-plug lock (pre-load).

Crypto API handles a list of preloaded modules internally. What we do
currently, we call crypto_alloc_base() under hot-plug lock, which calls
crypto_alg_mod_lookup(), which figures out that crypto modules list does
not contain that module yet so then it modprobes it.

With this patch we do the first crypto_alg_mod_lookup() outside of
hot-plug lock, so that it safely modprobes compression module. Then
when we grab the hot-plug lock and setup per-CPU streams,
crypto_alloc_base()->crypto_alg_mod_lookup() figures that module is
already on the list so no modprobe is needed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-23  0:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22  2:35 [PATCH] zram: do not lookup algorithm in backends table Sergey Senozhatsky
2022-06-22 15:20 ` Sergey Senozhatsky
2022-06-22 19:19   ` Andrew Morton
2022-06-23  0:21     ` Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.