* [PATCH] dm: use noio when sending kobject event
@ 2020-07-08 16:25 Mikulas Patocka
2020-07-08 18:26 ` Gabriel Krisman Bertazi
0 siblings, 1 reply; 5+ messages in thread
From: Mikulas Patocka @ 2020-07-08 16:25 UTC (permalink / raw)
To: Mike Snitzer
Cc: Tahsin Erdogan, dm-devel, Gabriel Krisman Bertazi, Khazhismel Kumykov
kobject_uevent may allocate memory and it may be called while there are dm
devices suspended. The allocation may recurse into a suspended device,
causing a deadlock. We must set the noio flag when sending a uevent.
This is the observed deadlock:
iSCSI-write
holding: rx_queue_mutex
waiting: uevent_sock_mutex
kobject_uevent_env+0x1bd/0x419
kobject_uevent+0xb/0xd
device_add+0x48a/0x678
scsi_add_host_with_dma+0xc5/0x22d
iscsi_host_add+0x53/0x55
iscsi_sw_tcp_session_create+0xa6/0x129
iscsi_if_rx+0x100/0x1247
netlink_unicast+0x213/0x4f0
netlink_sendmsg+0x230/0x3c0
iscsi_fail iscsi_conn_failure
waiting: rx_queue_mutex
schedule_preempt_disabled+0x325/0x734
__mutex_lock_slowpath+0x18b/0x230
mutex_lock+0x22/0x40
iscsi_conn_failure+0x42/0x149
worker_thread+0x24a/0xbc0
EventManager_
holding: uevent_sock_mutex
waiting: dm_bufio_client->lock
dm_bufio_lock+0xe/0x10
shrink+0x34/0xf7
shrink_slab+0x177/0x5d0
do_try_to_free_pages+0x129/0x470
try_to_free_mem_cgroup_pages+0x14f/0x210
memcg_kmem_newpage_charge+0xa6d/0x13b0
__alloc_pages_nodemask+0x4a3/0x1a70
fallback_alloc+0x1b2/0x36c
__kmalloc_node_track_caller+0xb9/0x10d0
__alloc_skb+0x83/0x2f0
kobject_uevent_env+0x26b/0x419
dm_kobject_uevent+0x70/0x79
dev_suspend+0x1a9/0x1e7
ctl_ioctl+0x3e9/0x411
dm_ctl_ioctl+0x13/0x17
do_vfs_ioctl+0xb3/0x460
SyS_ioctl+0x5e/0x90
MemcgReclaimerD
holding: dm_bufio_client->lock
waiting: stuck io to finish (needs iscsi_fail thread to progress)
schedule at ffffffffbd603618
io_schedule at ffffffffbd603ba4
do_io_schedule at ffffffffbdaf0d94
__wait_on_bit at ffffffffbd6008a6
out_of_line_wait_on_bit at ffffffffbd600960
wait_on_bit.constprop.10 at ffffffffbdaf0f17
__make_buffer_clean at ffffffffbdaf18ba
__cleanup_old_buffer at ffffffffbdaf192f
shrink at ffffffffbdaf19fd
do_shrink_slab at ffffffffbd6ec000
shrink_slab at ffffffffbd6ec24a
do_try_to_free_pages at ffffffffbd6eda09
try_to_free_mem_cgroup_pages at ffffffffbd6ede7e
mem_cgroup_resize_limit at ffffffffbd7024c0
mem_cgroup_write at ffffffffbd703149
cgroup_file_write at ffffffffbd6d9c6e
sys_write at ffffffffbd6662ea
system_call_fastpath at ffffffffbdbc34a2
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Khazhismel Kumykov <khazhy@google.com>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: stable@vger.kernel.org
---
drivers/md/dm.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c 2020-06-29 14:50:22.000000000 +0200
+++ linux-2.6/drivers/md/dm.c 2020-07-08 18:17:55.000000000 +0200
@@ -12,6 +12,7 @@
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/sched/mm.h>
#include <linux/sched/signal.h>
#include <linux/blkpg.h>
#include <linux/bio.h>
@@ -2926,17 +2927,25 @@ EXPORT_SYMBOL_GPL(dm_internal_resume_fas
int dm_kobject_uevent(struct mapped_device *md, enum kobject_action action,
unsigned cookie)
{
+ int r;
+ unsigned noio_flag;
char udev_cookie[DM_COOKIE_LENGTH];
char *envp[] = { udev_cookie, NULL };
+ noio_flag = memalloc_noio_save();
+
if (!cookie)
- return kobject_uevent(&disk_to_dev(md->disk)->kobj, action);
+ r = kobject_uevent(&disk_to_dev(md->disk)->kobj, action);
else {
snprintf(udev_cookie, DM_COOKIE_LENGTH, "%s=%u",
DM_COOKIE_ENV_VAR_NAME, cookie);
- return kobject_uevent_env(&disk_to_dev(md->disk)->kobj,
- action, envp);
+ r = kobject_uevent_env(&disk_to_dev(md->disk)->kobj,
+ action, envp);
}
+
+ memalloc_noio_restore(noio_flag);
+
+ return r;
}
uint32_t dm_next_uevent_seq(struct mapped_device *md)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: dm: use noio when sending kobject event
2020-07-08 18:26 ` Gabriel Krisman Bertazi
@ 2020-07-08 17:33 ` Mike Snitzer
2020-07-08 19:37 ` Gabriel Krisman Bertazi
0 siblings, 1 reply; 5+ messages in thread
From: Mike Snitzer @ 2020-07-08 17:33 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Tahsin Erdogan, dm-devel, Mikulas Patocka, Khazhismel Kumykov
On Wed, Jul 08 2020 at 2:26pm -0400,
Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
> Mikulas Patocka <mpatocka@redhat.com> writes:
>
> > kobject_uevent may allocate memory and it may be called while there are dm
> > devices suspended. The allocation may recurse into a suspended device,
> > causing a deadlock. We must set the noio flag when sending a uevent.
>
> If I understand it correctly, considering the deadlock you shared, this
> doesn't solve the entire issue. For instance, kobject_uevent_env on the
> GFP_NOIO thread waits on uevent_sock_mutex, and another thread with
> GFP_IO holding the mutex might have triggered the shrinker from inside
> kobject_uevent_net_broadcast. I believe 7e7cd796f277 ("scsi: iscsi: Fix
> deadlock on recovery path during GFP_IO reclaim") solved the one you
> shared and other similar cases for iSCSI in a different way.
I staged a different fix, from Mikulas, for 5.9 that is meant to address
the original report, please see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.9&id=e5bfe9baf23dca211f4b794b651e871032c427ec
I'd appreciate it if you could try this commit to se if it fixes the
original issue you reported.
> I know this is similar to the log I shared on an earlier patch. Are you
> able to reproduce the deadlock with the above patch applied?
Mikulas seized on the fact that the backtrace shows the uevent upcall
to have occurred while suspending. I know he didn't reproduce your
issue.
> That said, I think this patch is an improvement as we shouldn't be using
> GFP_IO in this path to begin with, so please add:
>
> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
FYI, whilee I do appreciate your Reviewed-by I already staged this for
5.8 so I'd rather not rebase to add your Reviewed-by, see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.8&id=6958c1c640af8c3f40fa8a2eee3b5b905d95b677
Thanks,
Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] dm: use noio when sending kobject event
2020-07-08 16:25 [PATCH] dm: use noio when sending kobject event Mikulas Patocka
@ 2020-07-08 18:26 ` Gabriel Krisman Bertazi
2020-07-08 17:33 ` Mike Snitzer
0 siblings, 1 reply; 5+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-07-08 18:26 UTC (permalink / raw)
To: Mikulas Patocka
Cc: Tahsin Erdogan, Mike Snitzer, dm-devel, Khazhismel Kumykov
Mikulas Patocka <mpatocka@redhat.com> writes:
> kobject_uevent may allocate memory and it may be called while there are dm
> devices suspended. The allocation may recurse into a suspended device,
> causing a deadlock. We must set the noio flag when sending a uevent.
If I understand it correctly, considering the deadlock you shared, this
doesn't solve the entire issue. For instance, kobject_uevent_env on the
GFP_NOIO thread waits on uevent_sock_mutex, and another thread with
GFP_IO holding the mutex might have triggered the shrinker from inside
kobject_uevent_net_broadcast. I believe 7e7cd796f277 ("scsi: iscsi: Fix
deadlock on recovery path during GFP_IO reclaim") solved the one you
shared and other similar cases for iSCSI in a different way.
I know this is similar to the log I shared on an earlier patch. Are you
able to reproduce the deadlock with the above patch applied?
That said, I think this patch is an improvement as we shouldn't be using
GFP_IO in this path to begin with, so please add:
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> This is the observed deadlock:
>
> iSCSI-write
> holding: rx_queue_mutex
> waiting: uevent_sock_mutex
>
> kobject_uevent_env+0x1bd/0x419
> kobject_uevent+0xb/0xd
> device_add+0x48a/0x678
> scsi_add_host_with_dma+0xc5/0x22d
> iscsi_host_add+0x53/0x55
> iscsi_sw_tcp_session_create+0xa6/0x129
> iscsi_if_rx+0x100/0x1247
> netlink_unicast+0x213/0x4f0
> netlink_sendmsg+0x230/0x3c0
>
> iscsi_fail iscsi_conn_failure
> waiting: rx_queue_mutex
>
> schedule_preempt_disabled+0x325/0x734
> __mutex_lock_slowpath+0x18b/0x230
> mutex_lock+0x22/0x40
> iscsi_conn_failure+0x42/0x149
> worker_thread+0x24a/0xbc0
>
> EventManager_
> holding: uevent_sock_mutex
> waiting: dm_bufio_client->lock
>
> dm_bufio_lock+0xe/0x10
> shrink+0x34/0xf7
> shrink_slab+0x177/0x5d0
> do_try_to_free_pages+0x129/0x470
> try_to_free_mem_cgroup_pages+0x14f/0x210
> memcg_kmem_newpage_charge+0xa6d/0x13b0
> __alloc_pages_nodemask+0x4a3/0x1a70
> fallback_alloc+0x1b2/0x36c
> __kmalloc_node_track_caller+0xb9/0x10d0
> __alloc_skb+0x83/0x2f0
> kobject_uevent_env+0x26b/0x419
> dm_kobject_uevent+0x70/0x79
> dev_suspend+0x1a9/0x1e7
> ctl_ioctl+0x3e9/0x411
> dm_ctl_ioctl+0x13/0x17
> do_vfs_ioctl+0xb3/0x460
> SyS_ioctl+0x5e/0x90
>
> MemcgReclaimerD
> holding: dm_bufio_client->lock
> waiting: stuck io to finish (needs iscsi_fail thread to progress)
>
> schedule at ffffffffbd603618
> io_schedule at ffffffffbd603ba4
> do_io_schedule at ffffffffbdaf0d94
> __wait_on_bit at ffffffffbd6008a6
> out_of_line_wait_on_bit at ffffffffbd600960
> wait_on_bit.constprop.10 at ffffffffbdaf0f17
> __make_buffer_clean at ffffffffbdaf18ba
> __cleanup_old_buffer at ffffffffbdaf192f
> shrink at ffffffffbdaf19fd
> do_shrink_slab at ffffffffbd6ec000
> shrink_slab at ffffffffbd6ec24a
> do_try_to_free_pages at ffffffffbd6eda09
> try_to_free_mem_cgroup_pages at ffffffffbd6ede7e
> mem_cgroup_resize_limit at ffffffffbd7024c0
> mem_cgroup_write at ffffffffbd703149
> cgroup_file_write at ffffffffbd6d9c6e
> sys_write at ffffffffbd6662ea
> system_call_fastpath at ffffffffbdbc34a2
>
>
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Reported-by: Khazhismel Kumykov <khazhy@google.com>
> Reported-by: Tahsin Erdogan <tahsin@google.com>
> Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> Cc: stable@vger.kernel.org
>
> ---
> drivers/md/dm.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/drivers/md/dm.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/dm.c 2020-06-29 14:50:22.000000000 +0200
> +++ linux-2.6/drivers/md/dm.c 2020-07-08 18:17:55.000000000 +0200
> @@ -12,6 +12,7 @@
> #include <linux/init.h>
> #include <linux/module.h>
> #include <linux/mutex.h>
> +#include <linux/sched/mm.h>
> #include <linux/sched/signal.h>
> #include <linux/blkpg.h>
> #include <linux/bio.h>
> @@ -2926,17 +2927,25 @@ EXPORT_SYMBOL_GPL(dm_internal_resume_fas
> int dm_kobject_uevent(struct mapped_device *md, enum kobject_action action,
> unsigned cookie)
> {
> + int r;
> + unsigned noio_flag;
> char udev_cookie[DM_COOKIE_LENGTH];
> char *envp[] = { udev_cookie, NULL };
>
> + noio_flag = memalloc_noio_save();
> +
> if (!cookie)
> - return kobject_uevent(&disk_to_dev(md->disk)->kobj, action);
> + r = kobject_uevent(&disk_to_dev(md->disk)->kobj, action);
> else {
> snprintf(udev_cookie, DM_COOKIE_LENGTH, "%s=%u",
> DM_COOKIE_ENV_VAR_NAME, cookie);
> - return kobject_uevent_env(&disk_to_dev(md->disk)->kobj,
> - action, envp);
> + r = kobject_uevent_env(&disk_to_dev(md->disk)->kobj,
> + action, envp);
> }
> +
> + memalloc_noio_restore(noio_flag);
> +
> + return r;
> }
>
> uint32_t dm_next_uevent_seq(struct mapped_device *md)
>
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: dm: use noio when sending kobject event
2020-07-08 17:33 ` Mike Snitzer
@ 2020-07-08 19:37 ` Gabriel Krisman Bertazi
2020-07-10 18:22 ` Mike Snitzer
0 siblings, 1 reply; 5+ messages in thread
From: Gabriel Krisman Bertazi @ 2020-07-08 19:37 UTC (permalink / raw)
To: Mike Snitzer; +Cc: dm-devel, Mikulas Patocka, Khazhismel Kumykov
Mike Snitzer <snitzer@redhat.com> writes:
> On Wed, Jul 08 2020 at 2:26pm -0400,
> Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
>
>> If I understand it correctly, considering the deadlock you shared, this
>> doesn't solve the entire issue. For instance, kobject_uevent_env on the
>> GFP_NOIO thread waits on uevent_sock_mutex, and another thread with
>> GFP_IO holding the mutex might have triggered the shrinker from inside
>> kobject_uevent_net_broadcast. I believe 7e7cd796f277 ("scsi: iscsi: Fix
>> deadlock on recovery path during GFP_IO reclaim") solved the one you
>> shared and other similar cases for iSCSI in a different way.
>
> I staged a different fix, from Mikulas, for 5.9 that is meant to address
> the original report, please see:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.9&id=e5bfe9baf23dca211f4b794b651e871032c427ec
>
> I'd appreciate it if you could try this commit to se if it fixes the
> original issue you reported.
I reverted 7e7cd796f277 and cherry-picked e5bfe9baf23dc on my tree.
After a few iterations, I could see the conditions that formerly
triggered the deadlock happening, but this patch successfully allowed
the reclaim to succeed and the iscsi recovery thread to run.
My reproducer is a bit artificial, as I wrote it only from only the
problem description provided by google. They were hitting this in
production and might have a better final word on the fix, though I know
they don't have a simple way to reproduce it.
>> That said, I think this patch is an improvement as we shouldn't be using
>> GFP_IO in this path to begin with, so please add:
>>
>> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>
> FYI, whilee I do appreciate your Reviewed-by I already staged this for
> 5.8 so I'd rather not rebase to add your Reviewed-by, see:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.8&id=6958c1c640af8c3f40fa8a2eee3b5b905d95b677
No worries. Actually, thank you guys for helping with this issue.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: dm: use noio when sending kobject event
2020-07-08 19:37 ` Gabriel Krisman Bertazi
@ 2020-07-10 18:22 ` Mike Snitzer
0 siblings, 0 replies; 5+ messages in thread
From: Mike Snitzer @ 2020-07-10 18:22 UTC (permalink / raw)
To: Gabriel Krisman Bertazi; +Cc: dm-devel, Mikulas Patocka, Khazhismel Kumykov
On Wed, Jul 08 2020 at 3:37pm -0400,
Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
> Mike Snitzer <snitzer@redhat.com> writes:
>
> > On Wed, Jul 08 2020 at 2:26pm -0400,
> > Gabriel Krisman Bertazi <krisman@collabora.com> wrote:
> >
> >> If I understand it correctly, considering the deadlock you shared, this
> >> doesn't solve the entire issue. For instance, kobject_uevent_env on the
> >> GFP_NOIO thread waits on uevent_sock_mutex, and another thread with
> >> GFP_IO holding the mutex might have triggered the shrinker from inside
> >> kobject_uevent_net_broadcast. I believe 7e7cd796f277 ("scsi: iscsi: Fix
> >> deadlock on recovery path during GFP_IO reclaim") solved the one you
> >> shared and other similar cases for iSCSI in a different way.
> >
> > I staged a different fix, from Mikulas, for 5.9 that is meant to address
> > the original report, please see:
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.9&id=e5bfe9baf23dca211f4b794b651e871032c427ec
> >
> > I'd appreciate it if you could try this commit to se if it fixes the
> > original issue you reported.
>
> I reverted 7e7cd796f277 and cherry-picked e5bfe9baf23dc on my tree.
> After a few iterations, I could see the conditions that formerly
> triggered the deadlock happening, but this patch successfully allowed
> the reclaim to succeed and the iscsi recovery thread to run.
>
> My reproducer is a bit artificial, as I wrote it only from only the
> problem description provided by google. They were hitting this in
> production and might have a better final word on the fix, though I know
> they don't have a simple way to reproduce it.
Nice job on getting together a reproducer that even begins to model
the issue google's production setup teased out.
Thanks for testing, I've added your Tested-by to the commit.
Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-07-10 18:22 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08 16:25 [PATCH] dm: use noio when sending kobject event Mikulas Patocka
2020-07-08 18:26 ` Gabriel Krisman Bertazi
2020-07-08 17:33 ` Mike Snitzer
2020-07-08 19:37 ` Gabriel Krisman Bertazi
2020-07-10 18:22 ` Mike Snitzer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.