All of lore.kernel.org
 help / color / mirror / Atom feed
* deadlocks when the target server runs as initiator to itself
@ 2023-01-27  9:58 Maurizio Lombardi
  2023-01-30  0:09 ` Mike Christie
  0 siblings, 1 reply; 5+ messages in thread
From: Maurizio Lombardi @ 2023-01-27  9:58 UTC (permalink / raw)
  To: Mike Christie, d.bogdanov; +Cc: target-devel

Hello Mike, Dmitry,

A customer of ours needs an unusual configuration where an iSCSI initiator
runs on the same host of the target;
in other words, the host sees an iSCSI disk which is in fact just a local disk.

The problem is that under heavy load sometimes the system hangs,
example of backtrace:

    crash> bt 2037117
    PID: 2037117  TASK: ffff8bb4c901dac0  CPU: 0    COMMAND: "iscsi_trx"
     #0 [ffffa3f4199db378] __schedule at ffffffff9134b2ed
     #1 [ffffa3f4199db408] schedule at ffffffff9134b7c8
     #2 [ffffa3f4199db418] io_schedule at ffffffff9134bbe2
     #3 [ffffa3f4199db428] rq_qos_wait at ffffffff90e61245
     #4 [ffffa3f4199db4b0] wbt_wait at ffffffff90e7bb99
     #5 [ffffa3f4199db4f0] __rq_qos_throttle at ffffffff90e60fc3
     #6 [ffffa3f4199db508] blk_mq_make_request at ffffffff90e5159d
     #7 [ffffa3f4199db598] generic_make_request at ffffffff90e4592f
     #8 [ffffa3f4199db600] submit_bio at ffffffff90e45bcc
     #9 [ffffa3f4199db640] xlog_state_release_iclog at ffffffffc0358cae [xfs]
    #10 [ffffa3f4199db668] __xfs_log_force_lsn at ffffffffc0359059 [xfs]
    #11 [ffffa3f4199db6d8] xfs_log_force_lsn at ffffffffc035a21f [xfs]
    #12 [ffffa3f4199db710] __xfs_iunpin_wait at ffffffffc03454e6 [xfs]
    #13 [ffffa3f4199db780] xfs_reclaim_inode at ffffffffc033c203 [xfs]
    #14 [ffffa3f4199db7c8] xfs_reclaim_inodes_ag at ffffffffc033c620 [xfs]
    #15 [ffffa3f4199db948] xfs_reclaim_inodes_nr at ffffffffc033d851 [xfs]
    #16 [ffffa3f4199db960] super_cache_scan at ffffffff90d1cad2
    #17 [ffffa3f4199db9b0] do_shrink_slab at ffffffff90c73e9c
    #18 [ffffa3f4199dba20] shrink_slab at ffffffff90c74761
    #19 [ffffa3f4199dbaa0] shrink_node at ffffffff90c7908c
    #20 [ffffa3f4199dbb20] do_try_to_free_pages at ffffffff90c79659
    #21 [ffffa3f4199dbb70] try_to_free_pages at ffffffff90c79a5f
    #22 [ffffa3f4199dbc10] __alloc_pages_slowpath at ffffffff90cbcd31
    #23 [ffffa3f4199dbd08] __alloc_pages_nodemask at ffffffff90cbd953
    #24 [ffffa3f4199dbd68] sgl_alloc_order at ffffffff90e80e08
    #25 [ffffa3f4199dbdb8] transport_generic_new_cmd at
ffffffffc0972ce5 [target_core_mod]
    #26 [ffffa3f4199dbdf8] iscsit_process_scsi_cmd at ffffffffc09eabf5
[iscsi_target_mod]
    #27 [ffffa3f4199dbe18] iscsit_get_rx_pdu at ffffffffc09ec239
[iscsi_target_mod]
    #28 [ffffa3f4199dbed8] iscsi_target_rx_thread at ffffffffc09eda61
[iscsi_target_mod]
    #29 [ffffa3f4199dbf10] kthread at ffffffff90b036c6

This is what I think it may happen:

The rx thread receives an iscsi command, calls sgl_alloc() but the
kernel needs to reclaim memory to satisfy the allocation; the memory
reclaim code starts a flush against the filesystem mounted on top of
the iscsi device, this ends up in a deadlock because the filesystem
needs the
target driver to complete the task, but the iscsi_rx thread is stuck
in sgl_alloc().

Sounds correct to you?

What do you think about using memalloc_noio_*() in the iscsi_rx thread
to prevent the memory reclaim code from starting I/O operations? Any
alternative ideas?

Thanks!
Maurizio


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlocks when the target server runs as initiator to itself
  2023-01-27  9:58 deadlocks when the target server runs as initiator to itself Maurizio Lombardi
@ 2023-01-30  0:09 ` Mike Christie
  2023-01-30  9:41   ` Maurizio Lombardi
  2023-02-06 15:18   ` Maurizio Lombardi
  0 siblings, 2 replies; 5+ messages in thread
From: Mike Christie @ 2023-01-30  0:09 UTC (permalink / raw)
  To: Maurizio Lombardi, d.bogdanov; +Cc: target-devel

On 1/27/23 03:58, Maurizio Lombardi wrote:
> Hello Mike, Dmitry,
> 
> A customer of ours needs an unusual configuration where an iSCSI initiator
> runs on the same host of the target;
> in other words, the host sees an iSCSI disk which is in fact just a local disk.
> 
> The problem is that under heavy load sometimes the system hangs,
> example of backtrace:
> 
>     crash> bt 2037117
>     PID: 2037117  TASK: ffff8bb4c901dac0  CPU: 0    COMMAND: "iscsi_trx"
>      #0 [ffffa3f4199db378] __schedule at ffffffff9134b2ed
>      #1 [ffffa3f4199db408] schedule at ffffffff9134b7c8
>      #2 [ffffa3f4199db418] io_schedule at ffffffff9134bbe2
>      #3 [ffffa3f4199db428] rq_qos_wait at ffffffff90e61245
>      #4 [ffffa3f4199db4b0] wbt_wait at ffffffff90e7bb99
>      #5 [ffffa3f4199db4f0] __rq_qos_throttle at ffffffff90e60fc3
>      #6 [ffffa3f4199db508] blk_mq_make_request at ffffffff90e5159d
>      #7 [ffffa3f4199db598] generic_make_request at ffffffff90e4592f
>      #8 [ffffa3f4199db600] submit_bio at ffffffff90e45bcc
>      #9 [ffffa3f4199db640] xlog_state_release_iclog at ffffffffc0358cae [xfs]
>     #10 [ffffa3f4199db668] __xfs_log_force_lsn at ffffffffc0359059 [xfs]
>     #11 [ffffa3f4199db6d8] xfs_log_force_lsn at ffffffffc035a21f [xfs]
>     #12 [ffffa3f4199db710] __xfs_iunpin_wait at ffffffffc03454e6 [xfs]
>     #13 [ffffa3f4199db780] xfs_reclaim_inode at ffffffffc033c203 [xfs]
>     #14 [ffffa3f4199db7c8] xfs_reclaim_inodes_ag at ffffffffc033c620 [xfs]
>     #15 [ffffa3f4199db948] xfs_reclaim_inodes_nr at ffffffffc033d851 [xfs]
>     #16 [ffffa3f4199db960] super_cache_scan at ffffffff90d1cad2
>     #17 [ffffa3f4199db9b0] do_shrink_slab at ffffffff90c73e9c
>     #18 [ffffa3f4199dba20] shrink_slab at ffffffff90c74761
>     #19 [ffffa3f4199dbaa0] shrink_node at ffffffff90c7908c
>     #20 [ffffa3f4199dbb20] do_try_to_free_pages at ffffffff90c79659
>     #21 [ffffa3f4199dbb70] try_to_free_pages at ffffffff90c79a5f
>     #22 [ffffa3f4199dbc10] __alloc_pages_slowpath at ffffffff90cbcd31
>     #23 [ffffa3f4199dbd08] __alloc_pages_nodemask at ffffffff90cbd953
>     #24 [ffffa3f4199dbd68] sgl_alloc_order at ffffffff90e80e08
>     #25 [ffffa3f4199dbdb8] transport_generic_new_cmd at
> ffffffffc0972ce5 [target_core_mod]
>     #26 [ffffa3f4199dbdf8] iscsit_process_scsi_cmd at ffffffffc09eabf5
> [iscsi_target_mod]
>     #27 [ffffa3f4199dbe18] iscsit_get_rx_pdu at ffffffffc09ec239
> [iscsi_target_mod]
>     #28 [ffffa3f4199dbed8] iscsi_target_rx_thread at ffffffffc09eda61
> [iscsi_target_mod]
>     #29 [ffffa3f4199dbf10] kthread at ffffffff90b036c6
> 
> This is what I think it may happen:
> 
> The rx thread receives an iscsi command, calls sgl_alloc() but the
> kernel needs to reclaim memory to satisfy the allocation; the memory
> reclaim code starts a flush against the filesystem mounted on top of
> the iscsi device, this ends up in a deadlock because the filesystem
> needs the
> target driver to complete the task, but the iscsi_rx thread is stuck
> in sgl_alloc().
> 
> Sounds correct to you?

Yeah, I think nbd and rbd have similar issues. I think they just say don't
do that.

> 
> What do you think about using memalloc_noio_*() in the iscsi_rx thread
> to prevent the memory reclaim code from starting I/O operations? Any
> alternative ideas?

I don't think that's the best option because it's a rare use case and it
will affect other users. Why can't the user just use tcm loop for the local
use case?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlocks when the target server runs as initiator to itself
  2023-01-30  0:09 ` Mike Christie
@ 2023-01-30  9:41   ` Maurizio Lombardi
  2023-02-06 15:18   ` Maurizio Lombardi
  1 sibling, 0 replies; 5+ messages in thread
From: Maurizio Lombardi @ 2023-01-30  9:41 UTC (permalink / raw)
  To: Mike Christie; +Cc: d.bogdanov, target-devel

po 30. 1. 2023 v 1:09 odesílatel Mike Christie
<michael.christie@oracle.com> napsal:
>
> > Sounds correct to you?
>
> Yeah, I think nbd and rbd have similar issues. I think they just say don't
> do that.

Ok. So iscsi initiator and target on the same host must be considered
unsupported.

>
> >
> > What do you think about using memalloc_noio_*() in the iscsi_rx thread
> > to prevent the memory reclaim code from starting I/O operations? Any
> > alternative ideas?
>
> I don't think that's the best option because it's a rare use case and it
> will affect other users.

Yeah, too overkill for such a corner case.

>Why can't the user just use tcm loop for the local
> use case?

This is indeed what we are going to suggest to them, to reconfigure
their target setup to
avoid using iscsi in loop and use tcm_loop instead.

Thanks,
Maurizio


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlocks when the target server runs as initiator to itself
  2023-01-30  0:09 ` Mike Christie
  2023-01-30  9:41   ` Maurizio Lombardi
@ 2023-02-06 15:18   ` Maurizio Lombardi
  2023-02-06 15:52     ` Mike Christie
  1 sibling, 1 reply; 5+ messages in thread
From: Maurizio Lombardi @ 2023-02-06 15:18 UTC (permalink / raw)
  To: Mike Christie; +Cc: d.bogdanov, target-devel, David Jeffery, Laurence Oberman

po 30. 1. 2023 v 1:09 odesílatel Mike Christie
<michael.christie@oracle.com> napsal:
>
> I don't think that's the best option because it's a rare use case and it
> will affect other users. Why can't the user just use tcm loop for the local
> use case?
>

Hi Mike,
our customer is still interested in getting iscsi in loopback work and I have
also been informed that  this use case isn't rare among our users.
One of my colleagues suggested using the IFF_LOOPBACK flag to restrict
the memalloc_noio_*()
usage to only those connections that are in loopback, so other use
cases would be left unaffected.

I copy-paste his patch here, our customer confirmed that it works.

--- a/drivers/target/iscsi/iscsi_target.c 2023-01-30 13:48:43.310455860 -0500
+++ b/drivers/target/iscsi/iscsi_target.c 2023-01-30 17:10:08.171410784 -0500
@@ -24,6 +24,7 @@
 #include <linux/vmalloc.h>
 #include <linux/idr.h>
 #include <linux/delay.h>
+#include <linux/sched/mm.h>
 #include <linux/sched/signal.h>
 #include <asm/unaligned.h>
 #include <linux/inet.h>
@@ -4043,7 +4044,10 @@ int iscsi_target_rx_thread(void *arg)
 {
  int rc;
  struct iscsi_conn *conn = arg;
+ struct dst_entry *dst;
  bool conn_freed = false;
+ bool local = false;
+ unsigned int flags;

  /*
  * Allow ourselves to be interrupted by SIGINT so that a
@@ -4061,7 +4065,17 @@ int iscsi_target_rx_thread(void *arg)
  if (!conn->conn_transport->iscsit_get_rx_pdu)
  return 0;

+ rcu_read_lock();
+ dst = rcu_dereference(conn->sock->sk->sk_dst_cache);
+ if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK)
+         local = true;
+ rcu_read_unlock();
+
+ if (local)
+         flags = memalloc_noio_save();
  conn->conn_transport->iscsit_get_rx_pdu(conn);
+ if (local)
+         memalloc_noio_restore(flags);


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: deadlocks when the target server runs as initiator to itself
  2023-02-06 15:18   ` Maurizio Lombardi
@ 2023-02-06 15:52     ` Mike Christie
  0 siblings, 0 replies; 5+ messages in thread
From: Mike Christie @ 2023-02-06 15:52 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: d.bogdanov, target-devel, David Jeffery, Laurence Oberman

On 2/6/23 9:18 AM, Maurizio Lombardi wrote:
> po 30. 1. 2023 v 1:09 odesílatel Mike Christie
> <michael.christie@oracle.com> napsal:
>>
>> I don't think that's the best option because it's a rare use case and it
>> will affect other users. Why can't the user just use tcm loop for the local
>> use case?
>>
> 
> Hi Mike,
> our customer is still interested in getting iscsi in loopback work and I have
> also been informed that  this use case isn't rare among our users.

Why do people use it like this in production? Is it for some sort of clustering
or container use?

I was actually going to ping you and tell you loop could have issues because the
backend could still allocate mem with GFP_KERNEL, so that was just moving the
problem around.

> One of my colleagues suggested using the IFF_LOOPBACK flag to restrict
> the memalloc_noio_*()
> usage to only those connections that are in loopback, so other use
> cases would be left unaffected.

If it's ok to for us to access those flags like that, then it seems ok to me
if you have legitimate uses.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-06 15:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-27  9:58 deadlocks when the target server runs as initiator to itself Maurizio Lombardi
2023-01-30  0:09 ` Mike Christie
2023-01-30  9:41   ` Maurizio Lombardi
2023-02-06 15:18   ` Maurizio Lombardi
2023-02-06 15:52     ` Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.