linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Krishnamraju Eraparaju <krishna2@chelsio.com>,
	Bart Van Assche <bvanassche@acm.org>,
	Max Gurtovoy <maxg@mellanox.com>
Cc: linux-rdma@vger.kernel.org,
	Potnuri Bharat Teja <bharat@chelsio.com>,
	Nirranjan Kirubaharan <nirranjan@chelsio.com>
Subject: Re: iSERT completions hung due to unavailable iscsit tag
Date: Wed, 3 Jun 2020 13:27:29 -0700	[thread overview]
Message-ID: <d316fdb7-4676-0bb9-c208-b06e43d46534@grimberg.me> (raw)
In-Reply-To: <20200601134637.GA17657@chelsio.com>


> Hi,
> 
> I observe, iscsit_allocate_cmd() process waiting indefinitely for
> 'iscsit tag' to become available.
> 
> Here is the scenario:
> - at initiator, run "iozone -i 0 -I -+d -s 100000 -r 16384 -w" in an
>    infinite loop.
> - after few seconds, target fails to obtain 'tag' from
>    sbitmap_queue_get(), while processing ISCSI_CTRL PDU via
> isert_recv_done().
> - then iscsit waits there for tag to become available, by calling
>    schedule() in iscsit_wait_for_tag()
> - and the process never scheduled back again, not sure why.

The trace shows only the failed login attempt that
happened probably after what you are talking about.

Do you have a trace that shows your analysis?

> Due to this blockage the whole completion logic at iSER target went to
> grinding halt, causing NOPOUT request timeouts at initator, and below
> hung traces at target. Is this a known issue?

No, its not a known issue.

I was not able to reproduce it on my VM with siw...


> Target:
> [May 7 20:30] iSCSI Login timeout on Network Portal 102.1.1.6:3260
> [May 7 20:33] kworker/dying (2185) used greatest stack depth: 11672
> bytes left
> [  +2.640115] kmemleak: 2 new suspected memory leaks (see
> /sys/kernel/debug/kmemleak)

Can you output what kmemleak complains about?

> [ +26.031050] INFO: task iscsi_np:8387 blocked for more than 122
> seconds.
> [  +0.000004]       Not tainted 5.7.0-rc2+ #2
> [  +0.000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  +0.000001] iscsi_np        D14304  8387      2 0x80004084
> [  +0.000005] Call Trace:
> [  +0.000008]  ? __schedule+0x274/0x5e0
> [  +0.000003]  ? stack_trace_save+0x46/0x70
> [  +0.000002]  schedule+0x45/0xb0
> [  +0.000002]  schedule_timeout+0x1d6/0x290
> [  +0.000001]  wait_for_completion+0x82/0xe0
> [  +0.000007]  iscsi_check_for_session_reinstatement+0x205/0x260
> [iscsi_target_mod]
> [  +0.000004]  iscsi_target_do_login+0xe3/0x5c0 [iscsi_target_mod]
> [  +0.000004]  iscsi_target_start_negotiation+0x4c/0xa0
> [iscsi_target_mod]
> [  +0.000004]  iscsi_target_login_thread+0x86f/0x1000 [iscsi_target_mod]
> [  +0.000003]  kthread+0xf3/0x130
> [  +0.000004]  ? iscsi_target_login_sess_out+0x1f0/0x1f0
> [iscsi_target_mod]
> [  +0.000001]  ? kthread_bind+0x10/0x10
> [  +0.000002]  ret_from_fork+0x35/0x40
> [  +0.000003] INFO: task iscsi_ttx:8863 blocked for more than 122
> seconds.
> [  +0.000001]       Not tainted 5.7.0-rc2+ #2
> [  +0.000000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  +0.000001] iscsi_ttx       D14384  8863      2 0x80004084
> [  +0.000003] Call Trace:
> [  +0.000002]  ? __schedule+0x274/0x5e0
> [  +0.000004]  ? c4iw_ib_modify_qp+0xf3/0x160 [iw_cxgb4]
> [  +0.000002]  schedule+0x45/0xb0
> [  +0.000002]  schedule_timeout+0x1d6/0x290
> [  +0.000001]  wait_for_completion+0x82/0xe0
> [  +0.000003]  __ib_drain_sq+0x147/0x170
> [  +0.000002]  ? ib_sg_to_pages+0x1a0/0x1a0
> [  +0.000003]  ib_drain_sq+0x6e/0x80
> [  +0.000002]  ib_drain_qp+0x9/0x20
> [  +0.000002]  isert_wait_conn+0x51/0x2c0 [ib_isert]
> [  +0.000004]  iscsit_close_connection+0x14c/0x840 [iscsi_target_mod]
> [  +0.000002]  ? __schedule+0x27c/0x5e0
> [  +0.000004]  iscsit_take_action_for_connection_exit+0x79/0x100
> [iscsi_target_mod]
> [  +0.000003]  iscsi_target_tx_thread+0x160/0x200 [iscsi_target_mod]
> [  +0.000004]  ? wait_woken+0x80/0x80
> [  +0.000002]  kthread+0xf3/0x130
> [  +0.000003]  ? iscsit_thread_get_cpumask+0x80/0x80 [iscsi_target_mod]
> [  +0.000001]  ? kthread_bind+0x10/0x10
> [  +0.000001]  ret_from_fork+0x35/0x40
> [  +0.000003] NMI backtrace for cpu 7
> [  +0.000002] CPU: 7 PID: 493 Comm: khungtaskd Not tainted 5.7.0-rc2+ #2
> [  +0.000001] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1a
> 10/24/2018
> [  +0.000001] Call Trace:
> [  +0.000003]  dump_stack+0x50/0x6b
> [  +0.000002]  nmi_cpu_backtrace+0x9e/0xb0
> [  +0.000003]  ? lapic_can_unplug_cpu+0x90/0x90
> [  +0.000002]  nmi_trigger_cpumask_backtrace+0x9c/0xf0
> [  +0.000003]  watchdog+0x310/0x4f0
> [  +0.000002]  kthread+0xf3/0x130
> [  +0.000002]  ? hungtask_pm_notify+0x40/0x40
> [  +0.000001]  ? kthread_bind+0x10/0x10
> [  +0.000001]  ret_from_fork+0x35/0x40
> [  +0.000002] Sending NMI from CPU 7 to CPUs 0-6:
> [  +0.000009] NMI backtrace for cpu 0 skipped: idling at
> acpi_processor_ffh_cstate_enter+0x6f/0xc0
> [  +0.000001] NMI backtrace for cpu 4 skipped: idling at
> acpi_processor_ffh_cstate_enter+0x6f/0xc0
> [  +0.000002] NMI backtrace for cpu 1 skipped: idling at
> acpi_processor_ffh_cstate_enter+0x6f/0xc0
> [  +0.000019] NMI backtrace for cpu 2 skipped: idling at
> acpi_processor_ffh_cstate_enter+0x6f/0xc0
> [  +0.000001] NMI backtrace for cpu 6 skipped: idling at
> acpi_processor_ffh_cstate_enter+0x6f/0xc0
> [  +0.000010] NMI backtrace for cpu 5
> [  +0.000000] CPU: 5 PID: 4465 Comm: kworker/5:1H Not tainted 5.7.0-rc2+
> #2
> [  +0.000000] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1a
> 10/24/2018
> [  +0.000001] Workqueue: ib-comp-wq ib_cq_poll_work
> [  +0.000000] RIP: 0010:__free_pages+0xe/0x60
> 
> 
> Initiator:
> [ 2002.689250]  connection5:0: ping timeout of 5 secs expired, recv
> timeout 5, last rx 4296659424, last ping 4296664448, now 4296669696
> [ 2002.689275]  connection5:0: detected conn error (1022)
> 
> 
> 
> Thanks,
> Krishna.
> 

  reply	other threads:[~2020-06-03 20:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-01 13:46 iSERT completions hung due to unavailable iscsit tag Krishnamraju Eraparaju
2020-06-03 20:27 ` Sagi Grimberg [this message]
2020-06-04 15:13   ` Krishnamraju Eraparaju
2020-06-16  9:49     ` Krishnamraju Eraparaju
2020-07-01 20:59       ` Krishnamraju Eraparaju
2020-06-05  5:07   ` Krishnamraju Eraparaju
2020-06-07 10:16 ` Max Gurtovoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d316fdb7-4676-0bb9-c208-b06e43d46534@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=bharat@chelsio.com \
    --cc=bvanassche@acm.org \
    --cc=krishna2@chelsio.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maxg@mellanox.com \
    --cc=nirranjan@chelsio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).