linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Yi Zhang <yi.zhang@redhat.com>, Ming Lei <ming.lei@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>,
	linux-block <linux-block@vger.kernel.org>,
	CKI Project <cki-project@redhat.com>
Subject: Re: [bug report] blktests srp/013 lead kernel panic with latest block/for-next and 5.13.15
Date: Tue, 28 Sep 2021 11:07:11 -0700	[thread overview]
Message-ID: <491cab4b-1f5b-2881-5ba2-943c23d407ff@acm.org> (raw)
In-Reply-To: <CAHj4cs_8KbMJ+HU22E4-e_zYuPj8TfGOzxNtzQqxqKig9S=gQg@mail.gmail.com>

On 9/27/21 10:10 PM, Yi Zhang wrote:
> Hi Bart
> 
> Bisect shows this issue was introduced from bellow commit, btw, this is always reproduced on the s390x kvm environment:
> 
> commit 65ca846a53149a1a72cd8d02e7b2e73dd545b834
> Author: Bart Van Assche <bvanassche@acm.org <mailto:bvanassche@acm.org>>
> Date:   Wed Jan 22 19:56:34 2020 -0800
> 
>      scsi: core: Introduce {init,exit}_cmd_priv()
> 
>      The current behavior of the SCSI core is to clear driver-private data
>      before preparing a request for submission to the SCSI LLD. Make it possible
>      for SCSI LLDs to disable clearing of driver-private data.
> 
>      These hooks will be used by a later patch, namely "scsi: ufs: Let the SCSI
>      core allocate per-command UFS data".
> 
> (gdb) l *(scsi_mq_exit_request+0x2c)
> 0x8d7be4 is in scsi_mq_exit_request (drivers/scsi/scsi_lib.c:1780).
> 1775 unsigned int hctx_idx)
> 1776 {
> 1777 struct Scsi_Host *shost = set->driver_data;
> 1778 struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
> 1779
> 1780 if (shost->hostt->exit_cmd_priv)
> 1781 shost->hostt->exit_cmd_priv(shost, cmd);
> 1782 kmem_cache_free(scsi_sense_cache, cmd->sense_buffer);
> 1783 }
> 1784

Hi Yi,

Thank you for having taken the time to run a bisect. However, I strongly doubt
that the bisection result is correct. If there would be anything wrong with the
above patch it would already have been noticed on other architectures. I
recommend to proceed as follows:
* Verify whether the reported issue only occurs with the stable kernel series or
   also with mainline kernels.
* Work with the soft-iWARP author to improve the reliability of the siw driver.
   If I run blktests in an x86 VM then the following appears sporadically in
   the kernel log:

------------[ cut here ]------------
WARNING: CPU: 18 PID: 5462 at drivers/infiniband/sw/siw/siw_cm.c:255 __siw_cep_dealloc+0x184/0x190 [siw]
CPU: 1 PID: 5462 Comm: kworker/u144:13 Tainted: G            E     5.15.0-rc2-dbg+ #7
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Workqueue: iw_cm_wq cm_work_handler [iw_cm]
RIP: 0010:__siw_cep_dealloc+0x184/0x190 [siw]
Call Trace:
  siw_cep_put+0x5c/0x80 [siw]
  siw_reject+0x13c/0x230 [siw]
  iw_cm_reject+0xac/0x130 [iw_cm]
  cm_conn_req_handler+0x4f1/0x7d0 [iw_cm]
  cm_work_handler+0x885/0x9c0 [iw_cm]
  process_one_work+0x535/0xad0
  worker_thread+0x2e7/0x700
  kthread+0x1f6/0x220
  ret_from_fork+0x1f/0x30
irq event stamp: 11449266
hardirqs last  enabled at (11449265): [<ffffffff81fc4248>] _raw_spin_unlock_irq+0x28/0x50
hardirqs last disabled at (11449266): [<ffffffff81fb7e44>] __schedule+0x5f4/0xbb0
softirqs last  enabled at (11449176): [<ffffffffa06d142f>] p_fill_from_dev_buffer+0xff/0x140 [scsi_debug]
softirqs last disabled at (11449168): [<ffffffffa06d1400>] p_fill_from_dev_buffer+0xd0/0x140 [scsi_debug]
---[ end trace b23871487c995b72 ]---

* Use the rdma_rxe driver to run blktests since at least in my experience that
   driver is more reliable than the soft-iWARP driver.

Thanks,

Bart.

  parent reply	other threads:[~2021-09-28 18:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-12 16:26 [bug report] blktests srp/013 lead kernel panic with latest block/for-next and 5.13.15 Yi Zhang
2021-09-12 21:25 ` Bart Van Assche
2021-09-12 21:28   ` Laurence Oberman
     [not found]     ` <CAHj4cs_8KbMJ+HU22E4-e_zYuPj8TfGOzxNtzQqxqKig9S=gQg@mail.gmail.com>
2021-09-28 18:07       ` Bart Van Assche [this message]
2021-10-05  3:21         ` Yi Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=491cab4b-1f5b-2881-5ba2-943c23d407ff@acm.org \
    --to=bvanassche@acm.org \
    --cc=cki-project@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=ming.lei@redhat.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).