linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yi Zhang <yi.zhang@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Ming Lei <ming.lei@redhat.com>,
	Laurence Oberman <loberman@redhat.com>,
	linux-block <linux-block@vger.kernel.org>,
	CKI Project <cki-project@redhat.com>
Subject: Re: [bug report] blktests srp/013 lead kernel panic with latest block/for-next and 5.13.15
Date: Tue, 5 Oct 2021 11:21:30 +0800	[thread overview]
Message-ID: <CAHj4cs_6DkGwj3UJWi3YJYJxuzGST-Yi8x4EVrO4YsHP+Pk=7Q@mail.gmail.com> (raw)
In-Reply-To: <491cab4b-1f5b-2881-5ba2-943c23d407ff@acm.org>

On Wed, Sep 29, 2021 at 2:07 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 9/27/21 10:10 PM, Yi Zhang wrote:
> > Hi Bart
> >
> > Bisect shows this issue was introduced from bellow commit, btw, this is always reproduced on the s390x kvm environment:
> >
> > commit 65ca846a53149a1a72cd8d02e7b2e73dd545b834
> > Author: Bart Van Assche <bvanassche@acm.org <mailto:bvanassche@acm.org>>
> > Date:   Wed Jan 22 19:56:34 2020 -0800
> >
> >      scsi: core: Introduce {init,exit}_cmd_priv()
> >
> >      The current behavior of the SCSI core is to clear driver-private data
> >      before preparing a request for submission to the SCSI LLD. Make it possible
> >      for SCSI LLDs to disable clearing of driver-private data.
> >
> >      These hooks will be used by a later patch, namely "scsi: ufs: Let the SCSI
> >      core allocate per-command UFS data".
> >
> > (gdb) l *(scsi_mq_exit_request+0x2c)
> > 0x8d7be4 is in scsi_mq_exit_request (drivers/scsi/scsi_lib.c:1780).
> > 1775 unsigned int hctx_idx)
> > 1776 {
> > 1777 struct Scsi_Host *shost = set->driver_data;
> > 1778 struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
> > 1779
> > 1780 if (shost->hostt->exit_cmd_priv)
> > 1781 shost->hostt->exit_cmd_priv(shost, cmd);
> > 1782 kmem_cache_free(scsi_sense_cache, cmd->sense_buffer);
> > 1783 }
> > 1784
>
> Hi Yi,
>
> Thank you for having taken the time to run a bisect. However, I strongly doubt
> that the bisection result is correct. If there would be anything wrong with the
> above patch it would already have been noticed on other architectures. I
> recommend to proceed as follows:
> * Verify whether the reported issue only occurs with the stable kernel series or
>    also with mainline kernels.

This can be reproduced on both stable kernels and mainline kernels.

> * Work with the soft-iWARP author to improve the reliability of the siw driver.
>    If I run blktests in an x86 VM then the following appears sporadically in
>    the kernel log:
>
> ------------[ cut here ]------------
> WARNING: CPU: 18 PID: 5462 at drivers/infiniband/sw/siw/siw_cm.c:255 __siw_cep_dealloc+0x184/0x190 [siw]
> CPU: 1 PID: 5462 Comm: kworker/u144:13 Tainted: G            E     5.15.0-rc2-dbg+ #7
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> Workqueue: iw_cm_wq cm_work_handler [iw_cm]
> RIP: 0010:__siw_cep_dealloc+0x184/0x190 [siw]
> Call Trace:
>   siw_cep_put+0x5c/0x80 [siw]
>   siw_reject+0x13c/0x230 [siw]
>   iw_cm_reject+0xac/0x130 [iw_cm]
>   cm_conn_req_handler+0x4f1/0x7d0 [iw_cm]
>   cm_work_handler+0x885/0x9c0 [iw_cm]
>   process_one_work+0x535/0xad0
>   worker_thread+0x2e7/0x700
>   kthread+0x1f6/0x220
>   ret_from_fork+0x1f/0x30
> irq event stamp: 11449266
> hardirqs last  enabled at (11449265): [<ffffffff81fc4248>] _raw_spin_unlock_irq+0x28/0x50
> hardirqs last disabled at (11449266): [<ffffffff81fb7e44>] __schedule+0x5f4/0xbb0
> softirqs last  enabled at (11449176): [<ffffffffa06d142f>] p_fill_from_dev_buffer+0xff/0x140 [scsi_debug]
> softirqs last disabled at (11449168): [<ffffffffa06d1400>] p_fill_from_dev_buffer+0xd0/0x140 [scsi_debug]
> ---[ end trace b23871487c995b72 ]---
>
> * Use the rdma_rxe driver to run blktests since at least in my experience that
>    driver is more reliable than the soft-iWARP driver.
>

I would suggest reproducing it on s390x platform since it was easy on
that platform from my testing.
And from the CKI tests history, it also has been reproduced on
ppc64le/aarch64 with rdma_rxe.

BTW, I've verified this issue with Ming's patch on s390x, thanks for
looking this issue.

https://lore.kernel.org/linux-scsi/20210930124415.1160754-1-ming.lei@redhat.com/T/#u


> Thanks,
>
> Bart.
>


-- 
Best Regards,
  Yi Zhang


      reply	other threads:[~2021-10-05  3:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-12 16:26 [bug report] blktests srp/013 lead kernel panic with latest block/for-next and 5.13.15 Yi Zhang
2021-09-12 21:25 ` Bart Van Assche
2021-09-12 21:28   ` Laurence Oberman
     [not found]     ` <CAHj4cs_8KbMJ+HU22E4-e_zYuPj8TfGOzxNtzQqxqKig9S=gQg@mail.gmail.com>
2021-09-28 18:07       ` Bart Van Assche
2021-10-05  3:21         ` Yi Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHj4cs_6DkGwj3UJWi3YJYJxuzGST-Yi8x4EVrO4YsHP+Pk=7Q@mail.gmail.com' \
    --to=yi.zhang@redhat.com \
    --cc=bvanassche@acm.org \
    --cc=cki-project@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).