All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
	jianchao.w.wang@oracle.com, Bart Van Assche <bvanassche@acm.org>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Jens Axboe <axboe@fb.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: Panic when rebooting target server testing srp on 5.0.0-rc2
Date: Thu, 28 Mar 2019 11:09:58 -0400	[thread overview]
Message-ID: <ea38f0a025d9686ad56bfb0c15dc824a1473915d.camel@redhat.com> (raw)
In-Reply-To: <f1414550876007cee564551561e3b21065b5d7a8.camel@redhat.com>

On Thu, 2019-03-28 at 10:58 -0400, Laurence Oberman wrote:
> On Thu, 2019-03-28 at 09:31 +0800, Ming Lei wrote:
> > On Wed, Mar 27, 2019 at 07:34:37PM -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > > > On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> > > > > Truncating email content, starting bisect again as suggested.
> > > > > Email was getting too long with repetition.
> > > > > 
> > > > > Crux of the issue repeated here so easy to understand topic
> > > > > 
> > > > > We got to dispatch passing rq_list and the list is
> > > > > corrupted/freed
> > > > > so
> > > > > we panic. Clearly a race and is in v5.x+ kernels.
> > > > > This new bisect will find it.
> > > > > 
> > > > > crash> bt
> > > > > PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND:
> > > > > "kworker/1:1H"
> > > > >  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
> > > > >  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
> > > > >  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
> > > > >  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
> > > > >  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
> > > > >  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
> > > > >  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
> > > > >     [exception RIP: blk_mq_dispatch_rq_list+114]
> > > > >     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS:
> > > > > 00010246
> > > > >     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> > > > > 0000000000000004
> > > > >     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> > > > > ffff9dea0dad87f0
> > > > >     RBP: 0000000000000000   R8: 0000000000000000   R9:
> > > > > 8080808080808080
> > > > >     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> > > > > ffffa9fe0759fe20
> > > > >     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> > > > > ffff9dea0dad87f0
> > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > >  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> > > > > ffffffff93ba455c
> > > > >  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at
> > > > > ffffffff93b9e3cf
> > > > >  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> > > > > #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> > > > > #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> > > > > #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> > > > > 
> > > > 
> > > > Hello Jens, Jianchao
> > > > Finally made it to this one.
> > > > I will see if I can revert and test
> > > > 
> > > > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad
> > > > commit
> > > > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > > > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > > > Date:   Fri Dec 14 09:28:18 2018 +0800
> > > > 
> > > >     blk-mq: refactor the code of issue request directly
> > > >     
> > > >     Merge blk_mq_try_issue_directly and
> > > > __blk_mq_try_issue_directly
> > > >     into one interface to unify the interfaces to issue
> > > > requests
> > > >     directly. The merged interface takes over the requests
> > > > totally,
> > > >     it could insert, end or do nothing based on the return
> > > > value
> > > > of
> > > >     .queue_rq and 'bypass' parameter. Then caller needn't any
> > > > other
> > > >     handling any more and then code could be cleaned up.
> > > >     
> > > >     And also the commit c616cbee ( blk-mq: punt failed direct
> > > > issue
> > > >     to dispatch list ) always inserts requests to hctx dispatch
> > > > list
> > > >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE,
> > > > this
> > > > is
> > > >     overkill and will harm the merging. We just need to do that
> > > > for
> > > >     the requests that has been through .queue_rq. This patch
> > > > also
> > > >     could fix this.
> > > >     
> > > >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> > > >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > > > 
> > > > 
> > > > 
> > > 
> > > Cannot clean revert 
> > > 
> > > loberman@ibclient linux]$ git revert
> > > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 
> > > error: could not revert 7f556a4... blk-mq: refactor the code of
> > > issue
> > > request directly
> > > hint: after resolving the conflicts, mark the corrected paths
> > > hint: with 'git add <paths>' or 'git rm <paths>'
> > > hint: and commit the result with 'git commit'
> > > 
> > > Revert "blk-mq: refactor the code of issue request directly"
> > > 
> > > This reverts commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2.
> > > 
> > > Conflicts:
> > >         block/blk-mq.c
> > > 
> > > No clear what in this commit is breaking things and causing the
> > > race
> > 
> > Hi Laurence,
> > 
> > Could you test the following patch?
> > 
> > 'bypass' means the caller handle the dispatch result.
> > 
> > Also, we might remove the handling for 'force', so we can align to
> > blk_mq_dispatch_rq_list(), but that shouldn't be related with this
> > issue.
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index bc3524428b96..ee4bfd9cbde5 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1848,12 +1848,14 @@ blk_status_t
> > blk_mq_try_issue_directly(struct
> > blk_mq_hw_ctx *hctx,
> >  	ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
> >  out_unlock:
> >  	hctx_unlock(hctx, srcu_idx);
> > +	if (bypass)
> > +		return ret;
> >  	switch (ret) {
> >  	case BLK_STS_OK:
> >  		break;
> >  	case BLK_STS_DEV_RESOURCE:
> >  	case BLK_STS_RESOURCE:
> > -		if (force) {
> > +		if (force)
> >  			blk_mq_request_bypass_insert(rq, run_queue);
> >  			/*
> >  			 * We have to return BLK_STS_OK for the DM
> > @@ -1861,18 +1863,14 @@ blk_status_t
> > blk_mq_try_issue_directly(struct
> > blk_mq_hw_ctx *hctx,
> >  			 * the real result to indicate whether the
> >  			 * request is direct-issued successfully.
> >  			 */
> > -			ret = bypass ? BLK_STS_OK : ret;
> > -		} else if (!bypass) {
> > +		else
> >  			blk_mq_sched_insert_request(rq, false,
> >  						    run_queue, false);
> > -		}
> >  		break;
> >  	default:
> > -		if (!bypass)
> > -			blk_mq_end_request(rq, ret);
> > +		blk_mq_end_request(rq, ret);
> >  		break;
> >  	}
> > -
> >  	return ret;
> >  }
> >  
> > Thanks,
> > Ming
> 
> Hello Ming
> 
> Thanks for the patch, unfortunately it did not help.
> I stared at the changes for a while and could not see how it would
> make
> a difference to the race.
> However, thats is mostly because I still need to get my head around
> all
> the block-mq code.
> 
> The test was with this commit I added
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 70b210a..8952116 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1834,12 +1834,15 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>         ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
>  out_unlock:
>         hctx_unlock(hctx, srcu_idx);
> +        if (bypass)
> +                return ret;
> +
>         switch (ret) {
>         case BLK_STS_OK:
>                 break;
>         case BLK_STS_DEV_RESOURCE:
>         case BLK_STS_RESOURCE:
> -               if (force) {
> +                if (force)
>                         blk_mq_request_bypass_insert(rq, run_queue);
>                         /*
>                          * We have to return BLK_STS_OK for the DM
> @@ -1847,18 +1850,14 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>                          * the real result to indicate whether the
>                          * request is direct-issued successfully.
>                          */
> -                       ret = bypass ? BLK_STS_OK : ret;
> -               } else if (!bypass) {
> +       else
>                         blk_mq_sched_insert_request(rq, false,
>                                                     run_queue,
> false);
> -               }
>                 break;
>         default:
> -               if (!bypass)
> -                       blk_mq_end_request(rq, ret);
> +                blk_mq_end_request(rq, ret);
>                 break;
>         }
> -
>         return ret;
>  }
> 
> 
> 
> [  193.068245] device-mapper: multipath: Failing path 66:96.
> [  193.092929] device-mapper: multipath: Failing path 8:96.
> [  193.137068] #PF error: [normal kernel read fault]
> [  195.217691] PGD 0 P4D 0 
> [  195.231958] Oops: 0000 [#1] SMP PTI
> [  195.250796] CPU: 4 PID: 8525 Comm: kworker/4:1H Tainted:
> G        W
> I       5.1.0-rc2+ #2
> [  195.295820] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  195.330339] Workqueue: kblockd blk_mq_run_work_fn
> [  195.355613] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  195.385494] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  195.489501] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
> [  195.519090] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
> 0000000000000004
> [  195.560254] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
> ffff9f8aa95297d0
> [  195.601196] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  195.642186] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074781be20
> [  195.683094] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
> ffff9f8aa95297d0
> [  195.724153] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
> knlGS:0000000000000000
> [  195.769427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  195.801687] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  195.841126] Call Trace:
> [  195.854014]  ? blk_mq_flush_busy_ctxs+0xca/0x120
> [  195.879448]  blk_mq_sched_dispatch_requests+0x15c/0x180
> [  195.908389]  __blk_mq_run_hw_queue+0x5f/0x100
> [  195.932961]  process_one_work+0x171/0x380
> [  195.956414]  worker_thread+0x49/0x3f0
> [  195.976984]  kthread+0xf8/0x130
> [  195.995078]  ? max_active_store+0x80/0x80
> [  196.018080]  ? kthread_bind+0x10/0x10
> [  196.038383]  ret_from_fork+0x35/0x40
> [  196.058435] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
> ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> aesni_intel
> crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
> iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
> acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
> ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
> bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log
> dm_mod
> [  196.451938] CR2: 00000000000000b8
> [  196.469994] ---[ end trace b2b431da81df2e95 ]---
> [  196.469996] BUG: unable to handle kernel NULL pointer dereference
> at
> 00000000000000b8
> [  196.495380] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  196.540456] #PF error: [normal kernel read fault]
> [  196.570957] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  196.597976] PGD 0 P4D 0 
> [  196.705627] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
> [  196.720114] Oops: 0000 [#2] SMP PTI
> [  196.749959] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
> 0000000000000004
> [  196.769936] CPU: 6 PID: 4034 Comm: kworker/6:1H Tainted: G      D
> W
> I       5.1.0-rc2+ #2
> [  196.809000] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
> ffff9f8aa95297d0
> [  196.854098] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  196.893109] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  196.928283] Workqueue: kblockd blk_mq_run_work_fn
> [  196.968750] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074781be20
> [  196.995203] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  197.035808] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
> ffff9f8aa95297d0
> [  197.065780] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  197.105262] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
> knlGS:0000000000000000
> [  197.213467] RSP: 0018:ffffbb074722bd90 EFLAGS: 00010246
> [  197.259374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  197.289579] RAX: ffff9f7e6e25a908 RBX: ffff9f7e6e25a908 RCX:
> 0000000000000004
> [  197.322352] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  197.361122] RDX: 0000000000000000 RSI: ffffbb074722be20 RDI:
> ffff9f8a19f60000
> [  197.400084] Kernel panic - not syncing: Fatal exception
> [  197.440204] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  197.509179] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074722be20
> [  197.549143] R13: ffff9f7e6e25a8c0 R14: 0000000000000000 R15:
> ffff9f8a19f60000
> [  197.590162] FS:  0000000000000000(0000) GS:ffff9f7eb78c0000(0000)
> knlGS:0000000000000000
> [  197.636223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  197.668680] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  197.708291] Call Trace:
> [  197.722282]  ? blk_mq_flush_busy_ctxs+0xca/0x120
> [  197.748285]  blk_mq_sched_dispatch_requests+0x15c/0x180
> [  197.777145]  __blk_mq_run_hw_queue+0x5f/0x100
> [  197.801406]  process_one_work+0x171/0x380
> [  197.824021]  worker_thread+0x49/0x3f0
> [  197.844756]  kthread+0xf8/0x130
> [  197.862142]  ? max_active_store+0x80/0x80
> [  197.883917]  ? kthread_bind+0x10/0x10
> [  197.904609]  ret_from_fork+0x35/0x40
> [  197.924992] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
> ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> aesni_intel
> crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
> iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
> acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
> ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
> bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log
> dm_mod
> [  198.317894] CR2: 00000000000000b8
> [  198.336570] ---[ end trace b2b431da81df2e96 ]---
> [  198.336571] BUG: unable to handle kernel NULL pointer dereference
> at
> 00000000000000b8
> [  198.336572] #PF error: [normal kernel read fault]
> [  198.362241] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  198.405359] PGD 0 P4D 0 
> [  198.432068] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00[  198.461906] Oops: 0000 [#3] SMP PTI
> 

I am going to run the same test not using srp but for example qla2xxx.
This issue is clearly in the block layer, but I wonder why I am the
first to report it.
The exposure, if it happens also on F/C luns means we should have seen
panics when arrays were rebooted with any kernel >= v5.0-rc1




  reply	other threads:[~2019-03-28 15:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <6e19971d315f4a3ce2cc20a1c6693f4a263a280c.camel@redhat.com>
     [not found] ` <7858e19ce3fc3ebf7845494a2209c58cd9e3086d.camel@redhat.com>
     [not found]   ` <1553113730.65329.60.camel@acm.org>
     [not found]     ` <3645c45e88523d4b242333d96adbb492ab100f97.camel@redhat.com>
     [not found]       ` <d9febee1de759206bbf4d66f6570415ae64e4f33.camel@redhat.com>
2019-03-26 15:16         ` Panic when rebooting target server testing srp on 5.0.0-rc2 Laurence Oberman
2019-03-26 19:06           ` Laurence Oberman
2019-04-03 14:53             ` Laurence Oberman
2019-04-03 15:09               ` Bart Van Assche
2019-04-03 17:58                 ` Laurence Oberman
2019-03-27  1:40           ` Ming Lei
2019-03-27 12:35             ` Laurence Oberman
2019-03-27 12:56               ` Laurence Oberman
2019-03-27 22:00                 ` Laurence Oberman
2019-03-27 23:34                   ` Laurence Oberman
2019-03-28  1:31                     ` Ming Lei
2019-03-28 14:58                       ` Laurence Oberman
2019-03-28 15:09                         ` Laurence Oberman [this message]
2019-04-03 17:52                   ` Bart Van Assche
2019-04-03 17:59                     ` Laurence Oberman
2019-04-03 18:43                     ` Laurence Oberman
2019-04-03 21:47                     ` Sagi Grimberg
2019-04-03 21:54                       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea38f0a025d9686ad56bfb0c15dc824a1473915d.camel@redhat.com \
    --to=loberman@redhat.com \
    --cc=axboe@fb.com \
    --cc=bvanassche@acm.org \
    --cc=jianchao.w.wang@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=tom.leiming@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.