All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
       [not found]       ` <d9febee1de759206bbf4d66f6570415ae64e4f33.camel@redhat.com>
@ 2019-03-26 15:16         ` Laurence Oberman
  2019-03-26 19:06           ` Laurence Oberman
  2019-03-27  1:40           ` Ming Lei
  0 siblings, 2 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-03-26 15:16 UTC (permalink / raw)
  To: Bart Van Assche, linux-rdma, linux-block, Jens Axboe; +Cc: linux-scsi

On Thu, 2019-03-21 at 08:44 -0400, Laurence Oberman wrote:
> On Wed, 2019-03-20 at 16:35 -0400, Laurence Oberman wrote:
> > On Wed, 2019-03-20 at 13:28 -0700, Bart Van Assche wrote:
> > > On Wed, 2019-03-20 at 11:11 -0400, Laurence Oberman wrote:
> > > > On Wed, 2019-03-20 at 09:45 -0400, Laurence Oberman wrote:
> > > > > Hello Bart, I hope all is well with you.
> > > > > 
> > > > > Quick question
> > > > > preparing to test v5.1-rc2 SRP, my usual method is first
> > > > > validate
> > > > > the
> > > > > prior kernel I had in place.
> > > > > This had passed tests previously (5.0.0-rc2) but I had not
> > > > > run
> > > > > the
> > > > > target server reboot test, just the disconnect tests.
> > > > > 
> > > > > Today with mapped SRP devices I rebooted the target server
> > > > > and
> > > > > the
> > > > > client panicked.
> > > > > 
> > > > > Its been a while and I have been so busy that have not kept
> > > > > up
> > > > > with
> > > > > all
> > > > > the fixes. Is this a known issue.
> > > > > 
> > > > > Thanks
> > > > > Laurence
> > > > > 
> > > > > 5414228.917507] scsi host2: ib_srp: Path record query failed:
> > > > > sgid
> > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > service_id
> > > > > 0x7cfe900300726e4e
> > > > > [5414229.014355] scsi host2: reconnect attempt 7 failed (-
> > > > > 110)
> > > > > [5414239.318161] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318165] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318167] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318168] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318170] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318172] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318173] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414239.318175] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > [5414243.670072] scsi host2: ib_srp: Got failed path rec
> > > > > status
> > > > > -110
> > > > > [5414243.702179] scsi host2: ib_srp: Path record query
> > > > > failed:
> > > > > sgid
> > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > service_id
> > > > > 0x7cfe900300726e4e
> > > > > [5414243.799313] scsi host2: reconnect attempt 8 failed (-
> > > > > 110)
> > > > > [5414247.510115] scsi host1: ib_srp: Sending CM REQ failed
> > > > > [5414247.510140] scsi host1: reconnect attempt 1 failed (-
> > > > > 104)
> > > > > [5414247.849078] BUG: unable to handle kernel NULL pointer
> > > > > dereference
> > > > > at 00000000000000b8
> > > > > [5414247.893793] #PF error: [normal kernel read fault]
> > > > > [5414247.921839] PGD 0 P4D 0 
> > > > > [5414247.937280] Oops: 0000 [#1] SMP PTI
> > > > > [5414247.958332] CPU: 4 PID: 7773 Comm: kworker/4:1H Kdump:
> > > > > loaded
> > > > > Tainted: G          I       5.0.0-rc2+ #2
> > > > > [5414248.012856] Hardware name: HP ProLiant DL380 G7, BIOS
> > > > > P67
> > > > > 08/16/2015
> > > > > [5414248.026174] device-mapper: multipath: Failing path 8:48.
> > > > > 
> > > > > 
> > > > > [5414248.050003] Workqueue: kblockd blk_mq_run_work_fn
> > > > > [5414248.108378] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > > > > [5414248.139724] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b
> > > > > 45
> > > > > 00
> > > > > 48
> > > > > 39 c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8
> > > > > 4c
> > > > > 8b
> > > > > 63
> > > > > c8
> > > > > 75 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85
> > > > > c0
> > > > > 74
> > > > > 10
> > > > > 4c
> > > > > [5414248.246176] RSP: 0018:ffffb1cd8760fd90 EFLAGS: 00010246
> > > > > [5414248.275599] RAX: ffffa049d67a1308 RBX: ffffa049d67a1308
> > > > > RCX:
> > > > > 0000000000000004
> > > > > [5414248.316090] RDX: 0000000000000000 RSI: ffffb1cd8760fe20
> > > > > RDI:
> > > > > ffffa0552ca08000
> > > > > [5414248.356884] RBP: ffffb1cd8760fe20 R08: 0000000000000000
> > > > > R09:
> > > > > 8080808080808080
> > > > > [5414248.397632] R10: 0000000000000001 R11: 0000000000000001
> > > > > R12:
> > > > > 0000000000000000
> > > > > [5414248.439323] R13: ffffa049d67a12c0 R14: 0000000000000000
> > > > > R15:
> > > > > ffffa0552ca08000
> > > > > [5414248.481743] FS:  0000000000000000(0000)
> > > > > GS:ffffa04a37880000(0000)
> > > > > knlGS:0000000000000000
> > > > > [5414248.528310] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > 0000000080050033
> > > > > [5414248.561779] CR2: 00000000000000b8 CR3: 0000000e9d40e004
> > > > > CR4:
> > > > > 00000000000206e0
> > > > > [5414248.602420] Call Trace:
> > > > > [5414248.616660]  blk_mq_sched_dispatch_requests+0x15c/0x180
> > > > > [5414248.647066]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > > [5414248.672633]  process_one_work+0x171/0x370
> > > > > [5414248.695443]  worker_thread+0x49/0x3f0
> > > > > [5414248.716730]  kthread+0xf8/0x130
> > > > > [5414248.735085]  ? max_active_store+0x80/0x80
> > > > > [5414248.758569]  ? kthread_bind+0x10/0x10
> > > > > [5414248.779953]  ret_from_fork+0x35/0x40
> > > > > 
> > > > > [5414248.801005] Modules linked in: ib_isert iscsi_target_mod
> > > > > target_core_mod ib_srp rpcrdma scsi_transport_srp rdma_ucm
> > > > > ib_iser
> > > > > ib_ipoib ib_umad rdma_cm libiscsi iw_cm scsi_transport_iscsi
> > > > > ib_cm
> > > > > sunrpc mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp
> > > > > kvm_intel
> > > > > kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul
> > > > > ghash_clmulni_intel aesni_intel iTCO_wdt crypto_simd cryptd
> > > > > gpio_ich
> > > > > iTCO_vendor_support glue_helper joydev ipmi_si
> > > > > dm_service_time
> > > > > pcspkr
> > > > > ipmi_devintf hpilo hpwdt sg ipmi_msghandler acpi_power_meter
> > > > > lpc_ich
> > > > > i7core_edac pcc_cpufreq dm_multipath ip_tables xfs libcrc32c
> > > > > radeon
> > > > > sd_mod i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
> > > > > sysimgblt
> > > > > fb_sys_fops ttm drm mlx5_core crc32c_intel serio_raw i2c_core
> > > > > hpsa
> > > > > bnx2
> > > > > scsi_transport_sas mlxfw devlink dm_mirror dm_region_hash
> > > > > dm_log
> > > > > dm_mod
> > > > > [5414249.199354] CR2: 00000000000000b8
> > > > > 
> > > > > 
> > > > 
> > > > Looking at the vmcore
> > > > 
> > > > PID: 7773   TASK: ffffa04a2c1e2b80  CPU: 4   COMMAND:
> > > > "kworker/4:1H"
> > > >  #0 [ffffb1cd8760fab0] machine_kexec at ffffffffaaa6003f
> > > >  #1 [ffffb1cd8760fb08] __crash_kexec at ffffffffaab373ed
> > > >  #2 [ffffb1cd8760fbd0] crash_kexec at ffffffffaab385b9
> > > >  #3 [ffffb1cd8760fbe8] oops_end at ffffffffaaa31931
> > > >  #4 [ffffb1cd8760fc08] no_context at ffffffffaaa6eb59
> > > >  #5 [ffffb1cd8760fcb0] do_page_fault at ffffffffaaa6feb2
> > > >  #6 [ffffb1cd8760fce0] page_fault at ffffffffab2010ee
> > > >     [exception RIP: blk_mq_dispatch_rq_list+201]
> > > >     RIP: ffffffffaad90589  RSP: ffffb1cd8760fd90  RFLAGS:
> > > > 00010246
> > > >     RAX: ffffa049d67a1308  RBX: ffffa049d67a1308  RCX:
> > > > 0000000000000004
> > > >     RDX: 0000000000000000  RSI: ffffb1cd8760fe20  RDI:
> > > > ffffa0552ca08000
> > > >     RBP: ffffb1cd8760fe20   R8: 0000000000000000   R9:
> > > > 8080808080808080
> > > >     R10: 0000000000000001  R11: 0000000000000001  R12:
> > > > 0000000000000000
> > > >     R13: ffffa049d67a12c0  R14: 0000000000000000  R15:
> > > > ffffa0552ca08000
> > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > >  #7 [ffffb1cd8760fe18] blk_mq_sched_dispatch_requests at
> > > > ffffffffaad9570c
> > > >  #8 [ffffb1cd8760fe60] __blk_mq_run_hw_queue at
> > > > ffffffffaad8de3f
> > > >  #9 [ffffb1cd8760fe78] process_one_work at ffffffffaaab0ab1
> > > > #10 [ffffb1cd8760feb8] worker_thread at ffffffffaaab11d9
> > > > #11 [ffffb1cd8760ff10] kthread at ffffffffaaab6758
> > > > #12 [ffffb1cd8760ff50] ret_from_fork at ffffffffab200215
> > > > 
> > > > We were working on this request_queue for
> > > >  blk_mq_sched_dispatch_requests
> > > > 
> > > > crash> dev -d | grep ffffa0552ca08000
> > > >     8
> > > > ffffa055c81b5800   sdd        ffffa0552ca08000       0     0   
> > > >   
> > > > 0 
> > > > ]
> > > > 
> > > > That device was no longer accessible 
> > > > 
> > > > sdev_state = SDEV_TRANSPORT_OFFLINE,
> > > > 
> > > > So it looks like we tried to process a no longer valid list
> > > > entry
> > > > in 
> > > > blk_mq_dispatch_rq_list
> > > > 
> > > > /home/loberman/rpmbuild/BUILD/kernel-5.0.0_rc2+/block/blk-mq.h:
> > > > 211
> > > > 0xffffffffaad90589
> > > > <blk_mq_dispatch_rq_list+201>:       mov    0xb8(%r12),%rax
> > > > 
> > > > R12 is NULL
> > > > 
> > > > 
> > > > From
> > > > static inline bool blk_mq_get_dispatch_budget(struct
> > > > blk_mq_hw_ctx
> > > > *hctx)
> > > > {
> > > >         struct request_queue *q = hctx->queue;
> > > > 
> > > >         if (q->mq_ops->get_budget)
> > > >                 return q->mq_ops->get_budget(hctx);
> > > >         return true;
> > > > }
> > > > 
> > > > Willw ait for a reply befaore i try the newer kernel, but looks
> > > > like a
> > > > use after free to me
> > > 
> > > Hi Laurence,
> > > 
> > > I don't think that any of the recent SRP initiator changes can be
> > > the
> > > root
> > > cause of this crash. However, significant changes went upstream
> > > in
> > > the block
> > > layer core during the v5.1-rc1 merge window, e.g. multi-page bvec
> > > support.
> > > Is it possible for you to bisect this kernel oops?
> > > 
> > > Thanks,
> > > 
> > > Bart.
> > > 
> > 
> > OK, I will see I can reproduce on the fly and I will bisect.
> > I do agree its not SRP, more likley some block layer race.
> > I was just able to reproduce it using SRP
> > 
> > Note that this was on 5.0.0-rc2+, prior to me trying 5.1.
> > 
> > I usually reboot the target server as part of my test series but
> > when
> > I
> > last tested 5.0.0-rc2+ I only reset the SRP interfaces and had
> > devices
> > get rediscovered.
> > 
> > I did not see it during those tests.
> > 
> > Back when I have more to share.
> > 
> > Many Thanks for your time as always
> > Laurence
> > 
> > 
> 
> Something crept in, in the block layer causing a use after free
> 4.19.0-rc1+ does not have the issue, so I will narrow the bisect 
> Thanks
> Laurence
> 

This took a long time to bisect.
Repeating the issue seen. We have changes that when the target is
rebooted with mapped srp devices the client then experiences a ist
corruption and panic as already shown.

Some stacks

[  222.631998] scsi host1: ib_srp: Path record query failed: sgid
fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
0x7cfe900300726e4e
[  222.729639] scsi host1: reconnect attempt 1 failed (-110)
[  223.176766] fast_io_fail_tmo expired for SRP port-1:1 / host1.

[  223.518759] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b8
[  223.519736] sd 1:0:0:0: rejecting I/O to offline device
[  223.563769] #PF error: [normal kernel read fault]
[  223.563770] PGD 0 P4D 0 
[  223.563774] Oops: 0000 [#1] SMP PTI
[  223.563778] CPU: 3 PID: 9027 Comm: kworker/3:1H Tainted:
G          I       5.0.0-rc1 #22
[  223.563779] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  223.563787] Workqueue: kblockd blk_mq_run_work_fn
[  223.593723] device-mapper: multipath: Failing path 8:48.
[  223.620801] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
[  223.635266] print_req_error: I/O error, dev dm-6, sector 8191872
flags 80700
[  223.655565] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b 45 00 48 39
c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8 4c 8b 63 c8 75
25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 4c
[  223.655566] RSP: 0018:ffffa65b4c43fd90 EFLAGS: 00010246
[  223.655570] RAX: ffff93ed9bfdbbc8 RBX: ffff93ed9bfdbbc8 RCX:
0000000000000004
[  223.702351] print_req_error: I/O error, dev dm-6, sector 8191872
flags 0
[  223.737640] RDX: 0000000000000000 RSI: ffffa65b4c43fe20 RDI:
ffff93ed9b838000
[  223.737641] RBP: ffffa65b4c43fe20 R08: 0000000000000000 R09:
8080808080808080
[  223.737642] R10: 0000000000000001 R11: 071c71c71c71c71c R12:
0000000000000000
[  223.737643] R13: ffff93ed9bfdbb80 R14: 0000000000000000 R15:
ffff93ed9b838000
[  223.737645] FS:  0000000000000000(0000) GS:ffff93ee33840000(0000)
knlGS:0000000000000000
[  223.737646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  223.737646] CR2: 00000000000000b8 CR3: 000000059da0e006 CR4:
00000000000206e0
[  223.737648] Call Trace:          

[  223.737657]  blk_mq_sched_dispatch_requests+0x15c/0x180  *** Freed
already

[  223.737660]  __blk_mq_run_hw_queue+0x5f/0xf0
[  223.737665]  process_one_work+0x171/0x370
[  223.737667]  worker_thread+0x49/0x3f0
[  223.737670]  kthread+0xf8/0x130
[  223.737673]  ? max_active_store+0x80/0x80

And:

[  643.425005] device-mapper: multipath: Failing path 67:0.
[  643.696365] ------------[ cut here ]------------
[  643.722927] list_add corruption. prev->next should be next
(ffffc8c0c3bd9448), but was ffff93b971965e08. (prev=ffff93b971965e08).
[  643.787089] WARNING: CPU: 14 PID: 6533 at lib/list_debug.c:28
__list_add_valid+0x6a/0x70
[  643.830951] Modules linked in: ib_isert iscsi_target_mod
target_core_mod ib_srp scsi_transport_srp rpcrdma ib_ipoib ib_umad
rdma_ucm ib_iser rdma_cm iw_cm libiscsi sunrpc scsi_transport_iscsi
ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul iTCO_wdt ghash_clmulni_intel
ipmi_ssif aesni_intel iTCO_vendor_support gpio_ich crypto_simd
dm_service_time cryptd glue_helper joydev ipmi_si ipmi_devintf pcspkr
hpilo hpwdt sg ipmi_msghandler acpi_power_meter pcc_cpufreq lpc_ich
i7core_edac dm_multipath ip_tables xfs libcrc32c radeon sd_mod
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm drm crc32c_intel serio_raw mlx5_core i2c_core bnx2 hpsa
mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  644.224150] CPU: 14 PID: 6533 Comm: kworker/14:1H Tainted:
G          I       4.20.0+ #26
[  644.269637] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  644.305110] Workqueue: kblockd blk_mq_run_work_fn
[  644.330984] RIP: 0010:__list_add_valid+0x6a/0x70
[  644.357601] Code: 31 c0 48 c7 c7 70 e4 ab b6 e8 22 a6 cc ff 0f 0b 31
c0 c3 48 89 d1 48 c7 c7 20 e4 ab b6 48 89 f2 48 89 c6 31 c0 e8 06 a6 cc
ff <0f> 0b 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57
[  644.462542] RSP: 0018:ffffa8ccc72bfc00 EFLAGS: 00010286
[  644.491594] RAX: 0000000000000000 RBX: ffff93b971965dc0 RCX:
0000000000000000
[  644.532745] RDX: 0000000000000001 RSI: ffff93b9b79d67b8 RDI:
ffff93b9b79d67b8
[  644.573533] RBP: ffffc8c0c3bd9448 R08: 0000000000000000 R09:
000000000000072c
[  644.614180] R10: 0000000000000000 R11: ffffa8ccc72bf968 R12:
ffff93b96d454c00
[  644.654683] R13: ffffc8c0c3bd9440 R14: ffff93b971965e08 R15:
ffff93b971965e08
[  644.694275] FS:  0000000000000000(0000) GS:ffff93b9b79c0000(0000)
knlGS:0000000000000000
[  644.739906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  644.771879] CR2: 00007fd566c10000 CR3: 0000000253e0e002 CR4:
00000000000206e0
[  644.811438] Call Trace:
[  644.824809]  __blk_mq_insert_request+0x62/0x130
[  644.849886]  blk_mq_sched_insert_request+0x13c/0x1b0
[  644.877402]  blk_mq_try_issue_directly+0x105/0x2c0
[  644.904452]  blk_insert_cloned_request+0x9a/0x130
[  644.931146]  ? ktime_get+0x37/0x90
[  644.950545]  dm_mq_queue_rq+0x21c/0x3f0 [dm_mod]
[  644.977064]  ? blk_mq_get_driver_tag+0xa1/0x120
[  645.003002]  blk_mq_dispatch_rq_list+0x8e/0x590
[  645.029812]  ? __switch_to_asm+0x40/0x70
[  645.052059]  ? __switch_to_asm+0x34/0x70
[  645.074664]  ? __switch_to_asm+0x40/0x70
[  645.097111]  ? __switch_to_asm+0x34/0x70
[  645.119948]  ? __switch_to_asm+0x34/0x70
[  645.143101]  ? __switch_to_asm+0x40/0x70
[  645.165273]  ? syscall_return_via_sysret+0x10/0x7f
[  645.192161]  blk_mq_sched_dispatch_requests+0xe8/0x180
[  645.221460]  __blk_mq_run_hw_queue+0x5f/0xf0
[  645.244766]  process_one_work+0x171/0x370
[  645.267164]  worker_thread+0x49/0x3f0
[  645.287860]  kthread+0xf8/0x130
[  645.304894]  ? max_active_store+0x80/0x80
[  645.327748]  ? kthread_bind+0x10/0x10
[  645.347898]  ret_from_fork+0x35/0x40
[  645.368298] ---[ end trace afa70bf68ffb006b ]---
[  645.397356] ------------[ cut here ]------------
[  645.423878] list_add corruption. prev->next should be next
(ffffc8c0c3bd9448), but was ffff93b971965e08. (prev=ffff93b971965e08).
[  645.488009] WAR
NING: CPU: 14 PID: 6533 at lib/list_debug.c:28
__list_add_valid+0x6a/0x70


I started just looking at block but nothing made sense so re-ran the
bisect with the entire kernel.

$ git bisect start 
$ git bisect good v4.20
$ git bisect bad v5.0-rc1

Bisecting: 5477 revisions left to test after this (roughly 13 steps)
*** Groan

I got to here and the problem is its an entire merge

[loberman@ibclient linux_torvalds]$ git bisect bad
4b9254328254bed12a4ac449cdff2c332e630837 is the first bad commit
[loberman@ibclient linux_torvalds]$ git show
4b9254328254bed12a4ac449cdff2c332e630837
commit 4b9254328254bed12a4ac449cdff2c332e630837
Merge: 1a9430d cd19181
Author: Jens Axboe <axboe@kernel.dk>
Date:   Tue Dec 18 08:29:53 2018 -0700

    Merge branch 'for-4.21/block' into for-4.21/aio
    
    * for-4.21/block: (351 commits)
      blk-mq: enable IO poll if .nr_queues of type poll > 0
      blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
      blk-mq: skip zero-queue maps in blk_mq_map_swqueue
      block: fix blk-iolatency accounting underflow
      blk-mq: fix dispatch from sw queue
      block: mq-deadline: Fix write completion handling
      nvme-pci: don't share queue maps
      blk-mq: only dispatch to non-defauly queue maps if they have
queues
      blk-mq: export hctx->type in debugfs instead of sysfs
      blk-mq: fix allocation for queue mapping table
      blk-wbt: export internal state via debugfs
      blk-mq-debugfs: support rq_qos
      block: update sysfs documentation
      block: loop: check error using IS_ERR instead of IS_ERR_OR_NULL
in loop_add()
      aoe: add __exit annotation
      block: clear REQ_HIPRI if polling is not supported
      blk-mq: replace and kill blk_mq_request_issue_directly
      blk-mq: issue directly with bypass 'false' in
blk_mq_sched_insert_requests
      blk-mq: refactor the code of issue request directly
      block: remove the bio_integrity_advance export
      ...

Definitley looks like a list corruption and use after free but its the
merge so looking for ideas now from the block folks,




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-26 15:16         ` Panic when rebooting target server testing srp on 5.0.0-rc2 Laurence Oberman
@ 2019-03-26 19:06           ` Laurence Oberman
  2019-04-03 14:53             ` Laurence Oberman
  2019-03-27  1:40           ` Ming Lei
  1 sibling, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-03-26 19:06 UTC (permalink / raw)
  To: Bart Van Assche, linux-rdma, linux-block, Jens Axboe; +Cc: linux-scsi

On Tue, 2019-03-26 at 11:16 -0400, Laurence Oberman wrote:
> On Thu, 2019-03-21 at 08:44 -0400, Laurence Oberman wrote:
> > On Wed, 2019-03-20 at 16:35 -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-20 at 13:28 -0700, Bart Van Assche wrote:
> > > > On Wed, 2019-03-20 at 11:11 -0400, Laurence Oberman wrote:
> > > > > On Wed, 2019-03-20 at 09:45 -0400, Laurence Oberman wrote:
> > > > > > Hello Bart, I hope all is well with you.
> > > > > > 
> > > > > > Quick question
> > > > > > preparing to test v5.1-rc2 SRP, my usual method is first
> > > > > > validate
> > > > > > the
> > > > > > prior kernel I had in place.
> > > > > > This had passed tests previously (5.0.0-rc2) but I had not
> > > > > > run
> > > > > > the
> > > > > > target server reboot test, just the disconnect tests.
> > > > > > 
> > > > > > Today with mapped SRP devices I rebooted the target server
> > > > > > and
> > > > > > the
> > > > > > client panicked.
> > > > > > 
> > > > > > Its been a while and I have been so busy that have not kept
> > > > > > up
> > > > > > with
> > > > > > all
> > > > > > the fixes. Is this a known issue.
> > > > > > 
> > > > > > Thanks
> > > > > > Laurence
> > > > > > 
> > > > > > 5414228.917507] scsi host2: ib_srp: Path record query
> > > > > > failed:
> > > > > > sgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > service_id
> > > > > > 0x7cfe900300726e4e
> > > > > > [5414229.014355] scsi host2: reconnect attempt 7 failed (-
> > > > > > 110)
> > > > > > [5414239.318161] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318165] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318167] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318168] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318170] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318172] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318173] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318175] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414243.670072] scsi host2: ib_srp: Got failed path rec
> > > > > > status
> > > > > > -110
> > > > > > [5414243.702179] scsi host2: ib_srp: Path record query
> > > > > > failed:
> > > > > > sgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > service_id
> > > > > > 0x7cfe900300726e4e
> > > > > > [5414243.799313] scsi host2: reconnect attempt 8 failed (-
> > > > > > 110)
> > > > > > [5414247.510115] scsi host1: ib_srp: Sending CM REQ failed
> > > > > > [5414247.510140] scsi host1: reconnect attempt 1 failed (-
> > > > > > 104)
> > > > > > [5414247.849078] BUG: unable to handle kernel NULL pointer
> > > > > > dereference
> > > > > > at 00000000000000b8
> > > > > > [5414247.893793] #PF error: [normal kernel read fault]
> > > > > > [5414247.921839] PGD 0 P4D 0 
> > > > > > [5414247.937280] Oops: 0000 [#1] SMP PTI
> > > > > > [5414247.958332] CPU: 4 PID: 7773 Comm: kworker/4:1H Kdump:
> > > > > > loaded
> > > > > > Tainted: G          I       5.0.0-rc2+ #2
> > > > > > [5414248.012856] Hardware name: HP ProLiant DL380 G7, BIOS
> > > > > > P67
> > > > > > 08/16/2015
> > > > > > [5414248.026174] device-mapper: multipath: Failing path
> > > > > > 8:48.
> > > > > > 
> > > > > > 
> > > > > > [5414248.050003] Workqueue: kblockd blk_mq_run_work_fn
> > > > > > [5414248.108378] RIP:
> > > > > > 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > > > > > [5414248.139724] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48
> > > > > > 8b
> > > > > > 45
> > > > > > 00
> > > > > > 48
> > > > > > 39 c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8
> > > > > > 4c
> > > > > > 8b
> > > > > > 63
> > > > > > c8
> > > > > > 75 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48
> > > > > > 85
> > > > > > c0
> > > > > > 74
> > > > > > 10
> > > > > > 4c
> > > > > > [5414248.246176] RSP: 0018:ffffb1cd8760fd90 EFLAGS:
> > > > > > 00010246
> > > > > > [5414248.275599] RAX: ffffa049d67a1308 RBX:
> > > > > > ffffa049d67a1308
> > > > > > RCX:
> > > > > > 0000000000000004
> > > > > > [5414248.316090] RDX: 0000000000000000 RSI:
> > > > > > ffffb1cd8760fe20
> > > > > > RDI:
> > > > > > ffffa0552ca08000
> > > > > > [5414248.356884] RBP: ffffb1cd8760fe20 R08:
> > > > > > 0000000000000000
> > > > > > R09:
> > > > > > 8080808080808080
> > > > > > [5414248.397632] R10: 0000000000000001 R11:
> > > > > > 0000000000000001
> > > > > > R12:
> > > > > > 0000000000000000
> > > > > > [5414248.439323] R13: ffffa049d67a12c0 R14:
> > > > > > 0000000000000000
> > > > > > R15:
> > > > > > ffffa0552ca08000
> > > > > > [5414248.481743] FS:  0000000000000000(0000)
> > > > > > GS:ffffa04a37880000(0000)
> > > > > > knlGS:0000000000000000
> > > > > > [5414248.528310] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > 0000000080050033
> > > > > > [5414248.561779] CR2: 00000000000000b8 CR3:
> > > > > > 0000000e9d40e004
> > > > > > CR4:
> > > > > > 00000000000206e0
> > > > > > [5414248.602420] Call Trace:
> > > > > > [5414248.616660]  blk_mq_sched_dispatch_requests+0x15c/0x18
> > > > > > 0
> > > > > > [5414248.647066]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > > > [5414248.672633]  process_one_work+0x171/0x370
> > > > > > [5414248.695443]  worker_thread+0x49/0x3f0
> > > > > > [5414248.716730]  kthread+0xf8/0x130
> > > > > > [5414248.735085]  ? max_active_store+0x80/0x80
> > > > > > [5414248.758569]  ? kthread_bind+0x10/0x10
> > > > > > [5414248.779953]  ret_from_fork+0x35/0x40
> > > > > > 
> > > > > > [5414248.801005] Modules linked in: ib_isert
> > > > > > iscsi_target_mod
> > > > > > target_core_mod ib_srp rpcrdma scsi_transport_srp rdma_ucm
> > > > > > ib_iser
> > > > > > ib_ipoib ib_umad rdma_cm libiscsi iw_cm
> > > > > > scsi_transport_iscsi
> > > > > > ib_cm
> > > > > > sunrpc mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp
> > > > > > kvm_intel
> > > > > > kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul
> > > > > > ghash_clmulni_intel aesni_intel iTCO_wdt crypto_simd cryptd
> > > > > > gpio_ich
> > > > > > iTCO_vendor_support glue_helper joydev ipmi_si
> > > > > > dm_service_time
> > > > > > pcspkr
> > > > > > ipmi_devintf hpilo hpwdt sg ipmi_msghandler
> > > > > > acpi_power_meter
> > > > > > lpc_ich
> > > > > > i7core_edac pcc_cpufreq dm_multipath ip_tables xfs
> > > > > > libcrc32c
> > > > > > radeon
> > > > > > sd_mod i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
> > > > > > sysimgblt
> > > > > > fb_sys_fops ttm drm mlx5_core crc32c_intel serio_raw
> > > > > > i2c_core
> > > > > > hpsa
> > > > > > bnx2
> > > > > > scsi_transport_sas mlxfw devlink dm_mirror dm_region_hash
> > > > > > dm_log
> > > > > > dm_mod
> > > > > > [5414249.199354] CR2: 00000000000000b8
> > > > > > 
> > > > > > 
> > > > > 
> > > > > Looking at the vmcore
> > > > > 
> > > > > PID: 7773   TASK: ffffa04a2c1e2b80  CPU: 4   COMMAND:
> > > > > "kworker/4:1H"
> > > > >  #0 [ffffb1cd8760fab0] machine_kexec at ffffffffaaa6003f
> > > > >  #1 [ffffb1cd8760fb08] __crash_kexec at ffffffffaab373ed
> > > > >  #2 [ffffb1cd8760fbd0] crash_kexec at ffffffffaab385b9
> > > > >  #3 [ffffb1cd8760fbe8] oops_end at ffffffffaaa31931
> > > > >  #4 [ffffb1cd8760fc08] no_context at ffffffffaaa6eb59
> > > > >  #5 [ffffb1cd8760fcb0] do_page_fault at ffffffffaaa6feb2
> > > > >  #6 [ffffb1cd8760fce0] page_fault at ffffffffab2010ee
> > > > >     [exception RIP: blk_mq_dispatch_rq_list+201]
> > > > >     RIP: ffffffffaad90589  RSP: ffffb1cd8760fd90  RFLAGS:
> > > > > 00010246
> > > > >     RAX: ffffa049d67a1308  RBX: ffffa049d67a1308  RCX:
> > > > > 0000000000000004
> > > > >     RDX: 0000000000000000  RSI: ffffb1cd8760fe20  RDI:
> > > > > ffffa0552ca08000
> > > > >     RBP: ffffb1cd8760fe20   R8: 0000000000000000   R9:
> > > > > 8080808080808080
> > > > >     R10: 0000000000000001  R11: 0000000000000001  R12:
> > > > > 0000000000000000
> > > > >     R13: ffffa049d67a12c0  R14: 0000000000000000  R15:
> > > > > ffffa0552ca08000
> > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > >  #7 [ffffb1cd8760fe18] blk_mq_sched_dispatch_requests at
> > > > > ffffffffaad9570c
> > > > >  #8 [ffffb1cd8760fe60] __blk_mq_run_hw_queue at
> > > > > ffffffffaad8de3f
> > > > >  #9 [ffffb1cd8760fe78] process_one_work at ffffffffaaab0ab1
> > > > > #10 [ffffb1cd8760feb8] worker_thread at ffffffffaaab11d9
> > > > > #11 [ffffb1cd8760ff10] kthread at ffffffffaaab6758
> > > > > #12 [ffffb1cd8760ff50] ret_from_fork at ffffffffab200215
> > > > > 
> > > > > We were working on this request_queue for
> > > > >  blk_mq_sched_dispatch_requests
> > > > > 
> > > > > crash> dev -d | grep ffffa0552ca08000
> > > > >     8
> > > > > ffffa055c81b5800   sdd        ffffa0552ca08000       0     0 
> > > > >   
> > > > >   
> > > > > 0 
> > > > > ]
> > > > > 
> > > > > That device was no longer accessible 
> > > > > 
> > > > > sdev_state = SDEV_TRANSPORT_OFFLINE,
> > > > > 
> > > > > So it looks like we tried to process a no longer valid list
> > > > > entry
> > > > > in 
> > > > > blk_mq_dispatch_rq_list
> > > > > 
> > > > > /home/loberman/rpmbuild/BUILD/kernel-5.0.0_rc2+/block/blk-
> > > > > mq.h:
> > > > > 211
> > > > > 0xffffffffaad90589
> > > > > <blk_mq_dispatch_rq_list+201>:       mov    0xb8(%r12),%rax
> > > > > 
> > > > > R12 is NULL
> > > > > 
> > > > > 
> > > > > From
> > > > > static inline bool blk_mq_get_dispatch_budget(struct
> > > > > blk_mq_hw_ctx
> > > > > *hctx)
> > > > > {
> > > > >         struct request_queue *q = hctx->queue;
> > > > > 
> > > > >         if (q->mq_ops->get_budget)
> > > > >                 return q->mq_ops->get_budget(hctx);
> > > > >         return true;
> > > > > }
> > > > > 
> > > > > Willw ait for a reply befaore i try the newer kernel, but
> > > > > looks
> > > > > like a
> > > > > use after free to me
> > > > 
> > > > Hi Laurence,
> > > > 
> > > > I don't think that any of the recent SRP initiator changes can
> > > > be
> > > > the
> > > > root
> > > > cause of this crash. However, significant changes went upstream
> > > > in
> > > > the block
> > > > layer core during the v5.1-rc1 merge window, e.g. multi-page
> > > > bvec
> > > > support.
> > > > Is it possible for you to bisect this kernel oops?
> > > > 
> > > > Thanks,
> > > > 
> > > > Bart.
> > > > 
> > > 
> > > OK, I will see I can reproduce on the fly and I will bisect.
> > > I do agree its not SRP, more likley some block layer race.
> > > I was just able to reproduce it using SRP
> > > 
> > > Note that this was on 5.0.0-rc2+, prior to me trying 5.1.
> > > 
> > > I usually reboot the target server as part of my test series but
> > > when
> > > I
> > > last tested 5.0.0-rc2+ I only reset the SRP interfaces and had
> > > devices
> > > get rediscovered.
> > > 
> > > I did not see it during those tests.
> > > 
> > > Back when I have more to share.
> > > 
> > > Many Thanks for your time as always
> > > Laurence
> > > 
> > > 
> > 
> > Something crept in, in the block layer causing a use after free
> > 4.19.0-rc1+ does not have the issue, so I will narrow the bisect 
> > Thanks
> > Laurence
> > 
> 
> This took a long time to bisect.
> Repeating the issue seen. We have changes that when the target is
> rebooted with mapped srp devices the client then experiences a ist
> corruption and panic as already shown.
> 
> Some stacks
> 
> [  222.631998] scsi host1: ib_srp: Path record query failed: sgid
> fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
> fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
> 0x7cfe900300726e4e
> [  222.729639] scsi host1: reconnect attempt 1 failed (-110)
> [  223.176766] fast_io_fail_tmo expired for SRP port-1:1 / host1.
> 
> [  223.518759] BUG: unable to handle kernel NULL pointer dereference
> at
> 00000000000000b8
> [  223.519736] sd 1:0:0:0: rejecting I/O to offline device
> [  223.563769] #PF error: [normal kernel read fault]
> [  223.563770] PGD 0 P4D 0 
> [  223.563774] Oops: 0000 [#1] SMP PTI
> [  223.563778] CPU: 3 PID: 9027 Comm: kworker/3:1H Tainted:
> G          I       5.0.0-rc1 #22
> [  223.563779] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  223.563787] Workqueue: kblockd blk_mq_run_work_fn
> [  223.593723] device-mapper: multipath: Failing path 8:48.
> [  223.620801] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> [  223.635266] print_req_error: I/O error, dev dm-6, sector 8191872
> flags 80700
> [  223.655565] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b 45 00 48
> 39
> c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8 4c 8b 63 c8
> 75
> 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10
> 4c
> [  223.655566] RSP: 0018:ffffa65b4c43fd90 EFLAGS: 00010246
> [  223.655570] RAX: ffff93ed9bfdbbc8 RBX: ffff93ed9bfdbbc8 RCX:
> 0000000000000004
> [  223.702351] print_req_error: I/O error, dev dm-6, sector 8191872
> flags 0
> [  223.737640] RDX: 0000000000000000 RSI: ffffa65b4c43fe20 RDI:
> ffff93ed9b838000
> [  223.737641] RBP: ffffa65b4c43fe20 R08: 0000000000000000 R09:
> 8080808080808080
> [  223.737642] R10: 0000000000000001 R11: 071c71c71c71c71c R12:
> 0000000000000000
> [  223.737643] R13: ffff93ed9bfdbb80 R14: 0000000000000000 R15:
> ffff93ed9b838000
> [  223.737645] FS:  0000000000000000(0000) GS:ffff93ee33840000(0000)
> knlGS:0000000000000000
> [  223.737646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  223.737646] CR2: 00000000000000b8 CR3: 000000059da0e006 CR4:
> 00000000000206e0
> [  223.737648] Call Trace:          
> 
> [  223.737657]  blk_mq_sched_dispatch_requests+0x15c/0x180  *** Freed
> already
> 
> [  223.737660]  __blk_mq_run_hw_queue+0x5f/0xf0
> [  223.737665]  process_one_work+0x171/0x370
> [  223.737667]  worker_thread+0x49/0x3f0
> [  223.737670]  kthread+0xf8/0x130
> [  223.737673]  ? max_active_store+0x80/0x80
> 
> And:
> 
> [  643.425005] device-mapper: multipath: Failing path 67:0.
> [  643.696365] ------------[ cut here ]------------
> [  643.722927] list_add corruption. prev->next should be next
> (ffffc8c0c3bd9448), but was ffff93b971965e08.
> (prev=ffff93b971965e08).
> [  643.787089] WARNING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> __list_add_valid+0x6a/0x70
> [  643.830951] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_ipoib ib_umad
> rdma_ucm ib_iser rdma_cm iw_cm libiscsi sunrpc scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> kvm
> irqbypass crct10dif_pclmul crc32_pclmul iTCO_wdt ghash_clmulni_intel
> ipmi_ssif aesni_intel iTCO_vendor_support gpio_ich crypto_simd
> dm_service_time cryptd glue_helper joydev ipmi_si ipmi_devintf pcspkr
> hpilo hpwdt sg ipmi_msghandler acpi_power_meter pcc_cpufreq lpc_ich
> i7core_edac dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops ttm drm crc32c_intel serio_raw mlx5_core i2c_core bnx2
> hpsa
> mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> [  644.224150] CPU: 14 PID: 6533 Comm: kworker/14:1H Tainted:
> G          I       4.20.0+ #26
> [  644.269637] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  644.305110] Workqueue: kblockd blk_mq_run_work_fn
> [  644.330984] RIP: 0010:__list_add_valid+0x6a/0x70
> [  644.357601] Code: 31 c0 48 c7 c7 70 e4 ab b6 e8 22 a6 cc ff 0f 0b
> 31
> c0 c3 48 89 d1 48 c7 c7 20 e4 ab b6 48 89 f2 48 89 c6 31 c0 e8 06 a6
> cc
> ff <0f> 0b 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b
> 57
> [  644.462542] RSP: 0018:ffffa8ccc72bfc00 EFLAGS: 00010286
> [  644.491594] RAX: 0000000000000000 RBX: ffff93b971965dc0 RCX:
> 0000000000000000
> [  644.532745] RDX: 0000000000000001 RSI: ffff93b9b79d67b8 RDI:
> ffff93b9b79d67b8
> [  644.573533] RBP: ffffc8c0c3bd9448 R08: 0000000000000000 R09:
> 000000000000072c
> [  644.614180] R10: 0000000000000000 R11: ffffa8ccc72bf968 R12:
> ffff93b96d454c00
> [  644.654683] R13: ffffc8c0c3bd9440 R14: ffff93b971965e08 R15:
> ffff93b971965e08
> [  644.694275] FS:  0000000000000000(0000) GS:ffff93b9b79c0000(0000)
> knlGS:0000000000000000
> [  644.739906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  644.771879] CR2: 00007fd566c10000 CR3: 0000000253e0e002 CR4:
> 00000000000206e0
> [  644.811438] Call Trace:
> [  644.824809]  __blk_mq_insert_request+0x62/0x130
> [  644.849886]  blk_mq_sched_insert_request+0x13c/0x1b0
> [  644.877402]  blk_mq_try_issue_directly+0x105/0x2c0
> [  644.904452]  blk_insert_cloned_request+0x9a/0x130
> [  644.931146]  ? ktime_get+0x37/0x90
> [  644.950545]  dm_mq_queue_rq+0x21c/0x3f0 [dm_mod]
> [  644.977064]  ? blk_mq_get_driver_tag+0xa1/0x120
> [  645.003002]  blk_mq_dispatch_rq_list+0x8e/0x590
> [  645.029812]  ? __switch_to_asm+0x40/0x70
> [  645.052059]  ? __switch_to_asm+0x34/0x70
> [  645.074664]  ? __switch_to_asm+0x40/0x70
> [  645.097111]  ? __switch_to_asm+0x34/0x70
> [  645.119948]  ? __switch_to_asm+0x34/0x70
> [  645.143101]  ? __switch_to_asm+0x40/0x70
> [  645.165273]  ? syscall_return_via_sysret+0x10/0x7f
> [  645.192161]  blk_mq_sched_dispatch_requests+0xe8/0x180
> [  645.221460]  __blk_mq_run_hw_queue+0x5f/0xf0
> [  645.244766]  process_one_work+0x171/0x370
> [  645.267164]  worker_thread+0x49/0x3f0
> [  645.287860]  kthread+0xf8/0x130
> [  645.304894]  ? max_active_store+0x80/0x80
> [  645.327748]  ? kthread_bind+0x10/0x10
> [  645.347898]  ret_from_fork+0x35/0x40
> [  645.368298] ---[ end trace afa70bf68ffb006b ]---
> [  645.397356] ------------[ cut here ]------------
> [  645.423878] list_add corruption. prev->next should be next
> (ffffc8c0c3bd9448), but was ffff93b971965e08.
> (prev=ffff93b971965e08).
> [  645.488009] WAR
> NING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> __list_add_valid+0x6a/0x70
> 
> 
> I started just looking at block but nothing made sense so re-ran the
> bisect with the entire kernel.
> 
> $ git bisect start 
> $ git bisect good v4.20
> $ git bisect bad v5.0-rc1
> 
> Bisecting: 5477 revisions left to test after this (roughly 13 steps)
> *** Groan
> 
> I got to here and the problem is its an entire merge
> 
> [loberman@ibclient linux_torvalds]$ git bisect bad
> 4b9254328254bed12a4ac449cdff2c332e630837 is the first bad commit
> [loberman@ibclient linux_torvalds]$ git show
> 4b9254328254bed12a4ac449cdff2c332e630837
> commit 4b9254328254bed12a4ac449cdff2c332e630837
> Merge: 1a9430d cd19181
> Author: Jens Axboe <axboe@kernel.dk>
> Date:   Tue Dec 18 08:29:53 2018 -0700
> 
>     Merge branch 'for-4.21/block' into for-4.21/aio
>     
>     * for-4.21/block: (351 commits)
>       blk-mq: enable IO poll if .nr_queues of type poll > 0
>       blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
>       blk-mq: skip zero-queue maps in blk_mq_map_swqueue
>       block: fix blk-iolatency accounting underflow
>       blk-mq: fix dispatch from sw queue
>       block: mq-deadline: Fix write completion handling
>       nvme-pci: don't share queue maps
>       blk-mq: only dispatch to non-defauly queue maps if they have
> queues
>       blk-mq: export hctx->type in debugfs instead of sysfs
>       blk-mq: fix allocation for queue mapping table
>       blk-wbt: export internal state via debugfs
>       blk-mq-debugfs: support rq_qos
>       block: update sysfs documentation
>       block: loop: check error using IS_ERR instead of IS_ERR_OR_NULL
> in loop_add()
>       aoe: add __exit annotation
>       block: clear REQ_HIPRI if polling is not supported
>       blk-mq: replace and kill blk_mq_request_issue_directly
>       blk-mq: issue directly with bypass 'false' in
> blk_mq_sched_insert_requests
>       blk-mq: refactor the code of issue request directly
>       block: remove the bio_integrity_advance export
>       ...
> 
> Definitley looks like a list corruption and use after free but its
> the
> merge so looking for ideas now from the block folks,
> 
> 
> 

for-4.21/block: (351 commits) *****
I will have to try narrow these down based on the heading of each and
do a full review of the vmcore I captured.

This expsoure is now there from v5.0-rc1 and onwards


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-26 15:16         ` Panic when rebooting target server testing srp on 5.0.0-rc2 Laurence Oberman
  2019-03-26 19:06           ` Laurence Oberman
@ 2019-03-27  1:40           ` Ming Lei
  2019-03-27 12:35             ` Laurence Oberman
  1 sibling, 1 reply; 18+ messages in thread
From: Ming Lei @ 2019-03-27  1:40 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: Bart Van Assche, linux-rdma, linux-block, Jens Axboe, linux-scsi

On Tue, Mar 26, 2019 at 11:17 PM Laurence Oberman <loberman@redhat.com> wrote:
>
> On Thu, 2019-03-21 at 08:44 -0400, Laurence Oberman wrote:
> > On Wed, 2019-03-20 at 16:35 -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-20 at 13:28 -0700, Bart Van Assche wrote:
> > > > On Wed, 2019-03-20 at 11:11 -0400, Laurence Oberman wrote:
> > > > > On Wed, 2019-03-20 at 09:45 -0400, Laurence Oberman wrote:
> > > > > > Hello Bart, I hope all is well with you.
> > > > > >
> > > > > > Quick question
> > > > > > preparing to test v5.1-rc2 SRP, my usual method is first
> > > > > > validate
> > > > > > the
> > > > > > prior kernel I had in place.
> > > > > > This had passed tests previously (5.0.0-rc2) but I had not
> > > > > > run
> > > > > > the
> > > > > > target server reboot test, just the disconnect tests.
> > > > > >
> > > > > > Today with mapped SRP devices I rebooted the target server
> > > > > > and
> > > > > > the
> > > > > > client panicked.
> > > > > >
> > > > > > Its been a while and I have been so busy that have not kept
> > > > > > up
> > > > > > with
> > > > > > all
> > > > > > the fixes. Is this a known issue.
> > > > > >
> > > > > > Thanks
> > > > > > Laurence
> > > > > >
> > > > > > 5414228.917507] scsi host2: ib_srp: Path record query failed:
> > > > > > sgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > service_id
> > > > > > 0x7cfe900300726e4e
> > > > > > [5414229.014355] scsi host2: reconnect attempt 7 failed (-
> > > > > > 110)
> > > > > > [5414239.318161] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318165] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318167] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318168] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318170] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318172] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318173] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414239.318175] scsi host2: ib_srp: Sending CM DREQ failed
> > > > > > [5414243.670072] scsi host2: ib_srp: Got failed path rec
> > > > > > status
> > > > > > -110
> > > > > > [5414243.702179] scsi host2: ib_srp: Path record query
> > > > > > failed:
> > > > > > sgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > service_id
> > > > > > 0x7cfe900300726e4e
> > > > > > [5414243.799313] scsi host2: reconnect attempt 8 failed (-
> > > > > > 110)
> > > > > > [5414247.510115] scsi host1: ib_srp: Sending CM REQ failed
> > > > > > [5414247.510140] scsi host1: reconnect attempt 1 failed (-
> > > > > > 104)
> > > > > > [5414247.849078] BUG: unable to handle kernel NULL pointer
> > > > > > dereference
> > > > > > at 00000000000000b8
> > > > > > [5414247.893793] #PF error: [normal kernel read fault]
> > > > > > [5414247.921839] PGD 0 P4D 0
> > > > > > [5414247.937280] Oops: 0000 [#1] SMP PTI
> > > > > > [5414247.958332] CPU: 4 PID: 7773 Comm: kworker/4:1H Kdump:
> > > > > > loaded
> > > > > > Tainted: G          I       5.0.0-rc2+ #2
> > > > > > [5414248.012856] Hardware name: HP ProLiant DL380 G7, BIOS
> > > > > > P67
> > > > > > 08/16/2015
> > > > > > [5414248.026174] device-mapper: multipath: Failing path 8:48.
> > > > > >
> > > > > >
> > > > > > [5414248.050003] Workqueue: kblockd blk_mq_run_work_fn
> > > > > > [5414248.108378] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > > > > > [5414248.139724] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b
> > > > > > 45
> > > > > > 00
> > > > > > 48
> > > > > > 39 c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8
> > > > > > 4c
> > > > > > 8b
> > > > > > 63
> > > > > > c8
> > > > > > 75 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85
> > > > > > c0
> > > > > > 74
> > > > > > 10
> > > > > > 4c
> > > > > > [5414248.246176] RSP: 0018:ffffb1cd8760fd90 EFLAGS: 00010246
> > > > > > [5414248.275599] RAX: ffffa049d67a1308 RBX: ffffa049d67a1308
> > > > > > RCX:
> > > > > > 0000000000000004
> > > > > > [5414248.316090] RDX: 0000000000000000 RSI: ffffb1cd8760fe20
> > > > > > RDI:
> > > > > > ffffa0552ca08000
> > > > > > [5414248.356884] RBP: ffffb1cd8760fe20 R08: 0000000000000000
> > > > > > R09:
> > > > > > 8080808080808080
> > > > > > [5414248.397632] R10: 0000000000000001 R11: 0000000000000001
> > > > > > R12:
> > > > > > 0000000000000000
> > > > > > [5414248.439323] R13: ffffa049d67a12c0 R14: 0000000000000000
> > > > > > R15:
> > > > > > ffffa0552ca08000
> > > > > > [5414248.481743] FS:  0000000000000000(0000)
> > > > > > GS:ffffa04a37880000(0000)
> > > > > > knlGS:0000000000000000
> > > > > > [5414248.528310] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > 0000000080050033
> > > > > > [5414248.561779] CR2: 00000000000000b8 CR3: 0000000e9d40e004
> > > > > > CR4:
> > > > > > 00000000000206e0
> > > > > > [5414248.602420] Call Trace:
> > > > > > [5414248.616660]  blk_mq_sched_dispatch_requests+0x15c/0x180
> > > > > > [5414248.647066]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > > > [5414248.672633]  process_one_work+0x171/0x370
> > > > > > [5414248.695443]  worker_thread+0x49/0x3f0
> > > > > > [5414248.716730]  kthread+0xf8/0x130
> > > > > > [5414248.735085]  ? max_active_store+0x80/0x80
> > > > > > [5414248.758569]  ? kthread_bind+0x10/0x10
> > > > > > [5414248.779953]  ret_from_fork+0x35/0x40
> > > > > >
> > > > > > [5414248.801005] Modules linked in: ib_isert iscsi_target_mod
> > > > > > target_core_mod ib_srp rpcrdma scsi_transport_srp rdma_ucm
> > > > > > ib_iser
> > > > > > ib_ipoib ib_umad rdma_cm libiscsi iw_cm scsi_transport_iscsi
> > > > > > ib_cm
> > > > > > sunrpc mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp
> > > > > > kvm_intel
> > > > > > kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul
> > > > > > ghash_clmulni_intel aesni_intel iTCO_wdt crypto_simd cryptd
> > > > > > gpio_ich
> > > > > > iTCO_vendor_support glue_helper joydev ipmi_si
> > > > > > dm_service_time
> > > > > > pcspkr
> > > > > > ipmi_devintf hpilo hpwdt sg ipmi_msghandler acpi_power_meter
> > > > > > lpc_ich
> > > > > > i7core_edac pcc_cpufreq dm_multipath ip_tables xfs libcrc32c
> > > > > > radeon
> > > > > > sd_mod i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
> > > > > > sysimgblt
> > > > > > fb_sys_fops ttm drm mlx5_core crc32c_intel serio_raw i2c_core
> > > > > > hpsa
> > > > > > bnx2
> > > > > > scsi_transport_sas mlxfw devlink dm_mirror dm_region_hash
> > > > > > dm_log
> > > > > > dm_mod
> > > > > > [5414249.199354] CR2: 00000000000000b8
> > > > > >
> > > > > >
> > > > >
> > > > > Looking at the vmcore
> > > > >
> > > > > PID: 7773   TASK: ffffa04a2c1e2b80  CPU: 4   COMMAND:
> > > > > "kworker/4:1H"
> > > > >  #0 [ffffb1cd8760fab0] machine_kexec at ffffffffaaa6003f
> > > > >  #1 [ffffb1cd8760fb08] __crash_kexec at ffffffffaab373ed
> > > > >  #2 [ffffb1cd8760fbd0] crash_kexec at ffffffffaab385b9
> > > > >  #3 [ffffb1cd8760fbe8] oops_end at ffffffffaaa31931
> > > > >  #4 [ffffb1cd8760fc08] no_context at ffffffffaaa6eb59
> > > > >  #5 [ffffb1cd8760fcb0] do_page_fault at ffffffffaaa6feb2
> > > > >  #6 [ffffb1cd8760fce0] page_fault at ffffffffab2010ee
> > > > >     [exception RIP: blk_mq_dispatch_rq_list+201]
> > > > >     RIP: ffffffffaad90589  RSP: ffffb1cd8760fd90  RFLAGS:
> > > > > 00010246
> > > > >     RAX: ffffa049d67a1308  RBX: ffffa049d67a1308  RCX:
> > > > > 0000000000000004
> > > > >     RDX: 0000000000000000  RSI: ffffb1cd8760fe20  RDI:
> > > > > ffffa0552ca08000
> > > > >     RBP: ffffb1cd8760fe20   R8: 0000000000000000   R9:
> > > > > 8080808080808080
> > > > >     R10: 0000000000000001  R11: 0000000000000001  R12:
> > > > > 0000000000000000
> > > > >     R13: ffffa049d67a12c0  R14: 0000000000000000  R15:
> > > > > ffffa0552ca08000
> > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > >  #7 [ffffb1cd8760fe18] blk_mq_sched_dispatch_requests at
> > > > > ffffffffaad9570c
> > > > >  #8 [ffffb1cd8760fe60] __blk_mq_run_hw_queue at
> > > > > ffffffffaad8de3f
> > > > >  #9 [ffffb1cd8760fe78] process_one_work at ffffffffaaab0ab1
> > > > > #10 [ffffb1cd8760feb8] worker_thread at ffffffffaaab11d9
> > > > > #11 [ffffb1cd8760ff10] kthread at ffffffffaaab6758
> > > > > #12 [ffffb1cd8760ff50] ret_from_fork at ffffffffab200215
> > > > >
> > > > > We were working on this request_queue for
> > > > >  blk_mq_sched_dispatch_requests
> > > > >
> > > > > crash> dev -d | grep ffffa0552ca08000
> > > > >     8
> > > > > ffffa055c81b5800   sdd        ffffa0552ca08000       0     0
> > > > >
> > > > > 0
> > > > > ]
> > > > >
> > > > > That device was no longer accessible
> > > > >
> > > > > sdev_state = SDEV_TRANSPORT_OFFLINE,
> > > > >
> > > > > So it looks like we tried to process a no longer valid list
> > > > > entry
> > > > > in
> > > > > blk_mq_dispatch_rq_list
> > > > >
> > > > > /home/loberman/rpmbuild/BUILD/kernel-5.0.0_rc2+/block/blk-mq.h:
> > > > > 211
> > > > > 0xffffffffaad90589
> > > > > <blk_mq_dispatch_rq_list+201>:       mov    0xb8(%r12),%rax
> > > > >
> > > > > R12 is NULL
> > > > >
> > > > >
> > > > > From
> > > > > static inline bool blk_mq_get_dispatch_budget(struct
> > > > > blk_mq_hw_ctx
> > > > > *hctx)
> > > > > {
> > > > >         struct request_queue *q = hctx->queue;
> > > > >
> > > > >         if (q->mq_ops->get_budget)
> > > > >                 return q->mq_ops->get_budget(hctx);
> > > > >         return true;
> > > > > }
> > > > >
> > > > > Willw ait for a reply befaore i try the newer kernel, but looks
> > > > > like a
> > > > > use after free to me
> > > >
> > > > Hi Laurence,
> > > >
> > > > I don't think that any of the recent SRP initiator changes can be
> > > > the
> > > > root
> > > > cause of this crash. However, significant changes went upstream
> > > > in
> > > > the block
> > > > layer core during the v5.1-rc1 merge window, e.g. multi-page bvec
> > > > support.
> > > > Is it possible for you to bisect this kernel oops?
> > > >
> > > > Thanks,
> > > >
> > > > Bart.
> > > >
> > >
> > > OK, I will see I can reproduce on the fly and I will bisect.
> > > I do agree its not SRP, more likley some block layer race.
> > > I was just able to reproduce it using SRP
> > >
> > > Note that this was on 5.0.0-rc2+, prior to me trying 5.1.
> > >
> > > I usually reboot the target server as part of my test series but
> > > when
> > > I
> > > last tested 5.0.0-rc2+ I only reset the SRP interfaces and had
> > > devices
> > > get rediscovered.
> > >
> > > I did not see it during those tests.
> > >
> > > Back when I have more to share.
> > >
> > > Many Thanks for your time as always
> > > Laurence
> > >
> > >
> >
> > Something crept in, in the block layer causing a use after free
> > 4.19.0-rc1+ does not have the issue, so I will narrow the bisect
> > Thanks
> > Laurence
> >
>
> This took a long time to bisect.
> Repeating the issue seen. We have changes that when the target is
> rebooted with mapped srp devices the client then experiences a ist
> corruption and panic as already shown.
>
> Some stacks
>
> [  222.631998] scsi host1: ib_srp: Path record query failed: sgid
> fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
> fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
> 0x7cfe900300726e4e
> [  222.729639] scsi host1: reconnect attempt 1 failed (-110)
> [  223.176766] fast_io_fail_tmo expired for SRP port-1:1 / host1.
>
> [  223.518759] BUG: unable to handle kernel NULL pointer dereference at
> 00000000000000b8
> [  223.519736] sd 1:0:0:0: rejecting I/O to offline device
> [  223.563769] #PF error: [normal kernel read fault]
> [  223.563770] PGD 0 P4D 0
> [  223.563774] Oops: 0000 [#1] SMP PTI
> [  223.563778] CPU: 3 PID: 9027 Comm: kworker/3:1H Tainted:
> G          I       5.0.0-rc1 #22
> [  223.563779] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
> [  223.563787] Workqueue: kblockd blk_mq_run_work_fn
> [  223.593723] device-mapper: multipath: Failing path 8:48.
> [  223.620801] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> [  223.635266] print_req_error: I/O error, dev dm-6, sector 8191872
> flags 80700
> [  223.655565] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b 45 00 48 39
> c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8 4c 8b 63 c8 75
> 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 4c
> [  223.655566] RSP: 0018:ffffa65b4c43fd90 EFLAGS: 00010246
> [  223.655570] RAX: ffff93ed9bfdbbc8 RBX: ffff93ed9bfdbbc8 RCX:
> 0000000000000004
> [  223.702351] print_req_error: I/O error, dev dm-6, sector 8191872
> flags 0
> [  223.737640] RDX: 0000000000000000 RSI: ffffa65b4c43fe20 RDI:
> ffff93ed9b838000
> [  223.737641] RBP: ffffa65b4c43fe20 R08: 0000000000000000 R09:
> 8080808080808080
> [  223.737642] R10: 0000000000000001 R11: 071c71c71c71c71c R12:
> 0000000000000000
> [  223.737643] R13: ffff93ed9bfdbb80 R14: 0000000000000000 R15:
> ffff93ed9b838000
> [  223.737645] FS:  0000000000000000(0000) GS:ffff93ee33840000(0000)
> knlGS:0000000000000000
> [  223.737646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  223.737646] CR2: 00000000000000b8 CR3: 000000059da0e006 CR4:
> 00000000000206e0
> [  223.737648] Call Trace:
>
> [  223.737657]  blk_mq_sched_dispatch_requests+0x15c/0x180  *** Freed
> already
>
> [  223.737660]  __blk_mq_run_hw_queue+0x5f/0xf0
> [  223.737665]  process_one_work+0x171/0x370
> [  223.737667]  worker_thread+0x49/0x3f0
> [  223.737670]  kthread+0xf8/0x130
> [  223.737673]  ? max_active_store+0x80/0x80
>
> And:
>
> [  643.425005] device-mapper: multipath: Failing path 67:0.
> [  643.696365] ------------[ cut here ]------------
> [  643.722927] list_add corruption. prev->next should be next
> (ffffc8c0c3bd9448), but was ffff93b971965e08. (prev=ffff93b971965e08).
> [  643.787089] WARNING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> __list_add_valid+0x6a/0x70
> [  643.830951] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_ipoib ib_umad
> rdma_ucm ib_iser rdma_cm iw_cm libiscsi sunrpc scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel kvm
> irqbypass crct10dif_pclmul crc32_pclmul iTCO_wdt ghash_clmulni_intel
> ipmi_ssif aesni_intel iTCO_vendor_support gpio_ich crypto_simd
> dm_service_time cryptd glue_helper joydev ipmi_si ipmi_devintf pcspkr
> hpilo hpwdt sg ipmi_msghandler acpi_power_meter pcc_cpufreq lpc_ich
> i7core_edac dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops ttm drm crc32c_intel serio_raw mlx5_core i2c_core bnx2 hpsa
> mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> [  644.224150] CPU: 14 PID: 6533 Comm: kworker/14:1H Tainted:
> G          I       4.20.0+ #26
> [  644.269637] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
> [  644.305110] Workqueue: kblockd blk_mq_run_work_fn
> [  644.330984] RIP: 0010:__list_add_valid+0x6a/0x70
> [  644.357601] Code: 31 c0 48 c7 c7 70 e4 ab b6 e8 22 a6 cc ff 0f 0b 31
> c0 c3 48 89 d1 48 c7 c7 20 e4 ab b6 48 89 f2 48 89 c6 31 c0 e8 06 a6 cc
> ff <0f> 0b 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57
> [  644.462542] RSP: 0018:ffffa8ccc72bfc00 EFLAGS: 00010286
> [  644.491594] RAX: 0000000000000000 RBX: ffff93b971965dc0 RCX:
> 0000000000000000
> [  644.532745] RDX: 0000000000000001 RSI: ffff93b9b79d67b8 RDI:
> ffff93b9b79d67b8
> [  644.573533] RBP: ffffc8c0c3bd9448 R08: 0000000000000000 R09:
> 000000000000072c
> [  644.614180] R10: 0000000000000000 R11: ffffa8ccc72bf968 R12:
> ffff93b96d454c00
> [  644.654683] R13: ffffc8c0c3bd9440 R14: ffff93b971965e08 R15:
> ffff93b971965e08
> [  644.694275] FS:  0000000000000000(0000) GS:ffff93b9b79c0000(0000)
> knlGS:0000000000000000
> [  644.739906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  644.771879] CR2: 00007fd566c10000 CR3: 0000000253e0e002 CR4:
> 00000000000206e0
> [  644.811438] Call Trace:
> [  644.824809]  __blk_mq_insert_request+0x62/0x130
> [  644.849886]  blk_mq_sched_insert_request+0x13c/0x1b0
> [  644.877402]  blk_mq_try_issue_directly+0x105/0x2c0
> [  644.904452]  blk_insert_cloned_request+0x9a/0x130
> [  644.931146]  ? ktime_get+0x37/0x90
> [  644.950545]  dm_mq_queue_rq+0x21c/0x3f0 [dm_mod]
> [  644.977064]  ? blk_mq_get_driver_tag+0xa1/0x120
> [  645.003002]  blk_mq_dispatch_rq_list+0x8e/0x590
> [  645.029812]  ? __switch_to_asm+0x40/0x70
> [  645.052059]  ? __switch_to_asm+0x34/0x70
> [  645.074664]  ? __switch_to_asm+0x40/0x70
> [  645.097111]  ? __switch_to_asm+0x34/0x70
> [  645.119948]  ? __switch_to_asm+0x34/0x70
> [  645.143101]  ? __switch_to_asm+0x40/0x70
> [  645.165273]  ? syscall_return_via_sysret+0x10/0x7f
> [  645.192161]  blk_mq_sched_dispatch_requests+0xe8/0x180
> [  645.221460]  __blk_mq_run_hw_queue+0x5f/0xf0
> [  645.244766]  process_one_work+0x171/0x370
> [  645.267164]  worker_thread+0x49/0x3f0
> [  645.287860]  kthread+0xf8/0x130
> [  645.304894]  ? max_active_store+0x80/0x80
> [  645.327748]  ? kthread_bind+0x10/0x10
> [  645.347898]  ret_from_fork+0x35/0x40
> [  645.368298] ---[ end trace afa70bf68ffb006b ]---
> [  645.397356] ------------[ cut here ]------------
> [  645.423878] list_add corruption. prev->next should be next
> (ffffc8c0c3bd9448), but was ffff93b971965e08. (prev=ffff93b971965e08).
> [  645.488009] WAR
> NING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> __list_add_valid+0x6a/0x70
>
>
> I started just looking at block but nothing made sense so re-ran the
> bisect with the entire kernel.
>
> $ git bisect start
> $ git bisect good v4.20
> $ git bisect bad v5.0-rc1
>
> Bisecting: 5477 revisions left to test after this (roughly 13 steps)
> *** Groan
>
> I got to here and the problem is its an entire merge
>
> [loberman@ibclient linux_torvalds]$ git bisect bad
> 4b9254328254bed12a4ac449cdff2c332e630837 is the first bad commit
> [loberman@ibclient linux_torvalds]$ git show
> 4b9254328254bed12a4ac449cdff2c332e630837
> commit 4b9254328254bed12a4ac449cdff2c332e630837
> Merge: 1a9430d cd19181
> Author: Jens Axboe <axboe@kernel.dk>
> Date:   Tue Dec 18 08:29:53 2018 -0700
>
>     Merge branch 'for-4.21/block' into for-4.21/aio
>
>     * for-4.21/block: (351 commits)
>       blk-mq: enable IO poll if .nr_queues of type poll > 0

You may set 4b9254328254b or 'cd19181bf9ad' as 'git bad' and start the
'git bisect'
again.

thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27  1:40           ` Ming Lei
@ 2019-03-27 12:35             ` Laurence Oberman
  2019-03-27 12:56               ` Laurence Oberman
  0 siblings, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-03-27 12:35 UTC (permalink / raw)
  To: Ming Lei; +Cc: Bart Van Assche, linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-03-27 at 09:40 +0800, Ming Lei wrote:
> On Tue, Mar 26, 2019 at 11:17 PM Laurence Oberman <
> loberman@redhat.com> wrote:
> > 
> > On Thu, 2019-03-21 at 08:44 -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-20 at 16:35 -0400, Laurence Oberman wrote:
> > > > On Wed, 2019-03-20 at 13:28 -0700, Bart Van Assche wrote:
> > > > > On Wed, 2019-03-20 at 11:11 -0400, Laurence Oberman wrote:
> > > > > > On Wed, 2019-03-20 at 09:45 -0400, Laurence Oberman wrote:
> > > > > > > Hello Bart, I hope all is well with you.
> > > > > > > 
> > > > > > > Quick question
> > > > > > > preparing to test v5.1-rc2 SRP, my usual method is first
> > > > > > > validate
> > > > > > > the
> > > > > > > prior kernel I had in place.
> > > > > > > This had passed tests previously (5.0.0-rc2) but I had
> > > > > > > not
> > > > > > > run
> > > > > > > the
> > > > > > > target server reboot test, just the disconnect tests.
> > > > > > > 
> > > > > > > Today with mapped SRP devices I rebooted the target
> > > > > > > server
> > > > > > > and
> > > > > > > the
> > > > > > > client panicked.
> > > > > > > 
> > > > > > > Its been a while and I have been so busy that have not
> > > > > > > kept
> > > > > > > up
> > > > > > > with
> > > > > > > all
> > > > > > > the fixes. Is this a known issue.
> > > > > > > 
> > > > > > > Thanks
> > > > > > > Laurence
> > > > > > > 
> > > > > > > 5414228.917507] scsi host2: ib_srp: Path record query
> > > > > > > failed:
> > > > > > > sgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > > service_id
> > > > > > > 0x7cfe900300726e4e
> > > > > > > [5414229.014355] scsi host2: reconnect attempt 7 failed
> > > > > > > (-
> > > > > > > 110)
> > > > > > > [5414239.318161] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318165] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318167] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318168] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318170] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318172] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318173] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318175] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414243.670072] scsi host2: ib_srp: Got failed path rec
> > > > > > > status
> > > > > > > -110
> > > > > > > [5414243.702179] scsi host2: ib_srp: Path record query
> > > > > > > failed:
> > > > > > > sgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > > service_id
> > > > > > > 0x7cfe900300726e4e
> > > > > > > [5414243.799313] scsi host2: reconnect attempt 8 failed
> > > > > > > (-
> > > > > > > 110)
> > > > > > > [5414247.510115] scsi host1: ib_srp: Sending CM REQ
> > > > > > > failed
> > > > > > > [5414247.510140] scsi host1: reconnect attempt 1 failed
> > > > > > > (-
> > > > > > > 104)
> > > > > > > [5414247.849078] BUG: unable to handle kernel NULL
> > > > > > > pointer
> > > > > > > dereference
> > > > > > > at 00000000000000b8
> > > > > > > [5414247.893793] #PF error: [normal kernel read fault]
> > > > > > > [5414247.921839] PGD 0 P4D 0
> > > > > > > [5414247.937280] Oops: 0000 [#1] SMP PTI
> > > > > > > [5414247.958332] CPU: 4 PID: 7773 Comm: kworker/4:1H
> > > > > > > Kdump:
> > > > > > > loaded
> > > > > > > Tainted: G          I       5.0.0-rc2+ #2
> > > > > > > [5414248.012856] Hardware name: HP ProLiant DL380 G7,
> > > > > > > BIOS
> > > > > > > P67
> > > > > > > 08/16/2015
> > > > > > > [5414248.026174] device-mapper: multipath: Failing path
> > > > > > > 8:48.
> > > > > > > 
> > > > > > > 
> > > > > > > [5414248.050003] Workqueue: kblockd blk_mq_run_work_fn
> > > > > > > [5414248.108378] RIP:
> > > > > > > 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > > > > > > [5414248.139724] Code: 0f 85 c2 04 00 00 83 44 24 28 01
> > > > > > > 48 8b
> > > > > > > 45
> > > > > > > 00
> > > > > > > 48
> > > > > > > 39 c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b
> > > > > > > b8
> > > > > > > 4c
> > > > > > > 8b
> > > > > > > 63
> > > > > > > c8
> > > > > > > 75 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10
> > > > > > > 48 85
> > > > > > > c0
> > > > > > > 74
> > > > > > > 10
> > > > > > > 4c
> > > > > > > [5414248.246176] RSP: 0018:ffffb1cd8760fd90 EFLAGS:
> > > > > > > 00010246
> > > > > > > [5414248.275599] RAX: ffffa049d67a1308 RBX:
> > > > > > > ffffa049d67a1308
> > > > > > > RCX:
> > > > > > > 0000000000000004
> > > > > > > [5414248.316090] RDX: 0000000000000000 RSI:
> > > > > > > ffffb1cd8760fe20
> > > > > > > RDI:
> > > > > > > ffffa0552ca08000
> > > > > > > [5414248.356884] RBP: ffffb1cd8760fe20 R08:
> > > > > > > 0000000000000000
> > > > > > > R09:
> > > > > > > 8080808080808080
> > > > > > > [5414248.397632] R10: 0000000000000001 R11:
> > > > > > > 0000000000000001
> > > > > > > R12:
> > > > > > > 0000000000000000
> > > > > > > [5414248.439323] R13: ffffa049d67a12c0 R14:
> > > > > > > 0000000000000000
> > > > > > > R15:
> > > > > > > ffffa0552ca08000
> > > > > > > [5414248.481743] FS:  0000000000000000(0000)
> > > > > > > GS:ffffa04a37880000(0000)
> > > > > > > knlGS:0000000000000000
> > > > > > > [5414248.528310] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > > 0000000080050033
> > > > > > > [5414248.561779] CR2: 00000000000000b8 CR3:
> > > > > > > 0000000e9d40e004
> > > > > > > CR4:
> > > > > > > 00000000000206e0
> > > > > > > [5414248.602420] Call Trace:
> > > > > > > [5414248.616660]  blk_mq_sched_dispatch_requests+0x15c/0x
> > > > > > > 180
> > > > > > > [5414248.647066]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > > > > [5414248.672633]  process_one_work+0x171/0x370
> > > > > > > [5414248.695443]  worker_thread+0x49/0x3f0
> > > > > > > [5414248.716730]  kthread+0xf8/0x130
> > > > > > > [5414248.735085]  ? max_active_store+0x80/0x80
> > > > > > > [5414248.758569]  ? kthread_bind+0x10/0x10
> > > > > > > [5414248.779953]  ret_from_fork+0x35/0x40
> > > > > > > 
> > > > > > > [5414248.801005] Modules linked in: ib_isert
> > > > > > > iscsi_target_mod
> > > > > > > target_core_mod ib_srp rpcrdma scsi_transport_srp
> > > > > > > rdma_ucm
> > > > > > > ib_iser
> > > > > > > ib_ipoib ib_umad rdma_cm libiscsi iw_cm
> > > > > > > scsi_transport_iscsi
> > > > > > > ib_cm
> > > > > > > sunrpc mlx5_ib ib_uverbs ib_core intel_powerclamp
> > > > > > > coretemp
> > > > > > > kvm_intel
> > > > > > > kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul
> > > > > > > ghash_clmulni_intel aesni_intel iTCO_wdt crypto_simd
> > > > > > > cryptd
> > > > > > > gpio_ich
> > > > > > > iTCO_vendor_support glue_helper joydev ipmi_si
> > > > > > > dm_service_time
> > > > > > > pcspkr
> > > > > > > ipmi_devintf hpilo hpwdt sg ipmi_msghandler
> > > > > > > acpi_power_meter
> > > > > > > lpc_ich
> > > > > > > i7core_edac pcc_cpufreq dm_multipath ip_tables xfs
> > > > > > > libcrc32c
> > > > > > > radeon
> > > > > > > sd_mod i2c_algo_bit drm_kms_helper syscopyarea
> > > > > > > sysfillrect
> > > > > > > sysimgblt
> > > > > > > fb_sys_fops ttm drm mlx5_core crc32c_intel serio_raw
> > > > > > > i2c_core
> > > > > > > hpsa
> > > > > > > bnx2
> > > > > > > scsi_transport_sas mlxfw devlink dm_mirror dm_region_hash
> > > > > > > dm_log
> > > > > > > dm_mod
> > > > > > > [5414249.199354] CR2: 00000000000000b8
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > Looking at the vmcore
> > > > > > 
> > > > > > PID: 7773   TASK: ffffa04a2c1e2b80  CPU: 4   COMMAND:
> > > > > > "kworker/4:1H"
> > > > > >  #0 [ffffb1cd8760fab0] machine_kexec at ffffffffaaa6003f
> > > > > >  #1 [ffffb1cd8760fb08] __crash_kexec at ffffffffaab373ed
> > > > > >  #2 [ffffb1cd8760fbd0] crash_kexec at ffffffffaab385b9
> > > > > >  #3 [ffffb1cd8760fbe8] oops_end at ffffffffaaa31931
> > > > > >  #4 [ffffb1cd8760fc08] no_context at ffffffffaaa6eb59
> > > > > >  #5 [ffffb1cd8760fcb0] do_page_fault at ffffffffaaa6feb2
> > > > > >  #6 [ffffb1cd8760fce0] page_fault at ffffffffab2010ee
> > > > > >     [exception RIP: blk_mq_dispatch_rq_list+201]
> > > > > >     RIP: ffffffffaad90589  RSP: ffffb1cd8760fd90  RFLAGS:
> > > > > > 00010246
> > > > > >     RAX: ffffa049d67a1308  RBX: ffffa049d67a1308  RCX:
> > > > > > 0000000000000004
> > > > > >     RDX: 0000000000000000  RSI: ffffb1cd8760fe20  RDI:
> > > > > > ffffa0552ca08000
> > > > > >     RBP: ffffb1cd8760fe20   R8: 0000000000000000   R9:
> > > > > > 8080808080808080
> > > > > >     R10: 0000000000000001  R11: 0000000000000001  R12:
> > > > > > 0000000000000000
> > > > > >     R13: ffffa049d67a12c0  R14: 0000000000000000  R15:
> > > > > > ffffa0552ca08000
> > > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > > >  #7 [ffffb1cd8760fe18] blk_mq_sched_dispatch_requests at
> > > > > > ffffffffaad9570c
> > > > > >  #8 [ffffb1cd8760fe60] __blk_mq_run_hw_queue at
> > > > > > ffffffffaad8de3f
> > > > > >  #9 [ffffb1cd8760fe78] process_one_work at ffffffffaaab0ab1
> > > > > > #10 [ffffb1cd8760feb8] worker_thread at ffffffffaaab11d9
> > > > > > #11 [ffffb1cd8760ff10] kthread at ffffffffaaab6758
> > > > > > #12 [ffffb1cd8760ff50] ret_from_fork at ffffffffab200215
> > > > > > 
> > > > > > We were working on this request_queue for
> > > > > >  blk_mq_sched_dispatch_requests
> > > > > > 
> > > > > > crash> dev -d | grep ffffa0552ca08000
> > > > > >     8
> > > > > > ffffa055c81b5800   sdd        ffffa0552ca08000       0     
> > > > > > 0
> > > > > > 
> > > > > > 0
> > > > > > ]
> > > > > > 
> > > > > > That device was no longer accessible
> > > > > > 
> > > > > > sdev_state = SDEV_TRANSPORT_OFFLINE,
> > > > > > 
> > > > > > So it looks like we tried to process a no longer valid list
> > > > > > entry
> > > > > > in
> > > > > > blk_mq_dispatch_rq_list
> > > > > > 
> > > > > > /home/loberman/rpmbuild/BUILD/kernel-5.0.0_rc2+/block/blk-
> > > > > > mq.h:
> > > > > > 211
> > > > > > 0xffffffffaad90589
> > > > > > <blk_mq_dispatch_rq_list+201>:       mov    0xb8(%r12),%rax
> > > > > > 
> > > > > > R12 is NULL
> > > > > > 
> > > > > > 
> > > > > > From
> > > > > > static inline bool blk_mq_get_dispatch_budget(struct
> > > > > > blk_mq_hw_ctx
> > > > > > *hctx)
> > > > > > {
> > > > > >         struct request_queue *q = hctx->queue;
> > > > > > 
> > > > > >         if (q->mq_ops->get_budget)
> > > > > >                 return q->mq_ops->get_budget(hctx);
> > > > > >         return true;
> > > > > > }
> > > > > > 
> > > > > > Willw ait for a reply befaore i try the newer kernel, but
> > > > > > looks
> > > > > > like a
> > > > > > use after free to me
> > > > > 
> > > > > Hi Laurence,
> > > > > 
> > > > > I don't think that any of the recent SRP initiator changes
> > > > > can be
> > > > > the
> > > > > root
> > > > > cause of this crash. However, significant changes went
> > > > > upstream
> > > > > in
> > > > > the block
> > > > > layer core during the v5.1-rc1 merge window, e.g. multi-page
> > > > > bvec
> > > > > support.
> > > > > Is it possible for you to bisect this kernel oops?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Bart.
> > > > > 
> > > > 
> > > > OK, I will see I can reproduce on the fly and I will bisect.
> > > > I do agree its not SRP, more likley some block layer race.
> > > > I was just able to reproduce it using SRP
> > > > 
> > > > Note that this was on 5.0.0-rc2+, prior to me trying 5.1.
> > > > 
> > > > I usually reboot the target server as part of my test series
> > > > but
> > > > when
> > > > I
> > > > last tested 5.0.0-rc2+ I only reset the SRP interfaces and had
> > > > devices
> > > > get rediscovered.
> > > > 
> > > > I did not see it during those tests.
> > > > 
> > > > Back when I have more to share.
> > > > 
> > > > Many Thanks for your time as always
> > > > Laurence
> > > > 
> > > > 
> > > 
> > > Something crept in, in the block layer causing a use after free
> > > 4.19.0-rc1+ does not have the issue, so I will narrow the bisect
> > > Thanks
> > > Laurence
> > > 
> > 
> > This took a long time to bisect.
> > Repeating the issue seen. We have changes that when the target is
> > rebooted with mapped srp devices the client then experiences a ist
> > corruption and panic as already shown.
> > 
> > Some stacks
> > 
> > [  222.631998] scsi host1: ib_srp: Path record query failed: sgid
> > fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
> > fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
> > 0x7cfe900300726e4e
> > [  222.729639] scsi host1: reconnect attempt 1 failed (-110)
> > [  223.176766] fast_io_fail_tmo expired for SRP port-1:1 / host1.
> > 
> > [  223.518759] BUG: unable to handle kernel NULL pointer
> > dereference at
> > 00000000000000b8
> > [  223.519736] sd 1:0:0:0: rejecting I/O to offline device
> > [  223.563769] #PF error: [normal kernel read fault]
> > [  223.563770] PGD 0 P4D 0
> > [  223.563774] Oops: 0000 [#1] SMP PTI
> > [  223.563778] CPU: 3 PID: 9027 Comm: kworker/3:1H Tainted:
> > G          I       5.0.0-rc1 #22
> > [  223.563779] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  223.563787] Workqueue: kblockd blk_mq_run_work_fn
> > [  223.593723] device-mapper: multipath: Failing path 8:48.
> > [  223.620801] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > [  223.635266] print_req_error: I/O error, dev dm-6, sector 8191872
> > flags 80700
> > [  223.655565] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b 45 00
> > 48 39
> > c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8 4c 8b 63
> > c8 75
> > 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10
> > 4c
> > [  223.655566] RSP: 0018:ffffa65b4c43fd90 EFLAGS: 00010246
> > [  223.655570] RAX: ffff93ed9bfdbbc8 RBX: ffff93ed9bfdbbc8 RCX:
> > 0000000000000004
> > [  223.702351] print_req_error: I/O error, dev dm-6, sector 8191872
> > flags 0
> > [  223.737640] RDX: 0000000000000000 RSI: ffffa65b4c43fe20 RDI:
> > ffff93ed9b838000
> > [  223.737641] RBP: ffffa65b4c43fe20 R08: 0000000000000000 R09:
> > 8080808080808080
> > [  223.737642] R10: 0000000000000001 R11: 071c71c71c71c71c R12:
> > 0000000000000000
> > [  223.737643] R13: ffff93ed9bfdbb80 R14: 0000000000000000 R15:
> > ffff93ed9b838000
> > [  223.737645] FS:  0000000000000000(0000)
> > GS:ffff93ee33840000(0000)
> > knlGS:0000000000000000
> > [  223.737646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  223.737646] CR2: 00000000000000b8 CR3: 000000059da0e006 CR4:
> > 00000000000206e0
> > [  223.737648] Call Trace:
> > 
> > [  223.737657]  blk_mq_sched_dispatch_requests+0x15c/0x180  ***
> > Freed
> > already
> > 
> > [  223.737660]  __blk_mq_run_hw_queue+0x5f/0xf0
> > [  223.737665]  process_one_work+0x171/0x370
> > [  223.737667]  worker_thread+0x49/0x3f0
> > [  223.737670]  kthread+0xf8/0x130
> > [  223.737673]  ? max_active_store+0x80/0x80
> > 
> > And:
> > 
> > [  643.425005] device-mapper: multipath: Failing path 67:0.
> > [  643.696365] ------------[ cut here ]------------
> > [  643.722927] list_add corruption. prev->next should be next
> > (ffffc8c0c3bd9448), but was ffff93b971965e08.
> > (prev=ffff93b971965e08).
> > [  643.787089] WARNING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > [  643.830951] Modules linked in: ib_isert iscsi_target_mod
> > target_core_mod ib_srp scsi_transport_srp rpcrdma ib_ipoib ib_umad
> > rdma_ucm ib_iser rdma_cm iw_cm libiscsi sunrpc scsi_transport_iscsi
> > ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> > kvm
> > irqbypass crct10dif_pclmul crc32_pclmul iTCO_wdt
> > ghash_clmulni_intel
> > ipmi_ssif aesni_intel iTCO_vendor_support gpio_ich crypto_simd
> > dm_service_time cryptd glue_helper joydev ipmi_si ipmi_devintf
> > pcspkr
> > hpilo hpwdt sg ipmi_msghandler acpi_power_meter pcc_cpufreq lpc_ich
> > i7core_edac dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> > fb_sys_fops ttm drm crc32c_intel serio_raw mlx5_core i2c_core bnx2
> > hpsa
> > mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> > [  644.224150] CPU: 14 PID: 6533 Comm: kworker/14:1H Tainted:
> > G          I       4.20.0+ #26
> > [  644.269637] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  644.305110] Workqueue: kblockd blk_mq_run_work_fn
> > [  644.330984] RIP: 0010:__list_add_valid+0x6a/0x70
> > [  644.357601] Code: 31 c0 48 c7 c7 70 e4 ab b6 e8 22 a6 cc ff 0f
> > 0b 31
> > c0 c3 48 89 d1 48 c7 c7 20 e4 ab b6 48 89 f2 48 89 c6 31 c0 e8 06
> > a6 cc
> > ff <0f> 0b 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b
> > 57
> > [  644.462542] RSP: 0018:ffffa8ccc72bfc00 EFLAGS: 00010286
> > [  644.491594] RAX: 0000000000000000 RBX: ffff93b971965dc0 RCX:
> > 0000000000000000
> > [  644.532745] RDX: 0000000000000001 RSI: ffff93b9b79d67b8 RDI:
> > ffff93b9b79d67b8
> > [  644.573533] RBP: ffffc8c0c3bd9448 R08: 0000000000000000 R09:
> > 000000000000072c
> > [  644.614180] R10: 0000000000000000 R11: ffffa8ccc72bf968 R12:
> > ffff93b96d454c00
> > [  644.654683] R13: ffffc8c0c3bd9440 R14: ffff93b971965e08 R15:
> > ffff93b971965e08
> > [  644.694275] FS:  0000000000000000(0000)
> > GS:ffff93b9b79c0000(0000)
> > knlGS:0000000000000000
> > [  644.739906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  644.771879] CR2: 00007fd566c10000 CR3: 0000000253e0e002 CR4:
> > 00000000000206e0
> > [  644.811438] Call Trace:
> > [  644.824809]  __blk_mq_insert_request+0x62/0x130
> > [  644.849886]  blk_mq_sched_insert_request+0x13c/0x1b0
> > [  644.877402]  blk_mq_try_issue_directly+0x105/0x2c0
> > [  644.904452]  blk_insert_cloned_request+0x9a/0x130
> > [  644.931146]  ? ktime_get+0x37/0x90
> > [  644.950545]  dm_mq_queue_rq+0x21c/0x3f0 [dm_mod]
> > [  644.977064]  ? blk_mq_get_driver_tag+0xa1/0x120
> > [  645.003002]  blk_mq_dispatch_rq_list+0x8e/0x590
> > [  645.029812]  ? __switch_to_asm+0x40/0x70
> > [  645.052059]  ? __switch_to_asm+0x34/0x70
> > [  645.074664]  ? __switch_to_asm+0x40/0x70
> > [  645.097111]  ? __switch_to_asm+0x34/0x70
> > [  645.119948]  ? __switch_to_asm+0x34/0x70
> > [  645.143101]  ? __switch_to_asm+0x40/0x70
> > [  645.165273]  ? syscall_return_via_sysret+0x10/0x7f
> > [  645.192161]  blk_mq_sched_dispatch_requests+0xe8/0x180
> > [  645.221460]  __blk_mq_run_hw_queue+0x5f/0xf0
> > [  645.244766]  process_one_work+0x171/0x370
> > [  645.267164]  worker_thread+0x49/0x3f0
> > [  645.287860]  kthread+0xf8/0x130
> > [  645.304894]  ? max_active_store+0x80/0x80
> > [  645.327748]  ? kthread_bind+0x10/0x10
> > [  645.347898]  ret_from_fork+0x35/0x40
> > [  645.368298] ---[ end trace afa70bf68ffb006b ]---
> > [  645.397356] ------------[ cut here ]------------
> > [  645.423878] list_add corruption. prev->next should be next
> > (ffffc8c0c3bd9448), but was ffff93b971965e08.
> > (prev=ffff93b971965e08).
> > [  645.488009] WAR
> > NING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > 
> > 
> > I started just looking at block but nothing made sense so re-ran
> > the
> > bisect with the entire kernel.
> > 
> > $ git bisect start
> > $ git bisect good v4.20
> > $ git bisect bad v5.0-rc1
> > 
> > Bisecting: 5477 revisions left to test after this (roughly 13
> > steps)
> > *** Groan
> > 
> > I got to here and the problem is its an entire merge
> > 
> > [loberman@ibclient linux_torvalds]$ git bisect bad
> > 4b9254328254bed12a4ac449cdff2c332e630837 is the first bad commit
> > [loberman@ibclient linux_torvalds]$ git show
> > 4b9254328254bed12a4ac449cdff2c332e630837
> > commit 4b9254328254bed12a4ac449cdff2c332e630837
> > Merge: 1a9430d cd19181
> > Author: Jens Axboe <axboe@kernel.dk>
> > Date:   Tue Dec 18 08:29:53 2018 -0700
> > 
> >     Merge branch 'for-4.21/block' into for-4.21/aio
> > 
> >     * for-4.21/block: (351 commits)
> >       blk-mq: enable IO poll if .nr_queues of type poll > 0
> 
> You may set 4b9254328254b or 'cd19181bf9ad' as 'git bad' and start
> the
> 'git bisect'
> again.
> 
> thanks,
> Ming Lei

Hello Ming, OK, but that means starting good at v4.20. long bisect
again.

From the vmcore I got to this analysis

[  382.412285] fast_io_fail_tmo expired for SRP port-2:1 / host2.
[  382.604347] scsi host2: ib_srp: Got failed path rec status -110
[  382.638622] scsi host2: ib_srp: Path record query failed: sgid
fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
0x7cfe900300726e4e
[  382.736300] scsi host2: reconnect attempt 1 failed (-110)
[  383.239347] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b8
[  383.239349] sd 2:0:0:0: rejecting I/O to offline device
[  383.239370] device-mapper: multipath: Failing path 8:64.
[  383.241278] sd 2:0:0:28: rejecting I/O to offline device
[  383.241284] sd 2:0:0:27: rejecting I/O to offline device
[  383.241289] device-mapper: multipath: Failing path 8:96.
[  383.241291] device-mapper: multipath: Failing path 8:160.
[  383.241301] print_req_error: I/O error, dev dm-8, sector 8191872
flags 80700
[  383.241303] print_req_error: I/O error, dev dm-7, sector 8191872
flags 80700
[  383.241335] print_req_error: I/O error, dev dm-8, sector 8191872
flags 0
[  383.241338] Buffer I/O error on dev dm-8, logical block 8191872,
async page read
[  383.241355] print_req_error: I/O error, dev dm-7, sector 8191872
flags 0
[  383.241357] Buffer I/O error on dev dm-7, logical block 8191872,
async page read
[  383.241367] print_req_error: I/O error, dev dm-7, sector 8191873
flags 0
[  383.241368] Buffer I/O error on dev dm-7, logical block 8191873,
async page read
[  383.241377] print_req_error: I/O error, dev dm-7, sector 8191874
flags 0
[  383.241378] Buffer I/O error on dev dm-7, logical block 8191874,
async page read
[  383.241387] print_req_error: I/O error, dev dm-7, sector 8191875
flags 0
[  383.241388] Buffer I/O error on dev dm-7, logical block 8191875,
async page read
[  383.241399] print_req_error: I/O error, dev dm-8, sector 8191873
flags 0
[  383.241400] Buffer I/O error on dev dm-8, logical block 8191873,
async page read
[  383.241409] print_req_error: I/O error, dev dm-7, sector 8191876
flags 0
[  383.241410] Buffer I/O error on dev dm-7, logical block 8191876,
async page read
[  383.241421] print_req_error: I/O error, dev dm-8, sector 8191874
flags 0
[  383.241422] Buffer I/O error on dev dm-8, logical block 8191874,
async page read
[  383.241431] Buffer I/O error on dev dm-7, logical block 8191877,
async page read
[  383.241440] Buffer I/O error on dev dm-8, logical block 8191875,
async page read

[  383.282566] #PF error: [normal kernel read fault]

[  384.290154] PGD 0 P4D 0 
[  384.303791] Oops: 0000 [#1] SMP PTI
[  384.323687] CPU: 1 PID: 9191 Comm: kworker/1:1H Kdump: loaded
Tainted: G        W I       5.1.0-rc2+ #39
[  384.375898] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015

[  384.411812] Workqueue: kblockd blk_mq_run_work_fn
[  384.438498] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  384.468539] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38 00
00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8 75
24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48 89
[  384.573942] RSP: 0018:ffffa9fe0759fd90 EFLAGS: 00010246
[  384.604181] RAX: ffff9de9c4d3bbc8 RBX: ffff9de9c4d3bbc8 RCX:
0000000000000004
[  384.645173] RDX: 0000000000000000 RSI: ffffa9fe0759fe20 RDI:
ffff9dea0dad87f0
[  384.686668] RBP: 0000000000000000 R08: 0000000000000000 R09:
8080808080808080
[  384.727154] R10: ffff9dea33827660 R11: ffffee9d9e097a00 R12:
ffffa9fe0759fe20
[  384.767777] R13: ffff9de9c4d3bb80 R14: 0000000000000000 R15:
ffff9dea0dad87f0
[  384.807728] FS:  0000000000000000(0000) GS:ffff9dea33800000(0000)
knlGS:0000000000000000
[  384.852776] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  384.883825] CR2: 00000000000000b8 CR3: 000000092560e002 CR4:
00000000000206e0
[  384.922960] Call Trace:
[  384.936550]  ? blk_mq_flush_busy_ctxs+0xca/0x120
[  384.962425]  blk_mq_sched_dispatch_requests+0x15c/0x180
[  384.992025]  __blk_mq_run_hw_queue+0x5f/0x100
[  385.016980]  process_one_work+0x171/0x380
[  385.040065]  worker_thread+0x49/0x3f0
[  385.060971]  kthread+0xf8/0x130
[  385.079101]  ? max_active_store+0x80/0x80
[  385.102005]  ? kthread_bind+0x10/0x10
[  385.122531]  ret_from_fork+0x35/0x40
[  385.142477] Modules linked in: ib_isert iscsi_target_mod
target_core_mod ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser
ib_umad ib_ipoib sunrpc rdma_cm iw_cm libiscsi scsi_transport_iscsi
ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
ipmi_ssif kvm iTCO_wdt gpio_ich iTCO_vendor_support irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
dm_service_time crypto_simd cryptd glue_helper ipmi_si pcspkr lpc_ich
joydev ipmi_devintf hpilo hpwdt sg ipmi_msghandler i7core_edac
acpi_power_meter pcc_cpufreq dm_multipath ip_tables xfs libcrc32c
radeon sd_mod i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
sysimgblt fb_sys_fops ttm serio_raw crc32c_intel drm mlx5_core i2c_core
bnx2 hpsa mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log
dm_mod
[  385.534331] CR2: 00000000000000b8
crash> 

kblockd blk_mq_run_work_fn then blk_mq_dispatch_rq_list and we have the
panic                   

crash> bt
PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND: "kworker/1:1H"
 #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
 #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
 #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
 #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
 #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
 #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
 #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
    [exception RIP: blk_mq_dispatch_rq_list+114]
    RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
    RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX: 0000000000000004
    RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI: ffff9dea0dad87f0
    RBP: 0000000000000000   R8: 0000000000000000   R9: 8080808080808080
    R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12: ffffa9fe0759fe20
    R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15: ffff9dea0dad87f0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
ffffffff93ba455c
 #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
 #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
#10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
#11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
#12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215

crash> whatis blk_mq_sched_dispatch_requests
void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *);

void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
{
        struct request_queue *q = hctx->queue;
        struct elevator_queue *e = q->elevator;
        const bool has_sched_dispatch = e && e->type-
>ops.dispatch_request;

        ***** Should have panicked here

        LIST_HEAD(rq_list);
..
..
       /*
         * Only ask the scheduler for requests, if we didn't have
residual
         * requests from the dispatch list. This is to avoid the case
where
         * we only ever dispatch a fraction of the requests available
because
         * of low device queue depth. Once we pull requests out of the
IO
         * scheduler, we can no longer merge or sort them. So it's best
to
         * leave them there for as long as we can. Mark the hw queue as
         * needing a restart in that case.
         *
         * We want to dispatch from the scheduler if there was nothing
         * on the dispatch list or we were able to dispatch from the
         * dispatch list.
         */
        if (!list_empty(&rq_list)) {
                blk_mq_sched_mark_restart_hctx(hctx);
                if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
                        if (has_sched_dispatch)
                                blk_mq_do_dispatch_sched(hctx);
                        else
                                blk_mq_do_dispatch_ctx(hctx);
                }
        } else if (has_sched_dispatch) {
                blk_mq_do_dispatch_sched(hctx);
        } else if (hctx->dispatch_busy) {
                /* dequeue request one by one from sw queue if queue is
busy */
                blk_mq_do_dispatch_ctx(hctx);
        } else {
                blk_mq_flush_busy_ctxs(hctx, &rq_list);

 ***** Called here

                blk_mq_dispatch_rq_list(q, &rq_list,
false);                 
        }
}


crash> request_queue.elevator 0xffff9dea0dad87f0
  elevator = 0x0
Should have panicked earlier as elevator is NULL

request was for 2:0:0:0
sdev_state = SDEV_TRANSPORT_OFFLINE,

crash> dis -l blk_mq_dispatch_rq_list+114
/home/loberman/git/linux_torvalds/block/blk-mq.h: 210
0xffffffff93b9f202
<blk_mq_dispatch_rq_list+114>:       mov    0xb8(%rbp),%rax

RBP: 0000000000000000 



static inline bool blk_mq_get_dispatch_budget(struct blk_mq_hw_ctx
*hctx)
{
        struct request_queue *q = hctx->queue;

        if (q->mq_ops->get_budget)                       ******* This
Line
                return q->mq_ops->get_budget(hctx);
        return true;
}

crash> blk_mq_hw_ctx.queue ffff9de9c4d11000
  queue = 0xffff9dea0dad87f0

crash> request_queue.mq_ops 0xffff9dea0dad87f0 
  mq_ops = 0xffffffff946afdc0

crash> blk_mq_ops 0xffffffff946afdc0
struct blk_mq_ops {
  queue_rq = 0xffffffff93d619e0, 
  commit_rqs = 0x0, 
  get_budget = 0xffffffff93d60510, 
  put_budget = 0xffffffff93d5efd0, 
  timeout = 0xffffffff93d5fa80, 
  poll = 0x0, 
  complete = 0xffffffff93d60f70, 
  init_hctx = 0x0, 
  exit_hctx = 0x0, 
  init_request = 0xffffffff93d5f9e0, 
  exit_request = 0xffffffff93d5f9b0, 
  initialize_rq_fn = 0xffffffff93d5f850, 
  busy = 0xffffffff93d60900, 
  map_queues = 0xffffffff93d5f980, 
  show_rq = 0xffffffff93d68b30
}[  382.412285] fast_io_fail_tmo expired for SRP port-2:1 / host2.
[  382.604347] scsi host2: ib_srp: Got failed path rec status -110
[  382.638622] scsi host2: ib_srp: Path record query failed: sgid
fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
0x7cfe900300726e4e
[  382.736300] scsi host2: reconnect attempt 1 failed (-110)
[  383.239347] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b8
[  383.239349] sd 2:0:0:0: rejecting I/O to offline device
[  383.239370] device-mapper: multipath: Failing path 8:64.
[  383.241278] sd 2:0:0:28: rejecting I/O to offline device
[  383.241284] sd 2:0:0:27: rejecting I/O to offline device
[  383.241289] device-mapper: multipath: Failing path 8:96.
[  383.241291] device-mapper: multipath: Failing path 8:160.
[  383.241301] print_req_error: I/O error, dev dm-8, sector 8191872
flags 80700
[  383.241303] print_req_error: I/O error, dev dm-7, sector 8191872
flags 80700
[  383.241335] print_req_error: I/O error, dev dm-8, sector 8191872
flags 0
[  383.241338] Buffer I/O error on dev dm-8, logical block 8191872,
async page read
[  383.241355] print_req_error: I/O error, dev dm-7, sector 8191872
flags 0
[  383.241357] Buffer I/O error on dev dm-7, logical block 8191872,
async page read
[  383.241367] print_req_error: I/O error, dev dm-7, sector 8191873
flags 0
[  383.241368] Buffer I/O error on dev dm-7, logical block 8191873,
async page read
[  383.241377] print_req_error: I/O error, dev dm-7, sector 8191874
flags 0
[  383.241378] Buffer I/O error on dev dm-7, logical block 8191874,
async page read
[  383.241387] print_req_error: I/O error, dev dm-7, sector 8191875
flags 0
[  383.241388] Buffer I/O error on dev dm-7, logical block 8191875,
async page read
[  383.241399] print_req_error: I/O error, dev dm-8, sector 8191873
flags 0
[  383.241400] Buffer I/O error on dev dm-8, logical block 8191873,
async page read
[  383.241409] print_req_error: I/O error, dev dm-7, sector 8191876
flags 0
[  383.241410] Buffer I/O error on dev dm-7, logical block 8191876,
async page read
[  383.241421] print_req_error: I/O error, dev dm-8, sector 8191874
flags 0
[  383.241422] Buffer I/O error on dev dm-8, logical block 8191874,
async page read
[  383.241431] Buffer I/O error on dev dm-7, logical block 8191877,
async page read
[  383.241440] Buffer I/O error on dev dm-8, logical block 8191875,
async page read

[  383.282566] #PF error: [normal kernel read fault]

[  384.290154] PGD 0 P4D 0 
[  384.303791] Oops: 0000 [#1] SMP PTI
[  384.323687] CPU: 1 PID: 9191 Comm: kworker/1:1H Kdump: loaded
Tainted: G        W I       5.1.0-rc2+ #39
[  384.375898] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015

[  384.411812] Workqueue: kblockd blk_mq_run_work_fn
[  384.438498] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  384.468539] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38 00
00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8 75
24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48 89
[  384.573942] RSP: 0018:ffffa9fe0759fd90 EFLAGS: 00010246
[  384.604181] RAX: ffff9de9c4d3bbc8 RBX: ffff9de9c4d3bbc8 RCX:
0000000000000004
[  384.645173] RDX: 0000000000000000 RSI: ffffa9fe0759fe20 RDI:
ffff9dea0dad87f0
[  384.686668] RBP: 0000000000000000 R08: 0000000000000000 R09:
8080808080808080
[  384.727154] R10: ffff9dea33827660 R11: ffffee9d9e097a00 R12:
ffffa9fe0759fe20
[  384.767777] R13: ffff9de9c4d3bb80 R14: 0000000000000000 R15:
ffff9dea0dad87f0
[  384.807728] FS:  0000000000000000(0000) GS:ffff9dea33800000(0000)
knlGS:0000000000000000
[  384.852776] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  384.883825] CR2: 00000000000000b8 CR3: 000000092560e002 CR4:
00000000000206e0
[  384.922960] Call Trace:
[  384.936550]  ? blk_mq_flush_busy_ctxs+0xca/0x120
[  384.962425]  blk_mq_sched_dispatch_requests+0x15c/0x180
[  384.992025]  __blk_mq_run_hw_queue+0x5f/0x100
[  385.016980]  process_one_work+0x171/0x380
[  385.040065]  worker_thread+0x49/0x3f0
[  385.060971]  kthread+0xf8/0x130
[  385.079101]  ? max_active_store+0x80/0x80
[  385.102005]  ? kthread_bind+0x10/0x10
[  385.122531]  ret_from_fork+0x35/0x40
[  385.142477] Modules linked in: ib_isert iscsi_target_mod
target_core_mod ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser
ib_umad ib_ipoib sunrpc rdma_cm iw_cm libiscsi scsi_transport_iscsi
ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
ipmi_ssif kvm iTCO_wdt gpio_ich iTCO_vendor_support irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
dm_service_time crypto_simd cryptd glue_helper ipmi_si pcspkr lpc_ich
joydev ipmi_devintf hpilo hpwdt sg ipmi_msghandler i7core_edac
acpi_power_meter pcc_cpufreq dm_multipath ip_tables xfs libcrc32c
radeon sd_mod i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
sysimgblt fb_sys_fops ttm serio_raw crc32c_intel drm mlx5_core i2c_core
bnx2 hpsa mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log
dm_mod
[  385.534331] CR2: 00000000000000b8
crash> 

kblockd blk_mq_run_work_fn then blk_mq_dispatch_rq_list and we have the
panic                   

crash> bt
PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND: "kworker/1:1H"
 #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
 #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
 #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
 #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
 #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
 #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
 #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
    [exception RIP: blk_mq_dispatch_rq_list+114]
    RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
    RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX: 0000000000000004
    RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI: ffff9dea0dad87f0
    RBP: 0000000000000000   R8: 0000000000000000   R9: 8080808080808080
    R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12: ffffa9fe0759fe20
    R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15: ffff9dea0dad87f0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
ffffffff93ba455c
 #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
 #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
#10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
#11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
#12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215

crash> whatis blk_mq_sched_dispatch_requests
void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *);

void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
{
        struct request_queue *q = hctx->queue;
        struct elevator_queue *e = q->elevator;
        const bool has_sched_dispatch = e && e->type-
>ops.dispatch_request;

        ***** Should have panicked here

        LIST_HEAD(rq_list);
..
..
       /*
         * Only ask the scheduler for requests, if we didn't have
residual
         * requests from the dispatch list. This is to avoid the case
where
         * we only ever dispatch a fraction of the requests available
because
         * of low device queue depth. Once we pull requests out of the
IO
         * scheduler, we can no longer merge or sort them. So it's best
to
         * leave them there for as long as we can. Mark the hw queue as
         * needing a restart in that case.
         *
         * We want to dispatch from the scheduler if there was nothing
         * on the dispatch list or we were able to dispatch from the
         * dispatch list.
         */
        if (!list_empty(&rq_list)) {
                blk_mq_sched_mark_restart_hctx(hctx);
                if (blk_mq_dispatch_rq_list(q, &rq_list, false)) {
                        if (has_sched_dispatch)
                                blk_mq_do_dispatch_sched(hctx);
                        else
                                blk_mq_do_dispatch_ctx(hctx);
                }
        } else if (has_sched_dispatch) {
                blk_mq_do_dispatch_sched(hctx);
        } else if (hctx->dispatch_busy) {
                /* dequeue request one by one from sw queue if queue is
busy */
                blk_mq_do_dispatch_ctx(hctx);
        } else {
                blk_mq_flush_busy_ctxs(hctx, &rq_list);

 ***** Called here

                blk_mq_dispatch_rq_list(q, &rq_list,
false);                 
        }
}


crash> request_queue.elevator 0xffff9dea0dad87f0
  elevator = 0x0
Should have panicked earlier as elevator is NULL

request was for 2:0:0:0
sdev_state = SDEV_TRANSPORT_OFFLINE,

crash> dis -l blk_mq_dispatch_rq_list+114
/home/loberman/git/linux_torvalds/block/blk-mq.h: 210
0xffffffff93b9f202
<blk_mq_dispatch_rq_list+114>:       mov    0xb8(%rbp),%rax

RBP: 0000000000000000 



static inline bool blk_mq_get_dispatch_budget(struct blk_mq_hw_ctx
*hctx)
{
        struct request_queue *q = hctx->queue;

        if (q->mq_ops->get_budget)                       ******* This
Line
                return q->mq_ops->get_budget(hctx);
        return true;
}

crash> blk_mq_hw_ctx.queue ffff9de9c4d11000
  queue = 0xffff9dea0dad87f0

crash> request_queue.mq_ops 0xffff9dea0dad87f0 
  mq_ops = 0xffffffff946afdc0

crash> blk_mq_ops 0xffffffff946afdc0
struct blk_mq_ops {
  queue_rq = 0xffffffff93d619e0, 
  commit_rqs = 0x0, 
  get_budget = 0xffffffff93d60510, 
  put_budget = 0xffffffff93d5efd0, 
  timeout = 0xffffffff93d5fa80, 
  poll = 0x0, 
  complete = 0xffffffff93d60f70, 
  init_hctx = 0x0, 
  exit_hctx = 0x0, 
  init_request = 0xffffffff93d5f9e0, 
  exit_request = 0xffffffff93d5f9b0, 
  initialize_rq_fn = 0xffffffff93d5f850, 
  busy = 0xffffffff93d60900, 
  map_queues = 0xffffffff93d5f980, 
  show_rq = 0xffffffff93d68b30
}


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27 12:35             ` Laurence Oberman
@ 2019-03-27 12:56               ` Laurence Oberman
  2019-03-27 22:00                 ` Laurence Oberman
  0 siblings, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-03-27 12:56 UTC (permalink / raw)
  To: Ming Lei; +Cc: Bart Van Assche, linux-rdma, linux-block, Jens Axboe, linux-scsi

Truncating email content, starting bisect again as suggested.
Email was getting too long with repetition.

Crux of the issue repeated here so easy to understand topic

We got to dispatch passing rq_list and the list is corrupted/freed so
we panic. Clearly a race and is in v5.x+ kernels.
This new bisect will find it.

crash> bt
PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND: "kworker/1:1H"
 #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
 #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
 #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
 #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
 #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
 #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
 #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
    [exception RIP: blk_mq_dispatch_rq_list+114]
    RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
    RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX: 0000000000000004
    RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI: ffff9dea0dad87f0
    RBP: 0000000000000000   R8: 0000000000000000   R9: 8080808080808080
    R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12: ffffa9fe0759fe20
    R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15: ffff9dea0dad87f0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
ffffffff93ba455c
 #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
 #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
#10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
#11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
#12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27 12:56               ` Laurence Oberman
@ 2019-03-27 22:00                 ` Laurence Oberman
  2019-03-27 23:34                   ` Laurence Oberman
  2019-04-03 17:52                   ` Bart Van Assche
  0 siblings, 2 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-03-27 22:00 UTC (permalink / raw)
  To: Ming Lei, jianchao.w.wang
  Cc: Bart Van Assche, linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> Truncating email content, starting bisect again as suggested.
> Email was getting too long with repetition.
> 
> Crux of the issue repeated here so easy to understand topic
> 
> We got to dispatch passing rq_list and the list is corrupted/freed so
> we panic. Clearly a race and is in v5.x+ kernels.
> This new bisect will find it.
> 
> crash> bt
> PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND: "kworker/1:1H"
>  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
>  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
>  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
>  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
>  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
>  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
>  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
>     [exception RIP: blk_mq_dispatch_rq_list+114]
>     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
>     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> 0000000000000004
>     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> ffff9dea0dad87f0
>     RBP: 0000000000000000   R8: 0000000000000000   R9:
> 8080808080808080
>     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> ffffa9fe0759fe20
>     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> ffff9dea0dad87f0
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> ffffffff93ba455c
>  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
>  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> 
Hello Jens, Jianchao
Finally made it to this one.
I will see if I can revert and test

7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
Author: Jianchao Wang <jianchao.w.wang@oracle.com>
Date:   Fri Dec 14 09:28:18 2018 +0800

    blk-mq: refactor the code of issue request directly
    
    Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
    into one interface to unify the interfaces to issue requests
    directly. The merged interface takes over the requests totally,
    it could insert, end or do nothing based on the return value of
    .queue_rq and 'bypass' parameter. Then caller needn't any other
    handling any more and then code could be cleaned up.
    
    And also the commit c616cbee ( blk-mq: punt failed direct issue
    to dispatch list ) always inserts requests to hctx dispatch list
    whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is
    overkill and will harm the merging. We just need to do that for
    the requests that has been through .queue_rq. This patch also
    could fix this.
    
    Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27 22:00                 ` Laurence Oberman
@ 2019-03-27 23:34                   ` Laurence Oberman
  2019-03-28  1:31                     ` Ming Lei
  2019-04-03 17:52                   ` Bart Van Assche
  1 sibling, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-03-27 23:34 UTC (permalink / raw)
  To: Ming Lei, jianchao.w.wang
  Cc: Bart Van Assche, linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> > Truncating email content, starting bisect again as suggested.
> > Email was getting too long with repetition.
> > 
> > Crux of the issue repeated here so easy to understand topic
> > 
> > We got to dispatch passing rq_list and the list is corrupted/freed
> > so
> > we panic. Clearly a race and is in v5.x+ kernels.
> > This new bisect will find it.
> > 
> > crash> bt
> > PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND:
> > "kworker/1:1H"
> >  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
> >  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
> >  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
> >  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
> >  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
> >  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
> >  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
> >     [exception RIP: blk_mq_dispatch_rq_list+114]
> >     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
> >     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> > 0000000000000004
> >     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> > ffff9dea0dad87f0
> >     RBP: 0000000000000000   R8: 0000000000000000   R9:
> > 8080808080808080
> >     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> > ffffa9fe0759fe20
> >     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> > ffff9dea0dad87f0
> >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> >  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> > ffffffff93ba455c
> >  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
> >  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> > #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> > #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> > #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> > 
> 
> Hello Jens, Jianchao
> Finally made it to this one.
> I will see if I can revert and test
> 
> 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> Date:   Fri Dec 14 09:28:18 2018 +0800
> 
>     blk-mq: refactor the code of issue request directly
>     
>     Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
>     into one interface to unify the interfaces to issue requests
>     directly. The merged interface takes over the requests totally,
>     it could insert, end or do nothing based on the return value of
>     .queue_rq and 'bypass' parameter. Then caller needn't any other
>     handling any more and then code could be cleaned up.
>     
>     And also the commit c616cbee ( blk-mq: punt failed direct issue
>     to dispatch list ) always inserts requests to hctx dispatch list
>     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is
>     overkill and will harm the merging. We just need to do that for
>     the requests that has been through .queue_rq. This patch also
>     could fix this.
>     
>     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> 
> 

Cannot clean revert 

loberman@ibclient linux]$ git revert
7f556a44e61d0b62d78db9a2662a5f0daef010f2 
error: could not revert 7f556a4... blk-mq: refactor the code of issue
request directly
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'

Revert "blk-mq: refactor the code of issue request directly"

This reverts commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2.

Conflicts:
        block/blk-mq.c

No clear what in this commit is breaking things and causing the race




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27 23:34                   ` Laurence Oberman
@ 2019-03-28  1:31                     ` Ming Lei
  2019-03-28 14:58                       ` Laurence Oberman
  0 siblings, 1 reply; 18+ messages in thread
From: Ming Lei @ 2019-03-28  1:31 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: Ming Lei, jianchao.w.wang, Bart Van Assche, linux-rdma,
	linux-block, Jens Axboe, linux-scsi

On Wed, Mar 27, 2019 at 07:34:37PM -0400, Laurence Oberman wrote:
> On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> > > Truncating email content, starting bisect again as suggested.
> > > Email was getting too long with repetition.
> > > 
> > > Crux of the issue repeated here so easy to understand topic
> > > 
> > > We got to dispatch passing rq_list and the list is corrupted/freed
> > > so
> > > we panic. Clearly a race and is in v5.x+ kernels.
> > > This new bisect will find it.
> > > 
> > > crash> bt
> > > PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND:
> > > "kworker/1:1H"
> > >  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
> > >  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
> > >  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
> > >  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
> > >  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
> > >  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
> > >  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
> > >     [exception RIP: blk_mq_dispatch_rq_list+114]
> > >     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS: 00010246
> > >     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> > > 0000000000000004
> > >     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> > > ffff9dea0dad87f0
> > >     RBP: 0000000000000000   R8: 0000000000000000   R9:
> > > 8080808080808080
> > >     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> > > ffffa9fe0759fe20
> > >     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> > > ffff9dea0dad87f0
> > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > >  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> > > ffffffff93ba455c
> > >  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at ffffffff93b9e3cf
> > >  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> > > #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> > > #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> > > #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> > > 
> > 
> > Hello Jens, Jianchao
> > Finally made it to this one.
> > I will see if I can revert and test
> > 
> > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > Date:   Fri Dec 14 09:28:18 2018 +0800
> > 
> >     blk-mq: refactor the code of issue request directly
> >     
> >     Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
> >     into one interface to unify the interfaces to issue requests
> >     directly. The merged interface takes over the requests totally,
> >     it could insert, end or do nothing based on the return value of
> >     .queue_rq and 'bypass' parameter. Then caller needn't any other
> >     handling any more and then code could be cleaned up.
> >     
> >     And also the commit c616cbee ( blk-mq: punt failed direct issue
> >     to dispatch list ) always inserts requests to hctx dispatch list
> >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is
> >     overkill and will harm the merging. We just need to do that for
> >     the requests that has been through .queue_rq. This patch also
> >     could fix this.
> >     
> >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > 
> > 
> > 
> 
> Cannot clean revert 
> 
> loberman@ibclient linux]$ git revert
> 7f556a44e61d0b62d78db9a2662a5f0daef010f2 
> error: could not revert 7f556a4... blk-mq: refactor the code of issue
> request directly
> hint: after resolving the conflicts, mark the corrected paths
> hint: with 'git add <paths>' or 'git rm <paths>'
> hint: and commit the result with 'git commit'
> 
> Revert "blk-mq: refactor the code of issue request directly"
> 
> This reverts commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2.
> 
> Conflicts:
>         block/blk-mq.c
> 
> No clear what in this commit is breaking things and causing the race

Hi Laurence,

Could you test the following patch?

'bypass' means the caller handle the dispatch result.

Also, we might remove the handling for 'force', so we can align to
blk_mq_dispatch_rq_list(), but that shouldn't be related with this
issue.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index bc3524428b96..ee4bfd9cbde5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1848,12 +1848,14 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
 out_unlock:
 	hctx_unlock(hctx, srcu_idx);
+	if (bypass)
+		return ret;
 	switch (ret) {
 	case BLK_STS_OK:
 		break;
 	case BLK_STS_DEV_RESOURCE:
 	case BLK_STS_RESOURCE:
-		if (force) {
+		if (force)
 			blk_mq_request_bypass_insert(rq, run_queue);
 			/*
 			 * We have to return BLK_STS_OK for the DM
@@ -1861,18 +1863,14 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 			 * the real result to indicate whether the
 			 * request is direct-issued successfully.
 			 */
-			ret = bypass ? BLK_STS_OK : ret;
-		} else if (!bypass) {
+		else
 			blk_mq_sched_insert_request(rq, false,
 						    run_queue, false);
-		}
 		break;
 	default:
-		if (!bypass)
-			blk_mq_end_request(rq, ret);
+		blk_mq_end_request(rq, ret);
 		break;
 	}
-
 	return ret;
 }
 
Thanks,
Ming

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-28  1:31                     ` Ming Lei
@ 2019-03-28 14:58                       ` Laurence Oberman
  2019-03-28 15:09                         ` Laurence Oberman
  0 siblings, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-03-28 14:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: Ming Lei, jianchao.w.wang, Bart Van Assche, linux-rdma,
	linux-block, Jens Axboe, linux-scsi

On Thu, 2019-03-28 at 09:31 +0800, Ming Lei wrote:
> On Wed, Mar 27, 2019 at 07:34:37PM -0400, Laurence Oberman wrote:
> > On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> > > > Truncating email content, starting bisect again as suggested.
> > > > Email was getting too long with repetition.
> > > > 
> > > > Crux of the issue repeated here so easy to understand topic
> > > > 
> > > > We got to dispatch passing rq_list and the list is
> > > > corrupted/freed
> > > > so
> > > > we panic. Clearly a race and is in v5.x+ kernels.
> > > > This new bisect will find it.
> > > > 
> > > > crash> bt
> > > > PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND:
> > > > "kworker/1:1H"
> > > >  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
> > > >  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
> > > >  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
> > > >  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
> > > >  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
> > > >  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
> > > >  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
> > > >     [exception RIP: blk_mq_dispatch_rq_list+114]
> > > >     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS:
> > > > 00010246
> > > >     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> > > > 0000000000000004
> > > >     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> > > > ffff9dea0dad87f0
> > > >     RBP: 0000000000000000   R8: 0000000000000000   R9:
> > > > 8080808080808080
> > > >     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> > > > ffffa9fe0759fe20
> > > >     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> > > > ffff9dea0dad87f0
> > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > >  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> > > > ffffffff93ba455c
> > > >  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at
> > > > ffffffff93b9e3cf
> > > >  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> > > > #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> > > > #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> > > > #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> > > > 
> > > 
> > > Hello Jens, Jianchao
> > > Finally made it to this one.
> > > I will see if I can revert and test
> > > 
> > > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> > > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > > Date:   Fri Dec 14 09:28:18 2018 +0800
> > > 
> > >     blk-mq: refactor the code of issue request directly
> > >     
> > >     Merge blk_mq_try_issue_directly and
> > > __blk_mq_try_issue_directly
> > >     into one interface to unify the interfaces to issue requests
> > >     directly. The merged interface takes over the requests
> > > totally,
> > >     it could insert, end or do nothing based on the return value
> > > of
> > >     .queue_rq and 'bypass' parameter. Then caller needn't any
> > > other
> > >     handling any more and then code could be cleaned up.
> > >     
> > >     And also the commit c616cbee ( blk-mq: punt failed direct
> > > issue
> > >     to dispatch list ) always inserts requests to hctx dispatch
> > > list
> > >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this
> > > is
> > >     overkill and will harm the merging. We just need to do that
> > > for
> > >     the requests that has been through .queue_rq. This patch also
> > >     could fix this.
> > >     
> > >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> > >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > > 
> > > 
> > > 
> > 
> > Cannot clean revert 
> > 
> > loberman@ibclient linux]$ git revert
> > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 
> > error: could not revert 7f556a4... blk-mq: refactor the code of
> > issue
> > request directly
> > hint: after resolving the conflicts, mark the corrected paths
> > hint: with 'git add <paths>' or 'git rm <paths>'
> > hint: and commit the result with 'git commit'
> > 
> > Revert "blk-mq: refactor the code of issue request directly"
> > 
> > This reverts commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2.
> > 
> > Conflicts:
> >         block/blk-mq.c
> > 
> > No clear what in this commit is breaking things and causing the
> > race
> 
> Hi Laurence,
> 
> Could you test the following patch?
> 
> 'bypass' means the caller handle the dispatch result.
> 
> Also, we might remove the handling for 'force', so we can align to
> blk_mq_dispatch_rq_list(), but that shouldn't be related with this
> issue.
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index bc3524428b96..ee4bfd9cbde5 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1848,12 +1848,14 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>  	ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
>  out_unlock:
>  	hctx_unlock(hctx, srcu_idx);
> +	if (bypass)
> +		return ret;
>  	switch (ret) {
>  	case BLK_STS_OK:
>  		break;
>  	case BLK_STS_DEV_RESOURCE:
>  	case BLK_STS_RESOURCE:
> -		if (force) {
> +		if (force)
>  			blk_mq_request_bypass_insert(rq, run_queue);
>  			/*
>  			 * We have to return BLK_STS_OK for the DM
> @@ -1861,18 +1863,14 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>  			 * the real result to indicate whether the
>  			 * request is direct-issued successfully.
>  			 */
> -			ret = bypass ? BLK_STS_OK : ret;
> -		} else if (!bypass) {
> +		else
>  			blk_mq_sched_insert_request(rq, false,
>  						    run_queue, false);
> -		}
>  		break;
>  	default:
> -		if (!bypass)
> -			blk_mq_end_request(rq, ret);
> +		blk_mq_end_request(rq, ret);
>  		break;
>  	}
> -
>  	return ret;
>  }
>  
> Thanks,
> Ming

Hello Ming

Thanks for the patch, unfortunately it did not help.
I stared at the changes for a while and could not see how it would make
a difference to the race.
However, thats is mostly because I still need to get my head around all
the block-mq code.

The test was with this commit I added

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 70b210a..8952116 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1834,12 +1834,15 @@ blk_status_t blk_mq_try_issue_directly(struct
blk_mq_hw_ctx *hctx,
        ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
 out_unlock:
        hctx_unlock(hctx, srcu_idx);
+        if (bypass)
+                return ret;
+
        switch (ret) {
        case BLK_STS_OK:
                break;
        case BLK_STS_DEV_RESOURCE:
        case BLK_STS_RESOURCE:
-               if (force) {
+                if (force)
                        blk_mq_request_bypass_insert(rq, run_queue);
                        /*
                         * We have to return BLK_STS_OK for the DM
@@ -1847,18 +1850,14 @@ blk_status_t blk_mq_try_issue_directly(struct
blk_mq_hw_ctx *hctx,
                         * the real result to indicate whether the
                         * request is direct-issued successfully.
                         */
-                       ret = bypass ? BLK_STS_OK : ret;
-               } else if (!bypass) {
+       else
                        blk_mq_sched_insert_request(rq, false,
                                                    run_queue, false);
-               }
                break;
        default:
-               if (!bypass)
-                       blk_mq_end_request(rq, ret);
+                blk_mq_end_request(rq, ret);
                break;
        }
-
        return ret;
 }



[  193.068245] device-mapper: multipath: Failing path 66:96.
[  193.092929] device-mapper: multipath: Failing path 8:96.
[  193.137068] #PF error: [normal kernel read fault]
[  195.217691] PGD 0 P4D 0 
[  195.231958] Oops: 0000 [#1] SMP PTI
[  195.250796] CPU: 4 PID: 8525 Comm: kworker/4:1H Tainted: G        W
I       5.1.0-rc2+ #2
[  195.295820] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  195.330339] Workqueue: kblockd blk_mq_run_work_fn
[  195.355613] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  195.385494] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38 00
00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8 75
24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48 89
[  195.489501] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
[  195.519090] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
0000000000000004
[  195.560254] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
ffff9f8aa95297d0
[  195.601196] RBP: 0000000000000000 R08: 0000000000000000 R09:
8080808080808080
[  195.642186] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffbb074781be20
[  195.683094] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
ffff9f8aa95297d0
[  195.724153] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
knlGS:0000000000000000
[  195.769427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  195.801687] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
00000000000206e0
[  195.841126] Call Trace:
[  195.854014]  ? blk_mq_flush_busy_ctxs+0xca/0x120
[  195.879448]  blk_mq_sched_dispatch_requests+0x15c/0x180
[  195.908389]  __blk_mq_run_hw_queue+0x5f/0x100
[  195.932961]  process_one_work+0x171/0x380
[  195.956414]  worker_thread+0x49/0x3f0
[  195.976984]  kthread+0xf8/0x130
[  195.995078]  ? max_active_store+0x80/0x80
[  196.018080]  ? kthread_bind+0x10/0x10
[  196.038383]  ret_from_fork+0x35/0x40
[  196.058435] Modules linked in: ib_isert iscsi_target_mod
target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[  196.451938] CR2: 00000000000000b8
[  196.469994] ---[ end trace b2b431da81df2e95 ]---
[  196.469996] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b8
[  196.495380] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  196.540456] #PF error: [normal kernel read fault]
[  196.570957] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38 00
00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8 75
24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48 89
[  196.597976] PGD 0 P4D 0 
[  196.705627] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
[  196.720114] Oops: 0000 [#2] SMP PTI
[  196.749959] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
0000000000000004
[  196.769936] CPU: 6 PID: 4034 Comm: kworker/6:1H Tainted: G      D W
I       5.1.0-rc2+ #2
[  196.809000] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
ffff9f8aa95297d0
[  196.854098] Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
[  196.893109] RBP: 0000000000000000 R08: 0000000000000000 R09:
8080808080808080
[  196.928283] Workqueue: kblockd blk_mq_run_work_fn
[  196.968750] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffbb074781be20
[  196.995203] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  197.035808] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
ffff9f8aa95297d0
[  197.065780] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38 00
00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8 75
24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48 89
[  197.105262] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
knlGS:0000000000000000
[  197.213467] RSP: 0018:ffffbb074722bd90 EFLAGS: 00010246
[  197.259374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.289579] RAX: ffff9f7e6e25a908 RBX: ffff9f7e6e25a908 RCX:
0000000000000004
[  197.322352] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
00000000000206e0
[  197.361122] RDX: 0000000000000000 RSI: ffffbb074722be20 RDI:
ffff9f8a19f60000
[  197.400084] Kernel panic - not syncing: Fatal exception
[  197.440204] RBP: 0000000000000000 R08: 0000000000000000 R09:
8080808080808080
[  197.509179] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffbb074722be20
[  197.549143] R13: ffff9f7e6e25a8c0 R14: 0000000000000000 R15:
ffff9f8a19f60000
[  197.590162] FS:  0000000000000000(0000) GS:ffff9f7eb78c0000(0000)
knlGS:0000000000000000
[  197.636223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  197.668680] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
00000000000206e0
[  197.708291] Call Trace:
[  197.722282]  ? blk_mq_flush_busy_ctxs+0xca/0x120
[  197.748285]  blk_mq_sched_dispatch_requests+0x15c/0x180
[  197.777145]  __blk_mq_run_hw_queue+0x5f/0x100
[  197.801406]  process_one_work+0x171/0x380
[  197.824021]  worker_thread+0x49/0x3f0
[  197.844756]  kthread+0xf8/0x130
[  197.862142]  ? max_active_store+0x80/0x80
[  197.883917]  ? kthread_bind+0x10/0x10
[  197.904609]  ret_from_fork+0x35/0x40
[  197.924992] Modules linked in: ib_isert iscsi_target_mod
target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod
[  198.317894] CR2: 00000000000000b8
[  198.336570] ---[ end trace b2b431da81df2e96 ]---
[  198.336571] BUG: unable to handle kernel NULL pointer dereference at
00000000000000b8
[  198.336572] #PF error: [normal kernel read fault]
[  198.362241] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
[  198.405359] PGD 0 P4D 0 
[  198.432068] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
00[  198.461906] Oops: 0000 [#3] SMP PTI


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-28 14:58                       ` Laurence Oberman
@ 2019-03-28 15:09                         ` Laurence Oberman
  0 siblings, 0 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-03-28 15:09 UTC (permalink / raw)
  To: Ming Lei
  Cc: Ming Lei, jianchao.w.wang, Bart Van Assche, linux-rdma,
	linux-block, Jens Axboe, linux-scsi

On Thu, 2019-03-28 at 10:58 -0400, Laurence Oberman wrote:
> On Thu, 2019-03-28 at 09:31 +0800, Ming Lei wrote:
> > On Wed, Mar 27, 2019 at 07:34:37PM -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > > > On Wed, 2019-03-27 at 08:56 -0400, Laurence Oberman wrote:
> > > > > Truncating email content, starting bisect again as suggested.
> > > > > Email was getting too long with repetition.
> > > > > 
> > > > > Crux of the issue repeated here so easy to understand topic
> > > > > 
> > > > > We got to dispatch passing rq_list and the list is
> > > > > corrupted/freed
> > > > > so
> > > > > we panic. Clearly a race and is in v5.x+ kernels.
> > > > > This new bisect will find it.
> > > > > 
> > > > > crash> bt
> > > > > PID: 9191   TASK: ffff9dea0a8395c0  CPU: 1   COMMAND:
> > > > > "kworker/1:1H"
> > > > >  #0 [ffffa9fe0759fab0] machine_kexec at ffffffff938606cf
> > > > >  #1 [ffffa9fe0759fb08] __crash_kexec at ffffffff9393a48d
> > > > >  #2 [ffffa9fe0759fbd0] crash_kexec at ffffffff9393b659
> > > > >  #3 [ffffa9fe0759fbe8] oops_end at ffffffff93831c41
> > > > >  #4 [ffffa9fe0759fc08] no_context at ffffffff9386ecb9
> > > > >  #5 [ffffa9fe0759fcb0] do_page_fault at ffffffff93870012
> > > > >  #6 [ffffa9fe0759fce0] page_fault at ffffffff942010ee
> > > > >     [exception RIP: blk_mq_dispatch_rq_list+114]
> > > > >     RIP: ffffffff93b9f202  RSP: ffffa9fe0759fd90  RFLAGS:
> > > > > 00010246
> > > > >     RAX: ffff9de9c4d3bbc8  RBX: ffff9de9c4d3bbc8  RCX:
> > > > > 0000000000000004
> > > > >     RDX: 0000000000000000  RSI: ffffa9fe0759fe20  RDI:
> > > > > ffff9dea0dad87f0
> > > > >     RBP: 0000000000000000   R8: 0000000000000000   R9:
> > > > > 8080808080808080
> > > > >     R10: ffff9dea33827660  R11: ffffee9d9e097a00  R12:
> > > > > ffffa9fe0759fe20
> > > > >     R13: ffff9de9c4d3bb80  R14: 0000000000000000  R15:
> > > > > ffff9dea0dad87f0
> > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > >  #7 [ffffa9fe0759fe18] blk_mq_sched_dispatch_requests at
> > > > > ffffffff93ba455c
> > > > >  #8 [ffffa9fe0759fe60] __blk_mq_run_hw_queue at
> > > > > ffffffff93b9e3cf
> > > > >  #9 [ffffa9fe0759fe78] process_one_work at ffffffff938b0c21
> > > > > #10 [ffffa9fe0759feb8] worker_thread at ffffffff938b18d9
> > > > > #11 [ffffa9fe0759ff10] kthread at ffffffff938b6ee8
> > > > > #12 [ffffa9fe0759ff50] ret_from_fork at ffffffff94200215
> > > > > 
> > > > 
> > > > Hello Jens, Jianchao
> > > > Finally made it to this one.
> > > > I will see if I can revert and test
> > > > 
> > > > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad
> > > > commit
> > > > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > > > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > > > Date:   Fri Dec 14 09:28:18 2018 +0800
> > > > 
> > > >     blk-mq: refactor the code of issue request directly
> > > >     
> > > >     Merge blk_mq_try_issue_directly and
> > > > __blk_mq_try_issue_directly
> > > >     into one interface to unify the interfaces to issue
> > > > requests
> > > >     directly. The merged interface takes over the requests
> > > > totally,
> > > >     it could insert, end or do nothing based on the return
> > > > value
> > > > of
> > > >     .queue_rq and 'bypass' parameter. Then caller needn't any
> > > > other
> > > >     handling any more and then code could be cleaned up.
> > > >     
> > > >     And also the commit c616cbee ( blk-mq: punt failed direct
> > > > issue
> > > >     to dispatch list ) always inserts requests to hctx dispatch
> > > > list
> > > >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE,
> > > > this
> > > > is
> > > >     overkill and will harm the merging. We just need to do that
> > > > for
> > > >     the requests that has been through .queue_rq. This patch
> > > > also
> > > >     could fix this.
> > > >     
> > > >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> > > >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > > > 
> > > > 
> > > > 
> > > 
> > > Cannot clean revert 
> > > 
> > > loberman@ibclient linux]$ git revert
> > > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 
> > > error: could not revert 7f556a4... blk-mq: refactor the code of
> > > issue
> > > request directly
> > > hint: after resolving the conflicts, mark the corrected paths
> > > hint: with 'git add <paths>' or 'git rm <paths>'
> > > hint: and commit the result with 'git commit'
> > > 
> > > Revert "blk-mq: refactor the code of issue request directly"
> > > 
> > > This reverts commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2.
> > > 
> > > Conflicts:
> > >         block/blk-mq.c
> > > 
> > > No clear what in this commit is breaking things and causing the
> > > race
> > 
> > Hi Laurence,
> > 
> > Could you test the following patch?
> > 
> > 'bypass' means the caller handle the dispatch result.
> > 
> > Also, we might remove the handling for 'force', so we can align to
> > blk_mq_dispatch_rq_list(), but that shouldn't be related with this
> > issue.
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index bc3524428b96..ee4bfd9cbde5 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1848,12 +1848,14 @@ blk_status_t
> > blk_mq_try_issue_directly(struct
> > blk_mq_hw_ctx *hctx,
> >  	ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
> >  out_unlock:
> >  	hctx_unlock(hctx, srcu_idx);
> > +	if (bypass)
> > +		return ret;
> >  	switch (ret) {
> >  	case BLK_STS_OK:
> >  		break;
> >  	case BLK_STS_DEV_RESOURCE:
> >  	case BLK_STS_RESOURCE:
> > -		if (force) {
> > +		if (force)
> >  			blk_mq_request_bypass_insert(rq, run_queue);
> >  			/*
> >  			 * We have to return BLK_STS_OK for the DM
> > @@ -1861,18 +1863,14 @@ blk_status_t
> > blk_mq_try_issue_directly(struct
> > blk_mq_hw_ctx *hctx,
> >  			 * the real result to indicate whether the
> >  			 * request is direct-issued successfully.
> >  			 */
> > -			ret = bypass ? BLK_STS_OK : ret;
> > -		} else if (!bypass) {
> > +		else
> >  			blk_mq_sched_insert_request(rq, false,
> >  						    run_queue, false);
> > -		}
> >  		break;
> >  	default:
> > -		if (!bypass)
> > -			blk_mq_end_request(rq, ret);
> > +		blk_mq_end_request(rq, ret);
> >  		break;
> >  	}
> > -
> >  	return ret;
> >  }
> >  
> > Thanks,
> > Ming
> 
> Hello Ming
> 
> Thanks for the patch, unfortunately it did not help.
> I stared at the changes for a while and could not see how it would
> make
> a difference to the race.
> However, thats is mostly because I still need to get my head around
> all
> the block-mq code.
> 
> The test was with this commit I added
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 70b210a..8952116 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1834,12 +1834,15 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>         ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
>  out_unlock:
>         hctx_unlock(hctx, srcu_idx);
> +        if (bypass)
> +                return ret;
> +
>         switch (ret) {
>         case BLK_STS_OK:
>                 break;
>         case BLK_STS_DEV_RESOURCE:
>         case BLK_STS_RESOURCE:
> -               if (force) {
> +                if (force)
>                         blk_mq_request_bypass_insert(rq, run_queue);
>                         /*
>                          * We have to return BLK_STS_OK for the DM
> @@ -1847,18 +1850,14 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>                          * the real result to indicate whether the
>                          * request is direct-issued successfully.
>                          */
> -                       ret = bypass ? BLK_STS_OK : ret;
> -               } else if (!bypass) {
> +       else
>                         blk_mq_sched_insert_request(rq, false,
>                                                     run_queue,
> false);
> -               }
>                 break;
>         default:
> -               if (!bypass)
> -                       blk_mq_end_request(rq, ret);
> +                blk_mq_end_request(rq, ret);
>                 break;
>         }
> -
>         return ret;
>  }
> 
> 
> 
> [  193.068245] device-mapper: multipath: Failing path 66:96.
> [  193.092929] device-mapper: multipath: Failing path 8:96.
> [  193.137068] #PF error: [normal kernel read fault]
> [  195.217691] PGD 0 P4D 0 
> [  195.231958] Oops: 0000 [#1] SMP PTI
> [  195.250796] CPU: 4 PID: 8525 Comm: kworker/4:1H Tainted:
> G        W
> I       5.1.0-rc2+ #2
> [  195.295820] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  195.330339] Workqueue: kblockd blk_mq_run_work_fn
> [  195.355613] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  195.385494] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  195.489501] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
> [  195.519090] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
> 0000000000000004
> [  195.560254] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
> ffff9f8aa95297d0
> [  195.601196] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  195.642186] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074781be20
> [  195.683094] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
> ffff9f8aa95297d0
> [  195.724153] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
> knlGS:0000000000000000
> [  195.769427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  195.801687] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  195.841126] Call Trace:
> [  195.854014]  ? blk_mq_flush_busy_ctxs+0xca/0x120
> [  195.879448]  blk_mq_sched_dispatch_requests+0x15c/0x180
> [  195.908389]  __blk_mq_run_hw_queue+0x5f/0x100
> [  195.932961]  process_one_work+0x171/0x380
> [  195.956414]  worker_thread+0x49/0x3f0
> [  195.976984]  kthread+0xf8/0x130
> [  195.995078]  ? max_active_store+0x80/0x80
> [  196.018080]  ? kthread_bind+0x10/0x10
> [  196.038383]  ret_from_fork+0x35/0x40
> [  196.058435] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
> ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> aesni_intel
> crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
> iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
> acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
> ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
> bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log
> dm_mod
> [  196.451938] CR2: 00000000000000b8
> [  196.469994] ---[ end trace b2b431da81df2e95 ]---
> [  196.469996] BUG: unable to handle kernel NULL pointer dereference
> at
> 00000000000000b8
> [  196.495380] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  196.540456] #PF error: [normal kernel read fault]
> [  196.570957] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  196.597976] PGD 0 P4D 0 
> [  196.705627] RSP: 0018:ffffbb074781bd90 EFLAGS: 00010246
> [  196.720114] Oops: 0000 [#2] SMP PTI
> [  196.749959] RAX: ffff9f7e6e1de148 RBX: ffff9f7e6e1de148 RCX:
> 0000000000000004
> [  196.769936] CPU: 6 PID: 4034 Comm: kworker/6:1H Tainted: G      D
> W
> I       5.1.0-rc2+ #2
> [  196.809000] RDX: 0000000000000000 RSI: ffffbb074781be20 RDI:
> ffff9f8aa95297d0
> [  196.854098] Hardware name: HP ProLiant DL380 G7, BIOS P67
> 08/16/2015
> [  196.893109] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  196.928283] Workqueue: kblockd blk_mq_run_work_fn
> [  196.968750] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074781be20
> [  196.995203] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  197.035808] R13: ffff9f7e6e1de100 R14: 0000000000000000 R15:
> ffff9f8aa95297d0
> [  197.065780] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00
> 00 00 00 c7 44 24 3c 00 00 00 00 80 3c 24 00 4c 8d 6b b8 48 8b 6b c8
> 75
> 24 <48> 8b 85 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10 48
> 89
> [  197.105262] FS:  0000000000000000(0000) GS:ffff9f7eb7880000(0000)
> knlGS:0000000000000000
> [  197.213467] RSP: 0018:ffffbb074722bd90 EFLAGS: 00010246
> [  197.259374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  197.289579] RAX: ffff9f7e6e25a908 RBX: ffff9f7e6e25a908 RCX:
> 0000000000000004
> [  197.322352] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  197.361122] RDX: 0000000000000000 RSI: ffffbb074722be20 RDI:
> ffff9f8a19f60000
> [  197.400084] Kernel panic - not syncing: Fatal exception
> [  197.440204] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 8080808080808080
> [  197.509179] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffbb074722be20
> [  197.549143] R13: ffff9f7e6e25a8c0 R14: 0000000000000000 R15:
> ffff9f8a19f60000
> [  197.590162] FS:  0000000000000000(0000) GS:ffff9f7eb78c0000(0000)
> knlGS:0000000000000000
> [  197.636223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  197.668680] CR2: 00000000000000b8 CR3: 000000020d20e004 CR4:
> 00000000000206e0
> [  197.708291] Call Trace:
> [  197.722282]  ? blk_mq_flush_busy_ctxs+0xca/0x120
> [  197.748285]  blk_mq_sched_dispatch_requests+0x15c/0x180
> [  197.777145]  __blk_mq_run_hw_queue+0x5f/0x100
> [  197.801406]  process_one_work+0x171/0x380
> [  197.824021]  worker_thread+0x49/0x3f0
> [  197.844756]  kthread+0xf8/0x130
> [  197.862142]  ? max_active_store+0x80/0x80
> [  197.883917]  ? kthread_bind+0x10/0x10
> [  197.904609]  ret_from_fork+0x35/0x40
> [  197.924992] Modules linked in: ib_isert iscsi_target_mod
> target_core_mod ib_srp scsi_transport_srp rpcrdma ib_iser rdma_ucm
> ib_umad rdma_cm ib_ipoib sunrpc libiscsi iw_cm scsi_transport_iscsi
> ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> kvm
> irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
> aesni_intel
> crypto_simd iTCO_wdt cryptd ipmi_ssif glue_helper gpio_ich
> iTCO_vendor_support dm_service_time pcspkr joydev hpwdt hpilo ipmi_si
> acpi_power_meter pcc_cpufreq sg ipmi_devintf i7core_edac lpc_ich
> ipmi_msghandler dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops mlx5_core ttm mlxfw drm crc32c_intel serio_raw hpsa ptp
> bnx2 scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log
> dm_mod
> [  198.317894] CR2: 00000000000000b8
> [  198.336570] ---[ end trace b2b431da81df2e96 ]---
> [  198.336571] BUG: unable to handle kernel NULL pointer dereference
> at
> 00000000000000b8
> [  198.336572] #PF error: [normal kernel read fault]
> [  198.362241] RIP: 0010:blk_mq_dispatch_rq_list+0x72/0x570
> [  198.405359] PGD 0 P4D 0 
> [  198.432068] Code: 08 84 d2 0f 85 cf 03 00 00 45 31 f6 c7 44 24 38
> 00[  198.461906] Oops: 0000 [#3] SMP PTI
> 

I am going to run the same test not using srp but for example qla2xxx.
This issue is clearly in the block layer, but I wonder why I am the
first to report it.
The exposure, if it happens also on F/C luns means we should have seen
panics when arrays were rebooted with any kernel >= v5.0-rc1




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-26 19:06           ` Laurence Oberman
@ 2019-04-03 14:53             ` Laurence Oberman
  2019-04-03 15:09               ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2019-04-03 14:53 UTC (permalink / raw)
  To: Bart Van Assche, linux-rdma, linux-block, Jens Axboe; +Cc: linux-scsi

On Tue, 2019-03-26 at 15:06 -0400, Laurence Oberman wrote:
> On Tue, 2019-03-26 at 11:16 -0400, Laurence Oberman wrote:
> > On Thu, 2019-03-21 at 08:44 -0400, Laurence Oberman wrote:
> > > On Wed, 2019-03-20 at 16:35 -0400, Laurence Oberman wrote:
> > > > On Wed, 2019-03-20 at 13:28 -0700, Bart Van Assche wrote:
> > > > > On Wed, 2019-03-20 at 11:11 -0400, Laurence Oberman wrote:
> > > > > > On Wed, 2019-03-20 at 09:45 -0400, Laurence Oberman wrote:
> > > > > > > Hello Bart, I hope all is well with you.
> > > > > > > 
> > > > > > > Quick question
> > > > > > > preparing to test v5.1-rc2 SRP, my usual method is first
> > > > > > > validate
> > > > > > > the
> > > > > > > prior kernel I had in place.
> > > > > > > This had passed tests previously (5.0.0-rc2) but I had
> > > > > > > not
> > > > > > > run
> > > > > > > the
> > > > > > > target server reboot test, just the disconnect tests.
> > > > > > > 
> > > > > > > Today with mapped SRP devices I rebooted the target
> > > > > > > server
> > > > > > > and
> > > > > > > the
> > > > > > > client panicked.
> > > > > > > 
> > > > > > > Its been a while and I have been so busy that have not
> > > > > > > kept
> > > > > > > up
> > > > > > > with
> > > > > > > all
> > > > > > > the fixes. Is this a known issue.
> > > > > > > 
> > > > > > > Thanks
> > > > > > > Laurence
> > > > > > > 
> > > > > > > 5414228.917507] scsi host2: ib_srp: Path record query
> > > > > > > failed:
> > > > > > > sgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > > service_id
> > > > > > > 0x7cfe900300726e4e
> > > > > > > [5414229.014355] scsi host2: reconnect attempt 7 failed
> > > > > > > (-
> > > > > > > 110)
> > > > > > > [5414239.318161] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318165] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318167] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318168] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318170] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318172] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318173] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414239.318175] scsi host2: ib_srp: Sending CM DREQ
> > > > > > > failed
> > > > > > > [5414243.670072] scsi host2: ib_srp: Got failed path rec
> > > > > > > status
> > > > > > > -110
> > > > > > > [5414243.702179] scsi host2: ib_srp: Path record query
> > > > > > > failed:
> > > > > > > sgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
> > > > > > > fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff,
> > > > > > > service_id
> > > > > > > 0x7cfe900300726e4e
> > > > > > > [5414243.799313] scsi host2: reconnect attempt 8 failed
> > > > > > > (-
> > > > > > > 110)
> > > > > > > [5414247.510115] scsi host1: ib_srp: Sending CM REQ
> > > > > > > failed
> > > > > > > [5414247.510140] scsi host1: reconnect attempt 1 failed
> > > > > > > (-
> > > > > > > 104)
> > > > > > > [5414247.849078] BUG: unable to handle kernel NULL
> > > > > > > pointer
> > > > > > > dereference
> > > > > > > at 00000000000000b8
> > > > > > > [5414247.893793] #PF error: [normal kernel read fault]
> > > > > > > [5414247.921839] PGD 0 P4D 0 
> > > > > > > [5414247.937280] Oops: 0000 [#1] SMP PTI
> > > > > > > [5414247.958332] CPU: 4 PID: 7773 Comm: kworker/4:1H
> > > > > > > Kdump:
> > > > > > > loaded
> > > > > > > Tainted: G          I       5.0.0-rc2+ #2
> > > > > > > [5414248.012856] Hardware name: HP ProLiant DL380 G7,
> > > > > > > BIOS
> > > > > > > P67
> > > > > > > 08/16/2015
> > > > > > > [5414248.026174] device-mapper: multipath: Failing path
> > > > > > > 8:48.
> > > > > > > 
> > > > > > > 
> > > > > > > [5414248.050003] Workqueue: kblockd blk_mq_run_work_fn
> > > > > > > [5414248.108378] RIP:
> > > > > > > 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > > > > > > [5414248.139724] Code: 0f 85 c2 04 00 00 83 44 24 28 01
> > > > > > > 48
> > > > > > > 8b
> > > > > > > 45
> > > > > > > 00
> > > > > > > 48
> > > > > > > 39 c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b
> > > > > > > b8
> > > > > > > 4c
> > > > > > > 8b
> > > > > > > 63
> > > > > > > c8
> > > > > > > 75 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10
> > > > > > > 48
> > > > > > > 85
> > > > > > > c0
> > > > > > > 74
> > > > > > > 10
> > > > > > > 4c
> > > > > > > [5414248.246176] RSP: 0018:ffffb1cd8760fd90 EFLAGS:
> > > > > > > 00010246
> > > > > > > [5414248.275599] RAX: ffffa049d67a1308 RBX:
> > > > > > > ffffa049d67a1308
> > > > > > > RCX:
> > > > > > > 0000000000000004
> > > > > > > [5414248.316090] RDX: 0000000000000000 RSI:
> > > > > > > ffffb1cd8760fe20
> > > > > > > RDI:
> > > > > > > ffffa0552ca08000
> > > > > > > [5414248.356884] RBP: ffffb1cd8760fe20 R08:
> > > > > > > 0000000000000000
> > > > > > > R09:
> > > > > > > 8080808080808080
> > > > > > > [5414248.397632] R10: 0000000000000001 R11:
> > > > > > > 0000000000000001
> > > > > > > R12:
> > > > > > > 0000000000000000
> > > > > > > [5414248.439323] R13: ffffa049d67a12c0 R14:
> > > > > > > 0000000000000000
> > > > > > > R15:
> > > > > > > ffffa0552ca08000
> > > > > > > [5414248.481743] FS:  0000000000000000(0000)
> > > > > > > GS:ffffa04a37880000(0000)
> > > > > > > knlGS:0000000000000000
> > > > > > > [5414248.528310] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > > 0000000080050033
> > > > > > > [5414248.561779] CR2: 00000000000000b8 CR3:
> > > > > > > 0000000e9d40e004
> > > > > > > CR4:
> > > > > > > 00000000000206e0
> > > > > > > [5414248.602420] Call Trace:
> > > > > > > [5414248.616660]  blk_mq_sched_dispatch_requests+0x15c/0x
> > > > > > > 18
> > > > > > > 0
> > > > > > > [5414248.647066]  __blk_mq_run_hw_queue+0x5f/0xf0
> > > > > > > [5414248.672633]  process_one_work+0x171/0x370
> > > > > > > [5414248.695443]  worker_thread+0x49/0x3f0
> > > > > > > [5414248.716730]  kthread+0xf8/0x130
> > > > > > > [5414248.735085]  ? max_active_store+0x80/0x80
> > > > > > > [5414248.758569]  ? kthread_bind+0x10/0x10
> > > > > > > [5414248.779953]  ret_from_fork+0x35/0x40
> > > > > > > 
> > > > > > > [5414248.801005] Modules linked in: ib_isert
> > > > > > > iscsi_target_mod
> > > > > > > target_core_mod ib_srp rpcrdma scsi_transport_srp
> > > > > > > rdma_ucm
> > > > > > > ib_iser
> > > > > > > ib_ipoib ib_umad rdma_cm libiscsi iw_cm
> > > > > > > scsi_transport_iscsi
> > > > > > > ib_cm
> > > > > > > sunrpc mlx5_ib ib_uverbs ib_core intel_powerclamp
> > > > > > > coretemp
> > > > > > > kvm_intel
> > > > > > > kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul
> > > > > > > ghash_clmulni_intel aesni_intel iTCO_wdt crypto_simd
> > > > > > > cryptd
> > > > > > > gpio_ich
> > > > > > > iTCO_vendor_support glue_helper joydev ipmi_si
> > > > > > > dm_service_time
> > > > > > > pcspkr
> > > > > > > ipmi_devintf hpilo hpwdt sg ipmi_msghandler
> > > > > > > acpi_power_meter
> > > > > > > lpc_ich
> > > > > > > i7core_edac pcc_cpufreq dm_multipath ip_tables xfs
> > > > > > > libcrc32c
> > > > > > > radeon
> > > > > > > sd_mod i2c_algo_bit drm_kms_helper syscopyarea
> > > > > > > sysfillrect
> > > > > > > sysimgblt
> > > > > > > fb_sys_fops ttm drm mlx5_core crc32c_intel serio_raw
> > > > > > > i2c_core
> > > > > > > hpsa
> > > > > > > bnx2
> > > > > > > scsi_transport_sas mlxfw devlink dm_mirror dm_region_hash
> > > > > > > dm_log
> > > > > > > dm_mod
> > > > > > > [5414249.199354] CR2: 00000000000000b8
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > Looking at the vmcore
> > > > > > 
> > > > > > PID: 7773   TASK: ffffa04a2c1e2b80  CPU: 4   COMMAND:
> > > > > > "kworker/4:1H"
> > > > > >  #0 [ffffb1cd8760fab0] machine_kexec at ffffffffaaa6003f
> > > > > >  #1 [ffffb1cd8760fb08] __crash_kexec at ffffffffaab373ed
> > > > > >  #2 [ffffb1cd8760fbd0] crash_kexec at ffffffffaab385b9
> > > > > >  #3 [ffffb1cd8760fbe8] oops_end at ffffffffaaa31931
> > > > > >  #4 [ffffb1cd8760fc08] no_context at ffffffffaaa6eb59
> > > > > >  #5 [ffffb1cd8760fcb0] do_page_fault at ffffffffaaa6feb2
> > > > > >  #6 [ffffb1cd8760fce0] page_fault at ffffffffab2010ee
> > > > > >     [exception RIP: blk_mq_dispatch_rq_list+201]
> > > > > >     RIP: ffffffffaad90589  RSP: ffffb1cd8760fd90  RFLAGS:
> > > > > > 00010246
> > > > > >     RAX: ffffa049d67a1308  RBX: ffffa049d67a1308  RCX:
> > > > > > 0000000000000004
> > > > > >     RDX: 0000000000000000  RSI: ffffb1cd8760fe20  RDI:
> > > > > > ffffa0552ca08000
> > > > > >     RBP: ffffb1cd8760fe20   R8: 0000000000000000   R9:
> > > > > > 8080808080808080
> > > > > >     R10: 0000000000000001  R11: 0000000000000001  R12:
> > > > > > 0000000000000000
> > > > > >     R13: ffffa049d67a12c0  R14: 0000000000000000  R15:
> > > > > > ffffa0552ca08000
> > > > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > > > >  #7 [ffffb1cd8760fe18] blk_mq_sched_dispatch_requests at
> > > > > > ffffffffaad9570c
> > > > > >  #8 [ffffb1cd8760fe60] __blk_mq_run_hw_queue at
> > > > > > ffffffffaad8de3f
> > > > > >  #9 [ffffb1cd8760fe78] process_one_work at ffffffffaaab0ab1
> > > > > > #10 [ffffb1cd8760feb8] worker_thread at ffffffffaaab11d9
> > > > > > #11 [ffffb1cd8760ff10] kthread at ffffffffaaab6758
> > > > > > #12 [ffffb1cd8760ff50] ret_from_fork at ffffffffab200215
> > > > > > 
> > > > > > We were working on this request_queue for
> > > > > >  blk_mq_sched_dispatch_requests
> > > > > > 
> > > > > > crash> dev -d | grep ffffa0552ca08000
> > > > > >     8
> > > > > > ffffa055c81b5800   sdd        ffffa0552ca08000       0     
> > > > > > 0 
> > > > > >   
> > > > > >   
> > > > > > 0 
> > > > > > ]
> > > > > > 
> > > > > > That device was no longer accessible 
> > > > > > 
> > > > > > sdev_state = SDEV_TRANSPORT_OFFLINE,
> > > > > > 
> > > > > > So it looks like we tried to process a no longer valid list
> > > > > > entry
> > > > > > in 
> > > > > > blk_mq_dispatch_rq_list
> > > > > > 
> > > > > > /home/loberman/rpmbuild/BUILD/kernel-5.0.0_rc2+/block/blk-
> > > > > > mq.h:
> > > > > > 211
> > > > > > 0xffffffffaad90589
> > > > > > <blk_mq_dispatch_rq_list+201>:       mov    0xb8(%r12),%rax
> > > > > > 
> > > > > > R12 is NULL
> > > > > > 
> > > > > > 
> > > > > > From
> > > > > > static inline bool blk_mq_get_dispatch_budget(struct
> > > > > > blk_mq_hw_ctx
> > > > > > *hctx)
> > > > > > {
> > > > > >         struct request_queue *q = hctx->queue;
> > > > > > 
> > > > > >         if (q->mq_ops->get_budget)
> > > > > >                 return q->mq_ops->get_budget(hctx);
> > > > > >         return true;
> > > > > > }
> > > > > > 
> > > > > > Willw ait for a reply befaore i try the newer kernel, but
> > > > > > looks
> > > > > > like a
> > > > > > use after free to me
> > > > > 
> > > > > Hi Laurence,
> > > > > 
> > > > > I don't think that any of the recent SRP initiator changes
> > > > > can
> > > > > be
> > > > > the
> > > > > root
> > > > > cause of this crash. However, significant changes went
> > > > > upstream
> > > > > in
> > > > > the block
> > > > > layer core during the v5.1-rc1 merge window, e.g. multi-page
> > > > > bvec
> > > > > support.
> > > > > Is it possible for you to bisect this kernel oops?
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Bart.
> > > > > 
> > > > 
> > > > OK, I will see I can reproduce on the fly and I will bisect.
> > > > I do agree its not SRP, more likley some block layer race.
> > > > I was just able to reproduce it using SRP
> > > > 
> > > > Note that this was on 5.0.0-rc2+, prior to me trying 5.1.
> > > > 
> > > > I usually reboot the target server as part of my test series
> > > > but
> > > > when
> > > > I
> > > > last tested 5.0.0-rc2+ I only reset the SRP interfaces and had
> > > > devices
> > > > get rediscovered.
> > > > 
> > > > I did not see it during those tests.
> > > > 
> > > > Back when I have more to share.
> > > > 
> > > > Many Thanks for your time as always
> > > > Laurence
> > > > 
> > > > 
> > > 
> > > Something crept in, in the block layer causing a use after free
> > > 4.19.0-rc1+ does not have the issue, so I will narrow the bisect 
> > > Thanks
> > > Laurence
> > > 
> > 
> > This took a long time to bisect.
> > Repeating the issue seen. We have changes that when the target is
> > rebooted with mapped srp devices the client then experiences a ist
> > corruption and panic as already shown.
> > 
> > Some stacks
> > 
> > [  222.631998] scsi host1: ib_srp: Path record query failed: sgid
> > fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
> > fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
> > 0x7cfe900300726e4e
> > [  222.729639] scsi host1: reconnect attempt 1 failed (-110)
> > [  223.176766] fast_io_fail_tmo expired for SRP port-1:1 / host1.
> > 
> > [  223.518759] BUG: unable to handle kernel NULL pointer
> > dereference
> > at
> > 00000000000000b8
> > [  223.519736] sd 1:0:0:0: rejecting I/O to offline device
> > [  223.563769] #PF error: [normal kernel read fault]
> > [  223.563770] PGD 0 P4D 0 
> > [  223.563774] Oops: 0000 [#1] SMP PTI
> > [  223.563778] CPU: 3 PID: 9027 Comm: kworker/3:1H Tainted:
> > G          I       5.0.0-rc1 #22
> > [  223.563779] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  223.563787] Workqueue: kblockd blk_mq_run_work_fn
> > [  223.593723] device-mapper: multipath: Failing path 8:48.
> > [  223.620801] RIP: 0010:blk_mq_dispatch_rq_list+0xc9/0x590
> > [  223.635266] print_req_error: I/O error, dev dm-6, sector 8191872
> > flags 80700
> > [  223.655565] Code: 0f 85 c2 04 00 00 83 44 24 28 01 48 8b 45 00
> > 48
> > 39
> > c5 0f 84 ea 00 00 00 48 8b 5d 00 80 3c 24 00 4c 8d 6b b8 4c 8b 63
> > c8
> > 75
> > 25 <49> 8b 84 24 b8 00 00 00 48 8b 40 40 48 8b 40 10 48 85 c0 74 10
> > 4c
> > [  223.655566] RSP: 0018:ffffa65b4c43fd90 EFLAGS: 00010246
> > [  223.655570] RAX: ffff93ed9bfdbbc8 RBX: ffff93ed9bfdbbc8 RCX:
> > 0000000000000004
> > [  223.702351] print_req_error: I/O error, dev dm-6, sector 8191872
> > flags 0
> > [  223.737640] RDX: 0000000000000000 RSI: ffffa65b4c43fe20 RDI:
> > ffff93ed9b838000
> > [  223.737641] RBP: ffffa65b4c43fe20 R08: 0000000000000000 R09:
> > 8080808080808080
> > [  223.737642] R10: 0000000000000001 R11: 071c71c71c71c71c R12:
> > 0000000000000000
> > [  223.737643] R13: ffff93ed9bfdbb80 R14: 0000000000000000 R15:
> > ffff93ed9b838000
> > [  223.737645] FS:  0000000000000000(0000)
> > GS:ffff93ee33840000(0000)
> > knlGS:0000000000000000
> > [  223.737646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  223.737646] CR2: 00000000000000b8 CR3: 000000059da0e006 CR4:
> > 00000000000206e0
> > [  223.737648] Call Trace:          
> > 
> > [  223.737657]  blk_mq_sched_dispatch_requests+0x15c/0x180  ***
> > Freed
> > already
> > 
> > [  223.737660]  __blk_mq_run_hw_queue+0x5f/0xf0
> > [  223.737665]  process_one_work+0x171/0x370
> > [  223.737667]  worker_thread+0x49/0x3f0
> > [  223.737670]  kthread+0xf8/0x130
> > [  223.737673]  ? max_active_store+0x80/0x80
> > 
> > And:
> > 
> > [  643.425005] device-mapper: multipath: Failing path 67:0.
> > [  643.696365] ------------[ cut here ]------------
> > [  643.722927] list_add corruption. prev->next should be next
> > (ffffc8c0c3bd9448), but was ffff93b971965e08.
> > (prev=ffff93b971965e08).
> > [  643.787089] WARNING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > [  643.830951] Modules linked in: ib_isert iscsi_target_mod
> > target_core_mod ib_srp scsi_transport_srp rpcrdma ib_ipoib ib_umad
> > rdma_ucm ib_iser rdma_cm iw_cm libiscsi sunrpc scsi_transport_iscsi
> > ib_cm mlx5_ib ib_uverbs ib_core intel_powerclamp coretemp kvm_intel
> > kvm
> > irqbypass crct10dif_pclmul crc32_pclmul iTCO_wdt
> > ghash_clmulni_intel
> > ipmi_ssif aesni_intel iTCO_vendor_support gpio_ich crypto_simd
> > dm_service_time cryptd glue_helper joydev ipmi_si ipmi_devintf
> > pcspkr
> > hpilo hpwdt sg ipmi_msghandler acpi_power_meter pcc_cpufreq lpc_ich
> > i7core_edac dm_multipath ip_tables xfs libcrc32c radeon sd_mod
> > i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
> > fb_sys_fops ttm drm crc32c_intel serio_raw mlx5_core i2c_core bnx2
> > hpsa
> > mlxfw scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
> > [  644.224150] CPU: 14 PID: 6533 Comm: kworker/14:1H Tainted:
> > G          I       4.20.0+ #26
> > [  644.269637] Hardware name: HP ProLiant DL380 G7, BIOS P67
> > 08/16/2015
> > [  644.305110] Workqueue: kblockd blk_mq_run_work_fn
> > [  644.330984] RIP: 0010:__list_add_valid+0x6a/0x70
> > [  644.357601] Code: 31 c0 48 c7 c7 70 e4 ab b6 e8 22 a6 cc ff 0f
> > 0b
> > 31
> > c0 c3 48 89 d1 48 c7 c7 20 e4 ab b6 48 89 f2 48 89 c6 31 c0 e8 06
> > a6
> > cc
> > ff <0f> 0b 31 c0 c3 90 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b
> > 57
> > [  644.462542] RSP: 0018:ffffa8ccc72bfc00 EFLAGS: 00010286
> > [  644.491594] RAX: 0000000000000000 RBX: ffff93b971965dc0 RCX:
> > 0000000000000000
> > [  644.532745] RDX: 0000000000000001 RSI: ffff93b9b79d67b8 RDI:
> > ffff93b9b79d67b8
> > [  644.573533] RBP: ffffc8c0c3bd9448 R08: 0000000000000000 R09:
> > 000000000000072c
> > [  644.614180] R10: 0000000000000000 R11: ffffa8ccc72bf968 R12:
> > ffff93b96d454c00
> > [  644.654683] R13: ffffc8c0c3bd9440 R14: ffff93b971965e08 R15:
> > ffff93b971965e08
> > [  644.694275] FS:  0000000000000000(0000)
> > GS:ffff93b9b79c0000(0000)
> > knlGS:0000000000000000
> > [  644.739906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  644.771879] CR2: 00007fd566c10000 CR3: 0000000253e0e002 CR4:
> > 00000000000206e0
> > [  644.811438] Call Trace:
> > [  644.824809]  __blk_mq_insert_request+0x62/0x130
> > [  644.849886]  blk_mq_sched_insert_request+0x13c/0x1b0
> > [  644.877402]  blk_mq_try_issue_directly+0x105/0x2c0
> > [  644.904452]  blk_insert_cloned_request+0x9a/0x130
> > [  644.931146]  ? ktime_get+0x37/0x90
> > [  644.950545]  dm_mq_queue_rq+0x21c/0x3f0 [dm_mod]
> > [  644.977064]  ? blk_mq_get_driver_tag+0xa1/0x120
> > [  645.003002]  blk_mq_dispatch_rq_list+0x8e/0x590
> > [  645.029812]  ? __switch_to_asm+0x40/0x70
> > [  645.052059]  ? __switch_to_asm+0x34/0x70
> > [  645.074664]  ? __switch_to_asm+0x40/0x70
> > [  645.097111]  ? __switch_to_asm+0x34/0x70
> > [  645.119948]  ? __switch_to_asm+0x34/0x70
> > [  645.143101]  ? __switch_to_asm+0x40/0x70
> > [  645.165273]  ? syscall_return_via_sysret+0x10/0x7f
> > [  645.192161]  blk_mq_sched_dispatch_requests+0xe8/0x180
> > [  645.221460]  __blk_mq_run_hw_queue+0x5f/0xf0
> > [  645.244766]  process_one_work+0x171/0x370
> > [  645.267164]  worker_thread+0x49/0x3f0
> > [  645.287860]  kthread+0xf8/0x130
> > [  645.304894]  ? max_active_store+0x80/0x80
> > [  645.327748]  ? kthread_bind+0x10/0x10
> > [  645.347898]  ret_from_fork+0x35/0x40
> > [  645.368298] ---[ end trace afa70bf68ffb006b ]---
> > [  645.397356] ------------[ cut here ]------------
> > [  645.423878] list_add corruption. prev->next should be next
> > (ffffc8c0c3bd9448), but was ffff93b971965e08.
> > (prev=ffff93b971965e08).
> > [  645.488009] WAR
> > NING: CPU: 14 PID: 6533 at lib/list_debug.c:28
> > __list_add_valid+0x6a/0x70
> > 
> > 
> > I started just looking at block but nothing made sense so re-ran
> > the
> > bisect with the entire kernel.
> > 
> > $ git bisect start 
> > $ git bisect good v4.20
> > $ git bisect bad v5.0-rc1
> > 
> > Bisecting: 5477 revisions left to test after this (roughly 13
> > steps)
> > *** Groan
> > 
> > I got to here and the problem is its an entire merge
> > 
> > [loberman@ibclient linux_torvalds]$ git bisect bad
> > 4b9254328254bed12a4ac449cdff2c332e630837 is the first bad commit
> > [loberman@ibclient linux_torvalds]$ git show
> > 4b9254328254bed12a4ac449cdff2c332e630837
> > commit 4b9254328254bed12a4ac449cdff2c332e630837
> > Merge: 1a9430d cd19181
> > Author: Jens Axboe <axboe@kernel.dk>
> > Date:   Tue Dec 18 08:29:53 2018 -0700
> > 
> >     Merge branch 'for-4.21/block' into for-4.21/aio
> >     
> >     * for-4.21/block: (351 commits)
> >       blk-mq: enable IO poll if .nr_queues of type poll > 0
> >       blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
> >       blk-mq: skip zero-queue maps in blk_mq_map_swqueue
> >       block: fix blk-iolatency accounting underflow
> >       blk-mq: fix dispatch from sw queue
> >       block: mq-deadline: Fix write completion handling
> >       nvme-pci: don't share queue maps
> >       blk-mq: only dispatch to non-defauly queue maps if they have
> > queues
> >       blk-mq: export hctx->type in debugfs instead of sysfs
> >       blk-mq: fix allocation for queue mapping table
> >       blk-wbt: export internal state via debugfs
> >       blk-mq-debugfs: support rq_qos
> >       block: update sysfs documentation
> >       block: loop: check error using IS_ERR instead of
> > IS_ERR_OR_NULL
> > in loop_add()
> >       aoe: add __exit annotation
> >       block: clear REQ_HIPRI if polling is not supported
> >       blk-mq: replace and kill blk_mq_request_issue_directly
> >       blk-mq: issue directly with bypass 'false' in
> > blk_mq_sched_insert_requests
> >       blk-mq: refactor the code of issue request directly
> >       block: remove the bio_integrity_advance export
> >       ...
> > 
> > Definitley looks like a list corruption and use after free but its
> > the
> > merge so looking for ideas now from the block folks,
> > 
> > 
> > 
> 
> for-4.21/block: (351 commits) *****
> I will have to try narrow these down based on the heading of each and
> do a full review of the vmcore I captured.
> 
> This expsoure is now there from v5.0-rc1 and onwards
> 

Update on this issue

If I reset the SRP target server and do not reboot it I do not see the
block-mq race with the SRP initiator.
Just resetting the SRP target array and not runing the reboot avoids
the shutdown scripts of ib_srpt and seems somehow to prevent the race
with the initiator code.

Given that I see very little churn and activity with LIO lately I dont
know if its worth spending more time on trying to isolate this.

If anybody has an SRP native array (like DDN) and can reboot it with
5.x plus, that would be a good test.

I will reach out to DDN and ask if they can test it.

Regards
Laurence



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 14:53             ` Laurence Oberman
@ 2019-04-03 15:09               ` Bart Van Assche
  2019-04-03 17:58                 ` Laurence Oberman
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2019-04-03 15:09 UTC (permalink / raw)
  To: Laurence Oberman, linux-rdma, linux-block, Jens Axboe; +Cc: linux-scsi

On Wed, 2019-04-03 at 10:53 -0400, Laurence Oberman wrote:
> Update on this issue
> 
> If I reset the SRP target server and do not reboot it I do not see the
> block-mq race with the SRP initiator.
> Just resetting the SRP target array and not runing the reboot avoids
> the shutdown scripts of ib_srpt and seems somehow to prevent the race
> with the initiator code.
> 
> Given that I see very little churn and activity with LIO lately I dont
> know if its worth spending more time on trying to isolate this.
> 
> If anybody has an SRP native array (like DDN) and can reboot it with
> 5.x plus, that would be a good test.
> 
> I will reach out to DDN and ask if they can test it.

Hi Laurence,

If the initiator side crashes it means there is a bug in the kernel at the
initiator side. It's not clear to me why you are commenting on the target
side in this e-mail thread?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-03-27 22:00                 ` Laurence Oberman
  2019-03-27 23:34                   ` Laurence Oberman
@ 2019-04-03 17:52                   ` Bart Van Assche
  2019-04-03 17:59                     ` Laurence Oberman
                                       ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Bart Van Assche @ 2019-04-03 17:52 UTC (permalink / raw)
  To: Laurence Oberman, Ming Lei, jianchao.w.wang
  Cc: linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> Hello Jens, Jianchao
> Finally made it to this one.
> I will see if I can revert and test
> 
> 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> Date:   Fri Dec 14 09:28:18 2018 +0800
> 
>     blk-mq: refactor the code of issue request directly
>     
>     Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
>     into one interface to unify the interfaces to issue requests
>     directly. The merged interface takes over the requests totally,
>     it could insert, end or do nothing based on the return value of
>     .queue_rq and 'bypass' parameter. Then caller needn't any other
>     handling any more and then code could be cleaned up.
>     
>     And also the commit c616cbee ( blk-mq: punt failed direct issue
>     to dispatch list ) always inserts requests to hctx dispatch list
>     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this is
>     overkill and will harm the merging. We just need to do that for
>     the requests that has been through .queue_rq. This patch also
>     could fix this.
>     
>     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>     Signed-off-by: Jens Axboe <axboe@kernel.dk>

Hi Laurence,

I have not been able to reproduce this issue. But you may want to try
the following patch (applies on top of v5.1-rc3):


Subject: [PATCH] block: Fix blk_mq_try_issue_directly()

If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means that
the request has not been queued and that the caller should retry to submit
the request. Both blk_mq_request_bypass_insert() and
blk_mq_sched_insert_request() guarantee that a request will be processed.
Hence return BLK_STS_OK if one of these functions is called. This patch
avoids that blk_mq_dispatch_rq_list() crashes when using dm-mpath.

Reported-by: Laurence Oberman <loberman@redhat.com>
Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 652d0c6d5945..b2c20dce8a30 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1859,16 +1859,11 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	case BLK_STS_RESOURCE:
 		if (force) {
 			blk_mq_request_bypass_insert(rq, run_queue);
-			/*
-			 * We have to return BLK_STS_OK for the DM
-			 * to avoid livelock. Otherwise, we return
-			 * the real result to indicate whether the
-			 * request is direct-issued successfully.
-			 */
-			ret = bypass ? BLK_STS_OK : ret;
+			ret = BLK_STS_OK;
 		} else if (!bypass) {
 			blk_mq_sched_insert_request(rq, false,
 						    run_queue, false);
+			ret = BLK_STS_OK;
 		}
 		break;
 	default:


Thanks,

Bart.

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 15:09               ` Bart Van Assche
@ 2019-04-03 17:58                 ` Laurence Oberman
  0 siblings, 0 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-04-03 17:58 UTC (permalink / raw)
  To: Bart Van Assche, linux-rdma, linux-block, Jens Axboe; +Cc: linux-scsi

On Wed, 2019-04-03 at 08:09 -0700, Bart Van Assche wrote:
> On Wed, 2019-04-03 at 10:53 -0400, Laurence Oberman wrote:
> > Update on this issue
> > 
> > If I reset the SRP target server and do not reboot it I do not see
> > the
> > block-mq race with the SRP initiator.
> > Just resetting the SRP target array and not runing the reboot
> > avoids
> > the shutdown scripts of ib_srpt and seems somehow to prevent the
> > race
> > with the initiator code.
> > 
> > Given that I see very little churn and activity with LIO lately I
> > dont
> > know if its worth spending more time on trying to isolate this.
> > 
> > If anybody has an SRP native array (like DDN) and can reboot it
> > with
> > 5.x plus, that would be a good test.
> > 
> > I will reach out to DDN and ask if they can test it.
> 
> Hi Laurence,
> 
> If the initiator side crashes it means there is a bug in the kernel
> at the
> initiator side. It's not clear to me why you are commenting on the
> target
> side in this e-mail thread?
> 
> Thanks,
> 
> Bart.

Hello Bart

It does indeed seem to be an initiator issue but related to something
when the target reboots. Resetting the target, the initiator survives.
I did not know this until today. 
Until today was rebooting the targetserver to reproduce the race on the
initiator side.
Today I started testing the same way as I was when I repriduced the
Qlogic lock recursion, and In noticed I could not longer reproduce it.

In other words rebooting the target closes the SRP target ports in such
a way that the initiator gets into the block-mq issue race.
Using echo b or power reset the array just dissapears and tyhe
initiator is fine.

Apologies if I was not clear.

Regards
Laurence



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 17:52                   ` Bart Van Assche
@ 2019-04-03 17:59                     ` Laurence Oberman
  2019-04-03 18:43                     ` Laurence Oberman
  2019-04-03 21:47                     ` Sagi Grimberg
  2 siblings, 0 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-04-03 17:59 UTC (permalink / raw)
  To: Bart Van Assche, Ming Lei, jianchao.w.wang
  Cc: linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-04-03 at 10:52 -0700, Bart Van Assche wrote:
> On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > Hello Jens, Jianchao
> > Finally made it to this one.
> > I will see if I can revert and test
> > 
> > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > Date:   Fri Dec 14 09:28:18 2018 +0800
> > 
> >     blk-mq: refactor the code of issue request directly
> >     
> >     Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
> >     into one interface to unify the interfaces to issue requests
> >     directly. The merged interface takes over the requests totally,
> >     it could insert, end or do nothing based on the return value of
> >     .queue_rq and 'bypass' parameter. Then caller needn't any other
> >     handling any more and then code could be cleaned up.
> >     
> >     And also the commit c616cbee ( blk-mq: punt failed direct issue
> >     to dispatch list ) always inserts requests to hctx dispatch
> > list
> >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this
> > is
> >     overkill and will harm the merging. We just need to do that for
> >     the requests that has been through .queue_rq. This patch also
> >     could fix this.
> >     
> >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Hi Laurence,
> 
> I have not been able to reproduce this issue. But you may want to try
> the following patch (applies on top of v5.1-rc3):
> 
> 
> Subject: [PATCH] block: Fix blk_mq_try_issue_directly()
> 
> If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means
> that
> the request has not been queued and that the caller should retry to
> submit
> the request. Both blk_mq_request_bypass_insert() and
> blk_mq_sched_insert_request() guarantee that a request will be
> processed.
> Hence return BLK_STS_OK if one of these functions is called. This
> patch
> avoids that blk_mq_dispatch_rq_list() crashes when using dm-mpath.
> 
> Reported-by: Laurence Oberman <loberman@redhat.com>
> Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request
> directly") # v5.0.
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/blk-mq.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 652d0c6d5945..b2c20dce8a30 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1859,16 +1859,11 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>  	case BLK_STS_RESOURCE:
>  		if (force) {
>  			blk_mq_request_bypass_insert(rq, run_queue);
> -			/*
> -			 * We have to return BLK_STS_OK for the DM
> -			 * to avoid livelock. Otherwise, we return
> -			 * the real result to indicate whether the
> -			 * request is direct-issued successfully.
> -			 */
> -			ret = bypass ? BLK_STS_OK : ret;
> +			ret = BLK_STS_OK;
>  		} else if (!bypass) {
>  			blk_mq_sched_insert_request(rq, false,
>  						    run_queue, false);
> +			ret = BLK_STS_OK;
>  		}
>  		break;
>  	default:
> 
> 
> Thanks,
> 
> Bart.

Thanks Bart
About to build and test


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 17:52                   ` Bart Van Assche
  2019-04-03 17:59                     ` Laurence Oberman
@ 2019-04-03 18:43                     ` Laurence Oberman
  2019-04-03 21:47                     ` Sagi Grimberg
  2 siblings, 0 replies; 18+ messages in thread
From: Laurence Oberman @ 2019-04-03 18:43 UTC (permalink / raw)
  To: Bart Van Assche, Ming Lei, jianchao.w.wang
  Cc: linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-04-03 at 10:52 -0700, Bart Van Assche wrote:
> On Wed, 2019-03-27 at 18:00 -0400, Laurence Oberman wrote:
> > Hello Jens, Jianchao
> > Finally made it to this one.
> > I will see if I can revert and test
> > 
> > 7f556a44e61d0b62d78db9a2662a5f0daef010f2 is the first bad commit
> > commit 7f556a44e61d0b62d78db9a2662a5f0daef010f2
> > Author: Jianchao Wang <jianchao.w.wang@oracle.com>
> > Date:   Fri Dec 14 09:28:18 2018 +0800
> > 
> >     blk-mq: refactor the code of issue request directly
> >     
> >     Merge blk_mq_try_issue_directly and __blk_mq_try_issue_directly
> >     into one interface to unify the interfaces to issue requests
> >     directly. The merged interface takes over the requests totally,
> >     it could insert, end or do nothing based on the return value of
> >     .queue_rq and 'bypass' parameter. Then caller needn't any other
> >     handling any more and then code could be cleaned up.
> >     
> >     And also the commit c616cbee ( blk-mq: punt failed direct issue
> >     to dispatch list ) always inserts requests to hctx dispatch
> > list
> >     whenever get a BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE, this
> > is
> >     overkill and will harm the merging. We just need to do that for
> >     the requests that has been through .queue_rq. This patch also
> >     could fix this.
> >     
> >     Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> >     Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Hi Laurence,
> 
> I have not been able to reproduce this issue. But you may want to try
> the following patch (applies on top of v5.1-rc3):
> 
> 
> Subject: [PATCH] block: Fix blk_mq_try_issue_directly()
> 
> If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means
> that
> the request has not been queued and that the caller should retry to
> submit
> the request. Both blk_mq_request_bypass_insert() and
> blk_mq_sched_insert_request() guarantee that a request will be
> processed.
> Hence return BLK_STS_OK if one of these functions is called. This
> patch
> avoids that blk_mq_dispatch_rq_list() crashes when using dm-mpath.
> 
> Reported-by: Laurence Oberman <loberman@redhat.com>
> Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request
> directly") # v5.0.
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/blk-mq.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 652d0c6d5945..b2c20dce8a30 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1859,16 +1859,11 @@ blk_status_t blk_mq_try_issue_directly(struct
> blk_mq_hw_ctx *hctx,
>  	case BLK_STS_RESOURCE:
>  		if (force) {
>  			blk_mq_request_bypass_insert(rq, run_queue);
> -			/*
> -			 * We have to return BLK_STS_OK for the DM
> -			 * to avoid livelock. Otherwise, we return
> -			 * the real result to indicate whether the
> -			 * request is direct-issued successfully.
> -			 */
> -			ret = bypass ? BLK_STS_OK : ret;
> +			ret = BLK_STS_OK;
>  		} else if (!bypass) {
>  			blk_mq_sched_insert_request(rq, false,
>  						    run_queue, false);
> +			ret = BLK_STS_OK;
>  		}
>  		break;
>  	default:
> 
> 
> Thanks,
> 
> Bart.

Hello Bart

For the above:

Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>


Thank you. Given I know this issue very well, I can confirm your patch
fixes it.
Against 5.1-rc3 the initiator no longer panics when I reboot the
ib_srpt target server. It continues to try reconnect as it should.
I would never have found this. Patch makes sense now of course so I can
review it.

[  221.285919] device-mapper: multipath: Failing path 8:176.
[  221.286182] device-mapper: multipath: Failing path 65:144.
[  221.286266] device-mapper: multipath: Failing path 65:0.
[  221.286625] device-mapper: multipath: Failing path 65:32.
[  221.286708] device-mapper: multipath: Failing path 65:96.
[  221.286965] device-mapper: multipath: Failing path 65:224.
[  221.287115] device-mapper: multipath: Failing path 66:48.
[  221.309589] sd 1:0:0:14: rejecting I/O to offline device
[  221.309595] sd 1:0:0:6: rejecting I/O to offline device
[  231.692106] scsi host2: ib_srp: Got failed path rec status -110
[  231.722521] scsi host2: ib_srp: Path record query failed: sgid
fe80:0000:0000:0000:7cfe:9003:0072:6ed3, dgid
fe80:0000:0000:0000:7cfe:9003:0072:6e4f, pkey 0xffff, service_id
0x7cfe900300726e4e
[  231.816709] scsi host2: reconnect attempt 2 failed (-110)
[  236.684030] scsi host1: ib_srp: Got failed path rec status -110
[  236.716132] scsi host1: ib_srp: Path record query failed: sgid
fe80:0000:0000:0000:7cfe:9003:0072:6ed2, dgid
fe80:0000:0000:0000:7cfe:9003:0072:6e4e, pkey 0xffff, service_id
0x7cfe900300726e4e
[  236.814095] scsi host1: reconnect attempt 2 failed (-110)





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 17:52                   ` Bart Van Assche
  2019-04-03 17:59                     ` Laurence Oberman
  2019-04-03 18:43                     ` Laurence Oberman
@ 2019-04-03 21:47                     ` Sagi Grimberg
  2019-04-03 21:54                       ` Bart Van Assche
  2 siblings, 1 reply; 18+ messages in thread
From: Sagi Grimberg @ 2019-04-03 21:47 UTC (permalink / raw)
  To: Bart Van Assche, Laurence Oberman, Ming Lei, jianchao.w.wang
  Cc: linux-rdma, linux-block, Jens Axboe, linux-scsi


> I have not been able to reproduce this issue. But you may want to try
> the following patch (applies on top of v5.1-rc3):
> 
> 
> Subject: [PATCH] block: Fix blk_mq_try_issue_directly()
> 
> If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means that
> the request has not been queued and that the caller should retry to submit
> the request. Both blk_mq_request_bypass_insert() and
> blk_mq_sched_insert_request() guarantee that a request will be processed.
> Hence return BLK_STS_OK if one of these functions is called. This patch
> avoids that blk_mq_dispatch_rq_list() crashes when using dm-mpath.
> 
> Reported-by: Laurence Oberman <loberman@redhat.com>
> Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   block/blk-mq.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 652d0c6d5945..b2c20dce8a30 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1859,16 +1859,11 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
>   	case BLK_STS_RESOURCE:
>   		if (force) {
>   			blk_mq_request_bypass_insert(rq, run_queue);
> -			/*
> -			 * We have to return BLK_STS_OK for the DM
> -			 * to avoid livelock. Otherwise, we return
> -			 * the real result to indicate whether the
> -			 * request is direct-issued successfully.
> -			 */
> -			ret = bypass ? BLK_STS_OK : ret;
> +			ret = BLK_STS_OK;
>   		} else if (!bypass) {
>   			blk_mq_sched_insert_request(rq, false,
>   						    run_queue, false);
> +			ret = BLK_STS_OK;
>   		}

I would move the ret assignment here after the condition statement.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Panic when rebooting target server testing srp on 5.0.0-rc2
  2019-04-03 21:47                     ` Sagi Grimberg
@ 2019-04-03 21:54                       ` Bart Van Assche
  0 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2019-04-03 21:54 UTC (permalink / raw)
  To: Sagi Grimberg, Laurence Oberman, Ming Lei, jianchao.w.wang
  Cc: linux-rdma, linux-block, Jens Axboe, linux-scsi

On Wed, 2019-04-03 at 14:47 -0700, Sagi Grimberg wrote:
> > I have not been able to reproduce this issue. But you may want to try
> > the following patch (applies on top of v5.1-rc3):
> > 
> > 
> > Subject: [PATCH] block: Fix blk_mq_try_issue_directly()
> > 
> > If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means that
> > the request has not been queued and that the caller should retry to submit
> > the request. Both blk_mq_request_bypass_insert() and
> > blk_mq_sched_insert_request() guarantee that a request will be processed.
> > Hence return BLK_STS_OK if one of these functions is called. This patch
> > avoids that blk_mq_dispatch_rq_list() crashes when using dm-mpath.
> > 
> > Reported-by: Laurence Oberman <loberman@redhat.com>
> > Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.
> > Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> > ---
> >   block/blk-mq.c | 9 ++-------
> >   1 file changed, 2 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 652d0c6d5945..b2c20dce8a30 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1859,16 +1859,11 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
> >   	case BLK_STS_RESOURCE:
> >   		if (force) {
> >   			blk_mq_request_bypass_insert(rq, run_queue);
> > -			/*
> > -			 * We have to return BLK_STS_OK for the DM
> > -			 * to avoid livelock. Otherwise, we return
> > -			 * the real result to indicate whether the
> > -			 * request is direct-issued successfully.
> > -			 */
> > -			ret = bypass ? BLK_STS_OK : ret;
> > +			ret = BLK_STS_OK;
> >   		} else if (!bypass) {
> >   			blk_mq_sched_insert_request(rq, false,
> >   						    run_queue, false);
> > +			ret = BLK_STS_OK;
> >   		}
> 
> I would move the ret assignment here after the condition statement.

How is that expected to work for the !force && bypass case? I don't
want to alter "ret" in that case.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-04-03 21:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6e19971d315f4a3ce2cc20a1c6693f4a263a280c.camel@redhat.com>
     [not found] ` <7858e19ce3fc3ebf7845494a2209c58cd9e3086d.camel@redhat.com>
     [not found]   ` <1553113730.65329.60.camel@acm.org>
     [not found]     ` <3645c45e88523d4b242333d96adbb492ab100f97.camel@redhat.com>
     [not found]       ` <d9febee1de759206bbf4d66f6570415ae64e4f33.camel@redhat.com>
2019-03-26 15:16         ` Panic when rebooting target server testing srp on 5.0.0-rc2 Laurence Oberman
2019-03-26 19:06           ` Laurence Oberman
2019-04-03 14:53             ` Laurence Oberman
2019-04-03 15:09               ` Bart Van Assche
2019-04-03 17:58                 ` Laurence Oberman
2019-03-27  1:40           ` Ming Lei
2019-03-27 12:35             ` Laurence Oberman
2019-03-27 12:56               ` Laurence Oberman
2019-03-27 22:00                 ` Laurence Oberman
2019-03-27 23:34                   ` Laurence Oberman
2019-03-28  1:31                     ` Ming Lei
2019-03-28 14:58                       ` Laurence Oberman
2019-03-28 15:09                         ` Laurence Oberman
2019-04-03 17:52                   ` Bart Van Assche
2019-04-03 17:59                     ` Laurence Oberman
2019-04-03 18:43                     ` Laurence Oberman
2019-04-03 21:47                     ` Sagi Grimberg
2019-04-03 21:54                       ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.