All of lore.kernel.org
 help / color / mirror / Atom feed
* blktest failures
@ 2022-04-08 22:10 Bob Pearson
  2022-04-08 22:50 ` Bob Pearson
  0 siblings, 1 reply; 15+ messages in thread
From: Bob Pearson @ 2022-04-08 22:10 UTC (permalink / raw)
  To: Bart Van Assche, Jason Gunthorpe, linux-rdma, Yi Zhang

Bart,

I finally was able to build a kernel with lockdep enabled correctly and saw the error that you and others reported.
I am not familiar with lockdep output but I am guessing that it is reporting a mismatch between a _bh spinlock
and a _irqsave spinlock (since those are the only two types used by the driver.)

I went on campaign a while back to replace all the locks with _bh locks because I figured they would be
faster than _irqsave locks and because the driver never touched a lock except from a verbs API call or from
a tasklet (softirq.) As it turned out some code makes verbs API calls while in hardirq context which broke
that assumption. So some of the locks were reverted back to irqsave locks which fixed those warnings.

Now it is happening again. I did an experiment and went through the rxe driver and replaced all spinlocks
with _irqsave locks. Now the lockdep splats have gone away and the srp/001 test reports success. BUT,
it hangs and doesn't finish. If I try to run all the tests I get warnings about unable to remove the
scsi_debug driver. I am able to remove the rdma_rxe driver and reload it. I am not seeing any errors in
the rxe driver.

Do you have any ideas what to look at next?

Bob

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-08 22:10 blktest failures Bob Pearson
@ 2022-04-08 22:50 ` Bob Pearson
  2022-04-08 23:25   ` Bart Van Assche
  0 siblings, 1 reply; 15+ messages in thread
From: Bob Pearson @ 2022-04-08 22:50 UTC (permalink / raw)
  To: Bart Van Assche, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/8/22 17:10, Bob Pearson wrote:
> Bart,
> 
> I finally was able to build a kernel with lockdep enabled correctly and saw the error that you and others reported.
> I am not familiar with lockdep output but I am guessing that it is reporting a mismatch between a _bh spinlock
> and a _irqsave spinlock (since those are the only two types used by the driver.)
> 
> I went on campaign a while back to replace all the locks with _bh locks because I figured they would be
> faster than _irqsave locks and because the driver never touched a lock except from a verbs API call or from
> a tasklet (softirq.) As it turned out some code makes verbs API calls while in hardirq context which broke
> that assumption. So some of the locks were reverted back to irqsave locks which fixed those warnings.
> 
> Now it is happening again. I did an experiment and went through the rxe driver and replaced all spinlocks
> with _irqsave locks. Now the lockdep splats have gone away and the srp/001 test reports success. BUT,
> it hangs and doesn't finish. If I try to run all the tests I get warnings about unable to remove the
> scsi_debug driver. I am able to remove the rdma_rxe driver and reload it. I am not seeing any errors in
> the rxe driver.
> 
> Do you have any ideas what to look at next?
> 
> Bob

Actually it doesn't hang forever but I get the following

......
[  107.579576] sd 4:0:0:0: [sdb] Synchronizing SCSI cache

[  291.970133] sd 4:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK

[  292.247547] rdma_rxe: unloaded

So it waits for about 3 minutes for something and then gives up.

Bob

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-08 22:50 ` Bob Pearson
@ 2022-04-08 23:25   ` Bart Van Assche
  2022-04-09  0:31     ` Yi Zhang
  2022-04-09  5:04     ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Bart Van Assche @ 2022-04-08 23:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bob Pearson, Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/8/22 15:50, Bob Pearson wrote:
> Actually it doesn't hang forever but I get the following
> 
> ......
> [  107.579576] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
> 
> [  291.970133] sd 4:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
> 
> [  292.247547] rdma_rxe: unloaded
> 
> So it waits for about 3 minutes for something and then gives up.

(+Christoph)

Hi Bob,

I can reproduce this behavior with the Soft-iWARP driver by running the 
following command:

cd blktests && use_siw=1 ./check -q srp/001

Christoph, the call stack involved in this issue is as follows:

__schedule+0x4c3/0xd20
schedule+0x82/0x110
schedule_timeout+0x122/0x200
io_schedule_timeout+0x7b/0xc0
__wait_for_common+0x2bc/0x380
wait_for_completion_io_timeout+0x1d/0x20
blk_execute_rq+0x1db/0x200
__scsi_execute+0x1fb/0x310
sd_sync_cache+0x155/0x2c0 [sd_mod]
sd_shutdown+0xbb/0x190 [sd_mod]
sd_remove+0x5b/0x80 [sd_mod]
device_remove+0x9a/0xb0
device_release_driver_internal+0x2c5/0x360
device_release_driver+0x12/0x20
bus_remove_device+0x1aa/0x270
device_del+0x2d4/0x640
__scsi_remove_device+0x168/0x1a0
scsi_forget_host+0xa8/0xb0
scsi_remove_host+0x9b/0x150
sdebug_driver_remove+0x3d/0x140 [scsi_debug]
device_remove+0x6f/0xb0
device_release_driver_internal+0x2c5/0x360
device_release_driver+0x12/0x20
bus_remove_device+0x1aa/0x270
device_del+0x2d4/0x640
device_unregister+0x18/0x70
sdebug_do_remove_host+0x138/0x180 [scsi_debug]
scsi_debug_exit+0x45/0xd5 [scsi_debug]
__do_sys_delete_module.constprop.0+0x210/0x320
__x64_sys_delete_module+0x1f/0x30
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae

One of the functions in the above call stack is sd_remove(). sd_remove() 
calls del_gendisk() just before calling sd_shutdown(). sd_shutdown() 
submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the 
following comment: "Fail any new I/O". Do you agree that failing new I/O 
before sd_shutdown() is called is wrong? Is there any other way to fix 
this than moving the blk_queue_start_drain() etc. calls out of 
del_gendisk() and into a new function?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-08 23:25   ` Bart Van Assche
@ 2022-04-09  0:31     ` Yi Zhang
  2022-04-09  4:33       ` Bart Van Assche
  2022-04-09  5:04     ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Yi Zhang @ 2022-04-09  0:31 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Bob Pearson, Martin K. Petersen,
	Jason Gunthorpe, linux-rdma

On Sat, Apr 9, 2022 at 7:25 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 4/8/22 15:50, Bob Pearson wrote:
> > Actually it doesn't hang forever but I get the following
> >
> > ......
> > [  107.579576] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
> >
> > [  291.970133] sd 4:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
> >
> > [  292.247547] rdma_rxe: unloaded
> >
> > So it waits for about 3 minutes for something and then gives up.
>
> (+Christoph)
>
> Hi Bob,
>
> I can reproduce this behavior with the Soft-iWARP driver by running the
> following command:
>
> cd blktests && use_siw=1 ./check -q srp/001
>
> Christoph, the call stack involved in this issue is as follows:
>
> __schedule+0x4c3/0xd20
> schedule+0x82/0x110
> schedule_timeout+0x122/0x200
> io_schedule_timeout+0x7b/0xc0
> __wait_for_common+0x2bc/0x380
> wait_for_completion_io_timeout+0x1d/0x20
> blk_execute_rq+0x1db/0x200
> __scsi_execute+0x1fb/0x310
> sd_sync_cache+0x155/0x2c0 [sd_mod]
> sd_shutdown+0xbb/0x190 [sd_mod]
> sd_remove+0x5b/0x80 [sd_mod]
> device_remove+0x9a/0xb0
> device_release_driver_internal+0x2c5/0x360
> device_release_driver+0x12/0x20
> bus_remove_device+0x1aa/0x270
> device_del+0x2d4/0x640
> __scsi_remove_device+0x168/0x1a0
> scsi_forget_host+0xa8/0xb0
> scsi_remove_host+0x9b/0x150
> sdebug_driver_remove+0x3d/0x140 [scsi_debug]
> device_remove+0x6f/0xb0
> device_release_driver_internal+0x2c5/0x360
> device_release_driver+0x12/0x20
> bus_remove_device+0x1aa/0x270
> device_del+0x2d4/0x640
> device_unregister+0x18/0x70
> sdebug_do_remove_host+0x138/0x180 [scsi_debug]
> scsi_debug_exit+0x45/0xd5 [scsi_debug]
> __do_sys_delete_module.constprop.0+0x210/0x320
> __x64_sys_delete_module+0x1f/0x30
> do_syscall_64+0x35/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> One of the functions in the above call stack is sd_remove(). sd_remove()
> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
> following comment: "Fail any new I/O". Do you agree that failing new I/O
> before sd_shutdown() is called is wrong? Is there any other way to fix
> this than moving the blk_queue_start_drain() etc. calls out of
> del_gendisk() and into a new function?
>
> Thanks,
>
> Bart.
>

I reported/bisected for this issue last week, not sure if it helped.
https://lore.kernel.org/linux-block/CAHj4cs9OTm9sb_5fmzgz+W9OSLeVPKix3Yri856kqQVccwd_Mw@mail.gmail.com/T/#t
-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-09  0:31     ` Yi Zhang
@ 2022-04-09  4:33       ` Bart Van Assche
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2022-04-09  4:33 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Christoph Hellwig, Bob Pearson, Martin K. Petersen,
	Jason Gunthorpe, linux-rdma

On 4/8/22 17:31, Yi Zhang wrote:
> I reported/bisected for this issue last week, not sure if it helped.
> https://lore.kernel.org/linux-block/CAHj4cs9OTm9sb_5fmzgz+W9OSLeVPKix3Yri856kqQVccwd_Mw@mail.gmail.com/T/#t

That definitely helps :-) Since reverting the scsi_debug patch that has 
been identified as the root cause unbreaks blktests, I will post a 
revert of that patch.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-08 23:25   ` Bart Van Assche
  2022-04-09  0:31     ` Yi Zhang
@ 2022-04-09  5:04     ` Christoph Hellwig
  2022-04-09 21:43       ` Bob Pearson
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2022-04-09  5:04 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Bob Pearson, Martin K. Petersen,
	Jason Gunthorpe, linux-rdma, Yi Zhang

On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
> One of the functions in the above call stack is sd_remove(). sd_remove() 
> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown() 
> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the 
> following comment: "Fail any new I/O". Do you agree that failing new I/O 
> before sd_shutdown() is called is wrong? Is there any other way to fix this 
> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and 
> into a new function?

That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
and should not be affected by stopping all file system I/O.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-09  5:04     ` Christoph Hellwig
@ 2022-04-09 21:43       ` Bob Pearson
  2022-04-09 21:47         ` Bart Van Assche
  2022-04-15  7:12         ` Yanjun Zhu
  0 siblings, 2 replies; 15+ messages in thread
From: Bob Pearson @ 2022-04-09 21:43 UTC (permalink / raw)
  To: Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/9/22 00:04, Christoph Hellwig wrote:
> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>> One of the functions in the above call stack is sd_remove(). sd_remove() 
>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown() 
>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the 
>> following comment: "Fail any new I/O". Do you agree that failing new I/O 
>> before sd_shutdown() is called is wrong? Is there any other way to fix this 
>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and 
>> into a new function?
> 
> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
> and should not be affected by stopping all file system I/O.

When I run check -q srp
all the test cases pass but each one stops for 3+ minutes at synchronize cache.
The rxe device is still active until sync cache returns when the last QP and the PD
are destroyed. It may be that the queues are blocked waiting for something else
even though they have reported success??

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-09 21:43       ` Bob Pearson
@ 2022-04-09 21:47         ` Bart Van Assche
  2022-04-15  7:12         ` Yanjun Zhu
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2022-04-09 21:47 UTC (permalink / raw)
  To: Bob Pearson, Christoph Hellwig
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/9/22 14:43, Bob Pearson wrote:
> On 4/9/22 00:04, Christoph Hellwig wrote:
>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>> into a new function?
>>
>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>> and should not be affected by stopping all file system I/O.
> 
> When I run check -q srp
> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
> The rxe device is still active until sync cache returns when the last QP and the PD
> are destroyed. It may be that the queues are blocked waiting for something else
> even though they have reported success??

Hi Bob,

After having taken a closer look at del_gendisk(), I agree with what 
Christoph wrote above. Please revert patch "scsi: scsi_debug: Address 
races following module load" locally when running blktests. See also 
https://lore.kernel.org/linux-scsi/5fb68dbd-ae0e-6230-8f9f-dd6df5593584@interlog.com/T/#m47a23ffd5ce68b8183100444d6e711b6b4aba393.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-09 21:43       ` Bob Pearson
  2022-04-09 21:47         ` Bart Van Assche
@ 2022-04-15  7:12         ` Yanjun Zhu
  2022-04-15  7:26           ` Bob Pearson
  2022-04-15  7:29           ` Bob Pearson
  1 sibling, 2 replies; 15+ messages in thread
From: Yanjun Zhu @ 2022-04-15  7:12 UTC (permalink / raw)
  To: Bob Pearson, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

在 2022/4/10 5:43, Bob Pearson 写道:
> On 4/9/22 00:04, Christoph Hellwig wrote:
>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>> into a new function?
>>
>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>> and should not be affected by stopping all file system I/O.
> 
> When I run check -q srp
> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
> The rxe device is still active until sync cache returns when the last QP and the PD
> are destroyed. It may be that the queues are blocked waiting for something else
> even though they have reported success??

If you remove all the xarray patches and use the original source code. 
This will not occur.

Zhu Yanjun


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:12         ` Yanjun Zhu
@ 2022-04-15  7:26           ` Bob Pearson
  2022-04-15  7:29           ` Bob Pearson
  1 sibling, 0 replies; 15+ messages in thread
From: Bob Pearson @ 2022-04-15  7:26 UTC (permalink / raw)
  To: Yanjun Zhu, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/15/22 02:12, Yanjun Zhu wrote:
> 在 2022/4/10 5:43, Bob Pearson 写道:
>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>> into a new function?
>>>
>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>> and should not be affected by stopping all file system I/O.
>>
>> When I run check -q srp
>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>> The rxe device is still active until sync cache returns when the last QP and the PD
>> are destroyed. It may be that the queues are blocked waiting for something else
>> even though they have reported success??
> 
> If you remove all the xarray patches and use the original source code. This will not occur.
> 
> Zhu Yanjun
> 

I know. I am trying to find out why. For performance reasons I very much want to
make the xarray + rcu_locking patches work correctly. All the spinlock issues are
now fixed in my tree. What is left is a race in the RDMA read retry code somewhere.
I'll find it. In the process of chasing this I have found several real bugs and
I suspect a few more are out there.

Bob

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:12         ` Yanjun Zhu
  2022-04-15  7:26           ` Bob Pearson
@ 2022-04-15  7:29           ` Bob Pearson
  2022-04-15  7:37             ` Yanjun Zhu
  1 sibling, 1 reply; 15+ messages in thread
From: Bob Pearson @ 2022-04-15  7:29 UTC (permalink / raw)
  To: Yanjun Zhu, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/15/22 02:12, Yanjun Zhu wrote:
> 在 2022/4/10 5:43, Bob Pearson 写道:
>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>> into a new function?
>>>
>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>> and should not be affected by stopping all file system I/O.
>>
>> When I run check -q srp
>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>> The rxe device is still active until sync cache returns when the last QP and the PD
>> are destroyed. It may be that the queues are blocked waiting for something else
>> even though they have reported success??
> 
> If you remove all the xarray patches and use the original source code. This will not occur.
> 
> Zhu Yanjun
> 

I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently
caused by a bad scsi patch which has since been reverted.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:29           ` Bob Pearson
@ 2022-04-15  7:37             ` Yanjun Zhu
  2022-04-15  7:46               ` Bob Pearson
  0 siblings, 1 reply; 15+ messages in thread
From: Yanjun Zhu @ 2022-04-15  7:37 UTC (permalink / raw)
  To: Bob Pearson, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang


在 2022/4/15 15:29, Bob Pearson 写道:
> On 4/15/22 02:12, Yanjun Zhu wrote:
>> 在 2022/4/10 5:43, Bob Pearson 写道:
>>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>>> into a new function?
>>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>>> and should not be affected by stopping all file system I/O.
>>> When I run check -q srp
>>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>>> The rxe device is still active until sync cache returns when the last QP and the PD
>>> are destroyed. It may be that the queues are blocked waiting for something else
>>> even though they have reported success??
>> If you remove all the xarray patches and use the original source code. This will not occur.
>>
>> Zhu Yanjun
>>
> I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently
> caused by a bad scsi patch which has since been reverted.

I am not sure about this because wr NULL problem exists with xarray patches.

Please let us find the root cause of wr NULL.

This can make RXE more stable.

Zhu Yanjun


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:37             ` Yanjun Zhu
@ 2022-04-15  7:46               ` Bob Pearson
  2022-04-15  7:59                 ` Yanjun Zhu
  0 siblings, 1 reply; 15+ messages in thread
From: Bob Pearson @ 2022-04-15  7:46 UTC (permalink / raw)
  To: Yanjun Zhu, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/15/22 02:37, Yanjun Zhu wrote:
> 
> 在 2022/4/15 15:29, Bob Pearson 写道:
>> On 4/15/22 02:12, Yanjun Zhu wrote:
>>> 在 2022/4/10 5:43, Bob Pearson 写道:
>>>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>>>> into a new function?
>>>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>>>> and should not be affected by stopping all file system I/O.
>>>> When I run check -q srp
>>>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>>>> The rxe device is still active until sync cache returns when the last QP and the PD
>>>> are destroyed. It may be that the queues are blocked waiting for something else
>>>> even though they have reported success??
>>> If you remove all the xarray patches and use the original source code. This will not occur.
>>>
>>> Zhu Yanjun
>>>
>> I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently
>> caused by a bad scsi patch which has since been reverted.
> 
> I am not sure about this because wr NULL problem exists with xarray patches.
> 
> Please let us find the root cause of wr NULL.
> 
> This can make RXE more stable.
> 
> Zhu Yanjun
> 

You mean mr = NULL. And it is not happening in my tree. I have WARN_ONs looking for it
and it isn't happening.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:46               ` Bob Pearson
@ 2022-04-15  7:59                 ` Yanjun Zhu
  2022-04-15 15:44                   ` Bob Pearson
  0 siblings, 1 reply; 15+ messages in thread
From: Yanjun Zhu @ 2022-04-15  7:59 UTC (permalink / raw)
  To: Bob Pearson, Yanjun Zhu, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

在 2022/4/15 15:46, Bob Pearson 写道:
> On 4/15/22 02:37, Yanjun Zhu wrote:
>>
>> 在 2022/4/15 15:29, Bob Pearson 写道:
>>> On 4/15/22 02:12, Yanjun Zhu wrote:
>>>> 在 2022/4/10 5:43, Bob Pearson 写道:
>>>>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>>>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>>>>> into a new function?
>>>>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>>>>> and should not be affected by stopping all file system I/O.
>>>>> When I run check -q srp
>>>>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>>>>> The rxe device is still active until sync cache returns when the last QP and the PD
>>>>> are destroyed. It may be that the queues are blocked waiting for something else
>>>>> even though they have reported success??
>>>> If you remove all the xarray patches and use the original source code. This will not occur.
>>>>
>>>> Zhu Yanjun
>>>>
>>> I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently
>>> caused by a bad scsi patch which has since been reverted.
>>
>> I am not sure about this because wr NULL problem exists with xarray patches.
>>
>> Please let us find the root cause of wr NULL.
>>
>> This can make RXE more stable.
>>
>> Zhu Yanjun
>>
> 
> You mean mr = NULL. And it is not happening in my tree. I have WARN_ONs looking for it

Why you said that you can not reproduce this problem now?

Please check your mail. I remember that you can reproduce this wr NULL 
in your host.

Please focus on this problem. This can make RXE more stable.
Let us find the root cause and fix this problem ASAP.

Thanks.
Zhu Yanjun

> and it isn't happening.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: blktest failures
  2022-04-15  7:59                 ` Yanjun Zhu
@ 2022-04-15 15:44                   ` Bob Pearson
  0 siblings, 0 replies; 15+ messages in thread
From: Bob Pearson @ 2022-04-15 15:44 UTC (permalink / raw)
  To: Yanjun Zhu, Christoph Hellwig, Bart Van Assche
  Cc: Martin K. Petersen, Jason Gunthorpe, linux-rdma, Yi Zhang

On 4/15/22 02:59, Yanjun Zhu wrote:
> 在 2022/4/15 15:46, Bob Pearson 写道:
>> On 4/15/22 02:37, Yanjun Zhu wrote:
>>>
>>> 在 2022/4/15 15:29, Bob Pearson 写道:
>>>> On 4/15/22 02:12, Yanjun Zhu wrote:
>>>>> 在 2022/4/10 5:43, Bob Pearson 写道:
>>>>>> On 4/9/22 00:04, Christoph Hellwig wrote:
>>>>>>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote:
>>>>>>>> One of the functions in the above call stack is sd_remove(). sd_remove()
>>>>>>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown()
>>>>>>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the
>>>>>>>> following comment: "Fail any new I/O". Do you agree that failing new I/O
>>>>>>>> before sd_shutdown() is called is wrong? Is there any other way to fix this
>>>>>>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and
>>>>>>>> into a new function?
>>>>>>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue
>>>>>>> and should not be affected by stopping all file system I/O.
>>>>>> When I run check -q srp
>>>>>> all the test cases pass but each one stops for 3+ minutes at synchronize cache.
>>>>>> The rxe device is still active until sync cache returns when the last QP and the PD
>>>>>> are destroyed. It may be that the queues are blocked waiting for something else
>>>>>> even though they have reported success??
>>>>> If you remove all the xarray patches and use the original source code. This will not occur.
>>>>>
>>>>> Zhu Yanjun
>>>>>
>>>> I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently
>>>> caused by a bad scsi patch which has since been reverted.
>>>
>>> I am not sure about this because wr NULL problem exists with xarray patches.
>>>
>>> Please let us find the root cause of wr NULL.
>>>
>>> This can make RXE more stable.
>>>
>>> Zhu Yanjun
>>>
>>
>> You mean mr = NULL. And it is not happening in my tree. I have WARN_ONs looking for it
> 
> Why you said that you can not reproduce this problem now?
> 
> Please check your mail. I remember that you can reproduce this wr NULL in your host.

Yes I did see it before but not with the new pool patches. The whole point of the patch series I
have been working on for the past month was to clean up races in shutdown and cleanup code.
These were demonstrated by colleagues here at hpe who are running lustre over soft roce.
I looked at the shutdown code and saw that the whole design was flawed. We were using
kref's to track reference counts on MRs and QPs etc. but always returning to rdma-core
regardless of the reference count. rdma-core having no visibility to our references
then deletes the object fairly soon after that while we are still trying to process late
arriving packets. One of the features in my tree is to use wait_for_completion() in the
return path to rdma-core and pause until all the references have been dropped.

I really don't see the point of debugging the old code because it is just wrong wrong wrong.
It will never be stable under heavy load.

Bob
> 
> Please focus on this problem. This can make RXE more stable.
> Let us find the root cause and fix this problem ASAP.
> 
> Thanks.
> Zhu Yanjun
> 
>> and it isn't happening.
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-04-15 15:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08 22:10 blktest failures Bob Pearson
2022-04-08 22:50 ` Bob Pearson
2022-04-08 23:25   ` Bart Van Assche
2022-04-09  0:31     ` Yi Zhang
2022-04-09  4:33       ` Bart Van Assche
2022-04-09  5:04     ` Christoph Hellwig
2022-04-09 21:43       ` Bob Pearson
2022-04-09 21:47         ` Bart Van Assche
2022-04-15  7:12         ` Yanjun Zhu
2022-04-15  7:26           ` Bob Pearson
2022-04-15  7:29           ` Bob Pearson
2022-04-15  7:37             ` Yanjun Zhu
2022-04-15  7:46               ` Bob Pearson
2022-04-15  7:59                 ` Yanjun Zhu
2022-04-15 15:44                   ` Bob Pearson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.