Hello Bart Firstly let me start with : You have always been kind, patient and helpful to me and myself the same to you so I am not keen to get in the middle of this. But its not true about Red Hat because I work very hard on this and I very often find bugs you are not seeing so Red Hat is adding value here. I emailed you a number of times asking if you can provide me the exact steps, but not via your srp-test suite. I have a setup that is not conducive to running your loop disconnects etc. and if you are seeing a stall on multiple loops of 02-mq I should be able to reproduce it with out having to run your test suite. Please let me know how I can help Laurence On Thu, Jan 18, 2018 at 4:39 PM, Bart Van Assche wrote: > On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote: > > On Thu, Jan 18 2018 at 3:58P -0500, > > Bart Van Assche wrote: > > > > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote: > > > > For Bart's test the underlying scsi-mq driver is what is regularly > > > > hitting this case in __blk_mq_try_issue_directly(): > > > > > > > > if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) > > > > > > These lockups were all triggered by incorrect handling of > > > .queue_rq() returning BLK_STS_RESOURCE. > > > > Please be precise, dm_mq_queue_rq()'s return of BLK_STS_RESOURCE? > > "Incorrect" because it no longer runs blk_mq_delay_run_hw_queue()? > > In what I wrote I was referring to both dm_mq_queue_rq() and > scsi_queue_rq(). > With "incorrect" I meant that queue lockups are introduced that make user > space processes unkillable. That's a severe bug. > > > Please try to do more work analyzing the test case that only you can > > easily run (due to srp_test being a PITA). > > It is not correct that I'm the only one who is able to run that software. > Anyone who is willing to merge the latest SRP initiator and target driver > patches in his or her tree can run that software in > any VM. I'm working hard > on getting the patches upstream that make it possible to run the srp-test > software on a setup that is not equipped with InfiniBand hardware. > > > We have time to get this right, please stop hyperventilating about > > "regressions". > > Sorry Mike but that's something I consider as an unfair comment. If Ming > and > you work on patches together, it's your job to make sure that no > regressions > are introduced. Instead of blaming me because I report these regressions > you > should be grateful that I take the time and effort to report these > regressions > early. And since you are employed by a large organization that sells Linux > support services, your employer should invest in developing test cases that > reach a higher coverage of the dm, SCSI and block layer code. I don't think > that it's normal that my tests discovered several issues that were not > discovered by Red Hat's internal test suite. That's something Red Hat has > to > address. > > Bart.