All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman@redhat.com>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "snitzer@redhat.com" <snitzer@redhat.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"osandov@fb.com" <osandov@fb.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"ming.lei@redhat.com" <ming.lei@redhat.com>
Subject: Re: [dm-devel] [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle
Date: Thu, 18 Jan 2018 16:45:15 -0500	[thread overview]
Message-ID: <CAFfF4qv+7gZY1dTdFnFkM-qu9Z3yUEV=Qe7fdUZuACzoxDyfTQ@mail.gmail.com> (raw)
In-Reply-To: <1516311554.2676.50.camel@wdc.com>

[-- Attachment #1: Type: text/plain, Size: 2988 bytes --]

Hello Bart

Firstly let me start with : You have always been kind, patient and helpful
to me and myself the same to you so I am not keen to get in the middle of
this.

But its not true about Red Hat because I work very hard on this and I very
often find bugs you are not seeing so Red Hat is adding value here.
I emailed you a number of times asking if you can provide me the exact
steps, but not via your srp-test suite.

I have a setup that is not conducive to running your loop disconnects etc.
and if you are seeing a stall on multiple loops of 02-mq I should be able
to reproduce it with out having to run your test suite.

Please let me know how I can help

Laurence

On Thu, Jan 18, 2018 at 4:39 PM, Bart Van Assche <Bart.VanAssche@wdc.com>
wrote:

> On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at  3:58P -0500,
> > Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> >
> > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > > For Bart's test the underlying scsi-mq driver is what is regularly
> > > > hitting this case in __blk_mq_try_issue_directly():
> > > >
> > > >         if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))
> > >
> > > These lockups were all triggered by incorrect handling of
> > > .queue_rq() returning BLK_STS_RESOURCE.
> >
> > Please be precise, dm_mq_queue_rq()'s return of BLK_STS_RESOURCE?
> > "Incorrect" because it no longer runs blk_mq_delay_run_hw_queue()?
>
> In what I wrote I was referring to both dm_mq_queue_rq() and
> scsi_queue_rq().
> With "incorrect" I meant that queue lockups are introduced that make user
> space processes unkillable. That's a severe bug.
>
> > Please try to do more work analyzing the test case that only you can
> > easily run (due to srp_test being a PITA).
>
> It is not correct that I'm the only one who is able to run that software.
> Anyone who is willing to merge the latest SRP initiator and target driver
> patches in his or her tree can run that software in
> any VM. I'm working hard
> on getting the patches upstream that make it possible to run the srp-test
> software on a setup that is not equipped with InfiniBand hardware.
>
> > We have time to get this right, please stop hyperventilating about
> > "regressions".
>
> Sorry Mike but that's something I consider as an unfair comment. If Ming
> and
> you work on patches together, it's your job to make sure that no
> regressions
> are introduced. Instead of blaming me because I report these regressions
> you
> should be grateful that I take the time and effort to report these
> regressions
> early. And since you are employed by a large organization that sells Linux
> support services, your employer should invest in developing test cases that
> reach a higher coverage of the dm, SCSI and block layer code. I don't think
> that it's normal that my tests discovered several issues that were not
> discovered by Red Hat's internal test suite. That's something Red Hat has
> to
> address.
>
> Bart.

[-- Attachment #2: Type: text/html, Size: 3929 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Laurence Oberman <loberman@redhat.com>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "axboe@kernel.dk" <axboe@kernel.dk>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"snitzer@redhat.com" <snitzer@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ming.lei@redhat.com" <ming.lei@redhat.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"osandov@fb.com" <osandov@fb.com>
Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle
Date: Thu, 18 Jan 2018 16:45:15 -0500	[thread overview]
Message-ID: <CAFfF4qv+7gZY1dTdFnFkM-qu9Z3yUEV=Qe7fdUZuACzoxDyfTQ@mail.gmail.com> (raw)
In-Reply-To: <1516311554.2676.50.camel@wdc.com>


[-- Attachment #1.1: Type: text/plain, Size: 2988 bytes --]

Hello Bart

Firstly let me start with : You have always been kind, patient and helpful
to me and myself the same to you so I am not keen to get in the middle of
this.

But its not true about Red Hat because I work very hard on this and I very
often find bugs you are not seeing so Red Hat is adding value here.
I emailed you a number of times asking if you can provide me the exact
steps, but not via your srp-test suite.

I have a setup that is not conducive to running your loop disconnects etc.
and if you are seeing a stall on multiple loops of 02-mq I should be able
to reproduce it with out having to run your test suite.

Please let me know how I can help

Laurence

On Thu, Jan 18, 2018 at 4:39 PM, Bart Van Assche <Bart.VanAssche@wdc.com>
wrote:

> On Thu, 2018-01-18 at 16:23 -0500, Mike Snitzer wrote:
> > On Thu, Jan 18 2018 at  3:58P -0500,
> > Bart Van Assche <Bart.VanAssche@wdc.com> wrote:
> >
> > > On Thu, 2018-01-18 at 15:48 -0500, Mike Snitzer wrote:
> > > > For Bart's test the underlying scsi-mq driver is what is regularly
> > > > hitting this case in __blk_mq_try_issue_directly():
> > > >
> > > >         if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))
> > >
> > > These lockups were all triggered by incorrect handling of
> > > .queue_rq() returning BLK_STS_RESOURCE.
> >
> > Please be precise, dm_mq_queue_rq()'s return of BLK_STS_RESOURCE?
> > "Incorrect" because it no longer runs blk_mq_delay_run_hw_queue()?
>
> In what I wrote I was referring to both dm_mq_queue_rq() and
> scsi_queue_rq().
> With "incorrect" I meant that queue lockups are introduced that make user
> space processes unkillable. That's a severe bug.
>
> > Please try to do more work analyzing the test case that only you can
> > easily run (due to srp_test being a PITA).
>
> It is not correct that I'm the only one who is able to run that software.
> Anyone who is willing to merge the latest SRP initiator and target driver
> patches in his or her tree can run that software in
> any VM. I'm working hard
> on getting the patches upstream that make it possible to run the srp-test
> software on a setup that is not equipped with InfiniBand hardware.
>
> > We have time to get this right, please stop hyperventilating about
> > "regressions".
>
> Sorry Mike but that's something I consider as an unfair comment. If Ming
> and
> you work on patches together, it's your job to make sure that no
> regressions
> are introduced. Instead of blaming me because I report these regressions
> you
> should be grateful that I take the time and effort to report these
> regressions
> early. And since you are employed by a large organization that sells Linux
> support services, your employer should invest in developing test cases that
> reach a higher coverage of the dm, SCSI and block layer code. I don't think
> that it's normal that my tests discovered several issues that were not
> discovered by Red Hat's internal test suite. That's something Red Hat has
> to
> address.
>
> Bart.

[-- Attachment #1.2: Type: text/html, Size: 3929 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2018-01-18 21:45 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-18  2:41 [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-18 16:50 ` Bart Van Assche
2018-01-18 17:03   ` Mike Snitzer
2018-01-18 17:03     ` Mike Snitzer
2018-01-18 17:20     ` Bart Van Assche
2018-01-18 17:20       ` Bart Van Assche
2018-01-18 18:30       ` Mike Snitzer
2018-01-18 18:47         ` Bart Van Assche
2018-01-18 18:47           ` Bart Van Assche
2018-01-18 20:11           ` Jens Axboe
2018-01-18 20:11             ` Jens Axboe
2018-01-18 20:48             ` Mike Snitzer
2018-01-18 20:58               ` Bart Van Assche
2018-01-18 20:58                 ` Bart Van Assche
2018-01-18 21:23                 ` Mike Snitzer
2018-01-18 21:23                   ` Mike Snitzer
2018-01-18 21:37                   ` Laurence Oberman
2018-01-18 21:39                   ` [dm-devel] " Bart Van Assche
2018-01-18 21:39                     ` Bart Van Assche
2018-01-18 21:45                     ` Laurence Oberman [this message]
2018-01-18 21:45                       ` Laurence Oberman
2018-01-18 22:01                     ` Mike Snitzer
2018-01-18 22:18                       ` Laurence Oberman
2018-01-18 22:20                         ` Laurence Oberman
2018-01-18 22:20                           ` Laurence Oberman
2018-01-18 22:24                         ` Bart Van Assche
2018-01-18 22:24                           ` Bart Van Assche
2018-01-18 22:35                           ` Laurence Oberman
2018-01-18 22:39                             ` Jens Axboe
2018-01-18 22:55                               ` Bart Van Assche
2018-01-18 22:55                                 ` Bart Van Assche
2018-01-18 22:20                       ` Bart Van Assche
2018-01-18 22:20                         ` Bart Van Assche
2018-01-23  9:22                         ` [PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle) Mike Snitzer
2018-01-23 10:53                           ` Ming Lei
2018-01-23 12:15                             ` Mike Snitzer
2018-01-23 12:17                               ` Ming Lei
2018-01-23 12:43                                 ` Mike Snitzer
2018-01-23 16:43                           ` [PATCH] " Bart Van Assche
2018-01-23 16:43                             ` Bart Van Assche
2018-01-19  2:32             ` [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Ming Lei
2018-01-19  4:02               ` Jens Axboe
2018-01-19  7:26                 ` Ming Lei
2018-01-19 15:20                   ` Bart Van Assche
2018-01-19 15:20                     ` Bart Van Assche
2018-01-19 15:25                     ` Jens Axboe
2018-01-19 15:33                     ` Ming Lei
2018-01-19 16:06                       ` Bart Van Assche
2018-01-19 16:06                         ` Bart Van Assche
2018-01-19 15:24                   ` Jens Axboe
2018-01-19 15:40                     ` Ming Lei
2018-01-19 15:40                       ` Ming Lei
2018-01-19 15:48                       ` Jens Axboe
2018-01-19 16:05                         ` Ming Lei
2018-01-19 16:19                           ` Jens Axboe
2018-01-19 16:26                             ` Ming Lei
2018-01-19 16:27                               ` Jens Axboe
2018-01-19 16:37                                 ` Ming Lei
2018-01-19 16:41                                   ` Jens Axboe
2018-01-19 16:41                                     ` Jens Axboe
2018-01-19 16:47                                     ` Mike Snitzer
2018-01-19 16:52                                       ` Jens Axboe
2018-01-19 17:05                                         ` Ming Lei
2018-01-19 17:09                                           ` Jens Axboe
2018-01-19 17:20                                             ` Ming Lei
2018-01-19 17:38                                   ` Jens Axboe
2018-01-19 18:24                                     ` Ming Lei
2018-01-19 18:24                                       ` Ming Lei
2018-01-19 18:33                                     ` Mike Snitzer
2018-01-19 23:52                                     ` Ming Lei
2018-01-20  4:27                                       ` Jens Axboe
2018-01-19 16:13                         ` Mike Snitzer
2018-01-19 16:23                           ` Jens Axboe
2018-01-19 23:57                             ` Ming Lei
2018-01-29 22:37                     ` Bart Van Assche
2018-01-19  5:09               ` Bart Van Assche
2018-01-19  5:09                 ` Bart Van Assche
2018-01-19  7:34                 ` Ming Lei
2018-01-19 19:47                   ` Bart Van Assche
2018-01-19 19:47                     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFfF4qv+7gZY1dTdFnFkM-qu9Z3yUEV=Qe7fdUZuACzoxDyfTQ@mail.gmail.com' \
    --to=loberman@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=osandov@fb.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.