linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@canonical.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>,
	"oleg.drokin@intel.com" <oleg.drokin@intel.com>,
	Ming Lin-SSI <ming.l@ssi.samsung.com>,
	"andreas.dilger@intel.com" <andreas.dilger@intel.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"minchan@kernel.org" <minchan@kernel.org>,
	"jkosina@suse.cz" <jkosina@suse.cz>,
	kernel list <linux-kernel@vger.kernel.org>,
	"jim@jtan.com" <jim@jtan.com>,
	"pjk1939@linux.vnet.ibm.com" <pjk1939@linux.vnet.ibm.com>,
	"axboe@fb.com" <axboe@fb.com>,
	"geoff@infradead.org" <geoff@infradead.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"dpark@posteo.net" <dpark@posteo.net>,
	Pavel Machek <pavel@ucw.cz>,
	"ngupta@vflare.org" <ngupta@vflare.org>,
	"hch@lst.de" <hch@lst.de>, "agk@redhat.com" <agk@redhat.com>
Subject: Re: 4.4-final: 28 bioset threads on small notebook
Date: Wed, 24 Feb 2016 10:48:10 +0800	[thread overview]
Message-ID: <CACVXFVN+xV5_4zktCGpQUrnOtJ1EVOLPt5v4niuj9VUGL8uUEA@mail.gmail.com> (raw)
In-Reply-To: <20160223145442.GB8047@redhat.com>

On Tue, Feb 23, 2016 at 10:54 PM, Mike Snitzer <snitzer@redhat.com> wrote:
> On Mon, Feb 22 2016 at  9:55pm -0500,
> Ming Lei <ming.lei@canonical.com> wrote:
>
>> On Tue, Feb 23, 2016 at 6:58 AM, Kent Overstreet
>> <kent.overstreet@gmail.com> wrote:
>> > On Sun, Feb 21, 2016 at 05:40:59PM +0800, Ming Lei wrote:
>> >> On Sun, Feb 21, 2016 at 2:43 PM, Ming Lin-SSI <ming.l@ssi.samsung.com> wrote:
>> >> >>-----Original Message-----
>> >> >
>> >> > So it's almost already "per request_queue"
>> >>
>> >> Yes, that is because of the following line:
>> >>
>> >> q->bio_split = bioset_create(BIO_POOL_SIZE, 0);
>> >>
>> >> in blk_alloc_queue_node().
>> >>
>> >> Looks like this bio_set doesn't need to be per-request_queue, and
>> >> now it is only used for fast-cloning bio for splitting, and one global
>> >> split bio_set should be enough.
>> >
>> > It does have to be per request queue for stacking block devices (which includes
>> > loopback).
>>
>> In commit df2cb6daa4(block: Avoid deadlocks with bio allocation by
>> stacking drivers), deadlock in this situation has been avoided already.
>> Or are there other issues with global bio_set? I appreciate if you may
>> explain it a bit if there are.
>
> Even with commit df2cb6daa4 there is still risk of deadlocks (even
> without low memory condition), see:
> https://patchwork.kernel.org/patch/7398411/

That is definitely another problem which isn't related with low memory,
and I guess Kent means there might be deadlock risk in case of shared
bio_set.

>
> (you may recall you blocked this patch with concerns about performance,
> context switches, plug merging being compromised, etc.. to which I never
> circled back to verify your concerns)

I still remember that problem:

1) Process A
     - two bio(a, b) are splitted in dm's make_request funtion
     - bio(a) is submitted via generic_make_request(), so it is staged
       in current->bio_list
     - time t1
     - before bio(b) is submitted, down_write(&s->lock) is run and
      never return

2) Process B:
     - just during time t1, wait completion of bio(a) by down_write(&s->lock)

Then Process A waits the lock which is acquired by B first, and the
two bio(a, b)
can't reach to driver/device at all.

Looks that current->bio_list is fragile to locks from make_request function,
and moving the lock into workqueue context should be helpful.

And I am happy to continue to discuss this issue further.

>
> But it illustrates the type of problems that can occur when your rescue
> infrastructure is shared across devices (in the context of df2cb6daa4,
> current->bio_list contains bios from multiple devices).
>
> If a single splitting bio_set were shared across devices there would be
> no guarantee of forward progress with complex stacked devices (one or
> more devices could exhaust the reserve and starve out other devices in
> the stack).  So keeping the bio_set per request_queue isn't prone to
> failure like a shared bio_set might be.

Not consider the dm lock problem, from Kent's commit(df2cb6daa4) log and
the patch, looks forward progress can be guaranteed for stacked devices
with same bio_set, but better to get Kent's clarification.

If forward progress can be guaranteed, percpu mempool might avoid
easy exhausting, because it is reasonable to assume that one CPU can only
provide a certain amount of bandwidth wrt. block transfer.

Thanks
Ming

  reply	other threads:[~2016-02-24  2:48 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-11 10:49 4.4-rc: 28 bioset threads on small notebook Pavel Machek
2015-12-11 14:08 ` Mike Snitzer
2015-12-11 17:14   ` Pavel Machek
2016-02-20 17:40   ` 4.4-final: " Pavel Machek
2016-02-20 18:42     ` Pavel Machek
2016-02-20 19:51       ` Mike Snitzer
2016-02-20 20:04         ` Pavel Machek
2016-02-20 20:38           ` Mike Snitzer
2016-02-20 20:55             ` Pavel Machek
2016-02-21  4:15               ` Kent Overstreet
2016-02-21  6:43                 ` Ming Lin-SSI
2016-02-21  9:40                   ` Ming Lei
2016-02-22 22:58                     ` Kent Overstreet
2016-02-23  2:55                       ` Ming Lei
2016-02-23 14:54                         ` Mike Snitzer
2016-02-24  2:48                           ` Ming Lei [this message]
2016-02-24  3:23                             ` Kent Overstreet
2016-02-23 20:45                       ` Pavel Machek
2017-02-06 12:53           ` v4.9, 4.4-final: 28 bioset threads on small notebook, 36 threads on cellphone Pavel Machek
2017-02-07  1:47             ` Kent Overstreet
2017-02-07  2:49               ` Kent Overstreet
2017-02-07 17:13                 ` Mike Snitzer
2017-02-07 20:39                 ` Pavel Machek
2017-02-08  3:12                   ` Mike Galbraith
2017-02-08  4:58                   ` Kent Overstreet
2017-02-08  6:22                     ` [PATCH] block: Make rescuer threads per request_queue, not per bioset kbuild test robot
2017-02-08  6:23                     ` kbuild test robot
2017-02-08  6:57                     ` v4.9, 4.4-final: 28 bioset threads on small notebook, 36 threads on cellphone Mike Galbraith
2017-02-08 16:34                     ` Mike Snitzer
2017-02-09 21:25                       ` Kent Overstreet
2017-02-14 16:34                         ` [dm-devel] " Mikulas Patocka
2017-02-14 17:33                         ` Mike Snitzer
2017-02-08  2:47                 ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACVXFVN+xV5_4zktCGpQUrnOtJ1EVOLPt5v4niuj9VUGL8uUEA@mail.gmail.com \
    --to=ming.lei@canonical.com \
    --cc=agk@redhat.com \
    --cc=andreas.dilger@intel.com \
    --cc=axboe@fb.com \
    --cc=dm-devel@redhat.com \
    --cc=dpark@posteo.net \
    --cc=geoff@infradead.org \
    --cc=hch@lst.de \
    --cc=jim@jtan.com \
    --cc=jkosina@suse.cz \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=minchan@kernel.org \
    --cc=ming.l@ssi.samsung.com \
    --cc=ngupta@vflare.org \
    --cc=oleg.drokin@intel.com \
    --cc=pavel@ucw.cz \
    --cc=pjk1939@linux.vnet.ibm.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).