From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753340AbcBWOys (ORCPT ); Tue, 23 Feb 2016 09:54:48 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39959 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753126AbcBWOyp (ORCPT ); Tue, 23 Feb 2016 09:54:45 -0500 Date: Tue, 23 Feb 2016 09:54:42 -0500 From: Mike Snitzer To: Ming Lei Cc: Kent Overstreet , "oleg.drokin@intel.com" , Ming Lin-SSI , "andreas.dilger@intel.com" , "martin.petersen@oracle.com" , "minchan@kernel.org" , "jkosina@suse.cz" , kernel list , "jim@jtan.com" , "pjk1939@linux.vnet.ibm.com" , "axboe@fb.com" , "geoff@infradead.org" , "dm-devel@redhat.com" , "dpark@posteo.net" , Pavel Machek , "ngupta@vflare.org" , "hch@lst.de" , "agk@redhat.com" Subject: Re: 4.4-final: 28 bioset threads on small notebook Message-ID: <20160223145442.GB8047@redhat.com> References: <20160220184258.GA3753@amd> <20160220195136.GA27149@redhat.com> <20160220200432.GB22120@amd> <20160220203856.GB27149@redhat.com> <20160220205519.GA14108@amd> <20160221041540.GA24735@kmo-pixel> <3A47B4705F6BE24CBB43C61AA73286211B3AA6A9@SSIEXCH-MB3.ssi.samsung.com> <20160222225818.GA2675@kmo-pixel> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 23 Feb 2016 14:54:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 22 2016 at 9:55pm -0500, Ming Lei wrote: > On Tue, Feb 23, 2016 at 6:58 AM, Kent Overstreet > wrote: > > On Sun, Feb 21, 2016 at 05:40:59PM +0800, Ming Lei wrote: > >> On Sun, Feb 21, 2016 at 2:43 PM, Ming Lin-SSI wrote: > >> >>-----Original Message----- > >> > > >> > So it's almost already "per request_queue" > >> > >> Yes, that is because of the following line: > >> > >> q->bio_split = bioset_create(BIO_POOL_SIZE, 0); > >> > >> in blk_alloc_queue_node(). > >> > >> Looks like this bio_set doesn't need to be per-request_queue, and > >> now it is only used for fast-cloning bio for splitting, and one global > >> split bio_set should be enough. > > > > It does have to be per request queue for stacking block devices (which includes > > loopback). > > In commit df2cb6daa4(block: Avoid deadlocks with bio allocation by > stacking drivers), deadlock in this situation has been avoided already. > Or are there other issues with global bio_set? I appreciate if you may > explain it a bit if there are. Even with commit df2cb6daa4 there is still risk of deadlocks (even without low memory condition), see: https://patchwork.kernel.org/patch/7398411/ (you may recall you blocked this patch with concerns about performance, context switches, plug merging being compromised, etc.. to which I never circled back to verify your concerns) But it illustrates the type of problems that can occur when your rescue infrastructure is shared across devices (in the context of df2cb6daa4, current->bio_list contains bios from multiple devices). If a single splitting bio_set were shared across devices there would be no guarantee of forward progress with complex stacked devices (one or more devices could exhaust the reserve and starve out other devices in the stack). So keeping the bio_set per request_queue isn't prone to failure like a shared bio_set might be. Mike