From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756465AbcIHMLf (ORCPT ); Thu, 8 Sep 2016 08:11:35 -0400 Received: from mx2.suse.de ([195.135.220.15]:49505 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752029AbcIHMLc (ORCPT ); Thu, 8 Sep 2016 08:11:32 -0400 Subject: Re: [PATCH V2 00/22] Replace the CFQ I/O Scheduler with BFQ To: Eric Wheeler , Mark Brown References: <1470654917-4280-1-git-send-email-paolo.valente@linaro.org> <20160808131954.GA12647@infradead.org> <20160831220949.GD5967@sirena.org.uk> Cc: Christoph Hellwig , Tejun Heo , Jens Axboe , Paolo Valente , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, linus.walleij@linaro.org, linux-bcache@vger.kernel.org, Omar Sandoval From: Hannes Reinecke X-Enigmail-Draft-Status: N1110 Message-ID: Date: Thu, 8 Sep 2016 14:11:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/01/2016 11:06 PM, Eric Wheeler wrote: > On Wed, 31 Aug 2016, Mark Brown wrote: > [...] >> I personally feel that given that it looks like this is all going to >> take a while it'd still be good to merge BFQ at least as an alternative >> scheduler so that people can take advantage of it while the work on >> modernising everything to use blk-mq - that way we can hopefully improve >> the state of the art for users in the short term or at least help get >> some wider feedback on how well this works in the real world >> independently of the work on blk-mq. > > I would like to chime in agree fervently with Mark. > > We have a pair of very busy hypervisors with a complicated block stack > integrating bcache, drbd, LVM, dm-thin, kvm, ggaoed (AoE target), zram > swap, continuous block-layer backups and snapshot verifies to tertiary > storage, cgroup block IO throttled limits, and lots of hourly dm-thin > snapshots replicated to tertiary storage. All of this is performed under > heavy memory pressure (35-40% swapped out to zram). > > The systems work moderately well under cfq, but *amazingly well* using > BFQ. I like BFQ so much that I've backported v8r2 to Linux v4.1 [1]. > > +1 to upstream this as a new scheduler without replacing CFQ. > > Including BFQ would be a boon for Linux and the community at large. > Personally, the main grudge I have against the BFQ patchset is that it _replaces_ the existing CFQ. CFQ with all its drawbacks is reasonably well understood, and we have a very large performance dataset. Replacing it with BFQ will invalidate all of this, with us having to redo _every_ of these performance tests. If, OTOH, BFQ would be added as an alternative to CFQ we could switch to it during runtime, allowing the user to configure the system as he sees fit. We did the same thing for the 'as' scheduler, so it's not a problem in principle. With that modification it's then a matter of policy whether it _should_ be integrated into the mainline kernel, seeing that it'll be part of a deemed obsolete subsystem. But this behaviour is precisely what made me giving up on hacking qemu; patches are being ignored or turned down because they are touching areas which are supposed be rewritten in the near future. And no deadline given nor any repositories to be had where this rewrite could be looked at. Which makes contributing _really_ hard and very frustrating; and I think this indeed would be a suitable topic for KS. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)