From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756074AbcECO1c (ORCPT ); Tue, 3 May 2016 10:27:32 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:9306 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755635AbcECO1a (ORCPT ); Tue, 3 May 2016 10:27:30 -0400 Subject: Re: [PATCH 7/8] wbt: add general throttling mechanism To: Jan Kara References: <1461686131-22999-1-git-send-email-axboe@fb.com> <1461686131-22999-8-git-send-email-axboe@fb.com> <20160428110559.GC17362@quack2.suse.cz> <57225C3E.7060504@fb.com> <20160503093410.GD12748@quack2.suse.cz> CC: , , , , From: Jens Axboe Message-ID: <5728B45F.6050200@fb.com> Date: Tue, 3 May 2016 08:23:27 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <20160503093410.GD12748@quack2.suse.cz> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-05-03_06:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/03/2016 03:34 AM, Jan Kara wrote: > On Thu 28-04-16 12:53:50, Jens Axboe wrote: >>> 2) As far as I can see in patch 8/8, you have plugged the throttling above >>> the IO scheduler. When there are e.g. multiple cgroups with different IO >>> limits operating, this throttling can lead to strange results (like a >>> cgroup with low limit using up all available background "slots" and thus >>> effectively stopping background writeback for other cgroups)? So won't >>> it make more sense to plug this below the IO scheduler? Now I understand >>> there may be other problems with this but I think we should put more >>> though to that and provide some justification in changelogs. >> >> One complexity is that we have to do this early for blk-mq, since once you >> get a request, you're already sitting on the hw tag. CoDel should actually >> work fine at each hop, so hopefully this will as well. > > OK, I see. But then this suggests that any IO scheduling and / or > cgroup-related throttling should happen before we get a request for blk-mq > as well? And then we can still do writeback throttling below that layer? Not necessarily. For IO scheduling, basically we care about two parts: 1) Are you allowed to allocate the resources to queue some IO 2) Are you allowed to dispatch The latter part can still be handled independently, and the former as well of course, wbt just deals with throttling back #1 for buffered writes. >> But yes, fairness is something that we have to pay attention to. Right now >> the wait queue has no priority associated with it, that should probably be >> improved to be able to wakeup in a more appropriate order. >> Needs testing, but hopefully it works out since if you do run into >> starvation, then you'll go to the back of the queue for the next attempt. > > Yeah, once I'll hunt down that regression with old disk, I can have a look > into how writeback throttling plays together with blkio-controller. Thanks! -- Jens Axboe