From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F22F2C433F5 for ; Fri, 20 May 2022 01:23:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344534AbiETBXD (ORCPT ); Thu, 19 May 2022 21:23:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344597AbiETBW4 (ORCPT ); Thu, 19 May 2022 21:22:56 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B3481498C3; Thu, 19 May 2022 18:22:21 -0700 (PDT) Received: from kwepemi500023.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L48861JKvzhZ4h; Fri, 20 May 2022 09:21:42 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi500023.china.huawei.com (7.221.188.76) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 20 May 2022 09:22:19 +0800 Received: from [10.174.176.73] (10.174.176.73) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 20 May 2022 09:22:18 +0800 Subject: Re: [PATCH -next v3 2/2] blk-throttle: fix io hung due to configuration updates To: =?UTF-8?Q?Michal_Koutn=c3=bd?= CC: , , , , , , , References: <20220519085811.879097-1-yukuai3@huawei.com> <20220519085811.879097-3-yukuai3@huawei.com> <20220519095857.GE16096@blackbody.suse.cz> <20220519161026.GG16096@blackbody.suse.cz> From: "yukuai (C)" Message-ID: <73464ca6-9412-cc55-d9c0-f2e8a10f0607@huawei.com> Date: Fri, 20 May 2022 09:22:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20220519161026.GG16096@blackbody.suse.cz> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.73] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org 在 2022/05/20 0:10, Michal Koutný 写道: > On Thu, May 19, 2022 at 08:14:28PM +0800, "yukuai (C)" wrote: >> tg_with_in_bps_limit: >> jiffy_elapsed_rnd = jiffies - tg->slice_start[rw]; >> tmp = bps_limit * jiffy_elapsed_rnd; >> do_div(tmp, HZ); >> bytes_allowed = tmp; -> how many bytes are allowed in this slice, >> incluing dispatched. >> if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) >> *wait = 0 -> no need to wait if this bio is within limit >> >> extra_bytes = tg->bytes_disp[rw] + bio_size - bytes_allowed; >> -> extra_bytes is based on 'bytes_disp' >> >> For example: >> >> 1) bps_limit is 2k, we issue two io, (1k and 9k) >> 2) the first io(1k) will be dispatched, bytes_disp = 1k, slice_start = 0 >> the second io(9k) is waiting for (9 - (2 - 1)) / 2 = 4 s > > The 2nd io arrived at 1s, the wait time is 4s, i.e. it can be dispatched > at 5s (i.e. 10k/*2kB/s = 5s). No, the example is that the second io arrived together with first io. > >> 3) after 3 s, we update bps_limit to 1k, then new waiting is caculated: >> >> without this patch: bytes_disp = 0, slict_start =3: >> bytes_allowed = 1k <--- why 1k and not 0? Because slice_start == jiffies, bytes_allowed is equal to bps_limit >> extra_bytes = 9k - 1k = 8k >> wait = 8s > > This looks like it was calculated at time 4s (1s after new config was > set). No... it was caculated at time 3s: jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); jiffies should be greater than 3s here, thus jiffy_elapsed_rnd is 3s + throtl_slice (I'm using throtl_slice = 1s here, it should not affect result) > >> >> whth this patch: bytes_disp = 0.5k, slice_start = 0, >> bytes_allowed = 1k * 3 + 1k = 4k >> extra_bytes = 0.5k + 9k - 4k = 5.5k >> wait = 5.5s > > This looks like calculated at 4s, so the IO would be waiting till > 4s+5.5s = 9.5s. wait time is based on extra_bytes, this is really 5.5s, add 4s is wrong here. bytes_allowed = ((jiffies - slice_start) / Hz + 1) * bps_limit extra_bytes = bio_size + bytes_disp - bytes_allowed wait = extra_bytes / bps_limit > > As I don't know why using time 4s, I'll shift this calculation to the > time 3s (when the config changes): > > bytes_disp = 0.5k, slice_start = 0, > bytes_allowed = 1k * 3 = 3k > extra_bytes = 0.5k + 9k - 3k = 7.5k 6.5k > wait = 7.5s > > In absolute time, the IO would wait till 3s+7.5s = 10.5s Like I said above, wait time should not add (jiffies - slice_start) > > OK, either your 9.5s or my 10.5s looks weird (although earlier than > original 4s+8s=12s). > However, the IO should ideally only wait till > > 3s + (9k - (6k - 1k) ) / 1k/s = > bio - (allowed - dispatched) / new_limit > > =3s + 4k / 1k/s = 7s > > ('allowed' is based on old limit) > > Or in another example, what if you change the config from 2k/s to ∞k/s > (unlimited, let's neglect the arithmetic overflow that you handle > explicitly, imagine a big number but not so big to be greater than > division result). > > In such a case, the wait time should be zero, i.e. IO should be > dispatched right at the time of config change. I thought about it, however, IMO, this is not a good idea. If user updated config quite frequently, io throttle will be invalid. Thanks, Kuai > (With your patch that still calculates >0 wait time (and the original > behavior gives >0 wait too.) > >> I hope I can expliain it clearly... > > Yes, thanks for pointing me to relevant parts. > I hope I grasped them correctly. > > IOW, your patch and formula make the wait time shorter but still IO can > be delayed indefinitely if you pass a sequence of new configs. (AFAIU) > > Regards, > Michal > . > From mboxrd@z Thu Jan 1 00:00:00 1970 From: "yukuai (C)" Subject: Re: [PATCH -next v3 2/2] blk-throttle: fix io hung due to configuration updates Date: Fri, 20 May 2022 09:22:17 +0800 Message-ID: <73464ca6-9412-cc55-d9c0-f2e8a10f0607@huawei.com> References: <20220519085811.879097-1-yukuai3@huawei.com> <20220519085811.879097-3-yukuai3@huawei.com> <20220519095857.GE16096@blackbody.suse.cz> <20220519161026.GG16096@blackbody.suse.cz> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20220519161026.GG16096-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org> List-ID: Content-Type: text/plain; charset="utf-8"; format="flowed" To: =?UTF-8?Q?Michal_Koutn=c3=bd?= Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org, ming.lei-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, yi.zhang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org 在 2022/05/20 0:10, Michal Koutný 写道: > On Thu, May 19, 2022 at 08:14:28PM +0800, "yukuai (C)" wrote: >> tg_with_in_bps_limit: >> jiffy_elapsed_rnd = jiffies - tg->slice_start[rw]; >> tmp = bps_limit * jiffy_elapsed_rnd; >> do_div(tmp, HZ); >> bytes_allowed = tmp; -> how many bytes are allowed in this slice, >> incluing dispatched. >> if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) >> *wait = 0 -> no need to wait if this bio is within limit >> >> extra_bytes = tg->bytes_disp[rw] + bio_size - bytes_allowed; >> -> extra_bytes is based on 'bytes_disp' >> >> For example: >> >> 1) bps_limit is 2k, we issue two io, (1k and 9k) >> 2) the first io(1k) will be dispatched, bytes_disp = 1k, slice_start = 0 >> the second io(9k) is waiting for (9 - (2 - 1)) / 2 = 4 s > > The 2nd io arrived at 1s, the wait time is 4s, i.e. it can be dispatched > at 5s (i.e. 10k/*2kB/s = 5s). No, the example is that the second io arrived together with first io. > >> 3) after 3 s, we update bps_limit to 1k, then new waiting is caculated: >> >> without this patch: bytes_disp = 0, slict_start =3: >> bytes_allowed = 1k <--- why 1k and not 0? Because slice_start == jiffies, bytes_allowed is equal to bps_limit >> extra_bytes = 9k - 1k = 8k >> wait = 8s > > This looks like it was calculated at time 4s (1s after new config was > set). No... it was caculated at time 3s: jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); jiffies should be greater than 3s here, thus jiffy_elapsed_rnd is 3s + throtl_slice (I'm using throtl_slice = 1s here, it should not affect result) > >> >> whth this patch: bytes_disp = 0.5k, slice_start = 0, >> bytes_allowed = 1k * 3 + 1k = 4k >> extra_bytes = 0.5k + 9k - 4k = 5.5k >> wait = 5.5s > > This looks like calculated at 4s, so the IO would be waiting till > 4s+5.5s = 9.5s. wait time is based on extra_bytes, this is really 5.5s, add 4s is wrong here. bytes_allowed = ((jiffies - slice_start) / Hz + 1) * bps_limit extra_bytes = bio_size + bytes_disp - bytes_allowed wait = extra_bytes / bps_limit > > As I don't know why using time 4s, I'll shift this calculation to the > time 3s (when the config changes): > > bytes_disp = 0.5k, slice_start = 0, > bytes_allowed = 1k * 3 = 3k > extra_bytes = 0.5k + 9k - 3k = 7.5k 6.5k > wait = 7.5s > > In absolute time, the IO would wait till 3s+7.5s = 10.5s Like I said above, wait time should not add (jiffies - slice_start) > > OK, either your 9.5s or my 10.5s looks weird (although earlier than > original 4s+8s=12s). > However, the IO should ideally only wait till > > 3s + (9k - (6k - 1k) ) / 1k/s = > bio - (allowed - dispatched) / new_limit > > =3s + 4k / 1k/s = 7s > > ('allowed' is based on old limit) > > Or in another example, what if you change the config from 2k/s to ∞k/s > (unlimited, let's neglect the arithmetic overflow that you handle > explicitly, imagine a big number but not so big to be greater than > division result). > > In such a case, the wait time should be zero, i.e. IO should be > dispatched right at the time of config change. I thought about it, however, IMO, this is not a good idea. If user updated config quite frequently, io throttle will be invalid. Thanks, Kuai > (With your patch that still calculates >0 wait time (and the original > behavior gives >0 wait too.) > >> I hope I can expliain it clearly... > > Yes, thanks for pointing me to relevant parts. > I hope I grasped them correctly. > > IOW, your patch and formula make the wait time shorter but still IO can > be delayed indefinitely if you pass a sequence of new configs. (AFAIU) > > Regards, > Michal > . >