From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92D6FC43331 for ; Tue, 31 Mar 2020 15:47:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 665AA20BED for ; Tue, 31 Mar 2020 15:47:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CZ/zV+Jp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730617AbgCaPry (ORCPT ); Tue, 31 Mar 2020 11:47:54 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:40985 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727703AbgCaPry (ORCPT ); Tue, 31 Mar 2020 11:47:54 -0400 Received: by mail-qt1-f196.google.com with SMTP id i3so18680835qtv.8; Tue, 31 Mar 2020 08:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=S+cb3bISW4bZmT92F1j7Pnq/uvLuSPQAD/l0XluG3es=; b=CZ/zV+JpODQPLEkTvr5bV1RtBpJRe9fHt7rTi1ZlaBgKVi9+wGb7JWDMSyVv0LFEsD o7PNPQNrt5OdGEe2WyyKKey7Y+vGvDbN3ZH4ABhpa4VnHcXwOIxT7GUBNx/KZLCkmWEk lrVuQgPUpgsHRfGXRvtQO3g0TW5D0OAiH81tJGPLzmcM0WGR4NBc/+UWiH2Z1VKmj6Gp mUinuOWxNArJRaIiJVjAmIbvhJuf9+2XxQZh8Bx8SBloKhJWEPzEszXl58bmi8CifTLw nT3J0bmpXMxgKhC6fTOKnt6huiA+8g/NT10Ysn2FSkTRzgZBVwIgHoorZyCh5XjndMoQ EmHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=S+cb3bISW4bZmT92F1j7Pnq/uvLuSPQAD/l0XluG3es=; b=bnvLL4zLzmTWj2wEelTUM0jQvjTMNWA3KKP5sI+4aXODXwLJuTL6Zhrt2qRVuzn22n ypxeVzoNkadvzPlGnmpBV9cKlsL6uYj0/T0cL0JS6UnZTNm2W9zlU/cQ+D1BnzNx9v3j UMuAGJUZsS5EL6KxvGtDhhLpzSMrpgjgvwhGtErTDlJzlKK7lv+oIOULX7lY61OHS5w/ TYmS4W9+gf3icDJotsm/2GBJFwOChyM2x3/zX78m63YC9/cvt/ZOa36mZLl1IGotGmll fVe2PGjWDNbXBCXtMxPmR0uHECjdEVwtkPkRFS8Pd0j3C2VlxXjmij/Mun1fuQvrVIkH F6tA== X-Gm-Message-State: ANhLgQ35zGSctUMPYbdCBLvHGBrPnZphncq883aFlI509lOO+8RoYpDq Ol1d8NSN28dL5IqQgmIYyYcTbrTAf7o9xOM44jY= X-Google-Smtp-Source: ADFU+vueFmwtyA1v+7pm/DjQa8uSCIHLcdiNP3fePAGUXFX+W1vqv3fkaAD0xsN2YQXUI2KJCFntt5b47zvPrwXIakg= X-Received: by 2002:ac8:18c3:: with SMTP id o3mr5923041qtk.49.1585669672947; Tue, 31 Mar 2020 08:47:52 -0700 (PDT) MIME-Version: 1.0 References: <20200204154200.GA5831@redsun51.ssa.fujisawa.hgst.com> <20200331143635.GS162390@mtj.duckdns.org> In-Reply-To: <20200331143635.GS162390@mtj.duckdns.org> From: Weiping Zhang Date: Tue, 31 Mar 2020 23:47:41 +0800 Message-ID: Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme To: Tejun Heo Cc: Keith Busch , Jens Axboe , Christoph Hellwig , Bart Van Assche , Minwoo Im , Thomas Gleixner , Ming Lei , "Nadolski, Edmund" , linux-block@vger.kernel.org, cgroups@vger.kernel.org, linux-nvme@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Tejun Heo =E4=BA=8E2020=E5=B9=B43=E6=9C=8831=E6=97=A5=E5=91= =A8=E4=BA=8C =E4=B8=8B=E5=8D=8810:36=E5=86=99=E9=81=93=EF=BC=9A > > Hello, Weiping. > > On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote: > > Recently I do some cgroup io weight testing, > > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test > > I think a proper io weight policy > > should consider high weight cgroup's iops, latency and also take whole > > disk's throughput > > into account, that is to say, the policy should do more carfully trade > > off between cgroup's > > IO performance and whole disk's throughput. I know one policy cannot > > do all things perfectly, > > but from the test result nvme-wrr can work well. > > That's w/o iocost QoS targets configured, right? iocost should be able to > achieve similar results as wrr with QoS configured. > Yes, I have not set Qos target. > > From the following test result, nvme-wrr work well for both cgroup's > > latency, iops, and whole > > disk's throughput. > > As I wrote before, the issues I see with wrr are the followings. > > * Hardware dependent. Some will work ok or even fantastic. Many others wi= ll do > horribly. > > * Lack of configuration granularity. We can't configure it granular enoug= h to > serve hierarchical configuration. > > * Likely not a huge problem with the deep QD of nvmes but lack of queue d= epth > control can lead to loss of latency control and thus loss of protection= for > low concurrency workloads when pitched against workloads which can satu= rate > QD. > > All that said, given the feature is available, I don't see any reason to = not > allow to use it, but I don't think it fits the cgroup interface model giv= en the > hardware dependency and coarse granularity. For these cases, I think the = right > thing to do is using cgroups to provide tagging information - ie. build a > dedicated interface which takes cgroup fd or ino as the tag and associate > configurations that way. There already are other use cases which use cgro= up this > way (e.g. perf). > Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a dedicated interface like /dev/xxx or /proc/xxx? I see the perf code: struct fd f =3D fdget(fd) struct cgroup_subsys_state *css =3D css_tryget_online_from_dir(f.file->f_path.dentry, &perf_event_cgrp_subsys); Looks can be applied to block cgroup in same way. Thanks your help. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41687C2D0EE for ; Tue, 31 Mar 2020 15:48:00 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 11BE520848 for ; Tue, 31 Mar 2020 15:48:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="i66Ne9yU"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CZ/zV+Jp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 11BE520848 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eQOdnZ2w/bV5S0O74iv+T5nUrjw69OKRR7TkS7jOqSY=; b=i66Ne9yU8x5dWI rstU+lbonZOSfSox1HqgWtYKY9h3VQDkkGrx/BkorUyxdk6vuNZVyV0xp21Dlbk9/qmr9pwFUCtQ6 272om2mxSGiOBHZRx+QLRoBQZkSa1SFe0uIJBXyutYeJ0CXlAHHQ8/KgzIjyhlEPfKiR6vvGklM2h exndeqf+XcFUlbWYQ/2dwgRTRusak0h8RBZsXVksnPAlyWzECmuhAv6lqrCJeDPnkK1xmq3eQYH5m W/q25CN8u3Iq5r1+cSAz5d6dL/OOVGfMNa/1gC9FYtYv7N3tX/I/kFRMDB10mCRfq26mDZlrXcPVR yn4Nkh8yV0kuKBMeKFAw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jJJ7N-0006Dk-NU; Tue, 31 Mar 2020 15:47:57 +0000 Received: from mail-qt1-x843.google.com ([2607:f8b0:4864:20::843]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jJJ7K-0006CN-3X for linux-nvme@lists.infradead.org; Tue, 31 Mar 2020 15:47:55 +0000 Received: by mail-qt1-x843.google.com with SMTP id t17so18673986qtn.12 for ; Tue, 31 Mar 2020 08:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=S+cb3bISW4bZmT92F1j7Pnq/uvLuSPQAD/l0XluG3es=; b=CZ/zV+JpODQPLEkTvr5bV1RtBpJRe9fHt7rTi1ZlaBgKVi9+wGb7JWDMSyVv0LFEsD o7PNPQNrt5OdGEe2WyyKKey7Y+vGvDbN3ZH4ABhpa4VnHcXwOIxT7GUBNx/KZLCkmWEk lrVuQgPUpgsHRfGXRvtQO3g0TW5D0OAiH81tJGPLzmcM0WGR4NBc/+UWiH2Z1VKmj6Gp mUinuOWxNArJRaIiJVjAmIbvhJuf9+2XxQZh8Bx8SBloKhJWEPzEszXl58bmi8CifTLw nT3J0bmpXMxgKhC6fTOKnt6huiA+8g/NT10Ysn2FSkTRzgZBVwIgHoorZyCh5XjndMoQ EmHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=S+cb3bISW4bZmT92F1j7Pnq/uvLuSPQAD/l0XluG3es=; b=o5TvnBTLgQPD0jXsLSyThGdHNR5wkJt4MKmgHZNFFfGb5lSr8OXH+2UV/mAalFhDJQ EsIVIGZ5grGBTjoCB3+Pmk3g4SrHTlw4AxuMNIC3jH61Ll238+d86rmFTFgVg6NyD2u5 UemvVkET1nsHyc6ObonBObnlcqbmwlkKw1GCNSvQimquq3DfG5FnQsswcdezuacZsTjn AoqS8tsSGeAn/vgkQupoKTav2VjUeHRaLNt9cHZaYH3YKJhKdEIbrYYJlEvLQjjoxuHc pVoAuVa4a4NjZ8qt3x135uS/QkwvSfUNRAXP9IprNF4aosUT7DJfSUvXQGQGpcRCAMV3 U5nw== X-Gm-Message-State: ANhLgQ36CjIjTSFiNBFLBJVpAYPsnIyuD3DbuuLHc7bl0W6P9XzrXmEl +PfZref1HIz3Ev71jEkNEtZz3pLtfj92f3Kwi8Q= X-Google-Smtp-Source: ADFU+vueFmwtyA1v+7pm/DjQa8uSCIHLcdiNP3fePAGUXFX+W1vqv3fkaAD0xsN2YQXUI2KJCFntt5b47zvPrwXIakg= X-Received: by 2002:ac8:18c3:: with SMTP id o3mr5923041qtk.49.1585669672947; Tue, 31 Mar 2020 08:47:52 -0700 (PDT) MIME-Version: 1.0 References: <20200204154200.GA5831@redsun51.ssa.fujisawa.hgst.com> <20200331143635.GS162390@mtj.duckdns.org> In-Reply-To: <20200331143635.GS162390@mtj.duckdns.org> From: Weiping Zhang Date: Tue, 31 Mar 2020 23:47:41 +0800 Message-ID: Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme To: Tejun Heo X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200331_084754_147971_2AC0F75D X-CRM114-Status: GOOD ( 18.18 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jens Axboe , Bart Van Assche , linux-nvme@lists.infradead.org, Ming Lei , linux-block@vger.kernel.org, Minwoo Im , cgroups@vger.kernel.org, Keith Busch , "Nadolski, Edmund" , Thomas Gleixner , Christoph Hellwig Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org VGVqdW4gSGVvIDx0akBrZXJuZWwub3JnPiDkuo4yMDIw5bm0M+aciDMx5pel5ZGo5LqMIOS4i+WN iDEwOjM25YaZ6YGT77yaCj4KPiBIZWxsbywgV2VpcGluZy4KPgo+IE9uIFR1ZSwgTWFyIDMxLCAy MDIwIGF0IDAyOjE3OjA2UE0gKzA4MDAsIFdlaXBpbmcgWmhhbmcgd3JvdGU6Cj4gPiBSZWNlbnRs eSBJIGRvIHNvbWUgY2dyb3VwIGlvIHdlaWdodCB0ZXN0aW5nLAo+ID4gaHR0cHM6Ly9naXRodWIu Y29tL2R1Ymxpby9pb3RyYWNrL3dpa2kvY2dyb3VwLWlvLXdlaWdodC10ZXN0Cj4gPiBJIHRoaW5r IGEgcHJvcGVyIGlvIHdlaWdodCBwb2xpY3kKPiA+IHNob3VsZCBjb25zaWRlciBoaWdoIHdlaWdo dCBjZ3JvdXAncyBpb3BzLCBsYXRlbmN5IGFuZCBhbHNvIHRha2Ugd2hvbGUKPiA+IGRpc2sncyB0 aHJvdWdocHV0Cj4gPiBpbnRvIGFjY291bnQsIHRoYXQgaXMgdG8gc2F5LCB0aGUgcG9saWN5IHNo b3VsZCBkbyBtb3JlIGNhcmZ1bGx5IHRyYWRlCj4gPiBvZmYgYmV0d2VlbiBjZ3JvdXAncwo+ID4g SU8gcGVyZm9ybWFuY2UgYW5kIHdob2xlIGRpc2sncyB0aHJvdWdocHV0LiBJIGtub3cgb25lIHBv bGljeSBjYW5ub3QKPiA+IGRvIGFsbCB0aGluZ3MgcGVyZmVjdGx5LAo+ID4gYnV0IGZyb20gdGhl IHRlc3QgcmVzdWx0IG52bWUtd3JyIGNhbiB3b3JrIHdlbGwuCj4KPiBUaGF0J3Mgdy9vIGlvY29z dCBRb1MgdGFyZ2V0cyBjb25maWd1cmVkLCByaWdodD8gaW9jb3N0IHNob3VsZCBiZSBhYmxlIHRv Cj4gYWNoaWV2ZSBzaW1pbGFyIHJlc3VsdHMgYXMgd3JyIHdpdGggUW9TIGNvbmZpZ3VyZWQuCj4K WWVzLCBJIGhhdmUgbm90IHNldCBRb3MgdGFyZ2V0Lgo+ID4gRnJvbSB0aGUgZm9sbG93aW5nIHRl c3QgcmVzdWx0LCBudm1lLXdyciB3b3JrIHdlbGwgZm9yIGJvdGggY2dyb3VwJ3MKPiA+IGxhdGVu Y3ksIGlvcHMsIGFuZCB3aG9sZQo+ID4gZGlzaydzIHRocm91Z2hwdXQuCj4KPiBBcyBJIHdyb3Rl IGJlZm9yZSwgdGhlIGlzc3VlcyBJIHNlZSB3aXRoIHdyciBhcmUgdGhlIGZvbGxvd2luZ3MuCj4K PiAqIEhhcmR3YXJlIGRlcGVuZGVudC4gU29tZSB3aWxsIHdvcmsgb2sgb3IgZXZlbiBmYW50YXN0 aWMuIE1hbnkgb3RoZXJzIHdpbGwgZG8KPiAgIGhvcnJpYmx5Lgo+Cj4gKiBMYWNrIG9mIGNvbmZp Z3VyYXRpb24gZ3JhbnVsYXJpdHkuIFdlIGNhbid0IGNvbmZpZ3VyZSBpdCBncmFudWxhciBlbm91 Z2ggdG8KPiAgIHNlcnZlIGhpZXJhcmNoaWNhbCBjb25maWd1cmF0aW9uLgo+Cj4gKiBMaWtlbHkg bm90IGEgaHVnZSBwcm9ibGVtIHdpdGggdGhlIGRlZXAgUUQgb2YgbnZtZXMgYnV0IGxhY2sgb2Yg cXVldWUgZGVwdGgKPiAgIGNvbnRyb2wgY2FuIGxlYWQgdG8gbG9zcyBvZiBsYXRlbmN5IGNvbnRy b2wgYW5kIHRodXMgbG9zcyBvZiBwcm90ZWN0aW9uIGZvcgo+ICAgbG93IGNvbmN1cnJlbmN5IHdv cmtsb2FkcyB3aGVuIHBpdGNoZWQgYWdhaW5zdCB3b3JrbG9hZHMgd2hpY2ggY2FuIHNhdHVyYXRl Cj4gICBRRC4KPgo+IEFsbCB0aGF0IHNhaWQsIGdpdmVuIHRoZSBmZWF0dXJlIGlzIGF2YWlsYWJs ZSwgSSBkb24ndCBzZWUgYW55IHJlYXNvbiB0byBub3QKPiBhbGxvdyB0byB1c2UgaXQsIGJ1dCBJ IGRvbid0IHRoaW5rIGl0IGZpdHMgdGhlIGNncm91cCBpbnRlcmZhY2UgbW9kZWwgZ2l2ZW4gdGhl Cj4gaGFyZHdhcmUgZGVwZW5kZW5jeSBhbmQgY29hcnNlIGdyYW51bGFyaXR5LiBGb3IgdGhlc2Ug Y2FzZXMsIEkgdGhpbmsgdGhlIHJpZ2h0Cj4gdGhpbmcgdG8gZG8gaXMgdXNpbmcgY2dyb3VwcyB0 byBwcm92aWRlIHRhZ2dpbmcgaW5mb3JtYXRpb24gLSBpZS4gYnVpbGQgYQo+IGRlZGljYXRlZCBp bnRlcmZhY2Ugd2hpY2ggdGFrZXMgY2dyb3VwIGZkIG9yIGlubyBhcyB0aGUgdGFnIGFuZCBhc3Nv Y2lhdGUKPiBjb25maWd1cmF0aW9ucyB0aGF0IHdheS4gVGhlcmUgYWxyZWFkeSBhcmUgb3RoZXIg dXNlIGNhc2VzIHdoaWNoIHVzZSBjZ3JvdXAgdGhpcwo+IHdheSAoZS5nLiBwZXJmKS4KPgpEbyB5 b3UgbWVhbnMgZHJvcCB0aGUgImlvLndyciIgb3IgImJsa2lvLndyciIgaW4gY2dyb3VwLCBhbmQg dXNlIGEKZGVkaWNhdGVkIGludGVyZmFjZQpsaWtlIC9kZXYveHh4IG9yIC9wcm9jL3h4eD8KCkkg c2VlIHRoZSBwZXJmIGNvZGU6CnN0cnVjdCBmZCBmID0gZmRnZXQoZmQpCnN0cnVjdCBjZ3JvdXBf c3Vic3lzX3N0YXRlICpjc3MgPQpjc3NfdHJ5Z2V0X29ubGluZV9mcm9tX2RpcihmLmZpbGUtPmZf cGF0aC5kZW50cnksCiAgICAgICAgJnBlcmZfZXZlbnRfY2dycF9zdWJzeXMpOwoKTG9va3MgY2Fu IGJlIGFwcGxpZWQgdG8gYmxvY2sgY2dyb3VwIGluIHNhbWUgd2F5LgoKVGhhbmtzIHlvdXIgaGVs cC4KCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmxpbnV4 LW52bWUgbWFpbGluZyBsaXN0CmxpbnV4LW52bWVAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRwOi8v bGlzdHMuaW5mcmFkZWFkLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2xpbnV4LW52bWUK From mboxrd@z Thu Jan 1 00:00:00 1970 From: Weiping Zhang Subject: Re: [PATCH v5 0/4] Add support Weighted Round Robin for blkcg and nvme Date: Tue, 31 Mar 2020 23:47:41 +0800 Message-ID: References: <20200204154200.GA5831@redsun51.ssa.fujisawa.hgst.com> <20200331143635.GS162390@mtj.duckdns.org> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=S+cb3bISW4bZmT92F1j7Pnq/uvLuSPQAD/l0XluG3es=; b=CZ/zV+JpODQPLEkTvr5bV1RtBpJRe9fHt7rTi1ZlaBgKVi9+wGb7JWDMSyVv0LFEsD o7PNPQNrt5OdGEe2WyyKKey7Y+vGvDbN3ZH4ABhpa4VnHcXwOIxT7GUBNx/KZLCkmWEk lrVuQgPUpgsHRfGXRvtQO3g0TW5D0OAiH81tJGPLzmcM0WGR4NBc/+UWiH2Z1VKmj6Gp mUinuOWxNArJRaIiJVjAmIbvhJuf9+2XxQZh8Bx8SBloKhJWEPzEszXl58bmi8CifTLw nT3J0bmpXMxgKhC6fTOKnt6huiA+8g/NT10Ysn2FSkTRzgZBVwIgHoorZyCh5XjndMoQ EmHQ== In-Reply-To: <20200331143635.GS162390-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="utf-8" To: Tejun Heo Cc: Keith Busch , Jens Axboe , Christoph Hellwig , Bart Van Assche , Minwoo Im , Thomas Gleixner , Ming Lei , "Nadolski, Edmund" , linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org Tejun Heo =E4=BA=8E2020=E5=B9=B43=E6=9C=8831=E6=97=A5=E5=91= =A8=E4=BA=8C =E4=B8=8B=E5=8D=8810:36=E5=86=99=E9=81=93=EF=BC=9A > > Hello, Weiping. > > On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote: > > Recently I do some cgroup io weight testing, > > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test > > I think a proper io weight policy > > should consider high weight cgroup's iops, latency and also take whole > > disk's throughput > > into account, that is to say, the policy should do more carfully trade > > off between cgroup's > > IO performance and whole disk's throughput. I know one policy cannot > > do all things perfectly, > > but from the test result nvme-wrr can work well. > > That's w/o iocost QoS targets configured, right? iocost should be able to > achieve similar results as wrr with QoS configured. > Yes, I have not set Qos target. > > From the following test result, nvme-wrr work well for both cgroup's > > latency, iops, and whole > > disk's throughput. > > As I wrote before, the issues I see with wrr are the followings. > > * Hardware dependent. Some will work ok or even fantastic. Many others wi= ll do > horribly. > > * Lack of configuration granularity. We can't configure it granular enoug= h to > serve hierarchical configuration. > > * Likely not a huge problem with the deep QD of nvmes but lack of queue d= epth > control can lead to loss of latency control and thus loss of protection= for > low concurrency workloads when pitched against workloads which can satu= rate > QD. > > All that said, given the feature is available, I don't see any reason to = not > allow to use it, but I don't think it fits the cgroup interface model giv= en the > hardware dependency and coarse granularity. For these cases, I think the = right > thing to do is using cgroups to provide tagging information - ie. build a > dedicated interface which takes cgroup fd or ino as the tag and associate > configurations that way. There already are other use cases which use cgro= up this > way (e.g. perf). > Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a dedicated interface like /dev/xxx or /proc/xxx? I see the perf code: struct fd f =3D fdget(fd) struct cgroup_subsys_state *css =3D css_tryget_online_from_dir(f.file->f_path.dentry, &perf_event_cgrp_subsys); Looks can be applied to block cgroup in same way. Thanks your help.