From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58FE5C636CB for ; Fri, 16 Jul 2021 18:44:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3986D61183 for ; Fri, 16 Jul 2021 18:44:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232139AbhGPSra (ORCPT ); Fri, 16 Jul 2021 14:47:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:24671 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231715AbhGPSr1 (ORCPT ); Fri, 16 Jul 2021 14:47:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626461071; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7RoDupO6M1bVRrm6W8ZOvDrptNGSB7yFCE/hhZqmCP0=; b=RwDNRDvOsdUGbZ6Zfo4pFN7JYKQ6czviRweCPE2uYNRxkfI5K6AqHm8W35+yFcfpsNviIt 7TQSeSDxHobZC3LFq63k9mx20gMpQo6Z0PhzPYC1bt0/cw4GvF/aJ4PbKCXx1727mOw77V MkHYfjVInL2WGvZyhwzKVksD5pIZ2IU= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-303-gzZJjaK0Npy1HR1O-s2x3w-1; Fri, 16 Jul 2021 14:44:30 -0400 X-MC-Unique: gzZJjaK0Npy1HR1O-s2x3w-1 Received: by mail-qk1-f200.google.com with SMTP id u9-20020a05620a0229b02903b8eff05707so2993640qkm.5 for ; Fri, 16 Jul 2021 11:44:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=7RoDupO6M1bVRrm6W8ZOvDrptNGSB7yFCE/hhZqmCP0=; b=UPTQZG9/Q5msfA7S3LmyToJuwp5pyAak7rWYAUmqK3g6TeJane6tyHLPh8X0R9y0FA ULGfN1rVRSmYYfqUxzKe2y7/8ZGb5EYHfOaNIx0DJoX2bQXt32DdZblmvIplZYcSaTvv CjAGfS6w7JRJy6teH5MBC+rUoJi33F1Sw42FTI74v9aRtP1EXrlY0pvJrYoBodf2+rjH 8qsnSvyqwdU/WGIUrDkL9mXmAZtGiIOc0Jr+LoV4jd/Ge12rsV6lZwEWz0hOVicmRBx4 ljcEw10W7DL/7P3JJbo22lrmp2Fbc0wwuNFU83x0ZIkpJkbeESK/Yq9n7mKH9KKvEjnn 7mZA== X-Gm-Message-State: AOAM533vWOhqcCdYLBWjPS10gB0B8CteVIpHhF7Ty1siS0el5tCNKcgx 8rQHGkz3bi0EBYE1kOZbeSpzNzXSwzmaXVxq4WjTryp2F2CGOJgj4dLOrXv2rlE/iy25gvirLHu ifDqSElNCNuJZm8Gi6jxZ+3l9 X-Received: by 2002:a05:622a:170d:: with SMTP id h13mr10563990qtk.264.1626461070120; Fri, 16 Jul 2021 11:44:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydn1Bq/Od/FhPPzfBC0XbXQPdklkUWDSUSQGqXvXC0USgTcignUQvoPYX4S8z7m7xo5sIAMg== X-Received: by 2002:a05:622a:170d:: with SMTP id h13mr10563965qtk.264.1626461069928; Fri, 16 Jul 2021 11:44:29 -0700 (PDT) Received: from llong.remote.csb ([2601:191:8500:76c0::cdbc]) by smtp.gmail.com with ESMTPSA id a16sm4177617qkn.107.2021.07.16.11.44.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 16 Jul 2021 11:44:29 -0700 (PDT) From: Waiman Long X-Google-Original-From: Waiman Long Subject: Re: [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root To: Tejun Heo , Waiman Long Cc: Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Andrew Morton , Roman Gushchin , Phil Auld , Peter Zijlstra , Juri Lelli References: <20210621184924.27493-1-longman@redhat.com> <20210621184924.27493-3-longman@redhat.com> <6ea1ac38-73e1-3f78-a5d2-a4c23bcd8dd1@redhat.com> Message-ID: Date: Fri, 16 Jul 2021 14:44:27 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/5/21 1:51 PM, Tejun Heo wrote: > Hello, Waiman. > > On Mon, Jun 28, 2021 at 09:06:50AM -0400, Waiman Long wrote: >> The main reason for doing this is because normal cpuset control file actions >> are under the direct control of the cpuset code. So it is up to us to decide >> whether to grant it or deny it. Hotplug, on the other hand, is not under the >> control of cpuset code. It can't deny a hotplug operation. This is the main >> reason why the partition root error state was added in the first place. > I have a difficult time convincing myself that this difference justifies the > behavior difference and it keeps bothering me that there is a state which > can be reached through one path but rejected by the other. I'll continue > below. > >> Normally, users can set cpuset.cpus to whatever value they want even though >> they are not actually granted. However, turning on partition root is under >> more strict control. You can't turn on partition root if the CPUs requested >> cannot actually be granted. The problem with setting the state to just >> partition error is that users may not be aware that the partition creation >> operation fails.  We can't assume all users will do the proper error >> checking. I would rather let them know the operation fails rather than >> relying on them doing the proper check afterward. >> >> Yes, I agree that it is a different philosophy than the original cpuset >> code, but I thought one reason of doing cgroup v2 is to simplify the >> interface and make it a bit more erorr-proof. Since partition root creation >> is a relatively rare operation, we can afford to make it more strict than >> the other operations. > So, IMO, one of the reasons why cgroup1 interface was such a mess was > because each piece of interaction was designed ad-hoc without regard to the > overall consistency. One person feels a particular way of interacting with > the interface is "correct" and does it that way and another person does > another part in a different way. In the end, we ended up with a messy > patchwork. > > One problematic aspect of cpuset in cgroup1 was the handling of failure > modes, which was caused by the same exact approach - we wanted the interface > to reject invalid configurations outright even though we didn't have the > ability to prevent those configurations from occurring through other paths, > which makes the failure mode more subtle by further obscuring them. > > I think a better approach would be having a clear signal and mechanism to > watch the state and explicitly requiring users to verify and monitor the > state transitions. Sorry for the late reply as I was busy with other works. I agree with you on principle. However, the reason why there are more restrictions on enabling partition is because I want to avoid forcing the users to always read back cpuset.partition.type to see if the operation succeeds instead of just getting an error from the operation. The former approach is more error prone. If you don't want changes in existing behavior, I can relax the checking and allow them to become an invalid partition if an illegal operation happens. Also there is now another cpuset patch to extend cpu isolation to cgroup v1 [1]. I think it is better suit to the cgroup v2 partition scheme, but cgroup v1 is still quite heavily out there. Please let me know what you want me to do and I will send out a v3 version. Thanks a lot! Longman From mboxrd@z Thu Jan 1 00:00:00 1970 From: Waiman Long Subject: Re: [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root Date: Fri, 16 Jul 2021 14:44:27 -0400 Message-ID: References: <20210621184924.27493-1-longman@redhat.com> <20210621184924.27493-3-longman@redhat.com> <6ea1ac38-73e1-3f78-a5d2-a4c23bcd8dd1@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626461071; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7RoDupO6M1bVRrm6W8ZOvDrptNGSB7yFCE/hhZqmCP0=; b=RwDNRDvOsdUGbZ6Zfo4pFN7JYKQ6czviRweCPE2uYNRxkfI5K6AqHm8W35+yFcfpsNviIt 7TQSeSDxHobZC3LFq63k9mx20gMpQo6Z0PhzPYC1bt0/cw4GvF/aJ4PbKCXx1727mOw77V MkHYfjVInL2WGvZyhwzKVksD5pIZ2IU= In-Reply-To: Content-Language: en-US List-ID: Content-Type: text/plain; charset="iso-8859-1"; format="flowed" To: Tejun Heo , Waiman Long Cc: Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kselftest-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Andrew Morton , Roman Gushchin , Phil Auld , Peter Zijlstra , Juri Lelli On 7/5/21 1:51 PM, Tejun Heo wrote: > Hello, Waiman. > > On Mon, Jun 28, 2021 at 09:06:50AM -0400, Waiman Long wrote: >> The main reason for doing this is because normal cpuset control file act= ions >> are under the direct control of the cpuset code. So it is up to us to de= cide >> whether to grant it or deny it. Hotplug, on the other hand, is not under= the >> control of cpuset code. It can't deny a hotplug operation. This is the m= ain >> reason why the partition root error state was added in the first place. > I have a difficult time convincing myself that this difference justifies = the > behavior difference and it keeps bothering me that there is a state which > can be reached through one path but rejected by the other. I'll continue > below. > >> Normally, users can set cpuset.cpus to whatever value they want even tho= ugh >> they are not actually granted. However, turning on partition root is und= er >> more strict control. You can't turn on partition root if the CPUs reques= ted >> cannot actually be granted. The problem with setting the state to just >> partition error is that users may not be aware that the partition creati= on >> operation fails.=C2=A0 We can't assume all users will do the proper error >> checking. I would rather let them know the operation fails rather than >> relying on them doing the proper check afterward. >> >> Yes, I agree that it is a different philosophy than the original cpuset >> code, but I thought one reason of doing cgroup v2 is to simplify the >> interface and make it a bit more erorr-proof. Since partition root creat= ion >> is a relatively rare operation, we can afford to make it more strict than >> the other operations. > So, IMO, one of the reasons why cgroup1 interface was such a mess was > because each piece of interaction was designed ad-hoc without regard to t= he > overall consistency. One person feels a particular way of interacting with > the interface is "correct" and does it that way and another person does > another part in a different way. In the end, we ended up with a messy > patchwork. > > One problematic aspect of cpuset in cgroup1 was the handling of failure > modes, which was caused by the same exact approach - we wanted the interf= ace > to reject invalid configurations outright even though we didn't have the > ability to prevent those configurations from occurring through other path= s, > which makes the failure mode more subtle by further obscuring them. > > I think a better approach would be having a clear signal and mechanism to > watch the state and explicitly requiring users to verify and monitor the > state transitions. Sorry for the late reply as I was busy with other works. I agree with you on principle. However, the reason why there are more=20 restrictions on enabling partition is because I want to avoid forcing=20 the users to always read back cpuset.partition.type to see if the=20 operation succeeds instead of just getting an error from the operation.=20 The former approach is more error prone. If you don't want changes in=20 existing behavior, I can relax the checking and allow them to become an=20 invalid partition if an illegal operation happens. Also there is now another cpuset patch to extend cpu isolation to cgroup=20 v1 [1]. I think it is better suit to the cgroup v2 partition scheme, but=20 cgroup v1 is still quite heavily out there. Please let me know what you want me to do and I will send out a v3 version. Thanks a lot! Longman