From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B268CC74A5B for ; Thu, 23 Mar 2023 16:10:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231182AbjCWQKY (ORCPT ); Thu, 23 Mar 2023 12:10:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231210AbjCWQKX (ORCPT ); Thu, 23 Mar 2023 12:10:23 -0400 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 121733524E for ; Thu, 23 Mar 2023 09:10:02 -0700 (PDT) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-5456249756bso107613427b3.5 for ; Thu, 23 Mar 2023 09:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679587801; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BgZMEf1PHFMinoIbPEomXcwtNzt8eUFNPvTXJsCn0P8=; b=k5dpDN1DHcGjTlwbOvMKLMRzGlnzZn29V1PdhQ9swzOTalJXOVLnQWZ41YtSQ6iUrk 3tYVVzf3TXbBDU3Fht0rRrJX5GujygwDq4CznsLvQvc3iwlzWZ7RjrlJRpn6zm+LX6Is wg3xRL0pVLcK4/WUOQciLucr4KsYng/0RiN3fJUrGmHZTbbYLdyI+Mxa6fi3ob6cRFNq 8SpTkAvVF4xMY6kgvEFxHg+HMcX5zbBQrAgc5RP417XZQXgpltBfUeyhzDW4y/NM+4Qd L5xz9jZ+0DhO8BpAwodd5of1IQ/G/KeTFSQm8YwjDJLsC/QfhlZPdbgvRtQ21H8E1ZYK ce/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679587801; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BgZMEf1PHFMinoIbPEomXcwtNzt8eUFNPvTXJsCn0P8=; b=dJmKzn3Z3e+ktRwN+YrQOk6nKDBKXoOca/uzK6QdchzKJ8zhmLw/v9YifkSGeWRzlL 6QctnUjcYcGj8QpefIrGfpr1+3yrfmNCKFYQVol8z/YLLvAdi5XCH4ArncLLAAdwOBTF VrJTguH/DRtmnJYvwmYTZRuHBakj99njALrUJG8BHOdv0QOi35ntxzA766nupbjCWjOQ KXw01mZxI+Wnx9jHh3UKioAgtslFgzxoVkoHYoF+jK0sl7fU/Z0g3vrccWWYVkuzrgoh josxWwcGMqqDopFOy2EaAOtlGa4jgGbEbc1g6uzcPhkwCOzRhoRt0rG6NwEZxmBmdEqo iA/A== X-Gm-Message-State: AAQBX9dRG5p1qtszZDo6gpKnCnkCyBQEOQSvjL7KdhAXXeZtQO6kgImS yMpJvi1vvfVcNMzOWpDBPkJHKxF8LjlGEGag3q43zA== X-Google-Smtp-Source: AKy350YABS+oeu6VgfSzXUY1rl2LsGxM5mQomi4AwvPmARWb+QR5iqh2XFdsgMJ1XB5supf5A5Kgty3nOBMi86V4FKQ= X-Received: by 2002:a81:ae1c:0:b0:52e:e095:d840 with SMTP id m28-20020a81ae1c000000b0052ee095d840mr2159466ywh.0.1679587801172; Thu, 23 Mar 2023 09:10:01 -0700 (PDT) MIME-Version: 1.0 References: <20230323040037.2389095-1-yosryahmed@google.com> <20230323040037.2389095-2-yosryahmed@google.com> In-Reply-To: From: Shakeel Butt Date: Thu, 23 Mar 2023 09:09:49 -0700 Message-ID: Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock To: Yosry Ahmed Cc: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Thu, Mar 23, 2023 at 8:46=E2=80=AFAM Shakeel Butt = wrote: > > On Thu, Mar 23, 2023 at 8:43=E2=80=AFAM Yosry Ahmed wrote: > > > > On Thu, Mar 23, 2023 at 8:40=E2=80=AFAM Shakeel Butt wrote: > > > > > > On Thu, Mar 23, 2023 at 6:36=E2=80=AFAM Yosry Ahmed wrote: > > > > > > > [...] > > > > > > > > > > > > > 2. Are we really calling rstat flush in irq context? > > > > > > > > > > > > I think it is possible through the charge/uncharge path: > > > > > > memcg_check_events()->mem_cgroup_threshold()->mem_cgroup_usage(= ). I > > > > > > added the protection against flushing in an interrupt context f= or > > > > > > future callers as well, as it may cause a deadlock if we don't = disable > > > > > > interrupts when acquiring cgroup_rstat_lock. > > > > > > > > > > > > > 3. The mem_cgroup_flush_stats() call in mem_cgroup_usage() is= only > > > > > > > done for root memcg. Why is mem_cgroup_threshold() interested= in root > > > > > > > memcg usage? Why not ignore root memcg in mem_cgroup_threshol= d() ? > > > > > > > > > > > > I am not sure, but the code looks like event notifications may = be set > > > > > > up on root memcg, which is why we need to check thresholds. > > > > > > > > > > This is something we should deprecate as root memcg's usage is il= l defined. > > > > > > > > Right, but I think this would be orthogonal to this patch series. > > > > > > > > > > I don't think we can make cgroup_rstat_lock a non-irq-disabling lock > > > without either breaking a link between mem_cgroup_threshold and > > > cgroup_rstat_lock or make mem_cgroup_threshold work without disabling > > > irqs. > > > > > > So, this patch can not be applied before either of those two tasks ar= e > > > done (and we may find more such scenarios). > > > > > > Could you elaborate why? > > > > My understanding is that with an in_task() check to make sure we only > > acquire cgroup_rstat_lock from non-irq context it should be fine to > > acquire cgroup_rstat_lock without disabling interrupts. > > From mem_cgroup_threshold() code path, cgroup_rstat_lock will be taken > with irq disabled while other code paths will take cgroup_rstat_lock > with irq enabled. This is a potential deadlock hazard unless > cgroup_rstat_lock is always taken with irq disabled. Oh you are making sure it is not taken in the irq context through should_skip_flush(). Hmm seems like a hack. Normally it is recommended to actually remove all such users instead of silently ignoring/bypassing the functionality. So, how about removing mem_cgroup_flush_stats() from mem_cgroup_usage(). It will break the known chain which is taking cgroup_rstat_lock with irq disabled and you can add WARN_ON_ONCE(!in_task()).