From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E6D7C433ED for ; Wed, 14 Apr 2021 15:18:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AAE4661164 for ; Wed, 14 Apr 2021 15:18:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AAE4661164 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 410C16B0073; Wed, 14 Apr 2021 11:18:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E7976B0074; Wed, 14 Apr 2021 11:18:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D7316B0075; Wed, 14 Apr 2021 11:18:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 14B996B0073 for ; Wed, 14 Apr 2021 11:18:55 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C0B2C181AF5F7 for ; Wed, 14 Apr 2021 15:18:54 +0000 (UTC) X-FDA: 78031330188.18.7AF10EB Received: from outbound-smtp02.blacknight.com (outbound-smtp02.blacknight.com [81.17.249.8]) by imf06.hostedemail.com (Postfix) with ESMTP id 1B927C0007CE for ; Wed, 14 Apr 2021 15:18:55 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp02.blacknight.com (Postfix) with ESMTPS id 95B93D60A5 for ; Wed, 14 Apr 2021 16:18:52 +0100 (IST) Received: (qmail 27104 invoked from network); 14 Apr 2021 15:18:52 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 14 Apr 2021 15:18:52 -0000 Date: Wed, 14 Apr 2021 16:18:50 +0100 From: Mel Gorman To: Vlastimil Babka Cc: Linux-MM , Linux-RT-Users , LKML , Chuck Lever , Jesper Dangaard Brouer , Matthew Wilcox , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Michal Hocko , Oscar Salvador Subject: Re: [PATCH 04/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters Message-ID: <20210414151850.GG3697@techsingularity.net> References: <20210407202423.16022-1-mgorman@techsingularity.net> <20210407202423.16022-5-mgorman@techsingularity.net> <7a7ec563-0519-a850-563a-9680a7bd00d3@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <7a7ec563-0519-a850-563a-9680a7bd00d3@suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1B927C0007CE X-Stat-Signature: bwjjzhqijn9r1kgbfcisjgrhnbg9bsst Received-SPF: none (techsingularity.net>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=outbound-smtp02.blacknight.com; client-ip=81.17.249.8 X-HE-DKIM-Result: none/none X-HE-Tag: 1618413535-943574 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 14, 2021 at 02:56:45PM +0200, Vlastimil Babka wrote: > On 4/7/21 10:24 PM, Mel Gorman wrote: > > NUMA statistics are maintained on the zone level for hits, misses, foreign > > etc but nothing relies on them being perfectly accurate for functional > > correctness. The counters are used by userspace to get a general overview > > of a workloads NUMA behaviour but the page allocator incurs a high cost to > > maintain perfect accuracy similar to what is required for a vmstat like > > NR_FREE_PAGES. There even is a sysctl vm.numa_stat to allow userspace to > > turn off the collection of NUMA statistics like NUMA_HIT. > > > > This patch converts NUMA_HIT and friends to be NUMA events with similar > > accuracy to VM events. There is a possibility that slight errors will be > > introduced but the overall trend as seen by userspace will be similar. > > Note that while these counters could be maintained at the node level that > > it would have a user-visible impact. > > I guess this kind of inaccuracy is fine. I just don't like much > fold_vm_zone_numa_events() which seems to calculate sums of percpu counters and > then assign the result to zone counters for immediate consumption, which differs > from other kinds of folds in vmstat that reset the percpu counters to 0 as they > are treated as diffs to the global counters. > The counters that are diffs fit inside an s8 and they are kept limited because their "true" value is sometimes critical -- e.g. NR_FREE_PAGES for watermark checking. So the level of drift has to be controlled and the drift should not exist potentially forever so it gets updated periodically. The inaccurate counters are only exported to userspace. There is no need to update them every few seconds so fold_vm_zone_numa_events() is only called when a user cares but you raise a raise a valid below. > So it seems that this intermediate assignment to zone counters (using > atomic_long_set() even) is unnecessary and this could mimic sum_vm_events() that > just does the summation on a local array? > The atomic is unnecessary for sure but using a local array is problematic because of your next point. > And probably a bit more serious is that vm_events have vm_events_fold_cpu() to > deal with a cpu going away, but after your patch the stats counted on a cpu just > disapepar from the sums as it goes offline as there's no such thing for the numa > counters. > That is a problem I missed. Even if zonestats was preserved on hot-remove, fold_vm_zone_numa_events would not be reading the CPU so hotplug events jump all over the place. So some periodic folding is necessary. I would still prefer not to do it by time but it could be done only on overflow or when a file like /proc/vmstat is read. I'll think about it a bit more and see what I come up with. Thanks! -- Mel Gorman SUSE Labs