From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED2BAC4360C for ; Fri, 4 Oct 2019 13:46:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B215F222CC for ; Fri, 4 Oct 2019 13:46:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ICXJkKqz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B215F222CC Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 493676B0006; Fri, 4 Oct 2019 09:46:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 443DD6B000C; Fri, 4 Oct 2019 09:46:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E5F98E0003; Fri, 4 Oct 2019 09:46:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id 0EFA86B0006 for ; Fri, 4 Oct 2019 09:46:00 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id ABB52181AC9B4 for ; Fri, 4 Oct 2019 13:45:59 +0000 (UTC) X-FDA: 76006225638.28.view23_6e79e3cb07a4e X-HE-Tag: view23_6e79e3cb07a4e X-Filterd-Recvd-Size: 6088 Received: from mail-ua1-f67.google.com (mail-ua1-f67.google.com [209.85.222.67]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Oct 2019 13:45:59 +0000 (UTC) Received: by mail-ua1-f67.google.com with SMTP id i13so2047959uaq.7 for ; Fri, 04 Oct 2019 06:45:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OtRvtwjGRbYCsw+7d1bhF8H4AB/aClk4LmqQQW6q0qg=; b=ICXJkKqzB33AhHPs+SiVfyxcEzmKVXy/oHoYiHTsT120L3kRkyQFSb5X8KQV4+yb9i cTMIASlltOZMk1DOLYLMhmFT1Y81FHsJU57dUaqp65zbBPq9kN90iVEceym588zPTeFz kH398/qc9hF6jakYBk1NMR9R41MuGwH3DXZSZoPEfD6Q3zIHel7h1/jVfspcOEicjLOQ eVNzFiVHxCMhgtMfXUGG02QAvOoRqyF6ySkmuRkUJx3dg+Fx9yjGKXytgL6QlQ2w1oHI KC8NoZ9xpiZs9ftgI/nKG5W4gbbTPhmEtU9bjGhg/aTtan+xypipZkGcnCKQFUCeyG0z 9C0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OtRvtwjGRbYCsw+7d1bhF8H4AB/aClk4LmqQQW6q0qg=; b=qZbxmltRTawRW72KHx6dNWQk8CXf0oMyCNHZ8PNSsj3SCFSnoxw9Vem39Jaw4Gz76D B6pSDS/G7qPnLJvj2DykjjR5eb+zslj7hVA+7n0+OyO7O3Qspofq/VX6muQGGdaJzonw WV0ltZ+fNOiJr8viXT7WgwrQljJ1+w3pbclNA+U9H0Z15YG2amffZ3IAqd5nAfwVR4Yy P/KBpacOpPP/HCVBCLbfStPtB6uE9mEeIiJELDFXsgWjAj6uR2ghQoux/mcCh87FVHOk zlQTlzo+yc5hC+vsEyAGncJ7kYt0WGtlxR18EW1rPzrLEt52C+YNRmrXLU62ISHfbwou qR5Q== X-Gm-Message-State: APjAAAVpoTyNE5+QTqF7UcgekqxO+HNy4KR4+YEpeQT0cgEg+48Z/mjg hWvo2oZlJX4Kdh+L/QKx7yUK8MFFIb8ETGVpNjKWOw== X-Google-Smtp-Source: APXvYqxjF0CoPuCnxZytdFRri2vGndk6w36js/uFi10vdA/1zStGf7BrhD+CZCaBm8roV77/jNgn97xelmBnOweWlBQ= X-Received: by 2002:ab0:14c4:: with SMTP id f4mr7966949uae.46.1570196758080; Fri, 04 Oct 2019 06:45:58 -0700 (PDT) MIME-Version: 1.0 References: <1C584B5C-E04E-4B04-A3B5-4DC8E5E67366@lca.pw> <20191004123349.GB10845@dhcp22.suse.cz> <20191004132624.ctaodxaxsd7wzwlh@box> In-Reply-To: <20191004132624.ctaodxaxsd7wzwlh@box> From: Daniel Colascione Date: Fri, 4 Oct 2019 06:45:21 -0700 Message-ID: Subject: Re: [PATCH] Make SPLIT_RSS_COUNTING configurable To: "Kirill A. Shutemov" Cc: Michal Hocko , Qian Cai , Tim Murray , Suren Baghdasaryan , linux-kernel , linux-mm Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 4, 2019 at 6:26 AM Kirill A. Shutemov wrote: > On Fri, Oct 04, 2019 at 02:33:49PM +0200, Michal Hocko wrote: > > On Wed 02-10-19 19:08:16, Daniel Colascione wrote: > > > On Wed, Oct 2, 2019 at 6:56 PM Qian Cai wrote: > > > > > On Oct 2, 2019, at 4:29 PM, Daniel Colascione wrote: > > > > > > > > > > Adding the correct linux-mm address. > > > > > > > > > > > > > > >> +config SPLIT_RSS_COUNTING > > > > >> + bool "Per-thread mm counter caching" > > > > >> + depends on MMU > > > > >> + default y if NR_CPUS >= SPLIT_PTLOCK_CPUS > > > > >> + help > > > > >> + Cache mm counter updates in thread structures and > > > > >> + flush them to visible per-process statistics in batches. > > > > >> + Say Y here to slightly reduce cache contention in processes > > > > >> + with many threads at the expense of decreasing the accuracy > > > > >> + of memory statistics in /proc. > > > > >> + > > > > >> endmenu > > > > > > > > All those vague words are going to make developers almost impossible to decide the right selection here. It sounds like we should kill SPLIT_RSS_COUNTING at all to simplify the code as the benefit is so small vs the side-effect? > > > > > > Killing SPLIT_RSS_COUNTING would be my first choice; IME, on mobile > > > and a basic desktop, it doesn't make a difference. I figured making it > > > a knob would help allay concerns about the performance impact in more > > > extreme configurations. > > > > I do agree with Qian. Either it is really helpful (is it? probably on > > the number of cpus) and it should be auto-enabled or it should be > > dropped altogether. You cannot really expect people know how to enable > > this without a deep understanding of the MM internals. Not to mention > > all those users using distro kernels/configs. > > > > A config option sounds like a bad way forward. > > And I don't see much point anyway. Reading RSS counters from proc is > inherently racy. It can just either way after the read due to process > behaviour. Split RSS accounting doesn't make reading from mm counters racy. It makes these counters *wrong*. We flush task mm counters to the mm_struct once every 64 page faults that a task incurs or when that task exits. That means that if a thread takes 63 page faults and then sleeps for a week, that thread's process's mm counters are wrong by 63 pages *for a week*. And some processes have a lot of threads, compounding the error. Split RSS accounting means that memory usage numbers don't add up. I don't think it's unreasonable to want a mode where memory counters to agree with other indicators of system activity. Nobody has demonstrated that split RSS accounting actually helps in the real world. But I've described above, concretely, how split RSS accounting hurts. I've been trying for over a year to either disable split RSS accounting or to let people opt out of it. If you won't remove split RSS accounting and you won't let me add a configuration knob that lets people opt out of it, what will you accept?