From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11F57C433F5 for ; Wed, 23 Feb 2022 20:28:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5EF0D8D005B; Wed, 23 Feb 2022 15:28:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 576F48D0053; Wed, 23 Feb 2022 15:28:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CA9A8D005B; Wed, 23 Feb 2022 15:28:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 103858D0053 for ; Wed, 23 Feb 2022 15:28:37 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B364023213 for ; Wed, 23 Feb 2022 20:28:36 +0000 (UTC) X-FDA: 79175182632.06.DB98C7D Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf20.hostedemail.com (Postfix) with ESMTP id 50BCA1C0004 for ; Wed, 23 Feb 2022 20:28:36 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id em10-20020a17090b014a00b001bc3071f921so3686011pjb.5 for ; Wed, 23 Feb 2022 12:28:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eOcNau6QcD9tU+iiDfWAQqB7bzHtqf0Kwxhp6N0qaLc=; b=dhLMM6/3RLAJT1T/tR9f+b+sGqUwmpyUz43zoNiENfYYPzjHVwuihccZY31v91Ox7Z CsW75zmAE7uBf3YVW1XczbhR1BgNnONB4jkPnfGpipTB4YZW2gcHRe2yakR+oHVY9VOy J18EyLJpXfwyrfuEuibD+tYlmGNmyMA0X0ZlBbX7bKHyN6Abi8NzvrY4TC3PguRjlgnT EaNyt62g5EhTltz8sWsIyM+ba9Ipzlm9VKIFF+YxErnzD1Q2cZdEzG9DHRN/gwOJTLSz eZ93/MkSEfLTIrJ6Rr9GPe44QUNgW+O3PYjaYTFtuFEbddw8vH3kFaCvEbJGYgaLDMzL Y/Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eOcNau6QcD9tU+iiDfWAQqB7bzHtqf0Kwxhp6N0qaLc=; b=4omJgyf4V/GQLhOwdPA+gYrWlqk7a5nVoE3bDtuZQ3cCdJZchZfG3Luo5SLMA3yYOW 1Lle3hIeh5RsnQoPhkIs1J7DFsfsM+wufiPlBR7Wdt9RrwAmVT6dwO68HmevYJeIumb4 i8XWpA0LSWMvi5qMpKkWYJleSmNOeFvJmLDqLwmJwTkdGDFnD07NobcqesKNUGPyAasB vkyQhakvLGqgYMVWDI8tRAemT97CGkLh0Im8DgnJV44cLwiVCfhpbsN21YyASmglZ0Xm NeoxXGVrmhpgHH80fE2jNER6AeDUVon6hfKqpjtrrlQoVdWgyRPwFwEkWM6Il9dqrvrh fOxA== X-Gm-Message-State: AOAM532nadhmFehpaszwpLGR0aQG7ryM+0xAF9r+5HOelXd3AURk+9QL Ixx4iiOCzpk3aPcnXZF1WHOOJdKPPX0KidDySBET9g== X-Google-Smtp-Source: ABdhPJzvudwf4B1RjxopamQA+u0+2VaWNgo1gSEtdzH1D19ddPKEnPmg5vzxSnckIy1pGSNw12shdzi+TlUperes5ro= X-Received: by 2002:a17:903:2cb:b0:14f:4fb6:2fb0 with SMTP id s11-20020a17090302cb00b0014f4fb62fb0mr1274670plk.172.1645648114754; Wed, 23 Feb 2022 12:28:34 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Shakeel Butt Date: Wed, 23 Feb 2022 12:28:23 -0800 Message-ID: Subject: Re: Regression in workingset_refault latency on 5.15 To: Ivan Babrou Cc: Daniel Dao , kernel-team , Linux MM , Johannes Weiner , Roman Gushchin , Feng Tang , Michal Hocko , Hillf Danton , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Andrew Morton , Linus Torvalds Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: qd53p7tb8r1n9waz3n7ah14kq5n9br33 X-Rspam-User: Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="dhLMM6/3"; spf=pass (imf20.hostedemail.com: domain of shakeelb@google.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 50BCA1C0004 X-HE-Tag: 1645648116-579211 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 23, 2022 at 11:28 AM Ivan Babrou wrote: > [...] > > 2) Can you please use the similar bpf+kprobe tracing for the > > memcg_rstat_updated() (or __mod_memcg_lruvec_state()) to find the > > source of frequent stat updates. > > "memcg_rstat_updated" is "static inline". > > With the following: > > bpftrace -e 'kprobe:__mod_memcg_lruvec_state { @stacks[kstack(10)]++ }' > [...] Thanks, it is helpful. It seems like most of the stats updates are happening on the anon page faults and based on signature, it seems like swap refaults. > > > 3) I am still pondering why disabling swap resolves the issue for you. > > Is that only for a workload different from xfs read? > > My understanding is that any block IO (including swap) triggers new > memcg accounting code. In our process we don't have any other IO than > swap, so disabling swap removes the major (if not only) vector of > triggering this issue. > Now, I understand why disabling swap is helping your case as the number of stat updates would be reduced drastically and rstat flush would happen async most of the time. [...] > I should mention that there are really two issues: > > 1. Expensive workingset_refault, which shows up on flamegraphs. We see > it for our rocksdb based database, which persists data on xfs (local > nvme). > 2. Expensive workingset_refault that causes latency hiccups, but > doesn't show up on flamegraphs. We see it in our nginx based proxy > with swap enabled (either zram or regular file on xfs on local nvme). > > We solved the latter by disabling swap. I think the proper solution > would be for workingset_refault to be fast enough to be invisible, in > line with what was happening on Linux 5.10. Thanks for the info. Is it possible to test https://lore.kernel.org/all/20210929235936.2859271-1-shakeelb@google.com/ ? If that patch did not help then we either have to optimize rstat flushing or further increase the update buffer which is nr_cpus * 32.