From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF42C433EF for ; Tue, 22 Mar 2022 08:14:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F9C16B0072; Tue, 22 Mar 2022 04:14:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AA5D6B0073; Tue, 22 Mar 2022 04:14:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB2CA6B0074; Tue, 22 Mar 2022 04:14:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id DB6556B0072 for ; Tue, 22 Mar 2022 04:14:17 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 60DB6181DB368 for ; Tue, 22 Mar 2022 08:14:17 +0000 (UTC) X-FDA: 79271309754.20.B2E2A93 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) by imf22.hostedemail.com (Postfix) with ESMTP id E334EC0017 for ; Tue, 22 Mar 2022 08:14:16 +0000 (UTC) Received: by mail-vs1-f50.google.com with SMTP id s18so2293397vsr.1 for ; Tue, 22 Mar 2022 01:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ejYqrEpcN+YIapqTaek6UuldGWcXdxvGA3dM/VeJRBw=; b=BUFmL4heEnQlaoWLUylgiBVgyrJVVkojiCMeAXJFkN+klvf/L4GwYq1ryCISPtRUGS si+JBR7yCvBp7J4j0vIyULINRl51j4D2xaSkm1tVlpAqr9ogkF+C20iqQWzgRJKDf6xi 1M1mqaHI6o7a5+6Yz7jK1F1rJL5JiJwjft7xaB1CBGWFhb62E9R6oyLDzXkkzC3kDjcn ZRn5V0IZBgmPnaC5K5gxyqsWjQoKO32xo7oY4B415L1eK0tIF9nzSwNhbzEasO4BBShX VTjugLCAiW2YsPV+ybWh1e1Uw+wun1LOiGG/FxAAD80r/3EsiSoGEhgcnIbXMnNCJNGK M2eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ejYqrEpcN+YIapqTaek6UuldGWcXdxvGA3dM/VeJRBw=; b=fg2hq9Z+Ck2lscR1IUC7DoAO+6+EtcOZJd8RJhMIC1mXd+6Dc+Hp2Bh1V/LIK1rdQk vPAO3X0w9V9AOihp9+UwrxWEFcXEyRV22oBYmA7thTr87t1jQaKygpZmkbpnU0HRTMVA 5U68siAxZODvUzKxZWA1CO1V2QO3PYWv/NaR39a78kw0rvr/b0FclnfZti9yxNUpIoR5 ehoO0/WJUpW4BA//SDZPnxlIA/Qts6gFDBXymetMt99Yd2qOaomNCVRqvhni3D5tuTdz eXMR1wzQysAqmstHmIQbzqwQNaAzFHjiTaowKBR+GjLD2DxLbNcqXFr/00dxpXu7TC5Y WsYg== X-Gm-Message-State: AOAM530LKPyLW7/mkGsiBPhWFP5YgFLOGNMDx1RThhs8TKX/7QqYhR3s mi/5hl3MUf301pX4itvN95841g8kTvzkHpWCYxVzLQ== X-Google-Smtp-Source: ABdhPJyS4rjE3E91EMtptKGTaCGpsQomyOoF1sxK53qyi6zy6Jm6Br78Kyg19z+h1BvSAwQv9ZiyDuNmYhVUsBl/EOw= X-Received: by 2002:a67:f956:0:b0:324:eb38:52fb with SMTP id u22-20020a67f956000000b00324eb3852fbmr5339929vsq.22.1647936855962; Tue, 22 Mar 2022 01:14:15 -0700 (PDT) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-12-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Tue, 22 Mar 2022 02:14:04 -0600 Message-ID: Subject: Re: [PATCH v9 11/14] mm: multi-gen LRU: thrashing prevention To: Barry Song <21cnbao@gmail.com> Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , Kernel Page Reclaim v2 , x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=BUFmL4he; spf=pass (imf22.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E334EC0017 X-Stat-Signature: gjbqgk176bsx9a9c4zoxc19zf45z8wy3 X-HE-Tag: 1647936856-961973 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 22, 2022 at 1:23 AM Barry Song <21cnbao@gmail.com> wrote: > > On Wed, Mar 9, 2022 at 3:48 PM Yu Zhao wrote: > > > > Add /sys/kernel/mm/lru_gen/min_ttl_ms for thrashing prevention, as > > requested by many desktop users [1]. > > > > When set to value N, it prevents the working set of N milliseconds > > from getting evicted. The OOM killer is triggered if this working set > > cannot be kept in memory. Based on the average human detectable lag > > (~100ms), N=1000 usually eliminates intolerable lags due to thrashing. > > Larger values like N=3000 make lags less noticeable at the risk of > > premature OOM kills. > > > > Compared with the size-based approach, e.g., [2], this time-based > > approach has the following advantages: > > 1. It is easier to configure because it is agnostic to applications > > and memory sizes. > > 2. It is more reliable because it is directly wired to the OOM killer. > > > > how are userspace oom daemons like android lmkd, systemd-oomd supposed > to work with this time-based oom killer? > only one of min_ttl_ms and userspace daemon should be enabled? or both > should be enabled at the same time? Generally we just need one. lmkd and oomd are more flexible but 1) they need customizations 2) not all distros have them 3) they might be stuck in direct reclaim as well. The last remark is not just a theoretical problem: a) we had many servers under extremely heavy (global) memory pressure, that 200+ direct reclaimers on each CPU competed for resources and userspace livelocked for 2 hours. Eventually hardware watchdogs kicked in. b) on Chromebooks we have something similar to lmkd, and we still frequently observe crashes due to heavy memory pressure, meaning some Chrome tabs were stuck in direct reclaim for 120 seconds (hung_task_timeout_secs=120).