From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3659C433F5 for ; Thu, 17 Feb 2022 00:13:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CC3E6B0074; Wed, 16 Feb 2022 19:13:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3546F6B0075; Wed, 16 Feb 2022 19:13:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CE946B0078; Wed, 16 Feb 2022 19:13:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 061CB6B0074 for ; Wed, 16 Feb 2022 19:13:19 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id AAC881810F5DE for ; Thu, 17 Feb 2022 00:13:18 +0000 (UTC) X-FDA: 79150347276.09.4BAAA71 Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) by imf19.hostedemail.com (Postfix) with ESMTP id 225541A0002 for ; Thu, 17 Feb 2022 00:13:18 +0000 (UTC) Received: by mail-il1-f173.google.com with SMTP id 9so378959ily.11 for ; Wed, 16 Feb 2022 16:13:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=eDkDJp0W+TGI2PKdmA1svRD0zMvxreAEeyOGvxc2rHY=; b=IjelJn0CvWn6f63gveYbF2hM3Cey98RR9lPq/Z4J6mwKQUVqLGQm8xtclnxQzP6AyQ FNpZ1ydKfxvkqQdng+buUE79I5Xgb97KYVDniAPQvL96/EsGhkpe38ryBtTn/tN4nLS7 CjzF76sMAmXrUlxA1KYLLY99fVFDwH1S33SkcYrmqETzWXtplb/1asoRXAREUE3nNtsi ovLLLiTsDWNOyDNJVeZWFZ/TLOa6o+GllsrquHV7+in+38aepeVNEhH1Ylz0oju5emdf jdpMpOBRcD0k7wZZ6M8g/mPwTfWts5T/MaD7SSx9RoD04UEic0qCX2igoAmMyxlZXcay YPqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=eDkDJp0W+TGI2PKdmA1svRD0zMvxreAEeyOGvxc2rHY=; b=Yr9tiU0Q8Bbz9jdWll2y1tXrLoSTkiucNUF04XMMU2uwwaK9i38adUaqgdreVHOtue IJpwTRgB0uu6QNLzqampgo4IwXWzH0uGuRpX06mHq+fx+qzfUArbJbeaxcjvgms2U/HZ vIkcsM0F9LmVLCt9gaTXgnF4G6mIPkfbu6sbDtkmUD8BSppQuAXJk4XA+QxiWAhuRFXy v4cJibJ3v8OVbJ7oclxZOdhSQ2prfZaHXAHc2KniaDbZZOIcyFNGaT0Ww5b3wEqN/Izr CJbyGq4x4ch9g1IgF5gNqFYRMXUrwbszcjJjKFdz6WipyZOG5eiQ6UnHmvxXyyyiFyTX Tibw== X-Gm-Message-State: AOAM531krY94mlQGwl0Dj6goI/16iQAV/GHXKeydJf8CtbdqyTV/dNzK 6hHWaMUN99xOfLP7cDA37uZz1w== X-Google-Smtp-Source: ABdhPJxoHtl1atRf/3OEaZW/KOI1eZbyM33kmHCph2zTG6d/i4c11kCCNXgE+z6D5y19FGpi8IZLBw== X-Received: by 2002:a05:6e02:b45:b0:2be:19d0:d73f with SMTP id f5-20020a056e020b4500b002be19d0d73fmr218408ilu.261.1645056797299; Wed, 16 Feb 2022 16:13:17 -0800 (PST) Received: from google.com ([2620:15c:183:200:5929:5114:bf56:ccb6]) by smtp.gmail.com with ESMTPSA id f13sm839961ion.18.2022.02.16.16.13.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Feb 2022 16:13:16 -0800 (PST) Date: Wed, 16 Feb 2022 17:13:13 -0700 From: Yu Zhao To: Hillf Danton Cc: Johannes Weiner , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v7 05/12] mm: multigenerational LRU: minimal implementation Message-ID: References: <20220208081902.3550911-1-yuzhao@google.com> <20220213100417.1183-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220213100417.1183-1-hdanton@sina.com> X-Rspamd-Queue-Id: 225541A0002 X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=IjelJn0C; spf=pass (imf19.hostedemail.com: domain of yuzhao@google.com designates 209.85.166.173 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: wjuwb9rmchg6rwqqbyyyaukpm3tyok8z X-Rspamd-Server: rspam11 X-HE-Tag: 1645056798-945336 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Feb 13, 2022 at 06:04:17PM +0800, Hillf Danton wrote: Hi Hillf, > On Tue, 8 Feb 2022 01:18:55 -0700 Yu Zhao wrote: > > + > > +/****************************************************************************** > > + * the aging > > + ******************************************************************************/ > > + > > +static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclaiming) > > +{ > > + unsigned long old_flags, new_flags; > > + int type = folio_is_file_lru(folio); > > + struct lru_gen_struct *lrugen = &lruvec->lrugen; > > + int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); > > + > > + do { > > + new_flags = old_flags = READ_ONCE(folio->flags); > > + VM_BUG_ON_FOLIO(!(new_flags & LRU_GEN_MASK), folio); > > + > > + new_gen = ((new_flags & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1; > > Is the chance zero for deadloop if new_gen != old_gen? No, because the counter is only cleared during isolation, and here it's protected again isolation (under the LRU lock, which is asserted in the lru_gen_balance_size() -> lru_gen_update_size() path). > > + new_gen = (old_gen + 1) % MAX_NR_GENS; > > + > > + new_flags &= ~LRU_GEN_MASK; > > + new_flags |= (new_gen + 1UL) << LRU_GEN_PGOFF; > > + new_flags &= ~(LRU_REFS_MASK | LRU_REFS_FLAGS); > > + /* for folio_end_writeback() */ > > /* for folio_end_writeback() and sort_folio() */ in terms of > reclaiming? Right. > > + if (reclaiming) > > + new_flags |= BIT(PG_reclaim); > > + } while (cmpxchg(&folio->flags, old_flags, new_flags) != old_flags); > > + > > + lru_gen_balance_size(lruvec, folio, old_gen, new_gen); > > + > > + return new_gen; > > +} > > ... > > > +/****************************************************************************** > > + * the eviction > > + ******************************************************************************/ > > + > > +static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) > > +{ > > Nit, the 80-column-char format is prefered. Will do. > > + bool success; > > + int gen = folio_lru_gen(folio); > > + int type = folio_is_file_lru(folio); > > + int zone = folio_zonenum(folio); > > + int tier = folio_lru_tier(folio); > > + int delta = folio_nr_pages(folio); > > + struct lru_gen_struct *lrugen = &lruvec->lrugen; > > + > > + VM_BUG_ON_FOLIO(gen >= MAX_NR_GENS, folio); > > + > > + if (!folio_evictable(folio)) { > > + success = lru_gen_del_folio(lruvec, folio, true); > > + VM_BUG_ON_FOLIO(!success, folio); > > + folio_set_unevictable(folio); > > + lruvec_add_folio(lruvec, folio); > > + __count_vm_events(UNEVICTABLE_PGCULLED, delta); > > + return true; > > + } > > + > > + if (type && folio_test_anon(folio) && folio_test_dirty(folio)) { > > + success = lru_gen_del_folio(lruvec, folio, true); > > + VM_BUG_ON_FOLIO(!success, folio); > > + folio_set_swapbacked(folio); > > + lruvec_add_folio_tail(lruvec, folio); > > + return true; > > + } > > + > > + if (tier > tier_idx) { > > + int hist = lru_hist_from_seq(lrugen->min_seq[type]); > > + > > + gen = folio_inc_gen(lruvec, folio, false); > > + list_move_tail(&folio->lru, &lrugen->lists[gen][type][zone]); > > + > > + WRITE_ONCE(lrugen->promoted[hist][type][tier - 1], > > + lrugen->promoted[hist][type][tier - 1] + delta); > > + __mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + type, delta); > > + return true; > > + } > > + > > + if (folio_test_locked(folio) || folio_test_writeback(folio) || > > + (type && folio_test_dirty(folio))) { > > + gen = folio_inc_gen(lruvec, folio, true); > > + list_move(&folio->lru, &lrugen->lists[gen][type][zone]); > > + return true; > > Make the cold dirty page cache younger instead of writeout in the backgroungd > reclaimer context, and the question rising is if laundry is defered until the > flusher threads are waken up in the following patches. This is a good point. In contrast to the active/inactive LRU, MGLRU doesn't write out dirty file pages (kswapd or direct reclaimers) -- this is writeback's job and it should be better at doing this. In fact, commit 21b4ee7029 ("xfs: drop ->writepage completely") has disabled dirty file page writeouts in the reclaim path completely. Reclaim indirectly wakes up writeback after clean file pages drop below a threshold (dirty ratio). However, dirty pages might be under counted on a system that uses a large number of mmapped file pages. MGLRU optimizes this by calling folio_mark_dirty() on pages mapped by dirty PTEs when scanning page tables. (Why not since it's already looking at the accessed bit.) The commit above explained this design choice from the performance aspect. From the implementation aspect, it also creates a boundary between reclaim and writeback. This simplifies things, e.g., the PageWriteback() check in shrink_page_list is no longer relevant for MGLRU, neither is the top half of the PageDirty() check.