From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C885DC4320A for ; Thu, 26 Aug 2021 08:58:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACC4F610C8 for ; Thu, 26 Aug 2021 08:58:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240747AbhHZI7B (ORCPT ); Thu, 26 Aug 2021 04:59:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41888 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229785AbhHZI67 (ORCPT ); Thu, 26 Aug 2021 04:58:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629968292; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cIdwTiP0NVTGucL/JR43xH/9VntQvDYubEEe+Wyv8Nc=; b=V+GzbGYsdcSyCwKAk2QfCnK+UGjL1D0WU5oKGR9KxPjLkjNfZxbuN4Db8LfyuDShGBMsNu b4hsDkiLHYZgzenZ/I45JA8VjbZvPXkHNILDZoc3ONdnV2wCqT1cfTPi1UfvbvkAHEIFeP qYFvzlPChmMw0Y2ZeAKrbrJVQ0K5tdo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-500-hF47ieCuMg-5hNZcWFij9A-1; Thu, 26 Aug 2021 04:58:11 -0400 X-MC-Unique: hF47ieCuMg-5hNZcWFij9A-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7C19180292B; Thu, 26 Aug 2021 08:58:09 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A92D60936; Thu, 26 Aug 2021 08:58:07 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: To: Johannes Weiner Cc: dhowells@redhat.com, Matthew Wilcox , Linus Torvalds , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [GIT PULL] Memory folios for v5.15 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2101396.1629968286.1@warthog.procyon.org.uk> Date: Thu, 26 Aug 2021 09:58:06 +0100 Message-ID: <2101397.1629968286@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Johannes Weiner wrote: > But we're here in part because the filesystems have been too exposed > to the backing memory implementation details. So all I'm saying is, if > you're touching all the file cache interface now anyway, why not use > the opportunity to properly disconnect it from the reality of pages, > instead of making the compound page the new interface for filesystems. > > What's wrong with the idea of a struct cache_entry Well, the name's already taken, though only in cifs. And we have a *lot* of caches so just calling it "cache_entry" is kind of unspecific. > which can be > embedded wherever we want: in a page, a folio or a pageset. Or in the > future allocated on demand for actually have it be just a cache entry for the fs to read and write, > not also a compound page and an anon page etc. all at the same time. > > Even today that would IMO delineate more clearly between the file > cache data plane and the backing memory plane. It doesn't get in the > way of also fixing the base-or-compound mess inside MM code with > folio/pageset, either. One thing I like about Willy's folio concept is that, as long as everyone uses the proper accessor functions and macros, we can mostly ignore the fact that they're 2^N sized/aligned and they're composed of exact multiples of pages. What really matters are the correspondences between folio size/alignment and medium/IO size/alignment, so you could look on the folio as being a tool to disconnect the filesystem from the concept of pages. We could, in the future, in theory, allow the internal implementation of a folio to shift from being a page array to being a kmalloc'd page list or allow higher order units to be mixed in. The main thing we have to stop people from doing is directly accessing the members of the struct. There are some tricky bits: kmap and mmapped page handling, for example. Some of this can be mitigated by making iov_iters handle folios (the ITER_XARRAY type does, for example) and providing utilities to populate scatterlists. David