All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: hannes@cmpxchg.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, bfoster@redhat.com,
	willy@infradead.org, arnd@arndb.de, linux-api@vger.kernel.org,
	kernel-team@meta.com
Subject: Re: [PATCH v11 0/3] cachestat: a new syscall for page cache state of files
Date: Tue, 14 Mar 2023 16:00:41 -0700	[thread overview]
Message-ID: <20230314160041.960ede03d5f5ff3dbb3e3fd0@linux-foundation.org> (raw)
In-Reply-To: <20230308032748.609510-1-nphamcs@gmail.com>

On Tue,  7 Mar 2023 19:27:45 -0800 Nhat Pham <nphamcs@gmail.com> wrote:

> There is currently no good way to query the page cache state of large
> file sets and directory trees. There is mincore(), but it scales poorly:
> the kernel writes out a lot of bitmap data that userspace has to
> aggregate, when the user really doesn not care about per-page information
> in that case. The user also needs to mmap and unmap each file as it goes
> along, which can be quite slow as well.

A while ago I asked about the security implications - could cachestat()
be used to figure out what parts of a file another user is reading. 
This also applies to mincore(), but cachestat() newly permits user A to
work out which parts of a file user B has *written* to.

I don't recall seeing a response to this, and there is no discussion in
the changelogs.


Secondly, I'm not seeing description of any use cases.  OK, it's faster
and better than mincore(), but who cares?  In other words, what
end-user value compels us to add this feature to Linux?


>    struct cachestat {
>	        __u64 nr_cache;
>	        __u64 nr_dirty;
>	        __u64 nr_writeback;
>	        __u64 nr_evicted;
>	        __u64 nr_recently_evicted;
>    };

And these fields are really getting into the weedy details of internal
kernel implementation.  Bear in mind that we must support this API for
ever.

Particularly the "evicted" things.  The workingset code was implemented
eight years ago, which is actually relatively recent.  It could be that
eight years from now it will have been removed and possibly replaced
workingset with something else.  Then what do we do?

For these reasons, and because of the lack of enthusiasm I have seen
from others, I don't think a case has yet been made for the addition of
this new syscall.

  parent reply	other threads:[~2023-03-14 23:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-08  3:27 [PATCH v11 0/3] cachestat: a new syscall for page cache state of files Nhat Pham
2023-03-08  3:27 ` [PATCH v11 1/3] workingset: refactor LRU refault to expose refault recency check Nhat Pham
2023-03-08  3:27 ` [PATCH v11 2/3] cachestat: implement cachestat syscall Nhat Pham
2023-03-09 14:08   ` kernel test robot
2023-03-09 14:08   ` kernel test robot
2023-03-09 17:02   ` Nhat Pham
2023-03-08  3:27 ` [PATCH v11 3/3] selftests: Add selftests for cachestat Nhat Pham
2023-03-14 23:00 ` Andrew Morton [this message]
2023-03-15 17:09   ` [PATCH v11 0/3] cachestat: a new syscall for page cache state of files Johannes Weiner
2023-03-15 19:14     ` Andres Freund
2023-03-24 21:59       ` Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230314160041.960ede03d5f5ff3dbb3e3fd0@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bfoster@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.