All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Xu <dxu@dxuuu.xyz>
To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Cc: Daniel Xu <dxu@dxuuu.xyz>,
	linux-kernel@vger.kernel.org, kernel-team@fb.com,
	jolsa@kernel.org, hannes@cmpxchg.org, yhs@fb.com
Subject: [RFC bpf-next 0/1] bpf: Add page cache iterator
Date: Wed,  7 Apr 2021 14:46:10 -0700	[thread overview]
Message-ID: <cover.1617831474.git.dxu@dxuuu.xyz> (raw)

There currently does not exist a way to answer the question: "What is in
the page cache?". There are various heuristics and counters but nothing
that can tell you anything like:

  * 3M from /home/dxu/foo.txt
  * 5K from ...
  * etc.

The answer to the question is particularly useful in the stacked
container world. Stacked containers implies multiple containers are run
on the same physical host. Memory is precious resource on some (if not
most) of these systems. On these systems, it's useful to know how much
duplicated data is in the page cache. Once you know the answer, you can
do something about it. One possible technique would be bind mount common
items from the root host into each container.

NOTES: 

  * This patch compiles and (maybe) works -- totally not fully tested
    or in a final state

  * I'm sending this early RFC to get comments on the general approach.
    I chatted w/ Johannes a little bit and it seems like the best way to
    do this is through superblock -> inode -> address_space iteration
    rather than going from numa node -> LRU iteration

  * I'll most likely add a page_hash() helper (or something) that hashes
    a page so that userspace can more easily tell which pages are
    duplicate

Daniel Xu (1):
  bpf: Introduce iter_pagecache

 kernel/bpf/Makefile         |   2 +-
 kernel/bpf/pagecache_iter.c | 293 ++++++++++++++++++++++++++++++++++++
 2 files changed, 294 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/pagecache_iter.c

-- 
2.26.3


             reply	other threads:[~2021-04-07 21:46 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07 21:46 Daniel Xu [this message]
2021-04-07 21:46 ` [RFC bpf-next 1/1] bpf: Introduce iter_pagecache Daniel Xu
2021-04-08  6:14   ` Matthew Wilcox
2021-04-08 19:48     ` Daniel Xu
2021-04-08 21:29       ` Matthew Wilcox
2021-04-08  8:19   ` Christian Brauner
2021-04-08 20:44     ` Daniel Xu
2021-04-08 16:45   ` Al Viro
2021-04-08 20:49     ` Daniel Xu
2021-04-08 21:04       ` Al Viro
2021-04-08 22:11   ` Dave Chinner
2021-04-08  7:51 ` [RFC bpf-next 0/1] bpf: Add page cache iterator Christian Brauner
2021-04-08 16:08   ` Daniel Xu
2021-04-08 21:33 ` Shakeel Butt
2021-04-08 21:33   ` Shakeel Butt
2021-04-08 23:13 ` Darrick J. Wong
2021-04-09  0:24   ` Daniel Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1617831474.git.dxu@dxuuu.xyz \
    --to=dxu@dxuuu.xyz \
    --cc=bpf@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jolsa@kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.