From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4894C433B4 for ; Wed, 7 Apr 2021 21:46:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6451861205 for ; Wed, 7 Apr 2021 21:46:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6451861205 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=dxuuu.xyz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ECC2F6B0073; Wed, 7 Apr 2021 17:46:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7BEC6B0078; Wed, 7 Apr 2021 17:46:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF4F66B007D; Wed, 7 Apr 2021 17:46:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B4BD26B0073 for ; Wed, 7 Apr 2021 17:46:34 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 77A721E18 for ; Wed, 7 Apr 2021 21:46:34 +0000 (UTC) X-FDA: 78006905508.39.81254BE Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by imf28.hostedemail.com (Postfix) with ESMTP id 543DC200025C for ; Wed, 7 Apr 2021 21:46:34 +0000 (UTC) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 7A3655C005D; Wed, 7 Apr 2021 17:46:33 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 07 Apr 2021 17:46:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; s=fm3; bh=r69jFNtPYMJmCAhPcCZ3L+qjCl ChR6gNTOq2XMQYtQQ=; b=OqQIvQ1uZKwgHjj/kcphq7iaZ7o9yTt9CctVAeDj6+ hTF6QX1cHj6JHh87LQGCi0dqRyry8cLpLbERV6D1b1eK4g8E+WgYi2eT27OhO3uD RzhryrfCxH6gxMBE2hwPAxaSsscfWIGgvdquY8uG98jJJ1kCSZllT1z0MYyAu3O2 hwlEqRXIUXokV7s/ZyvzE4OXj8DPS3G8iA0iX6nq1Xkdrh6yti0dt2qBSOPIxjQN 5fNtwF6P7sgGitE31umRXXEvh+8ilR1sN/nb2xvQxtzsE1w9exHDTwg56gFdYVqI qqpERmBSR4KkFaIP8ogzBGtfzUauFkfiYlZ6oqXQSAbQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :message-id:mime-version:subject:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=r69jFNtPYMJmCAhPc CZ3L+qjClChR6gNTOq2XMQYtQQ=; b=lLc4lLXaLNeqrRCFt5Pso8k9QTg8KQYrS bodqzIjYCxBsY4IvSyEbOLopro7UhqXyKNjcUyBGAejGk5JYsvWhW/va+HQsm/Jx j0HdwXrNZDlEd+EqQvZiAlwELfR0hwvJ06zqx5mfekVEEgcSOiamPhQ2GaOz0SKt 4jnzSIV3nAtl8p64DVl94nFic9fF80gJvVR5EWDwREf6lmLcgW5ytXjR+uQF/rZW wCPNU9opn/QnVI+iKFPaeobKkleBRuC3//dQjIBV0wBTl8FeEG3F4uRsbXC9KuhE tOxItTX6p4fiQz9TyhTSY0py2kZwo8xTffT4DmUUOPpFejfjf4XNA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrudejkedgtdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucgfrhhlucfvnfffucdljedtmdenucfjughrpefhvf fufffkofgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihu segugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpeeifffgledvffeitdeljedvte effeeivdefheeiveevjeduieeigfetieevieffffenucfkphepudeifedruddugedrudef vddrjeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe gugihusegugihuuhhurdighiii X-ME-Proxy: Received: from dlxu-fedora-R90QNFJV.thefacebook.com (unknown [163.114.132.7]) by mail.messagingengine.com (Postfix) with ESMTPA id 4484124005A; Wed, 7 Apr 2021 17:46:31 -0400 (EDT) From: Daniel Xu To: bpf@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Cc: Daniel Xu , linux-kernel@vger.kernel.org, kernel-team@fb.com, jolsa@kernel.org, hannes@cmpxchg.org, yhs@fb.com Subject: [RFC bpf-next 0/1] bpf: Add page cache iterator Date: Wed, 7 Apr 2021 14:46:10 -0700 Message-Id: X-Mailer: git-send-email 2.26.3 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 543DC200025C X-Stat-Signature: ffh464zb1jbexixwkcc8bwwrrgo54xje Received-SPF: none (dxuuu.xyz>: No applicable sender policy available) receiver=imf28; identity=mailfrom; envelope-from=""; helo=out5-smtp.messagingengine.com; client-ip=66.111.4.29 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617831994-791079 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There currently does not exist a way to answer the question: "What is in the page cache?". There are various heuristics and counters but nothing that can tell you anything like: * 3M from /home/dxu/foo.txt * 5K from ... * etc. The answer to the question is particularly useful in the stacked container world. Stacked containers implies multiple containers are run on the same physical host. Memory is precious resource on some (if not most) of these systems. On these systems, it's useful to know how much duplicated data is in the page cache. Once you know the answer, you can do something about it. One possible technique would be bind mount common items from the root host into each container. NOTES:=20 * This patch compiles and (maybe) works -- totally not fully tested or in a final state * I'm sending this early RFC to get comments on the general approach. I chatted w/ Johannes a little bit and it seems like the best way to do this is through superblock -> inode -> address_space iteration rather than going from numa node -> LRU iteration * I'll most likely add a page_hash() helper (or something) that hashes a page so that userspace can more easily tell which pages are duplicate Daniel Xu (1): bpf: Introduce iter_pagecache kernel/bpf/Makefile | 2 +- kernel/bpf/pagecache_iter.c | 293 ++++++++++++++++++++++++++++++++++++ 2 files changed, 294 insertions(+), 1 deletion(-) create mode 100644 kernel/bpf/pagecache_iter.c --=20 2.26.3