linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: George Amvrosiadis <gamvrosi@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 0/3] new feature: monitoring page cache events
Date: Thu, 28 Jul 2016 14:02:45 -0700	[thread overview]
Message-ID: <579A72F5.10808@intel.com> (raw)
In-Reply-To: <cover.1469489884.git.gamvrosi@gmail.com>

On 07/25/2016 08:47 PM, George Amvrosiadis wrote:
>  21 files changed, 2424 insertions(+), 1 deletion(-)

I like the idea, but yikes, that's a lot of code.

Have you considered using or augmenting the kernel's existing tracing
mechanisms?  Have you considered using something like netlink for
transporting the data out of the kernel?

The PageDirty() hooks look simple but turn out to be horribly deep.
Where we used to have a plain old bit set, we now have new locks,
potentially long periods of irq disabling, and loops over all the tasks
doing duet, even path lookup!

Given a big system, I would imagine these locks slowing down
SetPageDirty() and things like write() pretty severely.  Have you done
an assessment of the performance impact of this change?   I can't
imagine this being used in any kind of performance or
scalability-sensitive environment.

The current tracing code has a model where the trace producers put data
in *one* place, then all the mulitple consumers pull it out of that
place.  Duet seems to have the model that the producer puts the data in
multiple places and consumers consume it from their own private copies.
 That seems a bit backwards and puts cost directly in to hot code paths.
 Even a single task watching a single file on the system makes everyone
go in and pay some of this cost for every SetPageDirty().

Let's say we had a big system with virtually everything sitting in the
page cache.  Does duet have a way to find things currently _in_ the
cache, or only when things move in/out of it?

Tasks seem to have a fixed 'struct path' ->regpath at duet_task_init()
time.  The code goes page->mapping->inode->i_dentry and then tries to
compare that with the originally recorded path.  Does this even work in
the face of things like bind mounts, mounts that change after
duet_task_init(), or mounting a fs with a different superblock
underneath a watched path?  It seems awfully fragile.

  parent reply	other threads:[~2016-07-28 21:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-26  3:47 [PATCH 0/3] new feature: monitoring page cache events George Amvrosiadis
2016-07-26  3:47 ` [PATCH 1/3] mm: support for duet hooks George Amvrosiadis
2016-07-26  3:47 ` [PATCH 2/3] mm/duet: syscall wiring George Amvrosiadis
2016-07-26  3:47 ` [PATCH 3/3] mm/duet: framework code George Amvrosiadis
2016-07-28 21:02 ` Dave Hansen [this message]
2016-07-29  3:47   ` [PATCH 0/3] new feature: monitoring page cache events George Amvrosiadis
2016-07-29 15:33     ` Dave Hansen
2016-07-30 17:31       ` George Amvrosiadis
2016-08-01 14:07         ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=579A72F5.10808@intel.com \
    --to=dave.hansen@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=gamvrosi@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).