All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <andrea@betterlinux.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Minchan Kim" <minchan.kim@gmail.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Johannes Weiner" <jweiner@redhat.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Rik van Riel" <riel@redhat.com>,
	"Hugh Dickins" <hughd@google.com>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shaohua Li" <shaohua.li@intel.com>,
	"Pádraig Brady" <P@draigBrady.com>,
	"John Stultz" <john.stultz@linaro.org>,
	"Jerry James" <jamesjer@betterlinux.com>,
	"Julius Plenz" <julius@plenz.com>,
	"Greg Thelen" <gthelen@google.com>, linux-mm <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE
Date: Wed, 15 Feb 2012 02:35:24 +0100	[thread overview]
Message-ID: <20120215012957.GA1728@thinkpad> (raw)
In-Reply-To: <20120214152220.4f621975.akpm@linux-foundation.org>

On Tue, Feb 14, 2012 at 03:22:20PM -0800, Andrew Morton wrote:
> On Tue, 14 Feb 2012 23:59:22 +0100
> Andrea Righi <andrea@betterlinux.com> wrote:
> 
> > On Tue, Feb 14, 2012 at 01:33:37PM -0800, Andrew Morton wrote:
> > > On Sun, 12 Feb 2012 01:21:35 +0100
> > > Andrea Righi <andrea@betterlinux.com> wrote:
> > > 
> > > > The new proposal is to implement POSIX_FADV_NOREUSE as a way to perform a real
> > > > drop-behind policy where applications can mark certain intervals of a file as
> > > > FADV_NOREUSE before accessing the data.
> > > 
> > > I think you and John need to talk to each other, please.  The amount of
> > > duplication here is extraordinary.
> > 
> > Yes, definitely. I'm currently reviewing and testing the John's patch
> > set. I was even considering to apply my patch set on top of the John's
> > patch, or at least propose my tree-based approach to manage the list of
> > the POSIX_FADV_VOLATILE ranges.
> 
> Cool.
> 
> > > 
> > > Both patchsets add fields to the address_space (and hence inode), which
> > > is significant - we should convince ourselves that we're getting really
> > > good returns from a feature which does this.
> > > 
> > > 
> > > 
> > > Regarding the use of fadvise(): I suppose it's a reasonable thing to do
> > > in the long term - if the feature works well, popular data streaming
> > > applications will eventually switch over.  But I do think we should
> > > explore interfaces which don't require modification of userspace source
> > > code.  Because there will always be unconverted applications, and the
> > > feature becomes available immediately.
> > > 
> > > One such interface would be to toss the offending application into a
> > > container which has a modified drop-behind policy.  And here we need to
> > > drag out the crystal ball: what *is* the best way of tuning application
> > > pagecache behaviour?  Will we gravitate towards containerization, or
> > > will we gravitate towards finer-tuned fadvise/sync_page_range/etc
> > > behaviour?  Thus far it has been the latter, and I don't think that has
> > > been a great success.
> > > 
> > > Finally, are the problems which prompted these patchsets already
> > > solved?  What happens if you take the offending streaming application
> > > and toss it into a 16MB memcg?  That *should* avoid perturbing other
> > > things running on that machine.
> > 
> > Moving the streaming application into a 16MB memcg can be dangerous in
> > some cases... the application might start to do "bad" things, like
> > swapping (if the memcg can swap) or just fail due to OOMs.
> 
> Well OK, maybe there are problems with the current implementation.  But
> are they unfixable problems?  Is the right approach to give up on ever
> making containers useful for this application and to instead go off and
> implement a new and separate feature?
> 
> > > And yes, a container-based approach is pretty crude, and one can
> > > envision applications which only want modified reclaim policy for one
> > > particualr file.  But I suspect an application-wide reclaim policy
> > > solves 90% of the problems.
> > 
> > I really like the container-based approach. But for this we need a
> > better file cache control in the memory cgroup; now we have the
> > accounting of file pages, but there's no way to limit them.
> 
> Again, if/whem memcg becomes sufficiently useful for this application
> we're left maintaining the obsolete POSIX_FADVISE_NOREUSE for ever.

Yes, totally agree. For the future a memcg-based solution is probably
the best way to go.

This reminds me to the old per-memcg dirty memory discussion
(http://thread.gmane.org/gmane.linux.kernel.mm/67114), cc'ing Greg.

Maybe the generic feature to provide that could solve both problems is
a better file cache isolation in memcg.

Thanks,
-Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <andrea@betterlinux.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Minchan Kim" <minchan.kim@gmail.com>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Johannes Weiner" <jweiner@redhat.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	"Rik van Riel" <riel@redhat.com>,
	"Hugh Dickins" <hughd@google.com>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Shaohua Li" <shaohua.li@intel.com>,
	"Pádraig Brady" <P@draigBrady.com>,
	"John Stultz" <john.stultz@linaro.org>,
	"Jerry James" <jamesjer@betterlinux.com>,
	"Julius Plenz" <julius@plenz.com>,
	"Greg Thelen" <gthelen@google.com>, linux-mm <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE
Date: Wed, 15 Feb 2012 02:35:24 +0100	[thread overview]
Message-ID: <20120215012957.GA1728@thinkpad> (raw)
In-Reply-To: <20120214152220.4f621975.akpm@linux-foundation.org>

On Tue, Feb 14, 2012 at 03:22:20PM -0800, Andrew Morton wrote:
> On Tue, 14 Feb 2012 23:59:22 +0100
> Andrea Righi <andrea@betterlinux.com> wrote:
> 
> > On Tue, Feb 14, 2012 at 01:33:37PM -0800, Andrew Morton wrote:
> > > On Sun, 12 Feb 2012 01:21:35 +0100
> > > Andrea Righi <andrea@betterlinux.com> wrote:
> > > 
> > > > The new proposal is to implement POSIX_FADV_NOREUSE as a way to perform a real
> > > > drop-behind policy where applications can mark certain intervals of a file as
> > > > FADV_NOREUSE before accessing the data.
> > > 
> > > I think you and John need to talk to each other, please.  The amount of
> > > duplication here is extraordinary.
> > 
> > Yes, definitely. I'm currently reviewing and testing the John's patch
> > set. I was even considering to apply my patch set on top of the John's
> > patch, or at least propose my tree-based approach to manage the list of
> > the POSIX_FADV_VOLATILE ranges.
> 
> Cool.
> 
> > > 
> > > Both patchsets add fields to the address_space (and hence inode), which
> > > is significant - we should convince ourselves that we're getting really
> > > good returns from a feature which does this.
> > > 
> > > 
> > > 
> > > Regarding the use of fadvise(): I suppose it's a reasonable thing to do
> > > in the long term - if the feature works well, popular data streaming
> > > applications will eventually switch over.  But I do think we should
> > > explore interfaces which don't require modification of userspace source
> > > code.  Because there will always be unconverted applications, and the
> > > feature becomes available immediately.
> > > 
> > > One such interface would be to toss the offending application into a
> > > container which has a modified drop-behind policy.  And here we need to
> > > drag out the crystal ball: what *is* the best way of tuning application
> > > pagecache behaviour?  Will we gravitate towards containerization, or
> > > will we gravitate towards finer-tuned fadvise/sync_page_range/etc
> > > behaviour?  Thus far it has been the latter, and I don't think that has
> > > been a great success.
> > > 
> > > Finally, are the problems which prompted these patchsets already
> > > solved?  What happens if you take the offending streaming application
> > > and toss it into a 16MB memcg?  That *should* avoid perturbing other
> > > things running on that machine.
> > 
> > Moving the streaming application into a 16MB memcg can be dangerous in
> > some cases... the application might start to do "bad" things, like
> > swapping (if the memcg can swap) or just fail due to OOMs.
> 
> Well OK, maybe there are problems with the current implementation.  But
> are they unfixable problems?  Is the right approach to give up on ever
> making containers useful for this application and to instead go off and
> implement a new and separate feature?
> 
> > > And yes, a container-based approach is pretty crude, and one can
> > > envision applications which only want modified reclaim policy for one
> > > particualr file.  But I suspect an application-wide reclaim policy
> > > solves 90% of the problems.
> > 
> > I really like the container-based approach. But for this we need a
> > better file cache control in the memory cgroup; now we have the
> > accounting of file pages, but there's no way to limit them.
> 
> Again, if/whem memcg becomes sufficiently useful for this application
> we're left maintaining the obsolete POSIX_FADVISE_NOREUSE for ever.

Yes, totally agree. For the future a memcg-based solution is probably
the best way to go.

This reminds me to the old per-memcg dirty memory discussion
(http://thread.gmane.org/gmane.linux.kernel.mm/67114), cc'ing Greg.

Maybe the generic feature to provide that could solve both problems is
a better file cache isolation in memcg.

Thanks,
-Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-02-15  1:35 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-12  0:21 [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE Andrea Righi
2012-02-12  0:21 ` Andrea Righi
2012-02-12  0:21 ` [PATCH v5 1/3] kinterval: routines to manipulate generic intervals Andrea Righi
2012-02-12  0:21   ` Andrea Righi
2012-02-13  0:48   ` Andrea Righi
2012-02-13  0:48     ` Andrea Righi
2012-02-12  0:21 ` [PATCH v5 2/3] mm: filemap: introduce mark_page_usedonce Andrea Righi
2012-02-12  0:21   ` Andrea Righi
2012-02-12  0:21 ` [PATCH v5 3/3] fadvise: implement POSIX_FADV_NOREUSE Andrea Righi
2012-02-12  0:21   ` Andrea Righi
2012-02-13 16:22   ` KOSAKI Motohiro
2012-02-13 16:22     ` KOSAKI Motohiro
2012-02-13 18:00     ` Andrea Righi
2012-02-13 18:00       ` Andrea Righi
2012-02-13 18:00       ` Andrea Righi
2012-02-13 16:22   ` KOSAKI Motohiro
2012-02-13 16:22   ` KOSAKI Motohiro
2012-02-15 23:35   ` Arun Sharma
2012-02-15 23:35     ` Arun Sharma
2012-02-15 23:47     ` Andrea Righi
2012-02-15 23:47       ` Andrea Righi
2012-02-15 23:57       ` Arun Sharma
2012-02-15 23:57         ` Arun Sharma
2012-02-15 23:57         ` Arun Sharma
2012-02-16  0:56         ` Andrea Righi
2012-02-16  0:56           ` Andrea Righi
2012-02-16  0:56           ` Andrea Righi
2012-02-16  2:10           ` Arun Sharma
2012-02-16  2:10             ` Arun Sharma
2012-02-16 10:39             ` Andrea Righi
2012-02-16 10:39               ` Andrea Righi
2012-02-16 18:43               ` Arun Sharma
2012-02-16 18:43                 ` Arun Sharma
2012-02-16 18:57                 ` Andrea Righi
2012-02-16 18:57                   ` Andrea Righi
2012-02-16 19:07                   ` Arun Sharma
2012-02-16 19:07                     ` Arun Sharma
2012-02-27  2:33   ` KAMEZAWA Hiroyuki
2012-02-27  2:33     ` KAMEZAWA Hiroyuki
2012-02-27 10:46     ` Andrea Righi
2012-02-27 10:46       ` Andrea Righi
2012-02-12  7:16 ` [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE Hillf Danton
2012-02-12 11:58   ` Andrea Righi
2012-02-12 17:54     ` Aneesh Kumar K.V
2012-02-13  1:13       ` Andrea Righi
2012-02-14 21:33 ` Andrew Morton
2012-02-14 21:33   ` Andrew Morton
2012-02-14 22:06   ` John Stultz
2012-02-14 22:06     ` John Stultz
2012-02-14 22:59   ` Andrea Righi
2012-02-14 22:59     ` Andrea Righi
2012-02-14 23:22     ` Andrew Morton
2012-02-14 23:22       ` Andrew Morton
2012-02-15  1:35       ` Andrea Righi [this message]
2012-02-15  1:35         ` Andrea Righi
2012-02-15 23:48         ` KAMEZAWA Hiroyuki
2012-02-15 23:48           ` KAMEZAWA Hiroyuki
2012-02-16  0:43           ` Andrea Righi
2012-02-16  0:43             ` Andrea Righi
2014-01-02 21:25             ` Phillip Susi
2014-01-02 21:25               ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120215012957.GA1728@thinkpad \
    --to=andrea@betterlinux.com \
    --cc=P@draigBrady.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=jamesjer@betterlinux.com \
    --cc=john.stultz@linaro.org \
    --cc=julius@plenz.com \
    --cc=jweiner@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=shaohua.li@intel.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.