All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>,
	Rik van Riel <riel@redhat.com>, Ying Han <yinghan@google.com>,
	Greg Thelen <gthelen@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Fengguang Wu <fengguang.wu@intel.com>
Subject: Re: [PATCH v2 -mm] memcg: prevent from OOM with too many dirty pages
Date: Mon, 16 Jul 2012 01:30:48 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1207160111280.3936@eggly.anvils> (raw)
In-Reply-To: <20120713082150.GA1448@tiehlicka.suse.cz>

On Fri, 13 Jul 2012, Michal Hocko wrote:
> On Thu 12-07-12 15:42:53, Hugh Dickins wrote:
> > On Thu, 12 Jul 2012, Andrew Morton wrote:
> > > 
> > > I wasn't planning on 3.5, given the way it's been churning around.
> > 
> > I don't know if you had been intending to send it in for 3.5 earlier;
> > but I'm sorry if my late intervention on may_enter_fs has delayed it.
> 
> Well I should investigate more when the question came up...
>  
> > > How about we put it into 3.6 and tag it for a -stable backport, so
> > > it gets a bit of a run in mainline before we inflict it upon -stable
> > > users?
> > 
> > That sounds good enough to me, but does fall short of Michal's hope.
> 
> I would be happier if it went into 3.5 already because the problem (OOM
> on too many dirty pages) is real and long term (basically since ever).
> We have the patch in SLES11-SP2 for quite some time (the original one
> with the may_enter_fs check) and it helped a lot.
> The patch was designed as a band aid primarily because it is very simple
> that way and with a hope that the real fix will come later.
> The decision is up to you Andrew, but I vote for pushing it as soon as
> possible and try to come up with something more clever for 3.6.

Once I got to trying dd in memcg to FS on USB stick, yes, I very much
agree that the problem is real and well worth fixing, and that your
patch takes us most of the way there.

But Andrew's caution has proved to be well founded: in the last
few days I've found several problems with it.

I guess it makes more sense to go into detail in the patch I'm about
to send, fixing up what is (I think) currently in mmotm.

But in brief: my insistence on may_enter_fs actually took us backwards
on ext4, because that does __GFP_NOFS page allocations when writing.
I still don't understand how this showed up in none of my testing at
the end of the week, and only hit me today (er, yesterday).  But not
as big a problem as I thought at first, because loop also turns off
__GFP_IO, so we can go by that instead.

And though I found your patch works most of the time, one in five
or ten attempts would OOM just as before: we actually have a problem
also with PageWriteback pages which are not PageReclaim, but the
answer is to mark those PageReclaim.

Patch follows separately in a moment.  I'm pretty happy with it now,
but I've not yet tried xfs, btrfs, vfat, tmpfs.  I notice now that
you specifically describe testing on ext3, but don't mention ext4:
I wonder if you got bogged down in the problems I've fixed on that.

Hugh

WARNING: multiple messages have this Message-ID (diff)
From: Hugh Dickins <hughd@google.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>,
	Rik van Riel <riel@redhat.com>, Ying Han <yinghan@google.com>,
	Greg Thelen <gthelen@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Fengguang Wu <fengguang.wu@intel.com>
Subject: Re: [PATCH v2 -mm] memcg: prevent from OOM with too many dirty pages
Date: Mon, 16 Jul 2012 01:30:48 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.00.1207160111280.3936@eggly.anvils> (raw)
In-Reply-To: <20120713082150.GA1448@tiehlicka.suse.cz>

On Fri, 13 Jul 2012, Michal Hocko wrote:
> On Thu 12-07-12 15:42:53, Hugh Dickins wrote:
> > On Thu, 12 Jul 2012, Andrew Morton wrote:
> > > 
> > > I wasn't planning on 3.5, given the way it's been churning around.
> > 
> > I don't know if you had been intending to send it in for 3.5 earlier;
> > but I'm sorry if my late intervention on may_enter_fs has delayed it.
> 
> Well I should investigate more when the question came up...
>  
> > > How about we put it into 3.6 and tag it for a -stable backport, so
> > > it gets a bit of a run in mainline before we inflict it upon -stable
> > > users?
> > 
> > That sounds good enough to me, but does fall short of Michal's hope.
> 
> I would be happier if it went into 3.5 already because the problem (OOM
> on too many dirty pages) is real and long term (basically since ever).
> We have the patch in SLES11-SP2 for quite some time (the original one
> with the may_enter_fs check) and it helped a lot.
> The patch was designed as a band aid primarily because it is very simple
> that way and with a hope that the real fix will come later.
> The decision is up to you Andrew, but I vote for pushing it as soon as
> possible and try to come up with something more clever for 3.6.

Once I got to trying dd in memcg to FS on USB stick, yes, I very much
agree that the problem is real and well worth fixing, and that your
patch takes us most of the way there.

But Andrew's caution has proved to be well founded: in the last
few days I've found several problems with it.

I guess it makes more sense to go into detail in the patch I'm about
to send, fixing up what is (I think) currently in mmotm.

But in brief: my insistence on may_enter_fs actually took us backwards
on ext4, because that does __GFP_NOFS page allocations when writing.
I still don't understand how this showed up in none of my testing at
the end of the week, and only hit me today (er, yesterday).  But not
as big a problem as I thought at first, because loop also turns off
__GFP_IO, so we can go by that instead.

And though I found your patch works most of the time, one in five
or ten attempts would OOM just as before: we actually have a problem
also with PageWriteback pages which are not PageReclaim, but the
answer is to mark those PageReclaim.

Patch follows separately in a moment.  I'm pretty happy with it now,
but I've not yet tried xfs, btrfs, vfat, tmpfs.  I notice now that
you specifically describe testing on ext3, but don't mention ext4:
I wonder if you got bogged down in the problems I've fixed on that.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-07-16  8:31 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-19 14:50 [PATCH -mm] memcg: prevent from OOM with too many dirty pages Michal Hocko
2012-06-19 14:50 ` Michal Hocko
2012-06-19 22:00 ` Andrew Morton
2012-06-19 22:00   ` Andrew Morton
2012-06-20  8:27   ` Michal Hocko
2012-06-20  8:27     ` Michal Hocko
2012-06-20  9:20   ` Mel Gorman
2012-06-20  9:20     ` Mel Gorman
2012-06-20  9:55     ` Fengguang Wu
2012-06-20  9:55       ` Fengguang Wu
2012-06-20  9:59     ` Michal Hocko
2012-06-20  9:59       ` Michal Hocko
2012-06-20 10:11   ` [PATCH v2 " Michal Hocko
2012-06-20 10:11     ` Michal Hocko
2012-07-12  1:57     ` Hugh Dickins
2012-07-12  1:57       ` Hugh Dickins
2012-07-12  2:21       ` Andrew Morton
2012-07-12  2:21         ` Andrew Morton
2012-07-12  3:13         ` Hugh Dickins
2012-07-12  3:13           ` Hugh Dickins
2012-07-12  7:05       ` Michal Hocko
2012-07-12  7:05         ` Michal Hocko
2012-07-12 21:13         ` Andrew Morton
2012-07-12 21:13           ` Andrew Morton
2012-07-12 22:42           ` Hugh Dickins
2012-07-12 22:42             ` Hugh Dickins
2012-07-13  8:21             ` Michal Hocko
2012-07-13  8:21               ` Michal Hocko
2012-07-16  8:30               ` Hugh Dickins [this message]
2012-07-16  8:30                 ` Hugh Dickins
2012-07-16  8:35                 ` [PATCH mmotm] memcg: further prevent " Hugh Dickins
2012-07-16  8:35                   ` Hugh Dickins
2012-07-16  9:26                   ` Michal Hocko
2012-07-16  9:26                     ` Michal Hocko
2012-07-17  4:52                     ` Hugh Dickins
2012-07-17  4:52                       ` Hugh Dickins
2012-07-17  6:33                       ` Michal Hocko
2012-07-17  6:33                         ` Michal Hocko
2012-07-16 21:08                   ` Andrew Morton
2012-07-16 21:08                     ` Andrew Morton
2012-07-16  8:10         ` [PATCH v2 -mm] memcg: prevent from " Hugh Dickins
2012-07-16  8:10           ` Hugh Dickins
2012-07-16  8:48           ` Michal Hocko
2012-07-16  8:48             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.1207160111280.3936@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.