linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm: retry writepages() on ENOMEM when doing an data integrity writeback
Date: Wed, 15 Mar 2017 10:09:27 -0400	[thread overview]
Message-ID: <20170315140927.g5ylzcbxrvjqune3@thunk.org> (raw)
In-Reply-To: <20170315115933.GF12989@quack2.suse.cz>

On Wed, Mar 15, 2017 at 12:59:33PM +0100, Jan Kara wrote:
> > +	while (1) {
> > +		if (mapping->a_ops->writepages)
> > +			ret = mapping->a_ops->writepages(mapping, wbc);
> > +		else
> > +			ret = generic_writepages(mapping, wbc);
> > +		if ((ret != ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL))
> 
> -ENOMEM I guess...

Oops.  Thanks for noticing!

Unless anyone has any objections I plan to carry this in the ext4
tree.

						- Ted

>From 063312672cf277b12e337e91309672499bc797f7 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Tue, 14 Mar 2017 21:13:04 -0400
Subject: [PATCH -v2] mm: retry writepages() on ENOMEM when doing an data integrity writeback

Currently, file system's writepages() function must not fail with an
ENOMEM, since if they do, it's possible for buffered data to be lost.
This is because on a data integrity writeback writepages() gets called
but once, and if it returns ENOMEM, if you're lucky the error will get
reflected back to the userspace process calling fsync().  If you
aren't lucky, the user is unmounting the file system, and the dirty
pages will simply be lost.

For this reason, file system code generally will use GFP_NOFS, and in
some cases, will retry the allocation in a loop, on the theory that
"kernel livelocks are temporary; data loss is forever".
Unfortunately, this can indeed cause livelocks, since inside the
writepages() call, the file system is holding various mutexes, and
these mutexes may prevent the OOM killer from killing its targetted
victim if it is also holding on to those mutexes.

A better solution would be to allow writepages() to call the memory
allocator with flags that give greater latitude to the allocator to
fail, and then release its locks and return ENOMEM, and in the case of
background writeback, the writes can be retried at a later time.  In
the case of data-integrity writeback retry after waiting a brief
amount of time.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 mm/page-writeback.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 290e8b7d3181..c623cef68a8e 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2352,10 +2352,16 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
 
 	if (wbc->nr_to_write <= 0)
 		return 0;
-	if (mapping->a_ops->writepages)
-		ret = mapping->a_ops->writepages(mapping, wbc);
-	else
-		ret = generic_writepages(mapping, wbc);
+	while (1) {
+		if (mapping->a_ops->writepages)
+			ret = mapping->a_ops->writepages(mapping, wbc);
+		else
+			ret = generic_writepages(mapping, wbc);
+		if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL))
+			break;
+		cond_resched();
+		congestion_wait(BLK_RW_ASYNC, HZ/50);
+	}
 	return ret;
 }
 
-- 
2.11.0.rc0.7.gbe5a750

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-15 14:09 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-05 13:35 [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business Jeff Layton
2017-03-05 13:35 ` [PATCH 1/3] nilfs2: set the mapping error when calling SetPageError on writeback Jeff Layton
2017-03-07 13:46   ` Ryusuke Konishi
2017-03-05 13:35 ` [PATCH 2/3] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-03-05 13:35 ` [PATCH 3/3] mm: set mapping error when launder_pages fails Jeff Layton
2017-03-05 14:40 ` [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business Jeff Layton
2017-03-06 23:08   ` Ross Zwisler
2017-03-07 10:26     ` Jan Kara
2017-03-07 14:03       ` Jeff Layton
2017-03-07 15:59       ` Ross Zwisler
2017-03-07 16:17         ` Jan Kara
2017-03-09  2:57       ` Theodore Ts'o
2017-03-09  9:04         ` Jan Kara
2017-03-09 10:47           ` Jeff Layton
2017-03-09 11:02             ` Jan Kara
2017-03-09 12:43               ` Jeff Layton
2017-03-09 13:22                 ` Brian Foster
2017-03-09 14:21                 ` Theodore Ts'o
2017-03-15  5:07           ` [RFC PATCH] mm: retry writepages() on ENOMEM when doing an data integrity writeback Theodore Ts'o
2017-03-15 11:59             ` Jan Kara
2017-03-15 14:09               ` Theodore Ts'o [this message]
2017-03-15 13:03             ` Michal Hocko
2017-03-16 10:18               ` Tetsuo Handa
2017-03-06  3:06 ` [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business NeilBrown
2017-03-06 11:43   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170315140927.g5ylzcbxrvjqune3@thunk.org \
    --to=tytso@mit.edu \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).