From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752476AbdFAQWO (ORCPT <rfc822;w@1wt.eu>);
        Thu, 1 Jun 2017 12:22:14 -0400
Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:33358 "EHLO
        shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S1752073AbdFAPox (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 1 Jun 2017 11:44:53 -0400
Content-Type: text/plain; charset="UTF-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
CC: akpm@linux-foundation.org, "Theodore Ts'o" <tytso@mit.edu>,
        "Jan Kara" <jack@suse.com>
Date: Thu, 01 Jun 2017 16:43:16 +0100
Message-ID: <lsq.1496331796.207888707@decadent.org.uk>
X-Mailer: LinuxStableQueue (scripts by bwh)
Subject: [PATCH 3.16 081/212] ext4: use private version of page_zero_new_buffers()
 for data=journal mode
In-Reply-To: <lsq.1496331794.574686034@decadent.org.uk>
X-SA-Exim-Connect-IP: 82.70.136.246
X-SA-Exim-Mail-From: ben@decadent.org.uk
X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

3.16.44-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <tytso@mit.edu>

commit b90197b655185a11640cce3a0a0bc5d8291b8ad2 upstream.

If there is a error while copying data from userspace into the page
cache during a write(2) system call, in data=journal mode, in
ext4_journalled_write_end() were using page_zero_new_buffers() from
fs/buffer.c.  Unfortunately, this sets the buffer dirty flag, which is
no good if journalling is enabled.  This is a long-standing bug that
goes back for years and years in ext3, but a combination of (a)
data=journal not being very common, (b) in many case it only results
in a warning message. and (c) only very rarely causes the kernel hang,
means that we only really noticed this as a problem when commit
998ef75ddb caused this failure to happen frequently enough to cause
generic/208 to fail when run in data=journal mode.

The fix is to have our own version of this function that doesn't call
mark_dirty_buffer(), since we will end up calling
ext4_handle_dirty_metadata() on the buffer head(s) in questions very
shortly afterwards in ext4_journalled_write_end().

Thanks to Dave Hansen and Linus Torvalds for helping to identify the
root cause of the problem.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 fs/ext4/inode.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1166,6 +1166,38 @@ errout:
 	return ret ? ret : copied;
 }
 
+/*
+ * This is a private version of page_zero_new_buffers() which doesn't
+ * set the buffer to be dirty, since in data=journalled mode we need
+ * to call ext4_handle_dirty_metadata() instead.
+ */
+static void zero_new_buffers(struct page *page, unsigned from, unsigned to)
+{
+	unsigned int block_start = 0, block_end;
+	struct buffer_head *head, *bh;
+
+	bh = head = page_buffers(page);
+	do {
+		block_end = block_start + bh->b_size;
+		if (buffer_new(bh)) {
+			if (block_end > from && block_start < to) {
+				if (!PageUptodate(page)) {
+					unsigned start, size;
+
+					start = max(from, block_start);
+					size = min(to, block_end) - start;
+
+					zero_user(page, start, size);
+					set_buffer_uptodate(bh);
+				}
+				clear_buffer_new(bh);
+			}
+		}
+		block_start = block_end;
+		bh = bh->b_this_page;
+	} while (bh != head);
+}
+
 static int ext4_journalled_write_end(struct file *file,
 				     struct address_space *mapping,
 				     loff_t pos, unsigned len, unsigned copied,
@@ -1191,7 +1223,7 @@ static int ext4_journalled_write_end(str
 		if (copied < len) {
 			if (!PageUptodate(page))
 				copied = 0;
-			page_zero_new_buffers(page, from+copied, to);
+			zero_new_buffers(page, from+copied, to);
 		}
 
 		ret = ext4_walk_page_buffers(handle, page_buffers(page), from,