stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: Fix data corruption caused by unaligned direct AIO
@ 2019-03-06 11:06 Lukas Czerner
  2019-03-15  3:38 ` Theodore Ts'o
  0 siblings, 1 reply; 2+ messages in thread
From: Lukas Czerner @ 2019-03-06 11:06 UTC (permalink / raw)
  To: linux-ext4; +Cc: Frank Sorenson, stable

Ext4 needs to serialize unaligned direct AIO because the zeroing of
partial blocks of two competing unaligned AIOs can result in data
corruption.

However it decides not to serialize if the potentially unaligned aio is
past i_size with the rationale that no pending writes are possible past
i_size. Unfortunately if the i_size is not block aligned and the second
unaligned write lands past i_size, but still into the same block, it has
the potential of corrupting the previous unaligned write to the same
block.

This is (very simplified) reproducer from Frank

    // 41472 = (10 * 4096) + 512
    // 37376 = 41472 - 4096

    ftruncate(fd, 41472);
    io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
    io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);

    io_submit(io_ctx, 1, &iocbs[1]);
    io_submit(io_ctx, 1, &iocbs[2]);

    io_getevents(io_ctx, 2, 2, events, NULL);

Without this patch the 512B range from 40960 up to the start of the
second unaligned write (41472) is going to be zeroed overwriting the data
written by the first write. This is a data corruption.

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
*
00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
*
0000a000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
*
0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31

With this patch the data corruption is avoided because we will recognize
the unaligned_aio and wait for the unwritten extent conversion.

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
*
00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
*
0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
*
0000b200

Reported-by: Frank Sorenson <fsorenso@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Fixes: e9e3bcecf44c ("ext4: serialize unaligned asynchronous DIO")
Cc: <stable@vger.kernel.org>
---
 fs/ext4/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 69d65d49837b..98ec11f69cd4 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -125,7 +125,7 @@ ext4_unaligned_aio(struct inode *inode, struct iov_iter *from, loff_t pos)
 	struct super_block *sb = inode->i_sb;
 	int blockmask = sb->s_blocksize - 1;
 
-	if (pos >= i_size_read(inode))
+	if (pos >= ALIGN(i_size_read(inode), sb->s_blocksize))
 		return 0;
 
 	if ((pos | iov_iter_alignment(from)) & blockmask)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ext4: Fix data corruption caused by unaligned direct AIO
  2019-03-06 11:06 [PATCH] ext4: Fix data corruption caused by unaligned direct AIO Lukas Czerner
@ 2019-03-15  3:38 ` Theodore Ts'o
  0 siblings, 0 replies; 2+ messages in thread
From: Theodore Ts'o @ 2019-03-15  3:38 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-ext4, Frank Sorenson, stable

On Wed, Mar 06, 2019 at 12:06:42PM +0100, Lukas Czerner wrote:
> Ext4 needs to serialize unaligned direct AIO because the zeroing of
> partial blocks of two competing unaligned AIOs can result in data
> corruption.
> 
> However it decides not to serialize if the potentially unaligned aio is
> past i_size with the rationale that no pending writes are possible past
> i_size. Unfortunately if the i_size is not block aligned and the second
> unaligned write lands past i_size, but still into the same block, it has
> the potential of corrupting the previous unaligned write to the same
> block.
> 
> This is (very simplified) reproducer from Frank
> 
>     // 41472 = (10 * 4096) + 512
>     // 37376 = 41472 - 4096
> 
>     ftruncate(fd, 41472);
>     io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
>     io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
> 
>     io_submit(io_ctx, 1, &iocbs[1]);
>     io_submit(io_ctx, 1, &iocbs[2]);
> 
>     io_getevents(io_ctx, 2, 2, events, NULL);
> 
> Without this patch the 512B range from 40960 up to the start of the
> second unaligned write (41472) is going to be zeroed overwriting the data
> written by the first write. This is a data corruption.
> 
> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> *
> 00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
> *
> 0000a000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> *
> 0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
> 
> With this patch the data corruption is avoided because we will recognize
> the unaligned_aio and wait for the unwritten extent conversion.
> 
> 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> *
> 00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
> *
> 0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
> *
> 0000b200
> 
> Reported-by: Frank Sorenson <fsorenso@redhat.com>
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Fixes: e9e3bcecf44c ("ext4: serialize unaligned asynchronous DIO")
> Cc: <stable@vger.kernel.org>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-03-15  3:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-06 11:06 [PATCH] ext4: Fix data corruption caused by unaligned direct AIO Lukas Czerner
2019-03-15  3:38 ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).