linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] ext4: fix direct I/O read error
@ 2020-08-05  7:57 Jiang Ying
  2020-08-05  8:51 ` Jan Kara
  0 siblings, 1 reply; 8+ messages in thread
From: Jiang Ying @ 2020-08-05  7:57 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4, linux-kernel, stable
  Cc: wanglong19, heguanjun, jack

This patch is used to fix ext4 direct I/O read error when
the read size is not aligned with block size.

Then, I will use a test to explain the error.

(1) Make a file that is not aligned with block size:
	$dd if=/dev/zero of=./test.jar bs=1000 count=3

(2) I wrote a source file named "direct_io_read_file.c" as following:

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/file.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <string.h>
	#define BUF_SIZE 1024

	int main()
	{
		int fd;
		int ret;

		unsigned char *buf;
		ret = posix_memalign((void **)&buf, 512, BUF_SIZE);
		if (ret) {
			perror("posix_memalign failed");
			exit(1);
		}
		fd = open("./test.jar", O_RDONLY | O_DIRECT, 0755);
		if (fd < 0){
			perror("open ./test.jar failed");
			exit(1);
		}

		do {
			ret = read(fd, buf, BUF_SIZE);
			printf("ret=%d\n",ret);
			if (ret < 0) {
				perror("write test.jar failed");
			}
		} while (ret > 0);

		free(buf);
		close(fd);
	}

(3) Compile the source file:
	$gcc direct_io_read_file.c -D_GNU_SOURCE

(4) Run the test program:
	$./a.out

	The result is as following:
	ret=1024
	ret=1024
	ret=952
	ret=-1
	write test.jar failed: Invalid argument.

I have tested this program on XFS filesystem, XFS does not have
this problem, because XFS use iomap_dio_rw() to do direct I/O
read. And the comparing between read offset and file size is done
in iomap_dio_rw(), the code is as following:

	if (pos < size) {
		retval = filemap_write_and_wait_range(mapping, pos,
				pos + iov_length(iov, nr_segs) - 1);

		if (!retval) {
			retval = mapping->a_ops->direct_IO(READ, iocb,
						iov, pos, nr_segs);
		}
		...
	}

...only when "pos < size", direct I/O can be done, or 0 will be return.

I have tested the fix patch on Ext4, it is up to the mustard of
EINVAL in man2(read) as following:
	#include <unistd.h>
	ssize_t read(int fd, void *buf, size_t count);

	EINVAL
		fd is attached to an object which is unsuitable for reading;
		or the file was opened with the O_DIRECT flag, and either the
		address specified in buf, the value specified in count, or the
		current file offset is not suitably aligned.

So I think this patch can be applied to fix ext4 direct I/O error.

However Ext4 introduces direct I/O read using iomap infrastructure
on kernel 5.5, the patch is commit <b1b4705d54ab>
("ext4: introduce direct I/O read using iomap infrastructure"),
then Ext4 will be the same as XFS, they all use iomap_dio_rw() to do direct
I/O read. So this problem does not exist on kernel 5.5 for Ext4.

From above description, we can see this problem exists on all the kernel
versions between kernel 3.14 and kernel 5.4. It will cause the Applications
to fail to read. For example, when the search service downloads a new full
index file, the search engine is loading the previous index file and is
processing the search request, it can not use buffer io that may squeeze
the previous index file in use from pagecache, so the serch service must
use direct I/O read.

Please apply this patch on these kernel versions, or please use the method
on kernel 5.5 to fix this problem.

Fixes: 9fe55eea7e4b ("Fix race when checking i_size on direct i/o read")
Reviewed-by: Jan Kara <jack@suse.cz>
Co-developed-by: Wang Long <wanglong19@meituan.com>
Signed-off-by: Wang Long <wanglong19@meituan.com>
Signed-off-by: Jiang Ying <jiangying8582@126.com>

Changes since V3:
	Add the info: this bug could break some application that use the
	stable kernel releases.

Changes since V2:
	Optimize the description of the commit message and make a variation for
	the patch, e.g. with:

		Before:
			loff_t size;
			size = i_size_read(inode);
		After:
			loff_t size = i_size_read(inode);

Changes since V1:
	Signed-off use real name and add "Fixes:" flag

---
 fs/ext4/inode.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 516faa2..a66b0ac 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3821,6 +3821,11 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
 	struct inode *inode = mapping->host;
 	size_t count = iov_iter_count(iter);
 	ssize_t ret;
+	loff_t offset = iocb->ki_pos;
+	loff_t size = i_size_read(inode);
+
+	if (offset >= size)
+		return 0;
 
 	/*
 	 * Shared inode_lock is enough for us - it protects against concurrent
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-05  7:57 [PATCH v4] ext4: fix direct I/O read error Jiang Ying
@ 2020-08-05  8:51 ` Jan Kara
  2020-08-05 14:29   ` Greg KH
  2020-08-06  1:19   ` Sasha Levin
  0 siblings, 2 replies; 8+ messages in thread
From: Jan Kara @ 2020-08-05  8:51 UTC (permalink / raw)
  To: stable
  Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, wanglong19,
	heguanjun, jack, Jiang Ying

Note to stable tree maintainers (summary from the rather long changelog):
This is a non-upstream patch. It will not go upstream because the problem
there has been fixed by converting ext4 to use iomap infrastructure.
However that change is out of scope for stable kernels and this is a
minimal fix for the problem that has hit real-world applications so I think
it would be worth it to include the fix in stable trees. Thanks.

								Honza

On Wed 05-08-20 15:57:21, Jiang Ying wrote:
> This patch is used to fix ext4 direct I/O read error when
> the read size is not aligned with block size.
> 
> Then, I will use a test to explain the error.
> 
> (1) Make a file that is not aligned with block size:
> 	$dd if=/dev/zero of=./test.jar bs=1000 count=3
> 
> (2) I wrote a source file named "direct_io_read_file.c" as following:
> 
> 	#include <stdio.h>
> 	#include <stdlib.h>
> 	#include <unistd.h>
> 	#include <sys/file.h>
> 	#include <sys/types.h>
> 	#include <sys/stat.h>
> 	#include <string.h>
> 	#define BUF_SIZE 1024
> 
> 	int main()
> 	{
> 		int fd;
> 		int ret;
> 
> 		unsigned char *buf;
> 		ret = posix_memalign((void **)&buf, 512, BUF_SIZE);
> 		if (ret) {
> 			perror("posix_memalign failed");
> 			exit(1);
> 		}
> 		fd = open("./test.jar", O_RDONLY | O_DIRECT, 0755);
> 		if (fd < 0){
> 			perror("open ./test.jar failed");
> 			exit(1);
> 		}
> 
> 		do {
> 			ret = read(fd, buf, BUF_SIZE);
> 			printf("ret=%d\n",ret);
> 			if (ret < 0) {
> 				perror("write test.jar failed");
> 			}
> 		} while (ret > 0);
> 
> 		free(buf);
> 		close(fd);
> 	}
> 
> (3) Compile the source file:
> 	$gcc direct_io_read_file.c -D_GNU_SOURCE
> 
> (4) Run the test program:
> 	$./a.out
> 
> 	The result is as following:
> 	ret=1024
> 	ret=1024
> 	ret=952
> 	ret=-1
> 	write test.jar failed: Invalid argument.
> 
> I have tested this program on XFS filesystem, XFS does not have
> this problem, because XFS use iomap_dio_rw() to do direct I/O
> read. And the comparing between read offset and file size is done
> in iomap_dio_rw(), the code is as following:
> 
> 	if (pos < size) {
> 		retval = filemap_write_and_wait_range(mapping, pos,
> 				pos + iov_length(iov, nr_segs) - 1);
> 
> 		if (!retval) {
> 			retval = mapping->a_ops->direct_IO(READ, iocb,
> 						iov, pos, nr_segs);
> 		}
> 		...
> 	}
> 
> ...only when "pos < size", direct I/O can be done, or 0 will be return.
> 
> I have tested the fix patch on Ext4, it is up to the mustard of
> EINVAL in man2(read) as following:
> 	#include <unistd.h>
> 	ssize_t read(int fd, void *buf, size_t count);
> 
> 	EINVAL
> 		fd is attached to an object which is unsuitable for reading;
> 		or the file was opened with the O_DIRECT flag, and either the
> 		address specified in buf, the value specified in count, or the
> 		current file offset is not suitably aligned.
> 
> So I think this patch can be applied to fix ext4 direct I/O error.
> 
> However Ext4 introduces direct I/O read using iomap infrastructure
> on kernel 5.5, the patch is commit <b1b4705d54ab>
> ("ext4: introduce direct I/O read using iomap infrastructure"),
> then Ext4 will be the same as XFS, they all use iomap_dio_rw() to do direct
> I/O read. So this problem does not exist on kernel 5.5 for Ext4.
> 
> From above description, we can see this problem exists on all the kernel
> versions between kernel 3.14 and kernel 5.4. It will cause the Applications
> to fail to read. For example, when the search service downloads a new full
> index file, the search engine is loading the previous index file and is
> processing the search request, it can not use buffer io that may squeeze
> the previous index file in use from pagecache, so the serch service must
> use direct I/O read.
> 
> Please apply this patch on these kernel versions, or please use the method
> on kernel 5.5 to fix this problem.
> 
> Fixes: 9fe55eea7e4b ("Fix race when checking i_size on direct i/o read")
> Reviewed-by: Jan Kara <jack@suse.cz>
> Co-developed-by: Wang Long <wanglong19@meituan.com>
> Signed-off-by: Wang Long <wanglong19@meituan.com>
> Signed-off-by: Jiang Ying <jiangying8582@126.com>
> 
> Changes since V3:
> 	Add the info: this bug could break some application that use the
> 	stable kernel releases.
> 
> Changes since V2:
> 	Optimize the description of the commit message and make a variation for
> 	the patch, e.g. with:
> 
> 		Before:
> 			loff_t size;
> 			size = i_size_read(inode);
> 		After:
> 			loff_t size = i_size_read(inode);
> 
> Changes since V1:
> 	Signed-off use real name and add "Fixes:" flag
> 
> ---
>  fs/ext4/inode.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 516faa2..a66b0ac 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3821,6 +3821,11 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
>  	struct inode *inode = mapping->host;
>  	size_t count = iov_iter_count(iter);
>  	ssize_t ret;
> +	loff_t offset = iocb->ki_pos;
> +	loff_t size = i_size_read(inode);
> +
> +	if (offset >= size)
> +		return 0;
>  
>  	/*
>  	 * Shared inode_lock is enough for us - it protects against concurrent
> -- 
> 1.8.3.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-05  8:51 ` Jan Kara
@ 2020-08-05 14:29   ` Greg KH
  2020-08-06  1:19   ` Sasha Levin
  1 sibling, 0 replies; 8+ messages in thread
From: Greg KH @ 2020-08-05 14:29 UTC (permalink / raw)
  To: Jan Kara
  Cc: stable, tytso, adilger.kernel, linux-ext4, linux-kernel,
	wanglong19, heguanjun, Jiang Ying

On Wed, Aug 05, 2020 at 10:51:07AM +0200, Jan Kara wrote:
> Note to stable tree maintainers (summary from the rather long changelog):
> This is a non-upstream patch. It will not go upstream because the problem
> there has been fixed by converting ext4 to use iomap infrastructure.
> However that change is out of scope for stable kernels and this is a
> minimal fix for the problem that has hit real-world applications so I think
> it would be worth it to include the fix in stable trees. Thanks.

Thanks for the note, I wouldn't have noticed it otherwise :)

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-05  8:51 ` Jan Kara
  2020-08-05 14:29   ` Greg KH
@ 2020-08-06  1:19   ` Sasha Levin
  2020-08-06  1:52     ` 姜迎
  1 sibling, 1 reply; 8+ messages in thread
From: Sasha Levin @ 2020-08-06  1:19 UTC (permalink / raw)
  To: Jan Kara
  Cc: stable, tytso, adilger.kernel, linux-ext4, linux-kernel,
	wanglong19, heguanjun, Jiang Ying

On Wed, Aug 05, 2020 at 10:51:07AM +0200, Jan Kara wrote:
>Note to stable tree maintainers (summary from the rather long changelog):
>This is a non-upstream patch. It will not go upstream because the problem
>there has been fixed by converting ext4 to use iomap infrastructure.
>However that change is out of scope for stable kernels and this is a
>minimal fix for the problem that has hit real-world applications so I think
>it would be worth it to include the fix in stable trees. Thanks.

How far back should it go? It breaks the build on 4.9 and 4.4 but the
fix for the breakage is trivial.

It does however suggest that this fix wasn't tested on 4.9 or 4.4, so
I'd like to clarify it here before fixing it up (or dropping it).

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-06  1:19   ` Sasha Levin
@ 2020-08-06  1:52     ` 姜迎
  0 siblings, 0 replies; 8+ messages in thread
From: 姜迎 @ 2020-08-06  1:52 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jan Kara, stable, tytso, adilger.kernel, linux-ext4,
	linux-kernel, wanglong19, heguanjun

Sorry,I will fix this error on 4.4 and 4.9,and then send a patch for 4.4 and 4.9,thanks!

发自我的iPhone

> 在 2020年8月6日,上午9:19,Sasha Levin <sashal@kernel.org> 写道:
> 
> On Wed, Aug 05, 2020 at 10:51:07AM +0200, Jan Kara wrote:
>> Note to stable tree maintainers (summary from the rather long changelog):
>> This is a non-upstream patch. It will not go upstream because the problem
>> there has been fixed by converting ext4 to use iomap infrastructure.
>> However that change is out of scope for stable kernels and this is a
>> minimal fix for the problem that has hit real-world applications so I think
>> it would be worth it to include the fix in stable trees. Thanks.
> 
> How far back should it go? It breaks the build on 4.9 and 4.4 but the
> fix for the breakage is trivial.
> 
> It does however suggest that this fix wasn't tested on 4.9 or 4.4, so
> I'd like to clarify it here before fixing it up (or dropping it).
> 
> -- 
> Thanks,
> Sasha


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-14  8:04 ` Christoph Hellwig
@ 2020-08-14  8:38   ` 姜迎
  0 siblings, 0 replies; 8+ messages in thread
From: 姜迎 @ 2020-08-14  8:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, stable,
	wanglong19, heguanjun, jack

Ok,I will make time to do it.

发自我的iPhone

> 在 2020年8月14日,下午4:04,Christoph Hellwig <hch@infradead.org> 写道:
> 
> On Wed, Aug 05, 2020 at 03:40:34PM +0800, Jiang Ying wrote:
>> This patch is used to fix ext4 direct I/O read error when
>> the read size is not aligned with block size.
>> 
>> Then, I will use a test to explain the error.
>> 
>> (1) Make a file that is not aligned with block size:
>>    $dd if=/dev/zero of=./test.jar bs=1000 count=3
>> 
>> (2) I wrote a source file named "direct_io_read_file.c" as following:
> 
> Can you please add your reproducer to xfstests?
> 
> Thanks!


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] ext4: fix direct I/O read error
  2020-08-05  7:40 Jiang Ying
@ 2020-08-14  8:04 ` Christoph Hellwig
  2020-08-14  8:38   ` 姜迎
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2020-08-14  8:04 UTC (permalink / raw)
  To: Jiang Ying
  Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, stable,
	wanglong19, heguanjun, jack

On Wed, Aug 05, 2020 at 03:40:34PM +0800, Jiang Ying wrote:
> This patch is used to fix ext4 direct I/O read error when
> the read size is not aligned with block size.
> 
> Then, I will use a test to explain the error.
> 
> (1) Make a file that is not aligned with block size:
> 	$dd if=/dev/zero of=./test.jar bs=1000 count=3
> 
> (2) I wrote a source file named "direct_io_read_file.c" as following:

Can you please add your reproducer to xfstests?

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4] ext4: fix direct I/O read error
@ 2020-08-05  7:40 Jiang Ying
  2020-08-14  8:04 ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Jiang Ying @ 2020-08-05  7:40 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4, linux-kernel, stable
  Cc: wanglong19, heguanjun, jack

This patch is used to fix ext4 direct I/O read error when
the read size is not aligned with block size.

Then, I will use a test to explain the error.

(1) Make a file that is not aligned with block size:
	$dd if=/dev/zero of=./test.jar bs=1000 count=3

(2) I wrote a source file named "direct_io_read_file.c" as following:

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/file.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <string.h>
	#define BUF_SIZE 1024

	int main()
	{
		int fd;
		int ret;

		unsigned char *buf;
		ret = posix_memalign((void **)&buf, 512, BUF_SIZE);
		if (ret) {
			perror("posix_memalign failed");
			exit(1);
		}
		fd = open("./test.jar", O_RDONLY | O_DIRECT, 0755);
		if (fd < 0){
			perror("open ./test.jar failed");
			exit(1);
		}

		do {
			ret = read(fd, buf, BUF_SIZE);
			printf("ret=%d\n",ret);
			if (ret < 0) {
				perror("write test.jar failed");
			}
		} while (ret > 0);

		free(buf);
		close(fd);
	}

(3) Compile the source file:
	$gcc direct_io_read_file.c -D_GNU_SOURCE

(4) Run the test program:
	$./a.out

	The result is as following:
	ret=1024
	ret=1024
	ret=952
	ret=-1
	write test.jar failed: Invalid argument.

I have tested this program on XFS filesystem, XFS does not have
this problem, because XFS use iomap_dio_rw() to do direct I/O
read. And the comparing between read offset and file size is done
in iomap_dio_rw(), the code is as following:

	if (pos < size) {
		retval = filemap_write_and_wait_range(mapping, pos,
				pos + iov_length(iov, nr_segs) - 1);

		if (!retval) {
			retval = mapping->a_ops->direct_IO(READ, iocb,
						iov, pos, nr_segs);
		}
		...
	}

...only when "pos < size", direct I/O can be done, or 0 will be return.

I have tested the fix patch on Ext4, it is up to the mustard of
EINVAL in man2(read) as following:
	#include <unistd.h>
	ssize_t read(int fd, void *buf, size_t count);

	EINVAL
		fd is attached to an object which is unsuitable for reading;
		or the file was opened with the O_DIRECT flag, and either the
		address specified in buf, the value specified in count, or the
		current file offset is not suitably aligned.

So I think this patch can be applied to fix ext4 direct I/O error.

However Ext4 introduces direct I/O read using iomap infrastructure
on kernel 5.5, the patch is commit <b1b4705d54ab>
("ext4: introduce direct I/O read using iomap infrastructure"),
then Ext4 will be the same as XFS, they all use iomap_dio_rw() to do direct
I/O read. So this problem does not exist on kernel 5.5 for Ext4.

From above description, we can see this problem exists on all the kernel
versions between kernel 3.14 and kernel 5.4. It will cause the Applications
to fail to read. For example, when the search service downloads a new full
index file, the search engine is loading the previous index file and is
processing the search request, it can not use buffer io that may squeeze
the previous index file in use from pagecache, so the serch service must
use direct I/O read.

Please apply this patch on these kernel versions, or please use the method
on kernel 5.5 to fix this problem.

Fixes: 9fe55eea7e4b ("Fix race when checking i_size on direct i/o read")
Reviewed-by: Jan Kara <jack@suse.cz>
Co-developed-by: Wang Long <wanglong19@meituan.com>
Signed-off-by: Wang Long <wanglong19@meituan.com>
Signed-off-by: Jiang Ying <jiangying8582@126.com>
---
 fs/ext4/inode.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 516faa2..a66b0ac 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3821,6 +3821,11 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
 	struct inode *inode = mapping->host;
 	size_t count = iov_iter_count(iter);
 	ssize_t ret;
+	loff_t offset = iocb->ki_pos;
+	loff_t size = i_size_read(inode);
+
+	if (offset >= size)
+		return 0;
 
 	/*
 	 * Shared inode_lock is enough for us - it protects against concurrent
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-14  8:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-05  7:57 [PATCH v4] ext4: fix direct I/O read error Jiang Ying
2020-08-05  8:51 ` Jan Kara
2020-08-05 14:29   ` Greg KH
2020-08-06  1:19   ` Sasha Levin
2020-08-06  1:52     ` 姜迎
  -- strict thread matches above, loose matches on Subject: below --
2020-08-05  7:40 Jiang Ying
2020-08-14  8:04 ` Christoph Hellwig
2020-08-14  8:38   ` 姜迎

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).