io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
       [not found] <20220527092432.GE11731@xsang-OptiPlex-9020>
@ 2022-05-27 13:50 ` Jens Axboe
  2022-06-08  8:00   ` Oliver Sang
                     ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jens Axboe @ 2022-05-27 13:50 UTC (permalink / raw)
  To: kernel test robot
  Cc: LKML, io-uring, lkp, lkp, ying.huang, feng.tang, zhengjun.xing,
	fengwei.yin, guobing.chen, ming.a.chen, frank.du, Shuhua.Fan,
	wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang, guangli.li,
	tiejun.li, yu.ma, dapeng1.mi, jiebin.sun, gengxin.xie, fan.zhao

On 5/27/22 3:24 AM, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
> 
> 
> commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: phoronix-test-suite
> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
> with following parameters:
> 
> 	test: fio-1.14.1
> 	option_a: Sequential Write
> 	option_b: IO_uring
> 	option_c: Yes
> 	option_d: Yes
> 	option_e: 1MB
> 	option_f: Default Test Directory
> 	cpufreq_governor: performance
> 	ucode: 0x500320a
> 
> test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
> test-url: http://www.phoronix-test-suite.com/

I'm a bit skeptical on this, but I'd like to try and run the test case.
Since it's just a fio test case, why can't I find it somewhere? Seems
very convoluted to have to setup lkp-tests just for this. Besides, I
tried, but it doesn't work on aarch64...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe
@ 2022-06-08  8:00   ` Oliver Sang
  2022-06-14  1:54   ` [LKP] " Yin Fengwei
  2022-07-12  8:06   ` Yin Fengwei
  2 siblings, 0 replies; 18+ messages in thread
From: Oliver Sang @ 2022-06-08  8:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: LKML, io-uring, lkp, lkp, ying.huang, feng.tang, zhengjun.xing,
	fengwei.yin, guobing.chen, ming.a.chen, frank.du, Shuhua.Fan,
	wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang, guangli.li,
	tiejun.li, yu.ma, dapeng1.mi, jiebin.sun, gengxin.xie, fan.zhao

Hi Jens Axboe,

On Fri, May 27, 2022 at 07:50:27AM -0600, Jens Axboe wrote:
> On 5/27/22 3:24 AM, kernel test robot wrote:
> > 
> > 
> > Greeting,
> > 
> > FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
> > 
> > 
> > commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > in testcase: phoronix-test-suite
> > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
> > with following parameters:
> > 
> > 	test: fio-1.14.1
> > 	option_a: Sequential Write
> > 	option_b: IO_uring
> > 	option_c: Yes
> > 	option_d: Yes
> > 	option_e: 1MB
> > 	option_f: Default Test Directory
> > 	cpufreq_governor: performance
> > 	ucode: 0x500320a
> > 
> > test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
> > test-url: http://www.phoronix-test-suite.com/
> 
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...

we just follow doc on http://www.phoronix-test-suite.com/ to run tests in PTS
framework, so you don't need to care about lkp-tests.

and for this fio test, the parameters we used just as:
	test: fio-1.14.1
	option_a: Sequential Write
	option_b: IO_uring
	option_c: Yes
	option_d: Yes
	option_e: 1MB
	option_f: Default Test Directory


and yeah, we most focus on x86_64 and don't support lkp-tests to run on
aarch64...


if you have some idea that we could run other tests, could you let us know?
it will be great pleasure to run more tests to check for us if we can support.

Thanks a lot!

> 
> -- 
> Jens Axboe
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe
  2022-06-08  8:00   ` Oliver Sang
@ 2022-06-14  1:54   ` Yin Fengwei
  2022-07-12  8:06   ` Yin Fengwei
  2 siblings, 0 replies; 18+ messages in thread
From: Yin Fengwei @ 2022-06-14  1:54 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot
  Cc: LKML, io-uring, lkp, lkp, guobing.chen, ming.a.chen, frank.du,
	Shuhua.Fan, wangyang.guo, Wenhuan.Huang, jessica.ji, shan.kang,
	guangli.li, tiejun.li, yu.ma, dapeng1.mi, jiebin.sun,
	gengxin.xie, fan.zhao


On 5/27/2022 9:50 PM, Jens Axboe wrote:
> On 5/27/22 3:24 AM, kernel test robot wrote:
>>
>>
>> Greeting,
>>
>> FYI, we noticed a -10.2% regression of phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s due to commit:
>>
>>
>> commit: 584b0180f0f4d67d7145950fe68c625f06c88b10 ("io_uring: move read/write file prep state into actual opcode handler")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: phoronix-test-suite
>> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
>> with following parameters:
>>
>> 	test: fio-1.14.1
>> 	option_a: Sequential Write
>> 	option_b: IO_uring
>> 	option_c: Yes
>> 	option_d: Yes
>> 	option_e: 1MB
>> 	option_f: Default Test Directory
>> 	cpufreq_governor: performance
>> 	ucode: 0x500320a
>>
>> test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
>> test-url: http://www.phoronix-test-suite.com/
> 
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...
> 
We re-run the test and still could get exactly same test result. We noticed
following info from perf profile:

     14.40 ± 21%     +71.3       85.71 ±  2%  perf-profile.calltrace.cycles-pp.io_wqe_worker.ret_from_fork
     14.25 ± 21%     +71.4       85.64 ±  2%  perf-profile.calltrace.cycles-pp.io_worker_handle_work.io_wqe_worker.ret_from_fork
     14.23 ± 21%     +71.4       85.63 ±  2%  perf-profile.calltrace.cycles-pp.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
     14.23 ± 21%     +71.4       85.64 ±  2%  perf-profile.calltrace.cycles-pp.io_wq_submit_work.io_worker_handle_work.io_wqe_worker.ret_from_fork
     14.22 ± 21%     +71.4       85.63 ±  2%  perf-profile.calltrace.cycles-pp.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work.io_wqe_worker
     14.10 ± 21%     +71.5       85.62 ±  2%  perf-profile.calltrace.cycles-pp.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work.io_worker_handle_work
      0.00           +80.9       80.92 ±  2%  perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write
      0.00           +83.0       82.99 ±  2%  perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe
      0.00           +83.6       83.63 ±  2%  perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.ext4_buffered_write_iter.io_write.io_issue_sqe.io_wq_submit_work

The above operations takes more time with the patch applied.
It looks like the inode lock contention raised a lot with
the patch.

Frankly, we can't connect this behavior with the patch. Just
list here for your information. Thanks.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe
  2022-06-08  8:00   ` Oliver Sang
  2022-06-14  1:54   ` [LKP] " Yin Fengwei
@ 2022-07-12  8:06   ` Yin Fengwei
  2022-07-15 15:58     ` Jens Axboe
  2 siblings, 1 reply; 18+ messages in thread
From: Yin Fengwei @ 2022-07-12  8:06 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 5/27/2022 9:50 PM, Jens Axboe wrote:
> I'm a bit skeptical on this, but I'd like to try and run the test case.
> Since it's just a fio test case, why can't I find it somewhere? Seems
> very convoluted to have to setup lkp-tests just for this. Besides, I
> tried, but it doesn't work on aarch64...
Recheck this regression report. The regression could be reproduced if
the following config file is used with fio (tag: fio-3.25) :

	[global]
	rw=write
	ioengine=io_uring
	iodepth=64
	size=1g
	direct=1
	buffered=1
	startdelay=5
	force_async=4
	ramp_time=5
	runtime=20
	time_based
	clat_percentiles=0
	disable_lat=1
	disable_clat=1
	disable_slat=1
	filename=test_fiofile
	[test]
	name=test
	bs=1M
	stonewall

Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10:

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 969f65de9972..616d857f8fc6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
        struct kiocb *kiocb = &req->rw.kiocb;
        unsigned ioprio;
+       struct file *file = req->file;
        int ret;

+       if (likely(file && (file->f_mode & FMODE_WRITE)))
+               if (!io_req_ffs_set(req))
+                       req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
        kiocb->ki_pos = READ_ONCE(sqe->off);

        ioprio = READ_ONCE(sqe->ioprio);

could make regression gone. No idea how req->flags impact the write performance. Thanks.


Regards
Yin, Fengwei

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-12  8:06   ` Yin Fengwei
@ 2022-07-15 15:58     ` Jens Axboe
  2022-07-18  0:58       ` Yin Fengwei
  2022-07-18  3:30       ` Yin Fengwei
  0 siblings, 2 replies; 18+ messages in thread
From: Jens Axboe @ 2022-07-15 15:58 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/12/22 2:06 AM, Yin Fengwei wrote:
> Hi Jens,
> 
> On 5/27/2022 9:50 PM, Jens Axboe wrote:
>> I'm a bit skeptical on this, but I'd like to try and run the test case.
>> Since it's just a fio test case, why can't I find it somewhere? Seems
>> very convoluted to have to setup lkp-tests just for this. Besides, I
>> tried, but it doesn't work on aarch64...
> Recheck this regression report. The regression could be reproduced if
> the following config file is used with fio (tag: fio-3.25) :
> 
> 	[global]
> 	rw=write
> 	ioengine=io_uring
> 	iodepth=64
> 	size=1g
> 	direct=1
> 	buffered=1
> 	startdelay=5
> 	force_async=4
> 	ramp_time=5
> 	runtime=20
> 	time_based
> 	clat_percentiles=0
> 	disable_lat=1
> 	disable_clat=1
> 	disable_slat=1
> 	filename=test_fiofile
> 	[test]
> 	name=test
> 	bs=1M
> 	stonewall
> 
> Just FYI, a small change to commit: 584b0180f0f4d67d7145950fe68c625f06c88b10:
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 969f65de9972..616d857f8fc6 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3181,8 +3181,13 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  {
>         struct kiocb *kiocb = &req->rw.kiocb;
>         unsigned ioprio;
> +       struct file *file = req->file;
>         int ret;
> 
> +       if (likely(file && (file->f_mode & FMODE_WRITE)))
> +               if (!io_req_ffs_set(req))
> +                       req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
>         kiocb->ki_pos = READ_ONCE(sqe->off);
> 
>         ioprio = READ_ONCE(sqe->ioprio);
> 
> could make regression gone. No idea how req->flags impact the write
> performance. Thanks.

I can't really explain that either, at least not immediately. I tried
running with and without that patch, and don't see any difference here.
In terms of making this more obvious, does the below also fix it for
you?

And what filesystem is this being run on?


diff --git a/fs/io_uring.c b/fs/io_uring.c
index a01ea49f3017..797fad99780d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
 	if (unlikely(!file || !(file->f_mode & mode)))
 		return -EBADF;
 
-	if (!io_req_ffs_set(req))
-		req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
 	kiocb->ki_flags = iocb_flags(file);
 	ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
 	if (unlikely(ret))
@@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
 	else
 		req->file = io_file_get_normal(req, req->cqe.fd);
 
-	return !!req->file;
+	if (unlikely(!req->file))
+		return false;
+
+	if (!io_req_ffs_set(req))
+		req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
+	return true;
 }
 
 static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-15 15:58     ` Jens Axboe
@ 2022-07-18  0:58       ` Yin Fengwei
  2022-07-18  1:14         ` Jens Axboe
  2022-07-18  3:30       ` Yin Fengwei
  1 sibling, 1 reply; 18+ messages in thread
From: Yin Fengwei @ 2022-07-18  0:58 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp



On 7/15/2022 11:58 PM, Jens Axboe wrote:
> I can't really explain that either, at least not immediately. I tried
> running with and without that patch, and don't see any difference here.
> In terms of making this more obvious, does the below also fix it for
> you?
I will try the fix and let you know the result.

> 
> And what filesystem is this being run on?
I am using ext4 and LKP are also using ext4. Thanks.


Regards
Yin, Fengwei

> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a01ea49f3017..797fad99780d 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4269,9 +4269,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
>  	if (unlikely(!file || !(file->f_mode & mode)))
>  		return -EBADF;
>  
> -	if (!io_req_ffs_set(req))
> -		req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> -
>  	kiocb->ki_flags = iocb_flags(file);
>  	ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
>  	if (unlikely(ret))
> @@ -8309,7 +8306,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
>  	else
>  		req->file = io_file_get_normal(req, req->cqe.fd);
>  
> -	return !!req->file;
> +	if (unlikely(!req->file))
> +		return false;
> +
> +	if (!io_req_ffs_set(req))
> +		req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
> +	return true;
>  }
>  
>  static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-18  0:58       ` Yin Fengwei
@ 2022-07-18  1:14         ` Jens Axboe
  0 siblings, 0 replies; 18+ messages in thread
From: Jens Axboe @ 2022-07-18  1:14 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/17/22 6:58 PM, Yin Fengwei wrote:
> 
> 
> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>> I can't really explain that either, at least not immediately. I tried
>> running with and without that patch, and don't see any difference here.
>> In terms of making this more obvious, does the below also fix it for
>> you?
> I will try the fix and let you know the result.
> 
>>
>> And what filesystem is this being run on?
> I am using ext4 and LKP are also using ext4. Thanks.

Thanks, I'll try ext4 as well (was on XFS).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-15 15:58     ` Jens Axboe
  2022-07-18  0:58       ` Yin Fengwei
@ 2022-07-18  3:30       ` Yin Fengwei
  2022-07-18 16:27         ` Jens Axboe
  1 sibling, 1 reply; 18+ messages in thread
From: Yin Fengwei @ 2022-07-18  3:30 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 7/15/2022 11:58 PM, Jens Axboe wrote:
> In terms of making this more obvious, does the below also fix it for
> you?

The regression is still there after applied the change you posted.


Your change can't be applied to v5.18 (the latest commit on master branch).
I changed it a little bit to be applied:

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e0823f58f7959..0bf7f3d18d46e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3762,9 +3762,6 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
        if (unlikely(!file || !(file->f_mode & mode)))
                return -EBADF;

-       if (!io_req_ffs_set(req))
-               req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
        kiocb->ki_flags = iocb_flags(file);
        ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
        if (unlikely(ret))
@@ -7114,8 +7111,13 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
                req->file = io_file_get_fixed(req, req->fd, issue_flags);
        else
                req->file = io_file_get_normal(req, req->fd);
-       if (req->file)
+       if (req->file) {
+               if (!io_req_ffs_set(req))
+                       req->flags |= io_file_get_flags(req->file) <<
+                                               REQ_F_SUPPORT_NOWAIT_BIT;
+
                return true;
+       }

        req_set_fail(req);
        req->result = -EBADF;


Regards
Yin, Fengwei

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-18  3:30       ` Yin Fengwei
@ 2022-07-18 16:27         ` Jens Axboe
  2022-07-19  0:27           ` Yin Fengwei
  2022-07-19  2:16           ` Yin Fengwei
  0 siblings, 2 replies; 18+ messages in thread
From: Jens Axboe @ 2022-07-18 16:27 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/17/22 9:30 PM, Yin Fengwei wrote:
> Hi Jens,
> 
> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>> In terms of making this more obvious, does the below also fix it for
>> you?
> 
> The regression is still there after applied the change you posted.

Still don't see the regression here, using ext4. I get about 1020-1045
IOPS with or without the patch you sent.

This is running it in a vm, and the storage device is nvme. What is
hosting your ext4 fs?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-18 16:27         ` Jens Axboe
@ 2022-07-19  0:27           ` Yin Fengwei
  2022-07-19  2:16           ` Yin Fengwei
  1 sibling, 0 replies; 18+ messages in thread
From: Yin Fengwei @ 2022-07-19  0:27 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp



On 7/19/2022 12:27 AM, Jens Axboe wrote:
> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>> In terms of making this more obvious, does the below also fix it for
>>> you?
>>
>> The regression is still there after applied the change you posted.
> 
> Still don't see the regression here, using ext4. I get about 1020-1045
> IOPS with or without the patch you sent.
> 
> This is running it in a vm, and the storage device is nvme. What is
> hosting your ext4 fs?
> 
My local testing system is also a vm with SATA disk. LKP test platform
is a native one with SATA disk.

I could reproduce the regression on both environment. I will try to use
nvme to host my local vm disk and check whether we could see something
different.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-18 16:27         ` Jens Axboe
  2022-07-19  0:27           ` Yin Fengwei
@ 2022-07-19  2:16           ` Yin Fengwei
  2022-07-19  2:29             ` Jens Axboe
  1 sibling, 1 reply; 18+ messages in thread
From: Yin Fengwei @ 2022-07-19  2:16 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 7/19/2022 12:27 AM, Jens Axboe wrote:
> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>> In terms of making this more obvious, does the below also fix it for
>>> you?
>>
>> The regression is still there after applied the change you posted.
> 
> Still don't see the regression here, using ext4. I get about 1020-1045
> IOPS with or without the patch you sent.
> 
> This is running it in a vm, and the storage device is nvme. What is
> hosting your ext4 fs?
Just did more test with vm. The regression can't be reproduced with latest
code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA
or NVME.

But the regression and the debugging patch from me could be reproduced
on both SATA and NVME if use commit 584b0180f0f4d6 as base commit
(584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch).


Here is the test result I got:
NVME as host storage:
  5.19.0-rc7:
    write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets
    write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets
    write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets
    write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets
    write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets

  5.19.0-rc7 with my debugging patch:
    write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets
    write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets
    write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets
    write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets
    write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets

  584b0180f0:
    write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets
    write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets
    write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets
    write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets
    write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets

  584b0180f0 with my debugging the patch:
    write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets
    write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets
    write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets
    write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets
    write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets


SATA disk as host storage:
  5.19.0-rc7:
    write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets
    write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets
    write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets
    write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets
    write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets

  5.19.0-rc7 with my debugging patch:
    write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets
    write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets
    write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets
    write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets
    write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets

  584b0180f0:
    write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets
    write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets
    write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets

  584b0180f0 with my debugging patch:
    write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets
    write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets
    write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-19  2:16           ` Yin Fengwei
@ 2022-07-19  2:29             ` Jens Axboe
  2022-07-19  8:58               ` Yin Fengwei
  0 siblings, 1 reply; 18+ messages in thread
From: Jens Axboe @ 2022-07-19  2:29 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/18/22 8:16 PM, Yin Fengwei wrote:
> Hi Jens,
> 
> On 7/19/2022 12:27 AM, Jens Axboe wrote:
>> On 7/17/22 9:30 PM, Yin Fengwei wrote:
>>> Hi Jens,
>>>
>>> On 7/15/2022 11:58 PM, Jens Axboe wrote:
>>>> In terms of making this more obvious, does the below also fix it for
>>>> you?
>>>
>>> The regression is still there after applied the change you posted.
>>
>> Still don't see the regression here, using ext4. I get about 1020-1045
>> IOPS with or without the patch you sent.
>>
>> This is running it in a vm, and the storage device is nvme. What is
>> hosting your ext4 fs?
> Just did more test with vm. The regression can't be reproduced with latest
> code (I tried the tag v5.19-rc7) whatever the underneath storage is SATA
> or NVME.
> 
> But the regression and the debugging patch from me could be reproduced
> on both SATA and NVME if use commit 584b0180f0f4d6 as base commit
> (584b0180f0f4d6 vs 584b0180f0f4d6 with my debugging patch).
> 
> 
> Here is the test result I got:
> NVME as host storage:
>   5.19.0-rc7:
>     write: IOPS=933, BW=937MiB/s (982MB/s)(18.3GiB/20020msec); 0 zone resets
>     write: IOPS=993, BW=996MiB/s (1045MB/s)(19.5GiB/20020msec); 0 zone resets
>     write: IOPS=1005, BW=1009MiB/s (1058MB/s)(19.7GiB/20020msec); 0 zone resets
>     write: IOPS=985, BW=989MiB/s (1037MB/s)(19.3GiB/20020msec); 0 zone resets
>     write: IOPS=1020, BW=1024MiB/s (1073MB/s)(20.0GiB/20020msec); 0 zone resets
> 
>   5.19.0-rc7 with my debugging patch:
>     write: IOPS=988, BW=992MiB/s (1040MB/s)(19.7GiB/20384msec); 0 zone resets
>     write: IOPS=995, BW=998MiB/s (1047MB/s)(20.1GiB/20574msec); 0 zone resets
>     write: IOPS=996, BW=1000MiB/s (1048MB/s)(19.5GiB/20020msec); 0 zone resets
>     write: IOPS=995, BW=998MiB/s (1047MB/s)(19.5GiB/20020msec); 0 zone resets
>     write: IOPS=1006, BW=1009MiB/s (1058MB/s)(19.7GiB/20019msec); 0 zone resets

These two basically look identical, which may be why I get the same with
and without your patch. I don't think it makes a difference for this.
Curious how it came about?

>   584b0180f0:
>     write: IOPS=1004, BW=1008MiB/s (1057MB/s)(19.7GiB/20020msec); 0 zone resets
>     write: IOPS=968, BW=971MiB/s (1018MB/s)(19.4GiB/20468msec); 0 zone resets
>     write: IOPS=982, BW=986MiB/s (1033MB/s)(19.3GiB/20020msec); 0 zone resets
>     write: IOPS=1000, BW=1004MiB/s (1053MB/s)(20.1GiB/20461msec); 0 zone resets
>     write: IOPS=903, BW=906MiB/s (950MB/s)(18.1GiB/20419msec); 0 zone resets
> 
>   584b0180f0 with my debugging the patch:
>     write: IOPS=1073, BW=1076MiB/s (1129MB/s)(21.1GiB/20036msec); 0 zone resets
>     write: IOPS=1131, BW=1135MiB/s (1190MB/s)(22.2GiB/20022msec); 0 zone resets
>     write: IOPS=1122, BW=1126MiB/s (1180MB/s)(22.1GiB/20071msec); 0 zone resets
>     write: IOPS=1071, BW=1075MiB/s (1127MB/s)(21.1GiB/20071msec); 0 zone resets
>     write: IOPS=1049, BW=1053MiB/s (1104MB/s)(21.1GiB/20482msec); 0 zone resets

Last one looks like it may be faster indeed. I do wonder if this is
something else, though. There's no reason why -rc7 with that same patch
applied should be any different than 584b0180f0 with it.


these resu
> 
> 
> SATA disk as host storage:
>   5.19.0-rc7:
>     write: IOPS=624, BW=627MiB/s (658MB/s)(12.3GiB/20023msec); 0 zone resets
>     write: IOPS=655, BW=658MiB/s (690MB/s)(12.9GiB/20021msec); 0 zone resets
>     write: IOPS=596, BW=600MiB/s (629MB/s)(12.1GiB/20586msec); 0 zone resets
>     write: IOPS=647, BW=650MiB/s (682MB/s)(12.7GiB/20020msec); 0 zone resets
>     write: IOPS=591, BW=594MiB/s (623MB/s)(12.1GiB/20787msec); 0 zone resets
> 
>   5.19.0-rc7 with my debugging patch:
>     write: IOPS=633, BW=637MiB/s (668MB/s)(12.6GiB/20201msec); 0 zone resets
>     write: IOPS=614, BW=617MiB/s (647MB/s)(13.1GiB/21667msec); 0 zone resets
>     write: IOPS=653, BW=657MiB/s (689MB/s)(12.8GiB/20020msec); 0 zone resets
>     write: IOPS=618, BW=622MiB/s (652MB/s)(12.2GiB/20033msec); 0 zone resets
>     write: IOPS=604, BW=608MiB/s (638MB/s)(12.1GiB/20314msec); 0 zone resets

These again are probably the same, within variance.

>   584b0180f0:
>     write: IOPS=635, BW=638MiB/s (669MB/s)(12.5GiB/20020msec); 0 zone resets
>     write: IOPS=649, BW=652MiB/s (684MB/s)(12.8GiB/20066msec); 0 zone resets
>     write: IOPS=639, BW=642MiB/s (674MB/s)(13.1GiB/20818msec); 0 zone resets
> 
>   584b0180f0 with my debugging patch:
>     write: IOPS=850, BW=853MiB/s (895MB/s)(17.1GiB/20474msec); 0 zone resets
>     write: IOPS=738, BW=742MiB/s (778MB/s)(15.1GiB/20787msec); 0 zone resets
>     write: IOPS=751, BW=755MiB/s (792MB/s)(15.1GiB/20432msec); 0 zone resets

But this one looks like a clear difference.

I'll poke at this tomorrow.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-19  2:29             ` Jens Axboe
@ 2022-07-19  8:58               ` Yin Fengwei
  2022-07-20 17:24                 ` Jens Axboe
  0 siblings, 1 reply; 18+ messages in thread
From: Yin Fengwei @ 2022-07-19  8:58 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 7/19/2022 10:29 AM, Jens Axboe wrote:
> I'll poke at this tomorrow.

Just FYI. Another finding (test is based on commit 584b0180f0):
If the code block is put to different function, the fio performance result is
different:

Patch1:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 616d857f8fc6..b0578a3d063a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3184,10 +3184,6 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
        struct file *file = req->file;
        int ret;

-       if (likely(file && (file->f_mode & FMODE_WRITE)))
-               if (!io_req_ffs_set(req))
-                       req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
        kiocb->ki_pos = READ_ONCE(sqe->off);

        ioprio = READ_ONCE(sqe->ioprio);
@@ -7852,6 +7848,10 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
                return 0;
        }

+       if (likely(req->file))
+               if (!io_req_ffs_set(req))
+                       req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
        io_queue_sqe(req);
        return 0;


Patch2:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index b0578a3d063a..af705e7ba8d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7639,6 +7639,11 @@ static void io_queue_sqe_fallback(struct io_kiocb *req)
 static inline void io_queue_sqe(struct io_kiocb *req)
        __must_hold(&req->ctx->uring_lock)
 {
+
+       if (likely(req->file))
+               if (!io_req_ffs_set(req))
+                       req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
        if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
                __io_queue_sqe(req);
        else
@@ -7848,10 +7853,6 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
                return 0;
        }

-       if (likely(req->file))
-               if (!io_req_ffs_set(req))
-                       req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
        io_queue_sqe(req);
        return 0;
 }

Patch3:
diff --git a/fs/io_uring.c b/fs/io_uring.c
index af705e7ba8d3..5771d6d0ad8a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7598,6 +7598,10 @@ static inline void __io_queue_sqe(struct io_kiocb *req)
        struct io_kiocb *linked_timeout;
        int ret;

+       if (likely(req->file))
+               if (!io_req_ffs_set(req))
+                       req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
        ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);

        if (req->flags & REQ_F_COMPLETE_INLINE) {
@@ -7640,10 +7644,6 @@ static inline void io_queue_sqe(struct io_kiocb *req)
        __must_hold(&req->ctx->uring_lock)
 {

-       if (likely(req->file))
-               if (!io_req_ffs_set(req))
-                       req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
        if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
                __io_queue_sqe(req);
        else


The test result (confirmed on my own test env and LKP):
    patch1 and patch2 have no regression. patch3 has regression.


Regards
Yin, Fengwei

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-19  8:58               ` Yin Fengwei
@ 2022-07-20 17:24                 ` Jens Axboe
  2022-07-20 18:13                   ` Jens Axboe
  2022-07-21  3:08                   ` Yin Fengwei
  0 siblings, 2 replies; 18+ messages in thread
From: Jens Axboe @ 2022-07-20 17:24 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/19/22 2:58 AM, Yin Fengwei wrote:
> Hi Jens,
> 
> On 7/19/2022 10:29 AM, Jens Axboe wrote:
>> I'll poke at this tomorrow.
> 
> Just FYI. Another finding (test is based on commit 584b0180f0):
> If the code block is put to different function, the fio performance result is
> different:

I think this turned out to be a little bit of a goose chase. What's
happening here is that later kernels defer the file assignment, which
means it isn't set if a request is queued with IOSQE_ASYNC. That in
turn, for writes, means that we don't hash it on io-wq insertion, and
then it doesn't get serialized with other writes to that file.

I'll come up with a patch for this that you can test.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-20 17:24                 ` Jens Axboe
@ 2022-07-20 18:13                   ` Jens Axboe
  2022-07-20 23:25                     ` Yin Fengwei
  2022-07-21  2:59                     ` Yin Fengwei
  2022-07-21  3:08                   ` Yin Fengwei
  1 sibling, 2 replies; 18+ messages in thread
From: Jens Axboe @ 2022-07-20 18:13 UTC (permalink / raw)
  To: Yin Fengwei, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

On 7/20/22 11:24 AM, Jens Axboe wrote:
> On 7/19/22 2:58 AM, Yin Fengwei wrote:
>> Hi Jens,
>>
>> On 7/19/2022 10:29 AM, Jens Axboe wrote:
>>> I'll poke at this tomorrow.
>>
>> Just FYI. Another finding (test is based on commit 584b0180f0):
>> If the code block is put to different function, the fio performance result is
>> different:
> 
> I think this turned out to be a little bit of a goose chase. What's
> happening here is that later kernels defer the file assignment, which
> means it isn't set if a request is queued with IOSQE_ASYNC. That in
> turn, for writes, means that we don't hash it on io-wq insertion, and
> then it doesn't get serialized with other writes to that file.
> 
> I'll come up with a patch for this that you can test.

Can you try this? It's against 5.19-rc7.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index a01ea49f3017..34758e95990a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req)
 		__io_arm_ltimeout(req);
 }
 
+static bool io_bdev_nowait(struct block_device *bdev)
+{
+	return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static bool __io_file_supports_nowait(struct file *file, umode_t mode)
+{
+	if (S_ISBLK(mode)) {
+		if (IS_ENABLED(CONFIG_BLOCK) &&
+		    io_bdev_nowait(I_BDEV(file->f_mapping->host)))
+			return true;
+		return false;
+	}
+	if (S_ISSOCK(mode))
+		return true;
+	if (S_ISREG(mode)) {
+		if (IS_ENABLED(CONFIG_BLOCK) &&
+		    io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
+		    file->f_op != &io_uring_fops)
+			return true;
+		return false;
+	}
+
+	/* any ->read/write should understand O_NONBLOCK */
+	if (file->f_flags & O_NONBLOCK)
+		return true;
+	return file->f_mode & FMODE_NOWAIT;
+}
+
+static inline bool io_file_supports_nowait(struct io_kiocb *req)
+{
+	return req->flags & REQ_F_SUPPORT_NOWAIT;
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static unsigned int io_file_get_flags(struct file *file)
+{
+	umode_t mode = file_inode(file)->i_mode;
+	unsigned int res = 0;
+
+	if (S_ISREG(mode))
+		res |= FFS_ISREG;
+	if (__io_file_supports_nowait(file, mode))
+		res |= FFS_NOWAIT;
+	if (io_file_need_scm(file))
+		res |= FFS_SCM;
+	return res;
+}
+
 static void io_prep_async_work(struct io_kiocb *req)
 {
 	const struct io_op_def *def = &io_op_defs[req->opcode];
@@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req)
 	if (req->flags & REQ_F_FORCE_ASYNC)
 		req->work.flags |= IO_WQ_WORK_CONCURRENT;
 
+	if (req->file && !io_req_ffs_set(req))
+		req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
 	if (req->flags & REQ_F_ISREG) {
 		if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
 			io_wq_hash_work(&req->work, file_inode(req->file));
@@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
 	}
 }
 
-static bool io_bdev_nowait(struct block_device *bdev)
-{
-	return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static bool __io_file_supports_nowait(struct file *file, umode_t mode)
-{
-	if (S_ISBLK(mode)) {
-		if (IS_ENABLED(CONFIG_BLOCK) &&
-		    io_bdev_nowait(I_BDEV(file->f_mapping->host)))
-			return true;
-		return false;
-	}
-	if (S_ISSOCK(mode))
-		return true;
-	if (S_ISREG(mode)) {
-		if (IS_ENABLED(CONFIG_BLOCK) &&
-		    io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
-		    file->f_op != &io_uring_fops)
-			return true;
-		return false;
-	}
-
-	/* any ->read/write should understand O_NONBLOCK */
-	if (file->f_flags & O_NONBLOCK)
-		return true;
-	return file->f_mode & FMODE_NOWAIT;
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static unsigned int io_file_get_flags(struct file *file)
-{
-	umode_t mode = file_inode(file)->i_mode;
-	unsigned int res = 0;
-
-	if (S_ISREG(mode))
-		res |= FFS_ISREG;
-	if (__io_file_supports_nowait(file, mode))
-		res |= FFS_NOWAIT;
-	if (io_file_need_scm(file))
-		res |= FFS_SCM;
-	return res;
-}
-
-static inline bool io_file_supports_nowait(struct io_kiocb *req)
-{
-	return req->flags & REQ_F_SUPPORT_NOWAIT;
-}
-
 static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct kiocb *kiocb = &req->rw.kiocb;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-20 18:13                   ` Jens Axboe
@ 2022-07-20 23:25                     ` Yin Fengwei
  2022-07-21  2:59                     ` Yin Fengwei
  1 sibling, 0 replies; 18+ messages in thread
From: Yin Fengwei @ 2022-07-20 23:25 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp



On 7/21/2022 2:13 AM, Jens Axboe wrote:
> Can you try this? It's against 5.19-rc7.
Sure. I will try it and share the test result. Thanks.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-20 18:13                   ` Jens Axboe
  2022-07-20 23:25                     ` Yin Fengwei
@ 2022-07-21  2:59                     ` Yin Fengwei
  1 sibling, 0 replies; 18+ messages in thread
From: Yin Fengwei @ 2022-07-21  2:59 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 7/21/2022 2:13 AM, Jens Axboe wrote:
> Can you try this? It's against 5.19-rc7.
> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index a01ea49f3017..34758e95990a 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2015,6 +2015,64 @@ static inline void io_arm_ltimeout(struct io_kiocb *req)
>  		__io_arm_ltimeout(req);
>  }
>  
> +static bool io_bdev_nowait(struct block_device *bdev)
> +{
> +	return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
> +}
> +
> +/*
> + * If we tracked the file through the SCM inflight mechanism, we could support
> + * any file. For now, just ensure that anything potentially problematic is done
> + * inline.
> + */
> +static bool __io_file_supports_nowait(struct file *file, umode_t mode)
> +{
> +	if (S_ISBLK(mode)) {
> +		if (IS_ENABLED(CONFIG_BLOCK) &&
> +		    io_bdev_nowait(I_BDEV(file->f_mapping->host)))
> +			return true;
> +		return false;
> +	}
> +	if (S_ISSOCK(mode))
> +		return true;
> +	if (S_ISREG(mode)) {
> +		if (IS_ENABLED(CONFIG_BLOCK) &&
> +		    io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
> +		    file->f_op != &io_uring_fops)
> +			return true;
> +		return false;
> +	}
> +
> +	/* any ->read/write should understand O_NONBLOCK */
> +	if (file->f_flags & O_NONBLOCK)
> +		return true;
> +	return file->f_mode & FMODE_NOWAIT;
> +}
> +
> +static inline bool io_file_supports_nowait(struct io_kiocb *req)
> +{
> +	return req->flags & REQ_F_SUPPORT_NOWAIT;
> +}
> +
> +/*
> + * If we tracked the file through the SCM inflight mechanism, we could support
> + * any file. For now, just ensure that anything potentially problematic is done
> + * inline.
> + */
> +static unsigned int io_file_get_flags(struct file *file)
> +{
> +	umode_t mode = file_inode(file)->i_mode;
> +	unsigned int res = 0;
> +
> +	if (S_ISREG(mode))
> +		res |= FFS_ISREG;
> +	if (__io_file_supports_nowait(file, mode))
> +		res |= FFS_NOWAIT;
> +	if (io_file_need_scm(file))
> +		res |= FFS_SCM;
> +	return res;
> +}
> +
>  static void io_prep_async_work(struct io_kiocb *req)
>  {
>  	const struct io_op_def *def = &io_op_defs[req->opcode];
> @@ -2031,6 +2089,9 @@ static void io_prep_async_work(struct io_kiocb *req)
>  	if (req->flags & REQ_F_FORCE_ASYNC)
>  		req->work.flags |= IO_WQ_WORK_CONCURRENT;
>  
> +	if (req->file && !io_req_ffs_set(req))
> +		req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
> +
>  	if (req->flags & REQ_F_ISREG) {
>  		if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
>  			io_wq_hash_work(&req->work, file_inode(req->file));
> @@ -3556,64 +3617,6 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
>  	}
>  }
>  
> -static bool io_bdev_nowait(struct block_device *bdev)
> -{
> -	return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
> -}
> -
> -/*
> - * If we tracked the file through the SCM inflight mechanism, we could support
> - * any file. For now, just ensure that anything potentially problematic is done
> - * inline.
> - */
> -static bool __io_file_supports_nowait(struct file *file, umode_t mode)
> -{
> -	if (S_ISBLK(mode)) {
> -		if (IS_ENABLED(CONFIG_BLOCK) &&
> -		    io_bdev_nowait(I_BDEV(file->f_mapping->host)))
> -			return true;
> -		return false;
> -	}
> -	if (S_ISSOCK(mode))
> -		return true;
> -	if (S_ISREG(mode)) {
> -		if (IS_ENABLED(CONFIG_BLOCK) &&
> -		    io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
> -		    file->f_op != &io_uring_fops)
> -			return true;
> -		return false;
> -	}
> -
> -	/* any ->read/write should understand O_NONBLOCK */
> -	if (file->f_flags & O_NONBLOCK)
> -		return true;
> -	return file->f_mode & FMODE_NOWAIT;
> -}
> -
> -/*
> - * If we tracked the file through the SCM inflight mechanism, we could support
> - * any file. For now, just ensure that anything potentially problematic is done
> - * inline.
> - */
> -static unsigned int io_file_get_flags(struct file *file)
> -{
> -	umode_t mode = file_inode(file)->i_mode;
> -	unsigned int res = 0;
> -
> -	if (S_ISREG(mode))
> -		res |= FFS_ISREG;
> -	if (__io_file_supports_nowait(file, mode))
> -		res |= FFS_NOWAIT;
> -	if (io_file_need_scm(file))
> -		res |= FFS_SCM;
> -	return res;
> -}
> -
> -static inline bool io_file_supports_nowait(struct io_kiocb *req)
> -{
> -	return req->flags & REQ_F_SUPPORT_NOWAIT;
> -}
> -
>  static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  {
>  	struct kiocb *kiocb = &req->rw.kiocb;
> 
> -- Jens Axboe
This change could make regression gone. The test result is as following:
28d3a5662d44077aa6eb42bfcfa is your patch


584b0180f0f4d67d                   v5.19-rc7 28d3a5662d44077aa6eb42bfcfa
---------------- --------------------------- ---------------------------
       fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
           |             |             |             |             |
        503:3         9297%         782:3          178%         509:3     dmesg.timestamp:last
          3:3            0%           3:3            0%           3:3     pmeter.pmeter.fail
           :3          100%           3:3          100%           3:3     kmsg.I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class
           :3         3755%         112:3         4016%         120:3     kmsg.timestamp:I/O_error,dev_loop#,sector#op#:(READ)flags#phys_seg#prio_class
        465:3         9221%         742:3          235%         473:3     kmsg.timestamp:last
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
    972.00            -0.3%     968.67           +11.4%       1082        phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.iops
    975.00            -0.3%     972.33           +11.5%       1086        phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s

Comparing to v5.19-rc7 and 584b0180f0f4d67d, it could bring 11% regression back.
Thanks.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LKP] Re: [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression
  2022-07-20 17:24                 ` Jens Axboe
  2022-07-20 18:13                   ` Jens Axboe
@ 2022-07-21  3:08                   ` Yin Fengwei
  1 sibling, 0 replies; 18+ messages in thread
From: Yin Fengwei @ 2022-07-21  3:08 UTC (permalink / raw)
  To: Jens Axboe, kernel test robot; +Cc: LKML, io-uring, lkp, lkp

Hi Jens,

On 7/21/2022 1:24 AM, Jens Axboe wrote:
> I think this turned out to be a little bit of a goose chase. What's
> happening here is that later kernels defer the file assignment, which
> means it isn't set if a request is queued with IOSQE_ASYNC. That in
> turn, for writes, means that we don't hash it on io-wq insertion, and
> then it doesn't get serialized with other writes to that file.
Thanks a lot for the detail behavior explanation.


Regards
Yin, Fengwei

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-21  3:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220527092432.GE11731@xsang-OptiPlex-9020>
2022-05-27 13:50 ` [io_uring] 584b0180f0: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.Yes.1MB.DefaultTestDirectory.mb_s -10.2% regression Jens Axboe
2022-06-08  8:00   ` Oliver Sang
2022-06-14  1:54   ` [LKP] " Yin Fengwei
2022-07-12  8:06   ` Yin Fengwei
2022-07-15 15:58     ` Jens Axboe
2022-07-18  0:58       ` Yin Fengwei
2022-07-18  1:14         ` Jens Axboe
2022-07-18  3:30       ` Yin Fengwei
2022-07-18 16:27         ` Jens Axboe
2022-07-19  0:27           ` Yin Fengwei
2022-07-19  2:16           ` Yin Fengwei
2022-07-19  2:29             ` Jens Axboe
2022-07-19  8:58               ` Yin Fengwei
2022-07-20 17:24                 ` Jens Axboe
2022-07-20 18:13                   ` Jens Axboe
2022-07-20 23:25                     ` Yin Fengwei
2022-07-21  2:59                     ` Yin Fengwei
2022-07-21  3:08                   ` Yin Fengwei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).