linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Regression] performance regression since v4.19
@ 2019-11-12  7:53 Jiufei Xue
  2019-11-12  9:12 ` Miklos Szeredi
  0 siblings, 1 reply; 6+ messages in thread
From: Jiufei Xue @ 2019-11-12  7:53 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Joseph Qi

Hi Miklos,

A performance regression is observed since linux v4.19 when we do aio
test using fio with iodepth 128 on overlayfs. And we found that queue
depth of the device is always 1 which is unexpected. 

After investigation, it is found that commit 16914e6fc7
(“ovl: add ovl_read_iter()”) and commit 2a92e07edc
(“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
requests to real filesystem. Async IOs are converted to sync IOs here
and cause performance regression.

I wondered that is this a design flaw or supposed to be.

Thanks a lot,
Jiufei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] performance regression since v4.19
  2019-11-12  7:53 [Regression] performance regression since v4.19 Jiufei Xue
@ 2019-11-12  9:12 ` Miklos Szeredi
  2019-11-12 10:48   ` Amir Goldstein
  2019-11-13  6:31   ` Jiufei Xue
  0 siblings, 2 replies; 6+ messages in thread
From: Miklos Szeredi @ 2019-11-12  9:12 UTC (permalink / raw)
  To: Jiufei Xue; +Cc: overlayfs, Joseph Qi

On Tue, Nov 12, 2019 at 8:53 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:
>
> Hi Miklos,
>
> A performance regression is observed since linux v4.19 when we do aio
> test using fio with iodepth 128 on overlayfs. And we found that queue
> depth of the device is always 1 which is unexpected.
>
> After investigation, it is found that commit 16914e6fc7
> (“ovl: add ovl_read_iter()”) and commit 2a92e07edc
> (“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
> requests to real filesystem. Async IOs are converted to sync IOs here
> and cause performance regression.
>
> I wondered that is this a design flaw or supposed to be.

It's not theoretically difficult to fix.   The challenge is to do it
without too much complexity or code duplication.

Maybe best would be to introduce VFS helpers specially for stacked
operation such as:

  ssize_t vfs_read_iter_on_file(struct file *file, struct kiocb
*orig_iocb, struct iov_iter *iter);
  ssize_t vfs_write_iter_on_file(struct file *file, struct kiocb
*orig_iocb, struct iov_iter *iter);

Implementation-wise I'm quite sure we need to allocate a new kiocb and
initialize it from the old one, adding our own completion callback.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] performance regression since v4.19
  2019-11-12  9:12 ` Miklos Szeredi
@ 2019-11-12 10:48   ` Amir Goldstein
  2019-11-12 11:24     ` Miklos Szeredi
  2019-11-13  6:33     ` Jiufei Xue
  2019-11-13  6:31   ` Jiufei Xue
  1 sibling, 2 replies; 6+ messages in thread
From: Amir Goldstein @ 2019-11-12 10:48 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jiufei Xue, overlayfs, Joseph Qi

On Tue, Nov 12, 2019 at 11:15 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Nov 12, 2019 at 8:53 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:
> >
> > Hi Miklos,
> >
> > A performance regression is observed since linux v4.19 when we do aio
> > test using fio with iodepth 128 on overlayfs. And we found that queue
> > depth of the device is always 1 which is unexpected.
> >
> > After investigation, it is found that commit 16914e6fc7
> > (“ovl: add ovl_read_iter()”) and commit 2a92e07edc
> > (“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
> > requests to real filesystem. Async IOs are converted to sync IOs here
> > and cause performance regression.
> >
> > I wondered that is this a design flaw or supposed to be.
>
> It's not theoretically difficult to fix.   The challenge is to do it
> without too much complexity or code duplication.
>
> Maybe best would be to introduce VFS helpers specially for stacked
> operation such as:
>
>   ssize_t vfs_read_iter_on_file(struct file *file, struct kiocb
> *orig_iocb, struct iov_iter *iter);
>   ssize_t vfs_write_iter_on_file(struct file *file, struct kiocb
> *orig_iocb, struct iov_iter *iter);
>
> Implementation-wise I'm quite sure we need to allocate a new kiocb and
> initialize it from the old one, adding our own completion callback.
>

Isn't it "just" a matter of implementing ovl-aops and
generic_file_read/write_iter() will have done the aio properly?

I don't remember at what state we left the ovl-aops experiment [1]
IIRC, it passed most xfstests, but had some more corner cases to
cover.

Jiufei,

If you are interested, you can try out the experimental code [2] to
see how it plays with aio, although since readpages/writepages
are not implemented, overall performance may not be better
(or even worse).

Thanks,
Amir.

[1] https://marc.info/?l=linux-unionfs&m=154995908004146&w=2
[2] https://github.com/amir73il/linux/commits/ovl-aops-wip

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] performance regression since v4.19
  2019-11-12 10:48   ` Amir Goldstein
@ 2019-11-12 11:24     ` Miklos Szeredi
  2019-11-13  6:33     ` Jiufei Xue
  1 sibling, 0 replies; 6+ messages in thread
From: Miklos Szeredi @ 2019-11-12 11:24 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jiufei Xue, overlayfs, Joseph Qi

On Tue, Nov 12, 2019 at 11:48 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Nov 12, 2019 at 11:15 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Tue, Nov 12, 2019 at 8:53 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:
> > >
> > > Hi Miklos,
> > >
> > > A performance regression is observed since linux v4.19 when we do aio
> > > test using fio with iodepth 128 on overlayfs. And we found that queue
> > > depth of the device is always 1 which is unexpected.
> > >
> > > After investigation, it is found that commit 16914e6fc7
> > > (“ovl: add ovl_read_iter()”) and commit 2a92e07edc
> > > (“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
> > > requests to real filesystem. Async IOs are converted to sync IOs here
> > > and cause performance regression.
> > >
> > > I wondered that is this a design flaw or supposed to be.
> >
> > It's not theoretically difficult to fix.   The challenge is to do it
> > without too much complexity or code duplication.
> >
> > Maybe best would be to introduce VFS helpers specially for stacked
> > operation such as:
> >
> >   ssize_t vfs_read_iter_on_file(struct file *file, struct kiocb
> > *orig_iocb, struct iov_iter *iter);
> >   ssize_t vfs_write_iter_on_file(struct file *file, struct kiocb
> > *orig_iocb, struct iov_iter *iter);
> >
> > Implementation-wise I'm quite sure we need to allocate a new kiocb and
> > initialize it from the old one, adding our own completion callback.
> >
>
> Isn't it "just" a matter of implementing ovl-aops and
> generic_file_read/write_iter() will have done the aio properly?

Not quite.  First of all, generally only direct I/O can be async.
Which means ovl_direct_IO needs to handle async iocbs.  But, like the
stacked versions of read/write_iter, ovl_direct_IO just calls
vfs_iter_*() at the moment.

Also the generic_fille_read/write_iter() functions are called only if
the file has been copied up.   For lower files we want to retain the
stacked read implementation for page cache sharing.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] performance regression since v4.19
  2019-11-12  9:12 ` Miklos Szeredi
  2019-11-12 10:48   ` Amir Goldstein
@ 2019-11-13  6:31   ` Jiufei Xue
  1 sibling, 0 replies; 6+ messages in thread
From: Jiufei Xue @ 2019-11-13  6:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, Joseph Qi



On 2019/11/12 下午5:12, Miklos Szeredi wrote:
> On Tue, Nov 12, 2019 at 8:53 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:
>>
>> Hi Miklos,
>>
>> A performance regression is observed since linux v4.19 when we do aio
>> test using fio with iodepth 128 on overlayfs. And we found that queue
>> depth of the device is always 1 which is unexpected.
>>
>> After investigation, it is found that commit 16914e6fc7
>> (“ovl: add ovl_read_iter()”) and commit 2a92e07edc
>> (“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
>> requests to real filesystem. Async IOs are converted to sync IOs here
>> and cause performance regression.
>>
>> I wondered that is this a design flaw or supposed to be.
> 
> It's not theoretically difficult to fix.   The challenge is to do it
> without too much complexity or code duplication.
> 
> Maybe best would be to introduce VFS helpers specially for stacked
> operation such as:
> 
>   ssize_t vfs_read_iter_on_file(struct file *file, struct kiocb
> *orig_iocb, struct iov_iter *iter);
>   ssize_t vfs_write_iter_on_file(struct file *file, struct kiocb
> *orig_iocb, struct iov_iter *iter);
> 
> Implementation-wise I'm quite sure we need to allocate a new kiocb and
> initialize it from the old one, adding our own completion callback.
>
Yes, I totally agree with you. I will try to fix this regression and send
the patch.

Thanks,
Jiufei

> Thanks,
> Miklos
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression] performance regression since v4.19
  2019-11-12 10:48   ` Amir Goldstein
  2019-11-12 11:24     ` Miklos Szeredi
@ 2019-11-13  6:33     ` Jiufei Xue
  1 sibling, 0 replies; 6+ messages in thread
From: Jiufei Xue @ 2019-11-13  6:33 UTC (permalink / raw)
  To: Amir Goldstein, Miklos Szeredi; +Cc: overlayfs, Joseph Qi


Hi,
On 2019/11/12 下午6:48, Amir Goldstein wrote:
> On Tue, Nov 12, 2019 at 11:15 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>> On Tue, Nov 12, 2019 at 8:53 AM Jiufei Xue <jiufei.xue@linux.alibaba.com> wrote:
>>>
>>> Hi Miklos,
>>>
>>> A performance regression is observed since linux v4.19 when we do aio
>>> test using fio with iodepth 128 on overlayfs. And we found that queue
>>> depth of the device is always 1 which is unexpected.
>>>
>>> After investigation, it is found that commit 16914e6fc7
>>> (“ovl: add ovl_read_iter()”) and commit 2a92e07edc
>>> (“ovl: add ovl_write_iter()”) use do_iter_readv_writev() to submit
>>> requests to real filesystem. Async IOs are converted to sync IOs here
>>> and cause performance regression.
>>>
>>> I wondered that is this a design flaw or supposed to be.
>>
>> It's not theoretically difficult to fix.   The challenge is to do it
>> without too much complexity or code duplication.
>>
>> Maybe best would be to introduce VFS helpers specially for stacked
>> operation such as:
>>
>>   ssize_t vfs_read_iter_on_file(struct file *file, struct kiocb
>> *orig_iocb, struct iov_iter *iter);
>>   ssize_t vfs_write_iter_on_file(struct file *file, struct kiocb
>> *orig_iocb, struct iov_iter *iter);
>>
>> Implementation-wise I'm quite sure we need to allocate a new kiocb and
>> initialize it from the old one, adding our own completion callback.
>>
> 
> Isn't it "just" a matter of implementing ovl-aops and
> generic_file_read/write_iter() will have done the aio properly?
> 
> I don't remember at what state we left the ovl-aops experiment [1]
> IIRC, it passed most xfstests, but had some more corner cases to
> cover.
> 
> Jiufei,
> 
> If you are interested, you can try out the experimental code [2] to
> see how it plays with aio, although since readpages/writepages
> are not implemented, overall performance may not be better
> (or even worse).
>

Thanks for your reply.

I have checked the experimental code [2] and found that ovl_direct_IO()
also converted async IOs to sync. So I don't think it can solve the
regression.

Thanks,
Jiufei.

> Thanks,
> Amir.
> 
> [1] https://marc.info/?l=linux-unionfs&m=154995908004146&w=2
> [2] https://github.com/amir73il/linux/commits/ovl-aops-wip
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-13  6:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-12  7:53 [Regression] performance regression since v4.19 Jiufei Xue
2019-11-12  9:12 ` Miklos Szeredi
2019-11-12 10:48   ` Amir Goldstein
2019-11-12 11:24     ` Miklos Szeredi
2019-11-13  6:33     ` Jiufei Xue
2019-11-13  6:31   ` Jiufei Xue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).