All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance regression caused by stack operation of regular file
@ 2019-10-28  6:21 JeffleXu
  2019-11-01  7:27 ` Joseph Qi
  0 siblings, 1 reply; 4+ messages in thread
From: JeffleXu @ 2019-10-28  6:21 UTC (permalink / raw)
  To: miklos; +Cc: linux-unionfs

Hi, Miklos,

I noticed a performance regression of reading/writing files in mergeddir 
caused by commit a6518f73e60e5044656d1ba587e7463479a9381a (vfs: don't 
open real), using unixbench fstime.


Reproduce Steps:

1. cd /mnt/lower/ && git clone 
https://github.com/kdlucas/byte-unixbench.git

2. mount -t overlay overlay 
-olowerdir=/mnt/lower,upperdir=/mnt/upper,workdir=/mnt/work /mnt/merge

3. cd /mnt/merge/byte-unixbench/UnixBench && ./Run -c 1 -i 1 fstime


The score is 2870 before applying the patch, while it is 1780 after 
applying the patch, causing a 40% performance regression.

The testcase repeatedly reads 1024 bytes from one file and writes the 
readed data into another file, while both these two files

are created under /mnt/merge/tmp.  I have testsed the latest kernel 
5.4.0-rc4+, same results.


The perf shows that there's extra one call of file_remove_privs(), 
override_creds() and revert_creds() every write() syscall,

among which file_remove_privs() is pretty expensive.


- perf data before applying the patch.

```

-   53.00%     0.93%  fstime    [kernel.kallsyms]   [k] __vfs_write
    - 52.08% __vfs_write
       - 51.94% ext4_file_write_iter
          + 48.89% __generic_file_write_iter
            0.83% down_write_trylock
            0.79% up_write

```


- perf data after applying the patch.

```

+   94.88%     0.00%  fstime    [kernel.kallsyms]   [k] 
entry_SYSCALL_64_after_hwframe
+   94.88%     4.67%  fstime    [kernel.kallsyms]   [k] do_syscall_64
+   66.08%     1.60%  fstime    libc-2.17.so        [.] __GI___libc_write
+   62.37%     0.23%  fstime    [kernel.kallsyms]   [k] ksys_write
+   61.74%     0.62%  fstime    [kernel.kallsyms]   [k] vfs_write
-   60.10%     0.49%  fstime    [kernel.kallsyms]   [k] __vfs_write
    - 59.61% __vfs_write
       - 59.56% ovl_write_iter
          - 33.81% do_iter_write
             - 32.50% do_iter_readv_writev
                + ext4_file_write_iter
          + 19.15% file_remove_privs
            2.15% revert_creds
            2.02% override_creds
            0.64% down_write
            0.63% up_write

```


Regards.

Jeffle

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression caused by stack operation of regular file
  2019-10-28  6:21 Performance regression caused by stack operation of regular file JeffleXu
@ 2019-11-01  7:27 ` Joseph Qi
  2019-11-01 14:25   ` Amir Goldstein
  0 siblings, 1 reply; 4+ messages in thread
From: Joseph Qi @ 2019-11-01  7:27 UTC (permalink / raw)
  To: JeffleXu, miklos, Amir Goldstein; +Cc: linux-unionfs

Hi Miklos & Amir,
Could you please take a look at this?
It behaves different between the latest kernel and an old one, e.g. 4.9.

Thanks,
Joseph

On 19/10/28 14:21, JeffleXu wrote:
> Hi, Miklos,
> 
> I noticed a performance regression of reading/writing files in mergeddir caused by commit a6518f73e60e5044656d1ba587e7463479a9381a (vfs: don't open real), using unixbench fstime.
> 
> 
> Reproduce Steps:
> 
> 1. cd /mnt/lower/ && git clone https://github.com/kdlucas/byte-unixbench.git
> 
> 2. mount -t overlay overlay -olowerdir=/mnt/lower,upperdir=/mnt/upper,workdir=/mnt/work /mnt/merge
> 
> 3. cd /mnt/merge/byte-unixbench/UnixBench && ./Run -c 1 -i 1 fstime
> 
> 
> The score is 2870 before applying the patch, while it is 1780 after applying the patch, causing a 40% performance regression.
> 
> The testcase repeatedly reads 1024 bytes from one file and writes the readed data into another file, while both these two files
> 
> are created under /mnt/merge/tmp.  I have testsed the latest kernel 5.4.0-rc4+, same results.
> 
> 
> The perf shows that there's extra one call of file_remove_privs(), override_creds() and revert_creds() every write() syscall,
> 
> among which file_remove_privs() is pretty expensive.
> 
> 
> - perf data before applying the patch.
> 
> ```
> 
> -   53.00%     0.93%  fstime    [kernel.kallsyms]   [k] __vfs_write
>    - 52.08% __vfs_write
>       - 51.94% ext4_file_write_iter
>          + 48.89% __generic_file_write_iter
>            0.83% down_write_trylock
>            0.79% up_write
> 
> ```
> 
> 
> - perf data after applying the patch.
> 
> ```
> 
> +   94.88%     0.00%  fstime    [kernel.kallsyms]   [k] entry_SYSCALL_64_after_hwframe
> +   94.88%     4.67%  fstime    [kernel.kallsyms]   [k] do_syscall_64
> +   66.08%     1.60%  fstime    libc-2.17.so        [.] __GI___libc_write
> +   62.37%     0.23%  fstime    [kernel.kallsyms]   [k] ksys_write
> +   61.74%     0.62%  fstime    [kernel.kallsyms]   [k] vfs_write
> -   60.10%     0.49%  fstime    [kernel.kallsyms]   [k] __vfs_write
>    - 59.61% __vfs_write
>       - 59.56% ovl_write_iter
>          - 33.81% do_iter_write
>             - 32.50% do_iter_readv_writev
>                + ext4_file_write_iter
>          + 19.15% file_remove_privs
>            2.15% revert_creds
>            2.02% override_creds
>            0.64% down_write
>            0.63% up_write
> 
> ```
> 
> 
> Regards.
> 
> Jeffle
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression caused by stack operation of regular file
  2019-11-01  7:27 ` Joseph Qi
@ 2019-11-01 14:25   ` Amir Goldstein
  2019-11-04  1:03     ` Joseph Qi
  0 siblings, 1 reply; 4+ messages in thread
From: Amir Goldstein @ 2019-11-01 14:25 UTC (permalink / raw)
  To: Joseph Qi; +Cc: JeffleXu, Miklos Szeredi, overlayfs

On Fri, Nov 1, 2019 at 9:27 AM Joseph Qi <jiangqi903@gmail.com> wrote:
>
> Hi Miklos & Amir,
> Could you please take a look at this?
> It behaves different between the latest kernel and an old one, e.g. 4.9.

Not surprisingly.
Stacked file operations in 4.19 shuffled the cards.
See below.

>
> Thanks,
> Joseph
>
> On 19/10/28 14:21, JeffleXu wrote:
> > Hi, Miklos,
> >
> > I noticed a performance regression of reading/writing files in mergeddir caused by commit a6518f73e60e5044656d1ba587e7463479a9381a (vfs: don't open real), using unixbench fstime.
> >
> >
> > Reproduce Steps:
> >
> > 1. cd /mnt/lower/ && git clone https://github.com/kdlucas/byte-unixbench.git
> >
> > 2. mount -t overlay overlay -olowerdir=/mnt/lower,upperdir=/mnt/upper,workdir=/mnt/work /mnt/merge
> >
> > 3. cd /mnt/merge/byte-unixbench/UnixBench && ./Run -c 1 -i 1 fstime
> >
> >
> > The score is 2870 before applying the patch, while it is 1780 after applying the patch, causing a 40% performance regression.
> >
> > The testcase repeatedly reads 1024 bytes from one file and writes the readed data into another file, while both these two files
> >
> > are created under /mnt/merge/tmp.  I have testsed the latest kernel 5.4.0-rc4+, same results.
> >

Is this really a workload that you are interested in or just a random
micro benchmark?
If kernel changes behavior for the better in some workloads and for the worst
in other workloads, it is important to distinguish between the case of real
life workloads and less meaningful micro benchmarks that do not really have
that much effect on real world.

> >
> > The perf shows that there's extra one call of file_remove_privs(), override_creds() and revert_creds() every write() syscall,
> >
> > among which file_remove_privs() is pretty expensive.
> >

Interesting.
If this is indeed the reasons for the perf regression
then it boils down to performance vs. security, because if kernel
4.9 is truly faster due to skipped file_remove_privs() and override_creds()
then it is not really enforcing security in a consistent manner.
It's true that in the common case, mounter credentials are a super set of
user credentials, so file_remove_privs() and  security_file_permission()
with user credentials are most of the time practically enough, but that is
not universally true.

If the workload is truly important to you, please try to figure out
why the extra calls are so expensive.
Do you have any LSMs enabled?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Performance regression caused by stack operation of regular file
  2019-11-01 14:25   ` Amir Goldstein
@ 2019-11-04  1:03     ` Joseph Qi
  0 siblings, 0 replies; 4+ messages in thread
From: Joseph Qi @ 2019-11-04  1:03 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: JeffleXu, Miklos Szeredi, overlayfs

Hi Amir,
Thanks for your valuable inputs.

On 19/11/1 22:25, Amir Goldstein wrote:
> On Fri, Nov 1, 2019 at 9:27 AM Joseph Qi <jiangqi903@gmail.com> wrote:
>>
>> Hi Miklos & Amir,
>> Could you please take a look at this?
>> It behaves different between the latest kernel and an old one, e.g. 4.9.
> 
> Not surprisingly.
> Stacked file operations in 4.19 shuffled the cards.
> See below.
> 
>>
>> Thanks,
>> Joseph
>>
>> On 19/10/28 14:21, JeffleXu wrote:
>>> Hi, Miklos,
>>>
>>> I noticed a performance regression of reading/writing files in mergeddir caused by commit a6518f73e60e5044656d1ba587e7463479a9381a (vfs: don't open real), using unixbench fstime.
>>>
>>>
>>> Reproduce Steps:
>>>
>>> 1. cd /mnt/lower/ && git clone https://github.com/kdlucas/byte-unixbench.git
>>>
>>> 2. mount -t overlay overlay -olowerdir=/mnt/lower,upperdir=/mnt/upper,workdir=/mnt/work /mnt/merge
>>>
>>> 3. cd /mnt/merge/byte-unixbench/UnixBench && ./Run -c 1 -i 1 fstime
>>>
>>>
>>> The score is 2870 before applying the patch, while it is 1780 after applying the patch, causing a 40% performance regression.
>>>
>>> The testcase repeatedly reads 1024 bytes from one file and writes the readed data into another file, while both these two files
>>>
>>> are created under /mnt/merge/tmp.  I have testsed the latest kernel 5.4.0-rc4+, same results.
>>>
> 
> Is this really a workload that you are interested in or just a random
> micro benchmark?
> If kernel changes behavior for the better in some workloads and for the worst
> in other workloads, it is important to distinguish between the case of real
> life workloads and less meaningful micro benchmarks that do not really have
> that much effect on real world.
> 
We'll figure out if there is a real use case with respect to this benchmark.

>>>
>>> The perf shows that there's extra one call of file_remove_privs(), override_creds() and revert_creds() every write() syscall,
>>>
>>> among which file_remove_privs() is pretty expensive.
>>>
> 
> Interesting.
> If this is indeed the reasons for the perf regression
> then it boils down to performance vs. security, because if kernel
> 4.9 is truly faster due to skipped file_remove_privs() and override_creds()
> then it is not really enforcing security in a consistent manner.
> It's true that in the common case, mounter credentials are a super set of
> user credentials, so file_remove_privs() and  security_file_permission()
> with user credentials are most of the time practically enough, but that is
> not universally true.
> > If the workload is truly important to you, please try to figure out
> why the extra calls are so expensive.
> Do you have any LSMs enabled?
> 
Yes, we've enabled SELinux by default.

Thanks,
Joseph

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-04  1:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-28  6:21 Performance regression caused by stack operation of regular file JeffleXu
2019-11-01  7:27 ` Joseph Qi
2019-11-01 14:25   ` Amir Goldstein
2019-11-04  1:03     ` Joseph Qi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.