All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bschubert@ddn.com>
To: Vivek Goyal <vgoyal@redhat.com>,
	Dharmendra Hans <dharamhans87@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/3] FUSE: Implement atomic lookup + open/create
Date: Thu, 5 May 2022 17:13:00 +0200	[thread overview]
Message-ID: <882fbf7f-a56b-1e82-a158-9e2186ec7c4c@ddn.com> (raw)
In-Reply-To: <YnPI6f2fRZUXbCFP@redhat.com>



On 5/5/22 14:54, Vivek Goyal wrote:
> On Thu, May 05, 2022 at 11:42:51AM +0530, Dharmendra Hans wrote:
>> Here are the numbers I took last time. These were taken on tmpfs to
>> actually see the effect of reduced calls. On local file systems it
>> might not be that much visible. But we have observed that on systems
>> where we have thousands of clients hammering the metadata servers, it
>> helps a lot (We did not take numbers yet as  we are required to change
>> a lot of our client code but would be doing it later on).
>>
>> Note that for a change in performance number due to the new version of
>> these patches, we have just refactored the code and functionality has
>> remained the same since then.
>>
>> here is the link to the performance numbers
>> https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/
> 
> There is a lot going in that table. Trying to understand it.
> 
> - Why care about No-Flush. I mean that's independent of these changes,
>    right?  I am assuming this means that upon file close do not send
>    a flush to fuse server. Not sure how bringing No-Flush into the
>    mix is helpful here.


It is a basically removing another call from kernel to user space. The 
calls there are, the lower is the resulting percentage for atomic-open.


> 
> - What is "Patched Libfuse"? I am assuming that these are changes
>    needed in libfuse to support atomic create + atomic open. Similarly
>    assuming "Patched FuseK" means patched kernel with your changes.

Yes, I did that to ensure there is no regression with the patches, when 
the other side is not patched.

> 
>    If this is correct, I would probably only be interested in
>    looking at "Patched Libfuse + Patched FuseK" numbers to figure out
>    what's the effect of your changes w.r.t vanilla kernel + libfuse.
>    Am I understanding it right?

Yes.

> 
> - I am wondering why do we measure "Sequential" and "Random" patterns.
>    These optimizations are primarily for file creation + file opening
>    and I/O pattern should not matter.

bonnie++ does this automatically and it just convenient to take the 
bonnie++ csv value and to paste them into a table.

In our HPC world mdtest is more common, but it has MPI as requirement - 
make it harder to run. Reproducing the values with bonnie++ should be 
rather easy for you.

Only issue with bonnie++ is that bonnie++ by default does not run 
multi-threaded and the old 3rd party perl scripts I had to let it run 
with multiple processes and to sum up the values don't work anymore with 
recent perl versions. I need to find some time to fix that.


> 
> - Also wondering why performance of Read/s improves. Assuming once
>    file has been opened, I think your optimizations get out of the
>    way (no create, no open) and we are just going through data path of
>    reading file data and no lookups happening. If that's the case, why
>    do Read/s numbers show an improvement.

That is now bonnie++ works. It creates the files, closes them (which 
causes the flush) and then re-opens for stat and read - atomic open 
comes into the picture here. Also read() is totally skipped when the 
files are empty - which is why one should use something like 1B files.

If you have another metadata benchmark - please let us know.

> 
> - Why do we measure "Patched Libfuse". It shows performance regression
>    of 4-5% in table 0B, Sequential workoad. That sounds bad. So without
>    any optimization kicking in, it has a performance cost.

Yes, I'm not sure yet. There is not so much code that has changed on the 
libfuse side.
However the table needs to be redone with fixed libfuse - limiting the 
number of threads caused a permanent libfuse thread creation and destruction

https://github.com/libfuse/libfuse/pull/652

The numbers in table are also with paththrough_ll, which has its own 
issue due to linear inode search. paththrough_hp uses a C++ map and 
avoids that. I noticed too late when I started to investigate why there 
are regressions....

Also the table made me to investigate/profile all the fuse operations, 
which resulted in my waitq question. Please see that thread for more 
details 
https://lore.kernel.org/lkml/9326bb76-680f-05f6-6f78-df6170afaa2c@fastmail.fm/T/

Regarding atomic-open/create with avoiding lookup/revalidate, our 
primary goal is to reduce network calls. A file system that handles it 
locally only reduces the number of fuse kernel/user space crossing. A 
network file system that fully supports it needs to do the atomic open 
(or in old terms lookup-intent-open) on the server side of the network 
and needs to transfer attributes together with the open result.

Lustre does this, although I cannot easily point you to the right code. 
It all started almost two decades ago:
https://groups.google.com/g/lucky.linux.fsdevel/c/iYNFIIrkJ1s


BeeGFS does this as well
https://git.beegfs.io/pub/v7/-/blob/master/client_module/source/filesystem/FhgfsOpsInode.c
See for examaple FhgfsOps_atomicOpen() and FhgfsOps_createIntent()

(FhGFS is the old name when I was still involved in the project.)

 From my head I'm not sure if NFS does it over the wire, maybe v4.


Thanks,
Bernd







  reply	other threads:[~2022-05-05 15:13 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-02 10:25 [PATCH v4 0/3] FUSE: Implement atomic lookup + open/create Dharmendra Singh
2022-05-02 10:25 ` [PATCH v4 1/3] FUSE: Implement atomic lookup + create Dharmendra Singh
2022-05-03 12:43   ` Vivek Goyal
2022-05-03 14:13   ` Vivek Goyal
2022-05-03 19:53   ` Vivek Goyal
2022-05-03 20:48     ` Bernd Schubert
2022-05-04  4:26     ` Dharmendra Hans
2022-05-04 14:47       ` Vivek Goyal
2022-05-04 15:46         ` Bernd Schubert
2022-05-04 17:31           ` Vivek Goyal
2022-05-05  4:51         ` Dharmendra Hans
2022-05-05 14:26           ` Vivek Goyal
2022-05-06  5:34             ` Dharmendra Hans
2022-05-06 14:12               ` Vivek Goyal
2022-05-06 16:41                 ` Bernd Schubert
2022-05-06 17:07                   ` Vivek Goyal
2022-05-06 18:45                     ` Bernd Schubert
2022-05-07 10:42                       ` Jean-Pierre André
2022-05-07 10:42                         ` Jean-Pierre André
2022-05-11 10:08                         ` Bernd Schubert
2022-05-02 10:25 ` [PATCH v4 2/3] FUSE: Implement atomic lookup + open Dharmendra Singh
2022-05-04 18:20   ` Vivek Goyal
2022-05-05  6:39     ` Dharmendra Hans
2022-05-02 10:25 ` [PATCH v4 3/3] FUSE: Avoid lookup in d_revalidate() Dharmendra Singh
2022-05-04 20:39   ` Vivek Goyal
2022-05-04 21:05     ` Bernd Schubert
2022-05-05  5:49     ` Dharmendra Hans
2022-05-04 19:18 ` [PATCH v4 0/3] FUSE: Implement atomic lookup + open/create Vivek Goyal
2022-05-05  6:12   ` Dharmendra Hans
2022-05-05 12:54     ` Vivek Goyal
2022-05-05 15:13       ` Bernd Schubert [this message]
2022-05-05 19:59         ` Vivek Goyal
2022-05-11  9:40           ` Miklos Szeredi
2022-05-11  9:59             ` Bernd Schubert
2022-05-11 17:21             ` Vivek Goyal
2022-05-11 19:30               ` Vivek Goyal
2022-05-12  8:16                 ` Dharmendra Hans
2022-05-12 15:24                   ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=882fbf7f-a56b-1e82-a158-9e2186ec7c4c@ddn.com \
    --to=bschubert@ddn.com \
    --cc=dharamhans87@gmail.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.