Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] fuse: fix inode rwsem regression
@ 2020-02-01  5:49 qiwuchen55
  2020-02-01 15:47 ` Matthew Wilcox
  2020-02-01 23:09 ` Bernd Schubert
  0 siblings, 2 replies; 7+ messages in thread
From: qiwuchen55 @ 2020-02-01  5:49 UTC (permalink / raw)
  To: miklos; +Cc: linux-fsdevel, chenqiwu

From: chenqiwu <chenqiwu@xiaomi.com>

Apparently our current rwsem code doesn't like doing the trylock, then
lock for real scheme.  So change our direct write method to just do the
trylock for the RWF_NOWAIT case.
This seems to fix AIM7 regression in some scalable filesystems upto ~25%
in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")

Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
---
 fs/fuse/file.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index ce71538..ac16994 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	ssize_t res;
 
 	/* Don't allow parallel writes to the same file */
-	inode_lock(inode);
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock(inode))
+			return -EAGAIN;
+	} else {
+		inode_lock(inode);
+	}
+
 	res = generic_write_checks(iocb, from);
 	if (res > 0) {
 		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
-- 
1.9.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-01  5:49 [PATCH] fuse: fix inode rwsem regression qiwuchen55
@ 2020-02-01 15:47 ` Matthew Wilcox
  2020-02-01 23:09 ` Bernd Schubert
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2020-02-01 15:47 UTC (permalink / raw)
  To: qiwuchen55; +Cc: miklos, linux-fsdevel, chenqiwu

On Sat, Feb 01, 2020 at 01:49:31PM +0800, qiwuchen55@gmail.com wrote:
> Apparently our current rwsem code doesn't like doing the trylock, then
> lock for real scheme.  So change our direct write method to just do the
> trylock for the RWF_NOWAIT case.
> This seems to fix AIM7 regression in some scalable filesystems upto ~25%
> in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")

This commit message doesn't match the patch.

>  	/* Don't allow parallel writes to the same file */
> -	inode_lock(inode);
> +	if (iocb->ki_flags & IOCB_NOWAIT) {
> +		if (!inode_trylock(inode))
> +			return -EAGAIN;
> +	} else {
> +		inode_lock(inode);
> +	}
> +
>  	res = generic_write_checks(iocb, from);
>  	if (res > 0) {
>  		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-01  5:49 [PATCH] fuse: fix inode rwsem regression qiwuchen55
  2020-02-01 15:47 ` Matthew Wilcox
@ 2020-02-01 23:09 ` Bernd Schubert
  2020-02-02  2:08   ` chenqiwu
  1 sibling, 1 reply; 7+ messages in thread
From: Bernd Schubert @ 2020-02-01 23:09 UTC (permalink / raw)
  To: qiwuchen55, miklos; +Cc: linux-fsdevel, chenqiwu, Matthew Wilcox



On 2/1/20 6:49 AM, qiwuchen55@gmail.com wrote:
> From: chenqiwu <chenqiwu@xiaomi.com>
> 
> Apparently our current rwsem code doesn't like doing the trylock, then
> lock for real scheme.  So change our direct write method to just do the
> trylock for the RWF_NOWAIT case.
> This seems to fix AIM7 regression in some scalable filesystems upto ~25%
> in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")
> 
> Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
> ---
>  fs/fuse/file.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index ce71538..ac16994 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	ssize_t res;
>  
>  	/* Don't allow parallel writes to the same file */
> -	inode_lock(inode);
> +	if (iocb->ki_flags & IOCB_NOWAIT) {
> +		if (!inode_trylock(inode))
> +			return -EAGAIN;
> +	} else {
> +		inode_lock(inode);
> +	}
> +
>  	res = generic_write_checks(iocb, from);
>  	if (res > 0) {
>  		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
> 


I would actually like to ask if we can do something about this lock
altogether. Replace it with a range lock?  This very lock badly hurts
fuse shared file performance and maybe I miss something, but it should
be needed only for writes/reads going into the same file?


Thanks,
Bernd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-01 23:09 ` Bernd Schubert
@ 2020-02-02  2:08   ` chenqiwu
  2020-02-02 21:18     ` Bernd Schubert
  0 siblings, 1 reply; 7+ messages in thread
From: chenqiwu @ 2020-02-02  2:08 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: miklos, linux-fsdevel, chenqiwu, Matthew Wilcox

On Sun, Feb 02, 2020 at 12:09:50AM +0100, Bernd Schubert wrote:
> 
> 
> On 2/1/20 6:49 AM, qiwuchen55@gmail.com wrote:
> > From: chenqiwu <chenqiwu@xiaomi.com>
> > 
> > Apparently our current rwsem code doesn't like doing the trylock, then
> > lock for real scheme.  So change our direct write method to just do the
> > trylock for the RWF_NOWAIT case.
> > This seems to fix AIM7 regression in some scalable filesystems upto ~25%
> > in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")
> > 
> > Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
> > ---
> >  fs/fuse/file.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index ce71538..ac16994 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  	ssize_t res;
> >  
> >  	/* Don't allow parallel writes to the same file */
> > -	inode_lock(inode);
> > +	if (iocb->ki_flags & IOCB_NOWAIT) {
> > +		if (!inode_trylock(inode))
> > +			return -EAGAIN;
> > +	} else {
> > +		inode_lock(inode);
> > +	}
> > +
> >  	res = generic_write_checks(iocb, from);
> >  	if (res > 0) {
> >  		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
> > 
> 
> 
> I would actually like to ask if we can do something about this lock
> altogether. Replace it with a range lock?  This very lock badly hurts
> fuse shared file performance and maybe I miss something, but it should
> be needed only for writes/reads going into the same file?
>
I think replacing the internal inode rwsem with a range lock maybe not
a good idea, because it may cause potential block for different writes/reads
routes when this range lock is owned by someone. Using internal inode rwsem
can avoid this range racy.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-02  2:08   ` chenqiwu
@ 2020-02-02 21:18     ` Bernd Schubert
  2020-02-03  3:20       ` chenqiwu
  2020-02-13  9:50       ` Miklos Szeredi
  0 siblings, 2 replies; 7+ messages in thread
From: Bernd Schubert @ 2020-02-02 21:18 UTC (permalink / raw)
  To: chenqiwu; +Cc: miklos, linux-fsdevel, chenqiwu, Matthew Wilcox



On 2/2/20 3:08 AM, chenqiwu wrote:
> On Sun, Feb 02, 2020 at 12:09:50AM +0100, Bernd Schubert wrote:
>>
>>
>> On 2/1/20 6:49 AM, qiwuchen55@gmail.com wrote:
>>> From: chenqiwu <chenqiwu@xiaomi.com>
>>>
>>> Apparently our current rwsem code doesn't like doing the trylock, then
>>> lock for real scheme.  So change our direct write method to just do the
>>> trylock for the RWF_NOWAIT case.
>>> This seems to fix AIM7 regression in some scalable filesystems upto ~25%
>>> in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")
>>>
>>> Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
>>> ---
>>>  fs/fuse/file.c | 8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>> index ce71538..ac16994 100644
>>> --- a/fs/fuse/file.c
>>> +++ b/fs/fuse/file.c
>>> @@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>>  	ssize_t res;
>>>  
>>>  	/* Don't allow parallel writes to the same file */
>>> -	inode_lock(inode);
>>> +	if (iocb->ki_flags & IOCB_NOWAIT) {
>>> +		if (!inode_trylock(inode))
>>> +			return -EAGAIN;
>>> +	} else {
>>> +		inode_lock(inode);
>>> +	}
>>> +
>>>  	res = generic_write_checks(iocb, from);
>>>  	if (res > 0) {
>>>  		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
>>>
>>
>>
>> I would actually like to ask if we can do something about this lock
>> altogether. Replace it with a range lock?  This very lock badly hurts
>> fuse shared file performance and maybe I miss something, but it should
>> be needed only for writes/reads going into the same file?
>>
> I think replacing the internal inode rwsem with a range lock maybe not
> a good idea, because it may cause potential block for different writes/reads
> routes when this range lock is owned by someone. Using internal inode rwsem
> can avoid this range racy.
> 

So your 2nd patch changes to rw-locks and should solve low read
direct-io performance, but single shared file writes is still an issue.
For network file systems it also common to globally enforce fuse
direct-io to reduce/avoid cache coherency issues - the application
typically doesn't ask for that on its own. And that is where this lock
is badly hurting.  Hmm, maybe we should differentiate between
fuse-internal direct-io and application direct-io requests here? Or we
need a range lock,that supports shared readers (I haven't looked at any
of the proposed range lock patches yet (non has landed yet, right?).

Thanks,
Bernd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-02 21:18     ` Bernd Schubert
@ 2020-02-03  3:20       ` chenqiwu
  2020-02-13  9:50       ` Miklos Szeredi
  1 sibling, 0 replies; 7+ messages in thread
From: chenqiwu @ 2020-02-03  3:20 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: miklos, linux-fsdevel, chenqiwu, Matthew Wilcox

On Sun, Feb 02, 2020 at 10:18:58PM +0100, Bernd Schubert wrote:
> 
> 
> On 2/2/20 3:08 AM, chenqiwu wrote:
> > On Sun, Feb 02, 2020 at 12:09:50AM +0100, Bernd Schubert wrote:
> >>
> >>
> >> On 2/1/20 6:49 AM, qiwuchen55@gmail.com wrote:
> >>> From: chenqiwu <chenqiwu@xiaomi.com>
> >>>
> >>> Apparently our current rwsem code doesn't like doing the trylock, then
> >>> lock for real scheme.  So change our direct write method to just do the
> >>> trylock for the RWF_NOWAIT case.
> >>> This seems to fix AIM7 regression in some scalable filesystems upto ~25%
> >>> in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")
> >>>
> >>> Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
> >>> ---
> >>>  fs/fuse/file.c | 8 +++++++-
> >>>  1 file changed, 7 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> >>> index ce71538..ac16994 100644
> >>> --- a/fs/fuse/file.c
> >>> +++ b/fs/fuse/file.c
> >>> @@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >>>  	ssize_t res;
> >>>  
> >>>  	/* Don't allow parallel writes to the same file */
> >>> -	inode_lock(inode);
> >>> +	if (iocb->ki_flags & IOCB_NOWAIT) {
> >>> +		if (!inode_trylock(inode))
> >>> +			return -EAGAIN;
> >>> +	} else {
> >>> +		inode_lock(inode);
> >>> +	}
> >>> +
> >>>  	res = generic_write_checks(iocb, from);
> >>>  	if (res > 0) {
> >>>  		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
> >>>
> >>
> >>
> >> I would actually like to ask if we can do something about this lock
> >> altogether. Replace it with a range lock?  This very lock badly hurts
> >> fuse shared file performance and maybe I miss something, but it should
> >> be needed only for writes/reads going into the same file?
> >>
> > I think replacing the internal inode rwsem with a range lock maybe not
> > a good idea, because it may cause potential block for different writes/reads
> > routes when this range lock is owned by someone. Using internal inode rwsem
> > can avoid this range racy.
> > 
> 
> So your 2nd patch changes to rw-locks and should solve low read
> direct-io performance, but single shared file writes is still an issue.
> For network file systems it also common to globally enforce fuse
> direct-io to reduce/avoid cache coherency issues - the application
> typically doesn't ask for that on its own. And that is where this lock
> is badly hurting.  Hmm, maybe we should differentiate between
> fuse-internal direct-io and application direct-io requests here? Or we
> need a range lock,that supports shared readers (I haven't looked at any
> of the proposed range lock patches yet (non has landed yet, right?).
>
There is a recent fix for ext4 and we can evaluate and apply it to fuse
filesytem for solving low dio-write performance.
aa9714d0e(ext4: Start with shared i_rwsem in case of DIO instead of exclusive)

Thanks!
Qiwu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: fix inode rwsem regression
  2020-02-02 21:18     ` Bernd Schubert
  2020-02-03  3:20       ` chenqiwu
@ 2020-02-13  9:50       ` Miklos Szeredi
  1 sibling, 0 replies; 7+ messages in thread
From: Miklos Szeredi @ 2020-02-13  9:50 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: chenqiwu, linux-fsdevel, chenqiwu, Matthew Wilcox

On Sun, Feb 2, 2020 at 10:18 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
> For network file systems it also common to globally enforce fuse
> direct-io to reduce/avoid cache coherency issues - the application
> typically doesn't ask for that on its own. And that is where this lock
> is badly hurting.  Hmm, maybe we should differentiate between
> fuse-internal direct-io and application direct-io requests here?

I'm not against optionally removing the locking for direct write.
Theoretically that should be doable.  But we need to be very careful
about not breaking any assumptions in the kernel and libfuse code.
Obviously this would need to be enabled with a flag (e.g.
FOPEN_PARALLEL_WRITES).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-01  5:49 [PATCH] fuse: fix inode rwsem regression qiwuchen55
2020-02-01 15:47 ` Matthew Wilcox
2020-02-01 23:09 ` Bernd Schubert
2020-02-02  2:08   ` chenqiwu
2020-02-02 21:18     ` Bernd Schubert
2020-02-03  3:20       ` chenqiwu
2020-02-13  9:50       ` Miklos Szeredi

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git