Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
* Re: Re: [PATCH] lockd: Show pid of lockd for remote locks
@ 2019-05-17 21:45 Xuewei Zhang
  2019-05-18 12:09 ` Benjamin Coddington
  0 siblings, 1 reply; 8+ messages in thread
From: Xuewei Zhang @ 2019-05-17 21:45 UTC (permalink / raw)
  To: bfields, bcodding, Grigor Avagyan, Trevor Bourget, Nauman Rafique
  Cc: trond.myklebust, anna.schumaker, jlayton, linux-nfs

> On Fri, Nov 02, 2018 at 02:45:16PM -0400, J. Bruce Fields wrote:
> > On Thu, Nov 01, 2018 at 01:39:49PM -0400, Benjamin Coddington wrote:
> > > Commit 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid
> > > for remote locks") specified that the l_pid returned for F_GETLK on a local
> > > file that has a remote lock should be the pid of the lock manager process.
> > > That commit, while updating other filesystems, failed to update lockd, such
> > > that locks created by lockd had their fl_pid set to that of the remote
> > > process holding the lock.  Fix that here to be the pid of lockd.
> > >
> > > Also, fix the client case so that the returned lock pid is negative, which
> > > indicates a remote lock on a remote file.

Seems this patch introduced a bug in how lock protocol handles
GRANTED_MSG in nfs.

For example under this scenario:
1. Client1 takes an exclusive lock on a file and holds it
   (client1 sends LOCK request, and server replies with success)
2. Client2 attempts to take an exclusive lock on the same file and gets blocked
   (client2 sends LOCK request, and server replies with NLM_BLOCKED)
3. Client1 release the lock
   (client1 sends UNLOCK request, and server replies with success)
4. Server grant the lock to Client
   (server sends GRANTED_MSG to clients)
5. Client2 looks at all currently blocked locks and see if the
   fl_blocked->fl_u.nfs_fl.owner->pid matches lock->svid [1]
6. The owner->pid and lock->svid *should* match, and Client2 will then
take the lock.

Looking closely of how the matching at step 5 is implemented:
https://github.com/torvalds/linux/blob/e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd/fs/lockd/clntlock.c#L186
The comment says that the owner->pid should be the lockowner's PID, rather
than lock manager process(lockd)'s PID.

This patch b8eee0e90f97 ("lockd: Show pid of lockd for remote locks") changed
the behavior of lockd so that it sends the PID of lockd instead of lockowner's
PID at lock->svid, which results in the bug in the lock protocol.

The bug can be reproduced by opening two nfs clients on /nfs. And do this:
1. On client1: sudo flock /nfs/lock -c read
2. On client2: sudo flock /nfs/lock -c date
3. Now quit client 1 by ctrl-D.

You will see that client2 will take ~30 seconds before taking the lock (this is
because the GRANTED_MSG got discarded by client2 since svid doesn't match.
And then client2 retries after 30 seconds). The expected behavior is client2
should take the lock immediately.

On a kernel built with this b8eee0e90f97 ("lockd: Show pid of lockd for remote
locks") reverted, it shows the expected behavior.

I suspect 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid for
remote locks") probably chose to not update lockd for this reason.

Should we consider maybe reverting or fixing b8eee0e90f97 ("lockd:
Show pid of lockd
for remote locks")?

Thanks!
Xuewei

[1] https://github.com/torvalds/linux/blob/e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd/fs/lockd/clntlock.c#L186

> >
> > ACK.
> >
> > Uh, I guess I'll take this if nobody else speaks up.
>
> Applied for 4.21.--b.
>
> >
> > --b.
> >
> > >
> > > Fixes: 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
> > > Cc: stable@vger.kernel.org
> > >
> > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > > ---
> > >  fs/lockd/clntproc.c | 2 +-
> > >  fs/lockd/xdr.c      | 4 ++--
> > >  fs/lockd/xdr4.c     | 4 ++--
> > >  3 files changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> > > index d20b92f271c2..0a67dd4250e9 100644
> > > --- a/fs/lockd/clntproc.c
> > > +++ b/fs/lockd/clntproc.c
> > > @@ -442,7 +442,7 @@ nlmclnt_test(struct nlm_rqst *req, struct file_lock *fl)
> > >   fl->fl_start = req->a_res.lock.fl.fl_start;
> > >   fl->fl_end = req->a_res.lock.fl.fl_end;
> > >   fl->fl_type = req->a_res.lock.fl.fl_type;
> > > - fl->fl_pid = 0;
> > > + fl->fl_pid = -req->a_res.lock.fl.fl_pid;
> > >   break;
> > >   default:
> > >   status = nlm_stat_to_errno(req->a_res.status);
> > > diff --git a/fs/lockd/xdr.c b/fs/lockd/xdr.c
> > > index 7147e4aebecc..9846f7e95282 100644
> > > --- a/fs/lockd/xdr.c
> > > +++ b/fs/lockd/xdr.c
> > > @@ -127,7 +127,7 @@ nlm_decode_lock(__be32 *p, struct nlm_lock *lock)
> > >
> > >   locks_init_lock(fl);
> > >   fl->fl_owner = current->files;
> > > - fl->fl_pid   = (pid_t)lock->svid;
> > > + fl->fl_pid   = current->tgid;
> > >   fl->fl_flags = FL_POSIX;
> > >   fl->fl_type  = F_RDLCK; /* as good as anything else */
> > >   start = ntohl(*p++);
> > > @@ -269,7 +269,7 @@ nlmsvc_decode_shareargs(struct svc_rqst *rqstp, __be32 *p)
> > >   memset(lock, 0, sizeof(*lock));
> > >   locks_init_lock(&lock->fl);
> > >   lock->svid = ~(u32) 0;
> > > - lock->fl.fl_pid = (pid_t)lock->svid;
> > > + lock->fl.fl_pid = current->tgid;
> > >
> > >   if (!(p = nlm_decode_cookie(p, &argp->cookie))
> > >   || !(p = xdr_decode_string_inplace(p, &lock->caller,
> > > diff --git a/fs/lockd/xdr4.c b/fs/lockd/xdr4.c
> > > index 7ed9edf9aed4..70154f376695 100644
> > > --- a/fs/lockd/xdr4.c
> > > +++ b/fs/lockd/xdr4.c
> > > @@ -119,7 +119,7 @@ nlm4_decode_lock(__be32 *p, struct nlm_lock *lock)
> > >
> > >   locks_init_lock(fl);
> > >   fl->fl_owner = current->files;
> > > - fl->fl_pid   = (pid_t)lock->svid;
> > > + fl->fl_pid   = current->tgid;
> > >   fl->fl_flags = FL_POSIX;
> > >   fl->fl_type  = F_RDLCK; /* as good as anything else */
> > >   p = xdr_decode_hyper(p, &start);
> > > @@ -266,7 +266,7 @@ nlm4svc_decode_shareargs(struct svc_rqst *rqstp, __be32 *p)
> > >   memset(lock, 0, sizeof(*lock));
> > >   locks_init_lock(&lock->fl);
> > >   lock->svid = ~(u32) 0;
> > > - lock->fl.fl_pid = (pid_t)lock->svid;
> > > + lock->fl.fl_pid = current->tgid;
> > >
> > >   if (!(p = nlm4_decode_cookie(p, &argp->cookie))
> > >   || !(p = xdr_decode_string_inplace(p, &lock->caller,
> > > --
> > > 2.14.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-17 21:45 Re: [PATCH] lockd: Show pid of lockd for remote locks Xuewei Zhang
@ 2019-05-18 12:09 ` Benjamin Coddington
  2019-05-19  2:15   ` Xuewei Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2019-05-18 12:09 UTC (permalink / raw)
  To: Xuewei Zhang
  Cc: bfields, Grigor Avagyan, Trevor Bourget, Nauman Rafique,
	trond.myklebust, anna.schumaker, jlayton, linux-nfs

On 17 May 2019, at 17:45, Xuewei Zhang wrote:
> Seems this patch introduced a bug in how lock protocol handles
> GRANTED_MSG in nfs.

Yes, you're right: it's broken, and broken badly because we find conflicting
locks based on lockd's fl_pid and lockd's fl_owner, which is current->files.
That means that clients are not differentiated, and that means that v3 locks
are broken.

I'd really like to see the fl_pid value make sense on the server when we
show it to userspace, so I think that we should stuff the svid in fl_owner.

Clearly I need to be more careful making changes here, so I am going to take
my time fixing this, and I won't get to it until Monday. A revert would get
us back to safe behavior.

Ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-18 12:09 ` Benjamin Coddington
@ 2019-05-19  2:15   ` Xuewei Zhang
  2019-05-20 13:12     ` Benjamin Coddington
  0 siblings, 1 reply; 8+ messages in thread
From: Xuewei Zhang @ 2019-05-19  2:15 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: bfields, Grigor Avagyan, Trevor Bourget, Nauman Rafique,
	trond.myklebust, anna.schumaker, jlayton, linux-nfs

On Sat, May 18, 2019 at 5:09 AM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> On 17 May 2019, at 17:45, Xuewei Zhang wrote:
> > Seems this patch introduced a bug in how lock protocol handles
> > GRANTED_MSG in nfs.
>
> Yes, you're right: it's broken, and broken badly because we find conflicting
> locks based on lockd's fl_pid and lockd's fl_owner, which is current->files.
> That means that clients are not differentiated, and that means that v3 locks
> are broken.

Thanks a lot for the quick response and confirming the problem!

>
> I'd really like to see the fl_pid value make sense on the server when we
> show it to userspace, so I think that we should stuff the svid in fl_owner.
>
> Clearly I need to be more careful making changes here, so I am going to take
> my time fixing this, and I won't get to it until Monday. A revert would get
> us back to safe behavior.

From my limited understanding, b8eee0e90f97 ("lockd: Show pid of lockd
for remote locks")
exists only for fixing lockd in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific...").

But I don't see anything wrong in 9d5b86ac13c5 ("fs/locks: Remove
fl_nspid and use fs-specific..."). Could you let me know what's the
problem? Thanks a lot!

If 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
is correct, we
probably don't need to add another fixing patch. Perhaps reverting b8eee0e90f97
("lockd: Show pid of lockd for remote locks") would be the best way then.

>
> Ben

Best regards,
Xuewei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-19  2:15   ` Xuewei Zhang
@ 2019-05-20 13:12     ` Benjamin Coddington
  2019-05-20 14:22       ` Benjamin Coddington
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2019-05-20 13:12 UTC (permalink / raw)
  To: Xuewei Zhang, jlayton
  Cc: bfields, Grigor Avagyan, Trevor Bourget, Nauman Rafique,
	trond.myklebust, anna.schumaker, linux-nfs

On 18 May 2019, at 22:15, Xuewei Zhang wrote:

> On Sat, May 18, 2019 at 5:09 AM Benjamin Coddington 
> <bcodding@redhat.com> wrote:
>>
>> On 17 May 2019, at 17:45, Xuewei Zhang wrote:
>>> Seems this patch introduced a bug in how lock protocol handles
>>> GRANTED_MSG in nfs.
>>
>> Yes, you're right: it's broken, and broken badly because we find 
>> conflicting
>> locks based on lockd's fl_pid and lockd's fl_owner, which is 
>> current->files.
>> That means that clients are not differentiated, and that means that 
>> v3 locks
>> are broken.
>
> Thanks a lot for the quick response and confirming the problem!
>
>>
>> I'd really like to see the fl_pid value make sense on the server when 
>> we
>> show it to userspace, so I think that we should stuff the svid in 
>> fl_owner.
>>
>> Clearly I need to be more careful making changes here, so I am going 
>> to take
>> my time fixing this, and I won't get to it until Monday. A revert 
>> would get
>> us back to safe behavior.
>
> From my limited understanding, b8eee0e90f97 ("lockd: Show pid of lockd
> for remote locks")
> exists only for fixing lockd in 9d5b86ac13c5 ("fs/locks: Remove
> fl_nspid and use fs-specific...").
>
> But I don't see anything wrong in 9d5b86ac13c5 ("fs/locks: Remove
> fl_nspid and use fs-specific..."). Could you let me know what's the
> problem? Thanks a lot!
>
> If 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
> is correct, we
> probably don't need to add another fixing patch. Perhaps reverting 
> b8eee0e90f97
> ("lockd: Show pid of lockd for remote locks") would be the best way 
> then.

I think we have an existing problem:  the NLM server is setting fl_owner 
to
current->files and (before the bad patch) fl_pid to svid.

That means that we can't tell the difference between locks from 
different
clients that may have the same svid.  The bad patch just made the 
problem
far more likely to occur, that's what you're now noticing.

What needs to happen is that we generate our own fl_owner_t based on the
nlm_host and svid, so that we end up with unique fl_owner for each
client/svid pair, the same way that nlmclnt does.  That way the nlm 
server
can let the kernel do the lock matching based on unique fl_owner for 
each
client/svid.

The mech in clntproc.c for all this can probably be shared, so I'll try 
to
make that common code.

Jeff, any words of wisdom?

Ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-20 13:12     ` Benjamin Coddington
@ 2019-05-20 14:22       ` Benjamin Coddington
  2019-05-20 20:51         ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2019-05-20 14:22 UTC (permalink / raw)
  To: Xuewei Zhang, jlayton
  Cc: bfields, Grigor Avagyan, Trevor Bourget, Nauman Rafique,
	trond.myklebust, anna.schumaker, linux-nfs

On 20 May 2019, at 9:12, Benjamin Coddington wrote:

> On 18 May 2019, at 22:15, Xuewei Zhang wrote:
>
>> On Sat, May 18, 2019 at 5:09 AM Benjamin Coddington 
>> <bcodding@redhat.com> wrote:
>>>
>>> On 17 May 2019, at 17:45, Xuewei Zhang wrote:
>>>> Seems this patch introduced a bug in how lock protocol handles
>>>> GRANTED_MSG in nfs.
>>>
>>> Yes, you're right: it's broken, and broken badly because we find 
>>> conflicting
>>> locks based on lockd's fl_pid and lockd's fl_owner, which is 
>>> current->files.
>>> That means that clients are not differentiated, and that means that 
>>> v3 locks
>>> are broken.
>>
>> Thanks a lot for the quick response and confirming the problem!
>>
>>>
>>> I'd really like to see the fl_pid value make sense on the server 
>>> when we
>>> show it to userspace, so I think that we should stuff the svid in 
>>> fl_owner.
>>>
>>> Clearly I need to be more careful making changes here, so I am going 
>>> to take
>>> my time fixing this, and I won't get to it until Monday. A revert 
>>> would get
>>> us back to safe behavior.
>>
>> From my limited understanding, b8eee0e90f97 ("lockd: Show pid of 
>> lockd
>> for remote locks")
>> exists only for fixing lockd in 9d5b86ac13c5 ("fs/locks: Remove
>> fl_nspid and use fs-specific...").
>>
>> But I don't see anything wrong in 9d5b86ac13c5 ("fs/locks: Remove
>> fl_nspid and use fs-specific..."). Could you let me know what's the
>> problem? Thanks a lot!
>>
>> If 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
>> is correct, we
>> probably don't need to add another fixing patch. Perhaps reverting 
>> b8eee0e90f97
>> ("lockd: Show pid of lockd for remote locks") would be the best way 
>> then.
>
> I think we have an existing problem:  the NLM server is setting 
> fl_owner to
> current->files and (before the bad patch) fl_pid to svid.
>
> That means that we can't tell the difference between locks from 
> different
> clients that may have the same svid.  The bad patch just made the 
> problem
> far more likely to occur, that's what you're now noticing.

Ok, I just noticed that we set fl_owner to the nlm_host in
nlm4svc_retrieve_args, so things are not as dire as I thought.  What 
would
be nice is a sane set of tests for NLM..

Since we already were placing the nlm_host in fl_owner, I think 
reverting
9d5b86ac13c5 at this point is the proper thing to do.

Ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-20 14:22       ` Benjamin Coddington
@ 2019-05-20 20:51         ` J. Bruce Fields
  2019-05-21 11:18           ` Benjamin Coddington
  0 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2019-05-20 20:51 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Xuewei Zhang, jlayton, Grigor Avagyan, Trevor Bourget,
	Nauman Rafique, trond.myklebust, anna.schumaker, linux-nfs

On Mon, May 20, 2019 at 10:22:00AM -0400, Benjamin Coddington wrote:
> Ok, I just noticed that we set fl_owner to the nlm_host in
> nlm4svc_retrieve_args, so things are not as dire as I thought.  What
> would be nice is a sane set of tests for NLM..

What would we have needed to catch this?  Sounds like it turns
multi-client testing wouldn't have been required?  (Not that that would
be a bad idea.)

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-20 20:51         ` J. Bruce Fields
@ 2019-05-21 11:18           ` Benjamin Coddington
  2019-05-21 14:49             ` J. Bruce Fields
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Coddington @ 2019-05-21 11:18 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Xuewei Zhang, jlayton, Grigor Avagyan, Trevor Bourget,
	Nauman Rafique, trond.myklebust, anna.schumaker, linux-nfs

On 20 May 2019, at 16:51, J. Bruce Fields wrote:

> On Mon, May 20, 2019 at 10:22:00AM -0400, Benjamin Coddington wrote:
>> Ok, I just noticed that we set fl_owner to the nlm_host in
>> nlm4svc_retrieve_args, so things are not as dire as I thought.  What
>> would be nice is a sane set of tests for NLM..
>
> What would we have needed to catch this?  Sounds like it turns
> multi-client testing wouldn't have been required?  (Not that that 
> would
> be a bad idea.)

Two NLM clients would be ideal to exercise the full range of expected 
lock behavior.  I suspect that's something I can do with what's in pynfs 
today, but I haven't looked yet.  I suppose if there's a test for NLM I 
should make one for v4 too..

Ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] lockd: Show pid of lockd for remote locks
  2019-05-21 11:18           ` Benjamin Coddington
@ 2019-05-21 14:49             ` J. Bruce Fields
  0 siblings, 0 replies; 8+ messages in thread
From: J. Bruce Fields @ 2019-05-21 14:49 UTC (permalink / raw)
  To: Benjamin Coddington
  Cc: Xuewei Zhang, jlayton, Grigor Avagyan, Trevor Bourget,
	Nauman Rafique, trond.myklebust, anna.schumaker, linux-nfs

On Tue, May 21, 2019 at 07:18:57AM -0400, Benjamin Coddington wrote:
> On 20 May 2019, at 16:51, J. Bruce Fields wrote:
> 
> >On Mon, May 20, 2019 at 10:22:00AM -0400, Benjamin Coddington wrote:
> >>Ok, I just noticed that we set fl_owner to the nlm_host in
> >>nlm4svc_retrieve_args, so things are not as dire as I thought.  What
> >>would be nice is a sane set of tests for NLM..
> >
> >What would we have needed to catch this?  Sounds like it turns
> >multi-client testing wouldn't have been required?  (Not that that
> >would
> >be a bad idea.)
> 
> Two NLM clients would be ideal to exercise the full range of
> expected lock behavior.  I suspect that's something I can do with
> what's in pynfs today, but I haven't looked yet.  I suppose if
> there's a test for NLM I should make one for v4 too..

There isn't any pynfs NLM code.  Some isilon folks did NLM/NSM/NFSv2/v3
pynfs tests:

	https://github.com/sthaber/pynfs

I just never got a chance to incorporate them and try them.  It's been a
while, and I think there were one or two odd things about it, but maybe
it'd be a good starting point.

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17 21:45 Re: [PATCH] lockd: Show pid of lockd for remote locks Xuewei Zhang
2019-05-18 12:09 ` Benjamin Coddington
2019-05-19  2:15   ` Xuewei Zhang
2019-05-20 13:12     ` Benjamin Coddington
2019-05-20 14:22       ` Benjamin Coddington
2019-05-20 20:51         ` J. Bruce Fields
2019-05-21 11:18           ` Benjamin Coddington
2019-05-21 14:49             ` J. Bruce Fields

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org linux-nfs@archiver.kernel.org
	public-inbox-index linux-nfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox