All of lore.kernel.org
 help / color / mirror / Atom feed
* [NLM] 2.6.27.14 breakage when grace period expires
@ 2009-02-11 11:23 Frank van Maarseveen
  2009-02-11 20:35 ` J. Bruce Fields
  0 siblings, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-11 11:23 UTC (permalink / raw)
  To: Linux NFS mailing list

I'm sorry to inform you but... it seems that there is a similar problem
in the NLM subsystem as reported previously but this time it is triggered
when the grace time expires after a reboot.

Client and server run 2.6.27.14 + previous fix, NFSv3.

On the client there are three shells running:

	while :; do lck -w /mnt/foo 2; done

The "lck" program is the same as posted before and it obtains an exclusive
write lock then waits 2 seconds in above invocation (there's probably an
"fcntl" command equivalent). After an orderly server reboot + grace time
expiration one of above command loops reports:

	lck: fcntl: No locks available

and all three get stuck. After ^C-ing all "lck" loops the server still
shows an entry in /proc/locks which causes the file to be locked
indefinately. Maybe two loops are sufficient to reproduce the issue or
maybe you need more, I don't know.

Interestingly, during the grace time at least one of the "lck" processes
should have re-obtained the lock but it didn't show up in /proc/locks
on the server.

Interestingly (#2), after removing the file on the server (i.e. no
sillyrename) the now free inode is still locked according to /proc/locks.
Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
grace).

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-11 11:23 [NLM] 2.6.27.14 breakage when grace period expires Frank van Maarseveen
@ 2009-02-11 20:35 ` J. Bruce Fields
  2009-02-11 20:37   ` Frank van Maarseveen
  0 siblings, 1 reply; 25+ messages in thread
From: J. Bruce Fields @ 2009-02-11 20:35 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: Linux NFS mailing list

On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> I'm sorry to inform you but... it seems that there is a similar problem
> in the NLM subsystem as reported previously but this time it is triggered
> when the grace time expires after a reboot.
> 
> Client and server run 2.6.27.14 + previous fix, NFSv3.
> 
> On the client there are three shells running:
> 
> 	while :; do lck -w /mnt/foo 2; done
> 
> The "lck" program is the same as posted before and it obtains an exclusive
> write lock then waits 2 seconds in above invocation (there's probably an
> "fcntl" command equivalent). After an orderly server reboot + grace time

How are you rebooting the server?

--b.

> expiration one of above command loops reports:
> 
> 	lck: fcntl: No locks available
> 
> and all three get stuck. After ^C-ing all "lck" loops the server still
> shows an entry in /proc/locks which causes the file to be locked
> indefinately. Maybe two loops are sufficient to reproduce the issue or
> maybe you need more, I don't know.
> 
> Interestingly, during the grace time at least one of the "lck" processes
> should have re-obtained the lock but it didn't show up in /proc/locks
> on the server.
> 
> Interestingly (#2), after removing the file on the server (i.e. no
> sillyrename) the now free inode is still locked according to /proc/locks.
> Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> grace).
> 
> -- 
> Frank
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-11 20:35 ` J. Bruce Fields
@ 2009-02-11 20:37   ` Frank van Maarseveen
  2009-02-11 20:39     ` J. Bruce Fields
  0 siblings, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-11 20:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Frank van Maarseveen, Linux NFS mailing list

On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > I'm sorry to inform you but... it seems that there is a similar problem
> > in the NLM subsystem as reported previously but this time it is triggered
> > when the grace time expires after a reboot.
> > 
> > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > 
> > On the client there are three shells running:
> > 
> > 	while :; do lck -w /mnt/foo 2; done
> > 
> > The "lck" program is the same as posted before and it obtains an exclusive
> > write lock then waits 2 seconds in above invocation (there's probably an
> > "fcntl" command equivalent). After an orderly server reboot + grace time
> 
> How are you rebooting the server?

"reboot"

> 
> --b.
> 
> > expiration one of above command loops reports:
> > 
> > 	lck: fcntl: No locks available
> > 
> > and all three get stuck. After ^C-ing all "lck" loops the server still
> > shows an entry in /proc/locks which causes the file to be locked
> > indefinately. Maybe two loops are sufficient to reproduce the issue or
> > maybe you need more, I don't know.
> > 
> > Interestingly, during the grace time at least one of the "lck" processes
> > should have re-obtained the lock but it didn't show up in /proc/locks
> > on the server.
> > 
> > Interestingly (#2), after removing the file on the server (i.e. no
> > sillyrename) the now free inode is still locked according to /proc/locks.
> > Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> > 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> > grace).
> > 
> > -- 
> > Frank
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-11 20:37   ` Frank van Maarseveen
@ 2009-02-11 20:39     ` J. Bruce Fields
  2009-02-11 20:57       ` Frank van Maarseveen
  2009-02-12 14:28       ` Frank van Maarseveen
  0 siblings, 2 replies; 25+ messages in thread
From: J. Bruce Fields @ 2009-02-11 20:39 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: Linux NFS mailing list

On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > I'm sorry to inform you but... it seems that there is a similar problem
> > > in the NLM subsystem as reported previously but this time it is triggered
> > > when the grace time expires after a reboot.
> > > 
> > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > > 
> > > On the client there are three shells running:
> > > 
> > > 	while :; do lck -w /mnt/foo 2; done
> > > 
> > > The "lck" program is the same as posted before and it obtains an exclusive
> > > write lock then waits 2 seconds in above invocation (there's probably an
> > > "fcntl" command equivalent). After an orderly server reboot + grace time
> > 
> > How are you rebooting the server?
> 
> "reboot"

Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
server is actually sending the reboot notification to the client, and
that the client is trying to reclaim?  (Wireshark should make this all
fairly clear.  But capture the traffic with tcpdump -s0 -wtmp.pcap and
send it to me if you're having trouble interpreting it.)

--b.

> 
> > 
> > --b.
> > 
> > > expiration one of above command loops reports:
> > > 
> > > 	lck: fcntl: No locks available
> > > 
> > > and all three get stuck. After ^C-ing all "lck" loops the server still
> > > shows an entry in /proc/locks which causes the file to be locked
> > > indefinately. Maybe two loops are sufficient to reproduce the issue or
> > > maybe you need more, I don't know.
> > > 
> > > Interestingly, during the grace time at least one of the "lck" processes
> > > should have re-obtained the lock but it didn't show up in /proc/locks
> > > on the server.
> > > 
> > > Interestingly (#2), after removing the file on the server (i.e. no
> > > sillyrename) the now free inode is still locked according to /proc/locks.
> > > Even stopping/starting /etc/init.d/nfs-kernel-server plus "echo
> > > 3 >/proc/sys/vm/drop_caches" did not remove the lock (it did re-enter
> > > grace).
> > > 
> > > -- 
> > > Frank
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-11 20:39     ` J. Bruce Fields
@ 2009-02-11 20:57       ` Frank van Maarseveen
  2009-02-12 14:28       ` Frank van Maarseveen
  1 sibling, 0 replies; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-11 20:57 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Frank van Maarseveen, Linux NFS mailing list

On Wed, Feb 11, 2009 at 03:39:48PM -0500, J. Bruce Fields wrote:
> On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> > On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > > I'm sorry to inform you but... it seems that there is a similar problem
> > > > in the NLM subsystem as reported previously but this time it is triggered
> > > > when the grace time expires after a reboot.
> > > > 
> > > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > > > 
> > > > On the client there are three shells running:
> > > > 
> > > > 	while :; do lck -w /mnt/foo 2; done
> > > > 
> > > > The "lck" program is the same as posted before and it obtains an exclusive
> > > > write lock then waits 2 seconds in above invocation (there's probably an
> > > > "fcntl" command equivalent). After an orderly server reboot + grace time
> > > 
> > > How are you rebooting the server?
> > 
> > "reboot"
> 
> Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
> server is actually sending the reboot notification to the client, and
> that the client is trying to reclaim?  (Wireshark should make this all
> fairly clear.  But capture the traffic with tcpdump -s0 -wtmp.pcap and
> send it to me if you're having trouble interpreting it.)

Can't try it right now but tomorrow I can. However, I'm pretty sure at least
the reboot notification is there because:

1)	The issue happens too in a totally different NFS server setup which
	by definition invokes sm-notify in a script. This is the real use
	case.
2)	If not, then I would expect different behavior anyway compared to
	what I saw. A lost reboot notification is always possible but in
	that case the client(s) might end up holding more locks than the
	server, not the other way around as it is right now.

I'll make a capture.

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-11 20:39     ` J. Bruce Fields
  2009-02-11 20:57       ` Frank van Maarseveen
@ 2009-02-12 14:28       ` Frank van Maarseveen
  2009-02-12 15:16         ` Trond Myklebust
  1 sibling, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-12 14:28 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linux NFS mailing list

On Wed, Feb 11, 2009 at 03:39:48PM -0500, J. Bruce Fields wrote:
> On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> > On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > > I'm sorry to inform you but... it seems that there is a similar problem
> > > > in the NLM subsystem as reported previously but this time it is triggered
> > > > when the grace time expires after a reboot.
> > > > 
> > > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > > > 
> > > > On the client there are three shells running:
> > > > 
> > > > 	while :; do lck -w /mnt/foo 2; done
> > > > 
> > > > The "lck" program is the same as posted before and it obtains an exclusive
> > > > write lock then waits 2 seconds in above invocation (there's probably an
> > > > "fcntl" command equivalent). After an orderly server reboot + grace time
> > > 
> > > How are you rebooting the server?
> > 
> > "reboot"
> 
> Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
> server is actually sending the reboot notification to the client, and
> that the client is trying to reclaim?  (Wireshark should make this all
> fairly clear.  But capture the traffic with tcpdump -s0 -wtmp.pcap and
> send it to me if you're having trouble interpreting it.)

I have a capture with comment below. It raised so many questions
that I decided to do some more testing, trying to figure out how
it looks when the locking works. This issue now appears to predate the
fuse changes and is also present when both client and server run
2.6.24.4. I decided to stick with the traffic capture for 2.7.27.14 +
previous fix as discussed earlier. The full capture is available at
http://www.frankvm.com/tmp/2.6.27.14-nlm-grace.pcap. It's about 33k and
was started on the server as part of initscripts, right after the reboot
and filtered on client IP address.

Exported by wireshark (filter: nfs or stat or nlm) and condensed:

  #   time      src      prot
  1   0.000000  client:  NFS  V3 GETATTR Call (Reply In 42), FH:0x0308030a
  2   0.000018  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
  5   0.000583  server:  ICMP Destination unreachable (Port unreachable)
  6   0.000589  server:  ICMP Destination unreachable (Port unreachable)
  7   1.891277  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
  8   1.891320  server:  ICMP Destination unreachable (Port unreachable)
  9   5.827053  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 10   5.827119  server:  ICMP Destination unreachable (Port unreachable)
 11  14.626501  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 12  14.626587  server:  ICMP Destination unreachable (Port unreachable)
 15  15.726426  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 16  15.726505  server:  ICMP Destination unreachable (Port unreachable)
 17  17.926284  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 18  17.926368  server:  ICMP Destination unreachable (Port unreachable)
 25  22.326006  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 26  22.326090  server:  ICMP Destination unreachable (Port unreachable)
 35  30.022271  client:  NLM  V4 UNLOCK Call (Reply In 36) FH:0xcafa61cc svid:114 pos:0-0
 36  30.029511  server:  NLM  V4 UNLOCK Reply (Call In 35) NLM_DENIED_GRACE_PERIOD
 37  30.029660  client:  NLM  V4 LOCK Call (Reply In 39) FH:0xcafa61cc svid:116 pos:0-0
 38  30.029691  client:  NLM  V4 LOCK Call (Reply In 40) FH:0xcafa61cc svid:115 pos:0-0
 39  30.029884  server:  NLM  V4 LOCK Reply (Call In 37) NLM_DENIED_GRACE_PERIOD
 40  30.029914  server:  NLM  V4 LOCK Reply (Call In 38) NLM_DENIED_GRACE_PERIOD
 41  31.125403  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
 42  31.127499  server:  NFS  V3 GETATTR Reply (Call In 1)  Directory mode:0755 uid:0 gid:0
 43  31.127942  client:  NFS  V3 GETATTR Call (Reply In 45), FH:0x0308030a
 45  31.129378  server:  NFS  V3 GETATTR Reply (Call In 43)  Directory mode:0755 uid:0 gid:0
 47  31.129958  server:  STAT V1 NOTIFY Call (Reply In 48)
 48  31.130301  client:  STAT V1 NOTIFY Reply (Call In 47)

Reboot notification ok.

 51  35.029968  client:  NLM  V4 UNLOCK Call (Reply In 54) FH:0xcafa61cc svid:114 pos:0-0
 52  35.030003  client:  NLM  V4 LOCK Call (Reply In 55) FH:0xcafa61cc svid:116 pos:0-0
 53  35.030016  client:  NLM  V4 LOCK Call (Reply In 56) FH:0xcafa61cc svid:115 pos:0-0
 54  35.030085  server:  NLM  V4 UNLOCK Reply (Call In 51) NLM_DENIED_GRACE_PERIOD
 55  35.030126  server:  NLM  V4 LOCK Reply (Call In 52) NLM_DENIED_GRACE_PERIOD
 56  35.030153  server:  NLM  V4 LOCK Reply (Call In 53) NLM_DENIED_GRACE_PERIOD

The three contending client processes. I don't see a lock registration for
svid:114, only UNLOCK calls which fail with NLM_DENIED_GRACE_PERIOD. The
above goes on for a while. Neither the server or client shows any lock
in /proc/locks at this point.

166 115.028376  client:  NLM  V4 LOCK Call (Reply In 168) FH:0xcafa61cc svid:115 pos:0-0
167 115.028394  client:  NLM  V4 LOCK Call (Reply In 169) FH:0xcafa61cc svid:116 pos:0-0
168 115.028440  server:  NLM  V4 LOCK Reply (Call In 166) NLM_DENIED_GRACE_PERIOD
169 115.028465  server:  NLM  V4 LOCK Reply (Call In 167) NLM_DENIED_GRACE_PERIOD
170 120.027233  client:  NLM  V4 UNLOCK Call (Reply In 171) FH:0xcafa61cc svid:114 pos:0-0
171 120.027337  server:  NLM  V4 UNLOCK Reply (Call In 170) NLM_DENIED_GRACE_PERIOD
172 120.028234  client:  NLM  V4 LOCK Call (Reply In 175) FH:0xcafa61cc svid:116 pos:0-0
173 120.028258  client:  NLM  V4 LOCK Call (Reply In 174) FH:0xcafa61cc svid:115 pos:0-0
174 120.030601  server:  NLM  V4 LOCK Reply (Call In 173)
175 120.030656  server:  NLM  V4 LOCK Reply (Call In 172) NLM_BLOCKED

This doesn't add up. There hasn't been a successful unlock for svid:114
(see #213 for that) but still one of the locks is granted.

176 120.030781  client:  NLM  V4 LOCK Call (Reply In 177) FH:0xcafa61cc svid:115 pos:0-0
177 120.030849  server:  NLM  V4 LOCK Reply (Call In 176)

Strange: an identical lock request but with a different rpc xid (i.e. no
packet duplication).

178 120.031078  client:  NFS  V3 GETATTR Call (Reply In 179), FH:0xcafa61cc
179 120.031154  server:  NFS  V3 GETATTR Reply (Call In 178)  Regular File mode:0644 uid:363 gid:1500
180 120.033973  client:  NFS  V3 ACCESS Call (Reply In 181), FH:0x0308030a
181 120.034030  server:  NFS  V3 ACCESS Reply (Call In 180)
182 120.034223  client:  NFS  V3 LOOKUP Call (Reply In 183), DH:0x0308030a/loc
183 120.034285  server:  NFS  V3 LOOKUP Reply (Call In 182), FH:0x81685ca0
184 120.034472  client:  NFS  V3 ACCESS Call (Reply In 185), FH:0x0308030c
185 120.034526  server:  NFS  V3 ACCESS Reply (Call In 184)
186 120.034722  client:  NFS  V3 ACCESS Call (Reply In 187), FH:0x0308030c
187 120.034776  server:  NFS  V3 ACCESS Reply (Call In 186)
188 120.034922  client:  NFS  V3 LOOKUP Call (Reply In 189), DH:0x0308030c/locktest
189 120.034993  server:  NFS  V3 LOOKUP Reply (Call In 188), FH:0xcafa61cc
190 120.035172  client:  NFS  V3 ACCESS Call (Reply In 191), FH:0xcafa61cc
191 120.035230  server:  NFS  V3 ACCESS Reply (Call In 190)
193 122.032218  client:  NLM  V4 UNLOCK Call (Reply In 195) FH:0xcafa61cc svid:115 pos:0-0
194 122.032253  client:  NLM  V4 LOCK Call (Reply In 197) FH:0xcafa61cc svid:119 pos:0-0
195 122.032343  server:  NLM  V4 UNLOCK Reply (Call In 193)
197 122.032794  server:  NLM  V4 LOCK Reply (Call In 194) NLM_BLOCKED
201 122.033767  server:  NLM  V4 GRANTED_MSG Call (Reply In 202) FH:0xcafa61cc svid:116 pos:0-0
202 122.034066  client:  NLM  V4 GRANTED_MSG Reply (Call In 201)
205 122.034665  client:  NLM  V4 GRANTED_RES Call (Reply In 206) NLM_DENIED
206 122.034753  server:  NLM  V4 GRANTED_RES Reply (Call In 205)
207 122.036312  client:  NFS  V3 GETATTR Call (Reply In 208), FH:0xcafa61cc
208 122.036394  server:  NFS  V3 GETATTR Reply (Call In 207)  Regular File mode:0644 uid:363 gid:1500
209 122.036611  client:  NLM  V4 LOCK Call (Reply In 210) FH:0xcafa61cc svid:120 pos:0-0
210 122.036674  server:  NLM  V4 LOCK Reply (Call In 209) NLM_BLOCKED
213 125.027091  client:  NLM  V4 UNLOCK Call (Reply In 214) FH:0xcafa61cc svid:114 pos:0-0
214 125.027194  server:  NLM  V4 UNLOCK Reply (Call In 213)
215 125.029487  client:  NFS  V3 GETATTR Call (Reply In 216), FH:0xcafa61cc
216 125.029570  server:  NFS  V3 GETATTR Reply (Call In 215)  Regular File mode:0644 uid:363 gid:1500
217 125.029836  client:  NLM  V4 LOCK Call (Reply In 218) FH:0xcafa61cc svid:121 pos:0-0
218 125.029895  server:  NLM  V4 LOCK Reply (Call In 217) NLM_BLOCKED
224 152.032157  client:  NLM  V4 LOCK Call (Reply In 225) FH:0xcafa61cc svid:119 pos:0-0
225 152.032283  server:  NLM  V4 LOCK Reply (Call In 224) NLM_BLOCKED
226 152.035103  client:  NLM  V4 LOCK Call (Reply In 227) FH:0xcafa61cc svid:120 pos:0-0
227 152.035157  server:  NLM  V4 LOCK Reply (Call In 226) NLM_BLOCKED
230 155.029676  client:  NLM  V4 LOCK Call (Reply In 231) FH:0xcafa61cc svid:121 pos:0-0
231 155.029761  server:  NLM  V4 LOCK Reply (Call In 230) NLM_BLOCKED

To recap the problem: one of the fcntl calls to obtain a write lock
returns

	lck: fcntl: No locks available

shortly after the grace period expires. After that everything gets stuck,
server holding a write lock with no corresponding client side lock.


IMO looks like the client is to blame, even if/when the server
should/could have accepted UNLOCK during grace (I don't know, I'm not
an expert on that one).

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 14:28       ` Frank van Maarseveen
@ 2009-02-12 15:16         ` Trond Myklebust
       [not found]           ` <1234451789.7190.38.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 15:16 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 15:28 +0100, Frank van Maarseveen wrote:
> On Wed, Feb 11, 2009 at 03:39:48PM -0500, J. Bruce Fields wrote:
> > On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> > > On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > > > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > > > I'm sorry to inform you but... it seems that there is a similar problem
> > > > > in the NLM subsystem as reported previously but this time it is triggered
> > > > > when the grace time expires after a reboot.
> > > > > 
> > > > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > > > > 
> > > > > On the client there are three shells running:
> > > > > 
> > > > > 	while :; do lck -w /mnt/foo 2; done
> > > > > 
> > > > > The "lck" program is the same as posted before and it obtains an exclusive
> > > > > write lock then waits 2 seconds in above invocation (there's probably an
> > > > > "fcntl" command equivalent). After an orderly server reboot + grace time
> > > > 
> > > > How are you rebooting the server?
> > > 
> > > "reboot"
> > 
> > Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
> > server is actually sending the reboot notification to the client, and
> > that the client is trying to reclaim?  (Wireshark should make this all
> > fairly clear.  But capture the traffic with tcpdump -s0 -wtmp.pcap and
> > send it to me if you're having trouble interpreting it.)
> 
> I have a capture with comment below. It raised so many questions
> that I decided to do some more testing, trying to figure out how
> it looks when the locking works. This issue now appears to predate the
> fuse changes and is also present when both client and server run
> 2.6.24.4. I decided to stick with the traffic capture for 2.7.27.14 +
> previous fix as discussed earlier. The full capture is available at
> http://www.frankvm.com/tmp/2.6.27.14-nlm-grace.pcap. It's about 33k and
> was started on the server as part of initscripts, right after the reboot
> and filtered on client IP address.
> 
> Exported by wireshark (filter: nfs or stat or nlm) and condensed:
> 
>   #   time      src      prot
>   1   0.000000  client:  NFS  V3 GETATTR Call (Reply In 42), FH:0x0308030a
>   2   0.000018  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>   5   0.000583  server:  ICMP Destination unreachable (Port unreachable)
>   6   0.000589  server:  ICMP Destination unreachable (Port unreachable)
>   7   1.891277  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>   8   1.891320  server:  ICMP Destination unreachable (Port unreachable)
>   9   5.827053  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  10   5.827119  server:  ICMP Destination unreachable (Port unreachable)
>  11  14.626501  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  12  14.626587  server:  ICMP Destination unreachable (Port unreachable)
>  15  15.726426  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  16  15.726505  server:  ICMP Destination unreachable (Port unreachable)
>  17  17.926284  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  18  17.926368  server:  ICMP Destination unreachable (Port unreachable)
>  25  22.326006  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  26  22.326090  server:  ICMP Destination unreachable (Port unreachable)
>  35  30.022271  client:  NLM  V4 UNLOCK Call (Reply In 36) FH:0xcafa61cc svid:114 pos:0-0
>  36  30.029511  server:  NLM  V4 UNLOCK Reply (Call In 35) NLM_DENIED_GRACE_PERIOD
>  37  30.029660  client:  NLM  V4 LOCK Call (Reply In 39) FH:0xcafa61cc svid:116 pos:0-0
>  38  30.029691  client:  NLM  V4 LOCK Call (Reply In 40) FH:0xcafa61cc svid:115 pos:0-0
>  39  30.029884  server:  NLM  V4 LOCK Reply (Call In 37) NLM_DENIED_GRACE_PERIOD
>  40  30.029914  server:  NLM  V4 LOCK Reply (Call In 38) NLM_DENIED_GRACE_PERIOD
>  41  31.125403  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
>  42  31.127499  server:  NFS  V3 GETATTR Reply (Call In 1)  Directory mode:0755 uid:0 gid:0
>  43  31.127942  client:  NFS  V3 GETATTR Call (Reply In 45), FH:0x0308030a
>  45  31.129378  server:  NFS  V3 GETATTR Reply (Call In 43)  Directory mode:0755 uid:0 gid:0
>  47  31.129958  server:  STAT V1 NOTIFY Call (Reply In 48)
>  48  31.130301  client:  STAT V1 NOTIFY Reply (Call In 47)
> 
> Reboot notification ok.
> 
>  51  35.029968  client:  NLM  V4 UNLOCK Call (Reply In 54) FH:0xcafa61cc svid:114 pos:0-0
>  52  35.030003  client:  NLM  V4 LOCK Call (Reply In 55) FH:0xcafa61cc svid:116 pos:0-0
>  53  35.030016  client:  NLM  V4 LOCK Call (Reply In 56) FH:0xcafa61cc svid:115 pos:0-0
>  54  35.030085  server:  NLM  V4 UNLOCK Reply (Call In 51) NLM_DENIED_GRACE_PERIOD
>  55  35.030126  server:  NLM  V4 LOCK Reply (Call In 52) NLM_DENIED_GRACE_PERIOD
>  56  35.030153  server:  NLM  V4 LOCK Reply (Call In 53) NLM_DENIED_GRACE_PERIOD
> 
> The three contending client processes. I don't see a lock registration for
> svid:114, only UNLOCK calls which fail with NLM_DENIED_GRACE_PERIOD. The
> above goes on for a while. Neither the server or client shows any lock
> in /proc/locks at this point.
> 
> 166 115.028376  client:  NLM  V4 LOCK Call (Reply In 168) FH:0xcafa61cc svid:115 pos:0-0
> 167 115.028394  client:  NLM  V4 LOCK Call (Reply In 169) FH:0xcafa61cc svid:116 pos:0-0
> 168 115.028440  server:  NLM  V4 LOCK Reply (Call In 166) NLM_DENIED_GRACE_PERIOD
> 169 115.028465  server:  NLM  V4 LOCK Reply (Call In 167) NLM_DENIED_GRACE_PERIOD
> 170 120.027233  client:  NLM  V4 UNLOCK Call (Reply In 171) FH:0xcafa61cc svid:114 pos:0-0
> 171 120.027337  server:  NLM  V4 UNLOCK Reply (Call In 170) NLM_DENIED_GRACE_PERIOD
> 172 120.028234  client:  NLM  V4 LOCK Call (Reply In 175) FH:0xcafa61cc svid:116 pos:0-0
> 173 120.028258  client:  NLM  V4 LOCK Call (Reply In 174) FH:0xcafa61cc svid:115 pos:0-0
> 174 120.030601  server:  NLM  V4 LOCK Reply (Call In 173)
> 175 120.030656  server:  NLM  V4 LOCK Reply (Call In 172) NLM_BLOCKED
> 
> This doesn't add up. There hasn't been a successful unlock for svid:114
> (see #213 for that) but still one of the locks is granted.

Has the lock for svid:114 been attempted recovered by the client? If
not, then the server has no knowledge of that lock.

> 176 120.030781  client:  NLM  V4 LOCK Call (Reply In 177) FH:0xcafa61cc svid:115 pos:0-0
> 177 120.030849  server:  NLM  V4 LOCK Reply (Call In 176)
> 
> Strange: an identical lock request but with a different rpc xid (i.e. no
> packet duplication).

No. That would be the non-blocking lock that is intended as a 'ping' to
see if the server is still alive. It duplicates the blocking lock in all
details except that the 'block' flag is not set.

> 178 120.031078  client:  NFS  V3 GETATTR Call (Reply In 179), FH:0xcafa61cc
> 179 120.031154  server:  NFS  V3 GETATTR Reply (Call In 178)  Regular File mode:0644 uid:363 gid:1500
> 180 120.033973  client:  NFS  V3 ACCESS Call (Reply In 181), FH:0x0308030a
> 181 120.034030  server:  NFS  V3 ACCESS Reply (Call In 180)
> 182 120.034223  client:  NFS  V3 LOOKUP Call (Reply In 183), DH:0x0308030a/loc
> 183 120.034285  server:  NFS  V3 LOOKUP Reply (Call In 182), FH:0x81685ca0
> 184 120.034472  client:  NFS  V3 ACCESS Call (Reply In 185), FH:0x0308030c
> 185 120.034526  server:  NFS  V3 ACCESS Reply (Call In 184)
> 186 120.034722  client:  NFS  V3 ACCESS Call (Reply In 187), FH:0x0308030c
> 187 120.034776  server:  NFS  V3 ACCESS Reply (Call In 186)
> 188 120.034922  client:  NFS  V3 LOOKUP Call (Reply In 189), DH:0x0308030c/locktest
> 189 120.034993  server:  NFS  V3 LOOKUP Reply (Call In 188), FH:0xcafa61cc
> 190 120.035172  client:  NFS  V3 ACCESS Call (Reply In 191), FH:0xcafa61cc
> 191 120.035230  server:  NFS  V3 ACCESS Reply (Call In 190)
> 193 122.032218  client:  NLM  V4 UNLOCK Call (Reply In 195) FH:0xcafa61cc svid:115 pos:0-0
> 194 122.032253  client:  NLM  V4 LOCK Call (Reply In 197) FH:0xcafa61cc svid:119 pos:0-0
> 195 122.032343  server:  NLM  V4 UNLOCK Reply (Call In 193)
> 197 122.032794  server:  NLM  V4 LOCK Reply (Call In 194) NLM_BLOCKED
> 201 122.033767  server:  NLM  V4 GRANTED_MSG Call (Reply In 202) FH:0xcafa61cc svid:116 pos:0-0
> 202 122.034066  client:  NLM  V4 GRANTED_MSG Reply (Call In 201)
> 205 122.034665  client:  NLM  V4 GRANTED_RES Call (Reply In 206) NLM_DENIED
> 206 122.034753  server:  NLM  V4 GRANTED_RES Reply (Call In 205)

What happened here? Why did the client refuse the lock for svid 116?

Did the task get signalled? If so, where is the CANCEL request?

> 207 122.036312  client:  NFS  V3 GETATTR Call (Reply In 208), FH:0xcafa61cc
> 208 122.036394  server:  NFS  V3 GETATTR Reply (Call In 207)  Regular File mode:0644 uid:363 gid:1500
> 209 122.036611  client:  NLM  V4 LOCK Call (Reply In 210) FH:0xcafa61cc svid:120 pos:0-0
> 210 122.036674  server:  NLM  V4 LOCK Reply (Call In 209) NLM_BLOCKED
> 213 125.027091  client:  NLM  V4 UNLOCK Call (Reply In 214) FH:0xcafa61cc svid:114 pos:0-0
> 214 125.027194  server:  NLM  V4 UNLOCK Reply (Call In 213)
> 215 125.029487  client:  NFS  V3 GETATTR Call (Reply In 216), FH:0xcafa61cc
> 216 125.029570  server:  NFS  V3 GETATTR Reply (Call In 215)  Regular File mode:0644 uid:363 gid:1500
> 217 125.029836  client:  NLM  V4 LOCK Call (Reply In 218) FH:0xcafa61cc svid:121 pos:0-0
> 218 125.029895  server:  NLM  V4 LOCK Reply (Call In 217) NLM_BLOCKED
> 224 152.032157  client:  NLM  V4 LOCK Call (Reply In 225) FH:0xcafa61cc svid:119 pos:0-0
> 225 152.032283  server:  NLM  V4 LOCK Reply (Call In 224) NLM_BLOCKED
> 226 152.035103  client:  NLM  V4 LOCK Call (Reply In 227) FH:0xcafa61cc svid:120 pos:0-0
> 227 152.035157  server:  NLM  V4 LOCK Reply (Call In 226) NLM_BLOCKED
> 230 155.029676  client:  NLM  V4 LOCK Call (Reply In 231) FH:0xcafa61cc svid:121 pos:0-0
> 231 155.029761  server:  NLM  V4 LOCK Reply (Call In 230) NLM_BLOCKED
> 
> To recap the problem: one of the fcntl calls to obtain a write lock
> returns
> 
> 	lck: fcntl: No locks available
> 
> shortly after the grace period expires. After that everything gets stuck,
> server holding a write lock with no corresponding client side lock.
> 
> 
> IMO looks like the client is to blame, even if/when the server
> should/could have accepted UNLOCK during grace (I don't know, I'm not
> an expert on that one).

Possibly... It depends entirely on what happened to cause it to deny the
GRANTED callback...

Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]           ` <1234451789.7190.38.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 15:36             ` Frank van Maarseveen
  2009-02-12 18:17               ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-12 15:36 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, Feb 12, 2009 at 10:16:29AM -0500, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 15:28 +0100, Frank van Maarseveen wrote:
> > On Wed, Feb 11, 2009 at 03:39:48PM -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 11, 2009 at 09:37:03PM +0100, Frank van Maarseveen wrote:
> > > > On Wed, Feb 11, 2009 at 03:35:55PM -0500, J. Bruce Fields wrote:
> > > > > On Wed, Feb 11, 2009 at 12:23:18PM +0100, Frank van Maarseveen wrote:
> > > > > > I'm sorry to inform you but... it seems that there is a similar problem
> > > > > > in the NLM subsystem as reported previously but this time it is triggered
> > > > > > when the grace time expires after a reboot.
> > > > > > 
> > > > > > Client and server run 2.6.27.14 + previous fix, NFSv3.
> > > > > > 
> > > > > > On the client there are three shells running:
> > > > > > 
> > > > > > 	while :; do lck -w /mnt/foo 2; done
> > > > > > 
> > > > > > The "lck" program is the same as posted before and it obtains an exclusive
> > > > > > write lock then waits 2 seconds in above invocation (there's probably an
> > > > > > "fcntl" command equivalent). After an orderly server reboot + grace time
> > > > > 
> > > > > How are you rebooting the server?
> > > > 
> > > > "reboot"
> > > 
> > > Could you watch the nfs/nlm/nsm traffic on reboot and make sure that the
> > > server is actually sending the reboot notification to the client, and
> > > that the client is trying to reclaim?  (Wireshark should make this all
> > > fairly clear.  But capture the traffic with tcpdump -s0 -wtmp.pcap and
> > > send it to me if you're having trouble interpreting it.)
> > 
> > I have a capture with comment below. It raised so many questions
> > that I decided to do some more testing, trying to figure out how
> > it looks when the locking works. This issue now appears to predate the
> > fuse changes and is also present when both client and server run
> > 2.6.24.4. I decided to stick with the traffic capture for 2.7.27.14 +
> > previous fix as discussed earlier. The full capture is available at
> > http://www.frankvm.com/tmp/2.6.27.14-nlm-grace.pcap. It's about 33k and
> > was started on the server as part of initscripts, right after the reboot
> > and filtered on client IP address.
> > 
> > Exported by wireshark (filter: nfs or stat or nlm) and condensed:
> > 
> >   #   time      src      prot
> >   1   0.000000  client:  NFS  V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >   2   0.000018  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >   5   0.000583  server:  ICMP Destination unreachable (Port unreachable)
> >   6   0.000589  server:  ICMP Destination unreachable (Port unreachable)
> >   7   1.891277  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >   8   1.891320  server:  ICMP Destination unreachable (Port unreachable)
> >   9   5.827053  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  10   5.827119  server:  ICMP Destination unreachable (Port unreachable)
> >  11  14.626501  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  12  14.626587  server:  ICMP Destination unreachable (Port unreachable)
> >  15  15.726426  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  16  15.726505  server:  ICMP Destination unreachable (Port unreachable)
> >  17  17.926284  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  18  17.926368  server:  ICMP Destination unreachable (Port unreachable)
> >  25  22.326006  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  26  22.326090  server:  ICMP Destination unreachable (Port unreachable)
> >  35  30.022271  client:  NLM  V4 UNLOCK Call (Reply In 36) FH:0xcafa61cc svid:114 pos:0-0
> >  36  30.029511  server:  NLM  V4 UNLOCK Reply (Call In 35) NLM_DENIED_GRACE_PERIOD
> >  37  30.029660  client:  NLM  V4 LOCK Call (Reply In 39) FH:0xcafa61cc svid:116 pos:0-0
> >  38  30.029691  client:  NLM  V4 LOCK Call (Reply In 40) FH:0xcafa61cc svid:115 pos:0-0
> >  39  30.029884  server:  NLM  V4 LOCK Reply (Call In 37) NLM_DENIED_GRACE_PERIOD
> >  40  30.029914  server:  NLM  V4 LOCK Reply (Call In 38) NLM_DENIED_GRACE_PERIOD
> >  41  31.125403  client:  NFS  [RPC retransmission of #1]V3 GETATTR Call (Reply In 42), FH:0x0308030a
> >  42  31.127499  server:  NFS  V3 GETATTR Reply (Call In 1)  Directory mode:0755 uid:0 gid:0
> >  43  31.127942  client:  NFS  V3 GETATTR Call (Reply In 45), FH:0x0308030a
> >  45  31.129378  server:  NFS  V3 GETATTR Reply (Call In 43)  Directory mode:0755 uid:0 gid:0
> >  47  31.129958  server:  STAT V1 NOTIFY Call (Reply In 48)
> >  48  31.130301  client:  STAT V1 NOTIFY Reply (Call In 47)
> > 
> > Reboot notification ok.
> > 
> >  51  35.029968  client:  NLM  V4 UNLOCK Call (Reply In 54) FH:0xcafa61cc svid:114 pos:0-0
> >  52  35.030003  client:  NLM  V4 LOCK Call (Reply In 55) FH:0xcafa61cc svid:116 pos:0-0
> >  53  35.030016  client:  NLM  V4 LOCK Call (Reply In 56) FH:0xcafa61cc svid:115 pos:0-0
> >  54  35.030085  server:  NLM  V4 UNLOCK Reply (Call In 51) NLM_DENIED_GRACE_PERIOD
> >  55  35.030126  server:  NLM  V4 LOCK Reply (Call In 52) NLM_DENIED_GRACE_PERIOD
> >  56  35.030153  server:  NLM  V4 LOCK Reply (Call In 53) NLM_DENIED_GRACE_PERIOD
> > 
> > The three contending client processes. I don't see a lock registration for
> > svid:114, only UNLOCK calls which fail with NLM_DENIED_GRACE_PERIOD. The
> > above goes on for a while. Neither the server or client shows any lock
> > in /proc/locks at this point.
> > 
> > 166 115.028376  client:  NLM  V4 LOCK Call (Reply In 168) FH:0xcafa61cc svid:115 pos:0-0
> > 167 115.028394  client:  NLM  V4 LOCK Call (Reply In 169) FH:0xcafa61cc svid:116 pos:0-0
> > 168 115.028440  server:  NLM  V4 LOCK Reply (Call In 166) NLM_DENIED_GRACE_PERIOD
> > 169 115.028465  server:  NLM  V4 LOCK Reply (Call In 167) NLM_DENIED_GRACE_PERIOD
> > 170 120.027233  client:  NLM  V4 UNLOCK Call (Reply In 171) FH:0xcafa61cc svid:114 pos:0-0
> > 171 120.027337  server:  NLM  V4 UNLOCK Reply (Call In 170) NLM_DENIED_GRACE_PERIOD
> > 172 120.028234  client:  NLM  V4 LOCK Call (Reply In 175) FH:0xcafa61cc svid:116 pos:0-0
> > 173 120.028258  client:  NLM  V4 LOCK Call (Reply In 174) FH:0xcafa61cc svid:115 pos:0-0
> > 174 120.030601  server:  NLM  V4 LOCK Reply (Call In 173)
> > 175 120.030656  server:  NLM  V4 LOCK Reply (Call In 172) NLM_BLOCKED
> > 
> > This doesn't add up. There hasn't been a successful unlock for svid:114
> > (see #213 for that) but still one of the locks is granted.
> 
> Has the lock for svid:114 been attempted recovered by the client? If
> not, then the server has no knowledge of that lock.

exactly. Apparently the client tries to unlock an unrecovered lock.

> 
> > 176 120.030781  client:  NLM  V4 LOCK Call (Reply In 177) FH:0xcafa61cc svid:115 pos:0-0
> > 177 120.030849  server:  NLM  V4 LOCK Reply (Call In 176)
> > 
> > Strange: an identical lock request but with a different rpc xid (i.e. no
> > packet duplication).
> 
> No. That would be the non-blocking lock that is intended as a 'ping' to
> see if the server is still alive. It duplicates the blocking lock in all
> details except that the 'block' flag is not set.
> 
> > 178 120.031078  client:  NFS  V3 GETATTR Call (Reply In 179), FH:0xcafa61cc
> > 179 120.031154  server:  NFS  V3 GETATTR Reply (Call In 178)  Regular File mode:0644 uid:363 gid:1500
> > 180 120.033973  client:  NFS  V3 ACCESS Call (Reply In 181), FH:0x0308030a
> > 181 120.034030  server:  NFS  V3 ACCESS Reply (Call In 180)
> > 182 120.034223  client:  NFS  V3 LOOKUP Call (Reply In 183), DH:0x0308030a/loc
> > 183 120.034285  server:  NFS  V3 LOOKUP Reply (Call In 182), FH:0x81685ca0
> > 184 120.034472  client:  NFS  V3 ACCESS Call (Reply In 185), FH:0x0308030c
> > 185 120.034526  server:  NFS  V3 ACCESS Reply (Call In 184)
> > 186 120.034722  client:  NFS  V3 ACCESS Call (Reply In 187), FH:0x0308030c
> > 187 120.034776  server:  NFS  V3 ACCESS Reply (Call In 186)
> > 188 120.034922  client:  NFS  V3 LOOKUP Call (Reply In 189), DH:0x0308030c/locktest
> > 189 120.034993  server:  NFS  V3 LOOKUP Reply (Call In 188), FH:0xcafa61cc
> > 190 120.035172  client:  NFS  V3 ACCESS Call (Reply In 191), FH:0xcafa61cc
> > 191 120.035230  server:  NFS  V3 ACCESS Reply (Call In 190)
> > 193 122.032218  client:  NLM  V4 UNLOCK Call (Reply In 195) FH:0xcafa61cc svid:115 pos:0-0
> > 194 122.032253  client:  NLM  V4 LOCK Call (Reply In 197) FH:0xcafa61cc svid:119 pos:0-0
> > 195 122.032343  server:  NLM  V4 UNLOCK Reply (Call In 193)
> > 197 122.032794  server:  NLM  V4 LOCK Reply (Call In 194) NLM_BLOCKED
> > 201 122.033767  server:  NLM  V4 GRANTED_MSG Call (Reply In 202) FH:0xcafa61cc svid:116 pos:0-0
> > 202 122.034066  client:  NLM  V4 GRANTED_MSG Reply (Call In 201)
> > 205 122.034665  client:  NLM  V4 GRANTED_RES Call (Reply In 206) NLM_DENIED
> > 206 122.034753  server:  NLM  V4 GRANTED_RES Reply (Call In 205)
> 
> What happened here? Why did the client refuse the lock for svid 116?
> 
> Did the task get signalled? If so, where is the CANCEL request?

The task did not get signaled, there is no CANCEL.

> 
> > 207 122.036312  client:  NFS  V3 GETATTR Call (Reply In 208), FH:0xcafa61cc
> > 208 122.036394  server:  NFS  V3 GETATTR Reply (Call In 207)  Regular File mode:0644 uid:363 gid:1500
> > 209 122.036611  client:  NLM  V4 LOCK Call (Reply In 210) FH:0xcafa61cc svid:120 pos:0-0
> > 210 122.036674  server:  NLM  V4 LOCK Reply (Call In 209) NLM_BLOCKED
> > 213 125.027091  client:  NLM  V4 UNLOCK Call (Reply In 214) FH:0xcafa61cc svid:114 pos:0-0
> > 214 125.027194  server:  NLM  V4 UNLOCK Reply (Call In 213)
> > 215 125.029487  client:  NFS  V3 GETATTR Call (Reply In 216), FH:0xcafa61cc
> > 216 125.029570  server:  NFS  V3 GETATTR Reply (Call In 215)  Regular File mode:0644 uid:363 gid:1500
> > 217 125.029836  client:  NLM  V4 LOCK Call (Reply In 218) FH:0xcafa61cc svid:121 pos:0-0
> > 218 125.029895  server:  NLM  V4 LOCK Reply (Call In 217) NLM_BLOCKED
> > 224 152.032157  client:  NLM  V4 LOCK Call (Reply In 225) FH:0xcafa61cc svid:119 pos:0-0
> > 225 152.032283  server:  NLM  V4 LOCK Reply (Call In 224) NLM_BLOCKED
> > 226 152.035103  client:  NLM  V4 LOCK Call (Reply In 227) FH:0xcafa61cc svid:120 pos:0-0
> > 227 152.035157  server:  NLM  V4 LOCK Reply (Call In 226) NLM_BLOCKED
> > 230 155.029676  client:  NLM  V4 LOCK Call (Reply In 231) FH:0xcafa61cc svid:121 pos:0-0
> > 231 155.029761  server:  NLM  V4 LOCK Reply (Call In 230) NLM_BLOCKED
> > 
> > To recap the problem: one of the fcntl calls to obtain a write lock
> > returns
> > 
> > 	lck: fcntl: No locks available
> > 
> > shortly after the grace period expires. After that everything gets stuck,
> > server holding a write lock with no corresponding client side lock.
> > 
> > 
> > IMO looks like the client is to blame, even if/when the server
> > should/could have accepted UNLOCK during grace (I don't know, I'm not
> > an expert on that one).
> 
> Possibly... It depends entirely on what happened to cause it to deny the
> GRANTED callback...

A little theorizing:
If the unlock of a yet unrecovered lock has failed up to that point then
the client sure must remember the lock somehow. That might explain the
secondary error when a conflicting lock is granted by the server.

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 15:36             ` Frank van Maarseveen
@ 2009-02-12 18:17               ` Trond Myklebust
       [not found]                 ` <1234462647.7190.53.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 18:17 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> A little theorizing:
> If the unlock of a yet unrecovered lock has failed up to that point then
> the client sure must remember the lock somehow. That might explain the
> secondary error when a conflicting lock is granted by the server.

Sorry, but that doesn't hold water. The client will release the VFS
'mirror' of the lock before it attempts to unlock. Otherwise, you could
have some nasty races between the unlock thread and the recovery
thread...
Besides, the granted callback handler on the client only checks the list
of blocked locks for a match.

Oh, bugger, I know what this is... It's the same thing that happened to
the NFSv4 callback server. If you compile with CONFIG_IPV6 or
CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
the NLM server will listen on an IPv6 socket, and so the RPC request
come in with their IPv4 address mapped into the IPv6 namespace.
The client, on the other hand, is using an IPv4 socket, 'cos you
specified an IPv4 address to the mount command.
The result is that the call to nlm_cmp_addr() in nlmclnt_grant() always
fails...

Basically, we need to replace nlm_cmp_addr() with something akin to
nfs_sockaddr_match_ipaddr(), which will compare v4 mapped addresses.

The workaround should be simply to turn off CONFIG_SUNRPC_REGISTER_V4 if
you're not planning on ever using NFS-over-IPv6...

Cheers
  Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                 ` <1234462647.7190.53.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 18:29                   ` Frank van Maarseveen
  2009-02-12 19:10                     ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-12 18:29 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> > A little theorizing:
> > If the unlock of a yet unrecovered lock has failed up to that point then
> > the client sure must remember the lock somehow. That might explain the
> > secondary error when a conflicting lock is granted by the server.
> 
> Sorry, but that doesn't hold water. The client will release the VFS
> 'mirror' of the lock before it attempts to unlock. Otherwise, you could
> have some nasty races between the unlock thread and the recovery
> thread...
> Besides, the granted callback handler on the client only checks the list
> of blocked locks for a match.

ok, then we have more than one NLM bug to resolve.

> 
> Oh, bugger, I know what this is... It's the same thing that happened to
> the NFSv4 callback server. If you compile with CONFIG_IPV6 or
> CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
> the NLM server will listen on an IPv6 socket, and so the RPC request
> come in with their IPv4 address mapped into the IPv6 namespace.

Nope:

	$ zgrep IPV6 /proc/config.gz 
	# CONFIG_IPV6 is not set
	$ zgrep SUNRPC /proc/config.gz 
	CONFIG_SUNRPC=y
	CONFIG_SUNRPC_GSS=y
	# CONFIG_SUNRPC_BIND34 is not set


And remember this is not a recent regression.

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 18:29                   ` Frank van Maarseveen
@ 2009-02-12 19:10                     ` Trond Myklebust
       [not found]                       ` <1234465837.7190.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 19:10 UTC (permalink / raw)
  To: Frank van Maarseveen, Mr. Charles Edward Lever
  Cc: J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 19:29 +0100, Frank van Maarseveen wrote:
> On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> > > A little theorizing:
> > > If the unlock of a yet unrecovered lock has failed up to that point then
> > > the client sure must remember the lock somehow. That might explain the
> > > secondary error when a conflicting lock is granted by the server.
> > 
> > Sorry, but that doesn't hold water. The client will release the VFS
> > 'mirror' of the lock before it attempts to unlock. Otherwise, you could
> > have some nasty races between the unlock thread and the recovery
> > thread...
> > Besides, the granted callback handler on the client only checks the list
> > of blocked locks for a match.
> 
> ok, then we have more than one NLM bug to resolve.
> 
> > 
> > Oh, bugger, I know what this is... It's the same thing that happened to
> > the NFSv4 callback server. If you compile with CONFIG_IPV6 or
> > CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
> > the NLM server will listen on an IPv6 socket, and so the RPC request
> > come in with their IPv4 address mapped into the IPv6 namespace.
> 
> Nope:
> 
> 	$ zgrep IPV6 /proc/config.gz 
> 	# CONFIG_IPV6 is not set
> 	$ zgrep SUNRPC /proc/config.gz 
> 	CONFIG_SUNRPC=y
> 	CONFIG_SUNRPC_GSS=y
> 	# CONFIG_SUNRPC_BIND34 is not set

Sorry, yes... 2.6.27.x should be OK. The lockd v4mapped addresses bug is
specific to 2.6.29. Chuck, are you planning on fixing this before
2.6.29-final comes out?

> And remember this is not a recent regression.

It would help if you sent us the full binary tcpdump, instead of just
the summary. That should enable us to figure out which of the tests is
failing in nlmclnt_grant().

Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                       ` <1234465837.7190.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 19:16                         ` Frank van Maarseveen
  2009-02-12 20:24                           ` Trond Myklebust
  2009-02-12 19:35                         ` Chuck Lever
  1 sibling, 1 reply; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-12 19:16 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, Mr. Charles Edward Lever, J. Bruce Fields,
	Linux NFS mailing list

[-- Attachment #1: Type: text/plain, Size: 2007 bytes --]

On Thu, Feb 12, 2009 at 02:10:37PM -0500, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 19:29 +0100, Frank van Maarseveen wrote:
> > On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
> > > On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> > > > A little theorizing:
> > > > If the unlock of a yet unrecovered lock has failed up to that point then
> > > > the client sure must remember the lock somehow. That might explain the
> > > > secondary error when a conflicting lock is granted by the server.
> > > 
> > > Sorry, but that doesn't hold water. The client will release the VFS
> > > 'mirror' of the lock before it attempts to unlock. Otherwise, you could
> > > have some nasty races between the unlock thread and the recovery
> > > thread...
> > > Besides, the granted callback handler on the client only checks the list
> > > of blocked locks for a match.
> > 
> > ok, then we have more than one NLM bug to resolve.
> > 
> > > 
> > > Oh, bugger, I know what this is... It's the same thing that happened to
> > > the NFSv4 callback server. If you compile with CONFIG_IPV6 or
> > > CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
> > > the NLM server will listen on an IPv6 socket, and so the RPC request
> > > come in with their IPv4 address mapped into the IPv6 namespace.
> > 
> > Nope:
> > 
> > 	$ zgrep IPV6 /proc/config.gz 
> > 	# CONFIG_IPV6 is not set
> > 	$ zgrep SUNRPC /proc/config.gz 
> > 	CONFIG_SUNRPC=y
> > 	CONFIG_SUNRPC_GSS=y
> > 	# CONFIG_SUNRPC_BIND34 is not set
> 
> Sorry, yes... 2.6.27.x should be OK. The lockd v4mapped addresses bug is
> specific to 2.6.29. Chuck, are you planning on fixing this before
> 2.6.29-final comes out?
> 
> > And remember this is not a recent regression.
> 
> It would help if you sent us the full binary tcpdump, instead of just
> the summary. That should enable us to figure out which of the tests is
> failing in nlmclnt_grant().

I posted the link already. Anyway, see attachment.

-- 
Frank

[-- Attachment #2: 2.6.27.14-nlm-grace.pcap --]
[-- Type: application/octet-stream, Size: 33708 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                       ` <1234465837.7190.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-02-12 19:16                         ` Frank van Maarseveen
@ 2009-02-12 19:35                         ` Chuck Lever
  2009-02-12 19:43                           ` Trond Myklebust
  1 sibling, 1 reply; 25+ messages in thread
From: Chuck Lever @ 2009-02-12 19:35 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Feb 12, 2009, at 2:10 PM, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 19:29 +0100, Frank van Maarseveen wrote:
>> On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
>>> On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
>>>> A little theorizing:
>>>> If the unlock of a yet unrecovered lock has failed up to that  
>>>> point then
>>>> the client sure must remember the lock somehow. That might  
>>>> explain the
>>>> secondary error when a conflicting lock is granted by the server.
>>>
>>> Sorry, but that doesn't hold water. The client will release the VFS
>>> 'mirror' of the lock before it attempts to unlock. Otherwise, you  
>>> could
>>> have some nasty races between the unlock thread and the recovery
>>> thread...
>>> Besides, the granted callback handler on the client only checks  
>>> the list
>>> of blocked locks for a match.
>>
>> ok, then we have more than one NLM bug to resolve.
>>
>>>
>>> Oh, bugger, I know what this is... It's the same thing that  
>>> happened to
>>> the NFSv4 callback server. If you compile with CONFIG_IPV6 or
>>> CONFIG_IPV6_MODULE enabled, and also set  
>>> CONFIG_SUNRPC_REGISTER_V4, then
>>> the NLM server will listen on an IPv6 socket, and so the RPC request
>>> come in with their IPv4 address mapped into the IPv6 namespace.
>>
>> Nope:
>>
>> 	$ zgrep IPV6 /proc/config.gz
>> 	# CONFIG_IPV6 is not set
>> 	$ zgrep SUNRPC /proc/config.gz
>> 	CONFIG_SUNRPC=y
>> 	CONFIG_SUNRPC_GSS=y
>> 	# CONFIG_SUNRPC_BIND34 is not set
>
> Sorry, yes... 2.6.27.x should be OK. The lockd v4mapped addresses  
> bug is
> specific to 2.6.29. Chuck, are you planning on fixing this before
> 2.6.29-final comes out?

I wasn't sure exactly where the compared addresses came from.  I had  
assumed that they all came through the listener, so we wouldn't need  
this kind of translation.  It shouldn't be difficult to map addresses  
passed in via nlmclnt_init() to AF_INET6.

But this is the kind of thing that makes "falling back" to an AF_INET  
listener a little challenging.  We will have to record what flavor the  
listener is and do a translation depending on what listener family was  
actually created.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 19:35                         ` Chuck Lever
@ 2009-02-12 19:43                           ` Trond Myklebust
       [not found]                             ` <1234467795.7190.70.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 19:43 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
> I wasn't sure exactly where the compared addresses came from.  I had  
> assumed that they all came through the listener, so we wouldn't need  
> this kind of translation.  It shouldn't be difficult to map addresses  
> passed in via nlmclnt_init() to AF_INET6.
> 
> But this is the kind of thing that makes "falling back" to an AF_INET  
> listener a little challenging.  We will have to record what flavor the  
> listener is and do a translation depending on what listener family was  
> actually created.

Why? Should we care whether we're receiving IPv4 addresses or IPv6
v4-mapped addresses? They're the same thing...

We're already doing the mapping for the NFSv4 callback channel. See
nfs_sockaddr_match_ipaddr() in fs/nfs/client.c

Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                             ` <1234467795.7190.70.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 20:11                               ` Chuck Lever
  2009-02-12 20:27                                 ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Chuck Lever @ 2009-02-12 20:11 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
>> I wasn't sure exactly where the compared addresses came from.  I had
>> assumed that they all came through the listener, so we wouldn't need
>> this kind of translation.  It shouldn't be difficult to map addresses
>> passed in via nlmclnt_init() to AF_INET6.
>>
>> But this is the kind of thing that makes "falling back" to an AF_INET
>> listener a little challenging.  We will have to record what flavor  
>> the
>> listener is and do a translation depending on what listener family  
>> was
>> actually created.
>
> Why? Should we care whether we're receiving IPv4 addresses or IPv6
> v4-mapped addresses? They're the same thing...

The problem is the listener family is now decided at run-time.  If an  
AF_INET6 listener can't be created, an AF_INET listener is created  
instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.  If an  
AF_INET listener is created, we get only IPv4 addresses in svc_rqst- 
 >rq_addr.

So we can do it either way.  Taking lockd as an example:

1.  Have nlmclnt_init() map AF_INET mount addresses to AF_INET6 iff  
the lockd listener is AF_INET6, so nlm_cmp_addr() is always dealing  
with AF_INET6 in this case, or

2.  If CONFIG_IPV6 || CONFIG_IPV6_MODULE, unconditionally map AF_INET  
addresses in nlmclnt_init and for incoming NLM requests (when lockd  
happens to have fallen back to an AF_INET listener)

Personally I think solution 1. will be less confusing operationally  
and less invasive code-wise.  I suppose IPv6 purists would prefer  
keeping the whole stack in AF_INET6, so they would like solution 2.

Eventually we could map incoming addresses on AF_INET listeners in the  
RPC server code, but I prefer to wait until all kernel RPC services  
have IPv6 support.

Since 2.6.29 has the CONFIG_SUNRPC_REGISTER_V4=N workaround, do we  
need to fix 2.6.29, or can this wait until 2.6.30?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 19:16                         ` Frank van Maarseveen
@ 2009-02-12 20:24                           ` Trond Myklebust
       [not found]                             ` <1234470251.7190.102.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 20:24 UTC (permalink / raw)
  To: Frank van Maarseveen
  Cc: Mr. Charles Edward Lever, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 20:16 +0100, Frank van Maarseveen wrote:
> On Thu, Feb 12, 2009 at 02:10:37PM -0500, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 19:29 +0100, Frank van Maarseveen wrote:
> > > On Thu, Feb 12, 2009 at 01:17:27PM -0500, Trond Myklebust wrote:
> > > > On Thu, 2009-02-12 at 16:36 +0100, Frank van Maarseveen wrote:
> > > > > A little theorizing:
> > > > > If the unlock of a yet unrecovered lock has failed up to that point then
> > > > > the client sure must remember the lock somehow. That might explain the
> > > > > secondary error when a conflicting lock is granted by the server.
> > > > 
> > > > Sorry, but that doesn't hold water. The client will release the VFS
> > > > 'mirror' of the lock before it attempts to unlock. Otherwise, you could
> > > > have some nasty races between the unlock thread and the recovery
> > > > thread...
> > > > Besides, the granted callback handler on the client only checks the list
> > > > of blocked locks for a match.
> > > 
> > > ok, then we have more than one NLM bug to resolve.
> > > 
> > > > 
> > > > Oh, bugger, I know what this is... It's the same thing that happened to
> > > > the NFSv4 callback server. If you compile with CONFIG_IPV6 or
> > > > CONFIG_IPV6_MODULE enabled, and also set CONFIG_SUNRPC_REGISTER_V4, then
> > > > the NLM server will listen on an IPv6 socket, and so the RPC request
> > > > come in with their IPv4 address mapped into the IPv6 namespace.
> > > 
> > > Nope:
> > > 
> > > 	$ zgrep IPV6 /proc/config.gz 
> > > 	# CONFIG_IPV6 is not set
> > > 	$ zgrep SUNRPC /proc/config.gz 
> > > 	CONFIG_SUNRPC=y
> > > 	CONFIG_SUNRPC_GSS=y
> > > 	# CONFIG_SUNRPC_BIND34 is not set
> > 
> > Sorry, yes... 2.6.27.x should be OK. The lockd v4mapped addresses bug is
> > specific to 2.6.29. Chuck, are you planning on fixing this before
> > 2.6.29-final comes out?
> > 
> > > And remember this is not a recent regression.
> > 
> > It would help if you sent us the full binary tcpdump, instead of just
> > the summary. That should enable us to figure out which of the tests is
> > failing in nlmclnt_grant().
> 
> I posted the link already. Anyway, see attachment.

Yeah... It looks alright. The one thing that looks a bit odd is the
GRANTED lock has a 'caller_name' field that is set to the name of the
server. I pretty sure we don't care about that, though...

Hmm... I wonder if the problem isn't just that we're failing to cancel
the lock request when the process is signalled. Can you try the
following patch?

--------------------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@netapp.com>
NLM/lockd: Always cancel blocked locks when exiting early from nlmclnt_lock

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/lockd/clntproc.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)


diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
index 31668b6..f956d1e 100644
--- a/fs/lockd/clntproc.c
+++ b/fs/lockd/clntproc.c
@@ -542,9 +542,14 @@ again:
 		status = nlmclnt_call(cred, req, NLMPROC_LOCK);
 		if (status < 0)
 			break;
-		/* Did a reclaimer thread notify us of a server reboot? */
-		if (resp->status ==  nlm_lck_denied_grace_period)
+		/* Is the server in a grace period state?
+		 * If so, we need to reset the resp->status, and
+		 * retry...
+		 */
+		if (resp->status ==  nlm_lck_denied_grace_period) {
+			resp->status = nlm_lck_blocked;
 			continue;
+		}
 		if (resp->status != nlm_lck_blocked)
 			break;
 		/* Wait on an NLM blocking lock */



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 20:11                               ` Chuck Lever
@ 2009-02-12 20:27                                 ` Trond Myklebust
       [not found]                                   ` <1234470457.7190.106.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 20:27 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 15:11 -0500, Chuck Lever wrote:
> On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
> >> I wasn't sure exactly where the compared addresses came from.  I had
> >> assumed that they all came through the listener, so we wouldn't need
> >> this kind of translation.  It shouldn't be difficult to map addresses
> >> passed in via nlmclnt_init() to AF_INET6.
> >>
> >> But this is the kind of thing that makes "falling back" to an AF_INET
> >> listener a little challenging.  We will have to record what flavor  
> >> the
> >> listener is and do a translation depending on what listener family  
> >> was
> >> actually created.
> >
> > Why? Should we care whether we're receiving IPv4 addresses or IPv6
> > v4-mapped addresses? They're the same thing...
> 
> The problem is the listener family is now decided at run-time.  If an  
> AF_INET6 listener can't be created, an AF_INET listener is created  
> instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.  If an  
> AF_INET listener is created, we get only IPv4 addresses in svc_rqst- 
>  >rq_addr.

You're missing my point. Why should we care if it's one or the other? In
the NFSv4 case, we v4map all IPv4 addresses _unconditionally_ if it
turns out that CONFIG_IPV6 is enabled.

IOW: we always compare IPv6 addresses.

Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                                   ` <1234470457.7190.106.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 20:43                                     ` Chuck Lever
  2009-02-12 20:54                                       ` Trond Myklebust
  2009-02-12 22:02                                       ` Trond Myklebust
  0 siblings, 2 replies; 25+ messages in thread
From: Chuck Lever @ 2009-02-12 20:43 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Feb 12, 2009, at 3:27 PM, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 15:11 -0500, Chuck Lever wrote:
>> On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
>>> On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
>>>> I wasn't sure exactly where the compared addresses came from.  I  
>>>> had
>>>> assumed that they all came through the listener, so we wouldn't  
>>>> need
>>>> this kind of translation.  It shouldn't be difficult to map  
>>>> addresses
>>>> passed in via nlmclnt_init() to AF_INET6.
>>>>
>>>> But this is the kind of thing that makes "falling back" to an  
>>>> AF_INET
>>>> listener a little challenging.  We will have to record what flavor
>>>> the
>>>> listener is and do a translation depending on what listener family
>>>> was
>>>> actually created.
>>>
>>> Why? Should we care whether we're receiving IPv4 addresses or IPv6
>>> v4-mapped addresses? They're the same thing...
>>
>> The problem is the listener family is now decided at run-time.  If an
>> AF_INET6 listener can't be created, an AF_INET listener is created
>> instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.  If an
>> AF_INET listener is created, we get only IPv4 addresses in svc_rqst-
>>> rq_addr.
>
> You're missing my point. Why should we care if it's one or the  
> other? In
> the NFSv4 case, we v4map all IPv4 addresses _unconditionally_ if it
> turns out that CONFIG_IPV6 is enabled.
>
> IOW: we always compare IPv6 addresses.

The reason we might care in this case is nlm_cmp_addr() is executed  
more frequently than nfs_sockaddr_match_ipaddr().

Mapping the server address in nlmclnt_init() means we translate the  
server address once and are done with it.  We never have to map  
incoming AF_INET addresses in NLM requests, and we don't have the  
extra conditionals every time we go through nlm_cmp_addr().

This keeps nlm_cmp_addr() as simple as it can be: it compares only two  
AF_INET addresses or two AF_INET6 addresses.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 20:43                                     ` Chuck Lever
@ 2009-02-12 20:54                                       ` Trond Myklebust
       [not found]                                         ` <1234472083.7190.124.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-02-12 22:02                                       ` Trond Myklebust
  1 sibling, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 20:54 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
> On Feb 12, 2009, at 3:27 PM, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 15:11 -0500, Chuck Lever wrote:
> >> On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
> >>> On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
> >>>> I wasn't sure exactly where the compared addresses came from.  I  
> >>>> had
> >>>> assumed that they all came through the listener, so we wouldn't  
> >>>> need
> >>>> this kind of translation.  It shouldn't be difficult to map  
> >>>> addresses
> >>>> passed in via nlmclnt_init() to AF_INET6.
> >>>>
> >>>> But this is the kind of thing that makes "falling back" to an  
> >>>> AF_INET
> >>>> listener a little challenging.  We will have to record what flavor
> >>>> the
> >>>> listener is and do a translation depending on what listener family
> >>>> was
> >>>> actually created.
> >>>
> >>> Why? Should we care whether we're receiving IPv4 addresses or IPv6
> >>> v4-mapped addresses? They're the same thing...
> >>
> >> The problem is the listener family is now decided at run-time.  If an
> >> AF_INET6 listener can't be created, an AF_INET listener is created
> >> instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.  If an
> >> AF_INET listener is created, we get only IPv4 addresses in svc_rqst-
> >>> rq_addr.
> >
> > You're missing my point. Why should we care if it's one or the  
> > other? In
> > the NFSv4 case, we v4map all IPv4 addresses _unconditionally_ if it
> > turns out that CONFIG_IPV6 is enabled.
> >
> > IOW: we always compare IPv6 addresses.
> 
> The reason we might care in this case is nlm_cmp_addr() is executed  
> more frequently than nfs_sockaddr_match_ipaddr().
> 
> Mapping the server address in nlmclnt_init() means we translate the  
> server address once and are done with it.  We never have to map  
> incoming AF_INET addresses in NLM requests, and we don't have the  
> extra conditionals every time we go through nlm_cmp_addr().
> 
> This keeps nlm_cmp_addr() as simple as it can be: it compares only two  
> AF_INET addresses or two AF_INET6 addresses.

I don't see how that changes the general principle. All it means is that
you should be caching v4 mapped addresses instead of ipv4 addresses.
That would allow you to simplify nlm_cmp_addr() even further...

Trond


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                                         ` <1234472083.7190.124.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 21:43                                           ` Chuck Lever
  2009-02-12 22:03                                             ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Chuck Lever @ 2009-02-12 21:43 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Feb 12, 2009, at 3:54 PM, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
>> On Feb 12, 2009, at 3:27 PM, Trond Myklebust wrote:
>>> On Thu, 2009-02-12 at 15:11 -0500, Chuck Lever wrote:
>>>> On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
>>>>> On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
>>>>>> I wasn't sure exactly where the compared addresses came from.  I
>>>>>> had
>>>>>> assumed that they all came through the listener, so we wouldn't
>>>>>> need
>>>>>> this kind of translation.  It shouldn't be difficult to map
>>>>>> addresses
>>>>>> passed in via nlmclnt_init() to AF_INET6.
>>>>>>
>>>>>> But this is the kind of thing that makes "falling back" to an
>>>>>> AF_INET
>>>>>> listener a little challenging.  We will have to record what  
>>>>>> flavor
>>>>>> the
>>>>>> listener is and do a translation depending on what listener  
>>>>>> family
>>>>>> was
>>>>>> actually created.
>>>>>
>>>>> Why? Should we care whether we're receiving IPv4 addresses or IPv6
>>>>> v4-mapped addresses? They're the same thing...
>>>>
>>>> The problem is the listener family is now decided at run-time.   
>>>> If an
>>>> AF_INET6 listener can't be created, an AF_INET listener is created
>>>> instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.   
>>>> If an
>>>> AF_INET listener is created, we get only IPv4 addresses in  
>>>> svc_rqst-
>>>>> rq_addr.
>>>
>>> You're missing my point. Why should we care if it's one or the
>>> other? In
>>> the NFSv4 case, we v4map all IPv4 addresses _unconditionally_ if it
>>> turns out that CONFIG_IPV6 is enabled.
>>>
>>> IOW: we always compare IPv6 addresses.
>>
>> The reason we might care in this case is nlm_cmp_addr() is executed
>> more frequently than nfs_sockaddr_match_ipaddr().
>>
>> Mapping the server address in nlmclnt_init() means we translate the
>> server address once and are done with it.  We never have to map
>> incoming AF_INET addresses in NLM requests, and we don't have the
>> extra conditionals every time we go through nlm_cmp_addr().
>>
>> This keeps nlm_cmp_addr() as simple as it can be: it compares only  
>> two
>> AF_INET addresses or two AF_INET6 addresses.
>
> I don't see how that changes the general principle. All it means is  
> that
> you should be caching v4 mapped addresses instead of ipv4 addresses.
> That would allow you to simplify nlm_cmp_addr() even further...

Operationally we have to support both AF_INET and AF_INET6 addresses  
in the cache, because we don't know what kind of lockd listener can be  
created until runtime.  So, I can't see how we can eliminate the  
AF_INET arm in nlm_cmp_addr() unless we unconditionally convert all  
incoming AF_INET addresses from putative PF_INET listeners _and_  
convert incoming IPv4 server addresses in NFS mount requests to  
AF_INET6.

Doesn't that add computational overhead to a fairly common case?

This goes away if we ensure that the address family of the server  
address passed to nlmclnt_lookup_host() always matches the protocol  
family of lockd's listener sockets.  Then address mapping overhead is  
entirely removed from the common cases involving PF_INET listeners.

For PF_INET6 listeners, incoming IPv4 addresses are already mapped by  
the underlying network layer.  Nothing can be done about that.  But we  
can make sure the address family of the server address passed to  
nlmclnt_lookup_host() matches the incoming mapped addresses to  
eliminate the need for nlm_cmp_addr() to do the mapping every time it  
wants to compare an address.

It should be fairly simple to record the listener's protocol family,  
check it against incoming server addresses in nlmclnt_init(), then map  
the address as needed.

Having nlm_cmp_addr() do the mapping solves some problems, but at the  
cost of extra CPU time every time it is called; each loop iteration in  
nlm_lookup_host() for example.  All I'm doing is removing a loop  
invariant, essentially.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 20:43                                     ` Chuck Lever
  2009-02-12 20:54                                       ` Trond Myklebust
@ 2009-02-12 22:02                                       ` Trond Myklebust
       [not found]                                         ` <1234476134.7190.187.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  1 sibling, 1 reply; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 22:02 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
> The reason we might care in this case is nlm_cmp_addr() is executed  
> more frequently than nfs_sockaddr_match_ipaddr().

Actually, I'm not sure this assertion is correct. The only users of
nlm_cmp_addr() are nlmclnt_grant(), nlm_lookup_host() and
nlmsvc_unlock_all_by_ip().

AFAICS, the only one that needs to be v4 mapped should be nlmclnt_grant,
which is not in a performance critical path...



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 21:43                                           ` Chuck Lever
@ 2009-02-12 22:03                                             ` Trond Myklebust
  0 siblings, 0 replies; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 22:03 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 16:43 -0500, Chuck Lever wrote:
> Having nlm_cmp_addr() do the mapping solves some problems, but at the  
> cost of extra CPU time every time it is called; each loop iteration in  
> nlm_lookup_host() for example.  All I'm doing is removing a loop  
> invariant, essentially.

nlm_lookup_host() shouldn't need to compare v4 mapped addresses and IPv4
addresses afaics.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                                         ` <1234476134.7190.187.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-12 22:11                                           ` Chuck Lever
  2009-02-12 22:19                                             ` Trond Myklebust
  0 siblings, 1 reply; 25+ messages in thread
From: Chuck Lever @ 2009-02-12 22:11 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Feb 12, 2009, at 5:02 PM, Trond Myklebust wrote:
> On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
>> The reason we might care in this case is nlm_cmp_addr() is executed
>> more frequently than nfs_sockaddr_match_ipaddr().
>
> Actually, I'm not sure this assertion is correct. The only users of
> nlm_cmp_addr() are nlmclnt_grant(), nlm_lookup_host() and
> nlmsvc_unlock_all_by_ip().
>
> AFAICS, the only one that needs to be v4 mapped should be  
> nlmclnt_grant,
> which is not in a performance critical path...

So then your proposal is to ensure the two arguments of the  
nlm_cmp_addr() callsite in nlmclnt_grant() are both AF_INET6?

That doesn't sound so bad.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
  2009-02-12 22:11                                           ` Chuck Lever
@ 2009-02-12 22:19                                             ` Trond Myklebust
  0 siblings, 0 replies; 25+ messages in thread
From: Trond Myklebust @ 2009-02-12 22:19 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Frank van Maarseveen, J. Bruce Fields, Linux NFS mailing list

On Thu, 2009-02-12 at 17:11 -0500, Chuck Lever wrote:
> On Feb 12, 2009, at 5:02 PM, Trond Myklebust wrote:
> > On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
> >> The reason we might care in this case is nlm_cmp_addr() is executed
> >> more frequently than nfs_sockaddr_match_ipaddr().
> >
> > Actually, I'm not sure this assertion is correct. The only users of
> > nlm_cmp_addr() are nlmclnt_grant(), nlm_lookup_host() and
> > nlmsvc_unlock_all_by_ip().
> >
> > AFAICS, the only one that needs to be v4 mapped should be  
> > nlmclnt_grant,
> > which is not in a performance critical path...
> 
> So then your proposal is to ensure the two arguments of the  
> nlm_cmp_addr() callsite in nlmclnt_grant() are both AF_INET6?

Yup... I can't see that the other two callsites need anything like that.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [NLM] 2.6.27.14 breakage when grace period expires
       [not found]                             ` <1234470251.7190.102.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-02-13 11:04                               ` Frank van Maarseveen
  0 siblings, 0 replies; 25+ messages in thread
From: Frank van Maarseveen @ 2009-02-13 11:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Frank van Maarseveen, Mr. Charles Edward Lever, J. Bruce Fields,
	Linux NFS mailing list

On Thu, Feb 12, 2009 at 03:24:11PM -0500, Trond Myklebust wrote:
> 
> Hmm... I wonder if the problem isn't just that we're failing to cancel
> the lock request when the process is signalled. Can you try the
> following patch?
> 
> --------------------------------------------------------------------
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> NLM/lockd: Always cancel blocked locks when exiting early from nlmclnt_lock
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  fs/lockd/clntproc.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c
> index 31668b6..f956d1e 100644
> --- a/fs/lockd/clntproc.c
> +++ b/fs/lockd/clntproc.c
> @@ -542,9 +542,14 @@ again:
>  		status = nlmclnt_call(cred, req, NLMPROC_LOCK);
>  		if (status < 0)
>  			break;
> -		/* Did a reclaimer thread notify us of a server reboot? */
> -		if (resp->status ==  nlm_lck_denied_grace_period)
> +		/* Is the server in a grace period state?
> +		 * If so, we need to reset the resp->status, and
> +		 * retry...
> +		 */
> +		if (resp->status ==  nlm_lck_denied_grace_period) {
> +			resp->status = nlm_lck_blocked;
>  			continue;
> +		}
>  		if (resp->status != nlm_lck_blocked)
>  			break;
>  		/* Wait on an NLM blocking lock */

Patch tried but didn't make any difference. Note that there isn't any ^C
or any other signal involved. The client runs three loops in the shell

	while :; do lck -w /mnt/locktest 2; done

and every "lck" opens the file, obtains an exclusive write lock (waits
if necessary), calls sleep(2), closes the fd (releasing the lock) and
goes exit.

The "lck" which ends up unlocking during grace terminates normally but
one of the others gets a "fcntl: No locks available" when trying to
obtain the lock.


Question: shouldn't the server drop the lock after a sequence like:

201 122.033767  server:  NLM  V4 GRANTED_MSG Call (Reply In 202) FH:0xcafa61cc svid:116 pos:0-0
202 122.034066  client:  NLM  V4 GRANTED_MSG Reply (Call In 201)
205 122.034665  client:  NLM  V4 GRANTED_RES Call (Reply In 206) NLM_DENIED
206 122.034753  server:  NLM  V4 GRANTED_RES Reply (Call In 205)

?

-- 
Frank

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-02-13 11:04 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-11 11:23 [NLM] 2.6.27.14 breakage when grace period expires Frank van Maarseveen
2009-02-11 20:35 ` J. Bruce Fields
2009-02-11 20:37   ` Frank van Maarseveen
2009-02-11 20:39     ` J. Bruce Fields
2009-02-11 20:57       ` Frank van Maarseveen
2009-02-12 14:28       ` Frank van Maarseveen
2009-02-12 15:16         ` Trond Myklebust
     [not found]           ` <1234451789.7190.38.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 15:36             ` Frank van Maarseveen
2009-02-12 18:17               ` Trond Myklebust
     [not found]                 ` <1234462647.7190.53.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 18:29                   ` Frank van Maarseveen
2009-02-12 19:10                     ` Trond Myklebust
     [not found]                       ` <1234465837.7190.62.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 19:16                         ` Frank van Maarseveen
2009-02-12 20:24                           ` Trond Myklebust
     [not found]                             ` <1234470251.7190.102.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-13 11:04                               ` Frank van Maarseveen
2009-02-12 19:35                         ` Chuck Lever
2009-02-12 19:43                           ` Trond Myklebust
     [not found]                             ` <1234467795.7190.70.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 20:11                               ` Chuck Lever
2009-02-12 20:27                                 ` Trond Myklebust
     [not found]                                   ` <1234470457.7190.106.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 20:43                                     ` Chuck Lever
2009-02-12 20:54                                       ` Trond Myklebust
     [not found]                                         ` <1234472083.7190.124.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 21:43                                           ` Chuck Lever
2009-02-12 22:03                                             ` Trond Myklebust
2009-02-12 22:02                                       ` Trond Myklebust
     [not found]                                         ` <1234476134.7190.187.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-12 22:11                                           ` Chuck Lever
2009-02-12 22:19                                             ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.