All of lore.kernel.org
 help / color / mirror / Atom feed
* RPC Race Condition
@ 2012-10-22 18:03 Chris Perl
  2012-10-22 18:28 ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-22 18:03 UTC (permalink / raw)
  To: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 7877 bytes --]

I believe I may have identified a race condition with the client side
RPC code that comes into play when handling a reconnect.

First, I'll detail how to reproduce it, then give some details on what I
think is going on.


TO REPRODUCE:

Client is 192.168.1.100
Server is 192.168.1.200

client# showmount -e server
Export list for server:
/export/test 192.168.1.100
client# mkdir -p /mnt/test
client# mount -t nfs -o \
	ro,relatime,vers=3,rsize=131072,wsize=524288,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,proto=tcp,timeo=600,retrans=2,sec=sys \
	192.168.1.200:/export/test /mnt/test

Note: I have set all the ac* variables to 0 to better be able to trace
the calls my user space program is making to things that go out over the
wire.  This is not crucial as the problem still exists with these set to
the default.

Now, on the client start the attached python program [1] with real time
priority (SCHED_RR), stat'ing a file on the NFS mount:

client# touch /mnt/test/file.txt
client# chrt 50 ./hang.py /mnt/test/file.txt 10

>From here, it is necessary to simulate a network event such that the
client is unable to talk to the server, but the server is still able to
talk to the client.

client# ip route add blackhole 192.168.1.200/32 && ip route flush cache

Now, we wait until the userspace program makes its next stat call.  Once
it does, restart the NFS server processes on the server:

server# /etc/init.d/nfs restart

The idea of restarting the nfs server processes is to ensure that FINs
and/or an RST come across from the server to the client, changing the
state of its socket while its waiting for the response to its call of
`call_transmit' from within `__rpc_execute'.

Finally back on the client, remove the blackhole route:

client# ip route del 192.168.1.200/32 && ip route flush cache

Once the route is removed, you should be able to do things like ping the
server again, run `rpcinfo -p' against it, etc.  However, any NFS
operations for this server will continue to hang.  The way to recover is
to kill the SCHED_RR process.  After that, things will recover.


NOTES:

This setup for reproducing the problem may seem a little silly, but we
had something very similiar happen in our production environment a
couple of weeks ago and it caused us to get into this situation.  Note
that it has nothing to do with the server side of things.  I've seen
this happen with a linux NFS server (as above) and with a NAS from a
large storage vendor.

If you run the stat'ing process as a normal process, i.e. SCHED_OTHER,
then everything recovers without intervention (i.e. it all works as it
should).  However, I believe the same race condition is still there, but
just doesn't manifest because in this case the stat'ing process does not
have the same scheduling advantage over rpciod.

We experienced this problem on the following CentOS kernel:
2.6.32-279.5.2.el6.centos.plus.x86_64, but I've verified that the same
behavior exists as late as 3.7rc2.

During the hanging, if you watch the network, you can see that the
client continually completes a TCP 3-way handhsake, but then immediately
tears the connection down.


MY ATTEMPTED ANALYSIS:

First, I am no expert and just starting looking at the rpc code in the
kernel about 2 weeks ago.  Therefore some or all of what I lay out here
may be wrong or misguided etc.  This is just what I've been able to
figure out by looking at the source code and writing a bunch of
SystemTap scripts.

I can get more detailed if necessary (including sharing the SystemTap
scripts I used to observe this behavior), but don't want to make this
email overly long, so will just touch on the high points:

  - The sleeping process is that is in the middle of waiting for a
    response to `call_transmit' is woken up via `xs_error_report' when
    the FINs and/or RST comes across the wire.

  - It finds the `tk_status' field of the task set to EAGAIN, and so
    retries `call_transmit'.

  - On the next attempt, `tcp_sendmsg' and its call of
    `sk_stream_wait_connect' see the socket has an error of ECONNRESET.
    Therefore, the next call to `call_status' sees this and sets the
    next `tk_action' to be `call_bind'. 

  - `call_bind' does its thing and then we come to `call_connect'

  - `call_connect' schedules the execution of `xs_tcp_setup_socket' with
    rpciod via rpciod's workqueue and then goes to sleep.

  - Here is where I believe the problem comes about.  Our sleeping
    thread waiting for the connect to finish is woken up by rpciod
    early via the following call stack:

      0xffffffffa042f840 : rpc_make_runnable+0x0/0x80 [sunrpc]
      0xffffffffa042fb5e : rpc_wake_up_task_queue_locked+0x18e/0x270 [sunrpc]
      0xffffffffa042fc8e : rpc_wake_up_status+0x4e/0x80 [sunrpc]
      0xffffffffa04288ec : xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
      0xffffffffa042b9fd : xs_error_report+0x4d/0x90 [sunrpc]
      0xffffffff81487436 : tcp_disconnect+0x356/0x410 [kernel]
      0xffffffff814a9c20 : inet_stream_connect+0x50/0x2c0 [kernel]
      0xffffffff8142bb20 : kernel_connect+0x10/0x20 [kernel]
      0xffffffffa042b368 : xs_tcp_setup_socket+0xe8/0x4c0 [sunrpc]
      0xffffffff8108c760 : worker_thread+0x170/0x2a0 [kernel]
      0xffffffff81091d66 : kthread+0x96/0xa0 [kernel]
      0xffffffff8100c14a : child_rip+0xa/0x20 [kernel]

    It appears that with rpciod trying to reconnect, it winds up
    indirectly invoking `xs_error_report' via a callback from the
    sock/tcp layer.

  - At this point the `tk_status' is set to EAGAIN, so if
    `call_connect_status' handled EGAIN by setting `tk_action' to
    `call_connect' again AND rpciod was finished trying to execute
    `xs_tcp_setup_socket' then we would be ok.

  - However, `call_connect_status' sets `tk_action' to `call_transmit'
    and goes on.

  - At this point rpciod is still executing trying to connect the
    socket, and eventually will succeed, but now our thread and rpciod
    are racing to see who can update/check the state of the socket
    first.

    Depending on who wins and what status is returned,
    `xs_tcp_send_request' will either tear the connection down and
    retry, or it will simply retry without the teardown.  We can wind up
    in a loop of setup/teardowns and are never able to actually return
    from `__rpc_execute.'

    Since we're stuck in `__rpc_execute' and the transport is locked, no
    other NFS operations can come in and they all hang.

Simply changing `call_connect_status' to handle EAGAIN as described
above did "fix" my problem and I could no longer reproduce the issue,
however I believe that could result in another race condition with
multiple rpciod's trying to execute `xs_tcp_setup_socket' at the same
time.

I think the "right" fix, at least as far as I can tell would be to
ensure that rpciod does not wake up the process sleeping on
`call_connect' until it is completely done with its work (whether that
is success or failure).  I believe this coupled with making
`call_connect_status' handle EAGAIN would work.

Unfortunately, I'm stretching my abilities here and do not really have a
good understanding of how the interactions come together to be able to
submit a patch to fix this.


WORKAROUND:

Based on what I found above I decided to try setting the real time
priority of all my rpciod threads to one higher than that of my real
time stat'ing process (i.e. 51).  When I did this, I could no longer
reproduce the problem.  Setting the rt priority to 49 however, did allow
it to still happen.  I believe this is essentially the same reason that
things seem to work normally for SCHED_OTHER processes.  Simply that
they do not have the same scheduling advantage over rpciod and therefore
its likely that rpciod wins and our thread finds the socket state as
either `TCP_SYN_SENT' or `TCP_ESTABLISHED' and everything recovers.

Chris

[1] hang.py

[-- Attachment #2: hang.py --]
[-- Type: text/plain, Size: 195 bytes --]

#!/usr/bin/python

import os 
import sys
import time
import datetime

f=sys.argv[1]
s=float(sys.argv[2])

while True:
	r = os.stat(f)
	print "%s %r" % (datetime.datetime.now(), r)
	time.sleep(s)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-22 18:03 RPC Race Condition Chris Perl
@ 2012-10-22 18:28 ` Myklebust, Trond
  2012-10-22 20:26   ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-22 18:28 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

T24gTW9uLCAyMDEyLTEwLTIyIGF0IDE0OjAzIC0wNDAwLCBDaHJpcyBQZXJsIHdyb3RlOg0KPiBJ
IGJlbGlldmUgSSBtYXkgaGF2ZSBpZGVudGlmaWVkIGEgcmFjZSBjb25kaXRpb24gd2l0aCB0aGUg
Y2xpZW50IHNpZGUNCj4gUlBDIGNvZGUgdGhhdCBjb21lcyBpbnRvIHBsYXkgd2hlbiBoYW5kbGlu
ZyBhIHJlY29ubmVjdC4NCj4gDQo+IEZpcnN0LCBJJ2xsIGRldGFpbCBob3cgdG8gcmVwcm9kdWNl
IGl0LCB0aGVuIGdpdmUgc29tZSBkZXRhaWxzIG9uIHdoYXQgSQ0KPiB0aGluayBpcyBnb2luZyBv
bi4NCj4gDQo+IA0KPiBUTyBSRVBST0RVQ0U6DQo+IA0KPiBDbGllbnQgaXMgMTkyLjE2OC4xLjEw
MA0KPiBTZXJ2ZXIgaXMgMTkyLjE2OC4xLjIwMA0KPiANCj4gY2xpZW50IyBzaG93bW91bnQgLWUg
c2VydmVyDQo+IEV4cG9ydCBsaXN0IGZvciBzZXJ2ZXI6DQo+IC9leHBvcnQvdGVzdCAxOTIuMTY4
LjEuMTAwDQo+IGNsaWVudCMgbWtkaXIgLXAgL21udC90ZXN0DQo+IGNsaWVudCMgbW91bnQgLXQg
bmZzIC1vIFwNCj4gCXJvLHJlbGF0aW1lLHZlcnM9Myxyc2l6ZT0xMzEwNzIsd3NpemU9NTI0Mjg4
LG5hbWxlbj0yNTUsYWNyZWdtaW49MCxhY3JlZ21heD0wLGFjZGlybWluPTAsYWNkaXJtYXg9MCxo
YXJkLHByb3RvPXRjcCx0aW1lbz02MDAscmV0cmFucz0yLHNlYz1zeXMgXA0KPiAJMTkyLjE2OC4x
LjIwMDovZXhwb3J0L3Rlc3QgL21udC90ZXN0DQo+IA0KPiBOb3RlOiBJIGhhdmUgc2V0IGFsbCB0
aGUgYWMqIHZhcmlhYmxlcyB0byAwIHRvIGJldHRlciBiZSBhYmxlIHRvIHRyYWNlDQo+IHRoZSBj
YWxscyBteSB1c2VyIHNwYWNlIHByb2dyYW0gaXMgbWFraW5nIHRvIHRoaW5ncyB0aGF0IGdvIG91
dCBvdmVyIHRoZQ0KPiB3aXJlLiAgVGhpcyBpcyBub3QgY3J1Y2lhbCBhcyB0aGUgcHJvYmxlbSBz
dGlsbCBleGlzdHMgd2l0aCB0aGVzZSBzZXQgdG8NCj4gdGhlIGRlZmF1bHQuDQo+IA0KPiBOb3cs
IG9uIHRoZSBjbGllbnQgc3RhcnQgdGhlIGF0dGFjaGVkIHB5dGhvbiBwcm9ncmFtIFsxXSB3aXRo
IHJlYWwgdGltZQ0KPiBwcmlvcml0eSAoU0NIRURfUlIpLCBzdGF0J2luZyBhIGZpbGUgb24gdGhl
IE5GUyBtb3VudDoNCj4gDQo+IGNsaWVudCMgdG91Y2ggL21udC90ZXN0L2ZpbGUudHh0DQo+IGNs
aWVudCMgY2hydCA1MCAuL2hhbmcucHkgL21udC90ZXN0L2ZpbGUudHh0IDEwDQo+IA0KPiA+RnJv
bSBoZXJlLCBpdCBpcyBuZWNlc3NhcnkgdG8gc2ltdWxhdGUgYSBuZXR3b3JrIGV2ZW50IHN1Y2gg
dGhhdCB0aGUNCj4gY2xpZW50IGlzIHVuYWJsZSB0byB0YWxrIHRvIHRoZSBzZXJ2ZXIsIGJ1dCB0
aGUgc2VydmVyIGlzIHN0aWxsIGFibGUgdG8NCj4gdGFsayB0byB0aGUgY2xpZW50Lg0KPiANCj4g
Y2xpZW50IyBpcCByb3V0ZSBhZGQgYmxhY2tob2xlIDE5Mi4xNjguMS4yMDAvMzIgJiYgaXAgcm91
dGUgZmx1c2ggY2FjaGUNCj4gDQo+IE5vdywgd2Ugd2FpdCB1bnRpbCB0aGUgdXNlcnNwYWNlIHBy
b2dyYW0gbWFrZXMgaXRzIG5leHQgc3RhdCBjYWxsLiAgT25jZQ0KPiBpdCBkb2VzLCByZXN0YXJ0
IHRoZSBORlMgc2VydmVyIHByb2Nlc3NlcyBvbiB0aGUgc2VydmVyOg0KPiANCj4gc2VydmVyIyAv
ZXRjL2luaXQuZC9uZnMgcmVzdGFydA0KPiANCj4gVGhlIGlkZWEgb2YgcmVzdGFydGluZyB0aGUg
bmZzIHNlcnZlciBwcm9jZXNzZXMgaXMgdG8gZW5zdXJlIHRoYXQgRklOcw0KPiBhbmQvb3IgYW4g
UlNUIGNvbWUgYWNyb3NzIGZyb20gdGhlIHNlcnZlciB0byB0aGUgY2xpZW50LCBjaGFuZ2luZyB0
aGUNCj4gc3RhdGUgb2YgaXRzIHNvY2tldCB3aGlsZSBpdHMgd2FpdGluZyBmb3IgdGhlIHJlc3Bv
bnNlIHRvIGl0cyBjYWxsIG9mDQo+IGBjYWxsX3RyYW5zbWl0JyBmcm9tIHdpdGhpbiBgX19ycGNf
ZXhlY3V0ZScuDQo+IA0KPiBGaW5hbGx5IGJhY2sgb24gdGhlIGNsaWVudCwgcmVtb3ZlIHRoZSBi
bGFja2hvbGUgcm91dGU6DQo+IA0KPiBjbGllbnQjIGlwIHJvdXRlIGRlbCAxOTIuMTY4LjEuMjAw
LzMyICYmIGlwIHJvdXRlIGZsdXNoIGNhY2hlDQo+IA0KPiBPbmNlIHRoZSByb3V0ZSBpcyByZW1v
dmVkLCB5b3Ugc2hvdWxkIGJlIGFibGUgdG8gZG8gdGhpbmdzIGxpa2UgcGluZyB0aGUNCj4gc2Vy
dmVyIGFnYWluLCBydW4gYHJwY2luZm8gLXAnIGFnYWluc3QgaXQsIGV0Yy4gIEhvd2V2ZXIsIGFu
eSBORlMNCj4gb3BlcmF0aW9ucyBmb3IgdGhpcyBzZXJ2ZXIgd2lsbCBjb250aW51ZSB0byBoYW5n
LiAgVGhlIHdheSB0byByZWNvdmVyIGlzDQo+IHRvIGtpbGwgdGhlIFNDSEVEX1JSIHByb2Nlc3Mu
ICBBZnRlciB0aGF0LCB0aGluZ3Mgd2lsbCByZWNvdmVyLg0KPiANCj4gDQo+IE5PVEVTOg0KPiAN
Cj4gVGhpcyBzZXR1cCBmb3IgcmVwcm9kdWNpbmcgdGhlIHByb2JsZW0gbWF5IHNlZW0gYSBsaXR0
bGUgc2lsbHksIGJ1dCB3ZQ0KPiBoYWQgc29tZXRoaW5nIHZlcnkgc2ltaWxpYXIgaGFwcGVuIGlu
IG91ciBwcm9kdWN0aW9uIGVudmlyb25tZW50IGENCj4gY291cGxlIG9mIHdlZWtzIGFnbyBhbmQg
aXQgY2F1c2VkIHVzIHRvIGdldCBpbnRvIHRoaXMgc2l0dWF0aW9uLiAgTm90ZQ0KPiB0aGF0IGl0
IGhhcyBub3RoaW5nIHRvIGRvIHdpdGggdGhlIHNlcnZlciBzaWRlIG9mIHRoaW5ncy4gIEkndmUg
c2Vlbg0KPiB0aGlzIGhhcHBlbiB3aXRoIGEgbGludXggTkZTIHNlcnZlciAoYXMgYWJvdmUpIGFu
ZCB3aXRoIGEgTkFTIGZyb20gYQ0KPiBsYXJnZSBzdG9yYWdlIHZlbmRvci4NCj4gDQo+IElmIHlv
dSBydW4gdGhlIHN0YXQnaW5nIHByb2Nlc3MgYXMgYSBub3JtYWwgcHJvY2VzcywgaS5lLiBTQ0hF
RF9PVEhFUiwNCj4gdGhlbiBldmVyeXRoaW5nIHJlY292ZXJzIHdpdGhvdXQgaW50ZXJ2ZW50aW9u
IChpLmUuIGl0IGFsbCB3b3JrcyBhcyBpdA0KPiBzaG91bGQpLiAgSG93ZXZlciwgSSBiZWxpZXZl
IHRoZSBzYW1lIHJhY2UgY29uZGl0aW9uIGlzIHN0aWxsIHRoZXJlLCBidXQNCj4ganVzdCBkb2Vz
bid0IG1hbmlmZXN0IGJlY2F1c2UgaW4gdGhpcyBjYXNlIHRoZSBzdGF0J2luZyBwcm9jZXNzIGRv
ZXMgbm90DQo+IGhhdmUgdGhlIHNhbWUgc2NoZWR1bGluZyBhZHZhbnRhZ2Ugb3ZlciBycGNpb2Qu
DQo+IA0KPiBXZSBleHBlcmllbmNlZCB0aGlzIHByb2JsZW0gb24gdGhlIGZvbGxvd2luZyBDZW50
T1Mga2VybmVsOg0KPiAyLjYuMzItMjc5LjUuMi5lbDYuY2VudG9zLnBsdXMueDg2XzY0LCBidXQg
SSd2ZSB2ZXJpZmllZCB0aGF0IHRoZSBzYW1lDQo+IGJlaGF2aW9yIGV4aXN0cyBhcyBsYXRlIGFz
IDMuN3JjMi4NCj4gDQo+IER1cmluZyB0aGUgaGFuZ2luZywgaWYgeW91IHdhdGNoIHRoZSBuZXR3
b3JrLCB5b3UgY2FuIHNlZSB0aGF0IHRoZQ0KPiBjbGllbnQgY29udGludWFsbHkgY29tcGxldGVz
IGEgVENQIDMtd2F5IGhhbmRoc2FrZSwgYnV0IHRoZW4gaW1tZWRpYXRlbHkNCj4gdGVhcnMgdGhl
IGNvbm5lY3Rpb24gZG93bi4NCj4gDQo+IA0KPiBNWSBBVFRFTVBURUQgQU5BTFlTSVM6DQo+IA0K
PiBGaXJzdCwgSSBhbSBubyBleHBlcnQgYW5kIGp1c3Qgc3RhcnRpbmcgbG9va2luZyBhdCB0aGUg
cnBjIGNvZGUgaW4gdGhlDQo+IGtlcm5lbCBhYm91dCAyIHdlZWtzIGFnby4gIFRoZXJlZm9yZSBz
b21lIG9yIGFsbCBvZiB3aGF0IEkgbGF5IG91dCBoZXJlDQo+IG1heSBiZSB3cm9uZyBvciBtaXNn
dWlkZWQgZXRjLiAgVGhpcyBpcyBqdXN0IHdoYXQgSSd2ZSBiZWVuIGFibGUgdG8NCj4gZmlndXJl
IG91dCBieSBsb29raW5nIGF0IHRoZSBzb3VyY2UgY29kZSBhbmQgd3JpdGluZyBhIGJ1bmNoIG9m
DQo+IFN5c3RlbVRhcCBzY3JpcHRzLg0KPiANCj4gSSBjYW4gZ2V0IG1vcmUgZGV0YWlsZWQgaWYg
bmVjZXNzYXJ5IChpbmNsdWRpbmcgc2hhcmluZyB0aGUgU3lzdGVtVGFwDQo+IHNjcmlwdHMgSSB1
c2VkIHRvIG9ic2VydmUgdGhpcyBiZWhhdmlvciksIGJ1dCBkb24ndCB3YW50IHRvIG1ha2UgdGhp
cw0KPiBlbWFpbCBvdmVybHkgbG9uZywgc28gd2lsbCBqdXN0IHRvdWNoIG9uIHRoZSBoaWdoIHBv
aW50czoNCj4gDQo+ICAgLSBUaGUgc2xlZXBpbmcgcHJvY2VzcyBpcyB0aGF0IGlzIGluIHRoZSBt
aWRkbGUgb2Ygd2FpdGluZyBmb3IgYQ0KPiAgICAgcmVzcG9uc2UgdG8gYGNhbGxfdHJhbnNtaXQn
IGlzIHdva2VuIHVwIHZpYSBgeHNfZXJyb3JfcmVwb3J0JyB3aGVuDQo+ICAgICB0aGUgRklOcyBh
bmQvb3IgUlNUIGNvbWVzIGFjcm9zcyB0aGUgd2lyZS4NCj4gDQo+ICAgLSBJdCBmaW5kcyB0aGUg
YHRrX3N0YXR1cycgZmllbGQgb2YgdGhlIHRhc2sgc2V0IHRvIEVBR0FJTiwgYW5kIHNvDQo+ICAg
ICByZXRyaWVzIGBjYWxsX3RyYW5zbWl0Jy4NCj4gDQo+ICAgLSBPbiB0aGUgbmV4dCBhdHRlbXB0
LCBgdGNwX3NlbmRtc2cnIGFuZCBpdHMgY2FsbCBvZg0KPiAgICAgYHNrX3N0cmVhbV93YWl0X2Nv
bm5lY3QnIHNlZSB0aGUgc29ja2V0IGhhcyBhbiBlcnJvciBvZiBFQ09OTlJFU0VULg0KPiAgICAg
VGhlcmVmb3JlLCB0aGUgbmV4dCBjYWxsIHRvIGBjYWxsX3N0YXR1cycgc2VlcyB0aGlzIGFuZCBz
ZXRzIHRoZQ0KPiAgICAgbmV4dCBgdGtfYWN0aW9uJyB0byBiZSBgY2FsbF9iaW5kJy4gDQo+IA0K
PiAgIC0gYGNhbGxfYmluZCcgZG9lcyBpdHMgdGhpbmcgYW5kIHRoZW4gd2UgY29tZSB0byBgY2Fs
bF9jb25uZWN0Jw0KPiANCj4gICAtIGBjYWxsX2Nvbm5lY3QnIHNjaGVkdWxlcyB0aGUgZXhlY3V0
aW9uIG9mIGB4c190Y3Bfc2V0dXBfc29ja2V0JyB3aXRoDQo+ICAgICBycGNpb2QgdmlhIHJwY2lv
ZCdzIHdvcmtxdWV1ZSBhbmQgdGhlbiBnb2VzIHRvIHNsZWVwLg0KPiANCj4gICAtIEhlcmUgaXMg
d2hlcmUgSSBiZWxpZXZlIHRoZSBwcm9ibGVtIGNvbWVzIGFib3V0LiAgT3VyIHNsZWVwaW5nDQo+
ICAgICB0aHJlYWQgd2FpdGluZyBmb3IgdGhlIGNvbm5lY3QgdG8gZmluaXNoIGlzIHdva2VuIHVw
IGJ5IHJwY2lvZA0KPiAgICAgZWFybHkgdmlhIHRoZSBmb2xsb3dpbmcgY2FsbCBzdGFjazoNCj4g
DQo+ICAgICAgIDB4ZmZmZmZmZmZhMDQyZjg0MCA6IHJwY19tYWtlX3J1bm5hYmxlKzB4MC8weDgw
IFtzdW5ycGNdDQo+ICAgICAgIDB4ZmZmZmZmZmZhMDQyZmI1ZSA6IHJwY193YWtlX3VwX3Rhc2tf
cXVldWVfbG9ja2VkKzB4MThlLzB4MjcwIFtzdW5ycGNdDQo+ICAgICAgIDB4ZmZmZmZmZmZhMDQy
ZmM4ZSA6IHJwY193YWtlX3VwX3N0YXR1cysweDRlLzB4ODAgW3N1bnJwY10NCj4gICAgICAgMHhm
ZmZmZmZmZmEwNDI4OGVjIDogeHBydF93YWtlX3BlbmRpbmdfdGFza3MrMHgyYy8weDMwIFtzdW5y
cGNdDQo+ICAgICAgIDB4ZmZmZmZmZmZhMDQyYjlmZCA6IHhzX2Vycm9yX3JlcG9ydCsweDRkLzB4
OTAgW3N1bnJwY10NCj4gICAgICAgMHhmZmZmZmZmZjgxNDg3NDM2IDogdGNwX2Rpc2Nvbm5lY3Qr
MHgzNTYvMHg0MTAgW2tlcm5lbF0NCj4gICAgICAgMHhmZmZmZmZmZjgxNGE5YzIwIDogaW5ldF9z
dHJlYW1fY29ubmVjdCsweDUwLzB4MmMwIFtrZXJuZWxdDQo+ICAgICAgIDB4ZmZmZmZmZmY4MTQy
YmIyMCA6IGtlcm5lbF9jb25uZWN0KzB4MTAvMHgyMCBba2VybmVsXQ0KPiAgICAgICAweGZmZmZm
ZmZmYTA0MmIzNjggOiB4c190Y3Bfc2V0dXBfc29ja2V0KzB4ZTgvMHg0YzAgW3N1bnJwY10NCj4g
ICAgICAgMHhmZmZmZmZmZjgxMDhjNzYwIDogd29ya2VyX3RocmVhZCsweDE3MC8weDJhMCBba2Vy
bmVsXQ0KPiAgICAgICAweGZmZmZmZmZmODEwOTFkNjYgOiBrdGhyZWFkKzB4OTYvMHhhMCBba2Vy
bmVsXQ0KPiAgICAgICAweGZmZmZmZmZmODEwMGMxNGEgOiBjaGlsZF9yaXArMHhhLzB4MjAgW2tl
cm5lbF0NCj4gDQo+ICAgICBJdCBhcHBlYXJzIHRoYXQgd2l0aCBycGNpb2QgdHJ5aW5nIHRvIHJl
Y29ubmVjdCwgaXQgd2luZHMgdXANCj4gICAgIGluZGlyZWN0bHkgaW52b2tpbmcgYHhzX2Vycm9y
X3JlcG9ydCcgdmlhIGEgY2FsbGJhY2sgZnJvbSB0aGUNCj4gICAgIHNvY2svdGNwIGxheWVyLg0K
PiANCj4gICAtIEF0IHRoaXMgcG9pbnQgdGhlIGB0a19zdGF0dXMnIGlzIHNldCB0byBFQUdBSU4s
IHNvIGlmDQo+ICAgICBgY2FsbF9jb25uZWN0X3N0YXR1cycgaGFuZGxlZCBFR0FJTiBieSBzZXR0
aW5nIGB0a19hY3Rpb24nIHRvDQo+ICAgICBgY2FsbF9jb25uZWN0JyBhZ2FpbiBBTkQgcnBjaW9k
IHdhcyBmaW5pc2hlZCB0cnlpbmcgdG8gZXhlY3V0ZQ0KPiAgICAgYHhzX3RjcF9zZXR1cF9zb2Nr
ZXQnIHRoZW4gd2Ugd291bGQgYmUgb2suDQo+IA0KPiAgIC0gSG93ZXZlciwgYGNhbGxfY29ubmVj
dF9zdGF0dXMnIHNldHMgYHRrX2FjdGlvbicgdG8gYGNhbGxfdHJhbnNtaXQnDQo+ICAgICBhbmQg
Z29lcyBvbi4NCj4gDQo+ICAgLSBBdCB0aGlzIHBvaW50IHJwY2lvZCBpcyBzdGlsbCBleGVjdXRp
bmcgdHJ5aW5nIHRvIGNvbm5lY3QgdGhlDQo+ICAgICBzb2NrZXQsIGFuZCBldmVudHVhbGx5IHdp
bGwgc3VjY2VlZCwgYnV0IG5vdyBvdXIgdGhyZWFkIGFuZCBycGNpb2QNCj4gICAgIGFyZSByYWNp
bmcgdG8gc2VlIHdobyBjYW4gdXBkYXRlL2NoZWNrIHRoZSBzdGF0ZSBvZiB0aGUgc29ja2V0DQo+
ICAgICBmaXJzdC4NCj4gDQo+ICAgICBEZXBlbmRpbmcgb24gd2hvIHdpbnMgYW5kIHdoYXQgc3Rh
dHVzIGlzIHJldHVybmVkLA0KPiAgICAgYHhzX3RjcF9zZW5kX3JlcXVlc3QnIHdpbGwgZWl0aGVy
IHRlYXIgdGhlIGNvbm5lY3Rpb24gZG93biBhbmQNCj4gICAgIHJldHJ5LCBvciBpdCB3aWxsIHNp
bXBseSByZXRyeSB3aXRob3V0IHRoZSB0ZWFyZG93bi4gIFdlIGNhbiB3aW5kIHVwDQo+ICAgICBp
biBhIGxvb3Agb2Ygc2V0dXAvdGVhcmRvd25zIGFuZCBhcmUgbmV2ZXIgYWJsZSB0byBhY3R1YWxs
eSByZXR1cm4NCj4gICAgIGZyb20gYF9fcnBjX2V4ZWN1dGUuJw0KPiANCj4gICAgIFNpbmNlIHdl
J3JlIHN0dWNrIGluIGBfX3JwY19leGVjdXRlJyBhbmQgdGhlIHRyYW5zcG9ydCBpcyBsb2NrZWQs
IG5vDQo+ICAgICBvdGhlciBORlMgb3BlcmF0aW9ucyBjYW4gY29tZSBpbiBhbmQgdGhleSBhbGwg
aGFuZy4NCj4gDQo+IFNpbXBseSBjaGFuZ2luZyBgY2FsbF9jb25uZWN0X3N0YXR1cycgdG8gaGFu
ZGxlIEVBR0FJTiBhcyBkZXNjcmliZWQNCj4gYWJvdmUgZGlkICJmaXgiIG15IHByb2JsZW0gYW5k
IEkgY291bGQgbm8gbG9uZ2VyIHJlcHJvZHVjZSB0aGUgaXNzdWUsDQo+IGhvd2V2ZXIgSSBiZWxp
ZXZlIHRoYXQgY291bGQgcmVzdWx0IGluIGFub3RoZXIgcmFjZSBjb25kaXRpb24gd2l0aA0KPiBt
dWx0aXBsZSBycGNpb2QncyB0cnlpbmcgdG8gZXhlY3V0ZSBgeHNfdGNwX3NldHVwX3NvY2tldCcg
YXQgdGhlIHNhbWUNCj4gdGltZS4NCj4gDQo+IEkgdGhpbmsgdGhlICJyaWdodCIgZml4LCBhdCBs
ZWFzdCBhcyBmYXIgYXMgSSBjYW4gdGVsbCB3b3VsZCBiZSB0bw0KPiBlbnN1cmUgdGhhdCBycGNp
b2QgZG9lcyBub3Qgd2FrZSB1cCB0aGUgcHJvY2VzcyBzbGVlcGluZyBvbg0KPiBgY2FsbF9jb25u
ZWN0JyB1bnRpbCBpdCBpcyBjb21wbGV0ZWx5IGRvbmUgd2l0aCBpdHMgd29yayAod2hldGhlciB0
aGF0DQo+IGlzIHN1Y2Nlc3Mgb3IgZmFpbHVyZSkuICBJIGJlbGlldmUgdGhpcyBjb3VwbGVkIHdp
dGggbWFraW5nDQo+IGBjYWxsX2Nvbm5lY3Rfc3RhdHVzJyBoYW5kbGUgRUFHQUlOIHdvdWxkIHdv
cmsuDQo+IA0KPiBVbmZvcnR1bmF0ZWx5LCBJJ20gc3RyZXRjaGluZyBteSBhYmlsaXRpZXMgaGVy
ZSBhbmQgZG8gbm90IHJlYWxseSBoYXZlIGENCj4gZ29vZCB1bmRlcnN0YW5kaW5nIG9mIGhvdyB0
aGUgaW50ZXJhY3Rpb25zIGNvbWUgdG9nZXRoZXIgdG8gYmUgYWJsZSB0bw0KPiBzdWJtaXQgYSBw
YXRjaCB0byBmaXggdGhpcy4NCj4gDQo+IA0KPiBXT1JLQVJPVU5EOg0KPiANCj4gQmFzZWQgb24g
d2hhdCBJIGZvdW5kIGFib3ZlIEkgZGVjaWRlZCB0byB0cnkgc2V0dGluZyB0aGUgcmVhbCB0aW1l
DQo+IHByaW9yaXR5IG9mIGFsbCBteSBycGNpb2QgdGhyZWFkcyB0byBvbmUgaGlnaGVyIHRoYW4g
dGhhdCBvZiBteSByZWFsDQo+IHRpbWUgc3RhdCdpbmcgcHJvY2VzcyAoaS5lLiA1MSkuICBXaGVu
IEkgZGlkIHRoaXMsIEkgY291bGQgbm8gbG9uZ2VyDQo+IHJlcHJvZHVjZSB0aGUgcHJvYmxlbS4g
IFNldHRpbmcgdGhlIHJ0IHByaW9yaXR5IHRvIDQ5IGhvd2V2ZXIsIGRpZCBhbGxvdw0KPiBpdCB0
byBzdGlsbCBoYXBwZW4uICBJIGJlbGlldmUgdGhpcyBpcyBlc3NlbnRpYWxseSB0aGUgc2FtZSBy
ZWFzb24gdGhhdA0KPiB0aGluZ3Mgc2VlbSB0byB3b3JrIG5vcm1hbGx5IGZvciBTQ0hFRF9PVEhF
UiBwcm9jZXNzZXMuICBTaW1wbHkgdGhhdA0KPiB0aGV5IGRvIG5vdCBoYXZlIHRoZSBzYW1lIHNj
aGVkdWxpbmcgYWR2YW50YWdlIG92ZXIgcnBjaW9kIGFuZCB0aGVyZWZvcmUNCj4gaXRzIGxpa2Vs
eSB0aGF0IHJwY2lvZCB3aW5zIGFuZCBvdXIgdGhyZWFkIGZpbmRzIHRoZSBzb2NrZXQgc3RhdGUg
YXMNCj4gZWl0aGVyIGBUQ1BfU1lOX1NFTlQnIG9yIGBUQ1BfRVNUQUJMSVNIRUQnIGFuZCBldmVy
eXRoaW5nIHJlY292ZXJzLg0KPiANCg0KSGkgQ2hyaXMsDQoNClRoYW5rcyBmb3IgaGVscGluZyB0
byBkZWJ1ZyB0aGlzLCBhbmQgZm9yIHByb3ZpZGluZyBzdWNoIGEgdGhvcm91Z2gNCmFuYWx5c2lz
ISBUaGlzIGlzIGdyZWF0IHdvcmshDQoNCkRvZXMgYWRkaW5nIGEgY2hlY2sgaW4geHNfZXJyb3Jf
cmVwb3J0KCkgZm9yIHhwcnRfY29ubmVjdGluZyh4cHJ0KSBzbw0KdGhhdCB3ZSBza2lwIHRoZSBj
YWxsIHRvIHhwcnRfd2FrZV9wZW5kaW5nX3Rhc2tzKCkgaWYgdHJ1ZSwgaGVscCB0bw0KcmVzb2x2
ZSB0aGUgcmFjZT8NCg0KQ2hlZXJzDQogIFRyb25kDQotLSANClRyb25kIE15a2xlYnVzdA0KTGlu
dXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFw
cC5jb20NCnd3dy5uZXRhcHAuY29tDQo=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-22 18:28 ` Myklebust, Trond
@ 2012-10-22 20:26   ` Chris Perl
  2012-10-22 21:20     ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-22 20:26 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Mon, Oct 22, 2012 at 06:28:02PM +0000, Myklebust, Trond wrote:
> Hi Chris,
> 
> Thanks for helping to debug this, and for providing such a thorough
> analysis! This is great work!
> 
> Does adding a check in xs_error_report() for xprt_connecting(xprt) so
> that we skip the call to xprt_wake_pending_tasks() if true, help to
> resolve the race?

This is what I changed:

--- a/net/sunrpc/xprtsock.c     2012-08-14 08:47:16.000000000 -0400
+++ b/net/sunrpc/xprtsock.c     2012-10-22 14:50:09.237725498 -0400
@@ -1571,7 +1571,8 @@
        dprintk("RPC:       %s client %p...\n"
                        "RPC:       error %d\n",
                        __func__, xprt, sk->sk_err);
-       xprt_wake_pending_tasks(xprt, -EAGAIN);
+       if (!xprt_connecting(xprt))
+               xprt_wake_pending_tasks(xprt, -EAGAIN);
 out:
        read_unlock_bh(&sk->sk_callback_lock);
 }

Unfortunately it does not resolve it.  It appears that does stop the wakeup
from happening where I detailed it before, but we still wake up too early.  The
following is the call stack from SystemTap catching the call to `rpc_make_runnable':

 0xffffffffa0434840 : rpc_make_runnable+0x0/0x80 [sunrpc]
 0xffffffffa0434b5e : rpc_wake_up_task_queue_locked+0x18e/0x270 [sunrpc]
 0xffffffffa0434c8e : rpc_wake_up_status+0x4e/0x80 [sunrpc]
 0xffffffffa042d8ec : xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
 0xffffffffa042da56 : xprt_disconnect_done+0x46/0x70 [sunrpc]
 0xffffffffa04305b0 : xs_tcp_setup_socket+0x330/0x4c0 [sunrpc]
 0xffffffff8108c760 : worker_thread+0x170/0x2a0 [kernel]
 0xffffffff81091d66 : kthread+0x96/0xa0 [kernel]
 0xffffffff8100c14a : child_rip+0xa/0x20 [kernel]

However, its mildly confusing to me since I cannot follow that exact
call stack in the code.  The nearest I can find to that is the
following:

  rpc_make_runnable
  rpc_wake_up_task_queue_locked
  rpc_wake_up_status
  xprt_wake_pending_tasks
  xprt_disconnect_done
  xs_sock_mark_closed
  xs_abort_connection
  xs_tcp_reuse_connection
  xs_tcp_setup_socket

As far as I can tell, we get here because the call to `kernel_connect'
from within `xs_abort_connection' by rpciod is returning 0 sometimes
(and EINPROGRESS others).  When it returns 0, we enter this block:

  if (!result)
          xs_sock_mark_closed(&transport->xprt);
  else
    dprintk("RPC:       AF_UNSPEC connect return code %d\n", result);

and that eventually gets us woken up via the call stack above.  The end
result is the same, we wake up and move on to `call_transmit' before
rpciod is done connecting the socket.

I'll do some more tracing and see what else I can come up with.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-22 20:26   ` Chris Perl
@ 2012-10-22 21:20     ` Myklebust, Trond
  2012-10-23 13:45       ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-22 21:20 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

T24gTW9uLCAyMDEyLTEwLTIyIGF0IDE2OjI2IC0wNDAwLCBDaHJpcyBQZXJsIHdyb3RlOg0KPiBP
biBNb24sIE9jdCAyMiwgMjAxMiBhdCAwNjoyODowMlBNICswMDAwLCBNeWtsZWJ1c3QsIFRyb25k
IHdyb3RlOg0KPiA+IEhpIENocmlzLA0KPiA+IA0KPiA+IFRoYW5rcyBmb3IgaGVscGluZyB0byBk
ZWJ1ZyB0aGlzLCBhbmQgZm9yIHByb3ZpZGluZyBzdWNoIGEgdGhvcm91Z2gNCj4gPiBhbmFseXNp
cyEgVGhpcyBpcyBncmVhdCB3b3JrIQ0KPiA+IA0KPiA+IERvZXMgYWRkaW5nIGEgY2hlY2sgaW4g
eHNfZXJyb3JfcmVwb3J0KCkgZm9yIHhwcnRfY29ubmVjdGluZyh4cHJ0KSBzbw0KPiA+IHRoYXQg
d2Ugc2tpcCB0aGUgY2FsbCB0byB4cHJ0X3dha2VfcGVuZGluZ190YXNrcygpIGlmIHRydWUsIGhl
bHAgdG8NCj4gPiByZXNvbHZlIHRoZSByYWNlPw0KPiANCj4gVGhpcyBpcyB3aGF0IEkgY2hhbmdl
ZDoNCj4gDQo+IC0tLSBhL25ldC9zdW5ycGMveHBydHNvY2suYyAgICAgMjAxMi0wOC0xNCAwODo0
NzoxNi4wMDAwMDAwMDAgLTA0MDANCj4gKysrIGIvbmV0L3N1bnJwYy94cHJ0c29jay5jICAgICAy
MDEyLTEwLTIyIDE0OjUwOjA5LjIzNzcyNTQ5OCAtMDQwMA0KPiBAQCAtMTU3MSw3ICsxNTcxLDgg
QEANCj4gICAgICAgICBkcHJpbnRrKCJSUEM6ICAgICAgICVzIGNsaWVudCAlcC4uLlxuIg0KPiAg
ICAgICAgICAgICAgICAgICAgICAgICAiUlBDOiAgICAgICBlcnJvciAlZFxuIiwNCj4gICAgICAg
ICAgICAgICAgICAgICAgICAgX19mdW5jX18sIHhwcnQsIHNrLT5za19lcnIpOw0KPiAtICAgICAg
IHhwcnRfd2FrZV9wZW5kaW5nX3Rhc2tzKHhwcnQsIC1FQUdBSU4pOw0KPiArICAgICAgIGlmICgh
eHBydF9jb25uZWN0aW5nKHhwcnQpKQ0KPiArICAgICAgICAgICAgICAgeHBydF93YWtlX3BlbmRp
bmdfdGFza3MoeHBydCwgLUVBR0FJTik7DQo+ICBvdXQ6DQo+ICAgICAgICAgcmVhZF91bmxvY2tf
YmgoJnNrLT5za19jYWxsYmFja19sb2NrKTsNCj4gIH0NCj4gDQo+IFVuZm9ydHVuYXRlbHkgaXQg
ZG9lcyBub3QgcmVzb2x2ZSBpdC4gIEl0IGFwcGVhcnMgdGhhdCBkb2VzIHN0b3AgdGhlIHdha2V1
cA0KPiBmcm9tIGhhcHBlbmluZyB3aGVyZSBJIGRldGFpbGVkIGl0IGJlZm9yZSwgYnV0IHdlIHN0
aWxsIHdha2UgdXAgdG9vIGVhcmx5LiAgVGhlDQo+IGZvbGxvd2luZyBpcyB0aGUgY2FsbCBzdGFj
ayBmcm9tIFN5c3RlbVRhcCBjYXRjaGluZyB0aGUgY2FsbCB0byBgcnBjX21ha2VfcnVubmFibGUn
Og0KPiANCj4gIDB4ZmZmZmZmZmZhMDQzNDg0MCA6IHJwY19tYWtlX3J1bm5hYmxlKzB4MC8weDgw
IFtzdW5ycGNdDQo+ICAweGZmZmZmZmZmYTA0MzRiNWUgOiBycGNfd2FrZV91cF90YXNrX3F1ZXVl
X2xvY2tlZCsweDE4ZS8weDI3MCBbc3VucnBjXQ0KPiAgMHhmZmZmZmZmZmEwNDM0YzhlIDogcnBj
X3dha2VfdXBfc3RhdHVzKzB4NGUvMHg4MCBbc3VucnBjXQ0KPiAgMHhmZmZmZmZmZmEwNDJkOGVj
IDogeHBydF93YWtlX3BlbmRpbmdfdGFza3MrMHgyYy8weDMwIFtzdW5ycGNdDQo+ICAweGZmZmZm
ZmZmYTA0MmRhNTYgOiB4cHJ0X2Rpc2Nvbm5lY3RfZG9uZSsweDQ2LzB4NzAgW3N1bnJwY10NCj4g
IDB4ZmZmZmZmZmZhMDQzMDViMCA6IHhzX3RjcF9zZXR1cF9zb2NrZXQrMHgzMzAvMHg0YzAgW3N1
bnJwY10NCj4gIDB4ZmZmZmZmZmY4MTA4Yzc2MCA6IHdvcmtlcl90aHJlYWQrMHgxNzAvMHgyYTAg
W2tlcm5lbF0NCj4gIDB4ZmZmZmZmZmY4MTA5MWQ2NiA6IGt0aHJlYWQrMHg5Ni8weGEwIFtrZXJu
ZWxdDQo+ICAweGZmZmZmZmZmODEwMGMxNGEgOiBjaGlsZF9yaXArMHhhLzB4MjAgW2tlcm5lbF0N
Cj4gDQo+IEhvd2V2ZXIsIGl0cyBtaWxkbHkgY29uZnVzaW5nIHRvIG1lIHNpbmNlIEkgY2Fubm90
IGZvbGxvdyB0aGF0IGV4YWN0DQo+IGNhbGwgc3RhY2sgaW4gdGhlIGNvZGUuICBUaGUgbmVhcmVz
dCBJIGNhbiBmaW5kIHRvIHRoYXQgaXMgdGhlDQo+IGZvbGxvd2luZzoNCj4gDQo+ICAgcnBjX21h
a2VfcnVubmFibGUNCj4gICBycGNfd2FrZV91cF90YXNrX3F1ZXVlX2xvY2tlZA0KPiAgIHJwY193
YWtlX3VwX3N0YXR1cw0KPiAgIHhwcnRfd2FrZV9wZW5kaW5nX3Rhc2tzDQo+ICAgeHBydF9kaXNj
b25uZWN0X2RvbmUNCj4gICB4c19zb2NrX21hcmtfY2xvc2VkDQo+ICAgeHNfYWJvcnRfY29ubmVj
dGlvbg0KPiAgIHhzX3RjcF9yZXVzZV9jb25uZWN0aW9uDQo+ICAgeHNfdGNwX3NldHVwX3NvY2tl
dA0KPiANCj4gQXMgZmFyIGFzIEkgY2FuIHRlbGwsIHdlIGdldCBoZXJlIGJlY2F1c2UgdGhlIGNh
bGwgdG8gYGtlcm5lbF9jb25uZWN0Jw0KPiBmcm9tIHdpdGhpbiBgeHNfYWJvcnRfY29ubmVjdGlv
bicgYnkgcnBjaW9kIGlzIHJldHVybmluZyAwIHNvbWV0aW1lcw0KPiAoYW5kIEVJTlBST0dSRVNT
IG90aGVycykuICBXaGVuIGl0IHJldHVybnMgMCwgd2UgZW50ZXIgdGhpcyBibG9jazoNCj4gDQo+
ICAgaWYgKCFyZXN1bHQpDQo+ICAgICAgICAgICB4c19zb2NrX21hcmtfY2xvc2VkKCZ0cmFuc3Bv
cnQtPnhwcnQpOw0KPiAgIGVsc2UNCj4gICAgIGRwcmludGsoIlJQQzogICAgICAgQUZfVU5TUEVD
IGNvbm5lY3QgcmV0dXJuIGNvZGUgJWRcbiIsIHJlc3VsdCk7DQo+IA0KPiBhbmQgdGhhdCBldmVu
dHVhbGx5IGdldHMgdXMgd29rZW4gdXAgdmlhIHRoZSBjYWxsIHN0YWNrIGFib3ZlLiAgVGhlIGVu
ZA0KPiByZXN1bHQgaXMgdGhlIHNhbWUsIHdlIHdha2UgdXAgYW5kIG1vdmUgb24gdG8gYGNhbGxf
dHJhbnNtaXQnIGJlZm9yZQ0KPiBycGNpb2QgaXMgZG9uZSBjb25uZWN0aW5nIHRoZSBzb2NrZXQu
DQo+IA0KPiBJJ2xsIGRvIHNvbWUgbW9yZSB0cmFjaW5nIGFuZCBzZWUgd2hhdCBlbHNlIEkgY2Fu
IGNvbWUgdXAgd2l0aC4NCj4gDQo+IENocmlzDQoNCk9LLCB0aGVuIGxldCdzIGp1c3QgZ2V0IHJp
ZCBvZiB4c19lcnJvcl9yZXBvcnQoKS4gQWxsIHRoZSBjYXNlcyB0aGF0IHdlDQpjYXJlIGFib3V0
IHdpbGwgY2FsbCB0Y3BfZG9uZSgpLCB3aGljaCB3aWxsIGluZHVjZSBhIHN0YXRlIGNoYW5nZSB3
aGVuDQpuZWVkZWQuDQoNCkNoZWVycywNCiAgVHJvbmQNCjg8LS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpGcm9tIDA0
N2VlOWVhMDcyNjk4ZjFkMzU1NzNjMWYyNmRjMDJkMTNhNGFmN2YgTW9uIFNlcCAxNyAwMDowMDow
MCAyMDAxDQpGcm9tOiBUcm9uZCBNeWtsZWJ1c3QgPFRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29t
Pg0KRGF0ZTogTW9uLCAyMiBPY3QgMjAxMiAxNzoxNDozNiAtMDQwMA0KU3ViamVjdDogW1BBVENI
XSBTVU5SUEM6IEdldCByaWQgb2YgdGhlIHhzX2Vycm9yX3JlcG9ydCBzb2NrZXQgY2FsbGJhY2sN
Cg0KQ2hyaXMgUGVybCByZXBvcnRzIHRoYXQgd2UncmUgc2VlaW5nIHJhY2VzIGJldHdlZW4gdGhl
IHdha2V1cCBjYWxsIGluDQp4c19lcnJvcl9yZXBvcnQgYW5kIHRoZSBjb25uZWN0IGF0dGVtcHRz
LiBCYXNpY2FsbHksIENocmlzIGhhcyBzaG93bg0KdGhhdCBpbiBjZXJ0YWluIGNpcmN1bXN0YW5j
ZXMsIHRoZSBjYWxsIHRvIHhzX2Vycm9yX3JlcG9ydCBjYXVzZXMgdGhlDQpycGNfdGFzayB0aGF0
IGlzIHJlc3BvbnNpYmxlIGZvciByZWNvbm5lY3RpbmcgdG8gd2FrZSB1cCBlYXJseSwgdGh1cw0K
dHJpZ2dlcmluZyBhIGRpc2Nvbm5lY3QgYW5kIHJldHJ5Lg0KDQpTaW5jZSB0aGUgc2stPnNrX2Vy
cm9yX3JlcG9ydCgpIGNhbGxzIGluIHRoZSBzb2NrZXQgbGF5ZXIgYXJlIGFsd2F5cw0KZm9sbG93
ZWQgYnkgYSB0Y3BfZG9uZSgpIGluIHRoZSBjYXNlcyB3aGVyZSB3ZSBjYXJlIGFib3V0IHdha2lu
ZyB1cA0KdGhlIHJwY190YXNrcywganVzdCBsZXQgdGhlIHN0YXRlX2NoYW5nZSBjYWxsYmFja3Mg
dGFrZSByZXNwb25zaWJpbGl0eQ0KZm9yIHRob3NlIHdha2UgdXBzLg0KDQpSZXBvcnRlZC1ieTog
Q2hyaXMgUGVybCA8Y2hyaXMucGVybEBnbWFpbC5jb20+DQpTaWduZWQtb2ZmLWJ5OiBUcm9uZCBN
eWtsZWJ1c3QgPFRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tPg0KLS0tDQogbmV0L3N1bnJwYy94
cHJ0c29jay5jIHwgMjUgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KIDEgZmlsZSBjaGFuZ2Vk
LCAyNSBkZWxldGlvbnMoLSkNCg0KZGlmZiAtLWdpdCBhL25ldC9zdW5ycGMveHBydHNvY2suYyBi
L25ldC9zdW5ycGMveHBydHNvY2suYw0KaW5kZXggYWFhYWRmYi4uODliZGUyMSAxMDA2NDQNCi0t
LSBhL25ldC9zdW5ycGMveHBydHNvY2suYw0KKysrIGIvbmV0L3N1bnJwYy94cHJ0c29jay5jDQpA
QCAtMjU0LDcgKzI1NCw2IEBAIHN0cnVjdCBzb2NrX3hwcnQgew0KIAl2b2lkCQkJKCpvbGRfZGF0
YV9yZWFkeSkoc3RydWN0IHNvY2sgKiwgaW50KTsNCiAJdm9pZAkJCSgqb2xkX3N0YXRlX2NoYW5n
ZSkoc3RydWN0IHNvY2sgKik7DQogCXZvaWQJCQkoKm9sZF93cml0ZV9zcGFjZSkoc3RydWN0IHNv
Y2sgKik7DQotCXZvaWQJCQkoKm9sZF9lcnJvcl9yZXBvcnQpKHN0cnVjdCBzb2NrICopOw0KIH07
DQogDQogLyoNCkBAIC03ODEsNyArNzgwLDYgQEAgc3RhdGljIHZvaWQgeHNfc2F2ZV9vbGRfY2Fs
bGJhY2tzKHN0cnVjdCBzb2NrX3hwcnQgKnRyYW5zcG9ydCwgc3RydWN0IHNvY2sgKnNrKQ0KIAl0
cmFuc3BvcnQtPm9sZF9kYXRhX3JlYWR5ID0gc2stPnNrX2RhdGFfcmVhZHk7DQogCXRyYW5zcG9y
dC0+b2xkX3N0YXRlX2NoYW5nZSA9IHNrLT5za19zdGF0ZV9jaGFuZ2U7DQogCXRyYW5zcG9ydC0+
b2xkX3dyaXRlX3NwYWNlID0gc2stPnNrX3dyaXRlX3NwYWNlOw0KLQl0cmFuc3BvcnQtPm9sZF9l
cnJvcl9yZXBvcnQgPSBzay0+c2tfZXJyb3JfcmVwb3J0Ow0KIH0NCiANCiBzdGF0aWMgdm9pZCB4
c19yZXN0b3JlX29sZF9jYWxsYmFja3Moc3RydWN0IHNvY2tfeHBydCAqdHJhbnNwb3J0LCBzdHJ1
Y3Qgc29jayAqc2spDQpAQCAtNzg5LDcgKzc4Nyw2IEBAIHN0YXRpYyB2b2lkIHhzX3Jlc3RvcmVf
b2xkX2NhbGxiYWNrcyhzdHJ1Y3Qgc29ja194cHJ0ICp0cmFuc3BvcnQsIHN0cnVjdCBzb2NrICpz
DQogCXNrLT5za19kYXRhX3JlYWR5ID0gdHJhbnNwb3J0LT5vbGRfZGF0YV9yZWFkeTsNCiAJc2st
PnNrX3N0YXRlX2NoYW5nZSA9IHRyYW5zcG9ydC0+b2xkX3N0YXRlX2NoYW5nZTsNCiAJc2stPnNr
X3dyaXRlX3NwYWNlID0gdHJhbnNwb3J0LT5vbGRfd3JpdGVfc3BhY2U7DQotCXNrLT5za19lcnJv
cl9yZXBvcnQgPSB0cmFuc3BvcnQtPm9sZF9lcnJvcl9yZXBvcnQ7DQogfQ0KIA0KIHN0YXRpYyB2
b2lkIHhzX3Jlc2V0X3RyYW5zcG9ydChzdHJ1Y3Qgc29ja194cHJ0ICp0cmFuc3BvcnQpDQpAQCAt
MTU0MCwyNSArMTUzNyw2IEBAIHN0YXRpYyB2b2lkIHhzX3RjcF9zdGF0ZV9jaGFuZ2Uoc3RydWN0
IHNvY2sgKnNrKQ0KIAlyZWFkX3VubG9ja19iaCgmc2stPnNrX2NhbGxiYWNrX2xvY2spOw0KIH0N
CiANCi0vKioNCi0gKiB4c19lcnJvcl9yZXBvcnQgLSBjYWxsYmFjayBtYWlubHkgZm9yIGNhdGNo
aW5nIHNvY2tldCBlcnJvcnMNCi0gKiBAc2s6IHNvY2tldA0KLSAqLw0KLXN0YXRpYyB2b2lkIHhz
X2Vycm9yX3JlcG9ydChzdHJ1Y3Qgc29jayAqc2spDQotew0KLQlzdHJ1Y3QgcnBjX3hwcnQgKnhw
cnQ7DQotDQotCXJlYWRfbG9ja19iaCgmc2stPnNrX2NhbGxiYWNrX2xvY2spOw0KLQlpZiAoISh4
cHJ0ID0geHBydF9mcm9tX3NvY2soc2spKSkNCi0JCWdvdG8gb3V0Ow0KLQlkcHJpbnRrKCJSUEM6
ICAgICAgICVzIGNsaWVudCAlcC4uLlxuIg0KLQkJCSJSUEM6ICAgICAgIGVycm9yICVkXG4iLA0K
LQkJCV9fZnVuY19fLCB4cHJ0LCBzay0+c2tfZXJyKTsNCi0JeHBydF93YWtlX3BlbmRpbmdfdGFz
a3MoeHBydCwgLUVBR0FJTik7DQotb3V0Og0KLQlyZWFkX3VubG9ja19iaCgmc2stPnNrX2NhbGxi
YWNrX2xvY2spOw0KLX0NCi0NCiBzdGF0aWMgdm9pZCB4c193cml0ZV9zcGFjZShzdHJ1Y3Qgc29j
ayAqc2spDQogew0KIAlzdHJ1Y3Qgc29ja2V0ICpzb2NrOw0KQEAgLTE4NTgsNyArMTgzNiw2IEBA
IHN0YXRpYyBpbnQgeHNfbG9jYWxfZmluaXNoX2Nvbm5lY3Rpbmcoc3RydWN0IHJwY194cHJ0ICp4
cHJ0LA0KIAkJc2stPnNrX3VzZXJfZGF0YSA9IHhwcnQ7DQogCQlzay0+c2tfZGF0YV9yZWFkeSA9
IHhzX2xvY2FsX2RhdGFfcmVhZHk7DQogCQlzay0+c2tfd3JpdGVfc3BhY2UgPSB4c191ZHBfd3Jp
dGVfc3BhY2U7DQotCQlzay0+c2tfZXJyb3JfcmVwb3J0ID0geHNfZXJyb3JfcmVwb3J0Ow0KIAkJ
c2stPnNrX2FsbG9jYXRpb24gPSBHRlBfQVRPTUlDOw0KIA0KIAkJeHBydF9jbGVhcl9jb25uZWN0
ZWQoeHBydCk7DQpAQCAtMTk4Myw3ICsxOTYwLDYgQEAgc3RhdGljIHZvaWQgeHNfdWRwX2Zpbmlz
aF9jb25uZWN0aW5nKHN0cnVjdCBycGNfeHBydCAqeHBydCwgc3RydWN0IHNvY2tldCAqc29jaykN
CiAJCXNrLT5za191c2VyX2RhdGEgPSB4cHJ0Ow0KIAkJc2stPnNrX2RhdGFfcmVhZHkgPSB4c191
ZHBfZGF0YV9yZWFkeTsNCiAJCXNrLT5za193cml0ZV9zcGFjZSA9IHhzX3VkcF93cml0ZV9zcGFj
ZTsNCi0JCXNrLT5za19lcnJvcl9yZXBvcnQgPSB4c19lcnJvcl9yZXBvcnQ7DQogCQlzay0+c2tf
bm9fY2hlY2sgPSBVRFBfQ1NVTV9OT1JDVjsNCiAJCXNrLT5za19hbGxvY2F0aW9uID0gR0ZQX0FU
T01JQzsNCiANCkBAIC0yMDk4LDcgKzIwNzQsNiBAQCBzdGF0aWMgaW50IHhzX3RjcF9maW5pc2hf
Y29ubmVjdGluZyhzdHJ1Y3QgcnBjX3hwcnQgKnhwcnQsIHN0cnVjdCBzb2NrZXQgKnNvY2spDQog
CQlzay0+c2tfZGF0YV9yZWFkeSA9IHhzX3RjcF9kYXRhX3JlYWR5Ow0KIAkJc2stPnNrX3N0YXRl
X2NoYW5nZSA9IHhzX3RjcF9zdGF0ZV9jaGFuZ2U7DQogCQlzay0+c2tfd3JpdGVfc3BhY2UgPSB4
c190Y3Bfd3JpdGVfc3BhY2U7DQotCQlzay0+c2tfZXJyb3JfcmVwb3J0ID0geHNfZXJyb3JfcmVw
b3J0Ow0KIAkJc2stPnNrX2FsbG9jYXRpb24gPSBHRlBfQVRPTUlDOw0KIA0KIAkJLyogc29ja2V0
IG9wdGlvbnMgKi8NCi0tIA0KMS43LjExLjcNCg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGlu
dXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFw
cC5jb20NCnd3dy5uZXRhcHAuY29tDQo=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-22 21:20     ` Myklebust, Trond
@ 2012-10-23 13:45       ` Chris Perl
  2012-10-23 16:02         ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-23 13:45 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Mon, Oct 22, 2012 at 09:20:34PM +0000, Myklebust, Trond wrote:
> > This is what I changed:
> > 
> > --- a/net/sunrpc/xprtsock.c     2012-08-14 08:47:16.000000000 -0400
> > +++ b/net/sunrpc/xprtsock.c     2012-10-22 14:50:09.237725498 -0400
> > @@ -1571,7 +1571,8 @@
> >         dprintk("RPC:       %s client %p...\n"
> >                         "RPC:       error %d\n",
> >                         __func__, xprt, sk->sk_err);
> > -       xprt_wake_pending_tasks(xprt, -EAGAIN);
> > +       if (!xprt_connecting(xprt))
> > +               xprt_wake_pending_tasks(xprt, -EAGAIN);
> >  out:
> >         read_unlock_bh(&sk->sk_callback_lock);
> >  }
> > 
> > Unfortunately it does not resolve it.  It appears that does stop the wakeup
> > from happening where I detailed it before, but we still wake up too early.  The
> > following is the call stack from SystemTap catching the call to `rpc_make_runnable':
> > 
> >  0xffffffffa0434840 : rpc_make_runnable+0x0/0x80 [sunrpc]
> >  0xffffffffa0434b5e : rpc_wake_up_task_queue_locked+0x18e/0x270 [sunrpc]
> >  0xffffffffa0434c8e : rpc_wake_up_status+0x4e/0x80 [sunrpc]
> >  0xffffffffa042d8ec : xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
> >  0xffffffffa042da56 : xprt_disconnect_done+0x46/0x70 [sunrpc]
> >  0xffffffffa04305b0 : xs_tcp_setup_socket+0x330/0x4c0 [sunrpc]
> >  0xffffffff8108c760 : worker_thread+0x170/0x2a0 [kernel]
> >  0xffffffff81091d66 : kthread+0x96/0xa0 [kernel]
> >  0xffffffff8100c14a : child_rip+0xa/0x20 [kernel]
> > 
> > However, its mildly confusing to me since I cannot follow that exact
> > call stack in the code.  The nearest I can find to that is the
> > following:
> > 
> >   rpc_make_runnable
> >   rpc_wake_up_task_queue_locked
> >   rpc_wake_up_status
> >   xprt_wake_pending_tasks
> >   xprt_disconnect_done
> >   xs_sock_mark_closed
> >   xs_abort_connection
> >   xs_tcp_reuse_connection
> >   xs_tcp_setup_socket
> > 
> > As far as I can tell, we get here because the call to `kernel_connect'
> > from within `xs_abort_connection' by rpciod is returning 0 sometimes
> > (and EINPROGRESS others).  When it returns 0, we enter this block:
> > 
> >   if (!result)
> >           xs_sock_mark_closed(&transport->xprt);
> >   else
> >     dprintk("RPC:       AF_UNSPEC connect return code %d\n", result);
> > 
> > and that eventually gets us woken up via the call stack above.  The end
> > result is the same, we wake up and move on to `call_transmit' before
> > rpciod is done connecting the socket.
> > 
> > I'll do some more tracing and see what else I can come up with.
> > 
> > Chris
> 
> OK, then let's just get rid of xs_error_report(). All the cases that we
> care about will call tcp_done(), which will induce a state change when
> needed.
> 
> Cheers,
>   Trond
> 8<---------------------------------------------------------------------
> From 047ee9ea072698f1d35573c1f26dc02d13a4af7f Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date: Mon, 22 Oct 2012 17:14:36 -0400
> Subject: [PATCH] SUNRPC: Get rid of the xs_error_report socket callback
> 
> Chris Perl reports that we're seeing races between the wakeup call in
> xs_error_report and the connect attempts. Basically, Chris has shown
> that in certain circumstances, the call to xs_error_report causes the
> rpc_task that is responsible for reconnecting to wake up early, thus
> triggering a disconnect and retry.
> 
> Since the sk->sk_error_report() calls in the socket layer are always
> followed by a tcp_done() in the cases where we care about waking up
> the rpc_tasks, just let the state_change callbacks take responsibility
> for those wake ups.
> 
> Reported-by: Chris Perl <chris.perl@gmail.com>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
>  net/sunrpc/xprtsock.c | 25 -------------------------
>  1 file changed, 25 deletions(-)
> 
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index aaaadfb..89bde21 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -254,7 +254,6 @@ struct sock_xprt {
>  	void			(*old_data_ready)(struct sock *, int);
>  	void			(*old_state_change)(struct sock *);
>  	void			(*old_write_space)(struct sock *);
> -	void			(*old_error_report)(struct sock *);
>  };
>  
>  /*
> @@ -781,7 +780,6 @@ static void xs_save_old_callbacks(struct sock_xprt *transport, struct sock *sk)
>  	transport->old_data_ready = sk->sk_data_ready;
>  	transport->old_state_change = sk->sk_state_change;
>  	transport->old_write_space = sk->sk_write_space;
> -	transport->old_error_report = sk->sk_error_report;
>  }
>  
>  static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *sk)
> @@ -789,7 +787,6 @@ static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *s
>  	sk->sk_data_ready = transport->old_data_ready;
>  	sk->sk_state_change = transport->old_state_change;
>  	sk->sk_write_space = transport->old_write_space;
> -	sk->sk_error_report = transport->old_error_report;
>  }
>  
>  static void xs_reset_transport(struct sock_xprt *transport)
> @@ -1540,25 +1537,6 @@ static void xs_tcp_state_change(struct sock *sk)
>  	read_unlock_bh(&sk->sk_callback_lock);
>  }
>  
> -/**
> - * xs_error_report - callback mainly for catching socket errors
> - * @sk: socket
> - */
> -static void xs_error_report(struct sock *sk)
> -{
> -	struct rpc_xprt *xprt;
> -
> -	read_lock_bh(&sk->sk_callback_lock);
> -	if (!(xprt = xprt_from_sock(sk)))
> -		goto out;
> -	dprintk("RPC:       %s client %p...\n"
> -			"RPC:       error %d\n",
> -			__func__, xprt, sk->sk_err);
> -	xprt_wake_pending_tasks(xprt, -EAGAIN);
> -out:
> -	read_unlock_bh(&sk->sk_callback_lock);
> -}
> -
>  static void xs_write_space(struct sock *sk)
>  {
>  	struct socket *sock;
> @@ -1858,7 +1836,6 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
>  		sk->sk_user_data = xprt;
>  		sk->sk_data_ready = xs_local_data_ready;
>  		sk->sk_write_space = xs_udp_write_space;
> -		sk->sk_error_report = xs_error_report;
>  		sk->sk_allocation = GFP_ATOMIC;
>  
>  		xprt_clear_connected(xprt);
> @@ -1983,7 +1960,6 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
>  		sk->sk_user_data = xprt;
>  		sk->sk_data_ready = xs_udp_data_ready;
>  		sk->sk_write_space = xs_udp_write_space;
> -		sk->sk_error_report = xs_error_report;
>  		sk->sk_no_check = UDP_CSUM_NORCV;
>  		sk->sk_allocation = GFP_ATOMIC;
>  
> @@ -2098,7 +2074,6 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
>  		sk->sk_data_ready = xs_tcp_data_ready;
>  		sk->sk_state_change = xs_tcp_state_change;
>  		sk->sk_write_space = xs_tcp_write_space;
> -		sk->sk_error_report = xs_error_report;
>  		sk->sk_allocation = GFP_ATOMIC;
>  
>  		/* socket options */

Hi Trond,

First, thanks for being so responsive!  I never expected to get such
quick feedback and help.

I think perhaps you misread my last email.  Adding the check for
`xprt_connecting(xprt)' to `xs_error_report' _did_ stop the early wake
up from happening at that point in the code.  Removing the
entire call back with your patch works just as well.

I have confirmed that I can still reproduce the problem with this
patch.

Now that we're not waking up early because of `xs_error_report', we're
waking up early in another part of the code as I mentioned above.  I'm
not entirely clear about all the details here yet.

What I do know is that we're woken up with `tk_status' set to EGAIN, and
since `call_connect_status' doesn't do anything special for that, we move
on to `call_transmit'.  Meanwhile, rpciod is still trying to connect the
socket (eventually succeeding) and so we're racing.

Question: should `call_connect_status' treat EAGAIN specially and reset
`tk_action' to `call_connect' again (rather than just moving on to
`call_transmit')?  I don't think this alone would fix things properly,
but seems logical to me.  Perhaps I'm missing something.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-23 13:45       ` Chris Perl
@ 2012-10-23 16:02         ` Myklebust, Trond
  2012-10-23 20:08           ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-23 16:02 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 9312 bytes --]

On Tue, 2012-10-23 at 09:45 -0400, Chris Perl wrote:
> On Mon, Oct 22, 2012 at 09:20:34PM +0000, Myklebust, Trond wrote:
> > > This is what I changed:
> > > 
> > > --- a/net/sunrpc/xprtsock.c     2012-08-14 08:47:16.000000000 -0400
> > > +++ b/net/sunrpc/xprtsock.c     2012-10-22 14:50:09.237725498 -0400
> > > @@ -1571,7 +1571,8 @@
> > >         dprintk("RPC:       %s client %p...\n"
> > >                         "RPC:       error %d\n",
> > >                         __func__, xprt, sk->sk_err);
> > > -       xprt_wake_pending_tasks(xprt, -EAGAIN);
> > > +       if (!xprt_connecting(xprt))
> > > +               xprt_wake_pending_tasks(xprt, -EAGAIN);
> > >  out:
> > >         read_unlock_bh(&sk->sk_callback_lock);
> > >  }
> > > 
> > > Unfortunately it does not resolve it.  It appears that does stop the wakeup
> > > from happening where I detailed it before, but we still wake up too early.  The
> > > following is the call stack from SystemTap catching the call to `rpc_make_runnable':
> > > 
> > >  0xffffffffa0434840 : rpc_make_runnable+0x0/0x80 [sunrpc]
> > >  0xffffffffa0434b5e : rpc_wake_up_task_queue_locked+0x18e/0x270 [sunrpc]
> > >  0xffffffffa0434c8e : rpc_wake_up_status+0x4e/0x80 [sunrpc]
> > >  0xffffffffa042d8ec : xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
> > >  0xffffffffa042da56 : xprt_disconnect_done+0x46/0x70 [sunrpc]
> > >  0xffffffffa04305b0 : xs_tcp_setup_socket+0x330/0x4c0 [sunrpc]
> > >  0xffffffff8108c760 : worker_thread+0x170/0x2a0 [kernel]
> > >  0xffffffff81091d66 : kthread+0x96/0xa0 [kernel]
> > >  0xffffffff8100c14a : child_rip+0xa/0x20 [kernel]
> > > 
> > > However, its mildly confusing to me since I cannot follow that exact
> > > call stack in the code.  The nearest I can find to that is the
> > > following:
> > > 
> > >   rpc_make_runnable
> > >   rpc_wake_up_task_queue_locked
> > >   rpc_wake_up_status
> > >   xprt_wake_pending_tasks
> > >   xprt_disconnect_done
> > >   xs_sock_mark_closed
> > >   xs_abort_connection
> > >   xs_tcp_reuse_connection
> > >   xs_tcp_setup_socket
> > > 
> > > As far as I can tell, we get here because the call to `kernel_connect'
> > > from within `xs_abort_connection' by rpciod is returning 0 sometimes
> > > (and EINPROGRESS others).  When it returns 0, we enter this block:
> > > 
> > >   if (!result)
> > >           xs_sock_mark_closed(&transport->xprt);
> > >   else
> > >     dprintk("RPC:       AF_UNSPEC connect return code %d\n", result);
> > > 
> > > and that eventually gets us woken up via the call stack above.  The end
> > > result is the same, we wake up and move on to `call_transmit' before
> > > rpciod is done connecting the socket.
> > > 
> > > I'll do some more tracing and see what else I can come up with.
> > > 
> > > Chris
> > 
> > OK, then let's just get rid of xs_error_report(). All the cases that we
> > care about will call tcp_done(), which will induce a state change when
> > needed.
> > 
> > Cheers,
> >   Trond
> > 8<---------------------------------------------------------------------
> > From 047ee9ea072698f1d35573c1f26dc02d13a4af7f Mon Sep 17 00:00:00 2001
> > From: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date: Mon, 22 Oct 2012 17:14:36 -0400
> > Subject: [PATCH] SUNRPC: Get rid of the xs_error_report socket callback
> > 
> > Chris Perl reports that we're seeing races between the wakeup call in
> > xs_error_report and the connect attempts. Basically, Chris has shown
> > that in certain circumstances, the call to xs_error_report causes the
> > rpc_task that is responsible for reconnecting to wake up early, thus
> > triggering a disconnect and retry.
> > 
> > Since the sk->sk_error_report() calls in the socket layer are always
> > followed by a tcp_done() in the cases where we care about waking up
> > the rpc_tasks, just let the state_change callbacks take responsibility
> > for those wake ups.
> > 
> > Reported-by: Chris Perl <chris.perl@gmail.com>
> > Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > ---
> >  net/sunrpc/xprtsock.c | 25 -------------------------
> >  1 file changed, 25 deletions(-)
> > 
> > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > index aaaadfb..89bde21 100644
> > --- a/net/sunrpc/xprtsock.c
> > +++ b/net/sunrpc/xprtsock.c
> > @@ -254,7 +254,6 @@ struct sock_xprt {
> >  	void			(*old_data_ready)(struct sock *, int);
> >  	void			(*old_state_change)(struct sock *);
> >  	void			(*old_write_space)(struct sock *);
> > -	void			(*old_error_report)(struct sock *);
> >  };
> >  
> >  /*
> > @@ -781,7 +780,6 @@ static void xs_save_old_callbacks(struct sock_xprt *transport, struct sock *sk)
> >  	transport->old_data_ready = sk->sk_data_ready;
> >  	transport->old_state_change = sk->sk_state_change;
> >  	transport->old_write_space = sk->sk_write_space;
> > -	transport->old_error_report = sk->sk_error_report;
> >  }
> >  
> >  static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *sk)
> > @@ -789,7 +787,6 @@ static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *s
> >  	sk->sk_data_ready = transport->old_data_ready;
> >  	sk->sk_state_change = transport->old_state_change;
> >  	sk->sk_write_space = transport->old_write_space;
> > -	sk->sk_error_report = transport->old_error_report;
> >  }
> >  
> >  static void xs_reset_transport(struct sock_xprt *transport)
> > @@ -1540,25 +1537,6 @@ static void xs_tcp_state_change(struct sock *sk)
> >  	read_unlock_bh(&sk->sk_callback_lock);
> >  }
> >  
> > -/**
> > - * xs_error_report - callback mainly for catching socket errors
> > - * @sk: socket
> > - */
> > -static void xs_error_report(struct sock *sk)
> > -{
> > -	struct rpc_xprt *xprt;
> > -
> > -	read_lock_bh(&sk->sk_callback_lock);
> > -	if (!(xprt = xprt_from_sock(sk)))
> > -		goto out;
> > -	dprintk("RPC:       %s client %p...\n"
> > -			"RPC:       error %d\n",
> > -			__func__, xprt, sk->sk_err);
> > -	xprt_wake_pending_tasks(xprt, -EAGAIN);
> > -out:
> > -	read_unlock_bh(&sk->sk_callback_lock);
> > -}
> > -
> >  static void xs_write_space(struct sock *sk)
> >  {
> >  	struct socket *sock;
> > @@ -1858,7 +1836,6 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
> >  		sk->sk_user_data = xprt;
> >  		sk->sk_data_ready = xs_local_data_ready;
> >  		sk->sk_write_space = xs_udp_write_space;
> > -		sk->sk_error_report = xs_error_report;
> >  		sk->sk_allocation = GFP_ATOMIC;
> >  
> >  		xprt_clear_connected(xprt);
> > @@ -1983,7 +1960,6 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> >  		sk->sk_user_data = xprt;
> >  		sk->sk_data_ready = xs_udp_data_ready;
> >  		sk->sk_write_space = xs_udp_write_space;
> > -		sk->sk_error_report = xs_error_report;
> >  		sk->sk_no_check = UDP_CSUM_NORCV;
> >  		sk->sk_allocation = GFP_ATOMIC;
> >  
> > @@ -2098,7 +2074,6 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> >  		sk->sk_data_ready = xs_tcp_data_ready;
> >  		sk->sk_state_change = xs_tcp_state_change;
> >  		sk->sk_write_space = xs_tcp_write_space;
> > -		sk->sk_error_report = xs_error_report;
> >  		sk->sk_allocation = GFP_ATOMIC;
> >  
> >  		/* socket options */
> 
> Hi Trond,
> 
> First, thanks for being so responsive!  I never expected to get such
> quick feedback and help.
> 
> I think perhaps you misread my last email.  Adding the check for
> `xprt_connecting(xprt)' to `xs_error_report' _did_ stop the early wake
> up from happening at that point in the code.  Removing the
> entire call back with your patch works just as well.

OK. I did misunderstand that.

> I have confirmed that I can still reproduce the problem with this
> patch.
> 
> Now that we're not waking up early because of `xs_error_report', we're
> waking up early in another part of the code as I mentioned above.  I'm
> not entirely clear about all the details here yet.
> 
> What I do know is that we're woken up with `tk_status' set to EGAIN, and
> since `call_connect_status' doesn't do anything special for that, we move
> on to `call_transmit'.  Meanwhile, rpciod is still trying to connect the
> socket (eventually succeeding) and so we're racing.
> 
> Question: should `call_connect_status' treat EAGAIN specially and reset
> `tk_action' to `call_connect' again (rather than just moving on to
> `call_transmit')?  I don't think this alone would fix things properly,
> but seems logical to me.  Perhaps I'm missing something.

call_transmit() should normally just cause the rpc_task to detect that
the connection is down, and should cause it to go to call_connect in
order to fix everything.

If it is causing a race, then I'm thinking that is probably due to the
call to xs_tcp_shutdown() when we get an EPIPE error in
xs_tcp_send_request(). That shutdown seems to be needed only in the case
when we're in TCP_CLOSE_WAIT, in which case what we really want is to
force the rpc_task to go to xprt_connect, which means clearing
xprt_connected().

Hmm.... Can you try applying the attached 2 patches?

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-SUNRPC-Clear-the-connect-flag-when-socket-state-is-T.patch --]
[-- Type: text/x-patch; name="0001-SUNRPC-Clear-the-connect-flag-when-socket-state-is-T.patch", Size: 959 bytes --]

From 38b548821defc1ed171caf60b49d92172232c443 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Tue, 23 Oct 2012 11:35:47 -0400
Subject: [PATCH 1/2] SUNRPC: Clear the connect flag when socket state is
 TCP_CLOSE_WAIT

This is needed to ensure that we call xprt_connect() upon the next
call to call_connect().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
---
 net/sunrpc/xprtsock.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index aaaadfb..6e6967d 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1516,6 +1516,7 @@ static void xs_tcp_state_change(struct sock *sk)
 	case TCP_CLOSE_WAIT:
 		/* The server initiated a shutdown of the socket */
 		xprt->connect_cookie++;
+		clear_bit(XPRT_CONNECTED, &xprt->state);
 		xs_tcp_force_close(xprt);
 	case TCP_CLOSING:
 		/*
-- 
1.7.11.7


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Revert-SUNRPC-Ensure-we-close-the-socket-on-EPIPE-er.patch --]
[-- Type: text/x-patch; name="0002-Revert-SUNRPC-Ensure-we-close-the-socket-on-EPIPE-er.patch", Size: 1290 bytes --]

From ed42390cf42a0ea483d7a8d7e2e2b397ec0a8f00 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Tue, 23 Oct 2012 11:40:02 -0400
Subject: [PATCH 2/2] Revert "SUNRPC: Ensure we close the socket on EPIPE
 errors too..."

This reverts commit 55420c24a0d4d1fce70ca713f84aa00b6b74a70e.
Now that we clear the connected flag when entering TCP_CLOSE_WAIT,
the deadlock described in this commit is no longer possible.
Instead, the resulting call to xs_tcp_shutdown() can interfere
with pending reconnection attempts.

Reported-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
---
 net/sunrpc/xprtsock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 6e6967d..7e2dd0d 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -737,10 +737,10 @@ static int xs_tcp_send_request(struct rpc_task *task)
 		dprintk("RPC:       sendmsg returned unrecognized error %d\n",
 			-status);
 	case -ECONNRESET:
-	case -EPIPE:
 		xs_tcp_shutdown(xprt);
 	case -ECONNREFUSED:
 	case -ENOTCONN:
+	case -EPIPE:
 		clear_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags);
 	}
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-23 16:02         ` Myklebust, Trond
@ 2012-10-23 20:08           ` Chris Perl
  2012-10-23 21:58             ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-23 20:08 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Tue, Oct 23, 2012 at 04:02:27PM +0000, Myklebust, Trond wrote:
> call_transmit() should normally just cause the rpc_task to detect that
> the connection is down, and should cause it to go to call_connect in
> order to fix everything.
> 
> If it is causing a race, then I'm thinking that is probably due to the
> call to xs_tcp_shutdown() when we get an EPIPE error in
> xs_tcp_send_request(). That shutdown seems to be needed only in the case
> when we're in TCP_CLOSE_WAIT, in which case what we really want is to
> force the rpc_task to go to xprt_connect, which means clearing
> xprt_connected().
> 
> Hmm.... Can you try applying the attached 2 patches?

I rebuilt with those two patches as well as the first one that removed
the `xs_error_report' callback.

Good news, it works!

However, looking at the interactions, it still seems a little bit racy.
Clearly you know more about this than I do, so I'll defer to your better
judgement, but let me lay out what I believe is happening:

            Stat Thread                                rpciod
           -------------                              --------
  - woken up with tk_status set to
    EAGAIN via xs_tcp_state_change

  - tries call_transmit again, which
    calls xs_tcp_send_request which
    comes back with ECONNRESET (b/c this
    is what sk_err is)

  - call_status sees this and goes to
    sleep for 3 seconds and then sets
    tk_action to call_bind

  - call_bind

  - call_connect, queues execution of
    xs_tcp_setup_socket with rpciod and
    goes to sleep
    			
					   - starts doing its thing, but wakes
					     up our other thread before its done
					     trying to connect the socket (or
					     failing to do so) via the call
					     stack mentioned before (via
					     xprt_disconnect_done)
					     
  - tk_status is set to EAGAIN, so
    call_connect_status moves on to
    call_transmit

  - call_transmit calls
    xs_tcp_send_request again which
    finds the socket with sk_shutdown !=
    0 and so returns EPIPE

  - call_status sees the EPIPE and sets
    tk_action back to call_bind

					   - continues on and updates the state
					     of the socket to TCP_SYN_SENT

					   - When the connection is established,
					     xs_tcp_state_change takes care of
					     marking the xprt connected

  - call_bind

  - call_connect, however this time it
    schedules no work because it finds
    that the transport is already
    connected

  - call_transmit, succeeds, we recover

This is how the recovery went when I was testing, but I could imagine a
situation where different timing comes into play and we wind up asking
rpciod to execute xs_tcp_setup_socket twice (because the first connect
hasn't been established yet).  I haven't thought too hard about what
that would mean, but I imagine it probably wouldn't be good.

Like I said, you know way more about this than I do, I just wanted to point this
out.  Its entirely possible that I'm missing or misunderstanding something.

Thanks!

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-23 20:08           ` Chris Perl
@ 2012-10-23 21:58             ` Myklebust, Trond
  2012-10-24 12:40               ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-23 21:58 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

T24gVHVlLCAyMDEyLTEwLTIzIGF0IDE2OjA4IC0wNDAwLCBDaHJpcyBQZXJsIHdyb3RlOg0KPiBP
biBUdWUsIE9jdCAyMywgMjAxMiBhdCAwNDowMjoyN1BNICswMDAwLCBNeWtsZWJ1c3QsIFRyb25k
IHdyb3RlOg0KPiA+IGNhbGxfdHJhbnNtaXQoKSBzaG91bGQgbm9ybWFsbHkganVzdCBjYXVzZSB0
aGUgcnBjX3Rhc2sgdG8gZGV0ZWN0IHRoYXQNCj4gPiB0aGUgY29ubmVjdGlvbiBpcyBkb3duLCBh
bmQgc2hvdWxkIGNhdXNlIGl0IHRvIGdvIHRvIGNhbGxfY29ubmVjdCBpbg0KPiA+IG9yZGVyIHRv
IGZpeCBldmVyeXRoaW5nLg0KPiA+IA0KPiA+IElmIGl0IGlzIGNhdXNpbmcgYSByYWNlLCB0aGVu
IEknbSB0aGlua2luZyB0aGF0IGlzIHByb2JhYmx5IGR1ZSB0byB0aGUNCj4gPiBjYWxsIHRvIHhz
X3RjcF9zaHV0ZG93bigpIHdoZW4gd2UgZ2V0IGFuIEVQSVBFIGVycm9yIGluDQo+ID4geHNfdGNw
X3NlbmRfcmVxdWVzdCgpLiBUaGF0IHNodXRkb3duIHNlZW1zIHRvIGJlIG5lZWRlZCBvbmx5IGlu
IHRoZSBjYXNlDQo+ID4gd2hlbiB3ZSdyZSBpbiBUQ1BfQ0xPU0VfV0FJVCwgaW4gd2hpY2ggY2Fz
ZSB3aGF0IHdlIHJlYWxseSB3YW50IGlzIHRvDQo+ID4gZm9yY2UgdGhlIHJwY190YXNrIHRvIGdv
IHRvIHhwcnRfY29ubmVjdCwgd2hpY2ggbWVhbnMgY2xlYXJpbmcNCj4gPiB4cHJ0X2Nvbm5lY3Rl
ZCgpLg0KPiA+IA0KPiA+IEhtbS4uLi4gQ2FuIHlvdSB0cnkgYXBwbHlpbmcgdGhlIGF0dGFjaGVk
IDIgcGF0Y2hlcz8NCj4gDQo+IEkgcmVidWlsdCB3aXRoIHRob3NlIHR3byBwYXRjaGVzIGFzIHdl
bGwgYXMgdGhlIGZpcnN0IG9uZSB0aGF0IHJlbW92ZWQNCj4gdGhlIGB4c19lcnJvcl9yZXBvcnQn
IGNhbGxiYWNrLg0KPiANCj4gR29vZCBuZXdzLCBpdCB3b3JrcyENCj4gDQo+IEhvd2V2ZXIsIGxv
b2tpbmcgYXQgdGhlIGludGVyYWN0aW9ucywgaXQgc3RpbGwgc2VlbXMgYSBsaXR0bGUgYml0IHJh
Y3kuDQo+IENsZWFybHkgeW91IGtub3cgbW9yZSBhYm91dCB0aGlzIHRoYW4gSSBkbywgc28gSSds
bCBkZWZlciB0byB5b3VyIGJldHRlcg0KPiBqdWRnZW1lbnQsIGJ1dCBsZXQgbWUgbGF5IG91dCB3
aGF0IEkgYmVsaWV2ZSBpcyBoYXBwZW5pbmc6DQo+IA0KPiAgICAgICAgICAgICBTdGF0IFRocmVh
ZCAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgcnBjaW9kDQo+ICAgICAgICAgICAgLS0t
LS0tLS0tLS0tLSAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIC0tLS0tLS0tDQo+ICAgLSB3
b2tlbiB1cCB3aXRoIHRrX3N0YXR1cyBzZXQgdG8NCj4gICAgIEVBR0FJTiB2aWEgeHNfdGNwX3N0
YXRlX2NoYW5nZQ0KPiANCj4gICAtIHRyaWVzIGNhbGxfdHJhbnNtaXQgYWdhaW4sIHdoaWNoDQo+
ICAgICBjYWxscyB4c190Y3Bfc2VuZF9yZXF1ZXN0IHdoaWNoDQo+ICAgICBjb21lcyBiYWNrIHdp
dGggRUNPTk5SRVNFVCAoYi9jIHRoaXMNCj4gICAgIGlzIHdoYXQgc2tfZXJyIGlzKQ0KPiANCj4g
ICAtIGNhbGxfc3RhdHVzIHNlZXMgdGhpcyBhbmQgZ29lcyB0bw0KPiAgICAgc2xlZXAgZm9yIDMg
c2Vjb25kcyBhbmQgdGhlbiBzZXRzDQo+ICAgICB0a19hY3Rpb24gdG8gY2FsbF9iaW5kDQo+IA0K
PiAgIC0gY2FsbF9iaW5kDQo+IA0KPiAgIC0gY2FsbF9jb25uZWN0LCBxdWV1ZXMgZXhlY3V0aW9u
IG9mDQo+ICAgICB4c190Y3Bfc2V0dXBfc29ja2V0IHdpdGggcnBjaW9kIGFuZA0KPiAgICAgZ29l
cyB0byBzbGVlcA0KPiAgICAgCQkJDQo+IAkJCQkJICAgLSBzdGFydHMgZG9pbmcgaXRzIHRoaW5n
LCBidXQgd2FrZXMNCj4gCQkJCQkgICAgIHVwIG91ciBvdGhlciB0aHJlYWQgYmVmb3JlIGl0cyBk
b25lDQo+IAkJCQkJICAgICB0cnlpbmcgdG8gY29ubmVjdCB0aGUgc29ja2V0IChvcg0KPiAJCQkJ
CSAgICAgZmFpbGluZyB0byBkbyBzbykgdmlhIHRoZSBjYWxsDQo+IAkJCQkJICAgICBzdGFjayBt
ZW50aW9uZWQgYmVmb3JlICh2aWENCj4gCQkJCQkgICAgIHhwcnRfZGlzY29ubmVjdF9kb25lKQ0K
PiAJCQkJCSAgICAgDQo+ICAgLSB0a19zdGF0dXMgaXMgc2V0IHRvIEVBR0FJTiwgc28NCj4gICAg
IGNhbGxfY29ubmVjdF9zdGF0dXMgbW92ZXMgb24gdG8NCj4gICAgIGNhbGxfdHJhbnNtaXQNCj4g
DQo+ICAgLSBjYWxsX3RyYW5zbWl0IGNhbGxzDQo+ICAgICB4c190Y3Bfc2VuZF9yZXF1ZXN0IGFn
YWluIHdoaWNoDQo+ICAgICBmaW5kcyB0aGUgc29ja2V0IHdpdGggc2tfc2h1dGRvd24gIT0NCj4g
ICAgIDAgYW5kIHNvIHJldHVybnMgRVBJUEUNCj4gDQo+ICAgLSBjYWxsX3N0YXR1cyBzZWVzIHRo
ZSBFUElQRSBhbmQgc2V0cw0KPiAgICAgdGtfYWN0aW9uIGJhY2sgdG8gY2FsbF9iaW5kDQo+IA0K
PiAJCQkJCSAgIC0gY29udGludWVzIG9uIGFuZCB1cGRhdGVzIHRoZSBzdGF0ZQ0KPiAJCQkJCSAg
ICAgb2YgdGhlIHNvY2tldCB0byBUQ1BfU1lOX1NFTlQNCj4gDQo+IAkJCQkJICAgLSBXaGVuIHRo
ZSBjb25uZWN0aW9uIGlzIGVzdGFibGlzaGVkLA0KPiAJCQkJCSAgICAgeHNfdGNwX3N0YXRlX2No
YW5nZSB0YWtlcyBjYXJlIG9mDQo+IAkJCQkJICAgICBtYXJraW5nIHRoZSB4cHJ0IGNvbm5lY3Rl
ZA0KPiANCj4gICAtIGNhbGxfYmluZA0KPiANCj4gICAtIGNhbGxfY29ubmVjdCwgaG93ZXZlciB0
aGlzIHRpbWUgaXQNCj4gICAgIHNjaGVkdWxlcyBubyB3b3JrIGJlY2F1c2UgaXQgZmluZHMNCj4g
ICAgIHRoYXQgdGhlIHRyYW5zcG9ydCBpcyBhbHJlYWR5DQo+ICAgICBjb25uZWN0ZWQNCj4gDQo+
ICAgLSBjYWxsX3RyYW5zbWl0LCBzdWNjZWVkcywgd2UgcmVjb3Zlcg0KPiANCj4gVGhpcyBpcyBo
b3cgdGhlIHJlY292ZXJ5IHdlbnQgd2hlbiBJIHdhcyB0ZXN0aW5nLCBidXQgSSBjb3VsZCBpbWFn
aW5lIGENCj4gc2l0dWF0aW9uIHdoZXJlIGRpZmZlcmVudCB0aW1pbmcgY29tZXMgaW50byBwbGF5
IGFuZCB3ZSB3aW5kIHVwIGFza2luZw0KPiBycGNpb2QgdG8gZXhlY3V0ZSB4c190Y3Bfc2V0dXBf
c29ja2V0IHR3aWNlIChiZWNhdXNlIHRoZSBmaXJzdCBjb25uZWN0DQo+IGhhc24ndCBiZWVuIGVz
dGFibGlzaGVkIHlldCkuICBJIGhhdmVuJ3QgdGhvdWdodCB0b28gaGFyZCBhYm91dCB3aGF0DQo+
IHRoYXQgd291bGQgbWVhbiwgYnV0IEkgaW1hZ2luZSBpdCBwcm9iYWJseSB3b3VsZG4ndCBiZSBn
b29kLg0KPiANCj4gTGlrZSBJIHNhaWQsIHlvdSBrbm93IHdheSBtb3JlIGFib3V0IHRoaXMgdGhh
biBJIGRvLCBJIGp1c3Qgd2FudGVkIHRvIHBvaW50IHRoaXMNCj4gb3V0LiAgSXRzIGVudGlyZWx5
IHBvc3NpYmxlIHRoYXQgSSdtIG1pc3Npbmcgb3IgbWlzdW5kZXJzdGFuZGluZyBzb21ldGhpbmcu
DQoNClRoZSBvbmx5IHByb2JsZW0gdGhlcmUgd291bGQgYmUgdGhlIGNhbGwgdG8geHNfc29ja19t
YXJrX2Nsb3NlZCgpIGluDQp4c19hYm9ydF9jb25uZWN0aW9uLiBBcyBmYXIgYXMgSSBjYW4gdGVs
bCwgYWxsIHdlIHdhbnQgdG8gZG8gdGhlcmUgaXMNCmNsZWFyIGFsbCB0aG9zZSBmbGFncy4NCg0K
SG93IGFib3V0IHRoZSBmb2xsb3dpbmc/DQoNCkNoZWVycw0KICBUcm9uZA0KODwtLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0N
CkZyb20gYzJkMWJkNGFiNmM4NWNkYmRmMzM4MGRlMjI2OTgxNzA5MmZiOWExMSBNb24gU2VwIDE3
IDAwOjAwOjAwIDIwMDENCkZyb206IFRyb25kIE15a2xlYnVzdCA8VHJvbmQuTXlrbGVidXN0QG5l
dGFwcC5jb20+DQpEYXRlOiBUdWUsIDIzIE9jdCAyMDEyIDE3OjUwOjA3IC0wNDAwDQpTdWJqZWN0
OiBbUEFUQ0hdIFNVTlJQQzogUHJldmVudCByYWNlcyBpbiB4c19hYm9ydF9jb25uZWN0aW9uKCkN
Cg0KVGhlIGNhbGwgdG8geHBydF9kaXNjb25uZWN0X2RvbmUoKSB0aGF0IGlzIHRyaWdnZXJlZCBi
eSBhIHN1Y2Nlc3NmdWwNCmNvbm5lY3Rpb24gcmVzZXQgd2lsbCB0cmlnZ2VyIGFub3RoZXIgYXV0
b21hdGljIHdha2V1cCBvZiBhbGwgdGFza3MNCm9uIHRoZSB4cHJ0LT5wZW5kaW5nIHJwY193YWl0
X3F1ZXVlLiBJbiBwYXJ0aWN1bGFyIGl0IHdpbGwgY2F1c2UgYW4NCmVhcmx5IHdha2UgdXAgb2Yg
dGhlIHRhc2sgdGhhdCBjYWxsZWQgeHBydF9jb25uZWN0KCkuDQoNCkFsbCB3ZSByZWFsbHkgd2Fu
dCB0byBkbyBoZXJlIGlzIGNsZWFyIGFsbCB0aGUgc29ja2V0LXNwZWNpZmljIHN0YXRlDQpmbGFn
cywgc28gd2Ugc3BsaXQgdGhhdCBmdW5jdGlvbmFsaXR5IG91dCBvZiB4c19zb2NrX21hcmtfY2xv
c2VkKCkNCmludG8gYSBoZWxwZXIgdGhhdCBjYW4gYmUgY2FsbGVkIGJ5IHhzX2Fib3J0X2Nvbm5l
Y3Rpb24oKQ0KDQpSZXBvcnRlZC1ieTogQ2hyaXMgUGVybCA8Y2hyaXMucGVybEBnbWFpbC5jb20+
DQpTaWduZWQtb2ZmLWJ5OiBUcm9uZCBNeWtsZWJ1c3QgPFRyb25kLk15a2xlYnVzdEBuZXRhcHAu
Y29tPg0KQ2M6IHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmcNCi0tLQ0KIG5ldC9zdW5ycGMveHBydHNv
Y2suYyB8IDEzICsrKysrKysrLS0tLS0NCiAxIGZpbGUgY2hhbmdlZCwgOCBpbnNlcnRpb25zKCsp
LCA1IGRlbGV0aW9ucygtKQ0KDQpkaWZmIC0tZ2l0IGEvbmV0L3N1bnJwYy94cHJ0c29jay5jIGIv
bmV0L3N1bnJwYy94cHJ0c29jay5jDQppbmRleCA3ZTJkZDBkLi4xZjEwNWMyIDEwMDY0NA0KLS0t
IGEvbmV0L3N1bnJwYy94cHJ0c29jay5jDQorKysgYi9uZXQvc3VucnBjL3hwcnRzb2NrLmMNCkBA
IC0xNDUzLDcgKzE0NTMsNyBAQCBzdGF0aWMgdm9pZCB4c190Y3BfY2FuY2VsX2xpbmdlcl90aW1l
b3V0KHN0cnVjdCBycGNfeHBydCAqeHBydCkNCiAJeHBydF9jbGVhcl9jb25uZWN0aW5nKHhwcnQp
Ow0KIH0NCiANCi1zdGF0aWMgdm9pZCB4c19zb2NrX21hcmtfY2xvc2VkKHN0cnVjdCBycGNfeHBy
dCAqeHBydCkNCitzdGF0aWMgdm9pZCB4c19zb2NrX3Jlc2V0X2Nvbm5lY3Rpb25fZmxhZ3Moc3Ry
dWN0IHJwY194cHJ0ICp4cHJ0KQ0KIHsNCiAJc21wX21iX19iZWZvcmVfY2xlYXJfYml0KCk7DQog
CWNsZWFyX2JpdChYUFJUX0NPTk5FQ1RJT05fQUJPUlQsICZ4cHJ0LT5zdGF0ZSk7DQpAQCAtMTQ2
MSw2ICsxNDYxLDExIEBAIHN0YXRpYyB2b2lkIHhzX3NvY2tfbWFya19jbG9zZWQoc3RydWN0IHJw
Y194cHJ0ICp4cHJ0KQ0KIAljbGVhcl9iaXQoWFBSVF9DTE9TRV9XQUlULCAmeHBydC0+c3RhdGUp
Ow0KIAljbGVhcl9iaXQoWFBSVF9DTE9TSU5HLCAmeHBydC0+c3RhdGUpOw0KIAlzbXBfbWJfX2Fm
dGVyX2NsZWFyX2JpdCgpOw0KK30NCisNCitzdGF0aWMgdm9pZCB4c19zb2NrX21hcmtfY2xvc2Vk
KHN0cnVjdCBycGNfeHBydCAqeHBydCkNCit7DQorCXhzX3NvY2tfcmVzZXRfY29ubmVjdGlvbl9m
bGFncyh4cHJ0KTsNCiAJLyogTWFyayB0cmFuc3BvcnQgYXMgY2xvc2VkIGFuZCB3YWtlIHVwIGFs
bCBwZW5kaW5nIHRhc2tzICovDQogCXhwcnRfZGlzY29ubmVjdF9kb25lKHhwcnQpOw0KIH0NCkBA
IC0yMDUxLDEwICsyMDU2LDggQEAgc3RhdGljIHZvaWQgeHNfYWJvcnRfY29ubmVjdGlvbihzdHJ1
Y3Qgc29ja194cHJ0ICp0cmFuc3BvcnQpDQogCWFueS5zYV9mYW1pbHkgPSBBRl9VTlNQRUM7DQog
CXJlc3VsdCA9IGtlcm5lbF9jb25uZWN0KHRyYW5zcG9ydC0+c29jaywgJmFueSwgc2l6ZW9mKGFu
eSksIDApOw0KIAlpZiAoIXJlc3VsdCkNCi0JCXhzX3NvY2tfbWFya19jbG9zZWQoJnRyYW5zcG9y
dC0+eHBydCk7DQotCWVsc2UNCi0JCWRwcmludGsoIlJQQzogICAgICAgQUZfVU5TUEVDIGNvbm5l
Y3QgcmV0dXJuIGNvZGUgJWRcbiIsDQotCQkJCXJlc3VsdCk7DQorCQl4c19zb2NrX3Jlc2V0X2Nv
bm5lY3Rpb25fZmxhZ3MoJnRyYW5zcG9ydC0+eHBydCk7DQorCWRwcmludGsoIlJQQzogICAgICAg
QUZfVU5TUEVDIGNvbm5lY3QgcmV0dXJuIGNvZGUgJWRcbiIsIHJlc3VsdCk7DQogfQ0KIA0KIHN0
YXRpYyB2b2lkIHhzX3RjcF9yZXVzZV9jb25uZWN0aW9uKHN0cnVjdCBzb2NrX3hwcnQgKnRyYW5z
cG9ydCkNCi0tIA0KMS43LjExLjcNCg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZT
IGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20N
Cnd3dy5uZXRhcHAuY29tDQo=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-23 21:58             ` Myklebust, Trond
@ 2012-10-24 12:40               ` Chris Perl
  2012-10-24 13:04                 ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-24 12:40 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Tue, Oct 23, 2012 at 09:58:35PM +0000, Myklebust, Trond wrote:
> The only problem there would be the call to xs_sock_mark_closed() in
> xs_abort_connection. As far as I can tell, all we want to do there is
> clear all those flags.
> 
> How about the following?

Looks good!  With all 4 patches applied things recover correctly and I
no longer see anything racing.

Just to be clear, I took your patches and integrated them into the
CentOS 2.6.32-279.5.2.el6.centos.plus kernel.  Basically everything was
the same except the offsets and perhaps a little of the surrounding
context.  I do not believe the meaning of the patches was changed at
all.

If you want me to test your latest git where these commits live, I can,
just point me at the git repo url.

Thanks for the help!

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: RPC Race Condition
  2012-10-24 12:40               ` Chris Perl
@ 2012-10-24 13:04                 ` Myklebust, Trond
  2012-10-24 14:23                   ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-24 13:04 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

> -----Original Message-----
> From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-
> owner@vger.kernel.org] On Behalf Of Chris Perl
> Sent: Wednesday, October 24, 2012 8:41 AM
> To: Myklebust, Trond
> Cc: linux-nfs@vger.kernel.org
> Subject: Re: RPC Race Condition
> 
> On Tue, Oct 23, 2012 at 09:58:35PM +0000, Myklebust, Trond wrote:
> > The only problem there would be the call to xs_sock_mark_closed() in
> > xs_abort_connection. As far as I can tell, all we want to do there is
> > clear all those flags.
> >
> > How about the following?
> 
> Looks good!  With all 4 patches applied things recover correctly and I no
> longer see anything racing.
> 
> Just to be clear, I took your patches and integrated them into the CentOS
> 2.6.32-279.5.2.el6.centos.plus kernel.  Basically everything was the same
> except the offsets and perhaps a little of the surrounding context.  I do not
> believe the meaning of the patches was changed at all.

Cool! Would you mind if I add "Tested-by" tags?

> If you want me to test your latest git where these commits live, I can, just
> point me at the git repo url.

Sure. I'll let you know as soon as I push them out later today...

Thanks!
   Trond

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-24 13:04                 ` Myklebust, Trond
@ 2012-10-24 14:23                   ` Chris Perl
  2012-10-24 14:56                     ` Myklebust, Trond
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-24 14:23 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Wed, Oct 24, 2012 at 01:04:16PM +0000, Myklebust, Trond wrote:
> Cool! Would you mind if I add "Tested-by" tags?

Wouldn't mind at all.

I've also run several of the filebench [1] workload personalities with
this patched kernel for ten minutes a piece (i.e. the `fileserver' and
`networkfs' personalities) and experienced no problems.  During the
tests I also added the blackhole route to the server several times
(along with restarting the NFS server) and it recovered each time.  I
did this for filebench running SCHED_OTHER and SCHED_RR with a priority
of 50.

> Sure. I'll let you know as soon as I push them out later today...

Cool, I'll keep an eye out.

Chris

[1] http://sourceforge.net/apps/mediawiki/filebench/index.php?title=Filebench

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: RPC Race Condition
  2012-10-24 14:23                   ` Chris Perl
@ 2012-10-24 14:56                     ` Myklebust, Trond
  2012-10-24 16:10                       ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Myklebust, Trond @ 2012-10-24 14:56 UTC (permalink / raw)
  To: Chris Perl; +Cc: linux-nfs

> -----Original Message-----
> From: Chris Perl [mailto:chris.perl@gmail.com]
> Sent: Wednesday, October 24, 2012 10:24 AM
> To: Myklebust, Trond
> Cc: linux-nfs@vger.kernel.org
> Subject: Re: RPC Race Condition
> 
> On Wed, Oct 24, 2012 at 01:04:16PM +0000, Myklebust, Trond wrote:
> > Cool! Would you mind if I add "Tested-by" tags?
> 
> Wouldn't mind at all.
> 
> I've also run several of the filebench [1] workload personalities with this
> patched kernel for ten minutes a piece (i.e. the `fileserver' and `networkfs'
> personalities) and experienced no problems.  During the tests I also added
> the blackhole route to the server several times (along with restarting the NFS
> server) and it recovered each time.  I did this for filebench running
> SCHED_OTHER and SCHED_RR with a priority of 50.
> 
> > Sure. I'll let you know as soon as I push them out later today...
> 
> Cool, I'll keep an eye out.

OK, I've now posted the patches in the "bugfixes" branch of

       git://git.linux-nfs.org/projects/trondmy/linux-nfs.git

By the way, you said that you were using a set of systemtap scripts to debug these races? If you'd be OK sharing those scripts, I'd love to see them. Debugging this kind of race is always difficult...

Cheers
  Trond

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-24 14:56                     ` Myklebust, Trond
@ 2012-10-24 16:10                       ` Chris Perl
  2012-10-24 20:19                         ` Chris Perl
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Perl @ 2012-10-24 16:10 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 5737 bytes --]

On Wed, Oct 24, 2012 at 02:56:55PM +0000, Myklebust, Trond wrote:
> OK, I've now posted the patches in the "bugfixes" branch of
> 
>        git://git.linux-nfs.org/projects/trondmy/linux-nfs.git

Cool, I'll try to get that pulled, compiled and tested by the end of the
day.

> 
> By the way, you said that you were using a set of systemtap scripts to debug these races? If you'd be OK sharing those scripts, I'd love to see them. Debugging this kind of race is always difficult...
> 

Sure.  Attached are a bunch of them.  They have very unhelpful names. :)

They progressively get more involved.  As I watched their output coupled
with reading the source code, I found new areas I wanted to explore.

The numbered scripts are all aimed at being run with the `-x' flag
pointed at the process doing the stat calls, i.e.:

# stap 1.stp -x $(pgrep hang.py)

The `rpciod.stp' script is just meant to be run without any
qualifications to trace the function calls rpciod is making, i.e.:

# stap rpciod.stp

The script that really allowed me to see the race condition was 14.stp.
It produces a lot of output to poke through, but here is basically what
its doing:

  - At the top there are a lot of helper functions.  Their basic goals
    are to translate from numbers to human readable strings.  I.e.
    instead of seeing `xs_tcp_send_request' return 0xffffffffffffffe0, I
    wanted to see EPIPE.  It makes it a lot easier to pour over several
    pages of output while testing.

  - Below that there are the actual probes for the functions we want to
    trace, the gist of which is:

    - It watches for a call to `__rpc_execute' for the process id we're
      interested in.  Once it fires, we use the address of its
      task_struct to mark it as being traced via the `trace' array.

    - This probe also takes the address of the `struct rpc_task' we're
      currently working with and marks it as being traced via the
      `task_trace' array.  The idea with this is that another one of our
      probes enables tracing of `rpc_make_runnable' and on entry into
      that call we make sure we're tracing that task.  This way we can
      catch wake up calls for the task we care about.  When we do, we
      print that fact and a stack backtrace of how we got there.

    - There are many other functions that I ask to trace, but only if
      we're tracing the main task via the address of its task_struct
      being a valid key in the `trace' array.

    - I do something similiar to tracing calls to `rpc_make_runnable' with
      sockets.  Here we record the `struct sock' address in
      `xs_tcp_send_request'.  In another probe we enable tracing of
      `tcp_set_state' but only for the `struct sock' that we care about.

I'm far from a SystemTap expert, and some of this stuff might not be
particularly safe (i.e. if I'm trying to translate tk_runstate to a
human readable string while something else is updating it on another
CPU).  But it all seemed to work "well enough" and give insight into
what was actually happening.

Here is an example snippet of some of the output produced with 14.stp:
-----8<-----
1351091111259227075 [0xffff88060b27d540]    150 hang.py(5560):  -> call_transmit: task: tk_status: 0, tk_runstate: RPC_TASK_RUNNING|RPC_TASK_ACTIVE, parms: task=0xffff8806355c9080
1351091111259235702 [0xffff88060b27d540]    159 hang.py(5560):   -> xs_tcp_send_request: sock: sock=0xffff880620803140, state: SS_CONNECTING, sk=0xffff880635ded540, sk_state: TCP_ESTABLISHED, sk_err: 0, sk_shutdown: 0, task: tk_status: 0, tk_runstate: RPC_TASK_RUNNING|RPC_TASK_ACTIVE, parms: task=0xffff8806355c9080
1351091111259253562 [0xffff88060b27d540]    177 hang.py(5560):   <- xs_tcp_send_request: 0
1351091111259258719 [0xffff88060b27d540]    182 hang.py(5560):   -> call_transmit_status: task: tk_status: 0, tk_runstate: RPC_TASK_RUNNING|RPC_TASK_QUEUED|RPC_TASK_ACTIVE, parms: task=0xffff8806355c9080
1351091111259264704 [0xffff88060b27d540]    188 hang.py(5560):   <- call_transmit_status:
1351091111259267273 [0xffff88060b27d540]    190 hang.py(5560):  <- call_transmit:
1351091111259270415 [0xffff88060b27d540]    193 hang.py(5560):  -> out_of_line_wait_on_bit: word: tk_runstate: RPC_TASK_QUEUED|RPC_TASK_ACTIVE, parms: word=0xffff8806355c90f0 bit=0x1 action=0xffffffffa042f930 mode=0x82
1351091111259276940 [0xffff88060b27d540]    200 hang.py(5560):   -> rpc_wait_bit_killable: parms: word=0xffff8806355c90f0
1351091117489543178 [0xffff880637daeaa0] swapper -> rpc_make_runnable: task: tk_status: EAGAIN, tk_runstate: RPC_TASK_QUEUED|RPC_TASK_ACTIVE, parms: task=0xffff8806355c9080
 0xffffffffa042f840 : rpc_make_runnable+0x0/0x80 [sunrpc]
 0xffffffffa042fb5e : rpc_wake_up_task_queue_locked+0x18e/0x270 [sunrpc]
 0xffffffffa042fc8e : rpc_wake_up_status+0x4e/0x80 [sunrpc]
 0xffffffffa04288ec : xprt_wake_pending_tasks+0x2c/0x30 [sunrpc]
 0xffffffffa042b9fd : xs_error_report+0x4d/0x90 [sunrpc]
 0xffffffff8148d629 : tcp_reset+0x59/0x70 [kernel]
 0xffffffff8148d918 : tcp_validate_incoming+0x2d8/0x3a0 [kernel]
 0xffffffff81491006 : tcp_rcv_established+0x2e6/0x800 [kernel]
 0xffffffff814990f3 : tcp_v4_do_rcv+0x2e3/0x430 [kernel]
 0xffffffff8149a96e : tcp_v4_rcv+0x4fe/0x8d0 [kernel]
 0xffffffff8147868d : ip_local_deliver_finish+0xdd/0x2d0 [kernel]
 0xffffffff81478918 : ip_local_deliver+0x98/0xa0 [kernel]
 0xffffffff81477ddd : ip_rcv_finish+0x12d/0x440 [kernel]
 0xffffffff81478365 : ip_rcv+0x275/0x350 [kernel]
 0xffffffff81441a5b : __netif_receive_skb+0x49b/0x6f0 [kernel]
 0xffffffff81443cd8 : netif_receive_skb+0x58/0x60 [kernel]
 0xffffffff81443de0 : napi_skb_finish+0x50/0x70 [kernel]
 0xffffffff81446319 : napi_gro_receive+0x39/0x50 [kernel]
 0xffffffffa0126f2f [bnx2]
 0x0 (inexact)
-----8<-----

Chris

[-- Attachment #2: 1.stp --]
[-- Type: text/plain, Size: 816 bytes --]

global trace

probe module("sunrpc").function("rpc_call_sync").call {
    if (pid() == target()) {
        t = task_current()
        printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), probefunc(), $$parms)
        trace[t] = 1
    }
}

probe module("sunrpc").function("rpc_call_sync").return {
    t = task_current()
    if (trace[t]) {
        printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), $$return)
        trace[t] = 0
    }
}

probe module("sunrpc").function("*").call {
    t = task_current()
    if (trace[t]) {
        printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), probefunc(), $$parms)
    }
}

probe module("sunrpc").function("*").return {
    t = task_current()
    if (trace[t]) {
        printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), $$return)
    }
}

[-- Attachment #3: 3.stp --]
[-- Type: text/plain, Size: 449 bytes --]

# vim: ts=2 sts=2 sw=2 et

global trace

probe module("sunrpc").function("__rpc_execute").call {
  t = task_current()
  if (pid() == target()) {
    printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), probefunc(), $$parms)
    trace[t] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), $$return)
    trace[t] = 0
  }
}

[-- Attachment #4: 5.stp --]
[-- Type: text/plain, Size: 1102 bytes --]

global trace

probe module("sunrpc").function("__rpc_execute").call {
  if (pid() == target()) {
    t = task_current()
    proc = @cast(&$task->tk_msg, "struct rpc_message")->rpc_proc
    func = @cast(proc, "struct rpc_procinfo")->p_name
    serv = @cast($task->tk_client, "struct rpc_clnt")->cl_server
    func_name = kernel_string(func)
    serv_name = kernel_string(serv)
    printf("[0x%x] %s -> %s: %s: proc_name: %s, server: %s\n", t, thread_indent(1), probefunc(), $$parms, func_name, serv_name)
    trace[t] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), $$return)
    trace[t] = 0
  }
}

probe module("sunrpc").function("*").call {
  t = task_current()
  if (trace[t]) {
      printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), probefunc(), $$parms)
  }
}

probe module("sunrpc").function("*").return {
  t = task_current()
  if (trace[t]) {
      printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), $$return)
  }
}

# vim: ts=2 sts=2 sw=2 et

[-- Attachment #5: 7.stp --]
[-- Type: text/plain, Size: 4799 bytes --]

global trace

global _errno[134]

probe begin {
  _errno[0] = "0"
  _errno[1] = "EPERM"
  _errno[2] = "ENOENT"
  _errno[3] = "ESRCH"
  _errno[4] = "EINTR"
  _errno[5] = "EIO"
  _errno[6] = "ENXIO"
  _errno[7] = "E2BIG"
  _errno[8] = "ENOEXEC"
  _errno[9] = "EBADF"
  _errno[10] = "ECHILD"
  _errno[11] = "EAGAIN"
  _errno[12] = "ENOMEM"
  _errno[13] = "EACCES"
  _errno[14] = "EFAULT"
  _errno[15] = "ENOTBLK"
  _errno[16] = "EBUSY"
  _errno[17] = "EEXIST"
  _errno[18] = "EXDEV"
  _errno[19] = "ENODEV"
  _errno[20] = "ENOTDIR"
  _errno[21] = "EISDIR"
  _errno[22] = "EINVAL"
  _errno[23] = "ENFILE"
  _errno[24] = "EMFILE"
  _errno[25] = "ENOTTY"
  _errno[26] = "ETXTBSY"
  _errno[27] = "EFBIG"
  _errno[28] = "ENOSPC"
  _errno[29] = "ESPIPE"
  _errno[30] = "EROFS"
  _errno[31] = "EMLINK"
  _errno[32] = "EPIPE"
  _errno[33] = "EDOM"
  _errno[34] = "ERANGE"
  _errno[35] = "EDEADLK"
  _errno[36] = "ENAMETOOLONG"
  _errno[37] = "ENOLCK"
  _errno[38] = "ENOSYS"
  _errno[39] = "ENOTEMPTY"
  _errno[40] = "ELOOP"
  _errno[41] = "UNDEF"
  _errno[42] = "ENOMSG"
  _errno[43] = "EIDRM"
  _errno[44] = "ECHRNG"
  _errno[45] = "EL2NSYNC"
  _errno[46] = "EL3HLT"
  _errno[47] = "EL3RST"
  _errno[48] = "ELNRNG"
  _errno[49] = "EUNATCH"
  _errno[50] = "ENOCSI"
  _errno[51] = "EL2HLT"
  _errno[52] = "EBADE"
  _errno[53] = "EBADR"
  _errno[54] = "EXFULL"
  _errno[55] = "ENOANO"
  _errno[56] = "EBADRQC"
  _errno[57] = "EBADSLT"
  _errno[58] = "UNDEF"
  _errno[59] = "EBFONT"
  _errno[60] = "ENOSTR"
  _errno[61] = "ENODATA"
  _errno[62] = "ETIME"
  _errno[63] = "ENOSR"
  _errno[64] = "ENONET"
  _errno[65] = "ENOPKG"
  _errno[66] = "EREMOTE"
  _errno[67] = "ENOLINK"
  _errno[68] = "EADV"
  _errno[69] = "ESRMNT"
  _errno[70] = "ECOMM"
  _errno[71] = "EPROTO"
  _errno[72] = "EMULTIHOP"
  _errno[73] = "EDOTDOT"
  _errno[74] = "EBADMSG"
  _errno[75] = "EOVERFLOW"
  _errno[76] = "ENOTUNIQ"
  _errno[77] = "EBADFD"
  _errno[78] = "EREMCHG"
  _errno[79] = "ELIBACC"
  _errno[80] = "ELIBBAD"
  _errno[81] = "ELIBSCN"
  _errno[82] = "ELIBMAX"
  _errno[83] = "ELIBEXEC"
  _errno[84] = "EILSEQ"
  _errno[85] = "ERESTART"
  _errno[86] = "ESTRPIPE"
  _errno[87] = "EUSERS"
  _errno[88] = "ENOTSOCK"
  _errno[89] = "EDESTADDRREQ"
  _errno[90] = "EMSGSIZE"
  _errno[91] = "EPROTOTYPE"
  _errno[92] = "ENOPROTOOPT"
  _errno[93] = "EPROTONOSUPPORT"
  _errno[94] = "ESOCKTNOSUPPORT"
  _errno[95] = "EOPNOTSUPP"
  _errno[96] = "EPFNOSUPPORT"
  _errno[97] = "EAFNOSUPPORT"
  _errno[98] = "EADDRINUSE"
  _errno[99] = "EADDRNOTAVAIL"
  _errno[100] = "ENETDOWN"
  _errno[101] = "ENETUNREACH"
  _errno[102] = "ENETRESET"
  _errno[103] = "ECONNABORTED"
  _errno[104] = "ECONNRESET"
  _errno[105] = "ENOBUFS"
  _errno[106] = "EISCONN"
  _errno[107] = "ENOTCONN"
  _errno[108] = "ESHUTDOWN"
  _errno[109] = "ETOOMANYREFS"
  _errno[110] = "ETIMEDOUT"
  _errno[111] = "ECONNREFUSED"
  _errno[112] = "EHOSTDOWN"
  _errno[113] = "EHOSTUNREACH"
  _errno[114] = "EALREADY"
  _errno[115] = "EINPROGRESS"
  _errno[116] = "ESTALE"
  _errno[117] = "EUCLEAN"
  _errno[118] = "ENOTNAM"
  _errno[119] = "ENAVAIL"
  _errno[120] = "EISNAM"
  _errno[121] = "EREMOTEIO"
  _errno[122] = "EDQUOT"
  _errno[123] = "ENOMEDIUM"
  _errno[124] = "EMEDIUMTYPE"
  _errno[125] = "ECANCELED"
  _errno[126] = "ENOKEY"
  _errno[127] = "EKEYEXPIRED"
  _errno[128] = "EKEYREVOKED"
  _errno[129] = "EKEYREJECTED"
  _errno[130] = "EOWNERDEAD"
  _errno[131] = "ENOTRECOVERABLE"
  _errno[132] = "ERFKILL"
  _errno[133] = "EHWPOISON"
}

function err_num2str:string (error:long) {
  error = -error
  return (error in _errno ? _errno[error] : sprintf("0x%x", -error))
}

probe module("sunrpc").function("__rpc_execute").call {
  if (pid() == target()) {
      t = task_current()
      p = probefunc()
      serv      = @cast($task->tk_client, "struct rpc_clnt")->cl_server
      serv_name = kernel_string(serv)
      proc      = @cast(&$task->tk_msg, "struct rpc_message")->rpc_proc
      func      = @cast(proc, "struct rpc_procinfo")->p_name
      func_name = kernel_string(func)
      printf("[0x%x] %s -> %s: func: %s, server: %s, parms: %s\n", t, thread_indent(1), p, func_name, serv_name, $$parms)
      trace[t] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), p, $$return)
    trace[t] = 0
  }
}

probe module("sunrpc").function("xs_tcp_send_request").call {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), p, $$parms)
  }
}

probe module("sunrpc").function("xs_tcp_send_request").return {
  t = task_current()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

# vim: ts=2 sts=2 sw=2 et

[-- Attachment #6: 8.stp --]
[-- Type: text/plain, Size: 5505 bytes --]

global trace

global _errno[134]

probe begin {
  _errno[0] = "0"
  _errno[1] = "EPERM"
  _errno[2] = "ENOENT"
  _errno[3] = "ESRCH"
  _errno[4] = "EINTR"
  _errno[5] = "EIO"
  _errno[6] = "ENXIO"
  _errno[7] = "E2BIG"
  _errno[8] = "ENOEXEC"
  _errno[9] = "EBADF"
  _errno[10] = "ECHILD"
  _errno[11] = "EAGAIN"
  _errno[12] = "ENOMEM"
  _errno[13] = "EACCES"
  _errno[14] = "EFAULT"
  _errno[15] = "ENOTBLK"
  _errno[16] = "EBUSY"
  _errno[17] = "EEXIST"
  _errno[18] = "EXDEV"
  _errno[19] = "ENODEV"
  _errno[20] = "ENOTDIR"
  _errno[21] = "EISDIR"
  _errno[22] = "EINVAL"
  _errno[23] = "ENFILE"
  _errno[24] = "EMFILE"
  _errno[25] = "ENOTTY"
  _errno[26] = "ETXTBSY"
  _errno[27] = "EFBIG"
  _errno[28] = "ENOSPC"
  _errno[29] = "ESPIPE"
  _errno[30] = "EROFS"
  _errno[31] = "EMLINK"
  _errno[32] = "EPIPE"
  _errno[33] = "EDOM"
  _errno[34] = "ERANGE"
  _errno[35] = "EDEADLK"
  _errno[36] = "ENAMETOOLONG"
  _errno[37] = "ENOLCK"
  _errno[38] = "ENOSYS"
  _errno[39] = "ENOTEMPTY"
  _errno[40] = "ELOOP"
  _errno[41] = "UNDEF"
  _errno[42] = "ENOMSG"
  _errno[43] = "EIDRM"
  _errno[44] = "ECHRNG"
  _errno[45] = "EL2NSYNC"
  _errno[46] = "EL3HLT"
  _errno[47] = "EL3RST"
  _errno[48] = "ELNRNG"
  _errno[49] = "EUNATCH"
  _errno[50] = "ENOCSI"
  _errno[51] = "EL2HLT"
  _errno[52] = "EBADE"
  _errno[53] = "EBADR"
  _errno[54] = "EXFULL"
  _errno[55] = "ENOANO"
  _errno[56] = "EBADRQC"
  _errno[57] = "EBADSLT"
  _errno[58] = "UNDEF"
  _errno[59] = "EBFONT"
  _errno[60] = "ENOSTR"
  _errno[61] = "ENODATA"
  _errno[62] = "ETIME"
  _errno[63] = "ENOSR"
  _errno[64] = "ENONET"
  _errno[65] = "ENOPKG"
  _errno[66] = "EREMOTE"
  _errno[67] = "ENOLINK"
  _errno[68] = "EADV"
  _errno[69] = "ESRMNT"
  _errno[70] = "ECOMM"
  _errno[71] = "EPROTO"
  _errno[72] = "EMULTIHOP"
  _errno[73] = "EDOTDOT"
  _errno[74] = "EBADMSG"
  _errno[75] = "EOVERFLOW"
  _errno[76] = "ENOTUNIQ"
  _errno[77] = "EBADFD"
  _errno[78] = "EREMCHG"
  _errno[79] = "ELIBACC"
  _errno[80] = "ELIBBAD"
  _errno[81] = "ELIBSCN"
  _errno[82] = "ELIBMAX"
  _errno[83] = "ELIBEXEC"
  _errno[84] = "EILSEQ"
  _errno[85] = "ERESTART"
  _errno[86] = "ESTRPIPE"
  _errno[87] = "EUSERS"
  _errno[88] = "ENOTSOCK"
  _errno[89] = "EDESTADDRREQ"
  _errno[90] = "EMSGSIZE"
  _errno[91] = "EPROTOTYPE"
  _errno[92] = "ENOPROTOOPT"
  _errno[93] = "EPROTONOSUPPORT"
  _errno[94] = "ESOCKTNOSUPPORT"
  _errno[95] = "EOPNOTSUPP"
  _errno[96] = "EPFNOSUPPORT"
  _errno[97] = "EAFNOSUPPORT"
  _errno[98] = "EADDRINUSE"
  _errno[99] = "EADDRNOTAVAIL"
  _errno[100] = "ENETDOWN"
  _errno[101] = "ENETUNREACH"
  _errno[102] = "ENETRESET"
  _errno[103] = "ECONNABORTED"
  _errno[104] = "ECONNRESET"
  _errno[105] = "ENOBUFS"
  _errno[106] = "EISCONN"
  _errno[107] = "ENOTCONN"
  _errno[108] = "ESHUTDOWN"
  _errno[109] = "ETOOMANYREFS"
  _errno[110] = "ETIMEDOUT"
  _errno[111] = "ECONNREFUSED"
  _errno[112] = "EHOSTDOWN"
  _errno[113] = "EHOSTUNREACH"
  _errno[114] = "EALREADY"
  _errno[115] = "EINPROGRESS"
  _errno[116] = "ESTALE"
  _errno[117] = "EUCLEAN"
  _errno[118] = "ENOTNAM"
  _errno[119] = "ENAVAIL"
  _errno[120] = "EISNAM"
  _errno[121] = "EREMOTEIO"
  _errno[122] = "EDQUOT"
  _errno[123] = "ENOMEDIUM"
  _errno[124] = "EMEDIUMTYPE"
  _errno[125] = "ECANCELED"
  _errno[126] = "ENOKEY"
  _errno[127] = "EKEYEXPIRED"
  _errno[128] = "EKEYREVOKED"
  _errno[129] = "EKEYREJECTED"
  _errno[130] = "EOWNERDEAD"
  _errno[131] = "ENOTRECOVERABLE"
  _errno[132] = "ERFKILL"
  _errno[133] = "EHWPOISON"
}

function err_num2str:string (error:long) {
  error = -error
  return (error in _errno ? _errno[error] : sprintf("0x%x", -error))
}

probe module("sunrpc").function("__rpc_execute").call {
  if (pid() == target()) {
      t = task_current()
      p = probefunc()
      serv      = @cast($task->tk_client, "struct rpc_clnt")->cl_server
      serv_name = kernel_string(serv)
      proc      = @cast(&$task->tk_msg, "struct rpc_message")->rpc_proc
      func      = @cast(proc, "struct rpc_procinfo")->p_name
      func_name = kernel_string(func)
      printf("[0x%x] %s -> %s: func: %s, server: %s, parms: %s\n", t, thread_indent(1), p, func_name, serv_name, $$parms)
      trace[t] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), p, $$return)
    trace[t] = 0
  }
}

probe module("sunrpc").function("xs_tcp_send_request").call {
  t = task_current()
  p = probefunc()
  if (trace[t] == 1) {
    printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), p, $$parms)
    trace[t] = 2
  }
}

probe module("sunrpc").function("xs_tcp_send_request").return {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), err_num2str($return))
    trace[t] = 1
  }
}

probe module("sunrpc").function("*").call,
      kernel.function("*@net/ipv4/tcp*").call,
      kernel.function("*@net/core/stream.c").call,
      kernel.function("*@net/socket.c").call {
  t = task_current()
  p = probefunc()
  if (trace[t] == 2) {
    printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), p, $$parms)
  }
}

probe module("sunrpc").function("*").return,
      kernel.function("*@net/ipv4/tcp*").return,
      kernel.function("*@net/core/stream.c").return,
      kernel.function("*@net/socket.c").return {
  t = task_current()
  p = probefunc()
  if (trace[t] == 2) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), p, $$return)
  }
}

# vim: ts=2 sts=2 sw=2 et

[-- Attachment #7: 11.stp --]
[-- Type: text/plain, Size: 7852 bytes --]

global trace

global _tcp_state[13]
global _sock_state[5]
global _errno[134]

global _sock_sk_shutdown[2]

probe begin {
  /* from include/net/tcp_states.h */
  _tcp_state[0]  = "UNDEF"
  _tcp_state[1]  = "TCP_ESTABLISHED"
  _tcp_state[2]  = "TCP_SYN_SENT"
  _tcp_state[3]  = "TCP_SYN_RECV"
  _tcp_state[4]  = "TCP_FIN_WAIT1"
  _tcp_state[5]  = "TCP_FIN_WAIT2"
  _tcp_state[6]  = "TCP_TIME_WAIT"
  _tcp_state[7]  = "TCP_CLOSE"
  _tcp_state[8]  = "TCP_CLOSE_WAIT"
  _tcp_state[9]  = "TCP_LAST_ACK"
  _tcp_state[10] = "TCP_LISTEN"
  _tcp_state[11] = "TCP_CLOSING"
  _tcp_state[12] = "TCP_MAX_STATES"

  /* from include/linux/net.h */
  _sock_state[0] = "SS_FREE"
  _sock_state[1] = "SS_UNCONNECTED"
  _sock_state[2] = "SS_CONNECTING"
  _sock_state[3] = "SS_CONNECTED"
  _sock_state[4] = "SS_DISCONNECTING"

  /* from include/asm-generic/errno-base.h
   * and  include/asm-generic/errno.h
   */
  _errno[0] = "0"
  _errno[1] = "EPERM"
  _errno[2] = "ENOENT"
  _errno[3] = "ESRCH"
  _errno[4] = "EINTR"
  _errno[5] = "EIO"
  _errno[6] = "ENXIO"
  _errno[7] = "E2BIG"
  _errno[8] = "ENOEXEC"
  _errno[9] = "EBADF"
  _errno[10] = "ECHILD"
  _errno[11] = "EAGAIN"
  _errno[12] = "ENOMEM"
  _errno[13] = "EACCES"
  _errno[14] = "EFAULT"
  _errno[15] = "ENOTBLK"
  _errno[16] = "EBUSY"
  _errno[17] = "EEXIST"
  _errno[18] = "EXDEV"
  _errno[19] = "ENODEV"
  _errno[20] = "ENOTDIR"
  _errno[21] = "EISDIR"
  _errno[22] = "EINVAL"
  _errno[23] = "ENFILE"
  _errno[24] = "EMFILE"
  _errno[25] = "ENOTTY"
  _errno[26] = "ETXTBSY"
  _errno[27] = "EFBIG"
  _errno[28] = "ENOSPC"
  _errno[29] = "ESPIPE"
  _errno[30] = "EROFS"
  _errno[31] = "EMLINK"
  _errno[32] = "EPIPE"
  _errno[33] = "EDOM"
  _errno[34] = "ERANGE"
  _errno[35] = "EDEADLK"
  _errno[36] = "ENAMETOOLONG"
  _errno[37] = "ENOLCK"
  _errno[38] = "ENOSYS"
  _errno[39] = "ENOTEMPTY"
  _errno[40] = "ELOOP"
  _errno[41] = "UNDEF"
  _errno[42] = "ENOMSG"
  _errno[43] = "EIDRM"
  _errno[44] = "ECHRNG"
  _errno[45] = "EL2NSYNC"
  _errno[46] = "EL3HLT"
  _errno[47] = "EL3RST"
  _errno[48] = "ELNRNG"
  _errno[49] = "EUNATCH"
  _errno[50] = "ENOCSI"
  _errno[51] = "EL2HLT"
  _errno[52] = "EBADE"
  _errno[53] = "EBADR"
  _errno[54] = "EXFULL"
  _errno[55] = "ENOANO"
  _errno[56] = "EBADRQC"
  _errno[57] = "EBADSLT"
  _errno[58] = "UNDEF"
  _errno[59] = "EBFONT"
  _errno[60] = "ENOSTR"
  _errno[61] = "ENODATA"
  _errno[62] = "ETIME"
  _errno[63] = "ENOSR"
  _errno[64] = "ENONET"
  _errno[65] = "ENOPKG"
  _errno[66] = "EREMOTE"
  _errno[67] = "ENOLINK"
  _errno[68] = "EADV"
  _errno[69] = "ESRMNT"
  _errno[70] = "ECOMM"
  _errno[71] = "EPROTO"
  _errno[72] = "EMULTIHOP"
  _errno[73] = "EDOTDOT"
  _errno[74] = "EBADMSG"
  _errno[75] = "EOVERFLOW"
  _errno[76] = "ENOTUNIQ"
  _errno[77] = "EBADFD"
  _errno[78] = "EREMCHG"
  _errno[79] = "ELIBACC"
  _errno[80] = "ELIBBAD"
  _errno[81] = "ELIBSCN"
  _errno[82] = "ELIBMAX"
  _errno[83] = "ELIBEXEC"
  _errno[84] = "EILSEQ"
  _errno[85] = "ERESTART"
  _errno[86] = "ESTRPIPE"
  _errno[87] = "EUSERS"
  _errno[88] = "ENOTSOCK"
  _errno[89] = "EDESTADDRREQ"
  _errno[90] = "EMSGSIZE"
  _errno[91] = "EPROTOTYPE"
  _errno[92] = "ENOPROTOOPT"
  _errno[93] = "EPROTONOSUPPORT"
  _errno[94] = "ESOCKTNOSUPPORT"
  _errno[95] = "EOPNOTSUPP"
  _errno[96] = "EPFNOSUPPORT"
  _errno[97] = "EAFNOSUPPORT"
  _errno[98] = "EADDRINUSE"
  _errno[99] = "EADDRNOTAVAIL"
  _errno[100] = "ENETDOWN"
  _errno[101] = "ENETUNREACH"
  _errno[102] = "ENETRESET"
  _errno[103] = "ECONNABORTED"
  _errno[104] = "ECONNRESET"
  _errno[105] = "ENOBUFS"
  _errno[106] = "EISCONN"
  _errno[107] = "ENOTCONN"
  _errno[108] = "ESHUTDOWN"
  _errno[109] = "ETOOMANYREFS"
  _errno[110] = "ETIMEDOUT"
  _errno[111] = "ECONNREFUSED"
  _errno[112] = "EHOSTDOWN"
  _errno[113] = "EHOSTUNREACH"
  _errno[114] = "EALREADY"
  _errno[115] = "EINPROGRESS"
  _errno[116] = "ESTALE"
  _errno[117] = "EUCLEAN"
  _errno[118] = "ENOTNAM"
  _errno[119] = "ENAVAIL"
  _errno[120] = "EISNAM"
  _errno[121] = "EREMOTEIO"
  _errno[122] = "EDQUOT"
  _errno[123] = "ENOMEDIUM"
  _errno[124] = "EMEDIUMTYPE"
  _errno[125] = "ECANCELED"
  _errno[126] = "ENOKEY"
  _errno[127] = "EKEYEXPIRED"
  _errno[128] = "EKEYREVOKED"
  _errno[129] = "EKEYREJECTED"
  _errno[130] = "EOWNERDEAD"
  _errno[131] = "ENOTRECOVERABLE"
  _errno[132] = "ERFKILL"
  _errno[133] = "EHWPOISON"
}

function err_num2str:string (error:long) {
  error = -error
  return (error in _errno ? _errno[error] : sprintf("0x%x", -error))
}

function sk_state_num2str:string (state:long) {
  return (state in _tcp_state ? _tcp_state[state] : sprintf("0x%x", state))
}

function sock_state_num2str:string (state:long)
{
  return (state in _sock_state ? _sock_state[state] : sprintf("0x%x", state))
}

function sk_shutdown_num2str:string (flags:long)
{
  /* from include/net/sock.h */
  RCV_SHUTDOWN  = 1
  SEND_SHUTDOWN = 2

  retvalue = ""
  if (flags & SEND_SHUTDOWN)
    _sock_sk_shutdown["SEND_SHUTDOWN"] = 1

  if (flags & RCV_SHUTDOWN)
    _sock_sk_shutdown["RCV_SHUTDOWN"] = 1

  foreach (key in _sock_sk_shutdown) {
    sep = retvalue == "" ? "" : "|"
    retvalue = sprintf("%s%s%s", retvalue, sep, key)
  }

  delete _sock_sk_shutdown
  return retvalue == "" ? "0" : retvalue
}

function sk2str:string (sk:long)
{
  sk_state    = sk_state_num2str(@cast(sk, "struct sock")->__sk_common->skc_state)
  sk_shutdown = sk_shutdown_num2str(@cast(sk, "struct sock")->sk_shutdown)
  sk_err      = err_num2str(-(@cast(sk, "struct sock")->sk_err))
  return sprintf("sk=0x%x, sk_state: %s, sk_err: %s, sk_shutdown: %s", sk, sk_state, sk_err, sk_shutdown)
}

function sock2str:string (sock:long)
{
  sock_state = @cast(sock, "struct socket")->state
  sock_sk    = @cast(sock, "struct socket")->sk
  return sprintf("sock=0x%x, state: %s, %s", sock, sock_state_num2str(sock_state), sk2str(sock_sk))
}

probe module("sunrpc").function("__rpc_execute").call {
  if (pid() == target()) {
      t = task_current()
      p = probefunc()
      serv      = @cast($task->tk_client, "struct rpc_clnt")->cl_server
      serv_name = kernel_string(serv)
      proc      = @cast(&$task->tk_msg, "struct rpc_message")->rpc_proc
      func      = @cast(proc, "struct rpc_procinfo")->p_name
      func_name = kernel_string(func)
      printf("[0x%x] %s -> %s: func: %s, server: %s, parms: %s\n", t, thread_indent(1), p, func_name, serv_name, $$parms)
      trace[t] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), p, $$return)
    trace[t] = 0
  }
}

probe module("sunrpc").function("xs_tcp_send_request").call {
  t = task_current()
  p = probefunc()
  if (trace[t] == 1) {
    printf("[0x%x] %s -> %s: %s\n", t, thread_indent(1), p, $$parms)
    trace[t] = 2
  }
}

probe module("sunrpc").function("xs_tcp_send_request").return {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), err_num2str($return))
    trace[t] = 1
  }
}

probe kernel.function("tcp_sendmsg").call {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s -> %s: sock: %s\n", t, thread_indent(1), probefunc(), sock2str($sock))
  }
}

probe kernel.function("tcp_sendmsg").return {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

probe kernel.function("sk_stream_wait_connect").call {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s -> %s: sk: %s, time_out: %ld\n", t, thread_indent(1), probefunc(), sk2str($sk), kernel_long($timeo_p))
  }
}

probe kernel.function("sk_stream_wait_connect").return {
  t = task_current()
  if (trace[t] == 2) {
    printf("[0x%x] %s <- %s: %s\n", t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

#  vim: ts=2 sts=2 sw=2 et

[-- Attachment #8: 14.stp --]
[-- Type: text/plain, Size: 11504 bytes --]

global trace
global sk_trace
global task_trace

global _tcp_state[13]
global _sock_state[5]
global _errno[134]

global _task_tk_runstate[3]
global _sock_sk_shutdown[2]

probe begin {
  /* from include/net/tcp_states.h */
  _tcp_state[0]  = "UNDEF"
  _tcp_state[1]  = "TCP_ESTABLISHED"
  _tcp_state[2]  = "TCP_SYN_SENT"
  _tcp_state[3]  = "TCP_SYN_RECV"
  _tcp_state[4]  = "TCP_FIN_WAIT1"
  _tcp_state[5]  = "TCP_FIN_WAIT2"
  _tcp_state[6]  = "TCP_TIME_WAIT"
  _tcp_state[7]  = "TCP_CLOSE"
  _tcp_state[8]  = "TCP_CLOSE_WAIT"
  _tcp_state[9]  = "TCP_LAST_ACK"
  _tcp_state[10] = "TCP_LISTEN"
  _tcp_state[11] = "TCP_CLOSING"
  _tcp_state[12] = "TCP_MAX_STATES"

  /* from include/linux/net.h */
  _sock_state[0] = "SS_FREE"
  _sock_state[1] = "SS_UNCONNECTED"
  _sock_state[2] = "SS_CONNECTING"
  _sock_state[3] = "SS_CONNECTED"
  _sock_state[4] = "SS_DISCONNECTING"

  _errno[0] = "0"
  _errno[1] = "EPERM"
  _errno[2] = "ENOENT"
  _errno[3] = "ESRCH"
  _errno[4] = "EINTR"
  _errno[5] = "EIO"
  _errno[6] = "ENXIO"
  _errno[7] = "E2BIG"
  _errno[8] = "ENOEXEC"
  _errno[9] = "EBADF"
  _errno[10] = "ECHILD"
  _errno[11] = "EAGAIN"
  _errno[12] = "ENOMEM"
  _errno[13] = "EACCES"
  _errno[14] = "EFAULT"
  _errno[15] = "ENOTBLK"
  _errno[16] = "EBUSY"
  _errno[17] = "EEXIST"
  _errno[18] = "EXDEV"
  _errno[19] = "ENODEV"
  _errno[20] = "ENOTDIR"
  _errno[21] = "EISDIR"
  _errno[22] = "EINVAL"
  _errno[23] = "ENFILE"
  _errno[24] = "EMFILE"
  _errno[25] = "ENOTTY"
  _errno[26] = "ETXTBSY"
  _errno[27] = "EFBIG"
  _errno[28] = "ENOSPC"
  _errno[29] = "ESPIPE"
  _errno[30] = "EROFS"
  _errno[31] = "EMLINK"
  _errno[32] = "EPIPE"
  _errno[33] = "EDOM"
  _errno[34] = "ERANGE"
  _errno[35] = "EDEADLK"
  _errno[36] = "ENAMETOOLONG"
  _errno[37] = "ENOLCK"
  _errno[38] = "ENOSYS"
  _errno[39] = "ENOTEMPTY"
  _errno[40] = "ELOOP"
  _errno[41] = "UNDEF"
  _errno[42] = "ENOMSG"
  _errno[43] = "EIDRM"
  _errno[44] = "ECHRNG"
  _errno[45] = "EL2NSYNC"
  _errno[46] = "EL3HLT"
  _errno[47] = "EL3RST"
  _errno[48] = "ELNRNG"
  _errno[49] = "EUNATCH"
  _errno[50] = "ENOCSI"
  _errno[51] = "EL2HLT"
  _errno[52] = "EBADE"
  _errno[53] = "EBADR"
  _errno[54] = "EXFULL"
  _errno[55] = "ENOANO"
  _errno[56] = "EBADRQC"
  _errno[57] = "EBADSLT"
  _errno[58] = "UNDEF"
  _errno[59] = "EBFONT"
  _errno[60] = "ENOSTR"
  _errno[61] = "ENODATA"
  _errno[62] = "ETIME"
  _errno[63] = "ENOSR"
  _errno[64] = "ENONET"
  _errno[65] = "ENOPKG"
  _errno[66] = "EREMOTE"
  _errno[67] = "ENOLINK"
  _errno[68] = "EADV"
  _errno[69] = "ESRMNT"
  _errno[70] = "ECOMM"
  _errno[71] = "EPROTO"
  _errno[72] = "EMULTIHOP"
  _errno[73] = "EDOTDOT"
  _errno[74] = "EBADMSG"
  _errno[75] = "EOVERFLOW"
  _errno[76] = "ENOTUNIQ"
  _errno[77] = "EBADFD"
  _errno[78] = "EREMCHG"
  _errno[79] = "ELIBACC"
  _errno[80] = "ELIBBAD"
  _errno[81] = "ELIBSCN"
  _errno[82] = "ELIBMAX"
  _errno[83] = "ELIBEXEC"
  _errno[84] = "EILSEQ"
  _errno[85] = "ERESTART"
  _errno[86] = "ESTRPIPE"
  _errno[87] = "EUSERS"
  _errno[88] = "ENOTSOCK"
  _errno[89] = "EDESTADDRREQ"
  _errno[90] = "EMSGSIZE"
  _errno[91] = "EPROTOTYPE"
  _errno[92] = "ENOPROTOOPT"
  _errno[93] = "EPROTONOSUPPORT"
  _errno[94] = "ESOCKTNOSUPPORT"
  _errno[95] = "EOPNOTSUPP"
  _errno[96] = "EPFNOSUPPORT"
  _errno[97] = "EAFNOSUPPORT"
  _errno[98] = "EADDRINUSE"
  _errno[99] = "EADDRNOTAVAIL"
  _errno[100] = "ENETDOWN"
  _errno[101] = "ENETUNREACH"
  _errno[102] = "ENETRESET"
  _errno[103] = "ECONNABORTED"
  _errno[104] = "ECONNRESET"
  _errno[105] = "ENOBUFS"
  _errno[106] = "EISCONN"
  _errno[107] = "ENOTCONN"
  _errno[108] = "ESHUTDOWN"
  _errno[109] = "ETOOMANYREFS"
  _errno[110] = "ETIMEDOUT"
  _errno[111] = "ECONNREFUSED"
  _errno[112] = "EHOSTDOWN"
  _errno[113] = "EHOSTUNREACH"
  _errno[114] = "EALREADY"
  _errno[115] = "EINPROGRESS"
  _errno[116] = "ESTALE"
  _errno[117] = "EUCLEAN"
  _errno[118] = "ENOTNAM"
  _errno[119] = "ENAVAIL"
  _errno[120] = "EISNAM"
  _errno[121] = "EREMOTEIO"
  _errno[122] = "EDQUOT"
  _errno[123] = "ENOMEDIUM"
  _errno[124] = "EMEDIUMTYPE"
  _errno[125] = "ECANCELED"
  _errno[126] = "ENOKEY"
  _errno[127] = "EKEYEXPIRED"
  _errno[128] = "EKEYREVOKED"
  _errno[129] = "EKEYREJECTED"
  _errno[130] = "EOWNERDEAD"
  _errno[131] = "ENOTRECOVERABLE"
  _errno[132] = "ERFKILL"
  _errno[133] = "EHWPOISON"
}

function err_num2str:string (error:long) {
  error = -error
  return (error in _errno ? _errno[error] : sprintf("0x%x", -error))
}

function sk_state_num2str:string (state:long) {
  return (state in _tcp_state ? _tcp_state[state] : sprintf("0x%x", state))
}

function sock_state_num2str:string (state:long)
{
  return (state in _sock_state ? _sock_state[state] : sprintf("0x%x", state))
}

function sk_shutdown_num2str:string (flags:long)
{
  /* from include/net/sock.h */
  RCV_SHUTDOWN  = 1
  SEND_SHUTDOWN = 2

  retvalue = ""
  if (flags & SEND_SHUTDOWN)
    _sock_sk_shutdown["SEND_SHUTDOWN"] = 1

  if (flags & RCV_SHUTDOWN)
    _sock_sk_shutdown["RCV_SHUTDOWN"] = 1

  foreach (key in _sock_sk_shutdown) {
    sep = retvalue == "" ? "" : "|"
    retvalue = sprintf("%s%s%s", retvalue, sep, key)
  }

  delete _sock_sk_shutdown
  return retvalue == "" ? "0" : retvalue
}

function sk2str:string (sk:long)
{
  sk_state    = sk_state_num2str(@cast(sk, "struct sock")->__sk_common->skc_state)
  sk_shutdown = sk_shutdown_num2str(@cast(sk, "struct sock")->sk_shutdown)
  sk_err      = err_num2str(-(@cast(sk, "struct sock")->sk_err))
  return sprintf("sk=0x%x, sk_state: %s, sk_err: %s, sk_shutdown: %s", sk, sk_state, sk_err, sk_shutdown)
}

function sock2str:string (sock:long)
{
  sock_state = @cast(sock, "struct socket")->state
  sock_sk    = @cast(sock, "struct socket")->sk
  return sprintf("sock=0x%x, state: %s, %s", sock, sock_state_num2str(sock_state), sk2str(sock_sk))
}

function task_runstate_num2str:string (word:long)
{
  /* from include/linux/runrpc/sched.h but watch out, needs translation from
   * the values there for use in '&' as they use `test_bit'
   */

  RPC_TASK_RUNNING = 1
  RPC_TASK_QUEUED  = 2
  RPC_TASK_ACTIVE  = 4

  retvalue = ""
  if (word & RPC_TASK_RUNNING)
    _task_tk_runstate["RPC_TASK_RUNNING"] = 1
  if (word & RPC_TASK_QUEUED)
    _task_tk_runstate["RPC_TASK_QUEUED"] = 1
  if (word & RPC_TASK_ACTIVE)
    _task_tk_runstate["RPC_TASK_ACTIVE"] = 1

  foreach (key in _task_tk_runstate) {
    sep = retvalue == "" ? "" : "|"
    retvalue = sprintf("%s%s%s", retvalue, sep, key)
  }

  delete _task_tk_runstate
  retvalue = retvalue == "" ? "0" : retvalue
  return sprintf("tk_runstate: %s", retvalue)
}

function task2str:string (task:long)
{
  tk_runstate = @cast(task, "struct rpc_task", "kernel<linux/sunrpc/sched.h>")->tk_runstate
  tk_status   = @cast(task, "struct rpc_task", "kernel<linux/sunrpc/sched.h>")->tk_status
  return sprintf("tk_status: %s, %s", err_num2str(tk_status), task_runstate_num2str(tk_runstate))
}

probe module("sunrpc").function("__rpc_execute").call {
  if (pid() == target()) {
      t = task_current()
      p = probefunc()

      /* pull the server name */
      serv      = @cast($task->tk_client, "struct rpc_clnt")->cl_server
      serv_name = kernel_string(serv)

      proc      = @cast(&$task->tk_msg, "struct rpc_message")->rpc_proc
      func      = @cast(proc, "struct rpc_procinfo")->p_name
      func_name = kernel_string(func)

      printf("%d [0x%x] %s -> %s: func: %s, server: %s, task: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), p, func_name, serv_name, task2str($task), $$parms)
      trace[t] = 1
      task_trace[$task] = 1
  }
}

probe module("sunrpc").function("__rpc_execute").return {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), p, $$return)
    trace[t] = 0
    delete task_trace
  }
}

probe module("sunrpc").function("call_*").call {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("%d [0x%x] %s -> %s: task: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), p, task2str($task), $$parms)
  }
}

probe module("sunrpc").function("call_*").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), $$return)
  }
}

probe module("sunrpc").function("xs_tcp_send_request").call {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    rq_xprt     = @cast($task->tk_rqstp, "struct rpc_rqst")->rq_xprt
    sock        = @cast(rq_xprt - (& @cast(0, "struct sock_xprt")->xprt), "struct sock_xprt")->sock
    sk          = @cast(sock, "struct socket")->sk
    printf("%d [0x%x] %s -> %s: sock: %s, task: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), p, sock2str(sock), task2str($task), $$parms)
    sk_trace[sk] = 1
  }
}

probe module("sunrpc").function("xs_tcp_send_request").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

probe module("sunrpc").function("xs_connect").call {
  t = task_current()
  if (trace[t]) {
    rq_xprt     = @cast($task->tk_rqstp, "struct rpc_rqst")->rq_xprt
    sock        = @cast(rq_xprt - (& @cast(0, "struct sock_xprt")->xprt), "struct sock_xprt")->sock
    printf("%d [0x%x] %s -> %s: sock: %s, task: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), sock2str(sock), task2str($task), $$parms)
  }
}

probe module("sunrpc").function("xs_connect").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc())
  }
}

probe module("sunrpc").function("rpc_wait_bit_killable").call {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s -> %s: parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), $$parms)
  }
}

probe module("sunrpc").function("rpc_wait_bit_killable").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc())
  }
}

probe kernel.function("sk_stream_wait_connect").call {
  t = task_current()
  p = probefunc()
  if (trace[t]) {
    printf("%d [0x%x] %s -> %s: sk: %s, time_out: %d, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), p, sk2str($sk), kernel_long($timeo_p), $$parms)
  }
}

probe kernel.function("sk_stream_wait_connect").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

probe kernel.function("out_of_line_wait_on_bit").call {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s -> %s: word: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), task_runstate_num2str(kernel_long($word)), $$parms)
  }
}

probe kernel.function("out_of_line_wait_on_bit").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x] %s <- %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc())
  }
}

probe module("sunrpc").function("rpc_make_runnable").call {
  t = task_current()
  if (task_trace[$task]) {
    printf("%d [0x%x] %s -> %s: task: %s, parms: %s\n", gettimeofday_ns(), t, execname(), probefunc(), task2str($task), $$parms)
    print_backtrace()
  }
}

probe kernel.function("tcp_set_state").call {
  if (sk_trace[$sk]) {
    t = task_current()
    printf("%d [0x%x] %s -> %s: sk: %s, NEW sk_state: %s\n",
            gettimeofday_ns(), t, execname(), probefunc(), sk2str($sk), sk_state_num2str($state))
    print_backtrace()
  }
}

[-- Attachment #9: rpciod.stp --]
[-- Type: text/plain, Size: 7185 bytes --]

global trace

global _tcp_state[13]
global _sock_state[5]
global _errno[134]

global _sock_sk_shutdown[2]

probe begin {
  /* from include/net/tcp_states.h */
  _tcp_state[0]  = "UNDEF"
  _tcp_state[1]  = "TCP_ESTABLISHED"
  _tcp_state[2]  = "TCP_SYN_SENT"
  _tcp_state[3]  = "TCP_SYN_RECV"
  _tcp_state[4]  = "TCP_FIN_WAIT1"
  _tcp_state[5]  = "TCP_FIN_WAIT2"
  _tcp_state[6]  = "TCP_TIME_WAIT"
  _tcp_state[7]  = "TCP_CLOSE"
  _tcp_state[8]  = "TCP_CLOSE_WAIT"
  _tcp_state[9]  = "TCP_LAST_ACK"
  _tcp_state[10] = "TCP_LISTEN"
  _tcp_state[11] = "TCP_CLOSING"
  _tcp_state[12] = "TCP_MAX_STATES"

  /* from include/linux/net.h */
  _sock_state[0] = "SS_FREE"
  _sock_state[1] = "SS_UNCONNECTED"
  _sock_state[2] = "SS_CONNECTING"
  _sock_state[3] = "SS_CONNECTED"
  _sock_state[4] = "SS_DISCONNECTING"

  _errno[0] = "0"
  _errno[1] = "EPERM"
  _errno[2] = "ENOENT"
  _errno[3] = "ESRCH"
  _errno[4] = "EINTR"
  _errno[5] = "EIO"
  _errno[6] = "ENXIO"
  _errno[7] = "E2BIG"
  _errno[8] = "ENOEXEC"
  _errno[9] = "EBADF"
  _errno[10] = "ECHILD"
  _errno[11] = "EAGAIN"
  _errno[12] = "ENOMEM"
  _errno[13] = "EACCES"
  _errno[14] = "EFAULT"
  _errno[15] = "ENOTBLK"
  _errno[16] = "EBUSY"
  _errno[17] = "EEXIST"
  _errno[18] = "EXDEV"
  _errno[19] = "ENODEV"
  _errno[20] = "ENOTDIR"
  _errno[21] = "EISDIR"
  _errno[22] = "EINVAL"
  _errno[23] = "ENFILE"
  _errno[24] = "EMFILE"
  _errno[25] = "ENOTTY"
  _errno[26] = "ETXTBSY"
  _errno[27] = "EFBIG"
  _errno[28] = "ENOSPC"
  _errno[29] = "ESPIPE"
  _errno[30] = "EROFS"
  _errno[31] = "EMLINK"
  _errno[32] = "EPIPE"
  _errno[33] = "EDOM"
  _errno[34] = "ERANGE"
  _errno[35] = "EDEADLK"
  _errno[36] = "ENAMETOOLONG"
  _errno[37] = "ENOLCK"
  _errno[38] = "ENOSYS"
  _errno[39] = "ENOTEMPTY"
  _errno[40] = "ELOOP"
  _errno[41] = "UNDEF"
  _errno[42] = "ENOMSG"
  _errno[43] = "EIDRM"
  _errno[44] = "ECHRNG"
  _errno[45] = "EL2NSYNC"
  _errno[46] = "EL3HLT"
  _errno[47] = "EL3RST"
  _errno[48] = "ELNRNG"
  _errno[49] = "EUNATCH"
  _errno[50] = "ENOCSI"
  _errno[51] = "EL2HLT"
  _errno[52] = "EBADE"
  _errno[53] = "EBADR"
  _errno[54] = "EXFULL"
  _errno[55] = "ENOANO"
  _errno[56] = "EBADRQC"
  _errno[57] = "EBADSLT"
  _errno[58] = "UNDEF"
  _errno[59] = "EBFONT"
  _errno[60] = "ENOSTR"
  _errno[61] = "ENODATA"
  _errno[62] = "ETIME"
  _errno[63] = "ENOSR"
  _errno[64] = "ENONET"
  _errno[65] = "ENOPKG"
  _errno[66] = "EREMOTE"
  _errno[67] = "ENOLINK"
  _errno[68] = "EADV"
  _errno[69] = "ESRMNT"
  _errno[70] = "ECOMM"
  _errno[71] = "EPROTO"
  _errno[72] = "EMULTIHOP"
  _errno[73] = "EDOTDOT"
  _errno[74] = "EBADMSG"
  _errno[75] = "EOVERFLOW"
  _errno[76] = "ENOTUNIQ"
  _errno[77] = "EBADFD"
  _errno[78] = "EREMCHG"
  _errno[79] = "ELIBACC"
  _errno[80] = "ELIBBAD"
  _errno[81] = "ELIBSCN"
  _errno[82] = "ELIBMAX"
  _errno[83] = "ELIBEXEC"
  _errno[84] = "EILSEQ"
  _errno[85] = "ERESTART"
  _errno[86] = "ESTRPIPE"
  _errno[87] = "EUSERS"
  _errno[88] = "ENOTSOCK"
  _errno[89] = "EDESTADDRREQ"
  _errno[90] = "EMSGSIZE"
  _errno[91] = "EPROTOTYPE"
  _errno[92] = "ENOPROTOOPT"
  _errno[93] = "EPROTONOSUPPORT"
  _errno[94] = "ESOCKTNOSUPPORT"
  _errno[95] = "EOPNOTSUPP"
  _errno[96] = "EPFNOSUPPORT"
  _errno[97] = "EAFNOSUPPORT"
  _errno[98] = "EADDRINUSE"
  _errno[99] = "EADDRNOTAVAIL"
  _errno[100] = "ENETDOWN"
  _errno[101] = "ENETUNREACH"
  _errno[102] = "ENETRESET"
  _errno[103] = "ECONNABORTED"
  _errno[104] = "ECONNRESET"
  _errno[105] = "ENOBUFS"
  _errno[106] = "EISCONN"
  _errno[107] = "ENOTCONN"
  _errno[108] = "ESHUTDOWN"
  _errno[109] = "ETOOMANYREFS"
  _errno[110] = "ETIMEDOUT"
  _errno[111] = "ECONNREFUSED"
  _errno[112] = "EHOSTDOWN"
  _errno[113] = "EHOSTUNREACH"
  _errno[114] = "EALREADY"
  _errno[115] = "EINPROGRESS"
  _errno[116] = "ESTALE"
  _errno[117] = "EUCLEAN"
  _errno[118] = "ENOTNAM"
  _errno[119] = "ENAVAIL"
  _errno[120] = "EISNAM"
  _errno[121] = "EREMOTEIO"
  _errno[122] = "EDQUOT"
  _errno[123] = "ENOMEDIUM"
  _errno[124] = "EMEDIUMTYPE"
  _errno[125] = "ECANCELED"
  _errno[126] = "ENOKEY"
  _errno[127] = "EKEYEXPIRED"
  _errno[128] = "EKEYREVOKED"
  _errno[129] = "EKEYREJECTED"
  _errno[130] = "EOWNERDEAD"
  _errno[131] = "ENOTRECOVERABLE"
  _errno[132] = "ERFKILL"
  _errno[133] = "EHWPOISON"
}

/**** Helpers ****/

function err_num2str:string (error:long) {
  error = -error
  return (error in _errno ? _errno[error] : sprintf("0x%x", -error))
}

function sk_state_num2str:string (state:long) {
  return (state in _tcp_state ? _tcp_state[state] : sprintf("0x%x", state))
}

function sock_state_num2str:string (state:long)
{
  return (state in _sock_state ? _sock_state[state] : sprintf("0x%x", state))
}

function sk_shutdown_num2str:string (flags:long)
{
  /* from include/net/sock.h */
  RCV_SHUTDOWN  = 1
  SEND_SHUTDOWN = 2

  retvalue = ""
  if (flags & SEND_SHUTDOWN)
    _sock_sk_shutdown["SEND_SHUTDOWN"] = 1

  if (flags & RCV_SHUTDOWN)
    _sock_sk_shutdown["RCV_SHUTDOWN"] = 1

  foreach (key in _sock_sk_shutdown) {
    sep = retvalue == "" ? "" : "|"
    retvalue = sprintf("%s%s%s", retvalue, sep, key)
  }

  delete _sock_sk_shutdown
  return retvalue == "" ? "0" : retvalue
}

function sk2str:string (sk:long)
{
  sk_state    = sk_state_num2str(@cast(sk, "struct sock")->__sk_common->skc_state)
  sk_shutdown = sk_shutdown_num2str(@cast(sk, "struct sock")->sk_shutdown)
  sk_err      = err_num2str(-(@cast(sk, "struct sock")->sk_err))
  return sprintf("sk=0x%x, sk_state: %s, sk_err: %s, sk_shutdown: %s", sk, sk_state, sk_err, sk_shutdown)
}

function sock2str:string (sock:long)
{
  sock_state = @cast(sock, "struct socket")->state
  sock_sk    = @cast(sock, "struct socket")->sk
  return sprintf("sock=0x%x, state: %s, %s", sock, sock_state_num2str(sock_state), sk2str(sock_sk))
}

/**** Actual Probes Below ****/

probe module("sunrpc").function("xs_tcp_setup_socket").call {
  if (isinstr(execname(), "rpciod")) {
    t = task_current()
    trace[t] = 1
    printf("%d [0x%x]: %s -> %s: parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), $$parms)
  }
}

probe module("sunrpc").function("xs_tcp_setup_socket").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x]: %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), $$return)
  }
  trace[t] = 0
}

probe kernel.function("kernel_connect").call {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x]: %s -> %s: sock: %s, parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), sock2str($sock), $$parms)
  }
}

probe kernel.function("kernel_connect").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x]: %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), err_num2str($return))
  }
}

probe module("sunrpc").function("*").call,
      kernel.function("*@net/socket.c").call {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x]: %s -> %s: parms: %s\n", gettimeofday_ns(), t, thread_indent(1), probefunc(), $$parms)
  }
}

probe module("sunrpc").function("*").return,
      kernel.function("*@net/socket.c").return {
  t = task_current()
  if (trace[t]) {
    printf("%d [0x%x]: %s <- %s: %s\n", gettimeofday_ns(), t, thread_indent(-1), probefunc(), $$return)
  }
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RPC Race Condition
  2012-10-24 16:10                       ` Chris Perl
@ 2012-10-24 20:19                         ` Chris Perl
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Perl @ 2012-10-24 20:19 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: linux-nfs

On Wed, Oct 24, 2012 at 12:10:04PM -0400, Chris Perl wrote:
> On Wed, Oct 24, 2012 at 02:56:55PM +0000, Myklebust, Trond wrote:
> > OK, I've now posted the patches in the "bugfixes" branch of
> > 
> >        git://git.linux-nfs.org/projects/trondmy/linux-nfs.git
> 
> Cool, I'll try to get that pulled, compiled and tested by the end of the
> day.

Pulled, compiled and tested in the same manner described before without
problems.

Thanks for getting this resolved!

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-10-24 20:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-22 18:03 RPC Race Condition Chris Perl
2012-10-22 18:28 ` Myklebust, Trond
2012-10-22 20:26   ` Chris Perl
2012-10-22 21:20     ` Myklebust, Trond
2012-10-23 13:45       ` Chris Perl
2012-10-23 16:02         ` Myklebust, Trond
2012-10-23 20:08           ` Chris Perl
2012-10-23 21:58             ` Myklebust, Trond
2012-10-24 12:40               ` Chris Perl
2012-10-24 13:04                 ` Myklebust, Trond
2012-10-24 14:23                   ` Chris Perl
2012-10-24 14:56                     ` Myklebust, Trond
2012-10-24 16:10                       ` Chris Perl
2012-10-24 20:19                         ` Chris Perl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.