All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-10-17 12:32 Max Kellermann
  2008-10-17 14:33 ` Glauber Costa
  2008-10-20  6:27 ` Ian Campbell
  0 siblings, 2 replies; 131+ messages in thread
From: Max Kellermann @ 2008-10-17 12:32 UTC (permalink / raw)
  To: linux-kernel, gcosta, ijc

Hi,

Ian: this is a follow-up to your post "NFS regression? Odd delays and
lockups accessing an NFS export" a few weeks ago
(http://lkml.org/lkml/2008/9/27/42).

I am able to trigger this bug within a few minutes on a customer's
machine (large web hoster, a *lot* of NFS traffic).

Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
"INFO: task migration/2:9 blocked for more than 120 seconds." with
varying task names.  Except for the high load average, the machine
seems to work.

With git bisect, I was finally able to identify the guilty commit,
it's not "Ensure we zap only the access and acl caches when setting
new acls" like you guessed, Ian.  According to my bisect,
6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.

Glauber: that is your patch "x86: minor adjustments for do_boot_cpu"
(http://lkml.org/lkml/2008/3/19/143).  I don't understand this patch
well, and I fail to see a connection with the symptom, but maybe
somebody else does...

See patch below (applies to 2.6.27.1).  So far, it looks like the
problem is solved on the server, no visible side effects.

Max


Revert "x86: minor adjustments for do_boot_cpu"

According to a bisect, Glauber Costa's patch induced high load and
"task ... blocked for more than 120 seconds" messages in dmesg.  This
patch reverts 6becedbb06072c5741d4057b9facecb4b3143711.

Signed-off-by: Max Kellermann <mk@cm4all.com>
---

 arch/x86/kernel/smpboot.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)


diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7985c5b..789cf84 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -808,7 +808,7 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu)
  * Returns zero if CPU booted OK, else error code from wakeup_secondary_cpu.
  */
 {
-	unsigned long boot_error = 0;
+	unsigned long boot_error;
 	int timeout;
 	unsigned long start_ip;
 	unsigned short nmi_high = 0, nmi_low = 0;
@@ -828,7 +828,11 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu)
 	}
 #endif
 
-	alternatives_smp_switch(1);
+	/*
+	 * Save current MTRR state in case it was changed since early boot
+	 * (e.g. by the ACPI SMI) to initialize new CPUs with MTRRs in sync:
+	 */
+	mtrr_save_state();
 
 	c_idle.idle = get_idle_for_cpu(cpu);
 
@@ -873,6 +877,8 @@ do_rest:
 	/* start_ip had better be page-aligned! */
 	start_ip = setup_trampoline();
 
+	alternatives_smp_switch(1);
+
 	/* So we see what's up   */
 	printk(KERN_INFO "Booting processor %d/%d ip %lx\n",
 			  cpu, apicid, start_ip);
@@ -891,11 +897,6 @@ do_rest:
 		store_NMI_vector(&nmi_high, &nmi_low);
 
 		smpboot_setup_warm_reset_vector(start_ip);
-		/*
-		 * Be paranoid about clearing APIC errors.
-	 	*/
-		apic_write(APIC_ESR, 0);
-		apic_read(APIC_ESR);
 	}
 
 	/*
@@ -986,12 +987,6 @@ int __cpuinit native_cpu_up(unsigned int cpu)
 		return -ENOSYS;
 	}
 
-	/*
-	 * Save current MTRR state in case it was changed since early boot
-	 * (e.g. by the ACPI SMI) to initialize new CPUs with MTRRs in sync:
-	 */
-	mtrr_save_state();
-
 	per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
 
 #ifdef CONFIG_X86_32

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-17 12:32 [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds" Max Kellermann
@ 2008-10-17 14:33 ` Glauber Costa
  2008-10-20  6:51   ` Max Kellermann
  2008-10-20  6:27 ` Ian Campbell
  1 sibling, 1 reply; 131+ messages in thread
From: Glauber Costa @ 2008-10-17 14:33 UTC (permalink / raw)
  To: Max Kellermann; +Cc: linux-kernel, gcosta, ijc

On Fri, Oct 17, 2008 at 02:32:07PM +0200, Max Kellermann wrote:
> Hi,
> 
> Ian: this is a follow-up to your post "NFS regression? Odd delays and
> lockups accessing an NFS export" a few weeks ago
> (http://lkml.org/lkml/2008/9/27/42).
> 
> I am able to trigger this bug within a few minutes on a customer's
> machine (large web hoster, a *lot* of NFS traffic).
> 
> Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
> "INFO: task migration/2:9 blocked for more than 120 seconds." with
> varying task names.  Except for the high load average, the machine
> seems to work.
> 
> With git bisect, I was finally able to identify the guilty commit,
> it's not "Ensure we zap only the access and acl caches when setting
> new acls" like you guessed, Ian.  According to my bisect,
> 6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
> e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.
> 
> Glauber: that is your patch "x86: minor adjustments for do_boot_cpu"
> (http://lkml.org/lkml/2008/3/19/143).  I don't understand this patch
> well, and I fail to see a connection with the symptom, but maybe
> somebody else does...
> 
> See patch below (applies to 2.6.27.1).  So far, it looks like the
> problem is solved on the server, no visible side effects.
> 
> Max
That's probably something related to apic congestion.
Does the problem go away if the only thing you change is this:


> @@ -891,11 +897,6 @@ do_rest:
>  		store_NMI_vector(&nmi_high, &nmi_low);
>  
>  		smpboot_setup_warm_reset_vector(start_ip);
> -		/*
> -		 * Be paranoid about clearing APIC errors.
> -	 	*/
> -		apic_write(APIC_ESR, 0);
> -		apic_read(APIC_ESR);
>  	}


Please let me know.


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-17 12:32 [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds" Max Kellermann
  2008-10-17 14:33 ` Glauber Costa
@ 2008-10-20  6:27 ` Ian Campbell
  2008-11-01 11:45   ` Ian Campbell
  1 sibling, 1 reply; 131+ messages in thread
From: Ian Campbell @ 2008-10-20  6:27 UTC (permalink / raw)
  To: Max Kellermann
  Cc: linux-kernel, gcosta, Grant Coady, Trond Myklebust,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 3490 bytes --]

(adding back some CC's, please don't drop people)

On Fri, 2008-10-17 at 14:32 +0200, Max Kellermann wrote:
> Ian: this is a follow-up to your post "NFS regression? Odd delays and
> lockups accessing an NFS export" a few weeks ago
> (http://lkml.org/lkml/2008/9/27/42).
>
> I am able to trigger this bug within a few minutes on a customer's
> machine (large web hoster, a *lot* of NFS traffic).
> 
> Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
> "INFO: task migration/2:9 blocked for more than 120 seconds." with
> varying task names.  Except for the high load average, the machine
> seems to work.
> 
> With git bisect, I was finally able to identify the guilty commit,
> it's not "Ensure we zap only the access and acl caches when setting
> new acls" like you guessed, Ian.  According to my bisect,
> 6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
> e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.

The issue I see still occurs well before those changesets. I have seen
it with v2.6.25 but v2.6.24 survived for 7 days without issue (my
threshold for a good kernel is 7 days, hence bisecting is a bit
slow...). 

So far I have bisected down to this range and am currently testing
acee478 which has been up for >4days.

$ git bisect visualize  --pretty=oneline  
bdc7f021f3a1fade77adf3c2d7f65690566fddfe NFS: Clean up the (commit|read|write)_setup() callback routines
3ff7576ddac06c3d07089e241b40826d24bbf1ac SUNRPC: Clean up the initialisation of priority queue scheduling info.
c970aa85e71bd581726c42df843f6f129db275ac SUNRPC: Clean up rpc_run_task
84115e1cd4a3614c4e566d4cce31381dce3dbef9 SUNRPC: Cleanup of rpc_task initialisation
ef818a28fac9bd214e676986d8301db0582b92a9 NFS: Stop sillyname renames and unmounts from racing
2f74c0a05612b9c2014b5b67833dba9b9f523948 NFSv4: Clean up the OPEN/CLOSE serialisation code
acee478afc6ff7e1b8852d9a4dca1ff36021414d NFS: Clean up the write request locking.
8b1f9ee56e21e505a3d5d3e33f823006d1abdbaf NFS: Optimise nfs_vm_page_mkwrite()
77f111929d024165e736e919187cff017279bebe NFS: Ensure that we eject stale inodes as soon as possible
d45b9d8baf41acb177abbbe6746b1dea094b8a28 NFS: Handle -ENOENT errors in unlink()/rmdir()/rename()
609005c319bc6062b95ed82e132884ed7e22cdb9 NFS: Sillyrename: in the case of a race, check aliases are really positive
fccca7fc6aab4e6b519e2d606ef34632e4f50e33 NFS: Fix a sillyrename race...

note that this bisect is over fs/nfs only so it's possible the I might
drop off the beginning and have to bisect the 3878 commits between
v2.6.24 and fccca7f. I hope not! acee478 looks good so far.

$ git bisect log
# bad: [4b119e21d0c66c22e8ca03df05d9de623d0eb50f] Linux 2.6.25
# good: [49914084e797530d9baaf51df9eda77babc98fa8] Linux 2.6.24
git-bisect start 'v2.6.25' 'v2.6.24' '--' 'fs/nfs'
# bad: [4c5680177012a2b5c0f3fdf58f4375dd84a1da67] NFS: Support non-IPv4 addresses in nfs_parsed_mount_data
git-bisect bad 4c5680177012a2b5c0f3fdf58f4375dd84a1da67
# bad: [d45273ed6f4613e81701c3e896d9db200c288fff] NFS: Clean up address comparison in __nfs_find_client()
git-bisect bad d45273ed6f4613e81701c3e896d9db200c288fff
# bad: [bdc7f021f3a1fade77adf3c2d7f65690566fddfe] NFS: Clean up the (commit|read|write)_setup() callback routines
git-bisect bad bdc7f021f3a1fade77adf3c2d7f65690566fddfe

Ian.
-- 
Ian Campbell

"It is easier to fight for principles than to live up to them."
		-- Alfred Adler

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-17 14:33 ` Glauber Costa
@ 2008-10-20  6:51   ` Max Kellermann
  2008-10-20  7:43     ` Ian Campbell
                       ` (2 more replies)
  0 siblings, 3 replies; 131+ messages in thread
From: Max Kellermann @ 2008-10-20  6:51 UTC (permalink / raw)
  To: Glauber Costa
  Cc: linux-kernel, ijc, Grant Coady, Trond Myklebust, J. Bruce Fields,
	Tom Tucker

On 2008/10/17 16:33, Glauber Costa <glommer@redhat.com> wrote:
> That's probably something related to apic congestion.
> Does the problem go away if the only thing you change is this:
> 
> 
> > @@ -891,11 +897,6 @@ do_rest:
> >  		store_NMI_vector(&nmi_high, &nmi_low);
> >  
> >  		smpboot_setup_warm_reset_vector(start_ip);
> > -		/*
> > -		 * Be paranoid about clearing APIC errors.
> > -	 	*/
> > -		apic_write(APIC_ESR, 0);
> > -		apic_read(APIC_ESR);
> >  	}
> 
> 
> Please let me know.

Hello Glauber,

I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
No problems since.

Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
Tell me if you need more detailed information.


On 2008/10/20 08:27, Ian Campbell <ijc@hellion.org.uk> wrote:
> The issue I see still occurs well before those changesets. I have
> seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
> (my threshold for a good kernel is 7 days, hence bisecting is a bit
> slow...).

Hello Ian,

it seems we're hunting down different bugs after all.  Too bad, I
hoped I could have solved your problem, too.  Our machine has been
running well over the weekend with the patch I posted; with faulty
kernels, the problem would occur after a few minutes.

Max

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20  6:51   ` Max Kellermann
@ 2008-10-20  7:43     ` Ian Campbell
  2008-10-20 13:15     ` Glauber Costa
  2009-05-22 20:59     ` H. Peter Anvin
  2 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-10-20  7:43 UTC (permalink / raw)
  To: Max Kellermann
  Cc: Glauber Costa, linux-kernel, Grant Coady, Trond Myklebust,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

On Mon, 2008-10-20 at 08:51 +0200, Max Kellermann wrote:
> 
> On 2008/10/20 08:27, Ian Campbell <ijc@hellion.org.uk> wrote:
> > The issue I see still occurs well before those changesets. I have
> > seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
> > (my threshold for a good kernel is 7 days, hence bisecting is a bit
> > slow...).
> 
> Hello Ian,
> 
> it seems we're hunting down different bugs after all.  Too bad, I
> hoped I could have solved your problem, too.

Thanks anyway, I'll just keep on bisecting ;-)

>   Our machine has been
> running well over the weekend with the patch I posted; with faulty
> kernels, the problem would occur after a few minutes.
> 
-- 
Ian Campbell

BOFH excuse #400:

We are Microsoft.  What you are experiencing is not a problem; it is an undocumented feature.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20  6:51   ` Max Kellermann
  2008-10-20  7:43     ` Ian Campbell
@ 2008-10-20 13:15     ` Glauber Costa
  2008-10-20 14:12       ` Max Kellermann
  2008-10-20 14:21       ` Cyrill Gorcunov
  2009-05-22 20:59     ` H. Peter Anvin
  2 siblings, 2 replies; 131+ messages in thread
From: Glauber Costa @ 2008-10-20 13:15 UTC (permalink / raw)
  To: Max Kellermann
  Cc: Glauber Costa, linux-kernel, ijc, Grant Coady, Trond Myklebust,
	J. Bruce Fields, Tom Tucker, gorcunov

On Mon, Oct 20, 2008 at 4:51 AM, Max Kellermann <mk@cm4all.com> wrote:
> On 2008/10/17 16:33, Glauber Costa <glommer@redhat.com> wrote:
>> That's probably something related to apic congestion.
>> Does the problem go away if the only thing you change is this:
>>
>>
>> > @@ -891,11 +897,6 @@ do_rest:
>> >             store_NMI_vector(&nmi_high, &nmi_low);
>> >
>> >             smpboot_setup_warm_reset_vector(start_ip);
>> > -           /*
>> > -            * Be paranoid about clearing APIC errors.
>> > -           */
>> > -           apic_write(APIC_ESR, 0);
>> > -           apic_read(APIC_ESR);
>> >     }
>>
>>
>> Please let me know.
>
> Hello Glauber,
>
> I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
> No problems since.
>
> Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
> Tell me if you need more detailed information.
>

There's a patch in flight from cyrill that probably fixes your problem:
http://lkml.org/lkml/2008/9/15/93

The checks are obviously there for a reason, and we can't just wipe
them out unconditionally ;-) So can you check please that you are also
covered by the case provided?

> On 2008/10/20 08:27, Ian Campbell <ijc@hellion.org.uk> wrote:
>> The issue I see still occurs well before those changesets. I have
>> seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
>> (my threshold for a good kernel is 7 days, hence bisecting is a bit
>> slow...).
>
> Hello Ian,
>
> it seems we're hunting down different bugs after all.  Too bad, I
> hoped I could have solved your problem, too.  Our machine has been
> running well over the weekend with the patch I posted; with faulty
> kernels, the problem would occur after a few minutes.
>
> Max
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Glauber  Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20 13:15     ` Glauber Costa
@ 2008-10-20 14:12       ` Max Kellermann
  2008-10-20 14:34         ` Cyrill Gorcunov
  2008-10-20 14:21       ` Cyrill Gorcunov
  1 sibling, 1 reply; 131+ messages in thread
From: Max Kellermann @ 2008-10-20 14:12 UTC (permalink / raw)
  To: Glauber Costa; +Cc: linux-kernel

On 2008/10/20 15:15, Glauber Costa <glommer@gmail.com> wrote:
> There's a patch in flight from cyrill that probably fixes your
> problem: http://lkml.org/lkml/2008/9/15/93
> 
> The checks are obviously there for a reason, and we can't just wipe
> them out unconditionally ;-) So can you check please that you are
> also covered by the case provided?

Looks good: booted the machine 30 minutes ago, no problems so far.

Max

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20 13:15     ` Glauber Costa
  2008-10-20 14:12       ` Max Kellermann
@ 2008-10-20 14:21       ` Cyrill Gorcunov
  1 sibling, 0 replies; 131+ messages in thread
From: Cyrill Gorcunov @ 2008-10-20 14:21 UTC (permalink / raw)
  To: Glauber Costa
  Cc: Max Kellermann, Glauber Costa, linux-kernel, ijc, Grant Coady,
	Trond Myklebust, J. Bruce Fields, Tom Tucker

[Glauber Costa - Mon, Oct 20, 2008 at 11:15:56AM -0200]
| On Mon, Oct 20, 2008 at 4:51 AM, Max Kellermann <mk@cm4all.com> wrote:
| > On 2008/10/17 16:33, Glauber Costa <glommer@redhat.com> wrote:
| >> That's probably something related to apic congestion.
| >> Does the problem go away if the only thing you change is this:
| >>
| >>
| >> > @@ -891,11 +897,6 @@ do_rest:
| >> >             store_NMI_vector(&nmi_high, &nmi_low);
| >> >
| >> >             smpboot_setup_warm_reset_vector(start_ip);
| >> > -           /*
| >> > -            * Be paranoid about clearing APIC errors.
| >> > -           */
| >> > -           apic_write(APIC_ESR, 0);
| >> > -           apic_read(APIC_ESR);
| >> >     }
| >>
| >>
| >> Please let me know.
| >
| > Hello Glauber,
| >
| > I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
| > No problems since.
| >
| > Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
| > Tell me if you need more detailed information.
| >
| 
| There's a patch in flight from cyrill that probably fixes your problem:
| http://lkml.org/lkml/2008/9/15/93
| 
| The checks are obviously there for a reason, and we can't just wipe
| them out unconditionally ;-) So can you check please that you are also
| covered by the case provided?

Actually I'll wonder if it help. Do Xeon processors really not
have ESR register and not integrated?

...

		- Cyrill -

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20 14:12       ` Max Kellermann
@ 2008-10-20 14:34         ` Cyrill Gorcunov
  0 siblings, 0 replies; 131+ messages in thread
From: Cyrill Gorcunov @ 2008-10-20 14:34 UTC (permalink / raw)
  To: Max Kellermann; +Cc: Glauber Costa, linux-kernel

[Max Kellermann - Mon, Oct 20, 2008 at 04:12:58PM +0200]
| On 2008/10/20 15:15, Glauber Costa <glommer@gmail.com> wrote:
| > There's a patch in flight from cyrill that probably fixes your
| > problem: http://lkml.org/lkml/2008/9/15/93
| > 
| > The checks are obviously there for a reason, and we can't just wipe
| > them out unconditionally ;-) So can you check please that you are
| > also covered by the case provided?
| 
| Looks good: booted the machine 30 minutes ago, no problems so far.
| 
| Max
| 

Thanks Max for testing! (me -- wonders since the patch helped
for now :-)

		- Cyrill -

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20  6:27 ` Ian Campbell
@ 2008-11-01 11:45   ` Ian Campbell
  2008-11-01 13:41     ` Trond Myklebust
  0 siblings, 1 reply; 131+ messages in thread
From: Ian Campbell @ 2008-11-01 11:45 UTC (permalink / raw)
  To: Max Kellermann
  Cc: linux-kernel, gcosta, Grant Coady, Trond Myklebust,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> So far I have bisected down to this range and am currently testing
> acee478 which has been up for >4days.

Another update. It has now bisected down to a small range 

7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()

I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.

7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.

Ian.
-- 
Ian Campbell

There is no delight the equal of dread.  As long as it is somebody
else's.
		-- Clive Barker

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-11-01 11:45   ` Ian Campbell
@ 2008-11-01 13:41     ` Trond Myklebust
  2008-11-02 14:40       ` Ian Campbell
                         ` (2 more replies)
  0 siblings, 3 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-11-01 13:41 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > So far I have bisected down to this range and am currently testing
> > acee478 which has been up for >4days.
> 
> Another update. It has now bisected down to a small range 
> 
> 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> 
> I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> 
> 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> 
> Ian.

Have you tested with the TCP RST fix yet? It has been merged into
mainline, so it should be in the latest 2.6.28-git, but I've attached it
so you can apply it to your test kernel...

Cheers
  Trond


[-- Attachment #2: linux-2.6.27-001-respond_promptly_to_socket_errors.dif --]
[-- Type: application/x-dif, Size: 4418 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-11-01 13:41     ` Trond Myklebust
@ 2008-11-02 14:40       ` Ian Campbell
  2008-11-07  2:12         ` kenneth johansson
  2008-11-04 19:10       ` Ian Campbell
  2008-11-25  7:09       ` Ian Campbell
  2 siblings, 1 reply; 131+ messages in thread
From: Ian Campbell @ 2008-11-02 14:40 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 494 bytes --]

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> 
> 
> Have you tested with the TCP RST fix yet? It has been merged into
> mainline, so it should be in the latest 2.6.28-git, but I've attached
> it so you can apply it to your test kernel...

I wasn't aware of it. I'll give it a go.

Thanks,
Ian.
> 
-- 
Ian Campbell

His designs were strictly honourable, as the phrase is: that is, to rob
a lady of her fortune by way of marriage.
		-- Henry Fielding, "Tom Jones"

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-11-01 13:41     ` Trond Myklebust
  2008-11-02 14:40       ` Ian Campbell
@ 2008-11-04 19:10       ` Ian Campbell
  2008-11-25  7:09       ` Ian Campbell
  2 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-04 19:10 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1827 bytes --]

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > So far I have bisected down to this range and am currently testing
> > > acee478 which has been up for >4days.
> > 
> > Another update. It has now bisected down to a small range 
> > 
> > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > 
> > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > 
> > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> > 
> > Ian.
> 
> Have you tested with the TCP RST fix yet? It has been merged into
> mainline, so it should be in the latest 2.6.28-git, but I've attached it
> so you can apply it to your test kernel...

I cherry picked 2a9e1cfa23fb62da37739af81127dab5af095d99 onto v2.6.25
and unfortunately it has not fixed the issue. I'll go back to bisecting
with 3b948ae5be5e22532584113e2e02029519bbad8f.

Ian.

> 
> Cheers
>   Trond
> 
-- 
Ian Campbell

Superior ability breeds superior ambition.
		-- Spock, "Space Seed", stardate 3141.9

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-11-02 14:40       ` Ian Campbell
@ 2008-11-07  2:12         ` kenneth johansson
  0 siblings, 0 replies; 131+ messages in thread
From: kenneth johansson @ 2008-11-07  2:12 UTC (permalink / raw)
  To: linux-kernel

On Sun, 02 Nov 2008 14:40:47 +0000, Ian Campbell wrote:

> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
>> 
>> 
>> Have you tested with the TCP RST fix yet? It has been merged into
>> mainline, so it should be in the latest 2.6.28-git, but I've attached
>> it so you can apply it to your test kernel...
> 
> I wasn't aware of it. I'll give it a go.
> 
> Thanks,
> Ian.
>>

I think I having the same problem as you. At least I have a gut feeling it's nfs related. 

What good and bad versions do you have so far in your bisecting. ??

I see the problem several times a day so it should be possible
to at least try one or two versions per day.

this is on a 2.6.27.2 client to a 2.6.26.3 server.
--------

 sudo grep blocked /var/log/syslog.0 
Nov  5 02:06:27 duo kernel: [ 5080.947067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:08:40 duo kernel: [ 5214.091071] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:10:49 duo kernel: [ 5342.940064] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:33:15 duo kernel: [ 6688.338072] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:35:44 duo kernel: [ 6837.588072] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:38:12 duo kernel: [ 6985.765070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 02:40:37 duo kernel: [ 7130.720067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 06:56:00 duo kernel: [22454.090070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 07:51:38 duo kernel: [25791.279105] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 09:39:33 duo kernel: [32267.016068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 09:41:55 duo kernel: [32408.750061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 09:44:17 duo kernel: [32550.484061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 09:46:37 duo kernel: [32691.144064] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 16:26:25 duo kernel: [56678.536068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 16:28:50 duo kernel: [56823.492067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 16:31:17 duo kernel: [56970.594061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 16:33:44 duo kernel: [57117.697062] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 16:51:04 duo kernel: [58158.153065] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 18:00:39 duo kernel: [62256.625050] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov  5 18:15:16 duo kernel: [63210.108080] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 18:24:08 duo kernel: [63741.610074] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 19:47:09 duo kernel: [68722.698056] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov  5 19:47:53 duo kernel: [68722.698307] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov  5 19:47:53 duo kernel: [68722.698513] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov  5 19:47:53 duo kernel: [68755.984030] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov  5 22:20:11 duo kernel: [77904.265068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 22:23:10 duo kernel: [78083.580065] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 22:46:09 duo kernel: [79462.264081] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  5 23:22:05 duo kernel: [81599.010033] INFO: task scsi_eh_10:2675 blocked for more than 120 seconds.
Nov  5 23:22:05 duo kernel: [81599.010253] INFO: task hald-addon-stor:7097 blocked for more than 120 seconds.
Nov  5 23:22:05 duo kernel: [81599.010468] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov  5 23:22:05 duo kernel: [81599.010674] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov  6 01:46:31 duo kernel: [90209.346061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  6 01:46:31 duo kernel: [90242.632048] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov  6 01:46:31 duo kernel: [90242.632230] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov  6 01:46:31 duo kernel: [90242.632368] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov  6 01:46:31 duo kernel: [90275.918024] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov  6 02:11:59 duo kernel: [91812.443070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  6 02:14:12 duo kernel: [91945.587069] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  6 02:16:50 duo kernel: [92103.427069] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov  6 02:28:26 duo kernel: [92759.483053] INFO: task hald-addon-usb-:7009 blocked for more than 120 seconds.
Nov  6 02:28:26 duo kernel: [92759.483233] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov  6 02:28:26 duo kernel: [92759.483447] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov  6 02:28:26 duo kernel: [92792.769034] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov  6 02:58:49 duo kernel: [94622.425059] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-11-01 13:41     ` Trond Myklebust
  2008-11-02 14:40       ` Ian Campbell
  2008-11-04 19:10       ` Ian Campbell
@ 2008-11-25  7:09       ` Ian Campbell
  2008-11-25 13:28           ` Trond Myklebust
  2 siblings, 1 reply; 131+ messages in thread
From: Ian Campbell @ 2008-11-25  7:09 UTC (permalink / raw)
  To: Trond Myklebust, linux-nfs
  Cc: Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 2233 bytes --]

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > So far I have bisected down to this range and am currently testing
> > > acee478 which has been up for >4days.
> > 
> > Another update. It has now bisected down to a small range 
> > 
> > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > 
> > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > 
> > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.

According to bisect:

e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
Author: Trond Myklebust <Trond.Myklebust@netapp.com>
Date:   Mon Nov 5 15:44:12 2007 -0500

    SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
    
    By using shutdown() rather than close() we allow the RPC client to wait
    for the TCP close handshake to complete before we start trying to reconnect
    using the same port.
    We use shutdown(SHUT_WR) only instead of shutting down both directions,
    however we wait until the server has closed the connection on its side.
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

I've started testing 2.6.26 + revert. It's been a long while since I
started this process so I'll also have a go at an up to date version.

Cheers,
Ian.
-- 
Ian Campbell

By failing to prepare, you are preparing to fail.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:28           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-11-25 13:28 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > So far I have bisected down to this range and am currently testing
> > > > acee478 which has been up for >4days.
> > > 
> > > Another update. It has now bisected down to a small range 
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > 
> > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> 
> According to bisect:
> 
> e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date:   Mon Nov 5 15:44:12 2007 -0500
> 
>     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
>     
>     By using shutdown() rather than close() we allow the RPC client to wait
>     for the TCP close handshake to complete before we start trying to reconnect
>     using the same port.
>     We use shutdown(SHUT_WR) only instead of shutting down both directions,
>     however we wait until the server has closed the connection on its side.
>     
>     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> I've started testing 2.6.26 + revert. It's been a long while since I
> started this process so I'll also have a go at an up to date version.
> 
> Cheers,

That would indicate that the server is failing to close the TCP
connection when the client closes on its end.

Could you remind me what server you are using? Also, does 'netstat -t'
show connections that are stuck in the CLOSE_WAIT state when you see the
hang?

Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:28           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-11-25 13:28 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > So far I have bisected down to this range and am currently testing
> > > > acee478 which has been up for >4days.
> > > 
> > > Another update. It has now bisected down to a small range 
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > 
> > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> 
> According to bisect:
> 
> e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date:   Mon Nov 5 15:44:12 2007 -0500
> 
>     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
>     
>     By using shutdown() rather than close() we allow the RPC client to wait
>     for the TCP close handshake to complete before we start trying to reconnect
>     using the same port.
>     We use shutdown(SHUT_WR) only instead of shutting down both directions,
>     however we wait until the server has closed the connection on its side.
>     
>     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> I've started testing 2.6.26 + revert. It's been a long while since I
> started this process so I'll also have a go at an up to date version.
> 
> Cheers,

That would indicate that the server is failing to close the TCP
connection when the client closes on its end.

Could you remind me what server you are using? Also, does 'netstat -t'
show connections that are stuck in the CLOSE_WAIT state when you see the
hang?

Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:38             ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-25 13:38 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 08:28 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> > On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > > So far I have bisected down to this range and am currently testing
> > > > > acee478 which has been up for >4days.
> > > > 
> > > > Another update. It has now bisected down to a small range 
> > > > 
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > > 
> > > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > > 
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> > 
> > According to bisect:
> > 
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> > commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> > Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date:   Mon Nov 5 15:44:12 2007 -0500
> > 
> >     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> >     
> >     By using shutdown() rather than close() we allow the RPC client to wait
> >     for the TCP close handshake to complete before we start trying to reconnect
> >     using the same port.
> >     We use shutdown(SHUT_WR) only instead of shutting down both directions,
> >     however we wait until the server has closed the connection on its side.
> >     
> >     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > 
> > I've started testing 2.6.26 + revert. It's been a long while since I
> > started this process so I'll also have a go at an up to date version.
> > 
> > Cheers,
> 
> That would indicate that the server is failing to close the TCP
> connection when the client closes on its end.
> 
> Could you remind me what server you are using?

2.6.25-2-486 which is a Debian package from backports.org, changelog
indicates that it contains 2.6.25.7.

> Also, does 'netstat -t'
> show connections that are stuck in the CLOSE_WAIT state when you see the
> hang?

I'd have to wait for it to reproduce again to be 100% sure but according
to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

Ian.

-- 
Ian Campbell
Current Noise: Diamond Head - It's Electric

"The only real way to look younger is not to be born so soon."
		-- Charles Schulz, "Things I've Had to Learn Over and
		   Over and Over"


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:38             ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-25 13:38 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 08:28 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> > On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > > So far I have bisected down to this range and am currently testing
> > > > > acee478 which has been up for >4days.
> > > > 
> > > > Another update. It has now bisected down to a small range 
> > > > 
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > > 
> > > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > > 
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> > 
> > According to bisect:
> > 
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> > commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> > Author: Trond Myklebust <Trond.Myklebust@netapp.com>
> > Date:   Mon Nov 5 15:44:12 2007 -0500
> > 
> >     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> >     
> >     By using shutdown() rather than close() we allow the RPC client to wait
> >     for the TCP close handshake to complete before we start trying to reconnect
> >     using the same port.
> >     We use shutdown(SHUT_WR) only instead of shutting down both directions,
> >     however we wait until the server has closed the connection on its side.
> >     
> >     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> > 
> > I've started testing 2.6.26 + revert. It's been a long while since I
> > started this process so I'll also have a go at an up to date version.
> > 
> > Cheers,
> 
> That would indicate that the server is failing to close the TCP
> connection when the client closes on its end.
> 
> Could you remind me what server you are using?

2.6.25-2-486 which is a Debian package from backports.org, changelog
indicates that it contains 2.6.25.7.

> Also, does 'netstat -t'
> show connections that are stuck in the CLOSE_WAIT state when you see the
> hang?

I'd have to wait for it to reproduce again to be 100% sure but according
to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

Ian.

-- 
Ian Campbell
Current Noise: Diamond Head - It's Electric

"The only real way to look younger is not to be born so soon."
		-- Charles Schulz, "Things I've Had to Learn Over and
		   Over and Over"


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:57               ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-11-25 13:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > That would indicate that the server is failing to close the TCP
> > connection when the client closes on its end.
> > 
> > Could you remind me what server you are using?
> 
> 2.6.25-2-486 which is a Debian package from backports.org, changelog
> indicates that it contains 2.6.25.7.

Hmm... It should normally close sockets when the state changes. There
might be a race, though...

> > Also, does 'netstat -t'
> > show connections that are stuck in the CLOSE_WAIT state when you see the
> > hang?
> 
> I'd have to wait for it to reproduce again to be 100% sure but according
> to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

That would be on the client side. I'm talking about the server.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 13:57               ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-11-25 13:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > That would indicate that the server is failing to close the TCP
> > connection when the client closes on its end.
> > 
> > Could you remind me what server you are using?
> 
> 2.6.25-2-486 which is a Debian package from backports.org, changelog
> indicates that it contains 2.6.25.7.

Hmm... It should normally close sockets when the state changes. There
might be a race, though...

> > Also, does 'netstat -t'
> > show connections that are stuck in the CLOSE_WAIT state when you see the
> > hang?
> 
> I'd have to wait for it to reproduce again to be 100% sure but according
> to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

That would be on the client side. I'm talking about the server.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 14:04                 ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-25 14:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > That would indicate that the server is failing to close the TCP
> > > connection when the client closes on its end.
> > > 
> > > Could you remind me what server you are using?
> > 
> > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > indicates that it contains 2.6.25.7.
> 
> Hmm... It should normally close sockets when the state changes. There
> might be a race, though...
> 
> > > Also, does 'netstat -t'
> > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > hang?
> > 
> > I'd have to wait for it to reproduce again to be 100% sure but according
> > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
> 
> That would be on the client side. I'm talking about the server.

Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
so I can netstat the server, give me a couple of days...

Ian.
-- 
Ian Campbell

It is more rational to sacrifice one life than six.
		-- Spock, "The Galileo Seven", stardate 2822.3


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-25 14:04                 ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-25 14:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > That would indicate that the server is failing to close the TCP
> > > connection when the client closes on its end.
> > > 
> > > Could you remind me what server you are using?
> > 
> > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > indicates that it contains 2.6.25.7.
> 
> Hmm... It should normally close sockets when the state changes. There
> might be a race, though...
> 
> > > Also, does 'netstat -t'
> > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > hang?
> > 
> > I'd have to wait for it to reproduce again to be 100% sure but according
> > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
> 
> That would be on the client side. I'm talking about the server.

Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
so I can netstat the server, give me a couple of days...

Ian.
-- 
Ian Campbell

It is more rational to sacrifice one life than six.
		-- Spock, "The Galileo Seven", stardate 2822.3


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?,
  2008-11-25 13:38             ` Ian Campbell
  (?)
  (?)
@ 2008-11-26  9:16             ` Tomas Kasparek
  -1 siblings, 0 replies; 131+ messages in thread
From: Tomas Kasparek @ 2008-11-26  9:16 UTC (permalink / raw)
  To: linux-nfs

Ian Campbell <ijc@...> writes:

> According to bisect:
> commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> Author: Trond Myklebust <Trond.Myklebust@...>
> Date:   Mon Nov 5 15:44:12 2007 -0500
>     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP 

Hi, it seems I have the same problem, with slightly different config. Client is
2.6.24.7 (OK), 2.6.25-rc1 and later (25,26,27,28-rc6) (fail). On client I get
FIN_WAIT2 state. The server is FreeBSD 7.0-STABLE (have two of them with the
same behavior). Fastest way to get into trouble is to use automounter so it
cycles the ports - when it gets to one in FIN_WAIT2 the machine gets "dead".

Reversing the patch with 2.6.27.4 seems to help, before the machine gets stuck
in ~ hour, now it is up for 16hours and seems ok.

> > That would indicate that the server is failing to close the TCP
> > connection when the client closes on its end.

Once or twice I prove with tcpdump that client send FIN, got ACK from server but
no FIN from server.

Tom


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-26 22:12                   ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-26 22:12 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 5264 bytes --]

On Tue, 2008-11-25 at 14:04 +0000, Ian Campbell wrote:
> On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> > On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > > That would indicate that the server is failing to close the TCP
> > > > connection when the client closes on its end.
> > > > 
> > > > Could you remind me what server you are using?
> > > 
> > > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > > indicates that it contains 2.6.25.7.
> > 
> > Hmm... It should normally close sockets when the state changes. There
> > might be a race, though...
> > 
> > > > Also, does 'netstat -t'
> > > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > > hang?
> > > 
> > > I'd have to wait for it to reproduce again to be 100% sure but according
> > > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
> > 
> > That would be on the client side. I'm talking about the server.
> 
> Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
> so I can netstat the server, give me a couple of days...

So on the server I see the following. 192.168.1.4 is the problematic
client and 192.168.1.6 is the server.

Maybe not interesting but 192.168.1.5 also uses NFS for my $HOME and
runs 2.6.26 with no lockups.

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        1      0 192.168.1.6:2049        192.168.1.4:723         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:920         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:890         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:698         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:705         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:943         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:915         CLOSE_WAIT 
tcp        0      0 192.168.1.6:2049        192.168.1.5:783         ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:998         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:758         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:955         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:845         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:827         CLOSE_WAIT 
tcp        0      0 192.168.1.6:58464       128.31.0.36:80          ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:754         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:837         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:918         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:865         CLOSE_WAIT 
tcp        0      0 192.168.1.6:48343       192.168.1.5:832         ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:840         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:883         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:785         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:720         CLOSE_WAIT 
tcp6       0      0 ::ffff:192.168.1.6:22   ::ffff:192.168.1.:38206 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:143  ::ffff:192.168.1.:41308 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:143  ::ffff:192.168.1.:55784 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:22   ::ffff:192.168.1.:39046 ESTABLISHED

and on the client

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 192.168.1.4:943         192.168.1.6:2049        FIN_WAIT2  
tcp        0      0 192.168.1.4:33959       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:54157       ESTABLISHED
tcp        0      0 127.0.0.1:13666         127.0.0.1:33364         ESTABLISHED
tcp        0      0 192.168.1.4:22          192.168.1.5:54696       ESTABLISHED
tcp        0      0 192.168.1.4:22          192.168.1.5:47599       ESTABLISHED
tcp        0      0 192.168.1.4:54156       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:33957       ESTABLISHED
tcp        0      0 192.168.1.4:33957       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:54157       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:54156       ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:33959       ESTABLISHED
tcp        0      0 127.0.0.1:47756         127.0.0.1:6545          ESTABLISHED
tcp        0      0 127.0.0.1:33364         127.0.0.1:13666         ESTABLISHED
tcp        0      0 127.0.0.1:6545          127.0.0.1:47756         ESTABLISHED

> 
> Ian.
-- 
Ian Campbell

Just once, I wish we would encounter an alien menace that wasn't
immune to bullets.
		-- The Brigadier, "Dr. Who"

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-11-26 22:12                   ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-11-26 22:12 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 5264 bytes --]

On Tue, 2008-11-25 at 14:04 +0000, Ian Campbell wrote:
> On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> > On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > > That would indicate that the server is failing to close the TCP
> > > > connection when the client closes on its end.
> > > > 
> > > > Could you remind me what server you are using?
> > > 
> > > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > > indicates that it contains 2.6.25.7.
> > 
> > Hmm... It should normally close sockets when the state changes. There
> > might be a race, though...
> > 
> > > > Also, does 'netstat -t'
> > > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > > hang?
> > > 
> > > I'd have to wait for it to reproduce again to be 100% sure but according
> > > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
> > 
> > That would be on the client side. I'm talking about the server.
> 
> Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
> so I can netstat the server, give me a couple of days...

So on the server I see the following. 192.168.1.4 is the problematic
client and 192.168.1.6 is the server.

Maybe not interesting but 192.168.1.5 also uses NFS for my $HOME and
runs 2.6.26 with no lockups.

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        1      0 192.168.1.6:2049        192.168.1.4:723         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:920         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:890         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:698         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:705         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:943         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:915         CLOSE_WAIT 
tcp        0      0 192.168.1.6:2049        192.168.1.5:783         ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:998         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:758         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:955         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:845         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:827         CLOSE_WAIT 
tcp        0      0 192.168.1.6:58464       128.31.0.36:80          ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:754         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:837         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:918         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:865         CLOSE_WAIT 
tcp        0      0 192.168.1.6:48343       192.168.1.5:832         ESTABLISHED
tcp        1      0 192.168.1.6:2049        192.168.1.4:840         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:883         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:785         CLOSE_WAIT 
tcp        1      0 192.168.1.6:2049        192.168.1.4:720         CLOSE_WAIT 
tcp6       0      0 ::ffff:192.168.1.6:22   ::ffff:192.168.1.:38206 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:143  ::ffff:192.168.1.:41308 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:143  ::ffff:192.168.1.:55784 ESTABLISHED
tcp6       0      0 ::ffff:192.168.1.6:22   ::ffff:192.168.1.:39046 ESTABLISHED

and on the client

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 192.168.1.4:943         192.168.1.6:2049        FIN_WAIT2  
tcp        0      0 192.168.1.4:33959       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:54157       ESTABLISHED
tcp        0      0 127.0.0.1:13666         127.0.0.1:33364         ESTABLISHED
tcp        0      0 192.168.1.4:22          192.168.1.5:54696       ESTABLISHED
tcp        0      0 192.168.1.4:22          192.168.1.5:47599       ESTABLISHED
tcp        0      0 192.168.1.4:54156       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:33957       ESTABLISHED
tcp        0      0 192.168.1.4:33957       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:54157       192.168.1.4:6543        ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:54156       ESTABLISHED
tcp        0      0 192.168.1.4:6543        192.168.1.4:33959       ESTABLISHED
tcp        0      0 127.0.0.1:47756         127.0.0.1:6545          ESTABLISHED
tcp        0      0 127.0.0.1:33364         127.0.0.1:13666         ESTABLISHED
tcp        0      0 127.0.0.1:6545          127.0.0.1:47756         ESTABLISHED

> 
> Ian.
-- 
Ian Campbell

Just once, I wish we would encounter an alien menace that wasn't
immune to bullets.
		-- The Brigadier, "Dr. Who"

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01  0:17                     ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

Can you see if the following 3 patches help? They're against 2.6.28-rc6,
but afaics the problems are pretty much the same on 2.6.26.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01  0:17                     ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

Can you see if the following 3 patches help? They're against 2.6.28-rc6,
but afaics the problems are pretty much the same on 2.6.26.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-01  0:18                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:18 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

We want to ensure that connected sockets close down the connection when we
set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
stuff that is keeping a reference to the socket.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/svcsock.c |   21 ++++++++++++++++++++-
 1 files changed, 20 insertions(+), 1 deletions(-)


diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 95293f5..a1b048d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -59,6 +59,7 @@ static void		svc_udp_data_ready(struct sock *, int);
 static int		svc_udp_recvfrom(struct svc_rqst *);
 static int		svc_udp_sendto(struct svc_rqst *);
 static void		svc_sock_detach(struct svc_xprt *);
+static void		svc_tcp_sock_detach(struct svc_xprt *);
 static void		svc_sock_free(struct svc_xprt *);
 
 static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
@@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
 	.xpo_recvfrom = svc_tcp_recvfrom,
 	.xpo_sendto = svc_tcp_sendto,
 	.xpo_release_rqst = svc_release_skb,
-	.xpo_detach = svc_sock_detach,
+	.xpo_detach = svc_tcp_sock_detach,
 	.xpo_free = svc_sock_free,
 	.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
 	.xpo_has_wspace = svc_tcp_has_wspace,
@@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
 	sk->sk_state_change = svsk->sk_ostate;
 	sk->sk_data_ready = svsk->sk_odata;
 	sk->sk_write_space = svsk->sk_owspace;
+
+	if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+		wake_up_interruptible(sk->sk_sleep);
+}
+
+/*
+ * Disconnect the socket, and reset the callbacks
+ */
+static void svc_tcp_sock_detach(struct svc_xprt *xprt)
+{
+	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
+
+	dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
+
+	svc_sock_detach(xprt);
+
+	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
+		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
 }
 
 /*

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-01  0:18                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:18 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

We want to ensure that connected sockets close down the connection when we
set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
stuff that is keeping a reference to the socket.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/svcsock.c |   21 ++++++++++++++++++++-
 1 files changed, 20 insertions(+), 1 deletions(-)


diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 95293f5..a1b048d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -59,6 +59,7 @@ static void		svc_udp_data_ready(struct sock *, int);
 static int		svc_udp_recvfrom(struct svc_rqst *);
 static int		svc_udp_sendto(struct svc_rqst *);
 static void		svc_sock_detach(struct svc_xprt *);
+static void		svc_tcp_sock_detach(struct svc_xprt *);
 static void		svc_sock_free(struct svc_xprt *);
 
 static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
@@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
 	.xpo_recvfrom = svc_tcp_recvfrom,
 	.xpo_sendto = svc_tcp_sendto,
 	.xpo_release_rqst = svc_release_skb,
-	.xpo_detach = svc_sock_detach,
+	.xpo_detach = svc_tcp_sock_detach,
 	.xpo_free = svc_sock_free,
 	.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
 	.xpo_has_wspace = svc_tcp_has_wspace,
@@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
 	sk->sk_state_change = svsk->sk_ostate;
 	sk->sk_data_ready = svsk->sk_odata;
 	sk->sk_write_space = svsk->sk_owspace;
+
+	if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+		wake_up_interruptible(sk->sk_sleep);
+}
+
+/*
+ * Disconnect the socket, and reset the callbacks
+ */
+static void svc_tcp_sock_detach(struct svc_xprt *xprt)
+{
+	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
+
+	dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
+
+	svc_sock_detach(xprt);
+
+	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
+		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
 }
 
 /*

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
  2008-12-01  0:17                     ` Trond Myklebust
  (?)
  (?)
@ 2008-12-01  0:19                     ` Trond Myklebust
  -1 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:19 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

Use XPT_DEAD to ensure that we only call xpo_detach & friends once.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/svc_xprt.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)


diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..a417064 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -838,6 +838,10 @@ void svc_delete_xprt(struct svc_xprt *xprt)
 {
 	struct svc_serv	*serv = xprt->xpt_server;
 
+	/* Only do this once */
+	if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags) != 0)
+		return;
+
 	dprintk("svc: svc_delete_xprt(%p)\n", xprt);
 	xprt->xpt_ops->xpo_detach(xprt);
 
@@ -851,13 +855,14 @@ void svc_delete_xprt(struct svc_xprt *xprt)
 	 * while still attached to a queue, the queue itself
 	 * is about to be destroyed (in svc_destroy).
 	 */
-	if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
-		BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
-		if (test_bit(XPT_TEMP, &xprt->xpt_flags))
-			serv->sv_tmpcnt--;
-		svc_xprt_put(xprt);
-	}
+	if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+		serv->sv_tmpcnt--;
 	spin_unlock_bh(&serv->sv_lock);
+
+	/* FIXME: Is this really needed here? */
+	BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
+
+	svc_xprt_put(xprt);
 }
 
 void svc_close_xprt(struct svc_xprt *xprt)

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
@ 2008-12-01  0:20                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:20 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

Aside from being racy (there is nothing preventing someone setting XPT_DEAD
after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
wrong to assume that transports which have called svc_delete_xprt() might
not need to be re-enqueued.

See the list of deferred requests, which is currently never going to
be cleared if the revisit call happens after svc_delete_xprt(). In this
case, the deferred request will currently keep a reference to the transport
forever.

The fix should be to allow dead transports to be enqueued in order to clear
the deferred requests, then change the order of processing in svc_recv() so
that we pick up deferred requests before we do the XPT_CLOSE processing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/svc_xprt.c |  124 +++++++++++++++++++++++++++----------------------
 1 files changed, 69 insertions(+), 55 deletions(-)


diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index a417064..b54cf84 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -297,10 +297,15 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 	struct svc_serv	*serv = xprt->xpt_server;
 	struct svc_pool *pool;
 	struct svc_rqst	*rqstp;
+	unsigned long flags;
 	int cpu;
 
-	if (!(xprt->xpt_flags &
-	      ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
+	flags = xprt->xpt_flags &
+		(1UL<<XPT_CONN | 1UL<<XPT_DATA | 1UL<<XPT_CLOSE |
+		 1UL<<XPT_DEAD | 1UL<<XPT_DEFERRED);
+	if (flags == 0)
+		return;
+	if ((flags & 1UL<<XPT_DEAD) != 0 && (flags & 1UL<<XPT_DEFERRED) == 0)
 		return;
 
 	cpu = get_cpu();
@@ -315,12 +320,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 		       "svc_xprt_enqueue: "
 		       "threads and transports both waiting??\n");
 
-	if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
-		/* Don't enqueue dead transports */
-		dprintk("svc: transport %p is dead, not enqueued\n", xprt);
-		goto out_unlock;
-	}
-
 	/* Mark transport as busy. It will remain in this state until
 	 * the provider calls svc_xprt_received. We update XPT_BUSY
 	 * atomically because it also guards against trying to enqueue
@@ -566,6 +565,7 @@ static void svc_check_conn_limits(struct svc_serv *serv)
 int svc_recv(struct svc_rqst *rqstp, long timeout)
 {
 	struct svc_xprt		*xprt = NULL;
+	struct svc_xprt		*newxpt;
 	struct svc_serv		*serv = rqstp->rq_server;
 	struct svc_pool		*pool = rqstp->rq_pool;
 	int			len, i;
@@ -673,62 +673,76 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
 	spin_unlock_bh(&pool->sp_lock);
 
 	len = 0;
+
+	/*
+	 * Deal with deferred requests first, since they need to be
+	 * dequeued and dropped if the transport has been closed.
+	 */
+	rqstp->rq_deferred = svc_deferred_dequeue(xprt);
+	if (rqstp->rq_deferred) {
+		svc_xprt_received(xprt);
+		len = svc_deferred_recv(rqstp);
+	}
+
 	if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
 		dprintk("svc_recv: found XPT_CLOSE\n");
 		svc_delete_xprt(xprt);
-	} else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
-		struct svc_xprt *newxpt;
-		newxpt = xprt->xpt_ops->xpo_accept(xprt);
-		if (newxpt) {
-			/*
-			 * We know this module_get will succeed because the
-			 * listener holds a reference too
-			 */
-			__module_get(newxpt->xpt_class->xcl_owner);
-			svc_check_conn_limits(xprt->xpt_server);
-			spin_lock_bh(&serv->sv_lock);
-			set_bit(XPT_TEMP, &newxpt->xpt_flags);
-			list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
-			serv->sv_tmpcnt++;
-			if (serv->sv_temptimer.function == NULL) {
-				/* setup timer to age temp transports */
-				setup_timer(&serv->sv_temptimer,
-					    svc_age_temp_xprts,
-					    (unsigned long)serv);
-				mod_timer(&serv->sv_temptimer,
-					  jiffies + svc_conn_age_period * HZ);
-			}
-			spin_unlock_bh(&serv->sv_lock);
-			svc_xprt_received(newxpt);
-		}
-		svc_xprt_received(xprt);
-	} else {
-		dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
-			rqstp, pool->sp_id, xprt,
-			atomic_read(&xprt->xpt_ref.refcount));
-		rqstp->rq_deferred = svc_deferred_dequeue(xprt);
-		if (rqstp->rq_deferred) {
-			svc_xprt_received(xprt);
-			len = svc_deferred_recv(rqstp);
-		} else
+		goto drop_request;
+	}
+
+	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
+		if (len == 0) {
+			dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+					rqstp, pool->sp_id, xprt,
+					atomic_read(&xprt->xpt_ref.refcount));
 			len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+
+			/* No data, incomplete (TCP) read, or accept() */
+			if (len == 0 || len == -EAGAIN)
+				goto drop_request;
+		}
+
 		dprintk("svc: got len=%d\n", len);
-	}
 
-	/* No data, incomplete (TCP) read, or accept() */
-	if (len == 0 || len == -EAGAIN) {
-		rqstp->rq_res.len = 0;
-		svc_xprt_release(rqstp);
-		return -EAGAIN;
+		clear_bit(XPT_OLD, &xprt->xpt_flags);
+
+		rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
+		rqstp->rq_chandle.defer = svc_defer;
+
+		if (serv->sv_stats)
+			serv->sv_stats->netcnt++;
+		return len;
 	}
-	clear_bit(XPT_OLD, &xprt->xpt_flags);
 
-	rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
-	rqstp->rq_chandle.defer = svc_defer;
+	newxpt = xprt->xpt_ops->xpo_accept(xprt);
+	if (newxpt) {
+		/*
+		 * We know this module_get will succeed because the
+		 * listener holds a reference too
+		 */
+		__module_get(newxpt->xpt_class->xcl_owner);
+		svc_check_conn_limits(xprt->xpt_server);
+		spin_lock_bh(&serv->sv_lock);
+		set_bit(XPT_TEMP, &newxpt->xpt_flags);
+		list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
+		serv->sv_tmpcnt++;
+		if (serv->sv_temptimer.function == NULL) {
+			/* setup timer to age temp transports */
+			setup_timer(&serv->sv_temptimer,
+				    svc_age_temp_xprts,
+				    (unsigned long)serv);
+			mod_timer(&serv->sv_temptimer,
+				  jiffies + svc_conn_age_period * HZ);
+		}
+		spin_unlock_bh(&serv->sv_lock);
+		svc_xprt_received(newxpt);
+	}
+	svc_xprt_received(xprt);
 
-	if (serv->sv_stats)
-		serv->sv_stats->netcnt++;
-	return len;
+drop_request:
+	rqstp->rq_res.len = 0;
+	svc_xprt_release(rqstp);
+	return -EAGAIN;
 }
 EXPORT_SYMBOL(svc_recv);
 

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
@ 2008-12-01  0:20                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:20 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

Aside from being racy (there is nothing preventing someone setting XPT_DEAD
after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
wrong to assume that transports which have called svc_delete_xprt() might
not need to be re-enqueued.

See the list of deferred requests, which is currently never going to
be cleared if the revisit call happens after svc_delete_xprt(). In this
case, the deferred request will currently keep a reference to the transport
forever.

The fix should be to allow dead transports to be enqueued in order to clear
the deferred requests, then change the order of processing in svc_recv() so
that we pick up deferred requests before we do the XPT_CLOSE processing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/svc_xprt.c |  124 +++++++++++++++++++++++++++----------------------
 1 files changed, 69 insertions(+), 55 deletions(-)


diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index a417064..b54cf84 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -297,10 +297,15 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 	struct svc_serv	*serv = xprt->xpt_server;
 	struct svc_pool *pool;
 	struct svc_rqst	*rqstp;
+	unsigned long flags;
 	int cpu;
 
-	if (!(xprt->xpt_flags &
-	      ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
+	flags = xprt->xpt_flags &
+		(1UL<<XPT_CONN | 1UL<<XPT_DATA | 1UL<<XPT_CLOSE |
+		 1UL<<XPT_DEAD | 1UL<<XPT_DEFERRED);
+	if (flags == 0)
+		return;
+	if ((flags & 1UL<<XPT_DEAD) != 0 && (flags & 1UL<<XPT_DEFERRED) == 0)
 		return;
 
 	cpu = get_cpu();
@@ -315,12 +320,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 		       "svc_xprt_enqueue: "
 		       "threads and transports both waiting??\n");
 
-	if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
-		/* Don't enqueue dead transports */
-		dprintk("svc: transport %p is dead, not enqueued\n", xprt);
-		goto out_unlock;
-	}
-
 	/* Mark transport as busy. It will remain in this state until
 	 * the provider calls svc_xprt_received. We update XPT_BUSY
 	 * atomically because it also guards against trying to enqueue
@@ -566,6 +565,7 @@ static void svc_check_conn_limits(struct svc_serv *serv)
 int svc_recv(struct svc_rqst *rqstp, long timeout)
 {
 	struct svc_xprt		*xprt = NULL;
+	struct svc_xprt		*newxpt;
 	struct svc_serv		*serv = rqstp->rq_server;
 	struct svc_pool		*pool = rqstp->rq_pool;
 	int			len, i;
@@ -673,62 +673,76 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
 	spin_unlock_bh(&pool->sp_lock);
 
 	len = 0;
+
+	/*
+	 * Deal with deferred requests first, since they need to be
+	 * dequeued and dropped if the transport has been closed.
+	 */
+	rqstp->rq_deferred = svc_deferred_dequeue(xprt);
+	if (rqstp->rq_deferred) {
+		svc_xprt_received(xprt);
+		len = svc_deferred_recv(rqstp);
+	}
+
 	if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
 		dprintk("svc_recv: found XPT_CLOSE\n");
 		svc_delete_xprt(xprt);
-	} else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
-		struct svc_xprt *newxpt;
-		newxpt = xprt->xpt_ops->xpo_accept(xprt);
-		if (newxpt) {
-			/*
-			 * We know this module_get will succeed because the
-			 * listener holds a reference too
-			 */
-			__module_get(newxpt->xpt_class->xcl_owner);
-			svc_check_conn_limits(xprt->xpt_server);
-			spin_lock_bh(&serv->sv_lock);
-			set_bit(XPT_TEMP, &newxpt->xpt_flags);
-			list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
-			serv->sv_tmpcnt++;
-			if (serv->sv_temptimer.function == NULL) {
-				/* setup timer to age temp transports */
-				setup_timer(&serv->sv_temptimer,
-					    svc_age_temp_xprts,
-					    (unsigned long)serv);
-				mod_timer(&serv->sv_temptimer,
-					  jiffies + svc_conn_age_period * HZ);
-			}
-			spin_unlock_bh(&serv->sv_lock);
-			svc_xprt_received(newxpt);
-		}
-		svc_xprt_received(xprt);
-	} else {
-		dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
-			rqstp, pool->sp_id, xprt,
-			atomic_read(&xprt->xpt_ref.refcount));
-		rqstp->rq_deferred = svc_deferred_dequeue(xprt);
-		if (rqstp->rq_deferred) {
-			svc_xprt_received(xprt);
-			len = svc_deferred_recv(rqstp);
-		} else
+		goto drop_request;
+	}
+
+	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
+		if (len == 0) {
+			dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+					rqstp, pool->sp_id, xprt,
+					atomic_read(&xprt->xpt_ref.refcount));
 			len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+
+			/* No data, incomplete (TCP) read, or accept() */
+			if (len == 0 || len == -EAGAIN)
+				goto drop_request;
+		}
+
 		dprintk("svc: got len=%d\n", len);
-	}
 
-	/* No data, incomplete (TCP) read, or accept() */
-	if (len == 0 || len == -EAGAIN) {
-		rqstp->rq_res.len = 0;
-		svc_xprt_release(rqstp);
-		return -EAGAIN;
+		clear_bit(XPT_OLD, &xprt->xpt_flags);
+
+		rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
+		rqstp->rq_chandle.defer = svc_defer;
+
+		if (serv->sv_stats)
+			serv->sv_stats->netcnt++;
+		return len;
 	}
-	clear_bit(XPT_OLD, &xprt->xpt_flags);
 
-	rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
-	rqstp->rq_chandle.defer = svc_defer;
+	newxpt = xprt->xpt_ops->xpo_accept(xprt);
+	if (newxpt) {
+		/*
+		 * We know this module_get will succeed because the
+		 * listener holds a reference too
+		 */
+		__module_get(newxpt->xpt_class->xcl_owner);
+		svc_check_conn_limits(xprt->xpt_server);
+		spin_lock_bh(&serv->sv_lock);
+		set_bit(XPT_TEMP, &newxpt->xpt_flags);
+		list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
+		serv->sv_tmpcnt++;
+		if (serv->sv_temptimer.function == NULL) {
+			/* setup timer to age temp transports */
+			setup_timer(&serv->sv_temptimer,
+				    svc_age_temp_xprts,
+				    (unsigned long)serv);
+			mod_timer(&serv->sv_temptimer,
+				  jiffies + svc_conn_age_period * HZ);
+		}
+		spin_unlock_bh(&serv->sv_lock);
+		svc_xprt_received(newxpt);
+	}
+	svc_xprt_received(xprt);
 
-	if (serv->sv_stats)
-		serv->sv_stats->netcnt++;
-	return len;
+drop_request:
+	rqstp->rq_res.len = 0;
+	svc_xprt_release(rqstp);
+	return -EAGAIN;
 }
 EXPORT_SYMBOL(svc_recv);
 

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01  0:29                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.
> 
> Cheers
>   Trond

Sorry... I forgot to add that these 3 patches need to be applied to the
nfs server, not the client.

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01  0:29                       ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-01  0:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.
> 
> Cheers
>   Trond

Sorry... I forgot to add that these 3 patches need to be applied to the
nfs server, not the client.

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01 22:09                       ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-01 22:09 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.

Thanks.

The server was actually running 2.6.25.7 but the matching sources have
since been removed the backports.org so I've reproduce with 2.6.26 and
now I'll add the patches.

Ian.

-- 
Ian Campbell

It has been said that man is a rational animal.  All my life I have
been searching for evidence which could support this.
		-- Bertrand Russell

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-01 22:09                       ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-01 22:09 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.

Thanks.

The server was actually running 2.6.25.7 but the matching sources have
since been removed the backports.org so I've reproduce with 2.6.26 and
now I'll add the patches.

Ian.

-- 
Ian Campbell

It has been said that man is a rational animal.  All my life I have
been searching for evidence which could support this.
		-- Bertrand Russell

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 15:22                         ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-02 15:22 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields, Tom Tucker

On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
> 
> Sorry... I forgot to add that these 3 patches need to be applied to the
> nfs server, not the client.

Hi,

I have the problem on client side and can not change server (FreeBSD 7.0). 
these patches does not change the situation (and they are probably not
supposed to do so, just giving it a try). After few minutes I got this on
the client with 2.6.28-rc6 with patches:

tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2

Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
does help on the other side (with 2.6.27.4).

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 15:22                         ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-02 15:22 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields, Tom Tucker

On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
> 
> Sorry... I forgot to add that these 3 patches need to be applied to the
> nfs server, not the client.

Hi,

I have the problem on client side and can not change server (FreeBSD 7.0). 
these patches does not change the situation (and they are probably not
supposed to do so, just giving it a try). After few minutes I got this on
the client with 2.6.28-rc6 with patches:

tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2

Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
does help on the other side (with 2.6.27.4).

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-02 15:22                         ` Kasparek Tomas
@ 2008-12-02 15:37                           ` Trond Myklebust
  -1 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-02 15:37 UTC (permalink / raw)
  To: Kasparek Tomas
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields, Tom Tucker

On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> > 
> > Sorry... I forgot to add that these 3 patches need to be applied to the
> > nfs server, not the client.
> 
> Hi,
> 
> I have the problem on client side and can not change server (FreeBSD 7.0). 
> these patches does not change the situation (and they are probably not
> supposed to do so, just giving it a try). After few minutes I got this on
> the client with 2.6.28-rc6 with patches:
> 
> tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2
> 
> Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> does help on the other side (with 2.6.27.4).

Then I suggest working around the problem by reducing the value of the
sysctl net.ipv4.tcp_fin_timeout on the client.
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 15:37                           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-02 15:37 UTC (permalink / raw)
  To: Kasparek Tomas
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields, Tom Tucker

On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> > 
> > Sorry... I forgot to add that these 3 patches need to be applied to the
> > nfs server, not the client.
> 
> Hi,
> 
> I have the problem on client side and can not change server (FreeBSD 7.0). 
> these patches does not change the situation (and they are probably not
> supposed to do so, just giving it a try). After few minutes I got this on
> the client with 2.6.28-rc6 with patches:
> 
> tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2
> 
> Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> does help on the other side (with 2.6.27.4).

Then I suggest working around the problem by reducing the value of the
sysctl net.ipv4.tcp_fin_timeout on the client.
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 16:26                             ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-02 16:26 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	J. Bruce Fields, Tom Tucker

On Tue, Dec 02, 2008 at 10:37:02AM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> > On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > > 
> > > Sorry... I forgot to add that these 3 patches need to be applied to the
> > > nfs server, not the client.
> > 
> > Hi,
> > 
> > I have the problem on client side and can not change server (FreeBSD 7.0). 
> > these patches does not change the situation (and they are probably not
> > supposed to do so, just giving it a try). After few minutes I got this on
> > the client with 2.6.28-rc6 with patches:
> > 
> > tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2
> > 
> > Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> > does help on the other side (with 2.6.27.4).
> 
> Then I suggest working around the problem by reducing the value of the
> sysctl net.ipv4.tcp_fin_timeout on the client.

Did tried. The number should be seconds and defaults to 60, These
connections are still there after several hours. Changing it to 10 (sec)
and same behaviour. (BTW The server did not changed in last several months)

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 16:26                             ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-02 16:26 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	J. Bruce Fields, Tom Tucker

On Tue, Dec 02, 2008 at 10:37:02AM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> > On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > > 
> > > Sorry... I forgot to add that these 3 patches need to be applied to the
> > > nfs server, not the client.
> > 
> > Hi,
> > 
> > I have the problem on client side and can not change server (FreeBSD 7.0). 
> > these patches does not change the situation (and they are probably not
> > supposed to do so, just giving it a try). After few minutes I got this on
> > the client with 2.6.28-rc6 with patches:
> > 
> > tcp   0   0 147.229.12.146:674          147.229.176.14:2049     FIN_WAIT2
> > 
> > Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> > does help on the other side (with 2.6.27.4).
> 
> Then I suggest working around the problem by reducing the value of the
> sysctl net.ipv4.tcp_fin_timeout on the client.

Did tried. The number should be seconds and defaults to 60, These
connections are still there after several hours. Changing it to 10 (sec)
and same behaviour. (BTW The server did not changed in last several months)

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-02 16:26                             ` Kasparek Tomas
@ 2008-12-02 18:10                               ` Trond Myklebust
  -1 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-02 18:10 UTC (permalink / raw)
  To: Kasparek Tomas
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:

> Did tried. The number should be seconds and defaults to 60, These
> connections are still there after several hours. Changing it to 10 (sec)
> and same behaviour. (BTW The server did not changed in last several months)

Are you seeing the same behaviour with 'netstat -t'?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-02 18:10                               ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-02 18:10 UTC (permalink / raw)
  To: Kasparek Tomas
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	J. Bruce Fields, Tom Tucker

On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:

> Did tried. The number should be seconds and defaults to 60, These
> connections are still there after several hours. Changing it to 10 (sec)
> and same behaviour. (BTW The server did not changed in last several months)

Are you seeing the same behaviour with 'netstat -t'?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                               ` <1228241407.3090.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-12-04 10:23                                 ` Kasparek Tomas
       [not found]                                   ` <1229284201.6463.98.camel@heimdal.trondhjem.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-04 10:23 UTC (permalink / raw)
  To: Trond Myklebust

On Tue, Dec 02, 2008 at 01:10:07PM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:
> 
> > Did tried. The number should be seconds and defaults to 60, These
> > connections are still there after several hours. Changing it to 10 (sec)
> > and same behaviour. (BTW The server did not changed in last several months)
> 
> Are you seeing the same behaviour with 'netstat -t'?

yes:

root@pckasparek: ~# ssh root@pcnlp1 'netstat -pan | grep WAIT' | cut -c-85
tcp    0   0 147.229.12.146:989          147.229.176.14:2049 FIN_WAIT2
root@pckasparek: ~# ssh root@pcnlp1 'netstat -t | grep WAIT' | cut -c-85
tcp    0   0 pcnlp1.fit.vutbr.:ftps-data eva.fit.vutbr.cz:nfs FIN_WAIT2

but it should be the same, did't it? -t just selects TCP connections and
this is TCP connection so it shows the same

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-06 12:16                         ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-06 12:16 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
> 
> Thanks.
> 
> The server was actually running 2.6.25.7 but the matching sources have
> since been removed the backports.org so I've reproduce with 2.6.26 and
> now I'll add the patches.

Just a small progress report. Anecdotally I thought that unpatched
2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
day I was running it where previously it was less frequent than once per
day.

With the patched server the client ran OK for 2.5 days then mysteriously
hung, the logs show none of the normal symptoms and my wife reset it
before I got home so I've no real clue what happened but I'm inclined to
think it was unrelated for now. I'll get back to you in a week or so if
the problem hasn't reoccurred.

Ian.
-- 
Ian Campbell

It's later than you think.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-06 12:16                         ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-06 12:16 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
> 
> Thanks.
> 
> The server was actually running 2.6.25.7 but the matching sources have
> since been removed the backports.org so I've reproduce with 2.6.26 and
> now I'll add the patches.

Just a small progress report. Anecdotally I thought that unpatched
2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
day I was running it where previously it was less frequent than once per
day.

With the patched server the client ran OK for 2.5 days then mysteriously
hung, the logs show none of the normal symptoms and my wife reset it
before I got home so I've no real clue what happened but I'm inclined to
think it was unrelated for now. I'll get back to you in a week or so if
the problem hasn't reoccurred.

Ian.
-- 
Ian Campbell

It's later than you think.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-14 18:24                           ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-14 18:24 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> > 
> > Thanks.
> > 
> > The server was actually running 2.6.25.7 but the matching sources have
> > since been removed the backports.org so I've reproduce with 2.6.26 and
> > now I'll add the patches.
> 
> Just a small progress report. Anecdotally I thought that unpatched
> 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> day I was running it where previously it was less frequent than once per
> day.
> 
> With the patched server the client ran OK for 2.5 days then mysteriously
> hung, the logs show none of the normal symptoms and my wife reset it
> before I got home so I've no real clue what happened but I'm inclined to
> think it was unrelated for now. I'll get back to you in a week or so if
> the problem hasn't reoccurred.

$ uptime 
 18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46

This is on the problematic client, so it looks like the server side fix
has sorted it. Thanks very much Trond.

Ian.
-- 
Ian Campbell

You have only to mumble a few words in church to get married and few words
in your sleep to get divorced.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-14 18:24                           ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-14 18:24 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: linux-nfs, Max Kellermann, linux-kernel, gcosta, Grant Coady,
	J. Bruce Fields, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> > 
> > Thanks.
> > 
> > The server was actually running 2.6.25.7 but the matching sources have
> > since been removed the backports.org so I've reproduce with 2.6.26 and
> > now I'll add the patches.
> 
> Just a small progress report. Anecdotally I thought that unpatched
> 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> day I was running it where previously it was less frequent than once per
> day.
> 
> With the patched server the client ran OK for 2.5 days then mysteriously
> hung, the logs show none of the normal symptoms and my wife reset it
> before I got home so I've no real clue what happened but I'm inclined to
> think it was unrelated for now. I'll get back to you in a week or so if
> the problem hasn't reoccurred.

$ uptime 
 18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46

This is on the problematic client, so it looks like the server side fix
has sorted it. Thanks very much Trond.

Ian.
-- 
Ian Campbell

You have only to mumble a few words in church to get married and few words
in your sleep to get divorced.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                     ` <1229284201.6463.98.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-12-16 12:05                                       ` Kasparek Tomas
  2008-12-16 12:10                                         ` Kasparek Tomas
  2008-12-23 22:34                                         ` Trond Myklebust
  0 siblings, 2 replies; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-16 12:05 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Sun, Dec 14, 2008 at 02:50:01PM -0500, Trond Myklebust wrote:
> On Thu, 2008-12-04 at 11:23 +0100, Kasparek Tomas wrote:
> > On Tue, Dec 02, 2008 at 01:10:07PM -0500, Trond Myklebust wrote:
> > > On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:
> > > 
> > > > Did tried. The number should be seconds and defaults to 60, These
> > > > connections are still there after several hours. Changing it to 10 (sec)
> > > > and same behaviour. (BTW The server did not changed in last several months)
> > > 
> > > Are you seeing the same behaviour with 'netstat -t'?
> > 
> > yes:
> > 
> > root@pckasparek: ~# ssh root@pcnlp1 'netstat -pan | grep WAIT' | cut -c-85
> > tcp    0   0 147.229.12.146:989          147.229.176.14:2049 FIN_WAIT2
> > root@pckasparek: ~# ssh root@pcnlp1 'netstat -t | grep WAIT' | cut -c-85
> > tcp    0   0 pcnlp1.fit.vutbr.:ftps-data eva.fit.vutbr.cz:nfs FIN_WAIT2
> > 
> > but it should be the same, did't it? -t just selects TCP connections and
> > this is TCP connection so it shows the same
> 
> Right, but the point is that the client is in the state FIN_WAIT2, which
> means that it has closed the socket on its end, and is waiting for the
> server to close on its end. The fact that the server is failing to do
> this is a server bug.
> 
> That said, we can't wait forever for buggy servers. I see now why the
> linger2 stuff isn't working. I believe that the appended patch should
> help...

Hm, not happy to say that but it still does not work after some time. Now
the problem is opposite there are no connections to the server according to
netstat on client, just time to time there is

pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null

(kazi is server). Will try to investigate more details.

(just to remember the same kernel with reversed
e06799f958bf7f9f8fae15f0c6f519953fb0257c works fine - exact patch is
included - it was slightly modified to fit 2.6.27.x kernels)

Thank you very much for your help so far.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-16 12:05                                       ` Kasparek Tomas
@ 2008-12-16 12:10                                         ` Kasparek Tomas
  2008-12-16 12:59                                           ` Trond Myklebust
  2008-12-23 22:34                                         ` Trond Myklebust
  1 sibling, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2008-12-16 12:10 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 2596 bytes --]

On Tue, Dec 16, 2008 at 01:05:47PM +0100, Kasparek Tomas wrote:
> On Sun, Dec 14, 2008 at 02:50:01PM -0500, Trond Myklebust wrote:
> > On Thu, 2008-12-04 at 11:23 +0100, Kasparek Tomas wrote:
> > > On Tue, Dec 02, 2008 at 01:10:07PM -0500, Trond Myklebust wrote:
> > > > On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:
> > > > 
> > > > > Did tried. The number should be seconds and defaults to 60, These
> > > > > connections are still there after several hours. Changing it to 10 (sec)
> > > > > and same behaviour. (BTW The server did not changed in last several months)
> > > > 
> > > > Are you seeing the same behaviour with 'netstat -t'?
> > > 
> > > yes:
> > > 
> > > root@pckasparek: ~# ssh root@pcnlp1 'netstat -pan | grep WAIT' | cut -c-85
> > > tcp    0   0 147.229.12.146:989          147.229.176.14:2049 FIN_WAIT2
> > > root@pckasparek: ~# ssh root@pcnlp1 'netstat -t | grep WAIT' | cut -c-85
> > > tcp    0   0 pcnlp1.fit.vutbr.:ftps-data eva.fit.vutbr.cz:nfs FIN_WAIT2
> > > 
> > > but it should be the same, did't it? -t just selects TCP connections and
> > > this is TCP connection so it shows the same
> > 
> > Right, but the point is that the client is in the state FIN_WAIT2, which
> > means that it has closed the socket on its end, and is waiting for the
> > server to close on its end. The fact that the server is failing to do
> > this is a server bug.
> > 
> > That said, we can't wait forever for buggy servers. I see now why the
> > linger2 stuff isn't working. I believe that the appended patch should
> > help...
> 
> Hm, not happy to say that but it still does not work after some time. Now
> the problem is opposite there are no connections to the server according to
> netstat on client, just time to time there is
> 
> pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> 
> (kazi is server). Will try to investigate more details.
> 
> (just to remember the same kernel with reversed
> e06799f958bf7f9f8fae15f0c6f519953fb0257c works fine - exact patch is
> included - it was slightly modified to fit 2.6.27.x kernels)
> 
> Thank you very much for your help so far.

just the forgoten patch promised.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


[-- Attachment #2: linux-2.6.git-e06799f958bf7f9f8fae15f0c6f519953fb0257c-own-modified.patch --]
[-- Type: text/plain, Size: 2417 bytes --]

diff -ruN linux-2.6.27.4/net/sunrpc/xprtsock.c linux-2.6.27.4-64/net/sunrpc/xprtsock.c
--- linux-2.6.27.4/net/sunrpc/xprtsock.c	2008-11-04 14:30:26.000000000 +0100
+++ linux-2.6.27.4-64/net/sunrpc/xprtsock.c	2008-11-25 19:11:34.000000000 +0100
@@ -615,22 +615,6 @@
 	return status;
 }
 
-/**
- * xs_tcp_shutdown - gracefully shut down a TCP socket
- * @xprt: transport
- *
- * Initiates a graceful shutdown of the TCP socket by calling the
- * equivalent of shutdown(SHUT_WR);
- */
-static void xs_tcp_shutdown(struct rpc_xprt *xprt)
-{
-	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
-	struct socket *sock = transport->sock;
-
-	if (sock != NULL)
-		kernel_sock_shutdown(sock, SHUT_WR);
-}
-
 static inline void xs_encode_tcp_record_marker(struct xdr_buf *buf)
 {
 	u32 reclen = buf->len - sizeof(rpc_fraghdr);
@@ -709,7 +693,8 @@
 		dprintk("RPC:       sendmsg returned unrecognized error %d\n",
 			-status);
 		clear_bit(SOCK_ASYNC_NOSPACE, &transport->sock->flags);
-		xs_tcp_shutdown(xprt);
+ 		xprt_disconnect_done(xprt);
+        break;
 	}
 
 	return status;
@@ -1670,7 +1655,8 @@
 				break;
 			default:
 				/* get rid of existing socket, and retry */
-				xs_tcp_shutdown(xprt);
+				xs_close(xprt);
+				break;
 		}
 	}
 out:
@@ -1729,7 +1715,8 @@
 				break;
 			default:
 				/* get rid of existing socket, and retry */
-				xs_tcp_shutdown(xprt);
+				xs_close(xprt);
+				break;
 		}
 	}
 out:
@@ -1776,19 +1763,6 @@
 	}
 }
 
-static void xs_tcp_connect(struct rpc_task *task)
-{
-	struct rpc_xprt *xprt = task->tk_xprt;
-
-	/* Initiate graceful shutdown of the socket if not already done */
-	if (test_bit(XPRT_CONNECTED, &xprt->state))
-		xs_tcp_shutdown(xprt);
-	/* Exit if we need to wait for socket shutdown to complete */
-	if (test_bit(XPRT_CLOSING, &xprt->state))
-		return;
-	xs_connect(task);
-}
-
 /**
  * xs_udp_print_stats - display UDP socket-specifc stats
  * @xprt: rpc_xprt struct containing statistics
@@ -1859,12 +1833,12 @@
 	.release_xprt		= xs_tcp_release_xprt,
 	.rpcbind		= rpcb_getport_async,
 	.set_port		= xs_set_port,
-	.connect		= xs_tcp_connect,
+	.connect		= xs_connect,
 	.buf_alloc		= rpc_malloc,
 	.buf_free		= rpc_free,
 	.send_request		= xs_tcp_send_request,
 	.set_retrans_timeout	= xprt_set_retrans_timeout_def,
-	.close			= xs_tcp_shutdown,
+	.close			= xs_close,
 	.destroy		= xs_destroy,
 	.print_stats		= xs_tcp_print_stats,
 };

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-16 12:10                                         ` Kasparek Tomas
@ 2008-12-16 12:59                                           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-16 12:59 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Tue, 2008-12-16 at 13:10 +0100, Kasparek Tomas wrote:
> On Tue, Dec 16, 2008 at 01:05:47PM +0100, Kasparek Tomas wrote:
> > On Sun, Dec 14, 2008 at 02:50:01PM -0500, Trond Myklebust wrote:
> > > On Thu, 2008-12-04 at 11:23 +0100, Kasparek Tomas wrote:
> > > > On Tue, Dec 02, 2008 at 01:10:07PM -0500, Trond Myklebust wrote:
> > > > > On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:
> > > > > 
> > > > > > Did tried. The number should be seconds and defaults to 60, These
> > > > > > connections are still there after several hours. Changing it to 10 (sec)
> > > > > > and same behaviour. (BTW The server did not changed in last several months)
> > > > > 
> > > > > Are you seeing the same behaviour with 'netstat -t'?
> > > > 
> > > > yes:
> > > > 
> > > > root@pckasparek: ~# ssh root@pcnlp1 'netstat -pan | grep WAIT' | cut -c-85
> > > > tcp    0   0 147.229.12.146:989          147.229.176.14:2049 FIN_WAIT2
> > > > root@pckasparek: ~# ssh root@pcnlp1 'netstat -t | grep WAIT' | cut -c-85
> > > > tcp    0   0 pcnlp1.fit.vutbr.:ftps-data eva.fit.vutbr.cz:nfs FIN_WAIT2
> > > > 
> > > > but it should be the same, did't it? -t just selects TCP connections and
> > > > this is TCP connection so it shows the same
> > > 
> > > Right, but the point is that the client is in the state FIN_WAIT2, which
> > > means that it has closed the socket on its end, and is waiting for the
> > > server to close on its end. The fact that the server is failing to do
> > > this is a server bug.
> > > 
> > > That said, we can't wait forever for buggy servers. I see now why the
> > > linger2 stuff isn't working. I believe that the appended patch should
> > > help...
> > 
> > Hm, not happy to say that but it still does not work after some time. Now
> > the problem is opposite there are no connections to the server according to
> > netstat on client, just time to time there is
> > 
> > pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> > kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> > 
> > (kazi is server). Will try to investigate more details.
> > 
> > (just to remember the same kernel with reversed
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c works fine - exact patch is
> > included - it was slightly modified to fit 2.6.27.x kernels)
> > 
> > Thank you very much for your help so far.
> 
> just the forgoten patch promised.

NACK. That takes us right back to the previous broken behaviour.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-16 17:55                             ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2008-12-16 17:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > > 
> > > Thanks.
> > > 
> > > The server was actually running 2.6.25.7 but the matching sources have
> > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > now I'll add the patches.
> > 
> > Just a small progress report. Anecdotally I thought that unpatched
> > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > day I was running it where previously it was less frequent than once per
> > day.
> > 
> > With the patched server the client ran OK for 2.5 days then mysteriously
> > hung, the logs show none of the normal symptoms and my wife reset it
> > before I got home so I've no real clue what happened but I'm inclined to
> > think it was unrelated for now. I'll get back to you in a week or so if
> > the problem hasn't reoccurred.
> 
> $ uptime 
>  18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46
> 
> This is on the problematic client, so it looks like the server side fix
> has sorted it. Thanks very much Trond.

Thanks for the testing!  So this was with the following three patches
applied on the server on top of 2.6.26?

	[PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
	[PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
	[PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

I'll try to take a look at these before I leave for the holidays,
assuming the versions Trond posted on Nov. 30 are the latest.

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-16 17:55                             ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2008-12-16 17:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > > 
> > > Thanks.
> > > 
> > > The server was actually running 2.6.25.7 but the matching sources have
> > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > now I'll add the patches.
> > 
> > Just a small progress report. Anecdotally I thought that unpatched
> > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > day I was running it where previously it was less frequent than once per
> > day.
> > 
> > With the patched server the client ran OK for 2.5 days then mysteriously
> > hung, the logs show none of the normal symptoms and my wife reset it
> > before I got home so I've no real clue what happened but I'm inclined to
> > think it was unrelated for now. I'll get back to you in a week or so if
> > the problem hasn't reoccurred.
> 
> $ uptime 
>  18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46
> 
> This is on the problematic client, so it looks like the server side fix
> has sorted it. Thanks very much Trond.

Thanks for the testing!  So this was with the following three patches
applied on the server on top of 2.6.26?

	[PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
	[PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
	[PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

I'll try to take a look at these before I leave for the holidays,
assuming the versions Trond posted on Nov. 30 are the latest.

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-16 17:55                             ` J. Bruce Fields
@ 2008-12-16 18:39                               ` Ian Campbell
  -1 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-16 18:39 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 2195 bytes --]

On Tue, 2008-12-16 at 12:55 -0500, J. Bruce Fields wrote:
> On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> > On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > > but afaics the problems are pretty much the same on 2.6.26.
> > > > 
> > > > Thanks.
> > > > 
> > > > The server was actually running 2.6.25.7 but the matching sources have
> > > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > > now I'll add the patches.
> > > 
> > > Just a small progress report. Anecdotally I thought that unpatched
> > > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > > day I was running it where previously it was less frequent than once per
> > > day.
> > > 
> > > With the patched server the client ran OK for 2.5 days then mysteriously
> > > hung, the logs show none of the normal symptoms and my wife reset it
> > > before I got home so I've no real clue what happened but I'm inclined to
> > > think it was unrelated for now. I'll get back to you in a week or so if
> > > the problem hasn't reoccurred.
> > 
> > $ uptime 
> >  18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46
> > 
> > This is on the problematic client, so it looks like the server side fix
> > has sorted it. Thanks very much Trond.
> 
> Thanks for the testing!  So this was with the following three patches
> applied on the server on top of 2.6.26?
> 
> 	[PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
> 	[PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
> 	[PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

That's right, it was actually 2.6.26.7 FWIW.

> I'll try to take a look at these before I leave for the holidays,
> assuming the versions Trond posted on Nov. 30 are the latest.

Thanks.

Ian.
-- 
Ian Campbell

The light of a hundred stars does not equal the light of the moon.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2008-12-16 18:39                               ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2008-12-16 18:39 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 2195 bytes --]

On Tue, 2008-12-16 at 12:55 -0500, J. Bruce Fields wrote:
> On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> > On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote: 
> > > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > > but afaics the problems are pretty much the same on 2.6.26.
> > > > 
> > > > Thanks.
> > > > 
> > > > The server was actually running 2.6.25.7 but the matching sources have
> > > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > > now I'll add the patches.
> > > 
> > > Just a small progress report. Anecdotally I thought that unpatched
> > > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > > day I was running it where previously it was less frequent than once per
> > > day.
> > > 
> > > With the patched server the client ran OK for 2.5 days then mysteriously
> > > hung, the logs show none of the normal symptoms and my wife reset it
> > > before I got home so I've no real clue what happened but I'm inclined to
> > > think it was unrelated for now. I'll get back to you in a week or so if
> > > the problem hasn't reoccurred.
> > 
> > $ uptime 
> >  18:15:29 up 9 days, 22 min,  1 user,  load average: 0.74, 0.64, 0.46
> > 
> > This is on the problematic client, so it looks like the server side fix
> > has sorted it. Thanks very much Trond.
> 
> Thanks for the testing!  So this was with the following three patches
> applied on the server on top of 2.6.26?
> 
> 	[PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
> 	[PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
> 	[PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

That's right, it was actually 2.6.26.7 FWIW.

> I'll try to take a look at these before I leave for the holidays,
> assuming the versions Trond posted on Nov. 30 are the latest.

Thanks.

Ian.
-- 
Ian Campbell

The light of a hundred stars does not equal the light of the moon.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-17 15:27                         ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-17 15:27 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> We want to ensure that connected sockets close down the connection when we
> set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
> stuff that is keeping a reference to the socket.
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  net/sunrpc/svcsock.c |   21 ++++++++++++++++++++-
>  1 files changed, 20 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 95293f5..a1b048d 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -59,6 +59,7 @@ static void		svc_udp_data_ready(struct sock *, int);
>  static int		svc_udp_recvfrom(struct svc_rqst *);
>  static int		svc_udp_sendto(struct svc_rqst *);
>  static void		svc_sock_detach(struct svc_xprt *);
> +static void		svc_tcp_sock_detach(struct svc_xprt *);
>  static void		svc_sock_free(struct svc_xprt *);
>  
>  static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
> @@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
>  	.xpo_recvfrom = svc_tcp_recvfrom,
>  	.xpo_sendto = svc_tcp_sendto,
>  	.xpo_release_rqst = svc_release_skb,
> -	.xpo_detach = svc_sock_detach,
> +	.xpo_detach = svc_tcp_sock_detach,
>  	.xpo_free = svc_sock_free,
>  	.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
>  	.xpo_has_wspace = svc_tcp_has_wspace,
> @@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
>  	sk->sk_state_change = svsk->sk_ostate;
>  	sk->sk_data_ready = svsk->sk_odata;
>  	sk->sk_write_space = svsk->sk_owspace;
> +
> +	if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
> +		wake_up_interruptible(sk->sk_sleep);
> +}
> +
> +/*
> + * Disconnect the socket, and reset the callbacks
> + */
> +static void svc_tcp_sock_detach(struct svc_xprt *xprt)
> +{
> +	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> +
> +	dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
> +
> +	svc_sock_detach(xprt);
> +
> +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);

How is this different than what happens as an artifact of sock_release?

>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-17 15:27                         ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-17 15:27 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> We want to ensure that connected sockets close down the connection when we
> set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
> stuff that is keeping a reference to the socket.
> 
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  net/sunrpc/svcsock.c |   21 ++++++++++++++++++++-
>  1 files changed, 20 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 95293f5..a1b048d 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -59,6 +59,7 @@ static void		svc_udp_data_ready(struct sock *, int);
>  static int		svc_udp_recvfrom(struct svc_rqst *);
>  static int		svc_udp_sendto(struct svc_rqst *);
>  static void		svc_sock_detach(struct svc_xprt *);
> +static void		svc_tcp_sock_detach(struct svc_xprt *);
>  static void		svc_sock_free(struct svc_xprt *);
>  
>  static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
> @@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
>  	.xpo_recvfrom = svc_tcp_recvfrom,
>  	.xpo_sendto = svc_tcp_sendto,
>  	.xpo_release_rqst = svc_release_skb,
> -	.xpo_detach = svc_sock_detach,
> +	.xpo_detach = svc_tcp_sock_detach,
>  	.xpo_free = svc_sock_free,
>  	.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
>  	.xpo_has_wspace = svc_tcp_has_wspace,
> @@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
>  	sk->sk_state_change = svsk->sk_ostate;
>  	sk->sk_data_ready = svsk->sk_odata;
>  	sk->sk_write_space = svsk->sk_owspace;
> +
> +	if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
> +		wake_up_interruptible(sk->sk_sleep);
> +}
> +
> +/*
> + * Disconnect the socket, and reset the callbacks
> + */
> +static void svc_tcp_sock_detach(struct svc_xprt *xprt)
> +{
> +	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> +
> +	dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
> +
> +	svc_sock_detach(xprt);
> +
> +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);

How is this different than what happens as an artifact of sock_release?

>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2008-12-01  0:20                       ` Trond Myklebust
  (?)
@ 2008-12-17 15:35                       ` Tom Tucker
  2008-12-17 19:07                           ` Trond Myklebust
  -1 siblings, 1 reply; 131+ messages in thread
From: Tom Tucker @ 2008-12-17 15:35 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
> wrong to assume that transports which have called svc_delete_xprt() might
> not need to be re-enqueued.

This is only true because now you allow transports with XPT_DEAD set to 
be enqueued -- yes?

> 
> See the list of deferred requests, which is currently never going to
> be cleared if the revisit call happens after svc_delete_xprt(). In this
> case, the deferred request will currently keep a reference to the transport
> forever.
>

I agree this is a possibility and it needs to be fixed. I'm concerned 
that the root cause is still there though. I thought the test case was 
the client side timing out the connection. Why are there deferred 
requests sitting on what is presumably an idle connection?


> The fix should be to allow dead transports to be enqueued in order to clear
> the deferred requests, then change the order of processing in svc_recv() so
> that we pick up deferred requests before we do the XPT_CLOSE processing.
> 

Wouldn't it be simpler to clean up any deferred requests in the close 
path instead of changing the meaning of XPT_DEAD and dispatching 
N-threads to do the same?

> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  net/sunrpc/svc_xprt.c |  124 +++++++++++++++++++++++++++----------------------
>  1 files changed, 69 insertions(+), 55 deletions(-)
> 
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index a417064..b54cf84 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -297,10 +297,15 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>  	struct svc_serv	*serv = xprt->xpt_server;
>  	struct svc_pool *pool;
>  	struct svc_rqst	*rqstp;
> +	unsigned long flags;
>  	int cpu;
>  
> -	if (!(xprt->xpt_flags &
> -	      ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
> +	flags = xprt->xpt_flags &
> +		(1UL<<XPT_CONN | 1UL<<XPT_DATA | 1UL<<XPT_CLOSE |
> +		 1UL<<XPT_DEAD | 1UL<<XPT_DEFERRED);
> +	if (flags == 0)
> +		return;
> +	if ((flags & 1UL<<XPT_DEAD) != 0 && (flags & 1UL<<XPT_DEFERRED) == 0)
>  		return;
>  
>  	cpu = get_cpu();
> @@ -315,12 +320,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
>  		       "svc_xprt_enqueue: "
>  		       "threads and transports both waiting??\n");
>  
> -	if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> -		/* Don't enqueue dead transports */
> -		dprintk("svc: transport %p is dead, not enqueued\n", xprt);
> -		goto out_unlock;
> -	}
> -
>  	/* Mark transport as busy. It will remain in this state until
>  	 * the provider calls svc_xprt_received. We update XPT_BUSY
>  	 * atomically because it also guards against trying to enqueue
> @@ -566,6 +565,7 @@ static void svc_check_conn_limits(struct svc_serv *serv)
>  int svc_recv(struct svc_rqst *rqstp, long timeout)
>  {
>  	struct svc_xprt		*xprt = NULL;
> +	struct svc_xprt		*newxpt;
>  	struct svc_serv		*serv = rqstp->rq_server;
>  	struct svc_pool		*pool = rqstp->rq_pool;
>  	int			len, i;
> @@ -673,62 +673,76 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
>  	spin_unlock_bh(&pool->sp_lock);
>  
>  	len = 0;
> +
> +	/*
> +	 * Deal with deferred requests first, since they need to be
> +	 * dequeued and dropped if the transport has been closed.
> +	 */
> +	rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> +	if (rqstp->rq_deferred) {
> +		svc_xprt_received(xprt);
> +		len = svc_deferred_recv(rqstp);
> +	}
> +
>  	if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
>  		dprintk("svc_recv: found XPT_CLOSE\n");
>  		svc_delete_xprt(xprt);
> -	} else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> -		struct svc_xprt *newxpt;
> -		newxpt = xprt->xpt_ops->xpo_accept(xprt);
> -		if (newxpt) {
> -			/*
> -			 * We know this module_get will succeed because the
> -			 * listener holds a reference too
> -			 */
> -			__module_get(newxpt->xpt_class->xcl_owner);
> -			svc_check_conn_limits(xprt->xpt_server);
> -			spin_lock_bh(&serv->sv_lock);
> -			set_bit(XPT_TEMP, &newxpt->xpt_flags);
> -			list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
> -			serv->sv_tmpcnt++;
> -			if (serv->sv_temptimer.function == NULL) {
> -				/* setup timer to age temp transports */
> -				setup_timer(&serv->sv_temptimer,
> -					    svc_age_temp_xprts,
> -					    (unsigned long)serv);
> -				mod_timer(&serv->sv_temptimer,
> -					  jiffies + svc_conn_age_period * HZ);
> -			}
> -			spin_unlock_bh(&serv->sv_lock);
> -			svc_xprt_received(newxpt);
> -		}
> -		svc_xprt_received(xprt);
> -	} else {
> -		dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> -			rqstp, pool->sp_id, xprt,
> -			atomic_read(&xprt->xpt_ref.refcount));
> -		rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> -		if (rqstp->rq_deferred) {
> -			svc_xprt_received(xprt);
> -			len = svc_deferred_recv(rqstp);
> -		} else
> +		goto drop_request;
> +	}
> +
> +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> +		if (len == 0) {
> +			dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> +					rqstp, pool->sp_id, xprt,
> +					atomic_read(&xprt->xpt_ref.refcount));
>  			len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> +
> +			/* No data, incomplete (TCP) read, or accept() */
> +			if (len == 0 || len == -EAGAIN)
> +				goto drop_request;
> +		}
> +
>  		dprintk("svc: got len=%d\n", len);
> -	}
>  
> -	/* No data, incomplete (TCP) read, or accept() */
> -	if (len == 0 || len == -EAGAIN) {
> -		rqstp->rq_res.len = 0;
> -		svc_xprt_release(rqstp);
> -		return -EAGAIN;
> +		clear_bit(XPT_OLD, &xprt->xpt_flags);
> +
> +		rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> +		rqstp->rq_chandle.defer = svc_defer;
> +
> +		if (serv->sv_stats)
> +			serv->sv_stats->netcnt++;
> +		return len;
>  	}
> -	clear_bit(XPT_OLD, &xprt->xpt_flags);
>  
> -	rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> -	rqstp->rq_chandle.defer = svc_defer;
> +	newxpt = xprt->xpt_ops->xpo_accept(xprt);
> +	if (newxpt) {
> +		/*
> +		 * We know this module_get will succeed because the
> +		 * listener holds a reference too
> +		 */
> +		__module_get(newxpt->xpt_class->xcl_owner);
> +		svc_check_conn_limits(xprt->xpt_server);
> +		spin_lock_bh(&serv->sv_lock);
> +		set_bit(XPT_TEMP, &newxpt->xpt_flags);
> +		list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
> +		serv->sv_tmpcnt++;
> +		if (serv->sv_temptimer.function == NULL) {
> +			/* setup timer to age temp transports */
> +			setup_timer(&serv->sv_temptimer,
> +				    svc_age_temp_xprts,
> +				    (unsigned long)serv);
> +			mod_timer(&serv->sv_temptimer,
> +				  jiffies + svc_conn_age_period * HZ);
> +		}
> +		spin_unlock_bh(&serv->sv_lock);
> +		svc_xprt_received(newxpt);
> +	}
> +	svc_xprt_received(xprt);
>  
> -	if (serv->sv_stats)
> -		serv->sv_stats->netcnt++;
> -	return len;
> +drop_request:
> +	rqstp->rq_res.len = 0;
> +	svc_xprt_release(rqstp);
> +	return -EAGAIN;
>  }
>  EXPORT_SYMBOL(svc_recv);
>  
> 


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
  2008-12-17 15:27                         ` Tom Tucker
@ 2008-12-17 18:08                           ` Trond Myklebust
  -1 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-17 18:08 UTC (permalink / raw)
  To: Tom Tucker
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
> > +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> > +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
> 
> How is this different than what happens as an artifact of sock_release?

The point is that it is independent of whether or not something is
holding a reference to the svc_sock.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-17 18:08                           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-17 18:08 UTC (permalink / raw)
  To: Tom Tucker
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
> > +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> > +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
> 
> How is this different than what happens as an artifact of sock_release?

The point is that it is independent of whether or not something is
holding a reference to the svc_sock.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-17 18:59                             ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-17 18:59 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
>>> +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
>>> +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
>> How is this different than what happens as an artifact of sock_release?
> 
> The point is that it is independent of whether or not something is
> holding a reference to the svc_sock.

Thanks, makes sense.

> 


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
@ 2008-12-17 18:59                             ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-17 18:59 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
>>> +	if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
>>> +		kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
>> How is this different than what happens as an artifact of sock_release?
> 
> The point is that it is independent of whether or not something is
> holding a reference to the svc_sock.

Thanks, makes sense.

> 


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2008-12-17 15:35                       ` Tom Tucker
@ 2008-12-17 19:07                           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-17 19:07 UTC (permalink / raw)
  To: Tom Tucker
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > Aside from being racy (there is nothing preventing someone setting XPT_DEAD
> > after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
> > wrong to assume that transports which have called svc_delete_xprt() might
> > not need to be re-enqueued.
> 
> This is only true because now you allow transports with XPT_DEAD set to 
> be enqueued -- yes?
> 
> > 
> > See the list of deferred requests, which is currently never going to
> > be cleared if the revisit call happens after svc_delete_xprt(). In this
> > case, the deferred request will currently keep a reference to the transport
> > forever.
> >
> 
> I agree this is a possibility and it needs to be fixed. I'm concerned 
> that the root cause is still there though. I thought the test case was 
> the client side timing out the connection. Why are there deferred 
> requests sitting on what is presumably an idle connection?

I haven't said that they are the cause of this test case. I've said that
deferred requests hold references to the socket that can obviously
deadlock. That needs to be fixed regardless of whether or not it is the
cause here.

There are plenty of situations in which the client may choose to close
the TCP socket even if there are outstanding requests. One of the most
common is when the user signals the process, so that an RPC call that
was partially transmitted (ran out of buffer space) gets cancelled
before it can finish transmitting. In that case the client has no choice
but to disconnect and immediately reconnect.

> > The fix should be to allow dead transports to be enqueued in order to clear
> > the deferred requests, then change the order of processing in svc_recv() so
> > that we pick up deferred requests before we do the XPT_CLOSE processing.
> > 
> 
> Wouldn't it be simpler to clean up any deferred requests in the close 
> path instead of changing the meaning of XPT_DEAD and dispatching 
> N-threads to do the same?

AFAICS, deferred requests are the property of the cache until they
expire or a downcall occurs. I'm not aware of any way to cancel only
those deferred requests that hold a reference to this particular
transport.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
@ 2008-12-17 19:07                           ` Trond Myklebust
  0 siblings, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2008-12-17 19:07 UTC (permalink / raw)
  To: Tom Tucker
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > Aside from being racy (there is nothing preventing someone setting XPT_DEAD
> > after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
> > wrong to assume that transports which have called svc_delete_xprt() might
> > not need to be re-enqueued.
> 
> This is only true because now you allow transports with XPT_DEAD set to 
> be enqueued -- yes?
> 
> > 
> > See the list of deferred requests, which is currently never going to
> > be cleared if the revisit call happens after svc_delete_xprt(). In this
> > case, the deferred request will currently keep a reference to the transport
> > forever.
> >
> 
> I agree this is a possibility and it needs to be fixed. I'm concerned 
> that the root cause is still there though. I thought the test case was 
> the client side timing out the connection. Why are there deferred 
> requests sitting on what is presumably an idle connection?

I haven't said that they are the cause of this test case. I've said that
deferred requests hold references to the socket that can obviously
deadlock. That needs to be fixed regardless of whether or not it is the
cause here.

There are plenty of situations in which the client may choose to close
the TCP socket even if there are outstanding requests. One of the most
common is when the user signals the process, so that an RPC call that
was partially transmitted (ran out of buffer space) gets cancelled
before it can finish transmitting. In that case the client has no choice
but to disconnect and immediately reconnect.

> > The fix should be to allow dead transports to be enqueued in order to clear
> > the deferred requests, then change the order of processing in svc_recv() so
> > that we pick up deferred requests before we do the XPT_CLOSE processing.
> > 
> 
> Wouldn't it be simpler to clean up any deferred requests in the close 
> path instead of changing the meaning of XPT_DEAD and dispatching 
> N-threads to do the same?

AFAICS, deferred requests are the property of the cache until they
expire or a downcall occurs. I'm not aware of any way to cancel only
those deferred requests that hold a reference to this particular
transport.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                           ` <1229540877.7257.97.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2008-12-23 14:49                             ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-23 14:49 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>   
>> Trond Myklebust wrote:
>>     
>>> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>> wrong to assume that transports which have called svc_delete_xprt() might
>>> not need to be re-enqueued.
>>>       
>> This is only true because now you allow transports with XPT_DEAD set to 
>> be enqueued -- yes?
>>
>>     
>>> See the list of deferred requests, which is currently never going to
>>> be cleared if the revisit call happens after svc_delete_xprt(). In this
>>> case, the deferred request will currently keep a reference to the transport
>>> forever.
>>>
>>>       
>> I agree this is a possibility and it needs to be fixed. I'm concerned 
>> that the root cause is still there though. I thought the test case was 
>> the client side timing out the connection. Why are there deferred 
>> requests sitting on what is presumably an idle connection?
>>     
>
> I haven't said that they are the cause of this test case. I've said that
> deferred requests hold references to the socket that can obviously
> deadlock. That needs to be fixed regardless of whether or not it is the
> cause here.
>
> There are plenty of situations in which the client may choose to close
> the TCP socket even if there are outstanding requests. One of the most
> common is when the user signals the process, so that an RPC call that
> was partially transmitted (ran out of buffer space) gets cancelled
> before it can finish transmitting. In that case the client has no choice
> but to disconnect and immediately reconnect.
>
>   
>>> The fix should be to allow dead transports to be enqueued in order to clear
>>> the deferred requests, then change the order of processing in svc_recv() so
>>> that we pick up deferred requests before we do the XPT_CLOSE processing.
>>>
>>>       
>> Wouldn't it be simpler to clean up any deferred requests in the close 
>> path instead of changing the meaning of XPT_DEAD and dispatching 
>> N-threads to do the same?
>>     
>
> AFAICS, deferred requests are the property of the cache until they
> expire or a downcall occurs. I'm not aware of any way to cancel only
> those deferred requests that hold a reference to this particular
> transport.
>
>   
Ok, I think you're right, and I think that this fix is correct and makes 
the symptom go away.

I may be completely confused here, but:

- The deferred requests should be getting cleaned up by timing out, and 
that does not not seem to be happening, (Is this true?)

- By releasing the underlying connection prior to releasing the 
transport that manages it, we've converted the visible resource leek to 
an invisible one.

- This has been around forever and changing the client side close 
behavior graceful exposed this bug,

So I'm wondering if what we want to do here is to provide a mechanism 
for canceling deferred requests for a particular transport. This would 
provide a mechanism for the generic transport driver to force 
cancellation of deferred requests when closing.

Tom



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
@ 2008-12-23 14:49                             ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-23 14:49 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>   
>> Trond Myklebust wrote:
>>     
>>> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>> wrong to assume that transports which have called svc_delete_xprt() might
>>> not need to be re-enqueued.
>>>       
>> This is only true because now you allow transports with XPT_DEAD set to 
>> be enqueued -- yes?
>>
>>     
>>> See the list of deferred requests, which is currently never going to
>>> be cleared if the revisit call happens after svc_delete_xprt(). In this
>>> case, the deferred request will currently keep a reference to the transport
>>> forever.
>>>
>>>       
>> I agree this is a possibility and it needs to be fixed. I'm concerned 
>> that the root cause is still there though. I thought the test case was 
>> the client side timing out the connection. Why are there deferred 
>> requests sitting on what is presumably an idle connection?
>>     
>
> I haven't said that they are the cause of this test case. I've said that
> deferred requests hold references to the socket that can obviously
> deadlock. That needs to be fixed regardless of whether or not it is the
> cause here.
>
> There are plenty of situations in which the client may choose to close
> the TCP socket even if there are outstanding requests. One of the most
> common is when the user signals the process, so that an RPC call that
> was partially transmitted (ran out of buffer space) gets cancelled
> before it can finish transmitting. In that case the client has no choice
> but to disconnect and immediately reconnect.
>
>   
>>> The fix should be to allow dead transports to be enqueued in order to clear
>>> the deferred requests, then change the order of processing in svc_recv() so
>>> that we pick up deferred requests before we do the XPT_CLOSE processing.
>>>
>>>       
>> Wouldn't it be simpler to clean up any deferred requests in the close 
>> path instead of changing the meaning of XPT_DEAD and dispatching 
>> N-threads to do the same?
>>     
>
> AFAICS, deferred requests are the property of the cache until they
> expire or a downcall occurs. I'm not aware of any way to cancel only
> those deferred requests that hold a reference to this particular
> transport.
>
>   
Ok, I think you're right, and I think that this fix is correct and makes 
the symptom go away.

I may be completely confused here, but:

- The deferred requests should be getting cleaned up by timing out, and 
that does not not seem to be happening, (Is this true?)

- By releasing the underlying connection prior to releasing the 
transport that manages it, we've converted the visible resource leek to 
an invisible one.

- This has been around forever and changing the client side close 
behavior graceful exposed this bug,

So I'm wondering if what we want to do here is to provide a mechanism 
for canceling deferred requests for a particular transport. This would 
provide a mechanism for the generic transport driver to force 
cancellation of deferred requests when closing.

Tom



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-12-16 12:05                                       ` Kasparek Tomas
  2008-12-16 12:10                                         ` Kasparek Tomas
@ 2008-12-23 22:34                                         ` Trond Myklebust
       [not found]                                           ` <1230071647.17701.27.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  1 sibling, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2008-12-23 22:34 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Tue, 2008-12-16 at 13:05 +0100, Kasparek Tomas wrote:
> Hm, not happy to say that but it still does not work after some time. Now
> the problem is opposite there are no connections to the server according to
> netstat on client, just time to time there is
> 
> pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> 
> (kazi is server). Will try to investigate more details.

OK. Here is one more try. I've tightened up some locking issues with the
previous patch.

Thanks for helping test this!

Cheers
  Trond
-----------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Tue, 23 Dec 2008 16:21:25 -0500
SUNRPC: Add the equivalent of the linger2 timeout to RPC sockets

This avoids us getting stuck in the TCP_FIN_WAIT2 state forever.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 include/linux/sunrpc/xprt.h |    1 +
 net/sunrpc/xprtsock.c       |   63 ++++++++++++++++++++++++++++++-------------
 2 files changed, 45 insertions(+), 19 deletions(-)


diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 11fc71d..1a6ecd7 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -259,6 +259,7 @@ void			xprt_conditional_disconnect(struct rpc_xprt *xprt, unsigned int cookie);
 #define XPRT_BOUND		(4)
 #define XPRT_BINDING		(5)
 #define XPRT_CLOSING		(6)
+#define XPRT_CONNECTION_ABORT	(7)
 
 static inline void xprt_set_connected(struct rpc_xprt *xprt)
 {
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 0a50361..dfb0aeb 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -171,6 +171,7 @@ static ctl_table sunrpc_table[] = {
  */
 #define XS_TCP_INIT_REEST_TO	(3U * HZ)
 #define XS_TCP_MAX_REEST_TO	(5U * 60 * HZ)
+#define XS_TCP_LINGER2_TO	(5U * HZ)
 
 /*
  * TCP idle timeout; client drops the transport socket if it is idle
@@ -792,6 +793,7 @@ static void xs_close(struct rpc_xprt *xprt)
 	sock_release(sock);
 clear_close_wait:
 	smp_mb__before_clear_bit();
+	clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
 	clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
 	clear_bit(XPRT_CLOSING, &xprt->state);
 	smp_mb__after_clear_bit();
@@ -1126,6 +1128,7 @@ out:
  */
 static void xs_tcp_state_change(struct sock *sk)
 {
+	struct sock_xprt *transport;
 	struct rpc_xprt *xprt;
 
 	read_lock(&sk->sk_callback_lock);
@@ -1137,13 +1140,12 @@ static void xs_tcp_state_change(struct sock *sk)
 			sock_flag(sk, SOCK_DEAD),
 			sock_flag(sk, SOCK_ZAPPED));
 
+	transport = container_of(xprt, struct sock_xprt, xprt);
+
 	switch (sk->sk_state) {
 	case TCP_ESTABLISHED:
 		spin_lock_bh(&xprt->transport_lock);
 		if (!xprt_test_and_set_connected(xprt)) {
-			struct sock_xprt *transport = container_of(xprt,
-					struct sock_xprt, xprt);
-
 			/* Reset TCP record info */
 			transport->tcp_offset = 0;
 			transport->tcp_reclen = 0;
@@ -1184,7 +1186,24 @@ static void xs_tcp_state_change(struct sock *sk)
 		clear_bit(XPRT_CONNECTED, &xprt->state);
 		smp_mb__after_clear_bit();
 		break;
+	case TCP_FIN_WAIT2:
+		/* Do the equivalent of linger2 handling for dealing with
+		 * broken servers that don't close the socket in a timely
+		 * fashion
+		 */
+		if (!xprt_test_and_set_connecting(xprt)) {
+			set_bit(XPRT_CONNECTION_ABORT, &xprt->state);
+			queue_delayed_work(rpciod_workqueue,
+					&transport->connect_worker,
+					XS_TCP_LINGER2_TO);
+		}
+		break;
 	case TCP_CLOSE:
+		if (delayed_work_pending(&transport->connect_worker) &&
+			cancel_delayed_work(&transport->connect_worker)) {
+			clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
+			xprt_clear_connecting(xprt);
+		}
 		smp_mb__before_clear_bit();
 		clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
 		clear_bit(XPRT_CLOSING, &xprt->state);
@@ -1549,8 +1568,8 @@ static void xs_udp_connect_worker4(struct work_struct *work)
 	xs_udp_finish_connecting(xprt, sock);
 	status = 0;
 out:
-	xprt_wake_pending_tasks(xprt, status);
 	xprt_clear_connecting(xprt);
+	xprt_wake_pending_tasks(xprt, status);
 }
 
 /**
@@ -1590,8 +1609,8 @@ static void xs_udp_connect_worker6(struct work_struct *work)
 	xs_udp_finish_connecting(xprt, sock);
 	status = 0;
 out:
-	xprt_wake_pending_tasks(xprt, status);
 	xprt_clear_connecting(xprt);
+	xprt_wake_pending_tasks(xprt, status);
 }
 
 /*
@@ -1675,6 +1694,7 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
 		goto out;
 
 	if (!sock) {
+		clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
 		/* start from scratch */
 		if ((err = sock_create_kern(PF_INET, SOCK_STREAM, IPPROTO_TCP, &sock)) < 0) {
 			dprintk("RPC:       can't create TCP transport socket (%d).\n", -err);
@@ -1686,10 +1706,14 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
 			sock_release(sock);
 			goto out;
 		}
-	} else
+	} else {
 		/* "close" the socket, preserving the local port */
 		xs_tcp_reuse_connection(xprt);
 
+		if (test_and_clear_bit(XPRT_CONNECTION_ABORT, &xprt->state))
+			goto out;
+	}
+
 	dprintk("RPC:       worker connecting xprt %p to address: %s\n",
 			xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
 
@@ -1701,19 +1725,17 @@ static void xs_tcp_connect_worker4(struct work_struct *work)
 		switch (status) {
 			case -EINPROGRESS:
 			case -EALREADY:
-				goto out_clear;
-			case -ECONNREFUSED:
-			case -ECONNRESET:
-				/* retry with existing socket, after a delay */
 				break;
 			default:
 				/* get rid of existing socket, and retry */
 				xs_tcp_shutdown(xprt);
+			case -ECONNREFUSED:
+			case -ECONNRESET:
+				/* retry with existing socket, after a delay */
+				xprt_wake_pending_tasks(xprt, status);
 		}
 	}
 out:
-	xprt_wake_pending_tasks(xprt, status);
-out_clear:
 	xprt_clear_connecting(xprt);
 }
 
@@ -1735,6 +1757,7 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
 		goto out;
 
 	if (!sock) {
+		clear_bit(XPRT_CONNECTION_ABORT, &xprt->state);
 		/* start from scratch */
 		if ((err = sock_create_kern(PF_INET6, SOCK_STREAM, IPPROTO_TCP, &sock)) < 0) {
 			dprintk("RPC:       can't create TCP transport socket (%d).\n", -err);
@@ -1746,10 +1769,14 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
 			sock_release(sock);
 			goto out;
 		}
-	} else
+	} else {
 		/* "close" the socket, preserving the local port */
 		xs_tcp_reuse_connection(xprt);
 
+		if (test_and_clear_bit(XPRT_CONNECTION_ABORT, &xprt->state))
+			goto out;
+	}
+
 	dprintk("RPC:       worker connecting xprt %p to address: %s\n",
 			xprt, xprt->address_strings[RPC_DISPLAY_ALL]);
 
@@ -1760,19 +1787,17 @@ static void xs_tcp_connect_worker6(struct work_struct *work)
 		switch (status) {
 			case -EINPROGRESS:
 			case -EALREADY:
-				goto out_clear;
-			case -ECONNREFUSED:
-			case -ECONNRESET:
-				/* retry with existing socket, after a delay */
 				break;
 			default:
 				/* get rid of existing socket, and retry */
 				xs_tcp_shutdown(xprt);
+			case -ECONNREFUSED:
+			case -ECONNRESET:
+				/* retry with existing socket, after a delay */
+				xprt_wake_pending_tasks(xprt, status);
 		}
 	}
 out:
-	xprt_wake_pending_tasks(xprt, status);
-out_clear:
 	xprt_clear_connecting(xprt);
 }
 


-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2008-12-23 14:49                             ` Tom Tucker
@ 2008-12-23 23:39                               ` Tom Tucker
  -1 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-23 23:39 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Tom Tucker wrote:
> Trond Myklebust wrote:
>> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>>  
>>> Trond Myklebust wrote:
>>>    
>>>> Aside from being racy (there is nothing preventing someone setting 
>>>> XPT_DEAD
>>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>>> wrong to assume that transports which have called svc_delete_xprt() 
>>>> might
>>>> not need to be re-enqueued.
>>>>       
>>> This is only true because now you allow transports with XPT_DEAD set 
>>> to be enqueued -- yes?
>>>
>>>    
>>>> See the list of deferred requests, which is currently never going to
>>>> be cleared if the revisit call happens after svc_delete_xprt(). In 
>>>> this
>>>> case, the deferred request will currently keep a reference to the 
>>>> transport
>>>> forever.
>>>>
>>>>       
>>> I agree this is a possibility and it needs to be fixed. I'm 
>>> concerned that the root cause is still there though. I thought the 
>>> test case was the client side timing out the connection. Why are 
>>> there deferred requests sitting on what is presumably an idle 
>>> connection?
>>>     
>>
>> I haven't said that they are the cause of this test case. I've said that
>> deferred requests hold references to the socket that can obviously
>> deadlock. That needs to be fixed regardless of whether or not it is the
>> cause here.
>>
>> There are plenty of situations in which the client may choose to close
>> the TCP socket even if there are outstanding requests. One of the most
>> common is when the user signals the process, so that an RPC call that
>> was partially transmitted (ran out of buffer space) gets cancelled
>> before it can finish transmitting. In that case the client has no choice
>> but to disconnect and immediately reconnect.
>>
>>  
>>>> The fix should be to allow dead transports to be enqueued in order 
>>>> to clear
>>>> the deferred requests, then change the order of processing in 
>>>> svc_recv() so
>>>> that we pick up deferred requests before we do the XPT_CLOSE 
>>>> processing.
>>>>
>>>>       
>>> Wouldn't it be simpler to clean up any deferred requests in the 
>>> close path instead of changing the meaning of XPT_DEAD and 
>>> dispatching N-threads to do the same?
>>>     
>>
>> AFAICS, deferred requests are the property of the cache until they
>> expire or a downcall occurs. I'm not aware of any way to cancel only
>> those deferred requests that hold a reference to this particular
>> transport.
>>
>>   
> Ok, I think you're right, and I think that this fix is correct and 
> makes the symptom go away.
>
> I may be completely confused here, but:
>
> - The deferred requests should be getting cleaned up by timing out, 
> and that does not not seem to be happening, (Is this true?)
>
They are getting "cleaned up", but by the time they do the transport is 
dead, the request has been added to the deferred queue, but it won't get 
processed because svc_xprt_enqueue won't "schedule" a dead transport.

> - By releasing the underlying connection prior to releasing the 
> transport that manages it, we've converted the visible resource leek 
> to an invisible one.
>
Not with your changes per the above.

> - This has been around forever and changing the client side close 
> behavior graceful exposed this bug,
>
> So I'm wondering if what we want to do here is to provide a mechanism 
> for canceling deferred requests for a particular transport. This would 
> provide a mechanism for the generic transport driver to force 
> cancellation of deferred requests when closing. 
This is a new interface and we'd still need to handle requests sitting 
on the transport's deferred queue. Probably not a good idea.

> Tom
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
@ 2008-12-23 23:39                               ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2008-12-23 23:39 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Ian Campbell, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, J. Bruce Fields

Tom Tucker wrote:
> Trond Myklebust wrote:
>> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>>  
>>> Trond Myklebust wrote:
>>>    
>>>> Aside from being racy (there is nothing preventing someone setting 
>>>> XPT_DEAD
>>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>>> wrong to assume that transports which have called svc_delete_xprt() 
>>>> might
>>>> not need to be re-enqueued.
>>>>       
>>> This is only true because now you allow transports with XPT_DEAD set 
>>> to be enqueued -- yes?
>>>
>>>    
>>>> See the list of deferred requests, which is currently never going to
>>>> be cleared if the revisit call happens after svc_delete_xprt(). In 
>>>> this
>>>> case, the deferred request will currently keep a reference to the 
>>>> transport
>>>> forever.
>>>>
>>>>       
>>> I agree this is a possibility and it needs to be fixed. I'm 
>>> concerned that the root cause is still there though. I thought the 
>>> test case was the client side timing out the connection. Why are 
>>> there deferred requests sitting on what is presumably an idle 
>>> connection?
>>>     
>>
>> I haven't said that they are the cause of this test case. I've said that
>> deferred requests hold references to the socket that can obviously
>> deadlock. That needs to be fixed regardless of whether or not it is the
>> cause here.
>>
>> There are plenty of situations in which the client may choose to close
>> the TCP socket even if there are outstanding requests. One of the most
>> common is when the user signals the process, so that an RPC call that
>> was partially transmitted (ran out of buffer space) gets cancelled
>> before it can finish transmitting. In that case the client has no choice
>> but to disconnect and immediately reconnect.
>>
>>  
>>>> The fix should be to allow dead transports to be enqueued in order 
>>>> to clear
>>>> the deferred requests, then change the order of processing in 
>>>> svc_recv() so
>>>> that we pick up deferred requests before we do the XPT_CLOSE 
>>>> processing.
>>>>
>>>>       
>>> Wouldn't it be simpler to clean up any deferred requests in the 
>>> close path instead of changing the meaning of XPT_DEAD and 
>>> dispatching N-threads to do the same?
>>>     
>>
>> AFAICS, deferred requests are the property of the cache until they
>> expire or a downcall occurs. I'm not aware of any way to cancel only
>> those deferred requests that hold a reference to this particular
>> transport.
>>
>>   
> Ok, I think you're right, and I think that this fix is correct and 
> makes the symptom go away.
>
> I may be completely confused here, but:
>
> - The deferred requests should be getting cleaned up by timing out, 
> and that does not not seem to be happening, (Is this true?)
>
They are getting "cleaned up", but by the time they do the transport is 
dead, the request has been added to the deferred queue, but it won't get 
processed because svc_xprt_enqueue won't "schedule" a dead transport.

> - By releasing the underlying connection prior to releasing the 
> transport that manages it, we've converted the visible resource leek 
> to an invisible one.
>
Not with your changes per the above.

> - This has been around forever and changing the client side close 
> behavior graceful exposed this bug,
>
> So I'm wondering if what we want to do here is to provide a mechanism 
> for canceling deferred requests for a particular transport. This would 
> provide a mechanism for the generic transport driver to force 
> cancellation of deferred requests when closing. 
This is a new interface and we'd still need to handle requests sitting 
on the transport's deferred queue. Probably not a good idea.

> Tom
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                           ` <1229540877.7257.97.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-02 21:44                             ` Tom Tucker
  2009-01-04 19:12                               ` Trond Myklebust
  0 siblings, 1 reply; 131+ messages in thread
From: Tom Tucker @ 2009-01-02 21:44 UTC (permalink / raw)
  To: Trond Myklebust, J. Bruce Fields; +Cc: linux-nfs


Bruce/Trond:

This is an alternative to patches 2 and 3 from Trond's fix. I think 
Trond's fix is correct, but I believe this approach to be simpler.

From: Tom Tucker <tom@opengridcomputing.com>
Date: Wed, 31 Dec 2008 17:18:33 -0600
Subject: [PATCH] svc: Clean up deferred requests on transport destruction

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
---
 net/sunrpc/svc_xprt.c |   20 +++++++++++++++-----
 1 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..92ca5c6 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
 void svc_delete_xprt(struct svc_xprt *xprt)
 {
        struct svc_serv *serv = xprt->xpt_server;
+       struct svc_deferred_req *dr;
+
+       /* Only do this once */
+       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
+               return;

        dprintk("svc: svc_delete_xprt(%p)\n", xprt);
        xprt->xpt_ops->xpo_detach(xprt);
@@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
         * while still attached to a queue, the queue itself
         * is about to be destroyed (in svc_destroy).
         */
-       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
-               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
-               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
-                       serv->sv_tmpcnt--;
+       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+               serv->sv_tmpcnt--;
+
+       for (dr = svc_deferred_dequeue(xprt); dr;
+            dr = svc_deferred_dequeue(xprt)) {
                svc_xprt_put(xprt);
+               kfree(dr);
        }
+
+       svc_xprt_put(xprt);
        spin_unlock_bh(&serv->sv_lock);
 }

@@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
                container_of(dreq, struct svc_deferred_req, handle);
        struct svc_xprt *xprt = dr->xprt;

-       if (too_many) {
+       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
+               dprintk("revisit cancelled\n");
                svc_xprt_put(xprt);
                kfree(dr);
                return;


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2009-01-02 21:44                             ` Tom Tucker
@ 2009-01-04 19:12                               ` Trond Myklebust
       [not found]                                 ` <1231096358.7363.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
       [not found]                                 ` <1231097131.7 363.11.camel@heimdal.trondhjem.org>
  0 siblings, 2 replies; 131+ messages in thread
From: Trond Myklebust @ 2009-01-04 19:12 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs

On Fri, 2009-01-02 at 15:44 -0600, Tom Tucker wrote:
> Bruce/Trond:
> 
> This is an alternative to patches 2 and 3 from Trond's fix. I think 
> Trond's fix is correct, but I believe this approach to be simpler.
> 
> From: Tom Tucker <tom@opengridcomputing.com>
> Date: Wed, 31 Dec 2008 17:18:33 -0600
> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
> 
> A race between svc_revisit and svc_delete_xprt can result in
> deferred requests holding references on a transport that can never be
> recovered because dead transports are not enqueued for subsequent
> processing.
> 
> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
> transport and sweep a transport's deferred queue to do the same for queued
> but unprocessed deferrals.
> 
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> ---
>  net/sunrpc/svc_xprt.c |   20 +++++++++++++++-----
>  1 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index bf5b5cd..92ca5c6 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
>  void svc_delete_xprt(struct svc_xprt *xprt)
>  {
>         struct svc_serv *serv = xprt->xpt_server;
> +       struct svc_deferred_req *dr;
> +
> +       /* Only do this once */
> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
> +               return;
> 
>         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>         xprt->xpt_ops->xpo_detach(xprt);
> @@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>          * while still attached to a queue, the queue itself
>          * is about to be destroyed (in svc_destroy).
>          */
> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> -                       serv->sv_tmpcnt--;
> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> +               serv->sv_tmpcnt--;
> +
> +       for (dr = svc_deferred_dequeue(xprt); dr;
> +            dr = svc_deferred_dequeue(xprt)) {
>                 svc_xprt_put(xprt);
> +               kfree(dr);
>         }
> +
> +       svc_xprt_put(xprt);
>         spin_unlock_bh(&serv->sv_lock);
>  }
> 
> @@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>                 container_of(dreq, struct svc_deferred_req, handle);
>         struct svc_xprt *xprt = dr->xprt;
> 
> -       if (too_many) {
> +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> +               dprintk("revisit cancelled\n");
>                 svc_xprt_put(xprt);
>                 kfree(dr);
>                 return;
> 

I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
the above test in svc_revisit(), and before the test inside
svc_xprt_enqueue(). What's preventing a race there?

Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                 ` <1231096358.7363.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-04 19:25                                   ` Trond Myklebust
  2009-01-05  3:33                                   ` Tom Tucker
  1 sibling, 0 replies; 131+ messages in thread
From: Trond Myklebust @ 2009-01-04 19:25 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs

On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
> On Fri, 2009-01-02 at 15:44 -0600, Tom Tucker wrote:
> > Bruce/Trond:
> > 
> > This is an alternative to patches 2 and 3 from Trond's fix. I think 
> > Trond's fix is correct, but I believe this approach to be simpler.
> > 
> > From: Tom Tucker <tom@opengridcomputing.com>
> > Date: Wed, 31 Dec 2008 17:18:33 -0600
> > Subject: [PATCH] svc: Clean up deferred requests on transport destruction
> > 
> > A race between svc_revisit and svc_delete_xprt can result in
> > deferred requests holding references on a transport that can never be
> > recovered because dead transports are not enqueued for subsequent
> > processing.
> > 
> > Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
> > transport and sweep a transport's deferred queue to do the same for queued
> > but unprocessed deferrals.
> > 
> > Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> > ---
> >  net/sunrpc/svc_xprt.c |   20 +++++++++++++++-----
> >  1 files changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > index bf5b5cd..92ca5c6 100644
> > --- a/net/sunrpc/svc_xprt.c
> > +++ b/net/sunrpc/svc_xprt.c
> > @@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
> >  void svc_delete_xprt(struct svc_xprt *xprt)
> >  {
> >         struct svc_serv *serv = xprt->xpt_server;
> > +       struct svc_deferred_req *dr;
> > +
> > +       /* Only do this once */
> > +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
> > +               return;
> > 
> >         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
> >         xprt->xpt_ops->xpo_detach(xprt);
> > @@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> >          * while still attached to a queue, the queue itself
> >          * is about to be destroyed (in svc_destroy).
> >          */
> > -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> > -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
> > -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> > -                       serv->sv_tmpcnt--;
> > +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> > +               serv->sv_tmpcnt--;
> > +
> > +       for (dr = svc_deferred_dequeue(xprt); dr;
> > +            dr = svc_deferred_dequeue(xprt)) {
> >                 svc_xprt_put(xprt);
> > +               kfree(dr);
> >         }
> > +
> > +       svc_xprt_put(xprt);
> >         spin_unlock_bh(&serv->sv_lock);
> >  }
> > 
> > @@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
> >                 container_of(dreq, struct svc_deferred_req, handle);
> >         struct svc_xprt *xprt = dr->xprt;
> > 
> > -       if (too_many) {
> > +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> > +               dprintk("revisit cancelled\n");
> >                 svc_xprt_put(xprt);
> >                 kfree(dr);
> >                 return;
> > 
> 
> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
> the above test in svc_revisit(), and before the test inside
> svc_xprt_enqueue(). What's preventing a race there?

I suppose one way to fix it would be to hold the xprt->xpt_lock across
the above test, and to make sure that you set XPT_DEFERRED while holding
the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
anything that races with the setting of XPT_DEAD.

Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                 ` <1231096358.7363.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-01-04 19:25                                   ` Trond Myklebust
@ 2009-01-05  3:33                                   ` Tom Tucker
  1 sibling, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2009-01-05  3:33 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Fri, 2009-01-02 at 15:44 -0600, Tom Tucker wrote:
>> Bruce/Trond:
>>
>> This is an alternative to patches 2 and 3 from Trond's fix. I think 
>> Trond's fix is correct, but I believe this approach to be simpler.
>>
>> From: Tom Tucker <tom@opengridcomputing.com>
>> Date: Wed, 31 Dec 2008 17:18:33 -0600
>> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>>
>> A race between svc_revisit and svc_delete_xprt can result in
>> deferred requests holding references on a transport that can never be
>> recovered because dead transports are not enqueued for subsequent
>> processing.
>>
>> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
>> transport and sweep a transport's deferred queue to do the same for queued
>> but unprocessed deferrals.
>>
>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>> ---
>>  net/sunrpc/svc_xprt.c |   20 +++++++++++++++-----
>>  1 files changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>> index bf5b5cd..92ca5c6 100644
>> --- a/net/sunrpc/svc_xprt.c
>> +++ b/net/sunrpc/svc_xprt.c
>> @@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
>>  void svc_delete_xprt(struct svc_xprt *xprt)
>>  {
>>         struct svc_serv *serv = xprt->xpt_server;
>> +       struct svc_deferred_req *dr;
>> +
>> +       /* Only do this once */
>> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
>> +               return;
>>
>>         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>>         xprt->xpt_ops->xpo_detach(xprt);
>> @@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>>          * while still attached to a queue, the queue itself
>>          * is about to be destroyed (in svc_destroy).
>>          */
>> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
>> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>> -                       serv->sv_tmpcnt--;
>> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>> +               serv->sv_tmpcnt--;
>> +
>> +       for (dr = svc_deferred_dequeue(xprt); dr;
>> +            dr = svc_deferred_dequeue(xprt)) {
>>                 svc_xprt_put(xprt);
>> +               kfree(dr);
>>         }
>> +

If there are queued deferrals that are not processed, they will be 
cleaned up here and their references dropped.

>> +       svc_xprt_put(xprt);
>>         spin_unlock_bh(&serv->sv_lock);
>>  }
>>
>> @@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>>                 container_of(dreq, struct svc_deferred_req, handle);
>>         struct svc_xprt *xprt = dr->xprt;
>>
>> -       if (too_many) {
>> +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {

If there were references held by the cache, they will get cleaned up here.

>> +               dprintk("revisit cancelled\n");
>>                 svc_xprt_put(xprt);
>>                 kfree(dr);
>>                 return;
>>
> 
> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
> the above test in svc_revisit(), and before the test inside
> svc_xprt_enqueue(). What's preventing a race there?

Yep. I originally had a lock around the check for XPT_DEAD and the 
deferred enqueue, but convinced myself (incorrectly) it wasn't 
necessary. Erf.

Thanks, I'll repost a fix...

> 
> Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                   ` <1231097131.7363.11.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-05  3:33                                     ` Tom Tucker
  2009-01-05 17:04                                     ` Tom Tucker
  1 sibling, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2009-01-05  3:33 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
>> On Fri, 2009-01-02 at 15:44 -0600, Tom Tucker wrote:
>>> Bruce/Trond:
>>>
>>> This is an alternative to patches 2 and 3 from Trond's fix. I think 
>>> Trond's fix is correct, but I believe this approach to be simpler.
>>>
>>> From: Tom Tucker <tom@opengridcomputing.com>
>>> Date: Wed, 31 Dec 2008 17:18:33 -0600
>>> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>>>
>>> A race between svc_revisit and svc_delete_xprt can result in
>>> deferred requests holding references on a transport that can never be
>>> recovered because dead transports are not enqueued for subsequent
>>> processing.
>>>
>>> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
>>> transport and sweep a transport's deferred queue to do the same for queued
>>> but unprocessed deferrals.
>>>
>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>> ---
>>>  net/sunrpc/svc_xprt.c |   20 +++++++++++++++-----
>>>  1 files changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>> index bf5b5cd..92ca5c6 100644
>>> --- a/net/sunrpc/svc_xprt.c
>>> +++ b/net/sunrpc/svc_xprt.c
>>> @@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
>>>  void svc_delete_xprt(struct svc_xprt *xprt)
>>>  {
>>>         struct svc_serv *serv = xprt->xpt_server;
>>> +       struct svc_deferred_req *dr;
>>> +
>>> +       /* Only do this once */
>>> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
>>> +               return;
>>>
>>>         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>>>         xprt->xpt_ops->xpo_detach(xprt);
>>> @@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>>>          * while still attached to a queue, the queue itself
>>>          * is about to be destroyed (in svc_destroy).
>>>          */
>>> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
>>> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>> -                       serv->sv_tmpcnt--;
>>> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>> +               serv->sv_tmpcnt--;
>>> +
>>> +       for (dr = svc_deferred_dequeue(xprt); dr;
>>> +            dr = svc_deferred_dequeue(xprt)) {
>>>                 svc_xprt_put(xprt);
>>> +               kfree(dr);
>>>         }
>>> +
>>> +       svc_xprt_put(xprt);
>>>         spin_unlock_bh(&serv->sv_lock);
>>>  }
>>>
>>> @@ -902,7 +911,8 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>>>                 container_of(dreq, struct svc_deferred_req, handle);
>>>         struct svc_xprt *xprt = dr->xprt;
>>>
>>> -       if (too_many) {
>>> +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>> +               dprintk("revisit cancelled\n");
>>>                 svc_xprt_put(xprt);
>>>                 kfree(dr);
>>>                 return;
>>>
>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>> the above test in svc_revisit(), and before the test inside
>> svc_xprt_enqueue(). What's preventing a race there?
> 
> I suppose one way to fix it would be to hold the xprt->xpt_lock across
> the above test, and to make sure that you set XPT_DEFERRED while holding
> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> anything that races with the setting of XPT_DEAD.
> 

Yes, see previous post. Thanks Trond.


> Trond
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                           ` <1230071647.17701.27.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-05 12:18                                             ` Kasparek Tomas
  2009-01-09 14:56                                             ` Kasparek Tomas
  1 sibling, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-05 12:18 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Dec 23, 2008 at 05:34:07PM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-16 at 13:05 +0100, Kasparek Tomas wrote:
> > Hm, not happy to say that but it still does not work after some time. Now
> > the problem is opposite there are no connections to the server according to
> > netstat on client, just time to time there is
> > 
> > pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> > kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> > 
> > (kazi is server). Will try to investigate more details.
> 
> OK. Here is one more try. I've tightened up some locking issues with the
> previous patch.
> 
> Thanks for helping test this!

Finally back from vacations, give me few days to test it. 

Thanks for your help!

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                   ` <1231097131.7363.11.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-01-05  3:33                                     ` Tom Tucker
@ 2009-01-05 17:04                                     ` Tom Tucker
  2009-01-05 17:13                                       ` Trond Myklebust
  1 sibling, 1 reply; 131+ messages in thread
From: Tom Tucker @ 2009-01-05 17:04 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
[...snip...]

>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>> the above test in svc_revisit(), and before the test inside
>> svc_xprt_enqueue(). What's preventing a race there?
> 
> I suppose one way to fix it would be to hold the xprt->xpt_lock across
> the above test, and to make sure that you set XPT_DEFERRED while holding
> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> anything that races with the setting of XPT_DEAD.
> 
> Trond

I think this patch fixes this. Thanks again,

From: Tom Tucker <tom@opengridcomputing.com>
Date: Mon, 5 Jan 2009 10:56:03 -0600
Subject: [PATCH] svc: Clean up deferred requests on transport destruction

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

---
  net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
  1 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..375a695 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
  void svc_delete_xprt(struct svc_xprt *xprt)
  {
         struct svc_serv *serv = xprt->xpt_server;
+       struct svc_deferred_req *dr;
+
+       /* Only do this once */
+       spin_lock(&xprt->xpt_lock);
+       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
+               spin_unlock(&xprt->xpt_lock);
+               return;
+       }
+       spin_unlock(&xprt->xpt_lock);

         dprintk("svc: svc_delete_xprt(%p)\n", xprt);
         xprt->xpt_ops->xpo_detach(xprt);
@@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
          * while still attached to a queue, the queue itself
          * is about to be destroyed (in svc_destroy).
          */
-       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
-               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
-               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
-                       serv->sv_tmpcnt--;
+       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+               serv->sv_tmpcnt--;
+
+       for (dr = svc_deferred_dequeue(xprt); dr;
+            dr = svc_deferred_dequeue(xprt)) {
                 svc_xprt_put(xprt);
+               kfree(dr);
         }
+
+       svc_xprt_put(xprt);
         spin_unlock_bh(&serv->sv_lock);
  }

@@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
                 container_of(dreq, struct svc_deferred_req, handle);
         struct svc_xprt *xprt = dr->xprt;

-       if (too_many) {
+       spin_lock(&xprt->xpt_lock);
+       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
+               spin_unlock(&xprt->xpt_lock);
+               dprintk("revisit cancelled\n");
                 svc_xprt_put(xprt);
                 kfree(dr);
                 return;
         }
         dprintk("revisit queued\n");
         dr->xprt = NULL;
-       spin_lock(&xprt->xpt_lock);
         list_add(&dr->handle.recent, &xprt->xpt_deferred);
-       spin_unlock(&xprt->xpt_lock);
         set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+       spin_unlock(&xprt->xpt_lock);
         svc_xprt_enqueue(xprt);
         svc_xprt_put(xprt);
  }

^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2009-01-05 17:04                                     ` Tom Tucker
@ 2009-01-05 17:13                                       ` Trond Myklebust
       [not found]                                         ` <1231175613.7127.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-01-05 17:13 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs

On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
> [...snip...]
> 
> >> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
> >> the above test in svc_revisit(), and before the test inside
> >> svc_xprt_enqueue(). What's preventing a race there?
> > 
> > I suppose one way to fix it would be to hold the xprt->xpt_lock across
> > the above test, and to make sure that you set XPT_DEFERRED while holding
> > the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> > that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> > anything that races with the setting of XPT_DEAD.
> > 
> > Trond
> 
> I think this patch fixes this. Thanks again,
> 
> From: Tom Tucker <tom@opengridcomputing.com>
> Date: Mon, 5 Jan 2009 10:56:03 -0600
> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
> 
> A race between svc_revisit and svc_delete_xprt can result in
> deferred requests holding references on a transport that can never be
> recovered because dead transports are not enqueued for subsequent
> processing.
> 
> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
> transport and sweep a transport's deferred queue to do the same for queued
> but unprocessed deferrals.
> 
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> 
> ---
>   net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
>   1 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index bf5b5cd..375a695 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
>   void svc_delete_xprt(struct svc_xprt *xprt)
>   {
>          struct svc_serv *serv = xprt->xpt_server;
> +       struct svc_deferred_req *dr;
> +
> +       /* Only do this once */
> +       spin_lock(&xprt->xpt_lock);
> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> +               spin_unlock(&xprt->xpt_lock);
> +               return;
> +       }
> +       spin_unlock(&xprt->xpt_lock);

You shouldn't need to take the spinlock here if you just move the line
	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
in svc_revisit(). See below...

>          dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>          xprt->xpt_ops->xpo_detach(xprt);
> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>           * while still attached to a queue, the queue itself
>           * is about to be destroyed (in svc_destroy).
>           */
> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> -                       serv->sv_tmpcnt--;
> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> +               serv->sv_tmpcnt--;
> +
> +       for (dr = svc_deferred_dequeue(xprt); dr;
> +            dr = svc_deferred_dequeue(xprt)) {
>                  svc_xprt_put(xprt);
> +               kfree(dr);
>          }
> +
> +       svc_xprt_put(xprt);
>          spin_unlock_bh(&serv->sv_lock);
>   }
> 
> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>                  container_of(dreq, struct svc_deferred_req, handle);
>          struct svc_xprt *xprt = dr->xprt;
> 
> -       if (too_many) {
> +       spin_lock(&xprt->xpt_lock);

 +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);


> +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> +               spin_unlock(&xprt->xpt_lock);
> +               dprintk("revisit cancelled\n");
>                  svc_xprt_put(xprt);
>                  kfree(dr);
>                  return;
>          }
>          dprintk("revisit queued\n");
>          dr->xprt = NULL;
> -       spin_lock(&xprt->xpt_lock);
>          list_add(&dr->handle.recent, &xprt->xpt_deferred);
> -       spin_unlock(&xprt->xpt_lock);
>          set_bit(XPT_DEFERRED, &xprt->xpt_flags);

 -        set_bit(XPT_DEFERRED, &xprt->xpt_flags);

> +       spin_unlock(&xprt->xpt_lock);
>          svc_xprt_enqueue(xprt);
>          svc_xprt_put(xprt);
>   }


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                         ` <1231175613.7127.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-05 19:33                                           ` Tom Tucker
  2009-01-05 19:51                                             ` Trond Myklebust
  0 siblings, 1 reply; 131+ messages in thread
From: Tom Tucker @ 2009-01-05 19:33 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
>> Trond Myklebust wrote:
>>> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
>> [...snip...]
>>
>>>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>>>> the above test in svc_revisit(), and before the test inside
>>>> svc_xprt_enqueue(). What's preventing a race there?
>>> I suppose one way to fix it would be to hold the xprt->xpt_lock across
>>> the above test, and to make sure that you set XPT_DEFERRED while holding
>>> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
>>> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
>>> anything that races with the setting of XPT_DEAD.
>>>
>>> Trond
>> I think this patch fixes this. Thanks again,
>>
>> From: Tom Tucker <tom@opengridcomputing.com>
>> Date: Mon, 5 Jan 2009 10:56:03 -0600
>> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>>
>> A race between svc_revisit and svc_delete_xprt can result in
>> deferred requests holding references on a transport that can never be
>> recovered because dead transports are not enqueued for subsequent
>> processing.
>>
>> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
>> transport and sweep a transport's deferred queue to do the same for queued
>> but unprocessed deferrals.
>>
>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>
>> ---
>>   net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
>>   1 files changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>> index bf5b5cd..375a695 100644
>> --- a/net/sunrpc/svc_xprt.c
>> +++ b/net/sunrpc/svc_xprt.c
>> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
>>   void svc_delete_xprt(struct svc_xprt *xprt)
>>   {
>>          struct svc_serv *serv = xprt->xpt_server;
>> +       struct svc_deferred_req *dr;
>> +
>> +       /* Only do this once */
>> +       spin_lock(&xprt->xpt_lock);
>> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>> +               spin_unlock(&xprt->xpt_lock);
>> +               return;
>> +       }
>> +       spin_unlock(&xprt->xpt_lock);
> 
> You shouldn't need to take the spinlock here if you just move the line
> 	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> in svc_revisit(). See below...
> 

I'm confused...sorry.

This lock in intended to avoid the following race:

				revisit:
					- test_bit == 0
svc_delete_xprt:
	- test_and_set_bit == 0
	- iterates over deferred queue,
	  but there's nothing in it yet
	  to clean up.

					- Adds deferred request to transport's
					  deferred list.
					- enqueue fails because XPT_DEAD is set

Now we've got a dangling reference.

The lock forces the delete to wait until the revisit had
added the deferral to the transport list.


>>          dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>>          xprt->xpt_ops->xpo_detach(xprt);
>> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>>           * while still attached to a queue, the queue itself
>>           * is about to be destroyed (in svc_destroy).
>>           */
>> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
>> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>> -                       serv->sv_tmpcnt--;
>> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>> +               serv->sv_tmpcnt--;
>> +
>> +       for (dr = svc_deferred_dequeue(xprt); dr;
>> +            dr = svc_deferred_dequeue(xprt)) {
>>                  svc_xprt_put(xprt);
>> +               kfree(dr);
>>          }
>> +
>> +       svc_xprt_put(xprt);
>>          spin_unlock_bh(&serv->sv_lock);
>>   }
>>
>> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>>                  container_of(dreq, struct svc_deferred_req, handle);
>>          struct svc_xprt *xprt = dr->xprt;
>>
>> -       if (too_many) {
>> +       spin_lock(&xprt->xpt_lock);
> 
>  +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> 

Given the above, how does this avoid the race?

Thanks,
Tom

> 
>> +       if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
>> +               spin_unlock(&xprt->xpt_lock);
>> +               dprintk("revisit cancelled\n");
>>                  svc_xprt_put(xprt);
>>                  kfree(dr);
>>                  return;
>>          }
>>          dprintk("revisit queued\n");
>>          dr->xprt = NULL;
>> -       spin_lock(&xprt->xpt_lock);
>>          list_add(&dr->handle.recent, &xprt->xpt_deferred);
>> -       spin_unlock(&xprt->xpt_lock);
>>          set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> 
>  -        set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> 
>> +       spin_unlock(&xprt->xpt_lock);
>>          svc_xprt_enqueue(xprt);
>>          svc_xprt_put(xprt);
>>   }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2009-01-05 19:33                                           ` Tom Tucker
@ 2009-01-05 19:51                                             ` Trond Myklebust
       [not found]                                               ` <1231185115.7127.28.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-01-05 19:51 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs

On Mon, 2009-01-05 at 13:33 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
> >> Trond Myklebust wrote:
> >>> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
> >> [...snip...]
> >>
> >>>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
> >>>> the above test in svc_revisit(), and before the test inside
> >>>> svc_xprt_enqueue(). What's preventing a race there?
> >>> I suppose one way to fix it would be to hold the xprt->xpt_lock across
> >>> the above test, and to make sure that you set XPT_DEFERRED while holding
> >>> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
> >>> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
> >>> anything that races with the setting of XPT_DEAD.
> >>>
> >>> Trond
> >> I think this patch fixes this. Thanks again,
> >>
> >> From: Tom Tucker <tom@opengridcomputing.com>
> >> Date: Mon, 5 Jan 2009 10:56:03 -0600
> >> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
> >>
> >> A race between svc_revisit and svc_delete_xprt can result in
> >> deferred requests holding references on a transport that can never be
> >> recovered because dead transports are not enqueued for subsequent
> >> processing.
> >>
> >> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
> >> transport and sweep a transport's deferred queue to do the same for queued
> >> but unprocessed deferrals.
> >>
> >> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> >>
> >> ---
> >>   net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
> >>   1 files changed, 22 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> >> index bf5b5cd..375a695 100644
> >> --- a/net/sunrpc/svc_xprt.c
> >> +++ b/net/sunrpc/svc_xprt.c
> >> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
> >>   void svc_delete_xprt(struct svc_xprt *xprt)
> >>   {
> >>          struct svc_serv *serv = xprt->xpt_server;
> >> +       struct svc_deferred_req *dr;
> >> +
> >> +       /* Only do this once */
> >> +       spin_lock(&xprt->xpt_lock);
> >> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> >> +               spin_unlock(&xprt->xpt_lock);
> >> +               return;
> >> +       }
> >> +       spin_unlock(&xprt->xpt_lock);
> > 
> > You shouldn't need to take the spinlock here if you just move the line
> > 	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> > in svc_revisit(). See below...
> > 
> 
> I'm confused...sorry.
> 
> This lock in intended to avoid the following race:
> 
> 				revisit:
> 					- test_bit == 0
> svc_delete_xprt:
> 	- test_and_set_bit == 0
> 	- iterates over deferred queue,
> 	  but there's nothing in it yet
> 	  to clean up.
> 
> 					- Adds deferred request to transport's
> 					  deferred list.
> 					- enqueue fails because XPT_DEAD is set
> 
> Now we've got a dangling reference.
> 
> The lock forces the delete to wait until the revisit had
> added the deferral to the transport list.
> 
> 
> >>          dprintk("svc: svc_delete_xprt(%p)\n", xprt);
> >>          xprt->xpt_ops->xpo_detach(xprt);
> >> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> >>           * while still attached to a queue, the queue itself
> >>           * is about to be destroyed (in svc_destroy).
> >>           */
> >> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
> >> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
> >> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> >> -                       serv->sv_tmpcnt--;
> >> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> >> +               serv->sv_tmpcnt--;
> >> +
> >> +       for (dr = svc_deferred_dequeue(xprt); dr;
> >> +            dr = svc_deferred_dequeue(xprt)) {
> >>                  svc_xprt_put(xprt);
> >> +               kfree(dr);
> >>          }
> >> +
> >> +       svc_xprt_put(xprt);
> >>          spin_unlock_bh(&serv->sv_lock);
> >>   }
> >>
> >> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
> >>                  container_of(dreq, struct svc_deferred_req, handle);
> >>          struct svc_xprt *xprt = dr->xprt;
> >>
> >> -       if (too_many) {
> >> +       spin_lock(&xprt->xpt_lock);
> > 
> >  +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);
> > 
> 
> Given the above, how does this avoid the race?

By setting XPT_DEFERRED, you will force svc_deferred_dequeue to wait for
the xprt->xpt_lock, which you are already holding. At that point, it
would be OK for the test of XPT_DEAD to race, since you are still
holding the xpt_lock, so the loop over svc_deferred_dequeue() will catch
it...

Cheers
  Trond




^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                               ` <1231185115.7127.28.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-05 20:13                                                 ` Tom Tucker
  2009-01-05 20:41                                                 ` Tom Tucker
  1 sibling, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2009-01-05 20:13 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Mon, 2009-01-05 at 13:33 -0600, Tom Tucker wrote:
>> Trond Myklebust wrote:
>>> On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
>>>> Trond Myklebust wrote:
>>>>> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
>>>> [...snip...]
>>>>
>>>>>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>>>>>> the above test in svc_revisit(), and before the test inside
>>>>>> svc_xprt_enqueue(). What's preventing a race there?
>>>>> I suppose one way to fix it would be to hold the xprt->xpt_lock across
>>>>> the above test, and to make sure that you set XPT_DEFERRED while holding
>>>>> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
>>>>> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
>>>>> anything that races with the setting of XPT_DEAD.
>>>>>
>>>>> Trond
>>>> I think this patch fixes this. Thanks again,
>>>>
>>>> From: Tom Tucker <tom@opengridcomputing.com>
>>>> Date: Mon, 5 Jan 2009 10:56:03 -0600
>>>> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>>>>
>>>> A race between svc_revisit and svc_delete_xprt can result in
>>>> deferred requests holding references on a transport that can never be
>>>> recovered because dead transports are not enqueued for subsequent
>>>> processing.
>>>>
>>>> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
>>>> transport and sweep a transport's deferred queue to do the same for queued
>>>> but unprocessed deferrals.
>>>>
>>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>>>
>>>> ---
>>>>   net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
>>>>   1 files changed, 22 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>>> index bf5b5cd..375a695 100644
>>>> --- a/net/sunrpc/svc_xprt.c
>>>> +++ b/net/sunrpc/svc_xprt.c
>>>> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
>>>>   void svc_delete_xprt(struct svc_xprt *xprt)
>>>>   {
>>>>          struct svc_serv *serv = xprt->xpt_server;
>>>> +       struct svc_deferred_req *dr;
>>>> +
>>>> +       /* Only do this once */
>>>> +       spin_lock(&xprt->xpt_lock);
>>>> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>>> +               spin_unlock(&xprt->xpt_lock);
>>>> +               return;
>>>> +       }
>>>> +       spin_unlock(&xprt->xpt_lock);
>>> You shouldn't need to take the spinlock here if you just move the line
>>> 	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
>>> in svc_revisit(). See below...
>>>
>> I'm confused...sorry.
>>
>> This lock in intended to avoid the following race:
>>
>> 				revisit:
>> 					- test_bit == 0
>> svc_delete_xprt:
>> 	- test_and_set_bit == 0
>> 	- iterates over deferred queue,
>> 	  but there's nothing in it yet
>> 	  to clean up.
>>
>> 					- Adds deferred request to transport's
>> 					  deferred list.
>> 					- enqueue fails because XPT_DEAD is set
>>
>> Now we've got a dangling reference.
>>
>> The lock forces the delete to wait until the revisit had
>> added the deferral to the transport list.
>>
>>
>>>>          dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>>>>          xprt->xpt_ops->xpo_detach(xprt);
>>>> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>>>>           * while still attached to a queue, the queue itself
>>>>           * is about to be destroyed (in svc_destroy).
>>>>           */
>>>> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>>> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
>>>> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>>> -                       serv->sv_tmpcnt--;
>>>> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>>> +               serv->sv_tmpcnt--;
>>>> +
>>>> +       for (dr = svc_deferred_dequeue(xprt); dr;
>>>> +            dr = svc_deferred_dequeue(xprt)) {
>>>>                  svc_xprt_put(xprt);
>>>> +               kfree(dr);
>>>>          }
>>>> +
>>>> +       svc_xprt_put(xprt);
>>>>          spin_unlock_bh(&serv->sv_lock);
>>>>   }
>>>>
>>>> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>>>>                  container_of(dreq, struct svc_deferred_req, handle);
>>>>          struct svc_xprt *xprt = dr->xprt;
>>>>
>>>> -       if (too_many) {
>>>> +       spin_lock(&xprt->xpt_lock);
>>>  +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);
>>>
>> Given the above, how does this avoid the race?
> 
> By setting XPT_DEFERRED, you will force svc_deferred_dequeue to wait for
> the xprt->xpt_lock, which you are already holding. At that point, it
> would be OK for the test of XPT_DEAD to race, since you are still
> holding the xpt_lock, so the loop over svc_deferred_dequeue() will catch
> it...
> 

Right. I should have looked more carefully.


> Cheers
>   Trond
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                               ` <1231185115.7127.28.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-01-05 20:13                                                 ` Tom Tucker
@ 2009-01-05 20:41                                                 ` Tom Tucker
  2009-01-05 20:48                                                   ` Trond Myklebust
  1 sibling, 1 reply; 131+ messages in thread
From: Tom Tucker @ 2009-01-05 20:41 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Mon, 2009-01-05 at 13:33 -0600, Tom Tucker wrote:
>> Trond Myklebust wrote:
>>> On Mon, 2009-01-05 at 11:04 -0600, Tom Tucker wrote:
>>>> Trond Myklebust wrote:
>>>>> On Sun, 2009-01-04 at 14:12 -0500, Trond Myklebust wrote:
>>>> [...snip...]
>>>>
>>>>>> I see nothing that stops svc_delete_xprt() from setting XPT_DEAD after
>>>>>> the above test in svc_revisit(), and before the test inside
>>>>>> svc_xprt_enqueue(). What's preventing a race there?
>>>>> I suppose one way to fix it would be to hold the xprt->xpt_lock across
>>>>> the above test, and to make sure that you set XPT_DEFERRED while holding
>>>>> the lock, and _before_ you test for XPT_DEAD. That way, you guarantee
>>>>> that the svc_deferred_dequeue() loop in svc_delete_xprt() will pick up
>>>>> anything that races with the setting of XPT_DEAD.
>>>>>
>>>>> Trond
>>>> I think this patch fixes this. Thanks again,
>>>>
>>>> From: Tom Tucker <tom@opengridcomputing.com>
>>>> Date: Mon, 5 Jan 2009 10:56:03 -0600
>>>> Subject: [PATCH] svc: Clean up deferred requests on transport destruction
>>>>
>>>> A race between svc_revisit and svc_delete_xprt can result in
>>>> deferred requests holding references on a transport that can never be
>>>> recovered because dead transports are not enqueued for subsequent
>>>> processing.
>>>>
>>>> Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
>>>> transport and sweep a transport's deferred queue to do the same for queued
>>>> but unprocessed deferrals.
>>>>
>>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>>>
>>>> ---
>>>>   net/sunrpc/svc_xprt.c |   29 ++++++++++++++++++++++-------
>>>>   1 files changed, 22 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
>>>> index bf5b5cd..375a695 100644
>>>> --- a/net/sunrpc/svc_xprt.c
>>>> +++ b/net/sunrpc/svc_xprt.c
>>>> @@ -837,6 +837,15 @@ static void svc_age_temp_xprts(unsigned long closure)
>>>>   void svc_delete_xprt(struct svc_xprt *xprt)
>>>>   {
>>>>          struct svc_serv *serv = xprt->xpt_server;
>>>> +       struct svc_deferred_req *dr;
>>>> +
>>>> +       /* Only do this once */
>>>> +       spin_lock(&xprt->xpt_lock);
>>>> +       if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>>> +               spin_unlock(&xprt->xpt_lock);
>>>> +               return;
>>>> +       }
>>>> +       spin_unlock(&xprt->xpt_lock);
>>> You shouldn't need to take the spinlock here if you just move the line
>>> 	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
>>> in svc_revisit(). See below...
>>>
>> I'm confused...sorry.
>>
>> This lock in intended to avoid the following race:
>>
>> 				revisit:
>> 					- test_bit == 0
>> svc_delete_xprt:
>> 	- test_and_set_bit == 0
>> 	- iterates over deferred queue,
>> 	  but there's nothing in it yet
>> 	  to clean up.
>>
>> 					- Adds deferred request to transport's
>> 					  deferred list.
>> 					- enqueue fails because XPT_DEAD is set
>>
>> Now we've got a dangling reference.
>>
>> The lock forces the delete to wait until the revisit had
>> added the deferral to the transport list.
>>
>>
>>>>          dprintk("svc: svc_delete_xprt(%p)\n", xprt);
>>>>          xprt->xpt_ops->xpo_detach(xprt);
>>>> @@ -851,12 +860,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
>>>>           * while still attached to a queue, the queue itself
>>>>           * is about to be destroyed (in svc_destroy).
>>>>           */
>>>> -       if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
>>>> -               BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
>>>> -               if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>>> -                       serv->sv_tmpcnt--;
>>>> +       if (test_bit(XPT_TEMP, &xprt->xpt_flags))
>>>> +               serv->sv_tmpcnt--;
>>>> +
>>>> +       for (dr = svc_deferred_dequeue(xprt); dr;
>>>> +            dr = svc_deferred_dequeue(xprt)) {
>>>>                  svc_xprt_put(xprt);
>>>> +               kfree(dr);
>>>>          }
>>>> +
>>>> +       svc_xprt_put(xprt);
>>>>          spin_unlock_bh(&serv->sv_lock);
>>>>   }
>>>>
>>>> @@ -902,17 +915,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
>>>>                  container_of(dreq, struct svc_deferred_req, handle);
>>>>          struct svc_xprt *xprt = dr->xprt;
>>>>
>>>> -       if (too_many) {
>>>> +       spin_lock(&xprt->xpt_lock);
>>>  +        set_bit(XPT_DEFERRED, &xprt->xpt_flags);
>>>
>> Given the above, how does this avoid the race?
> 
> By setting XPT_DEFERRED, you will force svc_deferred_dequeue to wait for
> the xprt->xpt_lock, which you are already holding. At that point, it
> would be OK for the test of XPT_DEAD to race, since you are still
> holding the xpt_lock, so the loop over svc_deferred_dequeue() will catch
> it...
> 

Trond, this version removes the extra unnecessary spin locks.

If it's ok, do you want me to resend to Bruce, or
do you want to couple it with your other patch?

Thanks,
Tom

A race between svc_revisit and svc_delete_xprt can result in
deferred requests holding references on a transport that can never be
recovered because dead transports are not enqueued for subsequent
processing.

Check for XPT_DEAD in revisit to clean up completing deferrals on a dead
transport and sweep a transport's deferred queue to do the same for queued
but unprocessed deferrals.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>

---
  net/sunrpc/svc_xprt.c |   25 ++++++++++++++++++-------
  1 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..6bdbb79 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -837,6 +837,11 @@ static void svc_age_temp_xprts(unsigned long closure)
  void svc_delete_xprt(struct svc_xprt *xprt)
  {
  	struct svc_serv	*serv = xprt->xpt_server;
+	struct svc_deferred_req *dr;
+
+	/* Only do this once */
+	if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
+		return;

  	dprintk("svc: svc_delete_xprt(%p)\n", xprt);
  	xprt->xpt_ops->xpo_detach(xprt);
@@ -851,12 +856,16 @@ void svc_delete_xprt(struct svc_xprt *xprt)
  	 * while still attached to a queue, the queue itself
  	 * is about to be destroyed (in svc_destroy).
  	 */
-	if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
-		BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
-		if (test_bit(XPT_TEMP, &xprt->xpt_flags))
-			serv->sv_tmpcnt--;
+	if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+		serv->sv_tmpcnt--;
+
+	for (dr = svc_deferred_dequeue(xprt); dr;
+	     dr = svc_deferred_dequeue(xprt)) {
  		svc_xprt_put(xprt);
+		kfree(dr);
  	}
+
+	svc_xprt_put(xprt);
  	spin_unlock_bh(&serv->sv_lock);
  }

@@ -902,17 +911,19 @@ static void svc_revisit(struct cache_deferred_req *dreq, int too_many)
  		container_of(dreq, struct svc_deferred_req, handle);
  	struct svc_xprt *xprt = dr->xprt;

-	if (too_many) {
+	spin_lock(&xprt->xpt_lock);
+	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
+	if (too_many || test_bit(XPT_DEAD, &xprt->xpt_flags)) {
+		spin_unlock(&xprt->xpt_lock);
+		dprintk("revisit cancelled\n");
  		svc_xprt_put(xprt);
  		kfree(dr);
  		return;
  	}
  	dprintk("revisit queued\n");
  	dr->xprt = NULL;
-	spin_lock(&xprt->xpt_lock);
  	list_add(&dr->handle.recent, &xprt->xpt_deferred);
  	spin_unlock(&xprt->xpt_lock);
-	set_bit(XPT_DEFERRED, &xprt->xpt_flags);
  	svc_xprt_enqueue(xprt);
  	svc_xprt_put(xprt);
  }


> Cheers
>   Trond
> 
> 


^ permalink raw reply related	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
  2009-01-05 20:41                                                 ` Tom Tucker
@ 2009-01-05 20:48                                                   ` Trond Myklebust
       [not found]                                                     ` <1231188518.7127.30.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-01-05 20:48 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs

On Mon, 2009-01-05 at 14:41 -0600, Tom Tucker wrote:
> Trond, this version removes the extra unnecessary spin locks.
> 
> If it's ok, do you want me to resend to Bruce, or
> do you want to couple it with your other patch?

As long as it applies on top of my patch, then just send it on to Bruce.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports
       [not found]                                                     ` <1231188518.7127.30.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-05 21:10                                                       ` Tom Tucker
  0 siblings, 0 replies; 131+ messages in thread
From: Tom Tucker @ 2009-01-05 21:10 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: J. Bruce Fields, linux-nfs

Trond Myklebust wrote:
> On Mon, 2009-01-05 at 14:41 -0600, Tom Tucker wrote:
>> Trond, this version removes the extra unnecessary spin locks.
>>
>> If it's ok, do you want me to resend to Bruce, or
>> do you want to couple it with your other patch?
> 
> As long as it applies on top of my patch, then just send it on to Bruce.
> 
> Cheers
>   Trond

This patch only touches svc_xprt.c and yours only touches svcsock.c,
so we're clean.

Thanks for your help,
Tom

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-07 22:21                                 ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-07 22:21 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> That's right, it was actually 2.6.26.7 FWIW.
> 
> > I'll try to take a look at these before I leave for the holidays,
> > assuming the versions Trond posted on Nov. 30 are the latest.
> 
> Thanks.

Sorry for getting behind.

If you got a chance to retest with the for-2.6.29 branch at

	git://linux-nfs.org/~bfields/linux.git for-2.6.29

that'd be great; that's what I intend to send to Linus.

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-07 22:21                                 ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-07 22:21 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> That's right, it was actually 2.6.26.7 FWIW.
> 
> > I'll try to take a look at these before I leave for the holidays,
> > assuming the versions Trond posted on Nov. 30 are the latest.
> 
> Thanks.

Sorry for getting behind.

If you got a chance to retest with the for-2.6.29 branch at

	git://linux-nfs.org/~bfields/linux.git for-2.6.29

that'd be great; that's what I intend to send to Linus.

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-07 22:21                                 ` J. Bruce Fields
@ 2009-01-08 18:20                                   ` J. Bruce Fields
  -1 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-08 18:20 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > That's right, it was actually 2.6.26.7 FWIW.
> > 
> > > I'll try to take a look at these before I leave for the holidays,
> > > assuming the versions Trond posted on Nov. 30 are the latest.
> > 
> > Thanks.
> 
> Sorry for getting behind.
> 
> If you got a chance to retest with the for-2.6.29 branch at
> 
> 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> 
> that'd be great; that's what I intend to send to Linus.

(Merged now, so testing mainline as of today should work too.)

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-08 18:20                                   ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-08 18:20 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > That's right, it was actually 2.6.26.7 FWIW.
> > 
> > > I'll try to take a look at these before I leave for the holidays,
> > > assuming the versions Trond posted on Nov. 30 are the latest.
> > 
> > Thanks.
> 
> Sorry for getting behind.
> 
> If you got a chance to retest with the for-2.6.29 branch at
> 
> 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> 
> that'd be great; that's what I intend to send to Linus.

(Merged now, so testing mainline as of today should work too.)

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-08 18:20                                   ` J. Bruce Fields
@ 2009-01-08 21:22                                     ` Ian Campbell
  -1 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-08 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 7657 bytes --]

On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > That's right, it was actually 2.6.26.7 FWIW.
> > > 
> > > > I'll try to take a look at these before I leave for the holidays,
> > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > > 
> > > Thanks.
> > 
> > Sorry for getting behind.
> > 
> > If you got a chance to retest with the for-2.6.29 branch at
> > 
> > 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> > 
> > that'd be great; that's what I intend to send to Linus.
> 
> (Merged now, so testing mainline as of today should work too.)

The server isn't really a machine I want to test random kernels on, is
there some subset of those changesets which it would be useful for me to
pull back onto the 2.6.26 kernel I'm using to test? (I can most like
manage the backporting myself).

These two look like the relevant ones to me but I'm not sure:
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion

I think 69b6 was in the set of three I tested previously and the other
two turned into 2294?

Ian.

Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description


-- 
Ian Campbell

You're definitely on their list.  The question to ask next is what list it is.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-08 21:22                                     ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-08 21:22 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 7657 bytes --]

On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > That's right, it was actually 2.6.26.7 FWIW.
> > > 
> > > > I'll try to take a look at these before I leave for the holidays,
> > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > > 
> > > Thanks.
> > 
> > Sorry for getting behind.
> > 
> > If you got a chance to retest with the for-2.6.29 branch at
> > 
> > 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> > 
> > that'd be great; that's what I intend to send to Linus.
> 
> (Merged now, so testing mainline as of today should work too.)

The server isn't really a machine I want to test random kernels on, is
there some subset of those changesets which it would be useful for me to
pull back onto the 2.6.26 kernel I'm using to test? (I can most like
manage the backporting myself).

These two look like the relevant ones to me but I'm not sure:
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion

I think 69b6 was in the set of three I tested previously and the other
two turned into 2294?

Ian.

Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description


-- 
Ian Campbell

You're definitely on their list.  The question to ask next is what list it is.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-08 21:26                                       ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-08 21:26 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > > That's right, it was actually 2.6.26.7 FWIW.
> > > > 
> > > > > I'll try to take a look at these before I leave for the holidays,
> > > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > > > 
> > > > Thanks.
> > > 
> > > Sorry for getting behind.
> > > 
> > > If you got a chance to retest with the for-2.6.29 branch at
> > > 
> > > 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> > > 
> > > that'd be great; that's what I intend to send to Linus.
> > 
> > (Merged now, so testing mainline as of today should work too.)
> 
> The server isn't really a machine I want to test random kernels on, is
> there some subset of those changesets which it would be useful for me to
> pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> manage the backporting myself).
> 
> These two look like the relevant ones to me but I'm not sure:
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> 
> I think 69b6 was in the set of three I tested previously and the other
> two turned into 2294?

Yep, exactly.--b.

> 
> Ian.
> 
> Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
> db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
> 87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
> 4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
> df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
> 0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
> b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
> 30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
> 24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
> 2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
> f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
> 0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
> d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
> 55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> 262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
> 9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
> 54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
> b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
> b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
> 57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
> d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
> 49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
> b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
> 8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
> e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
> 94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
> 77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
> b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
> 92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
> 8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
> 3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
> 576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
> 7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
> cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
> 7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
> 05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
> bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
> 5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
> 67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
> 03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
> 36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
> 9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
> 0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
> 356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
> c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
> 1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
> 5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
> 5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
> 501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
> 9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
> 29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
> f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
> 5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
> a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
> afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
> bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
> 6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
> 1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
> 7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
> c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
> c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
> 548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
> 2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
> 9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
> b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description
> 
> 
> -- 
> Ian Campbell
> 
> You're definitely on their list.  The question to ask next is what list it is.



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-08 21:26                                       ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-08 21:26 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > > That's right, it was actually 2.6.26.7 FWIW.
> > > > 
> > > > > I'll try to take a look at these before I leave for the holidays,
> > > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > > > 
> > > > Thanks.
> > > 
> > > Sorry for getting behind.
> > > 
> > > If you got a chance to retest with the for-2.6.29 branch at
> > > 
> > > 	git://linux-nfs.org/~bfields/linux.git for-2.6.29
> > > 
> > > that'd be great; that's what I intend to send to Linus.
> > 
> > (Merged now, so testing mainline as of today should work too.)
> 
> The server isn't really a machine I want to test random kernels on, is
> there some subset of those changesets which it would be useful for me to
> pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> manage the backporting myself).
> 
> These two look like the relevant ones to me but I'm not sure:
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> 
> I think 69b6 was in the set of three I tested previously and the other
> two turned into 2294?

Yep, exactly.--b.

> 
> Ian.
> 
> Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
> db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
> 87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
> 4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
> df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
> 0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
> b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
> 30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
> 24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
> 2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
> f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
> 0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
> d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
> 55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> 262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
> 9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
> 54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
> b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
> b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
> 57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
> d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
> 49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
> b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
> 8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
> e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
> 94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
> 77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
> b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
> 92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
> 8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
> 3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
> 576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
> 7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
> cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
> 7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
> 05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
> bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
> 5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
> 67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
> 03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
> 36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
> 9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
> 0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
> 356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
> c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
> 1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
> 5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
> 5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
> 501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
> 9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
> 29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
> f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
> 5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
> a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
> afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
> bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
> 6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
> 1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
> 7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
> c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
> c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
> 548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
> 2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
> 9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
> b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description
> 
> 
> -- 
> Ian Campbell
> 
> You're definitely on their list.  The question to ask next is what list it is.



^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                           ` <1230071647.17701.27.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-01-05 12:18                                             ` Kasparek Tomas
@ 2009-01-09 14:56                                             ` Kasparek Tomas
  2009-01-09 17:59                                               ` Trond Myklebust
  1 sibling, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-09 14:56 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Dec 23, 2008 at 05:34:07PM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-16 at 13:05 +0100, Kasparek Tomas wrote:
> > Hm, not happy to say that but it still does not work after some time. Now
> > the problem is opposite there are no connections to the server according to
> > netstat on client, just time to time there is
> > 
> > pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> > kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> > 
> > (kazi is server). Will try to investigate more details.
> 
> OK. Here is one more try. I've tightened up some locking issues with the
> previous patch.

Did tried this new version. Applied it to 2.6.27.10, but the behaviour is
the same like with the first version - when old mounts are removed by amd
new one are not created, fter a while, there are no sockets on the client
and just CLOSED sockets on server. Client is issuing null RPC checks each
30sec and they are OK, but no other communication between client and server
takes place.

15:45:41.238796 IP pcnlp1.897490 > kazi.nfs: 40 null
15:45:41.239009 IP kazi.nfs > pcnlp1.897490: reply ok 24 null


I will try to get more info, but if you have some idea where to look or
what to try, it will be helpful.

Thanks for your help

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-09 14:56                                             ` Kasparek Tomas
@ 2009-01-09 17:59                                               ` Trond Myklebust
       [not found]                                                 ` <1231523966.7179.67.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-01-09 17:59 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Fri, 2009-01-09 at 15:56 +0100, Kasparek Tomas wrote:
> On Tue, Dec 23, 2008 at 05:34:07PM -0500, Trond Myklebust wrote:
> > On Tue, 2008-12-16 at 13:05 +0100, Kasparek Tomas wrote:
> > > Hm, not happy to say that but it still does not work after some time. Now
> > > the problem is opposite there are no connections to the server according to
> > > netstat on client, just time to time there is
> > > 
> > > pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> > > kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> > > 
> > > (kazi is server). Will try to investigate more details.
> > 
> > OK. Here is one more try. I've tightened up some locking issues with the
> > previous patch.
> 
> Did tried this new version. Applied it to 2.6.27.10, but the behaviour is
> the same like with the first version - when old mounts are removed by amd
> new one are not created, fter a while, there are no sockets on the client
> and just CLOSED sockets on server. Client is issuing null RPC checks each
> 30sec and they are OK, but no other communication between client and server
> takes place.
> 
> 15:45:41.238796 IP pcnlp1.897490 > kazi.nfs: 40 null
> 15:45:41.239009 IP kazi.nfs > pcnlp1.897490: reply ok 24 null
> 
> 
> I will try to get more info, but if you have some idea where to look or
> what to try, it will be helpful.
> 
> Thanks for your help

Wait. You're using amd when testing? Could you please rather retry using
just a static mount? amd has historically had way too many bugs
(particularly w.r.t. tcp) to be considered a reliable test.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                                 ` <1231523966.7179.67.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-10 10:24                                                   ` Kasparek Tomas
  2009-01-10 16:00                                                     ` Trond Myklebust
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-10 10:24 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Fri, Jan 09, 2009 at 12:59:26PM -0500, Trond Myklebust wrote:
> On Fri, 2009-01-09 at 15:56 +0100, Kasparek Tomas wrote:
> > On Tue, Dec 23, 2008 at 05:34:07PM -0500, Trond Myklebust wrote:
> > > On Tue, 2008-12-16 at 13:05 +0100, Kasparek Tomas wrote:
> > > > Hm, not happy to say that but it still does not work after some time. Now
> > > > the problem is opposite there are no connections to the server according to
> > > > netstat on client, just time to time there is
> > > > 
> > > > pcnlp1.fit.vutbr.cz.15234 > kazi.fit.vutbr.cz.nfs: 40 null
> > > > kazi.fit.vutbr.cz.nfs > pcnlp1.fit.vutbr.cz.15234: reply ok 24 null
> > > > 
> > > > (kazi is server). Will try to investigate more details.
> > > 
> > > OK. Here is one more try. I've tightened up some locking issues with the
> > > previous patch.
> > 
> > Did tried this new version. Applied it to 2.6.27.10, but the behaviour is
> > the same like with the first version - when old mounts are removed by amd
> > new one are not created, fter a while, there are no sockets on the client
> > and just CLOSED sockets on server. Client is issuing null RPC checks each
> > 30sec and they are OK, but no other communication between client and server
> > takes place.
> > 
> > 15:45:41.238796 IP pcnlp1.897490 > kazi.nfs: 40 null
> > 15:45:41.239009 IP kazi.nfs > pcnlp1.897490: reply ok 24 null
> > 
> > 
> > I will try to get more info, but if you have some idea where to look or
> > what to try, it will be helpful.
> > 
> > Thanks for your help
> 
> Wait. You're using amd when testing? Could you please rather retry using
> just a static mount? amd has historically had way too many bugs
> (particularly w.r.t. tcp) to be considered a reliable test.

Will try, but with static mount there is one stable TCP connection between
client and server so this problem can not happen at all isn't it? 

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-10 10:24                                                   ` Kasparek Tomas
@ 2009-01-10 16:00                                                     ` Trond Myklebust
       [not found]                                                       ` <20090112090404.GL47559@fit.vutbr.cz>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-01-10 16:00 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Sat, 2009-01-10 at 11:24 +0100, Kasparek Tomas wrote:
> > Wait. You're using amd when testing? Could you please rather retry using
> > just a static mount? amd has historically had way too many bugs
> > (particularly w.r.t. tcp) to be considered a reliable test.
> 
> Will try, but with static mount there is one stable TCP connection between
> client and server so this problem can not happen at all isn't it? 

It can. The client will automatically disconnect when the mountpoint has
not used for 5 minutes.

As for amd, I know that older versions used to set absolutely insane
timeouts for tcp connections. They'd use the same defaults as UDP, IOW
timeo=7, retrans=3 ('cat /proc/mounts' should be able to tell you if
that's the case for your setup).

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-08 21:26                                       ` J. Bruce Fields
@ 2009-01-12  9:46                                         ` Ian Campbell
  -1 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-12  9:46 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:

> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> > 
> > I think 69b6 was in the set of three I tested previously and the other
> > two turned into 2294?
> 
> Yep, exactly.--b.

OK, I have patched my 2.6.26 kernel with these and it is now running.
It'll be about a week before I can say with any certainty that the issue
hasn't recurred.

Ian.
-- 
Ian Campbell
Current Noise: Pitchshifter - Subject To Status

Do I have a lifestyle yet?


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-12  9:46                                         ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-12  9:46 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:

> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> > 
> > I think 69b6 was in the set of three I tested previously and the other
> > two turned into 2294?
> 
> Yep, exactly.--b.

OK, I have patched my 2.6.26 kernel with these and it is now running.
It'll be about a week before I can say with any certainty that the issue
hasn't recurred.

Ian.
-- 
Ian Campbell
Current Noise: Pitchshifter - Subject To Status

Do I have a lifestyle yet?


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                                             ` <1231809446.7322.17.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-13 15:22                                                               ` Kasparek Tomas
  2009-01-16 10:48                                                                 ` Kasparek Tomas
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-13 15:22 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Mon, Jan 12, 2009 at 08:17:26PM -0500, Trond Myklebust wrote:
> On Mon, 2009-01-12 at 12:40 -0500, Trond Myklebust wrote:
> > On Mon, 2009-01-12 at 10:04 +0100, Kasparek Tomas wrote:
> > > Ok, I find that allready. With static mount the behaviour is the same as
> > > with amd - no new connection is created and the client waits forever (~
> > > tens of hours at least).
> >  
> > OK. I now appear to be able to reproduce this problem. I should have a
> > fix ready soon.
> 
> The attached 2 patches have been tested using a server that was rigged
> not to ever close the socket. They appear to work fine on my setup,
> without the hang that you reported earlier.

after 8hours it seems it works both with static mount and with amd. I will
let you know the state after few more days again.

Thank you very much for your help.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-13 15:22                                                               ` Kasparek Tomas
@ 2009-01-16 10:48                                                                 ` Kasparek Tomas
  2009-01-18 13:08                                                                   ` Kasparek Tomas
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-16 10:48 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Jan 13, 2009 at 04:22:01PM +0100, Kasparek Tomas wrote:
> On Mon, Jan 12, 2009 at 08:17:26PM -0500, Trond Myklebust wrote:
> > On Mon, 2009-01-12 at 12:40 -0500, Trond Myklebust wrote:
> > > On Mon, 2009-01-12 at 10:04 +0100, Kasparek Tomas wrote:
> > > > Ok, I find that allready. With static mount the behaviour is the same as
> > > > with amd - no new connection is created and the client waits forever (~
> > > > tens of hours at least).
> > >  
> > > OK. I now appear to be able to reproduce this problem. I should have a
> > > fix ready soon.
> > 
> > The attached 2 patches have been tested using a server that was rigged
> > not to ever close the socket. They appear to work fine on my setup,
> > without the hang that you reported earlier.
> 
> after 8hours it seems it works both with static mount and with amd. I will
> let you know the state after few more days again.
> 
> Thank you very much for your help.

Just confirming, that the last patch did help and it works well both with
static mount and amd.

Thank you very much for repairing this. Should I do something more, or can
you propagate the change into vanilla and if possible to Greg for stable to
get into 2.6.27.x ?

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-16 10:48                                                                 ` Kasparek Tomas
@ 2009-01-18 13:08                                                                   ` Kasparek Tomas
  2009-01-20 15:03                                                                     ` Kasparek Tomas
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-18 13:08 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Fri, Jan 16, 2009 at 11:48:02AM +0100, Kasparek Tomas wrote:
> On Tue, Jan 13, 2009 at 04:22:01PM +0100, Kasparek Tomas wrote:
> > On Mon, Jan 12, 2009 at 08:17:26PM -0500, Trond Myklebust wrote:
> > > On Mon, 2009-01-12 at 12:40 -0500, Trond Myklebust wrote:
> > > > On Mon, 2009-01-12 at 10:04 +0100, Kasparek Tomas wrote:
> > > > > Ok, I find that allready. With static mount the behaviour is the same as
> > > > > with amd - no new connection is created and the client waits forever (~
> > > > > tens of hours at least).
> > > >  
> > > > OK. I now appear to be able to reproduce this problem. I should have a
> > > > fix ready soon.
> > > 
> > > The attached 2 patches have been tested using a server that was rigged
> > > not to ever close the socket. They appear to work fine on my setup,
> > > without the hang that you reported earlier.
> > 
> > after 8hours it seems it works both with static mount and with amd. I will
> > let you know the state after few more days again.
> > 
> > Thank you very much for your help.
> 
> Just confirming, that the last patch did help and it works well both with
> static mount and amd.
> 
> Thank you very much for repairing this. Should I do something more, or can
> you propagate the change into vanilla and if possible to Greg for stable to
> get into 2.6.27.x ?

Hi Trond, for now please do not push your patches to mainstream, I have some
big troubles with my machines and it starts loking like the new kernel may
be the cause.

It seems that machines with this new kernel (tried on 10 other machines
and the original client) may after few days get into state where they
generate huge amounts (10000-100000pkt/s) of packets on another server they
use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
With this flow it is hard to get more info and the server is production
one, so for now I only know it goes from these clients and end on tcp port
2049 on that server. It kills just this server, communication with the
previously problematic (FreeBSD machines) is fine now.

Will try to investigate more details.

Thanks.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-18 13:08                                                                   ` Kasparek Tomas
@ 2009-01-20 15:03                                                                     ` Kasparek Tomas
  2009-01-20 15:32                                                                       ` Trond Myklebust
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-20 15:03 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Sun, Jan 18, 2009 at 02:08:35PM +0100, Kasparek Tomas wrote:
> > > > The attached 2 patches have been tested using a server that was rigged
> > > > not to ever close the socket. They appear to work fine on my setup,
> > > > without the hang that you reported earlier.
> ...
> It seems that machines with this new kernel (tried on 10 other machines
> and the original client) may after few days get into state where they
> generate huge amounts (10000-100000pkt/s) of packets on another server they
> use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
> 2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
> flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
> With this flow it is hard to get more info and the server is production
> one, so for now I only know it goes from these clients and end on tcp port
> 2049 on that server. It kills just this server, communication with the
> previously problematic (FreeBSD machines) is fine now.

Hi all,

configrming that the problem is with machines with 2.6.27.10+trond's
patches. Do not have more info about what's there on network, the only new
thing I can add is that the client is dead not reacting even on keyborad or
anything else. Trond, would you have and idea what to try now or what other
information to find to get any further in this?

The clients are different machines - Intel x AMD with different boards,
NICs etc. The server is dead too, but it may recover after some time - I
can not afford leaving it for longer time to see what happens.

Thanks so far.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-20 15:03                                                                     ` Kasparek Tomas
@ 2009-01-20 15:32                                                                       ` Trond Myklebust
       [not found]                                                                         ` <1232465547.7055.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-03-03 12:08                                                                         ` Kasparek Tomas
  0 siblings, 2 replies; 131+ messages in thread
From: Trond Myklebust @ 2009-01-20 15:32 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Tue, 2009-01-20 at 16:03 +0100, Kasparek Tomas wrote:
> On Sun, Jan 18, 2009 at 02:08:35PM +0100, Kasparek Tomas wrote:
> > > > > The attached 2 patches have been tested using a server that was rigged
> > > > > not to ever close the socket. They appear to work fine on my setup,
> > > > > without the hang that you reported earlier.
> > ...
> > It seems that machines with this new kernel (tried on 10 other machines
> > and the original client) may after few days get into state where they
> > generate huge amounts (10000-100000pkt/s) of packets on another server they
> > use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
> > 2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
> > flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
> > With this flow it is hard to get more info and the server is production
> > one, so for now I only know it goes from these clients and end on tcp port
> > 2049 on that server. It kills just this server, communication with the
> > previously problematic (FreeBSD machines) is fine now.
> 
> Hi all,
> 
> configrming that the problem is with machines with 2.6.27.10+trond's
> patches. Do not have more info about what's there on network, the only new
> thing I can add is that the client is dead not reacting even on keyborad or
> anything else. Trond, would you have and idea what to try now or what other
> information to find to get any further in this?

A binary wireshark dump of the traffic between one such client and the
server would help.

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-08 21:26                                       ` J. Bruce Fields
@ 2009-01-22  8:27                                         ` Ian Campbell
  -1 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-22  8:27 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> 
> > > (Merged now, so testing mainline as of today should work too.)
> > 
> > The server isn't really a machine I want to test random kernels on,
> is
> > there some subset of those changesets which it would be useful for
> me to
> > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > manage the backporting myself).
> > 
> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> closes sockets in a timely fashion
> > 
> > I think 69b6 was in the set of three I tested previously and the
> other
> > two turned into 2294?
> 
> Yep, exactly.--b.

The client machine now has an uptime of ten days without error after
these two patches were applied to the server.

Thanks everybody,
Ian.

-- 
Ian Campbell

I used to think that the brain was the most wonderful organ in
my body.  Then I realized who was telling me this.
		-- Emo Phillips

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-22  8:27                                         ` Ian Campbell
  0 siblings, 0 replies; 131+ messages in thread
From: Ian Campbell @ 2009-01-22  8:27 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

[-- Attachment #1: Type: text/plain, Size: 1140 bytes --]

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> 
> > > (Merged now, so testing mainline as of today should work too.)
> > 
> > The server isn't really a machine I want to test random kernels on,
> is
> > there some subset of those changesets which it would be useful for
> me to
> > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > manage the backporting myself).
> > 
> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> closes sockets in a timely fashion
> > 
> > I think 69b6 was in the set of three I tested previously and the
> other
> > two turned into 2294?
> 
> Yep, exactly.--b.

The client machine now has an uptime of ten days without error after
these two patches were applied to the server.

Thanks everybody,
Ian.

-- 
Ian Campbell

I used to think that the brain was the most wonderful organ in
my body.  Then I realized who was telling me this.
		-- Emo Phillips

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-22 16:44                                           ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-22 16:44 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, Jan 22, 2009 at 08:27:40AM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> > 
> > > > (Merged now, so testing mainline as of today should work too.)
> > > 
> > > The server isn't really a machine I want to test random kernels on,
> > is
> > > there some subset of those changesets which it would be useful for
> > me to
> > > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > > manage the backporting myself).
> > > 
> > > These two look like the relevant ones to me but I'm not sure:
> > > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> > requests on transport destruction
> > > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> > closes sockets in a timely fashion
> > > 
> > > I think 69b6 was in the set of three I tested previously and the
> > other
> > > two turned into 2294?
> > 
> > Yep, exactly.--b.
> 
> The client machine now has an uptime of ten days without error after
> these two patches were applied to the server.
> 
> Thanks everybody,

Very good, so upstream should be OK.  Thanks for the testing!

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
@ 2009-01-22 16:44                                           ` J. Bruce Fields
  0 siblings, 0 replies; 131+ messages in thread
From: J. Bruce Fields @ 2009-01-22 16:44 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Trond Myklebust, linux-nfs, Max Kellermann, linux-kernel, gcosta,
	Grant Coady, Tom Tucker

On Thu, Jan 22, 2009 at 08:27:40AM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> > 
> > > > (Merged now, so testing mainline as of today should work too.)
> > > 
> > > The server isn't really a machine I want to test random kernels on,
> > is
> > > there some subset of those changesets which it would be useful for
> > me to
> > > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > > manage the backporting myself).
> > > 
> > > These two look like the relevant ones to me but I'm not sure:
> > > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> > requests on transport destruction
> > > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> > closes sockets in a timely fashion
> > > 
> > > I think 69b6 was in the set of three I tested previously and the
> > other
> > > two turned into 2294?
> > 
> > Yep, exactly.--b.
> 
> The client machine now has an uptime of ten days without error after
> these two patches were applied to the server.
> 
> Thanks everybody,

Very good, so upstream should be OK.  Thanks for the testing!

--b.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                                                         ` <1232465547.7055.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-01-28  8:18                                                                           ` Kasparek Tomas
  2009-02-06  6:35                                                                             ` Kasparek Tomas
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-01-28  8:18 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> On Tue, 2009-01-20 at 16:03 +0100, Kasparek Tomas wrote:
> > On Sun, Jan 18, 2009 at 02:08:35PM +0100, Kasparek Tomas wrote:
> > > > > > The attached 2 patches have been tested using a server that was rigged
> > > > > > not to ever close the socket. They appear to work fine on my setup,
> > > > > > without the hang that you reported earlier.
> > > ...
> > > It seems that machines with this new kernel (tried on 10 other machines
> > > and the original client) may after few days get into state where they
> > > generate huge amounts (10000-100000pkt/s) of packets on another server they
> > > use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
> > > 2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
> > > flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
> > > With this flow it is hard to get more info and the server is production
> > > one, so for now I only know it goes from these clients and end on tcp port
> > > 2049 on that server. It kills just this server, communication with the
> > > previously problematic (FreeBSD machines) is fine now.
> > 
> > patches. Do not have more info about what's there on network, the only new
> > thing I can add is that the client is dead not reacting even on keyborad or
> > anything else. Trond, would you have and idea what to try now or what other
> 
> A binary wireshark dump of the traffic between one such client and the
> server would help.

I tried to get some data several times, but the client is dead and the
server is overloaded so much, that I'm unable to get anything reasonable. I
did tried to insert another mechine in front of the client as a bridge, but
the traffic overloaded it the same way as the server. I will try to figure
out how to get some traffic dump, but have no other idea for now.

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-28  8:18                                                                           ` Kasparek Tomas
@ 2009-02-06  6:35                                                                             ` Kasparek Tomas
  2009-02-10  7:55                                                                               ` Kasparek Tomas
  0 siblings, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-02-06  6:35 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Wed, Jan 28, 2009 at 09:18:52AM +0100, Kasparek Tomas wrote:
> On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > On Tue, 2009-01-20 at 16:03 +0100, Kasparek Tomas wrote:
> > > On Sun, Jan 18, 2009 at 02:08:35PM +0100, Kasparek Tomas wrote:
> > > > > > > The attached 2 patches have been tested using a server that was rigged
> > > > > > > not to ever close the socket. They appear to work fine on my setup,
> > > > > > > without the hang that you reported earlier.
> > > > ...
> > > > It seems that machines with this new kernel (tried on 10 other machines
> > > > and the original client) may after few days get into state where they
> > > > generate huge amounts (10000-100000pkt/s) of packets on another server they
> > > > use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
> > > > 2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
> > > > flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
> > > > With this flow it is hard to get more info and the server is production
> > > > one, so for now I only know it goes from these clients and end on tcp port
> > > > 2049 on that server. It kills just this server, communication with the
> > > > previously problematic (FreeBSD machines) is fine now.
> > > 
> > > patches. Do not have more info about what's there on network, the only new
> > > thing I can add is that the client is dead not reacting even on keyborad or
> > > anything else. Trond, would you have and idea what to try now or what other
> > 
> > A binary wireshark dump of the traffic between one such client and the
> > server would help.
> 
> I tried to get some data several times, but the client is dead and the
> server is overloaded so much, that I'm unable to get anything reasonable. I
> did tried to insert another mechine in front of the client as a bridge, but
> the traffic overloaded it the same way as the server. I will try to figure
> out how to get some traffic dump, but have no other idea for now.

Hi,

as another try, I did upgrade from 2.6.27.10 to 2.6.27.13 (and .14) and it
looks like the problem disappeared. Righ now I'm running 5 clients with .13
or .14 and tcpdumps for 3 days without any problem. I will try to stop
tcpdumps as they can potentially influence behaviour and will confirm the
state next week.

Thank you very much for all the work you and others from nfs-linux are
doing!
(I'm going to test pNFS in short time, so if I can be helpfull in any
way, let me know).

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-02-06  6:35                                                                             ` Kasparek Tomas
@ 2009-02-10  7:55                                                                               ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-02-10  7:55 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Fri, Feb 06, 2009 at 07:35:13AM +0100, Kasparek Tomas wrote:
> > > A binary wireshark dump of the traffic between one such client and the
> > > server would help.
> > 
> > I tried to get some data several times, but the client is dead and the
> > server is overloaded so much, that I'm unable to get anything reasonable. I
> > did tried to insert another mechine in front of the client as a bridge, but
> > the traffic overloaded it the same way as the server. I will try to figure
> > out how to get some traffic dump, but have no other idea for now.

> as another try, I did upgrade from 2.6.27.10 to 2.6.27.13 (and .14) and it
> looks like the problem disappeared. Righ now I'm running 5 clients with .13
> or .14 and tcpdumps for 3 days without any problem. I will try to stop
> tcpdumps as they can potentially influence behaviour and will confirm the
> state next week.

After 6 days all machines except the first client used are fine and have no
problems. Based on this I would conclude that:

- your patch fixes the problem I had
- there may be something wrong in <2.6.27.13, but it's ok in .13+
- I finnaly have some tcpdumps from the server concerning the first
  problematic client, I will try to extract interesting packets and send it
  here if you or someone else can find anything helpfull there. With .14
  the client runs much better anyway staying alive for 3 days instead of
  6-10hours as with .10

Thank you for your support.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-01-20 15:32                                                                       ` Trond Myklebust
       [not found]                                                                         ` <1232465547.7055.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-03-03 12:08                                                                         ` Kasparek Tomas
  2009-03-03 14:16                                                                           ` Trond Myklebust
  1 sibling, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-03-03 12:08 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > > > > > The attached 2 patches have been tested using a server that was rigged
> > > > > > not to ever close the socket. They appear to work fine on my setup,
> > > > > > without the hang that you reported earlier.
> > > ...
> > > It seems that machines with this new kernel (tried on 10 other machines
> > > and the original client) may after few days get into state where they
> > > generate huge amounts (10000-100000pkt/s) of packets on another server they
> > > use (Linux 2.6.26.62, but the same behaviour with other kernels I tried -
> > > 2.6.24.7, 2.6.22.19, 2.6.27.10). It seems packets are quiet small as the
> > > flow on server is about 5-10MB/s. (probably) Each packet generates an answer.
> > > With this flow it is hard to get more info and the server is production
> > > one, so for now I only know it goes from these clients and end on tcp port
> > > 2049 on that server. It kills just this server, communication with the
> > > previously problematic (FreeBSD machines) is fine now.

> > configrming that the problem is with machines with 2.6.27.10+trond's
> > patches. Do not have more info about what's there on network, the only new
> > thing I can add is that the client is dead not reacting even on keyborad or
> > anything else. Trond, would you have and idea what to try now or what other
> > information to find to get any further in this?
> 
> A binary wireshark dump of the traffic between one such client and the
> server would help.

I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
after several weeks without problems. I include the file and place it on
http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
of dump, but it's all the time the same SYN+RST packets). The packet rate
maxed at 260000pps from two clients.

This dump is taken from server after reset (the server does not respond
even to keybord) before clients are disconnected/rebooted. To remind it - all
clients seems to work well with reversed
e06799f958bf7f9f8fae15f0c6f519953fb0257c

(just to be a hint do not mean the patch is wrong)

Thanks in advance

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-03-03 12:08                                                                         ` Kasparek Tomas
@ 2009-03-03 14:16                                                                           ` Trond Myklebust
       [not found]                                                                             ` <1236089767.9631.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  0 siblings, 1 reply; 131+ messages in thread
From: Trond Myklebust @ 2009-03-03 14:16 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs

On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
> On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > A binary wireshark dump of the traffic between one such client and the
> > server would help.
> 
> I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
> after several weeks without problems. I include the file and place it on
> http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
> of dump, but it's all the time the same SYN+RST packets). The packet rate
> maxed at 260000pps from two clients.
> 
> This dump is taken from server after reset (the server does not respond
> even to keybord) before clients are disconnected/rebooted. To remind it - all
> clients seems to work well with reversed
> e06799f958bf7f9f8fae15f0c6f519953fb0257c

Yes. I saw that behaviour when testing at Connectathon last week. When
one of the servers I was testing against crashed and later came up
again, the patched client went into that same SYN+RST frenzy. I'm
planning to look at this now that I'm back at home.

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                                                             ` <1236089767.9631.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-03-25  8:46                                                                               ` Kasparek Tomas
  2009-04-18  5:17                                                                               ` Kasparek Tomas
  1 sibling, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-03-25  8:46 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Mar 03, 2009 at 09:16:07AM -0500, Trond Myklebust wrote:
> On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
> > On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > > A binary wireshark dump of the traffic between one such client and the
> > > server would help.
> > 
> > I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
> > after several weeks without problems. I include the file and place it on
> > http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
> > of dump, but it's all the time the same SYN+RST packets). The packet rate
> > maxed at 260000pps from two clients.
> > 
> > This dump is taken from server after reset (the server does not respond
> > even to keybord) before clients are disconnected/rebooted. To remind it - all
> > clients seems to work well with reversed
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c
> 
> Yes. I saw that behaviour when testing at Connectathon last week. When
> one of the servers I was testing against crashed and later came up
> again, the patched client went into that same SYN+RST frenzy. I'm
> planning to look at this now that I'm back at home.

Hi Trond, is there something new about this? Having this solved will enable
me to test 2.6.27.x on more machines and hopefully solve the last lockd
problem.

Thank you in advance

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
       [not found]                                                                             ` <1236089767.9631.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-03-25  8:46                                                                               ` Kasparek Tomas
@ 2009-04-18  5:17                                                                               ` Kasparek Tomas
  2009-04-22 17:27                                                                                   ` Kasparek Tomas
  1 sibling, 1 reply; 131+ messages in thread
From: Kasparek Tomas @ 2009-04-18  5:17 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Tue, Mar 03, 2009 at 09:16:07AM -0500, Trond Myklebust wrote:
> On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
> > On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > > A binary wireshark dump of the traffic between one such client and the
> > > server would help.
> > 
> > I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
> > after several weeks without problems. I include the file and place it on
> > http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
> > of dump, but it's all the time the same SYN+RST packets). The packet rate
> > maxed at 260000pps from two clients.
> > 
> > This dump is taken from server after reset (the server does not respond
> > even to keybord) before clients are disconnected/rebooted. To remind it - all
> > clients seems to work well with reversed
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c
> 
> Yes. I saw that behaviour when testing at Connectathon last week. When
> one of the servers I was testing against crashed and later came up
> again, the patched client went into that same SYN+RST frenzy. I'm
> planning to look at this now that I'm back at home.

Hi, got a bit more data today as I get to the client early before it become
unresponsible. 

: BUG: soft lockup - CPU#5 stuck for 61s!  [rpciod/5:2730]
: Modules linked in: nfsd auth_rpcgss
exportfs i2c_dev i2c_core nfs lockd nfs_acl sunrpc ipv6 xfs dm_mirror 
dm_log dm_mod pci_slot fa n snd_hda_intel snd_seq_dummy thermal snd_seq_oss
snd_seq_midi_event snd_seq processor igb 8250_pnp sg firewire_ohci 
firewire_core crc_itu_t thermal_sys snd_seq_dev ice snd_pcm_oss snd_mixer_oss
evdev snd_pcm hwmon 3w_9xxx inet_lro button snd_timer sr_mod cdrom 8250
serial_core rtc_cmos rtc_core rtc_lib ehci_hcd uhci_hcd snd soundcore snd_page_alloc usbcore
: 
: Pid: 2730, comm: rpciod/5 Not tainted (2.6.27.21 #1)
: EIP: 0060:[<c02972c8>] EFLAGS: 00000202 CPU: 5
: EIP is at tcp_connect+0x213/0x2e6
: EAX: c55cf700 EBX: f67b7b40 ECX: 00000002 EDX: ed451d8c
: ESI: c5409380 EDI: 00000000 EBP: 00000001 ESP: f67cfe78
:  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
: CR0: 8005003b CR2: b7f7b9c8 CR3: 003b5000 CR4: 000006d0
: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
: DR6: ffff0ff0 DR7: 00000400 
:  [<c0299c5d>] ? tcp_v4_connect+0x3b2/0x40a
:  [<c02a312e>] ?  inet_stream_connect+0x87/0x20b
:  [<f8b41a88>] ?  rpc_wake_up_status+0x33/0x57 [sunrpc]
:  [<c026417a>] ? kernel_connect+0xb/0xe
:  [<f8b401b3>] ?  xs_tcp_finish_connecting+0xe4/0xea [sunrpc]
:  [<f8b41145>] ?  xs_tcp_connect_worker4+0x0/0x15a [sunrpc]
:  [<f8b41221>] ?  xs_tcp_connect_worker4+0xdc/0x15a [sunrpc]
:  [<c012854e>] ? run_workqueue+0x6a/0xe1
:  [<c0128c77>] ? worker_thread+0x0/0x8a
:  [<c0128cf6>] ? worker_thread+0x7f/0x8a
:  [<c012aeac>] ?  autoremove_wake_function+0x0/0x2b
:  [<c0128c77>] ? worker_thread+0x0/0x8a
:  [<c012ade8>] ? kthread+0x38/0x60
:  [<c012adb0>] ? kthread+0x0/0x60
:  [<c010371b>] ?  kernel_thread_helper+0x7/0x10
:  =======================

The lockup may be becouse I disconnected the cable from that client to stop
the packet storm, but still the backtrace may be usefull.

Is there anything else I can do, that will help with this problem?

Thanks in advance

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* NFS client packet storm on 2.6.27.x
  2009-04-18  5:17                                                                               ` Kasparek Tomas
@ 2009-04-22 17:27                                                                                   ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-04-22 17:27 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Sat, Apr 18, 2009 at 07:17:39AM +0200, Kasparek Tomas wrote:
> On Tue, Mar 03, 2009 at 09:16:07AM -0500, Trond Myklebust wrote:
> > On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
> > > On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > > > A binary wireshark dump of the traffic between one such client and the
> > > > server would help.
> > > 
> > > I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
> > > after several weeks without problems. I include the file and place it on
> > > http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
> > > of dump, but it's all the time the same SYN+RST packets). The packet rate
> > > maxed at 260000pps from two clients.
> > > 
> > > This dump is taken from server after reset (the server does not respond
> > > even to keybord) before clients are disconnected/rebooted. To remind it - all
> > > clients seems to work well with reversed
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c
> > 
> > Yes. I saw that behaviour when testing at Connectathon last week. When
> > one of the servers I was testing against crashed and later came up
> > again, the patched client went into that same SYN+RST frenzy. I'm
> > planning to look at this now that I'm back at home.
> 
> Hi, got a bit more data today as I get to the client early before it become
> unresponsible. 
> 
> 
> The lockup may be becouse I disconnected the cable from that client to stop
> the packet storm, but still the backtrace may be usefull.
> 
> Is there anything else I can do, that will help with this problem?

Hi,

(I changed the SUBJ to be more descriptive for current problem)

I got another client lockup today. It was a desktop so I have some more
dmesg warnings about soft lockup caused probably by network cable unplug
(but hopefully still showing what happens in rpciod) on

http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg

I can check with top, that rpciod was using 100% cpu. I limited the flow
from client to server with firewall so I was able to save the server and
get some tcpdump -s0 data (actually RPC null with ERR response from server)

Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
(x86_64).

Please let me know if I can do anything more, this is really paintfull for
me.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* NFS client packet storm on 2.6.27.x
@ 2009-04-22 17:27                                                                                   ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-04-22 17:27 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Sat, Apr 18, 2009 at 07:17:39AM +0200, Kasparek Tomas wrote:
> On Tue, Mar 03, 2009 at 09:16:07AM -0500, Trond Myklebust wrote:
> > On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
> > > On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
> > > > A binary wireshark dump of the traffic between one such client and the
> > > > server would help.
> > > 
> > > I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
> > > after several weeks without problems. I include the file and place it on
> > > http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
> > > of dump, but it's all the time the same SYN+RST packets). The packet rate
> > > maxed at 260000pps from two clients.
> > > 
> > > This dump is taken from server after reset (the server does not respond
> > > even to keybord) before clients are disconnected/rebooted. To remind it - all
> > > clients seems to work well with reversed
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c
> > 
> > Yes. I saw that behaviour when testing at Connectathon last week. When
> > one of the servers I was testing against crashed and later came up
> > again, the patched client went into that same SYN+RST frenzy. I'm
> > planning to look at this now that I'm back at home.
> 
> Hi, got a bit more data today as I get to the client early before it become
> unresponsible. 
> 
> 
> The lockup may be becouse I disconnected the cable from that client to stop
> the packet storm, but still the backtrace may be usefull.
> 
> Is there anything else I can do, that will help with this problem?

Hi,

(I changed the SUBJ to be more descriptive for current problem)

I got another client lockup today. It was a desktop so I have some more
dmesg warnings about soft lockup caused probably by network cable unplug
(but hopefully still showing what happens in rpciod) on

http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg

I can check with top, that rpciod was using 100% cpu. I limited the flow
from client to server with firewall so I was able to save the server and
get some tcpdump -s0 data (actually RPC null with ERR response from server)

Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
(x86_64).

Please let me know if I can do anything more, this is really paintfull for
me.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
  2009-04-22 17:27                                                                                   ` Kasparek Tomas
  (?)
@ 2009-04-29 12:12                                                                                   ` Steve Dickson
  2009-04-29 14:57                                                                                       ` Kasparek Tomas
  -1 siblings, 1 reply; 131+ messages in thread
From: Steve Dickson @ 2009-04-29 12:12 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs, linux-kernel



Kasparek Tomas wrote:
> On Sat, Apr 18, 2009 at 07:17:39AM +0200, Kasparek Tomas wrote:
>> On Tue, Mar 03, 2009 at 09:16:07AM -0500, Trond Myklebust wrote:
>>> On Tue, 2009-03-03 at 13:08 +0100, Kasparek Tomas wrote:
>>>> On Tue, Jan 20, 2009 at 10:32:27AM -0500, Trond Myklebust wrote:
>>>>> A binary wireshark dump of the traffic between one such client and the
>>>>> server would help.
>>>> I was able to finally got the tcpdump. I got it from 2.6.27.19 client but
>>>> after several weeks without problems. I include the file and place it on
>>>> http://merlin.fit.vutbr.cz/tmp/nfs/dump_kas2_mat.dump_small (have over 1GB
>>>> of dump, but it's all the time the same SYN+RST packets). The packet rate
>>>> maxed at 260000pps from two clients.
>>>>
>>>> This dump is taken from server after reset (the server does not respond
>>>> even to keybord) before clients are disconnected/rebooted. To remind it - all
>>>> clients seems to work well with reversed
>>>> e06799f958bf7f9f8fae15f0c6f519953fb0257c
>>> Yes. I saw that behaviour when testing at Connectathon last week. When
>>> one of the servers I was testing against crashed and later came up
>>> again, the patched client went into that same SYN+RST frenzy. I'm
>>> planning to look at this now that I'm back at home.
>> Hi, got a bit more data today as I get to the client early before it become
>> unresponsible. 
>>
>>
>> The lockup may be becouse I disconnected the cable from that client to stop
>> the packet storm, but still the backtrace may be usefull.
>>
>> Is there anything else I can do, that will help with this problem?
> 
> Hi,
> 
> (I changed the SUBJ to be more descriptive for current problem)
> 
> I got another client lockup today. It was a desktop so I have some more
> dmesg warnings about soft lockup caused probably by network cable unplug
> (but hopefully still showing what happens in rpciod) on
> 
> http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
> 
> I can check with top, that rpciod was using 100% cpu. I limited the flow
> from client to server with firewall so I was able to save the server and
> get some tcpdump -s0 data (actually RPC null with ERR response from server)
> 
> Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> (x86_64).
> 
> Please let me know if I can do anything more, this is really paintfull for
> me.
Try commenting out the tcp6/udp6 entries from /etc/netconfig....
This has help in other places...

steved.

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
@ 2009-04-29 14:57                                                                                       ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-04-29 14:57 UTC (permalink / raw)
  To: Steve Dickson; +Cc: linux-nfs, linux-kernel

On Wed, Apr 29, 2009 at 08:12:38AM -0400, Steve Dickson wrote:
> > I got another client lockup today. It was a desktop so I have some more
> > dmesg warnings about soft lockup caused probably by network cable unplug
> > (but hopefully still showing what happens in rpciod) on
> > 
> > http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
> > 
> > I can check with top, that rpciod was using 100% cpu. I limited the flow
> > from client to server with firewall so I was able to save the server and
> > get some tcpdump -s0 data (actually RPC null with ERR response from server)
> > 
> > Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> > (x86_64).
> > 
> > Please let me know if I can do anything more, this is really paintfull for
> > me.
> Try commenting out the tcp6/udp6 entries from /etc/netconfig....
> This has help in other places...

It's CentOS 5.3, but if you mean to disable IPv6, I can not do that, the
server is using IPv6 (but not with these clients). And as mentioned before
it works well without e06799f958bf7f9f8fae15f0c6f519953fb0257c, both for
lockd and client floods.

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
@ 2009-04-29 14:57                                                                                       ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-04-29 14:57 UTC (permalink / raw)
  To: Steve Dickson; +Cc: linux-nfs, linux-kernel

On Wed, Apr 29, 2009 at 08:12:38AM -0400, Steve Dickson wrote:
> > I got another client lockup today. It was a desktop so I have some more
> > dmesg warnings about soft lockup caused probably by network cable unplug
> > (but hopefully still showing what happens in rpciod) on
> > 
> > http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
> > 
> > I can check with top, that rpciod was using 100% cpu. I limited the flow
> > from client to server with firewall so I was able to save the server and
> > get some tcpdump -s0 data (actually RPC null with ERR response from server)
> > 
> > Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> > (x86_64).
> > 
> > Please let me know if I can do anything more, this is really paintfull for
> > me.
> Try commenting out the tcp6/udp6 entries from /etc/netconfig....
> This has help in other places...

It's CentOS 5.3, but if you mean to disable IPv6, I can not do that, the
server is using IPv6 (but not with these clients). And as mentioned before
it works well without e06799f958bf7f9f8fae15f0c6f519953fb0257c, both for
lockd and client floods.

Bye

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2008-10-20  6:51   ` Max Kellermann
  2008-10-20  7:43     ` Ian Campbell
  2008-10-20 13:15     ` Glauber Costa
@ 2009-05-22 20:59     ` H. Peter Anvin
  2009-05-25 13:12       ` Max Kellermann
  2 siblings, 1 reply; 131+ messages in thread
From: H. Peter Anvin @ 2009-05-22 20:59 UTC (permalink / raw)
  To: Max Kellermann
  Cc: Glauber Costa, linux-kernel, ijc, Grant Coady, Trond Myklebust,
	J. Bruce Fields, Tom Tucker, Cyrill Gorcunov

Max Kellermann wrote:
> On 2008/10/17 16:33, Glauber Costa <glommer@redhat.com> wrote:
>> That's probably something related to apic congestion.
>> Does the problem go away if the only thing you change is this:
>>
>>
>>> @@ -891,11 +897,6 @@ do_rest:
>>>  		store_NMI_vector(&nmi_high, &nmi_low);
>>>  
>>>  		smpboot_setup_warm_reset_vector(start_ip);
>>> -		/*
>>> -		 * Be paranoid about clearing APIC errors.
>>> -	 	*/
>>> -		apic_write(APIC_ESR, 0);
>>> -		apic_read(APIC_ESR);
>>>  	}
>>
>> Please let me know.
> 
> Hello Glauber,
> 
> I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
> No problems since.
> 
> Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
> Tell me if you need more detailed information.
> 

I was just pointed to this thread in the archives, and I'm really
wondering what was going on here.  In particular, if this machine still
is around, I would appreciate a dump of /proc/cpuinfo as well as
dmidecode from this machine.

Thanks,

	-hpa

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"
  2009-05-22 20:59     ` H. Peter Anvin
@ 2009-05-25 13:12       ` Max Kellermann
  0 siblings, 0 replies; 131+ messages in thread
From: Max Kellermann @ 2009-05-25 13:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

On 2009/05/22 22:59, "H. Peter Anvin" <hpa@zytor.com> wrote:
> I was just pointed to this thread in the archives, and I'm really
> wondering what was going on here.  In particular, if this machine still
> is around, I would appreciate a dump of /proc/cpuinfo as well as
> dmidecode from this machine.

Here it is.  We run these machines with 2.6.29, the problem is gone.

Max

[-- Attachment #2: cpuinfo --]
[-- Type: text/plain, Size: 2264 bytes --]

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 15
model		: 2
model name	: Intel(R) Xeon(TM) CPU 3.06GHz
stepping	: 5
cpu MHz		: 3066.590
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips	: 6105.99
clflush size	: 64
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 15
model		: 2
model name	: Intel(R) Xeon(TM) CPU 3.06GHz
stepping	: 5
cpu MHz		: 3066.590
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 6
initial apicid	: 6
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips	: 6106.57
clflush size	: 64
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 15
model		: 2
model name	: Intel(R) Xeon(TM) CPU 3.06GHz
stepping	: 5
cpu MHz		: 3066.590
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips	: 6106.63
clflush size	: 64
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 15
model		: 2
model name	: Intel(R) Xeon(TM) CPU 3.06GHz
stepping	: 5
cpu MHz		: 3066.590
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 7
initial apicid	: 7
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cid xtpr
bogomips	: 6106.64
clflush size	: 64
power management:


[-- Attachment #3: dmidecode --]
[-- Type: text/plain, Size: 14551 bytes --]

# dmidecode 2.9
SMBIOS 2.3 present.
49 structures occupying 1334 bytes.
Table at 0x000EC000.

Handle 0x0000, DMI type 0, 20 bytes
BIOS Information
	Vendor: HP
	Version: P29
	Release Date: 07/25/2003
	Address: 0xF0000
	Runtime Size: 64 kB
	ROM Size: 2048 kB
	Characteristics:
		PCI is supported
		PNP is supported
		BIOS is upgradeable
		BIOS shadowing is allowed
		ESCD support is available
		Boot from CD is supported
		Selectable boot is supported
		5.25"/360 KB floppy services are supported (int 13h)
		5.25"/1.2 MB floppy services are supported (int 13h)
		3.5"/720 KB floppy services are supported (int 13h)
		Print screen service is supported (int 5h)
		8042 keyboard services are supported (int 9h)
		Serial services are supported (int 14h)
		Printer services are supported (int 17h)
		CGA/mono video services are supported (int 10h)
		ACPI is supported
		USB legacy is supported
		BIOS boot specification is supported

Handle 0x0100, DMI type 1, 25 bytes
System Information
	Manufacturer: HP
	Product Name: ProLiant DL380 G3
	Version: Not Specified
	Serial Number: 8090LDN31R      
	UUID: 38303930-4C44-4E33-3152-202020202020
	Wake-up Type: Power Switch

Handle 0x0300, DMI type 3, 13 bytes
Chassis Information
	Manufacturer: HP
	Type: Rack Mount Chassis
	Lock: Not Present
	Version: Not Specified
	Serial Number: 8090LDN31R      
	Asset Tag:                                 
	Boot-up State: Unknown
	Power Supply State: Unknown
	Thermal State: Unknown
	Security Status: Unknown

Handle 0x0406, DMI type 4, 32 bytes
Processor Information
	Socket Designation: Proc 1
	Type: Central Processor
	Family: Xeon
	Manufacturer: Intel
	ID: 25 0F 00 00 FF FB EB BF
	Signature: Type 0, Family 15, Model 2, Stepping 5
	Flags:
		FPU (Floating-point unit on-chip)
		VME (Virtual mode extension)
		DE (Debugging extension)
		PSE (Page size extension)
		TSC (Time stamp counter)
		MSR (Model specific registers)
		PAE (Physical address extension)
		MCE (Machine check exception)
		CX8 (CMPXCHG8 instruction supported)
		APIC (On-chip APIC hardware supported)
		SEP (Fast system call)
		MTRR (Memory type range registers)
		PGE (Page global enable)
		MCA (Machine check architecture)
		CMOV (Conditional move instruction supported)
		PAT (Page attribute table)
		PSE-36 (36-bit page size extension)
		CLFSH (CLFLUSH instruction supported)
		DS (Debug store)
		ACPI (ACPI supported)
		MMX (MMX technology supported)
		FXSR (Fast floating-point save and restore)
		SSE (Streaming SIMD extensions)
		SSE2 (Streaming SIMD extensions 2)
		SS (Self-snoop)
		HTT (Hyper-threading technology)
		TM (Thermal monitor supported)
		PBE (Pending break enabled)
	Version: Not Specified
	Voltage: 1.7 V
	External Clock: 533 MHz
	Max Speed: 3600 MHz
	Current Speed: 3066 MHz
	Status: Populated, Idle
	Upgrade: ZIF Socket
	L1 Cache Handle: 0x0716
	L2 Cache Handle: 0x0726
	L3 Cache Handle: 0x0736

Handle 0x0400, DMI type 4, 32 bytes
Processor Information
	Socket Designation: Proc 2
	Type: Central Processor
	Family: Xeon
	Manufacturer: Intel
	ID: 25 0F 00 00 FF FB EB BF
	Signature: Type 0, Family 15, Model 2, Stepping 5
	Flags:
		FPU (Floating-point unit on-chip)
		VME (Virtual mode extension)
		DE (Debugging extension)
		PSE (Page size extension)
		TSC (Time stamp counter)
		MSR (Model specific registers)
		PAE (Physical address extension)
		MCE (Machine check exception)
		CX8 (CMPXCHG8 instruction supported)
		APIC (On-chip APIC hardware supported)
		SEP (Fast system call)
		MTRR (Memory type range registers)
		PGE (Page global enable)
		MCA (Machine check architecture)
		CMOV (Conditional move instruction supported)
		PAT (Page attribute table)
		PSE-36 (36-bit page size extension)
		CLFSH (CLFLUSH instruction supported)
		DS (Debug store)
		ACPI (ACPI supported)
		MMX (MMX technology supported)
		FXSR (Fast floating-point save and restore)
		SSE (Streaming SIMD extensions)
		SSE2 (Streaming SIMD extensions 2)
		SS (Self-snoop)
		HTT (Hyper-threading technology)
		TM (Thermal monitor supported)
		PBE (Pending break enabled)
	Version: Not Specified
	Voltage: 1.7 V
	External Clock: 533 MHz
	Max Speed: 3600 MHz
	Current Speed: 3066 MHz
	Status: Populated, Enabled
	Upgrade: ZIF Socket
	L1 Cache Handle: 0x0710
	L2 Cache Handle: 0x0720
	L3 Cache Handle: 0x0730

Handle 0x0716, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 1 Internal L1 Cache
	Configuration: Enabled, Not Socketed, Level 1
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 8 KB
	Maximum Size: 32 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: 4-way Set-associative

Handle 0x0710, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 2 Internal L1 Cache
	Configuration: Enabled, Not Socketed, Level 1
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 8 KB
	Maximum Size: 32 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: 4-way Set-associative

Handle 0x0726, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 1 Internal L2 Cache
	Configuration: Enabled, Not Socketed, Level 2
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 512 KB
	Maximum Size: 2048 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: Other

Handle 0x0720, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 2 Internal L2 Cache
	Configuration: Enabled, Not Socketed, Level 2
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 512 KB
	Maximum Size: 2048 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: Other

Handle 0x0736, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 1 Internal L3 Cache
	Configuration: Enabled, Not Socketed, Level 3
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 1024 KB
	Maximum Size: 1024 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: Other

Handle 0x0730, DMI type 7, 19 bytes
Cache Information
	Socket Designation: Processor 2 Internal L3 Cache
	Configuration: Enabled, Not Socketed, Level 3
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 1024 KB
	Maximum Size: 1024 KB
	Supported SRAM Types:
		Burst
	Installed SRAM Type: Burst
	Speed: Unknown
	Error Correction Type: Unknown
	System Type: Unknown
	Associativity: Other

Handle 0x0801, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: J58
	Internal Connector Type: Access Bus (USB)
	External Reference Designator: USB Port 1
	External Connector Type: Access Bus (USB)
	Port Type: USB

Handle 0x0802, DMI type 8, 9 bytes
Port Connector Information
	Internal Reference Designator: J58
	Internal Connector Type: Access Bus (USB)
	External Reference Designator: USB Port 2
	External Connector Type: Access Bus (USB)
	Port Type: USB

Handle 0x0901, DMI type 9, 13 bytes
System Slot Information
	Designation: PCI Slot 1
	Type: 64-bit PCI-X
	Current Usage: Available
	Length: Long
	ID: 1
	Characteristics:
		3.3 V is provided
		PME signal is supported

Handle 0x0902, DMI type 9, 13 bytes
System Slot Information
	Designation: PCI Slot 2
	Type: 64-bit PCI-X
	Current Usage: In Use
	Length: Long
	ID: 2
	Characteristics:
		3.3 V is provided
		PME signal is supported
		Hot-plug devices are supported

Handle 0x0903, DMI type 9, 13 bytes
System Slot Information
	Designation: PCI Slot 3
	Type: 64-bit PCI-X
	Current Usage: In Use
	Length: Long
	ID: 3
	Characteristics:
		3.3 V is provided
		PME signal is supported
		Hot-plug devices are supported

Handle 0x1000, DMI type 16, 15 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: Single-bit ECC
	Maximum Capacity: 12 GB
	Error Information Handle: Not Provided
	Number Of Devices: 6

Handle 0x1100, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 512 MB
	Form Factor: DIMM
	Set: 1
	Locator: DIMM 01
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1101, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 512 MB
	Form Factor: DIMM
	Set: 1
	Locator: DIMM 02
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1102, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 1024 MB
	Form Factor: DIMM
	Set: 2
	Locator: DIMM 03
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1103, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 1024 MB
	Form Factor: DIMM
	Set: 2
	Locator: DIMM 04
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1104, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 512 MB
	Form Factor: DIMM
	Set: 3
	Locator: DIMM 05
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1105, DMI type 17, 23 bytes
Memory Device
	Array Handle: 0x1000
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 512 MB
	Form Factor: DIMM
	Set: 3
	Locator: DIMM 06
	Bank Locator: Not Specified
	Type: DDR
	Type Detail: Synchronous
	Speed: 266 MHz (3.8 ns)

Handle 0x1300, DMI type 19, 15 bytes
Memory Array Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x000FFFFFFFF
	Range Size: 4 GB
	Physical Array Handle: 0x1000
	Partition Width: 0

Handle 0x1400, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x0003FFFFFFF
	Range Size: 1 GB
	Physical Device Handle: 0x1100
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 1
	Interleaved Data Depth: Unknown

Handle 0x1401, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x0003FFFFFFF
	Range Size: 1 GB
	Physical Device Handle: 0x1101
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 2
	Interleaved Data Depth: Unknown

Handle 0x1402, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00040000000
	Ending Address: 0x000BFFFFFFF
	Range Size: 2 GB
	Physical Device Handle: 0x1102
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 1
	Interleaved Data Depth: Unknown

Handle 0x1403, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00040000000
	Ending Address: 0x000BFFFFFFF
	Range Size: 2 GB
	Physical Device Handle: 0x1103
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 2
	Interleaved Data Depth: Unknown

Handle 0x1404, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x000C0000000
	Ending Address: 0x000F50003FF
	Range Size: 868353 kB
	Physical Device Handle: 0x1104
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 1
	Interleaved Data Depth: Unknown

Handle 0x1405, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x000C0000000
	Ending Address: 0x000F50003FF
	Range Size: 868353 kB
	Physical Device Handle: 0x1105
	Memory Array Mapped Address Handle: 0x1300
	Partition Row Position: 1
	Interleave Position: 2
	Interleaved Data Depth: Unknown

Handle 0x1450, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x000000003FF
	Range Size: 1 kB
	Physical Device Handle: 0x0000
	Memory Array Mapped Address Handle: 0x0000
	Partition Row Position: 1
	Interleave Position: 1
	Interleaved Data Depth: Unknown

Handle 0x1451, DMI type 20, 19 bytes
Memory Device Mapped Address
	Starting Address: 0x00000000000
	Ending Address: 0x000000003FF
	Range Size: 1 kB
	Physical Device Handle: 0x0000
	Memory Array Mapped Address Handle: 0x0000
	Partition Row Position: 1
	Interleave Position: 2
	Interleaved Data Depth: Unknown

Handle 0x2000, DMI type 32, 11 bytes
System Boot Information
	Status: Firmware-detected hardware failure

Handle 0xC100, DMI type 193, 7 bytes
OEM-specific Type
	Header and Data:
		C1 07 00 C1 01 01 02
	Strings:
		07/25/2003
		03/19/2003

Handle 0xC200, DMI type 194, 5 bytes
OEM-specific Type
	Header and Data:
		C2 05 00 C2 09

Handle 0xC300, DMI type 195, 5 bytes
OEM-specific Type
	Header and Data:
		C3 05 00 C3 01
	Strings:
		$0E110727

Handle 0xC400, DMI type 196, 5 bytes
OEM-specific Type
	Header and Data:
		C4 05 00 C4 00

Handle 0xC506, DMI type 197, 10 bytes
OEM-specific Type
	Header and Data:
		C5 0A 06 C5 06 04 06 00 FF 01

Handle 0xC500, DMI type 197, 10 bytes
OEM-specific Type
	Header and Data:
		C5 0A 00 C5 00 04 00 01 FF 02

Handle 0xC600, DMI type 198, 10 bytes
OEM-specific Type
	Header and Data:
		C6 0A 00 C6 17 00 00 01 3C 00

Handle 0xC700, DMI type 199, 52 bytes
OEM-specific Type
	Header and Data:
		C7 34 00 C7 11 00 00 00 03 20 13 06 25 0F 00 00
		15 00 00 00 03 20 04 06 29 0F 00 00 1D 00 00 00
		03 20 16 01 24 0F 00 00 38 00 00 00 03 20 04 06
		27 0F 00 00

Handle 0xCD00, DMI type 205, 22 bytes
OEM-specific Type
	Header and Data:
		CD 16 00 CD 01 01 46 41 54 78 00 00 F1 FF 00 00
		00 00 00 80 02 00

Handle 0xCA00, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 00 CA 00 11 FF 01

Handle 0xCA01, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 01 CA 01 11 FF 02

Handle 0xCA02, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 02 CA 02 11 FF 03

Handle 0xCA03, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 03 CA 03 11 FF 04

Handle 0xCA04, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 04 CA 04 11 FF 05

Handle 0xCA05, DMI type 202, 8 bytes
OEM-specific Type
	Header and Data:
		CA 08 05 CA 05 11 FF 06

Handle 0x7F00, DMI type 127, 4 bytes
End Of Table


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
  2009-04-22 17:27                                                                                   ` Kasparek Tomas
@ 2009-06-25  5:55                                                                                     ` Kasparek Tomas
  -1 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-06-25  5:55 UTC (permalink / raw)
  To: linux-nfs, linux-kernel; +Cc: Trond.Myklebust

On Wed, Apr 22, 2009 at 07:27:07PM +0200, Kasparek Tomas wrote:
> I got another client lockup today. It was a desktop so I have some more
> dmesg warnings about soft lockup caused probably by network cable unplug
> (but hopefully still showing what happens in rpciod) on
> 
> http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
> 
> I can check with top, that rpciod was using 100% cpu. I limited the flow
> from client to server with firewall so I was able to save the server and
> get some tcpdump -s0 data (actually RPC null with ERR response from server)
> 
> Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> (x86_64).

Hi, I was playing with patches from

http://www.linux-nfs.org/Linux-2.6.x/2.6.27/

and find, that 

.../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
.../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif

change the locking behaviour from long to endless lock to 1-2sec locks and
it seems there are fewer situations when it locks.

The packet storms does not repeat once I switched to 2.6.27.24 (and .25)
kernels so far, so it may be solved by some other patch inside .24 too.

Together with tcp_linger patch it seems to improve the situation a lot to
state when it is possible for me to use 2.6.27.x kernels.

Trond, will it be possible to get tcp_linger and the upper twho patches to
2.6.27.x stable queue so others get these fixes?

Big thanks for your help to all.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
@ 2009-06-25  5:55                                                                                     ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-06-25  5:55 UTC (permalink / raw)
  To: linux-nfs, linux-kernel; +Cc: Trond.Myklebust

On Wed, Apr 22, 2009 at 07:27:07PM +0200, Kasparek Tomas wrote:
> I got another client lockup today. It was a desktop so I have some more
> dmesg warnings about soft lockup caused probably by network cable unplug
> (but hopefully still showing what happens in rpciod) on
> 
> http://merlin.fit.vutbr.cz/tmp/nfs/pckas-dmesg
> 
> I can check with top, that rpciod was using 100% cpu. I limited the flow
> from client to server with firewall so I was able to save the server and
> get some tcpdump -s0 data (actually RPC null with ERR response from server)
> 
> Just to remind, the client is 2.6.27.21 (i386), the server is 2.6.16.62
> (x86_64).

Hi, I was playing with patches from

http://www.linux-nfs.org/Linux-2.6.x/2.6.27/

and find, that 

.../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
.../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif

change the locking behaviour from long to endless lock to 1-2sec locks and
it seems there are fewer situations when it locks.

The packet storms does not repeat once I switched to 2.6.27.24 (and .25)
kernels so far, so it may be solved by some other patch inside .24 too.

Together with tcp_linger patch it seems to improve the situation a lot to
state when it is possible for me to use 2.6.27.x kernels.

Trond, will it be possible to get tcp_linger and the upper twho patches to
2.6.27.x stable queue so others get these fixes?

Big thanks for your help to all.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
  2009-06-25  5:55                                                                                     ` Kasparek Tomas
@ 2009-07-13 11:12                                                                                       ` Kasparek Tomas
  -1 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-07-13 11:12 UTC (permalink / raw)
  To: linux-nfs, linux-kernel, stable

On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> 
> .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> 
> ...
> Together with tcp_linger patch it seems to improve the situation a lot to
> state when it is possible for me to use 2.6.27.x kernels.
> 
> Trond, will it be possible to get tcp_linger and the upper twho patches to
> 2.6.27.x stable queue so others get these fixes?

I got no response so far, is there someone who could advise me what to do
to get above patches to 2.6.27.x stable? Or is there some reason, why this
is not possible at all?

Thanks for advice and/or suggestions.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek@jabber.cz
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: NFS client packet storm on 2.6.27.x
@ 2009-07-13 11:12                                                                                       ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-07-13 11:12 UTC (permalink / raw)
  To: linux-nfs, linux-kernel, stable

On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> 
> .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> 
> ...
> Together with tcp_linger patch it seems to improve the situation a lot to
> state when it is possible for me to use 2.6.27.x kernels.
> 
> Trond, will it be possible to get tcp_linger and the upper twho patches to
> 2.6.27.x stable queue so others get these fixes?

I got no response so far, is there someone who could advise me what to do
to get above patches to 2.6.27.x stable? Or is there some reason, why this
is not possible at all?

Thanks for advice and/or suggestions.

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [stable] NFS client packet storm on 2.6.27.x
  2009-07-13 11:12                                                                                       ` Kasparek Tomas
  (?)
@ 2009-07-13 17:20                                                                                       ` Greg KH
  2009-07-13 17:40                                                                                         ` Trond Myklebust
  -1 siblings, 1 reply; 131+ messages in thread
From: Greg KH @ 2009-07-13 17:20 UTC (permalink / raw)
  To: Kasparek Tomas; +Cc: linux-nfs, linux-kernel, stable

On Mon, Jul 13, 2009 at 01:12:15PM +0200, Kasparek Tomas wrote:
> On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> > http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> > 
> > .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> > .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> > 
> > ...
> > Together with tcp_linger patch it seems to improve the situation a lot to
> > state when it is possible for me to use 2.6.27.x kernels.
> > 
> > Trond, will it be possible to get tcp_linger and the upper twho patches to
> > 2.6.27.x stable queue so others get these fixes?
> 
> I got no response so far, is there someone who could advise me what to do
> to get above patches to 2.6.27.x stable? Or is there some reason, why this
> is not possible at all?

Can you backport them and send them to stable@kernel.org with the git
commit ids of the orignal patches?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [stable] NFS client packet storm on 2.6.27.x
  2009-07-13 17:20                                                                                       ` [stable] " Greg KH
@ 2009-07-13 17:40                                                                                         ` Trond Myklebust
       [not found]                                                                                           ` <1247506817.14524.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
  2009-07-28 18:31                                                                                             ` Greg KH
  0 siblings, 2 replies; 131+ messages in thread
From: Trond Myklebust @ 2009-07-13 17:40 UTC (permalink / raw)
  To: Greg KH; +Cc: Kasparek Tomas, linux-nfs, linux-kernel, stable

On Mon, 2009-07-13 at 10:20 -0700, Greg KH wrote:
> On Mon, Jul 13, 2009 at 01:12:15PM +0200, Kasparek Tomas wrote:
> > On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> > > http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> > > 
> > > .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> > > .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> > > 
> > > ...
> > > Together with tcp_linger patch it seems to improve the situation a lot to
> > > state when it is possible for me to use 2.6.27.x kernels.
> > > 
> > > Trond, will it be possible to get tcp_linger and the upper twho patches to
> > > 2.6.27.x stable queue so others get these fixes?
> > 
> > I got no response so far, is there someone who could advise me what to do
> > to get above patches to 2.6.27.x stable? Or is there some reason, why this
> > is not possible at all?
> 
> Can you backport them and send them to stable@kernel.org with the git
> commit ids of the orignal patches?

When testing at a customer site, we recently found that we actually need
4 patches in order to completely fix the storming problem in 2.6.27.y,
and ensure a timely socket reconnect.

commit 15f081ca8ddfe150fb639c591b18944a539da0fc (SUNRPC: Avoid an
unnecessary task reschedule on ENOTCONN)

commit 670f94573104b4a25525d3fcdcd6496c678df172 (SUNRPC: Ensure we set
XPRT_CLOSING only after we've sent a tcp FIN...)

commit 40d2549db5f515e415894def98b49db7d4c56714 (SUNRPC: Don't
disconnect if a connection is still in progress.)

and finally

commit f75e6745aa3084124ae1434fd7629853bdaf6798 (SUNRPC: Fix the problem
of EADDRNOTAVAIL syslog floods on reconnect)

The first three need to be applied to kernels 2.6.27.y to 2.6.29.y,
while the last needs to be applied to 2.6.27.y to 2.6.30.y.

I'll ensure the patches get sent to stable@kernel.org...

   Trond


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [stable] NFS client packet storm on 2.6.27.x
       [not found]                                                                                           ` <1247506817.14524.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-07-24  8:54                                                                                             ` Kasparek Tomas
  0 siblings, 0 replies; 131+ messages in thread
From: Kasparek Tomas @ 2009-07-24  8:54 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Mon, Jul 13, 2009 at 01:40:17PM -0400, Trond Myklebust wrote:
> > > I got no response so far, is there someone who could advise me what to do
> > > to get above patches to 2.6.27.x stable? Or is there some reason, why this
> > > is not possible at all?
> > 
> > Can you backport them and send them to stable@kernel.org with the git
> > commit ids of the orignal patches?
> 
> When testing at a customer site, we recently found that we actually need
> 4 patches in order to completely fix the storming problem in 2.6.27.y,
> and ensure a timely socket reconnect.
> 
> commit 15f081ca8ddfe150fb639c591b18944a539da0fc (SUNRPC: Avoid an
> unnecessary task reschedule on ENOTCONN)
> 
> commit 670f94573104b4a25525d3fcdcd6496c678df172 (SUNRPC: Ensure we set
> XPRT_CLOSING only after we've sent a tcp FIN...)
> 
> commit 40d2549db5f515e415894def98b49db7d4c56714 (SUNRPC: Don't
> disconnect if a connection is still in progress.)
> 
> and finally
> 
> commit f75e6745aa3084124ae1434fd7629853bdaf6798 (SUNRPC: Fix the problem
> of EADDRNOTAVAIL syslog floods on reconnect)

Hi,

I checked that patches, all except f75e67... apply only with offset, but
the last one is a bit more ticky. Do you have these patches allready
backported for 2.7.27.x? If so, can you please send then to linux-nfs or
directly to me so I can test them?

Thanks in advance

--   

  Tomas Kasparek, PhD student  E-mail: kasparek@fit.vutbr.cz
  CVT FIT VUT Brno, L127       Web:    http://www.fit.vutbr.cz/~kasparek
  Bozetechova 1, 612 66        Fax:    +420 54114-1270
  Brno, Czech Republic         Phone:  +420 54114-1220

  jabber: tomas.kasparek-2ASvDZBniIelVyrhU4qvOw@public.gmane.org
  GPG: 2F1E 1AAF FD3B CFA3 1537  63BD DCBE 18FF A035 53BC


^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [stable] NFS client packet storm on 2.6.27.x
       [not found]                                                                                           ` <1247506817.14524.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
@ 2009-07-28 18:31                                                                                             ` Greg KH
  0 siblings, 0 replies; 131+ messages in thread
From: Greg KH @ 2009-07-28 18:31 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Kasparek Tomas, linux-nfs, linux-kernel, stable

On Mon, Jul 13, 2009 at 01:40:17PM -0400, Trond Myklebust wrote:
> On Mon, 2009-07-13 at 10:20 -0700, Greg KH wrote:
> > On Mon, Jul 13, 2009 at 01:12:15PM +0200, Kasparek Tomas wrote:
> > > On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> > > > http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> > > > 
> > > > .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> > > > .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> > > > 
> > > > ...
> > > > Together with tcp_linger patch it seems to improve the situation a lot to
> > > > state when it is possible for me to use 2.6.27.x kernels.
> > > > 
> > > > Trond, will it be possible to get tcp_linger and the upper twho patches to
> > > > 2.6.27.x stable queue so others get these fixes?
> > > 
> > > I got no response so far, is there someone who could advise me what to do
> > > to get above patches to 2.6.27.x stable? Or is there some reason, why this
> > > is not possible at all?
> > 
> > Can you backport them and send them to stable@kernel.org with the git
> > commit ids of the orignal patches?
> 
> When testing at a customer site, we recently found that we actually need
> 4 patches in order to completely fix the storming problem in 2.6.27.y,
> and ensure a timely socket reconnect.
> 
> commit 15f081ca8ddfe150fb639c591b18944a539da0fc (SUNRPC: Avoid an
> unnecessary task reschedule on ENOTCONN)
> 
> commit 670f94573104b4a25525d3fcdcd6496c678df172 (SUNRPC: Ensure we set
> XPRT_CLOSING only after we've sent a tcp FIN...)
> 
> commit 40d2549db5f515e415894def98b49db7d4c56714 (SUNRPC: Don't
> disconnect if a connection is still in progress.)

I've now applied all of these to the .27 stable tree.

> and finally
> 
> commit f75e6745aa3084124ae1434fd7629853bdaf6798 (SUNRPC: Fix the problem
> of EADDRNOTAVAIL syslog floods on reconnect)
> 
> The first three need to be applied to kernels 2.6.27.y to 2.6.29.y,
> while the last needs to be applied to 2.6.27.y to 2.6.30.y.

The last one is in .30, so it doesn't need to go there again :)

But this last one doesn't apply at all to the .27 stable tree.  Care to
refresh it and send it to stable@kernel.org if you think it is also
needed?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 131+ messages in thread

* Re: [stable] NFS client packet storm on 2.6.27.x
@ 2009-07-28 18:31                                                                                             ` Greg KH
  0 siblings, 0 replies; 131+ messages in thread
From: Greg KH @ 2009-07-28 18:31 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Kasparek Tomas, linux-nfs, linux-kernel, stable

On Mon, Jul 13, 2009 at 01:40:17PM -0400, Trond Myklebust wrote:
> On Mon, 2009-07-13 at 10:20 -0700, Greg KH wrote:
> > On Mon, Jul 13, 2009 at 01:12:15PM +0200, Kasparek Tomas wrote:
> > > On Thu, Jun 25, 2009 at 07:55:32AM +0200, Kasparek Tomas wrote:
> > > > http://www.linux-nfs.org/Linux-2.6.x/2.6.27/
> > > > 
> > > > .../fixups_4/linux-2.6.27-001-respond_promptly_to_socket_errors.dif
> > > > .../fixups_4/linux-2.6.27-002-respond_promptly_to_socket_errors_2.dif
> > > > 
> > > > ...
> > > > Together with tcp_linger patch it seems to improve the situation a lot to
> > > > state when it is possible for me to use 2.6.27.x kernels.
> > > > 
> > > > Trond, will it be possible to get tcp_linger and the upper twho patches to
> > > > 2.6.27.x stable queue so others get these fixes?
> > > 
> > > I got no response so far, is there someone who could advise me what to do
> > > to get above patches to 2.6.27.x stable? Or is there some reason, why this
> > > is not possible at all?
> > 
> > Can you backport them and send them to stable@kernel.org with the git
> > commit ids of the orignal patches?
> 
> When testing at a customer site, we recently found that we actually need
> 4 patches in order to completely fix the storming problem in 2.6.27.y,
> and ensure a timely socket reconnect.
> 
> commit 15f081ca8ddfe150fb639c591b18944a539da0fc (SUNRPC: Avoid an
> unnecessary task reschedule on ENOTCONN)
> 
> commit 670f94573104b4a25525d3fcdcd6496c678df172 (SUNRPC: Ensure we set
> XPRT_CLOSING only after we've sent a tcp FIN...)
> 
> commit 40d2549db5f515e415894def98b49db7d4c56714 (SUNRPC: Don't
> disconnect if a connection is still in progress.)

I've now applied all of these to the .27 stable tree.

> and finally
> 
> commit f75e6745aa3084124ae1434fd7629853bdaf6798 (SUNRPC: Fix the problem
> of EADDRNOTAVAIL syslog floods on reconnect)
> 
> The first three need to be applied to kernels 2.6.27.y to 2.6.29.y,
> while the last needs to be applied to 2.6.27.y to 2.6.30.y.

The last one is in .30, so it doesn't need to go there again :)

But this last one doesn't apply at all to the .27 stable tree.  Care to
refresh it and send it to stable@kernel.org if you think it is also
needed?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 131+ messages in thread

end of thread, other threads:[~2009-07-28 18:57 UTC | newest]

Thread overview: 131+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-17 12:32 [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds" Max Kellermann
2008-10-17 14:33 ` Glauber Costa
2008-10-20  6:51   ` Max Kellermann
2008-10-20  7:43     ` Ian Campbell
2008-10-20 13:15     ` Glauber Costa
2008-10-20 14:12       ` Max Kellermann
2008-10-20 14:34         ` Cyrill Gorcunov
2008-10-20 14:21       ` Cyrill Gorcunov
2009-05-22 20:59     ` H. Peter Anvin
2009-05-25 13:12       ` Max Kellermann
2008-10-20  6:27 ` Ian Campbell
2008-11-01 11:45   ` Ian Campbell
2008-11-01 13:41     ` Trond Myklebust
2008-11-02 14:40       ` Ian Campbell
2008-11-07  2:12         ` kenneth johansson
2008-11-04 19:10       ` Ian Campbell
2008-11-25  7:09       ` Ian Campbell
2008-11-25 13:28         ` Trond Myklebust
2008-11-25 13:28           ` Trond Myklebust
2008-11-25 13:38           ` Ian Campbell
2008-11-25 13:38             ` Ian Campbell
2008-11-25 13:57             ` Trond Myklebust
2008-11-25 13:57               ` Trond Myklebust
2008-11-25 14:04               ` Ian Campbell
2008-11-25 14:04                 ` Ian Campbell
2008-11-26 22:12                 ` Ian Campbell
2008-11-26 22:12                   ` Ian Campbell
2008-12-01  0:17                   ` [PATCH 0/3] " Trond Myklebust
2008-12-01  0:17                     ` Trond Myklebust
2008-12-01  0:18                     ` [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion Trond Myklebust
2008-12-01  0:18                       ` Trond Myklebust
2008-12-17 15:27                       ` Tom Tucker
2008-12-17 15:27                         ` Tom Tucker
2008-12-17 18:08                         ` Trond Myklebust
2008-12-17 18:08                           ` Trond Myklebust
2008-12-17 18:59                           ` Tom Tucker
2008-12-17 18:59                             ` Tom Tucker
2008-12-01  0:19                     ` [PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once Trond Myklebust
2008-12-01  0:20                     ` [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports Trond Myklebust
2008-12-01  0:20                       ` Trond Myklebust
2008-12-17 15:35                       ` Tom Tucker
2008-12-17 19:07                         ` Trond Myklebust
2008-12-17 19:07                           ` Trond Myklebust
2008-12-23 14:49                           ` Tom Tucker
2008-12-23 14:49                             ` Tom Tucker
2008-12-23 23:39                             ` Tom Tucker
2008-12-23 23:39                               ` Tom Tucker
     [not found]                           ` <1229540877.7257.97.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-02 21:44                             ` Tom Tucker
2009-01-04 19:12                               ` Trond Myklebust
     [not found]                                 ` <1231096358.7363.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-04 19:25                                   ` Trond Myklebust
2009-01-05  3:33                                   ` Tom Tucker
     [not found]                                 ` <1231097131.7 363.11.camel@heimdal.trondhjem.org>
     [not found]                                   ` <1231097131.7363.11.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-05  3:33                                     ` Tom Tucker
2009-01-05 17:04                                     ` Tom Tucker
2009-01-05 17:13                                       ` Trond Myklebust
     [not found]                                         ` <1231175613.7127.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-05 19:33                                           ` Tom Tucker
2009-01-05 19:51                                             ` Trond Myklebust
     [not found]                                               ` <1231185115.7127.28.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-05 20:13                                                 ` Tom Tucker
2009-01-05 20:41                                                 ` Tom Tucker
2009-01-05 20:48                                                   ` Trond Myklebust
     [not found]                                                     ` <1231188518.7127.30.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-05 21:10                                                       ` Tom Tucker
2008-12-01  0:29                     ` [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds" Trond Myklebust
2008-12-01  0:29                       ` Trond Myklebust
2008-12-02 15:22                       ` Kasparek Tomas
2008-12-02 15:22                         ` Kasparek Tomas
2008-12-02 15:37                         ` Trond Myklebust
2008-12-02 15:37                           ` Trond Myklebust
2008-12-02 16:26                           ` Kasparek Tomas
2008-12-02 16:26                             ` Kasparek Tomas
2008-12-02 18:10                             ` Trond Myklebust
2008-12-02 18:10                               ` Trond Myklebust
     [not found]                               ` <1228241407.3090.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-12-04 10:23                                 ` Kasparek Tomas
     [not found]                                   ` <1229284201.6463.98.camel@heimdal.trondhjem.org>
     [not found]                                     ` <1229284201.6463.98.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-12-16 12:05                                       ` Kasparek Tomas
2008-12-16 12:10                                         ` Kasparek Tomas
2008-12-16 12:59                                           ` Trond Myklebust
2008-12-23 22:34                                         ` Trond Myklebust
     [not found]                                           ` <1230071647.17701.27.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-05 12:18                                             ` Kasparek Tomas
2009-01-09 14:56                                             ` Kasparek Tomas
2009-01-09 17:59                                               ` Trond Myklebust
     [not found]                                                 ` <1231523966.7179.67.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-10 10:24                                                   ` Kasparek Tomas
2009-01-10 16:00                                                     ` Trond Myklebust
     [not found]                                                       ` <20090112090404.GL47559@fit.vutbr.cz>
     [not found]                                                         ` <1231782009.7322.12.camel@heimdal.trondhjem.org>
     [not found]                                                           ` <1231809446.7322.17.camel@heimdal.trondhjem.org>
     [not found]                                                             ` <1231809446.7322.17.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-13 15:22                                                               ` Kasparek Tomas
2009-01-16 10:48                                                                 ` Kasparek Tomas
2009-01-18 13:08                                                                   ` Kasparek Tomas
2009-01-20 15:03                                                                     ` Kasparek Tomas
2009-01-20 15:32                                                                       ` Trond Myklebust
     [not found]                                                                         ` <1232465547.7055.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-01-28  8:18                                                                           ` Kasparek Tomas
2009-02-06  6:35                                                                             ` Kasparek Tomas
2009-02-10  7:55                                                                               ` Kasparek Tomas
2009-03-03 12:08                                                                         ` Kasparek Tomas
2009-03-03 14:16                                                                           ` Trond Myklebust
     [not found]                                                                             ` <1236089767.9631.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-03-25  8:46                                                                               ` Kasparek Tomas
2009-04-18  5:17                                                                               ` Kasparek Tomas
2009-04-22 17:27                                                                                 ` NFS client packet storm on 2.6.27.x Kasparek Tomas
2009-04-22 17:27                                                                                   ` Kasparek Tomas
2009-04-29 12:12                                                                                   ` Steve Dickson
2009-04-29 14:57                                                                                     ` Kasparek Tomas
2009-04-29 14:57                                                                                       ` Kasparek Tomas
2009-06-25  5:55                                                                                   ` Kasparek Tomas
2009-06-25  5:55                                                                                     ` Kasparek Tomas
2009-07-13 11:12                                                                                     ` Kasparek Tomas
2009-07-13 11:12                                                                                       ` Kasparek Tomas
2009-07-13 17:20                                                                                       ` [stable] " Greg KH
2009-07-13 17:40                                                                                         ` Trond Myklebust
     [not found]                                                                                           ` <1247506817.14524.25.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-07-24  8:54                                                                                             ` Kasparek Tomas
2009-07-28 18:31                                                                                           ` Greg KH
2009-07-28 18:31                                                                                             ` Greg KH
2008-12-01 22:09                     ` [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds" Ian Campbell
2008-12-01 22:09                       ` Ian Campbell
2008-12-06 12:16                       ` Ian Campbell
2008-12-06 12:16                         ` Ian Campbell
2008-12-14 18:24                         ` Ian Campbell
2008-12-14 18:24                           ` Ian Campbell
2008-12-16 17:55                           ` J. Bruce Fields
2008-12-16 17:55                             ` J. Bruce Fields
2008-12-16 18:39                             ` Ian Campbell
2008-12-16 18:39                               ` Ian Campbell
2009-01-07 22:21                               ` J. Bruce Fields
2009-01-07 22:21                                 ` J. Bruce Fields
2009-01-08 18:20                                 ` J. Bruce Fields
2009-01-08 18:20                                   ` J. Bruce Fields
2009-01-08 21:22                                   ` Ian Campbell
2009-01-08 21:22                                     ` Ian Campbell
2009-01-08 21:26                                     ` J. Bruce Fields
2009-01-08 21:26                                       ` J. Bruce Fields
2009-01-12  9:46                                       ` Ian Campbell
2009-01-12  9:46                                         ` Ian Campbell
2009-01-22  8:27                                       ` Ian Campbell
2009-01-22  8:27                                         ` Ian Campbell
2009-01-22 16:44                                         ` J. Bruce Fields
2009-01-22 16:44                                           ` J. Bruce Fields
2008-11-26  9:16             ` [PATCH] NFS regression in 2.6.26?, Tomas Kasparek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.