All of lore.kernel.org
 help / color / mirror / Atom feed
* Nasty regression from .27.7 to .27.8: idle samba goes crazy
@ 2008-12-08  6:18 Holger Hoffstaette
  2008-12-08  7:34 ` Rafael J. Wysocki
  0 siblings, 1 reply; 21+ messages in thread
From: Holger Hoffstaette @ 2008-12-08  6:18 UTC (permalink / raw)
  To: linux-kernel


Hi,

I just encountered a nasty symptom for the second time that has started to
occur after updating my home server from vanilla 2.6.27.7 to .8 (same
config).

A while after disconnecting a samba client, the smbd samba server
process goes crazy and consumes 100% CPU. From that time on it is
unkillable (kill -9 returns but the process continues to run). The only
recourse is reboot, which works without problem (i.e. unmounting the
served filesystems is apparently possible?). I tried to attach to the
process with gdb but that just hung.

The system is a generic old single-core P4 box with a single SATA drive,
Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
patches or binary drivers. It has been rock solid before the update and
shows no other signs of weirdness in logs or otherwise. I downgraded to .7
for now and will see what happens, but since it worked before I am certain
that this is a regression in the .8 release.

The only commonality is a log entry by samba that seems to correlate with
both occurrences:

[2008/12/08 01:02:52, 0] lib/util_sock.c:read_data(534)
  read_data: read failure for 4 bytes to client 192.168.100.128. Error = No route to host

.128 is the Windows client machine (connected via a stable GigE link),
which I shut down pretty much exactly 30 minutes before that (any 30
minute timeouts in the kernel/network stack?). Both instances of these log
entries correlate with the CPU spikes which I noticed in my MRTG graphs.

Any suspects or ideas?

thanks
Holger



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08  6:18 Nasty regression from .27.7 to .27.8: idle samba goes crazy Holger Hoffstaette
@ 2008-12-08  7:34 ` Rafael J. Wysocki
  2008-12-08  8:07   ` Holger Hoffstaette
  2008-12-08 16:32   ` Stefan Richter
  0 siblings, 2 replies; 21+ messages in thread
From: Rafael J. Wysocki @ 2008-12-08  7:34 UTC (permalink / raw)
  To: Holger Hoffstaette; +Cc: linux-kernel, Greg KH, stable

On Monday, 8 of December 2008, Holger Hoffstaette wrote:
> 
> Hi,
> 
> I just encountered a nasty symptom for the second time that has started to
> occur after updating my home server from vanilla 2.6.27.7 to .8 (same
> config).
> 
> A while after disconnecting a samba client, the smbd samba server
> process goes crazy and consumes 100% CPU. From that time on it is
> unkillable (kill -9 returns but the process continues to run). The only
> recourse is reboot, which works without problem (i.e. unmounting the
> served filesystems is apparently possible?). I tried to attach to the
> process with gdb but that just hung.
> 
> The system is a generic old single-core P4 box with a single SATA drive,
> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
> patches or binary drivers. It has been rock solid before the update and
> shows no other signs of weirdness in logs or otherwise. I downgraded to .7
> for now and will see what happens, but since it worked before I am certain
> that this is a regression in the .8 release.
> 
> The only commonality is a log entry by samba that seems to correlate with
> both occurrences:
> 
> [2008/12/08 01:02:52, 0] lib/util_sock.c:read_data(534)
>   read_data: read failure for 4 bytes to client 192.168.100.128. Error = No route to host
> 
> .128 is the Windows client machine (connected via a stable GigE link),
> which I shut down pretty much exactly 30 minutes before that (any 30
> minute timeouts in the kernel/network stack?). Both instances of these log
> entries correlate with the CPU spikes which I noticed in my MRTG graphs.
> 
> Any suspects or ideas?

Please bisect.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08  7:34 ` Rafael J. Wysocki
@ 2008-12-08  8:07   ` Holger Hoffstaette
  2008-12-08 16:46     ` Stefan Richter
  2008-12-08 16:32   ` Stefan Richter
  1 sibling, 1 reply; 21+ messages in thread
From: Holger Hoffstaette @ 2008-12-08  8:07 UTC (permalink / raw)
  To: linux-kernel

On Mon, 08 Dec 2008 08:34:22 +0100, Rafael J. Wysocki wrote:

> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>> 
>> A while after disconnecting a samba client, the smbd samba server
>> process goes crazy and consumes 100% CPU. From that time on it is
>> unkillable (kill -9 returns but the process continues to run). The only
>> recourse is reboot, which works without problem (i.e. unmounting the
>> served filesystems is apparently possible?). I tried to attach to the
>> process with gdb but that just hung.
>> [..]
> 
> Please bisect.

I would love to try, but this is my "production server" (i.e. I need it
for real work) and I'll be traveling the next few days. I will try to
bisect after that (if nobody else has any ideas) but will have to make
sure the bug is actually reproducible after the timeout - for now I only
observed it by accident (via mrtg).
In the meantime maybe someone else will observe it as well.

thanks
Holger



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08  7:34 ` Rafael J. Wysocki
  2008-12-08  8:07   ` Holger Hoffstaette
@ 2008-12-08 16:32   ` Stefan Richter
  2008-12-08 18:46     ` Rafael J. Wysocki
  1 sibling, 1 reply; 21+ messages in thread
From: Stefan Richter @ 2008-12-08 16:32 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Holger Hoffstaette, linux-kernel, Greg KH, stable

Rafael J. Wysocki wrote:
> Please bisect.

Why should he bisect before the developers who added networking related
patches to .8 attempted to reproduce the bug, let alone looked at the
report?

Actually he "bisected" it already to the diff of .7->.8.

(Maybe it is not a networking bug, but that's where it makes most sense
to start to look.)
-- 
Stefan Richter
-=====-==--- ==-- -=---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08  8:07   ` Holger Hoffstaette
@ 2008-12-08 16:46     ` Stefan Richter
  2008-12-08 19:19       ` Stefan Richter
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Richter @ 2008-12-08 16:46 UTC (permalink / raw)
  To: netdev
  Cc: Holger Hoffstaette, linux-kernel, Rafael J. Wysocki, Greg KH, stable

Holger Hoffstaette wrote at LKML:
> On Mon, 08 Dec 2008 08:34:22 +0100, Rafael J. Wysocki wrote:
> 
>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>> Hi,
>>> 
>>> I just encountered a nasty symptom for the second time that has started to
>>> occur after updating my home server from vanilla 2.6.27.7 to .8 (same
>>> config).
>>> 
>>> A while after disconnecting a samba client, the smbd samba server
>>> process goes crazy and consumes 100% CPU. From that time on it is
>>> unkillable (kill -9 returns but the process continues to run). The only
>>> recourse is reboot, which works without problem (i.e. unmounting the
>>> served filesystems is apparently possible?). I tried to attach to the
>>> process with gdb but that just hung.
>>> 
>>> The system is a generic old single-core P4 box with a single SATA drive,
>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
>>> patches or binary drivers. It has been rock solid before the update and
>>> shows no other signs of weirdness in logs or otherwise. I downgraded to .7
>>> for now and will see what happens, but since it worked before I am certain
>>> that this is a regression in the .8 release.
>>> 
>>> The only commonality is a log entry by samba that seems to correlate with
>>> both occurrences:
>>> 
>>> [2008/12/08 01:02:52, 0] lib/util_sock.c:read_data(534)
>>>   read_data: read failure for 4 bytes to client 192.168.100.128. Error = No route to host
>>> 
>>> .128 is the Windows client machine (connected via a stable GigE link),
>>> which I shut down pretty much exactly 30 minutes before that (any 30
>>> minute timeouts in the kernel/network stack?). Both instances of these log
>>> entries correlate with the CPU spikes which I noticed in my MRTG graphs.
>>> 
>>> Any suspects or ideas?
>>> 
>>> thanks
>>> Holger
>> 
>> Please bisect.
> 
> I would love to try, but this is my "production server" (i.e. I need it
> for real work) and I'll be traveling the next few days. I will try to
> bisect after that (if nobody else has any ideas) but will have to make
> sure the bug is actually reproducible after the timeout - for now I only
> observed it by accident (via mrtg).
> In the meantime maybe someone else will observe it as well.
> 
> thanks
> Holger
> 

Added Cc: netdev, readded all other Cc's, quoted in full for netdev.
Good luck,
-- 
Stefan Richter
-=====-==--- ==-- -=---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 16:32   ` Stefan Richter
@ 2008-12-08 18:46     ` Rafael J. Wysocki
  2008-12-08 19:14       ` Stefan Richter
  0 siblings, 1 reply; 21+ messages in thread
From: Rafael J. Wysocki @ 2008-12-08 18:46 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Holger Hoffstaette, linux-kernel, Greg KH, stable

On Monday, 8 of December 2008, Stefan Richter wrote:
> Rafael J. Wysocki wrote:
> > Please bisect.
> 
> Why should he bisect before the developers who added networking related
> patches to .8 attempted to reproduce the bug, let alone looked at the
> report?

Because I think that's the fastest way to turn the attention of the appropriate
people to the problem.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 18:46     ` Rafael J. Wysocki
@ 2008-12-08 19:14       ` Stefan Richter
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Richter @ 2008-12-08 19:14 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Holger Hoffstaette, linux-kernel, Greg KH, stable

Rafael J. Wysocki wrote:
> On Monday, 8 of December 2008, Stefan Richter wrote:
>> Rafael J. Wysocki wrote:
>>> Please bisect.
>> Why should he bisect before the developers who added networking related
>> patches to .8 attempted to reproduce the bug, let alone looked at the
>> report?
> 
> Because I think that's the fastest way to turn the attention of the appropriate
> people to the problem.

All the contributors to 2.6.27.8 are known by name & address, and they
hopefully remember what they put into .8 and can quickly tell whether
there is a chance that it could have something to do with Samba going
into an unkillable busy loop.

So, in case of a bug report which even includes a potential way to
reproduce the issue on generic hardware with common tools, "Could you
bisect?  Meanwhile, let's Cc netdev." sounds better to me than "Please
bisect."
-- 
Stefan Richter
-=====-==--- ==-- -=---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 16:46     ` Stefan Richter
@ 2008-12-08 19:19       ` Stefan Richter
  2008-12-08 20:08           ` Holger Hoffstaette
  2008-12-08 22:22         ` Jan Rekorajski
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Richter @ 2008-12-08 19:19 UTC (permalink / raw)
  To: Holger Hoffstaette
  Cc: netdev, linux-kernel, Rafael J. Wysocki, Greg KH, stable

>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>>> The system is a generic old single-core P4 box with a single SATA drive,
>>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
>>>> patches or binary drivers.

Holger, it may be unrelated to the issue, but to be sure:  Which network
card driver do you use?
-- 
Stefan Richter
-=====-==--- ==-- -=---
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 19:19       ` Stefan Richter
@ 2008-12-08 20:08           ` Holger Hoffstaette
  2008-12-08 22:22         ` Jan Rekorajski
  1 sibling, 0 replies; 21+ messages in thread
From: Holger Hoffstaette @ 2008-12-08 20:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev

On Mon, 08 Dec 2008 20:19:37 +0100, Stefan Richter wrote:

>>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>>>> The system is a generic old single-core P4 box with a single SATA
>>>>> drive, Gentoo userland and Samba is 3.0.33 (in async mode). The
>>>>> kernel has no patches or binary drivers.
> 
> Holger, it may be unrelated to the issue, but to be sure:  Which network
> card driver do you use?

e1000 with the older PCI/PCI-X 82545GM rev.04 card in a PCI slot.

thanks,
Holger



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
@ 2008-12-08 20:08           ` Holger Hoffstaette
  0 siblings, 0 replies; 21+ messages in thread
From: Holger Hoffstaette @ 2008-12-08 20:08 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

On Mon, 08 Dec 2008 20:19:37 +0100, Stefan Richter wrote:

>>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>>>> The system is a generic old single-core P4 box with a single SATA
>>>>> drive, Gentoo userland and Samba is 3.0.33 (in async mode). The
>>>>> kernel has no patches or binary drivers.
> 
> Holger, it may be unrelated to the issue, but to be sure:  Which network
> card driver do you use?

e1000 with the older PCI/PCI-X 82545GM rev.04 card in a PCI slot.

thanks,
Holger



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 19:19       ` Stefan Richter
  2008-12-08 20:08           ` Holger Hoffstaette
@ 2008-12-08 22:22         ` Jan Rekorajski
  2008-12-09 17:37           ` Chuck Ebbert
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Rekorajski @ 2008-12-08 22:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Holger Hoffstaette, netdev, Rafael J. Wysocki, Greg KH, stable,
	Stefan Richter

On Mon, 08 Dec 2008, Stefan Richter wrote:

> >>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
> >>>> The system is a generic old single-core P4 box with a single SATA drive,
> >>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
> >>>> patches or binary drivers.
> 
> Holger, it may be unrelated to the issue, but to be sure:  Which network
> card driver do you use?

I think you can safely rule out NIC, I'm also seeing this behaviour on a
brand new server with imap hanging in some busy-loop.
Network card in my case:
Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)

What I observer was one CPU doing 100% system work, and the number of
timer interrupts went from 1k per second to 4k (for the whole system).

I didn't report it because I thought it's one of patches I have to blame.
Oh, and, unfortunately, I can't bisect, I'm seeing this only on one machine
that has to be running.

Jan
-- 
Jan Rekorajski            |  ALL SUSPECTS ARE GUILTY. PERIOD!
baggins<at>mimuw.edu.pl   |  OTHERWISE THEY WOULDN'T BE SUSPECTS, WOULD THEY?
BOFH, MANIAC              |                   -- TROOPS by Kevin Rubio

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-08 22:22         ` Jan Rekorajski
@ 2008-12-09 17:37           ` Chuck Ebbert
  2008-12-09 19:16             ` Manfred Spraul
  2008-12-10 17:37             ` Manfred Spraul
  0 siblings, 2 replies; 21+ messages in thread
From: Chuck Ebbert @ 2008-12-09 17:37 UTC (permalink / raw)
  To: Jan Rekorajski
  Cc: linux-kernel, Holger Hoffstaette, netdev, Rafael J. Wysocki,
	Greg KH, stable, Stefan Richter, Manfred Spraul

On Mon, 8 Dec 2008 23:22:46 +0100
Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:

> On Mon, 08 Dec 2008, Stefan Richter wrote:
> 
> > >>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
> > >>>> The system is a generic old single-core P4 box with a single SATA drive,
> > >>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
> > >>>> patches or binary drivers.
> > 
> > Holger, it may be unrelated to the issue, but to be sure:  Which network
> > card driver do you use?
> 
> I think you can safely rule out NIC, I'm also seeing this behaviour on a
> brand new server with imap hanging in some busy-loop.
> Network card in my case:
> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
> 
> What I observer was one CPU doing 100% system work, and the number of
> timer interrupts went from 1k per second to 4k (for the whole system).
> 

Try reverting the idr patch that went into 2.6.27.8. It broke DRM in the
Fedora kernel at least.

http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob_plain;f=releases/2.6.27.8/lib-idr.c-fix-rcu-related-race-with-idr_find.patch;h=b1145766fb9460a0c0285350b49216355c5b4ad8

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-09 17:37           ` Chuck Ebbert
@ 2008-12-09 19:16             ` Manfred Spraul
  2008-12-09 19:30               ` Chuck Ebbert
  2008-12-10 17:37             ` Manfred Spraul
  1 sibling, 1 reply; 21+ messages in thread
From: Manfred Spraul @ 2008-12-09 19:16 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Jan Rekorajski, linux-kernel, Holger Hoffstaette, netdev,
	Rafael J. Wysocki, Greg KH, stable, Stefan Richter, Nadia Derbey

Chuck Ebbert wrote:
> Try reverting the idr patch that went into 2.6.27.8. It broke DRM in the
> Fedora kernel at least.
>
>   
What happens?
Does it oops, does one of the BUG() statements trigger?

--
    Manfred

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-09 19:16             ` Manfred Spraul
@ 2008-12-09 19:30               ` Chuck Ebbert
  0 siblings, 0 replies; 21+ messages in thread
From: Chuck Ebbert @ 2008-12-09 19:30 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Jan Rekorajski, linux-kernel, Holger Hoffstaette, netdev,
	Rafael J. Wysocki, Greg KH, stable, Stefan Richter, Nadia Derbey

On Tue, 09 Dec 2008 20:16:34 +0100
Manfred Spraul <manfred@colorfullife.com> wrote:

> Chuck Ebbert wrote:
> > Try reverting the idr patch that went into 2.6.27.8. It broke DRM in the
> > Fedora kernel at least.
> >
> >   
> What happens?
> Does it oops, does one of the BUG() statements trigger?
> 

It fails in strange ways, e.g. trying to open a DRM device causes it to
disappear. (And DRM is a heavy user of idr.)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-09 17:37           ` Chuck Ebbert
  2008-12-09 19:16             ` Manfred Spraul
@ 2008-12-10 17:37             ` Manfred Spraul
  2008-12-11 22:54               ` Holger Hoffstätte
                                 ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Manfred Spraul @ 2008-12-10 17:37 UTC (permalink / raw)
  To: Holger Hoffstaette, Stefan Richter; +Cc: linux-kernel, stable

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

Chuck Ebbert wrote:
> On Mon, 8 Dec 2008 23:22:46 +0100
> Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:
>
>   
>> On Mon, 08 Dec 2008, Stefan Richter wrote:
>>
>>     
>>>>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>>>>>             
>>>>>>> The system is a generic old single-core P4 box with a single SATA drive,
>>>>>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel has no
>>>>>>> patches or binary drivers.
>>>>>>>               
>>> Holger, it may be unrelated to the issue, but to be sure:  Which network
>>> card driver do you use?
>>>       
>> I think you can safely rule out NIC, I'm also seeing this behaviour on a
>> brand new server with imap hanging in some busy-loop.
>> Network card in my case:
>> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
>>
>> What I observer was one CPU doing 100% system work, and the number of
>> timer interrupts went from 1k per second to 4k (for the whole system).
>>
>>     
Could you try the attached patch?
It should fix the bug.


--
    Manfred

[-- Attachment #2: patch-idr-get_above_int --]
[-- Type: text/plain, Size: 1490 bytes --]

rom ae060e0b7bc071bd73dd5319b93c3344d9e10212 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
To: torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: cebbert@redhat.com
Cc: airlied@gmail.com
Cc: akpm@linux-foundation.org
Bcc: manfred@colorfullife.com
Date: Wed, 10 Dec 2008 18:17:06 +0100
Subject: [PATCH] lib/idr.c: Fix bug introduced by RCU fix

The last patch to lib/idr.c caused a bug if idr_get_new_above() was
called on an empty idr:
Usually, nodes stay on the same layer. New layers are added to the top
of the tree.
The exception is idr_get_new_above() on an empty tree: In this case,
the new root node is first added on layer 0, then moved upwards.
p->layer was not updated.

As usual: You shall never rely on the source code comments, they
will only mislead you.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 lib/idr.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 7a785a0..1c4f928 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -220,8 +220,14 @@ build_up:
 	 */
 	while ((layers < (MAX_LEVEL - 1)) && (id >= (1 << (layers*IDR_BITS)))) {
 		layers++;
-		if (!p->count)
+		if (!p->count) {
+			/* special case: if the tree is currently empty,
+			 * then we grow the tree by moving the top node
+			 * upwards.
+			 */
+			p->layer++;
 			continue;
+		}
 		if (!(new = get_from_free_list(idp))) {
 			/*
 			 * The allocation failed.  If we built part of
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-10 17:37             ` Manfred Spraul
@ 2008-12-11 22:54               ` Holger Hoffstätte
  2008-12-11 23:40                 ` [stable] " Greg KH
  2008-12-12  1:08               ` Jan Rekorajski
  2008-12-12 14:01               ` Holger Hoffstätte
  2 siblings, 1 reply; 21+ messages in thread
From: Holger Hoffstätte @ 2008-12-11 22:54 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Stefan Richter, linux-kernel, stable


Dear all -

Thanks for your efforts.

Manfred Spraul wrote:
> Chuck Ebbert wrote:
>> On Mon, 8 Dec 2008 23:22:46 +0100
>> Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:
>>
>>  
>>> On Mon, 08 Dec 2008, Stefan Richter wrote:
>>>
>>>    
>>>>>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
>>>>>>>            
>>>>>>>> The system is a generic old single-core P4 box with a single
>>>>>>>> SATA drive,
>>>>>>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel
>>>>>>>> has no
>>>>>>>> patches or binary drivers.
>>>>>>>>               
>>>> Holger, it may be unrelated to the issue, but to be sure:  Which
>>>> network
>>>> card driver do you use?
>>>>       
>>> I think you can safely rule out NIC, I'm also seeing this behaviour on a
>>> brand new server with imap hanging in some busy-loop.
>>> Network card in my case:
>>> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
>>>
>>> What I observer was one CPU doing 100% system work, and the number of
>>> timer interrupts went from 1k per second to 4k (for the whole system).
>>>
> Could you try the attached patch?
> It should fix the bug.

I just built 2.6.27.9-rc1 and disconnected the Windowz box several times.
For now smbd does not seem to go into a death spin any more, even though
as far as I can tell .9-rc1 does not contain Manfred'd latest patch. Not
sure what that means, if anything.

I'll keep running stable.9-rc1 and see what happens..

thanks all
Holger


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [stable] Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-11 22:54               ` Holger Hoffstätte
@ 2008-12-11 23:40                 ` Greg KH
  2008-12-12 10:59                   ` Holger Hoffstätte
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2008-12-11 23:40 UTC (permalink / raw)
  To: Holger Hoffstätte
  Cc: Manfred Spraul, Stefan Richter, linux-kernel, stable

On Thu, Dec 11, 2008 at 11:54:12PM +0100, Holger Hoffstätte wrote:
> 
> Dear all -
> 
> Thanks for your efforts.
> 
> Manfred Spraul wrote:
> > Chuck Ebbert wrote:
> >> On Mon, 8 Dec 2008 23:22:46 +0100
> >> Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:
> >>
> >>  
> >>> On Mon, 08 Dec 2008, Stefan Richter wrote:
> >>>
> >>>    
> >>>>>>> On Monday, 8 of December 2008, Holger Hoffstaette wrote:
> >>>>>>>            
> >>>>>>>> The system is a generic old single-core P4 box with a single
> >>>>>>>> SATA drive,
> >>>>>>>> Gentoo userland and Samba is 3.0.33 (in async mode). The kernel
> >>>>>>>> has no
> >>>>>>>> patches or binary drivers.
> >>>>>>>>               
> >>>> Holger, it may be unrelated to the issue, but to be sure:  Which
> >>>> network
> >>>> card driver do you use?
> >>>>       
> >>> I think you can safely rule out NIC, I'm also seeing this behaviour on a
> >>> brand new server with imap hanging in some busy-loop.
> >>> Network card in my case:
> >>> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
> >>>
> >>> What I observer was one CPU doing 100% system work, and the number of
> >>> timer interrupts went from 1k per second to 4k (for the whole system).
> >>>
> > Could you try the attached patch?
> > It should fix the bug.
> 
> I just built 2.6.27.9-rc1 and disconnected the Windowz box several times.
> For now smbd does not seem to go into a death spin any more, even though
> as far as I can tell .9-rc1 does not contain Manfred'd latest patch. Not
> sure what that means, if anything.

.9-rc1 does contain a cifs patch, so perhaps that resolved the issue for
you.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-10 17:37             ` Manfred Spraul
  2008-12-11 22:54               ` Holger Hoffstätte
@ 2008-12-12  1:08               ` Jan Rekorajski
  2008-12-12 18:26                 ` Jan Rekorajski
  2008-12-12 14:01               ` Holger Hoffstätte
  2 siblings, 1 reply; 21+ messages in thread
From: Jan Rekorajski @ 2008-12-12  1:08 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Holger Hoffstaette, Stefan Richter, linux-kernel, stable

On Wed, 10 Dec 2008, Manfred Spraul wrote:

>> On Mon, 8 Dec 2008 23:22:46 +0100
>> Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:
>>
>>> I think you can safely rule out NIC, I'm also seeing this behaviour on a
>>> brand new server with imap hanging in some busy-loop.
>>> Network card in my case:
>>> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
>>>
>>> What I observer was one CPU doing 100% system work, and the number of
>>> timer interrupts went from 1k per second to 4k (for the whole system).
>>>
>>>     
> Could you try the attached patch?
> It should fix the bug.

Thank you, I'm currently running 2.6.27.8 with your patch, I'll report
after 12-24 hours.

Jan
-- 
Jan Rekorajski            |  ALL SUSPECTS ARE GUILTY. PERIOD!
baggins<at>mimuw.edu.pl   |  OTHERWISE THEY WOULDN'T BE SUSPECTS, WOULD THEY?
BOFH, MANIAC              |                   -- TROOPS by Kevin Rubio

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [stable] Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-11 23:40                 ` [stable] " Greg KH
@ 2008-12-12 10:59                   ` Holger Hoffstätte
  0 siblings, 0 replies; 21+ messages in thread
From: Holger Hoffstätte @ 2008-12-12 10:59 UTC (permalink / raw)
  To: Greg KH; +Cc: Manfred Spraul, Stefan Richter, linux-kernel, stable

Greg KH wrote:
> On Thu, Dec 11, 2008 at 11:54:12PM +0100, Holger Hoffstätte wrote:
> [samba's smbd going into spin of death after client disconnect]
> 
> .9-rc1 does contain a cifs patch, so perhaps that resolved the issue for
> you.

I spoke too soon: no, it didn't help as it happened again last night. I
don't see how CIFS oculd have helped, as userlevel smbd has AFAIK nothing
to do with the CIFS kernel module?

Apparently only pulling the client cable does NOT provoke the bug (tried
several times), whereas putting the client box into sleep mode does
(though that worked at least once as well). This time it also didn't
reboot properly without hard power-off. :/

Now running with Manfred's patch to idr.c; will report back if it happens
again.

Holger

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-10 17:37             ` Manfred Spraul
  2008-12-11 22:54               ` Holger Hoffstätte
  2008-12-12  1:08               ` Jan Rekorajski
@ 2008-12-12 14:01               ` Holger Hoffstätte
  2 siblings, 0 replies; 21+ messages in thread
From: Holger Hoffstätte @ 2008-12-12 14:01 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: Stefan Richter, linux-kernel, stable

Manfred Spraul wrote:
> Could you try the attached patch?
> It should fix the bug.

After applying Manfred's patch to .9-rc1, it *seems* that the problem is
gone. I have put the Windows client to sleep several times and after the
20-something minutes timeout smbd reports the error ("No route to host")
but does not go into a spinloop any more.
I'll continue to test this, but as far as I'm concerned this should go
into stable.9 as well (not sure if it's already in rc2).

thanks!
Holger

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Nasty regression from .27.7 to .27.8: idle samba goes crazy
  2008-12-12  1:08               ` Jan Rekorajski
@ 2008-12-12 18:26                 ` Jan Rekorajski
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Rekorajski @ 2008-12-12 18:26 UTC (permalink / raw)
  To: Manfred Spraul, Stefan Richter, linux-kernel, stable

On Fri, 12 Dec 2008, Jan Rekorajski wrote:

> On Wed, 10 Dec 2008, Manfred Spraul wrote:
> 
> >> On Mon, 8 Dec 2008 23:22:46 +0100
> >> Jan Rekorajski <baggins@sith.mimuw.edu.pl> wrote:
> >>
> >>> I think you can safely rule out NIC, I'm also seeing this behaviour on a
> >>> brand new server with imap hanging in some busy-loop.
> >>> Network card in my case:
> >>> Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
> >>>
> >>> What I observer was one CPU doing 100% system work, and the number of
> >>> timer interrupts went from 1k per second to 4k (for the whole system).
> >>>
> >>>     
> > Could you try the attached patch?
> > It should fix the bug.
> 
> Thank you, I'm currently running 2.6.27.8 with your patch, I'll report
> after 12-24 hours.

top - 19:20:59 up 17:22, 34 users,  load average: 0.34, 0.41, 0.33

So, it seems that your patch cured my problem, as that server couldn't
survive more than 8 hours previously (2-3 was norm).

Jan
-- 
Jan Rekorajski            |  ALL SUSPECTS ARE GUILTY. PERIOD!
baggins<at>mimuw.edu.pl   |  OTHERWISE THEY WOULDN'T BE SUSPECTS, WOULD THEY?
BOFH, MANIAC              |                   -- TROOPS by Kevin Rubio

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2008-12-12 18:27 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-08  6:18 Nasty regression from .27.7 to .27.8: idle samba goes crazy Holger Hoffstaette
2008-12-08  7:34 ` Rafael J. Wysocki
2008-12-08  8:07   ` Holger Hoffstaette
2008-12-08 16:46     ` Stefan Richter
2008-12-08 19:19       ` Stefan Richter
2008-12-08 20:08         ` Holger Hoffstaette
2008-12-08 20:08           ` Holger Hoffstaette
2008-12-08 22:22         ` Jan Rekorajski
2008-12-09 17:37           ` Chuck Ebbert
2008-12-09 19:16             ` Manfred Spraul
2008-12-09 19:30               ` Chuck Ebbert
2008-12-10 17:37             ` Manfred Spraul
2008-12-11 22:54               ` Holger Hoffstätte
2008-12-11 23:40                 ` [stable] " Greg KH
2008-12-12 10:59                   ` Holger Hoffstätte
2008-12-12  1:08               ` Jan Rekorajski
2008-12-12 18:26                 ` Jan Rekorajski
2008-12-12 14:01               ` Holger Hoffstätte
2008-12-08 16:32   ` Stefan Richter
2008-12-08 18:46     ` Rafael J. Wysocki
2008-12-08 19:14       ` Stefan Richter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.