linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rsync hangs on RedHat 2.4.2 or stock 2.4.4
@ 2001-06-12 13:59 Jeremy Sanders
  2001-06-12 14:09 ` Rasmus Andersen
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Jeremy Sanders @ 2001-06-12 13:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: rsync-bugs

I'm getting numerous rsync (v2.4.6) problems under Linux 2.4.2 (RedHat
7.1) or stock 2.4.4 on several machines. rsync often hangs copying files
from NFS or local disks to local disks. Strangely the problem is fixed by
stracing one of the three rsync threads!

I've encountered the problem just rsyncing the linux 2.4.4 kernel source
tree to a new (blank) directory.

rsync -raxv /data/jss/sysadmin/linux-2.4.4/linux .

The problem is repeatable with this source tree (which includes binaries)
on several machines (one PII machine and an Athlon). The problem also
exists copying the stock Linux 2.4.5 source tree (download it to reproduce
the problem). It hangs on linux/scripts/ver_linux in that case.

For example:

xpc6:~> rsync -raxv /data/jss/sysadmin/linux-2.4.4/linux /tmp/kernel/
[....]
linux/scripts/tkparse.c
linux/scripts/tkparse.h
linux/scripts/ver_linux
linux/vmlinux
[hangs here, for at least several hours]

(switch to another window)
xpc6:~> ps auxw|grep rsync
jss       3165 10.9  1.7  3144 2272 pts/0    S    14:20   0:19 rsync -raxv
/data/jss/sysadmin/linux-2.4.4/linux .
jss       3166  1.1  1.7  3128 2216 pts/0    S    14:20   0:02 rsync -raxv
/data/jss/sysadmin/linux-2.4.4/linux .
jss       3167 10.4  1.7  3136 2236 pts/0    S    14:20   0:18 rsync -raxv
/data/jss/sysadmin/linux-2.4.4/linux .

xpc6:~> su
[blah]
[root@xpc6 jss]# strace -p 3165
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
gettimeofday({992352238, 401281}, NULL) = 0
wait4(3166, 0xbfffdd80, WNOHANG, NULL)  = 0
gettimeofday({992352238, 401846}, NULL) = 0
gettimeofday({992352238, 402088}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 20000}) = 0 (Timeout)
gettimeofday({992352238, 420838}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
gettimeofday({992352238, 431066}, NULL) = 0
wait4(3166, 0xbfffdd80, WNOHANG, NULL)  = 0
gettimeofday({992352238, 431568}, NULL) = 0
gettimeofday({992352238, 431809}, NULL) = 0
select(0, NULL, NULL, NULL, {0, 20000}) = 0 (Timeout)
[lots more of these]

[root@xpc6 jss]# strace -p 3166
  [program starts working again]
select(2, NULL, [1], NULL, {17, 860000}) = 1 (out [1], left {17, 830000})
write(1, "\27\0\0\tlinux/arch/ia64/sn/io/\n", 27) = 27
select(6, [3 5], NULL, NULL, {60, 0})   = 1 (in [5], left {60, 0})
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
read(5, "\30\0\0\t", 4)                 = 4
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
read(5, "linux/arch/ia64/sn/sn1/\n", 24) = 24
select(2, NULL, [1], NULL, {60, 0})     = 1 (out [1], left {60, 0})
write(1, "\30\0\0\tlinux/arch/ia64/sn/sn1/\n", 28) = 28
select(6, [3 5], NULL, NULL, {60, 0})   = 1 (in [5], left {60, 0})
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
read(5, "\32\0\0\t", 4)                 = 4
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
read(5, "linux/arch/ia64/sn/tools/\n", 26) = 26
select(2, NULL, [1], NULL, {60, 0})     = 1 (out [1], left {60, 0})
write(1, "\32\0\0\tlinux/arch/ia64/sn/tools/\n", 30) = 30
select(6, [3 5], NULL, NULL, {60, 0})   = 1 (in [5], left {60, 0})
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
read(5, "\27\0\0\t", 4)                 = 4
select(6, [5], NULL, NULL, {60, 0})     = 1 (in [5], left {60, 0})
[lots more]
[program finishes]

Has anyone else encountered this problem? Is it a kernel problem or an
rsync problem?

Please CC answers to me, as I'm not on linux-kernel.

Jeremy

-- 
Jeremy Sanders <jss@ast.cam.ac.uk>  http://www-xray.ast.cam.ac.uk/~jss/
Pembroke College, Cambridge. UK   Institute of Astronomy, Cambridge. UK




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 13:59 rsync hangs on RedHat 2.4.2 or stock 2.4.4 Jeremy Sanders
@ 2001-06-12 14:09 ` Rasmus Andersen
  2001-06-12 14:17   ` Disconnect
  2001-06-12 14:47 ` Russell King
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Rasmus Andersen @ 2001-06-12 14:09 UTC (permalink / raw)
  To: Jeremy Sanders; +Cc: linux-kernel, rsync-bugs

On Tue, Jun 12, 2001 at 02:59:12PM +0100, Jeremy Sanders wrote:
> I'm getting numerous rsync (v2.4.6) problems under Linux 2.4.2 (RedHat
> 7.1) or stock 2.4.4 on several machines. rsync often hangs copying files
> from NFS or local disks to local disks. Strangely the problem is fixed by
> stracing one of the three rsync threads!
> 
[...]
> Has anyone else encountered this problem? Is it a kernel problem or an
> rsync problem?

I encountered this exact problem some time ago. Some discussion
but in the end the problem was blamed on rsync and nothing came
of it. I'll post an URL to the thread later on when I have the
time to dig it out.

I could swear that during early 240-testX this was not a problem,
but when I finally made a report about it and tried to go back
through earlier kernels, I could not reproduce. Also, this is
not reproducable under 2.2.X (for me, at least).

Regards,
  Rasmus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 14:09 ` Rasmus Andersen
@ 2001-06-12 14:17   ` Disconnect
  0 siblings, 0 replies; 11+ messages in thread
From: Disconnect @ 2001-06-12 14:17 UTC (permalink / raw)
  To: linux-kernel, rsync-bugs

On Tue, 12 Jun 2001, Rasmus Andersen did have cause to say:

> On Tue, Jun 12, 2001 at 02:59:12PM +0100, Jeremy Sanders wrote:
> > I'm getting numerous rsync (v2.4.6) problems under Linux 2.4.2 (RedHat
> > 7.1) or stock 2.4.4 on several machines. rsync often hangs copying files
> 
> I could swear that during early 240-testX this was not a problem,
> but when I finally made a report about it and tried to go back
> through earlier kernels, I could not reproduce. Also, this is
> not reproducable under 2.2.X (for me, at least).

Just a 'me too!' but I'm inclined to think 'rsync bug' because it happens
on Redhat+2.4.x, Debian+2.4.x and Debian+2.2.18 - we finally gave up on
rsync for big-stuff-site-to-site and went back to scp. (It was -way-
faster to scp 4 gigs than to rsync the 50 megs or so of changes. It would
run, then freeze (usually at different places - if it froze twice in the
same place we'd just scp the file manually), so we'd wander past and
kill/restart it, repeat. Fastest total was 4 days, where the two of us
checked it every couple of hours over the weekend.)

We're (trying to) using it in real-life-big-data environment, so if you
need debuggers/more info/etc let me know. 

(I'm on LKML but not rsync-bugs, so cc me from that side.. thanks!)

---
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1 [www.ebb.org/ungeek]
GIT/CC/CM/AT d--(-)@ s+:-- a-->? C++++$ ULBS*++++$ P- L+++>+++++ 
E--- W+++ N+@ o+>$ K? w--->+++++ O- M V-- PS+() PE Y+@ PGP++() t
5--- X-- R tv+@ b++++>$ DI++++ D++(+++) G++ e* h(-)* r++ y++
------END GEEK CODE BLOCK------

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 13:59 rsync hangs on RedHat 2.4.2 or stock 2.4.4 Jeremy Sanders
  2001-06-12 14:09 ` Rasmus Andersen
@ 2001-06-12 14:47 ` Russell King
  2001-06-12 15:00 ` David S. Miller
  2001-06-19  9:41 ` Jeremy Sanders
  3 siblings, 0 replies; 11+ messages in thread
From: Russell King @ 2001-06-12 14:47 UTC (permalink / raw)
  To: Jeremy Sanders; +Cc: linux-kernel

On Tue, Jun 12, 2001 at 02:59:12PM +0100, Jeremy Sanders wrote:
> I'm getting numerous rsync (v2.4.6) problems under Linux 2.4.2 (RedHat
> 7.1) or stock 2.4.4 on several machines. rsync often hangs copying files
> from NFS or local disks to local disks. Strangely the problem is fixed by
> stracing one of the three rsync threads!

<aol>me too!</aol> but I got shafted because I was using 2.2.15pre13 on
the machine rsync was pushing the data to, and this was the problem.

However, I can confirm that your symptoms are _precisely_ identical to
mine - when rsync locks up, stracing it on the 2.4.2 end causes it to start
up again.

At the time I suggested it was because of a missing wakeup in 2.4.2 kernels,
but I was shouted down for using 2.2.15pre13.  Since then I've seen these
reports appear on lkml several times, each time without a solution nor
explaination.

Oh, and yes, we're still using the same setup here at work, and its running
fine now - no rsync lockups.  I'm not sure why that is. ;(

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 13:59 rsync hangs on RedHat 2.4.2 or stock 2.4.4 Jeremy Sanders
  2001-06-12 14:09 ` Rasmus Andersen
  2001-06-12 14:47 ` Russell King
@ 2001-06-12 15:00 ` David S. Miller
  2001-06-12 15:01   ` Jeremy Sanders
                     ` (2 more replies)
  2001-06-19  9:41 ` Jeremy Sanders
  3 siblings, 3 replies; 11+ messages in thread
From: David S. Miller @ 2001-06-12 15:00 UTC (permalink / raw)
  To: Russell King; +Cc: Jeremy Sanders, linux-kernel


Russell King writes:
 > At the time I suggested it was because of a missing wakeup in 2.4.2 kernels,
 > but I was shouted down for using 2.2.15pre13.  Since then I've seen these
 > reports appear on lkml several times, each time without a solution nor
 > explaination.
 > 
 > Oh, and yes, we're still using the same setup here at work, and its running
 > fine now - no rsync lockups.  I'm not sure why that is. ;(

Look everyone, it was determined to be a deadlock because of some
interaction between how rsync sets up it's communication channels
with the ssh subprocess, readas: userland bug.

I don't remember if the specific problem was in rsync itself or some
buggy version of ssh.  One can search the list archives to discover
Alexey's full analysis of the problem.  I don't have a URL handy.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 15:00 ` David S. Miller
@ 2001-06-12 15:01   ` Jeremy Sanders
  2001-06-12 15:09   ` Stephen Frost
  2001-06-12 15:23   ` Disconnect
  2 siblings, 0 replies; 11+ messages in thread
From: Jeremy Sanders @ 2001-06-12 15:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: Russell King, linux-kernel

On Tue, 12 Jun 2001, David S. Miller wrote:

>
> Russell King writes:
>  > At the time I suggested it was because of a missing wakeup in 2.4.2 kernels,
>  > but I was shouted down for using 2.2.15pre13.  Since then I've seen these
>  > reports appear on lkml several times, each time without a solution nor
>  > explaination.
>  >
>  > Oh, and yes, we're still using the same setup here at work, and its running
>  > fine now - no rsync lockups.  I'm not sure why that is. ;(
>
> Look everyone, it was determined to be a deadlock because of some
> interaction between how rsync sets up it's communication channels
> with the ssh subprocess, readas: userland bug.

I'm not using ssh! This is from local disk to local disk!

Jeremy

-- 
Jeremy Sanders <jss@ast.cam.ac.uk>  http://www-xray.ast.cam.ac.uk/~jss/
Pembroke College, Cambridge. UK   Institute of Astronomy, Cambridge. UK


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 15:00 ` David S. Miller
  2001-06-12 15:01   ` Jeremy Sanders
@ 2001-06-12 15:09   ` Stephen Frost
  2001-06-12 15:23   ` Disconnect
  2 siblings, 0 replies; 11+ messages in thread
From: Stephen Frost @ 2001-06-12 15:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: Russell King, Jeremy Sanders, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1421 bytes --]

* David S. Miller (davem@redhat.com) wrote:
> 
> Russell King writes:
>  > At the time I suggested it was because of a missing wakeup in 2.4.2 kernels,
>  > but I was shouted down for using 2.2.15pre13.  Since then I've seen these
>  > reports appear on lkml several times, each time without a solution nor
>  > explaination.
>  > 
>  > Oh, and yes, we're still using the same setup here at work, and its running
>  > fine now - no rsync lockups.  I'm not sure why that is. ;(
> 
> Look everyone, it was determined to be a deadlock because of some
> interaction between how rsync sets up it's communication channels
> with the ssh subprocess, readas: userland bug.
> 
> I don't remember if the specific problem was in rsync itself or some
> buggy version of ssh.  One can search the list archives to discover
> Alexey's full analysis of the problem.  I don't have a URL handy.

	I have to say I find this likely to be the case for those who are
	having issues with rsync over ssh.  I was recently playing with
	rsync over ssh (newer openssh to older openssh) and was just using
	it as a pass-through to another machine.

	When I replaced ssh with rinetd, everything worked fine.  I havn't
	had a chance yet (though I'd like to) to try with two recent versions
	of ssh but I'm curious what the result will be.  It may be that the
	problem has been fixed in later versions of ssh.

			Stephen

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 15:00 ` David S. Miller
  2001-06-12 15:01   ` Jeremy Sanders
  2001-06-12 15:09   ` Stephen Frost
@ 2001-06-12 15:23   ` Disconnect
  2001-06-12 20:22     ` Rasmus Andersen
  2 siblings, 1 reply; 11+ messages in thread
From: Disconnect @ 2001-06-12 15:23 UTC (permalink / raw)
  To: linux-kernel

On Tue, 12 Jun 2001, David S. Miller did have cause to say:

> Look everyone, it was determined to be a deadlock because of some
> interaction between how rsync sets up it's communication channels
> with the ssh subprocess, readas: userland bug.

....we're not using ssh. :(

---
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1 [www.ebb.org/ungeek]
GIT/CC/CM/AT d--(-)@ s+:-- a-->? C++++$ ULBS*++++$ P- L+++>+++++ 
E--- W+++ N+@ o+>$ K? w--->+++++ O- M V-- PS+() PE Y+@ PGP++() t
5--- X-- R tv+@ b++++>$ DI++++ D++(+++) G++ e* h(-)* r++ y++
------END GEEK CODE BLOCK------

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 15:23   ` Disconnect
@ 2001-06-12 20:22     ` Rasmus Andersen
  0 siblings, 0 replies; 11+ messages in thread
From: Rasmus Andersen @ 2001-06-12 20:22 UTC (permalink / raw)
  To: Disconnect; +Cc: linux-kernel

On Tue, Jun 12, 2001 at 11:23:02AM -0400, Disconnect wrote:
> On Tue, 12 Jun 2001, David S. Miller did have cause to say:
> 
> > Look everyone, it was determined to be a deadlock because of some
> > interaction between how rsync sets up it's communication channels
> > with the ssh subprocess, readas: userland bug.
> 
> ....we're not using ssh. :(

Neither am/was I. The rsync is within a single FS.

Aside from that, here is the original bug report by Matthias 
Schniedermeyer: 
http://marc.theaimsgroup.com/?l=linux-kernel&m=98157768131423&w=2
with no reply.

My report:
http://marc.theaimsgroup.com/?l=linux-kernel&m=98262067309185&w=2
with myself replying :) 

Russell King's report: 
http://marc.theaimsgroup.com/?l=linux-kernel&m=98326853429463&w=2
which gave a fair amount of discussion.

Note that in my report I state that the problem cannot be seen
with smaller workloads; I have to try with at least drivers/*
before it shows itself. Hmm, just tried with 243-ac12 (yeah,
I'm way behind). I have to try the full tree now, drivers/*
wont do it anymore.
-- 
Regards,
        Rasmus(rasmus@jaquet.dk)

"An intellectual is someone who has been educated beyond their intelligence."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-12 13:59 rsync hangs on RedHat 2.4.2 or stock 2.4.4 Jeremy Sanders
                   ` (2 preceding siblings ...)
  2001-06-12 15:00 ` David S. Miller
@ 2001-06-19  9:41 ` Jeremy Sanders
  2001-06-21 19:38   ` Rasmus Andersen
  3 siblings, 1 reply; 11+ messages in thread
From: Jeremy Sanders @ 2001-06-19  9:41 UTC (permalink / raw)
  To: linux-kernel

I've found a patch which fixes the hanging problem, so I guess it's not
linux-kernel which is at fault. Get it from Wayne Davison at:

 http://www.clari.net/~wayne/rsync-nohang.patch

Jeremy

-- 
Jeremy Sanders <jss@ast.cam.ac.uk>  http://www-xray.ast.cam.ac.uk/~jss/
Pembroke College, Cambridge. UK   Institute of Astronomy, Cambridge. UK



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rsync hangs on RedHat 2.4.2 or stock 2.4.4
  2001-06-19  9:41 ` Jeremy Sanders
@ 2001-06-21 19:38   ` Rasmus Andersen
  0 siblings, 0 replies; 11+ messages in thread
From: Rasmus Andersen @ 2001-06-21 19:38 UTC (permalink / raw)
  To: Jeremy Sanders; +Cc: linux-kernel

On Tue, Jun 19, 2001 at 10:41:51AM +0100, Jeremy Sanders wrote:
> I've found a patch which fixes the hanging problem, so I guess it's not
> linux-kernel which is at fault. Get it from Wayne Davison at:
 
Works for me too. 


-- 
        Rasmus(rasmus@jaquet.dk)

"Men kick friendship around like a football, but it doesn't seem to
 crack. Women treat it like glass and it goes to pieces."
  -- Anne Spencer Morrow Lindbergh

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-06-21 19:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-12 13:59 rsync hangs on RedHat 2.4.2 or stock 2.4.4 Jeremy Sanders
2001-06-12 14:09 ` Rasmus Andersen
2001-06-12 14:17   ` Disconnect
2001-06-12 14:47 ` Russell King
2001-06-12 15:00 ` David S. Miller
2001-06-12 15:01   ` Jeremy Sanders
2001-06-12 15:09   ` Stephen Frost
2001-06-12 15:23   ` Disconnect
2001-06-12 20:22     ` Rasmus Andersen
2001-06-19  9:41 ` Jeremy Sanders
2001-06-21 19:38   ` Rasmus Andersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).