linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS insanity
@ 2001-06-20 23:23 Christian Robottom Reis
  2001-06-21 13:59 ` [reiserfs-list] " Chris Mason
  2001-06-21 15:10 ` [NFS] " Trond Myklebust
  0 siblings, 2 replies; 7+ messages in thread
From: Christian Robottom Reis @ 2001-06-20 23:23 UTC (permalink / raw)
  To: NFS; +Cc: linux-kernel, reiserfs-list


I've got an NFS server, version 2.4.4, using reiserfs with trond's NFS
patches and the reiser-2.4.4 nfs patch.

On a client running 2.4.5 with trond's patches and the corresponding
reiser patches, I get the wierdest behaviour:

# on client
cp libgkcontent.so libgkcontent.so.x
diff libgkcontent.so libgkcontent.so.x
# no diff

# on server
diff libgkcontent.so libgkcontent.so.x
Binary files libgkcontent.so and libgkcontent.so.x differ

It _only_ happens in this file of all files I've tried out so far. I'm
trying to get xdelta to show me what's differing so I can see if there's a
pattern or something, but it's awful - data corruption not only possibly
but happening. :-)

I haven't tried remounting yet to see what I get, but I don't see the
problems on unpatched 2.4.2. I'll wait a bit to see if anyone has seen
this. Anyone?

Take care,
--
/\/\ Christian Reis, Senior Engineer, Async Open Source, Brazil
~\/~ http://async.com.br/~kiko/ | [+55 16] 274 4311



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [reiserfs-list] NFS insanity
  2001-06-20 23:23 NFS insanity Christian Robottom Reis
@ 2001-06-21 13:59 ` Chris Mason
  2001-06-21 14:58   ` Trond Myklebust
  2001-06-21 15:10 ` [NFS] " Trond Myklebust
  1 sibling, 1 reply; 7+ messages in thread
From: Chris Mason @ 2001-06-21 13:59 UTC (permalink / raw)
  To: Christian Robottom Reis, NFS; +Cc: linux-kernel, reiserfs-list



On Wednesday, June 20, 2001 08:23:06 PM -0300 Christian Robottom Reis
<kiko@async.com.br> wrote:

> 
> I've got an NFS server, version 2.4.4, using reiserfs with trond's NFS
> patches and the reiser-2.4.4 nfs patch.
> 
> On a client running 2.4.5 with trond's patches and the corresponding
> reiser patches, I get the wierdest behaviour:
> 
> # on client
> cp libgkcontent.so libgkcontent.so.x
> diff libgkcontent.so libgkcontent.so.x
> # no diff
> 
> # on server
> diff libgkcontent.so libgkcontent.so.x
> Binary files libgkcontent.so and libgkcontent.so.x differ
> 
> It _only_ happens in this file of all files I've tried out so far. I'm
> trying to get xdelta to show me what's differing so I can see if there's a
> pattern or something, but it's awful - data corruption not only possibly
> but happening. :-)
> 
> I haven't tried remounting yet to see what I get, but I don't see the
> problems on unpatched 2.4.2. I'll wait a bit to see if anyone has seen
> this. Anyone?

Sounds like some of the problems fixed in 2.4.5 and 2.4.6pre kernels, where
NFS data didn't get flushed right away, but I thought that only involved
mmap'd files.

-chris




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [reiserfs-list] NFS insanity
  2001-06-21 13:59 ` [reiserfs-list] " Chris Mason
@ 2001-06-21 14:58   ` Trond Myklebust
  0 siblings, 0 replies; 7+ messages in thread
From: Trond Myklebust @ 2001-06-21 14:58 UTC (permalink / raw)
  To: Chris Mason; +Cc: Christian Robottom Reis, NFS, linux-kernel, reiserfs-list

>>>>> " " == Chris Mason <mason@suse.com> writes:

     > Sounds like some of the problems fixed in 2.4.5 and 2.4.6pre
     > kernels, where NFS data didn't get flushed right away, but I
     > thought that only involved mmap'd files.

Ordinary 'cp' should work fine: it uses 'read' and 'write' - not mmap.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [NFS] NFS insanity
  2001-06-20 23:23 NFS insanity Christian Robottom Reis
  2001-06-21 13:59 ` [reiserfs-list] " Chris Mason
@ 2001-06-21 15:10 ` Trond Myklebust
  2001-06-21 22:43   ` Christian Robottom Reis
  1 sibling, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2001-06-21 15:10 UTC (permalink / raw)
  To: Christian Robottom Reis; +Cc: NFS, linux-kernel, reiserfs-list

>>>>> " " == Christian Robottom Reis <kiko@async.com.br> writes:

     > It _only_ happens in this file of all files I've tried out so
     > far. I'm trying to get xdelta to show me what's differing so I
     > can see if there's a pattern or something, but it's awful -
     > data corruption not only possibly but happening. :-)

     > I haven't tried remounting yet to see what I get, but I don't
     > see the problems on unpatched 2.4.2. I'll wait a bit to see if
     > anyone has seen this. Anyone?

Are you perchance using soft mounts?

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [NFS] NFS insanity
  2001-06-21 15:10 ` [NFS] " Trond Myklebust
@ 2001-06-21 22:43   ` Christian Robottom Reis
  2001-06-22 12:58     ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Christian Robottom Reis @ 2001-06-21 22:43 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: NFS, linux-kernel, reiserfs-list

On 21 Jun 2001, Trond Myklebust wrote:

>      > I haven't tried remounting yet to see what I get, but I don't
>      > see the problems on unpatched 2.4.2. I'll wait a bit to see if
>      > anyone has seen this. Anyone?
>
> Are you perchance using soft mounts?

No:

anthem:/mondo   /mondo  nfs defaults,rsize=3072,wsize=3072,suid,async 0 0

Async is on, but it's there by default IIRC, right?

Take care,
--
/\/\ Christian Reis, Senior Engineer, Async Open Source, Brazil
~\/~ http://async.com.br/~kiko/ | [+55 16] 274 4311


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [NFS] NFS insanity
  2001-06-21 22:43   ` Christian Robottom Reis
@ 2001-06-22 12:58     ` Trond Myklebust
  2001-06-22 15:57       ` Christian Robottom Reis
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2001-06-22 12:58 UTC (permalink / raw)
  To: Christian Robottom Reis; +Cc: NFS, linux-kernel, reiserfs-list

>>>>> " " == Christian Robottom Reis <kiko@async.com.br> writes:

     > anthem:/mondo /mondo nfs
     > defaults,rsize=3072,wsize=3072,suid,async 0 0

     > Async is on, but it's there by default IIRC, right?

Nope. The 'async' option is meaningless to the NFS client. Should make
no difference though, as it's never checked.

I'm a bit surprised about your choice or rsize and wsize. Although it
shouldn't make any difference, 3072 is not a natural size on an x86
machine. You usually want something that divides PAGE_CACHE_SIZE=4096.
Furthermore, on the Linux NFS client, any value < PAGE_CACHE_SIZE
means that you use synchronous writes (deferred writes are enabled
with wsize=4096 or greater).
The advantage in this case though, is that it means any error message
that was returned by the server was guaranteed to have been received
by 'cp', because the page was written to the server immediately.

If I were you therefore, I'd use ethereal or tcpdump to sniff the NFS
traffic and check that the file indeed gets reproduced correctly on
the wire.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [NFS] NFS insanity
  2001-06-22 12:58     ` Trond Myklebust
@ 2001-06-22 15:57       ` Christian Robottom Reis
  0 siblings, 0 replies; 7+ messages in thread
From: Christian Robottom Reis @ 2001-06-22 15:57 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: NFS, linux-kernel, reiserfs-list

On 22 Jun 2001, Trond Myklebust wrote:

> I'm a bit surprised about your choice or rsize and wsize. Although it
> shouldn't make any difference, 3072 is not a natural size on an x86
> machine. You usually want something that divides PAGE_CACHE_SIZE=4096.
> Furthermore, on the Linux NFS client, any value < PAGE_CACHE_SIZE
> means that you use synchronous writes (deferred writes are enabled
> with wsize=4096 or greater).

Trond, your command was very much appreciated. I got to this value after
stress testing the network install in the office: anything above that
value caused massive collisions on the hub and I just thought it would be
unhealthy to be forcing this sort of bustage onto the wire. 1 and 2k
performed worse, and 4k causing collisions, I chose 3k. The tests
consisted of doing compiles and simple file operations (reading large mail
folders, in addition), which is what users doing everyday work here and
evaluating the performance of the filesystem look for.

I'm not _entirely_ sure my tests were sane, but is this a reasonable
explanation?

> The advantage in this case though, is that it means any error message
> that was returned by the server was guaranteed to have been received
> by 'cp', because the page was written to the server immediately.

And no error was reported; it was completely silent. I can no longer
reproduce this after the power outage we had yesterday forced a reboot on
the client. *sigh* It would have been nice to find out what it was.

Take care,
--
/\/\ Christian Reis, Senior Engineer, Async Open Source, Brazil
~\/~ http://async.com.br/~kiko/ | [+55 16] 274 4311



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-06-22 15:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-20 23:23 NFS insanity Christian Robottom Reis
2001-06-21 13:59 ` [reiserfs-list] " Chris Mason
2001-06-21 14:58   ` Trond Myklebust
2001-06-21 15:10 ` [NFS] " Trond Myklebust
2001-06-21 22:43   ` Christian Robottom Reis
2001-06-22 12:58     ` Trond Myklebust
2001-06-22 15:57       ` Christian Robottom Reis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).