All of lore.kernel.org
 help / color / mirror / Atom feed
* NFSERR_EAGAIN
@ 2003-07-10 18:29 Danny Smith
  2003-07-10 22:41 ` NFSERR_EAGAIN Trond Myklebust
  2003-07-16 15:53 ` NFSERR_EAGAIN - resolved Danny Smith
  0 siblings, 2 replies; 5+ messages in thread
From: Danny Smith @ 2003-07-10 18:29 UTC (permalink / raw)
  To: nfs; +Cc: Iain Irwin-Powell

I've been trying to resolve some issues we have with a set of systems 
running 2.4.20+NFS_ALL, dual CPU and Gigabit Ethernet. They're talking 
to SGI IRIX servers, (6.5.19), and having intermittent problems - this 
can be seen sometimes where an NFS mounted directory will "disappear", 
but subsequently be accessible. No errors are returned to the shell - 
the directory just appears to have no entries.

This seems (although I don't have proof positive yet, more testing is in 
progrees) to coincide with errors in the logs:

Jul  9 14:18:09 trout-node13 kernel: nfs_stat_to_errno: bad nfs status 
return value: 11

Looking through nfs2xdr.c and nfs.h and googling, it seems that error 
number 11 is not properly defined, but certainly seems to be in use by 
SGI. From nfs2xdr.c:
 
      { NFSERR_NXIO,         ENXIO         },
/*    { NFSERR_EAGAIN,       EAGAIN        }, */
      { NFSERR_ACCES,        EACCES        },

(EAGAIN having value 11)

Does anyone know much about the history of this? Was this removed in 
order to be RFC compliant, or is there a stronger motivation not to have 
this?

It would maybe explain what I'm seeing if SGI are interpreting error 11 
as "Try again" - this mainly happens when server load is high.

Any insight into this would be welcome - meanwhile I'm going to try some 
tests with NFSERR_EGAIN put back in.

Danny

-- 
Danny Smith
Senior Systems Administrator, Cinesite (Europe) Ltd
020 7973 4000 - x4055    /    dannys@cinesite.co.uk




-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFSERR_EAGAIN
  2003-07-10 18:29 NFSERR_EAGAIN Danny Smith
@ 2003-07-10 22:41 ` Trond Myklebust
  2003-07-11  9:55   ` NFSERR_EAGAIN Danny Smith
  2003-07-16 15:53 ` NFSERR_EAGAIN - resolved Danny Smith
  1 sibling, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2003-07-10 22:41 UTC (permalink / raw)
  To: Danny Smith; +Cc: nfs, Iain Irwin-Powell

>>>>> " " == Danny Smith <dannys@cinesite.co.uk> writes:

     > Looking through nfs2xdr.c and nfs.h and googling, it seems that
     > error number 11 is not properly defined, but certainly seems to
     > be in use by SGI. From nfs2xdr.c:

     >      { NFSERR_NXIO, ENXIO },

     > /* { NFSERR_EAGAIN, EAGAIN }, */
     >       { NFSERR_ACCES, EACCES },

     > (EAGAIN having value 11)

     > Does anyone know much about the history of this? Was this
     > removed in order to be RFC compliant, or is there a stronger
     > motivation not to have this?

Why should we be supporting something which isn't documented in the RFCs?

What is this error anyway? Is it some SGI hack for emulating
NFS3ERR_JUKEBOX under NFSv2?

Cheers,
  Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFSERR_EAGAIN
  2003-07-10 22:41 ` NFSERR_EAGAIN Trond Myklebust
@ 2003-07-11  9:55   ` Danny Smith
  2003-07-11 10:13     ` NFSERR_EAGAIN Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Danny Smith @ 2003-07-11  9:55 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs, Iain Irwin-Powell

Trond Myklebust wrote:

>>>>>>" " == Danny Smith <dannys@cinesite.co.uk> writes:
>>>>>>            
>>>>>>
>
>     > Looking through nfs2xdr.c and nfs.h and googling, it seems that
>     > error number 11 is not properly defined, but certainly seems to
>     > be in use by SGI. From nfs2xdr.c:
>
>     >      { NFSERR_NXIO, ENXIO },
>
>     > /* { NFSERR_EAGAIN, EAGAIN }, */
>     >       { NFSERR_ACCES, EACCES },
>
>     > (EAGAIN having value 11)
>
>     > Does anyone know much about the history of this? Was this
>     > removed in order to be RFC compliant, or is there a stronger
>     > motivation not to have this?
>
>Why should we be supporting something which isn't documented in the RFCs?
>
I really want to know how it got into the source in the first place? Is 
it a feature that's "not part of the standard, but seem to be widely 
used nevertheless" (quoting from nfs.h).

Searching around shows a (very) few ocurrences of this, with Solaris and 
IRIX servers, although often accompanied by other issues.

If the server is broken, but we can work with it anyway without 
upsetting anything else, this would be a "good thing" in my book (of 
course, we would take it up with SGI too).

>What is this error anyway? Is it some SGI hack for emulating
>NFS3ERR_JUKEBOX under NFSv2?
>  
>
Right now, I don't really know, but I'm guessing it's something of the 
sort (seems to make sense with what we're seeing).
I'm trying to get a packet trace to verify this is indeed what is being 
sent - will update when I have further evidence.

Danny

-- 
Danny Smith
Senior Systems Administrator, Cinesite (Europe) Ltd
020 7973 4000 - x4055    /    dannys@cinesite.co.uk




-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFSERR_EAGAIN
  2003-07-11  9:55   ` NFSERR_EAGAIN Danny Smith
@ 2003-07-11 10:13     ` Trond Myklebust
  0 siblings, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2003-07-11 10:13 UTC (permalink / raw)
  To: Danny Smith; +Cc: nfs, Iain Irwin-Powell

>>>>> " " == Danny Smith <dannys@cinesite.co.uk> writes:

     > I really want to know how it got into the source in the first
     > place?

That's a question for Olaf Kirch. I've never touched that entry.

     > If the server is broken, but we can work with it anyway without
     > upsetting anything else, this would be a "good thing" in my
     > book (of course, we would take it up with SGI too).

Depends what it is. If it *is* jukebox, then we might as well just
recommend that people use NFSv3 (NFSv2 is legacy anyway). Jukebox is
only used for slow media (tape, cd-exchangers,...), so it's not going
ever going to be a common case.

cheers,
  Trond




-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFSERR_EAGAIN - resolved.
  2003-07-10 18:29 NFSERR_EAGAIN Danny Smith
  2003-07-10 22:41 ` NFSERR_EAGAIN Trond Myklebust
@ 2003-07-16 15:53 ` Danny Smith
  1 sibling, 0 replies; 5+ messages in thread
From: Danny Smith @ 2003-07-16 15:53 UTC (permalink / raw)
  To: nfs; +Cc: Iain Irwin-Powell

Danny Smith wrote:

> I've been trying to resolve some issues we have with a set of systems 
> running 2.4.20+NFS_ALL, dual CPU and Gigabit Ethernet. They're talking 
> to SGI IRIX servers, (6.5.19), and having intermittent problems - this 
> can be seen sometimes where an NFS mounted directory will "disappear", 
> but subsequently be accessible. No errors are returned to the shell - 
> the directory just appears to have no entries.
>
> This seems (although I don't have proof positive yet, more testing is 
> in progrees) to coincide with errors in the logs:
>
> Jul  9 14:18:09 trout-node13 kernel: nfs_stat_to_errno: bad nfs status 
> return value: 11
>
I've found the cause of these messages - SGI are in the clear - it's not 
coming from their servers.
It's 'amd' (am-utils) which is providing the incorrect responses.

Trigger seems to be trying to access an automount on a host which is not 
running an NFS server. amd then sends back a garbled response over the 
loopback interface (after getting several "RPC - program not registered" 
responses).

Whether this is causing our other problems remains to be seen.

Danny

-- 
Danny Smith
Senior Systems Administrator, Cinesite (Europe) Ltd
020 7973 4000 - x4055    /    dannys@cinesite.co.uk




-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-07-16 15:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-10 18:29 NFSERR_EAGAIN Danny Smith
2003-07-10 22:41 ` NFSERR_EAGAIN Trond Myklebust
2003-07-11  9:55   ` NFSERR_EAGAIN Danny Smith
2003-07-11 10:13     ` NFSERR_EAGAIN Trond Myklebust
2003-07-16 15:53 ` NFSERR_EAGAIN - resolved Danny Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.