All of lore.kernel.org
 help / color / mirror / Atom feed
* SM_UNMON again -> kernel
@ 2003-07-11  6:16 Lawrence Ong
  2003-07-11  9:38 ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: Lawrence Ong @ 2003-07-11  6:16 UTC (permalink / raw)
  To: nfs

Hi Everybody,

If someone can shed some light on why this annoying message kept
appearing, it would be much appreciated.  I'm sure a lot of other
people out there are getting the same error message.

The Error in Syslog that we continually get is the common:

Received erroneous SM_UNMON request from mymachine for 10.1.1.20

The error does not break NFS or anything.  It's just a rather annoying
message that kept on appearing again and again.  Looking through the
different mailing list did not help.  None of the solutions provided
refers to kernel 2.4.19-21 or is the reason why it gives that error.

SYSTEM INFORMATION
------------------
The system that we're using is booting off the kernel 2.4.21 kernel with
nfs-utils 1.0.3 on a Debian Woody system.  We have also tested this on
2.4.19 and 2.4.20 with and without Trond's NFS (FIXES/ALL) patches.  We
have also tested the system with nfs-utils version 1.0.0.

We noticed that it is getting that error because in monitor.c, the
value generated by NL_MY_NAME(clnt) is 127.0.0.1?  Is this suppose to be
correct?!

The NFS server contains the IP address 10.1.1.20, mymachine is the client
with the IP address 10.1.1.180.  The client mounts the server as such:

10.1.1.20:/mountpath /mountpath nfs rsize=8192,wsize=8192,nfsvers=3,udp,noatime,hard,intr,bg 0 0

STEPPING THROUGH CODE
---------------------
Looking at utils/statd/monitor.c - we have lines that read:

--------------
while ((clnt = nlist_gethost(clnt, mon_name, 0))) {
	if (matchhostname(NL_MY_NAME(clnt), my_name) &&
--------------

This matchhostname never match because NL_MY_NAME(clnt) gives 127.0.0.1
but the my_name variable in sm_unmon_1_svc function is mymachine.

THEN LOOKING AT FURTHER CODE
----------------------------

In monitor.c again, in the sm_mon_1_svc function, we have:

struct sm_stat_res *
sm_mon_1_svc(struct mon *argp, struct svc_req *rqstp)
{
        static sm_stat_res result;
        char            *mon_name = argp->mon_id.mon_name,
                        *my_name  = argp->mon_id.my_id.my_name;
...
my_name = "127.0.0.1";

What the?!  my_name is reset to 127.0.0.1 due to some CERT Advisory
CA-99.05?  Anyhow, lets just ignore that since this is not the main
cause of why the error message appears in the first place.  Now we're
back to the kernel that kept on producing an SM_UNMON RPC to statd.

It looks like this unmon stufff is happening too regularly?  Now i'm
wondering why it is attempting to UNMON so regularly in the first place?

This brings us back to kernel 2.4.19-21.  Why is the kernel's statd client
calling statd so regularly for sm_unmon?!!!  No, i did not stop the
nfs-kernel-server.  No, i'm not mounting and unmounting NFS regularly.

Is there something else i should check?  I'll start looking at the kernel
code while awaiting reply.  Any help/suggestions appreciated.

Cheers,
Lawrence


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-11  6:16 SM_UNMON again -> kernel Lawrence Ong
@ 2003-07-11  9:38 ` Trond Myklebust
  2003-07-13  3:22   ` Philippe Troin
  2003-07-13  4:25   ` Lawrence Ong
  0 siblings, 2 replies; 16+ messages in thread
From: Trond Myklebust @ 2003-07-11  9:38 UTC (permalink / raw)
  To: Lawrence Ong; +Cc: nfs


Is /etc/hosts consistent? The kernel only sends MON/UNMON requests on
the loopback port, so this suggests that something is translating
127.0.0.1<->"mymachine" instead of to "localhost".

Cheers,
  Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-11  9:38 ` Trond Myklebust
@ 2003-07-13  3:22   ` Philippe Troin
  2003-07-13 15:25     ` Trond Myklebust
  2003-07-13  4:25   ` Lawrence Ong
  1 sibling, 1 reply; 16+ messages in thread
From: Philippe Troin @ 2003-07-13  3:22 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Lawrence Ong, nfs

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> Is /etc/hosts consistent? The kernel only sends MON/UNMON requests on
> the loopback port, so this suggests that something is translating
> 127.0.0.1<->"mymachine" instead of to "localhost".

I've also been seeing this forever... Which also means that locks are
never recovered after a server reboot, which is midly annoying.

What do you mean by "consistent" /etc/hosts?

% getent hosts localhost
127.0.0.1       localhost
% getent hosts 127.0.0.1
127.0.0.1       localhost
% host localhost
localhost has address 127.0.0.1
% host 127.0.0.1
1.0.0.127.in-addr.arpa domain name pointer localhost.

Is that consistent enough?

Phil.


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-11  9:38 ` Trond Myklebust
  2003-07-13  3:22   ` Philippe Troin
@ 2003-07-13  4:25   ` Lawrence Ong
  2003-07-14  8:47     ` Trond Myklebust
  1 sibling, 1 reply; 16+ messages in thread
From: Lawrence Ong @ 2003-07-13  4:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs

On Fri, Jul 11, 2003 at 11:38:57AM +02i0, Trond Myklebust wrote:
> 
> Is /etc/hosts consistent? The kernel only sends MON/UNMON requests on
> the loopback port, so this suggests that something is translating
> 127.0.0.1<->"mymachine" instead of to "localhost".

Consistent?

Well, this is what i have in /etc/hosts

127.0.0.1 localhost
10.1.1.180 mymachine

With some knowledge looking at the monitor.c code, when i changed my
/etc/hosts to:

127.0.0.1 localhost mymachine

The erroneous error disappeared.  Dont't think this should be happening
AT ALL.  The statd code section:

/* Match! */
dprintf(L_DEBUG, "UNMONITORING %s for %s",
		mon_name, my_name);

Is entered when i changed /etc/hosts to 127.0.0.1 localhost mymachine.
Then again, it's logical since my_name in statd's code is set to
127.0.0.1 automatically due to CERT CA-99.05.  Maybe it's got to be
rewritten somewhat?

Anyhow, it shows that the kernel is still continually sending out
unmonitor packets to statd at regular intervals.  What the?  Why is
this happening?

Thanks for the help.

Cheers,
Lawrence

> 
> Cheers,
>   Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-13  3:22   ` Philippe Troin
@ 2003-07-13 15:25     ` Trond Myklebust
  2003-07-13 18:00       ` Philippe Troin
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2003-07-13 15:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Lawrence Ong, nfs

>>>>> " " == Philippe Troin <phil@fifi.org> writes:

     > Trond Myklebust <trond.myklebust@fys.uio.no> writes:
    >> Is /etc/hosts consistent? The kernel only sends MON/UNMON
    >> requests on the loopback port, so this suggests that something
    >> is translating 127.0.0.1<->"mymachine" instead of to
    >> "localhost".

     > I've also been seeing this forever... Which also means that
     > locks are never recovered after a server reboot, which is midly
     > annoying.

The *same* error, with a message that appears not to be originating
from 127.0.0.l?

     > What do you mean by "consistent" /etc/hosts?

I mean crap like

127.0.0.l     localhost mymachine

in /etc/hosts, together with a DNS entry translates the name
"mymachine" into another IP address.

Believe me. It's been done...

Cheers,
  Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-13 15:25     ` Trond Myklebust
@ 2003-07-13 18:00       ` Philippe Troin
  2003-07-13 23:36         ` Lawrence Ong
  2003-07-14  8:56         ` Trond Myklebust
  0 siblings, 2 replies; 16+ messages in thread
From: Philippe Troin @ 2003-07-13 18:00 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Lawrence Ong, nfs

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> >>>>> " " == Philippe Troin <phil@fifi.org> writes:
> 
>      > Trond Myklebust <trond.myklebust@fys.uio.no> writes:
>     >> Is /etc/hosts consistent? The kernel only sends MON/UNMON
>     >> requests on the loopback port, so this suggests that something
>     >> is translating 127.0.0.1<->"mymachine" instead of to
>     >> "localhost".
> 
>      > I've also been seeing this forever... Which also means that
>      > locks are never recovered after a server reboot, which is midly
>      > annoying.
> 
> The *same* error, with a message that appears not to be originating
> from 127.0.0.l?

Yup.

On machine #1:
ceramic rpc.statd[30693]: Received erroneous SM_UNMON request from ceramic for 216.27.190.149

On machine #2:
tantale rpc.statd[18416]: Received erroneous SM_UNMON request from tantale for 216.27.190.148

>      > What do you mean by "consistent" /etc/hosts?
> 
> I mean crap like
> 
> 127.0.0.l     localhost mymachine
> 
> in /etc/hosts, together with a DNS entry translates the name
> "mymachine" into another IP address.
> 
> Believe me. It's been done...

Oh, I'm sure about that :-)

Phil.


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-13 18:00       ` Philippe Troin
@ 2003-07-13 23:36         ` Lawrence Ong
  2003-07-14  8:56         ` Trond Myklebust
  1 sibling, 0 replies; 16+ messages in thread
From: Lawrence Ong @ 2003-07-13 23:36 UTC (permalink / raw)
  To: Philippe Troin; +Cc: Trond Myklebust, nfs

On Sun, Jul 13, 2003 at 11:00:03AM -0700, Philippe Troin wrote:
> > I mean crap like
> > 
> > 127.0.0.l     localhost mymachine
> > 
> > in /etc/hosts, together with a DNS entry translates the name
> > "mymachine" into another IP address.
> > 
> > Believe me. It's been done...
> 
> Oh, I'm sure about that :-)
> 
> Phil.

The funny thing is that when i PUT in crap like that, the erroneous
error disappeared.  Of course, it's not suppose to be correct.  Just
placed it in there for testing.

Lawrence


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-13  4:25   ` Lawrence Ong
@ 2003-07-14  8:47     ` Trond Myklebust
  2003-07-14 23:42       ` Lawrence Ong
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2003-07-14  8:47 UTC (permalink / raw)
  To: Lawrence Ong; +Cc: Trond Myklebust, nfs

>>>>> " " == Lawrence Ong <lawrence.ong@netregistry.com.au> writes:

     > Anyhow, it shows that the kernel is still continually sending
     > out unmonitor packets to statd at regular intervals.  What the?
     > Why is this happening?

What's wrong with that: I presume you *are* releasing locks every now
and then?

There's no point in monitoring a server on which you're not holding
any locks.

Cheers,
  Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-13 18:00       ` Philippe Troin
  2003-07-13 23:36         ` Lawrence Ong
@ 2003-07-14  8:56         ` Trond Myklebust
  2003-07-18  1:08           ` Philippe Troin
  1 sibling, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2003-07-14  8:56 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Lawrence Ong, nfs

>>>>> " " == Philippe Troin <phil@fifi.org> writes:

     > On machine #1: ceramic rpc.statd[30693]: Received erroneous
     > SM_UNMON request from ceramic for 216.27.190.149

     > On machine #2: tantale rpc.statd[18416]: Received erroneous
     > SM_UNMON request from tantale for 216.27.190.148

OK. Hang on... The above looks more like stale entries due to somebody
having permanently switched off a server on which ceramic/tantale were
holding locks.

Could you check on 'ceramic', and 'tantale' if they don't have entries
for 216.27.190.149 and 216.27.190.148 respectively in /var/lib/nfs/sm
or /var/lib/nfs/sm.bak?

Another mistake that can cause the above errors is if the sysman has
made those 2 directories NFS shared between different clients. (Not a
good idea...)

Cheers,
  Trond


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-14  8:47     ` Trond Myklebust
@ 2003-07-14 23:42       ` Lawrence Ong
  0 siblings, 0 replies; 16+ messages in thread
From: Lawrence Ong @ 2003-07-14 23:42 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs

On Mon, Jul 14, 2003 at 10:47:52AM +0200, Trond Myklebust wrote:
> >>>>> " " == Lawrence Ong <lawrence.ong@netregistry.com.au> writes:
> 
>      > Anyhow, it shows that the kernel is still continually sending
>      > out unmonitor packets to statd at regular intervals.  What the?
>      > Why is this happening?
> 
> What's wrong with that: I presume you *are* releasing locks every now
> and then?

How often should you be releasing locks?  Right now, on the client and
server, it is happening about every 5 minutes.

> There's no point in monitoring a server on which you're not holding
> any locks.

Well, in that case, the only way to make the erroneous SM_UNMON really go
away is by adding: 127.0.0.1 localhost mymachine to /etc/hosts.
I know you said that's bad, but how else would you do it when the MON
function for nfs-utils actually set the my_name variable to 127.0.0.1?

Maybe i'm missing something, but so far, the only ways i see that the
erroneous message would ever disappear are:

a. By putting in 127.0.0.1 localhost mymachine in /etc/hosts
b. By having the kernel STOP sending out unmonitor packets to statd at
   regular intervals
c. By changing statd so that it does not set my_name to 127.0.0.1
   (breaks fix? for CERT CA-99.05).

If you think that there is another way to stop that erroneous SM_UNMON
error, without doing any of those ridiculous fix above, I would be willing
to test it out.  Thanks.

Cheers,
Lawrence


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-14  8:56         ` Trond Myklebust
@ 2003-07-18  1:08           ` Philippe Troin
  2003-07-19  3:26             ` Philippe Troin
  0 siblings, 1 reply; 16+ messages in thread
From: Philippe Troin @ 2003-07-18  1:08 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Lawrence Ong, nfs

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> >>>>> " " == Philippe Troin <phil@fifi.org> writes:
> 
>      > On machine #1: ceramic rpc.statd[30693]: Received erroneous
>      > SM_UNMON request from ceramic for 216.27.190.149
> 
>      > On machine #2: tantale rpc.statd[18416]: Received erroneous
>      > SM_UNMON request from tantale for 216.27.190.148
> 
> OK. Hang on... The above looks more like stale entries due to somebody
> having permanently switched off a server on which ceramic/tantale were
> holding locks.

Nope, the machines have been rebooted...
 
> Could you check on 'ceramic', and 'tantale' if they don't have entries
> for 216.27.190.149 and 216.27.190.148 respectively in /var/lib/nfs/sm
> or /var/lib/nfs/sm.bak?

They have indeed very old files there that were removed.

> Another mistake that can cause the above errors is if the sysman has
> made those 2 directories NFS shared between different clients. (Not a
> good idea...)

No, that not the case. But I can see why it would wreak havoc.

Anyways, this is what I did:

 - I shut down all boxes interacting via NFS.

 - When booting them up, I switched to single-user mode and:

     * wiped /var/lib/nfs/mtab

     * removed any files in /var/lib/nfs/sm{,.bak}

And we'll see if the message shows up again.

Phil.


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-18  1:08           ` Philippe Troin
@ 2003-07-19  3:26             ` Philippe Troin
  2003-07-19 16:22               ` Christian Reis
  0 siblings, 1 reply; 16+ messages in thread
From: Philippe Troin @ 2003-07-19  3:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: nfs

Philippe Troin <phil@fifi.org> writes:

> Trond Myklebust <trond.myklebust@fys.uio.no> writes:
> 
> > >>>>> " " == Philippe Troin <phil@fifi.org> writes:
> > 
> >      > On machine #1: ceramic rpc.statd[30693]: Received erroneous
> >      > SM_UNMON request from ceramic for 216.27.190.149
> > 
> >      > On machine #2: tantale rpc.statd[18416]: Received erroneous
> >      > SM_UNMON request from tantale for 216.27.190.148
> > 
> > OK. Hang on... The above looks more like stale entries due to somebody
> > having permanently switched off a server on which ceramic/tantale were
> > holding locks.
> 
> Nope, the machines have been rebooted...
>  
> > Could you check on 'ceramic', and 'tantale' if they don't have entries
> > for 216.27.190.149 and 216.27.190.148 respectively in /var/lib/nfs/sm
> > or /var/lib/nfs/sm.bak?
> 
> They have indeed very old files there that were removed.
> 
> > Another mistake that can cause the above errors is if the sysman has
> > made those 2 directories NFS shared between different clients. (Not a
> > good idea...)
> 
> No, that not the case. But I can see why it would wreak havoc.
> 
> Anyways, this is what I did:
> 
>  - I shut down all boxes interacting via NFS.
> 
>  - When booting them up, I switched to single-user mode and:
> 
>      * wiped /var/lib/nfs/mtab
> 
>      * removed any files in /var/lib/nfs/sm{,.bak}
> 
> And we'll see if the message shows up again.

And it did...

I still see the same messages over and over again.

Phil.


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-19  3:26             ` Philippe Troin
@ 2003-07-19 16:22               ` Christian Reis
  2003-07-20 23:28                 ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Reis @ 2003-07-19 16:22 UTC (permalink / raw)
  To: Philippe Troin; +Cc: Trond Myklebust, nfs

On Fri, Jul 18, 2003 at 08:26:50PM -0700, Philippe Troin wrote:
> I still see the same messages over and over again.

Just to add a data point, I see the same message periodically on our
diskless network -- all clients are linux 2.4.21 (debian woody), as is
the server (no relevant patches).  However, the IP addresses in the
messages are for the *remote* boxes.

Server (anthem 192.168.99.4) logs:

    Jul 12 17:50:10 anthem rpc.statd[326]: Received erroneous
        SM_UNMON request from anthem for 192.168.99.6
    Jul 12 17:54:37 anthem rpc.statd[326]: Received erroneous
        SM_UNMON request from anthem for 192.168.99.6
    Jul 12 18:07:03 anthem rpc.statd[326]: Received erroneous
        SM_UNMON request from anthem for 192.168.99.6
    Jul 12 18:13:10 anthem rpc.statd[326]: Received erroneous
        SM_UNMON request from anthem for 192.168.99.6

[One] client (manonegra 192.168.99.6) log:

    Jul 18 18:07:33 manonegra rpc.statd[115]: Received
        erroneous SM_UNMON request from manonegra for 192.168.99.4
    Jul 18 18:17:33 manonegra rpc.statd[115]: Received
        erroneous SM_UNMON request from manonegra for 192.168.99.4
    Jul 18 18:35:08 manonegra rpc.statd[115]: Received
        erroneous SM_UNMON request from manonegra for 192.168.99.4
    Jul 18 18:39:22 manonegra rpc.statd[115]: Received
        erroneous SM_UNMON request from manonegra for 192.168.99.4

It goes on forever. There's nothing fishy on the 127.0.0.1 lines in the
hosts file or in DNS. The logs go back for a year now.

If there's one thing special, it's that this server also runs an
iptables firewall that does masquerading and supports connection
tracking. Could it be mangling the lock packets? I recall seeing
something similar a long time ago..

Take care,
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-19 16:22               ` Christian Reis
@ 2003-07-20 23:28                 ` Trond Myklebust
  2003-07-20 23:45                   ` Christian Reis
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2003-07-20 23:28 UTC (permalink / raw)
  To: Christian Reis; +Cc: Philippe Troin, nfs

>>>>> " " == Christian Reis <kiko@async.com.br> writes:

     > If there's one thing special, it's that this server also runs
     > an iptables firewall that does masquerading and supports
     > connection tracking. Could it be mangling the lock packets?

Huh? "Special" sounds like it would only begin to describe such a
setup...



The client has to be able to contact the server for portmapper, lockd,
nfs and statd. Exactly how are you doing this with masquerading
enabled?

In particular, how are you ensuring that the portmapper returns
correct information?

Cheers,
  Trond


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-20 23:28                 ` Trond Myklebust
@ 2003-07-20 23:45                   ` Christian Reis
  2003-07-21 13:08                     ` Bogdan Costescu
  0 siblings, 1 reply; 16+ messages in thread
From: Christian Reis @ 2003-07-20 23:45 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Philippe Troin, nfs

On Mon, Jul 21, 2003 at 01:28:40AM +0200, Trond Myklebust wrote:
> >>>>> " " == Christian Reis <kiko@async.com.br> writes:
> 
>      > If there's one thing special, it's that this server also runs
>      > an iptables firewall that does masquerading and supports
>      > connection tracking. Could it be mangling the lock packets?
> 
> Huh? "Special" sounds like it would only begin to describe such a
> setup...

Oh, the masquerading has nothing to do with NFS. The NFS clients and the
server are all on the same local subnet. I just mentioned this to
indicate that something special might be happening on the packet level
(unfrag/frag) because this box is also a firewall, not that the clients
communicate with the server through a firewall.

Take care,
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL


-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SM_UNMON again -> kernel
  2003-07-20 23:45                   ` Christian Reis
@ 2003-07-21 13:08                     ` Bogdan Costescu
  0 siblings, 0 replies; 16+ messages in thread
From: Bogdan Costescu @ 2003-07-21 13:08 UTC (permalink / raw)
  To: Christian Reis; +Cc: Trond Myklebust, Philippe Troin, nfs

On Sun, 20 Jul 2003, Christian Reis wrote:

> Oh, the masquerading has nothing to do with NFS.

It might, by error. I don't know if this is relevant as I seem to have 
lost track of this thread, but some keywords suggested me to write this 
message...

In some conditions when using iptables, UDP packets get wrongly re-routed
to the loopback interface. I've hit this some months ago with a RedHat 8.0
installation with the current (at that time) 2.4.18-something kernel; I
don't know if things changed in the meantime as I have rewritten the
iptables rules...

It seems that some forms of MASQ/NAT and marking packets (fwmark) don't 
play well together; this is mentioned towards the end of:

http://lartc.org/howto/lartc.netfilter.html

but there was not enough information about what exactly would happen.  So
when I did use them (MASQ and fwmark) together, some UDP packets which had
perfectly good IP headers (source: server-IP, dest: client-IP) would not
make it onto the wire towards the client, but could be captured with
tcpdump attached to "lo". I've searched the netfilter list and came up
with some hits, but there were only questions (so the problem is known!)  
and no solution.

Incidentally, I've noticed this packet loss when trying to mount a NFS
export. I don't remember exactly where the packet loss happened, but it
was always at the same point in the packet exchange (well, out of 5-6
tries that I've made, also with different client OSes) - most probably,
the final packet, the answer to the "MOUNT" request (sorry, fuzzy memory).
If I'd clear the iptables rules on the server, the same mount command
would succeed immediately.  After deleting all the rules that used
"fwmark" (but still using MASQ), everything worked fine.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De



-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2003-07-21 13:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-11  6:16 SM_UNMON again -> kernel Lawrence Ong
2003-07-11  9:38 ` Trond Myklebust
2003-07-13  3:22   ` Philippe Troin
2003-07-13 15:25     ` Trond Myklebust
2003-07-13 18:00       ` Philippe Troin
2003-07-13 23:36         ` Lawrence Ong
2003-07-14  8:56         ` Trond Myklebust
2003-07-18  1:08           ` Philippe Troin
2003-07-19  3:26             ` Philippe Troin
2003-07-19 16:22               ` Christian Reis
2003-07-20 23:28                 ` Trond Myklebust
2003-07-20 23:45                   ` Christian Reis
2003-07-21 13:08                     ` Bogdan Costescu
2003-07-13  4:25   ` Lawrence Ong
2003-07-14  8:47     ` Trond Myklebust
2003-07-14 23:42       ` Lawrence Ong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.