linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (no subject)
@ 2003-02-15  1:53 Neil Brown
  2003-02-18  7:10 ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2003-02-15  1:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-kernel

Subject: Re: Routing problem with udp, and a multihomed host in 2.4.20
In-Reply-To: message from Herbert Xu on Saturday February 15
References: <15948.13879.734412.313081@notabene.cse.unsw.edu.au>
	<E18jpaa-0007Rc-00@gondolin.me.apana.org.au>
X-Mailer: VM 7.07 under Emacs 20.7.2
FCC: ~/.mail/linux
X-face:	[Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D<ml'fY1Vw+@XfR[fRCsUoP?K6bt3YD\ui5Fh?f
	LONpR';(ql)VM_TQ/<l_^D3~B:z$\YC7gUCuC=sYm/80G=$tt"98mr8(l))QzVKCk$6~gldn~*FK9x
	8`;pM{3S8679sP+MbP,72<3_PIH-$I&iaiIb|hV1d%cYg))BmI)AZ
--text follows this line--
On Saturday February 15, herbert@gondor.apana.org.au wrote:
> Neil Brown <neilb@cse.unsw.edu.au> wrote:
> > 
> > It turns out that the problem occurs when send_msg is used to send a
> > UDP packet, and the control information contains
> >              struct in_pktinfo {
> >                  unsigned int   ipi_ifindex;  /* Interface index */
> >                  struct in_addr ipi_spec_dst; /* Local address */
> >                  struct in_addr ipi_addr;     /* Header Destination address */
> >              };
> > specifying the address and interface of the message that we are
> > replying to.
> 
> So your application is forcing the packet to go out on a specific
> interface bypassing the routing table...

No.
My application (which is just using standard rpc server libraries) is
saying
  "This is in reply to a request that came in through a given
  interface".

It is not reasonable to treat that statement as equivalent to:
  "This packet must go out that interface"

which is what appears to be happening.

NeilBrown


> -- 
> Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
> Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* (no subject)
  2003-02-15  1:53 Neil Brown
@ 2003-02-18  7:10 ` David S. Miller
  2003-02-18 11:00   ` sendmsg and IP_PKTINFO Neil Brown
  0 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2003-02-18  7:10 UTC (permalink / raw)
  To: Neil Brown; +Cc: Herbert Xu, linux-kernel, Alexey N. Kuznetsov

On Fri, 2003-02-14 at 17:53, Neil Brown wrote:
> No.
> My application (which is just using standard rpc server libraries) is
> saying
>   "This is in reply to a request that came in through a given
>   interface".
> 
> It is not reasonable to treat that statement as equivalent to:
>   "This packet must go out that interface"
> 
> which is what appears to be happening.

You misunderstand what this control message knob means during
a sendmsg() then, it means "send this over interface X"

There is no other valid expectation.

I'm curious where you read something that would suggest otherwise
for sendmsg() behavior wrt. ip_pktinfo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-18  7:10 ` David S. Miller
@ 2003-02-18 11:00   ` Neil Brown
  2003-02-18 16:06     ` kuznet
  2003-02-19  3:52     ` David S. Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Neil Brown @ 2003-02-18 11:00 UTC (permalink / raw)
  To: David S. Miller; +Cc: Herbert Xu, linux-kernel, Alexey N. Kuznetsov

On  February 17, davem@redhat.com wrote:
> On Fri, 2003-02-14 at 17:53, Neil Brown wrote:
> > No.
> > My application (which is just using standard rpc server libraries) is
> > saying
> >   "This is in reply to a request that came in through a given
> >   interface".
> > 
> > It is not reasonable to treat that statement as equivalent to:
> >   "This packet must go out that interface"
> > 
> > which is what appears to be happening.
> 
> You misunderstand what this control message knob means during
> a sendmsg() then, it means "send this over interface X"
> 
> There is no other valid expectation.
> 
> I'm curious where you read something that would suggest otherwise
> for sendmsg() behavior wrt. ip_pktinfo

man 7 ip

on debian (unstable).

I quote:

       IP_PKTINFO
              Pass an IP_PKTINFO ancillary message  that  contains  a  pktinfo
              structure  that  supplies  some  information  about the incoming
              packet.  This only works for  datagram  oriented  sockets.   The
              argument  is a flag that tells the socket whether the IP_PKTINFO
              message should be passed or not. The message itself can only  be
              sent/retrieved as control message with a packet using recvmsg(2)
              or sendmsg(2).

              struct in_pktinfo {
                  unsigned int   ipi_ifindex;  /* Interface index */
                  struct in_addr ipi_spec_dst; /* Local address */
                  struct in_addr ipi_addr;     /* Header Destination address */
              };

              ipi_ifindex is the unique index of the interface the packet  was
              received  on.   ipi_spec_dst  is the local address of the packet
              and ipi_addr is the destination address in  the  packet  header.
              If  IP_PKTINFO  is passed to sendmsg(2) then the outgoing packet
              will be sent over the interface specified  in  ipi_ifindex  with
              the destination address set to ipi_spec_dst


Note that the in_pktinfo is described as "some information about the
incoming packet".  In particular ipi_ifindex is "the unique index of
the interface the packets was received on".

i.e. it is more about the incoming than the outgoing packet.

It does go on to say that the outgoing packet will be sent over the
same interface, however I feel that is an illogical conclusion given
the description of the meaning of the field.

So yes, the current behaviour seems to match part of the
documentation.  However I argue that the documented behaviour is
irrational.
A more rational behaviour is
 "the outgoing packet will be sent over the interface specified in
 ipi_ifindex if that interface has a valid route to the packets
 destination.  Otherwise normal rouing rules apply".

I further argue that this is not only more rational, but is actually
more useful (which is a more telling point).

NeilBrown

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-18 11:00   ` sendmsg and IP_PKTINFO Neil Brown
@ 2003-02-18 16:06     ` kuznet
  2003-02-18 23:35       ` Neil Brown
  2003-02-19  3:52     ` David S. Miller
  1 sibling, 1 reply; 9+ messages in thread
From: kuznet @ 2003-02-18 16:06 UTC (permalink / raw)
  To: Neil Brown; +Cc: davem, herbert, linux-kernel

Hello!

> So yes, the current behaviour seems to match part of the
> documentation.

Good. :-)

>  "the outgoing packet will be sent over the interface specified in
>  ipi_ifindex if that interface has a valid route to the packets
>  destination.  Otherwise normal rouing rules apply".
> 
> I further argue that this is not only more rational, but is actually
> more useful (which is a more telling point).

Either you rely on routing tables, or you do not, which is used
when routing tables are still not set up, or setup ambiguously,
or use of them just do not make sense which happens for multicasts/
limited/broadcasts/link local addresses. It is the thing which ifi_ifindex
does.

I do no see how it is possible to classify a middle way as "rational".
Well, and frankly speaking I do not see how it could be useful.

Alexey


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-18 16:06     ` kuznet
@ 2003-02-18 23:35       ` Neil Brown
  2003-02-18 23:56         ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2003-02-18 23:35 UTC (permalink / raw)
  To: kuznet; +Cc: davem, herbert, linux-kernel

On Tuesday February 18, kuznet@ms2.inr.ac.ru wrote:
> Hello!
> 
> > So yes, the current behaviour seems to match part of the
> > documentation.
> 
> Good. :-)
> 
> >  "the outgoing packet will be sent over the interface specified in
> >  ipi_ifindex if that interface has a valid route to the packets
> >  destination.  Otherwise normal rouing rules apply".
> > 
> > I further argue that this is not only more rational, but is actually
> > more useful (which is a more telling point).
> 
> Either you rely on routing tables, or you do not, which is used
> when routing tables are still not set up, or setup ambiguously,
> or use of them just do not make sense which happens for multicasts/
> limited/broadcasts/link local addresses. It is the thing which ifi_ifindex
> does.
> 

Presumably in these "don't rely on routing" cases the application
would set SO_DONTROUTE or MSG_DONTROUTE.
If this flag were set, I wouldn't argue with insisting that the
message goes out the interface specified in ifi_ifindex.

But if not, what then...

There remains the fact that the only documentation I can find about
this(*) describes ifi_ifindex as the interface that a message
*arrived* on, and when specified in a sendmsg call, it is information
about the message that we are replying to.  Interpreting this to mean
that the reply *must* go out that interface still seems wrong to me.

Do you agree which that definition of IP_PKTINFO?  If not, can you
show me some documentation which supports your position?

(*) I looked in the Single Unix Specification and it doesn't mention
PKTINFO at all.  I looked in the RFC's mentioned at the bottom of
 "man 7 ip"
(RPC1122 and 1812) and they don't say anything about specifying an
interface for outgoing messages - only specifying a source address.


> I do no see how it is possible to classify a middle way as "rational".
> Well, and frankly speaking I do not see how it could be useful.

The "middle way" is that we do want to rely on routing tables (as we
have not set MSG_DONTROUTE) but that the routing tables do not give a
unique route to some addresses - as is certainly possible and even
encouraged by RFC1122 (end of discussion of Strong ES model in
section 3.3.4.2).

With reliable routing that provides multiple routes in some cases, it
seems reasonable, even rational, to accept hints from the application
as to which interface to use.

As for "useful", it is more that I cannot see how the current
behaviour is useful - i.e. how can be used meaningfully by an
application.

The particular behaviour is:
  If MSG_DONTROUTE is not set, but a non-zero ifi_ifindex is given
  then:
    If there is a known route to the destination address through that
       interface, then use that route.
    If there is no route through that interface to the destination
       address, then treat it as a link-local address.

 When would an application actually want that behaviour?

My current approach to 'fixing' what I perceive as the 'problem' would
be one of:

 1/ if no route is found out the interface, return ENETUNREACH (not
 that ENETUNREACH is listed in the man page for sendmsg :-( )
   Also change to documentation to make the intent more explicit.  
   This would make my current problem clearly an application error and
   I could take it to the maintainers of glibc.

 2/ If MSG_DONTROUTE is not set, then ignore the value passed in
    ifi_ifindex.   
    This would slightly weaken the applications options for
    controlling routing, but I don't think the weakening would be
    significant. 

 3/ Treat the ifi_ifindex as a hint if MSG_DONTROUTE isn't set.   This
    would be much more invasive to the code, and it is not clear that
    the extra control it provides is needed in practice.


Thank you for your frankness,

NeilBrown

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-18 23:35       ` Neil Brown
@ 2003-02-18 23:56         ` David S. Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-02-18 23:56 UTC (permalink / raw)
  To: neilb; +Cc: kuznet, herbert, linux-kernel


All you are showing us Neil is that the documentation
is inaccurate.  That snippet you showed me from manual
pages is wrong about sendmsg semantics.

The ifindex you specify does mean "send out this interface".

It is very surprising that this documentation is wrong since
the likely author (Andi Kleen) is smart enough to read the
actual implementation when he writes these things.

And yes, this means, no accurate documentation exists currently.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-18 11:00   ` sendmsg and IP_PKTINFO Neil Brown
  2003-02-18 16:06     ` kuznet
@ 2003-02-19  3:52     ` David S. Miller
  2003-02-19  4:13       ` Neil Brown
  1 sibling, 1 reply; 9+ messages in thread
From: David S. Miller @ 2003-02-19  3:52 UTC (permalink / raw)
  To: neilb; +Cc: herbert, linux-kernel, kuznet

   From: Neil Brown <neilb@cse.unsw.edu.au>
   Date: Tue, 18 Feb 2003 22:00:37 +1100
   
   It does go on to say that the outgoing packet will be sent over the
   same interface, however I feel that is an illogical conclusion given
   the description of the meaning of the field.
   
   So yes, the current behaviour seems to match part of the
   documentation.  However I argue that the documented behaviour is
   irrational.

Alexey and myself totally disagree.  We have described for you
the intended purpose of this feature.  Please do not try to use
it in some other way, it may prove to be painful :-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-19  4:13       ` Neil Brown
@ 2003-02-19  4:03         ` David S. Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2003-02-19  4:03 UTC (permalink / raw)
  To: neilb; +Cc: herbert, linux-kernel, kuznet

   From: Neil Brown <neilb@cse.unsw.edu.au>
   Date: Wed, 19 Feb 2003 15:13:48 +1100

   On Tuesday February 18, davem@redhat.com wrote:
   > Alexey and myself totally disagree.  We have described for you
   > the intended purpose of this feature.  Please do not try to use
   > it in some other way, it may prove to be painful :-)
   
   Thankyou for making that clear.
   
You're very welcome, thank you for tracking all of this down.
   
   Currently the sunrpc/svc_udp.c code asks for an IP_PKTINFO from
   recvmsg, and passes it verbatim down through sendmsg.

And yes, this is buggy.

   My patch checks that the returned data looks believable and, if it
   does, zeros the ipi_ifindex field.

Please note also that ipi_addr is ignored on sendmsg().

You don't have to zero it, this is just a reminder about
what the kernel will do with this thing.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: sendmsg and IP_PKTINFO
  2003-02-19  3:52     ` David S. Miller
@ 2003-02-19  4:13       ` Neil Brown
  2003-02-19  4:03         ` David S. Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2003-02-19  4:13 UTC (permalink / raw)
  To: David S. Miller; +Cc: herbert, linux-kernel, kuznet

On Tuesday February 18, davem@redhat.com wrote:
>    From: Neil Brown <neilb@cse.unsw.edu.au>
>    Date: Tue, 18 Feb 2003 22:00:37 +1100
>    
>    It does go on to say that the outgoing packet will be sent over the
>    same interface, however I feel that is an illogical conclusion given
>    the description of the meaning of the field.
>    
>    So yes, the current behaviour seems to match part of the
>    documentation.  However I argue that the documented behaviour is
>    irrational.
> 
> Alexey and myself totally disagree.  We have described for you
> the intended purpose of this feature.  Please do not try to use
> it in some other way, it may prove to be painful :-)


Thankyou for making that clear.

I am currently working towards testing a patch that will fix the
behaviour of glibc.

Currently the sunrpc/svc_udp.c code asks for an IP_PKTINFO from
recvmsg, and passes it verbatim down through sendmsg.
My patch checks that the returned data looks believable and, if it
does, zeros the ipi_ifindex field.

NeilBrown



--- sunrpc/svc_udp.c.orig	2003-02-19 11:25:20.000000000 +1100
+++ sunrpc/svc_udp.c	2003-02-19 14:28:46.000000000 +1100
@@ -256,8 +256,26 @@
       mesgp->msg_controllen = sizeof(xprt->xp_pad)
 			      - sizeof (struct iovec) - sizeof (struct msghdr);
       rlen = recvmsg (xprt->xp_sock, mesgp, 0);
-      if (rlen >= 0)
-	len = mesgp->msg_namelen;
+      if (rlen >= 0) {
+	      struct cmsghdr *cmsg;
+	      len = mesgp->msg_namelen;
+	      cmsg = CMSG_FIRSTHDR(mesgp);
+	      if (cmsg == NULL ||
+		  CMSG_NXTHDR(mesgp, cmsg) != NULL ||
+		  cmsg->cmsg_level != SOL_IP ||
+		  cmsg->cmsg_type != IP_PKTINFO ||
+		  cmsg->cmsg_len != sizeof(struct in_pktinfo)) {
+		      /* Not a simple IP_PKTINFO, ignore it */
+		      mesgp->msg_control = NULL;
+		      mesgp->msg_controllen = 0;
+	      } else {
+		      /* it was a simple IP_PKTIFO as we expected,
+		       * Discard the interface field 
+		       */
+		      struct in_pktinfo *pkti = CMSG_DATA(cmsg);
+		      pkti->ipi_ifindex = 0;
+	      }
+      }
     }
   else
 #endif


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-02-19  4:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-15  1:53 Neil Brown
2003-02-18  7:10 ` David S. Miller
2003-02-18 11:00   ` sendmsg and IP_PKTINFO Neil Brown
2003-02-18 16:06     ` kuznet
2003-02-18 23:35       ` Neil Brown
2003-02-18 23:56         ` David S. Miller
2003-02-19  3:52     ` David S. Miller
2003-02-19  4:13       ` Neil Brown
2003-02-19  4:03         ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).