netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
@ 2011-11-04 14:46 François-Xavier Le Bail
  2011-11-04 16:24 ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: François-Xavier Le Bail @ 2011-11-04 14:46 UTC (permalink / raw)
  To: netdev

Hi,

I do some tests on a Linux 3.0 kernel with IPv6 forwarding mode enabled.

When I ping (ICMPv6 echo request) on one of its Subnet-Router anycast addresses
(SRAA, http://tools.ietf.org/html/rfc4291#section-2.6.1),
the Linux kernel reply with an unicast source address, not the anycast one.

When I send an IPv6 UDP packet to a server on Linux on one of its SRAA,
the Linux kernel build a reply with an unicast source address, not the anycast one.

The RFC 4942 states (http://tools.ietf.org/html/rfc4942#section-2.1.6) :
2.1.6. Anycast Traffic Identification and Security
[. . .]
   To avoid exposing knowledge about the internal structure of the
   network, it is recommended that anycast servers now take advantage of
   the ability to return responses with the anycast address as the
   source address if possible.

Also, If the source address of reply differs from destination address of the request, many applications are broken.
Please let me know your feedback.


Thanks,
Francois-Xavier Le Bail

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-04 14:46 [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation François-Xavier Le Bail
@ 2011-11-04 16:24 ` Eric Dumazet
  2011-11-05  8:39   ` François-Xavier Le Bail
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2011-11-04 16:24 UTC (permalink / raw)
  To: François-Xavier Le Bail; +Cc: netdev

Le vendredi 04 novembre 2011 à 07:46 -0700, François-Xavier Le Bail a
écrit :
> Hi,
> 
> I do some tests on a Linux 3.0 kernel with IPv6 forwarding mode enabled.
> 
> When I ping (ICMPv6 echo request) on one of its Subnet-Router anycast addresses
> (SRAA, http://tools.ietf.org/html/rfc4291#section-2.6.1),
> the Linux kernel reply with an unicast source address, not the anycast one.
> 
> When I send an IPv6 UDP packet to a server on Linux on one of its SRAA,
> the Linux kernel build a reply with an unicast source address, not the anycast one.
> 

Nothing in the kernel builds a reply to an UDP packet.

I would say the user application is responsible to build an answer, and
chose appropriate source address.

If your application uses a ANY_ADDR bind, then it must appropriate
action so that a good source address is used in answers.

In case of IPv6 socket, I advise you take a look at IPV6_PKTINFO /
IPV6_RECVPKTINFO options.


> The RFC 4942 states (http://tools.ietf.org/html/rfc4942#section-2.1.6) :
> 2.1.6. Anycast Traffic Identification and Security
> [. . .]
>    To avoid exposing knowledge about the internal structure of the
>    network, it is recommended that anycast servers now take advantage of
>    the ability to return responses with the anycast address as the
>    source address if possible.
> 
> Also, If the source address of reply differs from destination address of the request, many applications are broken.
> Please let me know your feedback.
> 

'anycast servers' are a combination of kernel and userland parts.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-04 16:24 ` Eric Dumazet
@ 2011-11-05  8:39   ` François-Xavier Le Bail
  2011-11-05  9:20     ` Eric Dumazet
  2011-11-05  9:30     ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: François-Xavier Le Bail @ 2011-11-05  8:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

>From: Eric Dumazet <eric.dumazet@gmail.com>
>To: François-Xavier Le Bail <fx.lebail@yahoo.com>
>Sent: Friday, November 4, 2011 5:24 PM
>Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
>
>Le vendredi 04 novembre 2011 à 07:46 -0700, François-Xavier Le Bail a
>écrit :
>> I do some tests on a Linux 3.0 kernel with IPv6 forwarding mode enabled.
>> 
>> When I ping (ICMPv6 echo request) on one of its Subnet-Router anycast addresses
>> (SRAA, http://tools.ietf.org/html/rfc4291#section-2.6.1),
>> the Linux kernel reply with an unicast source address, not the anycast one.
>> 
>> When I send an IPv6 UDP packet to a server on Linux on one of its SRAA,
>> the Linux kernel build a reply with an unicast source address, not the anycast one.


Thanks for your answer.

>Nothing in the kernel builds a reply to an UDP packet.

You are right. I meant that in some cases the kernel selects the source address
of the reply.

>I would say the user application is responsible to build an answer, and
>chose appropriate source address.


OK.


>If your application uses a ANY_ADDR bind, then it must appropriate
>action so that a good source address is used in answers.
>
>In case of IPv6 socket, I advise you take a look at IPV6_PKTINFO /
>IPV6_RECVPKTINFO options.


I will study and test these options for my application server.

>> The RFC 4942 states (http://tools.ietf.org/html/rfc4942#section-2.1.6) :
>> 2.1.6. Anycast Traffic Identification and Security
>> [. . .]
>>    To avoid exposing knowledge about the internal structure of the
>>    network, it is recommended that anycast servers now take advantage of
>>    the ability to return responses with the anycast address as the
>>    source address if possible.
>> 
>> Also, If the source address of reply differs from destination address of the request, many applications are broken.
>> Please let me know your feedback.
>> 
>
>'anycast servers' are a combination of kernel and userland parts.


Agreed, but remain the case of ICMPv6 echo request/reply, which I think is in kernel.

Francois-Xavier

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-05  8:39   ` François-Xavier Le Bail
@ 2011-11-05  9:20     ` Eric Dumazet
  2011-11-18 10:15       ` Bjørn Mork
  2011-11-05  9:30     ` Eric Dumazet
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2011-11-05  9:20 UTC (permalink / raw)
  To: François-Xavier Le Bail; +Cc: netdev

Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
écrit :

> Agreed, but remain the case of ICMPv6 echo request/reply, which I think is in kernel.

Yes, please describe your setup and how to reproduce the problem.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-05  8:39   ` François-Xavier Le Bail
  2011-11-05  9:20     ` Eric Dumazet
@ 2011-11-05  9:30     ` Eric Dumazet
  2011-11-10 10:58       ` François-Xavier Le Bail
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2011-11-05  9:30 UTC (permalink / raw)
  To: François-Xavier Le Bail; +Cc: netdev

Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
écrit :

> 
> I will study and test these options for my application server

Here is a sample of use of the IPv4 part, an udpecho service that use
IP_PKTINFO and IP_RECVTOS/IP_TOS to be able to use multihomed machine,
and reflect TOS field as well.

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <linux/udp.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>

#define PORT 4040

int pktinfo_get(struct msghdr *my_hdr, struct in_pktinfo *pktinfo)
{
	int res = -1;

	if (my_hdr->msg_controllen > 0) {
		struct cmsghdr *get_cmsg;
		for (get_cmsg = CMSG_FIRSTHDR(my_hdr); get_cmsg;
			get_cmsg = CMSG_NXTHDR(my_hdr, get_cmsg)) {
			if (get_cmsg->cmsg_type == IP_PKTINFO) {
				struct in_pktinfo *get_pktinfo = (struct in_pktinfo *)CMSG_DATA(get_cmsg);
				memcpy(pktinfo, get_pktinfo, sizeof(*pktinfo));
				res = 0;
			}
		}
	}
	return res;
}

int tos_get(struct msghdr *my_hdr, unsigned char *tos)
{
	int res = -1;

	if (my_hdr->msg_controllen > 0) {
		struct cmsghdr *get_cmsg;
		for (get_cmsg = CMSG_FIRSTHDR(my_hdr); get_cmsg;
			get_cmsg = CMSG_NXTHDR(my_hdr, get_cmsg)) {
			if (get_cmsg->cmsg_type == IP_TOS) {
				unsigned char *pkttos = (unsigned char *)CMSG_DATA(get_cmsg);
				*tos = *pkttos;
				res = 0;
			}
		}
	}
	return res;
}

int main(int argc, char *argv[])
{
	int fd = socket(AF_INET, SOCK_DGRAM, 0);
	struct sockaddr_in addr, rem_addr;
	int res, on = 1;
	struct msghdr msghdr;
	struct iovec vec[1];
	char cbuf[512];
	char frame[4096];
	struct in_pktinfo pktinfo;
	int c, count = 1000000;
	unsigned char last_tos = 0;

	while ((c = getopt(argc, argv, "c:")) != -1) {
		if (c == 'c') count = atoi(optarg);
		}
	memset(&addr, 0, sizeof(addr));
	addr.sin_family = AF_INET;
	addr.sin_port = htons(PORT);
	if (bind(fd, (struct sockaddr *)&addr, sizeof(addr)) == -1) {
		perror("bind");
		return 1;
	}
	setsockopt(fd, SOL_IP, IP_PKTINFO, &on, sizeof(on));
	setsockopt(fd, SOL_IP, IP_RECVTOS, &on, sizeof(on));

	while (1) {
		unsigned char tos;

		memset(&msghdr, 0, sizeof(msghdr));
		msghdr.msg_control = cbuf;
		msghdr.msg_controllen = sizeof(cbuf);
		msghdr.msg_iov = vec;
		msghdr.msg_iovlen = 1;
		vec[0].iov_base = frame;
		vec[0].iov_len = sizeof(frame);
		msghdr.msg_name = &rem_addr;
		msghdr.msg_namelen = sizeof(rem_addr);
		res = recvmsg(fd, &msghdr, 0);
		if (res == -1)
			break;
		if (pktinfo_get(&msghdr, &pktinfo) == 0) {

//			printf("Got IP_PKTINFO dst addr=%s\n", inet_ntoa(pktinfo.ipi_spec_dst));
			}
		if (tos_get(&msghdr, &tos) == 0) {
			/* IP_TOS option wont be used in sendmsg(), we must use setsockopt() instead */
			if (tos != last_tos) {
				if (setsockopt(fd, SOL_IP, IP_TOS, &tos, sizeof(tos)) == 0)
					last_tos = tos;
			}
		}
		/* ok, just echo reply this frame.
		 * Using sendmsg() will provide IP_PKTINFO back to kernel
		 * to let it use the 'right' source address
		 * (destination address of the incoming packet)
		 */
		vec[0].iov_len = res;
		sendmsg(fd, &msghdr, 0);
		if (--count == 0)
			break;
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-05  9:30     ` Eric Dumazet
@ 2011-11-10 10:58       ` François-Xavier Le Bail
  2011-11-10 11:27         ` Eric Dumazet
  2011-11-10 13:25         ` How to get the port values Naveen B N (nbn)
  0 siblings, 2 replies; 11+ messages in thread
From: François-Xavier Le Bail @ 2011-11-10 10:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

----- Original Message -----

> From: Eric Dumazet <eric.dumazet@gmail.com>
> To: François-Xavier Le Bail <fx.lebail@yahoo.com>
> Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
> Sent: Saturday, November 5, 2011 10:30 AM
> Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
> 
> Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
> écrit :
> 
>> 
>>  I will study and test these options for my application server
> 
> Here is a sample of use of the IPv4 part, an udpecho service that use
> IP_PKTINFO and IP_RECVTOS/IP_TOS to be able to use multihomed machine,
> and reflect TOS field as well.
> [. . .]

Hi,

I have updated the code for IPv6.

When a UDP client send to an unicast address on a multihomed Linux 3.0.0 host, from another host, it's OK.
For example :
setup 2001::1 on eth0, 2a01::1 on eth1.
send to 2001::1, recv from 2001::1.
send to 2a01::1, recv from 2a01::1.

When the UDP client send to an Subnet-Router anycast address on a multihomed Linux 3.0.0 host, from another host, it's KO.
send to 2001:: or 2a01::, the udpecho server display "sendmsg: Invalid argument".

Any idea ?

Thanks,
Francois-Xavier

Here is the server code:
----------------------------------------------------------------------
// Here is a sample of use of the IPv6 part, an udpecho service that use
// IPV6_RECVPKTINFO to be able to use multihomed machine.

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <linux/udp.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <arpa/inet.h>

#define PORT 4040

struct in6_pktinfo {
        struct in6_addr ipi6_addr;  /* src/dst IPv6 address */
    unsigned int ipi6_ifindex;  /* send/recv interface index */
};

int pktinfo_get(struct msghdr *my_hdr, struct in6_pktinfo *pktinfo)
{
    int res = -1;

    fprintf(stderr, "pktinfo_get()\n");
    if (my_hdr->msg_controllen > 0) {
        struct cmsghdr *get_cmsg;
        for (get_cmsg = CMSG_FIRSTHDR(my_hdr); get_cmsg;
            get_cmsg = CMSG_NXTHDR(my_hdr, get_cmsg)) {
            if (get_cmsg->cmsg_type == IPV6_PKTINFO) {
                struct in6_pktinfo *get_pktinfo = (struct in6_pktinfo *)CMSG_DATA(get_cmsg);
                memcpy(pktinfo, get_pktinfo, sizeof(*pktinfo));
                res = 0;
            }
        }
    }
    return res;
}

int main(int argc, char *argv[])
{
    int fd = socket(AF_INET6, SOCK_DGRAM, 0);
    struct sockaddr_in6 addr, rem_addr;
    int res, on = 1;
    struct msghdr msghdr;
    struct iovec vec[1];
    char cbuf[512];
    char frame[4096];
    struct in6_pktinfo pktinfo;
    int c, count = 1000000;

    while ((c = getopt(argc, argv, "c:")) != -1) {
        if (c == 'c') count = atoi(optarg);
        }
    memset(&addr, 0, sizeof(addr));
    addr.sin6_family = AF_INET6;
    addr.sin6_port = htons(PORT);
    if (bind(fd, (struct sockaddr *)&addr, sizeof(addr)) == -1) {
        perror("bind");
        return 1;
    }
    //setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on));
    setsockopt(fd, IPPROTO_IPV6, IPV6_RECVPKTINFO, &on, sizeof(on));

    while (1) {

        memset(&msghdr, 0, sizeof(msghdr));
        msghdr.msg_control = cbuf;
        msghdr.msg_controllen = sizeof(cbuf);
        msghdr.msg_iov = vec;
        msghdr.msg_iovlen = 1;
        vec[0].iov_base = frame;
        vec[0].iov_len = sizeof(frame);
        msghdr.msg_name = &rem_addr;
        msghdr.msg_namelen = sizeof(rem_addr);
        res = recvmsg(fd, &msghdr, 0);
        if (res == -1)
            break;
        if (pktinfo_get(&msghdr, &pktinfo) == 0) {

            //printf("Got IPV6_PKTINFO dst addr=%s\n", inet_ntoa(pktinfo.ipi6_addr));
            }
        /* ok, just echo reply this frame.
        * Using sendmsg() will provide IPV6_PKTINFO back to kernel
        * to let it use the 'right' source address
        * (destination address of the incoming packet)
        */
        vec[0].iov_len = res;
        res = sendmsg(fd, &msghdr, 0);
        if (res == -1) {
            perror ("sendmsg");
            break;
        }
        if (--count == 0)
            break;
    }
    return 0;
}

----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-10 10:58       ` François-Xavier Le Bail
@ 2011-11-10 11:27         ` Eric Dumazet
  2011-11-10 12:54           ` François-Xavier Le Bail
  2011-11-10 13:25         ` How to get the port values Naveen B N (nbn)
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2011-11-10 11:27 UTC (permalink / raw)
  To: François-Xavier Le Bail; +Cc: netdev

Le jeudi 10 novembre 2011 à 02:58 -0800, François-Xavier Le Bail a
écrit :
> ----- Original Message -----
> 
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > To: François-Xavier Le Bail <fx.lebail@yahoo.com>
> > Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
> > Sent: Saturday, November 5, 2011 10:30 AM
> > Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
> > 
> > Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
> > écrit :
> > 
> >> 
> >>  I will study and test these options for my application server
> > 
> > Here is a sample of use of the IPv4 part, an udpecho service that use
> > IP_PKTINFO and IP_RECVTOS/IP_TOS to be able to use multihomed machine,
> > and reflect TOS field as well.
> > [. . .]
> 
> Hi,
> 
> I have updated the code for IPv6.
> 
> When a UDP client send to an unicast address on a multihomed Linux 3.0.0 host, from another host, it's OK.
> For example :
> setup 2001::1 on eth0, 2a01::1 on eth1.
> send to 2001::1, recv from 2001::1.
> send to 2a01::1, recv from 2a01::1.
> 
> When the UDP client send to an Subnet-Router anycast address on a multihomed Linux 3.0.0 host, from another host, it's KO.
> send to 2001:: or 2a01::, the udpecho server display "sendmsg: Invalid argument".
> 
> Any idea ?

Could you describe the setup of this machine ?

ip -6 addr
ip -6 ro

...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-10 11:27         ` Eric Dumazet
@ 2011-11-10 12:54           ` François-Xavier Le Bail
  2011-11-10 15:23             ` François-Xavier Le Bail
  0 siblings, 1 reply; 11+ messages in thread
From: François-Xavier Le Bail @ 2011-11-10 12:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev





----- Original Message -----
> From: Eric Dumazet <eric.dumazet@gmail.com>
> To: François-Xavier Le Bail <fx.lebail@yahoo.com>
> Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
> Sent: Thursday, November 10, 2011 12:27 PM
> Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
> 
> Le jeudi 10 novembre 2011 à 02:58 -0800, François-Xavier Le Bail a
> écrit :
>>  ----- Original Message -----
>> 
>>  > From: Eric Dumazet <eric.dumazet@gmail.com>
>>  > To: François-Xavier Le Bail <fx.lebail@yahoo.com>
>>  > Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
>>  > Sent: Saturday, November 5, 2011 10:30 AM
>>  > Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the 
> RFC 4942 recommendation
>>  > 
>>  > Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
>>  > écrit :
>>  > 
>>  >> 
>>  >>  I will study and test these options for my application server
>>  > 
>>  > Here is a sample of use of the IPv4 part, an udpecho service that use
>>  > IP_PKTINFO and IP_RECVTOS/IP_TOS to be able to use multihomed machine,
>>  > and reflect TOS field as well.
>>  > [. . .]
>> 
>>  Hi,
>> 
>>  I have updated the code for IPv6.
>> 
>>  When a UDP client send to an unicast address on a multihomed Linux 3.0.0 
> host, from another host, it's OK.
>>  For example :
>>  setup 2001::1 on eth0, 2a01::1 on eth1.
>>  send to 2001::1, recv from 2001::1.
>>  send to 2a01::1, recv from 2a01::1.
>> 
>>  When the UDP client send to an Subnet-Router anycast address on a 
> multihomed Linux 3.0.0 host, from another host, it's KO.
>>  send to 2001:: or 2a01::, the udpecho server display "sendmsg: Invalid 
> argument".
>> 
>>  Any idea ?
> 
> Could you describe the setup of this machine ?
> 
> ip -6 addr
> ip -6 ro

The server has ipv6 forwarding on.

# ip -6 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2a01::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fecc:bc43/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fecc:bc4d/64 scope link 
       valid_lft forever preferred_lft forever

# ip -6 r
2001::/64 dev eth1  proto kernel  metric 256 
2a01::/64 dev eth0  proto kernel  metric 256 
fe80::/64 dev eth1  proto kernel  metric 256 
fe80::/64 dev eth0  proto kernel  metric 256 
default via 2001::2 dev eth1  metric 1024 

2001::2 is the address of the other (client) host.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* How to get the port values
  2011-11-10 10:58       ` François-Xavier Le Bail
  2011-11-10 11:27         ` Eric Dumazet
@ 2011-11-10 13:25         ` Naveen B N (nbn)
  1 sibling, 0 replies; 11+ messages in thread
From: Naveen B N (nbn) @ 2011-11-10 13:25 UTC (permalink / raw)
  To: François-Xavier Le Bail, Eric Dumazet; +Cc: netdev

Hi All,

How can i get an access to port values from sock *sk in the 
Function rawv6_sendmsg before xfrm_lookup in file /net/ipv6/raw.c .
In case the application itself is including the headers [ IP , UDP  ].
I want to bypass the port 500 for my application from IPsec.

Regards
Naveen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-10 12:54           ` François-Xavier Le Bail
@ 2011-11-10 15:23             ` François-Xavier Le Bail
  0 siblings, 0 replies; 11+ messages in thread
From: François-Xavier Le Bail @ 2011-11-10 15:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

----- Original Message -----

> From: François-Xavier Le Bail <fx.lebail@yahoo.com>
> To: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
> Sent: Thursday, November 10, 2011 1:54 PM
> Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
> 
> ----- Original Message -----
>>  From: Eric Dumazet <eric.dumazet@gmail.com>
>>  To: François-Xavier Le Bail <fx.lebail@yahoo.com>
>>  Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
>>  Sent: Thursday, November 10, 2011 12:27 PM
>>  Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 
> 4942 recommendation
>> 
>>  Le jeudi 10 novembre 2011 à 02:58 -0800, François-Xavier Le Bail a
>>  écrit :
>>>   ----- Original Message -----
>>> 
>>>   > From: Eric Dumazet <eric.dumazet@gmail.com>
>>>   > To: François-Xavier Le Bail <fx.lebail@yahoo.com>
>>>   > Cc: "netdev@vger.kernel.org" 
> <netdev@vger.kernel.org>
>>>   > Sent: Saturday, November 5, 2011 10:30 AM
>>>   > Subject: Re: [RFC] The Linux kernel IPv6 stack don't follow 
> the 
>>  RFC 4942 recommendation
>>>   > 
>>>   > Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail 
> a
>>>   > écrit :
>>>   > 
>>>   >> 
>>>   >>  I will study and test these options for my application 
> server
>>>   > 
>>>   > Here is a sample of use of the IPv4 part, an udpecho service that 
> use
>>>   > IP_PKTINFO and IP_RECVTOS/IP_TOS to be able to use multihomed 
> machine,
>>>   > and reflect TOS field as well.
>>>   > [. . .]
>>> 
>>>   Hi,
>>> 
>>>   I have updated the code for IPv6.
>>> 
>>>   When a UDP client send to an unicast address on a multihomed Linux 
> 3.0.0 
>>  host, from another host, it's OK.
>>>   For example :
>>>   setup 2001::1 on eth0, 2a01::1 on eth1.
>>>   send to 2001::1, recv from 2001::1.
>>>   send to 2a01::1, recv from 2a01::1.
>>> 
>>>   When the UDP client send to an Subnet-Router anycast address on a 
>>  multihomed Linux 3.0.0 host, from another host, it's KO.
>>>   send to 2001:: or 2a01::, the udpecho server display "sendmsg: 
> Invalid 
>>  argument".
>>> 
>>>   Any idea ?
>> 
>>  Could you describe the setup of this machine ?
>> 
>>  ip -6 addr
>>  ip -6 ro
> 
> The server has ipv6 forwarding on.
> 
> # ip -6 a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 
>     inet6 ::1/128 scope host 
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
>     inet6 2a01::1/64 scope global 
>        valid_lft forever preferred_lft forever
>     inet6 fe80::20c:29ff:fecc:bc43/64 scope link 
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
>     inet6 2001::1/64 scope global 
>        valid_lft forever preferred_lft forever
>     inet6 fe80::20c:29ff:fecc:bc4d/64 scope link 
>        valid_lft forever preferred_lft forever
> 
> # ip -6 r
> 2001::/64 dev eth1  proto kernel  metric 256 
> 2a01::/64 dev eth0  proto kernel  metric 256 
> fe80::/64 dev eth1  proto kernel  metric 256 
> fe80::/64 dev eth0  proto kernel  metric 256 
> default via 2001::2 dev eth1  metric 1024 
> 
> 2001::2 is the address of the other (client) host.

I put a printk in addrconf.c at the beginning of ipv6_get_saddr_eval function.
In th OK case : nothing is print.
In the KO case : many printk messages.

Any problem in RFC 3484 code ?

Thanks,
Francois-Xavier

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation
  2011-11-05  9:20     ` Eric Dumazet
@ 2011-11-18 10:15       ` Bjørn Mork
  0 siblings, 0 replies; 11+ messages in thread
From: Bjørn Mork @ 2011-11-18 10:15 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: François-Xavier Le Bail, netdev

Eric Dumazet <eric.dumazet@gmail.com> writes:
> Le samedi 05 novembre 2011 à 01:39 -0700, François-Xavier Le Bail a
> écrit :
>
>> Agreed, but remain the case of ICMPv6 echo request/reply, which I think is in kernel.
>
> Yes, please describe your setup and how to reproduce the problem.

I agree with François-Xavier that the current behaviour is confusing. An
anycast address should really not be treated different from any other
global unicast address by the kernel.


1) The router anycast address does not show up in the list of local
   addresses:

router:~$ ip -6 addr show dev br0.666
15: br0.666@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
    inet6 2001:db8:9:29a::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::215:17ff:fe1e:5e35/64 scope link 
       valid_lft forever preferred_lft forever



2) pinging the anycast address will produce a reply from another
address: 

frtest1:~# ping6 -n -c1 2001:db8:9:29a::  
PING 2001:db8:9:29a::(2001:db8:9:29a::) 56 data bytes
64 bytes from 2001:db8:9:29a::1: icmp_seq=1 ttl=64 time=0.275 ms

--- 2001:db8:9:29a:: ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.275/0.275/0.275/0.000 ms
frtest1:~# ping6 -n -c1 2001:db8:9:29a::1  
PING 2001:db8:9:29a::1(2001:db8:9:29a::1) 56 data bytes
64 bytes from 2001:db8:9:29a::1: icmp_seq=1 ttl=64 time=0.229 ms

--- 2001:db8:9:29a::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.229/0.229/0.229/0.000 ms



3) the previous issue will become even more obviously buggy if we add a
second address to the router, in the same subnet while pinging the
anycast address.  Here I'm doing 

  router:/tmp# ip addr add 2001:db8:9:29a::5/64 dev br0.666

on the router while I'm pinging the anycast address from a another host:


frtest1:~# ping6 -n  2001:db8:9:29a::  
PING 2001:db8:9:29a::(2001:db8:9:29a::) 56 data bytes
64 bytes from 2001:db8:9:29a::1: icmp_seq=1 ttl=64 time=330 ms
64 bytes from 2001:db8:9:29a::1: icmp_seq=2 ttl=64 time=0.268 ms
64 bytes from 2001:db8:9:29a::1: icmp_seq=3 ttl=64 time=0.234 ms
64 bytes from 2001:db8:9:29a::5: icmp_seq=4 ttl=64 time=0.232 ms
64 bytes from 2001:db8:9:29a::5: icmp_seq=5 ttl=64 time=0.298 ms
64 bytes from 2001:db8:9:29a::5: icmp_seq=6 ttl=64 time=0.279 ms


Now, THAT doesn't look good, does it?


4) just to add to issue 2.  After adding the second address on the
router, we have 3 global unicast addresses in the same subnet. You would
then expect the kernel to either use the destionation address of the
incoming requests as source, or always select the same address as
source.

It does neither.  This is inconsistent, unless you treat the router
anycast address as something special.  Which you should not:


frtest1:~# ping-n -c1 2001:db8:9:29a::  
PING 2001:db8:9:29a::(2001:db8:9:29a::) 56 data bytes
64 bytes from 2001:db8:9:29a::5: icmp_seq=1 ttl=64 time=337 ms

--- 2001:db8:9:29a:: ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 337.251/337.251/337.251/0.000 ms
frtest1:~# ping6 -n -c1 2001:db8:9:29a::1  
PING 2001:db8:9:29a::1(2001:db8:9:29a::1) 56 data bytes
64 bytes from 2001:db8:9:29a::1: icmp_seq=1 ttl=64 time=0.163 ms

--- 2001:db8:9:29a::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.163/0.163/0.163/0.000 ms
frtest1:~# ping6 -n -c1 2001:db8:9:29a::5 
PING 2001:db8:9:29a::5(2001:db8:9:29a::5) 56 data bytes
64 bytes from 2001:db8:9:29a::5: icmp_seq=1 ttl=64 time=3.87 ms

--- 2001:db8:9:29a::5 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.873/3.873/3.873/0.000 ms





Bjørn

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-18 10:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-04 14:46 [RFC] The Linux kernel IPv6 stack don't follow the RFC 4942 recommendation François-Xavier Le Bail
2011-11-04 16:24 ` Eric Dumazet
2011-11-05  8:39   ` François-Xavier Le Bail
2011-11-05  9:20     ` Eric Dumazet
2011-11-18 10:15       ` Bjørn Mork
2011-11-05  9:30     ` Eric Dumazet
2011-11-10 10:58       ` François-Xavier Le Bail
2011-11-10 11:27         ` Eric Dumazet
2011-11-10 12:54           ` François-Xavier Le Bail
2011-11-10 15:23             ` François-Xavier Le Bail
2011-11-10 13:25         ` How to get the port values Naveen B N (nbn)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).