All of lore.kernel.org
 help / color / mirror / Atom feed
* iproute2 mpls max labels
@ 2016-07-16 18:24 Magnus Bergroth
  2016-07-21 18:53 ` Roopa Prabhu
  0 siblings, 1 reply; 10+ messages in thread
From: Magnus Bergroth @ 2016-07-16 18:24 UTC (permalink / raw)
  To: netdev


Wanted to use more than the default maximum of 8 mpls labels. Max labels
seems to be hardcode to 8 in two places.

--- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
+++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
+0200
@@ -476,7 +476,7 @@
         addr->bytelen = 4;
         addr->bitlen = 20;
         /* How many bytes do I need? */
-        for (i = 0; i < 8; i++) {
+        for (i = 0; i < MPLS_MAX_LABELS; i++) {
             if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
                 addr->bytelen = (i + 1)*4;
                 break;


--- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
+++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
11:55:57.297681742 +0200
@@ -54,6 +54,9 @@
 #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
 #define PREV_ARG() do { argv--; argc++; } while(0)

+/* Maximum number of labels the mpls helpers support */
+#define MPLS_MAX_LABELS 8
+
 typedef struct
 {
     __u16 flags;
@@ -61,7 +64,7 @@
     __s16 bitlen;
     /* These next two fields match rtvia */
     __u16 family;
-    __u32 data[8];
+    __u32 data[MPLS_MAX_LABELS];
 } inet_prefix;

 #define PREFIXLEN_SPECIFIED 1
@@ -88,9 +91,6 @@
 # define AF_MPLS 28
 #endif

-/* Maximum number of labels the mpls helpers support */
-#define MPLS_MAX_LABELS 8
-
 __u32 get_addr32(const char *name);
 int get_addr_1(inet_prefix *dst, const char *arg, int family);
 int get_prefix_1(inet_prefix *dst, char *arg, int family);




Kindly

Magnus Bergroth

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-16 18:24 iproute2 mpls max labels Magnus Bergroth
@ 2016-07-21 18:53 ` Roopa Prabhu
  2016-07-21 19:43   ` Magnus Bergroth
  2016-07-21 20:00   ` Eric W. Biederman
  0 siblings, 2 replies; 10+ messages in thread
From: Roopa Prabhu @ 2016-07-21 18:53 UTC (permalink / raw)
  To: Magnus Bergroth; +Cc: netdev, Robert Shearman, ebiederm

On 7/16/16, 11:24 AM, Magnus Bergroth wrote:
> Wanted to use more than the default maximum of 8 mpls labels. Max labels
> seems to be hardcode to 8 in two places.
>
> --- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
> +++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
> +0200
> @@ -476,7 +476,7 @@
>          addr->bytelen = 4;
>          addr->bitlen = 20;
>          /* How many bytes do I need? */
> -        for (i = 0; i < 8; i++) {
> +        for (i = 0; i < MPLS_MAX_LABELS; i++) {
>              if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
>                  addr->bytelen = (i + 1)*4;
>                  break;
>
>
> --- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
> +++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
> 11:55:57.297681742 +0200
> @@ -54,6 +54,9 @@
>  #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
>  #define PREV_ARG() do { argv--; argc++; } while(0)
>
> +/* Maximum number of labels the mpls helpers support */
> +#define MPLS_MAX_LABELS 8
> +
>  typedef struct
>  {
>      __u16 flags;
> @@ -61,7 +64,7 @@
>      __s16 bitlen;
>      /* These next two fields match rtvia */
>      __u16 family;
> -    __u32 data[8];
> +    __u32 data[MPLS_MAX_LABELS];
>  } inet_prefix;
>
>  #define PREFIXLEN_SPECIFIED 1
> @@ -88,9 +91,6 @@
>  # define AF_MPLS 28
>  #endif
>
> -/* Maximum number of labels the mpls helpers support */
> -#define MPLS_MAX_LABELS 8
> -
>  __u32 get_addr32(const char *name);
>  int get_addr_1(inet_prefix *dst, const char *arg, int family);
>  int get_prefix_1(inet_prefix *dst, char *arg, int family);
>
>
I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
2.
I think we need to fix it in a few places:
a) we should move the kernel #define to a uapi header file which iproute2 can use
b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
a problem with it yet. so, we could bump it to 8.

Were you planning to post patches for one or both of the above ?.

I can post them too. Let me know.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 18:53 ` Roopa Prabhu
@ 2016-07-21 19:43   ` Magnus Bergroth
  2016-07-23 23:04     ` Roopa Prabhu
  2016-07-21 20:00   ` Eric W. Biederman
  1 sibling, 1 reply; 10+ messages in thread
From: Magnus Bergroth @ 2016-07-21 19:43 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, Robert Shearman, ebiederm



> Roopa Prabhu <mailto:roopa@cumulusnetworks.com>
> 21 juli 2016 20:53
> I did not realize it is hardcoded to 8 in iproute2. Because kernel has
> a hard coded limit of
> 2.
> I think we need to fix it in a few places:
> a) we should move the kernel #define to a uapi header file which
> iproute2 can use
> b) there has been a general ask to bump the kernel MAX_LABELS from 2
> and I don't see
> a problem with it yet. so, we could bump it to 8.
>
> Were you planning to post patches for one or both of the above ?.
>
> I can post them too. Let me know.
I would be happy if you could post them both as I very new to the kernel
development and I'm not entirely sure how to do it in the right way.

Kindly

Magnus
>
> Thanks,
> Roopa
>
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 18:53 ` Roopa Prabhu
  2016-07-21 19:43   ` Magnus Bergroth
@ 2016-07-21 20:00   ` Eric W. Biederman
  2016-07-21 21:08     ` Magnus Bergroth
  2016-07-22  6:16     ` Roopa Prabhu
  1 sibling, 2 replies; 10+ messages in thread
From: Eric W. Biederman @ 2016-07-21 20:00 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: Magnus Bergroth, netdev, Robert Shearman

Roopa Prabhu <roopa@cumulusnetworks.com> writes:

> On 7/16/16, 11:24 AM, Magnus Bergroth wrote:
>> Wanted to use more than the default maximum of 8 mpls labels. Max labels
>> seems to be hardcode to 8 in two places.
>>
>> --- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
>> +++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
>> +0200
>> @@ -476,7 +476,7 @@
>>          addr->bytelen = 4;
>>          addr->bitlen = 20;
>>          /* How many bytes do I need? */
>> -        for (i = 0; i < 8; i++) {
>> +        for (i = 0; i < MPLS_MAX_LABELS; i++) {
>>              if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
>>                  addr->bytelen = (i + 1)*4;
>>                  break;
>>
>>
>> --- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
>> +++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
>> 11:55:57.297681742 +0200
>> @@ -54,6 +54,9 @@
>>  #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
>>  #define PREV_ARG() do { argv--; argc++; } while(0)
>>
>> +/* Maximum number of labels the mpls helpers support */
>> +#define MPLS_MAX_LABELS 8
>> +
>>  typedef struct
>>  {
>>      __u16 flags;
>> @@ -61,7 +64,7 @@
>>      __s16 bitlen;
>>      /* These next two fields match rtvia */
>>      __u16 family;
>> -    __u32 data[8];
>> +    __u32 data[MPLS_MAX_LABELS];
>>  } inet_prefix;
>>

This structure is not MPLS specific so that is not appropriate to use
MPLS_MAX_LABELS when definiting the structure.  Likewise changing this
structure and limiting the changes to mpls parts of the code is not
appropriate.

>>  #define PREFIXLEN_SPECIFIED 1
>> @@ -88,9 +91,6 @@
>>  # define AF_MPLS 28
>>  #endif
>>
>> -/* Maximum number of labels the mpls helpers support */
>> -#define MPLS_MAX_LABELS 8
>> -
>>  __u32 get_addr32(const char *name);
>>  int get_addr_1(inet_prefix *dst, const char *arg, int family);
>>  int get_prefix_1(inet_prefix *dst, char *arg, int family);
>>
>>
> I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
> 2.
> I think we need to fix it in a few places:
> a) we should move the kernel #define to a uapi header file which iproute2 can use
> b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
> a problem with it yet. so, we could bump it to 8.
>
> Were you planning to post patches for one or both of the above ?.
>
> I can post them too. Let me know.

a) I just looked and the kernel netlink protocol does not have a limit.
   The kernel does have a limit but the netlink protocol does not  so
   there is no point in exporting a limit in a uapi header,  it will
   just be out of date and wrong.

b) I can see in principle bumping up the kernels MAX_LABELS past two
   although I haven't heard those requests, or understand the use cases.
   I don't recall seeing any ducumentation on cases where it is
   desirable to push a lot of labels at once.  (Do hardware
   implementations support pushing a lot of labels at once?)

   Bumping past 8 seems quite a lot.  That starts feeling like people
   trying to break other peoples mpls stacks.  That is asking for more
   packet space for labels than ipv6 uses for addresses and ipv6 is way
   oversized.  The commonly agreed wisdom is the world only needs 40 to
   48 bits to route on to reach the entire world.  

   I can completely understand a few specialty labels going beyond what
   is needed for general purpose routing but pushing more that 8 at
   once seems huge.  Especially since you can recirculate packets if
   you really need to and push more labels that way.

   Add to that for a software implementation we have these pesky things
   called cache lines.  I can see in the kernel pushing struct
   mpls_route towards the size of a full cacheline.  Today we are at 52
   bytes not counting the via adress.  With the via address we are at 56
   (ipv4), 58 (ethernet), and 60 (ipv6) bytes.  Which means in we have
   to make the kernel data structures smarter or we risk messing up the
   performance of the common case.

   Also we do need some kind of limit in the kernel to protect against
   insane inputs.
   
   So while I can imagine there are reasonable cases for bumping up the
   maximum number of labels in the kernel I think we need to be smart if
   we ware going to do that.  Which probably means we will want a
   __mpls_nh_label helper function.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 20:00   ` Eric W. Biederman
@ 2016-07-21 21:08     ` Magnus Bergroth
  2016-07-22 19:24       ` Eric W. Biederman
  2016-07-22  6:16     ` Roopa Prabhu
  1 sibling, 1 reply; 10+ messages in thread
From: Magnus Bergroth @ 2016-07-21 21:08 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Roopa Prabhu, netdev, Robert Shearman



> Eric W. Biederman <mailto:ebiederm@xmission.com>
> 21 juli 2016 22:00
> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>
>> On 7/16/16, 11:24 AM, Magnus Bergroth wrote:
>>> Wanted to use more than the default maximum of 8 mpls labels. Max labels
>>> seems to be hardcode to 8 in two places.
>>>
>>> --- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
>>> +++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
>>> +0200
>>> @@ -476,7 +476,7 @@
>>>          addr->bytelen = 4;
>>>          addr->bitlen = 20;
>>>          /* How many bytes do I need? */
>>> -        for (i = 0; i < 8; i++) {
>>> +        for (i = 0; i < MPLS_MAX_LABELS; i++) {
>>>              if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
>>>                  addr->bytelen = (i + 1)*4;
>>>                  break;
>>>
>>>
>>> --- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
>>> +++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
>>> 11:55:57.297681742 +0200
>>> @@ -54,6 +54,9 @@
>>>  #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
>>>  #define PREV_ARG() do { argv--; argc++; } while(0)
>>>
>>> +/* Maximum number of labels the mpls helpers support */
>>> +#define MPLS_MAX_LABELS 8
>>> +
>>>  typedef struct
>>>  {
>>>      __u16 flags;
>>> @@ -61,7 +64,7 @@
>>>      __s16 bitlen;
>>>      /* These next two fields match rtvia */
>>>      __u16 family;
>>> -    __u32 data[8];
>>> +    __u32 data[MPLS_MAX_LABELS];
>>>  } inet_prefix;
>>>
>
> This structure is not MPLS specific so that is not appropriate to use
> MPLS_MAX_LABELS when definiting the structure.  Likewise changing this
> structure and limiting the changes to mpls parts of the code is not
> appropriate.
>
>>>  #define PREFIXLEN_SPECIFIED 1
>>> @@ -88,9 +91,6 @@
>>>  # define AF_MPLS 28
>>>  #endif
>>>
>>> -/* Maximum number of labels the mpls helpers support */
>>> -#define MPLS_MAX_LABELS 8
>>> -
>>>  __u32 get_addr32(const char *name);
>>>  int get_addr_1(inet_prefix *dst, const char *arg, int family);
>>>  int get_prefix_1(inet_prefix *dst, char *arg, int family);
>>>
>>>
>> I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
>> 2.
>> I think we need to fix it in a few places:
>> a) we should move the kernel #define to a uapi header file which iproute2 can use
>> b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
>> a problem with it yet. so, we could bump it to 8.
>>
>> Were you planning to post patches for one or both of the above ?.
>>
>> I can post them too. Let me know.
>
> a) I just looked and the kernel netlink protocol does not have a limit.
>    The kernel does have a limit but the netlink protocol does not  so
>    there is no point in exporting a limit in a uapi header,  it will
>    just be out of date and wrong.
>
> b) I can see in principle bumping up the kernels MAX_LABELS past two
>    although I haven't heard those requests, or understand the use cases.
>    I don't recall seeing any ducumentation on cases where it is
>    desirable to push a lot of labels at once.  (Do hardware
>    implementations support pushing a lot of labels at once?)
>
>    Bumping past 8 seems quite a lot.  That starts feeling like people
>    trying to break other peoples mpls stacks.  That is asking for more
>    packet space for labels than ipv6 uses for addresses and ipv6 is way
>    oversized.  The commonly agreed wisdom is the world only needs 40 to
>    48 bits to route on to reach the entire world.  
I think that 8 would be more than enough for most use cases, even 6 or 4
would be sufficient. I'm looking at doing MPLS source routing based on a
label-stack. Each router in the network will get a set of static routes
that pop the label and sends it out to the next router based on the
label that gets poped.  I have no problem compiling a special build with
the MAX_LABELS set to my need. I just noticed that changing only the
MAX_LABELS wasn't enough to get more than 8 labels to work with iproute2
after changing the kernel "MAX_NEW_LABELS 2" in
include/net/mpls_iptunnel.h  and net/mpls/internal.h to a higher number.

<<Magnus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 20:00   ` Eric W. Biederman
  2016-07-21 21:08     ` Magnus Bergroth
@ 2016-07-22  6:16     ` Roopa Prabhu
  2016-07-22 19:20       ` Eric W. Biederman
  1 sibling, 1 reply; 10+ messages in thread
From: Roopa Prabhu @ 2016-07-22  6:16 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Magnus Bergroth, netdev, Robert Shearman

On 7/21/16, 1:00 PM, Eric W. Biederman wrote:
> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>
>> On 7/16/16, 11:24 AM, Magnus Bergroth wrote:
>>> Wanted to use more than the default maximum of 8 mpls labels. Max labels
>>> seems to be hardcode to 8 in two places.
>>>
>>> --- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
>>> +++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
>>> +0200
>>> @@ -476,7 +476,7 @@
>>>          addr->bytelen = 4;
>>>          addr->bitlen = 20;
>>>          /* How many bytes do I need? */
>>> -        for (i = 0; i < 8; i++) {
>>> +        for (i = 0; i < MPLS_MAX_LABELS; i++) {
>>>              if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
>>>                  addr->bytelen = (i + 1)*4;
>>>                  break;
>>>
>>>
>>> --- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
>>> +++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
>>> 11:55:57.297681742 +0200
>>> @@ -54,6 +54,9 @@
>>>  #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
>>>  #define PREV_ARG() do { argv--; argc++; } while(0)
>>>
>>> +/* Maximum number of labels the mpls helpers support */
>>> +#define MPLS_MAX_LABELS 8
>>> +
>>>  typedef struct
>>>  {
>>>      __u16 flags;
>>> @@ -61,7 +64,7 @@
>>>      __s16 bitlen;
>>>      /* These next two fields match rtvia */
>>>      __u16 family;
>>> -    __u32 data[8];
>>> +    __u32 data[MPLS_MAX_LABELS];
>>>  } inet_prefix;
>>>
> This structure is not MPLS specific so that is not appropriate to use
> MPLS_MAX_LABELS when definiting the structure.  Likewise changing this
> structure and limiting the changes to mpls parts of the code is not
> appropriate.
>
>>>  #define PREFIXLEN_SPECIFIED 1
>>> @@ -88,9 +91,6 @@
>>>  # define AF_MPLS 28
>>>  #endif
>>>
>>> -/* Maximum number of labels the mpls helpers support */
>>> -#define MPLS_MAX_LABELS 8
>>> -
>>>  __u32 get_addr32(const char *name);
>>>  int get_addr_1(inet_prefix *dst, const char *arg, int family);
>>>  int get_prefix_1(inet_prefix *dst, char *arg, int family);
>>>
>>>
>> I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
>> 2.
>> I think we need to fix it in a few places:
>> a) we should move the kernel #define to a uapi header file which iproute2 can use
>> b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
>> a problem with it yet. so, we could bump it to 8.
>>
>> Were you planning to post patches for one or both of the above ?.
>>
>> I can post them too. Let me know.
> a) I just looked and the kernel netlink protocol does not have a limit.
>    The kernel does have a limit but the netlink protocol does not  so
>    there is no point in exporting a limit in a uapi header,  it will
>    just be out of date and wrong.
sure, if you have concerns about making it part of uapi, we can
separately maintain the same limit in iproute2 and kernel.
>
> b) I can see in principle bumping up the kernels MAX_LABELS past two
>    although I haven't heard those requests, or understand the use cases.
>    I don't recall seeing any ducumentation on cases where it is
>    desirable to push a lot of labels at once.  (Do hardware
>    implementations support pushing a lot of labels at once?)
I don't know of any use cases either. But i have received multiple requests
on bumping the current limit of two
>
>    Bumping past 8 seems quite a lot.  That starts feeling like people
>    trying to break other peoples mpls stacks.  That is asking for more
>    packet space for labels than ipv6 uses for addresses and ipv6 is way
>    oversized.  The commonly agreed wisdom is the world only needs 40 to
>    48 bits to route on to reach the entire world.  
>
>    I can completely understand a few specialty labels going beyond what
>    is needed for general purpose routing but pushing more that 8 at
>    once seems huge.  Especially since you can recirculate packets if
>    you really need to and push more labels that way.

I don't think there is an ask for going more than 8. anything greater than
current 2 is good.
>
>    Add to that for a software implementation we have these pesky things
>    called cache lines.  I can see in the kernel pushing struct
>    mpls_route towards the size of a full cacheline.  Today we are at 52
>    bytes not counting the via adress.  With the via address we are at 56
>    (ipv4), 58 (ethernet), and 60 (ipv6) bytes.  Which means in we have
>    to make the kernel data structures smarter or we risk messing up the
>    performance of the common case.
>
>    Also we do need some kind of limit in the kernel to protect against
>    insane inputs.
>    
>    So while I can imagine there are reasonable cases for bumping up the
>    maximum number of labels in the kernel I think we need to be smart if
>    we ware going to do that.  Which probably means we will want a
>    __mpls_nh_label helper function.
>
sure, yes, the current static label array works well for the common case
of 2 labels. does it make sense for it to be configurable
with the default being 2 and max something like 8 ?

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-22  6:16     ` Roopa Prabhu
@ 2016-07-22 19:20       ` Eric W. Biederman
  2016-07-23 23:03         ` Roopa Prabhu
  0 siblings, 1 reply; 10+ messages in thread
From: Eric W. Biederman @ 2016-07-22 19:20 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: Magnus Bergroth, netdev, Robert Shearman

Roopa Prabhu <roopa@cumulusnetworks.com> writes:

> On 7/21/16, 1:00 PM, Eric W. Biederman wrote:
>> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>>
>>> On 7/16/16, 11:24 AM, Magnus Bergroth wrote:
>>>> Wanted to use more than the default maximum of 8 mpls labels. Max labels
>>>> seems to be hardcode to 8 in two places.
>>>>
>>>> --- iproute2-4.6.0/lib/utils.c    2016-05-18 20:56:02.000000000 +0200
>>>> +++ iproute2-4.6.0-bergroth/lib/utils.c    2016-07-16 20:12:10.714958071
>>>> +0200
>>>> @@ -476,7 +476,7 @@
>>>>          addr->bytelen = 4;
>>>>          addr->bitlen = 20;
>>>>          /* How many bytes do I need? */
>>>> -        for (i = 0; i < 8; i++) {
>>>> +        for (i = 0; i < MPLS_MAX_LABELS; i++) {
>>>>              if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
>>>>                  addr->bytelen = (i + 1)*4;
>>>>                  break;
>>>>
>>>>
>>>> --- iproute2-4.6.0/include/utils.h    2016-05-18 20:56:02.000000000 +0200
>>>> +++ iproute2-4.6.0-bergroth/include/utils.h    2016-07-15
>>>> 11:55:57.297681742 +0200
>>>> @@ -54,6 +54,9 @@
>>>>  #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
>>>>  #define PREV_ARG() do { argv--; argc++; } while(0)
>>>>
>>>> +/* Maximum number of labels the mpls helpers support */
>>>> +#define MPLS_MAX_LABELS 8
>>>> +
>>>>  typedef struct
>>>>  {
>>>>      __u16 flags;
>>>> @@ -61,7 +64,7 @@
>>>>      __s16 bitlen;
>>>>      /* These next two fields match rtvia */
>>>>      __u16 family;
>>>> -    __u32 data[8];
>>>> +    __u32 data[MPLS_MAX_LABELS];
>>>>  } inet_prefix;
>>>>
>> This structure is not MPLS specific so that is not appropriate to use
>> MPLS_MAX_LABELS when definiting the structure.  Likewise changing this
>> structure and limiting the changes to mpls parts of the code is not
>> appropriate.
>>
>>>>  #define PREFIXLEN_SPECIFIED 1
>>>> @@ -88,9 +91,6 @@
>>>>  # define AF_MPLS 28
>>>>  #endif
>>>>
>>>> -/* Maximum number of labels the mpls helpers support */
>>>> -#define MPLS_MAX_LABELS 8
>>>> -
>>>>  __u32 get_addr32(const char *name);
>>>>  int get_addr_1(inet_prefix *dst, const char *arg, int family);
>>>>  int get_prefix_1(inet_prefix *dst, char *arg, int family);
>>>>
>>>>
>>> I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
>>> 2.
>>> I think we need to fix it in a few places:
>>> a) we should move the kernel #define to a uapi header file which iproute2 can use
>>> b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
>>> a problem with it yet. so, we could bump it to 8.
>>>
>>> Were you planning to post patches for one or both of the above ?.
>>>
>>> I can post them too. Let me know.
>> a) I just looked and the kernel netlink protocol does not have a limit.
>>    The kernel does have a limit but the netlink protocol does not  so
>>    there is no point in exporting a limit in a uapi header,  it will
>>    just be out of date and wrong.
> sure, if you have concerns about making it part of uapi, we can
> separately maintain the same limit in iproute2 and kernel.

The different tools already have different limits and it is not
a problem.  The important thing is for the userspace tool to have
the larger limit.

>> b) I can see in principle bumping up the kernels MAX_LABELS past two
>>    although I haven't heard those requests, or understand the use cases.
>>    I don't recall seeing any ducumentation on cases where it is
>>    desirable to push a lot of labels at once.  (Do hardware
>>    implementations support pushing a lot of labels at once?)
> I don't know of any use cases either. But i have received multiple requests
> on bumping the current limit of two
>>
>>    Bumping past 8 seems quite a lot.  That starts feeling like people
>>    trying to break other peoples mpls stacks.  That is asking for more
>>    packet space for labels than ipv6 uses for addresses and ipv6 is way
>>    oversized.  The commonly agreed wisdom is the world only needs 40 to
>>    48 bits to route on to reach the entire world.  
>>
>>    I can completely understand a few specialty labels going beyond what
>>    is needed for general purpose routing but pushing more that 8 at
>>    once seems huge.  Especially since you can recirculate packets if
>>    you really need to and push more labels that way.
>
> I don't think there is an ask for going more than 8. anything greater than
> current 2 is good.

Except the patch that got all of this started.

>>    Add to that for a software implementation we have these pesky things
>>    called cache lines.  I can see in the kernel pushing struct
>>    mpls_route towards the size of a full cacheline.  Today we are at 52
>>    bytes not counting the via adress.  With the via address we are at 56
>>    (ipv4), 58 (ethernet), and 60 (ipv6) bytes.  Which means in we have
>>    to make the kernel data structures smarter or we risk messing up the
>>    performance of the common case.
>>
>>    Also we do need some kind of limit in the kernel to protect against
>>    insane inputs.
>>    
>>    So while I can imagine there are reasonable cases for bumping up the
>>    maximum number of labels in the kernel I think we need to be smart if
>>    we ware going to do that.  Which probably means we will want a
>>    __mpls_nh_label helper function.
>>
> sure, yes, the current static label array works well for the common case
> of 2 labels. does it make sense for it to be configurable
> with the default being 2 and max something like 8 ?

We have two structures both with one byte holes:
struct mpls_route { /* next hop label forwarding entry */
	struct rcu_head		rt_rcu;
	u8			rt_protocol;
	u8			rt_payload_type;
	u8			rt_max_alen;
	unsigned int		rt_nhn;
	unsigned int		rt_nhn_alive;
	struct mpls_nh		rt_nh[0];
};

struct mpls_nh { /* next hop label forwarding entry */
	struct net_device __rcu *nh_dev;
	unsigned int		nh_flags;
	u32			nh_label[MAX_NEW_LABELS];
	u8			nh_labels;
	u8			nh_via_alen;
	u8			nh_via_table;
};

If we were to define them as:
struct mpls_route { /* next hop label forwarding entry */
	struct rcu_head		rt_rcu;
	u8			rt_protocol;
	u8			rt_payload_type;
	u8			rt_max_alen;
        u8			rt_max_labels;
	unsigned int		rt_nhn;
	unsigned int		rt_nhn_alive;
	struct mpls_nh		rt_nh[0];
};

struct mpls_nh { /* next hop label forwarding entry */
	struct net_device __rcu *nh_dev;
	unsigned int		nh_flags;
	u8			nh_labels;
	u8			nh_via_alen;
	u8			nh_via_table;
};

static 32 *__mpls_nh_labels(struct mpls_route *rt, struct mpls_nh *nh)
{
	u32 *nh0_labels = PTR_ALIGN((u32 *)&rt->rt_nh[rt->rt_nhn], sizeof(u32));
	int nh_index = nh - rt->rt_nh;

	return nh0_labels + rt->rt_max_labels * nh_index;
}

static u8 *__mpls_nh_via(struct mpls_route *rt, struct mpls_nh *nh)
{
	u8 *nh0_via = PTR_ALIGN((u8 *)(&rt->rt_nh[rt->rt_nhn] + (sizeof(u32) *rt->max_labels * rt->nhn)), VIA_ALEN_ALIGN);
	int nh_index = nh - rt->rt_nh;

	return nh0_via + rt->rt_max_alen * nh_index;
}

Ugh.  I just noticed we have a nasty 4 byte gap in the mpls_route by
having both rt_nhn and rt_nhn_alive in there.  As rt_nh[0] has pointer
alignment.

Anyway something like the above should allow us to remove the limit
of the number of labels from the implementation and still fit everything
in a cache line in the common case, as the change above doesn't take up
any extra space in struct mpls_route.

Then we just pick a reasonable maximum and set MAX_NEW_LABELS to that.
That will change struct mpls_route_config.  So we need a small enough
value that putting struct mpls_route_config continues to make sense.
I propose 8 for MAX_NEW_LABELS after such a change.

It looks pretty straighforward on the kernel side.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 21:08     ` Magnus Bergroth
@ 2016-07-22 19:24       ` Eric W. Biederman
  0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2016-07-22 19:24 UTC (permalink / raw)
  To: Magnus Bergroth; +Cc: Roopa Prabhu, netdev, Robert Shearman

Magnus Bergroth <bergroth@nordu.net> writes:

>> Eric W. Biederman <mailto:ebiederm@xmission.com>
>> a) I just looked and the kernel netlink protocol does not have a limit.
>>    The kernel does have a limit but the netlink protocol does not  so
>>    there is no point in exporting a limit in a uapi header,  it will
>>    just be out of date and wrong.
>>
>> b) I can see in principle bumping up the kernels MAX_LABELS past two
>>    although I haven't heard those requests, or understand the use cases.
>>    I don't recall seeing any ducumentation on cases where it is
>>    desirable to push a lot of labels at once.  (Do hardware
>>    implementations support pushing a lot of labels at once?)
>>
>>    Bumping past 8 seems quite a lot.  That starts feeling like people
>>    trying to break other peoples mpls stacks.  That is asking for more
>>    packet space for labels than ipv6 uses for addresses and ipv6 is way
>>    oversized.  The commonly agreed wisdom is the world only needs 40 to
>>    48 bits to route on to reach the entire world.  
> I think that 8 would be more than enough for most use cases, even 6 or 4
> would be sufficient. I'm looking at doing MPLS source routing based on a
> label-stack. Each router in the network will get a set of static routes
> that pop the label and sends it out to the next router based on the
> label that gets poped.  I have no problem compiling a special build with
> the MAX_LABELS set to my need. I just noticed that changing only the
> MAX_LABELS wasn't enough to get more than 8 labels to work with iproute2
> after changing the kernel "MAX_NEW_LABELS 2" in
> include/net/mpls_iptunnel.h  and net/mpls/internal.h to a higher
> number.

At a practical level in general it doesn't make sense to support special
builds.  The code rarely get tested and so bit rots.  In a situation
like this it makes sense to dig in and solve the problem as generally as
you can as long as it doesn't cause problems for anything else.

Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-22 19:20       ` Eric W. Biederman
@ 2016-07-23 23:03         ` Roopa Prabhu
  0 siblings, 0 replies; 10+ messages in thread
From: Roopa Prabhu @ 2016-07-23 23:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Magnus Bergroth, netdev, Robert Shearman, olivier.dugeon

On 7/22/16, 12:20 PM, Eric W. Biederman wrote:
> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>
>> On 7/21/16, 1:00 PM, Eric W. Biederman wrote:
>>> Roopa Prabhu <roopa@cumulusnetworks.com> writes:
>>>
>>>

[snip]
>>> I did not realize it is hardcoded to 8 in iproute2. Because kernel has a hard coded limit of
>>> 2.
>>> I think we need to fix it in a few places:
>>> a) we should move the kernel #define to a uapi header file which iproute2 can use
>>> b) there has been a general ask to bump the kernel MAX_LABELS from 2 and I don't see
>>> a problem with it yet. so, we could bump it to 8.
>>>
>>> Were you planning to post patches for one or both of the above ?.
>>>
>>> I can post them too. Let me know.
>>> a) I just looked and the kernel netlink protocol does not have a limit.
>>>    The kernel does have a limit but the netlink protocol does not  so
>>>    there is no point in exporting a limit in a uapi header,  it will
>>>    just be out of date and wrong.
>> sure, if you have concerns about making it part of uapi, we can
>> separately maintain the same limit in iproute2 and kernel.
> The different tools already have different limits and it is not
> a problem.  The important thing is for the userspace tool to have
> the larger limit.
>>> b) I can see in principle bumping up the kernels MAX_LABELS past two
>>>    although I haven't heard those requests, or understand the use cases.
>>>    I don't recall seeing any ducumentation on cases where it is
>>>    desirable to push a lot of labels at once.  (Do hardware
>>>    implementations support pushing a lot of labels at once?)
>> I don't know of any use cases either. But i have received multiple requests
>> on bumping the current limit of two
>>>    Bumping past 8 seems quite a lot.  That starts feeling like people
>>>    trying to break other peoples mpls stacks.  That is asking for more
>>>    packet space for labels than ipv6 uses for addresses and ipv6 is way
>>>    oversized.  The commonly agreed wisdom is the world only needs 40 to
>>>    48 bits to route on to reach the entire world.  
>>>
>>>    I can completely understand a few specialty labels going beyond what
>>>    is needed for general purpose routing but pushing more that 8 at
>>>    once seems huge.  Especially since you can recirculate packets if
>>>    you really need to and push more labels that way.
>> I don't think there is an ask for going more than 8. anything greater than
>> current 2 is good.
> Except the patch that got all of this started.

ok, missed that. yesterday I also received some info on a segment routing use-case
where there is an ongoing study which is currently leaning towards a max label stack
depth of 17.

>
>>>    Add to that for a software implementation we have these pesky things
>>>    called cache lines.  I can see in the kernel pushing struct
>>>    mpls_route towards the size of a full cacheline.  Today we are at 52
>>>    bytes not counting the via adress.  With the via address we are at 56
>>>    (ipv4), 58 (ethernet), and 60 (ipv6) bytes.  Which means in we have
>>>    to make the kernel data structures smarter or we risk messing up the
>>>    performance of the common case.
>>>
>>>    Also we do need some kind of limit in the kernel to protect against
>>>    insane inputs.
>>>    
>>>    So while I can imagine there are reasonable cases for bumping up the
>>>    maximum number of labels in the kernel I think we need to be smart if
>>>    we ware going to do that.  Which probably means we will want a
>>>    __mpls_nh_label helper function.
>>>
>> sure, yes, the current static label array works well for the common case
>> of 2 labels. does it make sense for it to be configurable
>> with the default being 2 and max something like 8 ?
> We have two structures both with one byte holes:
> struct mpls_route { /* next hop label forwarding entry */
> 	struct rcu_head		rt_rcu;
> 	u8			rt_protocol;
> 	u8			rt_payload_type;
> 	u8			rt_max_alen;
> 	unsigned int		rt_nhn;
> 	unsigned int		rt_nhn_alive;
> 	struct mpls_nh		rt_nh[0];
> };
>
> struct mpls_nh { /* next hop label forwarding entry */
> 	struct net_device __rcu *nh_dev;
> 	unsigned int		nh_flags;
> 	u32			nh_label[MAX_NEW_LABELS];
> 	u8			nh_labels;
> 	u8			nh_via_alen;
> 	u8			nh_via_table;
> };
>
> If we were to define them as:
> struct mpls_route { /* next hop label forwarding entry */
> 	struct rcu_head		rt_rcu;
> 	u8			rt_protocol;
> 	u8			rt_payload_type;
> 	u8			rt_max_alen;
>         u8			rt_max_labels;
> 	unsigned int		rt_nhn;
> 	unsigned int		rt_nhn_alive;
> 	struct mpls_nh		rt_nh[0];
> };
>
> struct mpls_nh { /* next hop label forwarding entry */
> 	struct net_device __rcu *nh_dev;
> 	unsigned int		nh_flags;
> 	u8			nh_labels;
> 	u8			nh_via_alen;
> 	u8			nh_via_table;
> };
>
> static 32 *__mpls_nh_labels(struct mpls_route *rt, struct mpls_nh *nh)
> {
> 	u32 *nh0_labels = PTR_ALIGN((u32 *)&rt->rt_nh[rt->rt_nhn], sizeof(u32));
> 	int nh_index = nh - rt->rt_nh;
>
> 	return nh0_labels + rt->rt_max_labels * nh_index;
> }
>
> static u8 *__mpls_nh_via(struct mpls_route *rt, struct mpls_nh *nh)
> {
> 	u8 *nh0_via = PTR_ALIGN((u8 *)(&rt->rt_nh[rt->rt_nhn] + (sizeof(u32) *rt->max_labels * rt->nhn)), VIA_ALEN_ALIGN);
> 	int nh_index = nh - rt->rt_nh;
>
> 	return nh0_via + rt->rt_max_alen * nh_index;
> }
>
> Ugh.  I just noticed we have a nasty 4 byte gap in the mpls_route by
> having both rt_nhn and rt_nhn_alive in there.  As rt_nh[0] has pointer
> alignment.
>
> Anyway something like the above should allow us to remove the limit
> of the number of labels from the implementation and still fit everything
> in a cache line in the common case, as the change above doesn't take up
> any extra space in struct mpls_route.
>
> Then we just pick a reasonable maximum and set MAX_NEW_LABELS to that.
> That will change struct mpls_route_config.  So we need a small enough
> value that putting struct mpls_route_config continues to make sense.
> I propose 8 for MAX_NEW_LABELS after such a change.
>
> It looks pretty straighforward on the kernel side.
I like it. It follows how via is handled today and I agree seems like the best way to
represent varying number of labels without affecting the common case.

thanks for the suggestion.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: iproute2 mpls max labels
  2016-07-21 19:43   ` Magnus Bergroth
@ 2016-07-23 23:04     ` Roopa Prabhu
  0 siblings, 0 replies; 10+ messages in thread
From: Roopa Prabhu @ 2016-07-23 23:04 UTC (permalink / raw)
  To: Magnus Bergroth; +Cc: netdev, Robert Shearman, ebiederm

On 7/21/16, 12:43 PM, Magnus Bergroth wrote:
>
>> Roopa Prabhu <mailto:roopa@cumulusnetworks.com>
>> 21 juli 2016 20:53
>> I did not realize it is hardcoded to 8 in iproute2. Because kernel has
>> a hard coded limit of
>> 2.
>> I think we need to fix it in a few places:
>> a) we should move the kernel #define to a uapi header file which
>> iproute2 can use
>> b) there has been a general ask to bump the kernel MAX_LABELS from 2
>> and I don't see
>> a problem with it yet. so, we could bump it to 8.
>>
>> Were you planning to post patches for one or both of the above ?.
>>
>> I can post them too. Let me know.
> I would be happy if you could post them both as I very new to the kernel
> development and I'm not entirely sure how to do it in the right way.
>
ok, sure, will fix it.

thanks,
Roopa

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-07-23 23:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-16 18:24 iproute2 mpls max labels Magnus Bergroth
2016-07-21 18:53 ` Roopa Prabhu
2016-07-21 19:43   ` Magnus Bergroth
2016-07-23 23:04     ` Roopa Prabhu
2016-07-21 20:00   ` Eric W. Biederman
2016-07-21 21:08     ` Magnus Bergroth
2016-07-22 19:24       ` Eric W. Biederman
2016-07-22  6:16     ` Roopa Prabhu
2016-07-22 19:20       ` Eric W. Biederman
2016-07-23 23:03         ` Roopa Prabhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.