All of lore.kernel.org
 help / color / mirror / Atom feed
* Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
       [not found] <485531aec7e243659ee4e3bb7fa2186d@paneda.se>
@ 2020-11-23 14:22 ` Thomas Karlsson
  2020-11-23 22:30   ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-23 14:22 UTC (permalink / raw)
  To: davem, kuba; +Cc: netdev

Hello,

There is a special queue handling in macvlan.c for broadcast and multicast packages that was arbitrarily set to 1000 in commit 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably sufficient for most uses cases it is insufficient to support high packet rates. I currently have a setup with 144 000 multicast packets incoming per second (144 different live audio RTP streams) and suffer very frequent packet loss. With unicast this is not an issue and I can in addition to the 144kpps load the macvlan interface with another 450mbit/s using iperf.

In order to verify that the queue is the problem I edited the define to 100000 and recompiled the kernel module. After replacing it with rmmod/insmod I get 0 packet loss (measured over 2 days where I before had losses every other second or so) and can also load an additional 450 mbit/s multicast traffic using iperf without losses. So basically no change in performance between unicast/multicast when it comes to lost packets on my machine.

I think It would be best if this queue length was configurable somehow. Either an option when creating the macvlan (like how bridge/passthrough/etc are set) or at least when loading the module (for instance by using a config in /etc/modprobe.d). One size does not fit all in this situation.


Link to code in question using the define (on master):
https://github.com/torvalds/linux/blob/27bba9c532a8d21050b94224ffd310ad0058c353/drivers/net/macvlan.c#L357 

(re-sent in text/plain instead of html)

Best regards,
Thomas Karlsson

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-23 14:22 ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Thomas Karlsson
@ 2020-11-23 22:30   ` Jakub Kicinski
  2020-11-25 12:51     ` Thomas Karlsson
                       ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Jakub Kicinski @ 2020-11-23 22:30 UTC (permalink / raw)
  To: Thomas Karlsson; +Cc: davem, netdev

On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:
> Hello,
> 
> There is a special queue handling in macvlan.c for broadcast and
> multicast packages that was arbitrarily set to 1000 in commit
> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
> sufficient for most uses cases it is insufficient to support high
> packet rates. I currently have a setup with 144 000 multicast packets
> incoming per second (144 different live audio RTP streams) and suffer
> very frequent packet loss. With unicast this is not an issue and I
> can in addition to the 144kpps load the macvlan interface with
> another 450mbit/s using iperf.
> 
> In order to verify that the queue is the problem I edited the define
> to 100000 and recompiled the kernel module. After replacing it with
> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
> had losses every other second or so) and can also load an additional
> 450 mbit/s multicast traffic using iperf without losses. So basically
> no change in performance between unicast/multicast when it comes to
> lost packets on my machine.
> 
> I think It would be best if this queue length was configurable
> somehow. Either an option when creating the macvlan (like how
> bridge/passthrough/etc are set) or at least when loading the module
> (for instance by using a config in /etc/modprobe.d). One size does
> not fit all in this situation.

The former please. You can add a netlink attribute, should be
reasonably straightforward. The other macvlan attrs are defined
under "MACVLAN section" in if_link.h.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-23 22:30   ` Jakub Kicinski
@ 2020-11-25 12:51     ` Thomas Karlsson
  2020-11-25 16:57       ` [PATCH] macvlan: Support for high multicast packet rate Thomas Karlsson
  2020-11-25 16:58       ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Jakub Kicinski
  2020-11-30 14:00     ` [PATCH net-next v3] macvlan: Support for high multicast packet rate Thomas Karlsson
                       ` (3 subsequent siblings)
  4 siblings, 2 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-25 12:51 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev



Den 2020-11-23 kl. 23:30, skrev Jakub Kicinski:
> On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:
>> Hello,
>>
>> There is a special queue handling in macvlan.c for broadcast and
>> multicast packages that was arbitrarily set to 1000 in commit
>> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
>> sufficient for most uses cases it is insufficient to support high
>> packet rates. I currently have a setup with 144 000 multicast packets
>> incoming per second (144 different live audio RTP streams) and suffer
>> very frequent packet loss. With unicast this is not an issue and I
>> can in addition to the 144kpps load the macvlan interface with
>> another 450mbit/s using iperf.
>>
>> In order to verify that the queue is the problem I edited the define
>> to 100000 and recompiled the kernel module. After replacing it with
>> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
>> had losses every other second or so) and can also load an additional
>> 450 mbit/s multicast traffic using iperf without losses. So basically
>> no change in performance between unicast/multicast when it comes to
>> lost packets on my machine.
>>
>> I think It would be best if this queue length was configurable
>> somehow. Either an option when creating the macvlan (like how
>> bridge/passthrough/etc are set) or at least when loading the module
>> (for instance by using a config in /etc/modprobe.d). One size does
>> not fit all in this situation.
> 
> The former please. You can add a netlink attribute, should be
> reasonably straightforward. The other macvlan attrs are defined
> under "MACVLAN section" in if_link.h.
> 

I did some work towards a patch using the first option,
by adding a netlink attribute in if_link.h as suggested.
I agree that this was reasonably straightforward, until userspace.

In order to use/test my new parameter I need to update iproute2 package
as far as I understand. But then since I use the macvlan with docker
I also need to update the docker macvlan driver to send this new
option to the kernel module.

For this reason I would like to know if you would consider
merging a patch using the module_param(...) variant instead?

I would argue that this still makes the situation better
and resolves the packet-loss issue, although not necessarily
in an optimal way. However, The upside of being able to specify the
parameter on a per macvlan interface level instead of globally is not
that big in this situation. Normally you don't use that much
multicast anyway so it's a parameter that only will be touched by
a very small user base that can understand and handle the implications
of such a global setting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] macvlan: Support for high multicast packet rate
  2020-11-25 12:51     ` Thomas Karlsson
@ 2020-11-25 16:57       ` Thomas Karlsson
  2020-11-25 21:55         ` [PATCH net-next v2] " Thomas Karlsson
  2020-11-25 16:58       ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Jakub Kicinski
  1 sibling, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-25 16:57 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, thomas.karlsson

Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000 packages.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue len adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue len is specified using the bc_queue_len module parameter.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index c8d803d3616c..5c92ef2db284 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -12,6 +12,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/module.h>
+#include <linux/moduleparam.h>
 #include <linux/init.h>
 #include <linux/errno.h>
 #include <linux/slab.h>
@@ -35,11 +36,15 @@
 
 #define MACVLAN_HASH_BITS      8
 #define MACVLAN_HASH_SIZE      (1<<MACVLAN_HASH_BITS)
-#define MACVLAN_BC_QUEUE_LEN   1000
+#define MACVLAN_DEFAULT_BC_QUEUE_LEN   1000
 
 #define MACVLAN_F_PASSTHRU     1
 #define MACVLAN_F_ADDRCHANGE   2
 
+static uint bc_queue_len = MACVLAN_DEFAULT_BC_QUEUE_LEN;
+module_param(bc_queue_len, uint, 0444);
+MODULE_PARM_DESC(bc_queue_len, "The maximum length of the broadcast/multicast work queue");
+
 struct macvlan_port {
        struct net_device       *dev;
        struct hlist_head       vlan_hash[MACVLAN_HASH_SIZE];
@@ -354,7 +359,7 @@ static void macvlan_broadcast_enqueue(struct macvlan_port *port,
        MACVLAN_SKB_CB(nskb)->src = src;
 
        spin_lock(&port->bc_queue.lock);
-       if (skb_queue_len(&port->bc_queue) < MACVLAN_BC_QUEUE_LEN) {
+       if (skb_queue_len(&port->bc_queue) < bc_queue_len) {
                if (src)
                        dev_hold(src->dev);
                __skb_queue_tail(&port->bc_queue, nskb);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 12:51     ` Thomas Karlsson
  2020-11-25 16:57       ` [PATCH] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-11-25 16:58       ` Jakub Kicinski
  2020-11-25 17:12         ` Thomas Karlsson
  1 sibling, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-11-25 16:58 UTC (permalink / raw)
  To: Thomas Karlsson; +Cc: davem, netdev

On Wed, 25 Nov 2020 13:51:41 +0100 Thomas Karlsson wrote:
> Den 2020-11-23 kl. 23:30, skrev Jakub Kicinski:
> > On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:  
> >> Hello,
> >>
> >> There is a special queue handling in macvlan.c for broadcast and
> >> multicast packages that was arbitrarily set to 1000 in commit
> >> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
> >> sufficient for most uses cases it is insufficient to support high
> >> packet rates. I currently have a setup with 144 000 multicast packets
> >> incoming per second (144 different live audio RTP streams) and suffer
> >> very frequent packet loss. With unicast this is not an issue and I
> >> can in addition to the 144kpps load the macvlan interface with
> >> another 450mbit/s using iperf.
> >>
> >> In order to verify that the queue is the problem I edited the define
> >> to 100000 and recompiled the kernel module. After replacing it with
> >> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
> >> had losses every other second or so) and can also load an additional
> >> 450 mbit/s multicast traffic using iperf without losses. So basically
> >> no change in performance between unicast/multicast when it comes to
> >> lost packets on my machine.
> >>
> >> I think It would be best if this queue length was configurable
> >> somehow. Either an option when creating the macvlan (like how
> >> bridge/passthrough/etc are set) or at least when loading the module
> >> (for instance by using a config in /etc/modprobe.d). One size does
> >> not fit all in this situation.  
> > 
> > The former please. You can add a netlink attribute, should be
> > reasonably straightforward. The other macvlan attrs are defined
> > under "MACVLAN section" in if_link.h.
> >   
> 
> I did some work towards a patch using the first option,
> by adding a netlink attribute in if_link.h as suggested.
> I agree that this was reasonably straightforward, until userspace.
> 
> In order to use/test my new parameter I need to update iproute2 package
> as far as I understand. But then since I use the macvlan with docker
> I also need to update the docker macvlan driver to send this new
> option to the kernel module.

I wish I got a cookie every time someone said they can't do the right
thing because they'd have to update $container_system 😔

> For this reason I would like to know if you would consider
> merging a patch using the module_param(...) variant instead?
> 
> I would argue that this still makes the situation better
> and resolves the packet-loss issue, although not necessarily
> in an optimal way. However, The upside of being able to specify the
> parameter on a per macvlan interface level instead of globally is not
> that big in this situation. Normally you don't use that much
> multicast anyway so it's a parameter that only will be touched by
> a very small user base that can understand and handle the implications
> of such a global setting.

How about implementing .changelink in macvlan? That way you could
modify the macvlan device independent of Docker? 

Make sure you only accept changes to the bc queue len if that's the
only one you act on.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 16:58       ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Jakub Kicinski
@ 2020-11-25 17:12         ` Thomas Karlsson
  2020-11-25 18:07           ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-25 17:12 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

On 2020-11-25 17:58, Jakub Kicinski wrote:
> On Wed, 25 Nov 2020 13:51:41 +0100 Thomas Karlsson wrote:
>> Den 2020-11-23 kl. 23:30, skrev Jakub Kicinski:
>>> On Mon, 23 Nov 2020 14:22:31 +0000 Thomas Karlsson wrote:  
>>>> Hello,
>>>>
>>>> There is a special queue handling in macvlan.c for broadcast and
>>>> multicast packages that was arbitrarily set to 1000 in commit
>>>> 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb . While this is probably
>>>> sufficient for most uses cases it is insufficient to support high
>>>> packet rates. I currently have a setup with 144 000 multicast packets
>>>> incoming per second (144 different live audio RTP streams) and suffer
>>>> very frequent packet loss. With unicast this is not an issue and I
>>>> can in addition to the 144kpps load the macvlan interface with
>>>> another 450mbit/s using iperf.
>>>>
>>>> In order to verify that the queue is the problem I edited the define
>>>> to 100000 and recompiled the kernel module. After replacing it with
>>>> rmmod/insmod I get 0 packet loss (measured over 2 days where I before
>>>> had losses every other second or so) and can also load an additional
>>>> 450 mbit/s multicast traffic using iperf without losses. So basically
>>>> no change in performance between unicast/multicast when it comes to
>>>> lost packets on my machine.
>>>>
>>>> I think It would be best if this queue length was configurable
>>>> somehow. Either an option when creating the macvlan (like how
>>>> bridge/passthrough/etc are set) or at least when loading the module
>>>> (for instance by using a config in /etc/modprobe.d). One size does
>>>> not fit all in this situation.  
>>>
>>> The former please. You can add a netlink attribute, should be
>>> reasonably straightforward. The other macvlan attrs are defined
>>> under "MACVLAN section" in if_link.h.
>>>   
>>
>> I did some work towards a patch using the first option,
>> by adding a netlink attribute in if_link.h as suggested.
>> I agree that this was reasonably straightforward, until userspace.
>>
>> In order to use/test my new parameter I need to update iproute2 package
>> as far as I understand. But then since I use the macvlan with docker
>> I also need to update the docker macvlan driver to send this new
>> option to the kernel module.
> 
> I wish I got a cookie every time someone said they can't do the right
> thing because they'd have to update $container_system 😔

lol :)

> 
>> For this reason I would like to know if you would consider
>> merging a patch using the module_param(...) variant instead?
>>
>> I would argue that this still makes the situation better
>> and resolves the packet-loss issue, although not necessarily
>> in an optimal way. However, The upside of being able to specify the
>> parameter on a per macvlan interface level instead of globally is not
>> that big in this situation. Normally you don't use that much
>> multicast anyway so it's a parameter that only will be touched by
>> a very small user base that can understand and handle the implications
>> of such a global setting.
> 
> How about implementing .changelink in macvlan? That way you could
> modify the macvlan device independent of Docker? 
> 
> Make sure you only accept changes to the bc queue len if that's the
> only one you act on.
> 

Hmm, I see. You mean that docker can create the interface and then I can
modify it afterwards? That might be a workaround but I just submitted
a patch (like seconds before your message) with the module_param() option
and this was very clean I think. both in how little code that needed to be
changed and in how simple it is to set the option in the target environment.

This is my first time ever attemting a contribution to the kernel so
I'm quite happy to keep it simple like that too :)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 17:12         ` Thomas Karlsson
@ 2020-11-25 18:07           ` Jakub Kicinski
  2020-11-25 22:15             ` Thomas Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-11-25 18:07 UTC (permalink / raw)
  To: Thomas Karlsson; +Cc: davem, netdev

On Wed, 25 Nov 2020 18:12:34 +0100 Thomas Karlsson wrote:
> >> For this reason I would like to know if you would consider
> >> merging a patch using the module_param(...) variant instead?
> >>
> >> I would argue that this still makes the situation better
> >> and resolves the packet-loss issue, although not necessarily
> >> in an optimal way. However, The upside of being able to specify the
> >> parameter on a per macvlan interface level instead of globally is not
> >> that big in this situation. Normally you don't use that much
> >> multicast anyway so it's a parameter that only will be touched by
> >> a very small user base that can understand and handle the implications
> >> of such a global setting.  
> > 
> > How about implementing .changelink in macvlan? That way you could
> > modify the macvlan device independent of Docker? 
> > 
> > Make sure you only accept changes to the bc queue len if that's the
> > only one you act on.
> >   
> 
> Hmm, I see. You mean that docker can create the interface and then I can
> modify it afterwards? That might be a workaround but I just submitted
> a patch (like seconds before your message) with the module_param() option
> and this was very clean I think. both in how little code that needed to be
> changed and in how simple it is to set the option in the target environment.
> 
> This is my first time ever attemting a contribution to the kernel so
> I'm quite happy to keep it simple like that too :)

Module params are highly inflexible, we have a general policy not 
to accept them in the netdev world. There should even be a check 
in our patchwork which should fail here, but it appears that the patch 
did not apply in the first place:

https://patchwork.kernel.org/project/netdevbpf/patch/385b9b4c-25f5-b507-4e69-419883fa8043@paneda.se/

Make sure you're developing on top of this tree:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v2] macvlan: Support for high multicast packet rate
  2020-11-25 16:57       ` [PATCH] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-11-25 21:55         ` Thomas Karlsson
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-25 21:55 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, thomas.karlsson

Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue len adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue len is specified using the bc_queue_len module parameter.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
---
v2: Patch created on top of 'net-next' instead of 'torvalds/linux'

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index d9b6c44a5911..ed67fbfff450 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -12,6 +12,7 @@
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/module.h>
+#include <linux/moduleparam.h>
 #include <linux/init.h>
 #include <linux/errno.h>
 #include <linux/slab.h>
@@ -35,11 +36,15 @@

 #define MACVLAN_HASH_BITS      8
 #define MACVLAN_HASH_SIZE      (1<<MACVLAN_HASH_BITS)
-#define MACVLAN_BC_QUEUE_LEN   1000
+#define MACVLAN_DEFAULT_BC_QUEUE_LEN   1000

 #define MACVLAN_F_PASSTHRU     1
 #define MACVLAN_F_ADDRCHANGE   2

+static uint bc_queue_len = MACVLAN_DEFAULT_BC_QUEUE_LEN;
+module_param(bc_queue_len, uint, 0444);
+MODULE_PARM_DESC(bc_queue_len, "The maximum length of the broadcast/multicast work queue");
+
 struct macvlan_port {
        struct net_device       *dev;
        struct hlist_head       vlan_hash[MACVLAN_HASH_SIZE];
@@ -354,7 +359,7 @@ static void macvlan_broadcast_enqueue(struct macvlan_port *port,
        MACVLAN_SKB_CB(nskb)->src = src;

        spin_lock(&port->bc_queue.lock);
-       if (skb_queue_len(&port->bc_queue) < MACVLAN_BC_QUEUE_LEN) {
+       if (skb_queue_len(&port->bc_queue) < bc_queue_len) {
                if (src)
                        dev_hold(src->dev);
                __skb_queue_tail(&port->bc_queue, nskb);
--
2.28.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 18:07           ` Jakub Kicinski
@ 2020-11-25 22:15             ` Thomas Karlsson
  2020-11-25 23:01               ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-25 22:15 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, thomas.karlsson

On 2020-11-25 19:07, Jakub Kicinski wrote:
> On Wed, 25 Nov 2020 18:12:34 +0100 Thomas Karlsson wrote:
>>>> For this reason I would like to know if you would consider
>>>> merging a patch using the module_param(...) variant instead?
>>>>
>>>> I would argue that this still makes the situation better
>>>> and resolves the packet-loss issue, although not necessarily
>>>> in an optimal way. However, The upside of being able to specify the
>>>> parameter on a per macvlan interface level instead of globally is not
>>>> that big in this situation. Normally you don't use that much
>>>> multicast anyway so it's a parameter that only will be touched by
>>>> a very small user base that can understand and handle the implications
>>>> of such a global setting.  
>>>
>>> How about implementing .changelink in macvlan? That way you could
>>> modify the macvlan device independent of Docker? 
>>>
>>> Make sure you only accept changes to the bc queue len if that's the
>>> only one you act on.
>>>   
>>
>> Hmm, I see. You mean that docker can create the interface and then I can
>> modify it afterwards? That might be a workaround but I just submitted
>> a patch (like seconds before your message) with the module_param() option
>> and this was very clean I think. both in how little code that needed to be
>> changed and in how simple it is to set the option in the target environment.
>>
>> This is my first time ever attemting a contribution to the kernel so
>> I'm quite happy to keep it simple like that too :)
> 
> Module params are highly inflexible, we have a general policy not 
> to accept them in the netdev world.
> 

I see, although the current define seems even less flexible :)
Although, I might not have fully understood the .changelink you suggest.
Is it via the ip link set ... command? Or is there a way to set the parameters
in a more "raw" form that does not require a patch to iproute2 with parameter parsing,
error handing, man pages updates, etc. I feel that I'm getting in over my head here.

I appreciate your feedback!

> There should even be a check> in our patchwork which should fail here, but it appears that the patch 
> did not apply in the first place:
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/385b9b4c-25f5-b507-4e69-419883fa8043@paneda.se/
> 
> Make sure you're developing on top of this tree:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> 

Right, thanks! It's a bit of a learning curve. I had incorrectly done the work on top of torvalds/linux

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 22:15             ` Thomas Karlsson
@ 2020-11-25 23:01               ` Jakub Kicinski
  2020-11-26 20:00                 ` Thomas Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-11-25 23:01 UTC (permalink / raw)
  To: Thomas Karlsson; +Cc: davem, netdev

On Wed, 25 Nov 2020 23:15:39 +0100 Thomas Karlsson wrote:
> >> This is my first time ever attemting a contribution to the kernel so
> >> I'm quite happy to keep it simple like that too :)  
> > 
> > Module params are highly inflexible, we have a general policy not 
> > to accept them in the netdev world.
> 
> I see, although the current define seems even less flexible :)

Just to be clear - the module parameter is a no-go. 
No point discussing it.

> Although, I might not have fully understood the .changelink you suggest.
> Is it via the ip link set ... command? 

Yes.

> Or is there a way to set the parameters in a more "raw" form that
> does not require a patch to iproute2 with parameter parsing, error
> handing, man pages updates, etc. I feel that I'm getting in over my
> head here.

We're here to assist! Netlink takes a little bit of effort 
to comprehend but it's very simple once you get the mechanics!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-25 23:01               ` Jakub Kicinski
@ 2020-11-26 20:00                 ` Thomas Karlsson
  2020-11-27 17:27                   ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-26 20:00 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev

On 2020-11-26 00:01, Jakub Kicinski wrote:
> On Wed, 25 Nov 2020 23:15:39 +0100 Thomas Karlsson wrote:
>>>> This is my first time ever attemting a contribution to the kernel so
>>>> I'm quite happy to keep it simple like that too :)  
>>>
>>> Module params are highly inflexible, we have a general policy not 
>>> to accept them in the netdev world.
>>
>> I see, although the current define seems even less flexible :)
> 
> Just to be clear - the module parameter is a no-go. 
> No point discussing it.

Got it!

> 
>> Although, I might not have fully understood the .changelink you suggest.
>> Is it via the ip link set ... command? 
> 
> Yes.
> 
>> Or is there a way to set the parameters in a more "raw" form that
>> does not require a patch to iproute2 with parameter parsing, error
>> handing, man pages updates, etc. I feel that I'm getting in over my
>> head here.
> 
> We're here to assist! Netlink takes a little bit of effort 
> to comprehend but it's very simple once you get the mechanics!
> 

Thanks for the encouragement, I have been able to build iproute2 today and
I am successfully communicating with the driver now being able to set and retrieve my queue len!

As I'm working on this I do got a question. I placed the bc_queue_len into the struct macvlan_port *port
since that is where the bc_queue is located today. But when I change and retrieve the queue from userspace I realize
that all macvlan interfaces that share the same physical lowerdev uses the same port structure and thus
the same bc_queue_len.

It confused me at first and I'm not sure if that is how it should be. I expected the driver to have different
bc_queues for all macvlan interfaces no matter which lowerdev they were using but obviously that is not the case.

It may be a bit confusing to change bc_queue_len on one macvlan and see that the change was applied to more than one.

But I'm not sure if I should just move bc_queue_len to the struct macvlan_dev either. because then different macvlans will use different queue lengths while they still use the same queue. Which may also be considered a bit illogical

Let me know what you prefer here!

Thanks!





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-26 20:00                 ` Thomas Karlsson
@ 2020-11-27 17:27                   ` Jakub Kicinski
  2020-11-27 23:13                     ` Thomas Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-11-27 17:27 UTC (permalink / raw)
  To: Thomas Karlsson
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On Thu, 26 Nov 2020 21:00:46 +0100 Thomas Karlsson wrote:
> On 2020-11-26 00:01, Jakub Kicinski wrote:
> > On Wed, 25 Nov 2020 23:15:39 +0100 Thomas Karlsson wrote:  
> >> Or is there a way to set the parameters in a more "raw" form that
> >> does not require a patch to iproute2 with parameter parsing, error
> >> handing, man pages updates, etc. I feel that I'm getting in over my
> >> head here.  
> > 
> > We're here to assist! Netlink takes a little bit of effort 
> > to comprehend but it's very simple once you get the mechanics!
> >   
> 
> Thanks for the encouragement, I have been able to build iproute2 today and
> I am successfully communicating with the driver now being able to set and retrieve my queue len!
> 
> As I'm working on this I do got a question. I placed the bc_queue_len into the struct macvlan_port *port
> since that is where the bc_queue is located today. But when I change and retrieve the queue from userspace I realize
> that all macvlan interfaces that share the same physical lowerdev uses the same port structure and thus
> the same bc_queue_len.

Indeed looks like its an ingress attribute.

> It confused me at first and I'm not sure if that is how it should be. I expected the driver to have different
> bc_queues for all macvlan interfaces no matter which lowerdev they were using but obviously that is not the case.
> 
> It may be a bit confusing to change bc_queue_len on one macvlan and see that the change was applied to more than one.
> 
> But I'm not sure if I should just move bc_queue_len to the struct macvlan_dev either. because then different macvlans will use different queue lengths while they still use the same queue. Which may also be considered a bit illogical
> 
> Let me know what you prefer here!

I'd record the queue len requested by each interface in their struct
macvlan_dev and then calculate a max over the members to set the actual
value in struct macvlan_port.

Let me CC some extra people, looks like macvlan does not have a
maintainer..

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance
  2020-11-27 17:27                   ` Jakub Kicinski
@ 2020-11-27 23:13                     ` Thomas Karlsson
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-27 23:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On 2020-11-27 18:27, Jakub Kicinski wrote:
> On Thu, 26 Nov 2020 21:00:46 +0100 Thomas Karlsson wrote:
>> On 2020-11-26 00:01, Jakub Kicinski wrote:
>>> On Wed, 25 Nov 2020 23:15:39 +0100 Thomas Karlsson wrote:  
>>>> Or is there a way to set the parameters in a more "raw" form that
>>>> does not require a patch to iproute2 with parameter parsing, error
>>>> handing, man pages updates, etc. I feel that I'm getting in over my
>>>> head here.  
>>>
>>> We're here to assist! Netlink takes a little bit of effort 
>>> to comprehend but it's very simple once you get the mechanics!
>>>   
>>
>> Thanks for the encouragement, I have been able to build iproute2 today and
>> I am successfully communicating with the driver now being able to set and retrieve my queue len!
>>
>> As I'm working on this I do got a question. I placed the bc_queue_len into the struct macvlan_port *port
>> since that is where the bc_queue is located today. But when I change and retrieve the queue from userspace I realize
>> that all macvlan interfaces that share the same physical lowerdev uses the same port structure and thus
>> the same bc_queue_len.
> 
> Indeed looks like its an ingress attribute.
> 
>> It confused me at first and I'm not sure if that is how it should be. I expected the driver to have different
>> bc_queues for all macvlan interfaces no matter which lowerdev they were using but obviously that is not the case.
>>
>> It may be a bit confusing to change bc_queue_len on one macvlan and see that the change was applied to more than one.
>>
>> But I'm not sure if I should just move bc_queue_len to the struct macvlan_dev either. because then different macvlans will use different queue lengths while they still use the same queue. Which may also be considered a bit illogical
>>
>> Let me know what you prefer here!
> 
> I'd record the queue len requested by each interface in their struct
> macvlan_dev and then calculate a max over the members to set the actual
> value in struct macvlan_port.

That sounds like a good approach to me. I can see that for example mtu is
handled in a similar aggregated way. I'll give it a try.
Thanks!

> 
> Let me CC some extra people, looks like macvlan does not have a
> maintainer..
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v3] macvlan: Support for high multicast packet rate
  2020-11-23 22:30   ` Jakub Kicinski
  2020-11-25 12:51     ` Thomas Karlsson
@ 2020-11-30 14:00     ` Thomas Karlsson
  2020-12-01 19:11       ` Jakub Kicinski
  2020-11-30 14:23     ` [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter Thomas Karlsson
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-30 14:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck,
	thomas.karlsson

Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.

The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.

This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
---

v3 switched to using netlink attributes

v1/2 used module_param

 drivers/net/macvlan.c              | 44 ++++++++++++++++++++++++++++--
 include/linux/if_macvlan.h         |  1 +
 include/uapi/linux/if_link.h       |  2 ++
 tools/include/uapi/linux/if_link.h |  2 ++
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index d9b6c44a5911..b8197761248e 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -35,7 +35,7 @@
 
 #define MACVLAN_HASH_BITS	8
 #define MACVLAN_HASH_SIZE	(1<<MACVLAN_HASH_BITS)
-#define MACVLAN_BC_QUEUE_LEN	1000
+#define MACVLAN_DEFAULT_BC_QUEUE_LEN	1000
 
 #define MACVLAN_F_PASSTHRU	1
 #define MACVLAN_F_ADDRCHANGE	2
@@ -46,6 +46,7 @@ struct macvlan_port {
 	struct list_head	vlans;
 	struct sk_buff_head	bc_queue;
 	struct work_struct	bc_work;
+	u32			bc_queue_len_used;
 	u32			flags;
 	int			count;
 	struct hlist_head	vlan_source_hash[MACVLAN_HASH_SIZE];
@@ -67,6 +68,7 @@ struct macvlan_skb_cb {
 #define MACVLAN_SKB_CB(__skb) ((struct macvlan_skb_cb *)&((__skb)->cb[0]))
 
 static void macvlan_port_destroy(struct net_device *dev);
+static void update_port_bc_queue_len(struct macvlan_port *port);
 
 static inline bool macvlan_passthru(const struct macvlan_port *port)
 {
@@ -354,7 +356,7 @@ static void macvlan_broadcast_enqueue(struct macvlan_port *port,
 	MACVLAN_SKB_CB(nskb)->src = src;
 
 	spin_lock(&port->bc_queue.lock);
-	if (skb_queue_len(&port->bc_queue) < MACVLAN_BC_QUEUE_LEN) {
+	if (skb_queue_len(&port->bc_queue) < port->bc_queue_len_used) {
 		if (src)
 			dev_hold(src->dev);
 		__skb_queue_tail(&port->bc_queue, nskb);
@@ -1218,6 +1220,7 @@ static int macvlan_port_create(struct net_device *dev)
 	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
 		INIT_HLIST_HEAD(&port->vlan_source_hash[i]);
 
+	port->bc_queue_len_used = MACVLAN_DEFAULT_BC_QUEUE_LEN;
 	skb_queue_head_init(&port->bc_queue);
 	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
 
@@ -1486,6 +1489,12 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 			goto destroy_macvlan_port;
 	}
 
+	vlan->bc_queue_len_requested = MACVLAN_DEFAULT_BC_QUEUE_LEN;
+	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN])
+		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
+	if (vlan->bc_queue_len_requested > port->bc_queue_len_used)
+		port->bc_queue_len_used = vlan->bc_queue_len_requested;
+
 	err = register_netdevice(dev);
 	if (err < 0)
 		goto destroy_macvlan_port;
@@ -1529,6 +1538,7 @@ void macvlan_dellink(struct net_device *dev, struct list_head *head)
 	if (vlan->mode == MACVLAN_MODE_SOURCE)
 		macvlan_flush_sources(vlan->port, vlan);
 	list_del_rcu(&vlan->list);
+	update_port_bc_queue_len(vlan->port);
 	unregister_netdevice_queue(dev, head);
 	netdev_upper_dev_unlink(vlan->lowerdev, dev);
 }
@@ -1572,6 +1582,15 @@ static int macvlan_changelink(struct net_device *dev,
 		}
 		vlan->flags = flags;
 	}
+
+	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN_USED])
+		return -EINVAL; /* Trying to set a read only attribute */
+
+	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN]) {
+		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
+		update_port_bc_queue_len(vlan->port);
+	}
+
 	if (set_mode)
 		vlan->mode = mode;
 	if (data && data[IFLA_MACVLAN_MACADDR_MODE]) {
@@ -1602,6 +1621,8 @@ static size_t macvlan_get_size(const struct net_device *dev)
 		+ nla_total_size(2) /* IFLA_MACVLAN_FLAGS */
 		+ nla_total_size(4) /* IFLA_MACVLAN_MACADDR_COUNT */
 		+ macvlan_get_size_mac(vlan) /* IFLA_MACVLAN_MACADDR */
+		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN */
+		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN_USED */
 		);
 }
 
@@ -1625,6 +1646,7 @@ static int macvlan_fill_info(struct sk_buff *skb,
 				const struct net_device *dev)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
 	int i;
 	struct nlattr *nest;
 
@@ -1645,6 +1667,10 @@ static int macvlan_fill_info(struct sk_buff *skb,
 		}
 		nla_nest_end(skb, nest);
 	}
+	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN, vlan->bc_queue_len_requested))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN_USED, port->bc_queue_len_used))
+		goto nla_put_failure;
 	return 0;
 
 nla_put_failure:
@@ -1658,6 +1684,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
 	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
 	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
 	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
+	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
+	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_U32 },
 };
 
 int macvlan_link_register(struct rtnl_link_ops *ops)
@@ -1688,6 +1716,18 @@ static struct rtnl_link_ops macvlan_link_ops = {
 	.priv_size      = sizeof(struct macvlan_dev),
 };
 
+static void update_port_bc_queue_len(struct macvlan_port *port)
+{
+	struct macvlan_dev *vlan;
+	u32 max_bc_queue_len_requested = 0;
+
+	list_for_each_entry_rcu(vlan, &port->vlans, list) {
+		if (vlan->bc_queue_len_requested > max_bc_queue_len_requested)
+			max_bc_queue_len_requested = vlan->bc_queue_len_requested;
+	}
+	port->bc_queue_len_used = max_bc_queue_len_requested;
+}
+
 static int macvlan_device_event(struct notifier_block *unused,
 				unsigned long event, void *ptr)
 {
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index a367ead4bf4b..c3923fdbe1f0 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -30,6 +30,7 @@ struct macvlan_dev {
 	enum macvlan_mode	mode;
 	u16			flags;
 	unsigned int		macaddr_count;
+	u32			bc_queue_len_requested;
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll		*netpoll;
 #endif
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index c4b23f06f69e..874cc12a34d9 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -588,6 +588,8 @@ enum {
 	IFLA_MACVLAN_MACADDR,
 	IFLA_MACVLAN_MACADDR_DATA,
 	IFLA_MACVLAN_MACADDR_COUNT,
+	IFLA_MACVLAN_BC_QUEUE_LEN,
+	IFLA_MACVLAN_BC_QUEUE_LEN_USED,
 	__IFLA_MACVLAN_MAX,
 };
 
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index 781e482dc499..d208b2af697f 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -409,6 +409,8 @@ enum {
 	IFLA_MACVLAN_MACADDR,
 	IFLA_MACVLAN_MACADDR_DATA,
 	IFLA_MACVLAN_MACADDR_COUNT,
+	IFLA_MACVLAN_BC_QUEUE_LEN,
+	IFLA_MACVLAN_BC_QUEUE_LEN_USED,
 	__IFLA_MACVLAN_MAX,
 };
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter
  2020-11-23 22:30   ` Jakub Kicinski
  2020-11-25 12:51     ` Thomas Karlsson
  2020-11-30 14:00     ` [PATCH net-next v3] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-11-30 14:23     ` Thomas Karlsson
  2020-12-10 16:07       ` Thomas Karlsson
  2020-12-02 18:49     ` [PATCH net-next v4] macvlan: Support for high multicast packet rate Thomas Karlsson
  2020-12-14 10:42     ` [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter Thomas Karlsson
  4 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-11-30 14:23 UTC (permalink / raw)
  To: Jakub Kicinski, stephen
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck,
	thomas.karlsson

This is a follow up patch to iproute2 that allows the user
to set and retrieve the added IFLA_MACVLAN_BC_QUEUE_LEN parameter
via the bcqueuelen command line argument

This controls the requested size of the macvlan queue for broadcast
and multicast packages.

If not specified the driver default (1000) is used.

Note: The request is per macvlan but the actually used queue
length per port is the maximum of any request to any macvlan
connected to the same port.

The used queue length (IFLA_MACVLAN_BC_QUEUE_LEN_USED) is also
retrieved and displayed in order to aid in the understanding
of the setting. However, it cannot be directly set.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
---

v1 Initial version
   Note: This patch first requires that the corresponding
   kernel patch in 0c88607c-1b63-e8b5-8a84-14b63e55e8e2@paneda.se
   to macvlan is merged to be usable.

 include/uapi/linux/if_link.h |  2 ++
 ip/iplink_macvlan.c          | 33 +++++++++++++++++++++++++++++++--
 man/man8/ip-link.8.in        | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 307e5c24..faa90938 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -586,6 +586,8 @@ enum {
 	IFLA_MACVLAN_MACADDR,
 	IFLA_MACVLAN_MACADDR_DATA,
 	IFLA_MACVLAN_MACADDR_COUNT,
+	IFLA_MACVLAN_BC_QUEUE_LEN,
+	IFLA_MACVLAN_BC_QUEUE_LEN_USED,
 	__IFLA_MACVLAN_MAX,
 };
 
diff --git a/ip/iplink_macvlan.c b/ip/iplink_macvlan.c
index b966a615..302a3748 100644
--- a/ip/iplink_macvlan.c
+++ b/ip/iplink_macvlan.c
@@ -30,12 +30,13 @@
 static void print_explain(struct link_util *lu, FILE *f)
 {
 	fprintf(f,
-		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS\n"
+		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS [bcqueuelen BC_QUEUE_LEN]\n"
 		"\n"
 		"MODE: private | vepa | bridge | passthru | source\n"
 		"MODE_FLAG: null | nopromisc\n"
 		"MODE_OPTS: for mode \"source\":\n"
-		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n",
+		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n"
+		"BC_QUEUE_LEN: Length of the rx queue for broadcast/multicast: [0-4294967295]\n",
 		lu->id
 	);
 }
@@ -62,6 +63,14 @@ static int flag_arg(const char *arg)
 	return -1;
 }
 
+static int bc_queue_len_arg(const char *arg)
+{
+	fprintf(stderr,
+		"Error: argument of \"bcqueuelen\" must be a positive integer [0-4294967295], not \"%s\"\n",
+		arg);
+	return -1;
+}
+
 static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
 			  struct nlmsghdr *n)
 {
@@ -150,6 +159,14 @@ static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
 		} else if (matches(*argv, "nopromisc") == 0) {
 			flags |= MACVLAN_FLAG_NOPROMISC;
 			has_flags = 1;
+		} else if (matches(*argv, "bcqueuelen") == 0) {
+			__u32 bc_queue_len;
+			NEXT_ARG();
+			
+			if (get_u32(&bc_queue_len, *argv, 0)) {
+				return bc_queue_len_arg(*argv);
+			}
+			addattr32(n, 1024, IFLA_MACVLAN_BC_QUEUE_LEN, bc_queue_len);
 		} else if (matches(*argv, "help") == 0) {
 			explain(lu);
 			return -1;
@@ -212,6 +229,18 @@ static void macvlan_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]
 	if (flags & MACVLAN_FLAG_NOPROMISC)
 		print_bool(PRINT_ANY, "nopromisc", "nopromisc ", true);
 
+	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN] &&
+		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN]) >= sizeof(__u32)) {
+		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN]);
+		print_luint(PRINT_ANY, "bcqueuelen", "bcqueuelen %lu ", bc_queue_len);
+	}
+
+	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED] &&
+		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]) >= sizeof(__u32)) {
+		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]);
+		print_luint(PRINT_ANY, "usedbcqueuelen", "usedbcqueuelen %lu ", bc_queue_len);
+	}
+
 	/* in source mode, there are more options to print */
 
 	if (mode != MACVLAN_MODE_SOURCE)
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index 1ff01744..3516765a 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -1352,6 +1352,7 @@ the following additional arguments are supported:
 .BR type " { " macvlan " | " macvtap " } "
 .BR mode " { " private " | " vepa " | " bridge " | " passthru
 .RB " [ " nopromisc " ] | " source " } "
+.RB " [ " bcqueuelen " { " LENGTH " } ] "
 
 .in +8
 .sp
@@ -1395,6 +1396,18 @@ against source mac address from received frames on underlying interface. This
 allows creating mac based VLAN associations, instead of standard port or tag
 based. The feature is useful to deploy 802.1x mac based behavior,
 where drivers of underlying interfaces doesn't allows that.
+
+.BR bcqueuelen " { " LENGTH " } "
+- Set the length of the RX queue used to process broadcast and multicast packets.
+.BR LENGTH " must be a positive integer in the range [0-4294967295]."
+Setting a length of 0 will effectively drop all broadcast/multicast traffic.
+If not specified the macvlan driver default (1000) is used.
+Note that all macvlans that share the same underlying device are using the same
+.RB "queue. The parameter here is a " request ", the actual queue length used"
+will be the maximum length that any macvlan interface has requested.
+When listing device parameters both the bcqueuelen parameter
+as well as the actual used bcqueuelen are listed to better help
+the user understand the setting.
 .in -8
 
 .TP
@@ -2451,6 +2464,26 @@ Commands:
 .sp
 .in -8
 
+Update the broadcast/multicast queue length.
+
+.B "ip link set type { macvlan | macvap } "
+[
+.BI bcqueuelen "  LENGTH  "
+]
+
+.in +8
+.BI bcqueuelen " LENGTH "
+- Set the length of the RX queue used to process broadcast and multicast packets.
+.IR LENGTH " must be a positive integer in the range [0-4294967295]."
+Setting a length of 0 will effectively drop all broadcast/multicast traffic.
+If not specified the macvlan driver default (1000) is used.
+Note that all macvlans that share the same underlying device are using the same
+.RB "queue. The parameter here is a " request ", the actual queue length used"
+will be the maximum length that any macvlan interface has requested.
+When listing device parameters both the bcqueuelen parameter
+as well as the actual used bcqueuelen are listed to better help
+the user understand the setting.
+.in -8
 
 .SS  ip link show - display device attributes
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v3] macvlan: Support for high multicast packet rate
  2020-11-30 14:00     ` [PATCH net-next v3] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-12-01 19:11       ` Jakub Kicinski
  2020-12-02 11:28         ` Thomas Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-12-01 19:11 UTC (permalink / raw)
  To: Thomas Karlsson
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On Mon, 30 Nov 2020 15:00:43 +0100 Thomas Karlsson wrote:
> Background:
> Broadcast and multicast packages are enqueued for later processing.
> This queue was previously hardcoded to 1000.
> 
> This proved insufficient for handling very high packet rates.
> This resulted in packet drops for multicast.
> While at the same time unicast worked fine.
> 
> The change:
> This patch make the queue length adjustable to accommodate
> for environments with very high multicast packet rate.
> But still keeps the default value of 1000 unless specified.
> 
> The queue length is specified as a request per macvlan
> using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.
> 
> The actual used queue length will then be the maximum of
> any macvlan connected to the same port. The actual used
> queue length for the port can be retrieved (read only)
> by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.
> 
> This will be followed up by a patch to iproute2
> in order to adjust the parameter from userspace.
> 
> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>

Looks good! Minor nits below:

> @@ -1218,6 +1220,7 @@ static int macvlan_port_create(struct net_device *dev)
>  	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
>  		INIT_HLIST_HEAD(&port->vlan_source_hash[i]);
>  
> +	port->bc_queue_len_used = MACVLAN_DEFAULT_BC_QUEUE_LEN;

Should this be inited to 0? Otherwise if the first link asks for lower
queue len than the default it will not get set, right?

>  	skb_queue_head_init(&port->bc_queue);
>  	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
>  
> @@ -1486,6 +1489,12 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
>  			goto destroy_macvlan_port;
>  	}
>  
> +	vlan->bc_queue_len_requested = MACVLAN_DEFAULT_BC_QUEUE_LEN;
> +	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN])
> +		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
> +	if (vlan->bc_queue_len_requested > port->bc_queue_len_used)
> +		port->bc_queue_len_used = vlan->bc_queue_len_requested;

Or perhaps we should just call update_port_bc_queue_len() here?

>  	err = register_netdevice(dev);
>  	if (err < 0)
>  		goto destroy_macvlan_port;

> @@ -1658,6 +1684,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
>  	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
>  	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
>  	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
> +	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
> +	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_U32 },

This is an input policy, so you can set type to NLA_REJECT and you
won't have to check if it's set on input.

>  };
>  
>  int macvlan_link_register(struct rtnl_link_ops *ops)
> @@ -1688,6 +1716,18 @@ static struct rtnl_link_ops macvlan_link_ops = {
>  	.priv_size      = sizeof(struct macvlan_dev),
>  };
>  
> +static void update_port_bc_queue_len(struct macvlan_port *port)
> +{
> +	struct macvlan_dev *vlan;
> +	u32 max_bc_queue_len_requested = 0;

Please reorder so that the vars are longest line to shortest.

> +	list_for_each_entry_rcu(vlan, &port->vlans, list) {

I don't think you need the _rcu() flavor here, this is always called
from the configuration paths holding RTNL lock, right?

> +		if (vlan->bc_queue_len_requested > max_bc_queue_len_requested)
> +			max_bc_queue_len_requested = vlan->bc_queue_len_requested;
> +	}
> +	port->bc_queue_len_used = max_bc_queue_len_requested;
> +}
> +
>  static int macvlan_device_event(struct notifier_block *unused,
>  				unsigned long event, void *ptr)
>  {

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v3] macvlan: Support for high multicast packet rate
  2020-12-01 19:11       ` Jakub Kicinski
@ 2020-12-02 11:28         ` Thomas Karlsson
  2020-12-02 16:44           ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-02 11:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On 2020-12-01 20:11, Jakub Kicinski wrote:
> On Mon, 30 Nov 2020 15:00:43 +0100 Thomas Karlsson wrote:
>> Background:
>> Broadcast and multicast packages are enqueued for later processing.
>> This queue was previously hardcoded to 1000.
>>
>> This proved insufficient for handling very high packet rates.
>> This resulted in packet drops for multicast.
>> While at the same time unicast worked fine.
>>
>> The change:
>> This patch make the queue length adjustable to accommodate
>> for environments with very high multicast packet rate.
>> But still keeps the default value of 1000 unless specified.
>>
>> The queue length is specified as a request per macvlan
>> using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.
>>
>> The actual used queue length will then be the maximum of
>> any macvlan connected to the same port. The actual used
>> queue length for the port can be retrieved (read only)
>> by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.
>>
>> This will be followed up by a patch to iproute2
>> in order to adjust the parameter from userspace.
>>
>> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
> 
> Looks good! Minor nits below:

:)

> 
>> @@ -1218,6 +1220,7 @@ static int macvlan_port_create(struct net_device *dev)
>>  	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
>>  		INIT_HLIST_HEAD(&port->vlan_source_hash[i]);
>>  
>> +	port->bc_queue_len_used = MACVLAN_DEFAULT_BC_QUEUE_LEN;
> 
> Should this be inited to 0? Otherwise if the first link asks for lower
> queue len than the default it will not get set, right?

Indeed, looks you are right, see also below

 
>>  	skb_queue_head_init(&port->bc_queue);
>>  	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
>>  
>> @@ -1486,6 +1489,12 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
>>  			goto destroy_macvlan_port;
>>  	}
>>  
>> +	vlan->bc_queue_len_requested = MACVLAN_DEFAULT_BC_QUEUE_LEN;
>> +	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN])
>> +		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
>> +	if (vlan->bc_queue_len_requested > port->bc_queue_len_used)
>> +		port->bc_queue_len_used = vlan->bc_queue_len_requested;
> 
> Or perhaps we should just call update_port_bc_queue_len() here?

That would also have prevented the above bug... So yes, I think that is better
to keep the logic only in one place. I'll change to that.

 
>>  	err = register_netdevice(dev);
>>  	if (err < 0)
>>  		goto destroy_macvlan_port;
> 
>> @@ -1658,6 +1684,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
>>  	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
>>  	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
>>  	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
>> +	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
>> +	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_U32 },
> 
> This is an input policy, so you can set type to NLA_REJECT and you
> won't have to check if it's set on input.
> 

Great!

>>  };
>>  
>>  int macvlan_link_register(struct rtnl_link_ops *ops)
>> @@ -1688,6 +1716,18 @@ static struct rtnl_link_ops macvlan_link_ops = {
>>  	.priv_size      = sizeof(struct macvlan_dev),
>>  };
>>  
>> +static void update_port_bc_queue_len(struct macvlan_port *port)
>> +{
>> +	struct macvlan_dev *vlan;
>> +	u32 max_bc_queue_len_requested = 0;
> 
> Please reorder so that the vars are longest line to shortest.
> 
got it

>> +	list_for_each_entry_rcu(vlan, &port->vlans, list) {
> 
> I don't think you need the _rcu() flavor here, this is always called
> from the configuration paths holding RTNL lock, right?
> 

To be honest, what to use/not to use when traversing the list was what caused me the most
doubt/trouble of the patch :)

I sort of assumed that there must be some outer synchronisation that prevented
two or more concurrent calls to new/delte/change link. but wasn't sure how
and where that synchonisation took place. Now that I have googled RTLN lock I understand
that part much better.

The main reason I went with _rcu was because the existing code is using list_del_rcu and
list_add_tail_rcu when modifying the list as well as _rcu when accessing/traversing (in some places).
So I figured if they needed the _rcu variants I too would need that.

But from a closer inspection I think in that situation it is only needed because the list is accessed
from for example macvlan_handle_frame (obviously not protected by the RTLN lock) using _rcu version
and under the rcu_read_lock as protection. So then it must also be updated with _rcu 
in all places of course. Even if all the updates are done under the RTNL lock.

This was a long ramble :)
But thanks, I think I understand the synchronisation mechanism in the kernel a bit better now!

As I'm only calling my function from the netlink configuration functions under RTLN lock
It should be safe to drop the _rcu version as you say, because the list is only
modified in those functions too. Great!


>> +		if (vlan->bc_queue_len_requested > max_bc_queue_len_requested)
>> +			max_bc_queue_len_requested = vlan->bc_queue_len_requested;
>> +	}
>> +	port->bc_queue_len_used = max_bc_queue_len_requested;
>> +}
>> +
>>  static int macvlan_device_event(struct notifier_block *unused,
>>  				unsigned long event, void *ptr)
>>  {


I also noticed I got a few line length warnings in patchworks but none when I ran the ./scrips/checkpatch.pl
locally. So is the net tree using strict 80 chars? I would prefer not to introduce extra line breaks
on those lines as I think it will hurt readability but of course I will if needed.

I will publish a v4 later today.

/Thomas

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v3] macvlan: Support for high multicast packet rate
  2020-12-02 11:28         ` Thomas Karlsson
@ 2020-12-02 16:44           ` Jakub Kicinski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Kicinski @ 2020-12-02 16:44 UTC (permalink / raw)
  To: Thomas Karlsson
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On Wed, 2 Dec 2020 12:28:47 +0100 Thomas Karlsson wrote:
> >> +		if (vlan->bc_queue_len_requested > max_bc_queue_len_requested)
> >> +			max_bc_queue_len_requested = vlan->bc_queue_len_requested;
> >> +	}
> >> +	port->bc_queue_len_used = max_bc_queue_len_requested;
> >> +}
> >> +
> >>  static int macvlan_device_event(struct notifier_block *unused,
> >>  				unsigned long event, void *ptr)
> >>  {  
> 
> I also noticed I got a few line length warnings in patchworks but none when I ran the ./scrips/checkpatch.pl
> locally. So is the net tree using strict 80 chars? I would prefer not to introduce extra line breaks
> on those lines as I think it will hurt readability but of course I will if needed.

I run checkpatch with --max-line-length=80, I think the 80 char
limitation is quite reasonable and leads to more readable code.

In your case I'd do s/bc_queue_len_requested/bc_queue_len_req/.
But it's up to you.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v4] macvlan: Support for high multicast packet rate
  2020-11-23 22:30   ` Jakub Kicinski
                       ` (2 preceding siblings ...)
  2020-11-30 14:23     ` [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter Thomas Karlsson
@ 2020-12-02 18:49     ` Thomas Karlsson
  2020-12-03 16:20       ` Jakub Kicinski
  2020-12-14 10:42     ` [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter Thomas Karlsson
  4 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-02 18:49 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Thomas Karlsson, netdev, davem, jiri, kaber, edumazet, vyasevic,
	alexander.duyck

Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.

The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.

This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
---

v4 Updated after review (see interdiff for full details):
	- Initialize bc_queue_len_used to 0 when creating the port.
	- only change bc_queue_len_used from update_port_bc_queue_len()
	- Use NLA_REJECT for IFLA_MACVLAN_BC_QUEUE_LEN_USED and removed custom reject code
	- Use list_for_each_entry instead of list_for_each_entry_rcu
	- misc renaming/restructure to better match coding style
v3 switched to using netlink attributes
v1/2 used module_param

Interdiff against v3:
  diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
  index b8197761248e..fb51329f8964 100644
  --- a/drivers/net/macvlan.c
  +++ b/drivers/net/macvlan.c
  @@ -1220,7 +1220,7 @@ static int macvlan_port_create(struct net_device *dev)
   	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
   		INIT_HLIST_HEAD(&port->vlan_source_hash[i]);
   
  -	port->bc_queue_len_used = MACVLAN_DEFAULT_BC_QUEUE_LEN;
  +	port->bc_queue_len_used = 0;
   	skb_queue_head_init(&port->bc_queue);
   	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
   
  @@ -1489,11 +1489,9 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
   			goto destroy_macvlan_port;
   	}
   
  -	vlan->bc_queue_len_requested = MACVLAN_DEFAULT_BC_QUEUE_LEN;
  +	vlan->bc_queue_len_req = MACVLAN_DEFAULT_BC_QUEUE_LEN;
   	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN])
  -		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
  -	if (vlan->bc_queue_len_requested > port->bc_queue_len_used)
  -		port->bc_queue_len_used = vlan->bc_queue_len_requested;
  +		vlan->bc_queue_len_req = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
   
   	err = register_netdevice(dev);
   	if (err < 0)
  @@ -1505,6 +1503,7 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
   		goto unregister_netdev;
   
   	list_add_tail_rcu(&vlan->list, &port->vlans);
  +	update_port_bc_queue_len(vlan->port);
   	netif_stacked_transfer_operstate(lowerdev, dev);
   	linkwatch_fire_event(dev);
   
  @@ -1583,11 +1582,8 @@ static int macvlan_changelink(struct net_device *dev,
   		vlan->flags = flags;
   	}
   
  -	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN_USED])
  -		return -EINVAL; /* Trying to set a read only attribute */
  -
   	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN]) {
  -		vlan->bc_queue_len_requested = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
  +		vlan->bc_queue_len_req = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
   		update_port_bc_queue_len(vlan->port);
   	}
   
  @@ -1667,7 +1663,7 @@ static int macvlan_fill_info(struct sk_buff *skb,
   		}
   		nla_nest_end(skb, nest);
   	}
  -	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN, vlan->bc_queue_len_requested))
  +	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN, vlan->bc_queue_len_req))
   		goto nla_put_failure;
   	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN_USED, port->bc_queue_len_used))
   		goto nla_put_failure;
  @@ -1685,7 +1681,7 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
   	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
   	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
   	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
  -	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_U32 },
  +	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_REJECT },
   };
   
   int macvlan_link_register(struct rtnl_link_ops *ops)
  @@ -1718,14 +1714,14 @@ static struct rtnl_link_ops macvlan_link_ops = {
   
   static void update_port_bc_queue_len(struct macvlan_port *port)
   {
  +	u32 max_bc_queue_len_req = 0;
   	struct macvlan_dev *vlan;
  -	u32 max_bc_queue_len_requested = 0;
   
  -	list_for_each_entry_rcu(vlan, &port->vlans, list) {
  -		if (vlan->bc_queue_len_requested > max_bc_queue_len_requested)
  -			max_bc_queue_len_requested = vlan->bc_queue_len_requested;
  +	list_for_each_entry(vlan, &port->vlans, list) {
  +		if (vlan->bc_queue_len_req > max_bc_queue_len_req)
  +			max_bc_queue_len_req = vlan->bc_queue_len_req;
   	}
  -	port->bc_queue_len_used = max_bc_queue_len_requested;
  +	port->bc_queue_len_used = max_bc_queue_len_req;
   }
   
   static int macvlan_device_event(struct notifier_block *unused,
  diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
  index c3923fdbe1f0..96556c64c95d 100644
  --- a/include/linux/if_macvlan.h
  +++ b/include/linux/if_macvlan.h
  @@ -30,7 +30,7 @@ struct macvlan_dev {
   	enum macvlan_mode	mode;
   	u16			flags;
   	unsigned int		macaddr_count;
  -	u32			bc_queue_len_requested;
  +	u32			bc_queue_len_req;
   #ifdef CONFIG_NET_POLL_CONTROLLER
   	struct netpoll		*netpoll;
   #endif

 drivers/net/macvlan.c              | 40 ++++++++++++++++++++++++++++--
 include/linux/if_macvlan.h         |  1 +
 include/uapi/linux/if_link.h       |  2 ++
 tools/include/uapi/linux/if_link.h |  2 ++
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index d9b6c44a5911..fb51329f8964 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -35,7 +35,7 @@
 
 #define MACVLAN_HASH_BITS	8
 #define MACVLAN_HASH_SIZE	(1<<MACVLAN_HASH_BITS)
-#define MACVLAN_BC_QUEUE_LEN	1000
+#define MACVLAN_DEFAULT_BC_QUEUE_LEN	1000
 
 #define MACVLAN_F_PASSTHRU	1
 #define MACVLAN_F_ADDRCHANGE	2
@@ -46,6 +46,7 @@ struct macvlan_port {
 	struct list_head	vlans;
 	struct sk_buff_head	bc_queue;
 	struct work_struct	bc_work;
+	u32			bc_queue_len_used;
 	u32			flags;
 	int			count;
 	struct hlist_head	vlan_source_hash[MACVLAN_HASH_SIZE];
@@ -67,6 +68,7 @@ struct macvlan_skb_cb {
 #define MACVLAN_SKB_CB(__skb) ((struct macvlan_skb_cb *)&((__skb)->cb[0]))
 
 static void macvlan_port_destroy(struct net_device *dev);
+static void update_port_bc_queue_len(struct macvlan_port *port);
 
 static inline bool macvlan_passthru(const struct macvlan_port *port)
 {
@@ -354,7 +356,7 @@ static void macvlan_broadcast_enqueue(struct macvlan_port *port,
 	MACVLAN_SKB_CB(nskb)->src = src;
 
 	spin_lock(&port->bc_queue.lock);
-	if (skb_queue_len(&port->bc_queue) < MACVLAN_BC_QUEUE_LEN) {
+	if (skb_queue_len(&port->bc_queue) < port->bc_queue_len_used) {
 		if (src)
 			dev_hold(src->dev);
 		__skb_queue_tail(&port->bc_queue, nskb);
@@ -1218,6 +1220,7 @@ static int macvlan_port_create(struct net_device *dev)
 	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
 		INIT_HLIST_HEAD(&port->vlan_source_hash[i]);
 
+	port->bc_queue_len_used = 0;
 	skb_queue_head_init(&port->bc_queue);
 	INIT_WORK(&port->bc_work, macvlan_process_broadcast);
 
@@ -1486,6 +1489,10 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 			goto destroy_macvlan_port;
 	}
 
+	vlan->bc_queue_len_req = MACVLAN_DEFAULT_BC_QUEUE_LEN;
+	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN])
+		vlan->bc_queue_len_req = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
+
 	err = register_netdevice(dev);
 	if (err < 0)
 		goto destroy_macvlan_port;
@@ -1496,6 +1503,7 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 		goto unregister_netdev;
 
 	list_add_tail_rcu(&vlan->list, &port->vlans);
+	update_port_bc_queue_len(vlan->port);
 	netif_stacked_transfer_operstate(lowerdev, dev);
 	linkwatch_fire_event(dev);
 
@@ -1529,6 +1537,7 @@ void macvlan_dellink(struct net_device *dev, struct list_head *head)
 	if (vlan->mode == MACVLAN_MODE_SOURCE)
 		macvlan_flush_sources(vlan->port, vlan);
 	list_del_rcu(&vlan->list);
+	update_port_bc_queue_len(vlan->port);
 	unregister_netdevice_queue(dev, head);
 	netdev_upper_dev_unlink(vlan->lowerdev, dev);
 }
@@ -1572,6 +1581,12 @@ static int macvlan_changelink(struct net_device *dev,
 		}
 		vlan->flags = flags;
 	}
+
+	if (data && data[IFLA_MACVLAN_BC_QUEUE_LEN]) {
+		vlan->bc_queue_len_req = nla_get_u32(data[IFLA_MACVLAN_BC_QUEUE_LEN]);
+		update_port_bc_queue_len(vlan->port);
+	}
+
 	if (set_mode)
 		vlan->mode = mode;
 	if (data && data[IFLA_MACVLAN_MACADDR_MODE]) {
@@ -1602,6 +1617,8 @@ static size_t macvlan_get_size(const struct net_device *dev)
 		+ nla_total_size(2) /* IFLA_MACVLAN_FLAGS */
 		+ nla_total_size(4) /* IFLA_MACVLAN_MACADDR_COUNT */
 		+ macvlan_get_size_mac(vlan) /* IFLA_MACVLAN_MACADDR */
+		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN */
+		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN_USED */
 		);
 }
 
@@ -1625,6 +1642,7 @@ static int macvlan_fill_info(struct sk_buff *skb,
 				const struct net_device *dev)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
 	int i;
 	struct nlattr *nest;
 
@@ -1645,6 +1663,10 @@ static int macvlan_fill_info(struct sk_buff *skb,
 		}
 		nla_nest_end(skb, nest);
 	}
+	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN, vlan->bc_queue_len_req))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, IFLA_MACVLAN_BC_QUEUE_LEN_USED, port->bc_queue_len_used))
+		goto nla_put_failure;
 	return 0;
 
 nla_put_failure:
@@ -1658,6 +1680,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
 	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
 	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
 	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
+	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
+	[IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_REJECT },
 };
 
 int macvlan_link_register(struct rtnl_link_ops *ops)
@@ -1688,6 +1712,18 @@ static struct rtnl_link_ops macvlan_link_ops = {
 	.priv_size      = sizeof(struct macvlan_dev),
 };
 
+static void update_port_bc_queue_len(struct macvlan_port *port)
+{
+	u32 max_bc_queue_len_req = 0;
+	struct macvlan_dev *vlan;
+
+	list_for_each_entry(vlan, &port->vlans, list) {
+		if (vlan->bc_queue_len_req > max_bc_queue_len_req)
+			max_bc_queue_len_req = vlan->bc_queue_len_req;
+	}
+	port->bc_queue_len_used = max_bc_queue_len_req;
+}
+
 static int macvlan_device_event(struct notifier_block *unused,
 				unsigned long event, void *ptr)
 {
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index a367ead4bf4b..96556c64c95d 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -30,6 +30,7 @@ struct macvlan_dev {
 	enum macvlan_mode	mode;
 	u16			flags;
 	unsigned int		macaddr_count;
+	u32			bc_queue_len_req;
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll		*netpoll;
 #endif
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index c4b23f06f69e..874cc12a34d9 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -588,6 +588,8 @@ enum {
 	IFLA_MACVLAN_MACADDR,
 	IFLA_MACVLAN_MACADDR_DATA,
 	IFLA_MACVLAN_MACADDR_COUNT,
+	IFLA_MACVLAN_BC_QUEUE_LEN,
+	IFLA_MACVLAN_BC_QUEUE_LEN_USED,
 	__IFLA_MACVLAN_MAX,
 };
 
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index 781e482dc499..d208b2af697f 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -409,6 +409,8 @@ enum {
 	IFLA_MACVLAN_MACADDR,
 	IFLA_MACVLAN_MACADDR_DATA,
 	IFLA_MACVLAN_MACADDR_COUNT,
+	IFLA_MACVLAN_BC_QUEUE_LEN,
+	IFLA_MACVLAN_BC_QUEUE_LEN_USED,
 	__IFLA_MACVLAN_MAX,
 };
 
-- 
2.29.2

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v4] macvlan: Support for high multicast packet rate
  2020-12-02 18:49     ` [PATCH net-next v4] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-12-03 16:20       ` Jakub Kicinski
  2020-12-03 16:53         ` Thomas Karlsson
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Kicinski @ 2020-12-03 16:20 UTC (permalink / raw)
  To: Thomas Karlsson
  Cc: netdev, davem, jiri, kaber, edumazet, vyasevic, alexander.duyck

On Wed, 2 Dec 2020 19:49:58 +0100 Thomas Karlsson wrote:
> Background:
> Broadcast and multicast packages are enqueued for later processing.
> This queue was previously hardcoded to 1000.
> 
> This proved insufficient for handling very high packet rates.
> This resulted in packet drops for multicast.
> While at the same time unicast worked fine.
> 
> The change:
> This patch make the queue length adjustable to accommodate
> for environments with very high multicast packet rate.
> But still keeps the default value of 1000 unless specified.
> 
> The queue length is specified as a request per macvlan
> using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.
> 
> The actual used queue length will then be the maximum of
> any macvlan connected to the same port. The actual used
> queue length for the port can be retrieved (read only)
> by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.
> 
> This will be followed up by a patch to iproute2
> in order to adjust the parameter from userspace.
> 
> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>

> @@ -1658,6 +1680,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
>  	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
>  	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
>  	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
> +	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },

I wonder whether we should require that the queue_len is > 0? Is there 
a valid use case for the queue to be completely disabled? If you agree
please follow up with a simple patch which adds a NLA_POLICY_MIN() here.

Applied to net-next, thank you!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v4] macvlan: Support for high multicast packet rate
  2020-12-03 16:20       ` Jakub Kicinski
@ 2020-12-03 16:53         ` Thomas Karlsson
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-03 16:53 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, jiri, kaber, edumazet, vyasevic, alexander.duyck

> On Wed, 2 Dec 2020 19:49:58 +0100 Thomas Karlsson wrote:
>> Background:
>> Broadcast and multicast packages are enqueued for later processing.
>> This queue was previously hardcoded to 1000.
>>
>> This proved insufficient for handling very high packet rates.
>> This resulted in packet drops for multicast.
>> While at the same time unicast worked fine.
>>
>> The change:
>> This patch make the queue length adjustable to accommodate
>> for environments with very high multicast packet rate.
>> But still keeps the default value of 1000 unless specified.
>>
>> The queue length is specified as a request per macvlan
>> using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.
>>
>> The actual used queue length will then be the maximum of
>> any macvlan connected to the same port. The actual used
>> queue length for the port can be retrieved (read only)
>> by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.
>>
>> This will be followed up by a patch to iproute2
>> in order to adjust the parameter from userspace.
>>
>> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
> 
>> @@ -1658,6 +1680,8 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
>>  	[IFLA_MACVLAN_MACADDR] = { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
>>  	[IFLA_MACVLAN_MACADDR_DATA] = { .type = NLA_NESTED },
>>  	[IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
>> +	[IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
> 
> I wonder whether we should require that the queue_len is > 0? Is there 
> a valid use case for the queue to be completely disabled? If you agree
> please follow up with a simple patch which adds a NLA_POLICY_MIN() here.

Well, I did consider if I should put in a limit like that but then came
to the conclusion that is probably wise not to make any assumption on everyone else's
use cases. I can't really think of a reason why you want to disable the queue completely.
However, I think it still makes sense to allow that in case someone somewhere in the future
does find that useful.

> Applied to net-next, thank you!
> 

Awesome, thanks!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter
  2020-11-30 14:23     ` [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter Thomas Karlsson
@ 2020-12-10 16:07       ` Thomas Karlsson
  2020-12-11  0:36         ` Jakub Kicinski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-10 16:07 UTC (permalink / raw)
  To: Jakub Kicinski, stephen
  Cc: davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On 2020-11-30 15:23, Thomas Karlsson wrote:
> This is a follow up patch to iproute2 that allows the user
> to set and retrieve the added IFLA_MACVLAN_BC_QUEUE_LEN parameter
> via the bcqueuelen command line argument
> 
> 
> v1 Initial version
>    Note: This patch first requires that the corresponding
>    kernel patch in 0c88607c-1b63-e8b5-8a84-14b63e55e8e2@paneda.se
>    to macvlan is merged to be usable.

Just to follow up so this one isn't forgotten. The macvlan patch was merged
into net-next a week ago, commit d4bff72c8401e6f56194ecf455db70ebc22929e2

So this patch should be ready for review/incusion I think.

(Only sending this message since I noticed the patch was archived in patchworks).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter
  2020-12-10 16:07       ` Thomas Karlsson
@ 2020-12-11  0:36         ` Jakub Kicinski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Kicinski @ 2020-12-11  0:36 UTC (permalink / raw)
  To: Thomas Karlsson
  Cc: stephen, davem, netdev, jiri, kaber, edumazet, vyasevic, alexander.duyck

On Thu, 10 Dec 2020 17:07:51 +0100 Thomas Karlsson wrote:
> On 2020-11-30 15:23, Thomas Karlsson wrote:
> > This is a follow up patch to iproute2 that allows the user
> > to set and retrieve the added IFLA_MACVLAN_BC_QUEUE_LEN parameter
> > via the bcqueuelen command line argument
> > 
> > 
> > v1 Initial version
> >    Note: This patch first requires that the corresponding
> >    kernel patch in 0c88607c-1b63-e8b5-8a84-14b63e55e8e2@paneda.se
> >    to macvlan is merged to be usable.  
> 
> Just to follow up so this one isn't forgotten. The macvlan patch was merged
> into net-next a week ago, commit d4bff72c8401e6f56194ecf455db70ebc22929e2
> 
> So this patch should be ready for review/incusion I think.
> 
> (Only sending this message since I noticed the patch was archived in patchworks).

Best if you repost it if it got lost in PW. Please make sure to CC David
Ahern who maintainers iproute2-next.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter
  2020-11-23 22:30   ` Jakub Kicinski
                       ` (3 preceding siblings ...)
  2020-12-02 18:49     ` [PATCH net-next v4] macvlan: Support for high multicast packet rate Thomas Karlsson
@ 2020-12-14 10:42     ` Thomas Karlsson
  2020-12-14 17:03       ` David Ahern
  2020-12-16  4:07       ` David Ahern
  4 siblings, 2 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-14 10:42 UTC (permalink / raw)
  To: netdev, dsahern; +Cc: davem, Jakub Kicinski, stephen, kuznet

This patch allows the user to set and retrieve the
IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
command line argument

This parameter controls the requested size of the queue for
broadcast and multicast packages in the macvlan driver.

If not specified, the driver default (1000) will be used.

Note: The request is per macvlan but the actually used queue
length per port is the maximum of any request to any macvlan
connected to the same port.

For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
is also retrieved and displayed in order to aid in the understanding
of the setting. However, it can of course not be directly set.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
---

Note: This patch controls the parameter added in net-next
with commit d4bff72c8401e6f56194ecf455db70ebc22929e2

v2 Rebased on origin/main
v1 Initial version

 ip/iplink_macvlan.c   | 33 +++++++++++++++++++++++++++++++--
 man/man8/ip-link.8.in | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/ip/iplink_macvlan.c b/ip/iplink_macvlan.c
index b966a615..302a3748 100644
--- a/ip/iplink_macvlan.c
+++ b/ip/iplink_macvlan.c
@@ -30,12 +30,13 @@
 static void print_explain(struct link_util *lu, FILE *f)
 {
 	fprintf(f,
-		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS\n"
+		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS [bcqueuelen BC_QUEUE_LEN]\n"
 		"\n"
 		"MODE: private | vepa | bridge | passthru | source\n"
 		"MODE_FLAG: null | nopromisc\n"
 		"MODE_OPTS: for mode \"source\":\n"
-		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n",
+		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n"
+		"BC_QUEUE_LEN: Length of the rx queue for broadcast/multicast: [0-4294967295]\n",
 		lu->id
 	);
 }
@@ -62,6 +63,14 @@ static int flag_arg(const char *arg)
 	return -1;
 }
 
+static int bc_queue_len_arg(const char *arg)
+{
+	fprintf(stderr,
+		"Error: argument of \"bcqueuelen\" must be a positive integer [0-4294967295], not \"%s\"\n",
+		arg);
+	return -1;
+}
+
 static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
 			  struct nlmsghdr *n)
 {
@@ -150,6 +159,14 @@ static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
 		} else if (matches(*argv, "nopromisc") == 0) {
 			flags |= MACVLAN_FLAG_NOPROMISC;
 			has_flags = 1;
+		} else if (matches(*argv, "bcqueuelen") == 0) {
+			__u32 bc_queue_len;
+			NEXT_ARG();
+			
+			if (get_u32(&bc_queue_len, *argv, 0)) {
+				return bc_queue_len_arg(*argv);
+			}
+			addattr32(n, 1024, IFLA_MACVLAN_BC_QUEUE_LEN, bc_queue_len);
 		} else if (matches(*argv, "help") == 0) {
 			explain(lu);
 			return -1;
@@ -212,6 +229,18 @@ static void macvlan_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]
 	if (flags & MACVLAN_FLAG_NOPROMISC)
 		print_bool(PRINT_ANY, "nopromisc", "nopromisc ", true);
 
+	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN] &&
+		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN]) >= sizeof(__u32)) {
+		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN]);
+		print_luint(PRINT_ANY, "bcqueuelen", "bcqueuelen %lu ", bc_queue_len);
+	}
+
+	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED] &&
+		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]) >= sizeof(__u32)) {
+		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]);
+		print_luint(PRINT_ANY, "usedbcqueuelen", "usedbcqueuelen %lu ", bc_queue_len);
+	}
+
 	/* in source mode, there are more options to print */
 
 	if (mode != MACVLAN_MODE_SOURCE)
diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index 1ff01744..3516765a 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -1352,6 +1352,7 @@ the following additional arguments are supported:
 .BR type " { " macvlan " | " macvtap " } "
 .BR mode " { " private " | " vepa " | " bridge " | " passthru
 .RB " [ " nopromisc " ] | " source " } "
+.RB " [ " bcqueuelen " { " LENGTH " } ] "
 
 .in +8
 .sp
@@ -1395,6 +1396,18 @@ against source mac address from received frames on underlying interface. This
 allows creating mac based VLAN associations, instead of standard port or tag
 based. The feature is useful to deploy 802.1x mac based behavior,
 where drivers of underlying interfaces doesn't allows that.
+
+.BR bcqueuelen " { " LENGTH " } "
+- Set the length of the RX queue used to process broadcast and multicast packets.
+.BR LENGTH " must be a positive integer in the range [0-4294967295]."
+Setting a length of 0 will effectively drop all broadcast/multicast traffic.
+If not specified the macvlan driver default (1000) is used.
+Note that all macvlans that share the same underlying device are using the same
+.RB "queue. The parameter here is a " request ", the actual queue length used"
+will be the maximum length that any macvlan interface has requested.
+When listing device parameters both the bcqueuelen parameter
+as well as the actual used bcqueuelen are listed to better help
+the user understand the setting.
 .in -8
 
 .TP
@@ -2451,6 +2464,26 @@ Commands:
 .sp
 .in -8
 
+Update the broadcast/multicast queue length.
+
+.B "ip link set type { macvlan | macvap } "
+[
+.BI bcqueuelen "  LENGTH  "
+]
+
+.in +8
+.BI bcqueuelen " LENGTH "
+- Set the length of the RX queue used to process broadcast and multicast packets.
+.IR LENGTH " must be a positive integer in the range [0-4294967295]."
+Setting a length of 0 will effectively drop all broadcast/multicast traffic.
+If not specified the macvlan driver default (1000) is used.
+Note that all macvlans that share the same underlying device are using the same
+.RB "queue. The parameter here is a " request ", the actual queue length used"
+will be the maximum length that any macvlan interface has requested.
+When listing device parameters both the bcqueuelen parameter
+as well as the actual used bcqueuelen are listed to better help
+the user understand the setting.
+.in -8
 
 .SS  ip link show - display device attributes
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter
  2020-12-14 10:42     ` [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter Thomas Karlsson
@ 2020-12-14 17:03       ` David Ahern
  2020-12-14 17:50         ` Thomas Karlsson
  2020-12-16  4:07       ` David Ahern
  1 sibling, 1 reply; 27+ messages in thread
From: David Ahern @ 2020-12-14 17:03 UTC (permalink / raw)
  To: Thomas Karlsson, netdev, dsahern; +Cc: davem, Jakub Kicinski, stephen, kuznet

On 12/14/20 3:42 AM, Thomas Karlsson wrote:
> This patch allows the user to set and retrieve the
> IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
> command line argument
> 
> This parameter controls the requested size of the queue for
> broadcast and multicast packages in the macvlan driver.
> 
> If not specified, the driver default (1000) will be used.
> 
> Note: The request is per macvlan but the actually used queue
> length per port is the maximum of any request to any macvlan
> connected to the same port.
> 
> For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
> is also retrieved and displayed in order to aid in the understanding
> of the setting. However, it can of course not be directly set.
> 
> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
> ---
> 
> Note: This patch controls the parameter added in net-next
> with commit d4bff72c8401e6f56194ecf455db70ebc22929e2
> 
> v2 Rebased on origin/main
> v1 Initial version
> 
>  ip/iplink_macvlan.c   | 33 +++++++++++++++++++++++++++++++--
>  man/man8/ip-link.8.in | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/ip/iplink_macvlan.c b/ip/iplink_macvlan.c
> index b966a615..302a3748 100644
> --- a/ip/iplink_macvlan.c
> +++ b/ip/iplink_macvlan.c
> @@ -30,12 +30,13 @@
>  static void print_explain(struct link_util *lu, FILE *f)
>  {
>  	fprintf(f,
> -		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS\n"
> +		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS [bcqueuelen BC_QUEUE_LEN]\n"
>  		"\n"
>  		"MODE: private | vepa | bridge | passthru | source\n"
>  		"MODE_FLAG: null | nopromisc\n"
>  		"MODE_OPTS: for mode \"source\":\n"
> -		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n",
> +		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n"
> +		"BC_QUEUE_LEN: Length of the rx queue for broadcast/multicast: [0-4294967295]\n",

Are we really allowing a BC queue up to 4G? seems a bit much. is a u16
and 64k not more than sufficient?


>  		lu->id
>  	);
>  }
> @@ -62,6 +63,14 @@ static int flag_arg(const char *arg)
>  	return -1;
>  }
>  
> +static int bc_queue_len_arg(const char *arg)
> +{
> +	fprintf(stderr,
> +		"Error: argument of \"bcqueuelen\" must be a positive integer [0-4294967295], not \"%s\"\n",
> +		arg);
> +	return -1;
> +}
> +
>  static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
>  			  struct nlmsghdr *n)
>  {
> @@ -150,6 +159,14 @@ static int macvlan_parse_opt(struct link_util *lu, int argc, char **argv,
>  		} else if (matches(*argv, "nopromisc") == 0) {
>  			flags |= MACVLAN_FLAG_NOPROMISC;
>  			has_flags = 1;
> +		} else if (matches(*argv, "bcqueuelen") == 0) {
> +			__u32 bc_queue_len;
> +			NEXT_ARG();
> +			
> +			if (get_u32(&bc_queue_len, *argv, 0)) {
> +				return bc_queue_len_arg(*argv);
> +			}
> +			addattr32(n, 1024, IFLA_MACVLAN_BC_QUEUE_LEN, bc_queue_len);
>  		} else if (matches(*argv, "help") == 0) {
>  			explain(lu);
>  			return -1;
> @@ -212,6 +229,18 @@ static void macvlan_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]
>  	if (flags & MACVLAN_FLAG_NOPROMISC)
>  		print_bool(PRINT_ANY, "nopromisc", "nopromisc ", true);
>  
> +	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN] &&
> +		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN]) >= sizeof(__u32)) {
> +		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN]);
> +		print_luint(PRINT_ANY, "bcqueuelen", "bcqueuelen %lu ", bc_queue_len);
> +	}
> +
> +	if (tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED] &&
> +		RTA_PAYLOAD(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]) >= sizeof(__u32)) {
> +		__u32 bc_queue_len = rta_getattr_u32(tb[IFLA_MACVLAN_BC_QUEUE_LEN_USED]);
> +		print_luint(PRINT_ANY, "usedbcqueuelen", "usedbcqueuelen %lu ", bc_queue_len);
> +	}
> +
>  	/* in source mode, there are more options to print */
>  
>  	if (mode != MACVLAN_MODE_SOURCE)
> diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
> index 1ff01744..3516765a 100644
> --- a/man/man8/ip-link.8.in
> +++ b/man/man8/ip-link.8.in
> @@ -1352,6 +1352,7 @@ the following additional arguments are supported:
>  .BR type " { " macvlan " | " macvtap " } "
>  .BR mode " { " private " | " vepa " | " bridge " | " passthru
>  .RB " [ " nopromisc " ] | " source " } "
> +.RB " [ " bcqueuelen " { " LENGTH " } ] "
>  
>  .in +8
>  .sp
> @@ -1395,6 +1396,18 @@ against source mac address from received frames on underlying interface. This
>  allows creating mac based VLAN associations, instead of standard port or tag
>  based. The feature is useful to deploy 802.1x mac based behavior,
>  where drivers of underlying interfaces doesn't allows that.
> +
> +.BR bcqueuelen " { " LENGTH " } "
> +- Set the length of the RX queue used to process broadcast and multicast packets.
> +.BR LENGTH " must be a positive integer in the range [0-4294967295]."
> +Setting a length of 0 will effectively drop all broadcast/multicast traffic.
> +If not specified the macvlan driver default (1000) is used.
> +Note that all macvlans that share the same underlying device are using the same
> +.RB "queue. The parameter here is a " request ", the actual queue length used"
> +will be the maximum length that any macvlan interface has requested.
> +When listing device parameters both the bcqueuelen parameter
> +as well as the actual used bcqueuelen are listed to better help
> +the user understand the setting.
>  .in -8
>  
>  .TP
> @@ -2451,6 +2464,26 @@ Commands:
>  .sp
>  .in -8
>  
> +Update the broadcast/multicast queue length.
> +
> +.B "ip link set type { macvlan | macvap } "
> +[
> +.BI bcqueuelen "  LENGTH  "
> +]
> +
> +.in +8
> +.BI bcqueuelen " LENGTH "
> +- Set the length of the RX queue used to process broadcast and multicast packets.
> +.IR LENGTH " must be a positive integer in the range [0-4294967295]."
> +Setting a length of 0 will effectively drop all broadcast/multicast traffic.
> +If not specified the macvlan driver default (1000) is used.
> +Note that all macvlans that share the same underlying device are using the same
> +.RB "queue. The parameter here is a " request ", the actual queue length used"
> +will be the maximum length that any macvlan interface has requested.
> +When listing device parameters both the bcqueuelen parameter
> +as well as the actual used bcqueuelen are listed to better help
> +the user understand the setting.
> +.in -8
>  
>  .SS  ip link show - display device attributes
>  
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter
  2020-12-14 17:03       ` David Ahern
@ 2020-12-14 17:50         ` Thomas Karlsson
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Karlsson @ 2020-12-14 17:50 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: davem, Jakub Kicinski, stephen, kuznet

On 2020-12-14 18:03, David Ahern wrote:
> On 12/14/20 3:42 AM, Thomas Karlsson wrote:
>> This patch allows the user to set and retrieve the
>> IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
>> command line argument
>>
>> This parameter controls the requested size of the queue for
>> broadcast and multicast packages in the macvlan driver.
>>
>> If not specified, the driver default (1000) will be used.
>>
>> Note: The request is per macvlan but the actually used queue
>> length per port is the maximum of any request to any macvlan
>> connected to the same port.
>>
>> For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
>> is also retrieved and displayed in order to aid in the understanding
>> of the setting. However, it can of course not be directly set.
>>
>> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
>> ---
>>
>> Note: This patch controls the parameter added in net-next
>> with commit d4bff72c8401e6f56194ecf455db70ebc22929e2
>>
>> v2 Rebased on origin/main
>> v1 Initial version
>>
>>  ip/iplink_macvlan.c   | 33 +++++++++++++++++++++++++++++++--
>>  man/man8/ip-link.8.in | 33 +++++++++++++++++++++++++++++++++
>>  2 files changed, 64 insertions(+), 2 deletions(-)
>>
>> diff --git a/ip/iplink_macvlan.c b/ip/iplink_macvlan.c
>> index b966a615..302a3748 100644
>> --- a/ip/iplink_macvlan.c
>> +++ b/ip/iplink_macvlan.c
>> @@ -30,12 +30,13 @@
>>  static void print_explain(struct link_util *lu, FILE *f)
>>  {
>>  	fprintf(f,
>> -		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS\n"
>> +		"Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS [bcqueuelen BC_QUEUE_LEN]\n"
>>  		"\n"
>>  		"MODE: private | vepa | bridge | passthru | source\n"
>>  		"MODE_FLAG: null | nopromisc\n"
>>  		"MODE_OPTS: for mode \"source\":\n"
>> -		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n",
>> +		"\tmacaddr { { add | del } <macaddr> | set [ <macaddr> [ <macaddr>  ... ] ] | flush }\n"
>> +		"BC_QUEUE_LEN: Length of the rx queue for broadcast/multicast: [0-4294967295]\n",
> 
> Are we really allowing a BC queue up to 4G? seems a bit much. is a u16
> and 64k not more than sufficient?
> 
> 

No, 64k is not sufficient when you have very high packet rate and very small packages.
I can easily see myself retrieving more than 300 kpps and 65k is then only just around
200 ms of buffer head-room, which I don't consider much saftey margin at all, especially if
the incoming data is not perfectly spaced out but rather comes in bursts.
If you start adding 10Gbps cards in the mix you could get 10 times that and the buffer
would be down to only 20ms. And we would get back to the situation where unicast works
fine but multicast does not.

So for sure a larger max than 64k is needed.

The reason I didn't add an upper limit beside the u32 was that I didn't want to pick an
arbitrary limit that works for me now but maybe not support someone elses use case later.

I'm now looking at net.core.netdev_max_backlog for inspiration. Which, at least on my system,
seems to have a limit of 2147483647 (signed 32 bit int). So perhaps this setting could be
aligned with that number since the settings are sort of related instead of using the full range
if you prefer that?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter
  2020-12-14 10:42     ` [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter Thomas Karlsson
  2020-12-14 17:03       ` David Ahern
@ 2020-12-16  4:07       ` David Ahern
  1 sibling, 0 replies; 27+ messages in thread
From: David Ahern @ 2020-12-16  4:07 UTC (permalink / raw)
  To: Thomas Karlsson, netdev, dsahern; +Cc: davem, Jakub Kicinski, stephen, kuznet

On 12/14/20 3:42 AM, Thomas Karlsson wrote:
> This patch allows the user to set and retrieve the
> IFLA_MACVLAN_BC_QUEUE_LEN parameter via the bcqueuelen
> command line argument
> 
> This parameter controls the requested size of the queue for
> broadcast and multicast packages in the macvlan driver.
> 
> If not specified, the driver default (1000) will be used.
> 
> Note: The request is per macvlan but the actually used queue
> length per port is the maximum of any request to any macvlan
> connected to the same port.
> 
> For this reason, the used queue length IFLA_MACVLAN_BC_QUEUE_LEN_USED
> is also retrieved and displayed in order to aid in the understanding
> of the setting. However, it can of course not be directly set.
> 
> Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
> ---
> 
> Note: This patch controls the parameter added in net-next
> with commit d4bff72c8401e6f56194ecf455db70ebc22929e2
> 
> v2 Rebased on origin/main
> v1 Initial version
> 
>  ip/iplink_macvlan.c   | 33 +++++++++++++++++++++++++++++++--
>  man/man8/ip-link.8.in | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+), 2 deletions(-)
> 

applied to iproute2-next.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-12-16  4:09 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <485531aec7e243659ee4e3bb7fa2186d@paneda.se>
2020-11-23 14:22 ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Thomas Karlsson
2020-11-23 22:30   ` Jakub Kicinski
2020-11-25 12:51     ` Thomas Karlsson
2020-11-25 16:57       ` [PATCH] macvlan: Support for high multicast packet rate Thomas Karlsson
2020-11-25 21:55         ` [PATCH net-next v2] " Thomas Karlsson
2020-11-25 16:58       ` Hardcoded multicast queue length in macvlan.c driver causes poor multicast receive performance Jakub Kicinski
2020-11-25 17:12         ` Thomas Karlsson
2020-11-25 18:07           ` Jakub Kicinski
2020-11-25 22:15             ` Thomas Karlsson
2020-11-25 23:01               ` Jakub Kicinski
2020-11-26 20:00                 ` Thomas Karlsson
2020-11-27 17:27                   ` Jakub Kicinski
2020-11-27 23:13                     ` Thomas Karlsson
2020-11-30 14:00     ` [PATCH net-next v3] macvlan: Support for high multicast packet rate Thomas Karlsson
2020-12-01 19:11       ` Jakub Kicinski
2020-12-02 11:28         ` Thomas Karlsson
2020-12-02 16:44           ` Jakub Kicinski
2020-11-30 14:23     ` [PATCH iproute2-next v1] iplink macvlan: Added bcqueuelen parameter Thomas Karlsson
2020-12-10 16:07       ` Thomas Karlsson
2020-12-11  0:36         ` Jakub Kicinski
2020-12-02 18:49     ` [PATCH net-next v4] macvlan: Support for high multicast packet rate Thomas Karlsson
2020-12-03 16:20       ` Jakub Kicinski
2020-12-03 16:53         ` Thomas Karlsson
2020-12-14 10:42     ` [PATCH iproute2-next v2] iplink:macvlan: Added bcqueuelen parameter Thomas Karlsson
2020-12-14 17:03       ` David Ahern
2020-12-14 17:50         ` Thomas Karlsson
2020-12-16  4:07       ` David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.