* switchdev and VLAN ranges @ 2015-10-09 23:30 Vivien Didelot 2015-10-10 4:22 ` Scott Feldman 2015-10-10 7:49 ` Elad Raz 0 siblings, 2 replies; 34+ messages in thread From: Vivien Didelot @ 2015-10-09 23:30 UTC (permalink / raw) To: Jiri Pirko, Scott Feldman; +Cc: netdev, stephen, Florian Fainelli, Andrew Lunn Hi All, I understand that specifying a VLAN range on the command line is nice for the user, and it makes no big deal for software implementation. However, AFAICT a VLAN range does not make sense at all for hardware such as Ethernet switch chips. Am I wrong? I would suggest to make switchdev directly answer to a bridge request that the operation is not supported when the user asks for a VLAN range. That way, we can simply use a single "vid" member in struct switchdev_obj_port_vlan instead of "vid_begin" and "vid_end" and thus avoid making drivers heavier with iteration loops on such range. I have two concerns in mind: a) if we imagine that drivers like Rocker allocate memory in the prepare phase for each VID, preparing a range like 100-4000 would definitely not be recommended. b) imagine that you have two Linux bridges on a switch, one using the hardware VLAN 100. If you request the VLAN range 99-101 for the other bridge members, it is not possible for the driver to say "I can accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for the whole range. That's why I think that avoiding VLAN range at the switchdev level would be a good idea. Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-09 23:30 switchdev and VLAN ranges Vivien Didelot @ 2015-10-10 4:22 ` Scott Feldman 2015-10-10 16:33 ` Vivien Didelot 2015-10-10 7:49 ` Elad Raz 1 sibling, 1 reply; 34+ messages in thread From: Scott Feldman @ 2015-10-10 4:22 UTC (permalink / raw) To: Vivien Didelot Cc: Jiri Pirko, netdev, stephen, Florian Fainelli, Andrew Lunn, Roopa Prabhu On Fri, Oct 9, 2015 at 4:30 PM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: > Hi All, > > I understand that specifying a VLAN range on the command line is nice > for the user, and it makes no big deal for software implementation. [Adding Roopa, since she did the original vlan range support in the kernel/iproute2] > However, AFAICT a VLAN range does not make sense at all for hardware > such as Ethernet switch chips. Am I wrong? > > I would suggest to make switchdev directly answer to a bridge request > that the operation is not supported when the user asks for a VLAN range. > > That way, we can simply use a single "vid" member in struct > switchdev_obj_port_vlan instead of "vid_begin" and "vid_end" and thus > avoid making drivers heavier with iteration loops on such range. > > I have two concerns in mind: > > a) if we imagine that drivers like Rocker allocate memory in the prepare > phase for each VID, preparing a range like 100-4000 would definitely not > be recommended. This call should be in process context so it doesn't seem to terrible for the driver to take its time to reserve/allocate resources in prepare phase, even for a vlan range. I think I'm missing your point. > b) imagine that you have two Linux bridges on a switch, one using the > hardware VLAN 100. If you request the VLAN range 99-101 for the other > bridge members, it is not possible for the driver to say "I can > accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for > the whole range. Well, it probably should return -ERANGE to indicate the range can't be added, but that's an aside. The reason why vlan ranges need to work down to the switchdev driver is, from the user's perspective, it's an all-or-nothing request from the user to add the vlan range to the device. So we need to ask the driver in the prepare phase, "can you support this range, completely?", and if yes, then commit it as a whole. The netlink response back to the user isn't equipped to describe what subset of the range was added, and what subset was not. > That's why I think that avoiding VLAN range at the switchdev level would > be a good idea. As a general rule with switchdev, we've tried to keep the user's experience the same when using {Linux} as a soft switch/router vs. using {Linux + offload device} as a hard switch/router. So if native Linux supports some operation, for example vlan ranges, then we should try to extend that to the offload model. In other words, we don't want to re-train the user when moving from soft switch to hard switch! But there are physical limitations when dealing with an offload device.... Anyway, with your vlan range example, we've got a case where each soft bridge has an independent vlan set, and the vlan sets between soft bridges can overlap. For the (typical) hard switch, there is one vlan set for the whole switch, and trying to overlay the soft bridges' (overlapping) vlan sets on the hard switch fails. That failure is reported to the user. We tried, but due to offload device limitations, we can't support that operation. Of course, if the vlan sets didn't overlap, then we don't have a problem. This will not be the only case where something we can do on a soft Linux switch/router can't be offloaded to some physical offload device. But I think the philosophy has been to try offload what we can, up to the point of failure. In some cases, we can mask that failure from the user by falling back to soft-switch only, but in other cases the failure will pop up right in the user's face, like in your example. One idea to help mitigate the user's confusion would be to limit the number of bridges overlaid on the device to just one. Our drivers know when ports are enslaved to bridges, so is there something we can do there to fail the enslave on a second bridge? Exercise left to the reader. If we had that, now vlan ranges work 1:1 with soft Linux because both soft bridge and device have single vlan set. Sorry for the long-winded response. -scott ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-10 4:22 ` Scott Feldman @ 2015-10-10 16:33 ` Vivien Didelot 2015-10-10 18:10 ` Florian Fainelli 0 siblings, 1 reply; 34+ messages in thread From: Vivien Didelot @ 2015-10-10 16:33 UTC (permalink / raw) To: Scott Feldman Cc: Jiri Pirko, netdev, stephen, Florian Fainelli, Andrew Lunn, Roopa Prabhu On Oct. Friday 09 (41) 09:22 PM, Scott Feldman wrote: > On Fri, Oct 9, 2015 at 4:30 PM, Vivien Didelot > <vivien.didelot@savoirfairelinux.com> wrote: > > Hi All, > > > > I understand that specifying a VLAN range on the command line is nice > > for the user, and it makes no big deal for software implementation. > > [Adding Roopa, since she did the original vlan range support in the > kernel/iproute2] > > > However, AFAICT a VLAN range does not make sense at all for hardware > > such as Ethernet switch chips. Am I wrong? > > > > I would suggest to make switchdev directly answer to a bridge request > > that the operation is not supported when the user asks for a VLAN range. > > > > That way, we can simply use a single "vid" member in struct > > switchdev_obj_port_vlan instead of "vid_begin" and "vid_end" and thus > > avoid making drivers heavier with iteration loops on such range. > > > > I have two concerns in mind: > > > > a) if we imagine that drivers like Rocker allocate memory in the prepare > > phase for each VID, preparing a range like 100-4000 would definitely not > > be recommended. > > This call should be in process context so it doesn't seem to terrible > for the driver to take its time to reserve/allocate resources in > prepare phase, even for a vlan range. I think I'm missing your point. I'll try to give a concrete example. If I want to use the prepare phase in the mv88e6xxx driver, I will allocate a struct mv88e6xxx_vtu_entry (basically a row in the hardware VLAN table) in the prepare phase before programming it in the commit phase. With this VLAN range, the driver will allocate 3900 struct. This seems pointless for drivers and error-prone. > > b) imagine that you have two Linux bridges on a switch, one using the > > hardware VLAN 100. If you request the VLAN range 99-101 for the other > > bridge members, it is not possible for the driver to say "I can > > accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for > > the whole range. > > Well, it probably should return -ERANGE to indicate the range can't be > added, but that's an aside. > > The reason why vlan ranges need to work down to the switchdev driver > is, from the user's perspective, it's an all-or-nothing request from > the user to add the vlan range to the device. So we need to ask the > driver in the prepare phase, "can you support this range, > completely?", and if yes, then commit it as a whole. The netlink > response back to the user isn't equipped to describe what subset of > the range was added, and what subset was not. I understand the all-or-nothing request, and I don't want to re-train the user. I am basically suggesting that switchdev returns -ERANGE in its code, if the device is a switch port and a VLAN range is requested. I mean, I like the per-port paradigm of Linux, but an infrastructure such as switchdev should not just forward user request to the device, but also protect it and push consistent calls to the hardware. > > That's why I think that avoiding VLAN range at the switchdev level would > > be a good idea. > > As a general rule with switchdev, we've tried to keep the user's > experience the same when using {Linux} as a soft switch/router vs. > using {Linux + offload device} as a hard switch/router. So if native > Linux supports some operation, for example vlan ranges, then we should > try to extend that to the offload model. In other words, we don't > want to re-train the user when moving from soft switch to hard switch! > But there are physical limitations when dealing with an offload > device.... > > Anyway, with your vlan range example, we've got a case where each soft > bridge has an independent vlan set, and the vlan sets between soft > bridges can overlap. For the (typical) hard switch, there is one vlan > set for the whole switch, and trying to overlay the soft bridges' > (overlapping) vlan sets on the hard switch fails. That failure is > reported to the user. We tried, but due to offload device > limitations, we can't support that operation. Of course, if the vlan > sets didn't overlap, then we don't have a problem. > > This will not be the only case where something we can do on a soft > Linux switch/router can't be offloaded to some physical offload > device. But I think the philosophy has been to try offload what we > can, up to the point of failure. In some cases, we can mask that > failure from the user by falling back to soft-switch only, but in > other cases the failure will pop up right in the user's face, like in > your example. > > One idea to help mitigate the user's confusion would be to limit the > number of bridges overlaid on the device to just one. Our drivers > know when ports are enslaved to bridges, so is there something we can > do there to fail the enslave on a second bridge? Exercise left to the > reader. If we had that, now vlan ranges work 1:1 with soft Linux > because both soft bridge and device have single vlan set. > > Sorry for the long-winded response. I'm fine with the multiple software bridges on top of hardware switches. IIRC, Andrew is actually using this feature. The user should be able to request mixing of any ports like (s)he wants. A use case I can imagine is a bridge on a router between a WAN interface and a few switch ports, and an isolated group of ports (e.g. sandbox). What I have done in net/dsa/slave.c:dsa_bridge_check_vlan_range() should ideally be moved up to switchdev (this code prevents overlapping of VLAN between bridges in DSA switches). Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-10 16:33 ` Vivien Didelot @ 2015-10-10 18:10 ` Florian Fainelli 2015-10-10 19:47 ` Vivien Didelot 0 siblings, 1 reply; 34+ messages in thread From: Florian Fainelli @ 2015-10-10 18:10 UTC (permalink / raw) To: Vivien Didelot Cc: Scott Feldman, Jiri Pirko, netdev, stephen, Andrew Lunn, Roopa Prabhu 2015-10-10 9:33 GMT-07:00 Vivien Didelot <vivien.didelot@savoirfairelinux.com>: > On Oct. Friday 09 (41) 09:22 PM, Scott Feldman wrote: >> On Fri, Oct 9, 2015 at 4:30 PM, Vivien Didelot >> <vivien.didelot@savoirfairelinux.com> wrote: >> > Hi All, >> > >> > I understand that specifying a VLAN range on the command line is nice >> > for the user, and it makes no big deal for software implementation. >> >> [Adding Roopa, since she did the original vlan range support in the >> kernel/iproute2] >> >> > However, AFAICT a VLAN range does not make sense at all for hardware >> > such as Ethernet switch chips. Am I wrong? >> > >> > I would suggest to make switchdev directly answer to a bridge request >> > that the operation is not supported when the user asks for a VLAN range. >> > >> > That way, we can simply use a single "vid" member in struct >> > switchdev_obj_port_vlan instead of "vid_begin" and "vid_end" and thus >> > avoid making drivers heavier with iteration loops on such range. >> > >> > I have two concerns in mind: >> > >> > a) if we imagine that drivers like Rocker allocate memory in the prepare >> > phase for each VID, preparing a range like 100-4000 would definitely not >> > be recommended. >> >> This call should be in process context so it doesn't seem to terrible >> for the driver to take its time to reserve/allocate resources in >> prepare phase, even for a vlan range. I think I'm missing your point. > > I'll try to give a concrete example. If I want to use the prepare phase > in the mv88e6xxx driver, I will allocate a struct mv88e6xxx_vtu_entry > (basically a row in the hardware VLAN table) in the prepare phase before > programming it in the commit phase. > > With this VLAN range, the driver will allocate 3900 struct. This seems > pointless for drivers and error-prone. We could amortize this allocation cost by having a way to tell the bridge VLAN filtering code, once associated with a switchdev driver, that this driver needs X amount of bytes for its private VLAN entry, but that sounds overkill. If the switch driver needs to maintain bookeeping information about VLANs, we would have to allocate this data somewhere, so we might as well have the switch driver do it? > >> > b) imagine that you have two Linux bridges on a switch, one using the >> > hardware VLAN 100. If you request the VLAN range 99-101 for the other >> > bridge members, it is not possible for the driver to say "I can >> > accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >> > the whole range. >> >> Well, it probably should return -ERANGE to indicate the range can't be >> added, but that's an aside. >> >> The reason why vlan ranges need to work down to the switchdev driver >> is, from the user's perspective, it's an all-or-nothing request from >> the user to add the vlan range to the device. So we need to ask the >> driver in the prepare phase, "can you support this range, >> completely?", and if yes, then commit it as a whole. The netlink >> response back to the user isn't equipped to describe what subset of >> the range was added, and what subset was not. > > I understand the all-or-nothing request, and I don't want to re-train > the user. I am basically suggesting that switchdev returns -ERANGE in > its code, if the device is a switch port and a VLAN range is requested. Denying VLAN ranges in switchdev would at best, be something that should be negotiated with the underlying switch driver. There could be cases where it is desirable to push VLAN ranges to batch expensive HW operations for instance, or just because you can do it and this maps nicely to your HW is nice to have. Do you have something else in mind? > > I mean, I like the per-port paradigm of Linux, but an infrastructure > such as switchdev should not just forward user request to the device, > but also protect it and push consistent calls to the hardware. > >> > That's why I think that avoiding VLAN range at the switchdev level would >> > be a good idea. >> >> As a general rule with switchdev, we've tried to keep the user's >> experience the same when using {Linux} as a soft switch/router vs. >> using {Linux + offload device} as a hard switch/router. So if native >> Linux supports some operation, for example vlan ranges, then we should >> try to extend that to the offload model. In other words, we don't >> want to re-train the user when moving from soft switch to hard switch! >> But there are physical limitations when dealing with an offload >> device.... >> >> Anyway, with your vlan range example, we've got a case where each soft >> bridge has an independent vlan set, and the vlan sets between soft >> bridges can overlap. For the (typical) hard switch, there is one vlan >> set for the whole switch, and trying to overlay the soft bridges' >> (overlapping) vlan sets on the hard switch fails. That failure is >> reported to the user. We tried, but due to offload device >> limitations, we can't support that operation. Of course, if the vlan >> sets didn't overlap, then we don't have a problem. >> >> This will not be the only case where something we can do on a soft >> Linux switch/router can't be offloaded to some physical offload >> device. But I think the philosophy has been to try offload what we >> can, up to the point of failure. In some cases, we can mask that >> failure from the user by falling back to soft-switch only, but in >> other cases the failure will pop up right in the user's face, like in >> your example. >> >> One idea to help mitigate the user's confusion would be to limit the >> number of bridges overlaid on the device to just one. Our drivers >> know when ports are enslaved to bridges, so is there something we can >> do there to fail the enslave on a second bridge? Exercise left to the >> reader. If we had that, now vlan ranges work 1:1 with soft Linux >> because both soft bridge and device have single vlan set. >> >> Sorry for the long-winded response. > > I'm fine with the multiple software bridges on top of hardware switches. > IIRC, Andrew is actually using this feature. The user should be able to > request mixing of any ports like (s)he wants. > > A use case I can imagine is a bridge on a router between a WAN interface > and a few switch ports, and an isolated group of ports (e.g. sandbox). That use case can be solved by having a single bridge entity spanning an entire switch device, and configuring different PVIDs on e.g: ports 2-3-4 (WAN) and ports 0-1 (sandbox). > > What I have done in net/dsa/slave.c:dsa_bridge_check_vlan_range() should > ideally be moved up to switchdev (this code prevents overlapping of VLAN > between bridges in DSA switches). As we discussed before over IRC, checking for overlapping VLANs is something that may have to be switch driver specific at some point. For instance, some switches do a double tag VLAN tag normalization whenever a packet is ingressed, which means that as long as your are not doing double VLAN tagging (that would be fun to support, but even there), you could dedicate an outer VLAN tag per bridge instance, and still allow the same inner VLAN tags to be configured by an user. The switch would still guarantee proper isolation/switching/individual or shared FDBs for these logical domains. -- Florian ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-10 18:10 ` Florian Fainelli @ 2015-10-10 19:47 ` Vivien Didelot 0 siblings, 0 replies; 34+ messages in thread From: Vivien Didelot @ 2015-10-10 19:47 UTC (permalink / raw) To: Florian Fainelli Cc: Scott Feldman, Jiri Pirko, netdev, stephen, Andrew Lunn, Roopa Prabhu On Oct. Saturday 10 (41) 11:10 AM, Florian Fainelli wrote: > 2015-10-10 9:33 GMT-07:00 Vivien Didelot <vivien.didelot@savoirfairelinux.com>: > > On Oct. Friday 09 (41) 09:22 PM, Scott Feldman wrote: > >> On Fri, Oct 9, 2015 at 4:30 PM, Vivien Didelot > >> <vivien.didelot@savoirfairelinux.com> wrote: > >> > Hi All, > >> > > >> > I understand that specifying a VLAN range on the command line is nice > >> > for the user, and it makes no big deal for software implementation. > >> > >> [Adding Roopa, since she did the original vlan range support in the > >> kernel/iproute2] > >> > >> > However, AFAICT a VLAN range does not make sense at all for hardware > >> > such as Ethernet switch chips. Am I wrong? > >> > > >> > I would suggest to make switchdev directly answer to a bridge request > >> > that the operation is not supported when the user asks for a VLAN range. > >> > > >> > That way, we can simply use a single "vid" member in struct > >> > switchdev_obj_port_vlan instead of "vid_begin" and "vid_end" and thus > >> > avoid making drivers heavier with iteration loops on such range. > >> > > >> > I have two concerns in mind: > >> > > >> > a) if we imagine that drivers like Rocker allocate memory in the prepare > >> > phase for each VID, preparing a range like 100-4000 would definitely not > >> > be recommended. > >> > >> This call should be in process context so it doesn't seem to terrible > >> for the driver to take its time to reserve/allocate resources in > >> prepare phase, even for a vlan range. I think I'm missing your point. > > > > I'll try to give a concrete example. If I want to use the prepare phase > > in the mv88e6xxx driver, I will allocate a struct mv88e6xxx_vtu_entry > > (basically a row in the hardware VLAN table) in the prepare phase before > > programming it in the commit phase. > > > > With this VLAN range, the driver will allocate 3900 struct. This seems > > pointless for drivers and error-prone. > > We could amortize this allocation cost by having a way to tell the > bridge VLAN filtering code, once associated with a switchdev driver, > that this driver needs X amount of bytes for its private VLAN entry, > but that sounds overkill. If the switch driver needs to maintain > bookeeping information about VLANs, we would have to allocate this > data somewhere, so we might as well have the switch driver do it? You are right, the allocation would have to be done anyway. I am considering caching the VLAN table in mv88e6xxx sometimes. > > > >> > b) imagine that you have two Linux bridges on a switch, one using the > >> > hardware VLAN 100. If you request the VLAN range 99-101 for the other > >> > bridge members, it is not possible for the driver to say "I can > >> > accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for > >> > the whole range. > >> > >> Well, it probably should return -ERANGE to indicate the range can't be > >> added, but that's an aside. > >> > >> The reason why vlan ranges need to work down to the switchdev driver > >> is, from the user's perspective, it's an all-or-nothing request from > >> the user to add the vlan range to the device. So we need to ask the > >> driver in the prepare phase, "can you support this range, > >> completely?", and if yes, then commit it as a whole. The netlink > >> response back to the user isn't equipped to describe what subset of > >> the range was added, and what subset was not. > > > > I understand the all-or-nothing request, and I don't want to re-train > > the user. I am basically suggesting that switchdev returns -ERANGE in > > its code, if the device is a switch port and a VLAN range is requested. > > Denying VLAN ranges in switchdev would at best, be something that > should be negotiated with the underlying switch driver. There could be > cases where it is desirable to push VLAN ranges to batch expensive HW > operations for instance, or just because you can do it and this maps > nicely to your HW is nice to have. Do you have something else in > mind? I don't really have something else in mind. I was just worried that we are pushing features to drivers based on potential (but not existent AFAICT) hardware features. I am curious to find an hardware that might program a batch of VLAN in a single shot, especially on a per-port basis as Linux does. If someone could point me an example of such hardware, I would gladly stop questioning the need for VLAN ranges in drivers. > > > > I mean, I like the per-port paradigm of Linux, but an infrastructure > > such as switchdev should not just forward user request to the device, > > but also protect it and push consistent calls to the hardware. > > > >> > That's why I think that avoiding VLAN range at the switchdev level would > >> > be a good idea. > >> > >> As a general rule with switchdev, we've tried to keep the user's > >> experience the same when using {Linux} as a soft switch/router vs. > >> using {Linux + offload device} as a hard switch/router. So if native > >> Linux supports some operation, for example vlan ranges, then we should > >> try to extend that to the offload model. In other words, we don't > >> want to re-train the user when moving from soft switch to hard switch! > >> But there are physical limitations when dealing with an offload > >> device.... > >> > >> Anyway, with your vlan range example, we've got a case where each soft > >> bridge has an independent vlan set, and the vlan sets between soft > >> bridges can overlap. For the (typical) hard switch, there is one vlan > >> set for the whole switch, and trying to overlay the soft bridges' > >> (overlapping) vlan sets on the hard switch fails. That failure is > >> reported to the user. We tried, but due to offload device > >> limitations, we can't support that operation. Of course, if the vlan > >> sets didn't overlap, then we don't have a problem. > >> > >> This will not be the only case where something we can do on a soft > >> Linux switch/router can't be offloaded to some physical offload > >> device. But I think the philosophy has been to try offload what we > >> can, up to the point of failure. In some cases, we can mask that > >> failure from the user by falling back to soft-switch only, but in > >> other cases the failure will pop up right in the user's face, like in > >> your example. > >> > >> One idea to help mitigate the user's confusion would be to limit the > >> number of bridges overlaid on the device to just one. Our drivers > >> know when ports are enslaved to bridges, so is there something we can > >> do there to fail the enslave on a second bridge? Exercise left to the > >> reader. If we had that, now vlan ranges work 1:1 with soft Linux > >> because both soft bridge and device have single vlan set. > >> > >> Sorry for the long-winded response. > > > > I'm fine with the multiple software bridges on top of hardware switches. > > IIRC, Andrew is actually using this feature. The user should be able to > > request mixing of any ports like (s)he wants. > > > > A use case I can imagine is a bridge on a router between a WAN interface > > and a few switch ports, and an isolated group of ports (e.g. sandbox). > > That use case can be solved by having a single bridge entity spanning > an entire switch device, and configuring different PVIDs on e.g: ports > 2-3-4 (WAN) and ports 0-1 (sandbox). That's true. The number of software bridge groups on top of an hardware switch is not really the issue here. If Linux and the bridge code has an explicit support for hardware switch, then mapping a single software bridge group on a switch chip would be really consistent. Otherwise, it doesn't add much value. > > > > > What I have done in net/dsa/slave.c:dsa_bridge_check_vlan_range() should > > ideally be moved up to switchdev (this code prevents overlapping of VLAN > > between bridges in DSA switches). > > As we discussed before over IRC, checking for overlapping VLANs is > something that may have to be switch driver specific at some point. > > For instance, some switches do a double tag VLAN tag normalization > whenever a packet is ingressed, which means that as long as your are > not doing double VLAN tagging (that would be fun to support, but even > there), you could dedicate an outer VLAN tag per bridge instance, and > still allow the same inner VLAN tags to be configured by an user. The > switch would still guarantee proper isolation/switching/individual or > shared FDBs for these logical domains. Indeed, you can do internal hacks in the driver to support other features. But do we really want to do that? Wouldn't it be better to keep simple and explicit support? Like, VLAN is VLAN, double tagging is double tagging, etc. So we would avoid confusion in implementations. DSA and its "hardware bridging" code -- which is a hack using port-based VLAN maps until full 802.1Q support was added -- is good an example of confusion that such flexibility might bring. Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-09 23:30 switchdev and VLAN ranges Vivien Didelot 2015-10-10 4:22 ` Scott Feldman @ 2015-10-10 7:49 ` Elad Raz 2015-10-10 10:36 ` Nikolay Aleksandrov 1 sibling, 1 reply; 34+ messages in thread From: Elad Raz @ 2015-10-10 7:49 UTC (permalink / raw) To: Vivien Didelot Cc: Jiri Pirko, Scott Feldman, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz > On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: > > I have two concerns in mind: > > a) if we imagine that drivers like Rocker allocate memory in the prepare > phase for each VID, preparing a range like 100-4000 would definitely not > be recommended. > > b) imagine that you have two Linux bridges on a switch, one using the > hardware VLAN 100. If you request the VLAN range 99-101 for the other > bridge members, it is not possible for the driver to say "I can > accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for > the whole range. Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. There is no sense having more than one VLAN as a PVID. This leave the HW vendor the choice which VLAN id they will use as the PVID. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-10 7:49 ` Elad Raz @ 2015-10-10 10:36 ` Nikolay Aleksandrov 2015-10-11 7:12 ` Jiri Pirko 0 siblings, 1 reply; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-10 10:36 UTC (permalink / raw) To: Elad Raz, Vivien Didelot Cc: Jiri Pirko, Scott Feldman, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz On 10/10/2015 09:49 AM, Elad Raz wrote: > >> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: >> >> I have two concerns in mind: >> >> a) if we imagine that drivers like Rocker allocate memory in the prepare >> phase for each VID, preparing a range like 100-4000 would definitely not >> be recommended. >> >> b) imagine that you have two Linux bridges on a switch, one using the >> hardware VLAN 100. If you request the VLAN range 99-101 for the other >> bridge members, it is not possible for the driver to say "I can >> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >> the whole range. > > Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. > There is no sense having more than one VLAN as a PVID. > This leave the HW vendor the choice which VLAN id they will use as the PVID. > iproute2 doesn't allow to do it but I can see that someone can actually make it so the flags for the range have it and it doesn't look correct. Perhaps we need something like the patch below to enforce this from kernel-side. diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index d78b4429505a..02b17b53e9a6 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, if (vinfo_start) return -EINVAL; vinfo_start = vinfo; + /* don't allow range of pvids */ + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) + return -EINVAL; continue; } ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-10 10:36 ` Nikolay Aleksandrov @ 2015-10-11 7:12 ` Jiri Pirko 2015-10-11 10:49 ` [Bridge] " Nikolay Aleksandrov 2015-10-11 22:41 ` switchdev and VLAN ranges Vivien Didelot 0 siblings, 2 replies; 34+ messages in thread From: Jiri Pirko @ 2015-10-11 7:12 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: Elad Raz, Vivien Didelot, Scott Feldman, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz Sat, Oct 10, 2015 at 12:36:26PM CEST, nikolay@cumulusnetworks.com wrote: >On 10/10/2015 09:49 AM, Elad Raz wrote: >> >>> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: >>> >>> I have two concerns in mind: >>> >>> a) if we imagine that drivers like Rocker allocate memory in the prepare >>> phase for each VID, preparing a range like 100-4000 would definitely not >>> be recommended. >>> >>> b) imagine that you have two Linux bridges on a switch, one using the >>> hardware VLAN 100. If you request the VLAN range 99-101 for the other >>> bridge members, it is not possible for the driver to say "I can >>> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >>> the whole range. >> >> Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. >> There is no sense having more than one VLAN as a PVID. >> This leave the HW vendor the choice which VLAN id they will use as the PVID. >> > >iproute2 doesn't allow to do it but I can see that someone can actually make it >so the flags for the range have it and it doesn't look correct. Perhaps we need >something like the patch below to enforce this from kernel-side. > > >diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >index d78b4429505a..02b17b53e9a6 100644 >--- a/net/bridge/br_netlink.c >+++ b/net/bridge/br_netlink.c >@@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, > if (vinfo_start) > return -EINVAL; > vinfo_start = vinfo; >+ /* don't allow range of pvids */ >+ if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) >+ return -EINVAL; > continue; > } > Looks correct to me. Could you please submit this properly? Thanks! ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges 2015-10-11 7:12 ` Jiri Pirko @ 2015-10-11 10:49 ` Nikolay Aleksandrov 2015-10-11 22:41 ` switchdev and VLAN ranges Vivien Didelot 1 sibling, 0 replies; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-11 10:49 UTC (permalink / raw) To: netdev; +Cc: jiri, Nikolay Aleksandrov, bridge, eladr, davem From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Currently it's possible for someone to send a vlan range to the kernel with the pvid flag set which will result in the pvid bouncing from a vlan to vlan and isn't correct, it also introduces problems for hardware where it doesn't make sense having more than 1 pvid. iproute2 already enforces this, so let's enforce it on kernel-side as well. Reported-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> --- net/bridge/br_netlink.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index d78b4429505a..02b17b53e9a6 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, if (vinfo_start) return -EINVAL; vinfo_start = vinfo; + /* don't allow range of pvids */ + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) + return -EINVAL; continue; } -- 2.4.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [Bridge] [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges @ 2015-10-11 10:49 ` Nikolay Aleksandrov 0 siblings, 0 replies; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-11 10:49 UTC (permalink / raw) To: netdev; +Cc: jiri, Nikolay Aleksandrov, bridge, eladr, davem From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Currently it's possible for someone to send a vlan range to the kernel with the pvid flag set which will result in the pvid bouncing from a vlan to vlan and isn't correct, it also introduces problems for hardware where it doesn't make sense having more than 1 pvid. iproute2 already enforces this, so let's enforce it on kernel-side as well. Reported-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> --- net/bridge/br_netlink.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index d78b4429505a..02b17b53e9a6 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, if (vinfo_start) return -EINVAL; vinfo_start = vinfo; + /* don't allow range of pvids */ + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) + return -EINVAL; continue; } -- 2.4.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges 2015-10-11 10:49 ` [Bridge] " Nikolay Aleksandrov (?) @ 2015-10-11 14:13 ` Jiri Pirko -1 siblings, 0 replies; 34+ messages in thread From: Jiri Pirko @ 2015-10-11 14:13 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: netdev, stephen, bridge, davem, eladr, Nikolay Aleksandrov Sun, Oct 11, 2015 at 12:49:56PM CEST, razor@blackwall.org wrote: >From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >Currently it's possible for someone to send a vlan range to the kernel >with the pvid flag set which will result in the pvid bouncing from a >vlan to vlan and isn't correct, it also introduces problems for hardware >where it doesn't make sense having more than 1 pvid. iproute2 already >enforces this, so let's enforce it on kernel-side as well. > >Reported-by: Elad Raz <eladr@mellanox.com> >Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Jiri Pirko <jiri@mellanox.com> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges 2015-10-11 10:49 ` [Bridge] " Nikolay Aleksandrov @ 2015-10-13 2:59 ` David Miller -1 siblings, 0 replies; 34+ messages in thread From: David Miller @ 2015-10-13 2:59 UTC (permalink / raw) To: razor; +Cc: netdev, jiri, stephen, bridge, eladr, nikolay From: Nikolay Aleksandrov <razor@blackwall.org> Date: Sun, 11 Oct 2015 12:49:56 +0200 > From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > > Currently it's possible for someone to send a vlan range to the kernel > with the pvid flag set which will result in the pvid bouncing from a > vlan to vlan and isn't correct, it also introduces problems for hardware > where it doesn't make sense having more than 1 pvid. iproute2 already > enforces this, so let's enforce it on kernel-side as well. > > Reported-by: Elad Raz <eladr@mellanox.com> > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Applied, thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [Bridge] [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges @ 2015-10-13 2:59 ` David Miller 0 siblings, 0 replies; 34+ messages in thread From: David Miller @ 2015-10-13 2:59 UTC (permalink / raw) To: razor; +Cc: jiri, nikolay, netdev, bridge, eladr From: Nikolay Aleksandrov <razor@blackwall.org> Date: Sun, 11 Oct 2015 12:49:56 +0200 > From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > > Currently it's possible for someone to send a vlan range to the kernel > with the pvid flag set which will result in the pvid bouncing from a > vlan to vlan and isn't correct, it also introduces problems for hardware > where it doesn't make sense having more than 1 pvid. iproute2 already > enforces this, so let's enforce it on kernel-side as well. > > Reported-by: Elad Raz <eladr@mellanox.com> > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Applied, thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-11 7:12 ` Jiri Pirko 2015-10-11 10:49 ` [Bridge] " Nikolay Aleksandrov @ 2015-10-11 22:41 ` Vivien Didelot 2015-10-12 0:13 ` Nikolay Aleksandrov 1 sibling, 1 reply; 34+ messages in thread From: Vivien Didelot @ 2015-10-11 22:41 UTC (permalink / raw) To: Jiri Pirko Cc: Nikolay Aleksandrov, Elad Raz, Scott Feldman, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz On Oct. Sunday 11 (41) 09:12 AM, Jiri Pirko wrote: > Sat, Oct 10, 2015 at 12:36:26PM CEST, nikolay@cumulusnetworks.com wrote: > >On 10/10/2015 09:49 AM, Elad Raz wrote: > >> > >>> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: > >>> > >>> I have two concerns in mind: > >>> > >>> a) if we imagine that drivers like Rocker allocate memory in the prepare > >>> phase for each VID, preparing a range like 100-4000 would definitely not > >>> be recommended. > >>> > >>> b) imagine that you have two Linux bridges on a switch, one using the > >>> hardware VLAN 100. If you request the VLAN range 99-101 for the other > >>> bridge members, it is not possible for the driver to say "I can > >>> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for > >>> the whole range. > >> > >> Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. > >> There is no sense having more than one VLAN as a PVID. > >> This leave the HW vendor the choice which VLAN id they will use as the PVID. > >> > > > >iproute2 doesn't allow to do it but I can see that someone can actually make it > >so the flags for the range have it and it doesn't look correct. Perhaps we need > >something like the patch below to enforce this from kernel-side. > > > > > >diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c > >index d78b4429505a..02b17b53e9a6 100644 > >--- a/net/bridge/br_netlink.c > >+++ b/net/bridge/br_netlink.c > >@@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, > > if (vinfo_start) > > return -EINVAL; > > vinfo_start = vinfo; > >+ /* don't allow range of pvids */ > >+ if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) > >+ return -EINVAL; > > continue; > > } > > > > Looks correct to me. Could you please submit this properly? Thanks! The above patch is correct, but we only solve part of the problem, since the range and bridge flags are exposed by switchdev_obj_port_vlan as is. Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-11 22:41 ` switchdev and VLAN ranges Vivien Didelot @ 2015-10-12 0:13 ` Nikolay Aleksandrov 2015-10-12 5:14 ` Scott Feldman 0 siblings, 1 reply; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-12 0:13 UTC (permalink / raw) To: Vivien Didelot, Jiri Pirko Cc: Elad Raz, Scott Feldman, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz On 10/12/2015 12:41 AM, Vivien Didelot wrote: > On Oct. Sunday 11 (41) 09:12 AM, Jiri Pirko wrote: >> Sat, Oct 10, 2015 at 12:36:26PM CEST, nikolay@cumulusnetworks.com wrote: >>> On 10/10/2015 09:49 AM, Elad Raz wrote: >>>> >>>>> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: >>>>> >>>>> I have two concerns in mind: >>>>> >>>>> a) if we imagine that drivers like Rocker allocate memory in the prepare >>>>> phase for each VID, preparing a range like 100-4000 would definitely not >>>>> be recommended. >>>>> >>>>> b) imagine that you have two Linux bridges on a switch, one using the >>>>> hardware VLAN 100. If you request the VLAN range 99-101 for the other >>>>> bridge members, it is not possible for the driver to say "I can >>>>> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >>>>> the whole range. >>>> >>>> Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. >>>> There is no sense having more than one VLAN as a PVID. >>>> This leave the HW vendor the choice which VLAN id they will use as the PVID. >>>> >>> >>> iproute2 doesn't allow to do it but I can see that someone can actually make it >>> so the flags for the range have it and it doesn't look correct. Perhaps we need >>> something like the patch below to enforce this from kernel-side. >>> >>> >>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >>> index d78b4429505a..02b17b53e9a6 100644 >>> --- a/net/bridge/br_netlink.c >>> +++ b/net/bridge/br_netlink.c >>> @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, >>> if (vinfo_start) >>> return -EINVAL; >>> vinfo_start = vinfo; >>> + /* don't allow range of pvids */ >>> + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) >>> + return -EINVAL; >>> continue; >>> } >>> >> >> Looks correct to me. Could you please submit this properly? Thanks! > > The above patch is correct, but we only solve part of the problem, since > the range and bridge flags are exposed by switchdev_obj_port_vlan as is. > > Thanks, > -v > Yes, the above fixes the bridge side. About the switchdev side it seems like it's up to the switchdev driver to do the right thing in its switchdev_ops. I took a quick look at DSA and it seems correct, the flag isn't saved and on dump request the flags are generated so it shouldn't be possible to export multiple pvids. But switchdev_port_br_afspec() seems problematic, in fact I don't even see a vlan id check, i.e. ==0 || >= VLAN_N_MASK. Of course, I might be totally off point as I'm not that familiar with switchdev and it's very late. :-) But maybe it needs something like: diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c index 6e4a4f9ad927..3dd52a53867f 100644 --- a/net/switchdev/switchdev.c +++ b/net/switchdev/switchdev.c @@ -16,6 +16,7 @@ #include <linux/notifier.h> #include <linux/netdevice.h> #include <linux/if_bridge.h> +#include <linux/if_vlan.h> #include <linux/list.h> #include <net/ip_fib.h> #include <net/switchdev.h> @@ -716,10 +717,14 @@ static int switchdev_port_br_afspec(struct net_device *dev, return -EINVAL; vinfo = nla_data(attr); vlan.flags = vinfo->flags; + if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK) + return -EINVAL; if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_BEGIN) { if (vlan.vid_begin) return -EINVAL; vlan.vid_begin = vinfo->vid; + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) + return -EINVAL; } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { if (!vlan.vid_begin) return -EINVAL; ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-12 0:13 ` Nikolay Aleksandrov @ 2015-10-12 5:14 ` Scott Feldman 2015-10-12 10:15 ` Nikolay Aleksandrov 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 0 siblings, 2 replies; 34+ messages in thread From: Scott Feldman @ 2015-10-12 5:14 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: Vivien Didelot, Jiri Pirko, Elad Raz, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz On Sun, Oct 11, 2015 at 5:13 PM, Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote: > On 10/12/2015 12:41 AM, Vivien Didelot wrote: >> On Oct. Sunday 11 (41) 09:12 AM, Jiri Pirko wrote: >>> Sat, Oct 10, 2015 at 12:36:26PM CEST, nikolay@cumulusnetworks.com wrote: >>>> On 10/10/2015 09:49 AM, Elad Raz wrote: >>>>> >>>>>> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: >>>>>> >>>>>> I have two concerns in mind: >>>>>> >>>>>> a) if we imagine that drivers like Rocker allocate memory in the prepare >>>>>> phase for each VID, preparing a range like 100-4000 would definitely not >>>>>> be recommended. >>>>>> >>>>>> b) imagine that you have two Linux bridges on a switch, one using the >>>>>> hardware VLAN 100. If you request the VLAN range 99-101 for the other >>>>>> bridge members, it is not possible for the driver to say "I can >>>>>> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >>>>>> the whole range. >>>>> >>>>> Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. >>>>> There is no sense having more than one VLAN as a PVID. >>>>> This leave the HW vendor the choice which VLAN id they will use as the PVID. >>>>> >>>> >>>> iproute2 doesn't allow to do it but I can see that someone can actually make it >>>> so the flags for the range have it and it doesn't look correct. Perhaps we need >>>> something like the patch below to enforce this from kernel-side. >>>> >>>> >>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >>>> index d78b4429505a..02b17b53e9a6 100644 >>>> --- a/net/bridge/br_netlink.c >>>> +++ b/net/bridge/br_netlink.c >>>> @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, >>>> if (vinfo_start) >>>> return -EINVAL; >>>> vinfo_start = vinfo; >>>> + /* don't allow range of pvids */ >>>> + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) >>>> + return -EINVAL; >>>> continue; >>>> } >>>> >>> >>> Looks correct to me. Could you please submit this properly? Thanks! >> >> The above patch is correct, but we only solve part of the problem, since >> the range and bridge flags are exposed by switchdev_obj_port_vlan as is. >> >> Thanks, >> -v >> > > Yes, the above fixes the bridge side. About the switchdev side it seems like it's > up to the switchdev driver to do the right thing in its switchdev_ops. I took a > quick look at DSA and it seems correct, the flag isn't saved and on dump request > the flags are generated so it shouldn't be possible to export multiple pvids. > But switchdev_port_br_afspec() seems problematic, in fact I don't even see a vlan > id check, i.e. ==0 || >= VLAN_N_MASK. > Of course, I might be totally off point as I'm not that familiar with switchdev and > it's very late. :-) > But maybe it needs something like: > > diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > index 6e4a4f9ad927..3dd52a53867f 100644 > --- a/net/switchdev/switchdev.c > +++ b/net/switchdev/switchdev.c > @@ -16,6 +16,7 @@ > #include <linux/notifier.h> > #include <linux/netdevice.h> > #include <linux/if_bridge.h> > +#include <linux/if_vlan.h> > #include <linux/list.h> > #include <net/ip_fib.h> > #include <net/switchdev.h> > @@ -716,10 +717,14 @@ static int switchdev_port_br_afspec(struct net_device *dev, > return -EINVAL; > vinfo = nla_data(attr); > vlan.flags = vinfo->flags; > + if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK) > + return -EINVAL; > if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_BEGIN) { > if (vlan.vid_begin) > return -EINVAL; > vlan.vid_begin = vinfo->vid; > + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > + return -EINVAL; > } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > if (!vlan.vid_begin) > return -EINVAL; This (and you other patch) seem right to me, if we're going to block setting PVID when specifying a vlan range. Would you mind combining and resending both patches as one as a proper patch? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: switchdev and VLAN ranges 2015-10-12 5:14 ` Scott Feldman @ 2015-10-12 10:15 ` Nikolay Aleksandrov 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 1 sibling, 0 replies; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-12 10:15 UTC (permalink / raw) To: Scott Feldman Cc: Vivien Didelot, Jiri Pirko, Elad Raz, netdev, stephen, Florian Fainelli, Andrew Lunn, Ido Schimmel, Or Gerlitz On 10/12/2015 07:14 AM, Scott Feldman wrote: > On Sun, Oct 11, 2015 at 5:13 PM, Nikolay Aleksandrov > <nikolay@cumulusnetworks.com> wrote: >> On 10/12/2015 12:41 AM, Vivien Didelot wrote: >>> On Oct. Sunday 11 (41) 09:12 AM, Jiri Pirko wrote: >>>> Sat, Oct 10, 2015 at 12:36:26PM CEST, nikolay@cumulusnetworks.com wrote: >>>>> On 10/10/2015 09:49 AM, Elad Raz wrote: >>>>>> >>>>>>> On Oct 10, 2015, at 2:30 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: >>>>>>> >>>>>>> I have two concerns in mind: >>>>>>> >>>>>>> a) if we imagine that drivers like Rocker allocate memory in the prepare >>>>>>> phase for each VID, preparing a range like 100-4000 would definitely not >>>>>>> be recommended. >>>>>>> >>>>>>> b) imagine that you have two Linux bridges on a switch, one using the >>>>>>> hardware VLAN 100. If you request the VLAN range 99-101 for the other >>>>>>> bridge members, it is not possible for the driver to say "I can >>>>>>> accelerate VLAN 99 and 101, but not 100". It must return OPNOTSUPP for >>>>>>> the whole range. >>>>>> >>>>>> Another concern I have with vid_being..vid_end range is the “flags”. Where flags can be BRIDGE_VLAN_INFO_PVID. >>>>>> There is no sense having more than one VLAN as a PVID. >>>>>> This leave the HW vendor the choice which VLAN id they will use as the PVID. >>>>>> >>>>> >>>>> iproute2 doesn't allow to do it but I can see that someone can actually make it >>>>> so the flags for the range have it and it doesn't look correct. Perhaps we need >>>>> something like the patch below to enforce this from kernel-side. >>>>> >>>>> >>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c >>>>> index d78b4429505a..02b17b53e9a6 100644 >>>>> --- a/net/bridge/br_netlink.c >>>>> +++ b/net/bridge/br_netlink.c >>>>> @@ -524,6 +524,9 @@ static int br_afspec(struct net_bridge *br, >>>>> if (vinfo_start) >>>>> return -EINVAL; >>>>> vinfo_start = vinfo; >>>>> + /* don't allow range of pvids */ >>>>> + if (vinfo_start->flags & BRIDGE_VLAN_INFO_PVID) >>>>> + return -EINVAL; >>>>> continue; >>>>> } >>>>> >>>> >>>> Looks correct to me. Could you please submit this properly? Thanks! >>> >>> The above patch is correct, but we only solve part of the problem, since >>> the range and bridge flags are exposed by switchdev_obj_port_vlan as is. >>> >>> Thanks, >>> -v >>> >> >> Yes, the above fixes the bridge side. About the switchdev side it seems like it's >> up to the switchdev driver to do the right thing in its switchdev_ops. I took a >> quick look at DSA and it seems correct, the flag isn't saved and on dump request >> the flags are generated so it shouldn't be possible to export multiple pvids. >> But switchdev_port_br_afspec() seems problematic, in fact I don't even see a vlan >> id check, i.e. ==0 || >= VLAN_N_MASK. >> Of course, I might be totally off point as I'm not that familiar with switchdev and >> it's very late. :-) >> But maybe it needs something like: >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> index 6e4a4f9ad927..3dd52a53867f 100644 >> --- a/net/switchdev/switchdev.c >> +++ b/net/switchdev/switchdev.c >> @@ -16,6 +16,7 @@ >> #include <linux/notifier.h> >> #include <linux/netdevice.h> >> #include <linux/if_bridge.h> >> +#include <linux/if_vlan.h> >> #include <linux/list.h> >> #include <net/ip_fib.h> >> #include <net/switchdev.h> >> @@ -716,10 +717,14 @@ static int switchdev_port_br_afspec(struct net_device *dev, >> return -EINVAL; >> vinfo = nla_data(attr); >> vlan.flags = vinfo->flags; >> + if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK) >> + return -EINVAL; >> if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_BEGIN) { >> if (vlan.vid_begin) >> return -EINVAL; >> vlan.vid_begin = vinfo->vid; >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >> + return -EINVAL; >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >> if (!vlan.vid_begin) >> return -EINVAL; > > This (and you other patch) seem right to me, if we're going to block > setting PVID when specifying a vlan range. Would you mind combining > and resending both patches as one as a proper patch? > Thanks for the review, I'll prepare a small set as I'd like to keep these separate since they touch two different subsystems and will re-post. I'll target net-next with the pvid range change and -net with the vlan range check patch. Does this sound okay ? Thanks, Nik ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 5:14 ` Scott Feldman 2015-10-12 10:15 ` Nikolay Aleksandrov @ 2015-10-12 12:01 ` Nikolay Aleksandrov 2015-10-12 12:11 ` Elad Raz ` (3 more replies) 1 sibling, 4 replies; 34+ messages in thread From: Nikolay Aleksandrov @ 2015-10-12 12:01 UTC (permalink / raw) To: netdev; +Cc: sfeldma, jiri, vivien.didelot, davem, Nikolay Aleksandrov From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> --- net/switchdev/switchdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c index 6e4a4f9ad927..256c596de896 100644 --- a/net/switchdev/switchdev.c +++ b/net/switchdev/switchdev.c @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, if (vlan.vid_begin) return -EINVAL; vlan.vid_begin = vinfo->vid; + /* don't allow range of pvids */ + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) + return -EINVAL; } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { if (!vlan.vid_begin) return -EINVAL; -- 2.4.3 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov @ 2015-10-12 12:11 ` Elad Raz 2015-10-12 12:17 ` Jiri Pirko ` (2 subsequent siblings) 3 siblings, 0 replies; 34+ messages in thread From: Elad Raz @ 2015-10-12 12:11 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: netdev, Scott Feldman, Jiri Pirko, Vivien Didelot, davem, Nikolay Aleksandrov > On Oct 12, 2015, at 3:01 PM, Nikolay Aleksandrov <razor@blackwall.org> wrote: > > From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > > We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > --- > net/switchdev/switchdev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > index 6e4a4f9ad927..256c596de896 100644 > --- a/net/switchdev/switchdev.c > +++ b/net/switchdev/switchdev.c > @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > if (vlan.vid_begin) > return -EINVAL; > vlan.vid_begin = vinfo->vid; > + /* don't allow range of pvids */ > + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > + return -EINVAL; > } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > if (!vlan.vid_begin) > return -EINVAL; Acked-by: Elad Raz <eladr@mellanox.com> > -- > 2.4.3 > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 2015-10-12 12:11 ` Elad Raz @ 2015-10-12 12:17 ` Jiri Pirko 2015-10-12 17:36 ` Vivien Didelot 2015-10-13 11:42 ` David Miller 3 siblings, 0 replies; 34+ messages in thread From: Jiri Pirko @ 2015-10-12 12:17 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: netdev, sfeldma, vivien.didelot, davem, Nikolay Aleksandrov Mon, Oct 12, 2015 at 02:01:39PM CEST, razor@blackwall.org wrote: >From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > >Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Jiri Pirko <jiri@mellanox.com> ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 2015-10-12 12:11 ` Elad Raz 2015-10-12 12:17 ` Jiri Pirko @ 2015-10-12 17:36 ` Vivien Didelot 2015-10-13 6:13 ` Scott Feldman 2015-10-13 8:31 ` Ido Schimmel 2015-10-13 11:42 ` David Miller 3 siblings, 2 replies; 34+ messages in thread From: Vivien Didelot @ 2015-10-12 17:36 UTC (permalink / raw) To: Nikolay Aleksandrov; +Cc: netdev, sfeldma, jiri, davem, Nikolay Aleksandrov Hi guys, On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: > From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > > We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > --- > net/switchdev/switchdev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > index 6e4a4f9ad927..256c596de896 100644 > --- a/net/switchdev/switchdev.c > +++ b/net/switchdev/switchdev.c > @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > if (vlan.vid_begin) > return -EINVAL; > vlan.vid_begin = vinfo->vid; > + /* don't allow range of pvids */ > + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > + return -EINVAL; > } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > if (!vlan.vid_begin) > return -EINVAL; > -- > 2.4.3 > Yes the patch looks good, but it is a minor check though. I hope the subject of this thread is making sense. VLAN ranges seem to have been included for an UX purpose (so commands look like Cisco IOS). We don't want to change any existing interface, so we pushed that down to drivers, with the only valid reason that, maybe one day, an hardware can be capable of programming a range on a per-port basis. So what happens is that we'll add some code to fix and check non-sense (e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing other spots. Sorry for being insistent, but this still doesn't look right to me. It seems like we are bloating bridge, switchdev and drivers for the only reason to maintain a kernel support for something like: # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 17:36 ` Vivien Didelot @ 2015-10-13 6:13 ` Scott Feldman 2015-10-13 8:31 ` Ido Schimmel 1 sibling, 0 replies; 34+ messages in thread From: Scott Feldman @ 2015-10-13 6:13 UTC (permalink / raw) To: Vivien Didelot Cc: Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Roopa Prabhu On Mon, Oct 12, 2015 at 10:36 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: > Hi guys, > > On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> --- >> net/switchdev/switchdev.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> index 6e4a4f9ad927..256c596de896 100644 >> --- a/net/switchdev/switchdev.c >> +++ b/net/switchdev/switchdev.c >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >> if (vlan.vid_begin) >> return -EINVAL; >> vlan.vid_begin = vinfo->vid; >> + /* don't allow range of pvids */ >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >> + return -EINVAL; >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >> if (!vlan.vid_begin) >> return -EINVAL; >> -- >> 2.4.3 >> > > Yes the patch looks good, but it is a minor check though. I hope the > subject of this thread is making sense. > > VLAN ranges seem to have been included for an UX purpose (so commands > look like Cisco IOS). We don't want to change any existing interface, so > we pushed that down to drivers, with the only valid reason that, maybe > one day, an hardware can be capable of programming a range on a per-port > basis. > > So what happens is that we'll add some code to fix and check non-sense > (e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing > other spots. > > Sorry for being insistent, but this still doesn't look right to me. > > It seems like we are bloating bridge, switchdev and drivers for the only > reason to maintain a kernel support for something like: > > # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done [Adding Roopa] (Roopa, correct me if I'm wrong). Ref: commit # bdced7ef IIRC, vlan ranges were introduced to reduce the volume of netlink traffic when doing a command like your's above. Your command would create 3901 netlink msgs with a single IFLA_BRIDGE_VLAN_INFO, sent to the kernel and then echoed back to user-space. With vlan ranges, the cmd "bridge vlan add vid 100-4000 dev swp0" would result in one netlink msg with one IFLA_BRIDGE_VLAN_INFO. Seems there were problems with a getlink dump over-running the netlink msg (not using NLM_MULTI)? Maybe Roopa can expand on the problem that was solved. So the bridge command, and everything below it, has this feature. We can't undo it...it's in the wild, so to speak And, as mentioned before, it's and all-or-nothing command, and the underlying switchdev or driver code needs to deal with overlapping vlan maps. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 17:36 ` Vivien Didelot 2015-10-13 6:13 ` Scott Feldman @ 2015-10-13 8:31 ` Ido Schimmel 2015-10-13 14:32 ` Vivien Didelot 1 sibling, 1 reply; 34+ messages in thread From: Ido Schimmel @ 2015-10-13 8:31 UTC (permalink / raw) To: Vivien Didelot Cc: Nikolay Aleksandrov, netdev, sfeldma, jiri, davem, Nikolay Aleksandrov, eladr Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >Hi guys, > >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> --- >> net/switchdev/switchdev.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> index 6e4a4f9ad927..256c596de896 100644 >> --- a/net/switchdev/switchdev.c >> +++ b/net/switchdev/switchdev.c >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >> if (vlan.vid_begin) >> return -EINVAL; >> vlan.vid_begin = vinfo->vid; >> + /* don't allow range of pvids */ >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >> + return -EINVAL; >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >> if (!vlan.vid_begin) >> return -EINVAL; >> -- >> 2.4.3 >> > >Yes the patch looks good, but it is a minor check though. I hope the >subject of this thread is making sense. > >VLAN ranges seem to have been included for an UX purpose (so commands >look like Cisco IOS). We don't want to change any existing interface, so >we pushed that down to drivers, with the only valid reason that, maybe >one day, an hardware can be capable of programming a range on a per-port >basis. Hi, That's actually what we are doing in mlxsw. We can do up to 256 entries in one go. We've yet to submit this part. > >So what happens is that we'll add some code to fix and check non-sense >(e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing >other spots. > >Sorry for being insistent, but this still doesn't look right to me. > >It seems like we are bloating bridge, switchdev and drivers for the only >reason to maintain a kernel support for something like: > > # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done > >Thanks, >-v >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-13 8:31 ` Ido Schimmel @ 2015-10-13 14:32 ` Vivien Didelot 2015-10-14 6:14 ` Ido Schimmel 0 siblings, 1 reply; 34+ messages in thread From: Vivien Didelot @ 2015-10-13 14:32 UTC (permalink / raw) To: Ido Schimmel Cc: Nikolay Aleksandrov, netdev, sfeldma, jiri, davem, Nikolay Aleksandrov, eladr On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: > Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >Hi guys, > > > >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: > >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >> > >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > >> > >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >> --- > >> net/switchdev/switchdev.c | 3 +++ > >> 1 file changed, 3 insertions(+) > >> > >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > >> index 6e4a4f9ad927..256c596de896 100644 > >> --- a/net/switchdev/switchdev.c > >> +++ b/net/switchdev/switchdev.c > >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > >> if (vlan.vid_begin) > >> return -EINVAL; > >> vlan.vid_begin = vinfo->vid; > >> + /* don't allow range of pvids */ > >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > >> + return -EINVAL; > >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > >> if (!vlan.vid_begin) > >> return -EINVAL; > >> -- > >> 2.4.3 > >> > > > >Yes the patch looks good, but it is a minor check though. I hope the > >subject of this thread is making sense. > > > >VLAN ranges seem to have been included for an UX purpose (so commands > >look like Cisco IOS). We don't want to change any existing interface, so > >we pushed that down to drivers, with the only valid reason that, maybe > >one day, an hardware can be capable of programming a range on a per-port > >basis. > Hi, > > That's actually what we are doing in mlxsw. We can do up to 256 entries in > one go. We've yet to submit this part. Perfect Ido, thanks for pointing this out! I'm OK with the range then. So there is now a very last question in my head for this, which is more a matter of kernel design. Should the user be aware of such underlying support? In other words, would it make sense to do this in a driver: foo_port_vlan_add(struct net_device *dev, struct switchdev_obj_port_vlan *vlan) { if (vlan->vid_begin != vlan->vid_end) return -ENOTSUPP; /* or something more relevant for user */ return foo_port_single_vlan_add(dev, vlan->vid_begin); } So drivers keep being simple, and we can easily propagate the fact that one-or-all VLAN is not supportable, vs. the VLAN feature itself is not implemented and must be done in software. I'm not sure how transparent the hardware must be to the user. (the problem of one VLAN unsupported in a range is still something we need to address). > > > > >So what happens is that we'll add some code to fix and check non-sense > >(e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing > >other spots. > > > >Sorry for being insistent, but this still doesn't look right to me. > > > >It seems like we are bloating bridge, switchdev and drivers for the only > >reason to maintain a kernel support for something like: > > > > # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-13 14:32 ` Vivien Didelot @ 2015-10-14 6:14 ` Ido Schimmel 2015-10-14 15:25 ` Vivien Didelot 0 siblings, 1 reply; 34+ messages in thread From: Ido Schimmel @ 2015-10-14 6:14 UTC (permalink / raw) To: Vivien Didelot Cc: Nikolay Aleksandrov, netdev, sfeldma, jiri, davem, Nikolay Aleksandrov, eladr Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >> >Hi guys, >> > >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >> >> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> --- >> >> net/switchdev/switchdev.c | 3 +++ >> >> 1 file changed, 3 insertions(+) >> >> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> >> index 6e4a4f9ad927..256c596de896 100644 >> >> --- a/net/switchdev/switchdev.c >> >> +++ b/net/switchdev/switchdev.c >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >> >> if (vlan.vid_begin) >> >> return -EINVAL; >> >> vlan.vid_begin = vinfo->vid; >> >> + /* don't allow range of pvids */ >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >> >> + return -EINVAL; >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >> >> if (!vlan.vid_begin) >> >> return -EINVAL; >> >> -- >> >> 2.4.3 >> >> >> > >> >Yes the patch looks good, but it is a minor check though. I hope the >> >subject of this thread is making sense. >> > >> >VLAN ranges seem to have been included for an UX purpose (so commands >> >look like Cisco IOS). We don't want to change any existing interface, so >> >we pushed that down to drivers, with the only valid reason that, maybe >> >one day, an hardware can be capable of programming a range on a per-port >> >basis. >> Hi, >> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in >> one go. We've yet to submit this part. > >Perfect Ido, thanks for pointing this out! I'm OK with the range then. > >So there is now a very last question in my head for this, which is more >a matter of kernel design. Should the user be aware of such underlying >support? In other words, would it make sense to do this in a driver: > > foo_port_vlan_add(struct net_device *dev, > struct switchdev_obj_port_vlan *vlan) > { > if (vlan->vid_begin != vlan->vid_end) > return -ENOTSUPP; /* or something more relevant for user */ > > return foo_port_single_vlan_add(dev, vlan->vid_begin); > } > >So drivers keep being simple, and we can easily propagate the fact that >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >implemented and must be done in software. I think that if you want to keep it simple, then Scott's advice from the previous thread is the most appropriate one. I believe the hardware you are using is simply not meant to support multiple 802.1Q bridges. Trying to go around this will simply result in weird behavior (such as not supporting VLAN ranges), which is only sacrificed to support something a user doesn't even require (given the fact he's aware the hardware is meant to support only one 802.1Q bridge). What do you think? > >I'm not sure how transparent the hardware must be to the user. > >(the problem of one VLAN unsupported in a range is still something we >need to address). > >> >> > >> >So what happens is that we'll add some code to fix and check non-sense >> >(e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing >> >other spots. >> > >> >Sorry for being insistent, but this still doesn't look right to me. >> > >> >It seems like we are bloating bridge, switchdev and drivers for the only >> >reason to maintain a kernel support for something like: >> > >> > # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done > >Thanks, >-v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 6:14 ` Ido Schimmel @ 2015-10-14 15:25 ` Vivien Didelot 2015-10-14 17:14 ` Scott Feldman 0 siblings, 1 reply; 34+ messages in thread From: Vivien Didelot @ 2015-10-14 15:25 UTC (permalink / raw) To: Ido Schimmel Cc: Nikolay Aleksandrov, netdev, sfeldma, jiri, davem, Nikolay Aleksandrov, eladr On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: > Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: > >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >> >Hi guys, > >> > > >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: > >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >> >> > >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > >> >> > >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >> >> --- > >> >> net/switchdev/switchdev.c | 3 +++ > >> >> 1 file changed, 3 insertions(+) > >> >> > >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > >> >> index 6e4a4f9ad927..256c596de896 100644 > >> >> --- a/net/switchdev/switchdev.c > >> >> +++ b/net/switchdev/switchdev.c > >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > >> >> if (vlan.vid_begin) > >> >> return -EINVAL; > >> >> vlan.vid_begin = vinfo->vid; > >> >> + /* don't allow range of pvids */ > >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > >> >> + return -EINVAL; > >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > >> >> if (!vlan.vid_begin) > >> >> return -EINVAL; > >> >> -- > >> >> 2.4.3 > >> >> > >> > > >> >Yes the patch looks good, but it is a minor check though. I hope the > >> >subject of this thread is making sense. > >> > > >> >VLAN ranges seem to have been included for an UX purpose (so commands > >> >look like Cisco IOS). We don't want to change any existing interface, so > >> >we pushed that down to drivers, with the only valid reason that, maybe > >> >one day, an hardware can be capable of programming a range on a per-port > >> >basis. > >> Hi, > >> > >> That's actually what we are doing in mlxsw. We can do up to 256 entries in > >> one go. We've yet to submit this part. > > > >Perfect Ido, thanks for pointing this out! I'm OK with the range then. > > > >So there is now a very last question in my head for this, which is more > >a matter of kernel design. Should the user be aware of such underlying > >support? In other words, would it make sense to do this in a driver: > > > > foo_port_vlan_add(struct net_device *dev, > > struct switchdev_obj_port_vlan *vlan) > > { > > if (vlan->vid_begin != vlan->vid_end) > > return -ENOTSUPP; /* or something more relevant for user */ > > > > return foo_port_single_vlan_add(dev, vlan->vid_begin); > > } > > > >So drivers keep being simple, and we can easily propagate the fact that > >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not > >implemented and must be done in software. > I think that if you want to keep it simple, then Scott's advice from the > previous thread is the most appropriate one. I believe the hardware you > are using is simply not meant to support multiple 802.1Q bridges. You mean allowing only one Linux bridge over an hardware switch? It would for sure simplify how, as developers and users, we represent a physical switch. But I am not sure how to achieve that and I don't have strong opinions on this TBH. > Trying to go around this will simply result in weird behavior (such as > not supporting VLAN ranges), which is only sacrificed to support > something a user doesn't even require (given the fact he's aware the > hardware is meant to support only one 802.1Q bridge). > > What do you think? I think mapping one Linux bridge to one physical switch (and thus have real bridge device) makes a lot of sense. But since Linux soft bridges are not stackable (yet?), how do we join 2 distinct interfaces then? e.g. eth1 is WAN and switch ports behind eth0. > > > >I'm not sure how transparent the hardware must be to the user. > > > >(the problem of one VLAN unsupported in a range is still something we > >need to address). > > > >> > >> > > >> >So what happens is that we'll add some code to fix and check non-sense > >> >(e.g. range + PVID) in switchdev, bridge, and I'm sure we are missing > >> >other spots. > >> > > >> >Sorry for being insistent, but this still doesn't look right to me. > >> > > >> >It seems like we are bloating bridge, switchdev and drivers for the only > >> >reason to maintain a kernel support for something like: > >> > > >> > # for i in $(seq 100 4000); do bridge vlan add vid $i dev swp0; done Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 15:25 ` Vivien Didelot @ 2015-10-14 17:14 ` Scott Feldman 2015-10-14 17:42 ` Ido Schimmel 0 siblings, 1 reply; 34+ messages in thread From: Scott Feldman @ 2015-10-14 17:14 UTC (permalink / raw) To: Vivien Didelot Cc: Ido Schimmel, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot <vivien.didelot@savoirfairelinux.com> wrote: > On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: >> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >> >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >> >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >> >> >Hi guys, >> >> > >> >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >> >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> >> >> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >> >> >> >> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >> >> >> --- >> >> >> net/switchdev/switchdev.c | 3 +++ >> >> >> 1 file changed, 3 insertions(+) >> >> >> >> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >> >> >> index 6e4a4f9ad927..256c596de896 100644 >> >> >> --- a/net/switchdev/switchdev.c >> >> >> +++ b/net/switchdev/switchdev.c >> >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >> >> >> if (vlan.vid_begin) >> >> >> return -EINVAL; >> >> >> vlan.vid_begin = vinfo->vid; >> >> >> + /* don't allow range of pvids */ >> >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >> >> >> + return -EINVAL; >> >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >> >> >> if (!vlan.vid_begin) >> >> >> return -EINVAL; >> >> >> -- >> >> >> 2.4.3 >> >> >> >> >> > >> >> >Yes the patch looks good, but it is a minor check though. I hope the >> >> >subject of this thread is making sense. >> >> > >> >> >VLAN ranges seem to have been included for an UX purpose (so commands >> >> >look like Cisco IOS). We don't want to change any existing interface, so >> >> >we pushed that down to drivers, with the only valid reason that, maybe >> >> >one day, an hardware can be capable of programming a range on a per-port >> >> >basis. >> >> Hi, >> >> >> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in >> >> one go. We've yet to submit this part. >> > >> >Perfect Ido, thanks for pointing this out! I'm OK with the range then. >> > >> >So there is now a very last question in my head for this, which is more >> >a matter of kernel design. Should the user be aware of such underlying >> >support? In other words, would it make sense to do this in a driver: >> > >> > foo_port_vlan_add(struct net_device *dev, >> > struct switchdev_obj_port_vlan *vlan) >> > { >> > if (vlan->vid_begin != vlan->vid_end) >> > return -ENOTSUPP; /* or something more relevant for user */ >> > >> > return foo_port_single_vlan_add(dev, vlan->vid_begin); >> > } >> > >> >So drivers keep being simple, and we can easily propagate the fact that >> >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >> >implemented and must be done in software. >> I think that if you want to keep it simple, then Scott's advice from the >> previous thread is the most appropriate one. I believe the hardware you >> are using is simply not meant to support multiple 802.1Q bridges. > > You mean allowing only one Linux bridge over an hardware switch? > > It would for sure simplify how, as developers and users, we represent a > physical switch. But I am not sure how to achieve that and I don't have > strong opinions on this TBH. Hi Vivien, I think it's possible to keep switch ports on just one bridge if we do a little bit of work on the NETDEV_CHANGEUPPER notifier. This will give you the driver-level control you want. Do you have time to investigate? The idea is: 1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is being added to a second bridge,then return NOTIFY_BAD. Your driver needs to track the bridge count. 2) In __netdev_upper_dev_link(), check the return code from the call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if NOTIFY_BAD, abort the linking operation (goto rollback_xxx). -scott ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 17:14 ` Scott Feldman @ 2015-10-14 17:42 ` Ido Schimmel 2015-10-14 18:51 ` Vivien Didelot 2015-10-15 2:58 ` Scott Feldman 0 siblings, 2 replies; 34+ messages in thread From: Ido Schimmel @ 2015-10-14 17:42 UTC (permalink / raw) To: Scott Feldman Cc: Vivien Didelot, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: >On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot ><vivien.didelot@savoirfairelinux.com> wrote: >> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: >>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>> >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >>> >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>> >> >Hi guys, >>> >> > >>> >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >>> >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>> >> >> >>> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >>> >> >> >>> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>> >> >> --- >>> >> >> net/switchdev/switchdev.c | 3 +++ >>> >> >> 1 file changed, 3 insertions(+) >>> >> >> >>> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >>> >> >> index 6e4a4f9ad927..256c596de896 100644 >>> >> >> --- a/net/switchdev/switchdev.c >>> >> >> +++ b/net/switchdev/switchdev.c >>> >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >>> >> >> if (vlan.vid_begin) >>> >> >> return -EINVAL; >>> >> >> vlan.vid_begin = vinfo->vid; >>> >> >> + /* don't allow range of pvids */ >>> >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >>> >> >> + return -EINVAL; >>> >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >>> >> >> if (!vlan.vid_begin) >>> >> >> return -EINVAL; >>> >> >> -- >>> >> >> 2.4.3 >>> >> >> >>> >> > >>> >> >Yes the patch looks good, but it is a minor check though. I hope the >>> >> >subject of this thread is making sense. >>> >> > >>> >> >VLAN ranges seem to have been included for an UX purpose (so commands >>> >> >look like Cisco IOS). We don't want to change any existing interface, so >>> >> >we pushed that down to drivers, with the only valid reason that, maybe >>> >> >one day, an hardware can be capable of programming a range on a per-port >>> >> >basis. >>> >> Hi, >>> >> >>> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in >>> >> one go. We've yet to submit this part. >>> > >>> >Perfect Ido, thanks for pointing this out! I'm OK with the range then. >>> > >>> >So there is now a very last question in my head for this, which is more >>> >a matter of kernel design. Should the user be aware of such underlying >>> >support? In other words, would it make sense to do this in a driver: >>> > >>> > foo_port_vlan_add(struct net_device *dev, >>> > struct switchdev_obj_port_vlan *vlan) >>> > { >>> > if (vlan->vid_begin != vlan->vid_end) >>> > return -ENOTSUPP; /* or something more relevant for user */ >>> > >>> > return foo_port_single_vlan_add(dev, vlan->vid_begin); >>> > } >>> > >>> >So drivers keep being simple, and we can easily propagate the fact that >>> >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >>> >implemented and must be done in software. >>> I think that if you want to keep it simple, then Scott's advice from the >>> previous thread is the most appropriate one. I believe the hardware you >>> are using is simply not meant to support multiple 802.1Q bridges. >> >> You mean allowing only one Linux bridge over an hardware switch? >> >> It would for sure simplify how, as developers and users, we represent a >> physical switch. But I am not sure how to achieve that and I don't have >> strong opinions on this TBH. > >Hi Vivien, I think it's possible to keep switch ports on just one >bridge if we do a little bit of work on the NETDEV_CHANGEUPPER >notifier. This will give you the driver-level control you want. Do >you have time to investigate? The idea is: > >1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is >being added to a second bridge,then return NOTIFY_BAD. Your driver >needs to track the bridge count. > >2) In __netdev_upper_dev_link(), check the return code from the >call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if >NOTIFY_BAD, abort the linking operation (goto rollback_xxx). > Hi, We are doing something similar in mlxsw (not upstream yet). Jiri introduced PRE_CHANGEUPPER, which is called from the function you mentioned, but before the linking operation (so that you don't need to rollback). If the notification is about a linking operation and the master is a bridge different than the current one, then NOTIFY_BAD is returned. Vivien, regarding your WAN interface question, this is something we currently don't do. We don't even flood traffic from bridged ports to CPU (although we can), as it can saturate the bus. Only control traffic is supposed to go there. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 17:42 ` Ido Schimmel @ 2015-10-14 18:51 ` Vivien Didelot 2015-10-14 22:08 ` Florian Fainelli 2015-10-15 2:58 ` Scott Feldman 1 sibling, 1 reply; 34+ messages in thread From: Vivien Didelot @ 2015-10-14 18:51 UTC (permalink / raw) To: Ido Schimmel Cc: Scott Feldman, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz On Oct. Wednesday 14 (42) 08:42 PM, Ido Schimmel wrote: > Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: > >On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot > ><vivien.didelot@savoirfairelinux.com> wrote: > >> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: > >>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >>> >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: > >>> >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >>> >> >Hi guys, > >>> >> > > >>> >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: > >>> >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >>> >> >> > >>> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > >>> >> >> > >>> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >>> >> >> --- > >>> >> >> net/switchdev/switchdev.c | 3 +++ > >>> >> >> 1 file changed, 3 insertions(+) > >>> >> >> > >>> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > >>> >> >> index 6e4a4f9ad927..256c596de896 100644 > >>> >> >> --- a/net/switchdev/switchdev.c > >>> >> >> +++ b/net/switchdev/switchdev.c > >>> >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > >>> >> >> if (vlan.vid_begin) > >>> >> >> return -EINVAL; > >>> >> >> vlan.vid_begin = vinfo->vid; > >>> >> >> + /* don't allow range of pvids */ > >>> >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > >>> >> >> + return -EINVAL; > >>> >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > >>> >> >> if (!vlan.vid_begin) > >>> >> >> return -EINVAL; > >>> >> >> -- > >>> >> >> 2.4.3 > >>> >> >> > >>> >> > > >>> >> >Yes the patch looks good, but it is a minor check though. I hope the > >>> >> >subject of this thread is making sense. > >>> >> > > >>> >> >VLAN ranges seem to have been included for an UX purpose (so commands > >>> >> >look like Cisco IOS). We don't want to change any existing interface, so > >>> >> >we pushed that down to drivers, with the only valid reason that, maybe > >>> >> >one day, an hardware can be capable of programming a range on a per-port > >>> >> >basis. > >>> >> Hi, > >>> >> > >>> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in > >>> >> one go. We've yet to submit this part. > >>> > > >>> >Perfect Ido, thanks for pointing this out! I'm OK with the range then. > >>> > > >>> >So there is now a very last question in my head for this, which is more > >>> >a matter of kernel design. Should the user be aware of such underlying > >>> >support? In other words, would it make sense to do this in a driver: > >>> > > >>> > foo_port_vlan_add(struct net_device *dev, > >>> > struct switchdev_obj_port_vlan *vlan) > >>> > { > >>> > if (vlan->vid_begin != vlan->vid_end) > >>> > return -ENOTSUPP; /* or something more relevant for user */ > >>> > > >>> > return foo_port_single_vlan_add(dev, vlan->vid_begin); > >>> > } > >>> > > >>> >So drivers keep being simple, and we can easily propagate the fact that > >>> >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not > >>> >implemented and must be done in software. > >>> I think that if you want to keep it simple, then Scott's advice from the > >>> previous thread is the most appropriate one. I believe the hardware you > >>> are using is simply not meant to support multiple 802.1Q bridges. > >> > >> You mean allowing only one Linux bridge over an hardware switch? > >> > >> It would for sure simplify how, as developers and users, we represent a > >> physical switch. But I am not sure how to achieve that and I don't have > >> strong opinions on this TBH. > > > >Hi Vivien, I think it's possible to keep switch ports on just one > >bridge if we do a little bit of work on the NETDEV_CHANGEUPPER > >notifier. This will give you the driver-level control you want. Do > >you have time to investigate? The idea is: > > > >1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is > >being added to a second bridge,then return NOTIFY_BAD. Your driver > >needs to track the bridge count. > > > >2) In __netdev_upper_dev_link(), check the return code from the > >call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if > >NOTIFY_BAD, abort the linking operation (goto rollback_xxx). > > > Hi, > > We are doing something similar in mlxsw (not upstream yet). Jiri > introduced PRE_CHANGEUPPER, which is called from the function you > mentioned, but before the linking operation (so that you don't need to > rollback). > > If the notification is about a linking operation and the master is a > bridge different than the current one, then NOTIFY_BAD is returned. Great, I'll wait for this then. Scott, this is another good reason why we definitely need a simple struct device per switch chip. In addition to the port net_device registration, the netdev notifier is another exact same piece of code that both Rocker and DSA implement. > Vivien, regarding your WAN interface question, this is something we > currently don't do. We don't even flood traffic from bridged ports > to CPU (although we can), as it can saturate the bus. Only control > traffic is supposed to go there. I kinda answered it myself: a Linux bridge needs to remain a user abstraction of a logical group of net_device. In other words, we must allow physical distinct ports under the same bridge. Below is an example of a custom router with 2 chained switch chips sw0 and sw1, and what usage I believe we expect: [ Linux soft bridge "br0" which can accelerate VLAN, STP, etc. ] (CPU) (WAN) [ sw0p0 sw0p1 sw0p2 ] [ sw1p0 sw1p1 sw1p2 sw1p3 ] [ eth0 ] [ eth1 ] `--DSA--' `-------' Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 18:51 ` Vivien Didelot @ 2015-10-14 22:08 ` Florian Fainelli 2015-10-15 0:07 ` Vivien Didelot 0 siblings, 1 reply; 34+ messages in thread From: Florian Fainelli @ 2015-10-14 22:08 UTC (permalink / raw) To: Vivien Didelot, Ido Schimmel Cc: Scott Feldman, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz On 14/10/15 11:51, Vivien Didelot wrote: > On Oct. Wednesday 14 (42) 08:42 PM, Ido Schimmel wrote: >> Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: >>> On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot >>> <vivien.didelot@savoirfairelinux.com> wrote: >>>> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: >>>>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>>>> On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >>>>>>> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>>>>>> Hi guys, >>>>>>>> >>>>>>>> On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >>>>>>>>> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>>>>>>> >>>>>>>>> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >>>>>>>>> >>>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>>>>>>> --- >>>>>>>>> net/switchdev/switchdev.c | 3 +++ >>>>>>>>> 1 file changed, 3 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >>>>>>>>> index 6e4a4f9ad927..256c596de896 100644 >>>>>>>>> --- a/net/switchdev/switchdev.c >>>>>>>>> +++ b/net/switchdev/switchdev.c >>>>>>>>> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >>>>>>>>> if (vlan.vid_begin) >>>>>>>>> return -EINVAL; >>>>>>>>> vlan.vid_begin = vinfo->vid; >>>>>>>>> + /* don't allow range of pvids */ >>>>>>>>> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >>>>>>>>> + return -EINVAL; >>>>>>>>> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >>>>>>>>> if (!vlan.vid_begin) >>>>>>>>> return -EINVAL; >>>>>>>>> -- >>>>>>>>> 2.4.3 >>>>>>>>> >>>>>>>> >>>>>>>> Yes the patch looks good, but it is a minor check though. I hope the >>>>>>>> subject of this thread is making sense. >>>>>>>> >>>>>>>> VLAN ranges seem to have been included for an UX purpose (so commands >>>>>>>> look like Cisco IOS). We don't want to change any existing interface, so >>>>>>>> we pushed that down to drivers, with the only valid reason that, maybe >>>>>>>> one day, an hardware can be capable of programming a range on a per-port >>>>>>>> basis. >>>>>>> Hi, >>>>>>> >>>>>>> That's actually what we are doing in mlxsw. We can do up to 256 entries in >>>>>>> one go. We've yet to submit this part. >>>>>> >>>>>> Perfect Ido, thanks for pointing this out! I'm OK with the range then. >>>>>> >>>>>> So there is now a very last question in my head for this, which is more >>>>>> a matter of kernel design. Should the user be aware of such underlying >>>>>> support? In other words, would it make sense to do this in a driver: >>>>>> >>>>>> foo_port_vlan_add(struct net_device *dev, >>>>>> struct switchdev_obj_port_vlan *vlan) >>>>>> { >>>>>> if (vlan->vid_begin != vlan->vid_end) >>>>>> return -ENOTSUPP; /* or something more relevant for user */ >>>>>> >>>>>> return foo_port_single_vlan_add(dev, vlan->vid_begin); >>>>>> } >>>>>> >>>>>> So drivers keep being simple, and we can easily propagate the fact that >>>>>> one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >>>>>> implemented and must be done in software. >>>>> I think that if you want to keep it simple, then Scott's advice from the >>>>> previous thread is the most appropriate one. I believe the hardware you >>>>> are using is simply not meant to support multiple 802.1Q bridges. >>>> >>>> You mean allowing only one Linux bridge over an hardware switch? >>>> >>>> It would for sure simplify how, as developers and users, we represent a >>>> physical switch. But I am not sure how to achieve that and I don't have >>>> strong opinions on this TBH. >>> >>> Hi Vivien, I think it's possible to keep switch ports on just one >>> bridge if we do a little bit of work on the NETDEV_CHANGEUPPER >>> notifier. This will give you the driver-level control you want. Do >>> you have time to investigate? The idea is: >>> >>> 1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is >>> being added to a second bridge,then return NOTIFY_BAD. Your driver >>> needs to track the bridge count. >>> >>> 2) In __netdev_upper_dev_link(), check the return code from the >>> call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if >>> NOTIFY_BAD, abort the linking operation (goto rollback_xxx). >>> >> Hi, >> >> We are doing something similar in mlxsw (not upstream yet). Jiri >> introduced PRE_CHANGEUPPER, which is called from the function you >> mentioned, but before the linking operation (so that you don't need to >> rollback). >> >> If the notification is about a linking operation and the master is a >> bridge different than the current one, then NOTIFY_BAD is returned. > > Great, I'll wait for this then. > > Scott, this is another good reason why we definitely need a simple > struct device per switch chip. In addition to the port net_device > registration, the netdev notifier is another exact same piece of code > that both Rocker and DSA implement. > >> Vivien, regarding your WAN interface question, this is something we >> currently don't do. We don't even flood traffic from bridged ports >> to CPU (although we can), as it can saturate the bus. Only control >> traffic is supposed to go there. > > I kinda answered it myself: a Linux bridge needs to remain a user > abstraction of a logical group of net_device. In other words, we must > allow physical distinct ports under the same bridge. > > Below is an example of a custom router with 2 chained switch chips sw0 > and sw1, and what usage I believe we expect: > > [ Linux soft bridge "br0" which can accelerate VLAN, STP, etc. ] > (CPU) (WAN) > [ sw0p0 sw0p1 sw0p2 ] [ sw1p0 sw1p1 sw1p2 sw1p3 ] [ eth0 ] [ eth1 ] > `--DSA--' `-------' Your use case is something that is fairly common with PON/GPON devices, but AFAIR they typically implement this by having two sets of bridges, one which spans the WAN interface and a bunch of other ports, and another bridge which is LAN only (few ports + Wi-Fi typically). Usually this is under the same physical switch though, so this is all about partitioning physical ports and physical interfaces under logical groups. By definition the WAN and LAN domains are separate logical and broadcast domains, with separate admission control rules, STP etc. I do not think your "br0" example should span the WAN interface in this case. Also, with eth0 being the conduit interface, it cannot be allowed to be a bridge member, that's something I ended up fixing in the bridge layer, otherwise untagging does not work, but that is nitpicking. It still makes sense though allow the creation of a "br0" device which spans the entire set of ports and interfaces (except eth0), but not name eth1 "WAN", just treat it as an extension in that case. Sorry if that seems like f'ing flies, but having concise example should help make progress on these design issues ;) -- Florian ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 22:08 ` Florian Fainelli @ 2015-10-15 0:07 ` Vivien Didelot 0 siblings, 0 replies; 34+ messages in thread From: Vivien Didelot @ 2015-10-15 0:07 UTC (permalink / raw) To: Florian Fainelli Cc: Ido Schimmel, Scott Feldman, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz On Oct. Wednesday 14 (42) 03:08 PM, Florian Fainelli wrote: > On 14/10/15 11:51, Vivien Didelot wrote: > > On Oct. Wednesday 14 (42) 08:42 PM, Ido Schimmel wrote: > >> Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: > >>> On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot > >>> <vivien.didelot@savoirfairelinux.com> wrote: > >>>> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: > >>>>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >>>>>> On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: > >>>>>>> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: > >>>>>>>> Hi guys, > >>>>>>>> > >>>>>>>> On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: > >>>>>>>>> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >>>>>>>>> > >>>>>>>>> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > >>>>>>>>> > >>>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > >>>>>>>>> --- > >>>>>>>>> net/switchdev/switchdev.c | 3 +++ > >>>>>>>>> 1 file changed, 3 insertions(+) > >>>>>>>>> > >>>>>>>>> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c > >>>>>>>>> index 6e4a4f9ad927..256c596de896 100644 > >>>>>>>>> --- a/net/switchdev/switchdev.c > >>>>>>>>> +++ b/net/switchdev/switchdev.c > >>>>>>>>> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, > >>>>>>>>> if (vlan.vid_begin) > >>>>>>>>> return -EINVAL; > >>>>>>>>> vlan.vid_begin = vinfo->vid; > >>>>>>>>> + /* don't allow range of pvids */ > >>>>>>>>> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) > >>>>>>>>> + return -EINVAL; > >>>>>>>>> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { > >>>>>>>>> if (!vlan.vid_begin) > >>>>>>>>> return -EINVAL; > >>>>>>>>> -- > >>>>>>>>> 2.4.3 > >>>>>>>>> > >>>>>>>> > >>>>>>>> Yes the patch looks good, but it is a minor check though. I hope the > >>>>>>>> subject of this thread is making sense. > >>>>>>>> > >>>>>>>> VLAN ranges seem to have been included for an UX purpose (so commands > >>>>>>>> look like Cisco IOS). We don't want to change any existing interface, so > >>>>>>>> we pushed that down to drivers, with the only valid reason that, maybe > >>>>>>>> one day, an hardware can be capable of programming a range on a per-port > >>>>>>>> basis. > >>>>>>> Hi, > >>>>>>> > >>>>>>> That's actually what we are doing in mlxsw. We can do up to 256 entries in > >>>>>>> one go. We've yet to submit this part. > >>>>>> > >>>>>> Perfect Ido, thanks for pointing this out! I'm OK with the range then. > >>>>>> > >>>>>> So there is now a very last question in my head for this, which is more > >>>>>> a matter of kernel design. Should the user be aware of such underlying > >>>>>> support? In other words, would it make sense to do this in a driver: > >>>>>> > >>>>>> foo_port_vlan_add(struct net_device *dev, > >>>>>> struct switchdev_obj_port_vlan *vlan) > >>>>>> { > >>>>>> if (vlan->vid_begin != vlan->vid_end) > >>>>>> return -ENOTSUPP; /* or something more relevant for user */ > >>>>>> > >>>>>> return foo_port_single_vlan_add(dev, vlan->vid_begin); > >>>>>> } > >>>>>> > >>>>>> So drivers keep being simple, and we can easily propagate the fact that > >>>>>> one-or-all VLAN is not supportable, vs. the VLAN feature itself is not > >>>>>> implemented and must be done in software. > >>>>> I think that if you want to keep it simple, then Scott's advice from the > >>>>> previous thread is the most appropriate one. I believe the hardware you > >>>>> are using is simply not meant to support multiple 802.1Q bridges. > >>>> > >>>> You mean allowing only one Linux bridge over an hardware switch? > >>>> > >>>> It would for sure simplify how, as developers and users, we represent a > >>>> physical switch. But I am not sure how to achieve that and I don't have > >>>> strong opinions on this TBH. > >>> > >>> Hi Vivien, I think it's possible to keep switch ports on just one > >>> bridge if we do a little bit of work on the NETDEV_CHANGEUPPER > >>> notifier. This will give you the driver-level control you want. Do > >>> you have time to investigate? The idea is: > >>> > >>> 1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is > >>> being added to a second bridge,then return NOTIFY_BAD. Your driver > >>> needs to track the bridge count. > >>> > >>> 2) In __netdev_upper_dev_link(), check the return code from the > >>> call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if > >>> NOTIFY_BAD, abort the linking operation (goto rollback_xxx). > >>> > >> Hi, > >> > >> We are doing something similar in mlxsw (not upstream yet). Jiri > >> introduced PRE_CHANGEUPPER, which is called from the function you > >> mentioned, but before the linking operation (so that you don't need to > >> rollback). > >> > >> If the notification is about a linking operation and the master is a > >> bridge different than the current one, then NOTIFY_BAD is returned. > > > > Great, I'll wait for this then. > > > > Scott, this is another good reason why we definitely need a simple > > struct device per switch chip. In addition to the port net_device > > registration, the netdev notifier is another exact same piece of code > > that both Rocker and DSA implement. > > > >> Vivien, regarding your WAN interface question, this is something we > >> currently don't do. We don't even flood traffic from bridged ports > >> to CPU (although we can), as it can saturate the bus. Only control > >> traffic is supposed to go there. > > > > I kinda answered it myself: a Linux bridge needs to remain a user > > abstraction of a logical group of net_device. In other words, we must > > allow physical distinct ports under the same bridge. > > > > Below is an example of a custom router with 2 chained switch chips sw0 > > and sw1, and what usage I believe we expect: > > > > [ Linux soft bridge "br0" which can accelerate VLAN, STP, etc. ] > > (CPU) (WAN) > > [ sw0p0 sw0p1 sw0p2 ] [ sw1p0 sw1p1 sw1p2 sw1p3 ] [ eth0 ] [ eth1 ] > > `--DSA--' `-------' > > Your use case is something that is fairly common with PON/GPON devices, > but AFAIR they typically implement this by having two sets of bridges, > one which spans the WAN interface and a bunch of other ports, and > another bridge which is LAN only (few ports + Wi-Fi typically). Usually > this is under the same physical switch though, so this is all about > partitioning physical ports and physical interfaces under logical groups. > > By definition the WAN and LAN domains are separate logical and broadcast > domains, with separate admission control rules, STP etc. I do not think > your "br0" example should span the WAN interface in this case. Also, > with eth0 being the conduit interface, it cannot be allowed to be a > bridge member, that's something I ended up fixing in the bridge layer, > otherwise untagging does not work, but that is nitpicking. > > It still makes sense though allow the creation of a "br0" device which > spans the entire set of ports and interfaces (except eth0), but not name > eth1 "WAN", just treat it as an extension in that case. > > Sorry if that seems like f'ing flies, but having concise example should > help make progress on these design issues ;) You are right, my drawing wasn't that correct. What you just described is more accurate. Thanks, -v ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-14 17:42 ` Ido Schimmel 2015-10-14 18:51 ` Vivien Didelot @ 2015-10-15 2:58 ` Scott Feldman 2015-10-15 7:28 ` Ido Schimmel 1 sibling, 1 reply; 34+ messages in thread From: Scott Feldman @ 2015-10-15 2:58 UTC (permalink / raw) To: Ido Schimmel Cc: Vivien Didelot, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz On Wed, Oct 14, 2015 at 10:42 AM, Ido Schimmel <idosch@mellanox.com> wrote: > Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: >>On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot >><vivien.didelot@savoirfairelinux.com> wrote: >>> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: >>>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>> >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >>>> >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>> >> >Hi guys, >>>> >> > >>>> >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >>>> >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>> >> >> >>>> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >>>> >> >> >>>> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>> >> >> --- >>>> >> >> net/switchdev/switchdev.c | 3 +++ >>>> >> >> 1 file changed, 3 insertions(+) >>>> >> >> >>>> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >>>> >> >> index 6e4a4f9ad927..256c596de896 100644 >>>> >> >> --- a/net/switchdev/switchdev.c >>>> >> >> +++ b/net/switchdev/switchdev.c >>>> >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >>>> >> >> if (vlan.vid_begin) >>>> >> >> return -EINVAL; >>>> >> >> vlan.vid_begin = vinfo->vid; >>>> >> >> + /* don't allow range of pvids */ >>>> >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >>>> >> >> + return -EINVAL; >>>> >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >>>> >> >> if (!vlan.vid_begin) >>>> >> >> return -EINVAL; >>>> >> >> -- >>>> >> >> 2.4.3 >>>> >> >> >>>> >> > >>>> >> >Yes the patch looks good, but it is a minor check though. I hope the >>>> >> >subject of this thread is making sense. >>>> >> > >>>> >> >VLAN ranges seem to have been included for an UX purpose (so commands >>>> >> >look like Cisco IOS). We don't want to change any existing interface, so >>>> >> >we pushed that down to drivers, with the only valid reason that, maybe >>>> >> >one day, an hardware can be capable of programming a range on a per-port >>>> >> >basis. >>>> >> Hi, >>>> >> >>>> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in >>>> >> one go. We've yet to submit this part. >>>> > >>>> >Perfect Ido, thanks for pointing this out! I'm OK with the range then. >>>> > >>>> >So there is now a very last question in my head for this, which is more >>>> >a matter of kernel design. Should the user be aware of such underlying >>>> >support? In other words, would it make sense to do this in a driver: >>>> > >>>> > foo_port_vlan_add(struct net_device *dev, >>>> > struct switchdev_obj_port_vlan *vlan) >>>> > { >>>> > if (vlan->vid_begin != vlan->vid_end) >>>> > return -ENOTSUPP; /* or something more relevant for user */ >>>> > >>>> > return foo_port_single_vlan_add(dev, vlan->vid_begin); >>>> > } >>>> > >>>> >So drivers keep being simple, and we can easily propagate the fact that >>>> >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >>>> >implemented and must be done in software. >>>> I think that if you want to keep it simple, then Scott's advice from the >>>> previous thread is the most appropriate one. I believe the hardware you >>>> are using is simply not meant to support multiple 802.1Q bridges. >>> >>> You mean allowing only one Linux bridge over an hardware switch? >>> >>> It would for sure simplify how, as developers and users, we represent a >>> physical switch. But I am not sure how to achieve that and I don't have >>> strong opinions on this TBH. >> >>Hi Vivien, I think it's possible to keep switch ports on just one >>bridge if we do a little bit of work on the NETDEV_CHANGEUPPER >>notifier. This will give you the driver-level control you want. Do >>you have time to investigate? The idea is: >> >>1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is >>being added to a second bridge,then return NOTIFY_BAD. Your driver >>needs to track the bridge count. >> >>2) In __netdev_upper_dev_link(), check the return code from the >>call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if >>NOTIFY_BAD, abort the linking operation (goto rollback_xxx). >> > Hi, > > We are doing something similar in mlxsw (not upstream yet). Jiri > introduced PRE_CHANGEUPPER, which is called from the function you > mentioned, but before the linking operation (so that you don't need to > rollback). Oh, cool. > If the notification is about a linking operation and the master is a > bridge different than the current one, then NOTIFY_BAD is returned. So you're wanting to restrict to just one bridge also? Or is NOTIFY_BAD returned for some other reason? I guess I should be patient and wait for the patch. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-15 2:58 ` Scott Feldman @ 2015-10-15 7:28 ` Ido Schimmel 0 siblings, 0 replies; 34+ messages in thread From: Ido Schimmel @ 2015-10-15 7:28 UTC (permalink / raw) To: Scott Feldman Cc: Vivien Didelot, Nikolay Aleksandrov, Netdev, Jiří Pírko, David S. Miller, Nikolay Aleksandrov, Elad Raz Thu, Oct 15, 2015 at 05:58:33AM IDT, sfeldma@gmail.com wrote: >On Wed, Oct 14, 2015 at 10:42 AM, Ido Schimmel <idosch@mellanox.com> wrote: >> Wed, Oct 14, 2015 at 08:14:24PM IDT, sfeldma@gmail.com wrote: >>>On Wed, Oct 14, 2015 at 8:25 AM, Vivien Didelot >>><vivien.didelot@savoirfairelinux.com> wrote: >>>> On Oct. Wednesday 14 (42) 09:14 AM, Ido Schimmel wrote: >>>>> Tue, Oct 13, 2015 at 05:32:26PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>>> >On Oct. Tuesday 13 (42) 11:31 AM, Ido Schimmel wrote: >>>>> >> Mon, Oct 12, 2015 at 08:36:25PM IDT, vivien.didelot@savoirfairelinux.com wrote: >>>>> >> >Hi guys, >>>>> >> > >>>>> >> >On Oct. Monday 12 (42) 02:01 PM, Nikolay Aleksandrov wrote: >>>>> >> >> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>>> >> >> >>>>> >> >> We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. >>>>> >> >> >>>>> >> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> >>>>> >> >> --- >>>>> >> >> net/switchdev/switchdev.c | 3 +++ >>>>> >> >> 1 file changed, 3 insertions(+) >>>>> >> >> >>>>> >> >> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c >>>>> >> >> index 6e4a4f9ad927..256c596de896 100644 >>>>> >> >> --- a/net/switchdev/switchdev.c >>>>> >> >> +++ b/net/switchdev/switchdev.c >>>>> >> >> @@ -720,6 +720,9 @@ static int switchdev_port_br_afspec(struct net_device *dev, >>>>> >> >> if (vlan.vid_begin) >>>>> >> >> return -EINVAL; >>>>> >> >> vlan.vid_begin = vinfo->vid; >>>>> >> >> + /* don't allow range of pvids */ >>>>> >> >> + if (vlan.flags & BRIDGE_VLAN_INFO_PVID) >>>>> >> >> + return -EINVAL; >>>>> >> >> } else if (vinfo->flags & BRIDGE_VLAN_INFO_RANGE_END) { >>>>> >> >> if (!vlan.vid_begin) >>>>> >> >> return -EINVAL; >>>>> >> >> -- >>>>> >> >> 2.4.3 >>>>> >> >> >>>>> >> > >>>>> >> >Yes the patch looks good, but it is a minor check though. I hope the >>>>> >> >subject of this thread is making sense. >>>>> >> > >>>>> >> >VLAN ranges seem to have been included for an UX purpose (so commands >>>>> >> >look like Cisco IOS). We don't want to change any existing interface, so >>>>> >> >we pushed that down to drivers, with the only valid reason that, maybe >>>>> >> >one day, an hardware can be capable of programming a range on a per-port >>>>> >> >basis. >>>>> >> Hi, >>>>> >> >>>>> >> That's actually what we are doing in mlxsw. We can do up to 256 entries in >>>>> >> one go. We've yet to submit this part. >>>>> > >>>>> >Perfect Ido, thanks for pointing this out! I'm OK with the range then. >>>>> > >>>>> >So there is now a very last question in my head for this, which is more >>>>> >a matter of kernel design. Should the user be aware of such underlying >>>>> >support? In other words, would it make sense to do this in a driver: >>>>> > >>>>> > foo_port_vlan_add(struct net_device *dev, >>>>> > struct switchdev_obj_port_vlan *vlan) >>>>> > { >>>>> > if (vlan->vid_begin != vlan->vid_end) >>>>> > return -ENOTSUPP; /* or something more relevant for user */ >>>>> > >>>>> > return foo_port_single_vlan_add(dev, vlan->vid_begin); >>>>> > } >>>>> > >>>>> >So drivers keep being simple, and we can easily propagate the fact that >>>>> >one-or-all VLAN is not supportable, vs. the VLAN feature itself is not >>>>> >implemented and must be done in software. >>>>> I think that if you want to keep it simple, then Scott's advice from the >>>>> previous thread is the most appropriate one. I believe the hardware you >>>>> are using is simply not meant to support multiple 802.1Q bridges. >>>> >>>> You mean allowing only one Linux bridge over an hardware switch? >>>> >>>> It would for sure simplify how, as developers and users, we represent a >>>> physical switch. But I am not sure how to achieve that and I don't have >>>> strong opinions on this TBH. >>> >>>Hi Vivien, I think it's possible to keep switch ports on just one >>>bridge if we do a little bit of work on the NETDEV_CHANGEUPPER >>>notifier. This will give you the driver-level control you want. Do >>>you have time to investigate? The idea is: >>> >>>1) In your driver's handler for NETDEV_CHANGEUPPER, if switch port is >>>being added to a second bridge,then return NOTIFY_BAD. Your driver >>>needs to track the bridge count. >>> >>>2) In __netdev_upper_dev_link(), check the return code from the >>>call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, ...) call, and if >>>NOTIFY_BAD, abort the linking operation (goto rollback_xxx). >>> >> Hi, >> >> We are doing something similar in mlxsw (not upstream yet). Jiri >> introduced PRE_CHANGEUPPER, which is called from the function you >> mentioned, but before the linking operation (so that you don't need to >> rollback). > >Oh, cool. > >> If the notification is about a linking operation and the master is a >> bridge different than the current one, then NOTIFY_BAD is returned. > >So you're wanting to restrict to just one bridge also? Or is >NOTIFY_BAD returned for some other reason? I guess I should be >patient and wait for the patch. Yes, currently that's what we are doing. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov ` (2 preceding siblings ...) 2015-10-12 17:36 ` Vivien Didelot @ 2015-10-13 11:42 ` David Miller 3 siblings, 0 replies; 34+ messages in thread From: David Miller @ 2015-10-13 11:42 UTC (permalink / raw) To: razor; +Cc: netdev, sfeldma, jiri, vivien.didelot, nikolay From: Nikolay Aleksandrov <razor@blackwall.org> Date: Mon, 12 Oct 2015 14:01:39 +0200 > From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> > > We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. > > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Applied, thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2015-10-15 7:28 UTC | newest] Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-09 23:30 switchdev and VLAN ranges Vivien Didelot 2015-10-10 4:22 ` Scott Feldman 2015-10-10 16:33 ` Vivien Didelot 2015-10-10 18:10 ` Florian Fainelli 2015-10-10 19:47 ` Vivien Didelot 2015-10-10 7:49 ` Elad Raz 2015-10-10 10:36 ` Nikolay Aleksandrov 2015-10-11 7:12 ` Jiri Pirko 2015-10-11 10:49 ` [PATCH net-next] bridge: vlan: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 2015-10-11 10:49 ` [Bridge] " Nikolay Aleksandrov 2015-10-11 14:13 ` Jiri Pirko 2015-10-13 2:59 ` David Miller 2015-10-13 2:59 ` [Bridge] " David Miller 2015-10-11 22:41 ` switchdev and VLAN ranges Vivien Didelot 2015-10-12 0:13 ` Nikolay Aleksandrov 2015-10-12 5:14 ` Scott Feldman 2015-10-12 10:15 ` Nikolay Aleksandrov 2015-10-12 12:01 ` [PATCH net-next] switchdev: enforce no pvid flag in vlan ranges Nikolay Aleksandrov 2015-10-12 12:11 ` Elad Raz 2015-10-12 12:17 ` Jiri Pirko 2015-10-12 17:36 ` Vivien Didelot 2015-10-13 6:13 ` Scott Feldman 2015-10-13 8:31 ` Ido Schimmel 2015-10-13 14:32 ` Vivien Didelot 2015-10-14 6:14 ` Ido Schimmel 2015-10-14 15:25 ` Vivien Didelot 2015-10-14 17:14 ` Scott Feldman 2015-10-14 17:42 ` Ido Schimmel 2015-10-14 18:51 ` Vivien Didelot 2015-10-14 22:08 ` Florian Fainelli 2015-10-15 0:07 ` Vivien Didelot 2015-10-15 2:58 ` Scott Feldman 2015-10-15 7:28 ` Ido Schimmel 2015-10-13 11:42 ` David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.