* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables @ 2015-01-07 1:14 Alexei Starovoitov 2015-01-07 5:37 ` John Fastabend 0 siblings, 1 reply; 21+ messages in thread From: Alexei Starovoitov @ 2015-01-07 1:14 UTC (permalink / raw) To: John Fastabend Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, netdev, David S. Miller, Andy Gospodarek On Wed, Dec 31, 2014 at 11:45 AM, John Fastabend <john.fastabend@gmail.com> wrote: > + * [NET_FLOW_TABLE_IDENTIFIER_TYPE] > + * [NET_FLOW_TABLE_IDENTIFIER] > + * [NET_FLOW_TABLE_TABLES] > + * [NET_FLOW_TABLE] > + * [NET_FLOW_TABLE_ATTR_NAME] > + * [NET_FLOW_TABLE_ATTR_UID] > + * [NET_FLOW_TABLE_ATTR_SOURCE] > + * [NET_FLOW_TABLE_ATTR_SIZE] ... > + * Header definitions used to define headers with user friendly > + * names. > + * > + * [NET_FLOW_TABLE_HEADERS] > + * [NET_FLOW_HEADER] > + * [NET_FLOW_HEADER_ATTR_NAME] > + * [NET_FLOW_HEADER_ATTR_UID] > + * [NET_FLOW_HEADER_ATTR_FIELDS] > + * [NET_FLOW_HEADER_ATTR_FIELD] > + * [NET_FLOW_FIELD_ATTR_NAME] > + * [NET_FLOW_FIELD_ATTR_UID] > + * [NET_FLOW_FIELD_ATTR_BITWIDTH] > + * [NET_FLOW_HEADER_ATTR_FIELD] > + * [...] > + * [...] > + * Action definitions supported by tables > + * > + * [NET_FLOW_TABLE_ACTIONS] > + * [NET_FLOW_TABLE_ATTR_ACTIONS] > + * [NET_FLOW_ACTION] > + * [NET_FLOW_ACTION_ATTR_NAME] > + * [NET_FLOW_ACTION_ATTR_UID] > + * [NET_FLOW_ACTION_ATTR_SIGNATURE] > + * [NET_FLOW_ACTION_ARG] .. > + * Get Table Graph <Reply> description > + * > + * [NET_FLOW_TABLE_TABLE_GRAPH] > + * [TABLE_GRAPH_NODE] > + * [TABLE_GRAPH_NODE_UID] > + * [TABLE_GRAPH_NODE_JUMP] I think NET_FLOW prefix everywhere is too verbose. Especially since you've missed it in the above 3. and in patch 2 it is: NET_FLOW_FLOW which is kinda awkward. Can you abbreviate it to NFL_ or something else ? I couldn't find get_headers() and get_header_graph() implementation on rocker side ? Could you describe how put_header_graph() will look like? When it comes to parsing I'm assuming that hw will fall into N categories: - that has get_headers() and get_header_graph() only which would mean fixed parser - above plus put_header_graph() which will allow to rearrange some fixed sized headers ? - above plus put_header() ? I'm having a hard time envisioning how that would look like. - ... ? also can we change a name from add_flow to add_entry or add_rule ? I think 'rule' fits better, since rule = field_ref+action and one real TCP flow may need multiple rules inserted into table, right? The whole thing can still be called 'flow API'... will there be a put_table_graph() ? probably not, right? since as soon as HW supports 'goto' aciton, the meaning of table_graph is lost and it's actually just a set of disconnected tables and the way to jump from one into another is through 'goto'. I think OVS guys are quiet, since they're skeptical that headers+header_graph approach can work? Would be great if they can share the experience... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-07 1:14 [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables Alexei Starovoitov @ 2015-01-07 5:37 ` John Fastabend 0 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2015-01-07 5:37 UTC (permalink / raw) To: Alexei Starovoitov Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, netdev, David S. Miller, Andy Gospodarek On 01/06/2015 05:14 PM, Alexei Starovoitov wrote: > On Wed, Dec 31, 2014 at 11:45 AM, John Fastabend > <john.fastabend@gmail.com> wrote: >> + * [NET_FLOW_TABLE_IDENTIFIER_TYPE] >> + * [NET_FLOW_TABLE_IDENTIFIER] >> + * [NET_FLOW_TABLE_TABLES] >> + * [NET_FLOW_TABLE] >> + * [NET_FLOW_TABLE_ATTR_NAME] >> + * [NET_FLOW_TABLE_ATTR_UID] >> + * [NET_FLOW_TABLE_ATTR_SOURCE] >> + * [NET_FLOW_TABLE_ATTR_SIZE] > ... >> + * Header definitions used to define headers with user friendly >> + * names. >> + * >> + * [NET_FLOW_TABLE_HEADERS] >> + * [NET_FLOW_HEADER] >> + * [NET_FLOW_HEADER_ATTR_NAME] >> + * [NET_FLOW_HEADER_ATTR_UID] >> + * [NET_FLOW_HEADER_ATTR_FIELDS] >> + * [NET_FLOW_HEADER_ATTR_FIELD] >> + * [NET_FLOW_FIELD_ATTR_NAME] >> + * [NET_FLOW_FIELD_ATTR_UID] >> + * [NET_FLOW_FIELD_ATTR_BITWIDTH] >> + * [NET_FLOW_HEADER_ATTR_FIELD] >> + * [...] >> + * [...] >> + * Action definitions supported by tables >> + * >> + * [NET_FLOW_TABLE_ACTIONS] >> + * [NET_FLOW_TABLE_ATTR_ACTIONS] >> + * [NET_FLOW_ACTION] >> + * [NET_FLOW_ACTION_ATTR_NAME] >> + * [NET_FLOW_ACTION_ATTR_UID] >> + * [NET_FLOW_ACTION_ATTR_SIGNATURE] >> + * [NET_FLOW_ACTION_ARG] > .. >> + * Get Table Graph <Reply> description >> + * >> + * [NET_FLOW_TABLE_TABLE_GRAPH] >> + * [TABLE_GRAPH_NODE] >> + * [TABLE_GRAPH_NODE_UID] >> + * [TABLE_GRAPH_NODE_JUMP] > > I think NET_FLOW prefix everywhere is too verbose. > Especially since you've missed it in the above 3. > and in patch 2 it is: > NET_FLOW_FLOW > which is kinda awkward. > Can you abbreviate it to NFL_ or something else ? hmm I'm open for a better name, NFL_ might work but seems a bit cryptic to me. Maybe it is better than NET_FLOW. Anyone other suggestions? > > I couldn't find get_headers() and get_header_graph() > implementation on rocker side ? It is in patch [net-next PATCH v1 04/11] rocker: add pipeline model for rocker switch +#ifdef CONFIG_NET_FLOW_TABLES + .ndo_flow_get_tables = rocker_get_tables, + .ndo_flow_get_headers = rocker_get_headers, + .ndo_flow_get_actions = rocker_get_actions, + .ndo_flow_get_tbl_graph = rocker_get_tgraph, + .ndo_flow_get_hdr_graph = rocker_get_hgraph, +#endif although v2 will address some good feedback and clean it up a bit. > Could you describe how put_header_graph() will look like? The signature for get_hdrs and get_hdrs_graph in my latest deck (I'll push shortly still need to sort out the caching for get_flows) look like this, struct net_flow_hdr **(*ndo_flow_get_hdrs)(struct net_device *dev); struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev) I could then use the following signatures for put hdrs, int (*ndo_flow_put_hdrs)(struct net_device *dev, struct net_flow_hdr **hdrs); int (*ndo_flow_put_hdrs_graph(struct net_device *dev, struct net_flow_hdr_graph **graph); If the user supplies a new set of hdrs via put_hdrs it would invalidate the hdrs graph though so we could either smash those two operations into one or require both to occur when the device is down but not allow it to come up without a graph operation. Currently my preference is to smash the above two ops to this, > int (*ndo_flow_put_hdrs)(struct net_device *dev, struct net_flow_hdr **hdrs, struct net_flow_hdr_graph **graph); I've gone back and forth on this, doing updates to the hdrs/graph while the device is online doesn't seem practical. I don't have access to any devices that would support this. If your device is a software one though (futuristic eBPF-OVS) it may make some sense. > When it comes to parsing I'm assuming that hw will fall > into N categories: > - that has get_headers() and get_header_graph() only > which would mean fixed parser right this is rocker and what the initial series is trying to address. > - above plus put_header_graph() which will allow to > rearrange some fixed sized headers ? OK but I'm not sure where/if these devices exist. Maybe your thinking of a software dataplane case? Would get_headers return some headers/fields but not include them in the graph and then expect the user to build a graph with them if the user needs them. Are there restrictions on how the graph can be built out? I guess I'm working with the assumption that the device returns a complete parse graph for all combinations it supports. Are there really devices that could only support certain combinations and then if you shuffled the headers graph around support others? I'm just not aware of any device. > - above plus put_header() ? > I'm having a hard time envisioning how that would > look like. This case makes more sense to me. The user supplies the definition of the headers and the graph showing how they are related and the driver can program the parser to work correctly. This implies a flexible parser but I think some devices could support this. You would need some attributes to define depth of the parser and such restrictions if the device has restrictions on the parsers that can be supported. Maybe one concrete example would be to introduce a header tag that was previously unknown to the device. You could define it using the header/fields (bit/length/offset) notation and then give the graph to let the parser "know" where to expect it. Finally this could be passed to the driver and the parser could be generated. To be honest though I would really be happy getting the 1st option working. > - ... ? > > also can we change a name from add_flow > to add_entry or add_rule ? > I think 'rule' fits better, since rule = field_ref+action > and one real TCP flow may need multiple rules > inserted into table, right? > The whole thing can still be called 'flow API'... add_rule/del_rule fine by me. > > will there be a put_table_graph() ? > probably not, right? since as soon as HW supports > 'goto' aciton, the meaning of table_graph is lost and > it's actually just a set of disconnected tables and the > way to jump from one into another is through 'goto'. hmm I have support in another tree for create/destroy table. This allows users to create tables and destroy them. . I think the table_graph is still relevant its just the graph is completely connected. I'm not sure you will really see hardware like this anytime soon though :) or maybe I just mean I haven't seen any. > > I think OVS guys are quiet, since they're skeptical > that headers+header_graph approach can work? > Would be great if they can share the experience... > hmm maybe but I could define all the headers OVS supports using this. And then define a linear array of tables. It might be an interesting exercise to build this on top of rocker. -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables @ 2015-01-07 21:17 Alexei Starovoitov 2015-01-07 22:00 ` John Fastabend 0 siblings, 1 reply; 21+ messages in thread From: Alexei Starovoitov @ 2015-01-07 21:17 UTC (permalink / raw) To: John Fastabend Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, netdev, David S. Miller, Andy Gospodarek On Tue, Jan 6, 2015 at 9:37 PM, John Fastabend <john.fastabend@gmail.com> wrote: >> - above plus put_header_graph() which will allow to >> rearrange some fixed sized headers ? > > OK but I'm not sure where/if these devices exist. Maybe your > thinking of a software dataplane case? Would get_headers return > some headers/fields but not include them in the graph and then > expect the user to build a graph with them if the user needs > them. Are there restrictions on how the graph can be built > out? I guess I'm working with the assumption that the device > returns a complete parse graph for all combinations it supports. ahh. I thought that get_hdr_graph() will return one that is currently configured and put_hdr_graph() will try to configure new sequence of headers. I think returning all possible combinations is not practical, since number of such combinations can be very large for some hw. Also it seems that 4/11 patch and rocker_header_nodes[] in particular describing one graph instead of all possible? >> - above plus put_header() ? >> I'm having a hard time envisioning how that would >> look like. > > This case makes more sense to me. The user supplies the definition > of the headers and the graph showing how they are related and the > driver can program the parser to work correctly. yes, assuming that put_hdr_graph() programs one sequence of jumping through hdrs... but I think it's also fine if you do one put_hdrs_and_graph() function as you described. > To be honest though I would really be happy getting the 1st option > working. agree. as long as we don't screw up get*() semantics that prevent clean put*() logic :) To illustrate my point: if hw parser can parse 2 vlans and there is no way to configure it to do zero, one or three, it's perfectly fine for put_hdr_graph() to fail when it tries to configure something different. But if hw can be configured to do 1 vlan or 2 vlans, it would be overkill to pass both graphs in get(). Just pass one that is currently active and let put() try things? I think get_hdrs() on its own is good enough indication to the user what hw is capable of and hdr_graph is just a jump table between them. If hw can parse vxlan without vxlan extensions it will be clearly seen in get_hdrs, so no point trying to do put_hdrs() with some new definition of vxlan unless parser is fully programmable. that's where I was going with my category 2 where only put_hdr_graph() exists... imo it will fit trident and alta models ? Personally I believe that we should design this API with as much as possible real hw in mind. rocker can support different models of hw... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-07 21:17 Alexei Starovoitov @ 2015-01-07 22:00 ` John Fastabend 0 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2015-01-07 22:00 UTC (permalink / raw) To: Alexei Starovoitov Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, netdev, David S. Miller, Andy Gospodarek On 01/07/2015 01:17 PM, Alexei Starovoitov wrote: > On Tue, Jan 6, 2015 at 9:37 PM, John Fastabend <john.fastabend@gmail.com> wrote: >>> - above plus put_header_graph() which will allow to >>> rearrange some fixed sized headers ? >> >> OK but I'm not sure where/if these devices exist. Maybe your >> thinking of a software dataplane case? Would get_headers return >> some headers/fields but not include them in the graph and then >> expect the user to build a graph with them if the user needs >> them. Are there restrictions on how the graph can be built >> out? I guess I'm working with the assumption that the device >> returns a complete parse graph for all combinations it supports. > > ahh. I thought that get_hdr_graph() will return one > that is currently configured and put_hdr_graph() > will try to configure new sequence of headers. > I think returning all possible combinations is not practical, > since number of such combinations can be very large for > some hw. Agree here I think it should return the currently configured and active hdr graph. Just to be clear I had assumed that any driver that supported put_header_graph would also support a put_headers call. basically your case 3 below. > Also it seems that 4/11 patch and rocker_header_nodes[] > in particular describing one graph instead of > all possible? It returns the one and only graph rocker supports now or at least the graph of supported headers as I read the rocker code. As rocker becomes more flexible I would expect this to grow including tunnels, stacked headers, tcp, etc. > >>> - above plus put_header() ? >>> I'm having a hard time envisioning how that would >>> look like. >> >> This case makes more sense to me. The user supplies the definition >> of the headers and the graph showing how they are related and the >> driver can program the parser to work correctly. > > yes, assuming that put_hdr_graph() programs one > sequence of jumping through hdrs... > but I think it's also fine if you do one put_hdrs_and_graph() > function as you described. > >> To be honest though I would really be happy getting the 1st option >> working. > > agree. > as long as we don't screw up get*() semantics that > prevent clean put*() logic :) > To illustrate my point: > if hw parser can parse 2 vlans and there is > no way to configure it to do zero, one or three, it's perfectly > fine for put_hdr_graph() to fail when it tries to configure > something different. > But if hw can be configured to do 1 vlan or 2 vlans, it > would be overkill to pass both graphs in get(). > Just pass one that is currently active and let put() try things? This is what I intended. I think it is good enough. > I think get_hdrs() on its own is good enough indication > to the user what hw is capable of and hdr_graph is > just a jump table between them. If hw can parse vxlan > without vxlan extensions it will be clearly seen in get_hdrs, > so no point trying to do put_hdrs() with some new > definition of vxlan unless parser is fully programmable. > that's where I was going with my category 2 where > only put_hdr_graph() exists... imo it will fit trident > and alta models ? > Personally I believe that we should design this API > with as much as possible real hw in mind. > rocker can support different models of hw... > Yep. Which is why at some point I would like to program up a couple other "worlds" for rocker that have different pipelines. This would allow experimenting with more then the current static model rocker uses. -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next PATCH v1 00/11] A flow API @ 2014-12-31 19:45 John Fastabend 2014-12-31 19:45 ` [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables John Fastabend 0 siblings, 1 reply; 21+ messages in thread From: John Fastabend @ 2014-12-31 19:45 UTC (permalink / raw) To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy So... I could continue to mull over this and tweak bits and pieces here and there but I decided its best to get a wider group of folks looking at it and hopefulyl with any luck using it so here it is. This set creates a new netlink family and set of messages to configure flow tables in hardware. I tried to make the commit messages reasonably verbose at least in the flow_table patches. What we get at the end of this series is a working API to get device capabilities and program flows using the rocker switch. I created a user space tool 'flow' that I use to configure and query the devices it is posted here, https://github.com/jrfastab/iprotue2-flow-tool For now it is a stand-alone tool but once the kernel bits get sorted out (I'm guessing there will need to be a few versions of this series to get it right) I would like to port it into the iproute2 package. This way we can keep all of our tooling in one package see 'bridge' for example. As far as testing, I've tested various combinations of tables and rules on the rocker switch and it seems to work. I have not tested 100% of the rocker code paths though. It would be great to get some sort of automated framework around the API to do this. I don't think should gate the inclusion of the API though. I could use some help reviewing, (a) error paths and netlink validation code paths (b) Break down of structures vs netlink attributes. I am trying to balance flexibility given by having netlinnk TLV attributes vs conciseness. So some things are passed as structures. (c) are there any devices that have pipelines that we can't represent with this API? It would be good to know about these so we can design it in probably in a future series. For some examples and maybe a bit more illustrative description I posted a quickly typed up set of notes on github io pages. Here we can show the description along with images produced by the flow tool showing the pipeline. Once we settle a bit more on the API we should probably do a clean up of this and other threads happening and commit something to the Documentation directory. http://jrfastab.github.io/jekyll/update/2014/12/21/flow-api.html Finally I have more patches to add support for creating and destroying tables. This allows users to define the pipeline at runtime rather than statically as rocker does now. After this set gets some traction I'll look at pushing them in a next round. However it likely requires adding another "world" to rocker. Another piece that I want to add is a description of the actions and metadata. This way user space can "learn" what an action is and how metadata interacts with the system. This work is under development. Thanks! Any comments/feedback always welcome. And also thanks to everyone who helped with this flow API so far. All the folks at Dusseldorf LPC, OVS summit Santa Clara, P4 authors for some inspiration, the collection of IETF FoRCES documents I mulled over, Netfilter workshop where I started to realize fixing ethtool was most likely not going to work, etc. --- John Fastabend (11): net: flow_table: create interface for hw match/action tables net: flow_table: add flow, delete flow net: flow_table: add apply action argument to tables rocker: add pipeline model for rocker switch net: rocker: add set flow rules net: rocker: add group_id slices and drop explicit goto net: rocker: add multicast path to bridging net: rocker: add get flow API operation net: rocker: add cookie to group acls and use flow_id to set cookie net: rocker: have flow api calls set cookie value net: rocker: implement delete flow routine drivers/net/ethernet/rocker/rocker.c | 1641 +++++++++++++++++++++++++ drivers/net/ethernet/rocker/rocker_pipeline.h | 793 ++++++++++++ include/linux/if_flow.h | 115 ++ include/linux/netdevice.h | 20 include/uapi/linux/if_flow.h | 413 ++++++ net/Kconfig | 7 net/core/Makefile | 1 net/core/flow_table.c | 1339 ++++++++++++++++++++ 8 files changed, 4312 insertions(+), 17 deletions(-) create mode 100644 drivers/net/ethernet/rocker/rocker_pipeline.h create mode 100644 include/linux/if_flow.h create mode 100644 include/uapi/linux/if_flow.h create mode 100644 net/core/flow_table.c -- Signature ^ permalink raw reply [flat|nested] 21+ messages in thread
* [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2014-12-31 19:45 [net-next PATCH v1 00/11] A flow API John Fastabend @ 2014-12-31 19:45 ` John Fastabend 2014-12-31 20:10 ` John Fastabend ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: John Fastabend @ 2014-12-31 19:45 UTC (permalink / raw) To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy Currently, we do not have an interface to query hardware and learn the capabilities of the device. This makes it very difficult to use hardware flow tables. At the moment the only interface we have to work with hardware flow tables is ethtool. This has many deficiencies, first its ioctl based making it difficult to use in systems that need to monitor interfaces because there is no support for multicast, notifiers, etc. The next big gap is it doesn't support querying devices for capabilities. The only way to learn hardware entries is by doing a "try and see" operation. An error perhaps indicating the device can not support your request but could be possibly for other reasons. Maybe a table is full for example. The existing flow interface only supports a single ingress table which is sufficient for some of the existing NIC host interfaces but limiting for more advanced NIC interfaces and switch devices. Also it is not extensible without recompiling both drivers and core interfaces. It may be possible to reprogram a device with additional header types, new protocols, whatever and it would be great if the flow table infrastructure can handle this. So this patch scraps the ethtool flow classifier interface and creates a new flow table interface. It is expected that device that support the existing ethtool interface today can support both interfaces without too much difficulty. I did a proof point on the ixgbe driver. Only choosing ixgbe because I have a 82599 10Gbps device in my development system. A more thorough implementation was done for the rocker switch showing how to use the interface. In this patch we create interfaces to get the headers a device supports, the actions it supports, a header graph showing the relationship between headers the device supports, the tables supported by the device and how they are connected. This patch _only_ provides the get routines in an attempt to make the patch sequence manageable. get_headers : report a set of headers/fields the device supports. These are specified as length/offsets so we can support standard protocols or vendor specific headers. This is more flexible then bitmasks of pre-defined packet types. In 'tc' for example I may use u32 to match on proprietary or vendor specific fields. A bitmask approach does not allow for this, but defining the fields as a set of offsets and lengths allows for this. A device that supports Openflow version 1.x for example could provide the set of field/offsets that are equivelent to the specification. One property of this type of interface is I don't have to rebuild my kernel/driver header interfaces, etc to support the latest and greatest trendy protocol foo. For some types of metadata the device understands we also use header fields to represent these. One example of this is we may have an ingress_port metadata field to report the port a packet was received on. At the moment we expect the metadata fields to be defined outside the interface. We can standardize on common ones such "ingress_port" across devices. Some examples of outside definitions specifying metadata might be OVS, internal definitions like skb->mark, or some FoRCES definitions. get_header_graph : Simply providing a header/field offset I support is not sufficient to learn how many nested 802.1Q tags I can support and other similar cases where the ordering of headers matters. So we use this operation to query the device for a header graph showing how the headers need to be related. With this operation and the 'get_headers' operation you can interrogate the driver with questions like "do you support Q'in'Q?", "how many VLAN tags can I nest before the parser breaks?", "Do you support MPLS?", "How about Foo Header in a VXLAN tunnel?". get_actions : Report a list of actions supported by the device along with the arguments they take. So "drop_packet" action takes no arguments and "set_field" action takes two arguments a field and value. This suffers again from being slightly opaque. Meaning if a device reports back action "foo_bar" with three arguments how do I as a consumer of this "know" what that action is? The easy thing to do is punt on it and say it should be described outside the driver somewhere. OVS for example defines a set of actions. If my FoRCeS quick read is correct they define actions using text in the messaging interface. A follow up patch series could use a description language to describe actions. Possibly using something from eBPF or nftables for example. This patch will not try to solve the isuse now and expect actions are defined outside the API or are well known. get_tables : Hardware may support one or more tables. Each table supports a set of matches and a set of actions. The match fields supported are defined above by the 'get_headers' operations. Similarly the actions supported are defined by the 'get_actions' operation. This allows the hardware to report several tables all with distinct capabilities. Tables also have table attributes used to describe features of the table. Because netlink messages are TLV based we can easily add new table attribues as needed. Currently a table has two attributes size and source. The size indicates how many "slots" are in the table for flow entries. One caveat here is a rule in the flow table may consume multiple slots in the table. We deal with this in a subsequent patch. The source field is used to indicate table boundaries where actions are applied. A table with the same source value will not "see" actions from tables with the same source. An example where this is relavent would be to have an action to re-write the destiniation IP address of a packet. If you have a match rule in a table with the same source that matches on the new IP address it will not be hit. However if it is in a table with a different source value _and_ in another table that gets applied the rule will be hit. See the next operatoin for querying table ordering. Some basic hardware may only support a single table which simplifies some things. But even the simple 10/40Gbps NICs support multiple tables and different tables depending on ingress/egress. get_table_graph : When a device supports multiple tables we need to identify how the tables are connected when each table is executed. To do this we provide a table graph which gives the pipeline of the device. The graph gives nodes representing each table and the edges indicate the criteria to progress to the next flow table. There are examples of this type of thing in both FoRCES and OVS. OVS prescribes a set of tables reachable with goto actions and FoRCES a slightly more flexible arrangement. In software tc's u32 classifier allows "linking" hash tables together. The OVS dataplane with the support of 'goto' action is completely connected. Without the 'goto' action the tables are progressed linearly. By querying the graph from hardware we can "learn" what table flows are supported and map them into software. We also provide a bit to indicate if the node is a root node of the ingress pipeline or egress pipeline. This is used on devices that have different pipelines for ingres and egress. This appears to be fairly common for devices. The realtek chip presented at LPC in Dusseldorf for example appeared to have a separate ingress/egress pipeline. With these five operations software can learn what types of fields the hardware flow table supports and how they are arranged. Subsequent patches will address programming the flow tables. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> --- include/linux/if_flow.h | 93 +++++ include/linux/netdevice.h | 12 + include/uapi/linux/if_flow.h | 363 ++++++++++++++++++ net/Kconfig | 7 net/core/Makefile | 1 net/core/flow_table.c | 837 ++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 1313 insertions(+) create mode 100644 include/linux/if_flow.h create mode 100644 include/uapi/linux/if_flow.h create mode 100644 net/core/flow_table.c diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h new file mode 100644 index 0000000..1b6c1ea --- /dev/null +++ b/include/linux/if_flow.h @@ -0,0 +1,93 @@ +/* + * include/linux/net/if_flow.h - Flow table interface for Switch devices + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> + * + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * The full GNU General Public License is included in this distribution in + * the file called "COPYING". + * + * Author: John Fastabend <john.r.fastabend@intel.com> + */ + +#ifndef _IF_FLOW_H +#define _IF_FLOW_H + +#include <uapi/linux/if_flow.h> + +/** + * @struct net_flow_header + * @brief defines a match (header/field) an endpoint can use + * + * @uid unique identifier for header + * @field_sz number of fields are in the set + * @fields the set of fields in the net_flow_header + */ +struct net_flow_header { + char name[NET_FLOW_NAMSIZ]; + int uid; + int field_sz; + struct net_flow_field *fields; +}; + +/** + * @struct net_flow_action + * @brief a description of a endpoint defined action + * + * @name printable name + * @uid unique action identifier + * @types NET_FLOW_ACTION_TYPE_NULL terminated list of action types + */ +struct net_flow_action { + char name[NET_FLOW_NAMSIZ]; + int uid; + struct net_flow_action_arg *args; +}; + +/** + * @struct net_flow_table + * @brief define flow table with supported match/actions + * + * @uid unique identifier for table + * @source uid of parent table + * @size max number of entries for table or -1 for unbounded + * @matches null terminated set of supported match types given by match uid + * @actions null terminated set of supported action types given by action uid + * @flows set of flows + */ +struct net_flow_table { + char name[NET_FLOW_NAMSIZ]; + int uid; + int source; + int size; + struct net_flow_field_ref *matches; + int *actions; +}; + +/* net_flow_hdr_node: node in a header graph of header fields. + * + * @uid : unique id of the graph node + * @flwo_header_ref : identify the hdrs that can handled by this node + * @net_flow_jump_table : give a case jump statement + */ +struct net_flow_hdr_node { + char name[NET_FLOW_NAMSIZ]; + int uid; + int *hdrs; + struct net_flow_jump_table *jump; +}; + +struct net_flow_tbl_node { + int uid; + __u32 flags; + struct net_flow_jump_table *jump; +}; +#endif diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 29c92ee..3c3c856 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,11 @@ #include <linux/neighbour.h> #include <uapi/linux/netdevice.h> +#ifdef CONFIG_NET_FLOW_TABLES +#include <linux/if_flow.h> +#include <uapi/linux/if_flow.h> +#endif + struct netpoll_info; struct device; struct phy_device; @@ -1186,6 +1191,13 @@ struct net_device_ops { int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state); #endif +#ifdef CONFIG_NET_FLOW_TABLES + struct net_flow_action **(*ndo_flow_get_actions)(struct net_device *dev); + struct net_flow_table **(*ndo_flow_get_tables)(struct net_device *dev); + struct net_flow_header **(*ndo_flow_get_headers)(struct net_device *dev); + struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev); + struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev); +#endif }; /** diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h new file mode 100644 index 0000000..2acdb38 --- /dev/null +++ b/include/uapi/linux/if_flow.h @@ -0,0 +1,363 @@ +/* + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> + * + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * The full GNU General Public License is included in this distribution in + * the file called "COPYING". + * + * Author: John Fastabend <john.r.fastabend@intel.com> + */ + +/* Netlink description: + * + * Table definition used to describe running tables. The following + * describes the netlink message returned from a flow API messages. + * + * Flow table definitions used to define tables. + * + * [NET_FLOW_TABLE_IDENTIFIER_TYPE] + * [NET_FLOW_TABLE_IDENTIFIER] + * [NET_FLOW_TABLE_TABLES] + * [NET_FLOW_TABLE] + * [NET_FLOW_TABLE_ATTR_NAME] + * [NET_FLOW_TABLE_ATTR_UID] + * [NET_FLOW_TABLE_ATTR_SOURCE] + * [NET_FLOW_TABLE_ATTR_SIZE] + * [NET_FLOW_TABLE_ATTR_MATCHES] + * [NET_FLOW_FIELD_REF] + * [NET_FLOW_FIELD_REF] + * [...] + * [...] + * [NET_FLOW_TABLE_ATTR_ACTIONS] + * [NET_FLOW_ACTION] + * [NET_FLOW_ACTION_ATTR_NAME] + * [NET_FLOW_ACTION_ATTR_UID] + * [NET_FLOW_ACTION_ATTR_SIGNATURE] + * [NET_FLOW_ACTION_ARG] + * [NET_FLOW_ACTION_ARG] + * [...] + * [NET_FLOW_ACTION] + * [...] + * [...] + * [NET_FLOW_TABLE] + * [...] + * + * Header definitions used to define headers with user friendly + * names. + * + * [NET_FLOW_TABLE_HEADERS] + * [NET_FLOW_HEADER] + * [NET_FLOW_HEADER_ATTR_NAME] + * [NET_FLOW_HEADER_ATTR_UID] + * [NET_FLOW_HEADER_ATTR_FIELDS] + * [NET_FLOW_HEADER_ATTR_FIELD] + * [NET_FLOW_FIELD_ATTR_NAME] + * [NET_FLOW_FIELD_ATTR_UID] + * [NET_FLOW_FIELD_ATTR_BITWIDTH] + * [NET_FLOW_HEADER_ATTR_FIELD] + * [...] + * [...] + * [NET_FLOW_HEADER] + * [...] + * [...] + * + * Action definitions supported by tables + * + * [NET_FLOW_TABLE_ACTIONS] + * [NET_FLOW_TABLE_ATTR_ACTIONS] + * [NET_FLOW_ACTION] + * [NET_FLOW_ACTION_ATTR_NAME] + * [NET_FLOW_ACTION_ATTR_UID] + * [NET_FLOW_ACTION_ATTR_SIGNATURE] + * [NET_FLOW_ACTION_ARG] + * [NET_FLOW_ACTION_ARG] + * [...] + * [NET_FLOW_ACTION] + * [...] + * + * Parser definition used to unambiguously define match headers. + * + * [NET_FLOW_TABLE_PARSE_GRAPH] + * + * Primitive Type descriptions + * + * Get Table Graph <Request> only requires msg preamble. + * + * Get Table Graph <Reply> description + * + * [NET_FLOW_TABLE_TABLE_GRAPH] + * [TABLE_GRAPH_NODE] + * [TABLE_GRAPH_NODE_UID] + * [TABLE_GRAPH_NODE_JUMP] + * [NET_FLOW_JUMP_TABLE_ENTRY] + * [NET_FLOW_JUMP_TABLE_ENTRY] + * [...] + * [TABLE_GRAPH_NODE] + * [..] + */ + +#ifndef _UAPI_LINUX_IF_FLOW +#define _UAPI_LINUX_IF_FLOW + +#include <linux/types.h> +#include <linux/netlink.h> +#include <linux/if.h> + +#define NET_FLOW_NAMSIZ 80 + +/** + * @struct net_flow_fields + * @brief defines a field in a header + */ +struct net_flow_field { + char name[NET_FLOW_NAMSIZ]; + int uid; + int bitwidth; +}; + +enum { + NET_FLOW_FIELD_UNSPEC, + NET_FLOW_FIELD, + __NET_FLOW_FIELD_MAX, +}; +#define NET_FLOW_FIELD_MAX (__NET_FLOW_FIELD_MAX - 1) + +enum { + NET_FLOW_FIELD_ATTR_UNSPEC, + NET_FLOW_FIELD_ATTR_NAME, + NET_FLOW_FIELD_ATTR_UID, + NET_FLOW_FIELD_ATTR_BITWIDTH, + __NET_FLOW_FIELD_ATTR_MAX, +}; +#define NET_FLOW_FIELD_ATTR_MAX (__NET_FLOW_FIELD_ATTR_MAX - 1) + +enum { + NET_FLOW_HEADER_UNSPEC, + NET_FLOW_HEADER, + __NET_FLOW_HEADER_MAX, +}; +#define NET_FLOW_HEADER_MAX (__NET_FLOW_HEADER_MAX - 1) + +enum { + NET_FLOW_HEADER_ATTR_UNSPEC, + NET_FLOW_HEADER_ATTR_NAME, + NET_FLOW_HEADER_ATTR_UID, + NET_FLOW_HEADER_ATTR_FIELDS, + __NET_FLOW_HEADER_ATTR_MAX, +}; +#define NET_FLOW_HEADER_ATTR_MAX (__NET_FLOW_HEADER_ATTR_MAX - 1) + +enum { + NET_FLOW_MASK_TYPE_UNSPEC, + NET_FLOW_MASK_TYPE_EXACT, + NET_FLOW_MASK_TYPE_LPM, +}; + +/** + * @struct net_flow_field_ref + * @brief uniquely identify field as header:field tuple + */ +struct net_flow_field_ref { + int instance; + int header; + int field; + int mask_type; + int type; + union { /* Are these all the required data types */ + __u8 value_u8; + __u16 value_u16; + __u32 value_u32; + __u64 value_u64; + }; + union { /* Are these all the required data types */ + __u8 mask_u8; + __u16 mask_u16; + __u32 mask_u32; + __u64 mask_u64; + }; +}; + +enum { + NET_FLOW_FIELD_REF_UNSPEC, + NET_FLOW_FIELD_REF, + __NET_FLOW_FIELD_REF_MAX, +}; +#define NET_FLOW_FIELD_REF_MAX (__NET_FLOW_FIELD_REF_MAX - 1) + +enum { + NET_FLOW_FIELD_REF_ATTR_TYPE_UNSPEC, + NET_FLOW_FIELD_REF_ATTR_TYPE_U8, + NET_FLOW_FIELD_REF_ATTR_TYPE_U16, + NET_FLOW_FIELD_REF_ATTR_TYPE_U32, + NET_FLOW_FIELD_REF_ATTR_TYPE_U64, + /* Need more types for ether.addrs, ip.addrs, ... */ +}; + +enum net_flow_action_arg_type { + NET_FLOW_ACTION_ARG_TYPE_NULL, + NET_FLOW_ACTION_ARG_TYPE_U8, + NET_FLOW_ACTION_ARG_TYPE_U16, + NET_FLOW_ACTION_ARG_TYPE_U32, + NET_FLOW_ACTION_ARG_TYPE_U64, + __NET_FLOW_ACTION_ARG_TYPE_VAL_MAX, +}; + +struct net_flow_action_arg { + char name[NET_FLOW_NAMSIZ]; + enum net_flow_action_arg_type type; + union { + __u8 value_u8; + __u16 value_u16; + __u32 value_u32; + __u64 value_u64; + }; +}; + +enum { + NET_FLOW_ACTION_ARG_UNSPEC, + NET_FLOW_ACTION_ARG, + __NET_FLOW_ACTION_ARG_MAX, +}; +#define NET_FLOW_ACTION_ARG_MAX (__NET_FLOW_ACTION_ARG_MAX - 1) + +enum { + NET_FLOW_ACTION_UNSPEC, + NET_FLOW_ACTION, + __NET_FLOW_ACTION_MAX, +}; +#define NET_FLOW_ACTION_MAX (__NET_FLOW_ACTION_MAX - 1) + +enum { + NET_FLOW_ACTION_ATTR_UNSPEC, + NET_FLOW_ACTION_ATTR_NAME, + NET_FLOW_ACTION_ATTR_UID, + NET_FLOW_ACTION_ATTR_SIGNATURE, + __NET_FLOW_ACTION_ATTR_MAX, +}; +#define NET_FLOW_ACTION_ATTR_MAX (__NET_FLOW_ACTION_ATTR_MAX - 1) + +enum { + NET_FLOW_ACTION_SET_UNSPEC, + NET_FLOW_ACTION_SET_ACTIONS, + __NET_FLOW_ACTION_SET_MAX, +}; +#define NET_FLOW_ACTION_SET_MAX (__NET_FLOW_ACTION_SET_MAX - 1) + +enum { + NET_FLOW_TABLE_UNSPEC, + NET_FLOW_TABLE, + __NET_FLOW_TABLE_MAX, +}; +#define NET_FLOW_TABLE_MAX (__NET_FLOW_TABLE_MAX - 1) + +enum { + NET_FLOW_TABLE_ATTR_UNSPEC, + NET_FLOW_TABLE_ATTR_NAME, + NET_FLOW_TABLE_ATTR_UID, + NET_FLOW_TABLE_ATTR_SOURCE, + NET_FLOW_TABLE_ATTR_SIZE, + NET_FLOW_TABLE_ATTR_MATCHES, + NET_FLOW_TABLE_ATTR_ACTIONS, + __NET_FLOW_TABLE_ATTR_MAX, +}; +#define NET_FLOW_TABLE_ATTR_MAX (__NET_FLOW_TABLE_ATTR_MAX - 1) + +struct net_flow_jump_table { + struct net_flow_field_ref field; + int node; /* <0 is a parser error */ +}; + +#define NET_FLOW_JUMP_TABLE_DONE -1 + +enum { + NET_FLOW_JUMP_TABLE_ENTRY_UNSPEC, + NET_FLOW_JUMP_TABLE_ENTRY, + __NET_FLOW_JUMP_TABLE_ENTRY_MAX, +}; + +enum { + NET_FLOW_HEADER_NODE_HDRS_UNSPEC, + NET_FLOW_HEADER_NODE_HDRS_VALUE, + __NET_FLOW_HEADER_NODE_HDRS_MAX, +}; +#define NET_FLOW_HEADER_NODE_HDRS_MAX (__NET_FLOW_HEADER_NODE_HDRS_MAX - 1) + +enum { + NET_FLOW_HEADER_NODE_UNSPEC, + NET_FLOW_HEADER_NODE_NAME, + NET_FLOW_HEADER_NODE_UID, + NET_FLOW_HEADER_NODE_HDRS, + NET_FLOW_HEADER_NODE_JUMP, + __NET_FLOW_HEADER_NODE_MAX, +}; +#define NET_FLOW_HEADER_NODE_MAX (__NET_FLOW_HEADER_NODE_MAX - 1) + +enum { + NET_FLOW_HEADER_GRAPH_UNSPEC, + NET_FLOW_HEADER_GRAPH_NODE, + __NET_FLOW_HEADER_GRAPH_MAX, +}; +#define NET_FLOW_HEADER_GRAPH_MAX (__NET_FLOW_HEADER_GRAPH_MAX - 1) + +#define NET_FLOW_TABLE_EGRESS_ROOT 1 +#define NET_FLOW_TABLE_INGRESS_ROOT 2 + +enum { + NET_FLOW_TABLE_GRAPH_NODE_UNSPEC, + NET_FLOW_TABLE_GRAPH_NODE_UID, + NET_FLOW_TABLE_GRAPH_NODE_FLAGS, + NET_FLOW_TABLE_GRAPH_NODE_JUMP, + __NET_FLOW_TABLE_GRAPH_NODE_MAX, +}; +#define NET_FLOW_TABLE_GRAPH_NODE_MAX (__NET_FLOW_TABLE_GRAPH_NODE_MAX - 1) + +enum { + NET_FLOW_TABLE_GRAPH_UNSPEC, + NET_FLOW_TABLE_GRAPH_NODE, + __NET_FLOW_TABLE_GRAPH_MAX, +}; +#define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1) + +enum { + NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */ +}; + +enum { + NET_FLOW_UNSPEC, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER, + + NET_FLOW_TABLES, + NET_FLOW_HEADERS, + NET_FLOW_ACTIONS, + NET_FLOW_HEADER_GRAPH, + NET_FLOW_TABLE_GRAPH, + + __NET_FLOW_MAX, + NET_FLOW_MAX = (__NET_FLOW_MAX - 1), +}; + +enum { + NET_FLOW_TABLE_CMD_GET_TABLES, + NET_FLOW_TABLE_CMD_GET_HEADERS, + NET_FLOW_TABLE_CMD_GET_ACTIONS, + NET_FLOW_TABLE_CMD_GET_HDR_GRAPH, + NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH, + + __NET_FLOW_CMD_MAX, + NET_FLOW_CMD_MAX = (__NET_FLOW_CMD_MAX - 1), +}; + +#define NET_FLOW_GENL_NAME "net_flow_table" +#define NET_FLOW_GENL_VERSION 0x1 +#endif /* _UAPI_LINUX_IF_FLOW */ diff --git a/net/Kconfig b/net/Kconfig index ff9ffc1..8380bfe 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -293,6 +293,13 @@ config NET_FLOW_LIMIT with many clients some protection against DoS by a single (spoofed) flow that greatly exceeds average workload. +config NET_FLOW_TABLES + boolean "Support network flow tables" + ---help--- + This feature provides an interface for device drivers to report + flow tables and supported matches and actions. If you do not + want to support hardware offloads for flow tables, say N here. + menu "Network testing" config NET_PKTGEN diff --git a/net/core/Makefile b/net/core/Makefile index 235e6c5..1eea785 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -23,3 +23,4 @@ obj-$(CONFIG_NETWORK_PHY_TIMESTAMPING) += timestamping.o obj-$(CONFIG_NET_PTP_CLASSIFY) += ptp_classifier.o obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o +obj-$(CONFIG_NET_FLOW_TABLES) += flow_table.o diff --git a/net/core/flow_table.c b/net/core/flow_table.c new file mode 100644 index 0000000..ec3f06d --- /dev/null +++ b/net/core/flow_table.c @@ -0,0 +1,837 @@ +/* + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> + * + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * The full GNU General Public License is included in this distribution in + * the file called "COPYING". + * + * Author: John Fastabend <john.r.fastabend@intel.com> + */ + +#include <uapi/linux/if_flow.h> +#include <linux/if_flow.h> +#include <linux/if_bridge.h> +#include <linux/types.h> +#include <net/netlink.h> +#include <net/genetlink.h> +#include <net/rtnetlink.h> +#include <linux/module.h> + +static struct genl_family net_flow_nl_family = { + .id = GENL_ID_GENERATE, + .name = NET_FLOW_GENL_NAME, + .version = NET_FLOW_GENL_VERSION, + .maxattr = NET_FLOW_MAX, + .netnsok = true, +}; + +static struct net_device *net_flow_get_dev(struct genl_info *info) +{ + struct net *net = genl_info_net(info); + int type, ifindex; + + if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] || + !info->attrs[NET_FLOW_IDENTIFIER]) + return NULL; + + type = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER_TYPE]); + switch (type) { + case NET_FLOW_IDENTIFIER_IFINDEX: + ifindex = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER]); + break; + default: + return NULL; + } + + return dev_get_by_index(net, ifindex); +} + +static int net_flow_put_act_types(struct sk_buff *skb, + struct net_flow_action_arg *args) +{ + int i, err; + + for (i = 0; args[i].type; i++) { + err = nla_put(skb, NET_FLOW_ACTION_ARG, + sizeof(struct net_flow_action_arg), &args[i]); + if (err) + return -EMSGSIZE; + } + return 0; +} + +static const +struct nla_policy net_flow_action_policy[NET_FLOW_ACTION_ATTR_MAX + 1] = { + [NET_FLOW_ACTION_ATTR_NAME] = {.type = NLA_STRING, + .len = NET_FLOW_NAMSIZ-1 }, + [NET_FLOW_ACTION_ATTR_UID] = {.type = NLA_U32 }, + [NET_FLOW_ACTION_ATTR_SIGNATURE] = {.type = NLA_NESTED }, +}; + +static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a) +{ + struct net_flow_action_arg *this; + struct nlattr *nest; + int err, args = 0; + + if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name)) + return -EMSGSIZE; + + if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid)) + return -EMSGSIZE; + + if (!a->args) + return 0; + + for (this = &a->args[0]; strlen(this->name) > 0; this++) + args++; + + if (args) { + nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); + if (!nest) + goto nest_put_failure; + + err = net_flow_put_act_types(skb, a->args); + if (err) { + nla_nest_cancel(skb, nest); + return err; + } + nla_nest_end(skb, nest); + } + + return 0; +nest_put_failure: + return -EMSGSIZE; +} + +static int net_flow_put_actions(struct sk_buff *skb, + struct net_flow_action **acts) +{ + struct nlattr *actions; + int err, i; + + actions = nla_nest_start(skb, NET_FLOW_ACTIONS); + if (!actions) + return -EMSGSIZE; + + for (i = 0; acts[i]->uid; i++) { + struct nlattr *action = nla_nest_start(skb, NET_FLOW_ACTION); + + if (!action) + goto action_put_failure; + + err = net_flow_put_action(skb, acts[i]); + if (err) + goto action_put_failure; + nla_nest_end(skb, action); + } + nla_nest_end(skb, actions); + + return 0; +action_put_failure: + nla_nest_cancel(skb, actions); + return -EMSGSIZE; +} + +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, + struct net_device *dev, + u32 portid, int seq, u8 cmd) +{ + struct genlmsghdr *hdr; + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!skb) + return ERR_PTR(-ENOBUFS); + + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); + if (!hdr) + goto out; + + if (nla_put_u32(skb, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER_IFINDEX) || + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { + err = -ENOBUFS; + goto out; + } + + err = net_flow_put_actions(skb, a); + if (err < 0) + goto out; + + err = genlmsg_end(skb, hdr); + if (err < 0) + goto out; + + return skb; +out: + nlmsg_free(skb); + return ERR_PTR(err); +} + +static int net_flow_cmd_get_actions(struct sk_buff *skb, + struct genl_info *info) +{ + struct net_flow_action **a; + struct net_device *dev; + struct sk_buff *msg; + + dev = net_flow_get_dev(info); + if (!dev) + return -EINVAL; + + if (!dev->netdev_ops->ndo_flow_get_actions) { + dev_put(dev); + return -EOPNOTSUPP; + } + + a = dev->netdev_ops->ndo_flow_get_actions(dev); + if (!a) + return -EBUSY; + + msg = net_flow_build_actions_msg(a, dev, + info->snd_portid, + info->snd_seq, + NET_FLOW_TABLE_CMD_GET_ACTIONS); + dev_put(dev); + + if (IS_ERR(msg)) + return PTR_ERR(msg); + + return genlmsg_reply(msg, info); +} + +static int net_flow_put_table(struct net_device *dev, + struct sk_buff *skb, + struct net_flow_table *t) +{ + struct nlattr *matches, *actions; + int i; + + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) + return -EMSGSIZE; + + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); + if (!matches) + return -EMSGSIZE; + + for (i = 0; t->matches[i].instance; i++) + nla_put(skb, NET_FLOW_FIELD_REF, + sizeof(struct net_flow_field_ref), + &t->matches[i]); + nla_nest_end(skb, matches); + + actions = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_ACTIONS); + if (!actions) + return -EMSGSIZE; + + for (i = 0; t->actions[i]; i++) { + if (nla_put_u32(skb, + NET_FLOW_ACTION_ATTR_UID, + t->actions[i])) { + nla_nest_cancel(skb, actions); + return -EMSGSIZE; + } + } + nla_nest_end(skb, actions); + + return 0; +} + +static int net_flow_put_tables(struct net_device *dev, + struct sk_buff *skb, + struct net_flow_table **tables) +{ + struct nlattr *nest, *t; + int i, err = 0; + + nest = nla_nest_start(skb, NET_FLOW_TABLES); + if (!nest) + return -EMSGSIZE; + + for (i = 0; tables[i]->uid; i++) { + t = nla_nest_start(skb, NET_FLOW_TABLE); + if (!t) { + err = -EMSGSIZE; + goto errout; + } + + err = net_flow_put_table(dev, skb, tables[i]); + if (err) { + nla_nest_cancel(skb, t); + goto errout; + } + nla_nest_end(skb, t); + } + nla_nest_end(skb, nest); + return 0; +errout: + nla_nest_cancel(skb, nest); + return err; +} + +static struct sk_buff *net_flow_build_tables_msg(struct net_flow_table **t, + struct net_device *dev, + u32 portid, int seq, u8 cmd) +{ + struct genlmsghdr *hdr; + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!skb) + return ERR_PTR(-ENOBUFS); + + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); + if (!hdr) + goto out; + + if (nla_put_u32(skb, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER_IFINDEX) || + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { + err = -ENOBUFS; + goto out; + } + + err = net_flow_put_tables(dev, skb, t); + if (err < 0) + goto out; + + err = genlmsg_end(skb, hdr); + if (err < 0) + goto out; + + return skb; +out: + nlmsg_free(skb); + return ERR_PTR(err); +} + +static int net_flow_cmd_get_tables(struct sk_buff *skb, + struct genl_info *info) +{ + struct net_flow_table **tables; + struct net_device *dev; + struct sk_buff *msg; + + dev = net_flow_get_dev(info); + if (!dev) + return -EINVAL; + + if (!dev->netdev_ops->ndo_flow_get_tables) { + dev_put(dev); + return -EOPNOTSUPP; + } + + tables = dev->netdev_ops->ndo_flow_get_tables(dev); + if (!tables) /* transient failure should always have some table */ + return -EBUSY; + + msg = net_flow_build_tables_msg(tables, dev, + info->snd_portid, + info->snd_seq, + NET_FLOW_TABLE_CMD_GET_TABLES); + dev_put(dev); + + if (IS_ERR(msg)) + return PTR_ERR(msg); + + return genlmsg_reply(msg, info); +} + +static +int net_flow_put_fields(struct sk_buff *skb, const struct net_flow_header *h) +{ + struct net_flow_field *f; + int count = h->field_sz; + struct nlattr *field; + + for (f = h->fields; count; count--, f++) { + field = nla_nest_start(skb, NET_FLOW_FIELD); + if (!field) + goto field_put_failure; + + if (nla_put_string(skb, NET_FLOW_FIELD_ATTR_NAME, f->name) || + nla_put_u32(skb, NET_FLOW_FIELD_ATTR_UID, f->uid) || + nla_put_u32(skb, NET_FLOW_FIELD_ATTR_BITWIDTH, f->bitwidth)) + goto out; + + nla_nest_end(skb, field); + } + + return 0; +out: + nla_nest_cancel(skb, field); +field_put_failure: + return -EMSGSIZE; +} + +static int net_flow_put_headers(struct sk_buff *skb, + struct net_flow_header **headers) +{ + struct nlattr *nest, *hdr, *fields; + struct net_flow_header *h; + int i, err; + + nest = nla_nest_start(skb, NET_FLOW_HEADERS); + if (!nest) + return -EMSGSIZE; + + for (i = 0; headers[i]->uid; i++) { + err = -EMSGSIZE; + h = headers[i]; + + hdr = nla_nest_start(skb, NET_FLOW_HEADER); + if (!hdr) + goto hdr_put_failure; + + if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) || + nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid)) + goto attr_put_failure; + + fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS); + if (!fields) + goto attr_put_failure; + + err = net_flow_put_fields(skb, h); + if (err) + goto fields_put_failure; + + nla_nest_end(skb, fields); + + nla_nest_end(skb, hdr); + } + nla_nest_end(skb, nest); + + return 0; +fields_put_failure: + nla_nest_cancel(skb, fields); +attr_put_failure: + nla_nest_cancel(skb, hdr); +hdr_put_failure: + nla_nest_cancel(skb, nest); + return err; +} + +static struct sk_buff *net_flow_build_headers_msg(struct net_flow_header **h, + struct net_device *dev, + u32 portid, int seq, u8 cmd) +{ + struct genlmsghdr *hdr; + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!skb) + return ERR_PTR(-ENOBUFS); + + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); + if (!hdr) + goto out; + + if (nla_put_u32(skb, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER_IFINDEX) || + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { + err = -ENOBUFS; + goto out; + } + + err = net_flow_put_headers(skb, h); + if (err < 0) + goto out; + + err = genlmsg_end(skb, hdr); + if (err < 0) + goto out; + + return skb; +out: + nlmsg_free(skb); + return ERR_PTR(err); +} + +static int net_flow_cmd_get_headers(struct sk_buff *skb, + struct genl_info *info) +{ + struct net_flow_header **h; + struct net_device *dev; + struct sk_buff *msg; + + dev = net_flow_get_dev(info); + if (!dev) + return -EINVAL; + + if (!dev->netdev_ops->ndo_flow_get_headers) { + dev_put(dev); + return -EOPNOTSUPP; + } + + h = dev->netdev_ops->ndo_flow_get_headers(dev); + if (!h) + return -EBUSY; + + msg = net_flow_build_headers_msg(h, dev, + info->snd_portid, + info->snd_seq, + NET_FLOW_TABLE_CMD_GET_HEADERS); + dev_put(dev); + + if (IS_ERR(msg)) + return PTR_ERR(msg); + + return genlmsg_reply(msg, info); +} + +static int net_flow_put_header_node(struct sk_buff *skb, + struct net_flow_hdr_node *node) +{ + struct nlattr *hdrs, *jumps; + int i, err; + + if (nla_put_string(skb, NET_FLOW_HEADER_NODE_NAME, node->name) || + nla_put_u32(skb, NET_FLOW_HEADER_NODE_UID, node->uid)) + return -EMSGSIZE; + + /* Insert the set of headers that get extracted at this node */ + hdrs = nla_nest_start(skb, NET_FLOW_HEADER_NODE_HDRS); + if (!hdrs) + return -EMSGSIZE; + for (i = 0; node->hdrs[i]; i++) { + if (nla_put_u32(skb, NET_FLOW_HEADER_NODE_HDRS_VALUE, + node->hdrs[i])) { + nla_nest_cancel(skb, hdrs); + return -EMSGSIZE; + } + } + nla_nest_end(skb, hdrs); + + /* Then give the jump table to find next header node in graph */ + jumps = nla_nest_start(skb, NET_FLOW_HEADER_NODE_JUMP); + if (!jumps) + return -EMSGSIZE; + + for (i = 0; node->jump[i].node; i++) { + err = nla_put(skb, NET_FLOW_JUMP_TABLE_ENTRY, + sizeof(struct net_flow_jump_table), + &node->jump[i]); + if (err) { + nla_nest_cancel(skb, jumps); + return -EMSGSIZE; + } + } + nla_nest_end(skb, jumps); + + return 0; +} + +static int net_flow_put_header_graph(struct sk_buff *skb, + struct net_flow_hdr_node **g) +{ + struct nlattr *nodes, *node; + int err, i; + + nodes = nla_nest_start(skb, NET_FLOW_HEADER_GRAPH); + if (!nodes) + return -EMSGSIZE; + + for (i = 0; g[i]->uid; i++) { + node = nla_nest_start(skb, NET_FLOW_HEADER_GRAPH_NODE); + if (!node) { + err = -EMSGSIZE; + goto nodes_put_error; + } + + err = net_flow_put_header_node(skb, g[i]); + if (err) + goto node_put_error; + + nla_nest_end(skb, node); + } + + nla_nest_end(skb, nodes); + return 0; +node_put_error: + nla_nest_cancel(skb, node); +nodes_put_error: + nla_nest_cancel(skb, nodes); + return err; +} + +static +struct sk_buff *net_flow_build_header_graph_msg(struct net_flow_hdr_node **g, + struct net_device *dev, + u32 portid, int seq, u8 cmd) +{ + struct genlmsghdr *hdr; + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!skb) + return ERR_PTR(-ENOBUFS); + + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); + if (!hdr) + goto out; + + if (nla_put_u32(skb, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER_IFINDEX) || + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { + err = -ENOBUFS; + goto out; + } + + err = net_flow_put_header_graph(skb, g); + if (err < 0) + goto out; + + err = genlmsg_end(skb, hdr); + if (err < 0) + goto out; + + return skb; +out: + nlmsg_free(skb); + return ERR_PTR(err); +} + +static int net_flow_cmd_get_header_graph(struct sk_buff *skb, + struct genl_info *info) +{ + struct net_flow_hdr_node **h; + struct net_device *dev; + struct sk_buff *msg; + + dev = net_flow_get_dev(info); + if (!dev) + return -EINVAL; + + if (!dev->netdev_ops->ndo_flow_get_hdr_graph) { + dev_put(dev); + return -EOPNOTSUPP; + } + + h = dev->netdev_ops->ndo_flow_get_hdr_graph(dev); + if (!h) + return -EBUSY; + + msg = net_flow_build_header_graph_msg(h, dev, + info->snd_portid, + info->snd_seq, + NET_FLOW_TABLE_CMD_GET_HDR_GRAPH); + dev_put(dev); + + if (IS_ERR(msg)) + return PTR_ERR(msg); + + return genlmsg_reply(msg, info); +} + +static int net_flow_put_table_node(struct sk_buff *skb, + struct net_flow_tbl_node *node) +{ + struct nlattr *nest, *jump; + int i, err = -EMSGSIZE; + + nest = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH_NODE); + if (!nest) + return err; + + if (nla_put_u32(skb, NET_FLOW_TABLE_GRAPH_NODE_UID, node->uid) || + nla_put_u32(skb, NET_FLOW_TABLE_GRAPH_NODE_FLAGS, node->flags)) + goto node_put_failure; + + jump = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH_NODE_JUMP); + if (!jump) + goto node_put_failure; + + for (i = 0; node->jump[i].node; i++) { + err = nla_put(skb, NET_FLOW_JUMP_TABLE_ENTRY, + sizeof(struct net_flow_jump_table), + &node->jump[i]); + if (err) + goto jump_put_failure; + } + + nla_nest_end(skb, jump); + nla_nest_end(skb, nest); + return 0; +jump_put_failure: + nla_nest_cancel(skb, jump); +node_put_failure: + nla_nest_cancel(skb, nest); + return err; +} + +static int net_flow_put_table_graph(struct sk_buff *skb, + struct net_flow_tbl_node **nodes) +{ + struct nlattr *graph; + int err, i = 0; + + graph = nla_nest_start(skb, NET_FLOW_TABLE_GRAPH); + if (!graph) + return -EMSGSIZE; + + for (i = 0; nodes[i]->uid; i++) { + err = net_flow_put_table_node(skb, nodes[i]); + if (err) { + nla_nest_cancel(skb, graph); + return -EMSGSIZE; + } + } + + nla_nest_end(skb, graph); + return 0; +} + +static +struct sk_buff *net_flow_build_graph_msg(struct net_flow_tbl_node **g, + struct net_device *dev, + u32 portid, int seq, u8 cmd) +{ + struct genlmsghdr *hdr; + struct sk_buff *skb; + int err = -ENOBUFS; + + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!skb) + return ERR_PTR(-ENOBUFS); + + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); + if (!hdr) + goto out; + + if (nla_put_u32(skb, + NET_FLOW_IDENTIFIER_TYPE, + NET_FLOW_IDENTIFIER_IFINDEX) || + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { + err = -ENOBUFS; + goto out; + } + + err = net_flow_put_table_graph(skb, g); + if (err < 0) + goto out; + + err = genlmsg_end(skb, hdr); + if (err < 0) + goto out; + + return skb; +out: + nlmsg_free(skb); + return ERR_PTR(err); +} + +static int net_flow_cmd_get_table_graph(struct sk_buff *skb, + struct genl_info *info) +{ + struct net_flow_tbl_node **g; + struct net_device *dev; + struct sk_buff *msg; + + dev = net_flow_get_dev(info); + if (!dev) + return -EINVAL; + + if (!dev->netdev_ops->ndo_flow_get_tbl_graph) { + dev_put(dev); + return -EOPNOTSUPP; + } + + g = dev->netdev_ops->ndo_flow_get_tbl_graph(dev); + if (!g) + return -EBUSY; + + msg = net_flow_build_graph_msg(g, dev, + info->snd_portid, + info->snd_seq, + NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH); + dev_put(dev); + + if (IS_ERR(msg)) + return PTR_ERR(msg); + + return genlmsg_reply(msg, info); +} + +static const struct nla_policy net_flow_cmd_policy[NET_FLOW_MAX + 1] = { + [NET_FLOW_IDENTIFIER_TYPE] = {.type = NLA_U32, }, + [NET_FLOW_IDENTIFIER] = {.type = NLA_U32, }, + [NET_FLOW_TABLES] = {.type = NLA_NESTED, }, + [NET_FLOW_HEADERS] = {.type = NLA_NESTED, }, + [NET_FLOW_ACTIONS] = {.type = NLA_NESTED, }, + [NET_FLOW_HEADER_GRAPH] = {.type = NLA_NESTED, }, + [NET_FLOW_TABLE_GRAPH] = {.type = NLA_NESTED, }, +}; + +static const struct genl_ops net_flow_table_nl_ops[] = { + { + .cmd = NET_FLOW_TABLE_CMD_GET_TABLES, + .doit = net_flow_cmd_get_tables, + .policy = net_flow_cmd_policy, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = NET_FLOW_TABLE_CMD_GET_HEADERS, + .doit = net_flow_cmd_get_headers, + .policy = net_flow_cmd_policy, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = NET_FLOW_TABLE_CMD_GET_ACTIONS, + .doit = net_flow_cmd_get_actions, + .policy = net_flow_cmd_policy, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = NET_FLOW_TABLE_CMD_GET_HDR_GRAPH, + .doit = net_flow_cmd_get_header_graph, + .policy = net_flow_cmd_policy, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = NET_FLOW_TABLE_CMD_GET_TABLE_GRAPH, + .doit = net_flow_cmd_get_table_graph, + .policy = net_flow_cmd_policy, + .flags = GENL_ADMIN_PERM, + }, +}; + +static int __init net_flow_nl_module_init(void) +{ + return genl_register_family_with_ops(&net_flow_nl_family, + net_flow_table_nl_ops); +} + +static void net_flow_nl_module_fini(void) +{ + genl_unregister_family(&net_flow_nl_family); +} + +module_init(net_flow_nl_module_init); +module_exit(net_flow_nl_module_fini); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("John Fastabend <john.r.fastabend@intel.com>"); +MODULE_DESCRIPTION("Netlink interface to Flow Tables"); +MODULE_ALIAS_GENL_FAMILY(NET_FLOW_GENL_NAME); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2014-12-31 19:45 ` [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables John Fastabend @ 2014-12-31 20:10 ` John Fastabend 2015-01-04 11:12 ` Thomas Graf 2015-01-06 5:25 ` Scott Feldman 2 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2014-12-31 20:10 UTC (permalink / raw) To: tgraf, sfeldma, jiri, jhs, simon.horman; +Cc: netdev, davem, andy On 12/31/2014 11:45 AM, John Fastabend wrote: > Currently, we do not have an interface to query hardware and learn > the capabilities of the device. This makes it very difficult to use > hardware flow tables. > oops missed a few dev_put calls so at least need a new rev for this. I'll wait a few days for feedback though. [...] > + > +static int net_flow_cmd_get_actions(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_action **a; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_actions) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + a = dev->netdev_ops->ndo_flow_get_actions(dev); > + if (!a) missing dev_put(dev) here. > + return -EBUSY; > + > + msg = net_flow_build_actions_msg(a, dev, > + info->snd_portid, > + info->snd_seq, > + NET_FLOW_TABLE_CMD_GET_ACTIONS); > + dev_put(dev); > + > + if (IS_ERR(msg)) > + return PTR_ERR(msg); > + > + return genlmsg_reply(msg, info); > +} > + > +static int net_flow_put_table(struct net_device *dev, > + struct sk_buff *skb, > + struct net_flow_table *t) > +{ > + struct nlattr *matches, *actions; > + int i; > + > + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) > + return -EMSGSIZE; > + > + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); > + if (!matches) > + return -EMSGSIZE; > + > + for (i = 0; t->matches[i].instance; i++) > + nla_put(skb, NET_FLOW_FIELD_REF, > + sizeof(struct net_flow_field_ref), > + &t->matches[i]); need to check the return codes here. > + nla_nest_end(skb, matches); > + > + actions = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_ACTIONS); > + if (!actions) > + return -EMSGSIZE; > + > + for (i = 0; t->actions[i]; i++) { > + if (nla_put_u32(skb, > + NET_FLOW_ACTION_ATTR_UID, > + t->actions[i])) { > + nla_nest_cancel(skb, actions); > + return -EMSGSIZE; > + } remembered to do the check here though ;) > + } > + nla_nest_end(skb, actions); > + > + return 0; > +} > + [...] > + > +static struct sk_buff *net_flow_build_tables_msg(struct net_flow_table **t, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > + if (!skb) > + return ERR_PTR(-ENOBUFS); > + > + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); > + if (!hdr) > + goto out; > + > + if (nla_put_u32(skb, > + NET_FLOW_IDENTIFIER_TYPE, > + NET_FLOW_IDENTIFIER_IFINDEX) || > + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { > + err = -ENOBUFS; > + goto out; > + } > + > + err = net_flow_put_tables(dev, skb, t); > + if (err < 0) > + goto out; > + > + err = genlmsg_end(skb, hdr); > + if (err < 0) > + goto out; > + > + return skb; > +out: > + nlmsg_free(skb); > + return ERR_PTR(err); > +} > + > +static int net_flow_cmd_get_tables(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_table **tables; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_tables) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + tables = dev->netdev_ops->ndo_flow_get_tables(dev); > + if (!tables) /* transient failure should always have some table */ need dev_put() > + return -EBUSY; > + > + msg = net_flow_build_tables_msg(tables, dev, > + info->snd_portid, > + info->snd_seq, > + NET_FLOW_TABLE_CMD_GET_TABLES); > + dev_put(dev); > + > + if (IS_ERR(msg)) > + return PTR_ERR(msg); > + > + return genlmsg_reply(msg, info); > +} > + [...] > + > +static int net_flow_put_headers(struct sk_buff *skb, > + struct net_flow_header **headers) > +{ > + struct nlattr *nest, *hdr, *fields; > + struct net_flow_header *h; > + int i, err; > + > + nest = nla_nest_start(skb, NET_FLOW_HEADERS); > + if (!nest) > + return -EMSGSIZE; > + > + for (i = 0; headers[i]->uid; i++) { > + err = -EMSGSIZE; > + h = headers[i]; > + > + hdr = nla_nest_start(skb, NET_FLOW_HEADER); > + if (!hdr) > + goto hdr_put_failure; > + > + if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) || > + nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid)) > + goto attr_put_failure; > + > + fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS); > + if (!fields) > + goto attr_put_failure; > + > + err = net_flow_put_fields(skb, h); > + if (err) > + goto fields_put_failure; > + > + nla_nest_end(skb, fields); > + can remove this new line I think it doesn't add much. > + nla_nest_end(skb, hdr); > + } > + nla_nest_end(skb, nest); > + > + return 0; > +fields_put_failure: > + nla_nest_cancel(skb, fields); > +attr_put_failure: > + nla_nest_cancel(skb, hdr); > +hdr_put_failure: > + nla_nest_cancel(skb, nest); > + return err; > +} > + [...] > + > +static int net_flow_cmd_get_headers(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_header **h; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_headers) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + h = dev->netdev_ops->ndo_flow_get_headers(dev); > + if (!h) dev_put again > + return -EBUSY; > + > + msg = net_flow_build_headers_msg(h, dev, > + info->snd_portid, > + info->snd_seq, > + NET_FLOW_TABLE_CMD_GET_HEADERS); > + dev_put(dev); > + > + if (IS_ERR(msg)) > + return PTR_ERR(msg); > + > + return genlmsg_reply(msg, info); > +} > + [...] > + > +static int net_flow_cmd_get_header_graph(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_hdr_node **h; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_hdr_graph) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + h = dev->netdev_ops->ndo_flow_get_hdr_graph(dev); > + if (!h) dev_put() seems I copy/pasted the same template for each cmd. > + return -EBUSY; > + > + msg = net_flow_build_header_graph_msg(h, dev, > + info->snd_portid, > + info->snd_seq, > + NET_FLOW_TABLE_CMD_GET_HDR_GRAPH); > + dev_put(dev); > + > + if (IS_ERR(msg)) > + return PTR_ERR(msg); > + > + return genlmsg_reply(msg, info); > +} > + [...] > + > +static int net_flow_cmd_get_table_graph(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_tbl_node **g; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_tbl_graph) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + g = dev->netdev_ops->ndo_flow_get_tbl_graph(dev); > + if (!g) dev_put > + return -EBUSY; > + [...] -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2014-12-31 19:45 ` [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables John Fastabend 2014-12-31 20:10 ` John Fastabend @ 2015-01-04 11:12 ` Thomas Graf 2015-01-05 18:59 ` John Fastabend 2015-01-06 5:25 ` Scott Feldman 2 siblings, 1 reply; 21+ messages in thread From: Thomas Graf @ 2015-01-04 11:12 UTC (permalink / raw) To: John Fastabend; +Cc: sfeldma, jiri, jhs, simon.horman, netdev, davem, andy On 12/31/14 at 11:45am, John Fastabend wrote: Impressive work John, some minor nits below. In general this looks great. How large could tables grow? Any risk one of the nested attribtues could exceed 16K in size because of a very large parse graph? Not a problem if we account for it and allow for jumbo attributes. > + > +/** > + * @struct net_flow_header > + * @brief defines a match (header/field) an endpoint can use > + * > + * @uid unique identifier for header > + * @field_sz number of fields are in the set > + * @fields the set of fields in the net_flow_header FWIW, name is not documented. > + */ > +struct net_flow_header { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int field_sz; > + struct net_flow_field *fields; > +}; > + > + > +/** > + * @struct net_flow_table > + * @brief define flow table with supported match/actions > + * > + * @uid unique identifier for table > + * @source uid of parent table > + * @size max number of entries for table or -1 for unbounded > + * @matches null terminated set of supported match types given by match uid > + * @actions null terminated set of supported action types given by action uid > + * @flows set of flows name not documented, flows seems to be leftover > + */ > +struct net_flow_table { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int source; > + int size; > + struct net_flow_field_ref *matches; > + int *actions; > +}; > + > +/* net_flow_hdr_node: node in a header graph of header fields. > + * > + * @uid : unique id of the graph node > + * @flwo_header_ref : identify the hdrs that can handled by this node > + * @net_flow_jump_table : give a case jump statement > + */ needs more work too ;) > +struct net_flow_hdr_node { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int *hdrs; > + struct net_flow_jump_table *jump; > +}; > + */ > + > +/* Netlink description: > + * > + * Table definition used to describe running tables. The following > + * describes the netlink message returned from a flow API messages. > + * > + * Flow table definitions used to define tables. > + * > + * [NET_FLOW_TABLE_IDENTIFIER_TYPE] > + * [NET_FLOW_TABLE_IDENTIFIER] > + * [NET_FLOW_TABLE_TABLES] > + * [NET_FLOW_TABLE] > + * [NET_FLOW_TABLE_ATTR_NAME] > + * [NET_FLOW_TABLE_ATTR_UID] > + * [NET_FLOW_TABLE_ATTR_SOURCE] > + * [NET_FLOW_TABLE_ATTR_SIZE] > + * [NET_FLOW_TABLE_ATTR_MATCHES] The tabs and spaces mix make the indentation wrong in the patch, it looks correct unquoted though but consistency would make this perfect. > +#ifndef _UAPI_LINUX_IF_FLOW > +#define _UAPI_LINUX_IF_FLOW > + > +#include <linux/types.h> > +#include <linux/netlink.h> > +#include <linux/if.h> > + > +#define NET_FLOW_NAMSIZ 80 Did you consider allocating the memory for names? I don't have a grasp for the typical number of net_flow_* instances in memory yet. > +/** > + * @struct net_flow_field_ref > + * @brief uniquely identify field as header:field tuple > + */ > +struct net_flow_field_ref { > + int instance; > + int header; > + int field; > + int mask_type; > + int type; > + union { /* Are these all the required data types */ > + __u8 value_u8; > + __u16 value_u16; > + __u32 value_u32; > + __u64 value_u64; > + }; > + union { /* Are these all the required data types */ > + __u8 mask_u8; > + __u16 mask_u16; > + __u32 mask_u32; > + __u64 mask_u64; > + }; > +}; Does it make sense to write this as follows? union { struct { __u8 value_u8; __u8 mask_u8; }; struct { __u16 value_u16; __u16 mask_u16; }; ... }; > +#define NET_FLOW_TABLE_EGRESS_ROOT 1 > +#define NET_FLOW_TABLE_INGRESS_ROOT 2 Tab/space mix. > +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); > +static int net_flow_put_table(struct net_device *dev, > + struct sk_buff *skb, > + struct net_flow_table *t) > +{ > + struct nlattr *matches, *actions; > + int i; > + > + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) > + return -EMSGSIZE; > + > + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); > + if (!matches) > + return -EMSGSIZE; > + > + for (i = 0; t->matches[i].instance; i++) > + nla_put(skb, NET_FLOW_FIELD_REF, > + sizeof(struct net_flow_field_ref), > + &t->matches[i]); Unhandled nla_put() error > +static struct sk_buff *net_flow_build_tables_msg(struct net_flow_table **t, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); > +static int net_flow_put_headers(struct sk_buff *skb, > + struct net_flow_header **headers) > +{ > + struct nlattr *nest, *hdr, *fields; > + struct net_flow_header *h; > + int i, err; > + > + nest = nla_nest_start(skb, NET_FLOW_HEADERS); > + if (!nest) > + return -EMSGSIZE; > + > + for (i = 0; headers[i]->uid; i++) { > + err = -EMSGSIZE; > + h = headers[i]; > + > + hdr = nla_nest_start(skb, NET_FLOW_HEADER); > + if (!hdr) > + goto hdr_put_failure; > + > + if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) || > + nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid)) > + goto attr_put_failure; > + > + fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS); > + if (!fields) > + goto attr_put_failure; You can jump to hdr_put_failure right away and get rid of the attr_put_failure target as you cancel that nest anyway. You can apply this comment to several other places as well if you want. > + > + err = net_flow_put_fields(skb, h); > + if (err) > + goto fields_put_failure; > + > + nla_nest_end(skb, fields); > + > + nla_nest_end(skb, hdr); > + } > + nla_nest_end(skb, nest); > + > + return 0; > +fields_put_failure: > + nla_nest_cancel(skb, fields); > +attr_put_failure: > + nla_nest_cancel(skb, hdr); > +hdr_put_failure: > + nla_nest_cancel(skb, nest); > + return err; > +} > + > +static struct sk_buff *net_flow_build_headers_msg(struct net_flow_header **h, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); > +static > +struct sk_buff *net_flow_build_graph_msg(struct net_flow_tbl_node **g, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-04 11:12 ` Thomas Graf @ 2015-01-05 18:59 ` John Fastabend 2015-01-05 21:48 ` Thomas Graf ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: John Fastabend @ 2015-01-05 18:59 UTC (permalink / raw) To: Thomas Graf; +Cc: sfeldma, jiri, jhs, simon.horman, netdev, davem, andy On 01/04/2015 03:12 AM, Thomas Graf wrote: > On 12/31/14 at 11:45am, John Fastabend wrote: > > Impressive work John, some minor nits below. In general this looks > great. How large could tables grow? Any risk one of the nested > attribtues could exceed 16K in size because of a very large parse > graph? Not a problem if we account for it and allow for jumbo > attributes. > hmm it sounds large to me but maybe if you have an NPU that is trying to parse into application data it could happen. What does it take to allow for jumbo attributes? >> + >> +/** >> + * @struct net_flow_header >> + * @brief defines a match (header/field) an endpoint can use >> + * >> + * @uid unique identifier for header >> + * @field_sz number of fields are in the set >> + * @fields the set of fields in the net_flow_header > > FWIW, name is not documented. thanks fixed up documentation and spacing for v2. [...] >> +#ifndef _UAPI_LINUX_IF_FLOW >> +#define _UAPI_LINUX_IF_FLOW >> + >> +#include <linux/types.h> >> +#include <linux/netlink.h> >> +#include <linux/if.h> >> + >> +#define NET_FLOW_NAMSIZ 80 > > Did you consider allocating the memory for names? I don't have a grasp > for the typical number of net_flow_* instances in memory yet. > <100k in the devices I have. Maybe Simon can pitch in what is typical on the NPUs I'm not sure about them. Rocker tables can grow as large as needed at the moment. Allocating the memory may help I'll go ahead and give it a try. >> +/** >> + * @struct net_flow_field_ref >> + * @brief uniquely identify field as header:field tuple >> + */ >> +struct net_flow_field_ref { >> + int instance; >> + int header; >> + int field; >> + int mask_type; >> + int type; >> + union { /* Are these all the required data types */ >> + __u8 value_u8; >> + __u16 value_u16; >> + __u32 value_u32; >> + __u64 value_u64; >> + }; >> + union { /* Are these all the required data types */ >> + __u8 mask_u8; >> + __u16 mask_u16; >> + __u32 mask_u32; >> + __u64 mask_u64; >> + }; >> +}; > > Does it make sense to write this as follows? Yes. I'll make this update it helps make it clear value/mask pairs are needed. > > union { > struct { > __u8 value_u8; > __u8 mask_u8; > }; > struct { > __u16 value_u16; > __u16 mask_u16; > }; > ... > }; > >> +#define NET_FLOW_TABLE_EGRESS_ROOT 1 >> +#define NET_FLOW_TABLE_INGRESS_ROOT 2 > > Tab/space mix. > yep fixed. >> +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, >> + struct net_device *dev, >> + u32 portid, int seq, u8 cmd) >> +{ >> + struct genlmsghdr *hdr; >> + struct sk_buff *skb; >> + int err = -ENOBUFS; >> + >> + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > > genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); fixed along with the other cases. > >> +static int net_flow_put_table(struct net_device *dev, >> + struct sk_buff *skb, >> + struct net_flow_table *t) >> +{ >> + struct nlattr *matches, *actions; >> + int i; >> + >> + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) >> + return -EMSGSIZE; >> + >> + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); >> + if (!matches) >> + return -EMSGSIZE; >> + >> + for (i = 0; t->matches[i].instance; i++) >> + nla_put(skb, NET_FLOW_FIELD_REF, >> + sizeof(struct net_flow_field_ref), >> + &t->matches[i]); > > Unhandled nla_put() error > thanks. [...] >> +static int net_flow_put_headers(struct sk_buff *skb, >> + struct net_flow_header **headers) >> +{ >> + struct nlattr *nest, *hdr, *fields; >> + struct net_flow_header *h; >> + int i, err; >> + >> + nest = nla_nest_start(skb, NET_FLOW_HEADERS); >> + if (!nest) >> + return -EMSGSIZE; >> + >> + for (i = 0; headers[i]->uid; i++) { >> + err = -EMSGSIZE; >> + h = headers[i]; >> + >> + hdr = nla_nest_start(skb, NET_FLOW_HEADER); >> + if (!hdr) >> + goto hdr_put_failure; >> + >> + if (nla_put_string(skb, NET_FLOW_HEADER_ATTR_NAME, h->name) || >> + nla_put_u32(skb, NET_FLOW_HEADER_ATTR_UID, h->uid)) >> + goto attr_put_failure; >> + >> + fields = nla_nest_start(skb, NET_FLOW_HEADER_ATTR_FIELDS); >> + if (!fields) >> + goto attr_put_failure; > > You can jump to hdr_put_failure right away and get rid of the > attr_put_failure target as you cancel that nest anyway. You can apply > this comment to several other places as well if you want. > OK so to simplify the error paths we only need to cancel the outer most nested attribute. I'll do this transformation. .John -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-05 18:59 ` John Fastabend @ 2015-01-05 21:48 ` Thomas Graf 2015-01-05 23:29 ` John Fastabend ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Thomas Graf @ 2015-01-05 21:48 UTC (permalink / raw) To: John Fastabend; +Cc: sfeldma, jiri, jhs, simon.horman, netdev, davem, andy On 01/05/15 at 10:59am, John Fastabend wrote: > On 01/04/2015 03:12 AM, Thomas Graf wrote: > >On 12/31/14 at 11:45am, John Fastabend wrote: > > > >Impressive work John, some minor nits below. In general this looks > >great. How large could tables grow? Any risk one of the nested > >attribtues could exceed 16K in size because of a very large parse > >graph? Not a problem if we account for it and allow for jumbo > >attributes. > > > > hmm it sounds large to me but maybe if you have an NPU that is trying > to parse into application data it could happen. > > What does it take to allow for jumbo attributes? We basically need to make user space aware of a new nlattr header to be expected for certain attributes. We can reserve the 2nd bit of the type to indicate a 32bit length field following the current header. We can only do this for new attributes as its not backwards compatible so we need to think about this before we start exposing them. I can send a patch introducing them in the next few days if you want as it seems you'll have to respin this again anyway. > >You can jump to hdr_put_failure right away and get rid of the > >attr_put_failure target as you cancel that nest anyway. You can apply > >this comment to several other places as well if you want. > > > > OK so to simplify the error paths we only need to cancel the outer most > nested attribute. I'll do this transformation. It's a matter of style. I'm fine either way. Personally I prefer the single abort error target. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-05 18:59 ` John Fastabend 2015-01-05 21:48 ` Thomas Graf @ 2015-01-05 23:29 ` John Fastabend 2015-01-06 0:45 ` John Fastabend 2015-01-07 10:07 ` Or Gerlitz 3 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2015-01-05 23:29 UTC (permalink / raw) To: Thomas Graf; +Cc: sfeldma, jiri, jhs, simon.horman, netdev, davem, andy On 01/05/2015 10:59 AM, John Fastabend wrote: [...] >>> +#ifndef _UAPI_LINUX_IF_FLOW >>> +#define _UAPI_LINUX_IF_FLOW >>> + >>> +#include <linux/types.h> >>> +#include <linux/netlink.h> >>> +#include <linux/if.h> >>> + >>> +#define NET_FLOW_NAMSIZ 80 >> >> Did you consider allocating the memory for names? I don't have a grasp >> for the typical number of net_flow_* instances in memory yet. >> > > <100k in the devices I have. Maybe Simon can pitch in what is typical > on the NPUs I'm not sure about them. > > Rocker tables can grow as large as needed at the moment. > > Allocating the memory may help I'll go ahead and give it a try. > One issue with breaking this is up is a couple structures are being passed as attributes with name[] as a field. I think its best to break these up passing empty arrays seems to be ugly at best. So I'll need to adjust some of the messaging as well. -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-05 18:59 ` John Fastabend 2015-01-05 21:48 ` Thomas Graf 2015-01-05 23:29 ` John Fastabend @ 2015-01-06 0:45 ` John Fastabend 2015-01-06 1:09 ` Simon Horman 2015-01-07 10:07 ` Or Gerlitz 3 siblings, 1 reply; 21+ messages in thread From: John Fastabend @ 2015-01-06 0:45 UTC (permalink / raw) To: Thomas Graf; +Cc: sfeldma, jiri, jhs, simon.horman, netdev, davem, andy [...] >>> +/** >>> + * @struct net_flow_field_ref >>> + * @brief uniquely identify field as header:field tuple >>> + */ >>> +struct net_flow_field_ref { >>> + int instance; >>> + int header; >>> + int field; >>> + int mask_type; >>> + int type; >>> + union { /* Are these all the required data types */ >>> + __u8 value_u8; >>> + __u16 value_u16; >>> + __u32 value_u32; >>> + __u64 value_u64; >>> + }; >>> + union { /* Are these all the required data types */ >>> + __u8 mask_u8; >>> + __u16 mask_u16; >>> + __u32 mask_u32; >>> + __u64 mask_u64; >>> + }; >>> +}; >> >> Does it make sense to write this as follows? > > Yes. I'll make this update it helps make it clear value/mask pairs are > needed. > >> >> union { >> struct { >> __u8 value_u8; >> __u8 mask_u8; >> }; >> struct { >> __u16 value_u16; >> __u16 mask_u16; >> }; >> ... >> }; Another thought is to pull this entirely out of the structure and hide it from the UAPI so we can add more value/mask types as needed without having to spin versions of net_flow_field_ref. On the other hand I've been able to fit all my fields in these types so far and I can't think of any additions we need at the moment. >> >>> +#define NET_FLOW_TABLE_EGRESS_ROOT 1 >>> +#define NET_FLOW_TABLE_INGRESS_ROOT 2 >> >> Tab/space mix. >> > [...] -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 0:45 ` John Fastabend @ 2015-01-06 1:09 ` Simon Horman 2015-01-06 1:19 ` John Fastabend 0 siblings, 1 reply; 21+ messages in thread From: Simon Horman @ 2015-01-06 1:09 UTC (permalink / raw) To: John Fastabend; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote: > [...] > > >>>+/** > >>>+ * @struct net_flow_field_ref > >>>+ * @brief uniquely identify field as header:field tuple > >>>+ */ > >>>+struct net_flow_field_ref { > >>>+ int instance; > >>>+ int header; > >>>+ int field; > >>>+ int mask_type; > >>>+ int type; > >>>+ union { /* Are these all the required data types */ > >>>+ __u8 value_u8; > >>>+ __u16 value_u16; > >>>+ __u32 value_u32; > >>>+ __u64 value_u64; > >>>+ }; > >>>+ union { /* Are these all the required data types */ > >>>+ __u8 mask_u8; > >>>+ __u16 mask_u16; > >>>+ __u32 mask_u32; > >>>+ __u64 mask_u64; > >>>+ }; > >>>+}; > >> > >>Does it make sense to write this as follows? > > > >Yes. I'll make this update it helps make it clear value/mask pairs are > >needed. > > > >> > >>union { > >> struct { > >> __u8 value_u8; > >> __u8 mask_u8; > >> }; > >> struct { > >> __u16 value_u16; > >> __u16 mask_u16; > >> }; > >> ... > >>}; > > Another thought is to pull this entirely out of the structure and hide > it from the UAPI so we can add more value/mask types as needed without > having to spin versions of net_flow_field_ref. On the other hand I've > been able to fit all my fields in these types so far and I can't think > of any additions we need at the moment. FWIW, I think it would be cleaner to break both field_ref and action_args out into attributes and not expose the structures to user-space. But perhaps there is an advantage to dealing with structures directly that I am missing. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 1:09 ` Simon Horman @ 2015-01-06 1:19 ` John Fastabend 2015-01-06 2:05 ` Simon Horman 0 siblings, 1 reply; 21+ messages in thread From: John Fastabend @ 2015-01-06 1:19 UTC (permalink / raw) To: Simon Horman; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy On 01/05/2015 05:09 PM, Simon Horman wrote: > On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote: >> [...] >> >>>>> +/** >>>>> + * @struct net_flow_field_ref >>>>> + * @brief uniquely identify field as header:field tuple >>>>> + */ >>>>> +struct net_flow_field_ref { >>>>> + int instance; >>>>> + int header; >>>>> + int field; >>>>> + int mask_type; >>>>> + int type; >>>>> + union { /* Are these all the required data types */ >>>>> + __u8 value_u8; >>>>> + __u16 value_u16; >>>>> + __u32 value_u32; >>>>> + __u64 value_u64; >>>>> + }; >>>>> + union { /* Are these all the required data types */ >>>>> + __u8 mask_u8; >>>>> + __u16 mask_u16; >>>>> + __u32 mask_u32; >>>>> + __u64 mask_u64; >>>>> + }; >>>>> +}; >>>> >>>> Does it make sense to write this as follows? >>> >>> Yes. I'll make this update it helps make it clear value/mask pairs are >>> needed. >>> >>>> >>>> union { >>>> struct { >>>> __u8 value_u8; >>>> __u8 mask_u8; >>>> }; >>>> struct { >>>> __u16 value_u16; >>>> __u16 mask_u16; >>>> }; >>>> ... >>>> }; >> >> Another thought is to pull this entirely out of the structure and hide >> it from the UAPI so we can add more value/mask types as needed without >> having to spin versions of net_flow_field_ref. On the other hand I've >> been able to fit all my fields in these types so far and I can't think >> of any additions we need at the moment. > > FWIW, I think it would be cleaner to break both field_ref and action_args > out into attributes and not expose the structures to user-space. But > perhaps there is an advantage to dealing with structures directly that > I am missing. > I came to the same conclusion just now as well. I'm reworking it now for v2. -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 1:19 ` John Fastabend @ 2015-01-06 2:05 ` Simon Horman 2015-01-06 2:54 ` Simon Horman 0 siblings, 1 reply; 21+ messages in thread From: Simon Horman @ 2015-01-06 2:05 UTC (permalink / raw) To: John Fastabend; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy On Mon, Jan 05, 2015 at 05:19:26PM -0800, John Fastabend wrote: > On 01/05/2015 05:09 PM, Simon Horman wrote: > >On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote: > >>[...] > >> > >>>>>+/** > >>>>>+ * @struct net_flow_field_ref > >>>>>+ * @brief uniquely identify field as header:field tuple > >>>>>+ */ > >>>>>+struct net_flow_field_ref { > >>>>>+ int instance; > >>>>>+ int header; > >>>>>+ int field; > >>>>>+ int mask_type; > >>>>>+ int type; > >>>>>+ union { /* Are these all the required data types */ > >>>>>+ __u8 value_u8; > >>>>>+ __u16 value_u16; > >>>>>+ __u32 value_u32; > >>>>>+ __u64 value_u64; > >>>>>+ }; > >>>>>+ union { /* Are these all the required data types */ > >>>>>+ __u8 mask_u8; > >>>>>+ __u16 mask_u16; > >>>>>+ __u32 mask_u32; > >>>>>+ __u64 mask_u64; > >>>>>+ }; > >>>>>+}; > >>>> > >>>>Does it make sense to write this as follows? > >>> > >>>Yes. I'll make this update it helps make it clear value/mask pairs are > >>>needed. > >>> > >>>> > >>>>union { > >>>> struct { > >>>> __u8 value_u8; > >>>> __u8 mask_u8; > >>>> }; > >>>> struct { > >>>> __u16 value_u16; > >>>> __u16 mask_u16; > >>>> }; > >>>> ... > >>>>}; > >> > >>Another thought is to pull this entirely out of the structure and hide > >>it from the UAPI so we can add more value/mask types as needed without > >>having to spin versions of net_flow_field_ref. On the other hand I've > >>been able to fit all my fields in these types so far and I can't think > >>of any additions we need at the moment. > > > >FWIW, I think it would be cleaner to break both field_ref and action_args > >out into attributes and not expose the structures to user-space. But > >perhaps there is an advantage to dealing with structures directly that > >I am missing. > > > > I came to the same conclusion just now as well. I'm reworking it now > for v2. Thanks. BTW, I think there are a few problems with net_flow_put_flow_action(). I am not quite to the bottom of it but it seems that: * It loops over a->args[i] and then calls net_flow_put_act_types() which performs a similar loop. This outer-loop appears to be incorrect. * It passes a[i].args instead of a->args[i] to net_flow_put_act_types() I can post a fix once I've got it working to my satisfaction. But if you are reworking that code anyway perhaps it is easier for you to handle it then. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 2:05 ` Simon Horman @ 2015-01-06 2:54 ` Simon Horman 2015-01-06 3:31 ` John Fastabend 0 siblings, 1 reply; 21+ messages in thread From: Simon Horman @ 2015-01-06 2:54 UTC (permalink / raw) To: John Fastabend; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy On Tue, Jan 06, 2015 at 11:05:14AM +0900, Simon Horman wrote: > On Mon, Jan 05, 2015 at 05:19:26PM -0800, John Fastabend wrote: > > On 01/05/2015 05:09 PM, Simon Horman wrote: > > >On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote: > > >>[...] > > >> > > >>>>>+/** > > >>>>>+ * @struct net_flow_field_ref > > >>>>>+ * @brief uniquely identify field as header:field tuple > > >>>>>+ */ > > >>>>>+struct net_flow_field_ref { > > >>>>>+ int instance; > > >>>>>+ int header; > > >>>>>+ int field; > > >>>>>+ int mask_type; > > >>>>>+ int type; > > >>>>>+ union { /* Are these all the required data types */ > > >>>>>+ __u8 value_u8; > > >>>>>+ __u16 value_u16; > > >>>>>+ __u32 value_u32; > > >>>>>+ __u64 value_u64; > > >>>>>+ }; > > >>>>>+ union { /* Are these all the required data types */ > > >>>>>+ __u8 mask_u8; > > >>>>>+ __u16 mask_u16; > > >>>>>+ __u32 mask_u32; > > >>>>>+ __u64 mask_u64; > > >>>>>+ }; > > >>>>>+}; > > >>>> > > >>>>Does it make sense to write this as follows? > > >>> > > >>>Yes. I'll make this update it helps make it clear value/mask pairs are > > >>>needed. > > >>> > > >>>> > > >>>>union { > > >>>> struct { > > >>>> __u8 value_u8; > > >>>> __u8 mask_u8; > > >>>> }; > > >>>> struct { > > >>>> __u16 value_u16; > > >>>> __u16 mask_u16; > > >>>> }; > > >>>> ... > > >>>>}; > > >> > > >>Another thought is to pull this entirely out of the structure and hide > > >>it from the UAPI so we can add more value/mask types as needed without > > >>having to spin versions of net_flow_field_ref. On the other hand I've > > >>been able to fit all my fields in these types so far and I can't think > > >>of any additions we need at the moment. > > > > > >FWIW, I think it would be cleaner to break both field_ref and action_args > > >out into attributes and not expose the structures to user-space. But > > >perhaps there is an advantage to dealing with structures directly that > > >I am missing. > > > > > > > I came to the same conclusion just now as well. I'm reworking it now > > for v2. > > Thanks. > > BTW, I think there are a few problems with net_flow_put_flow_action(). > > I am not quite to the bottom of it but it seems that: > * It loops over a->args[i] and then calls net_flow_put_act_types() > which performs a similar loop. This outer-loop appears to be incorrect. > * It passes a[i].args instead of a->args[i] to net_flow_put_act_types() > > I can post a fix once I've got it working to my satisfaction. > But if you are reworking that code anyway perhaps it is easier for > you to handle it then. FWIW this got the current scheme working for me: diff --git a/net/core/flow_table.c b/net/core/flow_table.c index 5dbdc13..598afa2 100644 --- a/net/core/flow_table.c +++ b/net/core/flow_table.c @@ -946,7 +946,7 @@ static int net_flow_put_flow_action(struct sk_buff *skb, struct net_flow_action *a) { struct nlattr *action, *sigs; - int i, err = 0; + int err = 0; action = nla_nest_start(skb, NET_FLOW_ACTION); if (!action) @@ -958,21 +958,19 @@ static int net_flow_put_flow_action(struct sk_buff *skb, if (!a->args) goto done; - for (i = 0; a->args[i].type; i++) { - sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); - if (!sigs) { - nla_nest_cancel(skb, action); - return -EMSGSIZE; - } + sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); + if (!sigs) { + nla_nest_cancel(skb, action); + return -EMSGSIZE; + } - err = net_flow_put_act_types(skb, a[i].args); - if (err) { - nla_nest_cancel(skb, sigs); - nla_nest_cancel(skb, action); - return err; - } - nla_nest_end(skb, sigs); + err = net_flow_put_act_types(skb, a->args); + if (err) { + nla_nest_cancel(skb, sigs); + nla_nest_cancel(skb, action); + return err; } + nla_nest_end(skb, sigs); done: nla_nest_end(skb, action); @@ -1103,6 +1101,7 @@ static int net_flow_get_action(struct net_flow_action *a, struct nlattr *attr) } a->args[count] = *(struct net_flow_action_arg *)nla_data(args); + count++; } return 0; } ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 2:54 ` Simon Horman @ 2015-01-06 3:31 ` John Fastabend 0 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2015-01-06 3:31 UTC (permalink / raw) To: Simon Horman; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy [...] >> >> BTW, I think there are a few problems with net_flow_put_flow_action(). >> >> I am not quite to the bottom of it but it seems that: >> * It loops over a->args[i] and then calls net_flow_put_act_types() >> which performs a similar loop. This outer-loop appears to be incorrect. >> * It passes a[i].args instead of a->args[i] to net_flow_put_act_types() >> >> I can post a fix once I've got it working to my satisfaction. >> But if you are reworking that code anyway perhaps it is easier for >> you to handle it then. > > FWIW this got the current scheme working for me: > Thanks Simon. I'll roll this in as well. > diff --git a/net/core/flow_table.c b/net/core/flow_table.c > index 5dbdc13..598afa2 100644 > --- a/net/core/flow_table.c > +++ b/net/core/flow_table.c > @@ -946,7 +946,7 @@ static int net_flow_put_flow_action(struct sk_buff *skb, > struct net_flow_action *a) > { > struct nlattr *action, *sigs; > - int i, err = 0; > + int err = 0; > > action = nla_nest_start(skb, NET_FLOW_ACTION); > if (!action) > @@ -958,21 +958,19 @@ static int net_flow_put_flow_action(struct sk_buff *skb, > if (!a->args) > goto done; > > - for (i = 0; a->args[i].type; i++) { > - sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); > - if (!sigs) { > - nla_nest_cancel(skb, action); > - return -EMSGSIZE; > - } > + sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); > + if (!sigs) { > + nla_nest_cancel(skb, action); > + return -EMSGSIZE; > + } > > - err = net_flow_put_act_types(skb, a[i].args); > - if (err) { > - nla_nest_cancel(skb, sigs); > - nla_nest_cancel(skb, action); > - return err; > - } > - nla_nest_end(skb, sigs); > + err = net_flow_put_act_types(skb, a->args); > + if (err) { > + nla_nest_cancel(skb, sigs); > + nla_nest_cancel(skb, action); > + return err; > } > + nla_nest_end(skb, sigs); > > done: > nla_nest_end(skb, action); > @@ -1103,6 +1101,7 @@ static int net_flow_get_action(struct net_flow_action *a, struct nlattr *attr) > } > > a->args[count] = *(struct net_flow_action_arg *)nla_data(args); > + count++; > } > return 0; > } > -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-05 18:59 ` John Fastabend ` (2 preceding siblings ...) 2015-01-06 0:45 ` John Fastabend @ 2015-01-07 10:07 ` Or Gerlitz 2015-01-07 16:35 ` John Fastabend 3 siblings, 1 reply; 21+ messages in thread From: Or Gerlitz @ 2015-01-07 10:07 UTC (permalink / raw) To: John Fastabend Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, Linux Netdev List, David Miller, Andy Gospodarek On Mon, Jan 5, 2015 at 8:59 PM, John Fastabend <john.fastabend@gmail.com> wrote: >>> +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, >>> + struct net_device *dev, >>> + u32 portid, int seq, u8 cmd) >>> +{ >>> + struct genlmsghdr *hdr; >>> + struct sk_buff *skb; >>> + int err = -ENOBUFS; >>> + >>> + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); >> >> >> genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); > > > fixed along with the other cases. small nit here, net_flow_build_actions_msg can be made static, it's called only from within this file few more nits... checkpatch --strict produces bunch of "CHECK: Please use a blank line after function/struct/union/enum declarations" comments, I guess worth fixing too. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-07 10:07 ` Or Gerlitz @ 2015-01-07 16:35 ` John Fastabend 0 siblings, 0 replies; 21+ messages in thread From: John Fastabend @ 2015-01-07 16:35 UTC (permalink / raw) To: Or Gerlitz Cc: Thomas Graf, Scott Feldman, Jiří Pírko, Jamal Hadi Salim, simon.horman, Linux Netdev List, David Miller, Andy Gospodarek On 01/07/2015 02:07 AM, Or Gerlitz wrote: > On Mon, Jan 5, 2015 at 8:59 PM, John Fastabend <john.fastabend@gmail.com> wrote: >>>> +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, >>>> + struct net_device *dev, >>>> + u32 portid, int seq, u8 cmd) >>>> +{ >>>> + struct genlmsghdr *hdr; >>>> + struct sk_buff *skb; >>>> + int err = -ENOBUFS; >>>> + >>>> + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); >>> >>> >>> genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL); >> >> >> fixed along with the other cases. > > small nit here, net_flow_build_actions_msg can be made static, it's > called only from within this file > > few more nits... checkpatch --strict produces bunch of "CHECK: Please > use a blank line after function/struct/union/enum declarations" > comments, I guess worth fixing too. > Thanks. will fix in v2. -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2014-12-31 19:45 ` [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables John Fastabend 2014-12-31 20:10 ` John Fastabend 2015-01-04 11:12 ` Thomas Graf @ 2015-01-06 5:25 ` Scott Feldman 2015-01-06 6:04 ` John Fastabend 2 siblings, 1 reply; 21+ messages in thread From: Scott Feldman @ 2015-01-06 5:25 UTC (permalink / raw) To: John Fastabend Cc: Thomas Graf, Jiří Pírko, Jamal Hadi Salim, simon.horman, Netdev, David S. Miller, Andy Gospodarek Nice work John. Some nits inline... On Wed, Dec 31, 2014 at 11:45 AM, John Fastabend <john.fastabend@gmail.com> wrote: > > diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h > new file mode 100644 > index 0000000..1b6c1ea > --- /dev/null > +++ b/include/linux/if_flow.h > @@ -0,0 +1,93 @@ > +/* > + * include/linux/net/if_flow.h - Flow table interface for Switch devices > + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> > + * > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Author: John Fastabend <john.r.fastabend@intel.com> > + */ > + > +#ifndef _IF_FLOW_H > +#define _IF_FLOW_H > + > +#include <uapi/linux/if_flow.h> > + > +/** > + * @struct net_flow_header > + * @brief defines a match (header/field) an endpoint can use > + * > + * @uid unique identifier for header > + * @field_sz number of fields are in the set > + * @fields the set of fields in the net_flow_header > + */ > +struct net_flow_header { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int field_sz; > + struct net_flow_field *fields; > +}; > + > +/** > + * @struct net_flow_action > + * @brief a description of a endpoint defined action > + * > + * @name printable name > + * @uid unique action identifier > + * @types NET_FLOW_ACTION_TYPE_NULL terminated list of action types s/types/args? > + */ > +struct net_flow_action { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + struct net_flow_action_arg *args; > +}; > + > +/** > + * @struct net_flow_table > + * @brief define flow table with supported match/actions > + * > + * @uid unique identifier for table > + * @source uid of parent table Is parent table the table previous in the pipeline? If so, what if you can get to table from N different parent tables, what goes in source? > + * @size max number of entries for table or -1 for unbounded > + * @matches null terminated set of supported match types given by match uid > + * @actions null terminated set of supported action types given by action uid > + * @flows set of flows > + */ > +struct net_flow_table { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int source; > + int size; > + struct net_flow_field_ref *matches; > + int *actions; > +}; > + > +/* net_flow_hdr_node: node in a header graph of header fields. > + * > + * @uid : unique id of the graph node > + * @flwo_header_ref : identify the hdrs that can handled by this node s/flwo_header_ref/hdrs? > + * @net_flow_jump_table : give a case jump statement s/net_flow_jump_table/jump > + */ > +struct net_flow_hdr_node { > + char name[NET_FLOW_NAMSIZ]; > + int uid; > + int *hdrs; > + struct net_flow_jump_table *jump; > +}; > + > +struct net_flow_tbl_node { > + int uid; > + __u32 flags; > + struct net_flow_jump_table *jump; > +}; > +#endif > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 29c92ee..3c3c856 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -52,6 +52,11 @@ > #include <linux/neighbour.h> > #include <uapi/linux/netdevice.h> > > +#ifdef CONFIG_NET_FLOW_TABLES > +#include <linux/if_flow.h> > +#include <uapi/linux/if_flow.h> linux/if_flow.h already included uapi file > +#endif > + > struct netpoll_info; > struct device; > struct phy_device; > @@ -1186,6 +1191,13 @@ struct net_device_ops { > int (*ndo_switch_port_stp_update)(struct net_device *dev, > u8 state); > #endif > +#ifdef CONFIG_NET_FLOW_TABLES > + struct net_flow_action **(*ndo_flow_get_actions)(struct net_device *dev); > + struct net_flow_table **(*ndo_flow_get_tables)(struct net_device *dev); > + struct net_flow_header **(*ndo_flow_get_headers)(struct net_device *dev); > + struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev); hdr or header? pick one, probably hdr. > + struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev); move this up next to get_tables > +#endif > }; > > /** > diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h > new file mode 100644 > index 0000000..2acdb38 > --- /dev/null > +++ b/include/uapi/linux/if_flow.h > @@ -0,0 +1,363 @@ > +/* > + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices > + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> > + * > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Author: John Fastabend <john.r.fastabend@intel.com> > + */ > + > +/* Netlink description: > + * > + * Table definition used to describe running tables. The following > + * describes the netlink message returned from a flow API messages. message? > + > +enum { > + NET_FLOW_MASK_TYPE_UNSPEC, > + NET_FLOW_MASK_TYPE_EXACT, > + NET_FLOW_MASK_TYPE_LPM, As discussed in another thread, need third mask type that's not LPM; e.g. 0b0101. > +#define NET_FLOW_TABLE_GRAPH_NODE_MAX (__NET_FLOW_TABLE_GRAPH_NODE_MAX - 1) > + > +enum { > + NET_FLOW_TABLE_GRAPH_UNSPEC, > + NET_FLOW_TABLE_GRAPH_NODE, > + __NET_FLOW_TABLE_GRAPH_MAX, > +}; > +#define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1) > + > +enum { > + NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */ Maybe add an NET_FLOW_IDENTIFIER_UNSPEC so NET_FLOW_IDENTIFIER_IFINDEX isn't zero. > diff --git a/net/Kconfig b/net/Kconfig > index ff9ffc1..8380bfe 100644 > --- a/net/Kconfig > +++ b/net/Kconfig > @@ -293,6 +293,13 @@ config NET_FLOW_LIMIT > with many clients some protection against DoS by a single (spoofed) > flow that greatly exceeds average workload. > > +config NET_FLOW_TABLES > + boolean "Support network flow tables" > + ---help--- > + This feature provides an interface for device drivers to report > + flow tables and supported matches and actions. If you do not > + want to support hardware offloads for flow tables, say N here. > + > menu "Network testing" > > config NET_PKTGEN > diff --git a/net/core/Makefile b/net/core/Makefile > index 235e6c5..1eea785 100644 > --- a/net/core/Makefile > +++ b/net/core/Makefile > @@ -23,3 +23,4 @@ obj-$(CONFIG_NETWORK_PHY_TIMESTAMPING) += timestamping.o > obj-$(CONFIG_NET_PTP_CLASSIFY) += ptp_classifier.o > obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o > obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o > +obj-$(CONFIG_NET_FLOW_TABLES) += flow_table.o > diff --git a/net/core/flow_table.c b/net/core/flow_table.c > new file mode 100644 > index 0000000..ec3f06d > --- /dev/null > +++ b/net/core/flow_table.c > @@ -0,0 +1,837 @@ > +/* > + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices > + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> > + * > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Author: John Fastabend <john.r.fastabend@intel.com> > + */ > + > +#include <uapi/linux/if_flow.h> > +#include <linux/if_flow.h> > +#include <linux/if_bridge.h> > +#include <linux/types.h> > +#include <net/netlink.h> > +#include <net/genetlink.h> > +#include <net/rtnetlink.h> > +#include <linux/module.h> > + > +static struct genl_family net_flow_nl_family = { > + .id = GENL_ID_GENERATE, > + .name = NET_FLOW_GENL_NAME, > + .version = NET_FLOW_GENL_VERSION, > + .maxattr = NET_FLOW_MAX, > + .netnsok = true, > +}; > + > +static struct net_device *net_flow_get_dev(struct genl_info *info) > +{ > + struct net *net = genl_info_net(info); > + int type, ifindex; > + > + if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] || > + !info->attrs[NET_FLOW_IDENTIFIER]) > + return NULL; > + > + type = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER_TYPE]); > + switch (type) { > + case NET_FLOW_IDENTIFIER_IFINDEX: > + ifindex = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER]); > + break; > + default: > + return NULL; > + } > + > + return dev_get_by_index(net, ifindex); > +} > + > +static int net_flow_put_act_types(struct sk_buff *skb, > + struct net_flow_action_arg *args) > +{ > + int i, err; > + > + for (i = 0; args[i].type; i++) { > + err = nla_put(skb, NET_FLOW_ACTION_ARG, > + sizeof(struct net_flow_action_arg), &args[i]); > + if (err) > + return -EMSGSIZE; > + } > + return 0; > +} > + > +static const > +struct nla_policy net_flow_action_policy[NET_FLOW_ACTION_ATTR_MAX + 1] = { > + [NET_FLOW_ACTION_ATTR_NAME] = {.type = NLA_STRING, > + .len = NET_FLOW_NAMSIZ-1 }, > + [NET_FLOW_ACTION_ATTR_UID] = {.type = NLA_U32 }, > + [NET_FLOW_ACTION_ATTR_SIGNATURE] = {.type = NLA_NESTED }, > +}; > + > +static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a) > +{ > + struct net_flow_action_arg *this; > + struct nlattr *nest; > + int err, args = 0; > + > + if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name)) > + return -EMSGSIZE; > + > + if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid)) > + return -EMSGSIZE; > + > + if (!a->args) > + return 0; > + > + for (this = &a->args[0]; strlen(this->name) > 0; this++) > + args++; > + Since you only need to know that there are > 0 args, but don't need the actual count, can you simplify test with something like: bool has_args = strlen(a->args->name) > 0; or bool has_args = !!a->args->type; > + if (args) { > + nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); > + if (!nest) > + goto nest_put_failure; Maybe just return -EMSGSIZE here and skip goto. > + > + err = net_flow_put_act_types(skb, a->args); > + if (err) { > + nla_nest_cancel(skb, nest); > + return err; > + } > + nla_nest_end(skb, nest); > + } > + > + return 0; > +nest_put_failure: > + return -EMSGSIZE; > +} > + > +static int net_flow_put_actions(struct sk_buff *skb, > + struct net_flow_action **acts) > +{ > + struct nlattr *actions; > + int err, i; > + > + actions = nla_nest_start(skb, NET_FLOW_ACTIONS); > + if (!actions) > + return -EMSGSIZE; > + > + for (i = 0; acts[i]->uid; i++) { Using for(act = acts; act->udi; act++) will make code a little nicer. > + struct nlattr *action = nla_nest_start(skb, NET_FLOW_ACTION); > + > + if (!action) > + goto action_put_failure; > + > + err = net_flow_put_action(skb, acts[i]); > + if (err) > + goto action_put_failure; > + nla_nest_end(skb, action); > + } > + nla_nest_end(skb, actions); > + > + return 0; > +action_put_failure: > + nla_nest_cancel(skb, actions); > + return -EMSGSIZE; > +} > + > +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a, > + struct net_device *dev, > + u32 portid, int seq, u8 cmd) > +{ > + struct genlmsghdr *hdr; > + struct sk_buff *skb; > + int err = -ENOBUFS; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); > + if (!skb) > + return ERR_PTR(-ENOBUFS); > + > + hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd); > + if (!hdr) > + goto out; > + > + if (nla_put_u32(skb, > + NET_FLOW_IDENTIFIER_TYPE, > + NET_FLOW_IDENTIFIER_IFINDEX) || > + nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) { > + err = -ENOBUFS; > + goto out; > + } > + > + err = net_flow_put_actions(skb, a); > + if (err < 0) > + goto out; > + > + err = genlmsg_end(skb, hdr); > + if (err < 0) > + goto out; > + > + return skb; > +out: > + nlmsg_free(skb); > + return ERR_PTR(err); > +} > + > +static int net_flow_cmd_get_actions(struct sk_buff *skb, > + struct genl_info *info) > +{ > + struct net_flow_action **a; > + struct net_device *dev; > + struct sk_buff *msg; > + > + dev = net_flow_get_dev(info); > + if (!dev) > + return -EINVAL; > + > + if (!dev->netdev_ops->ndo_flow_get_actions) { > + dev_put(dev); > + return -EOPNOTSUPP; > + } > + > + a = dev->netdev_ops->ndo_flow_get_actions(dev); > + if (!a) > + return -EBUSY; Is it assumed ndo_flow_get_actions() returns a pointer to a static list of actions? What if the device wants to give up a dynamic list of actions? I'm trying to understand the lifetime of pointer 'a'. What would cause -EBUSY condition? > + > + msg = net_flow_build_actions_msg(a, dev, > + info->snd_portid, > + info->snd_seq, > + NET_FLOW_TABLE_CMD_GET_ACTIONS); > + dev_put(dev); > + > + if (IS_ERR(msg)) > + return PTR_ERR(msg); > + > + return genlmsg_reply(msg, info); > +} > + > +static int net_flow_put_table(struct net_device *dev, > + struct sk_buff *skb, > + struct net_flow_table *t) > +{ > + struct nlattr *matches, *actions; > + int i; > + > + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || > + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) > + return -EMSGSIZE; > + > + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); > + if (!matches) > + return -EMSGSIZE; > + > + for (i = 0; t->matches[i].instance; i++) pointer-based loop better than i-based? my personal preference, i guess. > + nla_put(skb, NET_FLOW_FIELD_REF, > + sizeof(struct net_flow_field_ref), > + &t->matches[i]); > + nla_nest_end(skb, matches); > + ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 5:25 ` Scott Feldman @ 2015-01-06 6:04 ` John Fastabend 2015-01-06 6:40 ` Scott Feldman 0 siblings, 1 reply; 21+ messages in thread From: John Fastabend @ 2015-01-06 6:04 UTC (permalink / raw) To: Scott Feldman Cc: Thomas Graf, Jiří Pírko, Jamal Hadi Salim, simon.horman, Netdev, David S. Miller, Andy Gospodarek [...] >> +/** >> + * @struct net_flow_action >> + * @brief a description of a endpoint defined action >> + * >> + * @name printable name >> + * @uid unique action identifier >> + * @types NET_FLOW_ACTION_TYPE_NULL terminated list of action types > > s/types/args? > yep typo fixed in upcoming v2. >> + */ >> +struct net_flow_action { >> + char name[NET_FLOW_NAMSIZ]; >> + int uid; >> + struct net_flow_action_arg *args; >> +}; >> + >> +/** >> + * @struct net_flow_table >> + * @brief define flow table with supported match/actions >> + * >> + * @uid unique identifier for table >> + * @source uid of parent table > > Is parent table the table previous in the pipeline? If so, what if > you can get to table from N different parent tables, what goes in > source? No, you can get the layout of tables from the table graph ops. Source is used when a single tcam or other implementation mechanism is sliced into a set of tables. The current rocker world doesn't use this very much at the moment because its static and I just assumed every table came out of the same virtual hardware namespace. A simple example world would be to come up with a set of large virtual TCAMs. Any given TCAM maybe sliced into a set of tables. Users may organize these either via some out of band configuration at init or power on time. In the rocker case we could specify this when we load qemu. For now it is just informational. But if we start allowing users to create delete tables at runtime it is important to "know" where the slices are being allocated/free'd from. The source gives you this information. The hardware devices I'm working on have multiple sources we can allocate/free tables from. The source values would provide a way to track down which tables are in which hardware namespaces. Hope that helps? > >> + * @size max number of entries for table or -1 for unbounded >> + * @matches null terminated set of supported match types given by match uid >> + * @actions null terminated set of supported action types given by action uid >> + * @flows set of flows >> + */ >> +struct net_flow_table { >> + char name[NET_FLOW_NAMSIZ]; >> + int uid; >> + int source; >> + int size; >> + struct net_flow_field_ref *matches; >> + int *actions; >> +}; >> + >> +/* net_flow_hdr_node: node in a header graph of header fields. >> + * >> + * @uid : unique id of the graph node >> + * @flwo_header_ref : identify the hdrs that can handled by this node > > s/flwo_header_ref/hdrs? > >> + * @net_flow_jump_table : give a case jump statement > > s/net_flow_jump_table/jump yep thanks. > >> + */ >> +struct net_flow_hdr_node { >> + char name[NET_FLOW_NAMSIZ]; >> + int uid; >> + int *hdrs; >> + struct net_flow_jump_table *jump; >> +}; >> + >> +struct net_flow_tbl_node { >> + int uid; >> + __u32 flags; >> + struct net_flow_jump_table *jump; >> +}; >> +#endif >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 29c92ee..3c3c856 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -52,6 +52,11 @@ >> #include <linux/neighbour.h> >> #include <uapi/linux/netdevice.h> >> >> +#ifdef CONFIG_NET_FLOW_TABLES >> +#include <linux/if_flow.h> >> +#include <uapi/linux/if_flow.h> > > linux/if_flow.h already included uapi file > fixed. >> +#endif >> + >> struct netpoll_info; >> struct device; >> struct phy_device; >> @@ -1186,6 +1191,13 @@ struct net_device_ops { >> int (*ndo_switch_port_stp_update)(struct net_device *dev, >> u8 state); >> #endif >> +#ifdef CONFIG_NET_FLOW_TABLES >> + struct net_flow_action **(*ndo_flow_get_actions)(struct net_device *dev); >> + struct net_flow_table **(*ndo_flow_get_tables)(struct net_device *dev); >> + struct net_flow_header **(*ndo_flow_get_headers)(struct net_device *dev); >> + struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev); > > hdr or header? pick one, probably hdr. hdr is shorter and doesn't lose any clarity IMO I'll use net_flow_hdr and net_flow_hdr_node > >> + struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev); > > move this up next to get_tables sure also what do you think tbl instead of table. > >> +#endif >> }; >> >> /** >> diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h >> new file mode 100644 >> index 0000000..2acdb38 >> --- /dev/null >> +++ b/include/uapi/linux/if_flow.h >> @@ -0,0 +1,363 @@ >> +/* >> + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices >> + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com> >> + * >> + * >> + * This program is free software; you can redistribute it and/or modify it >> + * under the terms and conditions of the GNU General Public License, >> + * version 2, as published by the Free Software Foundation. >> + * >> + * This program is distributed in the hope it will be useful, but WITHOUT >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or >> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for >> + * more details. >> + * >> + * The full GNU General Public License is included in this distribution in >> + * the file called "COPYING". >> + * >> + * Author: John Fastabend <john.r.fastabend@intel.com> >> + */ >> + >> +/* Netlink description: >> + * >> + * Table definition used to describe running tables. The following >> + * describes the netlink message returned from a flow API messages. > > message? > That sentence is a bit awkward all around. Changed it to "The following describes the netlink format used by the flow API." maybe that is better. > >> + >> +enum { >> + NET_FLOW_MASK_TYPE_UNSPEC, >> + NET_FLOW_MASK_TYPE_EXACT, >> + NET_FLOW_MASK_TYPE_LPM, > > As discussed in another thread, need third mask type that's not LPM; > e.g. 0b0101. > yep. > >> +#define NET_FLOW_TABLE_GRAPH_NODE_MAX (__NET_FLOW_TABLE_GRAPH_NODE_MAX - 1) >> + >> +enum { >> + NET_FLOW_TABLE_GRAPH_UNSPEC, >> + NET_FLOW_TABLE_GRAPH_NODE, >> + __NET_FLOW_TABLE_GRAPH_MAX, >> +}; >> +#define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1) >> + >> +enum { >> + NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */ > > Maybe add an NET_FLOW_IDENTIFIER_UNSPEC so NET_FLOW_IDENTIFIER_IFINDEX > isn't zero. > agreed I tend to like being able to test things with if (foo) { ... } [...] >> + >> +static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a) >> +{ >> + struct net_flow_action_arg *this; >> + struct nlattr *nest; >> + int err, args = 0; >> + >> + if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name)) >> + return -EMSGSIZE; >> + >> + if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid)) >> + return -EMSGSIZE; >> + >> + if (!a->args) >> + return 0; >> + >> + for (this = &a->args[0]; strlen(this->name) > 0; this++) >> + args++; >> + > > Since you only need to know that there are > 0 args, but don't need > the actual count, can you simplify test with something like: > good catch, this is a hold over from some code I rewrote I'll clean this up like, static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a) { struct net_flow_action_arg *this; struct nlattr *nest; int err; if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name)) return -EMSGSIZE; if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid)) return -EMSGSIZE; if (a->args && a->args[0].type) { nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); if (!nest) return -EMSGSIZE; err = net_flow_put_act_types(skb, a->args); if (err) { nla_nest_cancel(skb, nest); return err; } nla_nest_end(skb, nest); } return 0; } I think that should probably work. Of course I'll compile it and test it. > bool has_args = strlen(a->args->name) > 0; > > or > > bool has_args = !!a->args->type; > >> + if (args) { >> + nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE); >> + if (!nest) >> + goto nest_put_failure; > > Maybe just return -EMSGSIZE here and skip goto. > >> + >> + err = net_flow_put_act_types(skb, a->args); >> + if (err) { >> + nla_nest_cancel(skb, nest); >> + return err; >> + } >> + nla_nest_end(skb, nest); >> + } >> + >> + return 0; >> +nest_put_failure: >> + return -EMSGSIZE; >> +} >> + >> +static int net_flow_put_actions(struct sk_buff *skb, >> + struct net_flow_action **acts) >> +{ >> + struct nlattr *actions; >> + int err, i; >> + >> + actions = nla_nest_start(skb, NET_FLOW_ACTIONS); >> + if (!actions) >> + return -EMSGSIZE; >> + >> + for (i = 0; acts[i]->uid; i++) { > > Using for(act = acts; act->udi; act++) will make code a little nicer. > not entirely convinced its any nicer that way but sure I'll convert it. [...] >> +static int net_flow_cmd_get_actions(struct sk_buff *skb, >> + struct genl_info *info) >> +{ >> + struct net_flow_action **a; >> + struct net_device *dev; >> + struct sk_buff *msg; >> + >> + dev = net_flow_get_dev(info); >> + if (!dev) >> + return -EINVAL; >> + >> + if (!dev->netdev_ops->ndo_flow_get_actions) { >> + dev_put(dev); >> + return -EOPNOTSUPP; >> + } >> + >> + a = dev->netdev_ops->ndo_flow_get_actions(dev); >> + if (!a) >> + return -EBUSY; > > Is it assumed ndo_flow_get_actions() returns a pointer to a static > list of actions? What if the device wants to give up a dynamic list > of actions? I'm trying to understand the lifetime of pointer 'a'. > What would cause -EBUSY condition? > Ah this is a good point. At the moment if a driver dynamically changes a structure then its going to break because there is no locking involved. I think the best way to do this is to use RCU here. We can return rcu dereferenced pointers and then drivers will need to wait a grace period before free'ing the old pointer. To simplify drivers we can do this from helper calls and document the semantics. Currently rocker is static so we don't have any issues. If no one minds I would like to do this in a follow up series. >> + >> + msg = net_flow_build_actions_msg(a, dev, >> + info->snd_portid, >> + info->snd_seq, >> + NET_FLOW_TABLE_CMD_GET_ACTIONS); >> + dev_put(dev); >> + >> + if (IS_ERR(msg)) >> + return PTR_ERR(msg); >> + >> + return genlmsg_reply(msg, info); >> +} >> + >> +static int net_flow_put_table(struct net_device *dev, >> + struct sk_buff *skb, >> + struct net_flow_table *t) >> +{ >> + struct nlattr *matches, *actions; >> + int i; >> + >> + if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) || >> + nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size)) >> + return -EMSGSIZE; >> + >> + matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES); >> + if (!matches) >> + return -EMSGSIZE; >> + >> + for (i = 0; t->matches[i].instance; i++) > > pointer-based loop better than i-based? my personal preference, i guess. hmm I guess I tended to write these with indices. I might leave them for now but can change them if the consensus is pointer loops are easier to read. > >> + nla_put(skb, NET_FLOW_FIELD_REF, >> + sizeof(struct net_flow_field_ref), >> + &t->matches[i]); >> + nla_nest_end(skb, matches); >> + -- John Fastabend Intel Corporation ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables 2015-01-06 6:04 ` John Fastabend @ 2015-01-06 6:40 ` Scott Feldman 0 siblings, 0 replies; 21+ messages in thread From: Scott Feldman @ 2015-01-06 6:40 UTC (permalink / raw) To: John Fastabend Cc: Thomas Graf, Jiří Pírko, Jamal Hadi Salim, simon.horman, Netdev, David S. Miller, Andy Gospodarek On Mon, Jan 5, 2015 at 10:04 PM, John Fastabend <john.fastabend@gmail.com> wrote: >>> + * @uid unique identifier for table >>> + * @source uid of parent table >> >> >> Is parent table the table previous in the pipeline? If so, what if >> you can get to table from N different parent tables, what goes in >> source? > > > No, you can get the layout of tables from the table graph ops. > > Source is used when a single tcam or other implementation mechanism > is sliced into a set of tables. The current rocker world doesn't use > this very much at the moment because its static and I just assumed > every table came out of the same virtual hardware namespace. > > A simple example world would be to come up with a set of large virtual > TCAMs. Any given TCAM maybe sliced into a set of tables. Users may > organize these either via some out of band configuration at init or > power on time. In the rocker case we could specify this when we load > qemu. For now it is just informational. But if we start allowing users > to create delete tables at runtime it is important to "know" where the > slices are being allocated/free'd from. The source gives you this > information. > > The hardware devices I'm working on have multiple sources we can > allocate/free tables from. The source values would provide a way to > track down which tables are in which hardware namespaces. > > Hope that helps? Got it, thanks. Can source be encoded in tbl_id? >> hdr or header? pick one, probably hdr. > > > hdr is shorter and doesn't lose any clarity IMO I'll use net_flow_hdr > and net_flow_hdr_node > >> >>> + struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct >>> net_device *dev); >> >> >> move this up next to get_tables > > > sure also what do you think tbl instead of table. +1 for tbl. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-01-07 22:00 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-07 1:14 [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables Alexei Starovoitov 2015-01-07 5:37 ` John Fastabend -- strict thread matches above, loose matches on Subject: below -- 2015-01-07 21:17 Alexei Starovoitov 2015-01-07 22:00 ` John Fastabend 2014-12-31 19:45 [net-next PATCH v1 00/11] A flow API John Fastabend 2014-12-31 19:45 ` [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables John Fastabend 2014-12-31 20:10 ` John Fastabend 2015-01-04 11:12 ` Thomas Graf 2015-01-05 18:59 ` John Fastabend 2015-01-05 21:48 ` Thomas Graf 2015-01-05 23:29 ` John Fastabend 2015-01-06 0:45 ` John Fastabend 2015-01-06 1:09 ` Simon Horman 2015-01-06 1:19 ` John Fastabend 2015-01-06 2:05 ` Simon Horman 2015-01-06 2:54 ` Simon Horman 2015-01-06 3:31 ` John Fastabend 2015-01-07 10:07 ` Or Gerlitz 2015-01-07 16:35 ` John Fastabend 2015-01-06 5:25 ` Scott Feldman 2015-01-06 6:04 ` John Fastabend 2015-01-06 6:40 ` Scott Feldman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.