netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages
@ 2024-04-18 10:47 Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 1/4] doc/netlink/specs: Add draft nftables spec Donald Hunter
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Donald Hunter @ 2024-04-18 10:47 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Pablo Neira Ayuso,
	Jozsef Kadlecsik, netfilter-devel, coreteam
  Cc: donald.hunter, Donald Hunter

This series adds a ynl spec for nftables and extends ynl with a --multi
command line option that makes it possible to send transactional batches
for nftables.

This series includes a patch for nfnetlink which adds ACK processing for
batch begin/end messages. If you'd prefer that to be sent separately to
nf-next then I can do so, but I included it here so that it gets seen in
context.

An example of usage is:

./tools/net/ynl/cli.py \
 --spec Documentation/netlink/specs/nftables.yaml \
 --multi batch-begin '{"res-id": 10}' \
 --multi newtable '{"name": "test", "nfgen-family": 1}' \
 --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
 --multi batch-end '{"res-id": 10}'
[None, None, None, None]

It can also be used for bundling get requests:

./tools/net/ynl/cli.py \
 --spec Documentation/netlink/specs/nftables.yaml \
 --multi gettable '{"name": "test", "nfgen-family": 1}' \
 --multi getchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
 --output-json
[{"name": "test", "use": 1, "handle": 1, "flags": [],
 "nfgen-family": 1, "version": 0, "res-id": 2},
 {"table": "test", "name": "chain", "handle": 1, "use": 0,
 "nfgen-family": 1, "version": 0, "res-id": 2}]

There are 2 issues that may be worth resolving:

 - ynl reports errors by raising an NlError exception so only the first
   error gets reported. This could be changed to add errors to the list
   of responses so that multiple errors could be reported.

 - If any message does not get a response (e.g. batch-begin w/o patch 2)
   then ynl waits indefinitely. A recv timeout could be added which
   would allow ynl to terminate.

v3 -> v4:
 - fix the underlying extack decoding bug and drop the
   workaround patch

v2 -> v3:
 - update the ynl multi code to match the latest return value
   semantics of ynl
 - add a ynl patch to handle acks that use req_value
 - update the nfnetlink patch based on feedback from Pablo
 - move nfnetlink patch to end of series

v1 -> v2:
 - add a patch to nfnetlink to process ACKs for batch begin/end
 - handle multiple responses correctly

Donald Hunter (4):
  doc/netlink/specs: Add draft nftables spec
  tools/net/ynl: Fix extack decoding for directional ops
  tools/net/ynl: Add multi message support to ynl
  netfilter: nfnetlink: Handle ACK flags for batch messages

 Documentation/netlink/specs/nftables.yaml | 1264 +++++++++++++++++++++
 net/netfilter/nfnetlink.c                 |    5 +
 tools/net/ynl/cli.py                      |   25 +-
 tools/net/ynl/lib/ynl.py                  |   82 +-
 4 files changed, 1346 insertions(+), 30 deletions(-)
 create mode 100644 Documentation/netlink/specs/nftables.yaml

-- 
2.44.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v4 1/4] doc/netlink/specs: Add draft nftables spec
  2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
@ 2024-04-18 10:47 ` Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 2/4] tools/net/ynl: Fix extack decoding for directional ops Donald Hunter
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Donald Hunter @ 2024-04-18 10:47 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Pablo Neira Ayuso,
	Jozsef Kadlecsik, netfilter-devel, coreteam
  Cc: donald.hunter, Donald Hunter

Add a spec for nftables that has nearly complete coverage of the ops,
but limited coverage of rule types and subexpressions.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
---
 Documentation/netlink/specs/nftables.yaml | 1264 +++++++++++++++++++++
 1 file changed, 1264 insertions(+)
 create mode 100644 Documentation/netlink/specs/nftables.yaml

diff --git a/Documentation/netlink/specs/nftables.yaml b/Documentation/netlink/specs/nftables.yaml
new file mode 100644
index 000000000000..dff2a18f3d90
--- /dev/null
+++ b/Documentation/netlink/specs/nftables.yaml
@@ -0,0 +1,1264 @@
+# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+
+name: nftables
+protocol: netlink-raw
+protonum: 12
+
+doc:
+  Netfilter nftables configuration over netlink.
+
+definitions:
+  -
+    name: nfgenmsg
+    type: struct
+    members:
+      -
+        name: nfgen-family
+        type: u8
+      -
+        name: version
+        type: u8
+      -
+        name: res-id
+        byte-order: big-endian
+        type: u16
+  -
+    name: meta-keys
+    type: enum
+    entries:
+      - len
+      - protocol
+      - priority
+      - mark
+      - iif
+      - oif
+      - iifname
+      - oifname
+      - iftype
+      - oiftype
+      - skuid
+      - skgid
+      - nftrace
+      - rtclassid
+      - secmark
+      - nfproto
+      - l4-proto
+      - bri-iifname
+      - bri-oifname
+      - pkttype
+      - cpu
+      - iifgroup
+      - oifgroup
+      - cgroup
+      - prandom
+      - secpath
+      - iifkind
+      - oifkind
+      - bri-iifpvid
+      - bri-iifvproto
+      - time-ns
+      - time-day
+      - time-hour
+      - sdif
+      - sdifname
+      - bri-broute
+  -
+    name: cmp-ops
+    type: enum
+    entries:
+      - eq
+      - neq
+      - lt
+      - lte
+      - gt
+      - gte
+  -
+    name: object-type
+    type: enum
+    entries:
+      - unspec
+      - counter
+      - quota
+      - ct-helper
+      - limit
+      - connlimit
+      - tunnel
+      - ct-timeout
+      - secmark
+      - ct-expect
+      - synproxy
+  -
+    name: nat-range-flags
+    type: flags
+    entries:
+      - map-ips
+      - proto-specified
+      - proto-random
+      - persistent
+      - proto-random-fully
+      - proto-offset
+      - netmap
+  -
+    name: table-flags
+    type: flags
+    entries:
+      - dormant
+      - owner
+      - persist
+  -
+    name: chain-flags
+    type: flags
+    entries:
+      - base
+      - hw-offload
+      - binding
+  -
+    name: set-flags
+    type: flags
+    entries:
+      - anonymous
+      - constant
+      - interval
+      - map
+      - timeout
+      - eval
+      - object
+      - concat
+      - expr
+
+attribute-sets:
+  -
+    name: empty-attrs
+    attributes:
+      -
+        name: name
+        type: string
+  -
+    name: batch-attrs
+    attributes:
+      -
+        name: genid
+        type: u32
+        byte-order: big-endian
+  -
+    name: table-attrs
+    attributes:
+      -
+        name: name
+        type: string
+        doc: name of the table
+      -
+        name: flags
+        type: u32
+        byte-order: big-endian
+        doc: bitmask of flags
+        enum: table-flags
+        enum-as-flags: true
+      -
+        name: use
+        type: u32
+        byte-order: big-endian
+        doc: number of chains in this table
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+        doc: numeric handle of the table
+      -
+        name: userdata
+        type: binary
+        doc: user data
+  -
+    name: chain-attrs
+    attributes:
+      -
+        name: table
+        type: string
+        doc: name of the table containing the chain
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+        doc: numeric handle of the chain
+      -
+        name: name
+        type: string
+        doc: name of the chain
+      -
+        name: hook
+        type: nest
+        nested-attributes: nft-hook-attrs
+        doc: hook specification for basechains
+      -
+        name: policy
+        type: u32
+        byte-order: big-endian
+        doc: numeric policy of the chain
+      -
+        name: use
+        type: u32
+        byte-order: big-endian
+        doc: number of references to this chain
+      -
+        name: type
+        type: string
+        doc: type name of the chain
+      -
+        name: counters
+        type: nest
+        nested-attributes: nft-counter-attrs
+        doc: counter specification of the chain
+      -
+        name: flags
+        type: u32
+        byte-order: big-endian
+        doc: chain flags
+        enum: chain-flags
+        enum-as-flags: true
+      -
+        name: id
+        type: u32
+        byte-order: big-endian
+        doc: uniquely identifies a chain in a transaction
+      -
+        name: userdata
+        type: binary
+        doc: user data
+  -
+    name: counter-attrs
+    attributes:
+      -
+        name: bytes
+        type: u64
+        byte-order: big-endian
+      -
+        name: packets
+        type: u64
+        byte-order: big-endian
+      -
+        name: pad
+        type: pad
+  -
+    name: nft-hook-attrs
+    attributes:
+      -
+        name: num
+        type: u32
+        byte-order: big-endian
+      -
+        name: priority
+        type: s32
+        byte-order: big-endian
+      -
+        name: dev
+        type: string
+        doc: net device name
+      -
+        name: devs
+        type: nest
+        nested-attributes: hook-dev-attrs
+        doc: list of net devices
+  -
+    name: hook-dev-attrs
+    attributes:
+      -
+        name: name
+        type: string
+        multi-attr: true
+  -
+    name: nft-counter-attrs
+    attributes:
+      -
+        name: bytes
+        type: u64
+      -
+        name: packets
+        type: u64
+  -
+    name: rule-attrs
+    attributes:
+      -
+        name: table
+        type: string
+        doc: name of the table containing the rule
+      -
+        name: chain
+        type: string
+        doc: name of the chain containing the rule
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+        doc: numeric handle of the rule
+      -
+        name: expressions
+        type: nest
+        nested-attributes: expr-list-attrs
+        doc: list of expressions
+      -
+        name: compat
+        type: nest
+        nested-attributes: rule-compat-attrs
+        doc: compatibility specifications of the rule
+      -
+        name: position
+        type: u64
+        byte-order: big-endian
+        doc: numeric handle of the previous rule
+      -
+        name: userdata
+        type: binary
+        doc: user data
+      -
+        name: id
+        type: u32
+        doc: uniquely identifies a rule in a transaction
+      -
+        name: position-id
+        type: u32
+        doc: transaction unique identifier of the previous rule
+      -
+        name: chain-id
+        type: u32
+        doc: add the rule to chain by ID, alternative to chain name
+  -
+    name: expr-list-attrs
+    attributes:
+      -
+        name: elem
+        type: nest
+        nested-attributes: expr-attrs
+        multi-attr: true
+  -
+    name: expr-attrs
+    attributes:
+      -
+        name: name
+        type: string
+        doc: name of the expression type
+      -
+        name: data
+        type: sub-message
+        sub-message: expr-ops
+        selector: name
+        doc: type specific data
+  -
+    name: rule-compat-attrs
+    attributes:
+      -
+        name: proto
+        type: binary
+        doc: numeric value of the handled protocol
+      -
+        name: flags
+        type: binary
+        doc: bitmask of flags
+  -
+    name: set-attrs
+    attributes:
+      -
+        name: table
+        type: string
+        doc: table name
+      -
+        name: name
+        type: string
+        doc: set name
+      -
+        name: flags
+        type: u32
+        enum: set-flags
+        byte-order: big-endian
+        doc: bitmask of enum nft_set_flags
+      -
+        name: key-type
+        type: u32
+        byte-order: big-endian
+        doc: key data type, informational purpose only
+      -
+        name: key-len
+        type: u32
+        byte-order: big-endian
+        doc: key data length
+      -
+        name: data-type
+        type: u32
+        byte-order: big-endian
+        doc: mapping data type
+      -
+        name: data-len
+        type: u32
+        byte-order: big-endian
+        doc: mapping data length
+      -
+        name: policy
+        type: u32
+        byte-order: big-endian
+        doc: selection policy
+      -
+        name: desc
+        type: nest
+        nested-attributes: set-desc-attrs
+        doc: set description
+      -
+        name: id
+        type: u32
+        doc: uniquely identifies a set in a transaction
+      -
+        name: timeout
+        type: u64
+        doc: default timeout value
+      -
+        name: gc-interval
+        type: u32
+        doc: garbage collection interval
+      -
+        name: userdata
+        type: binary
+        doc: user data
+      -
+        name: pad
+        type: pad
+      -
+        name: obj-type
+        type: u32
+        byte-order: big-endian
+        doc: stateful object type
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+        doc: set handle
+      -
+        name: expr
+        type: nest
+        nested-attributes: expr-attrs
+        doc: set expression
+        multi-attr: true
+      -
+        name: expressions
+        type: nest
+        nested-attributes: set-list-attrs
+        doc: list of expressions
+  -
+    name: set-desc-attrs
+    attributes:
+      -
+        name: size
+        type: u32
+        byte-order: big-endian
+        doc: number of elements in set
+      -
+        name: concat
+        type: nest
+        nested-attributes: set-desc-concat-attrs
+        doc: description of field concatenation
+        multi-attr: true
+  -
+    name: set-desc-concat-attrs
+    attributes:
+      -
+        name: elem
+        type: nest
+        nested-attributes: set-field-attrs
+  -
+    name: set-field-attrs
+    attributes:
+      -
+        name: len
+        type: u32
+        byte-order: big-endian
+  -
+    name: set-list-attrs
+    attributes:
+      -
+        name: elem
+        type: nest
+        nested-attributes: expr-attrs
+        multi-attr: true
+  -
+    name: setelem-attrs
+    attributes:
+      -
+        name: key
+        type: nest
+        nested-attributes: data-attrs
+        doc: key value
+      -
+        name: data
+        type: nest
+        nested-attributes: data-attrs
+        doc: data value of mapping
+      -
+        name: flags
+        type: binary
+        doc: bitmask of nft_set_elem_flags
+      -
+        name: timeout
+        type: u64
+        doc: timeout value
+      -
+        name: expiration
+        type: u64
+        doc: expiration time
+      -
+        name: userdata
+        type: binary
+        doc: user data
+      -
+        name: expr
+        type: nest
+        nested-attributes: expr-attrs
+        doc: expression
+      -
+        name: objref
+        type: string
+        doc: stateful object reference
+      -
+        name: key-end
+        type: nest
+        nested-attributes: data-attrs
+        doc: closing key value
+      -
+        name: expressions
+        type: nest
+        nested-attributes: expr-list-attrs
+        doc: list of expressions
+  -
+    name: setelem-list-elem-attrs
+    attributes:
+      -
+        name: elem
+        type: nest
+        nested-attributes: setelem-attrs
+        multi-attr: true
+  -
+    name: setelem-list-attrs
+    attributes:
+      -
+        name: table
+        type: string
+      -
+        name: set
+        type: string
+      -
+        name: elements
+        type: nest
+        nested-attributes: setelem-list-elem-attrs
+      -
+        name: set-id
+        type: u32
+  -
+    name: gen-attrs
+    attributes:
+      -
+        name: id
+        type: u32
+        byte-order: big-endian
+        doc: ruleset generation id
+      -
+        name: proc-pid
+        type: u32
+        byte-order: big-endian
+      -
+        name: proc-name
+        type: string
+  -
+    name: obj-attrs
+    attributes:
+      -
+        name: table
+        type: string
+        doc: name of the table containing the expression
+      -
+        name: name
+        type: string
+        doc: name of this expression type
+      -
+        name: type
+        type: u32
+        enum: object-type
+        byte-order: big-endian
+        doc: stateful object type
+      -
+        name: data
+        type: sub-message
+        sub-message: obj-data
+        selector: type
+        doc: stateful object data
+      -
+        name: use
+        type: u32
+        byte-order: big-endian
+        doc: number of references to this expression
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+        doc: object handle
+      -
+        name: pad
+        type: pad
+      -
+        name: userdata
+        type: binary
+        doc: user data
+  -
+    name: quota-attrs
+    attributes:
+      -
+        name: bytes
+        type: u64
+        byte-order: big-endian
+      -
+        name: flags # TODO
+        type: u32
+        byte-order: big-endian
+      -
+        name: pad
+        type: pad
+      -
+        name: consumed
+        type: u64
+        byte-order: big-endian
+  -
+    name: flowtable-attrs
+    attributes:
+      -
+        name: table
+        type: string
+      -
+        name: name
+        type: string
+      -
+        name: hook
+        type: nest
+        nested-attributes: flowtable-hook-attrs
+      -
+        name: use
+        type: u32
+        byte-order: big-endian
+      -
+        name: handle
+        type: u64
+        byte-order: big-endian
+      -
+        name: pad
+        type: pad
+      -
+        name: flags
+        type: u32
+        byte-order: big-endian
+  -
+    name: flowtable-hook-attrs
+    attributes:
+      -
+        name: num
+        type: u32
+        byte-order: big-endian
+      -
+        name: priority
+        type: u32
+        byte-order: big-endian
+      -
+        name: devs
+        type: nest
+        nested-attributes: hook-dev-attrs
+  -
+    name: expr-cmp-attrs
+    attributes:
+      -
+        name: sreg
+        type: u32
+        byte-order: big-endian
+      -
+        name: op
+        type: u32
+        byte-order: big-endian
+        enum: cmp-ops
+      -
+        name: data
+        type: nest
+        nested-attributes: data-attrs
+  -
+    name: data-attrs
+    attributes:
+      -
+        name: value
+        type: binary
+        # sub-type: u8
+      -
+        name: verdict
+        type: nest
+        nested-attributes: verdict-attrs
+  -
+    name: verdict-attrs
+    attributes:
+      -
+        name: code
+        type: u32
+        byte-order: big-endian
+      -
+        name: chain
+        type: string
+      -
+        name: chain-id
+        type: u32
+  -
+    name: expr-counter-attrs
+    attributes:
+      -
+        name: bytes
+        type: u64
+        doc: Number of bytes
+      -
+        name: packets
+        type: u64
+        doc: Number of packets
+      -
+        name: pad
+        type: pad
+  -
+    name: expr-flow-offload-attrs
+    attributes:
+      -
+        name: name
+        type: string
+        doc: Flow offload table name
+  -
+    name: expr-immediate-attrs
+    attributes:
+      -
+        name: dreg
+        type: u32
+        byte-order: big-endian
+      -
+        name: data
+        type: nest
+        nested-attributes: data-attrs
+  -
+    name: expr-meta-attrs
+    attributes:
+      -
+        name: dreg
+        type: u32
+        byte-order: big-endian
+      -
+        name: key
+        type: u32
+        byte-order: big-endian
+        enum: meta-keys
+      -
+        name: sreg
+        type: u32
+        byte-order: big-endian
+  -
+    name: expr-nat-attrs
+    attributes:
+      -
+        name: type
+        type: u32
+        byte-order: big-endian
+      -
+        name: family
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-addr-min
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-addr-max
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-proto-min
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-proto-max
+        type: u32
+        byte-order: big-endian
+      -
+        name: flags
+        type: u32
+        byte-order: big-endian
+        enum: nat-range-flags
+        enum-as-flags: true
+  -
+    name: expr-payload-attrs
+    attributes:
+      -
+        name: dreg
+        type: u32
+        byte-order: big-endian
+      -
+        name: base
+        type: u32
+        byte-order: big-endian
+      -
+        name: offset
+        type: u32
+        byte-order: big-endian
+      -
+        name: len
+        type: u32
+        byte-order: big-endian
+      -
+        name: sreg
+        type: u32
+        byte-order: big-endian
+      -
+        name: csum-type
+        type: u32
+        byte-order: big-endian
+      -
+        name: csum-offset
+        type: u32
+        byte-order: big-endian
+      -
+        name: csum-flags
+        type: u32
+        byte-order: big-endian
+  -
+    name: expr-tproxy-attrs
+    attributes:
+      -
+        name: family
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-addr
+        type: u32
+        byte-order: big-endian
+      -
+        name: reg-port
+        type: u32
+        byte-order: big-endian
+
+sub-messages:
+  -
+    name: expr-ops
+    formats:
+      -
+        value: bitwise # TODO
+      -
+        value: cmp
+        attribute-set: expr-cmp-attrs
+      -
+        value: counter
+        attribute-set: expr-counter-attrs
+      -
+        value: ct # TODO
+      -
+        value: flow_offload
+        attribute-set: expr-flow-offload-attrs
+      -
+        value: immediate
+        attribute-set: expr-immediate-attrs
+      -
+        value: lookup # TODO
+      -
+        value: meta
+        attribute-set: expr-meta-attrs
+      -
+        value: nat
+        attribute-set: expr-nat-attrs
+      -
+        value: payload
+        attribute-set: expr-payload-attrs
+      -
+        value: tproxy
+        attribute-set: expr-tproxy-attrs
+  -
+    name: obj-data
+    formats:
+      -
+        value: counter
+        attribute-set: counter-attrs
+      -
+        value: quota
+        attribute-set: quota-attrs
+
+operations:
+  enum-model: directional
+  list:
+    -
+      name: batch-begin
+      doc: Start a batch of operations
+      attribute-set: batch-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0x10
+          attributes:
+            - genid
+        reply:
+          value: 0x10
+          attributes:
+            - genid
+    -
+      name: batch-end
+      doc: Finish a batch of operations
+      attribute-set: batch-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0x11
+          attributes:
+            - genid
+    -
+      name: newtable
+      doc: Create a new table.
+      attribute-set: table-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa00
+          attributes:
+            - name
+    -
+      name: gettable
+      doc: Get / dump tables.
+      attribute-set: table-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa01
+          attributes:
+            - name
+        reply:
+          value: 0xa00
+          attributes:
+            - name
+    -
+      name: deltable
+      doc: Delete an existing table.
+      attribute-set: table-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa02
+          attributes:
+            - name
+    -
+      name: destroytable
+      doc: Delete an existing table with destroy semantics (ignoring ENOENT errors).
+      attribute-set: table-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1a
+          attributes:
+            - name
+    -
+      name: newchain
+      doc: Create a new chain.
+      attribute-set: chain-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa03
+          attributes:
+            - name
+    -
+      name: getchain
+      doc: Get / dump chains.
+      attribute-set: chain-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa04
+          attributes:
+            - name
+        reply:
+          value: 0xa03
+          attributes:
+            - name
+    -
+      name: delchain
+      doc: Delete an existing chain.
+      attribute-set: chain-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa05
+          attributes:
+            - name
+    -
+      name: destroychain
+      doc: Delete an existing chain with destroy semantics (ignoring ENOENT errors).
+      attribute-set: chain-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1b
+          attributes:
+            - name
+    -
+      name: newrule
+      doc: Create a new rule.
+      attribute-set: rule-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa06
+          attributes:
+            - name
+    -
+      name: getrule
+      doc: Get / dump rules.
+      attribute-set: rule-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa07
+          attributes:
+            - name
+        reply:
+          value: 0xa06
+          attributes:
+            - name
+    -
+      name: getrule-reset
+      doc: Get / dump rules and reset stateful expressions.
+      attribute-set: rule-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa19
+          attributes:
+            - name
+        reply:
+          value: 0xa06
+          attributes:
+            - name
+    -
+      name: delrule
+      doc: Delete an existing rule.
+      attribute-set: rule-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa08
+          attributes:
+            - name
+    -
+      name: destroyrule
+      doc: Delete an existing rule with destroy semantics (ignoring ENOENT errors).
+      attribute-set: rule-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1c
+          attributes:
+            - name
+    -
+      name: newset
+      doc: Create a new set.
+      attribute-set: set-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa09
+          attributes:
+            - name
+    -
+      name: getset
+      doc: Get / dump sets.
+      attribute-set: set-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa0a
+          attributes:
+            - name
+        reply:
+          value: 0xa09
+          attributes:
+            - name
+    -
+      name: delset
+      doc: Delete an existing set.
+      attribute-set: set-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa0b
+          attributes:
+            - name
+    -
+      name: destroyset
+      doc: Delete an existing set with destroy semantics (ignoring ENOENT errors).
+      attribute-set: set-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1d
+          attributes:
+            - name
+    -
+      name: newsetelem
+      doc: Create a new set element.
+      attribute-set: setelem-list-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa0c
+          attributes:
+            - name
+    -
+      name: getsetelem
+      doc: Get / dump set elements.
+      attribute-set: setelem-list-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa0d
+          attributes:
+            - name
+        reply:
+          value: 0xa0c
+          attributes:
+            - name
+    -
+      name: getsetelem-reset
+      doc: Get / dump set elements and reset stateful expressions.
+      attribute-set: setelem-list-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa21
+          attributes:
+            - name
+        reply:
+          value: 0xa0c
+          attributes:
+            - name
+    -
+      name: delsetelem
+      doc: Delete an existing set element.
+      attribute-set: setelem-list-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa0e
+          attributes:
+            - name
+    -
+      name: destroysetelem
+      doc: Delete an existing set element with destroy semantics.
+      attribute-set: setelem-list-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1e
+          attributes:
+            - name
+    -
+      name: getgen
+      doc: Get / dump rule-set generation.
+      attribute-set: gen-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa10
+          attributes:
+            - name
+        reply:
+          value: 0xa0f
+          attributes:
+            - name
+    -
+      name: newobj
+      doc: Create a new stateful object.
+      attribute-set: obj-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa12
+          attributes:
+            - name
+    -
+      name: getobj
+      doc: Get / dump stateful objects.
+      attribute-set: obj-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa13
+          attributes:
+            - name
+        reply:
+          value: 0xa12
+          attributes:
+            - name
+    -
+      name: delobj
+      doc: Delete an existing stateful object.
+      attribute-set: obj-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa14
+          attributes:
+            - name
+    -
+      name: destroyobj
+      doc: Delete an existing stateful object with destroy semantics.
+      attribute-set: obj-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa1f
+          attributes:
+            - name
+    -
+      name: newflowtable
+      doc: Create a new flow table.
+      attribute-set: flowtable-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa16
+          attributes:
+            - name
+    -
+      name: getflowtable
+      doc: Get / dump flow tables.
+      attribute-set: flowtable-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa17
+          attributes:
+            - name
+        reply:
+          value: 0xa16
+          attributes:
+            - name
+    -
+      name: delflowtable
+      doc: Delete an existing flow table.
+      attribute-set: flowtable-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa18
+          attributes:
+            - name
+    -
+      name: destroyflowtable
+      doc: Delete an existing flow table with destroy semantics.
+      attribute-set: flowtable-attrs
+      fixed-header: nfgenmsg
+      do:
+        request:
+          value: 0xa20
+          attributes:
+            - name
+
+mcast-groups:
+  list:
+    -
+      name: mgmt
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v4 2/4] tools/net/ynl: Fix extack decoding for directional ops
  2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 1/4] doc/netlink/specs: Add draft nftables spec Donald Hunter
@ 2024-04-18 10:47 ` Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 3/4] tools/net/ynl: Add multi message support to ynl Donald Hunter
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Donald Hunter @ 2024-04-18 10:47 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Pablo Neira Ayuso,
	Jozsef Kadlecsik, netfilter-devel, coreteam
  Cc: donald.hunter, Donald Hunter

NetlinkProtocol.decode() was looking up ops by response value which breaks
when it is used for extack decoding of directional ops. Instead, pass
the op to decode().

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
---
 tools/net/ynl/lib/ynl.py | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py
index a67f7b6fef92..a3ec7a56180a 100644
--- a/tools/net/ynl/lib/ynl.py
+++ b/tools/net/ynl/lib/ynl.py
@@ -386,12 +386,9 @@ class NetlinkProtocol:
     def _decode(self, nl_msg):
         return nl_msg
 
-    def decode(self, ynl, nl_msg):
+    def decode(self, ynl, nl_msg, op):
         msg = self._decode(nl_msg)
-        fixed_header_size = 0
-        if ynl:
-            op = ynl.rsp_by_value[msg.cmd()]
-            fixed_header_size = ynl._struct_size(op.fixed_header)
+        fixed_header_size = ynl._struct_size(op.fixed_header)
         msg.raw_attrs = NlAttrs(msg.raw, fixed_header_size)
         return msg
 
@@ -797,7 +794,7 @@ class YnlFamily(SpecFamily):
         if 'bad-attr-offs' not in extack:
             return
 
-        msg = self.nlproto.decode(self, NlMsg(request, 0, op.attr_set))
+        msg = self.nlproto.decode(self, NlMsg(request, 0, op.attr_set), op)
         offset = self.nlproto.msghdr_size() + self._struct_size(op.fixed_header)
         path = self._decode_extack_path(msg.raw_attrs, op.attr_set, offset,
                                         extack['bad-attr-offs'])
@@ -922,7 +919,8 @@ class YnlFamily(SpecFamily):
                     print("Netlink done while checking for ntf!?")
                     continue
 
-                decoded = self.nlproto.decode(self, nl_msg)
+                op = self.rsp_by_value[nl_msg.cmd()]
+                decoded = self.nlproto.decode(self, nl_msg, op)
                 if decoded.cmd() not in self.async_msg_ids:
                     print("Unexpected msg id done while checking for ntf", decoded)
                     continue
@@ -979,7 +977,7 @@ class YnlFamily(SpecFamily):
                     done = True
                     break
 
-                decoded = self.nlproto.decode(self, nl_msg)
+                decoded = self.nlproto.decode(self, nl_msg, op)
 
                 # Check if this is a reply to our request
                 if nl_msg.nl_seq != req_seq or decoded.cmd() != op.rsp_value:
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v4 3/4] tools/net/ynl: Add multi message support to ynl
  2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 1/4] doc/netlink/specs: Add draft nftables spec Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 2/4] tools/net/ynl: Fix extack decoding for directional ops Donald Hunter
@ 2024-04-18 10:47 ` Donald Hunter
  2024-04-18 10:47 ` [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages Donald Hunter
  2024-04-23  0:53 ` [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages patchwork-bot+netdevbpf
  4 siblings, 0 replies; 12+ messages in thread
From: Donald Hunter @ 2024-04-18 10:47 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Pablo Neira Ayuso,
	Jozsef Kadlecsik, netfilter-devel, coreteam
  Cc: donald.hunter, Donald Hunter

Add a "--multi <do-op> <json>" command line to ynl that makes it
possible to add several operations to a single netlink request payload.
The --multi command line option is repeated for each operation.

This is used by the nftables family for transaction batches. For
example:

./tools/net/ynl/cli.py \
 --spec Documentation/netlink/specs/nftables.yaml \
 --multi batch-begin '{"res-id": 10}' \
 --multi newtable '{"name": "test", "nfgen-family": 1}' \
 --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
 --multi batch-end '{"res-id": 10}'
[None, None, None, None]

It can also be used for bundling get requests:

./tools/net/ynl/cli.py \
 --spec Documentation/netlink/specs/nftables.yaml \
 --multi gettable '{"name": "test", "nfgen-family": 1}' \
 --multi getchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
 --output-json
[{"name": "test", "use": 1, "handle": 1, "flags": [],
 "nfgen-family": 1, "version": 0, "res-id": 2},
 {"table": "test", "name": "chain", "handle": 1, "use": 0,
 "nfgen-family": 1, "version": 0, "res-id": 2}]

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
---
 tools/net/ynl/cli.py     | 25 +++++++++++++--
 tools/net/ynl/lib/ynl.py | 68 +++++++++++++++++++++++++++++-----------
 2 files changed, 71 insertions(+), 22 deletions(-)

diff --git a/tools/net/ynl/cli.py b/tools/net/ynl/cli.py
index f131e33ac3ee..058926d69ef0 100755
--- a/tools/net/ynl/cli.py
+++ b/tools/net/ynl/cli.py
@@ -19,13 +19,28 @@ class YnlEncoder(json.JSONEncoder):
 
 
 def main():
-    parser = argparse.ArgumentParser(description='YNL CLI sample')
+    description = """
+    YNL CLI utility - a general purpose netlink utility that uses YAML
+    specs to drive protocol encoding and decoding.
+    """
+    epilog = """
+    The --multi option can be repeated to include several do operations
+    in the same netlink payload.
+    """
+
+    parser = argparse.ArgumentParser(description=description,
+                                     epilog=epilog)
     parser.add_argument('--spec', dest='spec', type=str, required=True)
     parser.add_argument('--schema', dest='schema', type=str)
     parser.add_argument('--no-schema', action='store_true')
     parser.add_argument('--json', dest='json_text', type=str)
-    parser.add_argument('--do', dest='do', type=str)
-    parser.add_argument('--dump', dest='dump', type=str)
+
+    group = parser.add_mutually_exclusive_group()
+    group.add_argument('--do', dest='do', metavar='DO-OPERATION', type=str)
+    group.add_argument('--multi', dest='multi', nargs=2, action='append',
+                       metavar=('DO-OPERATION', 'JSON_TEXT'), type=str)
+    group.add_argument('--dump', dest='dump', metavar='DUMP-OPERATION', type=str)
+
     parser.add_argument('--sleep', dest='sleep', type=int)
     parser.add_argument('--subscribe', dest='ntf', type=str)
     parser.add_argument('--replace', dest='flags', action='append_const',
@@ -73,6 +88,10 @@ def main():
         if args.dump:
             reply = ynl.dump(args.dump, attrs)
             output(reply)
+        if args.multi:
+            ops = [ (item[0], json.loads(item[1]), args.flags or []) for item in args.multi ]
+            reply = ynl.do_multi(ops)
+            output(reply)
     except NlError as e:
         print(e)
         exit(1)
diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py
index a3ec7a56180a..ea48f83c2e84 100644
--- a/tools/net/ynl/lib/ynl.py
+++ b/tools/net/ynl/lib/ynl.py
@@ -938,16 +938,11 @@ class YnlFamily(SpecFamily):
 
       return op['do']['request']['attributes'].copy()
 
-    def _op(self, method, vals, flags=None, dump=False):
-        op = self.ops[method]
-
+    def _encode_message(self, op, vals, flags, req_seq):
         nl_flags = Netlink.NLM_F_REQUEST | Netlink.NLM_F_ACK
         for flag in flags or []:
             nl_flags |= flag
-        if dump:
-            nl_flags |= Netlink.NLM_F_DUMP
 
-        req_seq = random.randint(1024, 65535)
         msg = self.nlproto.message(nl_flags, op.req_value, 1, req_seq)
         if op.fixed_header:
             msg += self._encode_struct(op.fixed_header, vals)
@@ -955,18 +950,36 @@ class YnlFamily(SpecFamily):
         for name, value in vals.items():
             msg += self._add_attr(op.attr_set.name, name, value, search_attrs)
         msg = _genl_msg_finalize(msg)
+        return msg
 
-        self.sock.send(msg, 0)
+    def _ops(self, ops):
+        reqs_by_seq = {}
+        req_seq = random.randint(1024, 65535)
+        payload = b''
+        for (method, vals, flags) in ops:
+            op = self.ops[method]
+            msg = self._encode_message(op, vals, flags, req_seq)
+            reqs_by_seq[req_seq] = (op, msg, flags)
+            payload += msg
+            req_seq += 1
+
+        self.sock.send(payload, 0)
 
         done = False
         rsp = []
+        op_rsp = []
         while not done:
             reply = self.sock.recv(self._recv_size)
             nms = NlMsgs(reply, attr_space=op.attr_set)
             self._recv_dbg_print(reply, nms)
             for nl_msg in nms:
-                if nl_msg.extack:
-                    self._decode_extack(msg, op, nl_msg.extack)
+                if nl_msg.nl_seq in reqs_by_seq:
+                    (op, req_msg, req_flags) = reqs_by_seq[nl_msg.nl_seq]
+                    if nl_msg.extack:
+                        self._decode_extack(req_msg, op, nl_msg.extack)
+                else:
+                    op = self.rsp_by_value[nl_msg.cmd()]
+                    req_flags = []
 
                 if nl_msg.error:
                     raise NlError(nl_msg)
@@ -974,13 +987,25 @@ class YnlFamily(SpecFamily):
                     if nl_msg.extack:
                         print("Netlink warning:")
                         print(nl_msg)
-                    done = True
+
+                    if Netlink.NLM_F_DUMP in req_flags:
+                        rsp.append(op_rsp)
+                    elif not op_rsp:
+                        rsp.append(None)
+                    elif len(op_rsp) == 1:
+                        rsp.append(op_rsp[0])
+                    else:
+                        rsp.append(op_rsp)
+                    op_rsp = []
+
+                    del reqs_by_seq[nl_msg.nl_seq]
+                    done = len(reqs_by_seq) == 0
                     break
 
                 decoded = self.nlproto.decode(self, nl_msg, op)
 
                 # Check if this is a reply to our request
-                if nl_msg.nl_seq != req_seq or decoded.cmd() != op.rsp_value:
+                if nl_msg.nl_seq not in reqs_by_seq or decoded.cmd() != op.rsp_value:
                     if decoded.cmd() in self.async_msg_ids:
                         self.handle_ntf(decoded)
                         continue
@@ -991,18 +1016,23 @@ class YnlFamily(SpecFamily):
                 rsp_msg = self._decode(decoded.raw_attrs, op.attr_set.name)
                 if op.fixed_header:
                     rsp_msg.update(self._decode_struct(decoded.raw, op.fixed_header))
-                rsp.append(rsp_msg)
+                op_rsp.append(rsp_msg)
 
-        if dump:
-            return rsp
-        if not rsp:
-            return None
-        if len(rsp) == 1:
-            return rsp[0]
         return rsp
 
+    def _op(self, method, vals, flags=None, dump=False):
+        req_flags = flags or []
+        if dump:
+            req_flags.append(Netlink.NLM_F_DUMP)
+
+        ops = [(method, vals, req_flags)]
+        return self._ops(ops)[0]
+
     def do(self, method, vals, flags=None):
         return self._op(method, vals, flags)
 
     def dump(self, method, vals):
-        return self._op(method, vals, [], dump=True)
+        return self._op(method, vals, dump=True)
+
+    def do_multi(self, ops):
+        return self._ops(ops)
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
                   ` (2 preceding siblings ...)
  2024-04-18 10:47 ` [PATCH net-next v4 3/4] tools/net/ynl: Add multi message support to ynl Donald Hunter
@ 2024-04-18 10:47 ` Donald Hunter
  2024-04-18 16:30   ` Pablo Neira Ayuso
  2024-04-23  0:53 ` [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages patchwork-bot+netdevbpf
  4 siblings, 1 reply; 12+ messages in thread
From: Donald Hunter @ 2024-04-18 10:47 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Pablo Neira Ayuso,
	Jozsef Kadlecsik, netfilter-devel, coreteam
  Cc: donald.hunter, Donald Hunter

The NLM_F_ACK flag is ignored for nfnetlink batch begin and end
messages. This is a problem for ynl which wants to receive an ack for
every message it sends, not just the commands in between the begin/end
messages.

Add processing for ACKs for begin/end messages and provide responses
when requested.

I have checked that iproute2, pyroute2 and systemd are unaffected by
this change since none of them use NLM_F_ACK for batch begin/end.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
---
 net/netfilter/nfnetlink.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index c9fbe0f707b5..4abf660c7baf 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	nfnl_unlock(subsys_id);
 
+	if (nlh->nlmsg_flags & NLM_F_ACK)
+		nfnl_err_add(&err_list, nlh, 0, &extack);
+
 	while (skb->len >= nlmsg_total_size(0)) {
 		int msglen, type;
 
@@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 		} else if (err) {
 			ss->abort(net, oskb, NFNL_ABORT_NONE);
 			netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL);
+		} else if (nlh->nlmsg_flags & NLM_F_ACK) {
+			nfnl_err_add(&err_list, nlh, 0, &extack);
 		}
 	} else {
 		enum nfnl_abort_action abort_action;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-18 10:47 ` [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages Donald Hunter
@ 2024-04-18 16:30   ` Pablo Neira Ayuso
  2024-04-18 16:51     ` Jakub Kicinski
  2024-04-19 11:20     ` Donald Hunter
  0 siblings, 2 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2024-04-18 16:30 UTC (permalink / raw)
  To: Donald Hunter
  Cc: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

Hi Donald,

Apologies for a bit late jumping back on this, it took me a while.

On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote:
> The NLM_F_ACK flag is ignored for nfnetlink batch begin and end
> messages. This is a problem for ynl which wants to receive an ack for
> every message it sends, not just the commands in between the begin/end
> messages.

Just a side note: Turning on NLM_F_ACK for every message fills up the
receiver buffer very quickly, leading to ENOBUFS. Netlink, in general,
supports batching (with non-atomic semantics), I did not look at ynl
in detail, if it does send() + recv() for each message for other
subsystem then fine, but if it uses batching to amortize the cost of
the syscall then this explicit ACK could be an issue with very large
batches.

Out of curiosity: Why does the tool need an explicit ack for each
command? As mentioned above, this consumes a lot netlink bandwidth.

> Add processing for ACKs for begin/end messages and provide responses
> when requested.
> 
> I have checked that iproute2, pyroute2 and systemd are unaffected by
> this change since none of them use NLM_F_ACK for batch begin/end.

nitpick: Quick grep shows me iproute2 does not use the nfnetlink
subsystem? If I am correct, maybe remove this.

Thanks!

One more comment below.

> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
> ---
>  net/netfilter/nfnetlink.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
> index c9fbe0f707b5..4abf660c7baf 100644
> --- a/net/netfilter/nfnetlink.c
> +++ b/net/netfilter/nfnetlink.c
> @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
>  
>  	nfnl_unlock(subsys_id);
>  
> +	if (nlh->nlmsg_flags & NLM_F_ACK)
> +		nfnl_err_add(&err_list, nlh, 0, &extack);
> +
>  	while (skb->len >= nlmsg_total_size(0)) {
>  		int msglen, type;
>  
> @@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
>  		} else if (err) {
>  			ss->abort(net, oskb, NFNL_ABORT_NONE);
>  			netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL);
> +		} else if (nlh->nlmsg_flags & NLM_F_ACK) {
> +			nfnl_err_add(&err_list, nlh, 0, &extack);
>  		}
>  	} else {
>  		enum nfnl_abort_action abort_action;
> -- 
> 2.44.0
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-18 16:30   ` Pablo Neira Ayuso
@ 2024-04-18 16:51     ` Jakub Kicinski
  2024-04-22 20:33       ` Jakub Kicinski
  2024-04-19 11:20     ` Donald Hunter
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-18 16:51 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Donald Hunter, netdev, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote:
> Out of curiosity: Why does the tool need an explicit ack for each
> command? As mentioned above, this consumes a lot netlink bandwidth.

I think that the tool is sort of besides the point, it's just a PoC.
The point is that we're trying to describe netlink protocols in machine
readable fashion. Which in turn makes it possible to write netlink
binding generators in any language, like modern RPC frameworks.
For that to work we need protocol basics to be followed.

That's not to say that we're going to force all netlink families to
change to follow extra new rules. Just those that want to be accessed
via the bindings.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-18 16:30   ` Pablo Neira Ayuso
  2024-04-18 16:51     ` Jakub Kicinski
@ 2024-04-19 11:20     ` Donald Hunter
  2024-04-22 22:55       ` Pablo Neira Ayuso
  1 sibling, 1 reply; 12+ messages in thread
From: Donald Hunter @ 2024-04-19 11:20 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

Pablo Neira Ayuso <pablo@netfilter.org> writes:

> Hi Donald,
>
> Apologies for a bit late jumping back on this, it took me a while.
>
> On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote:
>> The NLM_F_ACK flag is ignored for nfnetlink batch begin and end
>> messages. This is a problem for ynl which wants to receive an ack for
>> every message it sends, not just the commands in between the begin/end
>> messages.
>
> Just a side note: Turning on NLM_F_ACK for every message fills up the
> receiver buffer very quickly, leading to ENOBUFS. Netlink, in general,
> supports batching (with non-atomic semantics), I did not look at ynl
> in detail, if it does send() + recv() for each message for other
> subsystem then fine, but if it uses batching to amortize the cost of
> the syscall then this explicit ACK could be an issue with very large
> batches.

ynl is batching the messages for send() and will accept batches from
recv() but nfnetlink is sending each ack separately. It is using
netlink_ack() which uses a new skb for each message, for example:

sudo strace -e sendto,recvfrom ./tools/net/ynl/cli.py \
     --spec Documentation/netlink/specs/nftables.yaml \
     --multi batch-begin '{"res-id": 10}' \
     --multi newtable '{"name": "test", "nfgen-family": 1}' \
     --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
     --multi batch-end '{"res-id": 10}'
sendto(3, [[{nlmsg_len=20, nlmsg_type=0x10 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}, "\x00\x00\x00\x0a"], [{nlmsg_len=32, nlmsg_type=0xa00 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}, "\x01\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=44, nlmsg_type=0xa03 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}, "\x01\x00\x00\x00\x0a\x00\x03\x00\x63\x68\x61\x69\x6e\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=20, nlmsg_type=0x11 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}, "\x00\x00\x00\x0a"]], 116, 0, NULL, 0) = 116
recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28254, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_BEGIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28255, nlmsg_pid=997}, {error=0, msg={nlmsg_len=32, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28256, nlmsg_pid=997}, {error=0, msg={nlmsg_len=44, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28257, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_END, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36

> Out of curiosity: Why does the tool need an explicit ack for each
> command? As mentioned above, this consumes a lot netlink bandwidth.

For the ynl python library, I guess it was a design choice to request an
ack for each command.

Since the Netlink API allows a user to request acks, it seems necessary
to be consistent about providing them. Otherwise we'd need to extend the
netlink message specs to say which messages are ack capable and which
are not.

>> Add processing for ACKs for begin/end messages and provide responses
>> when requested.
>> 
>> I have checked that iproute2, pyroute2 and systemd are unaffected by
>> this change since none of them use NLM_F_ACK for batch begin/end.
>
> nitpick: Quick grep shows me iproute2 does not use the nfnetlink
> subsystem? If I am correct, maybe remove this.

Yeah, my mistake. I did check iproute2 but didn't mean to add it to the
list. For nft, NFNL_MSG_BATCH_* usage is contained in libnftnl from what
I can see. I'll update the patch.

> Thanks!
>
> One more comment below.

Did you miss adding a comment?

>
>> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
>> ---
>>  net/netfilter/nfnetlink.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
>> index c9fbe0f707b5..4abf660c7baf 100644
>> --- a/net/netfilter/nfnetlink.c
>> +++ b/net/netfilter/nfnetlink.c
>> @@ -427,6 +427,9 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
>>  
>>  	nfnl_unlock(subsys_id);
>>  
>> +	if (nlh->nlmsg_flags & NLM_F_ACK)
>> +		nfnl_err_add(&err_list, nlh, 0, &extack);
>> +
>>  	while (skb->len >= nlmsg_total_size(0)) {
>>  		int msglen, type;
>>  
>> @@ -573,6 +576,8 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
>>  		} else if (err) {
>>  			ss->abort(net, oskb, NFNL_ABORT_NONE);
>>  			netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL);
>> +		} else if (nlh->nlmsg_flags & NLM_F_ACK) {
>> +			nfnl_err_add(&err_list, nlh, 0, &extack);
>>  		}
>>  	} else {
>>  		enum nfnl_abort_action abort_action;
>> -- 
>> 2.44.0
>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-18 16:51     ` Jakub Kicinski
@ 2024-04-22 20:33       ` Jakub Kicinski
  2024-04-22 22:56         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Kicinski @ 2024-04-22 20:33 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Donald Hunter, netdev, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

On Thu, 18 Apr 2024 09:51:53 -0700 Jakub Kicinski wrote:
> On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote:
> > Out of curiosity: Why does the tool need an explicit ack for each
> > command? As mentioned above, this consumes a lot netlink bandwidth.  
> 
> I think that the tool is sort of besides the point, it's just a PoC.
> The point is that we're trying to describe netlink protocols in machine
> readable fashion. Which in turn makes it possible to write netlink
> binding generators in any language, like modern RPC frameworks.
> For that to work we need protocol basics to be followed.
> 
> That's not to say that we're going to force all netlink families to
> change to follow extra new rules. Just those that want to be accessed
> via the bindings.

Pablo, any thoughts? Convinced? Given this touches YNL in significant
ways I'd prefer to merge it to net-next.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-19 11:20     ` Donald Hunter
@ 2024-04-22 22:55       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2024-04-22 22:55 UTC (permalink / raw)
  To: Donald Hunter
  Cc: netdev, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

On Fri, Apr 19, 2024 at 12:20:25PM +0100, Donald Hunter wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> writes:
> 
> > Hi Donald,
> >
> > Apologies for a bit late jumping back on this, it took me a while.
> >
> > On Thu, Apr 18, 2024 at 11:47:37AM +0100, Donald Hunter wrote:
> >> The NLM_F_ACK flag is ignored for nfnetlink batch begin and end
> >> messages. This is a problem for ynl which wants to receive an ack for
> >> every message it sends, not just the commands in between the begin/end
> >> messages.
> >
> > Just a side note: Turning on NLM_F_ACK for every message fills up the
> > receiver buffer very quickly, leading to ENOBUFS. Netlink, in general,
> > supports batching (with non-atomic semantics), I did not look at ynl
> > in detail, if it does send() + recv() for each message for other
> > subsystem then fine, but if it uses batching to amortize the cost of
> > the syscall then this explicit ACK could be an issue with very large
> > batches.
> 
> ynl is batching the messages for send() and will accept batches from
> recv() but nfnetlink is sending each ack separately.

Yes, like it happens with other existing netlink interfaces when
batching is used, nfnetlink is no different in such case.

> It is using netlink_ack() which uses a new skb for each message, for
> example:
> 
> sudo strace -e sendto,recvfrom ./tools/net/ynl/cli.py \
>      --spec Documentation/netlink/specs/nftables.yaml \
>      --multi batch-begin '{"res-id": 10}' \
>      --multi newtable '{"name": "test", "nfgen-family": 1}' \
>      --multi newchain '{"name": "chain", "table": "test", "nfgen-family": 1}' \
>      --multi batch-end '{"res-id": 10}'
> sendto(3, [[{nlmsg_len=20, nlmsg_type=0x10 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}, "\x00\x00\x00\x0a"], [{nlmsg_len=32, nlmsg_type=0xa00 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}, "\x01\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=44, nlmsg_type=0xa03 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}, "\x01\x00\x00\x00\x0a\x00\x03\x00\x63\x68\x61\x69\x6e\x00\x00\x00\x09\x00\x01\x00\x74\x65\x73\x74\x00\x00\x00\x00"], [{nlmsg_len=20, nlmsg_type=0x11 /* NLMSG_??? */, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}, "\x00\x00\x00\x0a"]], 116, 0, NULL, 0) = 116
> recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28254, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_BEGIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28254, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
> recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28255, nlmsg_pid=997}, {error=0, msg={nlmsg_len=32, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWTABLE, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28255, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
> recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28256, nlmsg_pid=997}, {error=0, msg={nlmsg_len=44, nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_NEWCHAIN, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28256, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
> recvfrom(3, [{nlmsg_len=36, nlmsg_type=NLMSG_ERROR, nlmsg_flags=NLM_F_CAPPED, nlmsg_seq=28257, nlmsg_pid=997}, {error=0, msg={nlmsg_len=20, nlmsg_type=NFNL_MSG_BATCH_END, nlmsg_flags=NLM_F_REQUEST|NLM_F_ACK, nlmsg_seq=28257, nlmsg_pid=0}}], 131072, 0, NULL, NULL) = 36
> 
> > Out of curiosity: Why does the tool need an explicit ack for each
> > command? As mentioned above, this consumes a lot netlink bandwidth.
> 
> For the ynl python library, I guess it was a design choice to request an
> ack for each command.
> 
> Since the Netlink API allows a user to request acks, it seems necessary
> to be consistent about providing them. Otherwise we'd need to extend the
> netlink message specs to say which messages are ack capable and which
> are not.

I see.

> >> Add processing for ACKs for begin/end messages and provide responses
> >> when requested.
> >> 
> >> I have checked that iproute2, pyroute2 and systemd are unaffected by
> >> this change since none of them use NLM_F_ACK for batch begin/end.
> >
> > nitpick: Quick grep shows me iproute2 does not use the nfnetlink
> > subsystem? If I am correct, maybe remove this.
> 
> Yeah, my mistake. I did check iproute2 but didn't mean to add it to the
> list. For nft, NFNL_MSG_BATCH_* usage is contained in libnftnl from what
> I can see. I'll update the patch.
> 
> > Thanks!
> >
> > One more comment below.
> 
> Did you miss adding a comment?

No more comments, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
  2024-04-22 20:33       ` Jakub Kicinski
@ 2024-04-22 22:56         ` Pablo Neira Ayuso
  0 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2024-04-22 22:56 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Donald Hunter, netdev, David S. Miller, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, Jacob Keller, Jozsef Kadlecsik,
	netfilter-devel, coreteam, donald.hunter

On Mon, Apr 22, 2024 at 01:33:28PM -0700, Jakub Kicinski wrote:
> On Thu, 18 Apr 2024 09:51:53 -0700 Jakub Kicinski wrote:
> > On Thu, 18 Apr 2024 18:30:55 +0200 Pablo Neira Ayuso wrote:
> > > Out of curiosity: Why does the tool need an explicit ack for each
> > > command? As mentioned above, this consumes a lot netlink bandwidth.  
> > 
> > I think that the tool is sort of besides the point, it's just a PoC.
> > The point is that we're trying to describe netlink protocols in machine
> > readable fashion. Which in turn makes it possible to write netlink
> > binding generators in any language, like modern RPC frameworks.
> > For that to work we need protocol basics to be followed.
> > 
> > That's not to say that we're going to force all netlink families to
> > change to follow extra new rules. Just those that want to be accessed
> > via the bindings.
> 
> Pablo, any thoughts? Convinced? Given this touches YNL in significant
> ways I'd prefer to merge it to net-next.

No objections, thanks for asking.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages
  2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
                   ` (3 preceding siblings ...)
  2024-04-18 10:47 ` [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages Donald Hunter
@ 2024-04-23  0:53 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-04-23  0:53 UTC (permalink / raw)
  To: Donald Hunter
  Cc: netdev, kuba, davem, edumazet, pabeni, jiri, jacob.e.keller,
	pablo, kadlec, netfilter-devel, coreteam, donald.hunter

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 18 Apr 2024 11:47:33 +0100 you wrote:
> This series adds a ynl spec for nftables and extends ynl with a --multi
> command line option that makes it possible to send transactional batches
> for nftables.
> 
> This series includes a patch for nfnetlink which adds ACK processing for
> batch begin/end messages. If you'd prefer that to be sent separately to
> nf-next then I can do so, but I included it here so that it gets seen in
> context.
> 
> [...]

Here is the summary with links:
  - [net-next,v4,1/4] doc/netlink/specs: Add draft nftables spec
    https://git.kernel.org/netdev/net-next/c/1ee731687137
  - [net-next,v4,2/4] tools/net/ynl: Fix extack decoding for directional ops
    https://git.kernel.org/netdev/net-next/c/0a966d606c68
  - [net-next,v4,3/4] tools/net/ynl: Add multi message support to ynl
    https://git.kernel.org/netdev/net-next/c/ba8be00f68f5
  - [net-next,v4,4/4] netfilter: nfnetlink: Handle ACK flags for batch messages
    https://git.kernel.org/netdev/net-next/c/bf2ac490d28c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-23  0:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 10:47 [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages Donald Hunter
2024-04-18 10:47 ` [PATCH net-next v4 1/4] doc/netlink/specs: Add draft nftables spec Donald Hunter
2024-04-18 10:47 ` [PATCH net-next v4 2/4] tools/net/ynl: Fix extack decoding for directional ops Donald Hunter
2024-04-18 10:47 ` [PATCH net-next v4 3/4] tools/net/ynl: Add multi message support to ynl Donald Hunter
2024-04-18 10:47 ` [PATCH net-next v4 4/4] netfilter: nfnetlink: Handle ACK flags for batch messages Donald Hunter
2024-04-18 16:30   ` Pablo Neira Ayuso
2024-04-18 16:51     ` Jakub Kicinski
2024-04-22 20:33       ` Jakub Kicinski
2024-04-22 22:56         ` Pablo Neira Ayuso
2024-04-19 11:20     ` Donald Hunter
2024-04-22 22:55       ` Pablo Neira Ayuso
2024-04-23  0:53 ` [PATCH net-next v4 0/4] netlink: Add nftables spec w/ multi messages patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).