Xtables2 A7 spec draft

* Xtables2 A7 spec draft
@ 2011-02-02 22:04 Jan Engelhardt
  2011-02-05 19:33 ` Jozsef Kadlecsik
  2011-02-07 20:50 ` James Nurmi
  0 siblings, 2 replies; 6+ messages in thread
From: Jan Engelhardt @ 2011-02-02 22:04 UTC (permalink / raw)
  To: Netfilter Developer Mailing List

I am posting the Xtables2 Netlink interface specification, draft 7
for comments.

Additionally, further documentation and toolchain around
it is available through the project page at

	http://jengelh.medozas.de/projects/xtables/

 * User Documentation Chapter 1: Architectural Differences
 * Developer Documentation Part 1: Netlink interface (WIP)
   This is copied below to facilitate inline replies
--8<--

Netlink interface

1 Concepts

This section is non-normative and should instead show the flow of 
thought and give reasons as to why the specification was 
conceived the way it is, and where the component problems are.

1.1 Nesting representation

The common element in Xtables is the ruleset, represented as a 
tree structure with ordering constraints at some levels:

ruleset (unordered tables)
 \__ table (unordered chains)
 |    \__ chain (ordered rules)
 |    |    \__ rule (ordered actions)
 |    |    |    \__ match (unordered data)
 |    |    |    |    \__ config-data
 |    |    |    |    |    \__ bin params
 |    |    |    |    \__ state-data
 |    |    |    |         \__ nlattrs
 |    |    |    \__ match...
 |    |    |    \__ target (unordered data)
 |    |    |    |    \__ config-data
 |    |    |    \__ target...
 |    |    |    \__ verdict...
 |    |    \__ rule...
 |    \__ chain...
 \__ table...

A more concrete example, here is a small ruleset, encoded into 
XML (just one of many possible representations):

<table>
  <chain name="INPUT">
    <rule idx="1">
      <match acidx="1" name="hashlimit" rev="1" csize="120">
        <config-data>...</config-data>
        <state-data>...</state-data>
      </match>
      <target acidx="2" name="TOS" rev="1">
        ...
      </target>
      <verdict acidx="3" name="ACCEPT" />
    </rule>
  </chain>
</table>

There are different ways to encode such a tree structure into a 
serialized stream. In many Netlink protocols, children attributes 
are encapsulated (a. k. a. “nested”, though we will avoid this 
term to avoid double-use) and treated as a whole as a parent's 
opaque data. It cannot be told apart from normal data. (Like 
writing “<chain> &lt;rule&gt; ... &lt;/rule&gt; </chain>” in 
XML.) We will call this format “Encapsulated Encoding”.

To encode an attribute's length, struct nlattr only has a 16-bit 
field, which means the attribute header plus payload is limited 
to 64 KB. This is easily exceedable with the encapsulated 
encoding as chains are collected rules in a chain, for example. 
The problem is aggreviated by the kernel's Netlink handler only 
allocating sk_buffs a page size worth, which leaves few room for 
extension data. In the worst case, the usable payload for 
attributes is around 3600 bytes only. In light of xt_u32's 
private data block being 1984 bytes already, that means that you 
won't be able to fit two -m u32 invocations nested in a single 
rule into a dump.

Certain voices in the community call for the obsoletion of such 
data blobs and replace them by Netlink attributes; there are no 
objections to doing so. However, the problem of size-limited 
sk_buffs applies to opaque data of any kind, and Netlink 
attributes fall within that.

The Xtables2 Netlink protocol encodes each node of information as 
a standalone attribute, to be called Flat Encoding, that is 
appended (a. k. a. “chained”) to the data stream. By avoiding 
encapsulated attributes, it is possible to split messages at much 
finer levels, and provides for attributes that happen to use 
opaque data with a maximally-sized buffer.

1.2 Nest markers<sub:Nest-markers>

Since Netlink messages do have a 32-bit quantity to store the 
message length, rulesets of roughly up to 4 GB are possibile, 
which is currently regarded as sufficient. The largest (while 
still being meaningful) rulesets seen to date in the industry 
weighed in at approximately 150 MB.

Whereas encapsulated attribute encoding automatically provided 
for boundaries, this is realized using dummy attributes in the 
chained approach. The start of a nesting level can be implicitly 
represented by the presence of the attribute that would have 
otherwise been used for encapsulated nesting. For declaring an 
end of a nest level, an extra attribute is needed:

• “chain { rule; rule; ... }” \Leftrightarrow CHAIN RULE RULE ... 
  STOP

1.3 Attribute limitations in nfnetlink

Netlink, being just a base protocol, does not specify what comes 
after the nlmsghdr, or how it is ordered. This is left up to the 
subprotocols based on Netlink. nfnetlink has two effective 
shortcomings (due to its parser) that shall be held in mind:

• Attribute ordering is ignored and lost

• No support for more than one attribute with the same type 
  within a message

struct nlattr **tb;
nla_for_each_attr(attr, head, ...)
        tb[nla_type(attr)] = attr;

This kills the idea of being able to do, for example, a table 
replace, in a single Netlink request message. This is like having 
to split an XML file at every tag simply because two tags can 
carry the same attribute. So Netlink requests have to be broken 
down into many many tiny parts and extra state has to be kept 
around in the kernel.

put_header(msg, NFXTM_TABLE_REPLACE);
foreach (rule)
        put(msg, rule);
send(sock, msg);

will become

put_header(msg, NFXTM_TABLE_REPLACE);
send(sock, msg);
foreach (rule) {
        clean(msg);
        put_header(msg, NFXTM_RULE_DATA);
        put(msg, rule);
        send(sock, msg);
}
clean(msg);
put_header(msg, NFXTM_COMMIT);
send(sock, msg);

or worse. In other words, the fact that the kernel side will use 
a temporary table (an implementation detail) will be exposed to 
userspace, which is bad too.

1.4 Summary of transform<sub:Summary-of-transform>

Essentially there is a 1:1 transform on the XML-like tree shown 
above, to:

NFXTM_CHAIN_ENTRY<name=INPUT,usertid=1>
  NFXTM_RULE_ENTRY<idx=1,usertid=1>
    NFXTM_MATCH_ENTRY<acidx=1,name=hashlimit,rev=1,usertid=2>
      NFXTM_CONFIG_DATA
        NFXTM_ARB_DATA<whatever>
        NFXTM_ARB_DATA<more arbitrary data>
      NFXTM_STOP
      NFXTM_STATE_DATA
        NFXTM_ATTR_DATA<nlattrs>
        NFXTM_ATTR_DATA<more nlattrs>
      NFXTM_STOP
    NFXTM_STOP
    NFXTM_TARGET_ENTRY<acidx=2,name=TOS,rev=0,usertid=3>
      ...
    NFXTM_STOP
    NFXTM_VERDICT_ENTRY<acidx=3,name=ACCEPT,usertid=3>
    NFXTM_STOP
  NFXTM_STOP
NFXTM_STOP

1.5 Extra sequence numbers<sub:Extra-sequence-numbers>

Netlink also does not specify any message ordering, though it 
does provide an nlmsg_seq field with which message order can at 
least be determined. The problem is that nothing specifies what 
nlmsg_seq should be in reply messages. It is assumed that the 
sequence number is linked, i. e. that a reply's number should be 
the same as the request's number, to do message matching (vague 
hint by netlink(7) manpage).

Even if that were decidedly so, that brings along a problem. In 
NLM_F_MULTI-style dumps, all messages would have the same 
nlmsg_seq. To counter this, multi messages will have an 
NFXT-specific sequence counter (NFXTA_SEQNO) in addition, 
especially since ordering is so much more crucial in Xtables than 
it is in other parts of networking.

1.6 Improved granularity error reporting

Xtables extensions as of Linux 2.6.37 can only return system 
error codes back to userspace in case there is a problem. The 
most common occurrences are, for example, ENOMEM (“Memory 
allocation failure” / “Out of memory”), and the dreaded EINVAL (“
Invalid argument”). Best practices at the moment are to printk a 
string to the kernel log for further information detailing the 
circumstances about the cause of EINVAL. In the light of this 
overload of EINVAL, an improved error reporting scheme is sought. 
(Other networking subsystems also suffer from this problem.)

By suggestion of Jozsef Kadlecsik, the Xtables2 protocol reports 
three kinds of errors:

• General/standard (integer) error codes, where there is no point 
  (or cannot be) to specify the nature of the error exactly. Like 
  in the example, ENOMEM: it is needles to report which new data 
  field could not be allocated.

• General Xtables2 error codes (largely replaces EINVAL sites) in 
  integer form, similar to errno. Use cases include:

  – chain for a requested operation does not exist

  – an extension is used from a hook it is not supposed to be

• Free-form string. Standalone, or in addition to the above.
It is impossible to provision error numbers for extensions, 
  especially those that are out-of-tree. The problems that 
  forcing a component to reuse another component's error code 
  space can be seen in the overuse of EINVAL. We are aware that 
  raw strings in kernel modules can hinder internationalization, 
  but it is seen as the better choice over awkward error codes 
  that convey nothing. It is also expected that strings do not 
  change that often.

The three error types will be conveyed by three distinct 
attributes: NFXTA_ERRNO (generic error codes), NFXTA_XTERRNO (xt2 
error codes), and NFXTA_ERRSTR (free-form string).

  Error pointer

Once a table/chain splice request has been finalized, 
xt_check_{match,target} is run, which can return:

• chain name, rule index, match/target index, NFXTE_*/custom 
  string

  Line number

I noticed Jozsef has added a line number attribute in ipset 
version 5 to facilitate locating errors for users. For its 
apparent value, such attribute is also specified for xtnetlink:

A request message can contain a “ping attribute”, NFXTA_USERTID, 
which xtnetlink may keep track of and which may be reported back 
verbatim in case an error occured. It may be used to represent 
the source line, or any other number.

• For the tree example in section 1, the ruleset file would be “
  -A INPUT \
-m hashlimit ... \
-j TOS ... -j ACCEPT”.

1.7 Multi-type responses

Using multi-type responses provides for a seemingly shorter reply 
(in at least one case) than not doing so:

• \RightarrowNFXTM_CHAIN_DUMP<NFXTA_NAME>
\LeftarrowNFXTM_RULE_START<>
\LeftarrowNFXTM_EMATCH<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowNFXTM_EMATCH<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowNFXTM_ETARGET<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowNFXTM_ETARGET<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowNFXTM_RULE_END<>
\LeftarrowNFXTM_RULE_START<>
\LeftarrowNFXTM_ETARGET<NFXTA_VERDICT>
\LeftarrowNFXTM_RULE_END<>
\LeftarrowNLMSG_DONE

• \RightarrowCHAIN_DUMP<NFXTA_NAME>
\LeftarrowCHAIN_DUMP<NFXTA_RULE_START>
\LeftarrowCHAIN_DUMP<NFXTA_MATCH_START>
\LeftarrowCHAIN_DUMP<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowCHAIN_DUMP<NFXTA_MATCH_END>
\LeftarrowCHAIN_DUMP<NFXTA_MATCH_START>
\LeftarrowCHAIN_DUMP<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowCHAIN_DUMP<NFXTA_MATCH_END>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_START>
\LeftarrowCHAIN_DUMP<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_END>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_START>
\LeftarrowCHAIN_DUMP<NFXTA_NAME, NFXTA_REVISION, NFXTA_DATA>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_END>
\LeftarrowCHAIN_DUMP<NFXTA_RULE_END>
\LeftarrowCHAIN_DUMP<NFXTA_RULE_START>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_START>
\LeftarrowCHAIN_DUMP<NFXTA_VERDICT>
\LeftarrowCHAIN_DUMP<NFXTA_TARGET_END>
\LeftarrowCHAIN_DUMP<NFXTA_RULE_END>
\LeftarrowNLMSG_DONE

2 General use

2.1 Socket

Xtables2 is made available through an nfnetlink socket. 
Specifically, this is a Netlink socket of type NETLINK_NETFILTER, 
with which messages are exchanged that are tagged having Xtables 
as the subsystem.

#include <sys/socket.h>
#include <linux/netlink.h>

struct nlmsghdr nlmsg;
int nf_socket = socket(AF_NETLINK, SOCK_RAW, 
NETFILTER_NETFILTER);
nlmsg.nlmsg_type = (NFNL_SUBSYS_XTABLES << 8) | xt_msg_type;

2.2 Message format

All messages transmitted over the Netlink socket are to have the 
base struct nlmsghdr header, followed by a struct nfgenmsg header 
as mandated by nfnetlink. The .nfgen_family member is always set 
to NFPROTO_UNSPEC. The .version member denotes the format of the 
byte stream following nfgenmsg; this is currently version 0. The 
.res_id member is unused.

3 Attributes

The meaning of attributes depends upon the message and logical 
nesting level in which they appear. Their type however remains 
the same, such that a single Netlink attribute validation policy 
object (struct nla_policy) can be used for all message types.

A table of all known attributes:

+--------+-------------------+---------------+-----------------+--------------------------------------+
| Value  | Mnemonic          |    C type     | NLA type        | Notes                                |
+--------+-------------------+---------------+-----------------+--------------------------------------+
+--------+-------------------+---------------+-----------------+--------------------------------------+
|   1    | NFXTA_SEQNO       | unsigned int  | NLA_U32         | Section [sub:Extra-sequence-numbers] |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|  tba   | NFXTA_ERRNO       |     int       | NLA_U32         | Generic system errno (Exxx)          |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|  ...   | NFXTA_XTERRNO     |     int       | NLA_U32         | NFXT errno (NFXTE_*)                 |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_ERRSTR      |   char []     | NLA_NUL_STRING  | Arbitrary                            |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_USERTID     | unsigned int  | NLA_U32         | Arbitrary, retained verbatim         |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_CHAIN_NAME  |   char []     | NLA_NUL_STRING  |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_RULE_IDX    | unsigned int  | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_ACTION_IDX  | unsigned int  | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_NAME        |   char []     | NLA_NUL_STRING  |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_REVISION    |   uint8_t     | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_HOOKNUM     | unsigned int  | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_PRIORITY    |     int       | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_NFPROTO     |   uint8_t     | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_OFFSET      | unsigned int  | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_LENGTH      |    size_t     | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_HOOKMASK    | unsigned int  | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_SIZE        |    size_t     | NLA_U32         |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+
|        | NFXTA_NEW_NAME    |   char []     | NLA_NUL_STRING  |                                      |
+--------+-------------------+---------------+-----------------+--------------------------------------+

The kernel ignores attributes with value 0 during validation, so 
it was left unused.

4 Error types<sec:Error-types>

+--------+---------------------+-------------------------------------------+
| Value  | Mnemonic            | Description                               |
+--------+---------------------+-------------------------------------------+
+--------+---------------------+-------------------------------------------+
|   0    | NFXTE_SUCCESS       | No error                                  |
+--------+---------------------+-------------------------------------------+
|   1    | NFXTE_CHAIN_EXIST   | Chain already exists                      |
+--------+---------------------+-------------------------------------------+
|   2    | NFXTE_CHAIN_NOENT   | Chain does not exist                      |
+--------+---------------------+-------------------------------------------+
|   3    | NFXTE_RULESET_LOOP  | Ruleset contains a loop                   |
+--------+---------------------+-------------------------------------------+
|   4    | NFXTE_EXT_HOOKMASK  | Rule invoked from incompatible hook       |
+--------+---------------------+-------------------------------------------+
|        | NFXTE_PROMO_STATUS  | Promotion/demotion state already achieved |
+--------+---------------------+-------------------------------------------+

5 Message types

+------+-----------------------+----------------+---------------------------------------------+
| ID   | Mnemonic              |      Dir       | Notes                                       |
+------+-----------------------+----------------+---------------------------------------------+
+------+-----------------------+----------------+---------------------------------------------+
|  0   | NFXTM_STOP            |     both       | End of logical nesting level or transaction |
+------+-----------------------+----------------+---------------------------------------------+
|  1   | NFXTM_ERROR           | k\rightarrowu  | Kills transactions (but not dumps)          |
+------+-----------------------+----------------+---------------------------------------------+
|  2   | NFXTM_ABORT           | u\rightarrowk  | Abort transaction                           |
+------+-----------------------+----------------+---------------------------------------------+
| tba  | NFXTM_CHAIN_NEW       | u\rightarrowk  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
| ...  | NFXTM_CHAIN_DEL       | u\rightarrowk  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CHAIN_MOVE      | u\rightarrowk  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CHAIN_PROMOTE   | u\rightarrowk  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CHAIN_DEMOTE    | u\rightarrowk  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_TABLE_DUMP      | u\rightarrowk  | Dump start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CHAIN_ENTRY     |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_RULE_ENTRY      |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_MATCH_ENTRY     |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_TARGET_ENTRY    |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_VERDICT_ENTRY   |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_JUMP_ENTRY      |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_GOTO_ENTRY      |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CONFIG_DATA     |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_STATE_DATA      |     both       | Nest start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_ARB_DATA        |     both       | Arbitrary data                              |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_ATTR_DATA       |     both       | Attribute list                              |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_CHAIN_SPLICE    | u\rightarrowk  | Transaction start, nest start               |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_TABLE_REPLACE   | u\rightarrowk  | Transaction start, nest start               |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_IDENTIFY        |     both       | Dump start                                  |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_IDMATCH_ENTRY   | k\rightarrowu  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_IDTARGET_ENTRY  | k\rightarrowu  |                                             |
+------+-----------------------+----------------+---------------------------------------------+
|      | NFXTM_EVENT           | k\rightarrowu  |                                             |
+------+-----------------------+----------------+---------------------------------------------+

5.1 End of nest level / transaction commit

NFXTM_STOP is used to end a nesting level as started by, for 
example, NFXTM_RULE_ENTRY.

It is also used to finish (commit) a transaction, such as with 
NFXTM_TABLE_REPLACE.

Request NFXTM_STOP:

• No attributes.

Response:

• Standard Netlink ACK.

5.2 Error report

xtnetlink uses NFXTM_ERROR to report back detailed errors on 
actions.

Possible attributes:

• NFXTA_ERRNO: generic error, using system-level errno codes 
  (ENOMEM, etc.)

• NFXTA_XTERRNO: xtnetlink error, see section [sec:Error-types]

• NFXTA_ERRSTR: free-form error string provided by extensions

• NFXTA_USERTID: user token received earlier is echoed back for 
  reference (may be used for things like line numbers)

• NFXTA_CHAIN_NAME: name of chain whose processing caused the 
  error

• NFXTA_RULE_IDX: index to rule (0-based) that caused the error

• NFXTA_ACTION_IDX: index to match/target/verdict (0-based) in 
  the particular rule that caused the error

(RFC:) Should outstanding transaction be terminated?

When NFXTM_ERROR is sent in an NLM_F_MULTI dump stream, an 
NLMSG_DONE message will still follow.

5.3 Transaction termination

NFXTM_ABORT can be used to abort a transaction as started by, for 
example, NFXTM_TABLE_REPLACE.

Request NFXTM_ABORT:

• No attributes.

Response:

• Standard Netlink ACK.

5.4 Chain creation

Request NFXTM_CHAIN_NEW:

• Attribute NFXTA_NAME: name of the new chain.

Response:

• Standard Netlink ACK, or NFXTM_ERROR:

  – ENOMEM: Out of memory

  – NFXTE_CHAIN_EXIST: Chain already exists

5.5 Chain deletion

Request NFXTM_CHAIN_DEL with attributes:

• NFXTA_NAME attribute carrying the name of the chain to delete

Response:

• Standard Netlink ACK, or NFXTM_ERROR:

  – NFXTE_CHAIN_NOENT: Chain does not exist.

Notes:

The chain is automatically demoted.

5.6 Chain renaming

Request:

• Type: NFXTM_CHAIN_MOVE

• Attributes: NFXTA_NAME (old name), NFXTA_NEW_NAME (new chain 
  name).

Response:

• Standard Netlink ACK, or NFXTA_ERROR:

  – NFXTE_CHAIN_NOENT: Source chain does not exist

  – NFXTE_CHAIN_EXIST: Target chain already exists

5.7 Promotion to base chain

Sets the specified chain up as an entrypoint from the Netfilter 
proper. (It does this by creating an appropriate nf_hook.)

Request:

• Type: NFXTM_CHAIN_PROMOTE

• Attributes: NFXTA_NAME, NFXTA_HOOKNUM (NF_INET_*/NF_ARP_*/
  NF_BR_*), NFXTA_PRIORITY, NFXTA_NFPROTO (one of the NFPROTO_* 
  constants)

Response:

• Standard Netlink ACK, or NFXTA_ERROR:

  – NFXTE_CHAIN_NOENT: The specified chain does not exist.

  – NFXTE_PROMO_STATUS: Already promoted.

  – NFXTE_RULESET_LOOP: There is a loop in the rule tree, which 
    is not allowed.

  – NFXTE_EXT_HOOKMASK: One or more extensions are used from a 
    hook that they do not support being invoked from.

Example:

• Turn the chain named “filter/ipv6/INPUT” into the equivalent of 
  the classic INPUT hook in the filter table: NFXTA_NAME=“
  filter/ipv6/INPUT”, NFXTA_HOOKNUM=NF_INET_LOCAL_IN (1), 
  NFXTA_PRIORITY=0, NFXTA_NFPROTO=NFPROTO_IPV6 (10).

5.8 Demotion from base chain

Removes the nf_hook.

Request:

• Type: NFXTM_CHAIN_DEMOTE

• Attributes: NFXTA_NAME

Response:

• Standard Netlink ACK, or NFXTA_ERROR:

  – NFXTE_CHAIN_NOENT: The specified chain does not exist.

  – NFXTE_PROMO_STATUS: Already demoted.

5.9 Implementation Identification (debug)

First and foremost a debug command, and to get something 
(table/chain-independent) that users can glare at (they love 
doing that).

Request:

• nlmsg_type = NFXTM_IDENTIFY;

Multiple message response:

• An NFXTM_IDENTIFY message containing:

  – An NFXTA_NAME attribute giving the name of the 
    implementation/patchset.

• Zero or more NFXTM_IDMATCH_ENTRY messages, giving 
  metainformation about the loaded match extensions. Each message 
  contains three attributes:

  – An NFXTA_NAME attribute for the name of the extension.

  – An NFXTA_REVISION attribute to denote the version of the 
    extension's parameter protocol.

  – An NFXTA_SIZE attribute for the size of its per-instance data 
    block.

  – An NFXTA_HOOKMASK attribute for the bitmap of hooks the 
    extensions may be used from.

• Zero or more NFXTM_IDTARGET_ENTRY messages, giving 
  metainformation about the loaded target extensions:

  – attributes like NFXTM_IDMATCH_ENTRY.

• NLMSG_DONE message.

5.10 Rule dump

Atomic dump of entire table/ruleset, or a single chain, with or 
without rules.

Request:

• nlmsg_type = NFXTM_TABLE_DUMP;

• NFXTA_NAME attribute specifying the name of the chain to dump. 
  Absence of attribute dumps entire table.

• NFXTA_RULE_IDX attribute specifying the particular rule 
  (1-based index) to dump. Absence of attribute dumps entire 
  chain. Use 0 to only get a chain list.

Multi Response:

• Zero or more chains, represented by the start marker message 
  NFXTM_CHAIN_ENTRY and the end marker NFXTM_STOP. The 
  NFXTM_CHAIN_ENTRY message may have NFXTA_HOOKNUM, 
  NFXTA_PRIORITY and NFXTA_NFPROTO attributes if it is a base 
  chain.

• Zero or more rules within NFXTM_CHAIN_ENTRY .. NFXTM_STOP, 
  represented by the start marker message NFXTM_RULE_ENTRY and 
  the end marker NFXTM_STOP.

• Zero or more actions within NFXTM_RULE_ENTRY .. NFXTM_STOP, 
  represented by the start marker message NFXTM_MATCH_ENTRY, 
  NFXTM_TARGET_ENTRY, NFXTM_VERDICT_ENTRY, NFXTM_JUMP_ENTRY or 
  NFXTM_GOTO_ENTRY and the end marker NFXTM_STOP.

• Zero or more config data messages within NFXTM_MATCH_ENTRY or 
  NFXTM_TARGET_ENTRY.

• Zero or more state data messages within NFXTM_MATCH_ENTRY or 
  NFXTM_TARGET_ENTRY.

(See section [sub:Summary-of-transform] for example.)

Errors:

• If an error occurs during dump, an NFXTM_ERRNO message is 
  emitted into the stream and the dump will then immediately 
  terminate with a standard NLMSG_DONE message. No NFXTA_STOP 
  attributes will be emitted if the dump stopped in the middle of 
  a nesting level.

5.11 Table replace

Atomic replacement of an entire table/ruleset.

1. User sends NFXTM_TABLE_REPLACE request. The state is 
  remembered per client socket.

2. Within this transaction, the following commands operate on a 
  temporary table: NFXTM_CHAIN_NEW, NFXTM_CHAIN_DEL, NFXTM_CHAIN_
  MOVE, NFXTM_CHAIN_SPLICE.

3. End transaction with NFXTM_STOP, or abort with NFXTM_ABORT.

5.12 Chain splicing (add/delete rules)

Chain splicing does a bulk deletion of zero or more consecutive 
rules, followed by a bulk insertion of zero or more consecutive 
rules, all done in an atomic fashion. It operates similar to 
Perl's splice function on arrays.

The user starts a transaction with a NFXTM_CHAIN_SPLICE request, 
supplying the name of the chain that is to be modified in a 
NFXTA_NAME attribute. xtnetlink will take the read lock on the 
table to prevent a table replace operation from interfering, and 
will take the write lock on the chain.

1. While in this context, higher-level transactions like 
  NFXTM_TABLE_REPLACE, are rejected.

2. Send new rules (ordered list).

3. End transaction with NFXTM_STOP, or abort transaction entirely 
  with NFXTM_ABORT.

New rules:

1. Send NFXTM_RULE_NEW. Must occur within the context of 
  chain_splice.

2. NFXTM_STOP. This ends the current rule.

blubb

Request:

• NFXTA_NAME: Name of the chain to modify.

• NFXTA_OFFSET: Index of entry where operation should start.

• NFXTA_LENGTH: Number of entries starting from offset that 
  should be removed. May be zero or more.

• Zero or more rules.

Response:

• Standard ACK.

• or detailed error code.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread