netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlad Buslov <vladbu@mellanox.com>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: "netfilter-devel@vger.kernel.org"
	<netfilter-devel@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"jakub.kicinski@netronome.com" <jakub.kicinski@netronome.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"vishal@chelsio.com" <vishal@chelsio.com>,
	Vlad Buslov <vladbu@mellanox.com>
Subject: Re: [PATCH net-next,v2 3/4] net: flow_offload: mangle action at byte level
Date: Wed, 4 Sep 2019 12:48:49 +0000	[thread overview]
Message-ID: <vbfimq88bx5.fsf@mellanox.com> (raw)
In-Reply-To: <20190903164513.15462-4-pablo@netfilter.org>


On Tue 03 Sep 2019 at 19:45, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> The flow mangle action is originally modeled after the tc pedit action,
> this has a number of shortcomings:
>
> 1) The tc pedit offset must be set on the 32-bits boundaries. Many
>    protocol header field offsets are not aligned to 32-bits, eg. port
>    destination, port source and ethernet destination. This patch adjusts
>    the offset accordingly and trim off length in these case, so drivers get
>    an exact offset and length to the header fields.
>
> 2) The maximum mangle length is one word of 32-bits, hence you need to
>    up to four actions to mangle an IPv6 address. This patch coalesces
>    consecutive tc pedit actions into one single action so drivers can
>    configure the IPv6 mangling in one go. Ethernet address fields now
>    require one single action instead of two too.
>
> The following drivers have been updated accordingly to use this new
> mangle action layout:
>
> 1) The cxgb4 driver does not need to split protocol field matching
>    larger than one 32-bit words into multiple definitions. Instead one
>    single definition per protocol field is enough. Checking for
>    transport protocol ports is also simplified.
>
> 2) The mlx5 driver logic to disallow IPv4 ttl and IPv6 hoplimit fields
>    becomes more simple too.
>
> 3) The nfp driver uses the nfp_fl_set_helper() function to configure the
>    payload mangling. The memchr_inv() function is used to check for
>    proper initialization of the value and mask. The driver has been
>    updated to refer to the exact protocol header offsets too.
>
> As a result, this patch reduces code complexity on the driver side at
> the cost of adding ~100 LOC at the core to perform offset and length
> adjustment; and to coalesce consecutive actions.
>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
>  .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 162 +++++-----------
>  .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h   |  40 ++--
>  drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  90 +++------
>  drivers/net/ethernet/netronome/nfp/flower/action.c | 203 ++++++++++-----------
>  include/net/flow_offload.h                         |   7 +-
>  net/sched/cls_api.c                                | 145 ++++++++++++---
>  6 files changed, 309 insertions(+), 338 deletions(-)

[...]

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index f29895b3a947..b7b88bc22cf7 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -2201,19 +2201,24 @@ static int pedit_header_offsets[] = {
>
>  #define pedit_header(_ph, _htype) ((void *)(_ph) + pedit_header_offsets[_htype])
>
> -static int set_pedit_val(u8 hdr_type, u32 mask, u32 val, u32 offset,
> +static int set_pedit_val(u8 hdr_type, const struct flow_action_entry *act,
>  			 struct pedit_headers_action *hdrs)
>  {
> -	u32 *curr_pmask, *curr_pval;
> +	u32 offset = act->mangle.offset;
> +	u8 *curr_pmask, *curr_pval;
> +	int i;
>
> -	curr_pmask = (u32 *)(pedit_header(&hdrs->masks, hdr_type) + offset);
> -	curr_pval  = (u32 *)(pedit_header(&hdrs->vals, hdr_type) + offset);
> +	curr_pmask = (u8 *)(pedit_header(&hdrs->masks, hdr_type) + offset);
> +	curr_pval  = (u8 *)(pedit_header(&hdrs->vals, hdr_type) + offset);
>
> -	if (*curr_pmask & mask)  /* disallow acting twice on the same location */
> -		goto out_err;
> +	for (i = 0; i < act->mangle.len; i++) {
> +		/* disallow acting twice on the same location */
> +		if (curr_pmask[i] & act->mangle.mask[i])
> +			goto out_err;
>
> -	*curr_pmask |= mask;
> -	*curr_pval  |= val;
> +		curr_pmask[i] |= act->mangle.mask[i];
> +		curr_pval[i] |= act->mangle.val[i];
> +	}
>
>  	return 0;
>
> @@ -2487,7 +2492,6 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
>  {
>  	u8 cmd = (act->id == FLOW_ACTION_MANGLE) ? 0 : 1;
>  	int err = -EOPNOTSUPP;
> -	u32 mask, val, offset;
>  	u8 htype;
>
>  	htype = act->mangle.htype;
> @@ -2504,11 +2508,7 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
>  		goto out_err;
>  	}
>
> -	mask = act->mangle.mask;
> -	val = act->mangle.val;
> -	offset = act->mangle.offset;
> -
> -	err = set_pedit_val(htype, mask, val, offset, &hdrs[cmd]);
> +	err = set_pedit_val(htype, act, &hdrs[cmd]);
>  	if (err)
>  		goto out_err;
>
> @@ -2589,50 +2589,18 @@ static bool csum_offload_supported(struct mlx5e_priv *priv,
>  	return true;
>  }
>
> -struct ip_ttl_word {
> -	__u8	ttl;
> -	__u8	protocol;
> -	__sum16	check;
> -};
> -
> -struct ipv6_hoplimit_word {
> -	__be16	payload_len;
> -	__u8	nexthdr;
> -	__u8	hop_limit;
> -};
> -
>  static bool is_action_keys_supported(const struct flow_action_entry *act)
>  {
> -	u32 mask, offset;
> -	u8 htype;
> +	u32 offset = act->mangle.offset;
> +	u8 htype = act->mangle.htype;
>
> -	htype = act->mangle.htype;
> -	offset = act->mangle.offset;
> -	mask = act->mangle.mask;
> -	/* For IPv4 & IPv6 header check 4 byte word,
> -	 * to determine that modified fields
> -	 * are NOT ttl & hop_limit only.
> -	 */
> -	if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4) {
> -		struct ip_ttl_word *ttl_word =
> -			(struct ip_ttl_word *)&mask;
> -
> -		if (offset != offsetof(struct iphdr, ttl) ||
> -		    ttl_word->protocol ||
> -		    ttl_word->check) {
> -			return true;
> -		}
> -	} else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) {
> -		struct ipv6_hoplimit_word *hoplimit_word =
> -			(struct ipv6_hoplimit_word *)&mask;
> -
> -		if (offset != offsetof(struct ipv6hdr, payload_len) ||
> -		    hoplimit_word->payload_len ||
> -		    hoplimit_word->nexthdr) {
> -			return true;
> -		}
> -	}
> -	return false;
> +	if ((htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4 &&
> +	     offset == offsetof(struct iphdr, ttl)) ||
> +	    (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6 &&
> +	     offset == offsetof(struct ipv6hdr, hop_limit)))
> +		return false;
> +
> +	return true;
>  }

With this change is_action_keys_supported() incorrectly returns true for
non-IP{4|6} mangles. I guess naming of the functions doesn't help
because it should be something like is_action_iphdr_keys_supported()...

Anyway, this results following rule to be incorrectly rejected by
driver:

tc filter add dev ens1f0_0 protocol ip parent ffff: prio 3
flower dst_mac e4:1d:2d:fd:8b:02 skip_sw
action pedit ex munge eth src set 11:22:33:44:55:66 munge eth dst set
       aa:bb:cc:dd:ee:ff pipe
action csum ip pipe
action tunnel_key set id 98 src_ip 2.2.2.2 dst_ip 2.2.2.3 dst_port 1234
action mirred egress redirect dev vxlan1

The pedit action is rejected by conditional that follows the loop in
modify_header_match_supported() which calls is_action_keys_supported().
With this change modify_ip_header==true (even though the pedit only
modifies eth header), which causes failure because ip proto is not
supported:

Error: mlx5_core: can't offload re-write of non TCP/UDP.
ERROR: [ 3345.830338] can't offload re-write of ip proto 0

  reply	other threads:[~2019-09-04 12:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 16:45 [PATCH net-next,v2 0/4] flow_offload: update mangle action representation Pablo Neira Ayuso
2019-09-03 16:45 ` [PATCH net-next,v2 1/4] net: flow_offload: flip mangle action mask Pablo Neira Ayuso
2019-09-03 16:45 ` [PATCH net-next,v2 2/4] net: flow_offload: bitwise AND on mangle action value field Pablo Neira Ayuso
2019-09-03 16:45 ` [PATCH net-next,v2 3/4] net: flow_offload: mangle action at byte level Pablo Neira Ayuso
2019-09-04 12:48   ` Vlad Buslov [this message]
2019-09-03 16:45 ` [PATCH net-next,v2 4/4] netfilter: nft_payload: packet mangling offload support Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vbfimq88bx5.fsf@mellanox.com \
    --to=vladbu@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=saeedm@mellanox.com \
    --cc=vishal@chelsio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).