Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: John Hurley <john.hurley@netronome.com>
To: David Ahern <dsahern@gmail.com>
Cc: Linux Netdev List <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jiri Pirko <jiri@mellanox.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Davide Caratti <dcaratti@redhat.com>,
	Simon Horman <simon.horman@netronome.com>,
	Jakub Kicinski <jakub.kicinski@netronome.com>,
	oss-drivers@netronome.com
Subject: Re: [PATCH net-next v3 1/2] net: sched: add mpls manipulation actions to TC
Date: Sat, 15 Jun 2019 00:22:52 +0100
Message-ID: <CAK+XE=nTcd+oB=UCe-gHMs7XNtS1w8GUWMwxgGQFtzp5CRz+LA@mail.gmail.com> (raw)
In-Reply-To: <24000ac2-a8b6-9908-d8a9-67a66f03b26d@gmail.com>

On Fri, Jun 14, 2019 at 6:22 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 6/14/19 8:58 AM, John Hurley wrote:
> > Currently, TC offers the ability to match on the MPLS fields of a packet
> > through the use of the flow_dissector_key_mpls struct. However, as yet, TC
> > actions do not allow the modification or manipulation of such fields.
> >
> > Add a new module that registers TC action ops to allow manipulation of
> > MPLS. This includes the ability to push and pop headers as well as modify
> > the contents of new or existing headers. A further action to decrement the
> > TTL field of an MPLS header is also provided.
>
> you have this limited to a single mpls label. It would be more flexible
> to allow push/pop of N-labels (push probably being the most useful).
>

Hi David.
Multiple push and pop actions can be done by piping them together.
E.g. for a flower filter that pushes 2 labels to an IP packet you can do:

tc filter add dev eth0 protocol ip parent ffff: \
        flower \
        action mpls push protocol mpls_mc label 10 \
        action mpls push protocol mpls_mc label 20 \
        action mirred egress redirect dev eth1


> more below
>
> > diff --git a/include/uapi/linux/tc_act/tc_mpls.h b/include/uapi/linux/tc_act/tc_mpls.h
> > new file mode 100644
> > index 0000000..6e8907b
> > --- /dev/null
> > +++ b/include/uapi/linux/tc_act/tc_mpls.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +/* Copyright (C) 2019 Netronome Systems, Inc. */
> > +
> > +#ifndef __LINUX_TC_MPLS_H
> > +#define __LINUX_TC_MPLS_H
> > +
> > +#include <linux/pkt_cls.h>
> > +
> > +#define TCA_MPLS_ACT_POP     1
> > +#define TCA_MPLS_ACT_PUSH    2
> > +#define TCA_MPLS_ACT_MODIFY  3
> > +#define TCA_MPLS_ACT_DEC_TTL 4
> > +
> > +struct tc_mpls {
> > +     tc_gen;
> > +     int m_action;
> > +};
> > +
> > +enum {
> > +     TCA_MPLS_UNSPEC,
> > +     TCA_MPLS_TM,
> > +     TCA_MPLS_PARMS,
> > +     TCA_MPLS_PAD,
> > +     TCA_MPLS_PROTO,
> > +     TCA_MPLS_LABEL,
> > +     TCA_MPLS_TC,
> > +     TCA_MPLS_TTL,
> > +     __TCA_MPLS_MAX,
> > +};
> > +#define TCA_MPLS_MAX (__TCA_MPLS_MAX - 1)
> > +
> > +#endif
>
> Adding comments to those attributes for userspace writers would be very
> helpful. See what I did for the nexthop API -
> include/uapi/linux/nexthop.h. My goal there was to document that API
> enough that someone adding support to a routing suite (e.g., FRR) never
> had to ask a question about the API.
>

nice example.
I can add similar

>
> > diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c
> > new file mode 100644
> > index 0000000..828a8d9
> > --- /dev/null
> > +++ b/net/sched/act_mpls.c
> ...
>
> > +     switch (p->tcfm_action) {
> > +     case TCA_MPLS_ACT_POP:
> > +             if (unlikely(!eth_p_mpls(skb->protocol)))
> > +                     goto out;
> > +
> > +             if (unlikely(skb_ensure_writable(skb, ETH_HLEN + MPLS_HLEN)))
> > +                     goto drop;
> > +
> > +             skb_postpull_rcsum(skb, mpls_hdr(skb), MPLS_HLEN);
> > +             memmove(skb->data + MPLS_HLEN, skb->data, ETH_HLEN);
> > +
> > +             __skb_pull(skb, MPLS_HLEN);
> > +             skb_reset_mac_header(skb);
> > +             skb_set_network_header(skb, ETH_HLEN);
> > +
> > +             tcf_mpls_set_eth_type(skb, p->tcfm_proto);
> > +             skb->protocol = p->tcfm_proto;
>
> This is pop of a single label. It may or may not be the bottom label, so
> it seems like this should handle both cases and may depend on the
> packet. If it is the bottom label, then letting the user specify next
> seems correct but it is not then the protocol needs to stay MPLS.
>

Yes, the user is expected to indicate the next protocol after the pop
even if another mpls label is next.
We're trying to cater for supporting mpls_uc and mpls_mc ethtypes.
So you could in theory pop the top mpls unicast header and set the
next to multicast.
We expect the user to know what the next header is so enforce that
they give that information.
Do you agree with this or should we add more checks around the BoS bit?

>
> > +             break;
> > +     case TCA_MPLS_ACT_PUSH:
> > +             if (unlikely(skb_cow_head(skb, MPLS_HLEN)))
> > +                     goto drop;
> > +
> > +             skb_push(skb, MPLS_HLEN);
> > +             memmove(skb->data, skb->data + MPLS_HLEN, ETH_HLEN);
> > +             skb_reset_mac_header(skb);
> > +             skb_set_network_header(skb, ETH_HLEN);
> > +
> > +             lse = mpls_hdr(skb);
> > +             lse->label_stack_entry = 0;
> > +             tcf_mpls_mod_lse(lse, p, !eth_p_mpls(skb->protocol));
> > +             skb_postpush_rcsum(skb, lse, MPLS_HLEN);
> > +
> > +             tcf_mpls_set_eth_type(skb, p->tcfm_proto);
> > +             skb->protocol = p->tcfm_proto;
>
> similarly here. If you are pushing one or more labels why allow the user
> to specify the protocol? It should be set to ETH_P_MPLS_UC.
>

Again, this caters for setting ETH_P_MPLS_MC.
For push, the setting of an ethtype is not enforced and UC is used as
the default.

> > +             break;
> > +     case TCA_MPLS_ACT_MODIFY:
> > +             if (unlikely(!eth_p_mpls(skb->protocol)))
> > +                     goto out;
> > +
> > +             if (unlikely(skb_ensure_writable(skb, ETH_HLEN + MPLS_HLEN)))
> > +                     goto drop;
> > +
> > +             lse = mpls_hdr(skb);
> > +             skb_postpull_rcsum(skb, lse, MPLS_HLEN);
> > +             tcf_mpls_mod_lse(lse, p, false);
> > +             skb_postpush_rcsum(skb, lse, MPLS_HLEN);
> > +             break;
> > +     case TCA_MPLS_ACT_DEC_TTL:
> > +             if (unlikely(!eth_p_mpls(skb->protocol)))
> > +                     goto out;
> > +
> > +             if (unlikely(skb_ensure_writable(skb, ETH_HLEN + MPLS_HLEN)))
> > +                     goto drop;
> > +
> > +             lse = mpls_hdr(skb);
> > +             temp_lse = be32_to_cpu(lse->label_stack_entry);
> > +             ttl = (temp_lse & MPLS_LS_TTL_MASK) >> MPLS_LS_TTL_SHIFT;
> > +             if (!--ttl)
> > +                     goto drop;
> > +
> > +             temp_lse &= ~MPLS_LS_TTL_MASK;
> > +             temp_lse |= ttl << MPLS_LS_TTL_SHIFT;
> > +             skb_postpull_rcsum(skb, lse, MPLS_HLEN);
> > +             lse->label_stack_entry = cpu_to_be32(temp_lse);
> > +             skb_postpush_rcsum(skb, lse, MPLS_HLEN);
> > +             break;
> > +     default:
> > +             WARN_ONCE(1, "Invalid MPLS action\n");
> > +     }
> > +
> > +out:
> > +     if (skb_at_tc_ingress(skb))
> > +             skb_pull_rcsum(skb, skb->mac_len);
> > +
> > +     return ret;
> > +
> > +drop:
> > +     qstats_drop_inc(this_cpu_ptr(m->common.cpu_qstats));
> > +     return TC_ACT_SHOT;
> > +}
> > +
> > +static int valid_label(const struct nlattr *attr,
> > +                    struct netlink_ext_ack *extack)
> > +{
> > +     const u32 *label = nla_data(attr);
> > +
> > +     if (!*label || *label & ~MPLS_LABEL_MASK) {
> > +             NL_SET_ERR_MSG_MOD(extack, "MPLS label out of range");
> > +             return -EINVAL;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static const struct nla_policy mpls_policy[TCA_MPLS_MAX + 1] = {
> > +     [TCA_MPLS_PARMS]        = NLA_POLICY_EXACT_LEN(sizeof(struct tc_mpls)),
> > +     [TCA_MPLS_PROTO]        = { .type = NLA_U16 },
> > +     [TCA_MPLS_LABEL]        = NLA_POLICY_VALIDATE_FN(NLA_U32, valid_label),
> > +     [TCA_MPLS_TC]           = NLA_POLICY_RANGE(NLA_U8, 0, 7),
> > +     [TCA_MPLS_TTL]          = NLA_POLICY_MIN(NLA_U8, 1),
> > +};
>
> Since this is a new policy, set .strict_start_type in the
> TCA_MPLS_UNSPEC entry:
>    [TCA_MPLS_UNSPEC] = { .strict_start_type = TCA_MPLS_UNSPEC + 1 },
>

ack

>
> > +
> > +static int tcf_mpls_init(struct net *net, struct nlattr *nla,
> > +                      struct nlattr *est, struct tc_action **a,
> > +                      int ovr, int bind, bool rtnl_held,
> > +                      struct tcf_proto *tp, struct netlink_ext_ack *extack)
> > +{
> > +     struct tc_action_net *tn = net_generic(net, mpls_net_id);
> > +     struct nlattr *tb[TCA_MPLS_MAX + 1];
> > +     struct tcf_chain *goto_ch = NULL;
> > +     struct tcf_mpls_params *p;
> > +     struct tc_mpls *parm;
> > +     bool exists = false;
> > +     struct tcf_mpls *m;
> > +     int ret = 0, err;
> > +     u8 mpls_ttl = 0;
> > +
> > +     if (!nla) {
> > +             NL_SET_ERR_MSG_MOD(extack, "missing netlink attributes");
> > +             return -EINVAL;
> > +     }
> > +
> > +     err = nla_parse_nested(tb, TCA_MPLS_MAX, nla, mpls_policy, extack);
> > +     if (err < 0)
> > +             return err;
> > +
> > +     if (!tb[TCA_MPLS_PARMS]) {
> > +             NL_SET_ERR_MSG_MOD(extack, "no MPLS params");
> > +             return -EINVAL;
> > +     }
> > +     parm = nla_data(tb[TCA_MPLS_PARMS]);
> > +
> > +     /* Verify parameters against action type. */
> > +     switch (parm->m_action) {
> > +     case TCA_MPLS_ACT_POP:
> > +             if (!tb[TCA_MPLS_PROTO] ||
> > +                 !eth_proto_is_802_3(nla_get_be16(tb[TCA_MPLS_PROTO]))) {
> > +                     NL_SET_ERR_MSG_MOD(extack, "MPLS POP: invalid proto");
>
> Please spell out the messages:
>   Invalid protocol
>
> > +                     return -EINVAL;
> > +             }
> > +             if (tb[TCA_MPLS_LABEL] || tb[TCA_MPLS_TTL] || tb[TCA_MPLS_TC]) {
> > +                     NL_SET_ERR_MSG_MOD(extack, "MPLS POP: unsupported attrs");
>
> Be more specific with the error message:
>     LABEL, TTL and ??? attributes can not be used with pop action.
>
> > +                     return -EINVAL;
> > +             }
> > +             break;
> > +     case TCA_MPLS_ACT_DEC_TTL:
> > +             if (tb[TCA_MPLS_PROTO] || tb[TCA_MPLS_LABEL] ||
> > +                 tb[TCA_MPLS_TTL] || tb[TCA_MPLS_TC]) {
> > +                     NL_SET_ERR_MSG_MOD(extack, "MPLS DEC TTL: unsupported attrs");
>
> same here and other extack messages here

ack to the extack comments.
Let me sort.
Thanks for input!

>

  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-14 14:58 [PATCH net-next v3 0/2] Add MPLS " John Hurley
2019-06-14 14:58 ` [PATCH net-next v3 1/2] net: sched: add mpls manipulation " John Hurley
2019-06-14 17:22   ` David Ahern
2019-06-14 23:22     ` John Hurley [this message]
2019-06-14 23:44       ` David Ahern
2019-06-14 14:58 ` [PATCH net-next v3 2/2] selftests: tc-tests: actions: add MPLS tests John Hurley

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK+XE=nTcd+oB=UCe-gHMs7XNtS1w8GUWMwxgGQFtzp5CRz+LA@mail.gmail.com' \
    --to=john.hurley@netronome.com \
    --cc=davem@davemloft.net \
    --cc=dcaratti@redhat.com \
    --cc=dsahern@gmail.com \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=simon.horman@netronome.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox