All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kubakici@wp.pl>
To: Ido Schimmel <idosch@idosch.org>
Cc: Or Gerlitz <gerlitz.or@gmail.com>,
	David Miller <davem@davemloft.net>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@mellanox.com>,
	John Fastabend <john.r.fastabend@intel.com>,
	Roi Dayan <roid@mellanox.com>,
	Linux Netdev List <netdev@vger.kernel.org>,
	Hadar Hen Zion <hadarh@mellanox.com>, Amir Vadai <amir@vadai.me>,
	Ido Schimmel <idosch@mellanox.com>
Subject: Re: [PATCH net-next V3 3/7] net/sched: Reflect HW offload status
Date: Thu, 16 Feb 2017 20:23:37 -0800	[thread overview]
Message-ID: <20170216202337.45ffb3d4@cakuba.netronome.com> (raw)
In-Reply-To: <20170216081744.GA8800@splinter.mtl.com>

On Thu, 16 Feb 2017 10:17:44 +0200, Ido Schimmel wrote:
> On Wed, Feb 15, 2017 at 10:28:21AM -0800, Jakub Kicinski wrote:
> > What worries me is that the moment we started offloading packet
> > modification we run at the risk of modifying packets twice.  This used
> > to be a problem only for eBPF but now mlx5 can also offload things like
> > ttl decrement.  For filters which have no skip_* flags set and get
> > offloaded if packet doesn't get redirected or modified significantly it
> > will match the filter both in HW and on the host and therefore have the
> > actions applied twice.  And it will get counted twice.  (It seems nobody
> > ever raised this so perhaps I'm mistaken in thinking that this can
> > happen?)  
> 
> FWIW, we already have that problem with bridge offload. There are some
> packets that you can easily forward in hardware, but still wants the
> software bridge to receive a copy. IGMP queries for example. These
> should be flooded to all ports in the bridge, so we do the forwarding in
> hardware, but send a copy to the bridge driver for it to mark the
> receiving port as an mrouter port.
> 
> To prevent the packet from being flooded twice, we set
> 'skb->offload_fwd_mark' inside the driver and have the bridge driver
> check it during its egress check. It's a bit more involved if you've
> several ASICs in the same bridge, but that's the gist. See commit
> 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked
> devices") for more details.

Thanks for mentioning 'skb->offload_fwd_mark'.  It took me a bit to wrap 
my head around it :)
 
> > Back to your patch set, I was hoping we will be able to use the new
> > IN_HW flag to skip filters in software even if they don't have skip_sw
> > set.  If we need to eject actions from HW based on external events, that
> > obviously complicates things.  Three trivial solutions to the problem
> > I could think of are:  
> 
> [...]
> 
> >  - use one of recently freed skb TC bits to mark packets which were
> >    supposed to be processed in HW by could not as needing software
> >    fallback (I think this could work for you without parsing the packet
> >    in the driver, you could replace the tunnel action with mark action
> >    and leave the matching rule in HW classifier/TCAM; for BPF I have a
> >    descriptor flag telling me if offloaded BPF completed successfully).  
> 
> This is similar to what I described above. The tricky part is correctly
> marking the packet. In mlxsw, for each received packet we get the trap
> ID in the DMA descriptor, so we can easily determine whether we should
> set 'skb->offload_fwd_mark' or not.

After trying to compare in bridging offload to TC I'm beginning to
think that skb bit may not actually work in TC case.  If I understand
correctly the bridge forwarding is all or nothing, while there may be
multiple TC rules and single packet may have had some of them applied
but not others*.  And because the TC rules have to be attached directly
to the egress port to get offloaded today we don't have the stacked
device problem your commit solves.  We could try to make drivers mark
packets with id of last executed filter... but that would take a lot of
skb space :S  Luckily AFAIK TC filters still can't be shared across
devices so perhaps making the driver update TC on which filters are in
hardware is not such a bad idea.

* which makes me think about the fact that our TC offloads today don't
  respect ordering of rules in TC.  Ugh.

  reply	other threads:[~2017-02-17  4:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-15  8:52 [PATCH net-next V3 0/7] net/sched: Reflect HW offload status in classifiers Or Gerlitz
2017-02-15  8:52 ` [PATCH net-next V3 1/7] net/sched: cls_flower: Properly handle classifier flags dumping Or Gerlitz
2017-02-15  8:52 ` [PATCH net-next V3 2/7] net/sched: cls_matchall: Dump the classifier flags Or Gerlitz
2017-02-15  8:52 ` [PATCH net-next V3 3/7] net/sched: Reflect HW offload status Or Gerlitz
2017-02-15 16:19   ` Or Gerlitz
2017-02-15 18:28     ` Jakub Kicinski
2017-02-15 21:56       ` Or Gerlitz
2017-02-15 23:07         ` Jakub Kicinski
2017-02-16  8:17       ` Ido Schimmel
2017-02-17  4:23         ` Jakub Kicinski [this message]
2017-02-15  8:52 ` [PATCH net-next V3 4/7] net/sched: cls_flower: " Or Gerlitz
2017-02-15  8:52 ` [PATCH net-next V3 5/7] net/sched: cls_matchall: Reflect HW offloading status Or Gerlitz
2017-02-15 18:15   ` David Miller
2017-02-15 21:40     ` Or Gerlitz
2017-02-15  8:52 ` [PATCH net-next V3 6/7] net/sched: cls_u32: Reflect HW offload status Or Gerlitz
2017-02-15 18:16   ` David Miller
2017-02-15  8:52 ` [PATCH net-next V3 7/7] net/sched: cls_bpf: " Or Gerlitz
2017-02-15 18:16   ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170216202337.45ffb3d4@cakuba.netronome.com \
    --to=kubakici@wp.pl \
    --cc=amir@vadai.me \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hadarh@mellanox.com \
    --cc=idosch@idosch.org \
    --cc=idosch@mellanox.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@mellanox.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=roid@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.