All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toshiaki Makita <toshiaki.makita1@gmail.com>
To: Hanlin Shi <hanlins@vmware.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: Cheng-Chun William Tu <tuc@vmware.com>
Subject: Re: Veth pair swallow packets for XDP_TX operation
Date: Fri, 17 Jan 2020 15:00:53 +0900	[thread overview]
Message-ID: <d2dffea2-13d9-72a2-a89c-354b6403da54@gmail.com> (raw)
In-Reply-To: <68645457-3A77-4AC2-A033-F09DB5AEE6F8@vmware.com>

Please avoid top-posting in netdev mailing list.

On 2020/01/17 7:54, Hanlin Shi wrote:
> Hi Toshiaki,
> 
> Thanks for your advice, and now it's working as expected in my environment. However I still have concerns on this issue. Is this dummy interface approach is a short-term work around?

This is a long-standing problem and should be fixed in some way. But not easy.

Your packets were dropped because the peer device did not prepare necessary
resources to receive XDP frames. The resource allocation is triggered by
attaching (possibly dummy) XDP program, which is unfortunately unintuitive.
Typically this kind of problem happens when other devices redirect frames by
XDP_REDIRECT to some device. If the redirect target device has not prepared
necessary resources, redirected frames will be dropped. This is a common issue
with other XDP drivers and netdev community is seeking for a right solution.

For veth there may be one more option that attaching an XDP program triggers
allocation of "peer" resource. But this means we need to allocate resources
on both ends when only either of them attaches XDP. This is not necessary when the
program only does XDP_DROP or XDP_PASS, so I'm not sure this is a good idea.

Anyway with current behavior the peer (i.e. container host) needs to explicitly
allow XDP_TX by attaching some program on host side.

> The behavior for native XDP is different from generic XDP, which could cause confusions for developers. 

Native XDP is generally hard to setup, which is one of reasons why generic XDP was introduced.

> Also, I'm planning to load the XDP program in container (specifically, Kubernetes pod), and I'm not sure is it's feasible for me to access the veth peer that is connected to the bridge (Linux bridge or ovs).

So veth devices will be created by CNI plugins? Then basically your CNI plugin needs to attach
XDP program on host side if you want to allow XDP_TX in containers.

> 
> I wonder is that ok to have a fix, that if the XDP program on the peer of veth is not found, then fallback to a dummy XDP_PASS behavior, just like what you demonstrated? If needed I can help on the fix.

I proposed a similar workaround when I introduced veth native XDP, but rejected.
If we do not allocate additional resources on the peer, we need to use legacy data path
that does not have bulk interface, which makes the XDP_TX performance lower.
That would be a hard-to-fix problem than dropping...

Toshiaki Makita

      reply	other threads:[~2020-01-17  6:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 22:35 Veth pair swallow packets for XDP_TX operation Hanlin Shi
2020-01-16  9:01 ` Toshiaki Makita
2020-01-16 21:28   ` William Tu
2020-01-16 22:54   ` Hanlin Shi
2020-01-17  6:00     ` Toshiaki Makita [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2dffea2-13d9-72a2-a89c-354b6403da54@gmail.com \
    --to=toshiaki.makita1@gmail.com \
    --cc=hanlins@vmware.com \
    --cc=netdev@vger.kernel.org \
    --cc=tuc@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.