netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Joseph, Jithu" <jithu.joseph@intel.com>
To: "kuba@kernel.org" <kuba@kernel.org>,
	"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>
Cc: "Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"sassmann@redhat.com" <sassmann@redhat.com>,
	"Lifshits, Vitaly" <vitaly.lifshits@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Neftin, Sasha" <sasha.neftin@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>,
	"bjorn.topel@intel.com" <bjorn.topel@intel.com>,
	"dvorax.fuxbrumer@linux.intel.com"
	<dvorax.fuxbrumer@linux.intel.com>
Subject: Re: [PATCH net-next 8/9] igc: Enable RX via AF_XDP zero-copy
Date: Wed, 14 Apr 2021 23:14:04 +0000	[thread overview]
Message-ID: <fcd46fb09a08af36b7c34693f4e687d2c9ca2422.camel@intel.com> (raw)
In-Reply-To: <20210409173604.217406b6@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

Hi Jakub, 
 
Apologies for the delay, I am looking into this as the original
developer Andre is no-longer with Intel. I really appreciate your
review feedback.

(I removed Andre's and Vedang's email from the cc list as they are
bouncing and have added a couple of Intel folks) 

Pardon me if  I have not  understood your questions precisely or if
some of the replies are not concise (I am still understanding XDP flow
patterns.) 

I  see that lot of the design patterns followed by this patch series,
follow the approaches from other Intel drivers like (ice, ixgbe, i140e)

On Fri, 2021-04-09 at 17:36 -0700, Jakub Kicinski wrote:
> On Fri,  9 Apr 2021 09:43:50 -0700 Tony Nguyen wrote:
> > From: Andre Guedes <andre.guedes@intel.com>
> > 
> > Add support for receiving packets via AF_XDP zero-copy mechanism.
> > 
> > Add a new flag to 'enum igc_ring_flags_t' to indicate the ring has
> > AF_XDP zero-copy enabled so proper ring setup is carried out during
> > ring
> > configuration in igc_configure_rx_ring().
> > 
> > RX buffers can now be allocated via the shared pages mechanism
> > (default
> > behavior of the driver) or via xsk pool (when AF_XDP zero-copy is
> > enabled) so a union is added to the 'struct igc_rx_buffer' to cover
> > both
> > cases.
> > 
> > When AF_XDP zero-copy is enabled, rx buffers are allocated from the
> > xsk
> > pool using the new helper igc_alloc_rx_buffers_zc() which is the
> > counterpart of igc_alloc_rx_buffers().
> > 
> > Likewise other Intel drivers that support AF_XDP zero-copy, in igc
> > we
> > have a dedicated path for cleaning up rx irqs when zero-copy is
> > enabled.
> > This avoids adding too many checks within igc_clean_rx_irq(),
> > resulting
> > in a more readable and efficient code since this function is called
> > from
> > the hot-path of the driver.
> > +static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
> > +					    struct xdp_buff *xdp)
> > +{
> > +	unsigned int metasize = xdp->data - xdp->data_meta;
> > +	unsigned int datasize = xdp->data_end - xdp->data;
> > +	struct sk_buff *skb;
> > +
> > +	skb = __napi_alloc_skb(&ring->q_vector->napi,
> > +			       xdp->data_end - xdp->data_hard_start,
> > +			       GFP_ATOMIC | __GFP_NOWARN);
> > +	if (unlikely(!skb))
> > +		return NULL;
> > +
> > +	skb_reserve(skb, xdp->data - xdp->data_hard_start);
> > +	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
> > +	if (metasize)
> > +		skb_metadata_set(skb, metasize);
> 
> But you haven't actually copied the matadata into the skb,
> the metadata is before xdp->data, right?

Today the igc driver doesn’t add any metadata (except for hw time
stamps explained later) . So for most part, xdp->data and xdp-
>data_meta point to the same address . That could be why in this
initial implementation we are not copying  the metadata into skb (as
the driver doesn’t add any).  

If the XDP program adds some metadata before xdp->data (and  xdp-
>data_meta reflects this), that is NOT copied into the SKB as you
mentioned .   Is the expectation that meta_data (if any added by the
bpf program) , should also be copied to the skb  in this XDP_PASS flow
? If so I can revise this patch to do that. 

If h/w time-stamp is added by the NIC, then metasize will be non zero
(as  xdp->data is advanced by the driver ) .  h/w ts  is still copied
into "skb_hwtstamps(skb)->hwtstamp" by  the caller of this function
igc_dispatch_skb_zc()  . Do you still want it to be copied into
__skb_put(skb, ) area too ? 

> 
> > +	return skb;
> > +}
> > +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> > +			       struct xsk_buff_pool *pool, u16
> > queue_id)
> > +{
> > +	struct net_device *ndev = adapter->netdev;
> > +	struct device *dev = &adapter->pdev->dev;
> > +	struct igc_ring *rx_ring;
> > +	struct napi_struct *napi;
> > +	bool needs_reset;
> > +	u32 frame_size;
> > +	int err;
> > +
> > +	if (queue_id >= adapter->num_rx_queues)
> > +		return -EINVAL;
> > +
> > +	frame_size = xsk_pool_get_rx_frame_size(pool);
> > +	if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> > +		/* When XDP is enabled, the driver doesn't support
> > frames that
> > +		 * span over multiple buffers. To avoid that, we check
> > if xsk
> > +		 * frame size is big enough to fit the max ethernet
> > frame size
> > +		 * + vlan double tagging.
> > +		 */
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	err = xsk_pool_dma_map(pool, dev, IGC_RX_DMA_ATTR);
> > +	if (err) {
> > +		netdev_err(ndev, "Failed to map xsk pool\n");
> > +		return err;
> > +	}
> > +
> > +	needs_reset = netif_running(adapter->netdev) &&
> > igc_xdp_is_enabled(adapter);
> > +
> > +	rx_ring = adapter->rx_ring[queue_id];
> > +	napi = &rx_ring->q_vector->napi;
> > +
> > +	if (needs_reset) {
> > +		igc_disable_rx_ring(rx_ring);
> > +		napi_disable(napi);
> > +	}
> > +
> > +	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
> > +
> > +	if (needs_reset) {
> > +		napi_enable(napi);
> > +		igc_enable_rx_ring(rx_ring);
> > +
> > +		err = igc_xsk_wakeup(ndev, queue_id, XDP_WAKEUP_RX);
> > +		if (err)
> > +			return err;
> 
> No need for an unwind path here?
> Does something call XDP_SETUP_XSK_POOL(NULL) on failure
> automagically?

I think we should add a xsk_pool_dma_unmap() in this failure path
?  Did I understand you correctly ?

> 
> > +	}
> > +
> > +	return 0;
> > +}

Thanks
Jithu

  reply	other threads:[~2021-04-14 23:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09 16:43 [PATCH net-next 0/9][pull request] 1GbE Intel Wired LAN Driver Updates 2021-04-09 Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 1/9] igc: Move igc_xdp_is_enabled() Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 2/9] igc: Refactor __igc_xdp_run_prog() Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 3/9] igc: Refactor igc_clean_rx_ring() Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 4/9] igc: Refactor XDP rxq info registration Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 5/9] igc: Introduce TX/RX stats helpers Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 6/9] igc: Introduce igc_unmap_tx_buffer() helper Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 7/9] igc: Replace IGC_TX_FLAGS_XDP flag by an enum Tony Nguyen
2021-04-09 16:43 ` [PATCH net-next 8/9] igc: Enable RX via AF_XDP zero-copy Tony Nguyen
2021-04-10  0:36   ` Jakub Kicinski
2021-04-14 23:14     ` Joseph, Jithu [this message]
2021-04-14 23:25       ` Jakub Kicinski
2021-04-14 23:59         ` Joseph, Jithu
2021-04-15  0:31           ` Jakub Kicinski
2021-04-09 16:43 ` [PATCH net-next 9/9] igc: Enable TX " Tony Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fcd46fb09a08af36b7c34693f4e687d2c9ca2422.camel@intel.com \
    --to=jithu.joseph@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=bjorn.topel@intel.com \
    --cc=davem@davemloft.net \
    --cc=dvorax.fuxbrumer@linux.intel.com \
    --cc=ederson.desouza@intel.com \
    --cc=kuba@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=sasha.neftin@intel.com \
    --cc=sassmann@redhat.com \
    --cc=vinicius.gomes@intel.com \
    --cc=vitaly.lifshits@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).