All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: netdev@vger.kernel.org,
	Gregory CLEMENT <gregory.clement@free-electrons.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH] net: convert mvneta to build_skb()
Date: Fri, 5 Jul 2013 09:43:30 +0200	[thread overview]
Message-ID: <20130705074330.GB25188@1wt.eu> (raw)
In-Reply-To: <20130705091752.675874b5@skate>

Hi Thomas,

On Fri, Jul 05, 2013 at 09:17:52AM +0200, Thomas Petazzoni wrote:
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> > @@ -225,6 +225,7 @@ struct mvneta_stats {
> >  
> >  struct mvneta_port {
> >  	int pkt_size;
> > +	unsigned int frag_size;
> >  	void __iomem *base;
> >  	struct mvneta_rx_queue *rxqs;
> >  	struct mvneta_tx_queue *txqs;
> 
> Thanks Willy. Sorry for asking such a stupid question, but I'm not very
> familiar with how this mechanism works. Can you explain why a single
> 'frag_size' field per port is sufficient? My concern is that this
> frag_size seems to be a per-packet information, but we have potentially
> multiple packets being received, and multiple RX queues. Is one single
> 'frag_size' per network interface sufficient?

I had the exact same question when Eric sent me an experimental patch to
do this on mv64xxx_eth a few months ago. Then I checked how the frag_size
is computed. As you can see below, it does not depend on a per-packet size
but on a per-port size which is in fact the MTU ("pkt_size" is the misleading
part here) :

      skb_size = SKB_DATA_ALIGN(pp->pkt_size + MVNETA_MH_SIZE + NET_SKB_PAD) +
                 SKB_DATA_ALIGN(sizeof(struct skb_shared_info));

So if skb_size depends solely on pp (struct mvneta_port), then it makes
sense to have frag_size stored at the same place.

In practice we don't really need to store the frag_size in the struct, we
just need to know if the data were allocated using netdev_alloc_frag()
or kmalloc() to know how to free them. So a single bit would be enough,
and I thought about just doing a +1 on the pointer when we need to free
using kmalloc(). But that would add unneeded extra work in the fast path
to fix the pointers. And since we need to pass frag_size to build_skb()
it was a reasonable solution in my opinion.

> For example, in mvneta_rx_refill(), you store the skb_size in
> pp->frag_size, and then you later re-use it in mvneta_rxq_drop_pkts.
> What guarantees you that mvneta_rx_refill() hasn't be called in the
> mean time for a different packet, in a different rxq, for the same
> network interface, and the value of pp->frag_size has been overridden?

It's not a problem since the refill() applies pp->pkt_size which doesn't
change between calls. It's only changed in mvneta_change_mtu() which
first stops the device. So I think it's safe.

> Again, sorry for this stupid question, this is maybe just some basic
> networking knowledge that I'm missing here, but I thought it would be
> good to ask anyway.

No it's neither stupid nor basic. It's a bit tricky to understand first
but it looks really nice once you get it. I think I will study the ability
to use +/-1 on the pointer and see if we can get rid of frag_size (eg: tg3
doesn't store it).

Hoping this clarifies some points,
Willy

  reply	other threads:[~2013-07-05  7:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-04 17:35 [PATCH] net: convert mvneta to build_skb() Willy Tarreau
2013-07-04 21:31 ` David Miller
2013-07-04 22:12   ` Willy Tarreau
2013-07-05  7:17 ` Thomas Petazzoni
2013-07-05  7:43   ` Willy Tarreau [this message]
2013-07-05  7:50     ` Thomas Petazzoni
2013-07-05  8:09       ` Willy Tarreau
2013-07-15 14:34 ` Thomas Petazzoni
2013-07-15 15:12   ` Willy Tarreau
2013-07-15 15:23     ` Thomas Petazzoni
2013-07-15 15:30       ` Willy Tarreau
2013-07-15 15:35         ` Thomas Petazzoni
2013-07-15 15:52           ` Florian Fainelli
2013-07-15 17:01             ` Willy Tarreau
2013-07-15 19:44             ` Thomas Petazzoni
2013-07-15 23:02               ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130705074330.GB25188@1wt.eu \
    --to=w@1wt.eu \
    --cc=edumazet@google.com \
    --cc=gregory.clement@free-electrons.com \
    --cc=netdev@vger.kernel.org \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.