netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: lsf@lists.linux-foundation.org, linux-mm <linux-mm@kvack.org>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Tom Herbert <tom@herbertland.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Brenden Blanco <bblanco@plumgrid.com>,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM TOPIC] Generic page-pool recycle facility?
Date: Thu, 07 Apr 2016 08:18:29 -0700	[thread overview]
Message-ID: <1460042309.6473.414.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <20160407161715.52635cac@redhat.com>

On Thu, 2016-04-07 at 16:17 +0200, Jesper Dangaard Brouer wrote:
> (Topic proposal for MM-summit)
> 
> Network Interface Cards (NIC) drivers, and increasing speeds stress
> the page-allocator (and DMA APIs).  A number of driver specific
> open-coded approaches exists that work-around these bottlenecks in the
> page allocator and DMA APIs. E.g. open-coded recycle mechanisms, and
> allocating larger pages and handing-out page "fragments".
> 
> I'm proposing a generic page-pool recycle facility, that can cover the
> driver use-cases, increase performance and open up for zero-copy RX.
> 
> 
> The basic performance problem is that pages (containing packets at RX)
> are cycled through the page allocator (freed at TX DMA completion
> time).  While a system in a steady state, could avoid calling the page
> allocator, when having a pool of pages equal to the size of the RX
> ring plus the number of outstanding frames in the TX ring (waiting for
> DMA completion).


We certainly used this at Google for quite a while.

The thing is : in steady state, the number of pages being 'in tx queues'
is lower than number of pages that were allocated for RX queues.

The page allocator is hardly hit, once you have big enough RX ring
buffers. (Nothing fancy, simply the default number of slots)

The 'hard coded´ code is quite small actually

if (page_count(page) != 1) {
    free the page and allocate another one, 
    since we are not the exclusive owner.
    Prefer __GFP_COLD pages btw.
}
page_ref_inc(page);

Problem of a 'pool' is that it matches a router workload, not host one.

With existing code, new pages are automatically allocated on demand, if
say previous pages are still used by skb stored in sockets receive
queues and consumers are slow to react to the presence of this data.

But in most cases (steady state), the refcount on the page is released
by the application reading the data before the driver cycled through the
RX ring buffer and drivers only increments the page count.

I also played with grouping pages into the same 2MB pages, but got mixed
results.

  parent reply	other threads:[~2016-04-07 15:18 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1460034425.20949.7.camel@HansenPartnership.com>
2016-04-07 14:17 ` [LSF/MM TOPIC] Generic page-pool recycle facility? Jesper Dangaard Brouer
2016-04-07 14:38   ` [Lsf-pc] " Christoph Hellwig
2016-04-07 15:11     ` [Lsf] " Bart Van Assche
2016-04-10 18:45       ` Sagi Grimberg
2016-04-11 21:41         ` Jesper Dangaard Brouer
2016-04-11 22:02           ` Alexander Duyck
2016-04-12  6:28             ` Jesper Dangaard Brouer
2016-04-12 15:37               ` Alexander Duyck
2016-04-11 22:21           ` Alexei Starovoitov
2016-04-12  6:16             ` Jesper Dangaard Brouer
2016-04-12 17:20               ` Alexei Starovoitov
2016-04-07 15:48     ` Chuck Lever
2016-04-07 16:14       ` [Lsf-pc] [Lsf] " Rik van Riel
2016-04-07 19:43         ` [Lsf] [Lsf-pc] " Jesper Dangaard Brouer
2016-04-07 15:18   ` Eric Dumazet [this message]
2016-04-09  9:11     ` [Lsf] " Jesper Dangaard Brouer
2016-04-09 12:34       ` Eric Dumazet
2016-04-11 20:23         ` Jesper Dangaard Brouer
2016-04-11 21:27           ` Eric Dumazet
2016-04-07 19:48   ` Waskiewicz, PJ
2016-04-07 20:38     ` Jesper Dangaard Brouer
2016-04-08 16:12       ` Alexander Duyck
2016-04-11  8:58   ` [Lsf-pc] " Mel Gorman
2016-04-11 12:26     ` Jesper Dangaard Brouer
2016-04-11 13:08       ` Mel Gorman
2016-04-11 16:19         ` [Lsf] " Jesper Dangaard Brouer
2016-04-11 16:53           ` Eric Dumazet
2016-04-11 19:47             ` Jesper Dangaard Brouer
2016-04-11 21:14               ` Eric Dumazet
2016-04-11 18:07           ` Mel Gorman
2016-04-11 19:26             ` Jesper Dangaard Brouer
2016-04-11 16:20         ` Matthew Wilcox
2016-04-11 17:46           ` Thadeu Lima de Souza Cascardo
2016-04-11 18:37             ` Jesper Dangaard Brouer
2016-04-11 18:53               ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1460042309.6473.414.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=lsf@lists.linux-foundation.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).