All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Alexander Duyck' <alexander.duyck@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Brenden Blanco <bblanco@plumgrid.com>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Saeed Mahameed <saeedm@dev.mellanox.co.il>,
	"Martin KaFai Lau" <kafai@fb.com>, Ari Saha <as754m@att.com>,
	Or Gerlitz <gerlitz.or@gmail.com>,
	john fastabend <john.fastabend@gmail.com>,
	"Hannes Frederic Sowa" <hannes@stressinduktion.org>,
	Thomas Graf <tgraf@suug.ch>, "Tom Herbert" <tom@herbertland.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Tariq Toukan" <ttoukan.linux@gmail.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	linux-mm <linux-mm@kvack.org>
Subject: RE: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support
Date: Fri, 5 Aug 2016 15:33:27 +0000	[thread overview]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D5F50B52E@AcuExch.aculab.com> (raw)
In-Reply-To: <CAKgT0Uc0=10xhcJJ+55rBv=YNPgPLmHb8x82CKbj+N895JQY5Q@mail.gmail.com>

From: Alexander Duyck
> Sent: 05 August 2016 16:15
...
> >
> > interesting idea. Like dma_map 1GB region and then allocate
> > pages from it only? but the rest of the kernel won't be able
> > to use them? so only some smaller region then? or it will be
> > a boot time flag to reserve this pseudo-huge page?
> 
> Yeah, something like that.  If we were already talking about
> allocating a pool of pages it might make sense to just setup something
> like this where you could reserve a 1GB region for a single 10G device
> for instance.  Then it would make the whole thing much easier to deal
> with since you would have a block of memory that should perform very
> well in terms of DMA accesses.

ISTM that the main kernel allocator ought to be keeping a cache
of pages that are mapped into the various IOMMU.
This might be a per-driver cache, but could be much wider.

Then if some code wants such a page it can be allocated one that is
already mapped.
Under memory pressure the pages could then be reused for other purposes.

...
> In the Intel drivers for instance if the frame
> size is less than 256 bytes we just copy the whole thing out since it
> is cheaper to just extend the header copy rather than taking the extra
> hit for get_page/put_page.

How fast is 'rep movsb' (on cached addresses) on recent x86 cpu?
It might actually be worth unconditionally copying the entire frame
on those cpus.

A long time ago we found the breakeven point for the copy to be about
1kb on sparc mbus/sbus systems - and that might not have been aligning
the copy.

	David


WARNING: multiple messages have this Message-ID (diff)
From: David Laight <David.Laight@ACULAB.COM>
To: 'Alexander Duyck' <alexander.duyck@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Brenden Blanco <bblanco@plumgrid.com>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Saeed Mahameed <saeedm@dev.mellanox.co.il>,
	Martin KaFai Lau <kafai@fb.com>, Ari Saha <as754m@att.com>,
	Or Gerlitz <gerlitz.or@gmail.com>,
	john fastabend <john.fastabend@gmail.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Thomas Graf <tgraf@suug.ch>, Tom Herbert <tom@herbertland.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Tariq Toukan <ttoukan.linux@gmail.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	linux-mm <linux-mm@kvack.org>
Subject: RE: order-0 vs order-N driver allocation. Was: [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support
Date: Fri, 5 Aug 2016 15:33:27 +0000	[thread overview]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D5F50B52E@AcuExch.aculab.com> (raw)
In-Reply-To: <CAKgT0Uc0=10xhcJJ+55rBv=YNPgPLmHb8x82CKbj+N895JQY5Q@mail.gmail.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1704 bytes --]

From: Alexander Duyck
> Sent: 05 August 2016 16:15
...
> >
> > interesting idea. Like dma_map 1GB region and then allocate
> > pages from it only? but the rest of the kernel won't be able
> > to use them? so only some smaller region then? or it will be
> > a boot time flag to reserve this pseudo-huge page?
> 
> Yeah, something like that.  If we were already talking about
> allocating a pool of pages it might make sense to just setup something
> like this where you could reserve a 1GB region for a single 10G device
> for instance.  Then it would make the whole thing much easier to deal
> with since you would have a block of memory that should perform very
> well in terms of DMA accesses.

ISTM that the main kernel allocator ought to be keeping a cache
of pages that are mapped into the various IOMMU.
This might be a per-driver cache, but could be much wider.

Then if some code wants such a page it can be allocated one that is
already mapped.
Under memory pressure the pages could then be reused for other purposes.

...
> In the Intel drivers for instance if the frame
> size is less than 256 bytes we just copy the whole thing out since it
> is cheaper to just extend the header copy rather than taking the extra
> hit for get_page/put_page.

How fast is 'rep movsb' (on cached addresses) on recent x86 cpu?
It might actually be worth unconditionally copying the entire frame
on those cpus.

A long time ago we found the breakeven point for the copy to be about
1kb on sparc mbus/sbus systems - and that might not have been aligning
the copy.

	David

N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒðèž×¦j)Z†·Ÿ

  reply	other threads:[~2016-08-05 15:36 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20  8:38   ` Daniel Borkmann
2016-07-20 17:35     ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41   ` Alexei Starovoitov
2016-07-20  9:07   ` Daniel Borkmann
2016-07-20 17:33     ` Brenden Blanco
2016-07-24 11:56   ` Jesper Dangaard Brouer
2016-07-24 16:57   ` Tom Herbert
2016-07-24 20:34     ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49   ` Alexei Starovoitov
2016-07-25  7:35   ` Eric Dumazet
2016-08-03 17:45     ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19       ` Jesper Dangaard Brouer
2016-08-05  0:30         ` Alexander Duyck
2016-08-05  3:55           ` Alexei Starovoitov
2016-08-05 15:15             ` Alexander Duyck
2016-08-05 15:33               ` David Laight [this message]
2016-08-05 15:33                 ` David Laight
2016-08-05 16:00                 ` Alexander Duyck
2016-08-05 16:00                   ` Alexander Duyck
2016-08-05  7:15         ` Eric Dumazet
2016-08-05  7:15           ` Eric Dumazet
2016-08-08  2:15           ` Alexei Starovoitov
2016-08-08  2:15             ` Alexei Starovoitov
2016-08-08  8:01             ` Jesper Dangaard Brouer
2016-08-08 18:34               ` Alexei Starovoitov
2016-08-09 12:14                 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05   ` Alexei Starovoitov
2016-07-20 17:38     ` Brenden Blanco
2016-07-27 18:25     ` Jesper Dangaard Brouer
2016-08-03 17:01   ` Tom Herbert
2016-08-03 17:11     ` Alexei Starovoitov
2016-08-03 17:29       ` Tom Herbert
2016-08-03 18:29         ` David Miller
2016-08-03 18:29         ` Brenden Blanco
2016-08-03 18:31           ` David Miller
2016-08-03 19:06           ` Tom Herbert
2016-08-03 22:36             ` Alexei Starovoitov
2016-08-03 23:18               ` Daniel Borkmann
2016-07-20  5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
     [not found]   ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08     ` Brenden Blanco
2016-07-20 19:14     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=063D6719AE5E284EB5DD2968C1650D6D5F50B52E@AcuExch.aculab.com \
    --to=david.laight@aculab.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    --cc=ttoukan.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.