All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Hao <haokexin@gmail.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: "David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Netdev <netdev@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next v2 1/4] mm: page_frag: Introduce page_frag_alloc_align()
Date: Thu, 4 Feb 2021 14:40:52 +0800	[thread overview]
Message-ID: <20210204064052.GA76441@pek-khao-d2.corp.ad.wrs.com> (raw)
In-Reply-To: <CAKgT0Uf2BJ-EHF+Cp+Jp4121xH3ei_L9ZCE1TFVPJVp4Ru9O0w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4638 bytes --]

On Tue, Feb 02, 2021 at 08:19:54AM -0800, Alexander Duyck wrote:
> On Sat, Jan 30, 2021 at 11:54 PM Kevin Hao <haokexin@gmail.com> wrote:
> >
> > In the current implementation of page_frag_alloc(), it doesn't have
> > any align guarantee for the returned buffer address. But for some
> > hardwares they do require the DMA buffer to be aligned correctly,
> > so we would have to use some workarounds like below if the buffers
> > allocated by the page_frag_alloc() are used by these hardwares for
> > DMA.
> >     buf = page_frag_alloc(really_needed_size + align);
> >     buf = PTR_ALIGN(buf, align);
> >
> > These codes seems ugly and would waste a lot of memories if the buffers
> > are used in a network driver for the TX/RX. So introduce
> > page_frag_alloc_align() to make sure that an aligned buffer address is
> > returned.
> >
> > Signed-off-by: Kevin Hao <haokexin@gmail.com>
> > Acked-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> > v2:
> >   - Inline page_frag_alloc()
> >   - Adopt Vlastimil's suggestion and add his Acked-by
> >
> >  include/linux/gfp.h | 12 ++++++++++--
> >  mm/page_alloc.c     |  8 +++++---
> >  2 files changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 6e479e9c48ce..39f4b3070d09 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -583,8 +583,16 @@ extern void free_pages(unsigned long addr, unsigned int order);
> >
> >  struct page_frag_cache;
> >  extern void __page_frag_cache_drain(struct page *page, unsigned int count);
> > -extern void *page_frag_alloc(struct page_frag_cache *nc,
> > -                            unsigned int fragsz, gfp_t gfp_mask);
> > +extern void *page_frag_alloc_align(struct page_frag_cache *nc,
> > +                                  unsigned int fragsz, gfp_t gfp_mask,
> > +                                  int align);
> > +
> > +static inline void *page_frag_alloc(struct page_frag_cache *nc,
> > +                            unsigned int fragsz, gfp_t gfp_mask)
> > +{
> > +       return page_frag_alloc_align(nc, fragsz, gfp_mask, 0);
> > +}
> > +
> >  extern void page_frag_free(void *addr);
> >
> >  #define __free_page(page) __free_pages((page), 0)
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 519a60d5b6f7..4667e7b6993b 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5137,8 +5137,8 @@ void __page_frag_cache_drain(struct page *page, unsigned int count)
> >  }
> >  EXPORT_SYMBOL(__page_frag_cache_drain);
> >
> > -void *page_frag_alloc(struct page_frag_cache *nc,
> > -                     unsigned int fragsz, gfp_t gfp_mask)
> > +void *page_frag_alloc_align(struct page_frag_cache *nc,
> > +                     unsigned int fragsz, gfp_t gfp_mask, int align)
> 
> I would make "align" unsigned since really we are using it as a mask.
> Actually passing it as a mask might be even better. More on that
> below.
> 
> >  {
> >         unsigned int size = PAGE_SIZE;
> >         struct page *page;
> > @@ -5190,11 +5190,13 @@ void *page_frag_alloc(struct page_frag_cache *nc,
> >         }
> >
> >         nc->pagecnt_bias--;
> > +       if (align)
> > +               offset = ALIGN_DOWN(offset, align);
> >         nc->offset = offset;
> >
> >         return nc->va + offset;
> >  }
> > -EXPORT_SYMBOL(page_frag_alloc);
> > +EXPORT_SYMBOL(page_frag_alloc_align);
> >
> >  /*
> >   * Frees a page fragment allocated out of either a compound or order 0 page.
> 
> Rather than using the conditional branch it might be better to just do
> "offset &= align_mask". Then you would be adding at most 1 instruction
> which can likely occur in parallel with the other work that is going
> on versus the conditional branch which requires a test, jump, and then
> the 3 alignment instructions to do the subtraction, inversion, and
> AND.

On arm64:

       if (align)
               offset = ALIGN_DOWN(offset, align);

	4b1503e2        neg     w2, w21
	710002bf        cmp     w21, #0x0
	0a020082        and     w2, w4, w2
	1a841044        csel    w4, w2, w4, ne  // ne = any


	offset &= align_mask

	0a0402a4        and     w4, w21, w4


Yes, we do cut 3 instructions by using align mask.

> 
> However it would ripple through the other patches as you would also
> need to update you other patches to assume ~0 in the unaligned case,
> however with your masked cases you could just use the negative
> alignment value to generate your mask which would likely be taken care
> of by the compiler.

Will do.

Thanks,
Kevin


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-02-04  6:42 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-31  7:44 [PATCH net-next v2 0/4] net: Avoid the memory waste in some Ethernet drivers Kevin Hao
2021-01-31  7:44 ` [PATCH net-next v2 1/4] mm: page_frag: Introduce page_frag_alloc_align() Kevin Hao
2021-02-02 11:36   ` Ioana Ciornei
2021-02-02 11:48     ` Vlastimil Babka
2021-02-02 12:31       ` Ioana Ciornei
2021-02-02 16:19   ` Alexander Duyck
2021-02-02 16:19     ` Alexander Duyck
2021-02-04  6:40     ` Kevin Hao [this message]
2021-01-31  7:44 ` [PATCH net-next v2 2/4] net: Introduce {netdev,napi}_alloc_frag_align() Kevin Hao
2021-02-02 16:26   ` Alexander Duyck
2021-02-04  6:47     ` Kevin Hao
2021-01-31  7:44 ` [PATCH net-next v2 3/4] net: octeontx2: Use napi_alloc_frag_align() to avoid the memory waste Kevin Hao
2021-02-01 12:41   ` sundeep subbaraya
2021-01-31  7:44 ` [PATCH net-next v2 4/4] net: dpaa2: " Kevin Hao
2021-02-02 12:32   ` Ioana Ciornei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210204064052.GA76441@pek-khao-d2.corp.ad.wrs.com \
    --to=haokexin@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.