All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Ard Biesheuvel <ardb@kernel.org>, Christoph Hellwig <hch@lst.de>,
	Isaac Manjarres <isaacmanjarres@google.com>,
	Saravana Kannan <saravanak@google.com>,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations
Date: Wed, 26 Oct 2022 18:09:19 +0100	[thread overview]
Message-ID: <Y1lpv5/hr+TXxAqd@arm.com> (raw)
In-Reply-To: <Y1kvGFAUjAx726oa@kroah.com>

On Wed, Oct 26, 2022 at 02:59:04PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 26, 2022 at 10:48:58AM +0100, Catalin Marinas wrote:
> > On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote:
> > > > --- a/lib/kasprintf.c
> > > > +++ b/lib/kasprintf.c
> > > > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap)
> > > >  	first = vsnprintf(NULL, 0, fmt, aq);
> > > >  	va_end(aq);
> > > >  
> > > > -	p = kmalloc_track_caller(first+1, gfp);
> > > > +	p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED);
> > > 
> > > How do we know this is going to be small?
> > 
> > We don't need to know it's small. If it's over 96 bytes on arm64, it
> > goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192
> > cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd
> > avoid GFP_TINY as this flag is not about size but rather alignment (e.g.
> > 192 may not be DMA safe but it's larger than 128).
> > 
> > That said, I should try to identify sizes > 128 and <= 192 and pass such
> > flag.
> 
> What if the flag is used for large sizes, what will happen?  In other
> words, why would you ever NOT want to use this?  DMA is a big issue, but
> then we should flip that around and explicitly mark the times we want
> DMA, not not-want DMA as "not want" is by far the most common, right?

Indeed, flipping these flags is the ideal solution. It's just tracking
them down and I'm not sure coccinelle on its own can handle it (maybe it
does). As an example of what needs changing:

----------------------8<-------------------------
diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h
index fc1ba2a3e6fb..8ba94d563db3 100644
--- a/drivers/infiniband/hw/irdma/osdep.h
+++ b/drivers/infiniband/hw/irdma/osdep.h
@@ -16,7 +16,7 @@ struct irdma_dma_info {
 };

 struct irdma_dma_mem {
-	void *va;
+	void __dma *va;
 	dma_addr_t pa;
 	u32 size;
 } __packed;
diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c
index 4ec9639f1bdb..ab15c5e812d0 100644
--- a/drivers/infiniband/hw/irdma/puda.c
+++ b/drivers/infiniband/hw/irdma/puda.c
@@ -143,13 +143,13 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev,
 	struct irdma_virt_mem buf_mem;

 	buf_mem.size = sizeof(struct irdma_puda_buf);
-	buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL);
+	buf_mem.va = dma_kzalloc(buf_mem.size, GFP_KERNEL);
 	if (!buf_mem.va)
 		return NULL;

 	buf = buf_mem.va;
 	buf->mem.size = len;
-	buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL);
+	buf->mem.va = dma_kzalloc(buf->mem.size, GFP_KERNEL);
 	if (!buf->mem.va)
 		goto free_virt;
 	buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va,
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0ee20b764000..8476e6609f35 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -324,7 +324,7 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 	dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir);
 }

-static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr,
+static inline dma_addr_t dma_map_single_attrs(struct device *dev, void __dma *ptr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	/* DMA must never operate on areas that might be remapped. */
----------------------8<-------------------------

Basically any pointer type passed to dma_map_single() would need the
__dma attribute. Once that's done, the next step is changing the
allocator from kmalloc() to a new dma_kmalloc(). There are other places
where the pointer gets assigned the value of another pointer (e.g.
skb->data), so the origin pointer need to inherit the __dma attribute
(and its original allocator changed).

The scatterlist API may need changing slightly as it works on pages +
offsets.

> In other words, I'm leaning hard on the "mark allocations that need DMA
> memory as needing DMA memory" option.  Yes it will be more work
> initially, but I bet a lot if not most, of them can be caught with
> coccinelle scripts.  And then enforced with sparse to ensure they don't
> break in the future.

I'll read a bit more on this.

> That would provide a huge hint to the allocator as to what is needed,
> while this "packed" really doesn't express the intent here at all.  Then
> if your allocator/arch has explicit requirements for DMA memory, it can
> enforce them and not just hope and guess that people mark things as "not
> dma".

"Packed" is more about saying "I don't need the ARCH_KMALLOC_MINALIGN
alignment". That's why I went for it instead of GFP_NODMA (which usually
leads to your question of why not doing it the other way around).

-- 
Catalin


WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Ard Biesheuvel <ardb@kernel.org>, Christoph Hellwig <hch@lst.de>,
	Isaac Manjarres <isaacmanjarres@google.com>,
	Saravana Kannan <saravanak@google.com>,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations
Date: Wed, 26 Oct 2022 18:09:19 +0100	[thread overview]
Message-ID: <Y1lpv5/hr+TXxAqd@arm.com> (raw)
In-Reply-To: <Y1kvGFAUjAx726oa@kroah.com>

On Wed, Oct 26, 2022 at 02:59:04PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 26, 2022 at 10:48:58AM +0100, Catalin Marinas wrote:
> > On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote:
> > > > --- a/lib/kasprintf.c
> > > > +++ b/lib/kasprintf.c
> > > > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap)
> > > >  	first = vsnprintf(NULL, 0, fmt, aq);
> > > >  	va_end(aq);
> > > >  
> > > > -	p = kmalloc_track_caller(first+1, gfp);
> > > > +	p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED);
> > > 
> > > How do we know this is going to be small?
> > 
> > We don't need to know it's small. If it's over 96 bytes on arm64, it
> > goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192
> > cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd
> > avoid GFP_TINY as this flag is not about size but rather alignment (e.g.
> > 192 may not be DMA safe but it's larger than 128).
> > 
> > That said, I should try to identify sizes > 128 and <= 192 and pass such
> > flag.
> 
> What if the flag is used for large sizes, what will happen?  In other
> words, why would you ever NOT want to use this?  DMA is a big issue, but
> then we should flip that around and explicitly mark the times we want
> DMA, not not-want DMA as "not want" is by far the most common, right?

Indeed, flipping these flags is the ideal solution. It's just tracking
them down and I'm not sure coccinelle on its own can handle it (maybe it
does). As an example of what needs changing:

----------------------8<-------------------------
diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h
index fc1ba2a3e6fb..8ba94d563db3 100644
--- a/drivers/infiniband/hw/irdma/osdep.h
+++ b/drivers/infiniband/hw/irdma/osdep.h
@@ -16,7 +16,7 @@ struct irdma_dma_info {
 };

 struct irdma_dma_mem {
-	void *va;
+	void __dma *va;
 	dma_addr_t pa;
 	u32 size;
 } __packed;
diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c
index 4ec9639f1bdb..ab15c5e812d0 100644
--- a/drivers/infiniband/hw/irdma/puda.c
+++ b/drivers/infiniband/hw/irdma/puda.c
@@ -143,13 +143,13 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev,
 	struct irdma_virt_mem buf_mem;

 	buf_mem.size = sizeof(struct irdma_puda_buf);
-	buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL);
+	buf_mem.va = dma_kzalloc(buf_mem.size, GFP_KERNEL);
 	if (!buf_mem.va)
 		return NULL;

 	buf = buf_mem.va;
 	buf->mem.size = len;
-	buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL);
+	buf->mem.va = dma_kzalloc(buf->mem.size, GFP_KERNEL);
 	if (!buf->mem.va)
 		goto free_virt;
 	buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va,
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0ee20b764000..8476e6609f35 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -324,7 +324,7 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size,
 	dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir);
 }

-static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr,
+static inline dma_addr_t dma_map_single_attrs(struct device *dev, void __dma *ptr,
 		size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
 	/* DMA must never operate on areas that might be remapped. */
----------------------8<-------------------------

Basically any pointer type passed to dma_map_single() would need the
__dma attribute. Once that's done, the next step is changing the
allocator from kmalloc() to a new dma_kmalloc(). There are other places
where the pointer gets assigned the value of another pointer (e.g.
skb->data), so the origin pointer need to inherit the __dma attribute
(and its original allocator changed).

The scatterlist API may need changing slightly as it works on pages +
offsets.

> In other words, I'm leaning hard on the "mark allocations that need DMA
> memory as needing DMA memory" option.  Yes it will be more work
> initially, but I bet a lot if not most, of them can be caught with
> coccinelle scripts.  And then enforced with sparse to ensure they don't
> break in the future.

I'll read a bit more on this.

> That would provide a huge hint to the allocator as to what is needed,
> while this "packed" really doesn't express the intent here at all.  Then
> if your allocator/arch has explicit requirements for DMA memory, it can
> enforce them and not just hope and guess that people mark things as "not
> dma".

"Packed" is more about saying "I don't need the ARCH_KMALLOC_MINALIGN
alignment". That's why I went for it instead of GFP_NODMA (which usually
leads to your question of why not doing it the other way around).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-10-26 17:09 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25 20:52 [PATCH v2 0/2] mm: Allow kmalloc() allocations below ARCH_KMALLOC_MINALIGN Catalin Marinas
2022-10-25 20:52 ` Catalin Marinas
2022-10-25 20:52 ` [PATCH v2 1/2] mm: slab: Introduce __GFP_PACKED for smaller kmalloc() alignments Catalin Marinas
2022-10-25 20:52   ` Catalin Marinas
2022-10-26  6:39   ` Greg Kroah-Hartman
2022-10-26  6:39     ` Greg Kroah-Hartman
2022-10-26  8:39     ` Catalin Marinas
2022-10-26  8:39       ` Catalin Marinas
2022-10-26  9:49       ` Greg Kroah-Hartman
2022-10-26  9:49         ` Greg Kroah-Hartman
2022-10-26  9:58         ` Catalin Marinas
2022-10-26  9:58           ` Catalin Marinas
2022-10-27 12:11   ` Hyeonggon Yoo
2022-10-27 12:11     ` Hyeonggon Yoo
2022-10-28  7:32     ` Catalin Marinas
2022-10-28  7:32       ` Catalin Marinas
2022-10-25 20:52 ` [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Catalin Marinas
2022-10-25 20:52   ` Catalin Marinas
2022-10-26  6:50   ` Greg Kroah-Hartman
2022-10-26  6:50     ` Greg Kroah-Hartman
2022-10-26  9:48     ` Catalin Marinas
2022-10-26  9:48       ` Catalin Marinas
2022-10-26 12:59       ` Greg Kroah-Hartman
2022-10-26 12:59         ` Greg Kroah-Hartman
2022-10-26 17:09         ` Catalin Marinas [this message]
2022-10-26 17:09           ` Catalin Marinas
2022-10-26 17:21           ` Greg Kroah-Hartman
2022-10-26 17:21             ` Greg Kroah-Hartman
2022-10-26 17:46       ` Linus Torvalds
2022-10-26 17:46         ` Linus Torvalds
2022-10-27 22:29         ` Catalin Marinas
2022-10-27 22:29           ` Catalin Marinas
2022-10-28  9:37           ` Greg Kroah-Hartman
2022-10-28  9:37             ` Greg Kroah-Hartman
2022-10-28  9:37             ` Greg Kroah-Hartman
2022-10-28  9:37               ` Greg Kroah-Hartman
2022-10-30  8:47               ` Christoph Hellwig
2022-10-30  8:47                 ` Christoph Hellwig
2022-10-30  9:02                 ` Greg Kroah-Hartman
2022-10-30  9:02                   ` Greg Kroah-Hartman
2022-10-30  9:13                   ` Christoph Hellwig
2022-10-30  9:13                     ` Christoph Hellwig
2022-10-30 16:43                     ` Catalin Marinas
2022-10-30 16:43                       ` Catalin Marinas
2022-11-01 10:59                       ` Christoph Hellwig
2022-11-01 10:59                         ` Christoph Hellwig
2022-11-01 17:19                         ` Catalin Marinas
2022-11-01 17:19                           ` Catalin Marinas
2022-11-01 17:24                           ` Christoph Hellwig
2022-11-01 17:24                             ` Christoph Hellwig
2022-11-01 17:32                             ` Catalin Marinas
2022-11-01 17:32                               ` Catalin Marinas
2022-11-01 17:39                               ` Christoph Hellwig
2022-11-01 17:39                                 ` Christoph Hellwig
2022-11-01 19:10                                 ` Isaac Manjarres
2022-11-01 19:10                                   ` Isaac Manjarres
2022-11-02 11:05                                   ` Catalin Marinas
2022-11-02 11:05                                     ` Catalin Marinas
2022-11-02 20:50                                     ` Isaac Manjarres
2022-11-02 20:50                                       ` Isaac Manjarres
2022-11-01 18:14                           ` Robin Murphy
2022-11-01 18:14                             ` Robin Murphy
2022-11-02 13:10                             ` Catalin Marinas
2022-11-02 13:10                               ` Catalin Marinas
2022-10-30  8:46           ` Christoph Hellwig
2022-10-30  8:46             ` Christoph Hellwig
2022-10-30  8:44         ` Christoph Hellwig
2022-10-30  8:44           ` Christoph Hellwig
2022-11-03 16:15       ` Vlastimil Babka
2022-11-03 16:15         ` Vlastimil Babka
2022-11-03 18:03         ` Catalin Marinas
2022-11-03 18:03           ` Catalin Marinas
2022-10-26  6:54 ` [PATCH v2 0/2] mm: Allow kmalloc() allocations below ARCH_KMALLOC_MINALIGN Greg Kroah-Hartman
2022-10-26  6:54   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1lpv5/hr+TXxAqd@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=isaacmanjarres@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=saravanak@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.