From: Catalin Marinas <catalin.marinas@arm.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Arnd Bergmann <arnd@arndb.de>, Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Herbert Xu <herbert@gondor.apana.org.au>, Ard Biesheuvel <ardb@kernel.org>, Christoph Hellwig <hch@lst.de>, Isaac Manjarres <isaacmanjarres@google.com>, Saravana Kannan <saravanak@google.com>, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Date: Wed, 26 Oct 2022 18:09:19 +0100 [thread overview] Message-ID: <Y1lpv5/hr+TXxAqd@arm.com> (raw) In-Reply-To: <Y1kvGFAUjAx726oa@kroah.com> On Wed, Oct 26, 2022 at 02:59:04PM +0200, Greg Kroah-Hartman wrote: > On Wed, Oct 26, 2022 at 10:48:58AM +0100, Catalin Marinas wrote: > > On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote: > > > On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote: > > > > --- a/lib/kasprintf.c > > > > +++ b/lib/kasprintf.c > > > > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap) > > > > first = vsnprintf(NULL, 0, fmt, aq); > > > > va_end(aq); > > > > > > > > - p = kmalloc_track_caller(first+1, gfp); > > > > + p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED); > > > > > > How do we know this is going to be small? > > > > We don't need to know it's small. If it's over 96 bytes on arm64, it > > goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192 > > cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd > > avoid GFP_TINY as this flag is not about size but rather alignment (e.g. > > 192 may not be DMA safe but it's larger than 128). > > > > That said, I should try to identify sizes > 128 and <= 192 and pass such > > flag. > > What if the flag is used for large sizes, what will happen? In other > words, why would you ever NOT want to use this? DMA is a big issue, but > then we should flip that around and explicitly mark the times we want > DMA, not not-want DMA as "not want" is by far the most common, right? Indeed, flipping these flags is the ideal solution. It's just tracking them down and I'm not sure coccinelle on its own can handle it (maybe it does). As an example of what needs changing: ----------------------8<------------------------- diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h index fc1ba2a3e6fb..8ba94d563db3 100644 --- a/drivers/infiniband/hw/irdma/osdep.h +++ b/drivers/infiniband/hw/irdma/osdep.h @@ -16,7 +16,7 @@ struct irdma_dma_info { }; struct irdma_dma_mem { - void *va; + void __dma *va; dma_addr_t pa; u32 size; } __packed; diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 4ec9639f1bdb..ab15c5e812d0 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -143,13 +143,13 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev, struct irdma_virt_mem buf_mem; buf_mem.size = sizeof(struct irdma_puda_buf); - buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL); + buf_mem.va = dma_kzalloc(buf_mem.size, GFP_KERNEL); if (!buf_mem.va) return NULL; buf = buf_mem.va; buf->mem.size = len; - buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL); + buf->mem.va = dma_kzalloc(buf->mem.size, GFP_KERNEL); if (!buf->mem.va) goto free_virt; buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0ee20b764000..8476e6609f35 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -324,7 +324,7 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size, dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); } -static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, +static inline dma_addr_t dma_map_single_attrs(struct device *dev, void __dma *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) { /* DMA must never operate on areas that might be remapped. */ ----------------------8<------------------------- Basically any pointer type passed to dma_map_single() would need the __dma attribute. Once that's done, the next step is changing the allocator from kmalloc() to a new dma_kmalloc(). There are other places where the pointer gets assigned the value of another pointer (e.g. skb->data), so the origin pointer need to inherit the __dma attribute (and its original allocator changed). The scatterlist API may need changing slightly as it works on pages + offsets. > In other words, I'm leaning hard on the "mark allocations that need DMA > memory as needing DMA memory" option. Yes it will be more work > initially, but I bet a lot if not most, of them can be caught with > coccinelle scripts. And then enforced with sparse to ensure they don't > break in the future. I'll read a bit more on this. > That would provide a huge hint to the allocator as to what is needed, > while this "packed" really doesn't express the intent here at all. Then > if your allocator/arch has explicit requirements for DMA memory, it can > enforce them and not just hope and guess that people mark things as "not > dma". "Packed" is more about saying "I don't need the ARCH_KMALLOC_MINALIGN alignment". That's why I went for it instead of GFP_NODMA (which usually leads to your question of why not doing it the other way around). -- Catalin
WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Arnd Bergmann <arnd@arndb.de>, Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, Herbert Xu <herbert@gondor.apana.org.au>, Ard Biesheuvel <ardb@kernel.org>, Christoph Hellwig <hch@lst.de>, Isaac Manjarres <isaacmanjarres@google.com>, Saravana Kannan <saravanak@google.com>, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Date: Wed, 26 Oct 2022 18:09:19 +0100 [thread overview] Message-ID: <Y1lpv5/hr+TXxAqd@arm.com> (raw) In-Reply-To: <Y1kvGFAUjAx726oa@kroah.com> On Wed, Oct 26, 2022 at 02:59:04PM +0200, Greg Kroah-Hartman wrote: > On Wed, Oct 26, 2022 at 10:48:58AM +0100, Catalin Marinas wrote: > > On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote: > > > On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote: > > > > --- a/lib/kasprintf.c > > > > +++ b/lib/kasprintf.c > > > > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap) > > > > first = vsnprintf(NULL, 0, fmt, aq); > > > > va_end(aq); > > > > > > > > - p = kmalloc_track_caller(first+1, gfp); > > > > + p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED); > > > > > > How do we know this is going to be small? > > > > We don't need to know it's small. If it's over 96 bytes on arm64, it > > goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192 > > cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd > > avoid GFP_TINY as this flag is not about size but rather alignment (e.g. > > 192 may not be DMA safe but it's larger than 128). > > > > That said, I should try to identify sizes > 128 and <= 192 and pass such > > flag. > > What if the flag is used for large sizes, what will happen? In other > words, why would you ever NOT want to use this? DMA is a big issue, but > then we should flip that around and explicitly mark the times we want > DMA, not not-want DMA as "not want" is by far the most common, right? Indeed, flipping these flags is the ideal solution. It's just tracking them down and I'm not sure coccinelle on its own can handle it (maybe it does). As an example of what needs changing: ----------------------8<------------------------- diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h index fc1ba2a3e6fb..8ba94d563db3 100644 --- a/drivers/infiniband/hw/irdma/osdep.h +++ b/drivers/infiniband/hw/irdma/osdep.h @@ -16,7 +16,7 @@ struct irdma_dma_info { }; struct irdma_dma_mem { - void *va; + void __dma *va; dma_addr_t pa; u32 size; } __packed; diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 4ec9639f1bdb..ab15c5e812d0 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -143,13 +143,13 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev, struct irdma_virt_mem buf_mem; buf_mem.size = sizeof(struct irdma_puda_buf); - buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL); + buf_mem.va = dma_kzalloc(buf_mem.size, GFP_KERNEL); if (!buf_mem.va) return NULL; buf = buf_mem.va; buf->mem.size = len; - buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL); + buf->mem.va = dma_kzalloc(buf->mem.size, GFP_KERNEL); if (!buf->mem.va) goto free_virt; buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0ee20b764000..8476e6609f35 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -324,7 +324,7 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size, dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); } -static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, +static inline dma_addr_t dma_map_single_attrs(struct device *dev, void __dma *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) { /* DMA must never operate on areas that might be remapped. */ ----------------------8<------------------------- Basically any pointer type passed to dma_map_single() would need the __dma attribute. Once that's done, the next step is changing the allocator from kmalloc() to a new dma_kmalloc(). There are other places where the pointer gets assigned the value of another pointer (e.g. skb->data), so the origin pointer need to inherit the __dma attribute (and its original allocator changed). The scatterlist API may need changing slightly as it works on pages + offsets. > In other words, I'm leaning hard on the "mark allocations that need DMA > memory as needing DMA memory" option. Yes it will be more work > initially, but I bet a lot if not most, of them can be caught with > coccinelle scripts. And then enforced with sparse to ensure they don't > break in the future. I'll read a bit more on this. > That would provide a huge hint to the allocator as to what is needed, > while this "packed" really doesn't express the intent here at all. Then > if your allocator/arch has explicit requirements for DMA memory, it can > enforce them and not just hope and guess that people mark things as "not > dma". "Packed" is more about saying "I don't need the ARCH_KMALLOC_MINALIGN alignment". That's why I went for it instead of GFP_NODMA (which usually leads to your question of why not doing it the other way around). -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-10-26 17:09 UTC|newest] Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-10-25 20:52 [PATCH v2 0/2] mm: Allow kmalloc() allocations below ARCH_KMALLOC_MINALIGN Catalin Marinas 2022-10-25 20:52 ` Catalin Marinas 2022-10-25 20:52 ` [PATCH v2 1/2] mm: slab: Introduce __GFP_PACKED for smaller kmalloc() alignments Catalin Marinas 2022-10-25 20:52 ` Catalin Marinas 2022-10-26 6:39 ` Greg Kroah-Hartman 2022-10-26 6:39 ` Greg Kroah-Hartman 2022-10-26 8:39 ` Catalin Marinas 2022-10-26 8:39 ` Catalin Marinas 2022-10-26 9:49 ` Greg Kroah-Hartman 2022-10-26 9:49 ` Greg Kroah-Hartman 2022-10-26 9:58 ` Catalin Marinas 2022-10-26 9:58 ` Catalin Marinas 2022-10-27 12:11 ` Hyeonggon Yoo 2022-10-27 12:11 ` Hyeonggon Yoo 2022-10-28 7:32 ` Catalin Marinas 2022-10-28 7:32 ` Catalin Marinas 2022-10-25 20:52 ` [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Catalin Marinas 2022-10-25 20:52 ` Catalin Marinas 2022-10-26 6:50 ` Greg Kroah-Hartman 2022-10-26 6:50 ` Greg Kroah-Hartman 2022-10-26 9:48 ` Catalin Marinas 2022-10-26 9:48 ` Catalin Marinas 2022-10-26 12:59 ` Greg Kroah-Hartman 2022-10-26 12:59 ` Greg Kroah-Hartman 2022-10-26 17:09 ` Catalin Marinas [this message] 2022-10-26 17:09 ` Catalin Marinas 2022-10-26 17:21 ` Greg Kroah-Hartman 2022-10-26 17:21 ` Greg Kroah-Hartman 2022-10-26 17:46 ` Linus Torvalds 2022-10-26 17:46 ` Linus Torvalds 2022-10-27 22:29 ` Catalin Marinas 2022-10-27 22:29 ` Catalin Marinas 2022-10-28 9:37 ` Greg Kroah-Hartman 2022-10-28 9:37 ` Greg Kroah-Hartman 2022-10-28 9:37 ` Greg Kroah-Hartman 2022-10-28 9:37 ` Greg Kroah-Hartman 2022-10-30 8:47 ` Christoph Hellwig 2022-10-30 8:47 ` Christoph Hellwig 2022-10-30 9:02 ` Greg Kroah-Hartman 2022-10-30 9:02 ` Greg Kroah-Hartman 2022-10-30 9:13 ` Christoph Hellwig 2022-10-30 9:13 ` Christoph Hellwig 2022-10-30 16:43 ` Catalin Marinas 2022-10-30 16:43 ` Catalin Marinas 2022-11-01 10:59 ` Christoph Hellwig 2022-11-01 10:59 ` Christoph Hellwig 2022-11-01 17:19 ` Catalin Marinas 2022-11-01 17:19 ` Catalin Marinas 2022-11-01 17:24 ` Christoph Hellwig 2022-11-01 17:24 ` Christoph Hellwig 2022-11-01 17:32 ` Catalin Marinas 2022-11-01 17:32 ` Catalin Marinas 2022-11-01 17:39 ` Christoph Hellwig 2022-11-01 17:39 ` Christoph Hellwig 2022-11-01 19:10 ` Isaac Manjarres 2022-11-01 19:10 ` Isaac Manjarres 2022-11-02 11:05 ` Catalin Marinas 2022-11-02 11:05 ` Catalin Marinas 2022-11-02 20:50 ` Isaac Manjarres 2022-11-02 20:50 ` Isaac Manjarres 2022-11-01 18:14 ` Robin Murphy 2022-11-01 18:14 ` Robin Murphy 2022-11-02 13:10 ` Catalin Marinas 2022-11-02 13:10 ` Catalin Marinas 2022-10-30 8:46 ` Christoph Hellwig 2022-10-30 8:46 ` Christoph Hellwig 2022-10-30 8:44 ` Christoph Hellwig 2022-10-30 8:44 ` Christoph Hellwig 2022-11-03 16:15 ` Vlastimil Babka 2022-11-03 16:15 ` Vlastimil Babka 2022-11-03 18:03 ` Catalin Marinas 2022-11-03 18:03 ` Catalin Marinas 2022-10-26 6:54 ` [PATCH v2 0/2] mm: Allow kmalloc() allocations below ARCH_KMALLOC_MINALIGN Greg Kroah-Hartman 2022-10-26 6:54 ` Greg Kroah-Hartman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=Y1lpv5/hr+TXxAqd@arm.com \ --to=catalin.marinas@arm.com \ --cc=akpm@linux-foundation.org \ --cc=ardb@kernel.org \ --cc=arnd@arndb.de \ --cc=gregkh@linuxfoundation.org \ --cc=hch@lst.de \ --cc=herbert@gondor.apana.org.au \ --cc=isaacmanjarres@google.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-mm@kvack.org \ --cc=maz@kernel.org \ --cc=saravanak@google.com \ --cc=torvalds@linux-foundation.org \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.