linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/22] Don't use kmalloc() with GFP_DMA
@ 2022-02-19  0:51 Baoquan He
  2022-02-19  0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
                   ` (22 more replies)
  0 siblings, 23 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

Let's replace it with other ways. This is the first step towards
removing dma-kmalloc support in kernel (Means that if everyting
is going well, we can't use kmalloc(GFP_DMA) to allocate buffer in the
future).

This series includes below changes which are easier to recognise and
make. 

1) Remove GFP_DMA from dma_alloc_wc/noncoherent(), dma_pool_alloc(),
   and dmam_alloc_coherent() which are redundant to specify GFP_DMA when
   calling.
2) Replace kmalloc(GFP_DMA)/dma_map_xxxx() pair with dma_alloc_noncoherent().

Next, plan to investigate how we should handle places as below. We
firstly need figure out whether they really need buffer from ZONE_DMA.
If yes, how to change them with other ways. This need help from
maintainers, experts from sub-components and code contributors or anyone
knowing them well. E.g s390 and crypyto, we need guidance and help.

1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
2) Kmalloc(GFP_DMA) in drivers/crypto;
3) Kmalloc(GFP_DMA) in network drivers under drivers/net, e.g skb
   buffer requested from DMA zone.
4) Kmalloc(GFP_DMA) in device register control, e.g using regmap, devres  
   to read/write register, while memory from ZONE_DMA is required, e.g
   i2c, spi. 

For this first patch series, thanks to Hyeonggon for helping
reviewing and great suggestions on patch improving. We will work
together to continue the next steps of work.

Any comment, thought, or suggestoin is welcome and appreciated,
including but not limited to:
1) whether we should remove dma-kmalloc support in kernel();
3) why kmalloc(GFP_DMA) is needed in a certain place. why memory from
   ZONE_DMA has to be requested in the case.
2) how to replace it with other ways in any place which you are familiar
   with;

===========================Background information=======================
Prelusion:
Earlier, allocation failure was observed when calling kmalloc() with
GFP_DMA. It requests to allocate slab page from DMA zone while no managed
pages at all in there. Because in the current kernel, dma-kmalloc will
be created as long as CONFIG_ZONE_DMA is enabled. However, kdump kernel
of x86_64 doesn't have managed pages on DMA zone since below commit. The
details of this kdump issue can be found in reference link (a).

	commit 6f599d84231f ("x86/kdump: Always reserve the low 1M when the crashkernel option is specified")

To make clear the root cause and fix, many reviewers contributed their
thoughts and suggestions in the thread of the patchset v3 (a). Finally
Hyeonggon concluded what we can do to fix the kdump issue for now as a
workaround, and further action to get rid of dma-kmalloc which is not
a reasonable existence. (Please see Hyeonggon's reply in refernce (b)).
Quote Hyeonggon's words here:
~~~~
What about one of those?:

    1) Do not call warn_alloc in page allocator if will always fail
    to allocate ZONE_DMA pages.

    2) let's check all callers of kmalloc with GFP_DMA
    if they really need GFP_DMA flag and replace those by DMA API or
    just remove GFP_DMA from kmalloc()

    3) Drop support for allocating DMA memory from slab allocator
    (as Christoph Hellwig said) and convert them to use DMA32
    and see what happens
~~~~

Then Christoph acked Hyeonggon's conclusion, and said "This is the right
thing to do, but it will take a while." (See reference link (c))


==========Reference links=======
(a) v4 post including the details of kdump issue:
https://lore.kernel.org/all/20211223094435.248523-1-bhe@redhat.com/T/#u

(b) v3 post including many reviewers' comments:
https://lore.kernel.org/all/20211213122712.23805-1-bhe@redhat.com/T/#u

(c) Hyeonggon's mail concluding the solution:
https://lore.kernel.org/all/20211215044818.GB1097530@odroid/T/#u

(d) Christoph acked the plan in this mail:
https://lore.kernel.org/all/20211215072710.GA3010@lst.de/T/#u

Baoquan He (21):
  parisc: pci-dma: remove stale code and comment
  gpu: ipu-v3: Don't use GFP_DMA when calling dma_alloc_coherent()
  drm/sti: Don't use GFP_DMA when calling dma_alloc_wc()
  sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent()
  fbdev: da8xx: Don't use GFP_DMA when calling dma_alloc_coherent()
  fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc()
  usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling
    dma_alloc_coherent()
  usb: cdns3: Don't use GFP_DMA when calling dma_alloc_coherent()
  uio: pruss: Don't use GFP_DMA when calling dma_alloc_coherent()
  staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  spi: atmel: Don't use GFP_DMA when calling dma_alloc_coherent()
  spi: spi-ti-qspi: Don't use GFP_DMA when calling dma_alloc_coherent()
  usb: cdns3: Don't use GFP_DMA when calling dma_pool_alloc()
  usb: udc: lpc32xx: Don't use GFP_DMA when calling dma_pool_alloc()
  net: marvell: prestera: Don't use GFP_DMA when calling
    dma_pool_alloc()
  net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling
    dmam_alloc_coherent()
  ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer
  HID: intel-ish-hid: Use dma_alloc_noncoherent() for dma buffer
  mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer

Hyeonggon Yoo (1):
  net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent()

 arch/parisc/kernel/pci-dma.c                  |  8 ---
 drivers/gpu/drm/sti/sti_cursor.c              |  4 +-
 drivers/gpu/drm/sti/sti_hqvdp.c               |  2 +-
 drivers/gpu/ipu-v3/ipu-image-convert.c        |  2 +-
 drivers/hid/intel-ish-hid/ishtp-fw-loader.c   | 23 +++-----
 drivers/mmc/host/wbsd.c                       | 45 +++-----------
 drivers/mtd/nand/raw/marvell_nand.c           | 55 ++++++++++-------
 .../ethernet/marvell/prestera/prestera_rxtx.c |  2 +-
 drivers/net/ethernet/mediatek/mtk_star_emac.c |  2 +-
 drivers/net/ethernet/moxa/moxart_ether.c      |  4 +-
 drivers/net/ethernet/rocker/rocker_main.c     | 59 ++++++++-----------
 drivers/spi/spi-atmel.c                       |  4 +-
 drivers/spi/spi-ti-qspi.c                     |  2 +-
 drivers/staging/emxx_udc/emxx_udc.c           |  2 +-
 drivers/staging/media/imx/imx-media-utils.c   |  2 +-
 drivers/uio/uio_pruss.c                       |  2 +-
 drivers/usb/cdns3/cdns3-gadget.c              |  4 +-
 drivers/usb/gadget/udc/lpc32xx_udc.c          |  4 +-
 drivers/video/fbdev/da8xx-fb.c                |  4 +-
 drivers/video/fbdev/fsl-diu-fb.c              |  2 +-
 drivers/video/fbdev/mx3fb.c                   |  2 +-
 sound/mips/snd-n64.c                          |  2 +-
 22 files changed, 97 insertions(+), 139 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/22] parisc: pci-dma: remove stale code and comment
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:07   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

The gfp assignment has been commented out in ancient times, combined with
the code comment, obviously it's not needed since then. Let's remove the
whole ifdeffery block so that GFP_DMA searching won't point to this.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/parisc/kernel/pci-dma.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index 36a57aa38e87..6c7c6314ef33 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -417,14 +417,6 @@ void *arch_dma_alloc(struct device *dev, size_t size,
 	map_uncached_pages(vaddr, size, paddr);
 	*dma_handle = (dma_addr_t) paddr;
 
-#if 0
-/* This probably isn't needed to support EISA cards.
-** ISA cards will certainly only support 24-bit DMA addressing.
-** Not clear if we can, want, or need to support ISA.
-*/
-	if (!dev || *dev->coherent_dma_mask < 0xffffffff)
-		gfp |= GFP_DMA;
-#endif
 	return (void *)vaddr;
 }
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
  2022-02-19  0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:07   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 03/22] gpu: ipu-v3: " Baoquan He
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

From: Hyeonggon Yoo <42.hyeyoo@gmail.com>

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/net/ethernet/moxa/moxart_ether.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index 15179b9529e1..8fc2c2e71c2d 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -495,7 +495,7 @@ static int moxart_mac_probe(struct platform_device *pdev)
 
 	priv->tx_desc_base = dma_alloc_coherent(&pdev->dev, TX_REG_DESC_SIZE *
 						TX_DESC_NUM, &priv->tx_base,
-						GFP_DMA | GFP_KERNEL);
+						GFP_KERNEL);
 	if (!priv->tx_desc_base) {
 		ret = -ENOMEM;
 		goto init_fail;
@@ -503,7 +503,7 @@ static int moxart_mac_probe(struct platform_device *pdev)
 
 	priv->rx_desc_base = dma_alloc_coherent(&pdev->dev, RX_REG_DESC_SIZE *
 						RX_DESC_NUM, &priv->rx_base,
-						GFP_DMA | GFP_KERNEL);
+						GFP_KERNEL);
 	if (!priv->rx_desc_base) {
 		ret = -ENOMEM;
 		goto init_fail;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/22] gpu: ipu-v3: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
  2022-02-19  0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
  2022-02-19  0:52 ` [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:07   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c
index aa1d4b6d278f..1bd3eff2cf47 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -382,7 +382,7 @@ static int alloc_dma_buf(struct ipu_image_convert_priv *priv,
 {
 	buf->len = PAGE_ALIGN(size);
 	buf->virt = dma_alloc_coherent(priv->ipu->dev, buf->len, &buf->phys,
-				       GFP_DMA | GFP_KERNEL);
+				       GFP_KERNEL);
 	if (!buf->virt) {
 		dev_err(priv->ipu->dev, "failed to alloc dma buffer\n");
 		return -ENOMEM;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (2 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 03/22] gpu: ipu-v3: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:08   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_wc() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_wc().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/gpu/drm/sti/sti_cursor.c | 4 ++--
 drivers/gpu/drm/sti/sti_hqvdp.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_cursor.c b/drivers/gpu/drm/sti/sti_cursor.c
index 1d6051b4f6fe..d1123dc09d25 100644
--- a/drivers/gpu/drm/sti/sti_cursor.c
+++ b/drivers/gpu/drm/sti/sti_cursor.c
@@ -235,7 +235,7 @@ static int sti_cursor_atomic_check(struct drm_plane *drm_plane,
 		cursor->pixmap.base = dma_alloc_wc(cursor->dev,
 						   cursor->pixmap.size,
 						   &cursor->pixmap.paddr,
-						   GFP_KERNEL | GFP_DMA);
+						   GFP_KERNEL);
 		if (!cursor->pixmap.base) {
 			DRM_ERROR("Failed to allocate memory for pixmap\n");
 			return -EINVAL;
@@ -375,7 +375,7 @@ struct drm_plane *sti_cursor_create(struct drm_device *drm_dev,
 	/* Allocate clut buffer */
 	size = 0x100 * sizeof(unsigned short);
 	cursor->clut = dma_alloc_wc(dev, size, &cursor->clut_paddr,
-				    GFP_KERNEL | GFP_DMA);
+				    GFP_KERNEL);
 
 	if (!cursor->clut) {
 		DRM_ERROR("Failed to allocate memory for cursor clut\n");
diff --git a/drivers/gpu/drm/sti/sti_hqvdp.c b/drivers/gpu/drm/sti/sti_hqvdp.c
index 3c61ba8b43e0..324e9dc238e4 100644
--- a/drivers/gpu/drm/sti/sti_hqvdp.c
+++ b/drivers/gpu/drm/sti/sti_hqvdp.c
@@ -857,7 +857,7 @@ static void sti_hqvdp_init(struct sti_hqvdp *hqvdp)
 	size = NB_VDP_CMD * sizeof(struct sti_hqvdp_cmd);
 	hqvdp->hqvdp_cmd = dma_alloc_wc(hqvdp->dev, size,
 					&dma_addr,
-					GFP_KERNEL | GFP_DMA);
+					GFP_KERNEL);
 	if (!hqvdp->hqvdp_cmd) {
 		DRM_ERROR("Failed to allocate memory for VDP cmd\n");
 		return;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (3 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:08   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 06/22] fbdev: da8xx: " Baoquan He
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 sound/mips/snd-n64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/mips/snd-n64.c b/sound/mips/snd-n64.c
index 463a6fe589eb..20386a855191 100644
--- a/sound/mips/snd-n64.c
+++ b/sound/mips/snd-n64.c
@@ -305,7 +305,7 @@ static int __init n64audio_probe(struct platform_device *pdev)
 	priv->card = card;
 
 	priv->ring_base = dma_alloc_coherent(card->dev, 32 * 1024, &priv->ring_base_dma,
-					     GFP_DMA|GFP_KERNEL);
+					     GFP_KERNEL);
 	if (!priv->ring_base) {
 		err = -ENOMEM;
 		goto fail_card;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/22] fbdev: da8xx: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (4 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:08   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/video/fbdev/da8xx-fb.c   | 4 ++--
 drivers/video/fbdev/fsl-diu-fb.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/video/fbdev/da8xx-fb.c b/drivers/video/fbdev/da8xx-fb.c
index 005ac3c17aa1..7cb7e63117c9 100644
--- a/drivers/video/fbdev/da8xx-fb.c
+++ b/drivers/video/fbdev/da8xx-fb.c
@@ -1426,7 +1426,7 @@ static int fb_probe(struct platform_device *device)
 	par->vram_virt = dmam_alloc_coherent(par->dev,
 					     par->vram_size,
 					     &par->vram_phys,
-					     GFP_KERNEL | GFP_DMA);
+					     GFP_KERNEL);
 	if (!par->vram_virt) {
 		dev_err(&device->dev,
 			"GLCD: kmalloc for frame buffer failed\n");
@@ -1446,7 +1446,7 @@ static int fb_probe(struct platform_device *device)
 	/* allocate palette buffer */
 	par->v_palette_base = dmam_alloc_coherent(par->dev, PALETTE_SIZE,
 						  &par->p_palette_base,
-						  GFP_KERNEL | GFP_DMA);
+						  GFP_KERNEL);
 	if (!par->v_palette_base) {
 		dev_err(&device->dev,
 			"GLCD: kmalloc for palette buffer failed\n");
diff --git a/drivers/video/fbdev/fsl-diu-fb.c b/drivers/video/fbdev/fsl-diu-fb.c
index e332017c6af6..a79fa162a5d1 100644
--- a/drivers/video/fbdev/fsl-diu-fb.c
+++ b/drivers/video/fbdev/fsl-diu-fb.c
@@ -1692,7 +1692,7 @@ static int fsl_diu_probe(struct platform_device *pdev)
 	int ret;
 
 	data = dmam_alloc_coherent(&pdev->dev, sizeof(struct fsl_diu_data),
-				   &dma_addr, GFP_DMA | __GFP_ZERO);
+				   &dma_addr, __GFP_ZERO);
 	if (!data)
 		return -ENOMEM;
 	data->dma_addr = dma_addr;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (5 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 06/22] fbdev: da8xx: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:08   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/video/fbdev/mx3fb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/mx3fb.c b/drivers/video/fbdev/mx3fb.c
index fabb271337ed..dc0b13d9e8b7 100644
--- a/drivers/video/fbdev/mx3fb.c
+++ b/drivers/video/fbdev/mx3fb.c
@@ -1335,7 +1335,7 @@ static int mx3fb_map_video_memory(struct fb_info *fbi, unsigned int mem_len,
 	dma_addr_t addr;
 
 	fbi->screen_base = dma_alloc_wc(fbi->device, mem_len, &addr,
-					GFP_DMA | GFP_KERNEL);
+					GFP_KERNEL);
 
 	if (!fbi->screen_base) {
 		dev_err(fbi->device, "Cannot allocate %u bytes framebuffer memory\n",
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (6 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:09   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 09/22] usb: cdns3: " Baoquan He
                   ` (14 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/usb/gadget/udc/lpc32xx_udc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c
index a25d01c89564..bcba5f9bc5a3 100644
--- a/drivers/usb/gadget/udc/lpc32xx_udc.c
+++ b/drivers/usb/gadget/udc/lpc32xx_udc.c
@@ -3080,7 +3080,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev)
 	/* Allocate memory for the UDCA */
 	udc->udca_v_base = dma_alloc_coherent(&pdev->dev, UDCA_BUFF_SIZE,
 					      &dma_handle,
-					      (GFP_KERNEL | GFP_DMA));
+					      GFP_KERNEL);
 	if (!udc->udca_v_base) {
 		dev_err(udc->dev, "error getting UDCA region\n");
 		retval = -ENOMEM;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/22] usb: cdns3: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (7 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:09   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 10/22] uio: pruss: " Baoquan He
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent(). replace it with GFP_KERNEL.

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index f9af7ebe003d..c0937d3d663f 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -3203,7 +3203,7 @@ static int cdns3_gadget_start(struct cdns *cdns)
 
 	/* allocate memory for setup packet buffer */
 	priv_dev->setup_buf = dma_alloc_coherent(priv_dev->sysdev, 8,
-						 &priv_dev->setup_dma, GFP_DMA);
+						 &priv_dev->setup_dma, GFP_KERNEL);
 	if (!priv_dev->setup_buf) {
 		ret = -ENOMEM;
 		goto err2;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/22] uio: pruss: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (8 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 09/22] usb: cdns3: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:09   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/uio/uio_pruss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/uio/uio_pruss.c b/drivers/uio/uio_pruss.c
index e9096f53b4cc..1de39875d436 100644
--- a/drivers/uio/uio_pruss.c
+++ b/drivers/uio/uio_pruss.c
@@ -168,7 +168,7 @@ static int pruss_probe(struct platform_device *pdev)
 	}
 
 	gdev->ddr_vaddr = dma_alloc_coherent(dev, extram_pool_sz,
-				&(gdev->ddr_paddr), GFP_KERNEL | GFP_DMA);
+				&(gdev->ddr_paddr), GFP_KERNEL);
 	if (!gdev->ddr_vaddr) {
 		dev_err(dev, "Could not allocate external memory\n");
 		ret = -ENOMEM;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (9 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 10/22] uio: pruss: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  6:51   ` Wolfram Sang
  2022-02-19  7:09   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 12/22] " Baoquan He
                   ` (11 subsequent siblings)
  22 siblings, 2 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/staging/media/imx/imx-media-utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/imx/imx-media-utils.c b/drivers/staging/media/imx/imx-media-utils.c
index 94bc866ca28c..043281ec2e9d 100644
--- a/drivers/staging/media/imx/imx-media-utils.c
+++ b/drivers/staging/media/imx/imx-media-utils.c
@@ -588,7 +588,7 @@ int imx_media_alloc_dma_buf(struct device *dev,
 
 	buf->len = PAGE_ALIGN(size);
 	buf->virt = dma_alloc_coherent(dev, buf->len, &buf->phys,
-				       GFP_DMA | GFP_KERNEL);
+				       GFP_KERNEL);
 	if (!buf->virt) {
 		dev_err(dev, "%s: failed\n", __func__);
 		return -ENOMEM;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (10 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:10   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 13/22] spi: atmel: " Baoquan He
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/staging/emxx_udc/emxx_udc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/emxx_udc/emxx_udc.c b/drivers/staging/emxx_udc/emxx_udc.c
index b6abd3770e81..673f8de50213 100644
--- a/drivers/staging/emxx_udc/emxx_udc.c
+++ b/drivers/staging/emxx_udc/emxx_udc.c
@@ -2593,7 +2593,7 @@ static int nbu2ss_ep_queue(struct usb_ep *_ep,
 		if (!ep->virt_buf)
 			ep->virt_buf = dma_alloc_coherent(udc->dev, PAGE_SIZE,
 							  &ep->phys_buf,
-							  GFP_ATOMIC | GFP_DMA);
+							  GFP_ATOMIC);
 		if (ep->epnum > 0)  {
 			if (ep->direct == USB_DIR_IN)
 				memcpy(ep->virt_buf, req->req.buf,
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/22] spi: atmel: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (11 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 12/22] " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:10   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 14/22] spi: spi-ti-qspi: " Baoquan He
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/spi/spi-atmel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c
index 9e300a932699..271dacf3b7d2 100644
--- a/drivers/spi/spi-atmel.c
+++ b/drivers/spi/spi-atmel.c
@@ -1516,14 +1516,14 @@ static int atmel_spi_probe(struct platform_device *pdev)
 		as->addr_rx_bbuf = dma_alloc_coherent(&pdev->dev,
 						      SPI_MAX_DMA_XFER,
 						      &as->dma_addr_rx_bbuf,
-						      GFP_KERNEL | GFP_DMA);
+						      GFP_KERNEL);
 		if (!as->addr_rx_bbuf) {
 			as->use_dma = false;
 		} else {
 			as->addr_tx_bbuf = dma_alloc_coherent(&pdev->dev,
 					SPI_MAX_DMA_XFER,
 					&as->dma_addr_tx_bbuf,
-					GFP_KERNEL | GFP_DMA);
+					GFP_KERNEL);
 			if (!as->addr_tx_bbuf) {
 				as->use_dma = false;
 				dma_free_coherent(&pdev->dev, SPI_MAX_DMA_XFER,
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 14/22] spi: spi-ti-qspi: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (12 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 13/22] spi: atmel: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:12   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc() Baoquan He
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_alloc_coherent() allocates dma buffer with device's addressing
limitation in mind. It's redundent to specify GFP_DMA when calling
dma_alloc_coherent().

[ 42.hyeyoo@gmail.com: Update changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
 drivers/spi/spi-ti-qspi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c
index e06aafe169e0..6c4a76a7a4b3 100644
--- a/drivers/spi/spi-ti-qspi.c
+++ b/drivers/spi/spi-ti-qspi.c
@@ -867,7 +867,7 @@ static int ti_qspi_probe(struct platform_device *pdev)
 	qspi->rx_bb_addr = dma_alloc_coherent(qspi->dev,
 					      QSPI_DMA_BUFFER_SIZE,
 					      &qspi->rx_bb_dma_addr,
-					      GFP_KERNEL | GFP_DMA);
+					      GFP_KERNEL);
 	if (!qspi->rx_bb_addr) {
 		dev_err(qspi->dev,
 			"dma_alloc_coherent failed, using PIO mode\n");
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (13 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 14/22] spi: spi-ti-qspi: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:13   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA " Baoquan He
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
so it's redundent to specify GFP_DMA32 when calling.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index c0937d3d663f..6afac25ff2c7 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -220,7 +220,7 @@ int cdns3_allocate_trb_pool(struct cdns3_endpoint *priv_ep)
 
 	if (!priv_ep->trb_pool) {
 		priv_ep->trb_pool = dma_pool_alloc(priv_dev->eps_dma_pool,
-						   GFP_DMA32 | GFP_ATOMIC,
+						   GFP_ATOMIC,
 						   &priv_ep->trb_pool_dma);
 
 		if (!priv_ep->trb_pool)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (14 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:13   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
so it's redundent to specify GFP_DMA when calling.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/usb/gadget/udc/lpc32xx_udc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c
index bcba5f9bc5a3..d234de1c62b3 100644
--- a/drivers/usb/gadget/udc/lpc32xx_udc.c
+++ b/drivers/usb/gadget/udc/lpc32xx_udc.c
@@ -922,7 +922,7 @@ static struct lpc32xx_usbd_dd_gad *udc_dd_alloc(struct lpc32xx_udc *udc)
 	dma_addr_t			dma;
 	struct lpc32xx_usbd_dd_gad	*dd;
 
-	dd = dma_pool_alloc(udc->dd_cache, GFP_ATOMIC | GFP_DMA, &dma);
+	dd = dma_pool_alloc(udc->dd_cache, GFP_ATOMIC, &dma);
 	if (dd)
 		dd->this_dma = dma;
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 17/22] net: marvell: prestera: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (15 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  4:54   ` Jakub Kicinski
  2022-02-19  7:13   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent() Baoquan He
                   ` (5 subsequent siblings)
  22 siblings, 2 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
so it's redundent to specify GFP_DMA when calling.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/net/ethernet/marvell/prestera/prestera_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/prestera/prestera_rxtx.c b/drivers/net/ethernet/marvell/prestera/prestera_rxtx.c
index e452cdeaf703..9f32dcabefb9 100644
--- a/drivers/net/ethernet/marvell/prestera/prestera_rxtx.c
+++ b/drivers/net/ethernet/marvell/prestera/prestera_rxtx.c
@@ -116,7 +116,7 @@ static int prestera_sdma_buf_init(struct prestera_sdma *sdma,
 	struct prestera_sdma_desc *desc;
 	dma_addr_t dma;
 
-	desc = dma_pool_alloc(sdma->desc_pool, GFP_DMA | GFP_KERNEL, &dma);
+	desc = dma_pool_alloc(sdma->desc_pool, GFP_KERNEL, &dma);
 	if (!desc)
 		return -ENOMEM;
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent()
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (16 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:13   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer Baoquan He
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

dmam_alloc_coherent() uses struct dma_devres to manage data, and call
dma_alloc_attrs() to allocate cohenrent DMA memory, so it's redundent
to specify GFP_DMA when calling.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/net/ethernet/mediatek/mtk_star_emac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_star_emac.c b/drivers/net/ethernet/mediatek/mtk_star_emac.c
index 89ca7960b225..55b95f51ac75 100644
--- a/drivers/net/ethernet/mediatek/mtk_star_emac.c
+++ b/drivers/net/ethernet/mediatek/mtk_star_emac.c
@@ -1533,7 +1533,7 @@ static int mtk_star_probe(struct platform_device *pdev)
 
 	priv->ring_base = dmam_alloc_coherent(dev, MTK_STAR_DMA_SIZE,
 					      &priv->dma_addr,
-					      GFP_KERNEL | GFP_DMA);
+					      GFP_KERNEL);
 	if (!priv->ring_base)
 		return -ENOMEM;
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (17 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent() Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:14   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 20/22] HID: intel-ish-hid: " Baoquan He
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

Use dma_alloc_noncoherent() instead to get the DMA buffer.

[ 42.hyeyoo@gmail.com: Use dma_alloc_noncoherent() instead of
  __get_free_pages.

  Fix memory leak. ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: davem@davemloft.net
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
---
 drivers/net/ethernet/rocker/rocker_main.c | 59 +++++++++--------------
 1 file changed, 23 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 3fcea211716c..b23dd9b70d8d 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -193,20 +193,17 @@ static int rocker_dma_test_offset(const struct rocker *rocker,
 	int i;
 	int err;
 
-	alloc = kzalloc(ROCKER_TEST_DMA_BUF_SIZE * 2 + offset,
-			GFP_KERNEL | GFP_DMA);
+	alloc = dma_alloc_noncoherent(&pdev->dev,
+				ROCKER_TEST_DMA_BUF_SIZE * 2 + offset,
+				&dma_handle,
+				DMA_BIDIRECTIONAL,
+				GFP_KERNEL);
 	if (!alloc)
 		return -ENOMEM;
+
 	buf = alloc + offset;
 	expect = buf + ROCKER_TEST_DMA_BUF_SIZE;
 
-	dma_handle = dma_map_single(&pdev->dev, buf, ROCKER_TEST_DMA_BUF_SIZE,
-				    DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(&pdev->dev, dma_handle)) {
-		err = -EIO;
-		goto free_alloc;
-	}
-
 	rocker_write64(rocker, TEST_DMA_ADDR, dma_handle);
 	rocker_write32(rocker, TEST_DMA_SIZE, ROCKER_TEST_DMA_BUF_SIZE);
 
@@ -215,14 +212,14 @@ static int rocker_dma_test_offset(const struct rocker *rocker,
 				  dma_handle, buf, expect,
 				  ROCKER_TEST_DMA_BUF_SIZE);
 	if (err)
-		goto unmap;
+		goto free;
 
 	memset(expect, 0, ROCKER_TEST_DMA_BUF_SIZE);
 	err = rocker_dma_test_one(rocker, wait, ROCKER_TEST_DMA_CTRL_CLEAR,
 				  dma_handle, buf, expect,
 				  ROCKER_TEST_DMA_BUF_SIZE);
 	if (err)
-		goto unmap;
+		goto free;
 
 	prandom_bytes(buf, ROCKER_TEST_DMA_BUF_SIZE);
 	for (i = 0; i < ROCKER_TEST_DMA_BUF_SIZE; i++)
@@ -231,14 +228,11 @@ static int rocker_dma_test_offset(const struct rocker *rocker,
 				  dma_handle, buf, expect,
 				  ROCKER_TEST_DMA_BUF_SIZE);
 	if (err)
-		goto unmap;
-
-unmap:
-	dma_unmap_single(&pdev->dev, dma_handle, ROCKER_TEST_DMA_BUF_SIZE,
-			 DMA_BIDIRECTIONAL);
-free_alloc:
-	kfree(alloc);
+		goto free;
 
+free:
+	dma_free_noncoherent(&pdev->dev, ROCKER_TEST_DMA_BUF_SIZE * 2 + offset,
+			     alloc, dma_handle, DMA_BIDIRECTIONAL);
 	return err;
 }
 
@@ -500,20 +494,13 @@ static int rocker_dma_ring_bufs_alloc(const struct rocker *rocker,
 		dma_addr_t dma_handle;
 		char *buf;
 
-		buf = kzalloc(buf_size, GFP_KERNEL | GFP_DMA);
+		buf = dma_alloc_noncoherent(&pdev->dev, buf_size,
+				&dma_handle, direction, GFP_KERNEL);
 		if (!buf) {
 			err = -ENOMEM;
 			goto rollback;
 		}
 
-		dma_handle = dma_map_single(&pdev->dev, buf, buf_size,
-					    direction);
-		if (dma_mapping_error(&pdev->dev, dma_handle)) {
-			kfree(buf);
-			err = -EIO;
-			goto rollback;
-		}
-
 		desc_info->data = buf;
 		desc_info->data_size = buf_size;
 		dma_unmap_addr_set(desc_info, mapaddr, dma_handle);
@@ -526,11 +513,10 @@ static int rocker_dma_ring_bufs_alloc(const struct rocker *rocker,
 rollback:
 	for (i--; i >= 0; i--) {
 		const struct rocker_desc_info *desc_info = &info->desc_info[i];
-
-		dma_unmap_single(&pdev->dev,
-				 dma_unmap_addr(desc_info, mapaddr),
-				 desc_info->data_size, direction);
-		kfree(desc_info->data);
+		dma_free_noncoherent(&pdev->dev, desc_info->data_size,
+				     desc_info->data,
+				     dma_unmap_addr(desc_info, mapaddr),
+				     direction);
 	}
 	return err;
 }
@@ -548,10 +534,11 @@ static void rocker_dma_ring_bufs_free(const struct rocker *rocker,
 
 		desc->buf_addr = 0;
 		desc->buf_size = 0;
-		dma_unmap_single(&pdev->dev,
-				 dma_unmap_addr(desc_info, mapaddr),
-				 desc_info->data_size, direction);
-		kfree(desc_info->data);
+		dma_free_noncoherent(&pdev->dev,
+				     desc_info->data_size,
+				     desc_info->data,
+				     dma_unmap_addr(desc_info, mapaddr),
+				     direction);
 	}
 }
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 20/22] HID: intel-ish-hid: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (18 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:14   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 21/22] mmc: wbsd: " Baoquan He
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

GFP_DMA32 is an illegal flag to pass when calling kmalloc(), please see
GFP_SLAB_BUG_MASK definition.

Allocating dma buffer using kmalloc() is not recommended. Use
dma_alloc_noncoherent() instead. DMA API will assume the device has
32 bit addressing limitation when allocating buffer.

[ 42.hyeyoo@gmail.com: Use dma_alloc_noncoherent() instead of
  __get_free_pages ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: christian.koenig@amd.com
Cc: linux-input@vger.kernel.org
---
 drivers/hid/intel-ish-hid/ishtp-fw-loader.c | 23 +++++++--------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/hid/intel-ish-hid/ishtp-fw-loader.c b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
index e24988586710..3be1e3329962 100644
--- a/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
+++ b/drivers/hid/intel-ish-hid/ishtp-fw-loader.c
@@ -661,21 +661,15 @@ static int ish_fw_xfer_direct_dma(struct ishtp_cl_data *client_data,
 	 */
 	payload_max_size &= ~(L1_CACHE_BYTES - 1);
 
-	dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
+	dma_buf = dma_alloc_noncoherent(devc, get_order(payload_max_size),
+				        &dma_buf_phy, DMA_TO_DEVICE,
+					GFP_KERNEL);
 	if (!dma_buf) {
+		dev_err(cl_data_to_dev(client_data), "DMA alloc failed\n");
 		client_data->flag_retry = true;
 		return -ENOMEM;
 	}
 
-	dma_buf_phy = dma_map_single(devc, dma_buf, payload_max_size,
-				     DMA_TO_DEVICE);
-	if (dma_mapping_error(devc, dma_buf_phy)) {
-		dev_err(cl_data_to_dev(client_data), "DMA map failed\n");
-		client_data->flag_retry = true;
-		rv = -ENOMEM;
-		goto end_err_dma_buf_release;
-	}
-
 	ldr_xfer_dma_frag.fragment.hdr.command = LOADER_CMD_XFER_FRAGMENT;
 	ldr_xfer_dma_frag.fragment.xfer_mode = LOADER_XFER_MODE_DIRECT_DMA;
 	ldr_xfer_dma_frag.ddr_phys_addr = (u64)dma_buf_phy;
@@ -725,15 +719,14 @@ static int ish_fw_xfer_direct_dma(struct ishtp_cl_data *client_data,
 		fragment_offset += fragment_size;
 	}
 
-	dma_unmap_single(devc, dma_buf_phy, payload_max_size, DMA_TO_DEVICE);
-	kfree(dma_buf);
+	dma_free_noncoherent(devc, get_order(payload_max_size), dma_buf,
+			     dma_buf_phy, DMA_TO_DEVICE);
 	return 0;
 
 end_err_resp_buf_release:
 	/* Free ISH buffer if not done already, in error case */
-	dma_unmap_single(devc, dma_buf_phy, payload_max_size, DMA_TO_DEVICE);
-end_err_dma_buf_release:
-	kfree(dma_buf);
+	dma_free_noncoherent(devc, get_order(payload_max_size), dma_buf,
+			     dma_buf_phy, DMA_TO_DEVICE);
 	return rv;
 }
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (19 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 20/22] HID: intel-ish-hid: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:17   ` Christoph Hellwig
  2022-02-19  0:52 ` [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer Baoquan He
  2022-02-21 13:57 ` [PATCH 00/22] Don't use kmalloc() with GFP_DMA Heiko Carstens
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

Use dma_alloc_noncoherent() instead to get the DMA buffer.

[ 42.hyeyoo@gmail.com: Only keep label free.

  Remove unnecessary alignment checks. it's guaranteed by DMA API.
  Just use GFP_KERNEL as it's called in sleepable context.

  Specify its dma capability using  dma_set_mask_and_coherent() ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: linux-mmc@vger.kernel.org
---
 drivers/mmc/host/wbsd.c | 45 +++++++++--------------------------------
 1 file changed, 9 insertions(+), 36 deletions(-)

diff --git a/drivers/mmc/host/wbsd.c b/drivers/mmc/host/wbsd.c
index 67ecd342fe5f..50b0197583c7 100644
--- a/drivers/mmc/host/wbsd.c
+++ b/drivers/mmc/host/wbsd.c
@@ -1366,55 +1366,28 @@ static void wbsd_request_dma(struct wbsd_host *host, int dma)
 	if (request_dma(dma, DRIVER_NAME))
 		goto err;
 
+	dma_set_mask_and_coherent(mmc_dev(host->mmc), DMA_BIT_MASK(24));
+
 	/*
 	 * We need to allocate a special buffer in
 	 * order for ISA to be able to DMA to it.
 	 */
-	host->dma_buffer = kmalloc(WBSD_DMA_SIZE,
-		GFP_NOIO | GFP_DMA | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+	host->dma_buffer = dma_alloc_noncoherent(mmc_dev(host->mmc),
+					WBSD_DMA_SIZE, &host->dma_addr,
+					DMA_BIDIRECTIONAL,
+					GFP_KERNEL);
 	if (!host->dma_buffer)
 		goto free;
 
-	/*
-	 * Translate the address to a physical address.
-	 */
-	host->dma_addr = dma_map_single(mmc_dev(host->mmc), host->dma_buffer,
-		WBSD_DMA_SIZE, DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(mmc_dev(host->mmc), host->dma_addr))
-		goto kfree;
-
-	/*
-	 * ISA DMA must be aligned on a 64k basis.
-	 */
-	if ((host->dma_addr & 0xffff) != 0)
-		goto unmap;
-	/*
-	 * ISA cannot access memory above 16 MB.
-	 */
-	else if (host->dma_addr >= 0x1000000)
-		goto unmap;
-
 	host->dma = dma;
 
 	return;
 
-unmap:
-	/*
-	 * If we've gotten here then there is some kind of alignment bug
-	 */
-	BUG_ON(1);
-
-	dma_unmap_single(mmc_dev(host->mmc), host->dma_addr,
-		WBSD_DMA_SIZE, DMA_BIDIRECTIONAL);
-	host->dma_addr = 0;
-
-kfree:
-	kfree(host->dma_buffer);
-	host->dma_buffer = NULL;
-
 free:
+	dma_free_noncoherent(mmc_dev(host->mmc), WBSD_DMA_SIZE, host->dma_buffer,
+			     host->dma_addr, DMA_BIDIRECTIONAL);
+	host->dma_buffer = NULL;
 	free_dma(dma);
-
 err:
 	pr_warn(DRIVER_NAME ": Unable to allocate DMA %d - falling back on FIFO\n",
 		dma);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (20 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 21/22] mmc: wbsd: " Baoquan He
@ 2022-02-19  0:52 ` Baoquan He
  2022-02-19  7:19   ` Christoph Hellwig
  2022-02-21 13:57 ` [PATCH 00/22] Don't use kmalloc() with GFP_DMA Heiko Carstens
  22 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-19  0:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, hch, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

Use dma_alloc_noncoherent() instead of directly allocating buffer
from kmalloc with GFP_DMA. DMA API will try to allocate buffer
depending on devices addressing limitation.

[ 42.hyeyoo@gmail.com: Use dma_alloc_noncoherent() instead of
  __get_free_page() and update changelog.

  As it does not allocate high order buffers, allocate buffer
  when needed and free after DMA. ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: christian.koenig@amd.com
Cc: linux-mtd@lists.infradead.org

---
 drivers/mtd/nand/raw/marvell_nand.c | 55 ++++++++++++++++++-----------
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/drivers/mtd/nand/raw/marvell_nand.c b/drivers/mtd/nand/raw/marvell_nand.c
index 2455a581fd70..c0b64a7e50af 100644
--- a/drivers/mtd/nand/raw/marvell_nand.c
+++ b/drivers/mtd/nand/raw/marvell_nand.c
@@ -860,26 +860,45 @@ static int marvell_nfc_xfer_data_dma(struct marvell_nfc *nfc,
 	struct dma_async_tx_descriptor *tx;
 	struct scatterlist sg;
 	dma_cookie_t cookie;
-	int ret;
+	dma_addr_t dma_handle;
+	int ret = 0;
 
 	marvell_nfc_enable_dma(nfc);
+
+	/*
+	 * DMA must act on length multiple of 32 and this length may be
+	 * bigger than the destination buffer. Use this buffer instead
+	 * for DMA transfers and then copy the desired amount of data to
+	 * the provided buffer.
+	 */
+	nfc->dma_buf = dma_alloc_noncoherent(nfc->dev, MAX_CHUNK_SIZE,
+						&dma_handle,
+						direction,
+						GFP_ATOMIC);
+	if (!nfc->dma_buf) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+
 	/* Prepare the DMA transfer */
-	sg_init_one(&sg, nfc->dma_buf, dma_len);
-	dma_map_sg(nfc->dma_chan->device->dev, &sg, 1, direction);
-	tx = dmaengine_prep_slave_sg(nfc->dma_chan, &sg, 1,
+	tx = dmaengine_prep_slave_single(nfc->dma_chan, dma_handle, dma_len,
 				     direction == DMA_FROM_DEVICE ?
 				     DMA_DEV_TO_MEM : DMA_MEM_TO_DEV,
 				     DMA_PREP_INTERRUPT);
 	if (!tx) {
 		dev_err(nfc->dev, "Could not prepare DMA S/G list\n");
-		return -ENXIO;
+		ret = -ENXIO;
+		goto free;
 	}
 
 	/* Do the task and wait for it to finish */
 	cookie = dmaengine_submit(tx);
 	ret = dma_submit_error(cookie);
-	if (ret)
-		return -EIO;
+	if (ret) {
+		ret = -EIO;
+		goto free;
+	}
 
 	dma_async_issue_pending(nfc->dma_chan);
 	ret = marvell_nfc_wait_cmdd(nfc->selected_chip);
@@ -889,10 +908,16 @@ static int marvell_nfc_xfer_data_dma(struct marvell_nfc *nfc,
 		dev_err(nfc->dev, "Timeout waiting for DMA (status: %d)\n",
 			dmaengine_tx_status(nfc->dma_chan, cookie, NULL));
 		dmaengine_terminate_all(nfc->dma_chan);
-		return -ETIMEDOUT;
+		ret = -ETIMEDOUT;
+		goto free;
 	}
 
-	return 0;
+free:
+	dma_free_noncoherent(nfc->dev, MAX_CHUNK_SIZE, nfc->dma_buf,
+			     dma_handle, direction);
+
+out:
+	return ret;
 }
 
 static int marvell_nfc_xfer_data_in_pio(struct marvell_nfc *nfc, u8 *in,
@@ -2814,18 +2839,6 @@ static int marvell_nfc_init_dma(struct marvell_nfc *nfc)
 		goto release_channel;
 	}
 
-	/*
-	 * DMA must act on length multiple of 32 and this length may be
-	 * bigger than the destination buffer. Use this buffer instead
-	 * for DMA transfers and then copy the desired amount of data to
-	 * the provided buffer.
-	 */
-	nfc->dma_buf = kmalloc(MAX_CHUNK_SIZE, GFP_KERNEL | GFP_DMA);
-	if (!nfc->dma_buf) {
-		ret = -ENOMEM;
-		goto release_channel;
-	}
-
 	nfc->use_dma = true;
 
 	return 0;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/22] net: marvell: prestera: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
@ 2022-02-19  4:54   ` Jakub Kicinski
  2022-02-20  2:06     ` Baoquan He
  2022-02-19  7:13   ` Christoph Hellwig
  1 sibling, 1 reply; 74+ messages in thread
From: Jakub Kicinski @ 2022-02-19  4:54 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, 19 Feb 2022 08:52:16 +0800 Baoquan He wrote:
> dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
> so it's redundent to specify GFP_DMA when calling.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

This and the other two netdev patches in the series are perfectly
cleanups reasonable even outside of the larger context.

Please repost those separately and make sure you CC the maintainers
of the drivers.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
@ 2022-02-19  6:51   ` Wolfram Sang
  2022-02-20  1:55     ` Baoquan He
  2022-02-19  7:09   ` Christoph Hellwig
  1 sibling, 1 reply; 74+ messages in thread
From: Wolfram Sang @ 2022-02-19  6:51 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c

[-- Attachment #1: Type: text/plain, Size: 111 bytes --]


> --- a/drivers/staging/media/imx/imx-media-utils.c

$subject says 'emxx_udc' instead of 'imx: media-utils'.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/22] parisc: pci-dma: remove stale code and comment
  2022-02-19  0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
@ 2022-02-19  7:07   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:07 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:00AM +0800, Baoquan He wrote:
> The gfp assignment has been commented out in ancient times, combined with
> the code comment, obviously it's not needed since then. Let's remove the
> whole ifdeffery block so that GFP_DMA searching won't point to this.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  7:07   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:07 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:01AM +0800, Baoquan He wrote:
> From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> 
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/22] gpu: ipu-v3: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 03/22] gpu: ipu-v3: " Baoquan He
@ 2022-02-19  7:07   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:07 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:02AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc()
  2022-02-19  0:52 ` [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
@ 2022-02-19  7:08   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:08 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:03AM +0800, Baoquan He wrote:
> dma_alloc_wc() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_wc().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  7:08   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:08 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:04AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().
> 
> [ 42.hyeyoo@gmail.com: Update changelog ]
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 06/22] fbdev: da8xx: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 06/22] fbdev: da8xx: " Baoquan He
@ 2022-02-19  7:08   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:08 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:05AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc()
  2022-02-19  0:52 ` [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
@ 2022-02-19  7:08   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:08 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:06AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
@ 2022-02-19  7:09   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:09 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:07AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 09/22] usb: cdns3: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 09/22] usb: cdns3: " Baoquan He
@ 2022-02-19  7:09   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:09 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:08AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent(). replace it with GFP_KERNEL.

Plase avoid the overly long line. The rest looks good.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/22] uio: pruss: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 10/22] uio: pruss: " Baoquan He
@ 2022-02-19  7:09   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:09 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:09AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
  2022-02-19  6:51   ` Wolfram Sang
@ 2022-02-19  7:09   ` Christoph Hellwig
  1 sibling, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:09 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:10AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 12/22] " Baoquan He
@ 2022-02-19  7:10   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:11AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/22] spi: atmel: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 13/22] spi: atmel: " Baoquan He
@ 2022-02-19  7:10   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:12AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 14/22] spi: spi-ti-qspi: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 14/22] spi: spi-ti-qspi: " Baoquan He
@ 2022-02-19  7:12   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:12 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:13AM +0800, Baoquan He wrote:
> dma_alloc_coherent() allocates dma buffer with device's addressing
> limitation in mind. It's redundent to specify GFP_DMA when calling
> dma_alloc_coherent().

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc()
  2022-02-19  0:52 ` [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc() Baoquan He
@ 2022-02-19  7:13   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:13 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:14AM +0800, Baoquan He wrote:
> dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
> so it's redundent to specify GFP_DMA32 when calling.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  0:52 ` [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA " Baoquan He
@ 2022-02-19  7:13   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:13 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:15AM +0800, Baoquan He wrote:
> dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
> so it's redundent to specify GFP_DMA when calling.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/22] net: marvell: prestera: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
  2022-02-19  4:54   ` Jakub Kicinski
@ 2022-02-19  7:13   ` Christoph Hellwig
  1 sibling, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:13 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:16AM +0800, Baoquan He wrote:
> dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
> so it's redundent to specify GFP_DMA when calling.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent()
  2022-02-19  0:52 ` [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent() Baoquan He
@ 2022-02-19  7:13   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:13 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:17AM +0800, Baoquan He wrote:
> dmam_alloc_coherent() uses struct dma_devres to manage data, and call
> dma_alloc_attrs() to allocate cohenrent DMA memory, so it's redundent
> to specify GFP_DMA when calling.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:52 ` [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer Baoquan He
@ 2022-02-19  7:14   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:18AM +0800, Baoquan He wrote:
> Use dma_alloc_noncoherent() instead to get the DMA buffer.
> 
> [ 42.hyeyoo@gmail.com: Use dma_alloc_noncoherent() instead of
>   __get_free_pages.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 20/22] HID: intel-ish-hid: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:52 ` [PATCH 20/22] HID: intel-ish-hid: " Baoquan He
@ 2022-02-19  7:14   ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:14 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:52 ` [PATCH 21/22] mmc: wbsd: " Baoquan He
@ 2022-02-19  7:17   ` Christoph Hellwig
  2022-02-20  8:40     ` Baoquan He
  0 siblings, 1 reply; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:17 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:20AM +0800, Baoquan He wrote:
>  	if (request_dma(dma, DRIVER_NAME))
>  		goto err;
>  
> +	dma_set_mask_and_coherent(mmc_dev(host->mmc), DMA_BIT_MASK(24));

This also sets the streaming mask, but the driver doesn't seem to make
use of that.  Please document it in the commit log.

Also setting smaller than 32 bit masks can fail, so this should have
error handling.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  0:52 ` [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer Baoquan He
@ 2022-02-19  7:19   ` Christoph Hellwig
  2022-02-19 11:18     ` Hyeonggon Yoo
  0 siblings, 1 reply; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-19  7:19 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 08:52:21AM +0800, Baoquan He wrote:
> Use dma_alloc_noncoherent() instead of directly allocating buffer
> from kmalloc with GFP_DMA. DMA API will try to allocate buffer
> depending on devices addressing limitation.

I think it would be better to still allocate the buffer at allocation
time and then just transfer ownership using dma_sync_single* in the I/O
path to avoid the GFP_ATOMIC allocation.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  7:19   ` Christoph Hellwig
@ 2022-02-19 11:18     ` Hyeonggon Yoo
  2022-02-22  8:46       ` Christoph Hellwig
  0 siblings, 1 reply; 74+ messages in thread
From: Hyeonggon Yoo @ 2022-02-19 11:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Baoquan He, linux-kernel, linux-mm, akpm, cl, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On Sat, Feb 19, 2022 at 08:19:00AM +0100, Christoph Hellwig wrote:
> On Sat, Feb 19, 2022 at 08:52:21AM +0800, Baoquan He wrote:
> > Use dma_alloc_noncoherent() instead of directly allocating buffer
> > from kmalloc with GFP_DMA. DMA API will try to allocate buffer
> > depending on devices addressing limitation.
> 
> I think it would be better to still allocate the buffer at allocation
> time and then just transfer ownership using dma_sync_single* in the I/O
> path to avoid the GFP_ATOMIC allocation.

This driver allocates the buffer at initialization step and maps the buffer
for DMA_TO_DEVICE and DMA_FROM_DEVICE when processing IO.

But after making this driver to use dma_alloc_noncoherent(), remapping
dma_alloc_noncoherent()-ed buffer is strange So I just made it to allocate
the buffer in IO path.

At this point I thought we need an API that allocates based on
address bit mask (like dma_alloc_noncoherent()), which does not maps buffer
into dma address. __get_free_pages/kmalloc(GFP_DMA) has been so confusing..

Hmm.. for this specific case, What about allocating two buffers
for DMA_TO_DEVICE and DMA_FROM_DEVICE at initialization time?

Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/22] staging: emxx_udc: Don't use GFP_DMA when calling dma_alloc_coherent()
  2022-02-19  6:51   ` Wolfram Sang
@ 2022-02-20  1:55     ` Baoquan He
  0 siblings, 0 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-20  1:55 UTC (permalink / raw)
  To: Wolfram Sang, linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c

On 02/19/22 at 07:51am, Wolfram Sang wrote:
> 
> > --- a/drivers/staging/media/imx/imx-media-utils.c
> 
> $subject says 'emxx_udc' instead of 'imx: media-utils'.

Ah, good catch. It should be wrongly copied from the patch 12, will fix
it, thanks.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 17/22] net: marvell: prestera: Don't use GFP_DMA when calling dma_pool_alloc()
  2022-02-19  4:54   ` Jakub Kicinski
@ 2022-02-20  2:06     ` Baoquan He
  0 siblings, 0 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-20  2:06 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On 02/18/22 at 08:54pm, Jakub Kicinski wrote:
> On Sat, 19 Feb 2022 08:52:16 +0800 Baoquan He wrote:
> > dma_pool_alloc() uses dma_alloc_coherent() to pre-allocate DMA buffer,
> > so it's redundent to specify GFP_DMA when calling.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> 
> This and the other two netdev patches in the series are perfectly
> cleanups reasonable even outside of the larger context.
> 
> Please repost those separately and make sure you CC the maintainers
> of the drivers.

Thanks for reviewing. I am not familiar with netdev and network patch
posting rule. There are 4 patches altogether related to netdev as below,
Will repost them to the relevant netdev mailinglist and maintainers.

[PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer
[PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent()
[PATCH 17/22] net: marvell: prestera: Don't use GFP_DMA when calling dma_pool_alloc()
[PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent()


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19  7:17   ` Christoph Hellwig
@ 2022-02-20  8:40     ` Baoquan He
  2022-02-22  8:45       ` Christoph Hellwig
  0 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-20  8:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/19/22 at 08:17am, Christoph Hellwig wrote:
> On Sat, Feb 19, 2022 at 08:52:20AM +0800, Baoquan He wrote:
> >  	if (request_dma(dma, DRIVER_NAME))
> >  		goto err;
> >  
> > +	dma_set_mask_and_coherent(mmc_dev(host->mmc), DMA_BIT_MASK(24));
> 
> This also sets the streaming mask, but the driver doesn't seem to make
> use of that.  Please document it in the commit log.

Thanks for reviewing. I will change it to dma_set_mask(), and describe
this change in patch log.

> 
> Also setting smaller than 32 bit masks can fail, so this should have
> error handling.

OK, will check and add error handling.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
                   ` (21 preceding siblings ...)
  2022-02-19  0:52 ` [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer Baoquan He
@ 2022-02-21 13:57 ` Heiko Carstens
  2022-02-22  8:44   ` Christoph Hellwig
  22 siblings, 1 reply; 74+ messages in thread
From: Heiko Carstens @ 2022-02-21 13:57 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa,
	Halil Pasic, Vineeth Vijayan

On Sat, Feb 19, 2022 at 08:51:59AM +0800, Baoquan He wrote:
> Let's replace it with other ways. This is the first step towards
> removing dma-kmalloc support in kernel (Means that if everyting
> is going well, we can't use kmalloc(GFP_DMA) to allocate buffer in the
> future).
...
> 
> Next, plan to investigate how we should handle places as below. We
> firstly need figure out whether they really need buffer from ZONE_DMA.
> If yes, how to change them with other ways. This need help from
> maintainers, experts from sub-components and code contributors or anyone
> knowing them well. E.g s390 and crypyto, we need guidance and help.
> 
> 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;

So, s390 partially requires GFP_DMA allocations for memory areas which
are required by the hardware to be below 2GB. There is not necessarily
a device associated when this is required. E.g. some legacy "diagnose"
calls require buffers to be below 2GB.

How should something like this be handled? I'd guess that the
dma_alloc API is not the right thing to use in such cases. Of course
we could say, let's waste memory and use full pages instead, however
I'm not sure this is a good idea.

s390 drivers could probably converted to dma_alloc API, even though
that would cause quite some code churn.

> For this first patch series, thanks to Hyeonggon for helping
> reviewing and great suggestions on patch improving. We will work
> together to continue the next steps of work.
> 
> Any comment, thought, or suggestoin is welcome and appreciated,
> including but not limited to:
> 1) whether we should remove dma-kmalloc support in kernel();

The question is: what would this buy us? As stated above I'd assume
this comes with quite some code churn, so there should be a good
reason to do this.

From this cover letter I only get that there was a problem with kdump
on x86, and this has been fixed. So why this extra effort?

>     3) Drop support for allocating DMA memory from slab allocator
>     (as Christoph Hellwig said) and convert them to use DMA32
>     and see what happens

Can you please clarify what "convert to DMA32" means? I would assume
this does _not_ mean that passing GFP_DMA32 to slab allocator would
work then?

btw. there are actually two kmalloc allocations which pass GFP_DMA32;
I guess this is broken(?):

drivers/hid/intel-ish-hid/ishtp-fw-loader.c:    dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
drivers/media/test-drivers/vivid/vivid-osd.c:   dev->video_vbase = kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-21 13:57 ` [PATCH 00/22] Don't use kmalloc() with GFP_DMA Heiko Carstens
@ 2022-02-22  8:44   ` Christoph Hellwig
  2022-02-22 13:12     ` Baoquan He
  2022-02-23 19:18     ` Heiko Carstens
  0 siblings, 2 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22  8:44 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Baoquan He, linux-kernel, linux-mm, akpm, hch, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa, Halil Pasic, Vineeth Vijayan

On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote:
> > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
> 
> So, s390 partially requires GFP_DMA allocations for memory areas which
> are required by the hardware to be below 2GB. There is not necessarily
> a device associated when this is required. E.g. some legacy "diagnose"
> calls require buffers to be below 2GB.
> 
> How should something like this be handled? I'd guess that the
> dma_alloc API is not the right thing to use in such cases. Of course
> we could say, let's waste memory and use full pages instead, however
> I'm not sure this is a good idea.

Yeah, I don't think the DMA API is the right thing for that.  This
is one of the very rare cases where a raw allocation makes sense.

That being said being able to drop kmalloc support for GFP_DMA would
be really useful. How much memory would we waste if switching to the
page allocator?

> s390 drivers could probably converted to dma_alloc API, even though
> that would cause quite some code churn.

I think that would be a very good thing to have.

> > For this first patch series, thanks to Hyeonggon for helping
> > reviewing and great suggestions on patch improving. We will work
> > together to continue the next steps of work.
> > 
> > Any comment, thought, or suggestoin is welcome and appreciated,
> > including but not limited to:
> > 1) whether we should remove dma-kmalloc support in kernel();
> 
> The question is: what would this buy us? As stated above I'd assume
> this comes with quite some code churn, so there should be a good
> reason to do this.

There is two steps here.  One is to remove GFP_DMA support from
kmalloc, which would help to cleanup the slab allocator(s) very nicely,
as at that point it can stop to be zone aware entirely.

The long term goal is to remove ZONE_DMA entirely at least for
architectures that only use the small 16MB ISA-style one.  It can
then be replaced with for example a CMA area and fall into a movable
zone.  I'd have to prototype this first and see how it applies to the
s390 case.  It might not be worth it and maybe we should replace
ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
the amount covered tends to not be totally out of line for what we
built the zone infrastructure.

> >From this cover letter I only get that there was a problem with kdump
> on x86, and this has been fixed. So why this extra effort?
> 
> >     3) Drop support for allocating DMA memory from slab allocator
> >     (as Christoph Hellwig said) and convert them to use DMA32
> >     and see what happens
> 
> Can you please clarify what "convert to DMA32" means? I would assume
> this does _not_ mean that passing GFP_DMA32 to slab allocator would
> work then?

I'm really not sure what this means.

> 
> btw. there are actually two kmalloc allocations which pass GFP_DMA32;
> I guess this is broken(?):
> 
> drivers/hid/intel-ish-hid/ishtp-fw-loader.c:    dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
> drivers/media/test-drivers/vivid/vivid-osd.c:   dev->video_vbase = kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);

Yes, this is completely broken.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-20  8:40     ` Baoquan He
@ 2022-02-22  8:45       ` Christoph Hellwig
  2022-02-22  9:14         ` Baoquan He
  0 siblings, 1 reply; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22  8:45 UTC (permalink / raw)
  To: Baoquan He
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sun, Feb 20, 2022 at 04:40:44PM +0800, Baoquan He wrote:
> On 02/19/22 at 08:17am, Christoph Hellwig wrote:
> > On Sat, Feb 19, 2022 at 08:52:20AM +0800, Baoquan He wrote:
> > >  	if (request_dma(dma, DRIVER_NAME))
> > >  		goto err;
> > >  
> > > +	dma_set_mask_and_coherent(mmc_dev(host->mmc), DMA_BIT_MASK(24));
> > 
> > This also sets the streaming mask, but the driver doesn't seem to make
> > use of that.  Please document it in the commit log.
> 
> Thanks for reviewing. I will change it to dma_set_mask(), and describe
> this change in patch log.

No, if you change it, it should be dma_set_coherent_mask only as it is
not using streaming mappings.  I suspect dma_set_mask_and_coherent is
the right thing if the driver ever wants to use streaming mapping,
it would just need to be documented in the commit message.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-19 11:18     ` Hyeonggon Yoo
@ 2022-02-22  8:46       ` Christoph Hellwig
  2022-02-22  9:06         ` David Laight
  0 siblings, 1 reply; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22  8:46 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: Christoph Hellwig, Baoquan He, linux-kernel, linux-mm, akpm, cl,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Sat, Feb 19, 2022 at 11:18:24AM +0000, Hyeonggon Yoo wrote:
> > I think it would be better to still allocate the buffer at allocation
> > time and then just transfer ownership using dma_sync_single* in the I/O
> > path to avoid the GFP_ATOMIC allocation.
> 
> This driver allocates the buffer at initialization step and maps the buffer
> for DMA_TO_DEVICE and DMA_FROM_DEVICE when processing IO.
> 
> But after making this driver to use dma_alloc_noncoherent(), remapping
> dma_alloc_noncoherent()-ed buffer is strange So I just made it to allocate
> the buffer in IO path.

You should not remap it.  Just use dma_sync_single* to transfer ownership.

> Hmm.. for this specific case, What about allocating two buffers
> for DMA_TO_DEVICE and DMA_FROM_DEVICE at initialization time?

That will work, but I don't see the benefit as you'd still need to call
dma_sync_single* before and after each data transfer.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-22  8:46       ` Christoph Hellwig
@ 2022-02-22  9:06         ` David Laight
  2022-02-22 13:16           ` 'Christoph Hellwig'
  0 siblings, 1 reply; 74+ messages in thread
From: David Laight @ 2022-02-22  9:06 UTC (permalink / raw)
  To: 'Christoph Hellwig', Hyeonggon Yoo
  Cc: Baoquan He, linux-kernel, linux-mm, akpm, cl, penberg, rientjes,
	iamjoonsoo.kim, vbabka, david, herbert, davem, linux-crypto,
	steffen.klassert, netdev, hca, gor, agordeev, borntraeger, svens,
	linux-s390, michael, linux-i2c, wsa

From: Christoph Hellwig
> Sent: 22 February 2022 08:47
...
> > Hmm.. for this specific case, What about allocating two buffers
> > for DMA_TO_DEVICE and DMA_FROM_DEVICE at initialization time?
> 
> That will work, but I don't see the benefit as you'd still need to call
> dma_sync_single* before and after each data transfer.

For systems with an iommu that should save all the iommu setup
for every transfer.
I'd also guess that it saves worrying about the error path
when the dma_map fails (eg because the iommu has no space).

OTOH the driver would be 'hogging' iommu space, so maybe
allocate during open() (or equivalent).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-22  8:45       ` Christoph Hellwig
@ 2022-02-22  9:14         ` Baoquan He
  2022-02-22 13:11           ` Christoph Hellwig
  0 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-22  9:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/22/22 at 09:45am, Christoph Hellwig wrote:
> On Sun, Feb 20, 2022 at 04:40:44PM +0800, Baoquan He wrote:
> > On 02/19/22 at 08:17am, Christoph Hellwig wrote:
> > > On Sat, Feb 19, 2022 at 08:52:20AM +0800, Baoquan He wrote:
> > > >  	if (request_dma(dma, DRIVER_NAME))
> > > >  		goto err;
> > > >  
> > > > +	dma_set_mask_and_coherent(mmc_dev(host->mmc), DMA_BIT_MASK(24));
> > > 
> > > This also sets the streaming mask, but the driver doesn't seem to make
> > > use of that.  Please document it in the commit log.
> > 
> > Thanks for reviewing. I will change it to dma_set_mask(), and describe
> > this change in patch log.
> 
> No, if you change it, it should be dma_set_coherent_mask only as it is
> not using streaming mappings.  I suspect dma_set_mask_and_coherent is
> the right thing if the driver ever wants to use streaming mapping,
> it would just need to be documented in the commit message.

It will serve dma_alloc_noncoherent() calling later, should be streaming
mapping?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-22  9:14         ` Baoquan He
@ 2022-02-22 13:11           ` Christoph Hellwig
  2022-02-22 13:40             ` Baoquan He
                               ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22 13:11 UTC (permalink / raw)
  To: Baoquan He
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Tue, Feb 22, 2022 at 05:14:16PM +0800, Baoquan He wrote:
> > No, if you change it, it should be dma_set_coherent_mask only as it is
> > not using streaming mappings.  I suspect dma_set_mask_and_coherent is
> > the right thing if the driver ever wants to use streaming mapping,
> > it would just need to be documented in the commit message.
> 
> It will serve dma_alloc_noncoherent() calling later, should be streaming
> mapping?

No, that also looks at the coherent mask.  Which is a bit misnamed these
days, it really should be the alloc mask.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-22  8:44   ` Christoph Hellwig
@ 2022-02-22 13:12     ` Baoquan He
  2022-02-22 13:26       ` Baoquan He
  2022-02-23 19:18     ` Heiko Carstens
  1 sibling, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-22 13:12 UTC (permalink / raw)
  To: Heiko Carstens, Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa,
	Halil Pasic, Vineeth Vijayan, x86

On 02/22/22 at 09:44am, Christoph Hellwig wrote:
> On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote:
> > > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
> > 
> > So, s390 partially requires GFP_DMA allocations for memory areas which
> > are required by the hardware to be below 2GB. There is not necessarily
> > a device associated when this is required. E.g. some legacy "diagnose"
> > calls require buffers to be below 2GB.
> > 
> > How should something like this be handled? I'd guess that the
> > dma_alloc API is not the right thing to use in such cases. Of course
> > we could say, let's waste memory and use full pages instead, however
> > I'm not sure this is a good idea.
> 
> Yeah, I don't think the DMA API is the right thing for that.  This
> is one of the very rare cases where a raw allocation makes sense.
> 
> That being said being able to drop kmalloc support for GFP_DMA would
> be really useful. How much memory would we waste if switching to the
> page allocator?
> 
> > s390 drivers could probably converted to dma_alloc API, even though
> > that would cause quite some code churn.
> 
> I think that would be a very good thing to have.
> 
> > > For this first patch series, thanks to Hyeonggon for helping
> > > reviewing and great suggestions on patch improving. We will work
> > > together to continue the next steps of work.
> > > 
> > > Any comment, thought, or suggestoin is welcome and appreciated,
> > > including but not limited to:
> > > 1) whether we should remove dma-kmalloc support in kernel();
> > 
> > The question is: what would this buy us? As stated above I'd assume
> > this comes with quite some code churn, so there should be a good
> > reason to do this.
> 
> There is two steps here.  One is to remove GFP_DMA support from
> kmalloc, which would help to cleanup the slab allocator(s) very nicely,
> as at that point it can stop to be zone aware entirely.
> 
> The long term goal is to remove ZONE_DMA entirely at least for
> architectures that only use the small 16MB ISA-style one.  It can
> then be replaced with for example a CMA area and fall into a movable
> zone.  I'd have to prototype this first and see how it applies to the
> s390 case.  It might not be worth it and maybe we should replace
> ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
> the amount covered tends to not be totally out of line for what we
> built the zone infrastructure.
> 
> > >From this cover letter I only get that there was a problem with kdump
> > on x86, and this has been fixed. So why this extra effort?
> > 
> > >     3) Drop support for allocating DMA memory from slab allocator
> > >     (as Christoph Hellwig said) and convert them to use DMA32
> > >     and see what happens
> > 
> > Can you please clarify what "convert to DMA32" means? I would assume
> > this does _not_ mean that passing GFP_DMA32 to slab allocator would
> > work then?
> 
> I'm really not sure what this means.

Thanks a lot to Heiko for valuable input, it's very helpful. And thanks
a lot to Christoph for explaining.

I guess this "convert to DMA32" is similar to "replace ZONE_DMA and
ZONE_DMA32 with a ZONE_LIMITED".

When I use 'git grep "GFP_DMA/>"' to search all places specifying GFP_DMA,
I noticed the main usage of kmalloc(GFP_DMA) is to get memory under a
memory limitation, but not for DMA buffer allocation. Below is what I got
for earlier kdump issue explanation. It can help explain why kmalloc(GFP_DMA)
is useful on ARCHes w/o ZONE_DMA32, but doesn't make sense on x86_64 which
has both zone DMA and DMA32. The 16M ZONE_DMA is only for very rarely used
legacy ISA device, but most pci devices driver supporting 32bit addressing
likes to abuse kmalloc(GFP_DMA) to get DMA buffer from the zone DMA.
That obviously is unsafe and unreasonable.

Like risc-V which doesn't have the burden of legacy ISA devices, it can
take only containing DMA32 zone way. ARM64 also adjusts to have only
arm64 if not on Raspberry Pi. Using kmalloc(GFP_DMA) makes them no
inconvenience. If finally having dma32-kmalloc, the name may need be
carefully considerred, it seems to be acceptable. We just need to pick
up those ISA device driver and handle their 24bit addressing DMA well.

For this patchset, I only find out places in which GPF_DMA is
redundant and can be removed directly, and places where
kmalloc(GFP_DMA)|dma_map_ pair can be replaced with dma_alloc_xxxx() API
and the memory wasting is not so big. I have patches converting
kmalloc(GFP_DMA) to alloc_pages(GFP_DMA), but not easy to replace with
dma_alloc_xxx(), Hyeonggon suggested not adding them to this series.
I will continue investigating the left places, see whether or how we can
convert them.

=============================
ARCH which has DMA32
        ZONE_DMA       ZONE_DMA32
arm64   0~X            X~4G  (X is got from ACPI or DT. Otherwise it's 4G by default, DMA32 is empty)
ia64    None           0~4G
mips    0 or 0~16M     X~4G  (zone DMA is empty on SGI_IP22 or SGI_IP28, otherwise 16M by default like i386)
riscv   None           0~4G
x86_64  16M            16M~4G


=============================
ARCH which has no DMA32
        ZONE_DMA
alpha   0~16M or empty if IOMMU enabled
arm     0~X (X is reported by fdt, 4G by default)
m68k    0~total memory
microblaze 0~total low memory
powerpc 0~2G
s390    0~2G
sparc   0~ total low memory
i386    0~16M

> 
> > 
> > btw. there are actually two kmalloc allocations which pass GFP_DMA32;
> > I guess this is broken(?):
> > 
> > drivers/hid/intel-ish-hid/ishtp-fw-loader.c:    dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
> > drivers/media/test-drivers/vivid/vivid-osd.c:   dev->video_vbase = kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);
> 
> Yes, this is completely broken.
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer
  2022-02-22  9:06         ` David Laight
@ 2022-02-22 13:16           ` 'Christoph Hellwig'
  0 siblings, 0 replies; 74+ messages in thread
From: 'Christoph Hellwig' @ 2022-02-22 13:16 UTC (permalink / raw)
  To: David Laight
  Cc: 'Christoph Hellwig',
	Hyeonggon Yoo, Baoquan He, linux-kernel, linux-mm, akpm, cl,
	penberg, rientjes, iamjoonsoo.kim, vbabka, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On Tue, Feb 22, 2022 at 09:06:48AM +0000, David Laight wrote:
> From: Christoph Hellwig
> > Sent: 22 February 2022 08:47
> ...
> > > Hmm.. for this specific case, What about allocating two buffers
> > > for DMA_TO_DEVICE and DMA_FROM_DEVICE at initialization time?
> > 
> > That will work, but I don't see the benefit as you'd still need to call
> > dma_sync_single* before and after each data transfer.
> 
> For systems with an iommu that should save all the iommu setup
> for every transfer.

So does allocating a single buffer as in the patch we are replying to.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-22 13:12     ` Baoquan He
@ 2022-02-22 13:26       ` Baoquan He
  0 siblings, 0 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-22 13:26 UTC (permalink / raw)
  To: Heiko Carstens, Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa,
	Halil Pasic, Vineeth Vijayan, x86

On 02/22/22 at 09:12pm, Baoquan He wrote:
> On 02/22/22 at 09:44am, Christoph Hellwig wrote:
> > On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote:
> > > > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
> > > 
> > > So, s390 partially requires GFP_DMA allocations for memory areas which
> > > are required by the hardware to be below 2GB. There is not necessarily
> > > a device associated when this is required. E.g. some legacy "diagnose"
> > > calls require buffers to be below 2GB.
> > > 
> > > How should something like this be handled? I'd guess that the
> > > dma_alloc API is not the right thing to use in such cases. Of course
> > > we could say, let's waste memory and use full pages instead, however
> > > I'm not sure this is a good idea.
> > 
> > Yeah, I don't think the DMA API is the right thing for that.  This
> > is one of the very rare cases where a raw allocation makes sense.
> > 
> > That being said being able to drop kmalloc support for GFP_DMA would
> > be really useful. How much memory would we waste if switching to the
> > page allocator?
> > 
> > > s390 drivers could probably converted to dma_alloc API, even though
> > > that would cause quite some code churn.
> > 
> > I think that would be a very good thing to have.
> > 
> > > > For this first patch series, thanks to Hyeonggon for helping
> > > > reviewing and great suggestions on patch improving. We will work
> > > > together to continue the next steps of work.
> > > > 
> > > > Any comment, thought, or suggestoin is welcome and appreciated,
> > > > including but not limited to:
> > > > 1) whether we should remove dma-kmalloc support in kernel();
> > > 
> > > The question is: what would this buy us? As stated above I'd assume
> > > this comes with quite some code churn, so there should be a good
> > > reason to do this.
> > 
> > There is two steps here.  One is to remove GFP_DMA support from
> > kmalloc, which would help to cleanup the slab allocator(s) very nicely,
> > as at that point it can stop to be zone aware entirely.
> > 
> > The long term goal is to remove ZONE_DMA entirely at least for
> > architectures that only use the small 16MB ISA-style one.  It can
> > then be replaced with for example a CMA area and fall into a movable
> > zone.  I'd have to prototype this first and see how it applies to the
> > s390 case.  It might not be worth it and maybe we should replace
> > ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
> > the amount covered tends to not be totally out of line for what we
> > built the zone infrastructure.
> > 
> > > >From this cover letter I only get that there was a problem with kdump
> > > on x86, and this has been fixed. So why this extra effort?
> > > 
> > > >     3) Drop support for allocating DMA memory from slab allocator
> > > >     (as Christoph Hellwig said) and convert them to use DMA32
> > > >     and see what happens
> > > 
> > > Can you please clarify what "convert to DMA32" means? I would assume
> > > this does _not_ mean that passing GFP_DMA32 to slab allocator would
> > > work then?
> > 
> > I'm really not sure what this means.
> 
> Thanks a lot to Heiko for valuable input, it's very helpful. And thanks
> a lot to Christoph for explaining.
> 
> I guess this "convert to DMA32" is similar to "replace ZONE_DMA and
> ZONE_DMA32 with a ZONE_LIMITED".

And by the way, when I searched SLAB_CACHE_DMA32 which is another zone
aware slab flag, I got that not all people likes to abuse
kmalloc(GFP_DMA). There are two places where 
kmem_cache_create(SLAB_CACHE_DMA32) are called to create slab grabbing
memory from zone DMA32. Obviously the code author really knows slab
allocator. They use dma32 slab to get cache memory under 4G.

drivers/firmware/google/gsmi.c : gsmi_init()
drivers/iommu/io-pgtable-arm-v7s.c: arm_v7s_alloc_pgtable()

> 
> When I use 'git grep "GFP_DMA/>"' to search all places specifying GFP_DMA,
> I noticed the main usage of kmalloc(GFP_DMA) is to get memory under a
> memory limitation, but not for DMA buffer allocation. Below is what I got
> for earlier kdump issue explanation. It can help explain why kmalloc(GFP_DMA)
> is useful on ARCHes w/o ZONE_DMA32, but doesn't make sense on x86_64 which
> has both zone DMA and DMA32. The 16M ZONE_DMA is only for very rarely used
> legacy ISA device, but most pci devices driver supporting 32bit addressing
> likes to abuse kmalloc(GFP_DMA) to get DMA buffer from the zone DMA.
> That obviously is unsafe and unreasonable.
> 
> Like risc-V which doesn't have the burden of legacy ISA devices, it can
> take only containing DMA32 zone way. ARM64 also adjusts to have only
> arm64 if not on Raspberry Pi. Using kmalloc(GFP_DMA) makes them no
> inconvenience. If finally having dma32-kmalloc, the name may need be
> carefully considerred, it seems to be acceptable. We just need to pick
> up those ISA device driver and handle their 24bit addressing DMA well.
> 
> For this patchset, I only find out places in which GPF_DMA is
> redundant and can be removed directly, and places where
> kmalloc(GFP_DMA)|dma_map_ pair can be replaced with dma_alloc_xxxx() API
> and the memory wasting is not so big. I have patches converting
> kmalloc(GFP_DMA) to alloc_pages(GFP_DMA), but not easy to replace with
> dma_alloc_xxx(), Hyeonggon suggested not adding them to this series.
> I will continue investigating the left places, see whether or how we can
> convert them.
> 
> =============================
> ARCH which has DMA32
>         ZONE_DMA       ZONE_DMA32
> arm64   0~X            X~4G  (X is got from ACPI or DT. Otherwise it's 4G by default, DMA32 is empty)
> ia64    None           0~4G
> mips    0 or 0~16M     X~4G  (zone DMA is empty on SGI_IP22 or SGI_IP28, otherwise 16M by default like i386)
> riscv   None           0~4G
> x86_64  16M            16M~4G
> 
> 
> =============================
> ARCH which has no DMA32
>         ZONE_DMA
> alpha   0~16M or empty if IOMMU enabled
> arm     0~X (X is reported by fdt, 4G by default)
> m68k    0~total memory
> microblaze 0~total low memory
> powerpc 0~2G
> s390    0~2G
> sparc   0~ total low memory
> i386    0~16M
> 
> > 
> > > 
> > > btw. there are actually two kmalloc allocations which pass GFP_DMA32;
> > > I guess this is broken(?):
> > > 
> > > drivers/hid/intel-ish-hid/ishtp-fw-loader.c:    dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
> > > drivers/media/test-drivers/vivid/vivid-osd.c:   dev->video_vbase = kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);
> > 
> > Yes, this is completely broken.
> > 
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 21/22] mmc: wbsd: Use dma_alloc_noncoherent() for dma buffer
  2022-02-22 13:11           ` Christoph Hellwig
@ 2022-02-22 13:40             ` Baoquan He
  2022-02-22 13:41             ` [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs Baoquan He
  2022-02-22 13:42             ` [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct Baoquan He
  2 siblings, 0 replies; 74+ messages in thread
From: Baoquan He @ 2022-02-22 13:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/22/22 at 02:11pm, Christoph Hellwig wrote:
> On Tue, Feb 22, 2022 at 05:14:16PM +0800, Baoquan He wrote:
> > > No, if you change it, it should be dma_set_coherent_mask only as it is
> > > not using streaming mappings.  I suspect dma_set_mask_and_coherent is
> > > the right thing if the driver ever wants to use streaming mapping,
> > > it would just need to be documented in the commit message.
> > 
> > It will serve dma_alloc_noncoherent() calling later, should be streaming
> > mapping?
> 
> No, that also looks at the coherent mask.  Which is a bit misnamed these
> days, it really should be the alloc mask.

I noticed the misnamed code and have made two draft patches, please help
check if it's necessary.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-22 13:11           ` Christoph Hellwig
  2022-02-22 13:40             ` Baoquan He
@ 2022-02-22 13:41             ` Baoquan He
  2022-02-22 15:59               ` Christoph Hellwig
  2022-02-22 13:42             ` [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct Baoquan He
  2 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-22 13:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

For newly added streaming mapping APIs, the internal core function
__dma_alloc_pages() should check dev->dma_mask, but not
ev->coherent_dma_mask which is for coherent mapping.

Meanwhile, just filter out gfp flags if they are any of
__GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM, but not fail it. This change
makes it  consistent with coherent mapping allocs.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 kernel/dma/mapping.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 9478eccd1c8e..e66847aeac67 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -543,10 +543,11 @@ static struct page *__dma_alloc_pages(struct device *dev, size_t size,
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (WARN_ON_ONCE(!dev->coherent_dma_mask))
-		return NULL;
-	if (WARN_ON_ONCE(gfp & (__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM)))
-		return NULL;
+	if (WARN_ON_ONCE(!dev->dma_mask))
+                return NULL;
+
+	/* let the implementation decide on the zone to allocate from: */
+        gfp &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
 	size = PAGE_ALIGN(size);
 	if (dma_alloc_direct(dev, ops))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct
  2022-02-22 13:11           ` Christoph Hellwig
  2022-02-22 13:40             ` Baoquan He
  2022-02-22 13:41             ` [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs Baoquan He
@ 2022-02-22 13:42             ` Baoquan He
  2022-02-22 15:59               ` Christoph Hellwig
  2 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-22 13:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

In the old dma mapping, coherent mapping uses dma_alloc_coherent() to
allocate DMA buffer and mapping; while streaming mapping can only get
memory from slab or buddy allocator, then map with dma_map_single().
In that situation, dma_alloc_direct() checks a direct mapping for
coherent DMA, dma_map_direct() checks a direct mapping for streaming
DMA.

However, several new APIs have been added for streaming mapping, e.g
dma_alloc_pages(). These new APIs take care of DMA buffer allocating
and mapping which are similar with dma_alloc_coherent(). So we should
rename both of them to reflect their real intention to avoid confusion.

       dma_alloc_direct()  ==>  dma_coherent_direct()
       dma_map_direct()    ==>  dma_streaming_direct()

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 kernel/dma/mapping.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index e66847aeac67..2835b08e96c6 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -127,13 +127,13 @@ static bool dma_go_direct(struct device *dev, dma_addr_t mask,
  * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
  * enough.
  */
-static inline bool dma_alloc_direct(struct device *dev,
+static inline bool dma_coherent_direct(struct device *dev,
 		const struct dma_map_ops *ops)
 {
 	return dma_go_direct(dev, dev->coherent_dma_mask, ops);
 }
 
-static inline bool dma_map_direct(struct device *dev,
+static inline bool dma_streaming_direct(struct device *dev,
 		const struct dma_map_ops *ops)
 {
 	return dma_go_direct(dev, *dev->dma_mask, ops);
@@ -151,7 +151,7 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page,
 	if (WARN_ON_ONCE(!dev->dma_mask))
 		return DMA_MAPPING_ERROR;
 
-	if (dma_map_direct(dev, ops) ||
+	if (dma_streaming_direct(dev, ops) ||
 	    arch_dma_map_page_direct(dev, page_to_phys(page) + offset + size))
 		addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
 	else
@@ -168,7 +168,7 @@ void dma_unmap_page_attrs(struct device *dev, dma_addr_t addr, size_t size,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (dma_map_direct(dev, ops) ||
+	if (dma_streaming_direct(dev, ops) ||
 	    arch_dma_unmap_page_direct(dev, addr + size))
 		dma_direct_unmap_page(dev, addr, size, dir, attrs);
 	else if (ops->unmap_page)
@@ -188,7 +188,7 @@ static int __dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
 	if (WARN_ON_ONCE(!dev->dma_mask))
 		return 0;
 
-	if (dma_map_direct(dev, ops) ||
+	if (dma_streaming_direct(dev, ops) ||
 	    arch_dma_map_sg_direct(dev, sg, nents))
 		ents = dma_direct_map_sg(dev, sg, nents, dir, attrs);
 	else
@@ -277,7 +277,7 @@ void dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg,
 
 	BUG_ON(!valid_dma_direction(dir));
 	debug_dma_unmap_sg(dev, sg, nents, dir);
-	if (dma_map_direct(dev, ops) ||
+	if (dma_streaming_direct(dev, ops) ||
 	    arch_dma_unmap_sg_direct(dev, sg, nents))
 		dma_direct_unmap_sg(dev, sg, nents, dir, attrs);
 	else if (ops->unmap_sg)
@@ -296,7 +296,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr,
 	if (WARN_ON_ONCE(!dev->dma_mask))
 		return DMA_MAPPING_ERROR;
 
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		addr = dma_direct_map_resource(dev, phys_addr, size, dir, attrs);
 	else if (ops->map_resource)
 		addr = ops->map_resource(dev, phys_addr, size, dir, attrs);
@@ -312,7 +312,7 @@ void dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (!dma_map_direct(dev, ops) && ops->unmap_resource)
+	if (!dma_streaming_direct(dev, ops) && ops->unmap_resource)
 		ops->unmap_resource(dev, addr, size, dir, attrs);
 	debug_dma_unmap_resource(dev, addr, size, dir);
 }
@@ -324,7 +324,7 @@ void dma_sync_single_for_cpu(struct device *dev, dma_addr_t addr, size_t size,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		dma_direct_sync_single_for_cpu(dev, addr, size, dir);
 	else if (ops->sync_single_for_cpu)
 		ops->sync_single_for_cpu(dev, addr, size, dir);
@@ -338,7 +338,7 @@ void dma_sync_single_for_device(struct device *dev, dma_addr_t addr,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		dma_direct_sync_single_for_device(dev, addr, size, dir);
 	else if (ops->sync_single_for_device)
 		ops->sync_single_for_device(dev, addr, size, dir);
@@ -352,7 +352,7 @@ void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		dma_direct_sync_sg_for_cpu(dev, sg, nelems, dir);
 	else if (ops->sync_sg_for_cpu)
 		ops->sync_sg_for_cpu(dev, sg, nelems, dir);
@@ -366,7 +366,7 @@ void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	BUG_ON(!valid_dma_direction(dir));
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		dma_direct_sync_sg_for_device(dev, sg, nelems, dir);
 	else if (ops->sync_sg_for_device)
 		ops->sync_sg_for_device(dev, sg, nelems, dir);
@@ -391,7 +391,7 @@ int dma_get_sgtable_attrs(struct device *dev, struct sg_table *sgt,
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		return dma_direct_get_sgtable(dev, sgt, cpu_addr, dma_addr,
 				size, attrs);
 	if (!ops->get_sgtable)
@@ -430,7 +430,7 @@ bool dma_can_mmap(struct device *dev)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_coherent_direct(dev, ops))
 		return dma_direct_can_mmap(dev);
 	return ops->mmap != NULL;
 }
@@ -455,7 +455,7 @@ int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_coherent_direct(dev, ops))
 		return dma_direct_mmap(dev, vma, cpu_addr, dma_addr, size,
 				attrs);
 	if (!ops->mmap)
@@ -468,7 +468,7 @@ u64 dma_get_required_mask(struct device *dev)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		return dma_direct_get_required_mask(dev);
 	if (ops->get_required_mask)
 		return ops->get_required_mask(dev);
@@ -499,7 +499,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	/* let the implementation decide on the zone to allocate from: */
 	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
-	if (dma_alloc_direct(dev, ops))
+	if (dma_coherent_direct(dev, ops))
 		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
 	else if (ops->alloc)
 		cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs);
@@ -531,7 +531,7 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 		return;
 
 	debug_dma_free_coherent(dev, size, cpu_addr, dma_handle);
-	if (dma_alloc_direct(dev, ops))
+	if (dma_coherent_direct(dev, ops))
 		dma_direct_free(dev, size, cpu_addr, dma_handle, attrs);
 	else if (ops->free)
 		ops->free(dev, size, cpu_addr, dma_handle, attrs);
@@ -550,7 +550,7 @@ static struct page *__dma_alloc_pages(struct device *dev, size_t size,
         gfp &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 
 	size = PAGE_ALIGN(size);
-	if (dma_alloc_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		return dma_direct_alloc_pages(dev, size, dma_handle, dir, gfp);
 	if (!ops->alloc_pages)
 		return NULL;
@@ -574,7 +574,7 @@ static void __dma_free_pages(struct device *dev, size_t size, struct page *page,
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
 	size = PAGE_ALIGN(size);
-	if (dma_alloc_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		dma_direct_free_pages(dev, size, page, dma_handle, dir);
 	else if (ops->free_pages)
 		ops->free_pages(dev, size, page, dma_handle, dir);
@@ -769,7 +769,7 @@ size_t dma_max_mapping_size(struct device *dev)
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 	size_t size = SIZE_MAX;
 
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		size = dma_direct_max_mapping_size(dev);
 	else if (ops && ops->max_mapping_size)
 		size = ops->max_mapping_size(dev);
@@ -782,7 +782,7 @@ bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);
 
-	if (dma_map_direct(dev, ops))
+	if (dma_streaming_direct(dev, ops))
 		return dma_direct_need_sync(dev, dma_addr);
 	return ops->sync_single_for_cpu || ops->sync_single_for_device;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-22 13:41             ` [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs Baoquan He
@ 2022-02-22 15:59               ` Christoph Hellwig
  2022-02-23  0:28                 ` Baoquan He
  0 siblings, 1 reply; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22 15:59 UTC (permalink / raw)
  To: Baoquan He
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Tue, Feb 22, 2022 at 09:41:43PM +0800, Baoquan He wrote:
> For newly added streaming mapping APIs, the internal core function
> __dma_alloc_pages() should check dev->dma_mask, but not
> ev->coherent_dma_mask which is for coherent mapping.

No, this is wrong.  dev->coherent_dma_mask is and should be used here.

>
> 
> Meanwhile, just filter out gfp flags if they are any of
> __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM, but not fail it. This change
> makes it  consistent with coherent mapping allocs.

This is wrong as well.  We want to eventually fail dma_alloc_coherent
for these, too.  It just needs more work.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct
  2022-02-22 13:42             ` [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct Baoquan He
@ 2022-02-22 15:59               ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-22 15:59 UTC (permalink / raw)
  To: Baoquan He
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Tue, Feb 22, 2022 at 09:42:22PM +0800, Baoquan He wrote:
> In the old dma mapping, coherent mapping uses dma_alloc_coherent() to
> allocate DMA buffer and mapping; while streaming mapping can only get
> memory from slab or buddy allocator, then map with dma_map_single().
> In that situation, dma_alloc_direct() checks a direct mapping for
> coherent DMA, dma_map_direct() checks a direct mapping for streaming
> DMA.
> 
> However, several new APIs have been added for streaming mapping, e.g
> dma_alloc_pages(). These new APIs take care of DMA buffer allocating
> and mapping which are similar with dma_alloc_coherent(). So we should
> rename both of them to reflect their real intention to avoid confusion.
> 
>        dma_alloc_direct()  ==>  dma_coherent_direct()
>        dma_map_direct()    ==>  dma_streaming_direct()

No, these new names are highly misleading.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-22 15:59               ` Christoph Hellwig
@ 2022-02-23  0:28                 ` Baoquan He
  2022-02-23 14:25                   ` Christoph Hellwig
  0 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-23  0:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/22/22 at 04:59pm, Christoph Hellwig wrote:
> On Tue, Feb 22, 2022 at 09:41:43PM +0800, Baoquan He wrote:
> > For newly added streaming mapping APIs, the internal core function
> > __dma_alloc_pages() should check dev->dma_mask, but not
> > ev->coherent_dma_mask which is for coherent mapping.
> 
> No, this is wrong.  dev->coherent_dma_mask is and should be used here.

Could you tell more why this is wrong? According to
Documentation/core-api/dma-api.rst and DMA code, __dma_alloc_pages() is
the core function of dma_alloc_pages()/dma_alloc_noncoherent() which are
obviously streaming mapping, why do we need to check
dev->coherent_dma_mask here? Because dev->coherent_dma_mask is the subset
of dev->dma_mask, it's safer to use dev->coherent_dma_mask in these
places? This is confusing, I talked to Hyeonggon in private mail, he has
the same feeling.

> 
> >
> > 
> > Meanwhile, just filter out gfp flags if they are any of
> > __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM, but not fail it. This change
> > makes it  consistent with coherent mapping allocs.
> 
> This is wrong as well.  We want to eventually fail dma_alloc_coherent
> for these, too.  It just needs more work.
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-23  0:28                 ` Baoquan He
@ 2022-02-23 14:25                   ` Christoph Hellwig
  2022-02-23 14:57                     ` David Laight
  2022-02-24 14:11                     ` Baoquan He
  0 siblings, 2 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-23 14:25 UTC (permalink / raw)
  To: Baoquan He
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, David.Laight, david,
	herbert, davem, linux-crypto, steffen.klassert, netdev, hca, gor,
	agordeev, borntraeger, svens, linux-s390, michael, linux-i2c,
	wsa

On Wed, Feb 23, 2022 at 08:28:13AM +0800, Baoquan He wrote:
> Could you tell more why this is wrong? According to
> Documentation/core-api/dma-api.rst and DMA code, __dma_alloc_pages() is
> the core function of dma_alloc_pages()/dma_alloc_noncoherent() which are
> obviously streaming mapping,

Why are they "obviously" streaming mappings?

> why do we need to check
> dev->coherent_dma_mask here? Because dev->coherent_dma_mask is the subset
> of dev->dma_mask, it's safer to use dev->coherent_dma_mask in these
> places? This is confusing, I talked to Hyeonggon in private mail, he has
> the same feeling.

Think of th coherent_dma_mask as dma_alloc_mask.  It is the mask for the
DMA memory allocator.  dma_mask is the mask for the dma_map_* routines.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-23 14:25                   ` Christoph Hellwig
@ 2022-02-23 14:57                     ` David Laight
  2022-02-24 14:11                     ` Baoquan He
  1 sibling, 0 replies; 74+ messages in thread
From: David Laight @ 2022-02-23 14:57 UTC (permalink / raw)
  To: 'Christoph Hellwig', Baoquan He
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, david, herbert, davem, linux-crypto,
	steffen.klassert, netdev, hca, gor, agordeev, borntraeger, svens,
	linux-s390, michael, linux-i2c, wsa

From: Christoph Hellwig
> Sent: 23 February 2022 14:26
> 
> On Wed, Feb 23, 2022 at 08:28:13AM +0800, Baoquan He wrote:
> > Could you tell more why this is wrong? According to
> > Documentation/core-api/dma-api.rst and DMA code, __dma_alloc_pages() is
> > the core function of dma_alloc_pages()/dma_alloc_noncoherent() which are
> > obviously streaming mapping,
> 
> Why are they "obviously" streaming mappings?
> 
> > why do we need to check
> > dev->coherent_dma_mask here? Because dev->coherent_dma_mask is the subset
> > of dev->dma_mask, it's safer to use dev->coherent_dma_mask in these
> > places? This is confusing, I talked to Hyeonggon in private mail, he has
> > the same feeling.
> 
> Think of th coherent_dma_mask as dma_alloc_mask.  It is the mask for the
> DMA memory allocator.  dma_mask is the mask for the dma_map_* routines.

I suspect it is all to allow for things like:
- A 64bit system with memory above 4G.
- A device that can only generate 32bit addresses.
- Some feature of the memory system (or bus bridges) that restricts
  cache snooping to the low 1G of address space.

So dma_alloc_coherent() has to allocate memory below 1G.
The dma_map functions have to use bounce-buffers for addresses
  above 4G.
dma_alloc_noncoherent() can allocate anything below 4G and so
  avoid bounce buffers later on.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-22  8:44   ` Christoph Hellwig
  2022-02-22 13:12     ` Baoquan He
@ 2022-02-23 19:18     ` Heiko Carstens
  2022-02-24  6:33       ` Christoph Hellwig
  1 sibling, 1 reply; 74+ messages in thread
From: Heiko Carstens @ 2022-02-23 19:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Baoquan He, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg,
	rientjes, iamjoonsoo.kim, vbabka, David.Laight, david, herbert,
	davem, linux-crypto, steffen.klassert, netdev, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa,
	Halil Pasic, Vineeth Vijayan

On Tue, Feb 22, 2022 at 09:44:22AM +0100, Christoph Hellwig wrote:
> On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote:
> > > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
> > 
> > So, s390 partially requires GFP_DMA allocations for memory areas which
> > are required by the hardware to be below 2GB. There is not necessarily
> > a device associated when this is required. E.g. some legacy "diagnose"
> > calls require buffers to be below 2GB.
> > 
> > How should something like this be handled? I'd guess that the
> > dma_alloc API is not the right thing to use in such cases. Of course
> > we could say, let's waste memory and use full pages instead, however
> > I'm not sure this is a good idea.
> 
> Yeah, I don't think the DMA API is the right thing for that.  This
> is one of the very rare cases where a raw allocation makes sense.
> 
> That being said being able to drop kmalloc support for GFP_DMA would
> be really useful. How much memory would we waste if switching to the
> page allocator?

At a first glance this would not waste much memory, since most callers
seem to allocate such memory pieces only temporarily.

> > The question is: what would this buy us? As stated above I'd assume
> > this comes with quite some code churn, so there should be a good
> > reason to do this.
> 
> There is two steps here.  One is to remove GFP_DMA support from
> kmalloc, which would help to cleanup the slab allocator(s) very nicely,
> as at that point it can stop to be zone aware entirely.

Well, looking at slub.c it looks like there is only a very minimal
maintenance burden for GPF_DMA/GFP_DMA32 support.

> The long term goal is to remove ZONE_DMA entirely at least for
> architectures that only use the small 16MB ISA-style one.  It can
> then be replaced with for example a CMA area and fall into a movable
> zone.  I'd have to prototype this first and see how it applies to the
> s390 case.  It might not be worth it and maybe we should replace
> ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
> the amount covered tends to not be totally out of line for what we
> built the zone infrastructure.

So probably I'm missing something; but for small systems where we
would only have ZONE_DMA, how would a CMA area within this zone
improve things?

If I'm not mistaken then the page allocator will not fallback to any
CMA area for GFP_KERNEL allocations. That is: we would somehow need to
find "the right size" for the CMA area, depending on memory size. This
looks like a new problem class which currently does not exist.

Besides that we would also not have all the debugging options provided
by the slab allocator anymore.

Anyway, maybe it would make more sense if you would send your patch
and then we can see where we would end up.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
  2022-02-23 19:18     ` Heiko Carstens
@ 2022-02-24  6:33       ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2022-02-24  6:33 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Christoph Hellwig, Baoquan He, linux-kernel, linux-mm, akpm, cl,
	42.hyeyoo, penberg, rientjes, iamjoonsoo.kim, vbabka,
	David.Laight, david, herbert, davem, linux-crypto,
	steffen.klassert, netdev, gor, agordeev, borntraeger, svens,
	linux-s390, michael, linux-i2c, wsa, Halil Pasic,
	Vineeth Vijayan

On Wed, Feb 23, 2022 at 08:18:08PM +0100, Heiko Carstens wrote:
> > The long term goal is to remove ZONE_DMA entirely at least for
> > architectures that only use the small 16MB ISA-style one.  It can
> > then be replaced with for example a CMA area and fall into a movable
> > zone.  I'd have to prototype this first and see how it applies to the
> > s390 case.  It might not be worth it and maybe we should replace
> > ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
> > the amount covered tends to not be totally out of line for what we
> > built the zone infrastructure.
> 
> So probably I'm missing something; but for small systems where we
> would only have ZONE_DMA, how would a CMA area within this zone
> improve things?

It would not, but more importantly we would not need it at all.  The
thinking here is really about the nasty 16MB ISA-style zone DMA.
a 31-bit something rather different.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-23 14:25                   ` Christoph Hellwig
  2022-02-23 14:57                     ` David Laight
@ 2022-02-24 14:11                     ` Baoquan He
  2022-02-24 14:27                       ` David Laight
  1 sibling, 1 reply; 74+ messages in thread
From: Baoquan He @ 2022-02-24 14:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, David.Laight, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/23/22 at 03:25pm, Christoph Hellwig wrote:
> On Wed, Feb 23, 2022 at 08:28:13AM +0800, Baoquan He wrote:
> > Could you tell more why this is wrong? According to
> > Documentation/core-api/dma-api.rst and DMA code, __dma_alloc_pages() is
> > the core function of dma_alloc_pages()/dma_alloc_noncoherent() which are
> > obviously streaming mapping,
> 
> Why are they "obviously" streaming mappings?

Because they are obviously not coherent mapping?

With my understanding, there are two kinds of DMA mapping, coherent
mapping (which is also persistent mapping), and streaming mapping. The
coherent mapping will be handled during driver init, and released during
driver de-init. While streaming mapping will be done when needed at any
time, and released after usage.

Are we going to add another kind of mapping? It's not streaming mapping,
but use dev->coherent_dma_mask, just because it uses dma_alloc_xxx()
api.

> 
> > why do we need to check
> > dev->coherent_dma_mask here? Because dev->coherent_dma_mask is the subset
> > of dev->dma_mask, it's safer to use dev->coherent_dma_mask in these
> > places? This is confusing, I talked to Hyeonggon in private mail, he has
> > the same feeling.
> 
> Think of th coherent_dma_mask as dma_alloc_mask.  It is the mask for the
> DMA memory allocator.  dma_mask is the mask for the dma_map_* routines.

I will check code further. While this may need be noted in doc, e.g
dma_api.rst or dma-api-howto.rst.

If you have guide, I can try to add some words to make clear this. Or
leave this to people who knows this clearly. I believe it will be very
helpful to understand DMA api.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-24 14:11                     ` Baoquan He
@ 2022-02-24 14:27                       ` David Laight
  2022-02-25 15:39                         ` 'Baoquan He'
  0 siblings, 1 reply; 74+ messages in thread
From: David Laight @ 2022-02-24 14:27 UTC (permalink / raw)
  To: 'Baoquan He', Christoph Hellwig
  Cc: linux-kernel, linux-mm, akpm, cl, 42.hyeyoo, penberg, rientjes,
	iamjoonsoo.kim, vbabka, david, herbert, davem, linux-crypto,
	steffen.klassert, netdev, hca, gor, agordeev, borntraeger, svens,
	linux-s390, michael, linux-i2c, wsa

From: Baoquan He
> Sent: 24 February 2022 14:11
...
> With my understanding, there are two kinds of DMA mapping, coherent
> mapping (which is also persistent mapping), and streaming mapping. The
> coherent mapping will be handled during driver init, and released during
> driver de-init. While streaming mapping will be done when needed at any
> time, and released after usage.

The lifetime has absolutely nothing to do with it.

It is all about how the DMA cycles (from the device) interact with
(or more don't interact with) the cpu memory cache.

For coherent mapping the cpu and device can write to (different)
words in the same cache line at the same time, and both will see
both updates.
On some systems this can only be achieved by making the memory
uncached - which significantly slows down cpu access.

For non-coherent (streaming) mapping the cpu writes back and/or
invalidates the data cache so that the dma read cycles from memory
read the correct data and the cpu re-reads the cache line after
the dma has completed.
They are only really suitable for data buffers.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs
  2022-02-24 14:27                       ` David Laight
@ 2022-02-25 15:39                         ` 'Baoquan He'
  0 siblings, 0 replies; 74+ messages in thread
From: 'Baoquan He' @ 2022-02-25 15:39 UTC (permalink / raw)
  To: David Laight
  Cc: Christoph Hellwig, linux-kernel, linux-mm, akpm, cl, 42.hyeyoo,
	penberg, rientjes, iamjoonsoo.kim, vbabka, david, herbert, davem,
	linux-crypto, steffen.klassert, netdev, hca, gor, agordeev,
	borntraeger, svens, linux-s390, michael, linux-i2c, wsa

On 02/24/22 at 02:27pm, David Laight wrote:
> From: Baoquan He
> > Sent: 24 February 2022 14:11
> ...
> > With my understanding, there are two kinds of DMA mapping, coherent
> > mapping (which is also persistent mapping), and streaming mapping. The
> > coherent mapping will be handled during driver init, and released during
> > driver de-init. While streaming mapping will be done when needed at any
> > time, and released after usage.
> 
> The lifetime has absolutely nothing to do with it.
> 
> It is all about how the DMA cycles (from the device) interact with
> (or more don't interact with) the cpu memory cache.
> 
> For coherent mapping the cpu and device can write to (different)
> words in the same cache line at the same time, and both will see
> both updates.
> On some systems this can only be achieved by making the memory
> uncached - which significantly slows down cpu access.
> 
> For non-coherent (streaming) mapping the cpu writes back and/or
> invalidates the data cache so that the dma read cycles from memory
> read the correct data and the cpu re-reads the cache line after
> the dma has completed.
> They are only really suitable for data buffers.

Thanks for valuable input, I agree the lifetime is not stuff we can rely
on to judge. But how do we explain dma_alloc_noncoherent() is not streaming
mapping? Then which kind of dma mapping is it?

I could miss something important to understand this which is obvious to
other people, I will make time to check.


^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2022-02-25 15:39 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-19  0:51 [PATCH 00/22] Don't use kmalloc() with GFP_DMA Baoquan He
2022-02-19  0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
2022-02-19  7:07   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19  7:07   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 03/22] gpu: ipu-v3: " Baoquan He
2022-02-19  7:07   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
2022-02-19  7:08   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19  7:08   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 06/22] fbdev: da8xx: " Baoquan He
2022-02-19  7:08   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
2022-02-19  7:08   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19  7:09   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 09/22] usb: cdns3: " Baoquan He
2022-02-19  7:09   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 10/22] uio: pruss: " Baoquan He
2022-02-19  7:09   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
2022-02-19  6:51   ` Wolfram Sang
2022-02-20  1:55     ` Baoquan He
2022-02-19  7:09   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 12/22] " Baoquan He
2022-02-19  7:10   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 13/22] spi: atmel: " Baoquan He
2022-02-19  7:10   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 14/22] spi: spi-ti-qspi: " Baoquan He
2022-02-19  7:12   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc() Baoquan He
2022-02-19  7:13   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA " Baoquan He
2022-02-19  7:13   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
2022-02-19  4:54   ` Jakub Kicinski
2022-02-20  2:06     ` Baoquan He
2022-02-19  7:13   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent() Baoquan He
2022-02-19  7:13   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer Baoquan He
2022-02-19  7:14   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 20/22] HID: intel-ish-hid: " Baoquan He
2022-02-19  7:14   ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 21/22] mmc: wbsd: " Baoquan He
2022-02-19  7:17   ` Christoph Hellwig
2022-02-20  8:40     ` Baoquan He
2022-02-22  8:45       ` Christoph Hellwig
2022-02-22  9:14         ` Baoquan He
2022-02-22 13:11           ` Christoph Hellwig
2022-02-22 13:40             ` Baoquan He
2022-02-22 13:41             ` [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs Baoquan He
2022-02-22 15:59               ` Christoph Hellwig
2022-02-23  0:28                 ` Baoquan He
2022-02-23 14:25                   ` Christoph Hellwig
2022-02-23 14:57                     ` David Laight
2022-02-24 14:11                     ` Baoquan He
2022-02-24 14:27                       ` David Laight
2022-02-25 15:39                         ` 'Baoquan He'
2022-02-22 13:42             ` [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct Baoquan He
2022-02-22 15:59               ` Christoph Hellwig
2022-02-19  0:52 ` [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer Baoquan He
2022-02-19  7:19   ` Christoph Hellwig
2022-02-19 11:18     ` Hyeonggon Yoo
2022-02-22  8:46       ` Christoph Hellwig
2022-02-22  9:06         ` David Laight
2022-02-22 13:16           ` 'Christoph Hellwig'
2022-02-21 13:57 ` [PATCH 00/22] Don't use kmalloc() with GFP_DMA Heiko Carstens
2022-02-22  8:44   ` Christoph Hellwig
2022-02-22 13:12     ` Baoquan He
2022-02-22 13:26       ` Baoquan He
2022-02-23 19:18     ` Heiko Carstens
2022-02-24  6:33       ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).