All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems
@ 2023-01-24  9:27 Takashi Iwai
  2023-01-24  9:27 ` [PATCH 1/2] ALSA: memalloc: Explicit SG-allocations " Takashi Iwai
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Takashi Iwai @ 2023-01-24  9:27 UTC (permalink / raw)
  To: alsa-devel; +Cc: Marek Marczykowski-Górecki

Hi,

this is a patch series to address the recent regression on Xen PV (and
possibly non-IOMMU) systems about the SG-buffer memory allocation.
We switched to use dma_alloc_noncontiguous() as hoped it handling
everything right, but it turned out that this doesn't work always.
So this is one step back, use the explicit SG-buffer with
dma_alloc_coherent() calls, but in a bit more optimized ways, and also
applying only for those systems.


Takashi

===

Takashi Iwai (2):
  ALSA: memalloc: Explicit SG-allocations for Xen PV and non-IOMMU
    systems
  ALSA: memalloc: Use coherent DMA allocation for fallback again

 sound/core/memalloc.c | 68 +++++++++++++++++++++++++++++++++----------
 1 file changed, 53 insertions(+), 15 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] ALSA: memalloc: Explicit SG-allocations for Xen PV and non-IOMMU systems
  2023-01-24  9:27 [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
@ 2023-01-24  9:27 ` Takashi Iwai
  2023-01-24  9:27 ` [PATCH 2/2] ALSA: memalloc: Use coherent DMA allocation for fallback again Takashi Iwai
  2023-01-25 15:29 ` [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
  2 siblings, 0 replies; 5+ messages in thread
From: Takashi Iwai @ 2023-01-24  9:27 UTC (permalink / raw)
  To: alsa-devel; +Cc: Marek Marczykowski-Górecki

For non-IOMMU systems, it makes little sense (or even buggy) to use
dma_alloc_noncontiguous() as an x86 SG-buffer allocation, as it always
results in the big continuous pages instead of SG buffer.  Also, for
Xen PV, this seems also not working well as the pages allocated there
aren't guaranteed to be coherent.

This patch is a first step to address those problems.  It changes the
memalloc helper to go for the explicit SG-page allocations via the
existing fallback allocation primarily for such a platform instead of
dma_alloc_noncontiguous().

Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Reported-and-tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/memalloc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index 81025f50a542..30c9ad192986 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -541,10 +541,9 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct sg_table *sgt;
 	void *p;
 
-	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
-				      DEFAULT_GFP, 0);
 #ifdef CONFIG_SND_DMA_SGBUF
-	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
+	if (cpu_feature_enabled(X86_FEATURE_XENPV) ||
+	    !get_dma_ops(dmab->dev.dev)) {
 		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
 			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 		else
@@ -552,6 +551,8 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 		return snd_dma_sg_fallback_alloc(dmab, size);
 	}
 #endif
+	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
+				      DEFAULT_GFP, 0);
 	if (!sgt)
 		return NULL;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] ALSA: memalloc: Use coherent DMA allocation for fallback again
  2023-01-24  9:27 [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
  2023-01-24  9:27 ` [PATCH 1/2] ALSA: memalloc: Explicit SG-allocations " Takashi Iwai
@ 2023-01-24  9:27 ` Takashi Iwai
  2023-01-25 15:29 ` [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
  2 siblings, 0 replies; 5+ messages in thread
From: Takashi Iwai @ 2023-01-24  9:27 UTC (permalink / raw)
  To: alsa-devel; +Cc: Marek Marczykowski-Górecki

We switched the memory allocation for fallback cases in the noncontig
type to use the standard alloc_pages*() at the commit a8d302a0b770
("ALSA: memalloc: Revive x86-specific WC page allocations again"),
while we used the dma_alloc_coherent() in the past.  The reason was
that the page address retrieved from the virtual pointer returned from
dma_alloc_coherent() can't be used with IOMMU systems.  Meanwhile, we
explicitly disabled the fallback allocation for IOMMU systems at the
commit 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer
with IOMMU") after the commit above; that is, the usage of
dma_alloc_coherent() should be OK again.

Now, we've received reports that the current fallback page allocation
caused a regression on Xen (and maybe other) systems; the sound
disappear partially or completely.  So it's time to take back to
dma_alloc_coherent().

This patch switches back to the dma_alloc_coherent() for the fallback
allocations.  Unlike the previous implementation, the allocation is
implemented in a more optimized way to try larger chunks.  Also, the
proper DMA address from the dma_alloc_coherent() is returned.

The page count for the page chunk is stored in the lower bits of the
first page address. The address is masked and re-calculated at
get_addr memalloc ops in return.

Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Reported-and-tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/memalloc.c | 61 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 12 deletions(-)

diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index 30c9ad192986..5fe21c0080dc 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -720,17 +720,31 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = {
 struct snd_dma_sg_fallback {
 	size_t count;
 	struct page **pages;
+	/* DMA address array; the first page contains #pages in ~PAGE_MASK */
+	dma_addr_t *addrs;
 };
 
 static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab,
 				       struct snd_dma_sg_fallback *sgbuf)
 {
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
-	size_t i;
-
-	for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++)
-		do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc);
+	size_t i, size;
+
+	if (sgbuf->pages && sgbuf->addrs) {
+		i = 0;
+		while (i < sgbuf->count) {
+			if (!sgbuf->pages[i] || !sgbuf->addrs[i])
+				break;
+			size = sgbuf->addrs[i] & ~PAGE_MASK;
+			if (WARN_ON(!size))
+				break;
+			dma_free_coherent(dmab->dev.dev, size << PAGE_SHIFT,
+					  page_address(sgbuf->pages[i]),
+					  sgbuf->addrs[i] & PAGE_MASK);
+			i += size;
+		}
+	}
 	kvfree(sgbuf->pages);
+	kvfree(sgbuf->addrs);
 	kfree(sgbuf);
 }
 
@@ -739,9 +753,9 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct snd_dma_sg_fallback *sgbuf;
 	struct page **pagep, *curp;
 	size_t chunk, npages;
+	dma_addr_t *addrp;
 	dma_addr_t addr;
 	void *p;
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 
 	sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL);
 	if (!sgbuf)
@@ -749,14 +763,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	size = PAGE_ALIGN(size);
 	sgbuf->count = size >> PAGE_SHIFT;
 	sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL);
-	if (!sgbuf->pages)
+	sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL);
+	if (!sgbuf->pages || !sgbuf->addrs)
 		goto error;
 
 	pagep = sgbuf->pages;
-	chunk = size;
+	addrp = sgbuf->addrs;
+	chunk = (PAGE_SIZE - 1) << PAGE_SHIFT; /* to fit in low bits in addrs */
 	while (size > 0) {
 		chunk = min(size, chunk);
-		p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc);
+		p = dma_alloc_coherent(dmab->dev.dev, chunk, &addr, DEFAULT_GFP);
 		if (!p) {
 			if (chunk <= PAGE_SIZE)
 				goto error;
@@ -768,17 +784,25 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 		size -= chunk;
 		/* fill pages */
 		npages = chunk >> PAGE_SHIFT;
+		*addrp = npages; /* store in lower bits */
 		curp = virt_to_page(p);
-		while (npages--)
+		while (npages--) {
 			*pagep++ = curp++;
+			*addrp++ |= addr;
+			addr += PAGE_SIZE;
+		}
 	}
 
 	p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL);
 	if (!p)
 		goto error;
+
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wc(sgbuf->pages, sgbuf->count);
+
 	dmab->private_data = sgbuf;
 	/* store the first page address for convenience */
-	dmab->addr = snd_sgbuf_get_addr(dmab, 0);
+	dmab->addr = sgbuf->addrs[0] & PAGE_MASK;
 	return p;
 
  error:
@@ -788,10 +812,23 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 
 static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab)
 {
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+
 	vunmap(dmab->area);
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wb(sgbuf->pages, sgbuf->count);
 	__snd_dma_sg_fallback_free(dmab, dmab->private_data);
 }
 
+static dma_addr_t snd_dma_sg_fallback_get_addr(struct snd_dma_buffer *dmab,
+					       size_t offset)
+{
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+	size_t index = offset >> PAGE_SHIFT;
+
+	return (sgbuf->addrs[index] & PAGE_MASK) | (offset & ~PAGE_MASK);
+}
+
 static int snd_dma_sg_fallback_mmap(struct snd_dma_buffer *dmab,
 				    struct vm_area_struct *area)
 {
@@ -806,8 +843,8 @@ static const struct snd_malloc_ops snd_dma_sg_fallback_ops = {
 	.alloc = snd_dma_sg_fallback_alloc,
 	.free = snd_dma_sg_fallback_free,
 	.mmap = snd_dma_sg_fallback_mmap,
+	.get_addr = snd_dma_sg_fallback_get_addr,
 	/* reuse vmalloc helpers */
-	.get_addr = snd_dma_vmalloc_get_addr,
 	.get_page = snd_dma_vmalloc_get_page,
 	.get_chunk_size = snd_dma_vmalloc_get_chunk_size,
 };
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems
  2023-01-24  9:27 [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
  2023-01-24  9:27 ` [PATCH 1/2] ALSA: memalloc: Explicit SG-allocations " Takashi Iwai
  2023-01-24  9:27 ` [PATCH 2/2] ALSA: memalloc: Use coherent DMA allocation for fallback again Takashi Iwai
@ 2023-01-25 15:29 ` Takashi Iwai
  2023-01-25 15:33   ` Marek Marczykowski-Górecki
  2 siblings, 1 reply; 5+ messages in thread
From: Takashi Iwai @ 2023-01-25 15:29 UTC (permalink / raw)
  To: alsa-devel; +Cc: Marek Marczykowski-Górecki

On Tue, 24 Jan 2023 10:27:42 +0100,
Takashi Iwai wrote:
> 
> Hi,
> 
> this is a patch series to address the recent regression on Xen PV (and
> possibly non-IOMMU) systems about the SG-buffer memory allocation.
> We switched to use dma_alloc_noncontiguous() as hoped it handling
> everything right, but it turned out that this doesn't work always.
> So this is one step back, use the explicit SG-buffer with
> dma_alloc_coherent() calls, but in a bit more optimized ways, and also
> applying only for those systems.

It seems that the second patch causes a problem; at least I see casual
Oopses on my system after using the patch.  Let's scratch.

I'll resubmit the fix.  Marek, could you try that later and report
back if it still works and doesn't break things again?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems
  2023-01-25 15:29 ` [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
@ 2023-01-25 15:33   ` Marek Marczykowski-Górecki
  0 siblings, 0 replies; 5+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-25 15:33 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

[-- Attachment #1: Type: text/plain, Size: 980 bytes --]

On Wed, Jan 25, 2023 at 04:29:19PM +0100, Takashi Iwai wrote:
> On Tue, 24 Jan 2023 10:27:42 +0100,
> Takashi Iwai wrote:
> > 
> > Hi,
> > 
> > this is a patch series to address the recent regression on Xen PV (and
> > possibly non-IOMMU) systems about the SG-buffer memory allocation.
> > We switched to use dma_alloc_noncontiguous() as hoped it handling
> > everything right, but it turned out that this doesn't work always.
> > So this is one step back, use the explicit SG-buffer with
> > dma_alloc_coherent() calls, but in a bit more optimized ways, and also
> > applying only for those systems.
> 
> It seems that the second patch causes a problem; at least I see casual
> Oopses on my system after using the patch.  Let's scratch.
> 
> I'll resubmit the fix.  Marek, could you try that later and report
> back if it still works and doesn't break things again?

Sure, just cc me.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-25 15:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-24  9:27 [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
2023-01-24  9:27 ` [PATCH 1/2] ALSA: memalloc: Explicit SG-allocations " Takashi Iwai
2023-01-24  9:27 ` [PATCH 2/2] ALSA: memalloc: Use coherent DMA allocation for fallback again Takashi Iwai
2023-01-25 15:29 ` [PATCH 0/2] ALSA: memalloc: Fix for Xen PV and non-IOMMU systems Takashi Iwai
2023-01-25 15:33   ` Marek Marczykowski-Górecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.