All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
       [not found] <Y5KPAs6f7S2dEoxR@mail-itl>
@ 2022-12-09  8:10 ` Takashi Iwai
  2022-12-09 12:40   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2022-12-09  8:10 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Fri, 09 Dec 2022 02:27:30 +0100,
Marek Marczykowski-Górecki wrote:
> 
> Hi,
> 
> Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> hours. pavucontrol still shows meter bars moving, but the speakers
> remain silent. At least on some occasions I see the following message in
> dmesg:
> 
>   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> 
> I'm not sure if that happens before sound stops working, after, of if
> it's related at all, but that's pretty much the only sound-related error
> I found in logs.
> When the issue happens, on rare occasions it starts working again later
> for a short time, but generally the fix is to reboot. Reloading all
> snd_* modules (surprisingly) do not help. I don't know what exactly
> triggers the issue, sometimes is happen after short time like 15 minutes
> uptime, but usually after several hours. I guess it depends on usage
> pattern, but I haven't spotted any specific relation.
> 
> I managed to bisect it to this commit:
> 
>     2c95b92ecd92e784785b1db8cccc4f0f2bfa850c is the first bad commit
>     commit 2c95b92ecd92e784785b1db8cccc4f0f2bfa850c
>     Author: Takashi Iwai <tiwai@suse.de>
>     Date:   Tue Nov 16 08:33:58 2021 +0100
> 
>         ALSA: memalloc: Unify x86 SG-buffer handling (take#3)
>         
>         This is a second attempt to unify the x86-specific SG-buffer handling
>         code with the new standard non-contiguous page handler.
>         
>         The first try (in commit 2d9ea39917a4) failed due to the wrong page
>         and address calculations, hence reverted.  (And the second try failed
>         due to a copy&paste error.)  Now it's corrected with the previous fix
>         for noncontig pages, and the proper sg page iteration by this patch.
>         
>         After the migration, SNDRV_DMA_TYPE_DMA_SG becomes identical with
>         SNDRV_DMA_TYPE_NONCONTIG on x86, while others still fall back to
>         SNDRV_DMA_TYPE_DEV.
>         
>         Tested-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
>         Tested-by: Harald Arnesen <harald@skogtun.org>
>         Link: https://lore.kernel.org/r/20211017074859.24112-4-tiwai@suse.de
>         Link: https://lore.kernel.org/r/20211109062235.22310-1-tiwai@suse.de
>         Link: https://lore.kernel.org/r/20211116073358.19741-1-tiwai@suse.de
>         Signed-off-by: Takashi Iwai <tiwai@suse.de>
> 
>      include/sound/memalloc.h |  14 ++--
>      sound/core/Makefile      |   1 -
>      sound/core/memalloc.c    |  53 ++++++++++++-
>      sound/core/sgbuf.c       | 201 -----------------------------------------------
>      4 files changed, 56 insertions(+), 213 deletions(-)
>      delete mode 100644 sound/core/sgbuf.c
> 
> I've seen further follow ups to this commit, but I still observe this
> issue on Linux 6.0.8.
> 
> I have observed this issue on KBL-based system, but I've got reports
> also from users of other platforms (including as old as Sandy Bridge).
> 
> I tried to include all relevant information above, but some more details
> can be found at original report at
> https://github.com/QubesOS/qubes-issues/issues/7465
> 
> Any ideas?

Hm, is it specific to Xen, i.e. if you run the normal kernel on the
same machine, does it still work?

In anyway, please check the behavior with 6.1-rc8 + the commit
cc26516374065a34e10c9a8bf3e940e42cd96e2a
    ALSA: memalloc: Allocate more contiguous pages for fallback case
from for-next of my sound git tree (which will be in 6.2-rc1).

If the problem persists, another thing to check is the hack below
works.


thanks,

Takashi

-- 8< --
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1808,9 +1808,16 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci,
 	if (err < 0)
 		return err;
 
+#if 0
 	/* use the non-cached pages in non-snoop mode */
 	if (!azx_snoop(chip))
 		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_WC_SG;
+#else
+	if (!azx_snoop(chip))
+		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_SG;
+	else
+		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV;
+#endif
 
 	if (chip->driver_type == AZX_DRIVER_NVIDIA) {
 		dev_dbg(chip->card->dev, "Enable delay in RIRB handling\n");

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-09  8:10 ` Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17 Takashi Iwai
@ 2022-12-09 12:40   ` Marek Marczykowski-Górecki
  2022-12-10  1:00     ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-12-09 12:40 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 5525 bytes --]

On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> On Fri, 09 Dec 2022 02:27:30 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > Hi,
> > 
> > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > hours. pavucontrol still shows meter bars moving, but the speakers
> > remain silent. At least on some occasions I see the following message in
> > dmesg:
> > 
> >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > 
> > I'm not sure if that happens before sound stops working, after, of if
> > it's related at all, but that's pretty much the only sound-related error
> > I found in logs.
> > When the issue happens, on rare occasions it starts working again later
> > for a short time, but generally the fix is to reboot. Reloading all
> > snd_* modules (surprisingly) do not help. I don't know what exactly
> > triggers the issue, sometimes is happen after short time like 15 minutes
> > uptime, but usually after several hours. I guess it depends on usage
> > pattern, but I haven't spotted any specific relation.
> > 
> > I managed to bisect it to this commit:
> > 
> >     2c95b92ecd92e784785b1db8cccc4f0f2bfa850c is the first bad commit
> >     commit 2c95b92ecd92e784785b1db8cccc4f0f2bfa850c
> >     Author: Takashi Iwai <tiwai@suse.de>
> >     Date:   Tue Nov 16 08:33:58 2021 +0100
> > 
> >         ALSA: memalloc: Unify x86 SG-buffer handling (take#3)
> >         
> >         This is a second attempt to unify the x86-specific SG-buffer handling
> >         code with the new standard non-contiguous page handler.
> >         
> >         The first try (in commit 2d9ea39917a4) failed due to the wrong page
> >         and address calculations, hence reverted.  (And the second try failed
> >         due to a copy&paste error.)  Now it's corrected with the previous fix
> >         for noncontig pages, and the proper sg page iteration by this patch.
> >         
> >         After the migration, SNDRV_DMA_TYPE_DMA_SG becomes identical with
> >         SNDRV_DMA_TYPE_NONCONTIG on x86, while others still fall back to
> >         SNDRV_DMA_TYPE_DEV.
> >         
> >         Tested-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> >         Tested-by: Harald Arnesen <harald@skogtun.org>
> >         Link: https://lore.kernel.org/r/20211017074859.24112-4-tiwai@suse.de
> >         Link: https://lore.kernel.org/r/20211109062235.22310-1-tiwai@suse.de
> >         Link: https://lore.kernel.org/r/20211116073358.19741-1-tiwai@suse.de
> >         Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > 
> >      include/sound/memalloc.h |  14 ++--
> >      sound/core/Makefile      |   1 -
> >      sound/core/memalloc.c    |  53 ++++++++++++-
> >      sound/core/sgbuf.c       | 201 -----------------------------------------------
> >      4 files changed, 56 insertions(+), 213 deletions(-)
> >      delete mode 100644 sound/core/sgbuf.c
> > 
> > I've seen further follow ups to this commit, but I still observe this
> > issue on Linux 6.0.8.
> > 
> > I have observed this issue on KBL-based system, but I've got reports
> > also from users of other platforms (including as old as Sandy Bridge).
> > 
> > I tried to include all relevant information above, but some more details
> > can be found at original report at
> > https://github.com/QubesOS/qubes-issues/issues/7465
> > 
> > Any ideas?
> 
> Hm, is it specific to Xen, i.e. if you run the normal kernel on the
> same machine, does it still work?

I don't know if that's specific to Xen, but I assume if it wouldn't be,
there would be a lot more bug reports. I can't think of any other
relevant difference. Unfortunately, I can't run Linux without Xen on
this system long enough to confirm.


> In anyway, please check the behavior with 6.1-rc8 + the commit
> cc26516374065a34e10c9a8bf3e940e42cd96e2a
>     ALSA: memalloc: Allocate more contiguous pages for fallback case
> from for-next of my sound git tree (which will be in 6.2-rc1).

Looking at the mentioned commits, there is one specific aspect of Xen PV
that may be relevant. It configures PAT differently than native Linux.
Theoretically Linux adapts automatically and using proper API (like
set_memory_wc()) should just work, but at least for i915 driver it
causes issues (not fully tracked down yet). Details about that bug
report include some more background:
https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/

Anyway, I have tested it on a Xen modified to setup PAT the same way as
native Linux and the audio issue is still there.

> If the problem persists, another thing to check is the hack below
> works.

Thanks, I'll check both and report back.

> thanks,
> 
> Takashi
> 
> -- 8< --
> --- a/sound/pci/hda/hda_intel.c
> +++ b/sound/pci/hda/hda_intel.c
> @@ -1808,9 +1808,16 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci,
>  	if (err < 0)
>  		return err;
>  
> +#if 0
>  	/* use the non-cached pages in non-snoop mode */
>  	if (!azx_snoop(chip))
>  		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_WC_SG;
> +#else
> +	if (!azx_snoop(chip))
> +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_SG;
> +	else
> +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV;
> +#endif
>  
>  	if (chip->driver_type == AZX_DRIVER_NVIDIA) {
>  		dev_dbg(chip->card->dev, "Enable delay in RIRB handling\n");

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-09 12:40   ` Marek Marczykowski-Górecki
@ 2022-12-10  1:00     ` Marek Marczykowski-Górecki
  2022-12-10 16:17       ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-12-10  1:00 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]

On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > On Fri, 09 Dec 2022 02:27:30 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > Hi,
> > > 
> > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > remain silent. At least on some occasions I see the following message in
> > > dmesg:
> > > 
> > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting

Hit the issue again, this message did not appear in the log (or at least
not yet).

(...)

> > In anyway, please check the behavior with 6.1-rc8 + the commit
> > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > from for-next of my sound git tree (which will be in 6.2-rc1).

This did not helped.

> Looking at the mentioned commits, there is one specific aspect of Xen PV
> that may be relevant. It configures PAT differently than native Linux.
> Theoretically Linux adapts automatically and using proper API (like
> set_memory_wc()) should just work, but at least for i915 driver it
> causes issues (not fully tracked down yet). Details about that bug
> report include some more background:
> https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> 
> Anyway, I have tested it on a Xen modified to setup PAT the same way as
> native Linux and the audio issue is still there.
> 
> > If the problem persists, another thing to check is the hack below
> > works.

Trying this one now.

> > -- 8< --
> > --- a/sound/pci/hda/hda_intel.c
> > +++ b/sound/pci/hda/hda_intel.c
> > @@ -1808,9 +1808,16 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci,
> >  	if (err < 0)
> >  		return err;
> >  
> > +#if 0
> >  	/* use the non-cached pages in non-snoop mode */
> >  	if (!azx_snoop(chip))
> >  		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_WC_SG;
> > +#else
> > +	if (!azx_snoop(chip))
> > +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_SG;
> > +	else
> > +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV;
> > +#endif
> >  
> >  	if (chip->driver_type == AZX_DRIVER_NVIDIA) {
> >  		dev_dbg(chip->card->dev, "Enable delay in RIRB handling\n");

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-10  1:00     ` Marek Marczykowski-Górecki
@ 2022-12-10 16:17       ` Marek Marczykowski-Górecki
  2022-12-20  4:43         ` Marek Marczykowski-Górecki
  2022-12-22  8:09         ` Takashi Iwai
  0 siblings, 2 replies; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-12-10 16:17 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu


[-- Attachment #1.1: Type: text/plain, Size: 3235 bytes --]

On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > remain silent. At least on some occasions I see the following message in
> > > > dmesg:
> > > > 
> > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> 
> Hit the issue again, this message did not appear in the log (or at least
> not yet).
> 
> (...)
> 
> > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > from for-next of my sound git tree (which will be in 6.2-rc1).
> 
> This did not helped.
> 
> > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > that may be relevant. It configures PAT differently than native Linux.
> > Theoretically Linux adapts automatically and using proper API (like
> > set_memory_wc()) should just work, but at least for i915 driver it
> > causes issues (not fully tracked down yet). Details about that bug
> > report include some more background:
> > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > 
> > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > native Linux and the audio issue is still there.
> > 
> > > If the problem persists, another thing to check is the hack below
> > > works.
> 
> Trying this one now.

And this one didn't either :/

When it stopped working, I did two things:
1. switched audio profiles ("configuration" tab in pavucontrol) several
times; this on its own did not helped
2. reloaded sound related modules, but did not loaded them all back (see
attached list before and after).

After this, it worked again for a few minutes. Not sure if/which the above
actions were relevant, tho...

Another observation: when it stops working, it's never during a
playback. It's always that at some point starting an audio stream
results in a silence.

> > > -- 8< --
> > > --- a/sound/pci/hda/hda_intel.c
> > > +++ b/sound/pci/hda/hda_intel.c
> > > @@ -1808,9 +1808,16 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci,
> > >  	if (err < 0)
> > >  		return err;
> > >  
> > > +#if 0
> > >  	/* use the non-cached pages in non-snoop mode */
> > >  	if (!azx_snoop(chip))
> > >  		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_WC_SG;
> > > +#else
> > > +	if (!azx_snoop(chip))
> > > +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV_SG;
> > > +	else
> > > +		azx_bus(chip)->dma_type = SNDRV_DMA_TYPE_DEV;
> > > +#endif
> > >  
> > >  	if (chip->driver_type == AZX_DRIVER_NVIDIA) {
> > >  		dev_dbg(chip->card->dev, "Enable delay in RIRB handling\n");

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #1.2: snd-mods-list-after.txt --]
[-- Type: text/plain, Size: 1077 bytes --]

snd_hda_codec_hdmi     86016  1
snd_ctl_led            24576  0
snd_hda_codec_conexant    32768  1
snd_hda_codec_generic    98304  1 snd_hda_codec_conexant
ledtrig_audio          16384  2 snd_ctl_led,snd_hda_codec_generic
snd_hda_intel          61440  5
snd_intel_dspcfg       36864  1 snd_hda_intel
snd_intel_sdw_acpi     20480  1 snd_intel_dspcfg
snd_hda_codec         184320  4 snd_hda_codec_generic,snd_hda_codec_conexant,snd_hda_codec_hdmi,snd_hda_intel
snd_hda_core          118784  5 snd_hda_codec_generic,snd_hda_codec_conexant,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
snd_hwdep              16384  1 snd_hda_codec
snd_seq                94208  0
snd_seq_device         16384  1 snd_seq
snd_pcm               151552  5 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hda_core
snd_timer              49152  3 snd_seq,snd_pcm
snd                   126976  19 snd_ctl_led,snd_hda_codec_generic,snd_seq,snd_hda_codec_conexant,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_timer,snd_pcm
soundcore              16384  2 snd_ctl_led,snd

[-- Attachment #1.3: snd-mods-list-before.txt --]
[-- Type: text/plain, Size: 2848 bytes --]

snd_sof_pci_intel_skl    16384  0
snd_sof_intel_hda_common   217088  1 snd_sof_pci_intel_skl
soundwire_intel        53248  1 snd_sof_intel_hda_common
snd_sof_intel_hda      20480  1 snd_sof_intel_hda_common
snd_sof_pci            24576  2 snd_sof_intel_hda_common,snd_sof_pci_intel_skl
snd_sof_xtensa_dsp     20480  1 snd_sof_intel_hda_common
snd_sof               339968  2 snd_sof_pci,snd_sof_intel_hda_common
snd_sof_utils          20480  1 snd_sof
snd_soc_avs           172032  0
snd_soc_hda_codec      28672  1 snd_soc_avs
snd_soc_skl           217088  0
snd_soc_hdac_hda       28672  2 snd_sof_intel_hda_common,snd_soc_skl
snd_hda_ext_core       36864  5 snd_soc_avs,snd_soc_hda_codec,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl
snd_soc_sst_ipc        20480  1 snd_soc_skl
snd_soc_sst_dsp        36864  1 snd_soc_skl
snd_soc_acpi_intel_match    73728  3 snd_sof_intel_hda_common,snd_soc_skl,snd_sof_pci_intel_skl
snd_soc_acpi           16384  3 snd_soc_acpi_intel_match,snd_sof_intel_hda_common,snd_soc_skl
snd_hda_codec_hdmi     86016  1
snd_soc_core          393216  7 snd_soc_avs,snd_soc_hda_codec,soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl
snd_ctl_led            24576  0
snd_compress           28672  1 snd_soc_core
ac97_bus               16384  1 snd_soc_core
snd_hda_codec_conexant    32768  1
snd_pcm_dmaengine      16384  1 snd_soc_core
snd_hda_codec_generic    98304  1 snd_hda_codec_conexant
snd_hda_intel          61440  2
snd_intel_dspcfg       36864  5 snd_soc_avs,snd_hda_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_skl
snd_intel_sdw_acpi     20480  2 snd_sof_intel_hda_common,snd_intel_dspcfg
snd_hda_codec         184320  9 snd_hda_codec_generic,snd_hda_codec_conexant,snd_soc_avs,snd_hda_codec_hdmi,snd_soc_hda_codec,snd_hda_intel,snd_soc_hdac_hda,snd_soc_skl,snd_sof_intel_hda
snd_hda_core          118784  12 snd_hda_codec_generic,snd_hda_codec_conexant,snd_soc_avs,snd_hda_codec_hdmi,snd_soc_hda_codec,snd_hda_intel,snd_hda_ext_core,snd_hda_codec,snd_sof_intel_hda_common,snd_soc_hdac_hda,snd_soc_skl,snd_sof_intel_hda
snd_hwdep              16384  1 snd_hda_codec
snd_seq                94208  0
snd_seq_device         16384  1 snd_seq
snd_pcm               151552  13 snd_soc_avs,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_compress,snd_soc_core,snd_sof_utils,snd_soc_skl,snd_hda_core,snd_pcm_dmaengine
snd_timer              49152  2 snd_seq,snd_pcm
ledtrig_audio          16384  3 snd_ctl_led,snd_hda_codec_generic,thinkpad_acpi
snd                   126976  19 snd_ctl_led,snd_hda_codec_generic,snd_seq,snd_hda_codec_conexant,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_sof,snd_timer,snd_compress,thinkpad_acpi,snd_soc_core,snd_pcm
soundcore              16384  2 snd_ctl_led,snd

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-10 16:17       ` Marek Marczykowski-Górecki
@ 2022-12-20  4:43         ` Marek Marczykowski-Górecki
  2022-12-22  8:09         ` Takashi Iwai
  1 sibling, 0 replies; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-12-20  4:43 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 2759 bytes --]

On Sat, Dec 10, 2022 at 05:17:42PM +0100, Marek Marczykowski-Górecki wrote:
> On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > remain silent. At least on some occasions I see the following message in
> > > > > dmesg:
> > > > > 
> > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > 
> > Hit the issue again, this message did not appear in the log (or at least
> > not yet).
> > 
> > (...)
> > 
> > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > 
> > This did not helped.
> > 
> > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > that may be relevant. It configures PAT differently than native Linux.
> > > Theoretically Linux adapts automatically and using proper API (like
> > > set_memory_wc()) should just work, but at least for i915 driver it
> > > causes issues (not fully tracked down yet). Details about that bug
> > > report include some more background:
> > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > 
> > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > native Linux and the audio issue is still there.
> > > 
> > > > If the problem persists, another thing to check is the hack below
> > > > works.
> > 
> > Trying this one now.
> 
> And this one didn't either :/
> 
> When it stopped working, I did two things:
> 1. switched audio profiles ("configuration" tab in pavucontrol) several
> times; this on its own did not helped
> 2. reloaded sound related modules, but did not loaded them all back (see
> attached list before and after).
> 
> After this, it worked again for a few minutes. Not sure if/which the above
> actions were relevant, tho...
> 
> Another observation: when it stops working, it's never during a
> playback. It's always that at some point starting an audio stream
> results in a silence.

Any other ideas? Or maybe there is another patch I should try?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-10 16:17       ` Marek Marczykowski-Górecki
  2022-12-20  4:43         ` Marek Marczykowski-Górecki
@ 2022-12-22  8:09         ` Takashi Iwai
  2022-12-27 15:26           ` Marek Marczykowski-Górecki
  1 sibling, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2022-12-22  8:09 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Sat, 10 Dec 2022 17:17:42 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > remain silent. At least on some occasions I see the following message in
> > > > > dmesg:
> > > > > 
> > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > 
> > Hit the issue again, this message did not appear in the log (or at least
> > not yet).
> > 
> > (...)
> > 
> > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > 
> > This did not helped.
> > 
> > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > that may be relevant. It configures PAT differently than native Linux.
> > > Theoretically Linux adapts automatically and using proper API (like
> > > set_memory_wc()) should just work, but at least for i915 driver it
> > > causes issues (not fully tracked down yet). Details about that bug
> > > report include some more background:
> > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > 
> > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > native Linux and the audio issue is still there.
> > > 
> > > > If the problem persists, another thing to check is the hack below
> > > > works.
> > 
> > Trying this one now.
> 
> And this one didn't either :/

(Sorry for the late reply, as I've been off in the last weeks.)

I think the hack doesn't influence on the PCM buffer pages, but only
about BDL pages.  Could you check the patch below instead?
It'll disable the SG-buffer handling on x86 completely. 


thanks,

Takashi

-- 8< --
--- a/sound/core/Kconfig
+++ b/sound/core/Kconfig
@@ -225,8 +225,8 @@ config SND_VMASTER
 	bool
 
 config SND_DMA_SGBUF
-	def_bool y
-	depends on X86
+	def_bool n
+#	depends on X86
 
 config SND_CTL_LED
 	tristate

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-22  8:09         ` Takashi Iwai
@ 2022-12-27 15:26           ` Marek Marczykowski-Górecki
  2023-01-16 15:55             ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2022-12-27 15:26 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 3017 bytes --]

On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> On Sat, 10 Dec 2022 17:17:42 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > dmesg:
> > > > > > 
> > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > 
> > > Hit the issue again, this message did not appear in the log (or at least
> > > not yet).
> > > 
> > > (...)
> > > 
> > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > 
> > > This did not helped.
> > > 
> > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > that may be relevant. It configures PAT differently than native Linux.
> > > > Theoretically Linux adapts automatically and using proper API (like
> > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > causes issues (not fully tracked down yet). Details about that bug
> > > > report include some more background:
> > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > 
> > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > native Linux and the audio issue is still there.
> > > > 
> > > > > If the problem persists, another thing to check is the hack below
> > > > > works.
> > > 
> > > Trying this one now.
> > 
> > And this one didn't either :/
> 
> (Sorry for the late reply, as I've been off in the last weeks.)
> 
> I think the hack doesn't influence on the PCM buffer pages, but only
> about BDL pages.  Could you check the patch below instead?
> It'll disable the SG-buffer handling on x86 completely. 

This seems to "fix" the issue, thanks!
I guess I'll run it this way for now, but a proper solution would be
nice. Let me know if I can collect any more info that would help with
that.

> -- 8< --
> --- a/sound/core/Kconfig
> +++ b/sound/core/Kconfig
> @@ -225,8 +225,8 @@ config SND_VMASTER
>  	bool
>  
>  config SND_DMA_SGBUF
> -	def_bool y
> -	depends on X86
> +	def_bool n
> +#	depends on X86
>  
>  config SND_CTL_LED
>  	tristate

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2022-12-27 15:26           ` Marek Marczykowski-Górecki
@ 2023-01-16 15:55             ` Takashi Iwai
  2023-01-17  7:58               ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-16 15:55 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Tue, 27 Dec 2022 16:26:54 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > On Sat, 10 Dec 2022 17:17:42 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > dmesg:
> > > > > > > 
> > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > 
> > > > Hit the issue again, this message did not appear in the log (or at least
> > > > not yet).
> > > > 
> > > > (...)
> > > > 
> > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > 
> > > > This did not helped.
> > > > 
> > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > report include some more background:
> > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > 
> > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > native Linux and the audio issue is still there.
> > > > > 
> > > > > > If the problem persists, another thing to check is the hack below
> > > > > > works.
> > > > 
> > > > Trying this one now.
> > > 
> > > And this one didn't either :/
> > 
> > (Sorry for the late reply, as I've been off in the last weeks.)
> > 
> > I think the hack doesn't influence on the PCM buffer pages, but only
> > about BDL pages.  Could you check the patch below instead?
> > It'll disable the SG-buffer handling on x86 completely. 
> 
> This seems to "fix" the issue, thanks!
> I guess I'll run it this way for now, but a proper solution would be
> nice. Let me know if I can collect any more info that would help with
> that.

Then we seem to go back again with the coherent memory allocation for
the fallback sg cases.  It was changed because the use of
dma_alloc_coherent() caused a problem with IOMMU case for retrieving
the page addresses, but since the commit 9736a325137b, we essentially
avoid the fallback when IOMMU is used, so it should be fine again.

Let me know if the patch like below works for you instead of the
previous hack to disable SG-buffer (note: totally untested!)


thanks,

Takashi

-- 8< --
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -719,17 +719,30 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = {
 struct snd_dma_sg_fallback {
 	size_t count;
 	struct page **pages;
+	dma_addr_t *addrs;
 };
 
 static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab,
 				       struct snd_dma_sg_fallback *sgbuf)
 {
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
-	size_t i;
-
-	for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++)
-		do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc);
+	size_t i, size;
+
+	if (sgbuf->pages && sgbuf->addrs) {
+		i = 0;
+		while (i < sgbuf->count) {
+			if (!sgbuf->pages[i] || !sgbuf->addrs[i])
+				break;
+			size = sgbuf->addrs[i] & ~PAGE_MASK;
+			if (!WARN_ON(size))
+				break;
+			dma_free_coherent(dmab->dev.dev, size,
+					  page_address(sgbuf->pages[i]),
+					  sgbuf->addrs[i] & PAGE_MASK);
+			i += size;
+		}
+	}
 	kvfree(sgbuf->pages);
+	kvfree(sgbuf->addrs);
 	kfree(sgbuf);
 }
 
@@ -738,9 +751,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct snd_dma_sg_fallback *sgbuf;
 	struct page **pagep, *curp;
 	size_t chunk, npages;
-	dma_addr_t addr;
+	dma_addr_t *addrp;
 	void *p;
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 
 	sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL);
 	if (!sgbuf)
@@ -748,14 +760,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	size = PAGE_ALIGN(size);
 	sgbuf->count = size >> PAGE_SHIFT;
 	sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL);
-	if (!sgbuf->pages)
+	sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL);
+	if (!sgbuf->pages || !sgbuf->addrs)
 		goto error;
 
 	pagep = sgbuf->pages;
-	chunk = size;
+	addrp = sgbuf->addrs;
+	chunk = PAGE_SIZE * (PAGE_SIZE - 1); /* to fit in low bits in addrs */
 	while (size > 0) {
 		chunk = min(size, chunk);
-		p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc);
+		p = dma_alloc_coherent(dmab->dev.dev, chunk, addrp, DEFAULT_GFP);
 		if (!p) {
 			if (chunk <= PAGE_SIZE)
 				goto error;
@@ -767,6 +781,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 		size -= chunk;
 		/* fill pages */
 		npages = chunk >> PAGE_SHIFT;
+		*addrp |= npages; /* store in lower bits */
+		addrp += npages;
 		curp = virt_to_page(p);
 		while (npages--)
 			*pagep++ = curp++;
@@ -775,6 +791,10 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL);
 	if (!p)
 		goto error;
+
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wc(sgbuf->pages, sgbuf->count);
+
 	dmab->private_data = sgbuf;
 	/* store the first page address for convenience */
 	dmab->addr = snd_sgbuf_get_addr(dmab, 0);
@@ -787,7 +807,11 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 
 static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab)
 {
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+
 	vunmap(dmab->area);
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wb(sgbuf->pages, sgbuf->count);
 	__snd_dma_sg_fallback_free(dmab, dmab->private_data);
 }
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-16 15:55             ` Takashi Iwai
@ 2023-01-17  7:58               ` Takashi Iwai
  2023-01-17 11:36                 ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-17  7:58 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Mon, 16 Jan 2023 16:55:11 +0100,
Takashi Iwai wrote:
> 
> On Tue, 27 Dec 2022 16:26:54 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > dmesg:
> > > > > > > > 
> > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > 
> > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > not yet).
> > > > > 
> > > > > (...)
> > > > > 
> > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > 
> > > > > This did not helped.
> > > > > 
> > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > report include some more background:
> > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > 
> > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > native Linux and the audio issue is still there.
> > > > > > 
> > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > works.
> > > > > 
> > > > > Trying this one now.
> > > > 
> > > > And this one didn't either :/
> > > 
> > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > 
> > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > about BDL pages.  Could you check the patch below instead?
> > > It'll disable the SG-buffer handling on x86 completely. 
> > 
> > This seems to "fix" the issue, thanks!
> > I guess I'll run it this way for now, but a proper solution would be
> > nice. Let me know if I can collect any more info that would help with
> > that.
> 
> Then we seem to go back again with the coherent memory allocation for
> the fallback sg cases.  It was changed because the use of
> dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> the page addresses, but since the commit 9736a325137b, we essentially
> avoid the fallback when IOMMU is used, so it should be fine again.
> 
> Let me know if the patch like below works for you instead of the
> previous hack to disable SG-buffer (note: totally untested!)

Gah, there was an obvious typo, scratch that.

Below is a proper patch.  Please try this one instead.


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] ALSA: memalloc: Use coherent DMA allocation for fallback again

We switched the memory allocation for fallback cases in the noncontig
type to use the standard alloc_pages*() at the commit a8d302a0b770
("ALSA: memalloc: Revive x86-specific WC page allocations again"),
while we used the dma_alloc_coherent() in the past.  The reason was
that the page address retrieved from the virtual pointer returned from
dma_alloc_coherent() can't be used with IOMMU systems.  Meanwhile, we
explicitly disabled the fallback allocation for IOMMU systems at the
commit 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer
with IOMMU") after the commit above; that is, the usage of
dma_alloc_coherent() should be OK again.

Now, we've received reports that the current fallback page allocation
caused a regression on Xen (and maybe other) systems; the sound
disappear partially or completely.  The further investigation showed
that this can be worked around by the dma_alloc_coherent() pages.
So, it's time to take it back.

This patch switches back to the dma_alloc_coherent() for the fallback
allocations.  Unlike the previous implementation, the allocation is
implemented in a more optimized way to try larger chunks.  The page
count is stored in the lower bits of the addresses.

Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/memalloc.c | 44 +++++++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 10 deletions(-)

diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index 81025f50a542..dff07cd6f209 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -719,17 +719,30 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = {
 struct snd_dma_sg_fallback {
 	size_t count;
 	struct page **pages;
+	dma_addr_t *addrs;
 };
 
 static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab,
 				       struct snd_dma_sg_fallback *sgbuf)
 {
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
-	size_t i;
-
-	for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++)
-		do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc);
+	size_t i, size;
+
+	if (sgbuf->pages && sgbuf->addrs) {
+		i = 0;
+		while (i < sgbuf->count) {
+			if (!sgbuf->pages[i] || !sgbuf->addrs[i])
+				break;
+			size = sgbuf->addrs[i] & ~PAGE_MASK;
+			if (WARN_ON(!size))
+				break;
+			dma_free_coherent(dmab->dev.dev, size,
+					  page_address(sgbuf->pages[i]),
+					  sgbuf->addrs[i] & PAGE_MASK);
+			i += size;
+		}
+	}
 	kvfree(sgbuf->pages);
+	kvfree(sgbuf->addrs);
 	kfree(sgbuf);
 }
 
@@ -738,9 +751,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct snd_dma_sg_fallback *sgbuf;
 	struct page **pagep, *curp;
 	size_t chunk, npages;
-	dma_addr_t addr;
+	dma_addr_t *addrp;
 	void *p;
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 
 	sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL);
 	if (!sgbuf)
@@ -748,14 +760,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	size = PAGE_ALIGN(size);
 	sgbuf->count = size >> PAGE_SHIFT;
 	sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL);
-	if (!sgbuf->pages)
+	sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL);
+	if (!sgbuf->pages || !sgbuf->addrs)
 		goto error;
 
 	pagep = sgbuf->pages;
-	chunk = size;
+	addrp = sgbuf->addrs;
+	chunk = (PAGE_SIZE - 1) << PAGE_SHIFT; /* to fit in low bits in addrs */
 	while (size > 0) {
 		chunk = min(size, chunk);
-		p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc);
+		p = dma_alloc_coherent(dmab->dev.dev, chunk, addrp, DEFAULT_GFP);
 		if (!p) {
 			if (chunk <= PAGE_SIZE)
 				goto error;
@@ -767,6 +781,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 		size -= chunk;
 		/* fill pages */
 		npages = chunk >> PAGE_SHIFT;
+		*addrp |= npages; /* store in lower bits */
+		addrp += npages;
 		curp = virt_to_page(p);
 		while (npages--)
 			*pagep++ = curp++;
@@ -775,6 +791,10 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL);
 	if (!p)
 		goto error;
+
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wc(sgbuf->pages, sgbuf->count);
+
 	dmab->private_data = sgbuf;
 	/* store the first page address for convenience */
 	dmab->addr = snd_sgbuf_get_addr(dmab, 0);
@@ -787,7 +807,11 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 
 static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab)
 {
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+
 	vunmap(dmab->area);
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wb(sgbuf->pages, sgbuf->count);
 	__snd_dma_sg_fallback_free(dmab, dmab->private_data);
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17  7:58               ` Takashi Iwai
@ 2023-01-17 11:36                 ` Marek Marczykowski-Górecki
  2023-01-17 14:21                   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-17 11:36 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 3977 bytes --]

On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> On Mon, 16 Jan 2023 16:55:11 +0100,
> Takashi Iwai wrote:
> > 
> > On Tue, 27 Dec 2022 16:26:54 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > 
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > dmesg:
> > > > > > > > > 
> > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > 
> > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > not yet).
> > > > > > 
> > > > > > (...)
> > > > > > 
> > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > 
> > > > > > This did not helped.
> > > > > > 
> > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > report include some more background:
> > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > 
> > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > native Linux and the audio issue is still there.
> > > > > > > 
> > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > works.
> > > > > > 
> > > > > > Trying this one now.
> > > > > 
> > > > > And this one didn't either :/
> > > > 
> > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > 
> > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > about BDL pages.  Could you check the patch below instead?
> > > > It'll disable the SG-buffer handling on x86 completely. 
> > > 
> > > This seems to "fix" the issue, thanks!
> > > I guess I'll run it this way for now, but a proper solution would be
> > > nice. Let me know if I can collect any more info that would help with
> > > that.
> > 
> > Then we seem to go back again with the coherent memory allocation for
> > the fallback sg cases.  It was changed because the use of
> > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > the page addresses, but since the commit 9736a325137b, we essentially
> > avoid the fallback when IOMMU is used, so it should be fine again.
> > 
> > Let me know if the patch like below works for you instead of the
> > previous hack to disable SG-buffer (note: totally untested!)
> 
> Gah, there was an obvious typo, scratch that.
> 
> Below is a proper patch.  Please try this one instead.

Thanks, I'll give it a try.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 11:36                 ` Marek Marczykowski-Górecki
@ 2023-01-17 14:21                   ` Marek Marczykowski-Górecki
  2023-01-17 14:33                     ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-17 14:21 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 4302 bytes --]

On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > On Mon, 16 Jan 2023 16:55:11 +0100,
> > Takashi Iwai wrote:
> > > 
> > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > 
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > dmesg:
> > > > > > > > > > 
> > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > 
> > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > not yet).
> > > > > > > 
> > > > > > > (...)
> > > > > > > 
> > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > 
> > > > > > > This did not helped.
> > > > > > > 
> > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > report include some more background:
> > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > 
> > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > 
> > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > works.
> > > > > > > 
> > > > > > > Trying this one now.
> > > > > > 
> > > > > > And this one didn't either :/
> > > > > 
> > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > 
> > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > about BDL pages.  Could you check the patch below instead?
> > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > 
> > > > This seems to "fix" the issue, thanks!
> > > > I guess I'll run it this way for now, but a proper solution would be
> > > > nice. Let me know if I can collect any more info that would help with
> > > > that.
> > > 
> > > Then we seem to go back again with the coherent memory allocation for
> > > the fallback sg cases.  It was changed because the use of
> > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > the page addresses, but since the commit 9736a325137b, we essentially
> > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > 
> > > Let me know if the patch like below works for you instead of the
> > > previous hack to disable SG-buffer (note: totally untested!)
> > 
> > Gah, there was an obvious typo, scratch that.
> > 
> > Below is a proper patch.  Please try this one instead.
> 
> Thanks, I'll give it a try.

Unfortunately, it doesn't help, it stopped working again, after about 3h
uptime.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 14:21                   ` Marek Marczykowski-Górecki
@ 2023-01-17 14:33                     ` Takashi Iwai
  2023-01-17 16:49                       ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-17 14:33 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Tue, 17 Jan 2023 15:21:23 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > Takashi Iwai wrote:
> > > > 
> > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > 
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > dmesg:
> > > > > > > > > > > 
> > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > 
> > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > not yet).
> > > > > > > > 
> > > > > > > > (...)
> > > > > > > > 
> > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > 
> > > > > > > > This did not helped.
> > > > > > > > 
> > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > report include some more background:
> > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > 
> > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > 
> > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > works.
> > > > > > > > 
> > > > > > > > Trying this one now.
> > > > > > > 
> > > > > > > And this one didn't either :/
> > > > > > 
> > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > 
> > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > 
> > > > > This seems to "fix" the issue, thanks!
> > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > nice. Let me know if I can collect any more info that would help with
> > > > > that.
> > > > 
> > > > Then we seem to go back again with the coherent memory allocation for
> > > > the fallback sg cases.  It was changed because the use of
> > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > 
> > > > Let me know if the patch like below works for you instead of the
> > > > previous hack to disable SG-buffer (note: totally untested!)
> > > 
> > > Gah, there was an obvious typo, scratch that.
> > > 
> > > Below is a proper patch.  Please try this one instead.
> > 
> > Thanks, I'll give it a try.
> 
> Unfortunately, it doesn't help, it stopped working again, after about 3h
> uptime.

Aha, then it might be rather other way round;
dma_alloc_noncontiguous() doesn't work on Xen properly.

Could you try the one below instead of the previous?


Takashi

-- 8< --
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -538,11 +538,11 @@ static const struct snd_malloc_ops snd_dma_wc_ops = {
  */
 static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 {
-	struct sg_table *sgt;
+	struct sg_table *sgt = NULL;
 	void *p;
 
-	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
-				      DEFAULT_GFP, 0);
+	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
+	//			      DEFAULT_GFP, 0);
 #ifdef CONFIG_SND_DMA_SGBUF
 	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
 		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 14:33                     ` Takashi Iwai
@ 2023-01-17 16:49                       ` Marek Marczykowski-Górecki
  2023-01-17 16:52                         ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-17 16:49 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 5012 bytes --]

On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> On Tue, 17 Jan 2023 15:21:23 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > Takashi Iwai wrote:
> > > > > 
> > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > Hi,
> > > > > > > > > > > > 
> > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > dmesg:
> > > > > > > > > > > > 
> > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > 
> > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > not yet).
> > > > > > > > > 
> > > > > > > > > (...)
> > > > > > > > > 
> > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > 
> > > > > > > > > This did not helped.
> > > > > > > > > 
> > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > report include some more background:
> > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > 
> > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > 
> > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > works.
> > > > > > > > > 
> > > > > > > > > Trying this one now.
> > > > > > > > 
> > > > > > > > And this one didn't either :/
> > > > > > > 
> > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > 
> > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > 
> > > > > > This seems to "fix" the issue, thanks!
> > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > that.
> > > > > 
> > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > the fallback sg cases.  It was changed because the use of
> > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > 
> > > > > Let me know if the patch like below works for you instead of the
> > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > 
> > > > Gah, there was an obvious typo, scratch that.
> > > > 
> > > > Below is a proper patch.  Please try this one instead.
> > > 
> > > Thanks, I'll give it a try.
> > 
> > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > uptime.
> 
> Aha, then it might be rather other way round;
> dma_alloc_noncontiguous() doesn't work on Xen properly.
> 
> Could you try the one below instead of the previous?

Unfortunately, this one doesn't fix it either :/

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 16:49                       ` Marek Marczykowski-Górecki
@ 2023-01-17 16:52                         ` Takashi Iwai
  2023-01-17 20:34                           ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-17 16:52 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Tue, 17 Jan 2023 17:49:28 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > On Tue, 17 Jan 2023 15:21:23 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > Takashi Iwai wrote:
> > > > > > 
> > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > 
> > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > 
> > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > 
> > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > not yet).
> > > > > > > > > > 
> > > > > > > > > > (...)
> > > > > > > > > > 
> > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > 
> > > > > > > > > > This did not helped.
> > > > > > > > > > 
> > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > report include some more background:
> > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > 
> > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > 
> > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > works.
> > > > > > > > > > 
> > > > > > > > > > Trying this one now.
> > > > > > > > > 
> > > > > > > > > And this one didn't either :/
> > > > > > > > 
> > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > 
> > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > 
> > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > that.
> > > > > > 
> > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > 
> > > > > > Let me know if the patch like below works for you instead of the
> > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > 
> > > > > Gah, there was an obvious typo, scratch that.
> > > > > 
> > > > > Below is a proper patch.  Please try this one instead.
> > > > 
> > > > Thanks, I'll give it a try.
> > > 
> > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > uptime.
> > 
> > Aha, then it might be rather other way round;
> > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > 
> > Could you try the one below instead of the previous?
> 
> Unfortunately, this one doesn't fix it either :/

Hmm.  Then how about applying both of the last two patches?  The last
one to enforce the fallback allocation and the previous one to use
dma_alloc_coherent().  It should be essentially reverting to the old
way.


Takashi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 16:52                         ` Takashi Iwai
@ 2023-01-17 20:34                           ` Marek Marczykowski-Górecki
  2023-01-18  8:59                             ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-17 20:34 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 7027 bytes --]

On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> On Tue, 17 Jan 2023 17:49:28 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > Takashi Iwai wrote:
> > > > > > > 
> > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > 
> > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > 
> > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > not yet).
> > > > > > > > > > > 
> > > > > > > > > > > (...)
> > > > > > > > > > > 
> > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > 
> > > > > > > > > > > This did not helped.
> > > > > > > > > > > 
> > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > 
> > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > 
> > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > works.
> > > > > > > > > > > 
> > > > > > > > > > > Trying this one now.
> > > > > > > > > > 
> > > > > > > > > > And this one didn't either :/
> > > > > > > > > 
> > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > 
> > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > 
> > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > that.
> > > > > > > 
> > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > 
> > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > 
> > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > 
> > > > > > Below is a proper patch.  Please try this one instead.
> > > > > 
> > > > > Thanks, I'll give it a try.
> > > > 
> > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > uptime.
> > > 
> > > Aha, then it might be rather other way round;
> > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > 
> > > Could you try the one below instead of the previous?
> > 
> > Unfortunately, this one doesn't fix it either :/
> 
> Hmm.  Then how about applying both of the last two patches?  The last
> one to enforce the fallback allocation and the previous one to use
> dma_alloc_coherent().  It should be essentially reverting to the old
> way.

Oh, I noticed only now: the last patch made it fail to initialize. I
don't see obvious errors in dmesg, but when trying aplay, I get:

    ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
    ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
    aplay: main:830: audio open error: Cannot allocate memory

# dmesg |grep snd
[   21.947940] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[   21.985769] snd_hda_codec_conexant hdaudioC0D0: CX8200: BIOS auto-probing.
[   21.987956] snd_hda_codec_conexant hdaudioC0D0: vmaster hook already present before cdev!
[   21.990073] snd_hda_codec_conexant hdaudioC0D0: autoconfig for CX8200: line_outs=1 (0x17/0x0/0x0/0x0/0x0) type:speaker
[   21.991126] snd_hda_codec_conexant hdaudioC0D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[   21.992188] snd_hda_codec_conexant hdaudioC0D0:    hp_outs=1 (0x16/0x0/0x0/0x0/0x0)
[   21.993234] snd_hda_codec_conexant hdaudioC0D0:    mono: mono_out=0x0
[   21.994274] snd_hda_codec_conexant hdaudioC0D0:    inputs:
[   22.000517] snd_hda_codec_conexant hdaudioC0D0:      Internal Mic=0x1a
[   22.001586] snd_hda_codec_conexant hdaudioC0D0:      Mic=0x19


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-17 20:34                           ` Marek Marczykowski-Górecki
@ 2023-01-18  8:59                             ` Takashi Iwai
  2023-01-18 10:39                               ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-18  8:59 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Tue, 17 Jan 2023 21:34:11 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > On Tue, 17 Jan 2023 17:49:28 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > Takashi Iwai wrote:
> > > > > > > > 
> > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > 
> > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > 
> > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > not yet).
> > > > > > > > > > > > 
> > > > > > > > > > > > (...)
> > > > > > > > > > > > 
> > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > 
> > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > 
> > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > works.
> > > > > > > > > > > > 
> > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > 
> > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > 
> > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > 
> > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > 
> > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > that.
> > > > > > > > 
> > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > 
> > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > 
> > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > 
> > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > 
> > > > > > Thanks, I'll give it a try.
> > > > > 
> > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > uptime.
> > > > 
> > > > Aha, then it might be rather other way round;
> > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > 
> > > > Could you try the one below instead of the previous?
> > > 
> > > Unfortunately, this one doesn't fix it either :/
> > 
> > Hmm.  Then how about applying both of the last two patches?  The last
> > one to enforce the fallback allocation and the previous one to use
> > dma_alloc_coherent().  It should be essentially reverting to the old
> > way.
> 
> Oh, I noticed only now: the last patch made it fail to initialize.

The "last patch" means the patch to enforce the fallback allocation?

> I
> don't see obvious errors in dmesg, but when trying aplay, I get:
> 
>     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
>     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
>     aplay: main:830: audio open error: Cannot allocate memory

It's -ENOMEM, so it must be from there.  Does it appear always?  If
yes, your system is with IOMMU, and the patch made return always NULL
intentionally.

If that's the case, the problem is that IOMMU doesn't handle the
coherent memory on Xen.

Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
in snd_dma_noncontig_alloc() returns NULL or not.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-18  8:59                             ` Takashi Iwai
@ 2023-01-18 10:39                               ` Marek Marczykowski-Górecki
  2023-01-18 12:39                                 ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-18 10:39 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 7396 bytes --]

On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> On Tue, 17 Jan 2023 21:34:11 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > Takashi Iwai wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > 
> > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > (...)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > 
> > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > 
> > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > 
> > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > 
> > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > that.
> > > > > > > > > 
> > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > 
> > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > 
> > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > 
> > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > 
> > > > > > > Thanks, I'll give it a try.
> > > > > > 
> > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > uptime.
> > > > > 
> > > > > Aha, then it might be rather other way round;
> > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > 
> > > > > Could you try the one below instead of the previous?
> > > > 
> > > > Unfortunately, this one doesn't fix it either :/
> > > 
> > > Hmm.  Then how about applying both of the last two patches?  The last
> > > one to enforce the fallback allocation and the previous one to use
> > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > way.
> > 
> > Oh, I noticed only now: the last patch made it fail to initialize.
> 
> The "last patch" means the patch to enforce the fallback allocation?

Yes, the one about dma_alloc_noncontiguous().

> > I
> > don't see obvious errors in dmesg, but when trying aplay, I get:
> > 
> >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> >     aplay: main:830: audio open error: Cannot allocate memory
> 
> It's -ENOMEM, so it must be from there.  Does it appear always?  If
> yes, your system is with IOMMU, and the patch made return always NULL
> intentionally.

While the system do have IOMMU, it isn't configured by Linux, but by
Xen. And it maps all the memory that Linux see.

> If that's the case, the problem is that IOMMU doesn't handle the
> coherent memory on Xen.
> 
> Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> in snd_dma_noncontig_alloc() returns NULL or not.

Will do.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-18 10:39                               ` Marek Marczykowski-Górecki
@ 2023-01-18 12:39                                 ` Takashi Iwai
  2023-01-20  1:10                                   ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-18 12:39 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Wed, 18 Jan 2023 11:39:18 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > On Tue, 17 Jan 2023 21:34:11 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > 
> > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > 
> > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > 
> > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > 
> > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > that.
> > > > > > > > > > 
> > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > 
> > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > 
> > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > 
> > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > 
> > > > > > > > Thanks, I'll give it a try.
> > > > > > > 
> > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > uptime.
> > > > > > 
> > > > > > Aha, then it might be rather other way round;
> > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > 
> > > > > > Could you try the one below instead of the previous?
> > > > > 
> > > > > Unfortunately, this one doesn't fix it either :/
> > > > 
> > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > one to enforce the fallback allocation and the previous one to use
> > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > way.
> > > 
> > > Oh, I noticed only now: the last patch made it fail to initialize.
> > 
> > The "last patch" means the patch to enforce the fallback allocation?
> 
> Yes, the one about dma_alloc_noncontiguous().
> 
> > > I
> > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > 
> > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > >     aplay: main:830: audio open error: Cannot allocate memory
> > 
> > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > yes, your system is with IOMMU, and the patch made return always NULL
> > intentionally.
> 
> While the system do have IOMMU, it isn't configured by Linux, but by
> Xen. And it maps all the memory that Linux see.
> 
> > If that's the case, the problem is that IOMMU doesn't handle the
> > coherent memory on Xen.
> > 
> > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > in snd_dma_noncontig_alloc() returns NULL or not.
> 
> Will do.

If get_dma_ops() is non-NULL, it means we need some Xen-specific
workaround not to use dma_alloc_noncontiguous().
What's the best way to see whether the driver is running on Xen PV?

Meanwhile, it's helpful if you can try the combo of my last two
patches, too.  It should work, and if it doesn't, it implies that
we're looking at a wrong place.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-18 12:39                                 ` Takashi Iwai
@ 2023-01-20  1:10                                   ` Marek Marczykowski-Górecki
  2023-01-20  2:24                                     ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-20  1:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 9284 bytes --]

On Wed, Jan 18, 2023 at 01:39:56PM +0100, Takashi Iwai wrote:
> On Wed, 18 Jan 2023 11:39:18 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > > On Tue, 17 Jan 2023 21:34:11 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > > 
> > > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > > 
> > > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > > that.
> > > > > > > > > > > 
> > > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > > 
> > > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > > 
> > > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > > 
> > > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > > 
> > > > > > > > > Thanks, I'll give it a try.
> > > > > > > > 
> > > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > > uptime.
> > > > > > > 
> > > > > > > Aha, then it might be rather other way round;
> > > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > > 
> > > > > > > Could you try the one below instead of the previous?
> > > > > > 
> > > > > > Unfortunately, this one doesn't fix it either :/
> > > > > 
> > > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > > one to enforce the fallback allocation and the previous one to use
> > > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > > way.
> > > > 
> > > > Oh, I noticed only now: the last patch made it fail to initialize.
> > > 
> > > The "last patch" means the patch to enforce the fallback allocation?
> > 
> > Yes, the one about dma_alloc_noncontiguous().
> > 
> > > > I
> > > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > > 
> > > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > > >     aplay: main:830: audio open error: Cannot allocate memory
> > > 
> > > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > > yes, your system is with IOMMU, and the patch made return always NULL
> > > intentionally.
> > 
> > While the system do have IOMMU, it isn't configured by Linux, but by
> > Xen. And it maps all the memory that Linux see.
> > 
> > > If that's the case, the problem is that IOMMU doesn't handle the
> > > coherent memory on Xen.
> > > 
> > > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > > in snd_dma_noncontig_alloc() returns NULL or not.
> > 
> > Will do.
> 
> If get_dma_ops() is non-NULL, 

Yes, it's non-NULL.

> it means we need some Xen-specific
> workaround not to use dma_alloc_noncontiguous().
> What's the best way to see whether the driver is running on Xen PV?

Usually it's this: cpu_feature_enabled(X86_FEATURE_XENPV)

> Meanwhile, it's helpful if you can try the combo of my last two
> patches, too.  It should work, and if it doesn't, it implies that
> we're looking at a wrong place.

It doesn't because the last of them causes "Cannot allocate memory".
I'm trying now with this on top:

---8<---
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index 97d7b8106869..e927d18d1ebb 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -545,7 +545,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
 	//	      DEFAULT_GFP, 0);
 #ifdef CONFIG_SND_DMA_SGBUF
-	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
+	if (!sgt) { // && !get_dma_ops(dmab->dev.dev)) {
 		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
 			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 		else
---8<---


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-20  1:10                                   ` Marek Marczykowski-Górecki
@ 2023-01-20  2:24                                     ` Marek Marczykowski-Górecki
  2023-01-20  7:26                                       ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-20  2:24 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 9954 bytes --]

On Fri, Jan 20, 2023 at 02:10:37AM +0100, Marek Marczykowski-Górecki wrote:
> On Wed, Jan 18, 2023 at 01:39:56PM +0100, Takashi Iwai wrote:
> > On Wed, 18 Jan 2023 11:39:18 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > > > On Tue, 17 Jan 2023 21:34:11 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > > > that.
> > > > > > > > > > > > 
> > > > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > > > 
> > > > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > > > 
> > > > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > > > 
> > > > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > > > 
> > > > > > > > > > Thanks, I'll give it a try.
> > > > > > > > > 
> > > > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > > > uptime.
> > > > > > > > 
> > > > > > > > Aha, then it might be rather other way round;
> > > > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > > > 
> > > > > > > > Could you try the one below instead of the previous?
> > > > > > > 
> > > > > > > Unfortunately, this one doesn't fix it either :/
> > > > > > 
> > > > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > > > one to enforce the fallback allocation and the previous one to use
> > > > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > > > way.
> > > > > 
> > > > > Oh, I noticed only now: the last patch made it fail to initialize.
> > > > 
> > > > The "last patch" means the patch to enforce the fallback allocation?
> > > 
> > > Yes, the one about dma_alloc_noncontiguous().
> > > 
> > > > > I
> > > > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > > > 
> > > > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > > > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > > > >     aplay: main:830: audio open error: Cannot allocate memory
> > > > 
> > > > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > > > yes, your system is with IOMMU, and the patch made return always NULL
> > > > intentionally.
> > > 
> > > While the system do have IOMMU, it isn't configured by Linux, but by
> > > Xen. And it maps all the memory that Linux see.
> > > 
> > > > If that's the case, the problem is that IOMMU doesn't handle the
> > > > coherent memory on Xen.
> > > > 
> > > > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > > > in snd_dma_noncontig_alloc() returns NULL or not.
> > > 
> > > Will do.
> > 
> > If get_dma_ops() is non-NULL, 
> 
> Yes, it's non-NULL.
> 
> > it means we need some Xen-specific
> > workaround not to use dma_alloc_noncontiguous().
> > What's the best way to see whether the driver is running on Xen PV?
> 
> Usually it's this: cpu_feature_enabled(X86_FEATURE_XENPV)
> 
> > Meanwhile, it's helpful if you can try the combo of my last two
> > patches, too.  It should work, and if it doesn't, it implies that
> > we're looking at a wrong place.
> 
> It doesn't because the last of them causes "Cannot allocate memory".
> I'm trying now with this on top:
> 
> ---8<---
> diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
> index 97d7b8106869..e927d18d1ebb 100644
> --- a/sound/core/memalloc.c
> +++ b/sound/core/memalloc.c
> @@ -545,7 +545,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
>  	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
>  	//	      DEFAULT_GFP, 0);
>  #ifdef CONFIG_SND_DMA_SGBUF
> -	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
> +	if (!sgt) { // && !get_dma_ops(dmab->dev.dev)) {
>  		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
>  			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
>  		else
> ---8<---

Unfortunately, the above doesn't help. I mean, I don't get an error
anymore, but no sound output either (even though pavucontrol says I
should hear it). So, it's like the original issue, but without any
delay, just straight from the start.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-20  2:24                                     ` Marek Marczykowski-Górecki
@ 2023-01-20  7:26                                       ` Takashi Iwai
  2023-01-20 12:11                                         ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-20  7:26 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Fri, 20 Jan 2023 03:24:30 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Fri, Jan 20, 2023 at 02:10:37AM +0100, Marek Marczykowski-Górecki wrote:
> > On Wed, Jan 18, 2023 at 01:39:56PM +0100, Takashi Iwai wrote:
> > > On Wed, 18 Jan 2023 11:39:18 +0100,
> > > Marek Marczykowski-Górecki wrote:
> > > > 
> > > > On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > > > > On Tue, 17 Jan 2023 21:34:11 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > 
> > > > > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > > > > that.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > > > > 
> > > > > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > > > > 
> > > > > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > > > > 
> > > > > > > > > > > Thanks, I'll give it a try.
> > > > > > > > > > 
> > > > > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > > > > uptime.
> > > > > > > > > 
> > > > > > > > > Aha, then it might be rather other way round;
> > > > > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > > > > 
> > > > > > > > > Could you try the one below instead of the previous?
> > > > > > > > 
> > > > > > > > Unfortunately, this one doesn't fix it either :/
> > > > > > > 
> > > > > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > > > > one to enforce the fallback allocation and the previous one to use
> > > > > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > > > > way.
> > > > > > 
> > > > > > Oh, I noticed only now: the last patch made it fail to initialize.
> > > > > 
> > > > > The "last patch" means the patch to enforce the fallback allocation?
> > > > 
> > > > Yes, the one about dma_alloc_noncontiguous().
> > > > 
> > > > > > I
> > > > > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > > > > 
> > > > > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > > > > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > > > > >     aplay: main:830: audio open error: Cannot allocate memory
> > > > > 
> > > > > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > > > > yes, your system is with IOMMU, and the patch made return always NULL
> > > > > intentionally.
> > > > 
> > > > While the system do have IOMMU, it isn't configured by Linux, but by
> > > > Xen. And it maps all the memory that Linux see.
> > > > 
> > > > > If that's the case, the problem is that IOMMU doesn't handle the
> > > > > coherent memory on Xen.
> > > > > 
> > > > > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > > > > in snd_dma_noncontig_alloc() returns NULL or not.
> > > > 
> > > > Will do.
> > > 
> > > If get_dma_ops() is non-NULL, 
> > 
> > Yes, it's non-NULL.
> > 
> > > it means we need some Xen-specific
> > > workaround not to use dma_alloc_noncontiguous().
> > > What's the best way to see whether the driver is running on Xen PV?
> > 
> > Usually it's this: cpu_feature_enabled(X86_FEATURE_XENPV)
> > 
> > > Meanwhile, it's helpful if you can try the combo of my last two
> > > patches, too.  It should work, and if it doesn't, it implies that
> > > we're looking at a wrong place.
> > 
> > It doesn't because the last of them causes "Cannot allocate memory".
> > I'm trying now with this on top:
> > 
> > ---8<---
> > diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
> > index 97d7b8106869..e927d18d1ebb 100644
> > --- a/sound/core/memalloc.c
> > +++ b/sound/core/memalloc.c
> > @@ -545,7 +545,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
> >  	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
> >  	//	      DEFAULT_GFP, 0);
> >  #ifdef CONFIG_SND_DMA_SGBUF
> > -	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
> > +	if (!sgt) { // && !get_dma_ops(dmab->dev.dev)) {
> >  		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
> >  			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
> >  		else
> > ---8<---
> 
> Unfortunately, the above doesn't help. I mean, I don't get an error
> anymore, but no sound output either (even though pavucontrol says I
> should hear it). So, it's like the original issue, but without any
> delay, just straight from the start.

Hmm, it's the result with the combination of both patches, right?
What I meant as the combo is something like below.


Takashi

-- 8< --
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -15,6 +15,7 @@
 #include <linux/vmalloc.h>
 #ifdef CONFIG_X86
 #include <asm/set_memory.h>
+#include <xen/xen.h>
 #endif
 #include <sound/memalloc.h>
 #include "memalloc_local.h"
@@ -541,10 +542,8 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct sg_table *sgt;
 	void *p;
 
-	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
-				      DEFAULT_GFP, 0);
 #ifdef CONFIG_SND_DMA_SGBUF
-	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
+	if (xen_domain() || !get_dma_ops(dmab->dev.dev)) {
 		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
 			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 		else
@@ -552,6 +551,8 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 		return snd_dma_sg_fallback_alloc(dmab, size);
 	}
 #endif
+	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
+				      DEFAULT_GFP, 0);
 	if (!sgt)
 		return NULL;
 
@@ -719,17 +720,30 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = {
 struct snd_dma_sg_fallback {
 	size_t count;
 	struct page **pages;
+	dma_addr_t *addrs;
 };
 
 static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab,
 				       struct snd_dma_sg_fallback *sgbuf)
 {
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
-	size_t i;
-
-	for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++)
-		do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc);
+	size_t i, size;
+
+	if (sgbuf->pages && sgbuf->addrs) {
+		i = 0;
+		while (i < sgbuf->count) {
+			if (!sgbuf->pages[i] || !sgbuf->addrs[i])
+				break;
+			size = sgbuf->addrs[i] & ~PAGE_MASK;
+			if (WARN_ON(!size))
+				break;
+			dma_free_coherent(dmab->dev.dev, size,
+					  page_address(sgbuf->pages[i]),
+					  sgbuf->addrs[i] & PAGE_MASK);
+			i += size;
+		}
+	}
 	kvfree(sgbuf->pages);
+	kvfree(sgbuf->addrs);
 	kfree(sgbuf);
 }
 
@@ -738,9 +752,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct snd_dma_sg_fallback *sgbuf;
 	struct page **pagep, *curp;
 	size_t chunk, npages;
-	dma_addr_t addr;
+	dma_addr_t *addrp;
 	void *p;
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 
 	sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL);
 	if (!sgbuf)
@@ -748,14 +761,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	size = PAGE_ALIGN(size);
 	sgbuf->count = size >> PAGE_SHIFT;
 	sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL);
-	if (!sgbuf->pages)
+	sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL);
+	if (!sgbuf->pages || !sgbuf->addrs)
 		goto error;
 
 	pagep = sgbuf->pages;
-	chunk = size;
+	addrp = sgbuf->addrs;
+	chunk = (PAGE_SIZE - 1) << PAGE_SHIFT; /* to fit in low bits in addrs */
 	while (size > 0) {
 		chunk = min(size, chunk);
-		p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc);
+		p = dma_alloc_coherent(dmab->dev.dev, chunk, addrp, DEFAULT_GFP);
 		if (!p) {
 			if (chunk <= PAGE_SIZE)
 				goto error;
@@ -767,6 +782,8 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 		size -= chunk;
 		/* fill pages */
 		npages = chunk >> PAGE_SHIFT;
+		*addrp |= npages; /* store in lower bits */
+		addrp += npages;
 		curp = virt_to_page(p);
 		while (npages--)
 			*pagep++ = curp++;
@@ -775,6 +792,10 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL);
 	if (!p)
 		goto error;
+
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wc(sgbuf->pages, sgbuf->count);
+
 	dmab->private_data = sgbuf;
 	/* store the first page address for convenience */
 	dmab->addr = snd_sgbuf_get_addr(dmab, 0);
@@ -787,7 +808,11 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 
 static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab)
 {
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+
 	vunmap(dmab->area);
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wb(sgbuf->pages, sgbuf->count);
 	__snd_dma_sg_fallback_free(dmab, dmab->private_data);
 }
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-20  7:26                                       ` Takashi Iwai
@ 2023-01-20 12:11                                         ` Marek Marczykowski-Górecki
  2023-01-20 13:19                                           ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-20 12:11 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 11125 bytes --]

On Fri, Jan 20, 2023 at 08:26:09AM +0100, Takashi Iwai wrote:
> On Fri, 20 Jan 2023 03:24:30 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> > On Fri, Jan 20, 2023 at 02:10:37AM +0100, Marek Marczykowski-Górecki wrote:
> > > On Wed, Jan 18, 2023 at 01:39:56PM +0100, Takashi Iwai wrote:
> > > > On Wed, 18 Jan 2023 11:39:18 +0100,
> > > > Marek Marczykowski-Górecki wrote:
> > > > > 
> > > > > On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > > > > > On Tue, 17 Jan 2023 21:34:11 +0100,
> > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > 
> > > > > > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > > > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks, I'll give it a try.
> > > > > > > > > > > 
> > > > > > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > > > > > uptime.
> > > > > > > > > > 
> > > > > > > > > > Aha, then it might be rather other way round;
> > > > > > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > > > > > 
> > > > > > > > > > Could you try the one below instead of the previous?
> > > > > > > > > 
> > > > > > > > > Unfortunately, this one doesn't fix it either :/
> > > > > > > > 
> > > > > > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > > > > > one to enforce the fallback allocation and the previous one to use
> > > > > > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > > > > > way.
> > > > > > > 
> > > > > > > Oh, I noticed only now: the last patch made it fail to initialize.
> > > > > > 
> > > > > > The "last patch" means the patch to enforce the fallback allocation?
> > > > > 
> > > > > Yes, the one about dma_alloc_noncontiguous().
> > > > > 
> > > > > > > I
> > > > > > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > > > > > 
> > > > > > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > > > > > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > > > > > >     aplay: main:830: audio open error: Cannot allocate memory
> > > > > > 
> > > > > > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > > > > > yes, your system is with IOMMU, and the patch made return always NULL
> > > > > > intentionally.
> > > > > 
> > > > > While the system do have IOMMU, it isn't configured by Linux, but by
> > > > > Xen. And it maps all the memory that Linux see.
> > > > > 
> > > > > > If that's the case, the problem is that IOMMU doesn't handle the
> > > > > > coherent memory on Xen.
> > > > > > 
> > > > > > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > > > > > in snd_dma_noncontig_alloc() returns NULL or not.
> > > > > 
> > > > > Will do.
> > > > 
> > > > If get_dma_ops() is non-NULL, 
> > > 
> > > Yes, it's non-NULL.
> > > 
> > > > it means we need some Xen-specific
> > > > workaround not to use dma_alloc_noncontiguous().
> > > > What's the best way to see whether the driver is running on Xen PV?
> > > 
> > > Usually it's this: cpu_feature_enabled(X86_FEATURE_XENPV)
> > > 
> > > > Meanwhile, it's helpful if you can try the combo of my last two
> > > > patches, too.  It should work, and if it doesn't, it implies that
> > > > we're looking at a wrong place.
> > > 
> > > It doesn't because the last of them causes "Cannot allocate memory".
> > > I'm trying now with this on top:
> > > 
> > > ---8<---
> > > diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
> > > index 97d7b8106869..e927d18d1ebb 100644
> > > --- a/sound/core/memalloc.c
> > > +++ b/sound/core/memalloc.c
> > > @@ -545,7 +545,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
> > >  	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
> > >  	//	      DEFAULT_GFP, 0);
> > >  #ifdef CONFIG_SND_DMA_SGBUF
> > > -	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
> > > +	if (!sgt) { // && !get_dma_ops(dmab->dev.dev)) {
> > >  		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
> > >  			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
> > >  		else
> > > ---8<---
> > 
> > Unfortunately, the above doesn't help. I mean, I don't get an error
> > anymore, but no sound output either (even though pavucontrol says I
> > should hear it). So, it's like the original issue, but without any
> > delay, just straight from the start.
> 
> Hmm, it's the result with the combination of both patches, right?

Yes.

> What I meant as the combo is something like below.

Something like this, yes.

BTW, xen_domain() will also return true on PVH/HVM domain, which should
not need any of this special treatment. It's PV that is weird.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-20 12:11                                         ` Marek Marczykowski-Górecki
@ 2023-01-20 13:19                                           ` Takashi Iwai
  2023-01-23 21:31                                             ` Marek Marczykowski-Górecki
  0 siblings, 1 reply; 25+ messages in thread
From: Takashi Iwai @ 2023-01-20 13:19 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Fri, 20 Jan 2023 13:11:34 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Fri, Jan 20, 2023 at 08:26:09AM +0100, Takashi Iwai wrote:
> > On Fri, 20 Jan 2023 03:24:30 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > > On Fri, Jan 20, 2023 at 02:10:37AM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Wed, Jan 18, 2023 at 01:39:56PM +0100, Takashi Iwai wrote:
> > > > > On Wed, 18 Jan 2023 11:39:18 +0100,
> > > > > Marek Marczykowski-Górecki wrote:
> > > > > > 
> > > > > > On Wed, Jan 18, 2023 at 09:59:26AM +0100, Takashi Iwai wrote:
> > > > > > > On Tue, 17 Jan 2023 21:34:11 +0100,
> > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > 
> > > > > > > > On Tue, Jan 17, 2023 at 05:52:25PM +0100, Takashi Iwai wrote:
> > > > > > > > > On Tue, 17 Jan 2023 17:49:28 +0100,
> > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > 
> > > > > > > > > > On Tue, Jan 17, 2023 at 03:33:42PM +0100, Takashi Iwai wrote:
> > > > > > > > > > > On Tue, 17 Jan 2023 15:21:23 +0100,
> > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > On Tue, Jan 17, 2023 at 12:36:28PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > On Tue, Jan 17, 2023 at 08:58:57AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > On Mon, 16 Jan 2023 16:55:11 +0100,
> > > > > > > > > > > > > > Takashi Iwai wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Tue, 27 Dec 2022 16:26:54 +0100,
> > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > On Thu, Dec 22, 2022 at 09:09:15AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > On Sat, 10 Dec 2022 17:17:42 +0100,
> > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > On Sat, Dec 10, 2022 at 02:00:06AM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 01:40:15PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > > On Fri, Dec 09, 2022 at 09:10:19AM +0100, Takashi Iwai wrote:
> > > > > > > > > > > > > > > > > > > > > On Fri, 09 Dec 2022 02:27:30 +0100,
> > > > > > > > > > > > > > > > > > > > > Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > > Under Xen PV dom0, with Linux >= 5.17, sound stops working after few
> > > > > > > > > > > > > > > > > > > > > > hours. pavucontrol still shows meter bars moving, but the speakers
> > > > > > > > > > > > > > > > > > > > > > remain silent. At least on some occasions I see the following message in
> > > > > > > > > > > > > > > > > > > > > > dmesg:
> > > > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > >   [ 2142.484553] snd_hda_intel 0000:00:1f.3: Unstable LPIB (18144 >= 6396); disabling LPIB delay counting
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Hit the issue again, this message did not appear in the log (or at least
> > > > > > > > > > > > > > > > > > > not yet).
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > (...)
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > In anyway, please check the behavior with 6.1-rc8 + the commit
> > > > > > > > > > > > > > > > > > > > > cc26516374065a34e10c9a8bf3e940e42cd96e2a
> > > > > > > > > > > > > > > > > > > > >     ALSA: memalloc: Allocate more contiguous pages for fallback case
> > > > > > > > > > > > > > > > > > > > > from for-next of my sound git tree (which will be in 6.2-rc1).
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > This did not helped.
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > Looking at the mentioned commits, there is one specific aspect of Xen PV
> > > > > > > > > > > > > > > > > > > > that may be relevant. It configures PAT differently than native Linux.
> > > > > > > > > > > > > > > > > > > > Theoretically Linux adapts automatically and using proper API (like
> > > > > > > > > > > > > > > > > > > > set_memory_wc()) should just work, but at least for i915 driver it
> > > > > > > > > > > > > > > > > > > > causes issues (not fully tracked down yet). Details about that bug
> > > > > > > > > > > > > > > > > > > > report include some more background:
> > > > > > > > > > > > > > > > > > > > https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/
> > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > Anyway, I have tested it on a Xen modified to setup PAT the same way as
> > > > > > > > > > > > > > > > > > > > native Linux and the audio issue is still there.
> > > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > > > If the problem persists, another thing to check is the hack below
> > > > > > > > > > > > > > > > > > > > > works.
> > > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > > Trying this one now.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > And this one didn't either :/
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > (Sorry for the late reply, as I've been off in the last weeks.)
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I think the hack doesn't influence on the PCM buffer pages, but only
> > > > > > > > > > > > > > > > > about BDL pages.  Could you check the patch below instead?
> > > > > > > > > > > > > > > > > It'll disable the SG-buffer handling on x86 completely. 
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > This seems to "fix" the issue, thanks!
> > > > > > > > > > > > > > > > I guess I'll run it this way for now, but a proper solution would be
> > > > > > > > > > > > > > > > nice. Let me know if I can collect any more info that would help with
> > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Then we seem to go back again with the coherent memory allocation for
> > > > > > > > > > > > > > > the fallback sg cases.  It was changed because the use of
> > > > > > > > > > > > > > > dma_alloc_coherent() caused a problem with IOMMU case for retrieving
> > > > > > > > > > > > > > > the page addresses, but since the commit 9736a325137b, we essentially
> > > > > > > > > > > > > > > avoid the fallback when IOMMU is used, so it should be fine again.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Let me know if the patch like below works for you instead of the
> > > > > > > > > > > > > > > previous hack to disable SG-buffer (note: totally untested!)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Gah, there was an obvious typo, scratch that.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Below is a proper patch.  Please try this one instead.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Thanks, I'll give it a try.
> > > > > > > > > > > > 
> > > > > > > > > > > > Unfortunately, it doesn't help, it stopped working again, after about 3h
> > > > > > > > > > > > uptime.
> > > > > > > > > > > 
> > > > > > > > > > > Aha, then it might be rather other way round;
> > > > > > > > > > > dma_alloc_noncontiguous() doesn't work on Xen properly.
> > > > > > > > > > > 
> > > > > > > > > > > Could you try the one below instead of the previous?
> > > > > > > > > > 
> > > > > > > > > > Unfortunately, this one doesn't fix it either :/
> > > > > > > > > 
> > > > > > > > > Hmm.  Then how about applying both of the last two patches?  The last
> > > > > > > > > one to enforce the fallback allocation and the previous one to use
> > > > > > > > > dma_alloc_coherent().  It should be essentially reverting to the old
> > > > > > > > > way.
> > > > > > > > 
> > > > > > > > Oh, I noticed only now: the last patch made it fail to initialize.
> > > > > > > 
> > > > > > > The "last patch" means the patch to enforce the fallback allocation?
> > > > > > 
> > > > > > Yes, the one about dma_alloc_noncontiguous().
> > > > > > 
> > > > > > > > I
> > > > > > > > don't see obvious errors in dmesg, but when trying aplay, I get:
> > > > > > > > 
> > > > > > > >     ALSA lib pcm_direct.c:1284:(snd1_pcm_direct_initialize_slave) unable to install hw params
> > > > > > > >     ALSA lib pcm_dmix.c:1087:(snd_pcm_dmix_open) unable to initialize slave
> > > > > > > >     aplay: main:830: audio open error: Cannot allocate memory
> > > > > > > 
> > > > > > > It's -ENOMEM, so it must be from there.  Does it appear always?  If
> > > > > > > yes, your system is with IOMMU, and the patch made return always NULL
> > > > > > > intentionally.
> > > > > > 
> > > > > > While the system do have IOMMU, it isn't configured by Linux, but by
> > > > > > Xen. And it maps all the memory that Linux see.
> > > > > > 
> > > > > > > If that's the case, the problem is that IOMMU doesn't handle the
> > > > > > > coherent memory on Xen.
> > > > > > > 
> > > > > > > Please check more explicitly, whether get_dma_ops(dmab->dev.dev) call
> > > > > > > in snd_dma_noncontig_alloc() returns NULL or not.
> > > > > > 
> > > > > > Will do.
> > > > > 
> > > > > If get_dma_ops() is non-NULL, 
> > > > 
> > > > Yes, it's non-NULL.
> > > > 
> > > > > it means we need some Xen-specific
> > > > > workaround not to use dma_alloc_noncontiguous().
> > > > > What's the best way to see whether the driver is running on Xen PV?
> > > > 
> > > > Usually it's this: cpu_feature_enabled(X86_FEATURE_XENPV)
> > > > 
> > > > > Meanwhile, it's helpful if you can try the combo of my last two
> > > > > patches, too.  It should work, and if it doesn't, it implies that
> > > > > we're looking at a wrong place.
> > > > 
> > > > It doesn't because the last of them causes "Cannot allocate memory".
> > > > I'm trying now with this on top:
> > > > 
> > > > ---8<---
> > > > diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
> > > > index 97d7b8106869..e927d18d1ebb 100644
> > > > --- a/sound/core/memalloc.c
> > > > +++ b/sound/core/memalloc.c
> > > > @@ -545,7 +545,7 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
> > > >  	// sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
> > > >  	//	      DEFAULT_GFP, 0);
> > > >  #ifdef CONFIG_SND_DMA_SGBUF
> > > > -	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
> > > > +	if (!sgt) { // && !get_dma_ops(dmab->dev.dev)) {
> > > >  		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
> > > >  			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
> > > >  		else
> > > > ---8<---
> > > 
> > > Unfortunately, the above doesn't help. I mean, I don't get an error
> > > anymore, but no sound output either (even though pavucontrol says I
> > > should hear it). So, it's like the original issue, but without any
> > > delay, just straight from the start.
> > 
> > Hmm, it's the result with the combination of both patches, right?
> 
> Yes.
> 
> > What I meant as the combo is something like below.
> 
> Something like this, yes.

It's puzzling, then.  The patch changes the allocation with the
dma_alloc_coherent(), and that's what does with the Kconfig hack
you've tested.  One possible significant difference is the use of the
DMA address.

> BTW, xen_domain() will also return true on PVH/HVM domain, which should
> not need any of this special treatment. It's PV that is weird.

OK, then it can be an overkill.

Below is another try: it changes the different use of the DMA buffer
address.  Let's cross fingers.


thanks,

Takashi

-- 8< --
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -541,10 +541,9 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct sg_table *sgt;
 	void *p;
 
-	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
-				      DEFAULT_GFP, 0);
 #ifdef CONFIG_SND_DMA_SGBUF
-	if (!sgt && !get_dma_ops(dmab->dev.dev)) {
+	if (cpu_feature_enabled(X86_FEATURE_XENPV) ||
+	    !get_dma_ops(dmab->dev.dev)) {
 		if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG)
 			dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 		else
@@ -552,6 +551,8 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
 		return snd_dma_sg_fallback_alloc(dmab, size);
 	}
 #endif
+	sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir,
+				      DEFAULT_GFP, 0);
 	if (!sgt)
 		return NULL;
 
@@ -719,17 +720,30 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = {
 struct snd_dma_sg_fallback {
 	size_t count;
 	struct page **pages;
+	dma_addr_t *addrs;
 };
 
 static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab,
 				       struct snd_dma_sg_fallback *sgbuf)
 {
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
-	size_t i;
-
-	for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++)
-		do_free_pages(page_address(sgbuf->pages[i]), PAGE_SIZE, wc);
+	size_t i, size;
+
+	if (sgbuf->pages && sgbuf->addrs) {
+		i = 0;
+		while (i < sgbuf->count) {
+			if (!sgbuf->pages[i] || !sgbuf->addrs[i])
+				break;
+			size = sgbuf->addrs[i] & ~PAGE_MASK;
+			if (WARN_ON(!size))
+				break;
+			dma_free_coherent(dmab->dev.dev, size,
+					  page_address(sgbuf->pages[i]),
+					  sgbuf->addrs[i] & PAGE_MASK);
+			i += size;
+		}
+	}
 	kvfree(sgbuf->pages);
+	kvfree(sgbuf->addrs);
 	kfree(sgbuf);
 }
 
@@ -738,9 +752,9 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	struct snd_dma_sg_fallback *sgbuf;
 	struct page **pagep, *curp;
 	size_t chunk, npages;
+	dma_addr_t *addrp;
 	dma_addr_t addr;
 	void *p;
-	bool wc = dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK;
 
 	sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL);
 	if (!sgbuf)
@@ -748,14 +762,16 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 	size = PAGE_ALIGN(size);
 	sgbuf->count = size >> PAGE_SHIFT;
 	sgbuf->pages = kvcalloc(sgbuf->count, sizeof(*sgbuf->pages), GFP_KERNEL);
-	if (!sgbuf->pages)
+	sgbuf->addrs = kvcalloc(sgbuf->count, sizeof(*sgbuf->addrs), GFP_KERNEL);
+	if (!sgbuf->pages || !sgbuf->addrs)
 		goto error;
 
 	pagep = sgbuf->pages;
-	chunk = size;
+	addrp = sgbuf->addrs;
+	chunk = (PAGE_SIZE - 1) << PAGE_SHIFT; /* to fit in low bits in addrs */
 	while (size > 0) {
 		chunk = min(size, chunk);
-		p = do_alloc_pages(dmab->dev.dev, chunk, &addr, wc);
+		p = dma_alloc_coherent(dmab->dev.dev, chunk, &addr, DEFAULT_GFP);
 		if (!p) {
 			if (chunk <= PAGE_SIZE)
 				goto error;
@@ -767,17 +783,25 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 		size -= chunk;
 		/* fill pages */
 		npages = chunk >> PAGE_SHIFT;
+		*addrp = npages; /* store in lower bits */
 		curp = virt_to_page(p);
-		while (npages--)
+		while (npages--) {
 			*pagep++ = curp++;
+			*addrp++ |= addr;
+			addr += PAGE_SIZE;
+		}
 	}
 
 	p = vmap(sgbuf->pages, sgbuf->count, VM_MAP, PAGE_KERNEL);
 	if (!p)
 		goto error;
+
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wc(sgbuf->pages, sgbuf->count);
+
 	dmab->private_data = sgbuf;
 	/* store the first page address for convenience */
-	dmab->addr = snd_sgbuf_get_addr(dmab, 0);
+	dmab->addr = sgbuf->addrs[0] & PAGE_MASK;
 	return p;
 
  error:
@@ -787,10 +811,23 @@ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size)
 
 static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab)
 {
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+
 	vunmap(dmab->area);
+	if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK)
+		set_pages_array_wb(sgbuf->pages, sgbuf->count);
 	__snd_dma_sg_fallback_free(dmab, dmab->private_data);
 }
 
+static dma_addr_t snd_dma_sg_fallback_get_addr(struct snd_dma_buffer *dmab,
+					       size_t offset)
+{
+	struct snd_dma_sg_fallback *sgbuf = dmab->private_data;
+	size_t index = offset >> PAGE_SHIFT;
+
+	return (sgbuf->addrs[index] & PAGE_MASK) | (offset & ~PAGE_MASK);
+}
+
 static int snd_dma_sg_fallback_mmap(struct snd_dma_buffer *dmab,
 				    struct vm_area_struct *area)
 {
@@ -805,8 +842,8 @@ static const struct snd_malloc_ops snd_dma_sg_fallback_ops = {
 	.alloc = snd_dma_sg_fallback_alloc,
 	.free = snd_dma_sg_fallback_free,
 	.mmap = snd_dma_sg_fallback_mmap,
+	.get_addr = snd_dma_sg_fallback_get_addr,
 	/* reuse vmalloc helpers */
-	.get_addr = snd_dma_vmalloc_get_addr,
 	.get_page = snd_dma_vmalloc_get_page,
 	.get_chunk_size = snd_dma_vmalloc_get_chunk_size,
 };

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-20 13:19                                           ` Takashi Iwai
@ 2023-01-23 21:31                                             ` Marek Marczykowski-Górecki
  2023-01-24  9:24                                               ` Takashi Iwai
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Marczykowski-Górecki @ 2023-01-23 21:31 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel, Harald Arnesen, Alex Xu

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

On Fri, Jan 20, 2023 at 02:19:08PM +0100, Takashi Iwai wrote:
> On Fri, 20 Jan 2023 13:11:34 +0100,
> Marek Marczykowski-Górecki wrote:
> > 
> It's puzzling, then.  The patch changes the allocation with the
> dma_alloc_coherent(), and that's what does with the Kconfig hack
> you've tested.  One possible significant difference is the use of the
> DMA address.
> 
> > BTW, xen_domain() will also return true on PVH/HVM domain, which should
> > not need any of this special treatment. It's PV that is weird.
> 
> OK, then it can be an overkill.
> 
> Below is another try: it changes the different use of the DMA buffer
> address.  Let's cross fingers.

3 days update and it still works!

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17
  2023-01-23 21:31                                             ` Marek Marczykowski-Górecki
@ 2023-01-24  9:24                                               ` Takashi Iwai
  0 siblings, 0 replies; 25+ messages in thread
From: Takashi Iwai @ 2023-01-24  9:24 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: alsa-devel, Harald Arnesen, Alex Xu

On Mon, 23 Jan 2023 22:31:39 +0100,
Marek Marczykowski-Górecki wrote:
> 
> On Fri, Jan 20, 2023 at 02:19:08PM +0100, Takashi Iwai wrote:
> > On Fri, 20 Jan 2023 13:11:34 +0100,
> > Marek Marczykowski-Górecki wrote:
> > > 
> > It's puzzling, then.  The patch changes the allocation with the
> > dma_alloc_coherent(), and that's what does with the Kconfig hack
> > you've tested.  One possible significant difference is the use of the
> > DMA address.
> > 
> > > BTW, xen_domain() will also return true on PVH/HVM domain, which should
> > > not need any of this special treatment. It's PV that is weird.
> > 
> > OK, then it can be an overkill.
> > 
> > Below is another try: it changes the different use of the DMA buffer
> > address.  Let's cross fingers.
> 
> 3 days update and it still works!

Great, I'm going to submit the proper patches, then.
Thanks for your patient testing!


Takashi

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-01-24  9:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Y5KPAs6f7S2dEoxR@mail-itl>
2022-12-09  8:10 ` Intel HD Audio: sound stops working in Xen PV dom0 in >=5.17 Takashi Iwai
2022-12-09 12:40   ` Marek Marczykowski-Górecki
2022-12-10  1:00     ` Marek Marczykowski-Górecki
2022-12-10 16:17       ` Marek Marczykowski-Górecki
2022-12-20  4:43         ` Marek Marczykowski-Górecki
2022-12-22  8:09         ` Takashi Iwai
2022-12-27 15:26           ` Marek Marczykowski-Górecki
2023-01-16 15:55             ` Takashi Iwai
2023-01-17  7:58               ` Takashi Iwai
2023-01-17 11:36                 ` Marek Marczykowski-Górecki
2023-01-17 14:21                   ` Marek Marczykowski-Górecki
2023-01-17 14:33                     ` Takashi Iwai
2023-01-17 16:49                       ` Marek Marczykowski-Górecki
2023-01-17 16:52                         ` Takashi Iwai
2023-01-17 20:34                           ` Marek Marczykowski-Górecki
2023-01-18  8:59                             ` Takashi Iwai
2023-01-18 10:39                               ` Marek Marczykowski-Górecki
2023-01-18 12:39                                 ` Takashi Iwai
2023-01-20  1:10                                   ` Marek Marczykowski-Górecki
2023-01-20  2:24                                     ` Marek Marczykowski-Górecki
2023-01-20  7:26                                       ` Takashi Iwai
2023-01-20 12:11                                         ` Marek Marczykowski-Górecki
2023-01-20 13:19                                           ` Takashi Iwai
2023-01-23 21:31                                             ` Marek Marczykowski-Górecki
2023-01-24  9:24                                               ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.