All of lore.kernel.org
 help / color / mirror / Atom feed
* [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-15 14:17 Mikhail Gavrilov
  2022-12-15 18:23   ` Kai Vehmanen
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Mikhail Gavrilov @ 2022-12-15 14:17 UTC (permalink / raw)
  To: Takashi Iwai, alsa-devel, hch, m.szyprowski, Linux List Kernel Mailing

Hi,
The kernel 6.2 preparation cycle has begun and yesterday after the
kernel was updated on my Fedora Rawhide all audio devices disappeared.

The backtrace of the issue looks like:
[  133.033269] page:00000000e4a2c44b refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x207490
[  133.033353] head:00000000e4a2c44b order:2 compound_mapcount:0
subpages_mapcount:0 compound_pincount:0
[  133.033360] flags: 0x17ffffc0010000(head|node=0|zone=2|lastcpupid=0x1fffff)
[  133.033369] raw: 0017ffffc0010000 0000000000000000 dead000000000122
0000000000000000
[  133.033376] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
[  133.033381] page dumped because: VM_BUG_ON_PAGE(PageCompound(page))
[  133.033392] ------------[ cut here ]------------
[  133.033397] kernel BUG at mm/page_alloc.c:3592!
[  133.033406] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  133.033410] CPU: 22 PID: 1673 Comm: wireplumber Tainted: G        W
   L    -------  ---
6.2.0-0.rc0.20221214gite2ca6ba6ba01.3.fc38.x86_64 #1
[  133.033415] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
[  133.033417] RIP: 0010:split_page+0xa2/0x160
[  133.033425] Code: 00 48 83 c7 40 48 39 d7 75 d7 0f 1f 44 00 00 89
ee 48 89 df 5b 5d e9 2d fe 06 00 48 c7 c6 d8 ca 9a 95 48 89 df e8 8e
77 fc ff <0f> 0b 48 89 f8 f7 c7 ff 0f 00 00 0f 85 7a ff ff ff 48 8b 17
f7 c2
[  133.033428] RSP: 0018:ffff9f5645177b98 EFLAGS: 00010286
[  133.033432] RAX: 0000000000000037 RBX: ffffeb89c81d2400 RCX: 0000000000000000
[  133.033435] RDX: 0000000000000001 RSI: ffffffff959f0673 RDI: 00000000ffffffff
[  133.033438] RBP: 0000000000000002 R08: 0000000000000000 R09: ffff9f5645177a08
[  133.033440] R10: 0000000000000003 R11: ffff8d032e2fffe8 R12: 0000000000000007
[  133.033442] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000001
[  133.033445] FS:  00007f7e55702800(0000) GS:ffff8d02e8200000(0000)
knlGS:0000000000000000
[  133.033448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.033450] CR2: 00007f7e556cb000 CR3: 00000001f604e000 CR4: 0000000000350ee0
[  133.033453] Call Trace:
[  133.033455]  <TASK>
[  133.033458]  __iommu_dma_alloc_noncontiguous.constprop.0+0x2de/0x3e0
[  133.033468]  ? rcu_read_lock_sched_held+0x3f/0x80
[  133.033475]  iommu_dma_alloc_noncontiguous+0x66/0xb0
[  133.033481]  dma_alloc_noncontiguous+0x54/0x1a0
[  133.033489]  snd_dma_noncontig_alloc+0x25/0x120 [snd_pcm]
[  133.033505]  snd_dma_sg_wc_alloc+0x13/0xb0 [snd_pcm]
[  133.033519]  snd_dma_alloc_dir_pages+0x50/0x90 [snd_pcm]
[  133.033532]  do_alloc_pages+0x49/0xa0 [snd_pcm]
[  133.033546]  snd_pcm_lib_malloc_pages+0xf1/0x1e0 [snd_pcm]
[  133.033560]  snd_pcm_hw_params+0x57f/0x620 [snd_pcm]
[  133.033576]  snd_pcm_common_ioctl+0x1e4/0x12a0 [snd_pcm]
[  133.033595]  snd_pcm_ioctl+0x23/0x40 [snd_pcm]
[  133.033607]  __x64_sys_ioctl+0x90/0xd0
[  133.033613]  do_syscall_64+0x5b/0x80
[  133.033618]  ? do_syscall_64+0x67/0x80
[  133.033622]  ? lockdep_hardirqs_on+0x7d/0x100
[  133.033627]  ? do_syscall_64+0x67/0x80
[  133.033630]  ? do_syscall_64+0x67/0x80
[  133.033633]  ? do_syscall_64+0x67/0x80
[  133.033636]  ? do_syscall_64+0x67/0x80
[  133.033640]  ? lockdep_hardirqs_on+0x7d/0x100
[  133.033644]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  133.033648] RIP: 0033:0x7f7e55b5f65f
[  133.033671] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28
00 00
[  133.033674] RSP: 002b:00007ffd24c51ec0 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  133.033678] RAX: ffffffffffffffda RBX: 00007ffd24c520f0 RCX: 00007f7e55b5f65f
[  133.033681] RDX: 00007ffd24c520f0 RSI: 00000000c2604111 RDI: 0000000000000023
[  133.033683] RBP: 0000556c04c4ff60 R08: 0000000000000000 R09: 0000000000000000
[  133.033685] R10: 0000000000000004 R11: 0000000000000246 R12: 0000556c04c4fee0
[  133.033688] R13: 00007ffd24c52360 R14: 00007ffd24c527b0 R15: 00007ffd24c520f0
[  133.033696]  </TASK>
[  133.033698] Modules linked in: snd_seq_dummy snd_hrtimer
nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep
sunrpc binfmt_misc iwlmvm hid_logitech_hidpp btusb btrtl btbcm
snd_seq_midi snd_seq_midi_event btintel btmtk snd_usb_audio bluetooth
snd_usbmidi_lib iwlwifi xpad snd_rawmidi ff_memless mc intel_rapl_msr
joydev intel_rapl_common snd_hda_codec_realtek edac_mce_amd
snd_hda_codec_generic snd_hda_codec_hdmi mt76x2u snd_hda_intel kvm_amd
snd_intel_dspcfg mt76x2_common snd_intel_sdw_acpi mt76x02_usb
snd_hda_codec asus_ec_sensors mt76_usb kvm vfat snd_hda_core fat
mt76x02_lib snd_hwdep eeepc_wmi mt76 snd_seq asus_wmi ledtrig_audio
snd_seq_device irqbypass sparse_keymap snd_pcm rapl platform_profile
wmi_bmof pcspkr snd_timer mac80211 k10temp snd i2c_piix4 soundcore
libarc4 acpi_cpufreq
[  133.033777]  cfg80211 hid_logitech_dj rfkill zram amdgpu
drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul video crc32c_intel
polyval_clmulni iommu_v2 gpu_sched polyval_generic drm_buddy nvme
ucsi_ccg drm_display_helper typec_ucsi ghash_clmulni_intel ccp igb
sha512_ssse3 typec nvme_core sp5100_tco cec dca nvme_common wmi
ip6_tables ip_tables fuse
[  133.033832] ---[ end trace 0000000000000000 ]---

I bisected problematic commit and find this:
ffcb754584603adf7039d7972564fbf6febdc542 is the first bad commit
commit ffcb754584603adf7039d7972564fbf6febdc542
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Nov 9 08:37:17 2022 +0100

    dma-mapping: reject __GFP_COMP in dma_alloc_attrs

    DMA allocations can never be turned back into a page pointer, so
    requesting compound pages doesn't make sense and it can't even be
    supported at all by various backends.

    Reject __GFP_COMP with a warning in dma_alloc_attrs, and stop clearing
    the flag in the arm dma ops and dma-iommu.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>

 arch/arm/mm/dma-mapping.c | 17 -----------------
 drivers/iommu/dma-iommu.c |  3 ---
 kernel/dma/mapping.c      |  8 ++++++++
 3 files changed, 8 insertions(+), 20 deletions(-)

Reverting this commit and rebuilding the kernel confirmed the
correctness of the find.

I hope my report helps fix the problem quickly.

Full kernel log is here: https://pastebin.com/5hsuhifY

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-15 14:17 [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Mikhail Gavrilov
@ 2022-12-15 18:23   ` Kai Vehmanen
  2022-12-16  0:06 ` Joan Bruguera
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Kai Vehmanen @ 2022-12-15 18:23 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Takashi Iwai, alsa-devel, hch, m.szyprowski, Linux List Kernel Mailing

Hi,

On Thu, 15 Dec 2022, Mikhail Gavrilov wrote:

> The kernel 6.2 preparation cycle has begun and yesterday after the
> kernel was updated on my Fedora Rawhide all audio devices disappeared.

I can confirm this breaks audio in our SOF tests if I cherry-pick the 
identified patch ffcb754584603a to sound tree. This affects audio on a 
very large number of x86 systems.

Br, Kai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-15 18:23   ` Kai Vehmanen
  0 siblings, 0 replies; 15+ messages in thread
From: Kai Vehmanen @ 2022-12-15 18:23 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Takashi Iwai, Linux List Kernel Mailing, alsa-devel, hch, m.szyprowski

Hi,

On Thu, 15 Dec 2022, Mikhail Gavrilov wrote:

> The kernel 6.2 preparation cycle has begun and yesterday after the
> kernel was updated on my Fedora Rawhide all audio devices disappeared.

I can confirm this breaks audio in our SOF tests if I cherry-pick the 
identified patch ffcb754584603a to sound tree. This affects audio on a 
very large number of x86 systems.

Br, Kai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-15 14:17 [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Mikhail Gavrilov
  2022-12-15 18:23   ` Kai Vehmanen
@ 2022-12-16  0:06 ` Joan Bruguera
  2022-12-16  6:46   ` Christoph Hellwig
  2022-12-22 12:17 ` [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) #forregzbot Thorsten Leemhuis
  3 siblings, 0 replies; 15+ messages in thread
From: Joan Bruguera @ 2022-12-16  0:06 UTC (permalink / raw)
  To: tiwai, linux-kernel; +Cc: Joan Bruguera

The one passing the __GFP_COMP flag appears to be sound/core/memalloc.c,
see also commit e529d3507a93d3c9528580081bbaf931a50de154.
Removing the flags also fixes the sound issues and warnings for me.

*Resent with fixed Message-ID - sorry!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-15 14:17 [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Mikhail Gavrilov
@ 2022-12-16  6:46   ` Christoph Hellwig
  2022-12-16  0:06 ` Joan Bruguera
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-12-16  6:46 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: Takashi Iwai, alsa-devel, hch, m.szyprowski,
	Linux List Kernel Mailing, Robin Murphy, iommu

Ok, it seems like the sound noncontig alloc code that I already
commented on as potentially bogus GFP_GOMP mapping trips this.  I think
for now the right thing would be to revert the hunk in dma-iommu.c
(see patch below).  The other thing to try would be to remove both
uses GFP_COMP in sound/core/memalloc.c, which should have the same
effect.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9297b741f5e80e..f798c44e090337 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -744,9 +744,6 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 	/* IOMMU can map any pages, so himem can also be used here */
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
-	/* It makes no sense to muck about with huge pages */
-	gfp &= ~__GFP_COMP;
-
 	while (count) {
 		struct page *page = NULL;
 		unsigned int order_size;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-16  6:46   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-12-16  6:46 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: alsa-devel, Takashi Iwai, Linux List Kernel Mailing, iommu,
	Robin Murphy, hch, m.szyprowski

Ok, it seems like the sound noncontig alloc code that I already
commented on as potentially bogus GFP_GOMP mapping trips this.  I think
for now the right thing would be to revert the hunk in dma-iommu.c
(see patch below).  The other thing to try would be to remove both
uses GFP_COMP in sound/core/memalloc.c, which should have the same
effect.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9297b741f5e80e..f798c44e090337 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -744,9 +744,6 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
 	/* IOMMU can map any pages, so himem can also be used here */
 	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
 
-	/* It makes no sense to muck about with huge pages */
-	gfp &= ~__GFP_COMP;
-
 	while (count) {
 		struct page *page = NULL;
 		unsigned int order_size;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-16  6:46   ` Christoph Hellwig
@ 2022-12-16 11:40     ` Robin Murphy
  -1 siblings, 0 replies; 15+ messages in thread
From: Robin Murphy @ 2022-12-16 11:40 UTC (permalink / raw)
  To: Christoph Hellwig, Mikhail Gavrilov
  Cc: Takashi Iwai, alsa-devel, m.szyprowski, Linux List Kernel Mailing, iommu

On 2022-12-16 06:46, Christoph Hellwig wrote:
> Ok, it seems like the sound noncontig alloc code that I already
> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
> for now the right thing would be to revert the hunk in dma-iommu.c
> (see patch below).  The other thing to try would be to remove both
> uses GFP_COMP in sound/core/memalloc.c, which should have the same
> effect.

Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe 
dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That 
seems like it might be the most robust option.

Robin.

> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9297b741f5e80e..f798c44e090337 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -744,9 +744,6 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
>   	/* IOMMU can map any pages, so himem can also be used here */
>   	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>   
> -	/* It makes no sense to muck about with huge pages */
> -	gfp &= ~__GFP_COMP;
> -
>   	while (count) {
>   		struct page *page = NULL;
>   		unsigned int order_size;
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-16 11:40     ` Robin Murphy
  0 siblings, 0 replies; 15+ messages in thread
From: Robin Murphy @ 2022-12-16 11:40 UTC (permalink / raw)
  To: Christoph Hellwig, Mikhail Gavrilov
  Cc: Takashi Iwai, alsa-devel, iommu, Linux List Kernel Mailing, m.szyprowski

On 2022-12-16 06:46, Christoph Hellwig wrote:
> Ok, it seems like the sound noncontig alloc code that I already
> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
> for now the right thing would be to revert the hunk in dma-iommu.c
> (see patch below).  The other thing to try would be to remove both
> uses GFP_COMP in sound/core/memalloc.c, which should have the same
> effect.

Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe 
dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That 
seems like it might be the most robust option.

Robin.

> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9297b741f5e80e..f798c44e090337 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -744,9 +744,6 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
>   	/* IOMMU can map any pages, so himem can also be used here */
>   	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>   
> -	/* It makes no sense to muck about with huge pages */
> -	gfp &= ~__GFP_COMP;
> -
>   	while (count) {
>   		struct page *page = NULL;
>   		unsigned int order_size;
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-16 11:40     ` Robin Murphy
@ 2022-12-16 12:15       ` Christoph Hellwig
  -1 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-12-16 12:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Mikhail Gavrilov, Takashi Iwai, alsa-devel,
	m.szyprowski, Linux List Kernel Mailing, iommu

On Fri, Dec 16, 2022 at 11:40:57AM +0000, Robin Murphy wrote:
> On 2022-12-16 06:46, Christoph Hellwig wrote:
>> Ok, it seems like the sound noncontig alloc code that I already
>> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
>> for now the right thing would be to revert the hunk in dma-iommu.c
>> (see patch below).  The other thing to try would be to remove both
>> uses GFP_COMP in sound/core/memalloc.c, which should have the same
>> effect.
>
> Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe 
> dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That 
> seems like it might be the most robust option.

In the long run warning there and returning an error seems like the
right thing to do, yes. I'm just a little worried doing this right now
after the merge window.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-16 12:15       ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2022-12-16 12:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: alsa-devel, Takashi Iwai, Linux List Kernel Mailing, iommu,
	Mikhail Gavrilov, Christoph Hellwig, m.szyprowski

On Fri, Dec 16, 2022 at 11:40:57AM +0000, Robin Murphy wrote:
> On 2022-12-16 06:46, Christoph Hellwig wrote:
>> Ok, it seems like the sound noncontig alloc code that I already
>> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
>> for now the right thing would be to revert the hunk in dma-iommu.c
>> (see patch below).  The other thing to try would be to remove both
>> uses GFP_COMP in sound/core/memalloc.c, which should have the same
>> effect.
>
> Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe 
> dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That 
> seems like it might be the most robust option.

In the long run warning there and returning an error seems like the
right thing to do, yes. I'm just a little worried doing this right now
after the merge window.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
  2022-12-16 12:15       ` Christoph Hellwig
@ 2022-12-16 12:52         ` Robin Murphy
  -1 siblings, 0 replies; 15+ messages in thread
From: Robin Murphy @ 2022-12-16 12:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mikhail Gavrilov, Takashi Iwai, alsa-devel, m.szyprowski,
	Linux List Kernel Mailing, iommu

On 2022-12-16 12:15, Christoph Hellwig wrote:
> On Fri, Dec 16, 2022 at 11:40:57AM +0000, Robin Murphy wrote:
>> On 2022-12-16 06:46, Christoph Hellwig wrote:
>>> Ok, it seems like the sound noncontig alloc code that I already
>>> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
>>> for now the right thing would be to revert the hunk in dma-iommu.c
>>> (see patch below).  The other thing to try would be to remove both
>>> uses GFP_COMP in sound/core/memalloc.c, which should have the same
>>> effect.
>>
>> Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe
>> dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That
>> seems like it might be the most robust option.
> 
> In the long run warning there and returning an error seems like the
> right thing to do, yes. I'm just a little worried doing this right now
> after the merge window.

Fair point, I guess nobody else actually implements 
dma_alloc_noncontiguous(), and dma_alloc_pages() seems a bit of a grey 
area since it is more of an explicit page allocator. So yeah, just 
restoring iommu-dma (perhaps with a mild VM_WARN_ON?) seems like a 
sufficiently safe and sensible fix for the short term. You can have my 
pre-emptive ack for that.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
@ 2022-12-16 12:52         ` Robin Murphy
  0 siblings, 0 replies; 15+ messages in thread
From: Robin Murphy @ 2022-12-16 12:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: alsa-devel, Takashi Iwai, Linux List Kernel Mailing, iommu,
	Mikhail Gavrilov, m.szyprowski

On 2022-12-16 12:15, Christoph Hellwig wrote:
> On Fri, Dec 16, 2022 at 11:40:57AM +0000, Robin Murphy wrote:
>> On 2022-12-16 06:46, Christoph Hellwig wrote:
>>> Ok, it seems like the sound noncontig alloc code that I already
>>> commented on as potentially bogus GFP_GOMP mapping trips this.  I think
>>> for now the right thing would be to revert the hunk in dma-iommu.c
>>> (see patch below).  The other thing to try would be to remove both
>>> uses GFP_COMP in sound/core/memalloc.c, which should have the same
>>> effect.
>>
>> Or we explicitly strip the flag in dma_alloc_noncontiguous() (and maybe
>> dma_alloc_pages() as well) for consistency with dma_alloc_attrs(). That
>> seems like it might be the most robust option.
> 
> In the long run warning there and returning an error seems like the
> right thing to do, yes. I'm just a little worried doing this right now
> after the merge window.

Fair point, I guess nobody else actually implements 
dma_alloc_noncontiguous(), and dma_alloc_pages() seems a bit of a grey 
area since it is more of an explicit page allocator. So yeah, just 
restoring iommu-dma (perhaps with a mild VM_WARN_ON?) seems like a 
sufficiently safe and sensible fix for the short term. You can have my 
pre-emptive ack for that.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) #forregzbot
  2022-12-15 14:17 [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Mikhail Gavrilov
                   ` (2 preceding siblings ...)
  2022-12-16  6:46   ` Christoph Hellwig
@ 2022-12-22 12:17 ` Thorsten Leemhuis
  2022-12-24  7:20   ` Thorsten Leemhuis
  3 siblings, 1 reply; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-12-22 12:17 UTC (permalink / raw)
  To: alsa-devel, Linux List Kernel Mailing, regressions

[Note: this mail contains only information for Linux kernel regression
tracking. Mails like these contain '#forregzbot' in the subject to make
then easy to spot and filter out. The author also tried to remove most
or all individuals from the list of recipients to spare them the hassle.]

On 15.12.22 15:17, Mikhail Gavrilov wrote:
> Hi,
> The kernel 6.2 preparation cycle has begun and yesterday after the
> kernel was updated on my Fedora Rawhide all audio devices disappeared.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced ffcb754584603adf
#regzbot title dma-mapping: audio devices disappeared
#regzbot monitor:
https://lore.kernel.org/all/20221220082009.569785-1-hch@lst.de/
#regzbot fix: dma-mapping: reject GFP_COMP for noncohernt allocaions
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) #forregzbot
  2022-12-22 12:17 ` [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) #forregzbot Thorsten Leemhuis
@ 2022-12-24  7:20   ` Thorsten Leemhuis
  0 siblings, 0 replies; 15+ messages in thread
From: Thorsten Leemhuis @ 2022-12-24  7:20 UTC (permalink / raw)
  To: alsa-devel, Linux List Kernel Mailing, regressions



On 22.12.22 13:17, Thorsten Leemhuis wrote:
> [Note: this mail contains only information for Linux kernel regression
> tracking. Mails like these contain '#forregzbot' in the subject to make
> then easy to spot and filter out. The author also tried to remove most
> or all individuals from the list of recipients to spare them the hassle.]
> 
> On 15.12.22 15:17, Mikhail Gavrilov wrote:
>> Hi,
>> The kernel 6.2 preparation cycle has begun and yesterday after the
>> kernel was updated on my Fedora Rawhide all audio devices disappeared.
> 
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
> 
> #regzbot ^introduced ffcb754584603adf
> #regzbot title dma-mapping: audio devices disappeared
> #regzbot monitor:
> https://lore.kernel.org/all/20221220082009.569785-1-hch@lst.de/
> #regzbot fix: dma-mapping: reject GFP_COMP for noncohernt allocaions

The typo in the subject of the fix was fixed, hence this is needed:

#regzbot fix: 3622b86f49f8

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!)
       [not found] <CABXGCsPnpu0TGHnvXM1we7q1t3tJAOYW2rA=>
@ 2022-12-15 23:52 ` Joan Bruguera
  0 siblings, 0 replies; 15+ messages in thread
From: Joan Bruguera @ 2022-12-15 23:52 UTC (permalink / raw)
  To: tiwai, linux-kernel; +Cc: Joan Bruguera

The one passing the __GFP_COMP flag appears to be sound/core/memalloc.c,
see also commit e529d3507a93d3c9528580081bbaf931a50de154.
Removing the flags also fixes the sound issues and warnings for me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-12-24  7:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15 14:17 [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Mikhail Gavrilov
2022-12-15 18:23 ` Kai Vehmanen
2022-12-15 18:23   ` Kai Vehmanen
2022-12-16  0:06 ` Joan Bruguera
2022-12-16  6:46 ` Christoph Hellwig
2022-12-16  6:46   ` Christoph Hellwig
2022-12-16 11:40   ` Robin Murphy
2022-12-16 11:40     ` Robin Murphy
2022-12-16 12:15     ` Christoph Hellwig
2022-12-16 12:15       ` Christoph Hellwig
2022-12-16 12:52       ` Robin Murphy
2022-12-16 12:52         ` Robin Murphy
2022-12-22 12:17 ` [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) #forregzbot Thorsten Leemhuis
2022-12-24  7:20   ` Thorsten Leemhuis
     [not found] <CABXGCsPnpu0TGHnvXM1we7q1t3tJAOYW2rA=>
2022-12-15 23:52 ` [6.2][regression] after commit ffcb754584603adf7039d7972564fbf6febdc542 all sound devices disappeared (due BUG at mm/page_alloc.c:3592!) Joan Bruguera

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.