All of lore.kernel.org
 help / color / mirror / Atom feed
* aconnect occasionally causes kernel oopses
@ 2021-08-01 18:27 folkert
  2021-08-02  6:16 ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-01 18:27 UTC (permalink / raw)
  To: alsa-devel

[zo aug  1 20:24:51 2021] BUG: unable to handle page fault for address: 00003c366982f750
[zo aug  1 20:24:51 2021] #PF: supervisor read access in kernel mode
[zo aug  1 20:24:51 2021] #PF: error_code(0x0000) - not-present page
[zo aug  1 20:24:51 2021] PGD 0 P4D 0 
[zo aug  1 20:24:51 2021] Oops: 0000 [#1] SMP NOPTI
[zo aug  1 20:24:51 2021] CPU: 2 PID: 59575 Comm: aconnect Not tainted 5.11.0-25-generic #27-Ubuntu
[zo aug  1 20:24:51 2021] Hardware name: HP HP Laptop 17-by2xxx/865A, BIOS F.58 06/16/2020
[zo aug  1 20:24:51 2021] RIP: 0010:check_and_subscribe_port+0x151/0x210 [snd_seq]
[zo aug  1 20:24:51 2021] Code: 49 39 c5 0f 84 39 ff ff ff 41 0f b6 3e eb 0c 48 8b 00 49 39 c5 0f 84 27 ff ff ff 48 8d 70 b0 48 8d 48 a0 45 84 c0 48 0f 45 ce <40> 3a 39 75 e0 0f b6 51 01 41 38 56 01 75 d6 0f b6 51 02 41 38 56
[zo aug  1 20:24:51 2021] RSP: 0018:ffffa57dc9e03cf8 EFLAGS: 00010246 [zo aug  1 20:24:51 2021] RAX: 00003c366982f7b0 RBX: 0000000000000000 RCX: 00003c366982f750
[zo aug  1 20:24:51 2021] RDX: 0000000000000089 RSI: 00003c366982f760 RDI: 0000000000000080
[zo aug  1 20:24:51 2021] RBP: ffffa57dc9e03d38 R08: 0000000000000000 R09: ffff8945226312c0
[zo aug  1 20:24:51 2021] R10: ffff89458b7dda00 R11: ffff89460f08d270 R12: ffff894522631200
[zo aug  1 20:24:51 2021] R13: ffff8945226312c0 R14: ffff894506dd2080 R15: ffff8945226312d8
[zo aug  1 20:24:51 2021] FS:  00007f7fd9633480(0000) GS:ffff894756500000(0000) knlGS:0000000000000000
[zo aug  1 20:24:51 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[zo aug  1 20:24:51 2021] CR2: 00003c366982f750 CR3: 00000001d8318006 CR4: 00000000003706e0
[zo aug  1 20:24:51 2021] Call Trace:
[zo aug  1 20:24:51 2021]  snd_seq_port_connect+0x113/0x180 [snd_seq]
[zo aug  1 20:24:51 2021]  snd_seq_ioctl_subscribe_port+0xb9/0x180 [snd_seq]
[zo aug  1 20:24:51 2021]  ? snd_seq_port_get_subscription+0xbb/0xd0 [snd_seq]
[zo aug  1 20:24:51 2021]  ? __check_object_size.part.0+0x3a/0x150
[zo aug  1 20:24:51 2021]  snd_seq_ioctl+0xe8/0x1b0 [snd_seq]
[zo aug  1 20:24:51 2021]  __x64_sys_ioctl+0x91/0xc0
[zo aug  1 20:24:51 2021]  do_syscall_64+0x38/0x90
[zo aug  1 20:24:51 2021]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[zo aug  1 20:24:51 2021] RIP: 0033:0x7f7fd98b8ecb
[zo aug  1 20:24:51 2021] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d 1f 0d 00 f7 d8 64 89 01 48
[zo aug  1 20:24:51 2021] RSP: 002b:00007ffdc3e3f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[zo aug  1 20:24:51 2021] RAX: ffffffffffffffda RBX: 00007f7fd998b354 RCX: 00007f7fd98b8ecb
[zo aug  1 20:24:51 2021] RDX: 00007ffdc3e3f050 RSI: 0000000040505330 RDI: 0000000000000003
[zo aug  1 20:24:51 2021] RBP: 00007ffdc3e3f120 R08: 00000000ffffffff R09: 0000000000000000
[zo aug  1 20:24:51 2021] R10: 00007f7fd993dac0 R11: 0000000000000246 R12: 00007ffdc3e3f050
[zo aug  1 20:24:51 2021] R13: 00007ffdc3e3f0de R14: 0000000000000000 R15: 00007ffdc3e3f0dc
[zo aug  1 20:24:51 2021] Modules linked in: snd_hrtimer snd_usb_audio snd_usbmidi_lib snd_seq_dummy ch341 usbserial ccm xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables libcrc32c nfnetlink bridge stp llc rfcomm cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_sof_pci snd_sof_intel_hda_common snd_hda_codec_realtek snd_soc_hdac_hda snd_hda_codec_generic snd_sof_intel_hda snd_sof_intel_byt snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core snd_hwdep soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_pcm mei_hdcp rtw88_8723de rtw88_8723d snd_seq_midi x86_pkg_temp_thermal intel_powerclamp snd_seq_midi_event coretemp intel_rapl_msr rtw88_pci [zo aug  1 20:24:51 2021]  kvm_intel snd_rawmidi rtw88_core kvm uvcvideo videobuf2_vmalloc rapl intel_cstate mac80211 videobuf2_memops snd_seq joydev btusb btrtl videobuf2_v4l2 snd_seq_device snd_timer btbcm input_leds btintel videobuf2_common videodev hp_wmi efi_pstore mc sparse_keymap serio_raw snd wmi_bmof intel_wmi_thunderbolt cfg80211 mei_me bluetooth soundcore ee1004 libarc4 processor_thermal_device processor_thermal_rfim processor_thermal_mbox mei ecdh_generic processor_thermal_rapl intel_rapl_common ecc intel_pch_thermal int340x_thermal_zone intel_soc_dts_iosf acpi_pad hp_wireless int3400_thermal acpi_tad acpi_thermal_rel mac_hid sch_fq_codel msr parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 dm_crypt i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd syscopyarea cryptd sysfillrect glue_helper sysimgblt fb_sys_fops cec psmouse nvme rc_core intel_lpss_pci intel_lpss nvme_core i2c_i801 r8169 idma64 i2c_smbus ahci realtek
[zo aug  1 20:24:51 2021]  libahci drm virt_dma xhci_pci xhci_pci_renesas wmi pinctrl_cannonlake video
[zo aug  1 20:24:51 2021] CR2: 00003c366982f750
[zo aug  1 20:24:51 2021] ---[ end trace 000787e3cd09f39d ]---
[zo aug  1 20:24:52 2021] RIP: 0010:check_and_subscribe_port+0x151/0x210 [snd_seq]
[zo aug  1 20:24:52 2021] Code: 49 39 c5 0f 84 39 ff ff ff 41 0f b6 3e eb 0c 48 8b 00 49 39 c5 0f 84 27 ff ff ff 48 8d 70 b0 48 8d 48 a0 45 84 c0 48 0f 45 ce <40> 3a 39 75 e0 0f b6 51 01 41 38 56 01 75 d6 0f b6 51 02 41 38 56
[zo aug  1 20:24:52 2021] RSP: 0018:ffffa57dc9e03cf8 EFLAGS: 00010246
[zo aug  1 20:24:52 2021] RAX: 00003c366982f7b0 RBX: 0000000000000000 RCX: 00003c366982f750
[zo aug  1 20:24:52 2021] RDX: 0000000000000089 RSI: 00003c366982f760 RDI: 0000000000000080
[zo aug  1 20:24:52 2021] RBP: ffffa57dc9e03d38 R08: 0000000000000000 R09: ffff8945226312c0
[zo aug  1 20:24:52 2021] R10: ffff89458b7dda00 R11: ffff89460f08d270 R12: ffff894522631200
[zo aug  1 20:24:52 2021] R13: ffff8945226312c0 R14: ffff894506dd2080 R15: ffff8945226312d8
[zo aug  1 20:24:52 2021] FS:  00007f7fd9633480(0000) GS:ffff894756500000(0000) knlGS:0000000000000000
[zo aug  1 20:24:52 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[zo aug  1 20:24:52 2021] CR2: 00003c366982f750 CR3: 00000001d8318006 CR4: 00000000003706e0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-01 18:27 aconnect occasionally causes kernel oopses folkert
@ 2021-08-02  6:16 ` Takashi Iwai
  2021-08-02  6:18   ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-02  6:16 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

In which situation?


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02  6:16 ` Takashi Iwai
@ 2021-08-02  6:18   ` folkert
  2021-08-02  6:59     ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-02  6:18 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> In which situation?

I was testing something that listens for alsa events and for that I ran
a continuous loop that did:

while true
do
	aconnect 128:1 14:0
	aconnect -d 128:1 14:0
	aconnect -d 128:2 128:1
	aconnect 128:2 128:1
done

I ran 5 instances in parallel.

14 is midi through
128 is rtpmidi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02  6:18   ` folkert
@ 2021-08-02  6:59     ` Takashi Iwai
  2021-08-02  9:10       ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-02  6:59 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Mon, 02 Aug 2021 08:18:45 +0200,
folkert wrote:
> 
> > In which situation?
> 
> I was testing something that listens for alsa events and for that I ran
> a continuous loop that did:
> 
> while true
> do
> 	aconnect 128:1 14:0
> 	aconnect -d 128:1 14:0
> 	aconnect -d 128:2 128:1
> 	aconnect 128:2 128:1
> done
> 
> I ran 5 instances in parallel.
> 
> 14 is midi through
> 128 is rtpmidi

So rtpmidi process keeps running during the loop, that is, it's only
about connection and disconnection, right?
Also, you're listening to an event during that -- but how?

In general when you report a bug, more detailed description about the
test itself would be really helpful.  At best, provide a PoC, which
makes the debug much easier.  Posting only the stack trace without any
other information is, OTOH, almost pointless.  One can't decode the
address for a binary built for a different system easily, either.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02  6:59     ` Takashi Iwai
@ 2021-08-02  9:10       ` folkert
  2021-08-02 15:11         ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-02  9:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> > > In which situation?
> > 
> > I was testing something that listens for alsa events and for that I ran
> > a continuous loop that did:
> > 
> > while true
> > do
> > 	aconnect 128:1 14:0
> > 	aconnect -d 128:1 14:0
> > 	aconnect -d 128:2 128:1
> > 	aconnect 128:2 128:1
> > done
> > 
> > I ran 5 instances in parallel.
> > 
> > 14 is midi through
> > 128 is rtpmidi
> 
> So rtpmidi process keeps running during the loop, that is, it's only
> about connection and disconnection, right?
> Also, you're listening to an event during that -- but how?

I tried it again but with a simpler setup:

I've got these devices:

root@lappiemctopface:~# aplaymidi -l
 Port    Client name                      Port name
 14:0    Midi Through                     Midi Through Port-0
130:0    FLUID Synth (17032)              Synth input port (17032:0)
131:0    VMPK Input                       in
root@lappiemctopface:~# arecordmidi -l
 Port    Client name                      Port name
 14:0    Midi Through                     Midi Through Port-0
132:0    VMPK Output                      out

I run this in 3x parallel:

while true
do
        aconnect 132:0 130:0
        aconnect -d 132:0 130:0
done

and then in less than a minute I get a backtrace.

[ma aug  2 11:05:13 2021] Call Trace:
[ma aug  2 11:05:13 2021]  ? snd_seq_deliver_event+0x38/0x90 [snd_seq]
[ma aug  2 11:05:13 2021]  ? snd_seq_kernel_client_dispatch+0x72/0x90 [snd_seq]
[ma aug  2 11:05:13 2021]  kfree+0x3bc/0x3e0
[ma aug  2 11:05:13 2021]  ? snd_seq_port_disconnect+0x10c/0x140 [snd_seq]
[ma aug  2 11:05:13 2021]  snd_seq_port_disconnect+0x10c/0x140 [snd_seq]
[ma aug  2 11:05:13 2021]  snd_seq_ioctl_unsubscribe_port+0xb9/0x180 [snd_seq]
[ma aug  2 11:05:13 2021]  ? snd_seq_port_get_subscription+0xbb/0xd0 [snd_seq]
[ma aug  2 11:05:13 2021]  ? __check_object_size.part.0+0x3a/0x150
[ma aug  2 11:05:13 2021]  snd_seq_ioctl+0xe8/0x1b0 [snd_seq]
[ma aug  2 11:05:13 2021]  __x64_sys_ioctl+0x91/0xc0
[ma aug  2 11:05:13 2021]  do_syscall_64+0x38/0x90
[ma aug  2 11:05:13 2021]  entry_SYSCALL_64_after_hwframe+0x44/0xa9


Using:
fluidsynth                                 2.1.7-1
vmpk                                       0.7.2-1build1


On: 5.11.0-25-generic (ubuntu) on 2 cores, 2 threads/core cpu (64b).


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02  9:10       ` folkert
@ 2021-08-02 15:11         ` Takashi Iwai
  2021-08-02 15:21           ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-02 15:11 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Mon, 02 Aug 2021 11:10:12 +0200,
folkert wrote:
> 
> > > > In which situation?
> > > 
> > > I was testing something that listens for alsa events and for that I ran
> > > a continuous loop that did:
> > > 
> > > while true
> > > do
> > > 	aconnect 128:1 14:0
> > > 	aconnect -d 128:1 14:0
> > > 	aconnect -d 128:2 128:1
> > > 	aconnect 128:2 128:1
> > > done
> > > 
> > > I ran 5 instances in parallel.
> > > 
> > > 14 is midi through
> > > 128 is rtpmidi
> > 
> > So rtpmidi process keeps running during the loop, that is, it's only
> > about connection and disconnection, right?
> > Also, you're listening to an event during that -- but how?
> 
> I tried it again but with a simpler setup:
> 
> I've got these devices:
> 
> root@lappiemctopface:~# aplaymidi -l
>  Port    Client name                      Port name
>  14:0    Midi Through                     Midi Through Port-0
> 130:0    FLUID Synth (17032)              Synth input port (17032:0)
> 131:0    VMPK Input                       in
> root@lappiemctopface:~# arecordmidi -l
>  Port    Client name                      Port name
>  14:0    Midi Through                     Midi Through Port-0
> 132:0    VMPK Output                      out
> 
> I run this in 3x parallel:
> 
> while true
> do
>         aconnect 132:0 130:0
>         aconnect -d 132:0 130:0
> done
> 
> and then in less than a minute I get a backtrace.

OK, thanks.  That's more promising.

Does this happen if you do reconnect of kernel sequencer client?
You can use snd-virmidi as well as snd-dummy.

I'm asking it because it'll simplify the test a lot, which will be
almost self-contained.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02 15:11         ` Takashi Iwai
@ 2021-08-02 15:21           ` folkert
  2021-08-02 17:12             ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-02 15:21 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

[ kernel bug if repeatingly aconnect'ing midi devices ]

> Does this happen if you do reconnect of kernel sequencer client?
> You can use snd-virmidi as well as snd-dummy.
> I'm asking it because it'll simplify the test a lot, which will be
> almost self-contained.

Like this?

root@lappiemctopface:~# aplaymidi -l
 Port    Client name                      Port name
 14:0    Midi Through                     Midi Through Port-0
 20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
 21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
 22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
 23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
128:0    rtpmidi lappiemctopface          Network
128:1    rtpmidi lappiemctopface          metronoom
128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
128:3    rtpmidi lappiemctopface          oensoens
130:0    FLUID Synth (11462)              Synth input port (11462:0)

and then:

root@lappiemctopface:~# cat test.sh 
while true
do
	aconnect 20:0 21:0
	aconnect -d 20:0 21:0
done

root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done

This hard locks-up my laptop: it doesn't even respond to capslock (led
on/off) anymore nor the ctrl+prtscr+alt+b combination.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02 15:21           ` folkert
@ 2021-08-02 17:12             ` Takashi Iwai
  2021-08-02 19:53               ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-02 17:12 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Mon, 02 Aug 2021 17:21:17 +0200,
folkert wrote:
> 
> [ kernel bug if repeatingly aconnect'ing midi devices ]
> 
> > Does this happen if you do reconnect of kernel sequencer client?
> > You can use snd-virmidi as well as snd-dummy.
> > I'm asking it because it'll simplify the test a lot, which will be
> > almost self-contained.
> 
> Like this?
> 
> root@lappiemctopface:~# aplaymidi -l
>  Port    Client name                      Port name
>  14:0    Midi Through                     Midi Through Port-0
>  20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
>  21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
>  22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
>  23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
> 128:0    rtpmidi lappiemctopface          Network
> 128:1    rtpmidi lappiemctopface          metronoom
> 128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
> 128:3    rtpmidi lappiemctopface          oensoens
> 130:0    FLUID Synth (11462)              Synth input port (11462:0)
> 
> and then:
> 
> root@lappiemctopface:~# cat test.sh 
> while true
> do
> 	aconnect 20:0 21:0
> 	aconnect -d 20:0 21:0
> done
> 
> root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done
> 
> This hard locks-up my laptop: it doesn't even respond to capslock (led
> on/off) anymore nor the ctrl+prtscr+alt+b combination.

Thanks, that's really helpful!
I see the possible race now.

Could you try the quick fix below?


Takashi

---
--- a/sound/core/seq/seq_ports.c
+++ b/sound/core/seq/seq_ports.c
@@ -200,6 +200,17 @@ get_subscriber(struct list_head *p, bool is_src)
 		return list_entry(p, struct snd_seq_subscribers, dest_list);
 }
 
+static void subscriber_get(struct snd_seq_subscribers *subs)
+{
+	atomic_inc(&subs->ref_count);
+}
+
+static void subscriber_put(struct snd_seq_subscribers *subs)
+{
+	if (atomic_dec_and_test(&subs->ref_count))
+		kfree(subs);
+}
+
 /*
  * remove all subscribers on the list
  * this is called from port_delete, for each src and dest list.
@@ -228,8 +239,7 @@ static void clear_subscriber_list(struct snd_seq_client *client,
 			 * we decrease the counter, and when both ports are deleted
 			 * remove the subscriber info
 			 */
-			if (atomic_dec_and_test(&subs->ref_count))
-				kfree(subs);
+			subscriber_put(subs);
 			continue;
 		}
 
@@ -489,6 +499,8 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
 			s = get_subscriber(p, is_src);
 			if (match_subs_info(&subs->info, &s->info))
 				goto __error;
+			if (!subs->ready)
+				goto __error;
 		}
 	}
 
@@ -505,7 +517,7 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
 	else
 		list_add_tail(&subs->dest_list, &grp->list_head);
 	grp->exclusive = exclusive;
-	atomic_inc(&subs->ref_count);
+	subscriber_get(subs);
 	write_unlock_irq(&grp->list_lock);
 	err = 0;
 
@@ -536,6 +548,7 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
 	if (!empty)
 		unsubscribe_port(client, port, grp, &subs->info, ack);
 	up_write(&grp->list_mutex);
+	subscriber_put(subs);
 }
 
 /* connect two ports */
@@ -555,7 +568,7 @@ int snd_seq_port_connect(struct snd_seq_client *connector,
 		return -ENOMEM;
 
 	subs->info = *info;
-	atomic_set(&subs->ref_count, 0);
+	atomic_set(&subs->ref_count, 1);
 	INIT_LIST_HEAD(&subs->src_list);
 	INIT_LIST_HEAD(&subs->dest_list);
 
@@ -572,13 +585,14 @@ int snd_seq_port_connect(struct snd_seq_client *connector,
 	if (err < 0)
 		goto error_dest;
 
+	subs->ready = true;
 	return 0;
 
  error_dest:
 	delete_and_unsubscribe_port(src_client, src_port, subs, true,
 				    connector->number != src_client->number);
  error:
-	kfree(subs);
+	subscriber_put(subs);
 	return err;
 }
 
@@ -597,8 +611,8 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
 	down_write(&src->list_mutex);
 	/* look for the connection */
 	list_for_each_entry(subs, &src->list_head, src_list) {
-		if (match_subs_info(info, &subs->info)) {
-			atomic_dec(&subs->ref_count); /* mark as not ready */
+		if (match_subs_info(info, &subs->info) && subs->ready) {
+			subs->ready = false;
 			err = 0;
 			break;
 		}
@@ -611,7 +625,7 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
 				    connector->number != src_client->number);
 	delete_and_unsubscribe_port(dest_client, dest_port, subs, false,
 				    connector->number != dest_client->number);
-	kfree(subs);
+	subscriber_put(subs);
 	return 0;
 }
 
diff --git a/sound/core/seq/seq_ports.h b/sound/core/seq/seq_ports.h
index b1f2c4943174..d9644d5f109e 100644
--- a/sound/core/seq/seq_ports.h
+++ b/sound/core/seq/seq_ports.h
@@ -30,6 +30,7 @@ struct snd_seq_subscribers {
 	struct list_head src_list;	/* link of sources */
 	struct list_head dest_list;	/* link of destinations */
 	atomic_t ref_count;
+	bool ready;
 };
 
 struct snd_seq_port_subs_info {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02 17:12             ` Takashi Iwai
@ 2021-08-02 19:53               ` folkert
  2021-08-03  6:55                 ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-02 19:53 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> > [ kernel bug if repeatingly aconnect'ing midi devices ]
> > 
> > > Does this happen if you do reconnect of kernel sequencer client?
> > > You can use snd-virmidi as well as snd-dummy.
> > > I'm asking it because it'll simplify the test a lot, which will be
> > > almost self-contained.
> > 
> > Like this?
> > 
> > root@lappiemctopface:~# aplaymidi -l
> >  Port    Client name                      Port name
> >  14:0    Midi Through                     Midi Through Port-0
> >  20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
> >  21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
> >  22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
> >  23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
> > 128:0    rtpmidi lappiemctopface          Network
> > 128:1    rtpmidi lappiemctopface          metronoom
> > 128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
> > 128:3    rtpmidi lappiemctopface          oensoens
> > 130:0    FLUID Synth (11462)              Synth input port (11462:0)
> > 
> > and then:
> > 
> > root@lappiemctopface:~# cat test.sh 
> > while true
> > do
> > 	aconnect 20:0 21:0
> > 	aconnect -d 20:0 21:0
> > done
> > 
> > root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done
> > 
> > This hard locks-up my laptop: it doesn't even respond to capslock (led
> > on/off) anymore nor the ctrl+prtscr+alt+b combination.
> 
> Thanks, that's really helpful!
> I see the possible race now.

> Could you try the quick fix below?

Something may have changed:

folkert@oensoens:~$ aplaymidi -l
ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
Cannot open sequencer - Permission denied

???

Otoh it now runs fine from what I've could test.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-02 19:53               ` folkert
@ 2021-08-03  6:55                 ` Takashi Iwai
  2021-08-03  7:40                   ` folkert
  2021-08-03 10:12                   ` folkert
  0 siblings, 2 replies; 17+ messages in thread
From: Takashi Iwai @ 2021-08-03  6:55 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Mon, 02 Aug 2021 21:53:49 +0200,
folkert wrote:
> 
> > > [ kernel bug if repeatingly aconnect'ing midi devices ]
> > > 
> > > > Does this happen if you do reconnect of kernel sequencer client?
> > > > You can use snd-virmidi as well as snd-dummy.
> > > > I'm asking it because it'll simplify the test a lot, which will be
> > > > almost self-contained.
> > > 
> > > Like this?
> > > 
> > > root@lappiemctopface:~# aplaymidi -l
> > >  Port    Client name                      Port name
> > >  14:0    Midi Through                     Midi Through Port-0
> > >  20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
> > >  21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
> > >  22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
> > >  23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
> > > 128:0    rtpmidi lappiemctopface          Network
> > > 128:1    rtpmidi lappiemctopface          metronoom
> > > 128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
> > > 128:3    rtpmidi lappiemctopface          oensoens
> > > 130:0    FLUID Synth (11462)              Synth input port (11462:0)
> > > 
> > > and then:
> > > 
> > > root@lappiemctopface:~# cat test.sh 
> > > while true
> > > do
> > > 	aconnect 20:0 21:0
> > > 	aconnect -d 20:0 21:0
> > > done
> > > 
> > > root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done
> > > 
> > > This hard locks-up my laptop: it doesn't even respond to capslock (led
> > > on/off) anymore nor the ctrl+prtscr+alt+b combination.
> > 
> > Thanks, that's really helpful!
> > I see the possible race now.
> 
> > Could you try the quick fix below?
> 
> Something may have changed:
> 
> folkert@oensoens:~$ aplaymidi -l
> ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
> Cannot open sequencer - Permission denied
> 
> ???

Hrm, that's unexpected.

Meanwhile, I reconsidered the fix and a better idea came after the
sleep.  Below is the new fix patch.  Could you give it a try instead
of the previous one?


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] ALSA: seq: Fix racy deletion of subscriber

It turned out that the current implementation of the port subscription
is racy.  The subscription contains two linked lists, and we have to
add to or delete from both lists.  Since both connection and
disconnection procedures perform the same order for those two lists
(i.e. src list, then dest list), when a deletion happens during a
connection procedure, the src list may be deleted before the dest list
addition completes, and this may lead to a use-after-free or an Oops,
even though the access to both lists are protected via mutex.

The simple workaround for this race is to change the access order for
the disconnection, namely, dest list, then src list.  This assures
that the connection has been established when disconnecting, and also
the concurrent deletion can be avoided.

Reported-by: folkert <folkert@vanheusden.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210801182754.GP890690@belle.intranet.vanheusden.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/seq/seq_ports.c | 39 ++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/sound/core/seq/seq_ports.c b/sound/core/seq/seq_ports.c
index b9c2ce2b8d5a..84d78630463e 100644
--- a/sound/core/seq/seq_ports.c
+++ b/sound/core/seq/seq_ports.c
@@ -514,10 +514,11 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
 	return err;
 }
 
-static void delete_and_unsubscribe_port(struct snd_seq_client *client,
-					struct snd_seq_client_port *port,
-					struct snd_seq_subscribers *subs,
-					bool is_src, bool ack)
+/* called with grp->list_mutex held */
+static void __delete_and_unsubscribe_port(struct snd_seq_client *client,
+					  struct snd_seq_client_port *port,
+					  struct snd_seq_subscribers *subs,
+					  bool is_src, bool ack)
 {
 	struct snd_seq_port_subs_info *grp;
 	struct list_head *list;
@@ -525,7 +526,6 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
 
 	grp = is_src ? &port->c_src : &port->c_dest;
 	list = is_src ? &subs->src_list : &subs->dest_list;
-	down_write(&grp->list_mutex);
 	write_lock_irq(&grp->list_lock);
 	empty = list_empty(list);
 	if (!empty)
@@ -535,6 +535,18 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
 
 	if (!empty)
 		unsubscribe_port(client, port, grp, &subs->info, ack);
+}
+
+static void delete_and_unsubscribe_port(struct snd_seq_client *client,
+					struct snd_seq_client_port *port,
+					struct snd_seq_subscribers *subs,
+					bool is_src, bool ack)
+{
+	struct snd_seq_port_subs_info *grp;
+
+	grp = is_src ? &port->c_src : &port->c_dest;
+	down_write(&grp->list_mutex);
+	__delete_and_unsubscribe_port(client, port, subs, is_src, ack);
 	up_write(&grp->list_mutex);
 }
 
@@ -590,27 +602,30 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
 			    struct snd_seq_client_port *dest_port,
 			    struct snd_seq_port_subscribe *info)
 {
-	struct snd_seq_port_subs_info *src = &src_port->c_src;
+	struct snd_seq_port_subs_info *dest = &dest_port->c_dest;
 	struct snd_seq_subscribers *subs;
 	int err = -ENOENT;
 
-	down_write(&src->list_mutex);
+	/* always start from deleting the dest port for avoiding concurrent
+	 * deletions
+	 */
+	down_write(&dest->list_mutex);
 	/* look for the connection */
-	list_for_each_entry(subs, &src->list_head, src_list) {
+	list_for_each_entry(subs, &dest->list_head, dest_list) {
 		if (match_subs_info(info, &subs->info)) {
-			atomic_dec(&subs->ref_count); /* mark as not ready */
+			__delete_and_unsubscribe_port(dest_client, dest_port,
+						      subs, false,
+						      connector->number != dest_client->number);
 			err = 0;
 			break;
 		}
 	}
-	up_write(&src->list_mutex);
+	up_write(&dest->list_mutex);
 	if (err < 0)
 		return err;
 
 	delete_and_unsubscribe_port(src_client, src_port, subs, true,
 				    connector->number != src_client->number);
-	delete_and_unsubscribe_port(dest_client, dest_port, subs, false,
-				    connector->number != dest_client->number);
 	kfree(subs);
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  6:55                 ` Takashi Iwai
@ 2021-08-03  7:40                   ` folkert
  2021-08-03  7:43                     ` Takashi Iwai
  2021-08-03 10:12                   ` folkert
  1 sibling, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-03  7:40 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

Hi,

To which kernel version should I apply it?
Because:

folkert@oensoens:~/linux-5.11.0$ patch -p1 < ~/alsa-patch.diff 
patching file sound/core/seq/seq_ports.c
patching file sound/core/seq/seq_ports.h
patching file sound/core/seq/seq_ports.c
Hunk #1 succeeded at 526 (offset 12 lines).
Hunk #2 succeeded at 538 (offset 12 lines).
Hunk #3 succeeded at 547 with fuzz 2 (offset 12 lines).
Hunk #4 FAILED at 602.
1 out of 4 hunks FAILED -- saving rejects to file sound/core/seq/seq_ports.c.rej

On Tue, Aug 03, 2021 at 08:55:36AM +0200, Takashi Iwai wrote:
> On Mon, 02 Aug 2021 21:53:49 +0200,
> folkert wrote:
> > 
> > > > [ kernel bug if repeatingly aconnect'ing midi devices ]
> > > > 
> > > > > Does this happen if you do reconnect of kernel sequencer client?
> > > > > You can use snd-virmidi as well as snd-dummy.
> > > > > I'm asking it because it'll simplify the test a lot, which will be
> > > > > almost self-contained.
> > > > 
> > > > Like this?
> > > > 
> > > > root@lappiemctopface:~# aplaymidi -l
> > > >  Port    Client name                      Port name
> > > >  14:0    Midi Through                     Midi Through Port-0
> > > >  20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
> > > >  21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
> > > >  22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
> > > >  23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
> > > > 128:0    rtpmidi lappiemctopface          Network
> > > > 128:1    rtpmidi lappiemctopface          metronoom
> > > > 128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
> > > > 128:3    rtpmidi lappiemctopface          oensoens
> > > > 130:0    FLUID Synth (11462)              Synth input port (11462:0)
> > > > 
> > > > and then:
> > > > 
> > > > root@lappiemctopface:~# cat test.sh 
> > > > while true
> > > > do
> > > > 	aconnect 20:0 21:0
> > > > 	aconnect -d 20:0 21:0
> > > > done
> > > > 
> > > > root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done
> > > > 
> > > > This hard locks-up my laptop: it doesn't even respond to capslock (led
> > > > on/off) anymore nor the ctrl+prtscr+alt+b combination.
> > > 
> > > Thanks, that's really helpful!
> > > I see the possible race now.
> > 
> > > Could you try the quick fix below?
> > 
> > Something may have changed:
> > 
> > folkert@oensoens:~$ aplaymidi -l
> > ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
> > Cannot open sequencer - Permission denied
> > 
> > ???
> 
> Hrm, that's unexpected.
> 
> Meanwhile, I reconsidered the fix and a better idea came after the
> sleep.  Below is the new fix patch.  Could you give it a try instead
> of the previous one?
> 
> 
> thanks,
> 
> Takashi
> 
> -- 8< --
> From: Takashi Iwai <tiwai@suse.de>
> Subject: [PATCH] ALSA: seq: Fix racy deletion of subscriber
> 
> It turned out that the current implementation of the port subscription
> is racy.  The subscription contains two linked lists, and we have to
> add to or delete from both lists.  Since both connection and
> disconnection procedures perform the same order for those two lists
> (i.e. src list, then dest list), when a deletion happens during a
> connection procedure, the src list may be deleted before the dest list
> addition completes, and this may lead to a use-after-free or an Oops,
> even though the access to both lists are protected via mutex.
> 
> The simple workaround for this race is to change the access order for
> the disconnection, namely, dest list, then src list.  This assures
> that the connection has been established when disconnecting, and also
> the concurrent deletion can be avoided.
> 
> Reported-by: folkert <folkert@vanheusden.com>
> Cc: <stable@vger.kernel.org>
> Link: https://lore.kernel.org/r/20210801182754.GP890690@belle.intranet.vanheusden.com
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> ---
>  sound/core/seq/seq_ports.c | 39 ++++++++++++++++++++++++++------------
>  1 file changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/sound/core/seq/seq_ports.c b/sound/core/seq/seq_ports.c
> index b9c2ce2b8d5a..84d78630463e 100644
> --- a/sound/core/seq/seq_ports.c
> +++ b/sound/core/seq/seq_ports.c
> @@ -514,10 +514,11 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
>  	return err;
>  }
>  
> -static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> -					struct snd_seq_client_port *port,
> -					struct snd_seq_subscribers *subs,
> -					bool is_src, bool ack)
> +/* called with grp->list_mutex held */
> +static void __delete_and_unsubscribe_port(struct snd_seq_client *client,
> +					  struct snd_seq_client_port *port,
> +					  struct snd_seq_subscribers *subs,
> +					  bool is_src, bool ack)
>  {
>  	struct snd_seq_port_subs_info *grp;
>  	struct list_head *list;
> @@ -525,7 +526,6 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
>  
>  	grp = is_src ? &port->c_src : &port->c_dest;
>  	list = is_src ? &subs->src_list : &subs->dest_list;
> -	down_write(&grp->list_mutex);
>  	write_lock_irq(&grp->list_lock);
>  	empty = list_empty(list);
>  	if (!empty)
> @@ -535,6 +535,18 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
>  
>  	if (!empty)
>  		unsubscribe_port(client, port, grp, &subs->info, ack);
> +}
> +
> +static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> +					struct snd_seq_client_port *port,
> +					struct snd_seq_subscribers *subs,
> +					bool is_src, bool ack)
> +{
> +	struct snd_seq_port_subs_info *grp;
> +
> +	grp = is_src ? &port->c_src : &port->c_dest;
> +	down_write(&grp->list_mutex);
> +	__delete_and_unsubscribe_port(client, port, subs, is_src, ack);
>  	up_write(&grp->list_mutex);
>  }
>  
> @@ -590,27 +602,30 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
>  			    struct snd_seq_client_port *dest_port,
>  			    struct snd_seq_port_subscribe *info)
>  {
> -	struct snd_seq_port_subs_info *src = &src_port->c_src;
> +	struct snd_seq_port_subs_info *dest = &dest_port->c_dest;
>  	struct snd_seq_subscribers *subs;
>  	int err = -ENOENT;
>  
> -	down_write(&src->list_mutex);
> +	/* always start from deleting the dest port for avoiding concurrent
> +	 * deletions
> +	 */
> +	down_write(&dest->list_mutex);
>  	/* look for the connection */
> -	list_for_each_entry(subs, &src->list_head, src_list) {
> +	list_for_each_entry(subs, &dest->list_head, dest_list) {
>  		if (match_subs_info(info, &subs->info)) {
> -			atomic_dec(&subs->ref_count); /* mark as not ready */
> +			__delete_and_unsubscribe_port(dest_client, dest_port,
> +						      subs, false,
> +						      connector->number != dest_client->number);
>  			err = 0;
>  			break;
>  		}
>  	}
> -	up_write(&src->list_mutex);
> +	up_write(&dest->list_mutex);
>  	if (err < 0)
>  		return err;
>  
>  	delete_and_unsubscribe_port(src_client, src_port, subs, true,
>  				    connector->number != src_client->number);
> -	delete_and_unsubscribe_port(dest_client, dest_port, subs, false,
> -				    connector->number != dest_client->number);
>  	kfree(subs);
>  	return 0;
>  }
> -- 
> 2.26.2


Folkert van Heusden

-- 
MultiTail is a versatile tool for watching logfiles and output of
commands. Filtering, coloring, merging, diff-view, etc.
http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  7:40                   ` folkert
@ 2021-08-03  7:43                     ` Takashi Iwai
  2021-08-03  7:44                       ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-03  7:43 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Tue, 03 Aug 2021 09:40:50 +0200,
folkert wrote:
> 
> Hi,
> 
> To which kernel version should I apply it?
> Because:
> 
> folkert@oensoens:~/linux-5.11.0$ patch -p1 < ~/alsa-patch.diff 
> patching file sound/core/seq/seq_ports.c
> patching file sound/core/seq/seq_ports.h
> patching file sound/core/seq/seq_ports.c
> Hunk #1 succeeded at 526 (offset 12 lines).
> Hunk #2 succeeded at 538 (offset 12 lines).
> Hunk #3 succeeded at 547 with fuzz 2 (offset 12 lines).
> Hunk #4 FAILED at 602.
> 1 out of 4 hunks FAILED -- saving rejects to file sound/core/seq/seq_ports.c.rej

Did you scratch the previous patch?  AFAIK, the latest patch should be
applicable to 5.11 and older cleanly.


Takashi

> 
> On Tue, Aug 03, 2021 at 08:55:36AM +0200, Takashi Iwai wrote:
> > On Mon, 02 Aug 2021 21:53:49 +0200,
> > folkert wrote:
> > > 
> > > > > [ kernel bug if repeatingly aconnect'ing midi devices ]
> > > > > 
> > > > > > Does this happen if you do reconnect of kernel sequencer client?
> > > > > > You can use snd-virmidi as well as snd-dummy.
> > > > > > I'm asking it because it'll simplify the test a lot, which will be
> > > > > > almost self-contained.
> > > > > 
> > > > > Like this?
> > > > > 
> > > > > root@lappiemctopface:~# aplaymidi -l
> > > > >  Port    Client name                      Port name
> > > > >  14:0    Midi Through                     Midi Through Port-0
> > > > >  20:0    Virtual Raw MIDI 1-0             VirMIDI 1-0
> > > > >  21:0    Virtual Raw MIDI 1-1             VirMIDI 1-1
> > > > >  22:0    Virtual Raw MIDI 1-2             VirMIDI 1-2
> > > > >  23:0    Virtual Raw MIDI 1-3             VirMIDI 1-3
> > > > > 128:0    rtpmidi lappiemctopface          Network
> > > > > 128:1    rtpmidi lappiemctopface          metronoom
> > > > > 128:2    rtpmidi lappiemctopface          AppleMidi2IPMidiBridge
> > > > > 128:3    rtpmidi lappiemctopface          oensoens
> > > > > 130:0    FLUID Synth (11462)              Synth input port (11462:0)
> > > > > 
> > > > > and then:
> > > > > 
> > > > > root@lappiemctopface:~# cat test.sh 
> > > > > while true
> > > > > do
> > > > > 	aconnect 20:0 21:0
> > > > > 	aconnect -d 20:0 21:0
> > > > > done
> > > > > 
> > > > > root@lappiemctopface:~# for i in `seq 0 3` ; do (./test.sh &) ; done
> > > > > 
> > > > > This hard locks-up my laptop: it doesn't even respond to capslock (led
> > > > > on/off) anymore nor the ctrl+prtscr+alt+b combination.
> > > > 
> > > > Thanks, that's really helpful!
> > > > I see the possible race now.
> > > 
> > > > Could you try the quick fix below?
> > > 
> > > Something may have changed:
> > > 
> > > folkert@oensoens:~$ aplaymidi -l
> > > ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
> > > Cannot open sequencer - Permission denied
> > > 
> > > ???
> > 
> > Hrm, that's unexpected.
> > 
> > Meanwhile, I reconsidered the fix and a better idea came after the
> > sleep.  Below is the new fix patch.  Could you give it a try instead
> > of the previous one?
> > 
> > 
> > thanks,
> > 
> > Takashi
> > 
> > -- 8< --
> > From: Takashi Iwai <tiwai@suse.de>
> > Subject: [PATCH] ALSA: seq: Fix racy deletion of subscriber
> > 
> > It turned out that the current implementation of the port subscription
> > is racy.  The subscription contains two linked lists, and we have to
> > add to or delete from both lists.  Since both connection and
> > disconnection procedures perform the same order for those two lists
> > (i.e. src list, then dest list), when a deletion happens during a
> > connection procedure, the src list may be deleted before the dest list
> > addition completes, and this may lead to a use-after-free or an Oops,
> > even though the access to both lists are protected via mutex.
> > 
> > The simple workaround for this race is to change the access order for
> > the disconnection, namely, dest list, then src list.  This assures
> > that the connection has been established when disconnecting, and also
> > the concurrent deletion can be avoided.
> > 
> > Reported-by: folkert <folkert@vanheusden.com>
> > Cc: <stable@vger.kernel.org>
> > Link: https://lore.kernel.org/r/20210801182754.GP890690@belle.intranet.vanheusden.com
> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > ---
> >  sound/core/seq/seq_ports.c | 39 ++++++++++++++++++++++++++------------
> >  1 file changed, 27 insertions(+), 12 deletions(-)
> > 
> > diff --git a/sound/core/seq/seq_ports.c b/sound/core/seq/seq_ports.c
> > index b9c2ce2b8d5a..84d78630463e 100644
> > --- a/sound/core/seq/seq_ports.c
> > +++ b/sound/core/seq/seq_ports.c
> > @@ -514,10 +514,11 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
> >  	return err;
> >  }
> >  
> > -static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> > -					struct snd_seq_client_port *port,
> > -					struct snd_seq_subscribers *subs,
> > -					bool is_src, bool ack)
> > +/* called with grp->list_mutex held */
> > +static void __delete_and_unsubscribe_port(struct snd_seq_client *client,
> > +					  struct snd_seq_client_port *port,
> > +					  struct snd_seq_subscribers *subs,
> > +					  bool is_src, bool ack)
> >  {
> >  	struct snd_seq_port_subs_info *grp;
> >  	struct list_head *list;
> > @@ -525,7 +526,6 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> >  
> >  	grp = is_src ? &port->c_src : &port->c_dest;
> >  	list = is_src ? &subs->src_list : &subs->dest_list;
> > -	down_write(&grp->list_mutex);
> >  	write_lock_irq(&grp->list_lock);
> >  	empty = list_empty(list);
> >  	if (!empty)
> > @@ -535,6 +535,18 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> >  
> >  	if (!empty)
> >  		unsubscribe_port(client, port, grp, &subs->info, ack);
> > +}
> > +
> > +static void delete_and_unsubscribe_port(struct snd_seq_client *client,
> > +					struct snd_seq_client_port *port,
> > +					struct snd_seq_subscribers *subs,
> > +					bool is_src, bool ack)
> > +{
> > +	struct snd_seq_port_subs_info *grp;
> > +
> > +	grp = is_src ? &port->c_src : &port->c_dest;
> > +	down_write(&grp->list_mutex);
> > +	__delete_and_unsubscribe_port(client, port, subs, is_src, ack);
> >  	up_write(&grp->list_mutex);
> >  }
> >  
> > @@ -590,27 +602,30 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
> >  			    struct snd_seq_client_port *dest_port,
> >  			    struct snd_seq_port_subscribe *info)
> >  {
> > -	struct snd_seq_port_subs_info *src = &src_port->c_src;
> > +	struct snd_seq_port_subs_info *dest = &dest_port->c_dest;
> >  	struct snd_seq_subscribers *subs;
> >  	int err = -ENOENT;
> >  
> > -	down_write(&src->list_mutex);
> > +	/* always start from deleting the dest port for avoiding concurrent
> > +	 * deletions
> > +	 */
> > +	down_write(&dest->list_mutex);
> >  	/* look for the connection */
> > -	list_for_each_entry(subs, &src->list_head, src_list) {
> > +	list_for_each_entry(subs, &dest->list_head, dest_list) {
> >  		if (match_subs_info(info, &subs->info)) {
> > -			atomic_dec(&subs->ref_count); /* mark as not ready */
> > +			__delete_and_unsubscribe_port(dest_client, dest_port,
> > +						      subs, false,
> > +						      connector->number != dest_client->number);
> >  			err = 0;
> >  			break;
> >  		}
> >  	}
> > -	up_write(&src->list_mutex);
> > +	up_write(&dest->list_mutex);
> >  	if (err < 0)
> >  		return err;
> >  
> >  	delete_and_unsubscribe_port(src_client, src_port, subs, true,
> >  				    connector->number != src_client->number);
> > -	delete_and_unsubscribe_port(dest_client, dest_port, subs, false,
> > -				    connector->number != dest_client->number);
> >  	kfree(subs);
> >  	return 0;
> >  }
> > -- 
> > 2.26.2
> 
> 
> Folkert van Heusden
> 
> -- 
> MultiTail is a versatile tool for watching logfiles and output of
> commands. Filtering, coloring, merging, diff-view, etc.
> http://www.vanheusden.com/multitail/
> ----------------------------------------------------------------------
> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  7:43                     ` Takashi Iwai
@ 2021-08-03  7:44                       ` folkert
  2021-08-03  7:49                         ` Takashi Iwai
  0 siblings, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-03  7:44 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> > To which kernel version should I apply it?
> > Because:
> > 
> > folkert@oensoens:~/linux-5.11.0$ patch -p1 < ~/alsa-patch.diff 
> > patching file sound/core/seq/seq_ports.c
> > patching file sound/core/seq/seq_ports.h
> > patching file sound/core/seq/seq_ports.c
> > Hunk #1 succeeded at 526 (offset 12 lines).
> > Hunk #2 succeeded at 538 (offset 12 lines).
> > Hunk #3 succeeded at 547 with fuzz 2 (offset 12 lines).
> > Hunk #4 FAILED at 602.
> > 1 out of 4 hunks FAILED -- saving rejects to file sound/core/seq/seq_ports.c.rej
> 
> Did you scratch the previous patch?  AFAIK, the latest patch should be
> applicable to 5.11 and older cleanly.

I did.
It also fails on 5.13.7.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  7:44                       ` folkert
@ 2021-08-03  7:49                         ` Takashi Iwai
  2021-08-03  7:52                           ` folkert
  0 siblings, 1 reply; 17+ messages in thread
From: Takashi Iwai @ 2021-08-03  7:49 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

[-- Attachment #1: Type: text/plain, Size: 983 bytes --]

On Tue, 03 Aug 2021 09:44:51 +0200,
folkert wrote:
> 
> > > To which kernel version should I apply it?
> > > Because:
> > > 
> > > folkert@oensoens:~/linux-5.11.0$ patch -p1 < ~/alsa-patch.diff 
> > > patching file sound/core/seq/seq_ports.c
> > > patching file sound/core/seq/seq_ports.h
> > > patching file sound/core/seq/seq_ports.c
> > > Hunk #1 succeeded at 526 (offset 12 lines).
> > > Hunk #2 succeeded at 538 (offset 12 lines).
> > > Hunk #3 succeeded at 547 with fuzz 2 (offset 12 lines).
> > > Hunk #4 FAILED at 602.
> > > 1 out of 4 hunks FAILED -- saving rejects to file sound/core/seq/seq_ports.c.rej
> > 
> > Did you scratch the previous patch?  AFAIK, the latest patch should be
> > applicable to 5.11 and older cleanly.
> 
> I did.
> It also fails on 5.13.7.

Weird, here it's applied cleanly on my machine with every kernel
version.  Double-check whether you really have any other changes.
I attach the patch again but with an attachment just to be sure.


Takashi


[-- Attachment #2: 0001-ALSA-seq-Fix-racy-deletion-of-subscriber.patch --]
[-- Type: application/octet-stream, Size: 4266 bytes --]

From a474c4b6224556b39e20a3d13bdb56b5e6984f78 Mon Sep 17 00:00:00 2001
From: Takashi Iwai <tiwai@suse.de>
Date: Tue, 3 Aug 2021 08:43:21 +0200
Subject: [PATCH] ALSA: seq: Fix racy deletion of subscriber

It turned out that the current implementation of the port subscription
is racy.  The subscription contains two linked lists, and we have to
add to or delete from both lists.  Since both connection and
disconnection procedures perform the same order for those two lists
(i.e. src list, then dest list), when a deletion happens during a
connection procedure, the src list may be deleted before the dest list
addition completes, and this may lead to a use-after-free or an Oops,
even though the access to both lists are protected via mutex.

The simple workaround for this race is to change the access order for
the disconnection, namely, dest list, then src list.  This assures
that the connection has been established when disconnecting, and also
the concurrent deletion can be avoided.

Reported-by: folkert <folkert@vanheusden.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210801182754.GP890690@belle.intranet.vanheusden.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 sound/core/seq/seq_ports.c | 39 ++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/sound/core/seq/seq_ports.c b/sound/core/seq/seq_ports.c
index b9c2ce2b8d5a..84d78630463e 100644
--- a/sound/core/seq/seq_ports.c
+++ b/sound/core/seq/seq_ports.c
@@ -514,10 +514,11 @@ static int check_and_subscribe_port(struct snd_seq_client *client,
 	return err;
 }
 
-static void delete_and_unsubscribe_port(struct snd_seq_client *client,
-					struct snd_seq_client_port *port,
-					struct snd_seq_subscribers *subs,
-					bool is_src, bool ack)
+/* called with grp->list_mutex held */
+static void __delete_and_unsubscribe_port(struct snd_seq_client *client,
+					  struct snd_seq_client_port *port,
+					  struct snd_seq_subscribers *subs,
+					  bool is_src, bool ack)
 {
 	struct snd_seq_port_subs_info *grp;
 	struct list_head *list;
@@ -525,7 +526,6 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
 
 	grp = is_src ? &port->c_src : &port->c_dest;
 	list = is_src ? &subs->src_list : &subs->dest_list;
-	down_write(&grp->list_mutex);
 	write_lock_irq(&grp->list_lock);
 	empty = list_empty(list);
 	if (!empty)
@@ -535,6 +535,18 @@ static void delete_and_unsubscribe_port(struct snd_seq_client *client,
 
 	if (!empty)
 		unsubscribe_port(client, port, grp, &subs->info, ack);
+}
+
+static void delete_and_unsubscribe_port(struct snd_seq_client *client,
+					struct snd_seq_client_port *port,
+					struct snd_seq_subscribers *subs,
+					bool is_src, bool ack)
+{
+	struct snd_seq_port_subs_info *grp;
+
+	grp = is_src ? &port->c_src : &port->c_dest;
+	down_write(&grp->list_mutex);
+	__delete_and_unsubscribe_port(client, port, subs, is_src, ack);
 	up_write(&grp->list_mutex);
 }
 
@@ -590,27 +602,30 @@ int snd_seq_port_disconnect(struct snd_seq_client *connector,
 			    struct snd_seq_client_port *dest_port,
 			    struct snd_seq_port_subscribe *info)
 {
-	struct snd_seq_port_subs_info *src = &src_port->c_src;
+	struct snd_seq_port_subs_info *dest = &dest_port->c_dest;
 	struct snd_seq_subscribers *subs;
 	int err = -ENOENT;
 
-	down_write(&src->list_mutex);
+	/* always start from deleting the dest port for avoiding concurrent
+	 * deletions
+	 */
+	down_write(&dest->list_mutex);
 	/* look for the connection */
-	list_for_each_entry(subs, &src->list_head, src_list) {
+	list_for_each_entry(subs, &dest->list_head, dest_list) {
 		if (match_subs_info(info, &subs->info)) {
-			atomic_dec(&subs->ref_count); /* mark as not ready */
+			__delete_and_unsubscribe_port(dest_client, dest_port,
+						      subs, false,
+						      connector->number != dest_client->number);
 			err = 0;
 			break;
 		}
 	}
-	up_write(&src->list_mutex);
+	up_write(&dest->list_mutex);
 	if (err < 0)
 		return err;
 
 	delete_and_unsubscribe_port(src_client, src_port, subs, true,
 				    connector->number != src_client->number);
-	delete_and_unsubscribe_port(dest_client, dest_port, subs, false,
-				    connector->number != dest_client->number);
 	kfree(subs);
 	return 0;
 }
-- 
2.26.2


[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  7:49                         ` Takashi Iwai
@ 2021-08-03  7:52                           ` folkert
  0 siblings, 0 replies; 17+ messages in thread
From: folkert @ 2021-08-03  7:52 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> > I did.
> > It also fails on 5.13.7.
> 
> Weird, here it's applied cleanly on my machine with every kernel
> version.  Double-check whether you really have any other changes.
> I attach the patch again but with an attachment just to be sure.

It now applies fine!
Strange. Most likely I made a mistake before. Anyway, it is now
compiling.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03  6:55                 ` Takashi Iwai
  2021-08-03  7:40                   ` folkert
@ 2021-08-03 10:12                   ` folkert
  2021-08-03 11:42                     ` Takashi Iwai
  1 sibling, 1 reply; 17+ messages in thread
From: folkert @ 2021-08-03 10:12 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

> > folkert@oensoens:~$ aplaymidi -l
> > ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
> > Cannot open sequencer - Permission denied
> > 
> > ???
> 
> Hrm, that's unexpected.

New system, account was not in the audio-group.

> Meanwhile, I reconsidered the fix and a better idea came after the
> sleep.  Below is the new fix patch.  Could you give it a try instead
> of the previous one?

Works fine as well: 7 concurrent script-instances for half an hour and
nothing logged in dmesg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: aconnect occasionally causes kernel oopses
  2021-08-03 10:12                   ` folkert
@ 2021-08-03 11:42                     ` Takashi Iwai
  0 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2021-08-03 11:42 UTC (permalink / raw)
  To: folkert; +Cc: alsa-devel

On Tue, 03 Aug 2021 12:12:34 +0200,
folkert wrote:
> 
> > > folkert@oensoens:~$ aplaymidi -l
> > > ALSA lib seq_hw.c:466:(snd_seq_hw_open) open /dev/snd/seq failed: Permission denied
> > > Cannot open sequencer - Permission denied
> > > 
> > > ???
> > 
> > Hrm, that's unexpected.
> 
> New system, account was not in the audio-group.
> 
> > Meanwhile, I reconsidered the fix and a better idea came after the
> > sleep.  Below is the new fix patch.  Could you give it a try instead
> > of the previous one?
> 
> Works fine as well: 7 concurrent script-instances for half an hour and
> nothing logged in dmesg.

Great, I'm going to submit and merge the fix soon.


Thanks!

Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-08-03 11:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-01 18:27 aconnect occasionally causes kernel oopses folkert
2021-08-02  6:16 ` Takashi Iwai
2021-08-02  6:18   ` folkert
2021-08-02  6:59     ` Takashi Iwai
2021-08-02  9:10       ` folkert
2021-08-02 15:11         ` Takashi Iwai
2021-08-02 15:21           ` folkert
2021-08-02 17:12             ` Takashi Iwai
2021-08-02 19:53               ` folkert
2021-08-03  6:55                 ` Takashi Iwai
2021-08-03  7:40                   ` folkert
2021-08-03  7:43                     ` Takashi Iwai
2021-08-03  7:44                       ` folkert
2021-08-03  7:49                         ` Takashi Iwai
2021-08-03  7:52                           ` folkert
2021-08-03 10:12                   ` folkert
2021-08-03 11:42                     ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.