All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
@ 2016-05-29 12:11 Manuel Reimer
  2016-05-29 17:11 ` Cameron Gutman
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-05-29 12:11 UTC (permalink / raw)
  To: linux-input; +Cc: jikos

Hello,

I had a deeper look at the kernel panic, happening if there are rumble 
effects loaded and the USB plug is pulled.

The reason for this is similar to the one, I fixed in uinput some days ago.

In "sony_remove" the memory for "output_report_dmabuf" is freed.
Then, a few lines later, "hid_hw_stop" is called.
This now tries to cleanup and causes ff_memless to try to send out a new 
rumble event which should turn both motor speeds to zero.
To get this processed, "sc->send_output_report" is called, which, for 
example, ends up in "dualshock4_send_output_report".
This function will now use the, already freed, "output_report_dmabuf".

My patch zeroes out "output_report_dmabuf" after freeing and checks for 
this in the "send_output_report" functions. There may be other ways to 
fix this, so please tell me if you prefer some other way.

I've added a one-line comment above the memory pointer check, as, in my 
opinion, it is not obvious what is happening here.

Signed-off-by: Manuel Reimer <mail@m-reimer.de>

--- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
+++ b/drivers/hid/hid-sony.c	2016-05-29 13:54:25.452029787 +0200
@@ -1809,6 +1809,10 @@ static void sixaxis_send_output_report(s
  		(struct sixaxis_output_report *)sc->output_report_dmabuf;
  	int n;

+	/* If called via hid_hw_stop, then our memory is already gone! */
+	if (!report)
+		return;
+
  	/* Initialize the report with default values */
  	memcpy(report, &default_report, sizeof(struct sixaxis_output_report));

@@ -1853,6 +1857,10 @@ static void dualshock4_send_output_repor
  	__u8 *buf = sc->output_report_dmabuf;
  	int offset;

+	/* If called via hid_hw_stop, then our memory is already gone! */
+	if (!buf)
+		return;
+
  	if (sc->quirks & DUALSHOCK4_CONTROLLER_USB) {
  		memset(buf, 0, DS4_REPORT_0x05_SIZE);
  		buf[0] = 0x05;
@@ -1899,6 +1907,10 @@ static void motion_send_output_report(st
  	struct motion_output_report_02 *report =
  		(struct motion_output_report_02 *)sc->output_report_dmabuf;

+	/* If called via hid_hw_stop, then our memory is already gone! */
+	if (!report)
+		return;
+
  	memset(report, 0, MOTION_REPORT_0x02_SIZE);

  	report->type = 0x02; /* set leds */
@@ -2426,6 +2438,7 @@ static void sony_remove(struct hid_devic
  	sony_cancel_work_sync(sc);

  	kfree(sc->output_report_dmabuf);
+	sc->output_report_dmabuf = NULL;

  	sony_remove_dev_list(sc);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-05-29 12:11 [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect Manuel Reimer
@ 2016-05-29 17:11 ` Cameron Gutman
  2016-05-30  4:45   ` mail
  2016-05-30 19:15   ` Manuel Reimer
  0 siblings, 2 replies; 13+ messages in thread
From: Cameron Gutman @ 2016-05-29 17:11 UTC (permalink / raw)
  To: Manuel Reimer; +Cc: linux-input, jikos


> On May 29, 2016, at 7:11 AM, Manuel Reimer <mail+linux-input@m-reimer.de> wrote:
> 
> Hello,
> 
> I had a deeper look at the kernel panic, happening if there are rumble effects loaded and the USB plug is pulled.
> 
> The reason for this is similar to the one, I fixed in uinput some days ago.
> 
> In "sony_remove" the memory for "output_report_dmabuf" is freed.
> Then, a few lines later, "hid_hw_stop" is called.
> This now tries to cleanup and causes ff_memless to try to send out a new rumble event which should turn both motor speeds to zero.
> To get this processed, "sc->send_output_report" is called, which, for example, ends up in "dualshock4_send_output_report".
> This function will now use the, already freed, "output_report_dmabuf".
> 
> My patch zeroes out "output_report_dmabuf" after freeing and checks for this in the "send_output_report" functions. There may be other ways to fix this, so please tell me if you prefer some other way.
> 
> I've added a one-line comment above the memory pointer check, as, in my opinion, it is not obvious what is happening here.
> 
> Signed-off-by: Manuel Reimer <mail@m-reimer.de>
> 
> --- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
> +++ b/drivers/hid/hid-sony.c	2016-05-29 13:54:25.452029787 +0200
> @@ -1809,6 +1809,10 @@ static void sixaxis_send_output_report(s
> 		(struct sixaxis_output_report *)sc->output_report_dmabuf;
> 	int n;
> 
> +	/* If called via hid_hw_stop, then our memory is already gone! */
> +	if (!report)
> +		return;
> +
> 	/* Initialize the report with default values */
> 	memcpy(report, &default_report, sizeof(struct sixaxis_output_report));
> 
> @@ -1853,6 +1857,10 @@ static void dualshock4_send_output_repor
> 	__u8 *buf = sc->output_report_dmabuf;
> 	int offset;
> 
> +	/* If called via hid_hw_stop, then our memory is already gone! */
> +	if (!buf)
> +		return;
> +
> 	if (sc->quirks & DUALSHOCK4_CONTROLLER_USB) {
> 		memset(buf, 0, DS4_REPORT_0x05_SIZE);
> 		buf[0] = 0x05;
> @@ -1899,6 +1907,10 @@ static void motion_send_output_report(st
> 	struct motion_output_report_02 *report =
> 		(struct motion_output_report_02 *)sc->output_report_dmabuf;
> 
> +	/* If called via hid_hw_stop, then our memory is already gone! */
> +	if (!report)
> +		return;
> +
> 	memset(report, 0, MOTION_REPORT_0x02_SIZE);
> 
> 	report->type = 0x02; /* set leds */
> @@ -2426,6 +2438,7 @@ static void sony_remove(struct hid_devic
> 	sony_cancel_work_sync(sc);
> 
> 	kfree(sc->output_report_dmabuf);
> +	sc->output_report_dmabuf = NULL;

What prevents one of the send_output_report() functions from accessing sc->output_report_dmabuf after it’s been passed to kfree() but before you’ve set it to NULL? Why not simply wait to free output_report_dmabuf until after hid_hw_stop returns?

> 
> 	sony_remove_dev_list(sc);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-input" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-05-29 17:11 ` Cameron Gutman
@ 2016-05-30  4:45   ` mail
  2016-05-30 19:15   ` Manuel Reimer
  1 sibling, 0 replies; 13+ messages in thread
From: mail @ 2016-05-30  4:45 UTC (permalink / raw)
  To: Cameron Gutman, Manuel Reimer; +Cc: linux-input, jikos

Am 29. Mai 2016 19:11:45 MESZ, schrieb Cameron Gutman <aicommander@gmail.com>:
>
>> On May 29, 2016, at 7:11 AM, Manuel Reimer
><mail+linux-input@m-reimer.de> wrote:
>> 
>> Hello,
>> 
>> I had a deeper look at the kernel panic, happening if there are
>rumble effects loaded and the USB plug is pulled.
>> 
>> The reason for this is similar to the one, I fixed in uinput some
>days ago.
>> 
>> In "sony_remove" the memory for "output_report_dmabuf" is freed.
>> Then, a few lines later, "hid_hw_stop" is called.
>> This now tries to cleanup and causes ff_memless to try to send out a
>new rumble event which should turn both motor speeds to zero.
>> To get this processed, "sc->send_output_report" is called, which, for
>example, ends up in "dualshock4_send_output_report".
>> This function will now use the, already freed,
>"output_report_dmabuf".
>> 
>> My patch zeroes out "output_report_dmabuf" after freeing and checks
>for this in the "send_output_report" functions. There may be other ways
>to fix this, so please tell me if you prefer some other way.
>> 
>> I've added a one-line comment above the memory pointer check, as, in
>my opinion, it is not obvious what is happening here.
>> 
>> Signed-off-by: Manuel Reimer <mail@m-reimer.de>
>> 
>> --- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
>> +++ b/drivers/hid/hid-sony.c	2016-05-29 13:54:25.452029787 +0200
>> @@ -1809,6 +1809,10 @@ static void sixaxis_send_output_report(s
>> 		(struct sixaxis_output_report *)sc->output_report_dmabuf;
>> 	int n;
>> 
>> +	/* If called via hid_hw_stop, then our memory is already gone! */
>> +	if (!report)
>> +		return;
>> +
>> 	/* Initialize the report with default values */
>> 	memcpy(report, &default_report, sizeof(struct
>sixaxis_output_report));
>> 
>> @@ -1853,6 +1857,10 @@ static void dualshock4_send_output_repor
>> 	__u8 *buf = sc->output_report_dmabuf;
>> 	int offset;
>> 
>> +	/* If called via hid_hw_stop, then our memory is already gone! */
>> +	if (!buf)
>> +		return;
>> +
>> 	if (sc->quirks & DUALSHOCK4_CONTROLLER_USB) {
>> 		memset(buf, 0, DS4_REPORT_0x05_SIZE);
>> 		buf[0] = 0x05;
>> @@ -1899,6 +1907,10 @@ static void motion_send_output_report(st
>> 	struct motion_output_report_02 *report =
>> 		(struct motion_output_report_02 *)sc->output_report_dmabuf;
>> 
>> +	/* If called via hid_hw_stop, then our memory is already gone! */
>> +	if (!report)
>> +		return;
>> +
>> 	memset(report, 0, MOTION_REPORT_0x02_SIZE);
>> 
>> 	report->type = 0x02; /* set leds */
>> @@ -2426,6 +2438,7 @@ static void sony_remove(struct hid_devic
>> 	sony_cancel_work_sync(sc);
>> 
>> 	kfree(sc->output_report_dmabuf);
>> +	sc->output_report_dmabuf = NULL;
>
>What prevents one of the send_output_report() functions from accessing
>sc->output_report_dmabuf after it’s been passed to kfree() but before
>you’ve set it to NULL? Why not simply wait to free output_report_dmabuf
>until after hid_hw_stop returns?


Good point! But I'm unsure how these workqueues work in kernel. Isn't it possible that send_output_report is called after sony_remove already finished executing (still a task in queue) and so the already freed memory is accessed, again?

>
>> 
>> 	sony_remove_dev_list(sc);
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>linux-input" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-05-29 17:11 ` Cameron Gutman
  2016-05-30  4:45   ` mail
@ 2016-05-30 19:15   ` Manuel Reimer
  2016-06-02 17:53     ` [PATCH v2] " Manuel Reimer
  1 sibling, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-05-30 19:15 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos

On 05/29/2016 07:11 PM, Cameron Gutman wrote:
> What prevents one of the send_output_report() functions from accessing sc->output_report_dmabuf after it’s been passed to kfree() but before you’ve set it to NULL? Why not simply wait to free output_report_dmabuf until after hid_hw_stop returns?

Just moving the kfree to the end doesn't seem to be a good idea.

I did the following:

--- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
+++ b/drivers/hid/hid-sony.c	2016-05-30 20:28:39.897849164 +0200
@@ -2423,15 +2423,15 @@ static void sony_remove(struct hid_devic
  		sony_battery_remove(sc);
  	}

-	sony_cancel_work_sync(sc);
-
-	kfree(sc->output_report_dmabuf);
-
  	sony_remove_dev_list(sc);

  	sony_release_device_id(sc);

  	hid_hw_stop(hdev);
+
+	sony_cancel_work_sync(sc);
+
+	kfree(sc->output_report_dmabuf);
  }

Idea was: We know that hid_hw_stop will schedule work into our queue. So 
it seems to be a good idea to move sony_cancel_work_sync below that, too 
and kfree after that.

But now I have:

[  391.170091] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000030
[  391.303404] IP: [<ffffffffa0387eaf>] ml_effect_timer+0x1f/0x170 
[ff_memless]
[  391.303404] PGD 3d3a5067 PUD 3d3a4067 PMD 0
[  391.303404] Oops: 0000 [#1] PREEMPT SMP
[  391.303404] Modules linked in: hid_sony ff_memless cfg80211 rfkill 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 
lrw gf128mul glue_helper ablk_helper cryptd input_leds psmouse ppdev 
acpi_cpufreq pcspkr led_class joydev serio_raw mousedev tpm_tis tpm 
parport_pc mac_hid fjes parport intel_agp intel_gtt e1000 battery video 
evdev button processor i2c_piix4 ac sch_fq_codel vboxvideo(O) ttm 
drm_kms_helper drm syscopyarea sysfillrect sysimgblt fb_sys_fops 
vboxsf(O) vboxguest(O) ip_tables x_tables xfs libcrc32c hid_generic 
usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi atkbd libps2 
crc32c_intel ahci libahci ohci_pci ohci_hcd ata_piix usbcore usb_common 
libata scsi_mod i8042 serio
[  391.303404] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 
4.5.4-1-ARCH #1
[  391.303404] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
[  391.303404] task: ffffffff81811500 ti: ffffffff81800000 task.ti: 
ffffffff81800000
[  391.303404] RIP: 0010:[<ffffffffa0387eaf>]  [<ffffffffa0387eaf>] 
ml_effect_timer+0x1f/0x170 [ff_memless]
[  391.303404] RSP: 0018:ffff88003fc03e60  EFLAGS: 00010246
[  421.393551] ata2: lost interrupt (Status 0x58)
[  421.393601] ata2: drained 8 bytes to clear DRQ
[  421.393611] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 
frozen
[  421.393623] ata2.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 
pio 16392 in
[  421.393623]          opcode=0x4a 4a 01 00 00 10 00 00 00 08 00res 
40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[  421.393627] ata2.00: status: { DRDY }
[  421.393789] ata2: soft resetting link
[  421.550864] ata2.00: configured for UDMA/33
[  426.550330] ata2.00: qc timeout (cmd 0xa0)
[  426.550342] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
[  426.550519] ata2: soft resetting link
[  426.705035] ata2.00: configured for UDMA/33
[  431.703725] ata2.00: qc timeout (cmd 0xa0)
[  431.703736] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
[  431.703744] ata2.00: limiting speed to UDMA/33:PIO3
[  431.703994] ata2: soft resetting link
[  431.858270] ata2.00: configured for UDMA/33
[  436.856832] ata2.00: qc timeout (cmd 0xa0)
[  436.856840] ata2.00: TEST_UNIT_READY failed (err_mask=0x5)
[  436.856846] ata2.00: disabled
[  436.857022] ata2: soft resetting link
[  437.010167] ata2: EH complete
[  421.547203] RAX: 0000000000000000 RBX: ffff88003d08f000 RCX: 
ffff88003fc03ed8
[  421.547203] RDX: ffff88003d08f000 RSI: ffffffffa0387e90 RDI: 
ffff88003d08f000
[  421.547203] RBP: ffff88003fc03e78 R08: ffff88003ceaee00 R09: 
0000000000000000
[  421.547203] R10: 0000000000000020 R11: 0000000000000000 R12: 
ffff880037a47710
[  421.547203] R13: 0000000000000101 R14: ffffffffa0387e90 R15: 
ffff88003d08f000
[  421.547203] FS:  00007fba4be63700(0000) GS:ffff88003fc00000(0000) 
knlGS:0000000000000000
[  421.547203] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  421.547203] CR2: 0000000000000030 CR3: 000000003d3a2000 CR4: 
00000000000406f0
[  421.547203] Stack:
[  421.547203]  ffff88003fc0dbc0 ffff880037a47710 0000000000000101 
ffff88003fc03eb0
[  421.547203]  ffffffff810e3cc5 ffff88003fc0dbc0 ffff880037a47710 
ffffffffa0387e90
[  421.547203]  0000000000000000 0000000000000001 ffff88003fc03f10 
ffffffff810e4016
[  421.547203] Call Trace:
[  421.547203]  <IRQ>
[  421.547203]  [<ffffffff810e3cc5>] call_timer_fn+0x35/0x150
[  421.547203]  [<ffffffffa0387e90>] ? ml_ff_playback+0x100/0x100 
[ff_memless]
[  421.547203]  [<ffffffff810e4016>] run_timer_softirq+0x236/0x2e0
[  421.547203]  [<ffffffff8107d3e6>] __do_softirq+0xe6/0x2f0
[  421.547203]  [<ffffffff8107d763>] irq_exit+0xa3/0xb0
[  421.547203]  [<ffffffff815b8112>] smp_apic_timer_interrupt+0x42/0x50
[  421.547203]  [<ffffffff815b63f2>] apic_timer_interrupt+0x82/0x90
[  421.547203]  <EOI>
[  421.547203]  [<ffffffff8105f896>] ? native_safe_halt+0x6/0x10
[  421.547203]  [<ffffffff81021c30>] default_idle+0x20/0x110
[  421.547203]  [<ffffffff8102246f>] arch_cpu_idle+0xf/0x20
[  421.547203]  [<ffffffff810baf0a>] default_idle_call+0x2a/0x40
[  421.547203]  [<ffffffff810bb264>] cpu_startup_entry+0x344/0x3b0
[  421.547203]  [<ffffffff815a8c19>] rest_init+0x89/0x90
[  421.547203]  [<ffffffff8190c012>] start_kernel+0x46a/0x48b
[  421.547203]  [<ffffffff8190b120>] ? early_idt_handler_array+0x120/0x120
[  421.547203]  [<ffffffff8190b332>] x86_64_start_reservations+0x2a/0x2c
[  421.547203]  [<ffffffff8190b480>] x86_64_start_kernel+0x14c/0x16f
[  421.547203] Code: a0 e8 06 fa f7 e0 eb a3 0f 1f 40 00 0f 1f 44 00 00 
55 f6 05 d5 11 00 00 04 48 89 e5 41 55 41 54 53 48 8b 87 f8 00 00 00 48 
89 fb <4c> 8b 68 30 75 2c 48 81 c3 10 02 00 00 48 89 df e8 bc d1 22 e1
[  421.547203] RIP  [<ffffffffa0387eaf>] ml_effect_timer+0x1f/0x170 
[ff_memless]
[  421.547203]  RSP <ffff88003fc03e60>
[  421.547203] CR2: 0000000000000030
[  421.547203] ---[ end trace 7a664affc45468fa ]---
[  421.547203] Kernel panic - not syncing: Fatal exception in interrupt
[  421.547203] Kernel Offset: disabled
[  421.547203] ---[ end Kernel panic - not syncing: Fatal exception in 
interrupt

This was some kind of "brute force attempt" to check the stability.

I opened fftest, then I started effect "4" over and over again to get 
the queue in ff_memless filled up. Then I pulled the USB plug.

It may now actually be possible that the memory has been freed again and 
some event in the queue kicked in after that. It may also be possible 
that the hid driver subsystem doesn't really like this case, that the 
device is stopped, but someone still sends to it.

Same with xpad driver:
[   64.957349] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[   66.148685] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[   66.292505] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[   66.336855] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19



Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-05-30 19:15   ` Manuel Reimer
@ 2016-06-02 17:53     ` Manuel Reimer
  2016-06-05 12:59       ` [PATCH v3] " Manuel Reimer
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-06-02 17:53 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos

On 05/30/2016 09:15 PM, Manuel Reimer wrote:
> It may now actually be possible that the memory has been freed again and
> some event in the queue kicked in after that. It may also be possible
> that the hid driver subsystem doesn't really like this case, that the
> device is stopped, but someone still sends to it.

It definitively seems to be a bad idea to continue to interact with the 
HID device as soon as hid_hw_stop was called.

"Somewhere" while hid_hw_stop is running, the cleanup of loaded force 
feedback events is happening. I don't know if schedule_work will run the 
work asynchronously. If so, then calls could be triggered even after 
hid_hw_stop is already finished.

One possible option would be to add a input_unregister_device into 
sony_remove right before the call to "sony_cancel_work_sync". A call to 
input_unregister_device triggers the cleanup, too and so allows to move 
this to a point earlier in the shutdown procedure. But I don't know if 
this is actually a good idea to do this here.

At the end of this mail is a second attempt for a fix. The idea of this 
is, that after "sony_cancel_work_sync" was called, the worker itself 
shouldn't be able to take any new work. To do this, I set 
worker_initialized back to zero and check for this in a newly introduced 
function sony_schedule_work. This seems to fix the problem. If I pull 
the plug while effects are loaded or playing, then I get one or more of 
my added error messages, but, so far, no more kernel panic.

So finally I am at a point where I need feedback to continue my work. 
Would be great if someone could offer some help.

Signed-off-by: Manuel Reimer <mail@m-reimer.de>

--- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
+++ b/drivers/hid/hid-sony.c	2016-06-02 19:33:59.621212336 +0200
@@ -1051,6 +1051,14 @@ struct sony_sc {
  	__u8 led_count;
  };

+static inline void sony_schedule_work(struct sony_sc *sc)
+{
+	if (sc->worker_initialized)
+		schedule_work(&sc->state_worker);
+	else
+		printk(KERN_ERR "hid-sony: Sending to uninitialized device failed!\n");
+}
+
  static __u8 *sixaxis_fixup(struct hid_device *hdev, __u8 *rdesc,
  			     unsigned int *rsize)
  {
@@ -1542,7 +1550,7 @@ static void buzz_set_leds(struct sony_sc
  static void sony_set_leds(struct sony_sc *sc)
  {
  	if (!(sc->quirks & BUZZ_CONTROLLER))
-		schedule_work(&sc->state_worker);
+		sony_schedule_work(sc);
  	else
  		buzz_set_leds(sc);
  }
@@ -1653,7 +1661,7 @@ static int sony_led_blink_set(struct led
  		new_off != drv_data->led_delay_off[n]) {
  		drv_data->led_delay_on[n] = new_on;
  		drv_data->led_delay_off[n] = new_off;
-		schedule_work(&drv_data->state_worker);
+		sony_schedule_work(drv_data);
  	}

  	return 0;
@@ -1963,7 +1971,7 @@ static int sony_play_effect(struct input
  	sc->left = effect->u.rumble.strong_magnitude / 256;
  	sc->right = effect->u.rumble.weak_magnitude / 256;

-	schedule_work(&sc->state_worker);
+	sony_schedule_work(sc);
  	return 0;
  }

@@ -2265,8 +2273,10 @@ static inline void sony_init_output_repo

  static inline void sony_cancel_work_sync(struct sony_sc *sc)
  {
-	if (sc->worker_initialized)
+	if (sc->worker_initialized) {
+		sc->worker_initialized = 0;
  		cancel_work_sync(&sc->state_worker);
+	}
  }

  static int sony_probe(struct hid_device *hdev, const struct 
hid_device_id *id)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-02 17:53     ` [PATCH v2] " Manuel Reimer
@ 2016-06-05 12:59       ` Manuel Reimer
  2016-06-07  5:38         ` Cameron Gutman
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-06-05 12:59 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos

Hello,

in the meantime, I got private mail with the suggestion, that I should 
have a look at spinlocks for my last change, so this is what I came up with:




--- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
+++ b/drivers/hid/hid-sony.c	2016-06-03 16:56:13.642143935 +0200
@@ -1023,6 +1023,7 @@ static DEFINE_IDA(sony_device_id_allocat

  struct sony_sc {
  	spinlock_t lock;
+	spinlock_t worker_initialized_lock;
  	struct list_head list_node;
  	struct hid_device *hdev;
  	struct led_classdev *leds[MAX_LEDS];
@@ -1051,6 +1052,20 @@ struct sony_sc {
  	__u8 led_count;
  };

+static inline void sony_schedule_work(struct sony_sc *sc)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
+
+	if (sc->worker_initialized)
+		schedule_work(&sc->state_worker);
+	else
+		printk(KERN_ERR "hid-sony: Sending to uninitialized device failed!\n");
+
+	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
+}
+
  static __u8 *sixaxis_fixup(struct hid_device *hdev, __u8 *rdesc,
  			     unsigned int *rsize)
  {
@@ -1542,7 +1557,7 @@ static void buzz_set_leds(struct sony_sc
  static void sony_set_leds(struct sony_sc *sc)
  {
  	if (!(sc->quirks & BUZZ_CONTROLLER))
-		schedule_work(&sc->state_worker);
+		sony_schedule_work(sc);
  	else
  		buzz_set_leds(sc);
  }
@@ -1653,7 +1668,7 @@ static int sony_led_blink_set(struct led
  		new_off != drv_data->led_delay_off[n]) {
  		drv_data->led_delay_on[n] = new_on;
  		drv_data->led_delay_off[n] = new_off;
-		schedule_work(&drv_data->state_worker);
+		sony_schedule_work(drv_data);
  	}

  	return 0;
@@ -1963,7 +1978,7 @@ static int sony_play_effect(struct input
  	sc->left = effect->u.rumble.strong_magnitude / 256;
  	sc->right = effect->u.rumble.weak_magnitude / 256;

-	schedule_work(&sc->state_worker);
+	sony_schedule_work(sc);
  	return 0;
  }

@@ -2255,18 +2270,30 @@ static void sony_release_device_id(struc
  static inline void sony_init_output_report(struct sony_sc *sc,
  				void(*send_output_report)(struct sony_sc*))
  {
+	unsigned long flags;
+
  	sc->send_output_report = send_output_report;

+	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
+
  	if (!sc->worker_initialized)
  		INIT_WORK(&sc->state_worker, sony_state_worker);

  	sc->worker_initialized = 1;
+
+	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
  }

  static inline void sony_cancel_work_sync(struct sony_sc *sc)
  {
-	if (sc->worker_initialized)
+	unsigned long flags;
+
+	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
+	if (sc->worker_initialized) {
+		sc->worker_initialized = 0;
  		cancel_work_sync(&sc->state_worker);
+	}
+	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
  }

  static int sony_probe(struct hid_device *hdev, const struct 
hid_device_id *id)
@@ -2283,6 +2310,7 @@ static int sony_probe(struct hid_device
  	}

  	spin_lock_init(&sc->lock);
+	spin_lock_init(&sc->worker_initialized_lock);

  	sc->quirks = quirks;
  	hid_set_drvdata(hdev, sc);






Result:





[   67.956856] BUG: unable to handle kernel NULL pointer dereference at 
00000000000000d8
[   68.380234] IP: [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 
[hid_sony]
[   68.380234] PGD 0
[   68.380234] Oops: 0002 [#1] PREEMPT SMP
[   68.380234] Modules linked in: hid_sony ff_memless cfg80211 rfkill 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 
ppdev lrw gf128mul glue_helper ablk_helper cryptd psmouse input_leds 
joydev led_class acpi_cpufreq serio_raw mousedev pcspkr mac_hid 
i2c_piix4 e1000 intel_agp intel_gtt tpm_tis tpm fjes video battery 
parport_pc parport button processor ac evdev sch_fq_codel vboxvideo(O) 
ttm drm_kms_helper drm syscopyarea sysfillrect sysimgblt fb_sys_fops 
vboxsf(O) vboxguest(O) ip_tables x_tables xfs libcrc32c hid_generic 
usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi atkbd libps2 
ohci_pci ohci_hcd ata_piix ahci libahci usbcore crc32c_intel usb_common 
libata scsi_mod i8042 serio
[   68.380234] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 
4.5.4-1-ARCH #1
[   68.380234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
[   68.380234] task: ffffffff81811500 ti: ffffffff81800000 task.ti: 
ffffffff81800000
[   68.380234] RIP: 0010:[<ffffffffa036933d>]  [<ffffffffa036933d>] 
sony_play_effect+0x2d/0x90 [hid_sony]
[   68.380234] RSP: 0018:ffff88003fc03da0  EFLAGS: 00010046
[   68.380234] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
000000000000ffff
[   68.380234] RDX: ffff88003fc03df0 RSI: 0000000000000000 RDI: 
ffff88003d09d800
[   68.380234] RBP: ffff88003fc03db8 R08: ffff880037bf3100 R09: 
0000000000000000
[   68.380234] R10: 0000000000000020 R11: 0000000000000000 R12: 
0000000000000004
[   68.380234] R13: ffff880039980c00 R14: ffff88003fc03df0 R15: 
00000000000000b4
[   68.380234] FS:  00007f1a3b54a780(0000) GS:ffff88003fc00000(0000) 
knlGS:0000000000000000
[   68.380234] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   68.380234] CR2: 00000000000000d8 CR3: 000000003ceb7000 CR4: 
00000000000406f0
[   68.380234] Stack:
[   68.380234]  0000000000000010 ffff880039980f08 ffff880039980c00 
ffff88003fc03e50
[   68.380234]  ffffffffa036475a 0000000000000002 0000000000015740 
0000000000000046
[   68.380234]  ffff880039980c08 000000000000ffff 0000000000000050 
0000000000000000
[   68.380234] Call Trace:
[   68.380234]  <IRQ>
[   68.380234]  [<ffffffffa036475a>] ml_play_effects+0x10a/0x700 
[ff_memless]
[   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 
[ff_memless]
[   68.380234]  [<ffffffffa0364ecf>] ml_effect_timer+0x3f/0x170 [ff_memless]
[   68.380234]  [<ffffffff810e3cc5>] call_timer_fn+0x35/0x150
[   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 
[ff_memless]
[   68.380234]  [<ffffffff810e4016>] run_timer_softirq+0x236/0x2e0
[   68.380234]  [<ffffffff8107d3e6>] __do_softirq+0xe6/0x2f0
[   68.380234]  [<ffffffff8107d763>] irq_exit+0xa3/0xb0
[   68.380234]  [<ffffffff815b8112>] smp_apic_timer_interrupt+0x42/0x50
[   68.380234]  [<ffffffff815b63f2>] apic_timer_interrupt+0x82/0x90
[   68.380234]  <EOI>
[   68.380234]  [<ffffffff8105f896>] ? native_safe_halt+0x6/0x10
[   68.380234]  [<ffffffff81021c30>] default_idle+0x20/0x110
[   68.380234]  [<ffffffff8102246f>] arch_cpu_idle+0xf/0x20
[   68.380234]  [<ffffffff810baf0a>] default_idle_call+0x2a/0x40
[   68.380234]  [<ffffffff810bb264>] cpu_startup_entry+0x344/0x3b0
[   68.380234]  [<ffffffff815a8c19>] rest_init+0x89/0x90
[   68.380234]  [<ffffffff8190c012>] start_kernel+0x46a/0x48b
[   68.380234]  [<ffffffff8190b120>] ? early_idt_handler_array+0x120/0x120
[   68.380234]  [<ffffffff8190b332>] x86_64_start_reservations+0x2a/0x2c
[   68.380234]  [<ffffffff8190b480>] x86_64_start_kernel+0x14c/0x16f
[   68.380234] Code: 44 00 00 66 83 3a 50 74 03 31 c0 c3 55 48 89 e5 41 
55 41 54 53 48 8b 87 e8 02 00 00 48 8b 98 88 19 00 00 0f b6 42 11 4c 8d 
63 04 <88> 83 d8 00 00 00 0f b6 42 13 4c 89 e7 88 83 d9 00 00 00 e8 2b
[   68.380234] RIP  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 
[hid_sony]
[   68.380234]  RSP <ffff88003fc03da0>
[   68.380234] CR2: 00000000000000d8
[   68.380234] ---[ end trace 3e201de0e22c4c6e ]---
[   68.380234] Kernel panic - not syncing: Fatal exception in interrupt
[   68.380234] Kernel Offset: disabled
[   68.380234] ---[ end Kernel panic - not syncing: Fatal exception in 
interrupt





So either the underlying bug is still there or my usage of spinlocks is 
totally wrong.

I would be really happy to get some feedback....

Thank you very much in advance.

Manuel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-05 12:59       ` [PATCH v3] " Manuel Reimer
@ 2016-06-07  5:38         ` Cameron Gutman
  2016-06-07 15:55           ` Manuel Reimer
  0 siblings, 1 reply; 13+ messages in thread
From: Cameron Gutman @ 2016-06-07  5:38 UTC (permalink / raw)
  To: Manuel Reimer; +Cc: linux-input, jikos

This looks like the ff-memless effect timer is going off and invoking your effect callback
after sony_remove() has returned. No amount of locking is going to fix a callback to a
device that has been freed.

I took a look in ff-memless.c and I did not see any obvious code path where the effect
timer is stopped when the device is destroyed. That would explain the crashes you are
getting. Maybe I am missing something though.

Someone with some experience in the force feedback code could probably be more helpful.

I tried for a little while to reproduce this bug with fftest and xpad and never managed it.
What were you doing to trigger the bug?


Cameron

> On Jun 5, 2016, at 7:59 AM, Manuel Reimer <mail+linux-input@m-reimer.de> wrote:
> 
> Hello,
> 
> in the meantime, I got private mail with the suggestion, that I should have a look at spinlocks for my last change, so this is what I came up with:
> 
> 
> 
> 
> --- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
> +++ b/drivers/hid/hid-sony.c	2016-06-03 16:56:13.642143935 +0200
> @@ -1023,6 +1023,7 @@ static DEFINE_IDA(sony_device_id_allocat
> 
> struct sony_sc {
> 	spinlock_t lock;
> +	spinlock_t worker_initialized_lock;
> 	struct list_head list_node;
> 	struct hid_device *hdev;
> 	struct led_classdev *leds[MAX_LEDS];
> @@ -1051,6 +1052,20 @@ struct sony_sc {
> 	__u8 led_count;
> };
> 
> +static inline void sony_schedule_work(struct sony_sc *sc)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
> +
> +	if (sc->worker_initialized)
> +		schedule_work(&sc->state_worker);
> +	else
> +		printk(KERN_ERR "hid-sony: Sending to uninitialized device failed!\n");
> +
> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
> +}
> +
> static __u8 *sixaxis_fixup(struct hid_device *hdev, __u8 *rdesc,
> 			     unsigned int *rsize)
> {
> @@ -1542,7 +1557,7 @@ static void buzz_set_leds(struct sony_sc
> static void sony_set_leds(struct sony_sc *sc)
> {
> 	if (!(sc->quirks & BUZZ_CONTROLLER))
> -		schedule_work(&sc->state_worker);
> +		sony_schedule_work(sc);
> 	else
> 		buzz_set_leds(sc);
> }
> @@ -1653,7 +1668,7 @@ static int sony_led_blink_set(struct led
> 		new_off != drv_data->led_delay_off[n]) {
> 		drv_data->led_delay_on[n] = new_on;
> 		drv_data->led_delay_off[n] = new_off;
> -		schedule_work(&drv_data->state_worker);
> +		sony_schedule_work(drv_data);
> 	}
> 
> 	return 0;
> @@ -1963,7 +1978,7 @@ static int sony_play_effect(struct input
> 	sc->left = effect->u.rumble.strong_magnitude / 256;
> 	sc->right = effect->u.rumble.weak_magnitude / 256;
> 
> -	schedule_work(&sc->state_worker);
> +	sony_schedule_work(sc);
> 	return 0;
> }
> 
> @@ -2255,18 +2270,30 @@ static void sony_release_device_id(struc
> static inline void sony_init_output_report(struct sony_sc *sc,
> 				void(*send_output_report)(struct sony_sc*))
> {
> +	unsigned long flags;
> +
> 	sc->send_output_report = send_output_report;
> 
> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
> +
> 	if (!sc->worker_initialized)
> 		INIT_WORK(&sc->state_worker, sony_state_worker);
> 
> 	sc->worker_initialized = 1;
> +
> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
> }
> 
> static inline void sony_cancel_work_sync(struct sony_sc *sc)
> {
> -	if (sc->worker_initialized)
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
> +	if (sc->worker_initialized) {
> +		sc->worker_initialized = 0;
> 		cancel_work_sync(&sc->state_worker);
> +	}
> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
> }
> 
> static int sony_probe(struct hid_device *hdev, const struct hid_device_id *id)
> @@ -2283,6 +2310,7 @@ static int sony_probe(struct hid_device
> 	}
> 
> 	spin_lock_init(&sc->lock);
> +	spin_lock_init(&sc->worker_initialized_lock);
> 
> 	sc->quirks = quirks;
> 	hid_set_drvdata(hdev, sc);
> 
> 
> 
> 
> 
> 
> Result:
> 
> 
> 
> 
> 
> [   67.956856] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8
> [   68.380234] IP: [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
> [   68.380234] PGD 0
> [   68.380234] Oops: 0002 [#1] PREEMPT SMP
> [   68.380234] Modules linked in: hid_sony ff_memless cfg80211 rfkill crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ppdev lrw gf128mul glue_helper ablk_helper cryptd psmouse input_leds joydev led_class acpi_cpufreq serio_raw mousedev pcspkr mac_hid i2c_piix4 e1000 intel_agp intel_gtt tpm_tis tpm fjes video battery parport_pc parport button processor ac evdev sch_fq_codel vboxvideo(O) ttm drm_kms_helper drm syscopyarea sysfillrect sysimgblt fb_sys_fops vboxsf(O) vboxguest(O) ip_tables x_tables xfs libcrc32c hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi atkbd libps2 ohci_pci ohci_hcd ata_piix ahci libahci usbcore crc32c_intel usb_common libata scsi_mod i8042 serio
> [   68.380234] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 4.5.4-1-ARCH #1
> [   68.380234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   68.380234] task: ffffffff81811500 ti: ffffffff81800000 task.ti: ffffffff81800000
> [   68.380234] RIP: 0010:[<ffffffffa036933d>]  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
> [   68.380234] RSP: 0018:ffff88003fc03da0  EFLAGS: 00010046
> [   68.380234] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000ffff
> [   68.380234] RDX: ffff88003fc03df0 RSI: 0000000000000000 RDI: ffff88003d09d800
> [   68.380234] RBP: ffff88003fc03db8 R08: ffff880037bf3100 R09: 0000000000000000
> [   68.380234] R10: 0000000000000020 R11: 0000000000000000 R12: 0000000000000004
> [   68.380234] R13: ffff880039980c00 R14: ffff88003fc03df0 R15: 00000000000000b4
> [   68.380234] FS:  00007f1a3b54a780(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
> [   68.380234] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [   68.380234] CR2: 00000000000000d8 CR3: 000000003ceb7000 CR4: 00000000000406f0
> [   68.380234] Stack:
> [   68.380234]  0000000000000010 ffff880039980f08 ffff880039980c00 ffff88003fc03e50
> [   68.380234]  ffffffffa036475a 0000000000000002 0000000000015740 0000000000000046
> [   68.380234]  ffff880039980c08 000000000000ffff 0000000000000050 0000000000000000
> [   68.380234] Call Trace:
> [   68.380234]  <IRQ>
> [   68.380234]  [<ffffffffa036475a>] ml_play_effects+0x10a/0x700 [ff_memless]
> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 [ff_memless]
> [   68.380234]  [<ffffffffa0364ecf>] ml_effect_timer+0x3f/0x170 [ff_memless]
> [   68.380234]  [<ffffffff810e3cc5>] call_timer_fn+0x35/0x150
> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 [ff_memless]
> [   68.380234]  [<ffffffff810e4016>] run_timer_softirq+0x236/0x2e0
> [   68.380234]  [<ffffffff8107d3e6>] __do_softirq+0xe6/0x2f0
> [   68.380234]  [<ffffffff8107d763>] irq_exit+0xa3/0xb0
> [   68.380234]  [<ffffffff815b8112>] smp_apic_timer_interrupt+0x42/0x50
> [   68.380234]  [<ffffffff815b63f2>] apic_timer_interrupt+0x82/0x90
> [   68.380234]  <EOI>
> [   68.380234]  [<ffffffff8105f896>] ? native_safe_halt+0x6/0x10
> [   68.380234]  [<ffffffff81021c30>] default_idle+0x20/0x110
> [   68.380234]  [<ffffffff8102246f>] arch_cpu_idle+0xf/0x20
> [   68.380234]  [<ffffffff810baf0a>] default_idle_call+0x2a/0x40
> [   68.380234]  [<ffffffff810bb264>] cpu_startup_entry+0x344/0x3b0
> [   68.380234]  [<ffffffff815a8c19>] rest_init+0x89/0x90
> [   68.380234]  [<ffffffff8190c012>] start_kernel+0x46a/0x48b
> [   68.380234]  [<ffffffff8190b120>] ? early_idt_handler_array+0x120/0x120
> [   68.380234]  [<ffffffff8190b332>] x86_64_start_reservations+0x2a/0x2c
> [   68.380234]  [<ffffffff8190b480>] x86_64_start_kernel+0x14c/0x16f
> [   68.380234] Code: 44 00 00 66 83 3a 50 74 03 31 c0 c3 55 48 89 e5 41 55 41 54 53 48 8b 87 e8 02 00 00 48 8b 98 88 19 00 00 0f b6 42 11 4c 8d 63 04 <88> 83 d8 00 00 00 0f b6 42 13 4c 89 e7 88 83 d9 00 00 00 e8 2b
> [   68.380234] RIP  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
> [   68.380234]  RSP <ffff88003fc03da0>
> [   68.380234] CR2: 00000000000000d8
> [   68.380234] ---[ end trace 3e201de0e22c4c6e ]---
> [   68.380234] Kernel panic - not syncing: Fatal exception in interrupt
> [   68.380234] Kernel Offset: disabled
> [   68.380234] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
> 
> 
> 
> 
> 
> So either the underlying bug is still there or my usage of spinlocks is totally wrong.
> 
> I would be really happy to get some feedback....
> 
> Thank you very much in advance.
> 
> Manuel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-07  5:38         ` Cameron Gutman
@ 2016-06-07 15:55           ` Manuel Reimer
  2016-06-11 10:00             ` Manuel Reimer
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-06-07 15:55 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos

Hello,

this may be true. I didn't expect that this is actually possible. For a 
kernel newbie it is very difficult to know where new threads are started 
and in which cases they are stopped.

The only thing, that happens in the "destroy" callback of ff-memless, 
is, again, that kernel memory is freed. If timers really keep running, 
then this would be a second possibility for a "use after free" bug. I 
think even if the timers are deleted before this, there is still a small 
chance that the timer callback already was triggered at the point where 
kfree is called.


To reproduce this, I did the following:

- Open fftest
--> Now, with ff-memless devices, we can use effect 4 or 5
- Start several effects. Meaning: Press 4 + Enter and/or 5 + Enter 
repeatedly several times.
- Finally pull the USB plug as fast as possible

It doesn't happen every time and may need some retries. I don't know if 
it is actually possible to get the crash using the xpad driver.

Manuel


On 06/07/2016 07:38 AM, Cameron Gutman wrote:
> This looks like the ff-memless effect timer is going off and invoking your effect callback
> after sony_remove() has returned. No amount of locking is going to fix a callback to a
> device that has been freed.
>
> I took a look in ff-memless.c and I did not see any obvious code path where the effect
> timer is stopped when the device is destroyed. That would explain the crashes you are
> getting. Maybe I am missing something though.
>
> Someone with some experience in the force feedback code could probably be more helpful.
>
> I tried for a little while to reproduce this bug with fftest and xpad and never managed it.
> What were you doing to trigger the bug?
>
>
> Cameron
>
>> On Jun 5, 2016, at 7:59 AM, Manuel Reimer <mail+linux-input@m-reimer.de> wrote:
>>
>> Hello,
>>
>> in the meantime, I got private mail with the suggestion, that I should have a look at spinlocks for my last change, so this is what I came up with:
>>
>>
>>
>>
>> --- a/drivers/hid/hid-sony.c	2016-05-13 16:13:00.339346161 +0200
>> +++ b/drivers/hid/hid-sony.c	2016-06-03 16:56:13.642143935 +0200
>> @@ -1023,6 +1023,7 @@ static DEFINE_IDA(sony_device_id_allocat
>>
>> struct sony_sc {
>> 	spinlock_t lock;
>> +	spinlock_t worker_initialized_lock;
>> 	struct list_head list_node;
>> 	struct hid_device *hdev;
>> 	struct led_classdev *leds[MAX_LEDS];
>> @@ -1051,6 +1052,20 @@ struct sony_sc {
>> 	__u8 led_count;
>> };
>>
>> +static inline void sony_schedule_work(struct sony_sc *sc)
>> +{
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>> +
>> +	if (sc->worker_initialized)
>> +		schedule_work(&sc->state_worker);
>> +	else
>> +		printk(KERN_ERR "hid-sony: Sending to uninitialized device failed!\n");
>> +
>> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>> +}
>> +
>> static __u8 *sixaxis_fixup(struct hid_device *hdev, __u8 *rdesc,
>> 			     unsigned int *rsize)
>> {
>> @@ -1542,7 +1557,7 @@ static void buzz_set_leds(struct sony_sc
>> static void sony_set_leds(struct sony_sc *sc)
>> {
>> 	if (!(sc->quirks & BUZZ_CONTROLLER))
>> -		schedule_work(&sc->state_worker);
>> +		sony_schedule_work(sc);
>> 	else
>> 		buzz_set_leds(sc);
>> }
>> @@ -1653,7 +1668,7 @@ static int sony_led_blink_set(struct led
>> 		new_off != drv_data->led_delay_off[n]) {
>> 		drv_data->led_delay_on[n] = new_on;
>> 		drv_data->led_delay_off[n] = new_off;
>> -		schedule_work(&drv_data->state_worker);
>> +		sony_schedule_work(drv_data);
>> 	}
>>
>> 	return 0;
>> @@ -1963,7 +1978,7 @@ static int sony_play_effect(struct input
>> 	sc->left = effect->u.rumble.strong_magnitude / 256;
>> 	sc->right = effect->u.rumble.weak_magnitude / 256;
>>
>> -	schedule_work(&sc->state_worker);
>> +	sony_schedule_work(sc);
>> 	return 0;
>> }
>>
>> @@ -2255,18 +2270,30 @@ static void sony_release_device_id(struc
>> static inline void sony_init_output_report(struct sony_sc *sc,
>> 				void(*send_output_report)(struct sony_sc*))
>> {
>> +	unsigned long flags;
>> +
>> 	sc->send_output_report = send_output_report;
>>
>> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>> +
>> 	if (!sc->worker_initialized)
>> 		INIT_WORK(&sc->state_worker, sony_state_worker);
>>
>> 	sc->worker_initialized = 1;
>> +
>> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>> }
>>
>> static inline void sony_cancel_work_sync(struct sony_sc *sc)
>> {
>> -	if (sc->worker_initialized)
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>> +	if (sc->worker_initialized) {
>> +		sc->worker_initialized = 0;
>> 		cancel_work_sync(&sc->state_worker);
>> +	}
>> +	spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>> }
>>
>> static int sony_probe(struct hid_device *hdev, const struct hid_device_id *id)
>> @@ -2283,6 +2310,7 @@ static int sony_probe(struct hid_device
>> 	}
>>
>> 	spin_lock_init(&sc->lock);
>> +	spin_lock_init(&sc->worker_initialized_lock);
>>
>> 	sc->quirks = quirks;
>> 	hid_set_drvdata(hdev, sc);
>>
>>
>>
>>
>>
>>
>> Result:
>>
>>
>>
>>
>>
>> [   67.956856] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8
>> [   68.380234] IP: [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
>> [   68.380234] PGD 0
>> [   68.380234] Oops: 0002 [#1] PREEMPT SMP
>> [   68.380234] Modules linked in: hid_sony ff_memless cfg80211 rfkill crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 ppdev lrw gf128mul glue_helper ablk_helper cryptd psmouse input_leds joydev led_class acpi_cpufreq serio_raw mousedev pcspkr mac_hid i2c_piix4 e1000 intel_agp intel_gtt tpm_tis tpm fjes video battery parport_pc parport button processor ac evdev sch_fq_codel vboxvideo(O) ttm drm_kms_helper drm syscopyarea sysfillrect sysimgblt fb_sys_fops vboxsf(O) vboxguest(O) ip_tables x_tables xfs libcrc32c hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi atkbd libps2 ohci_pci ohci_hcd ata_piix ahci libahci usbcore crc32c_intel usb_common libata scsi_mod i8042 serio
>> [   68.380234] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 4.5.4-1-ARCH #1
>> [   68.380234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
>> [   68.380234] task: ffffffff81811500 ti: ffffffff81800000 task.ti: ffffffff81800000
>> [   68.380234] RIP: 0010:[<ffffffffa036933d>]  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
>> [   68.380234] RSP: 0018:ffff88003fc03da0  EFLAGS: 00010046
>> [   68.380234] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000ffff
>> [   68.380234] RDX: ffff88003fc03df0 RSI: 0000000000000000 RDI: ffff88003d09d800
>> [   68.380234] RBP: ffff88003fc03db8 R08: ffff880037bf3100 R09: 0000000000000000
>> [   68.380234] R10: 0000000000000020 R11: 0000000000000000 R12: 0000000000000004
>> [   68.380234] R13: ffff880039980c00 R14: ffff88003fc03df0 R15: 00000000000000b4
>> [   68.380234] FS:  00007f1a3b54a780(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
>> [   68.380234] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [   68.380234] CR2: 00000000000000d8 CR3: 000000003ceb7000 CR4: 00000000000406f0
>> [   68.380234] Stack:
>> [   68.380234]  0000000000000010 ffff880039980f08 ffff880039980c00 ffff88003fc03e50
>> [   68.380234]  ffffffffa036475a 0000000000000002 0000000000015740 0000000000000046
>> [   68.380234]  ffff880039980c08 000000000000ffff 0000000000000050 0000000000000000
>> [   68.380234] Call Trace:
>> [   68.380234]  <IRQ>
>> [   68.380234]  [<ffffffffa036475a>] ml_play_effects+0x10a/0x700 [ff_memless]
>> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 [ff_memless]
>> [   68.380234]  [<ffffffffa0364ecf>] ml_effect_timer+0x3f/0x170 [ff_memless]
>> [   68.380234]  [<ffffffff810e3cc5>] call_timer_fn+0x35/0x150
>> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100 [ff_memless]
>> [   68.380234]  [<ffffffff810e4016>] run_timer_softirq+0x236/0x2e0
>> [   68.380234]  [<ffffffff8107d3e6>] __do_softirq+0xe6/0x2f0
>> [   68.380234]  [<ffffffff8107d763>] irq_exit+0xa3/0xb0
>> [   68.380234]  [<ffffffff815b8112>] smp_apic_timer_interrupt+0x42/0x50
>> [   68.380234]  [<ffffffff815b63f2>] apic_timer_interrupt+0x82/0x90
>> [   68.380234]  <EOI>
>> [   68.380234]  [<ffffffff8105f896>] ? native_safe_halt+0x6/0x10
>> [   68.380234]  [<ffffffff81021c30>] default_idle+0x20/0x110
>> [   68.380234]  [<ffffffff8102246f>] arch_cpu_idle+0xf/0x20
>> [   68.380234]  [<ffffffff810baf0a>] default_idle_call+0x2a/0x40
>> [   68.380234]  [<ffffffff810bb264>] cpu_startup_entry+0x344/0x3b0
>> [   68.380234]  [<ffffffff815a8c19>] rest_init+0x89/0x90
>> [   68.380234]  [<ffffffff8190c012>] start_kernel+0x46a/0x48b
>> [   68.380234]  [<ffffffff8190b120>] ? early_idt_handler_array+0x120/0x120
>> [   68.380234]  [<ffffffff8190b332>] x86_64_start_reservations+0x2a/0x2c
>> [   68.380234]  [<ffffffff8190b480>] x86_64_start_kernel+0x14c/0x16f
>> [   68.380234] Code: 44 00 00 66 83 3a 50 74 03 31 c0 c3 55 48 89 e5 41 55 41 54 53 48 8b 87 e8 02 00 00 48 8b 98 88 19 00 00 0f b6 42 11 4c 8d 63 04 <88> 83 d8 00 00 00 0f b6 42 13 4c 89 e7 88 83 d9 00 00 00 e8 2b
>> [   68.380234] RIP  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90 [hid_sony]
>> [   68.380234]  RSP <ffff88003fc03da0>
>> [   68.380234] CR2: 00000000000000d8
>> [   68.380234] ---[ end trace 3e201de0e22c4c6e ]---
>> [   68.380234] Kernel panic - not syncing: Fatal exception in interrupt
>> [   68.380234] Kernel Offset: disabled
>> [   68.380234] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
>>
>>
>>
>>
>>
>> So either the underlying bug is still there or my usage of spinlocks is totally wrong.
>>
>> I would be really happy to get some feedback....
>>
>> Thank you very much in advance.
>>
>> Manuel
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-07 15:55           ` Manuel Reimer
@ 2016-06-11 10:00             ` Manuel Reimer
  2016-06-11 17:37               ` Cameron Gutman
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-06-11 10:00 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos

Hello,

I did some more testing. Now I added printk messages to start and end of 
ml_effect_timer and to hl_ff_destroy. Result:

[  513.493511] ml_effect_timer start
[  513.746964] ml_effect_timer end
[  515.107003] hid-sony: Sending to uninitialized device failed!
[  515.333520] hid-sony: Sending to uninitialized device failed!
[  515.415381] hid-sony: Sending to uninitialized device failed!
[  520.476860] ml_effect_timer start
[  520.677003] BUG: unable to handle kernel NULL pointer dereference at 
00000000000000d8

The hid-sony messages are created by my last patch to fix the hid-sony 
driver. They show that some sending attempts have been cancelled, as the 
device is about to be destroyed.

Quite some time after that there in fact is another attempt to call 
ml_effect_timer, so the timer still was active. Tomorrow I'll add 
additional printk lines to the hid-sony destroy function to see if this 
finished executing before this unwanted timer call arrives.

This also shows that ml_ff_destroy is not the right place to cancel the 
timer. ml_ff_destroy is called as soon as I exit fftest. It is not 
called at all on USB disconnect.

I now guess this can also be reproduced with the xpad driver, but it 
requires some fiddling with fftest. It took me ten minutes this time to 
get the bug triggered. I think the way to trigger the bug is to start 
effect 5 and shortly after that effect 4. With some luck the USB plug is 
pulled before event 4 is actually started.

Manuel

On 06/07/2016 05:55 PM, Manuel Reimer wrote:
> Hello,
>
> this may be true. I didn't expect that this is actually possible. For a
> kernel newbie it is very difficult to know where new threads are started
> and in which cases they are stopped.
>
> The only thing, that happens in the "destroy" callback of ff-memless,
> is, again, that kernel memory is freed. If timers really keep running,
> then this would be a second possibility for a "use after free" bug. I
> think even if the timers are deleted before this, there is still a small
> chance that the timer callback already was triggered at the point where
> kfree is called.
>
>
> To reproduce this, I did the following:
>
> - Open fftest
> --> Now, with ff-memless devices, we can use effect 4 or 5
> - Start several effects. Meaning: Press 4 + Enter and/or 5 + Enter
> repeatedly several times.
> - Finally pull the USB plug as fast as possible
>
> It doesn't happen every time and may need some retries. I don't know if
> it is actually possible to get the crash using the xpad driver.
>
> Manuel
>
>
> On 06/07/2016 07:38 AM, Cameron Gutman wrote:
>> This looks like the ff-memless effect timer is going off and invoking
>> your effect callback
>> after sony_remove() has returned. No amount of locking is going to fix
>> a callback to a
>> device that has been freed.
>>
>> I took a look in ff-memless.c and I did not see any obvious code path
>> where the effect
>> timer is stopped when the device is destroyed. That would explain the
>> crashes you are
>> getting. Maybe I am missing something though.
>>
>> Someone with some experience in the force feedback code could probably
>> be more helpful.
>>
>> I tried for a little while to reproduce this bug with fftest and xpad
>> and never managed it.
>> What were you doing to trigger the bug?
>>
>>
>> Cameron
>>
>>> On Jun 5, 2016, at 7:59 AM, Manuel Reimer
>>> <mail+linux-input@m-reimer.de> wrote:
>>>
>>> Hello,
>>>
>>> in the meantime, I got private mail with the suggestion, that I
>>> should have a look at spinlocks for my last change, so this is what I
>>> came up with:
>>>
>>>
>>>
>>>
>>> --- a/drivers/hid/hid-sony.c    2016-05-13 16:13:00.339346161 +0200
>>> +++ b/drivers/hid/hid-sony.c    2016-06-03 16:56:13.642143935 +0200
>>> @@ -1023,6 +1023,7 @@ static DEFINE_IDA(sony_device_id_allocat
>>>
>>> struct sony_sc {
>>>     spinlock_t lock;
>>> +    spinlock_t worker_initialized_lock;
>>>     struct list_head list_node;
>>>     struct hid_device *hdev;
>>>     struct led_classdev *leds[MAX_LEDS];
>>> @@ -1051,6 +1052,20 @@ struct sony_sc {
>>>     __u8 led_count;
>>> };
>>>
>>> +static inline void sony_schedule_work(struct sony_sc *sc)
>>> +{
>>> +    unsigned long flags;
>>> +
>>> +    spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>>> +
>>> +    if (sc->worker_initialized)
>>> +        schedule_work(&sc->state_worker);
>>> +    else
>>> +        printk(KERN_ERR "hid-sony: Sending to uninitialized device
>>> failed!\n");
>>> +
>>> +    spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>>> +}
>>> +
>>> static __u8 *sixaxis_fixup(struct hid_device *hdev, __u8 *rdesc,
>>>                  unsigned int *rsize)
>>> {
>>> @@ -1542,7 +1557,7 @@ static void buzz_set_leds(struct sony_sc
>>> static void sony_set_leds(struct sony_sc *sc)
>>> {
>>>     if (!(sc->quirks & BUZZ_CONTROLLER))
>>> -        schedule_work(&sc->state_worker);
>>> +        sony_schedule_work(sc);
>>>     else
>>>         buzz_set_leds(sc);
>>> }
>>> @@ -1653,7 +1668,7 @@ static int sony_led_blink_set(struct led
>>>         new_off != drv_data->led_delay_off[n]) {
>>>         drv_data->led_delay_on[n] = new_on;
>>>         drv_data->led_delay_off[n] = new_off;
>>> -        schedule_work(&drv_data->state_worker);
>>> +        sony_schedule_work(drv_data);
>>>     }
>>>
>>>     return 0;
>>> @@ -1963,7 +1978,7 @@ static int sony_play_effect(struct input
>>>     sc->left = effect->u.rumble.strong_magnitude / 256;
>>>     sc->right = effect->u.rumble.weak_magnitude / 256;
>>>
>>> -    schedule_work(&sc->state_worker);
>>> +    sony_schedule_work(sc);
>>>     return 0;
>>> }
>>>
>>> @@ -2255,18 +2270,30 @@ static void sony_release_device_id(struc
>>> static inline void sony_init_output_report(struct sony_sc *sc,
>>>                 void(*send_output_report)(struct sony_sc*))
>>> {
>>> +    unsigned long flags;
>>> +
>>>     sc->send_output_report = send_output_report;
>>>
>>> +    spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>>> +
>>>     if (!sc->worker_initialized)
>>>         INIT_WORK(&sc->state_worker, sony_state_worker);
>>>
>>>     sc->worker_initialized = 1;
>>> +
>>> +    spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>>> }
>>>
>>> static inline void sony_cancel_work_sync(struct sony_sc *sc)
>>> {
>>> -    if (sc->worker_initialized)
>>> +    unsigned long flags;
>>> +
>>> +    spin_lock_irqsave(&sc->worker_initialized_lock, flags);
>>> +    if (sc->worker_initialized) {
>>> +        sc->worker_initialized = 0;
>>>         cancel_work_sync(&sc->state_worker);
>>> +    }
>>> +    spin_unlock_irqrestore(&sc->worker_initialized_lock, flags);
>>> }
>>>
>>> static int sony_probe(struct hid_device *hdev, const struct
>>> hid_device_id *id)
>>> @@ -2283,6 +2310,7 @@ static int sony_probe(struct hid_device
>>>     }
>>>
>>>     spin_lock_init(&sc->lock);
>>> +    spin_lock_init(&sc->worker_initialized_lock);
>>>
>>>     sc->quirks = quirks;
>>>     hid_set_drvdata(hdev, sc);
>>>
>>>
>>>
>>>
>>>
>>>
>>> Result:
>>>
>>>
>>>
>>>
>>>
>>> [   67.956856] BUG: unable to handle kernel NULL pointer dereference
>>> at 00000000000000d8
>>> [   68.380234] IP: [<ffffffffa036933d>] sony_play_effect+0x2d/0x90
>>> [hid_sony]
>>> [   68.380234] PGD 0
>>> [   68.380234] Oops: 0002 [#1] PREEMPT SMP
>>> [   68.380234] Modules linked in: hid_sony ff_memless cfg80211 rfkill
>>> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
>>> aes_x86_64 ppdev lrw gf128mul glue_helper ablk_helper cryptd psmouse
>>> input_leds joydev led_class acpi_cpufreq serio_raw mousedev pcspkr
>>> mac_hid i2c_piix4 e1000 intel_agp intel_gtt tpm_tis tpm fjes video
>>> battery parport_pc parport button processor ac evdev sch_fq_codel
>>> vboxvideo(O) ttm drm_kms_helper drm syscopyarea sysfillrect sysimgblt
>>> fb_sys_fops vboxsf(O) vboxguest(O) ip_tables x_tables xfs libcrc32c
>>> hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi
>>> atkbd libps2 ohci_pci ohci_hcd ata_piix ahci libahci usbcore
>>> crc32c_intel usb_common libata scsi_mod i8042 serio
>>> [   68.380234] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O
>>> 4.5.4-1-ARCH #1
>>> [   68.380234] Hardware name: innotek GmbH VirtualBox/VirtualBox,
>>> BIOS VirtualBox 12/01/2006
>>> [   68.380234] task: ffffffff81811500 ti: ffffffff81800000 task.ti:
>>> ffffffff81800000
>>> [   68.380234] RIP: 0010:[<ffffffffa036933d>]  [<ffffffffa036933d>]
>>> sony_play_effect+0x2d/0x90 [hid_sony]
>>> [   68.380234] RSP: 0018:ffff88003fc03da0  EFLAGS: 00010046
>>> [   68.380234] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 000000000000ffff
>>> [   68.380234] RDX: ffff88003fc03df0 RSI: 0000000000000000 RDI:
>>> ffff88003d09d800
>>> [   68.380234] RBP: ffff88003fc03db8 R08: ffff880037bf3100 R09:
>>> 0000000000000000
>>> [   68.380234] R10: 0000000000000020 R11: 0000000000000000 R12:
>>> 0000000000000004
>>> [   68.380234] R13: ffff880039980c00 R14: ffff88003fc03df0 R15:
>>> 00000000000000b4
>>> [   68.380234] FS:  00007f1a3b54a780(0000) GS:ffff88003fc00000(0000)
>>> knlGS:0000000000000000
>>> [   68.380234] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [   68.380234] CR2: 00000000000000d8 CR3: 000000003ceb7000 CR4:
>>> 00000000000406f0
>>> [   68.380234] Stack:
>>> [   68.380234]  0000000000000010 ffff880039980f08 ffff880039980c00
>>> ffff88003fc03e50
>>> [   68.380234]  ffffffffa036475a 0000000000000002 0000000000015740
>>> 0000000000000046
>>> [   68.380234]  ffff880039980c08 000000000000ffff 0000000000000050
>>> 0000000000000000
>>> [   68.380234] Call Trace:
>>> [   68.380234]  <IRQ>
>>> [   68.380234]  [<ffffffffa036475a>] ml_play_effects+0x10a/0x700
>>> [ff_memless]
>>> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100
>>> [ff_memless]
>>> [   68.380234]  [<ffffffffa0364ecf>] ml_effect_timer+0x3f/0x170
>>> [ff_memless]
>>> [   68.380234]  [<ffffffff810e3cc5>] call_timer_fn+0x35/0x150
>>> [   68.380234]  [<ffffffffa0364e90>] ? ml_ff_playback+0x100/0x100
>>> [ff_memless]
>>> [   68.380234]  [<ffffffff810e4016>] run_timer_softirq+0x236/0x2e0
>>> [   68.380234]  [<ffffffff8107d3e6>] __do_softirq+0xe6/0x2f0
>>> [   68.380234]  [<ffffffff8107d763>] irq_exit+0xa3/0xb0
>>> [   68.380234]  [<ffffffff815b8112>] smp_apic_timer_interrupt+0x42/0x50
>>> [   68.380234]  [<ffffffff815b63f2>] apic_timer_interrupt+0x82/0x90
>>> [   68.380234]  <EOI>
>>> [   68.380234]  [<ffffffff8105f896>] ? native_safe_halt+0x6/0x10
>>> [   68.380234]  [<ffffffff81021c30>] default_idle+0x20/0x110
>>> [   68.380234]  [<ffffffff8102246f>] arch_cpu_idle+0xf/0x20
>>> [   68.380234]  [<ffffffff810baf0a>] default_idle_call+0x2a/0x40
>>> [   68.380234]  [<ffffffff810bb264>] cpu_startup_entry+0x344/0x3b0
>>> [   68.380234]  [<ffffffff815a8c19>] rest_init+0x89/0x90
>>> [   68.380234]  [<ffffffff8190c012>] start_kernel+0x46a/0x48b
>>> [   68.380234]  [<ffffffff8190b120>] ?
>>> early_idt_handler_array+0x120/0x120
>>> [   68.380234]  [<ffffffff8190b332>] x86_64_start_reservations+0x2a/0x2c
>>> [   68.380234]  [<ffffffff8190b480>] x86_64_start_kernel+0x14c/0x16f
>>> [   68.380234] Code: 44 00 00 66 83 3a 50 74 03 31 c0 c3 55 48 89 e5
>>> 41 55 41 54 53 48 8b 87 e8 02 00 00 48 8b 98 88 19 00 00 0f b6 42 11
>>> 4c 8d 63 04 <88> 83 d8 00 00 00 0f b6 42 13 4c 89 e7 88 83 d9 00 00
>>> 00 e8 2b
>>> [   68.380234] RIP  [<ffffffffa036933d>] sony_play_effect+0x2d/0x90
>>> [hid_sony]
>>> [   68.380234]  RSP <ffff88003fc03da0>
>>> [   68.380234] CR2: 00000000000000d8
>>> [   68.380234] ---[ end trace 3e201de0e22c4c6e ]---
>>> [   68.380234] Kernel panic - not syncing: Fatal exception in interrupt
>>> [   68.380234] Kernel Offset: disabled
>>> [   68.380234] ---[ end Kernel panic - not syncing: Fatal exception
>>> in interrupt
>>>
>>>
>>>
>>>
>>>
>>> So either the underlying bug is still there or my usage of spinlocks
>>> is totally wrong.
>>>
>>> I would be really happy to get some feedback....
>>>
>>> Thank you very much in advance.
>>>
>>> Manuel
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-input" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-11 10:00             ` Manuel Reimer
@ 2016-06-11 17:37               ` Cameron Gutman
  2016-06-11 19:15                 ` Cameron Gutman
  0 siblings, 1 reply; 13+ messages in thread
From: Cameron Gutman @ 2016-06-11 17:37 UTC (permalink / raw)
  To: Manuel Reimer; +Cc: linux-input, jikos



On 06/11/2016 05:00 AM, Manuel Reimer wrote:
> Hello,
> 
> I did some more testing. Now I added printk messages to start and end of ml_effect_timer and to hl_ff_destroy. Result:
> 
> [  513.493511] ml_effect_timer start
> [  513.746964] ml_effect_timer end
> [  515.107003] hid-sony: Sending to uninitialized device failed!
> [  515.333520] hid-sony: Sending to uninitialized device failed!
> [  515.415381] hid-sony: Sending to uninitialized device failed!
> [  520.476860] ml_effect_timer start
> [  520.677003] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8
> 
> The hid-sony messages are created by my last patch to fix the hid-sony driver. They show that some sending attempts have been cancelled, as the device is about to be destroyed.
> 
> Quite some time after that there in fact is another attempt to call ml_effect_timer, so the timer still was active. Tomorrow I'll add additional printk lines to the hid-sony destroy function to see if this finished executing before this unwanted timer call arrives.
> 
> This also shows that ml_ff_destroy is not the right place to cancel the timer. ml_ff_destroy is called as soon as I exit fftest. It is not called at all on USB disconnect.
> 
> I now guess this can also be reproduced with the xpad driver, but it requires some fiddling with fftest. It took me ten minutes this time to get the bug triggered. I think the way to trigger the bug is to start effect 5 and shortly after that effect 4. With some luck the USB plug is pulled before event 4 is actually started.
> 

Can you try applying the following patch on a clean source tree and see if it resolves your issue?

diff --git a/drivers/input/ff-memless.c b/drivers/input/ff-memless.c
index fcc6c33..6366e9a 100644
--- a/drivers/input/ff-memless.c
+++ b/drivers/input/ff-memless.c
@@ -501,6 +501,7 @@ static void ml_ff_destroy(struct ff_device *ff)
 {
        struct ml_device *ml = ff->private;
 
+       del_timer_sync(&ml->timer);
        kfree(ml->private);
 }


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-11 17:37               ` Cameron Gutman
@ 2016-06-11 19:15                 ` Cameron Gutman
  2016-06-12 10:01                   ` Manuel Reimer
  0 siblings, 1 reply; 13+ messages in thread
From: Cameron Gutman @ 2016-06-11 19:15 UTC (permalink / raw)
  To: Manuel Reimer; +Cc: linux-input, jikos, dmitry.torokhov



On 06/11/2016 12:37 PM, Cameron Gutman wrote:
> 
> 
> On 06/11/2016 05:00 AM, Manuel Reimer wrote:
>> Hello,
>>
>> I did some more testing. Now I added printk messages to start and end of ml_effect_timer and to hl_ff_destroy. Result:
>>
>> [  513.493511] ml_effect_timer start
>> [  513.746964] ml_effect_timer end
>> [  515.107003] hid-sony: Sending to uninitialized device failed!
>> [  515.333520] hid-sony: Sending to uninitialized device failed!
>> [  515.415381] hid-sony: Sending to uninitialized device failed!
>> [  520.476860] ml_effect_timer start
>> [  520.677003] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d8
>>
>> The hid-sony messages are created by my last patch to fix the hid-sony driver. They show that some sending attempts have been cancelled, as the device is about to be destroyed.
>>
>> Quite some time after that there in fact is another attempt to call ml_effect_timer, so the timer still was active. Tomorrow I'll add additional printk lines to the hid-sony destroy function to see if this finished executing before this unwanted timer call arrives.
>>
>> This also shows that ml_ff_destroy is not the right place to cancel the timer. ml_ff_destroy is called as soon as I exit fftest. It is not called at all on USB disconnect.
>>
>> I now guess this can also be reproduced with the xpad driver, but it requires some fiddling with fftest. It took me ten minutes this time to get the bug triggered. I think the way to trigger the bug is to start effect 5 and shortly after that effect 4. With some luck the USB plug is pulled before event 4 is actually started.
>>
> 
> Can you try applying the following patch on a clean source tree and see if it resolves your issue?
> 
> diff --git a/drivers/input/ff-memless.c b/drivers/input/ff-memless.c
> index fcc6c33..6366e9a 100644
> --- a/drivers/input/ff-memless.c
> +++ b/drivers/input/ff-memless.c
> @@ -501,6 +501,7 @@ static void ml_ff_destroy(struct ff_device *ff)
>  {
>         struct ml_device *ml = ff->private;
>  
> +       del_timer_sync(&ml->timer);
>         kfree(ml->private);
>  }
> 

Sorry, I misremembered what I thought I read in your email - you said you didn't see a call to ml_ff_destroy()
in the disconnect case.

Though that is quite strange to me, as the comments on ff_device indicate that it should be the right spot:

483  * @destroy: called by input core when parent input device is being
484  *      destroyed

The input core will call input_ff_destroy() immediately when an input device's release function is called, which
happens when input_unregister_device() is called. Xpad calls that almost immediately when xpad_disconnect() is
called. The HID case is a bit more complex but seems to make sense too: sony_remove() -> hid_hw_stop() ->
hid_disconnect() -> hidinput_disconnect() -> input_unregister_device().

My concern is that, as we thought before, hid_hw_stop() is being called too late in the teardown process in
the hid-sony module. This means input_ff_destroy() doesn't get called until after all the state needed to
send FF events is already freed, leaving the window where you oops before your ml_ff_destroy() printk fires.

I want to be sure we're on the right track though. Can you please resolve the address in the RIP register to a
line number in hid-sony.c?


Cameron

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
  2016-06-11 19:15                 ` Cameron Gutman
@ 2016-06-12 10:01                   ` Manuel Reimer
       [not found]                     ` <20160612155643.d2218913a4f9ff42dec938e3@ao2.it>
  0 siblings, 1 reply; 13+ messages in thread
From: Manuel Reimer @ 2016-06-12 10:01 UTC (permalink / raw)
  To: Cameron Gutman; +Cc: linux-input, jikos, dmitry.torokhov

On 06/11/2016 09:15 PM, Cameron Gutman wrote:
> Sorry, I misremembered what I thought I read in your email - you said you didn't see a call to ml_ff_destroy()
> in the disconnect case.

I also don't see this call with the xpad driver!

> Though that is quite strange to me, as the comments on ff_device indicate that it should be the right spot:
>
> 483  * @destroy: called by input core when parent input device is being
> 484  *      destroyed

What does "input device is being destroyed" actually mean in this case?

I tried with the xpad driver and as long as fftest is running (it will 
keep running until it detects that the device has been disconnected) 
there is no ml_ff_destroy event. So, if I want, there can be hours 
between actual USB disconnect and ml_ff_destroy.

So what does it mean to the kernel that fftest is keeping an open file 
handle to this input device? Is it really gone if there is at least one 
process still holding a reference to it?


And. It actually seems possible to trigger the bug with the xpad driver. 
I didn't have the crash, but there is a good chance to get it. This is 
what I managed to get:

[  267.006873] ml_effect_timer start
[  267.256399] ml_effect_timer end
[  268.286814] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[  268.331112] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[  268.384689] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[  268.452146] xpad 1-2:1.0: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[  273.813537] ml_effect_timer start
[  273.816845] usb ȵ�<: xpad_try_sending_next_out_packet - 
usb_submit_urb failed with result -19
[  274.400668] ml_effect_timer end


Starting with 268.286814 the USB device is gone! That's why the xpad 
driver is throwing errors.

Some time later at 273.813537 (USB device is gone!) there is a new call 
to the timer callback. "Unfortunately" no crash.

> I want to be sure we're on the right track though. Can you please resolve the address in the RIP register to a
> line number in hid-sony.c?

Can you point me to some instructions on how to do this? The only way to 
debug things, I currently know, is the "good old print(f/k)" debugging...

Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect
       [not found]                     ` <20160612155643.d2218913a4f9ff42dec938e3@ao2.it>
@ 2016-06-14 18:31                       ` Manuel Reimer
  0 siblings, 0 replies; 13+ messages in thread
From: Manuel Reimer @ 2016-06-14 18:31 UTC (permalink / raw)
  To: Antonio Ospite; +Cc: Cameron Gutman, linux-input, jikos, dmitry.torokhov

On 06/12/2016 03:56 PM, Antonio Ospite wrote:
> On Sun, 12 Jun 2016 12:01:31 +0200
> Manuel Reimer <mail+linux-input@m-reimer.de> wrote:
>
>> On 06/11/2016 09:15 PM, Cameron Gutman wrote:
> [...]
>>> I want to be sure we're on the right track though. Can you please resolve the address in the RIP register to a
>>> line number in hid-sony.c?
>>
>> Can you point me to some instructions on how to do this? The only way to
>> debug things, I currently know, is the "good old print(f/k)" debugging...
>>
>
> The return addresses are in the trace from the Oops, you can resolve
> them with objdump, see:
> https://ao2.it/en/blog/2010/11/02/bug-hunting-linux-kernel-land-unpretentious-primer
>
> Ciao,
>    Antonio

Thank you for this link and sorry for the late response.

This is my latest crash (abbreviated):

[  183.865163] ml_ff_playback start. Effect: 1 Value: 1
[  184.897065] ml_ff_playback start. Effect: 0 Value: 0
[  184.900069] hid-sony: Sending to uninitialized device failed!
[  184.956432] ml_ff_playback start. Effect: 1 Value: 0
[  184.957933] hid-sony: Sending to uninitialized device failed!
[  185.023015] ml_ff_playback start. Effect: 2 Value: 0
[  185.026143] hid-sony: Sending to uninitialized device failed!
[  190.863508] ml_effect_timer start
[  190.866803] BUG: unable to handle kernel NULL pointer dereference at 
00000000000000d8
[  190.866803] IP: [<ffffffffa037f33d>] sony_play_effect+0x2d/0x90 
[hid_sony]

Also contains some of my printk messages, so we now can actually see why 
the last sending attempts are happening (attempt to stop the effects).

My sony_play_effect is at 0x310 and with 0x2d added we are at 0x33d.

In my case:
      339:	4c 8d 63 04          	lea    0x4(%rbx),%r12
sony_play_effect():
/home/manuel/kernel/4.5.4/src/b/drivers/hid/hid-sony.c:1978
      33d:	88 83 d8 00 00 00    	mov    %al,0xd8(%rbx)
/home/manuel/kernel/4.5.4/src/b/drivers/hid/hid-sony.c:1979
	sc->right = effect->u.rumble.weak_magnitude / 256;
      343:	0f b6 42 13          	movzbl 0x13(%rdx),%eax
sony_schedule_work():
/home/manuel/kernel/4.5.4/src/b/drivers/hid/hid-sony.c:1059


So if I interpret this right, then the source for the crash is the 
assignment of the weak magnitude value to sc->right? Maybe sc already 
freed/zeroed?

Manuel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-06-14 18:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-29 12:11 [PATCH] hid-sony: Prevent crash when rumble effects are still loaded at USB disconnect Manuel Reimer
2016-05-29 17:11 ` Cameron Gutman
2016-05-30  4:45   ` mail
2016-05-30 19:15   ` Manuel Reimer
2016-06-02 17:53     ` [PATCH v2] " Manuel Reimer
2016-06-05 12:59       ` [PATCH v3] " Manuel Reimer
2016-06-07  5:38         ` Cameron Gutman
2016-06-07 15:55           ` Manuel Reimer
2016-06-11 10:00             ` Manuel Reimer
2016-06-11 17:37               ` Cameron Gutman
2016-06-11 19:15                 ` Cameron Gutman
2016-06-12 10:01                   ` Manuel Reimer
     [not found]                     ` <20160612155643.d2218913a4f9ff42dec938e3@ao2.it>
2016-06-14 18:31                       ` Manuel Reimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.