All of lore.kernel.org
 help / color / mirror / Atom feed
* [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
@ 2010-09-13  7:00 ` Simon Kirby
  0 siblings, 0 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-13  7:00 UTC (permalink / raw)
  To: linux-pm, linux-kernel

Hi!

At first I thought I hitting a suspend bug, but it was because I had an
rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
sound sounding crackly after resume.  I can reproduce this same crash
simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
using sound again.

I git-bisected to:
82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
commit 82f682514a5df89ffb3890627eebf0897b7a84ec
Author: James Bottomley <James.Bottomley@suse.de>
Date:   Mon Jul 5 22:53:06 2010 +0200

    pm_qos: Get rid of the allocation in pm_qos_add_request()

    All current users of pm_qos_add_request() have the ability to supply
    the memory required by the pm_qos routines, so make them do this and
    eliminate the kmalloc() with pm_qos_add_request().  This has the
    double benefit of making the call never fail and allowing it to be
    called from atomic context.

    Signed-off-by: James Bottomley <James.Bottomley@suse.de>
    Signed-off-by: mark gross <markgross@thegnar.org>
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

:040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
:040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
:040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
:040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound

...but I can't revert this over HEAD due to conflicts and I can't see
what's wrong in the diff.  2.6.35 is fine.

Simon-

[   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
[   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
[   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
[   62.156008] Oops: 0000 [#1] SMP 
[   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
[   62.156118] CPU 1 
[   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
[   62.156849] 
[   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
[   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
[   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
[   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
[   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
[   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
[   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
[   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
[   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
[   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
[   62.157590] Stack:
[   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
[   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
[   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
[   62.157590] Call Trace:
[   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
[   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
[   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
[   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
[   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
[   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
[   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
[   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
[   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
[   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
[   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
[   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
[   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
[   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.157590]  RSP <ffff8801ac7f3cd8>
[   62.157590] CR2: 0000000000000000
[   62.157590] ---[ end trace 853ed59ac5273c1f ]---

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
@ 2010-09-13  7:00 ` Simon Kirby
  0 siblings, 0 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-13  7:00 UTC (permalink / raw)
  To: linux-pm, linux-kernel

Hi!

At first I thought I hitting a suspend bug, but it was because I had an
rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
sound sounding crackly after resume.  I can reproduce this same crash
simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
using sound again.

I git-bisected to:
82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
commit 82f682514a5df89ffb3890627eebf0897b7a84ec
Author: James Bottomley <James.Bottomley@suse.de>
Date:   Mon Jul 5 22:53:06 2010 +0200

    pm_qos: Get rid of the allocation in pm_qos_add_request()

    All current users of pm_qos_add_request() have the ability to supply
    the memory required by the pm_qos routines, so make them do this and
    eliminate the kmalloc() with pm_qos_add_request().  This has the
    double benefit of making the call never fail and allowing it to be
    called from atomic context.

    Signed-off-by: James Bottomley <James.Bottomley@suse.de>
    Signed-off-by: mark gross <markgross@thegnar.org>
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

:040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
:040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
:040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
:040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound

...but I can't revert this over HEAD due to conflicts and I can't see
what's wrong in the diff.  2.6.35 is fine.

Simon-

[   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
[   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
[   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
[   62.156008] Oops: 0000 [#1] SMP 
[   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
[   62.156118] CPU 1 
[   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
[   62.156849] 
[   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
[   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
[   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
[   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
[   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
[   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
[   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
[   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
[   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
[   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
[   62.157590] Stack:
[   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
[   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
[   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
[   62.157590] Call Trace:
[   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
[   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
[   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
[   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
[   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
[   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
[   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
[   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
[   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
[   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
[   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
[   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
[   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
[   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
[   62.157590]  RSP <ffff8801ac7f3cd8>
[   62.157590] CR2: 0000000000000000
[   62.157590] ---[ end trace 853ed59ac5273c1f ]---

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer  dereference (plist_add)
  2010-09-13  7:00 ` Simon Kirby
  (?)
@ 2010-09-14 20:38 ` Rafael J. Wysocki
  2010-09-14 22:06   ` Rafael J. Wysocki
                     ` (3 more replies)
  -1 siblings, 4 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-14 20:38 UTC (permalink / raw)
  To: Simon Kirby
  Cc: linux-pm, linux-kernel, James Bottomley, mark gross, Takashi Iwai

On Monday, September 13, 2010, Simon Kirby wrote:
> Hi!
> 
> At first I thought I hitting a suspend bug, but it was because I had an
> rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> sound sounding crackly after resume.  I can reproduce this same crash
> simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> using sound again.
> 
> I git-bisected to:
> 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> Author: James Bottomley <James.Bottomley@suse.de>
> Date:   Mon Jul 5 22:53:06 2010 +0200
> 
>     pm_qos: Get rid of the allocation in pm_qos_add_request()
> 
>     All current users of pm_qos_add_request() have the ability to supply
>     the memory required by the pm_qos routines, so make them do this and
>     eliminate the kmalloc() with pm_qos_add_request().  This has the
>     double benefit of making the call never fail and allowing it to be
>     called from atomic context.
> 
>     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
>     Signed-off-by: mark gross <markgross@thegnar.org>
>     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> 
> ...but I can't revert this over HEAD due to conflicts and I can't see
> what's wrong in the diff.  2.6.35 is fine.
> 
> Simon-
> 
> [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> [   62.156008] Oops: 0000 [#1] SMP 
> [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> [   62.156118] CPU 1 
> [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> [   62.156849] 
> [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> [   62.157590] Stack:
> [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> [   62.157590] Call Trace:
> [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.157590]  RSP <ffff8801ac7f3cd8>
> [   62.157590] CR2: 0000000000000000
> [   62.157590] ---[ end trace 853ed59ac5273c1f ]---

Hmm, interesting.  This looks like a plist corruption to me, but can you please
check (using gdb) what line of code corresponds to the address
plist_add+0x36/0xa0 ?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-13  7:00 ` Simon Kirby
  (?)
  (?)
@ 2010-09-14 20:38 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-14 20:38 UTC (permalink / raw)
  To: Simon Kirby
  Cc: Takashi Iwai, linux-pm, James Bottomley, linux-kernel, mark gross

On Monday, September 13, 2010, Simon Kirby wrote:
> Hi!
> 
> At first I thought I hitting a suspend bug, but it was because I had an
> rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> sound sounding crackly after resume.  I can reproduce this same crash
> simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> using sound again.
> 
> I git-bisected to:
> 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> Author: James Bottomley <James.Bottomley@suse.de>
> Date:   Mon Jul 5 22:53:06 2010 +0200
> 
>     pm_qos: Get rid of the allocation in pm_qos_add_request()
> 
>     All current users of pm_qos_add_request() have the ability to supply
>     the memory required by the pm_qos routines, so make them do this and
>     eliminate the kmalloc() with pm_qos_add_request().  This has the
>     double benefit of making the call never fail and allowing it to be
>     called from atomic context.
> 
>     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
>     Signed-off-by: mark gross <markgross@thegnar.org>
>     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> 
> ...but I can't revert this over HEAD due to conflicts and I can't see
> what's wrong in the diff.  2.6.35 is fine.
> 
> Simon-
> 
> [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> [   62.156008] Oops: 0000 [#1] SMP 
> [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> [   62.156118] CPU 1 
> [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> [   62.156849] 
> [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> [   62.157590] Stack:
> [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> [   62.157590] Call Trace:
> [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> [   62.157590]  RSP <ffff8801ac7f3cd8>
> [   62.157590] CR2: 0000000000000000
> [   62.157590] ---[ end trace 853ed59ac5273c1f ]---

Hmm, interesting.  This looks like a plist corruption to me, but can you please
check (using gdb) what line of code corresponds to the address
plist_add+0x36/0xa0 ?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer  dereference (plist_add)
  2010-09-14 20:38 ` [linux-pm] " Rafael J. Wysocki
@ 2010-09-14 22:06   ` Rafael J. Wysocki
  2010-09-15  8:29     ` Takashi Iwai
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
  2010-09-14 22:06   ` Rafael J. Wysocki
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-14 22:06 UTC (permalink / raw)
  To: Simon Kirby
  Cc: linux-pm, linux-kernel, James Bottomley, mark gross, Takashi Iwai

On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> On Monday, September 13, 2010, Simon Kirby wrote:
> > Hi!
> > 
> > At first I thought I hitting a suspend bug, but it was because I had an
> > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > sound sounding crackly after resume.  I can reproduce this same crash
> > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > using sound again.
> > 
> > I git-bisected to:
> > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > Author: James Bottomley <James.Bottomley@suse.de>
> > Date:   Mon Jul 5 22:53:06 2010 +0200
> > 
> >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > 
> >     All current users of pm_qos_add_request() have the ability to supply
> >     the memory required by the pm_qos routines, so make them do this and
> >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> >     double benefit of making the call never fail and allowing it to be
> >     called from atomic context.
> > 
> >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> >     Signed-off-by: mark gross <markgross@thegnar.org>
> >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > 
> > ...but I can't revert this over HEAD due to conflicts and I can't see
> > what's wrong in the diff.  2.6.35 is fine.
> > 
> > Simon-
> > 
> > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > [   62.156008] Oops: 0000 [#1] SMP 
> > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > [   62.156118] CPU 1 
> > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > [   62.156849] 
> > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > [   62.157590] Stack:
> > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > [   62.157590] Call Trace:
> > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > [   62.157590] CR2: 0000000000000000
> > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> 
> Hmm, interesting.  This looks like a plist corruption to me, but can you please
> check (using gdb) what line of code corresponds to the address
> plist_add+0x36/0xa0 ?

Well, my current theory is that the list member of latency_pm_qos_req in
struct snd_pcm_substream gets corrupted because of the changed size of the
structure.  Then, deleting it from the plist corrupts the plist itself, which
only turns up when we try to add a new item to it.

If that's really the case, the (untested!) patch below should help (unless I broke it).

Thanks,
Rafael


---
 include/sound/pcm.h     |    2 +-
 sound/core/pcm_native.c |   40 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 36 insertions(+), 6 deletions(-)

Index: linux-2.6/include/sound/pcm.h
===================================================================
--- linux-2.6.orig/include/sound/pcm.h
+++ linux-2.6/include/sound/pcm.h
@@ -370,7 +370,7 @@ struct snd_pcm_substream {
 	int number;
 	char name[32];			/* substream name */
 	int stream;			/* stream (direction) */
-	struct pm_qos_request_list latency_pm_qos_req; /* pm_qos request */
+	struct pm_qos_request_list *latency_pm_qos_req; /* pm_qos request */
 	size_t buffer_bytes_max;	/* limit ring buffer size */
 	struct snd_dma_buffer dma_buffer;
 	unsigned int dma_buf_id;
Index: linux-2.6/sound/core/pcm_native.c
===================================================================
--- linux-2.6.orig/sound/core/pcm_native.c
+++ linux-2.6/sound/core/pcm_native.c
@@ -369,6 +369,38 @@ static int period_to_usecs(struct snd_pc
 	return usecs;
 }
 
+static void snd_pcm_add_pm_qos_request(struct snd_pcm_substream *substream,
+				       int usecs)
+{
+	struct pm_qos_request_list *pm_qos_req;
+
+	if (substream->latency_pm_qos_req)
+		return;
+
+	pm_qos_req = kzalloc(sizeof(*pm_qos_req), GFP_KERNEL);
+	if (!pm_qos_req) {
+		snd_printd("failure to allocate PM QoS request object");
+		return;
+	}
+
+	pm_qos_add_request(pm_qos_req, PM_QOS_CPU_DMA_LATENCY, usecs);
+	substream->latency_pm_qos_req = pm_qos_req;
+}
+
+static void snd_pcm_remove_pm_qos_request(struct snd_pcm_substream *substream)
+{
+	struct pm_qos_request_list *pm_qos_req = substream->latency_pm_qos_req;
+
+	if (!pm_qos_req)
+		return;
+
+	if (pm_qos_request_active(pm_qos_req))
+		pm_qos_remove_request(pm_qos_req);
+	kfree(pm_qos_req);
+
+	substream->latency_pm_qos_req = NULL;
+}
+
 static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
 			     struct snd_pcm_hw_params *params)
 {
@@ -451,11 +483,9 @@ static int snd_pcm_hw_params(struct snd_
 	snd_pcm_timer_resolution_change(substream);
 	runtime->status->state = SNDRV_PCM_STATE_SETUP;
 
-	if (pm_qos_request_active(&substream->latency_pm_qos_req))
-		pm_qos_remove_request(&substream->latency_pm_qos_req);
+	snd_pcm_remove_pm_qos_request(substream);
 	if ((usecs = period_to_usecs(runtime)) >= 0)
-		pm_qos_add_request(&substream->latency_pm_qos_req,
-				   PM_QOS_CPU_DMA_LATENCY, usecs);
+		snd_pcm_add_pm_qos_request(substream, usecs);
 	return 0;
  _error:
 	/* hardware might be unuseable from this time,
@@ -510,7 +540,7 @@ static int snd_pcm_hw_free(struct snd_pc
 	if (substream->ops->hw_free)
 		result = substream->ops->hw_free(substream);
 	runtime->status->state = SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_request(&substream->latency_pm_qos_req);
+	snd_pcm_remove_pm_qos_request(substream);
 	return result;
 }
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-14 20:38 ` [linux-pm] " Rafael J. Wysocki
  2010-09-14 22:06   ` Rafael J. Wysocki
@ 2010-09-14 22:06   ` Rafael J. Wysocki
  2010-09-15 23:46   ` [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add) Simon Kirby
  2010-09-15 23:46   ` [linux-pm] " Simon Kirby
  3 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-14 22:06 UTC (permalink / raw)
  To: Simon Kirby
  Cc: Takashi Iwai, linux-pm, James Bottomley, linux-kernel, mark gross

On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> On Monday, September 13, 2010, Simon Kirby wrote:
> > Hi!
> > 
> > At first I thought I hitting a suspend bug, but it was because I had an
> > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > sound sounding crackly after resume.  I can reproduce this same crash
> > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > using sound again.
> > 
> > I git-bisected to:
> > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > Author: James Bottomley <James.Bottomley@suse.de>
> > Date:   Mon Jul 5 22:53:06 2010 +0200
> > 
> >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > 
> >     All current users of pm_qos_add_request() have the ability to supply
> >     the memory required by the pm_qos routines, so make them do this and
> >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> >     double benefit of making the call never fail and allowing it to be
> >     called from atomic context.
> > 
> >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> >     Signed-off-by: mark gross <markgross@thegnar.org>
> >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > 
> > ...but I can't revert this over HEAD due to conflicts and I can't see
> > what's wrong in the diff.  2.6.35 is fine.
> > 
> > Simon-
> > 
> > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > [   62.156008] Oops: 0000 [#1] SMP 
> > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > [   62.156118] CPU 1 
> > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > [   62.156849] 
> > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > [   62.157590] Stack:
> > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > [   62.157590] Call Trace:
> > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > [   62.157590] CR2: 0000000000000000
> > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> 
> Hmm, interesting.  This looks like a plist corruption to me, but can you please
> check (using gdb) what line of code corresponds to the address
> plist_add+0x36/0xa0 ?

Well, my current theory is that the list member of latency_pm_qos_req in
struct snd_pcm_substream gets corrupted because of the changed size of the
structure.  Then, deleting it from the plist corrupts the plist itself, which
only turns up when we try to add a new item to it.

If that's really the case, the (untested!) patch below should help (unless I broke it).

Thanks,
Rafael


---
 include/sound/pcm.h     |    2 +-
 sound/core/pcm_native.c |   40 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 36 insertions(+), 6 deletions(-)

Index: linux-2.6/include/sound/pcm.h
===================================================================
--- linux-2.6.orig/include/sound/pcm.h
+++ linux-2.6/include/sound/pcm.h
@@ -370,7 +370,7 @@ struct snd_pcm_substream {
 	int number;
 	char name[32];			/* substream name */
 	int stream;			/* stream (direction) */
-	struct pm_qos_request_list latency_pm_qos_req; /* pm_qos request */
+	struct pm_qos_request_list *latency_pm_qos_req; /* pm_qos request */
 	size_t buffer_bytes_max;	/* limit ring buffer size */
 	struct snd_dma_buffer dma_buffer;
 	unsigned int dma_buf_id;
Index: linux-2.6/sound/core/pcm_native.c
===================================================================
--- linux-2.6.orig/sound/core/pcm_native.c
+++ linux-2.6/sound/core/pcm_native.c
@@ -369,6 +369,38 @@ static int period_to_usecs(struct snd_pc
 	return usecs;
 }
 
+static void snd_pcm_add_pm_qos_request(struct snd_pcm_substream *substream,
+				       int usecs)
+{
+	struct pm_qos_request_list *pm_qos_req;
+
+	if (substream->latency_pm_qos_req)
+		return;
+
+	pm_qos_req = kzalloc(sizeof(*pm_qos_req), GFP_KERNEL);
+	if (!pm_qos_req) {
+		snd_printd("failure to allocate PM QoS request object");
+		return;
+	}
+
+	pm_qos_add_request(pm_qos_req, PM_QOS_CPU_DMA_LATENCY, usecs);
+	substream->latency_pm_qos_req = pm_qos_req;
+}
+
+static void snd_pcm_remove_pm_qos_request(struct snd_pcm_substream *substream)
+{
+	struct pm_qos_request_list *pm_qos_req = substream->latency_pm_qos_req;
+
+	if (!pm_qos_req)
+		return;
+
+	if (pm_qos_request_active(pm_qos_req))
+		pm_qos_remove_request(pm_qos_req);
+	kfree(pm_qos_req);
+
+	substream->latency_pm_qos_req = NULL;
+}
+
 static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
 			     struct snd_pcm_hw_params *params)
 {
@@ -451,11 +483,9 @@ static int snd_pcm_hw_params(struct snd_
 	snd_pcm_timer_resolution_change(substream);
 	runtime->status->state = SNDRV_PCM_STATE_SETUP;
 
-	if (pm_qos_request_active(&substream->latency_pm_qos_req))
-		pm_qos_remove_request(&substream->latency_pm_qos_req);
+	snd_pcm_remove_pm_qos_request(substream);
 	if ((usecs = period_to_usecs(runtime)) >= 0)
-		pm_qos_add_request(&substream->latency_pm_qos_req,
-				   PM_QOS_CPU_DMA_LATENCY, usecs);
+		snd_pcm_add_pm_qos_request(substream, usecs);
 	return 0;
  _error:
 	/* hardware might be unuseable from this time,
@@ -510,7 +540,7 @@ static int snd_pcm_hw_free(struct snd_pc
 	if (substream->ops->hw_free)
 		result = substream->ops->hw_free(substream);
 	runtime->status->state = SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_request(&substream->latency_pm_qos_req);
+	snd_pcm_remove_pm_qos_request(substream);
 	return result;
 }
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer  dereference (plist_add)
  2010-09-14 22:06   ` Rafael J. Wysocki
  2010-09-15  8:29     ` Takashi Iwai
@ 2010-09-15  8:29     ` Takashi Iwai
  2010-09-15 13:05       ` mark gross
                         ` (3 more replies)
  1 sibling, 4 replies; 22+ messages in thread
From: Takashi Iwai @ 2010-09-15  8:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Simon Kirby, linux-pm, linux-kernel, James Bottomley, mark gross

At Wed, 15 Sep 2010 00:06:26 +0200,
Rafael J. Wysocki wrote:
> 
> On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > On Monday, September 13, 2010, Simon Kirby wrote:
> > > Hi!
> > > 
> > > At first I thought I hitting a suspend bug, but it was because I had an
> > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > sound sounding crackly after resume.  I can reproduce this same crash
> > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > using sound again.
> > > 
> > > I git-bisected to:
> > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > Author: James Bottomley <James.Bottomley@suse.de>
> > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > 
> > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > 
> > >     All current users of pm_qos_add_request() have the ability to supply
> > >     the memory required by the pm_qos routines, so make them do this and
> > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > >     double benefit of making the call never fail and allowing it to be
> > >     called from atomic context.
> > > 
> > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > 
> > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > what's wrong in the diff.  2.6.35 is fine.
> > > 
> > > Simon-
> > > 
> > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > [   62.156008] Oops: 0000 [#1] SMP 
> > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > [   62.156118] CPU 1 
> > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > [   62.156849] 
> > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > [   62.157590] Stack:
> > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > [   62.157590] Call Trace:
> > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > [   62.157590] CR2: 0000000000000000
> > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > 
> > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > check (using gdb) what line of code corresponds to the address
> > plist_add+0x36/0xa0 ?
> 
> Well, my current theory is that the list member of latency_pm_qos_req in
> struct snd_pcm_substream gets corrupted because of the changed size of the
> structure.  Then, deleting it from the plist corrupts the plist itself, which
> only turns up when we try to add a new item to it.
> 
> If that's really the case, the (untested!) patch below should help (unless I broke it).

Could you try first the patch in kernel bugzilla 17922?
This might be an unreleased object, mostly only triggered via OSS
emulation.
    https://bugzilla.kernel.org/show_bug.cgi?id=17922


thanks,

Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-14 22:06   ` Rafael J. Wysocki
@ 2010-09-15  8:29     ` Takashi Iwai
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
  1 sibling, 0 replies; 22+ messages in thread
From: Takashi Iwai @ 2010-09-15  8:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, James Bottomley, linux-kernel, mark gross

At Wed, 15 Sep 2010 00:06:26 +0200,
Rafael J. Wysocki wrote:
> 
> On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > On Monday, September 13, 2010, Simon Kirby wrote:
> > > Hi!
> > > 
> > > At first I thought I hitting a suspend bug, but it was because I had an
> > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > sound sounding crackly after resume.  I can reproduce this same crash
> > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > using sound again.
> > > 
> > > I git-bisected to:
> > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > Author: James Bottomley <James.Bottomley@suse.de>
> > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > 
> > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > 
> > >     All current users of pm_qos_add_request() have the ability to supply
> > >     the memory required by the pm_qos routines, so make them do this and
> > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > >     double benefit of making the call never fail and allowing it to be
> > >     called from atomic context.
> > > 
> > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > 
> > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > what's wrong in the diff.  2.6.35 is fine.
> > > 
> > > Simon-
> > > 
> > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > [   62.156008] Oops: 0000 [#1] SMP 
> > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > [   62.156118] CPU 1 
> > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > [   62.156849] 
> > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > [   62.157590] Stack:
> > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > [   62.157590] Call Trace:
> > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > [   62.157590] CR2: 0000000000000000
> > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > 
> > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > check (using gdb) what line of code corresponds to the address
> > plist_add+0x36/0xa0 ?
> 
> Well, my current theory is that the list member of latency_pm_qos_req in
> struct snd_pcm_substream gets corrupted because of the changed size of the
> structure.  Then, deleting it from the plist corrupts the plist itself, which
> only turns up when we try to add a new item to it.
> 
> If that's really the case, the (untested!) patch below should help (unless I broke it).

Could you try first the patch in kernel bugzilla 17922?
This might be an unreleased object, mostly only triggered via OSS
emulation.
    https://bugzilla.kernel.org/show_bug.cgi?id=17922


thanks,

Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
@ 2010-09-15 13:05       ` mark gross
  2010-09-15 13:12         ` Takashi Iwai
  2010-09-15 13:12         ` Takashi Iwai
  2010-09-15 13:05       ` mark gross
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 22+ messages in thread
From: mark gross @ 2010-09-15 13:05 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Rafael J. Wysocki, Simon Kirby, linux-pm, linux-kernel,
	James Bottomley, mark gross

On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> At Wed, 15 Sep 2010 00:06:26 +0200,
> Rafael J. Wysocki wrote:
> > 
> > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > Hi!
> > > > 
> > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > using sound again.
> > > > 
> > > > I git-bisected to:
> > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > 
> > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > 
> > > >     All current users of pm_qos_add_request() have the ability to supply
> > > >     the memory required by the pm_qos routines, so make them do this and
> > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > >     double benefit of making the call never fail and allowing it to be
> > > >     called from atomic context.
> > > > 
> > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > 
> > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > what's wrong in the diff.  2.6.35 is fine.
> > > > 
> > > > Simon-
> > > > 
> > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > [   62.156118] CPU 1 
> > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > [   62.156849] 
> > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > [   62.157590] Stack:
> > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > [   62.157590] Call Trace:
> > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > [   62.157590] CR2: 0000000000000000
> > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > 
> > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > check (using gdb) what line of code corresponds to the address
> > > plist_add+0x36/0xa0 ?
> > 
> > Well, my current theory is that the list member of latency_pm_qos_req in
> > struct snd_pcm_substream gets corrupted because of the changed size of the
> > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > only turns up when we try to add a new item to it.
> > 
> > If that's really the case, the (untested!) patch below should help (unless I broke it).
> 
> Could you try first the patch in kernel bugzilla 17922?
> This might be an unreleased object, mostly only triggered via OSS
> emulation.
>     https://bugzilla.kernel.org/show_bug.cgi?id=17922
this patch inline:
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 134fc6c..6f630e0 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -510,7 +510,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream
	*substream)
	if (substream->ops->hw_free)
	result = substream->ops->hw_free(substream);
	runtime->status->state =
	SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_request(&substream->latency_pm_qos_req);
+	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		 pm_qos_remove_request(&substream->latency_pm_qos_req);
	return result;
	}

@@ -1989,6 +1990,8 @@ void
	snd_pcm_release_substream(struct
	snd_pcm_substream *substream)
	if (substream->hw_opened) {
		if (substream->ops->hw_free != NULL)
			substream->ops->hw_free(substream);
+ 	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		 pm_qos_remove_request(&substream->latency_pm_qos_req);
	substream->ops->close(substream);
	substream->hw_opened = 0;
															 	}

but, the code in pm_qos_remove_request only pull an element out of the
list if its active and your remove if active shouldn't be needed.  You
should just remove the request. I'm guessing this change to the
pcm_native.c make the bug go away.  It doesn't feel right to me.

Also, I don't think pm_qos_request_active should be external.

--mark

> 
> 
> thanks,
> 
> Takashi

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
  2010-09-15 13:05       ` mark gross
@ 2010-09-15 13:05       ` mark gross
  2010-09-15 13:17       ` [linux-pm] " mark gross
  2010-09-15 13:17       ` mark gross
  3 siblings, 0 replies; 22+ messages in thread
From: mark gross @ 2010-09-15 13:05 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: mark gross, linux-kernel, James Bottomley, linux-pm

On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> At Wed, 15 Sep 2010 00:06:26 +0200,
> Rafael J. Wysocki wrote:
> > 
> > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > Hi!
> > > > 
> > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > using sound again.
> > > > 
> > > > I git-bisected to:
> > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > 
> > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > 
> > > >     All current users of pm_qos_add_request() have the ability to supply
> > > >     the memory required by the pm_qos routines, so make them do this and
> > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > >     double benefit of making the call never fail and allowing it to be
> > > >     called from atomic context.
> > > > 
> > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > 
> > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > what's wrong in the diff.  2.6.35 is fine.
> > > > 
> > > > Simon-
> > > > 
> > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > [   62.156118] CPU 1 
> > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > [   62.156849] 
> > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > [   62.157590] Stack:
> > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > [   62.157590] Call Trace:
> > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > [   62.157590] CR2: 0000000000000000
> > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > 
> > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > check (using gdb) what line of code corresponds to the address
> > > plist_add+0x36/0xa0 ?
> > 
> > Well, my current theory is that the list member of latency_pm_qos_req in
> > struct snd_pcm_substream gets corrupted because of the changed size of the
> > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > only turns up when we try to add a new item to it.
> > 
> > If that's really the case, the (untested!) patch below should help (unless I broke it).
> 
> Could you try first the patch in kernel bugzilla 17922?
> This might be an unreleased object, mostly only triggered via OSS
> emulation.
>     https://bugzilla.kernel.org/show_bug.cgi?id=17922
this patch inline:
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 134fc6c..6f630e0 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -510,7 +510,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream
	*substream)
	if (substream->ops->hw_free)
	result = substream->ops->hw_free(substream);
	runtime->status->state =
	SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_request(&substream->latency_pm_qos_req);
+	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		 pm_qos_remove_request(&substream->latency_pm_qos_req);
	return result;
	}

@@ -1989,6 +1990,8 @@ void
	snd_pcm_release_substream(struct
	snd_pcm_substream *substream)
	if (substream->hw_opened) {
		if (substream->ops->hw_free != NULL)
			substream->ops->hw_free(substream);
+ 	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		 pm_qos_remove_request(&substream->latency_pm_qos_req);
	substream->ops->close(substream);
	substream->hw_opened = 0;
															 	}

but, the code in pm_qos_remove_request only pull an element out of the
list if its active and your remove if active shouldn't be needed.  You
should just remove the request. I'm guessing this change to the
pcm_native.c make the bug go away.  It doesn't feel right to me.

Also, I don't think pm_qos_request_active should be external.

--mark

> 
> 
> thanks,
> 
> Takashi

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15 13:05       ` mark gross
@ 2010-09-15 13:12         ` Takashi Iwai
  2010-09-15 13:12         ` Takashi Iwai
  1 sibling, 0 replies; 22+ messages in thread
From: Takashi Iwai @ 2010-09-15 13:12 UTC (permalink / raw)
  To: markgross
  Cc: Rafael J. Wysocki, Simon Kirby, linux-pm, linux-kernel, James Bottomley

At Wed, 15 Sep 2010 06:05:16 -0700,
mark gross wrote:
> 
> On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > At Wed, 15 Sep 2010 00:06:26 +0200,
> > Rafael J. Wysocki wrote:
> > > 
> > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > Hi!
> > > > > 
> > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > using sound again.
> > > > > 
> > > > > I git-bisected to:
> > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > 
> > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > 
> > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > >     double benefit of making the call never fail and allowing it to be
> > > > >     called from atomic context.
> > > > > 
> > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > 
> > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > 
> > > > > Simon-
> > > > > 
> > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > [   62.156118] CPU 1 
> > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > [   62.156849] 
> > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > [   62.157590] Stack:
> > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > [   62.157590] Call Trace:
> > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > [   62.157590] CR2: 0000000000000000
> > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > 
> > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > check (using gdb) what line of code corresponds to the address
> > > > plist_add+0x36/0xa0 ?
> > > 
> > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > only turns up when we try to add a new item to it.
> > > 
> > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > 
> > Could you try first the patch in kernel bugzilla 17922?
> > This might be an unreleased object, mostly only triggered via OSS
> > emulation.
> >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> this patch inline:
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 134fc6c..6f630e0 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -510,7 +510,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream
> 	*substream)
> 	if (substream->ops->hw_free)
> 	result = substream->ops->hw_free(substream);
> 	runtime->status->state =
> 	SNDRV_PCM_STATE_OPEN;
> -	pm_qos_remove_request(&substream->latency_pm_qos_req);
> +	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		 pm_qos_remove_request(&substream->latency_pm_qos_req);
> 	return result;
> 	}
> 
> @@ -1989,6 +1990,8 @@ void
> 	snd_pcm_release_substream(struct
> 	snd_pcm_substream *substream)
> 	if (substream->hw_opened) {
> 		if (substream->ops->hw_free != NULL)
> 			substream->ops->hw_free(substream);
> + 	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		 pm_qos_remove_request(&substream->latency_pm_qos_req);
> 	substream->ops->close(substream);
> 	substream->hw_opened = 0;
> 															 	}
> 
> but, the code in pm_qos_remove_request only pull an element out of the
> list if its active and your remove if active shouldn't be needed.

No, it'll spew warning messages when an non-active request is passed.

>  You should just remove the request.

At least at the second chunk, the request wasn't always done, so it
must be checked beforehand.

Or, I'll remove the check happily once if pm_qos_remove_request()
won't complain for non-active request.

> I'm guessing this change to the
> pcm_native.c make the bug go away.  It doesn't feel right to me.
> 
> Also, I don't think pm_qos_request_active should be external.


Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15 13:05       ` mark gross
  2010-09-15 13:12         ` Takashi Iwai
@ 2010-09-15 13:12         ` Takashi Iwai
  1 sibling, 0 replies; 22+ messages in thread
From: Takashi Iwai @ 2010-09-15 13:12 UTC (permalink / raw)
  To: markgross; +Cc: linux-pm, James Bottomley, linux-kernel

At Wed, 15 Sep 2010 06:05:16 -0700,
mark gross wrote:
> 
> On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > At Wed, 15 Sep 2010 00:06:26 +0200,
> > Rafael J. Wysocki wrote:
> > > 
> > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > Hi!
> > > > > 
> > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > using sound again.
> > > > > 
> > > > > I git-bisected to:
> > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > 
> > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > 
> > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > >     double benefit of making the call never fail and allowing it to be
> > > > >     called from atomic context.
> > > > > 
> > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > 
> > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > 
> > > > > Simon-
> > > > > 
> > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > [   62.156118] CPU 1 
> > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > [   62.156849] 
> > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > [   62.157590] Stack:
> > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > [   62.157590] Call Trace:
> > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > [   62.157590] CR2: 0000000000000000
> > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > 
> > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > check (using gdb) what line of code corresponds to the address
> > > > plist_add+0x36/0xa0 ?
> > > 
> > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > only turns up when we try to add a new item to it.
> > > 
> > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > 
> > Could you try first the patch in kernel bugzilla 17922?
> > This might be an unreleased object, mostly only triggered via OSS
> > emulation.
> >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> this patch inline:
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 134fc6c..6f630e0 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -510,7 +510,8 @@ static int snd_pcm_hw_free(struct snd_pcm_substream
> 	*substream)
> 	if (substream->ops->hw_free)
> 	result = substream->ops->hw_free(substream);
> 	runtime->status->state =
> 	SNDRV_PCM_STATE_OPEN;
> -	pm_qos_remove_request(&substream->latency_pm_qos_req);
> +	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		 pm_qos_remove_request(&substream->latency_pm_qos_req);
> 	return result;
> 	}
> 
> @@ -1989,6 +1990,8 @@ void
> 	snd_pcm_release_substream(struct
> 	snd_pcm_substream *substream)
> 	if (substream->hw_opened) {
> 		if (substream->ops->hw_free != NULL)
> 			substream->ops->hw_free(substream);
> + 	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		 pm_qos_remove_request(&substream->latency_pm_qos_req);
> 	substream->ops->close(substream);
> 	substream->hw_opened = 0;
> 															 	}
> 
> but, the code in pm_qos_remove_request only pull an element out of the
> list if its active and your remove if active shouldn't be needed.

No, it'll spew warning messages when an non-active request is passed.

>  You should just remove the request.

At least at the second chunk, the request wasn't always done, so it
must be checked beforehand.

Or, I'll remove the check happily once if pm_qos_remove_request()
won't complain for non-active request.

> I'm guessing this change to the
> pcm_native.c make the bug go away.  It doesn't feel right to me.
> 
> Also, I don't think pm_qos_request_active should be external.


Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
  2010-09-15 13:05       ` mark gross
  2010-09-15 13:05       ` mark gross
@ 2010-09-15 13:17       ` mark gross
  2010-09-16  0:23         ` Simon Kirby
  2010-09-16  0:23         ` [linux-pm] " Simon Kirby
  2010-09-15 13:17       ` mark gross
  3 siblings, 2 replies; 22+ messages in thread
From: mark gross @ 2010-09-15 13:17 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Rafael J. Wysocki, Simon Kirby, linux-pm, linux-kernel,
	James Bottomley, mark gross

On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> At Wed, 15 Sep 2010 00:06:26 +0200,
> Rafael J. Wysocki wrote:
> > 
> > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > Hi!
> > > > 
> > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > using sound again.
> > > > 
> > > > I git-bisected to:
> > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > 
> > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > 
> > > >     All current users of pm_qos_add_request() have the ability to supply
> > > >     the memory required by the pm_qos routines, so make them do this and
> > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > >     double benefit of making the call never fail and allowing it to be
> > > >     called from atomic context.
> > > > 
> > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > 
> > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > what's wrong in the diff.  2.6.35 is fine.
> > > > 
> > > > Simon-
> > > > 
> > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > [   62.156118] CPU 1 
> > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > [   62.156849] 
> > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > [   62.157590] Stack:
> > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > [   62.157590] Call Trace:
> > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > [   62.157590] CR2: 0000000000000000
> > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > 
> > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > check (using gdb) what line of code corresponds to the address
> > > plist_add+0x36/0xa0 ?
> > 
> > Well, my current theory is that the list member of latency_pm_qos_req in
> > struct snd_pcm_substream gets corrupted because of the changed size of the
> > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > only turns up when we try to add a new item to it.
> > 
> > If that's really the case, the (untested!) patch below should help (unless I broke it).
> 
> Could you try first the patch in kernel bugzilla 17922?
> This might be an unreleased object, mostly only triggered via OSS
> emulation.
>     https://bugzilla.kernel.org/show_bug.cgi?id=17922

adding the pm_qos_remove in the snd_pcm_release_substream may be all
thats needed to fix the problem.  If the snd_cpm is unloading its module
without removing the qos request from the list then the very next time
any pm_qos operation requiring a list walk like adding or removing an
element, would trigger a failure.

i.e. the second hunk of the change in the above bugzilla should be all
that is needed.

--mark

> 
> 
> thanks,
> 
> Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
                         ` (2 preceding siblings ...)
  2010-09-15 13:17       ` [linux-pm] " mark gross
@ 2010-09-15 13:17       ` mark gross
  3 siblings, 0 replies; 22+ messages in thread
From: mark gross @ 2010-09-15 13:17 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: mark gross, linux-kernel, James Bottomley, linux-pm

On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> At Wed, 15 Sep 2010 00:06:26 +0200,
> Rafael J. Wysocki wrote:
> > 
> > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > Hi!
> > > > 
> > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > using sound again.
> > > > 
> > > > I git-bisected to:
> > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > 
> > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > 
> > > >     All current users of pm_qos_add_request() have the ability to supply
> > > >     the memory required by the pm_qos routines, so make them do this and
> > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > >     double benefit of making the call never fail and allowing it to be
> > > >     called from atomic context.
> > > > 
> > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > 
> > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > what's wrong in the diff.  2.6.35 is fine.
> > > > 
> > > > Simon-
> > > > 
> > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > [   62.156118] CPU 1 
> > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > [   62.156849] 
> > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > [   62.157590] Stack:
> > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > [   62.157590] Call Trace:
> > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > [   62.157590] CR2: 0000000000000000
> > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > 
> > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > check (using gdb) what line of code corresponds to the address
> > > plist_add+0x36/0xa0 ?
> > 
> > Well, my current theory is that the list member of latency_pm_qos_req in
> > struct snd_pcm_substream gets corrupted because of the changed size of the
> > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > only turns up when we try to add a new item to it.
> > 
> > If that's really the case, the (untested!) patch below should help (unless I broke it).
> 
> Could you try first the patch in kernel bugzilla 17922?
> This might be an unreleased object, mostly only triggered via OSS
> emulation.
>     https://bugzilla.kernel.org/show_bug.cgi?id=17922

adding the pm_qos_remove in the snd_pcm_release_substream may be all
thats needed to fix the problem.  If the snd_cpm is unloading its module
without removing the qos request from the list then the very next time
any pm_qos operation requiring a list walk like adding or removing an
element, would trigger a failure.

i.e. the second hunk of the change in the above bugzilla should be all
that is needed.

--mark

> 
> 
> thanks,
> 
> Takashi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add)
  2010-09-14 20:38 ` [linux-pm] " Rafael J. Wysocki
                     ` (2 preceding siblings ...)
  2010-09-15 23:46   ` [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add) Simon Kirby
@ 2010-09-15 23:46   ` Simon Kirby
  2010-09-16 18:26     ` Rafael J. Wysocki
  2010-09-16 18:26     ` [linux-pm] " Rafael J. Wysocki
  3 siblings, 2 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-15 23:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, James Bottomley, mark gross, Takashi Iwai

On Tue, Sep 14, 2010 at 10:38:25PM +0200, Rafael J. Wysocki wrote:

> Hmm, interesting.  This looks like a plist corruption to me, but can you please
> check (using gdb) what line of code corresponds to the address
> plist_add+0x36/0xa0 ?

I ended up rebuilding since then, and I enabled a bunch of debugging
stuff.  Does this help make it more obvious?  I'll try your other patch
tonight, but I still don't get what's wrong with the existing code.

Simon-

[51198.357666] ICE1724 0000:01:06.0: PCI INT A disabled
[51198.380010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[51199.893821] ------------[ cut here ]------------
[51199.893821] WARNING: at lib/plist.c:40 plist_check_list+0x62/0xe0()
[51199.893821] Hardware name: System Product Name
[51199.893821] top: ffffffff819c7da0, n: ffff8801ac029288, p: ffff8801ac029288
[51199.893821] prev: ffffffff819c7da0, n: ffff8801ac029288, p: ffff8801ac029288
[51199.893821] next: ffff8801ac029288, n: 6b6b6b6b6b6b6b6b, p: 6b6b6b6b6b6b6b6b
[51199.893821] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 parport_pc snd_i2c parport snd_ak4113 k10temp r8169 [last unloaded: snd_ice1724]
[51199.893821] Pid: 4392, comm: mplayer Not tainted 2.6.36-rc4-oofdbg+ #8
[51199.893821] Call Trace:
[51199.893821]  [<ffffffff8103ffca>] warn_slowpath_common+0x7a/0xb0
[51199.893821]  [<ffffffff810400a1>] warn_slowpath_fmt+0x41/0x50
[51199.893821]  [<ffffffff813a2632>] plist_check_list+0x62/0xe0
[51199.893821]  [<ffffffff813a26ec>] plist_check_head+0x3c/0xa0
[51199.893821]  [<ffffffff813a280d>] plist_add+0x1d/0xb0
[51199.893821]  [<ffffffff81067376>] update_target+0x146/0x160
[51199.893821]  [<ffffffff810675ca>] pm_qos_add_request+0x5a/0x90
[51199.893821]  [<ffffffff815b1c64>] snd_pcm_hw_params+0x2e4/0x3b0
[51199.893821]  [<ffffffff815b2121>] snd_pcm_common_ioctl1+0xb1/0xbe0
[51199.893821]  [<ffffffff8106798b>] ? local_clock+0x4b/0x60
[51199.893821]  [<ffffffff81073115>] ? lock_release_holdtime+0x35/0x180
[51199.893821]  [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
[51199.893821]  [<ffffffff81065fce>] ? up_read+0x1e/0x40
[51199.893821]  [<ffffffff816dbd0c>] ? do_page_fault+0x18c/0x450
[51199.893821]  [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
[51199.893821]  [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
[51199.893821]  [<ffffffff810f2b84>] ? fget_light+0x124/0x2d0
[51199.893821]  [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
[51199.893821]  [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
[51199.893821] ---[ end trace 6aefebd043d8473f ]---
[51199.893821] general protection fault: 0000 [#1] SMP
[51199.893821] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
[51199.893821] CPU 1
[51199.893821] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 parport_pc snd_i2c parport snd_ak4113 k10temp r8169 [last unloaded: snd_ice1724]
[51199.893821]
[51199.893821] Pid: 4392, comm: mplayer Tainted: G        W   2.6.36-rc4-oofdbg+ #8 M4A79T Deluxe/System Product Name
[51199.893821] RIP: 0010:[<ffffffff813a2647>]  [<ffffffff813a2647>] plist_check_list+0x77/0xe0
[51199.893821] RSP: 0018:ffff880179309c58  EFLAGS: 00010016
[51199.893821] RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
[51199.893821] RDX: ffff880008800000 RSI: 0000000000000001 RDI: 0000000000000009
[51199.893821] RBP: ffff880179309ca8 R08: 0000000000000001 R09: 0000000000000000
[51199.893821] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8801ac029288
[51199.893821] R13: ffffffff819c7da0 R14: ffff8801a5550040 R15: ffffffff819c7db0
[51199.893821] FS:  00007f559b19c860(0000) GS:ffff880008800000(0000) knlGS:00000000f76066c0
[51199.893821] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51199.893821] CR2: 00007f55994f6030 CR3: 00000001a3b5a000 CR4: 00000000000006e0
[51199.893821] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[51199.893821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[51199.893821] Process mplayer (pid: 4392, threadinfo ffff880179308000, task ffff8801a8525000)
[51199.893821] Stack:
[51199.893821]  ffffffff819c7da0 ffff8801ac029288 ffff8801ac029288 ffff8801ac029288
[51199.893821] <0> 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b ffff880179309ca8 ffffffff819c7da0
[51199.893821] <0> ffff8801a5550040 ffff8801a5550058 ffff880179309cc8 ffffffff813a26ec
[51199.893821] Call Trace:
[51199.893821]  [<ffffffff813a26ec>] plist_check_head+0x3c/0xa0
[51199.893821]  [<ffffffff813a280d>] plist_add+0x1d/0xb0
[51199.893821]  [<ffffffff81067376>] update_target+0x146/0x160
[51199.893821]  [<ffffffff810675ca>] pm_qos_add_request+0x5a/0x90
[51199.893821]  [<ffffffff815b1c64>] snd_pcm_hw_params+0x2e4/0x3b0
[51199.893821]  [<ffffffff815b2121>] snd_pcm_common_ioctl1+0xb1/0xbe0
[51199.893821]  [<ffffffff8106798b>] ? local_clock+0x4b/0x60
[51199.893821]  [<ffffffff81073115>] ? lock_release_holdtime+0x35/0x180
[51199.893821]  [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
[51199.893821]  [<ffffffff81065fce>] ? up_read+0x1e/0x40
[51199.893821]  [<ffffffff816dbd0c>] ? do_page_fault+0x18c/0x450
[51199.893821]  [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
[51199.893821]  [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
[51199.893821]  [<ffffffff810f2b84>] ? fget_light+0x124/0x2d0
[51199.893821]  [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
[51199.893821]  [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
[51199.893821] Code: 4c 89 4c 24 10 48 89 44 24 20 31 c0 4c 89 64 24 08 e8 2e da c9 ff 4d 39 e5 75 0c eb 66 0f 1f 80 00 00 00 00 49 89 dc 49 8b 1c 24 <48> 8b 43 08 49 39 c4 74 4a 48 89 44 24 28 48 8b 03 4c 89 e9 48
[51199.893821] RIP  [<ffffffff813a2647>] plist_check_list+0x77/0xe0
[51199.893821]  RSP <ffff880179309c58>
[51199.893821] ---[ end trace 6aefebd043d84740 ]---
[51938.896002] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

incidentally, kmemleak output:

unreferenced object 0xffff8801aea4a000 (size 232):
  comm "swapper", pid 1, jiffies 4294893947 (age 78417.732s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea042>] kmem_cache_alloc+0x112/0x1a0
    [<ffffffff815d1495>] __alloc_skb+0x45/0x160
    [<ffffffff81c5e139>] llc_station_init+0xf1/0x13f
    [<ffffffff81c5dfa7>] llc2_init+0x2b/0xcc
    [<ffffffff810001de>] do_one_initcall+0x3e/0x170
    [<ffffffff81c316cc>] kernel_init+0x143/0x1cc
    [<ffffffff81003b14>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8801ac658ff8 (size 512):
  comm "swapper", pid 1, jiffies 4294893947 (age 78417.736s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea86b>] __kmalloc_track_caller+0x12b/0x250
    [<ffffffff815d14c2>] __alloc_skb+0x72/0x160
    [<ffffffff81c5e139>] llc_station_init+0xf1/0x13f
    [<ffffffff81c5dfa7>] llc2_init+0x2b/0xcc
    [<ffffffff810001de>] do_one_initcall+0x3e/0x170
    [<ffffffff81c316cc>] kernel_init+0x143/0x1cc
    [<ffffffff81003b14>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8801a8c5af18 (size 1024):
  comm "mplayer", pid 4392, jiffies 4307692268 (age 27224.632s)
  hex dump (first 32 bytes):
    00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea86b>] __kmalloc_track_caller+0x12b/0x250
    [<ffffffff810cd75b>] memdup_user+0x2b/0x90
    [<ffffffff815b20ff>] snd_pcm_common_ioctl1+0x8f/0xbe0
    [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
    [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
    [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
    [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
    [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add)
  2010-09-14 20:38 ` [linux-pm] " Rafael J. Wysocki
  2010-09-14 22:06   ` Rafael J. Wysocki
  2010-09-14 22:06   ` Rafael J. Wysocki
@ 2010-09-15 23:46   ` Simon Kirby
  2010-09-15 23:46   ` [linux-pm] " Simon Kirby
  3 siblings, 0 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-15 23:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Takashi Iwai, linux-pm, James Bottomley, linux-kernel, mark gross

On Tue, Sep 14, 2010 at 10:38:25PM +0200, Rafael J. Wysocki wrote:

> Hmm, interesting.  This looks like a plist corruption to me, but can you please
> check (using gdb) what line of code corresponds to the address
> plist_add+0x36/0xa0 ?

I ended up rebuilding since then, and I enabled a bunch of debugging
stuff.  Does this help make it more obvious?  I'll try your other patch
tonight, but I still don't get what's wrong with the existing code.

Simon-

[51198.357666] ICE1724 0000:01:06.0: PCI INT A disabled
[51198.380010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
[51199.893821] ------------[ cut here ]------------
[51199.893821] WARNING: at lib/plist.c:40 plist_check_list+0x62/0xe0()
[51199.893821] Hardware name: System Product Name
[51199.893821] top: ffffffff819c7da0, n: ffff8801ac029288, p: ffff8801ac029288
[51199.893821] prev: ffffffff819c7da0, n: ffff8801ac029288, p: ffff8801ac029288
[51199.893821] next: ffff8801ac029288, n: 6b6b6b6b6b6b6b6b, p: 6b6b6b6b6b6b6b6b
[51199.893821] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 parport_pc snd_i2c parport snd_ak4113 k10temp r8169 [last unloaded: snd_ice1724]
[51199.893821] Pid: 4392, comm: mplayer Not tainted 2.6.36-rc4-oofdbg+ #8
[51199.893821] Call Trace:
[51199.893821]  [<ffffffff8103ffca>] warn_slowpath_common+0x7a/0xb0
[51199.893821]  [<ffffffff810400a1>] warn_slowpath_fmt+0x41/0x50
[51199.893821]  [<ffffffff813a2632>] plist_check_list+0x62/0xe0
[51199.893821]  [<ffffffff813a26ec>] plist_check_head+0x3c/0xa0
[51199.893821]  [<ffffffff813a280d>] plist_add+0x1d/0xb0
[51199.893821]  [<ffffffff81067376>] update_target+0x146/0x160
[51199.893821]  [<ffffffff810675ca>] pm_qos_add_request+0x5a/0x90
[51199.893821]  [<ffffffff815b1c64>] snd_pcm_hw_params+0x2e4/0x3b0
[51199.893821]  [<ffffffff815b2121>] snd_pcm_common_ioctl1+0xb1/0xbe0
[51199.893821]  [<ffffffff8106798b>] ? local_clock+0x4b/0x60
[51199.893821]  [<ffffffff81073115>] ? lock_release_holdtime+0x35/0x180
[51199.893821]  [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
[51199.893821]  [<ffffffff81065fce>] ? up_read+0x1e/0x40
[51199.893821]  [<ffffffff816dbd0c>] ? do_page_fault+0x18c/0x450
[51199.893821]  [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
[51199.893821]  [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
[51199.893821]  [<ffffffff810f2b84>] ? fget_light+0x124/0x2d0
[51199.893821]  [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
[51199.893821]  [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
[51199.893821] ---[ end trace 6aefebd043d8473f ]---
[51199.893821] general protection fault: 0000 [#1] SMP
[51199.893821] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
[51199.893821] CPU 1
[51199.893821] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 parport_pc snd_i2c parport snd_ak4113 k10temp r8169 [last unloaded: snd_ice1724]
[51199.893821]
[51199.893821] Pid: 4392, comm: mplayer Tainted: G        W   2.6.36-rc4-oofdbg+ #8 M4A79T Deluxe/System Product Name
[51199.893821] RIP: 0010:[<ffffffff813a2647>]  [<ffffffff813a2647>] plist_check_list+0x77/0xe0
[51199.893821] RSP: 0018:ffff880179309c58  EFLAGS: 00010016
[51199.893821] RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
[51199.893821] RDX: ffff880008800000 RSI: 0000000000000001 RDI: 0000000000000009
[51199.893821] RBP: ffff880179309ca8 R08: 0000000000000001 R09: 0000000000000000
[51199.893821] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8801ac029288
[51199.893821] R13: ffffffff819c7da0 R14: ffff8801a5550040 R15: ffffffff819c7db0
[51199.893821] FS:  00007f559b19c860(0000) GS:ffff880008800000(0000) knlGS:00000000f76066c0
[51199.893821] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51199.893821] CR2: 00007f55994f6030 CR3: 00000001a3b5a000 CR4: 00000000000006e0
[51199.893821] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[51199.893821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[51199.893821] Process mplayer (pid: 4392, threadinfo ffff880179308000, task ffff8801a8525000)
[51199.893821] Stack:
[51199.893821]  ffffffff819c7da0 ffff8801ac029288 ffff8801ac029288 ffff8801ac029288
[51199.893821] <0> 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b ffff880179309ca8 ffffffff819c7da0
[51199.893821] <0> ffff8801a5550040 ffff8801a5550058 ffff880179309cc8 ffffffff813a26ec
[51199.893821] Call Trace:
[51199.893821]  [<ffffffff813a26ec>] plist_check_head+0x3c/0xa0
[51199.893821]  [<ffffffff813a280d>] plist_add+0x1d/0xb0
[51199.893821]  [<ffffffff81067376>] update_target+0x146/0x160
[51199.893821]  [<ffffffff810675ca>] pm_qos_add_request+0x5a/0x90
[51199.893821]  [<ffffffff815b1c64>] snd_pcm_hw_params+0x2e4/0x3b0
[51199.893821]  [<ffffffff815b2121>] snd_pcm_common_ioctl1+0xb1/0xbe0
[51199.893821]  [<ffffffff8106798b>] ? local_clock+0x4b/0x60
[51199.893821]  [<ffffffff81073115>] ? lock_release_holdtime+0x35/0x180
[51199.893821]  [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
[51199.893821]  [<ffffffff81065fce>] ? up_read+0x1e/0x40
[51199.893821]  [<ffffffff816dbd0c>] ? do_page_fault+0x18c/0x450
[51199.893821]  [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
[51199.893821]  [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
[51199.893821]  [<ffffffff810f2b84>] ? fget_light+0x124/0x2d0
[51199.893821]  [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
[51199.893821]  [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
[51199.893821] Code: 4c 89 4c 24 10 48 89 44 24 20 31 c0 4c 89 64 24 08 e8 2e da c9 ff 4d 39 e5 75 0c eb 66 0f 1f 80 00 00 00 00 49 89 dc 49 8b 1c 24 <48> 8b 43 08 49 39 c4 74 4a 48 89 44 24 28 48 8b 03 4c 89 e9 48
[51199.893821] RIP  [<ffffffff813a2647>] plist_check_list+0x77/0xe0
[51199.893821]  RSP <ffff880179309c58>
[51199.893821] ---[ end trace 6aefebd043d84740 ]---
[51938.896002] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

incidentally, kmemleak output:

unreferenced object 0xffff8801aea4a000 (size 232):
  comm "swapper", pid 1, jiffies 4294893947 (age 78417.732s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea042>] kmem_cache_alloc+0x112/0x1a0
    [<ffffffff815d1495>] __alloc_skb+0x45/0x160
    [<ffffffff81c5e139>] llc_station_init+0xf1/0x13f
    [<ffffffff81c5dfa7>] llc2_init+0x2b/0xcc
    [<ffffffff810001de>] do_one_initcall+0x3e/0x170
    [<ffffffff81c316cc>] kernel_init+0x143/0x1cc
    [<ffffffff81003b14>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8801ac658ff8 (size 512):
  comm "swapper", pid 1, jiffies 4294893947 (age 78417.736s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea86b>] __kmalloc_track_caller+0x12b/0x250
    [<ffffffff815d14c2>] __alloc_skb+0x72/0x160
    [<ffffffff81c5e139>] llc_station_init+0xf1/0x13f
    [<ffffffff81c5dfa7>] llc2_init+0x2b/0xcc
    [<ffffffff810001de>] do_one_initcall+0x3e/0x170
    [<ffffffff81c316cc>] kernel_init+0x143/0x1cc
    [<ffffffff81003b14>] kernel_thread_helper+0x4/0x10
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff8801a8c5af18 (size 1024):
  comm "mplayer", pid 4392, jiffies 4307692268 (age 27224.632s)
  hex dump (first 32 bytes):
    00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816c02d5>] kmemleak_alloc+0x25/0x50
    [<ffffffff810ea86b>] __kmalloc_track_caller+0x12b/0x250
    [<ffffffff810cd75b>] memdup_user+0x2b/0x90
    [<ffffffff815b20ff>] snd_pcm_common_ioctl1+0x8f/0xbe0
    [<ffffffff815b2f5d>] snd_pcm_playback_ioctl1+0x3d/0x280
    [<ffffffff815b390d>] snd_pcm_playback_ioctl+0x3d/0x50
    [<ffffffff81101d71>] do_vfs_ioctl+0xa1/0x5a0
    [<ffffffff811022ba>] sys_ioctl+0x4a/0x80
    [<ffffffff81002d6b>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15 13:17       ` [linux-pm] " mark gross
  2010-09-16  0:23         ` Simon Kirby
@ 2010-09-16  0:23         ` Simon Kirby
  2010-09-16  2:52           ` mark gross
  2010-09-16  2:52           ` [linux-pm] " mark gross
  1 sibling, 2 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-16  0:23 UTC (permalink / raw)
  To: Rafael J. Wysocki, mark gross
  Cc: Takashi Iwai, linux-pm, linux-kernel, James Bottomley

On Wed, Sep 15, 2010 at 06:17:12AM -0700, mark gross wrote:

> On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > At Wed, 15 Sep 2010 00:06:26 +0200,
> > Rafael J. Wysocki wrote:
> > > 
> > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > Hi!
> > > > > 
> > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > using sound again.
> > > > > 
> > > > > I git-bisected to:
> > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > 
> > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > 
> > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > >     double benefit of making the call never fail and allowing it to be
> > > > >     called from atomic context.
> > > > > 
> > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > 
> > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > 
> > > > > Simon-
> > > > > 
> > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > [   62.156118] CPU 1 
> > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > [   62.156849] 
> > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > [   62.157590] Stack:
> > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > [   62.157590] Call Trace:
> > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > [   62.157590] CR2: 0000000000000000
> > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > 
> > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > check (using gdb) what line of code corresponds to the address
> > > > plist_add+0x36/0xa0 ?
> > > 
> > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > only turns up when we try to add a new item to it.
> > > 
> > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > 
> > Could you try first the patch in kernel bugzilla 17922?
> > This might be an unreleased object, mostly only triggered via OSS
> > emulation.
> >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> 
> adding the pm_qos_remove in the snd_pcm_release_substream may be all
> thats needed to fix the problem.  If the snd_cpm is unloading its module
> without removing the qos request from the list then the very next time
> any pm_qos operation requiring a list walk like adding or removing an
> element, would trigger a failure.
> 
> i.e. the second hunk of the change in the above bugzilla should be all
> that is needed.

Confirmed: I can't reproduce an Oops with the second hunk applied.  My
snd_pcm_release_substream() looks a bit different than that diff, though. 
Here is a diff against git HEAD I pulled yesterday.  The lines may best
be inserted in a slightly different place.

Simon-

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 134fc6c..d4eb2ef 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1992,6 +1992,8 @@ void snd_pcm_release_substream(struct snd_pcm_substream *substream)
 		substream->ops->close(substream);
 		substream->hw_opened = 0;
 	}
+	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		pm_qos_remove_request(&substream->latency_pm_qos_req);
 	if (substream->pcm_release) {
 		substream->pcm_release(substream);
 		substream->pcm_release = NULL;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-15 13:17       ` [linux-pm] " mark gross
@ 2010-09-16  0:23         ` Simon Kirby
  2010-09-16  0:23         ` [linux-pm] " Simon Kirby
  1 sibling, 0 replies; 22+ messages in thread
From: Simon Kirby @ 2010-09-16  0:23 UTC (permalink / raw)
  To: Rafael J. Wysocki, mark gross
  Cc: Takashi Iwai, linux-pm, James Bottomley, linux-kernel

On Wed, Sep 15, 2010 at 06:17:12AM -0700, mark gross wrote:

> On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > At Wed, 15 Sep 2010 00:06:26 +0200,
> > Rafael J. Wysocki wrote:
> > > 
> > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > Hi!
> > > > > 
> > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > using sound again.
> > > > > 
> > > > > I git-bisected to:
> > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > 
> > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > 
> > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > >     double benefit of making the call never fail and allowing it to be
> > > > >     called from atomic context.
> > > > > 
> > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > 
> > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > 
> > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > 
> > > > > Simon-
> > > > > 
> > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > [   62.156118] CPU 1 
> > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > [   62.156849] 
> > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > [   62.157590] Stack:
> > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > [   62.157590] Call Trace:
> > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > [   62.157590] CR2: 0000000000000000
> > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > 
> > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > check (using gdb) what line of code corresponds to the address
> > > > plist_add+0x36/0xa0 ?
> > > 
> > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > only turns up when we try to add a new item to it.
> > > 
> > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > 
> > Could you try first the patch in kernel bugzilla 17922?
> > This might be an unreleased object, mostly only triggered via OSS
> > emulation.
> >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> 
> adding the pm_qos_remove in the snd_pcm_release_substream may be all
> thats needed to fix the problem.  If the snd_cpm is unloading its module
> without removing the qos request from the list then the very next time
> any pm_qos operation requiring a list walk like adding or removing an
> element, would trigger a failure.
> 
> i.e. the second hunk of the change in the above bugzilla should be all
> that is needed.

Confirmed: I can't reproduce an Oops with the second hunk applied.  My
snd_pcm_release_substream() looks a bit different than that diff, though. 
Here is a diff against git HEAD I pulled yesterday.  The lines may best
be inserted in a slightly different place.

Simon-

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 134fc6c..d4eb2ef 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1992,6 +1992,8 @@ void snd_pcm_release_substream(struct snd_pcm_substream *substream)
 		substream->ops->close(substream);
 		substream->hw_opened = 0;
 	}
+	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		pm_qos_remove_request(&substream->latency_pm_qos_req);
 	if (substream->pcm_release) {
 		substream->pcm_release(substream);
 		substream->pcm_release = NULL;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-16  0:23         ` [linux-pm] " Simon Kirby
  2010-09-16  2:52           ` mark gross
@ 2010-09-16  2:52           ` mark gross
  1 sibling, 0 replies; 22+ messages in thread
From: mark gross @ 2010-09-16  2:52 UTC (permalink / raw)
  To: Simon Kirby
  Cc: Rafael J. Wysocki, mark gross, Takashi Iwai, linux-pm,
	linux-kernel, James Bottomley

On Wed, Sep 15, 2010 at 05:23:03PM -0700, Simon Kirby wrote:
> On Wed, Sep 15, 2010 at 06:17:12AM -0700, mark gross wrote:
> 
> > On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > > At Wed, 15 Sep 2010 00:06:26 +0200,
> > > Rafael J. Wysocki wrote:
> > > > 
> > > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > > Hi!
> > > > > > 
> > > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > > using sound again.
> > > > > > 
> > > > > > I git-bisected to:
> > > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > > 
> > > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > > 
> > > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > > >     double benefit of making the call never fail and allowing it to be
> > > > > >     called from atomic context.
> > > > > > 
> > > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > 
> > > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > > 
> > > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > > 
> > > > > > Simon-
> > > > > > 
> > > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > > [   62.156118] CPU 1 
> > > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > > [   62.156849] 
> > > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > > [   62.157590] Stack:
> > > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > > [   62.157590] Call Trace:
> > > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > > [   62.157590] CR2: 0000000000000000
> > > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > > 
> > > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > > check (using gdb) what line of code corresponds to the address
> > > > > plist_add+0x36/0xa0 ?
> > > > 
> > > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > > only turns up when we try to add a new item to it.
> > > > 
> > > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > > 
> > > Could you try first the patch in kernel bugzilla 17922?
> > > This might be an unreleased object, mostly only triggered via OSS
> > > emulation.
> > >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> > 
> > adding the pm_qos_remove in the snd_pcm_release_substream may be all
> > thats needed to fix the problem.  If the snd_cpm is unloading its module
> > without removing the qos request from the list then the very next time
> > any pm_qos operation requiring a list walk like adding or removing an
> > element, would trigger a failure.
> > 
> > i.e. the second hunk of the change in the above bugzilla should be all
> > that is needed.
> 
> Confirmed: I can't reproduce an Oops with the second hunk applied.  My
> snd_pcm_release_substream() looks a bit different than that diff, though. 
> Here is a diff against git HEAD I pulled yesterday.  The lines may best
> be inserted in a slightly different place.
> 
> Simon-
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 134fc6c..d4eb2ef 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -1992,6 +1992,8 @@ void snd_pcm_release_substream(struct snd_pcm_substream *substream)
>  		substream->ops->close(substream);
>  		substream->hw_opened = 0;
>  	}
> +	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		pm_qos_remove_request(&substream->latency_pm_qos_req);
>  	if (substream->pcm_release) {
>  		substream->pcm_release(substream);
>  		substream->pcm_release = NULL;

yay!

sorry for the trouble.

--mark


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add)
  2010-09-16  0:23         ` [linux-pm] " Simon Kirby
@ 2010-09-16  2:52           ` mark gross
  2010-09-16  2:52           ` [linux-pm] " mark gross
  1 sibling, 0 replies; 22+ messages in thread
From: mark gross @ 2010-09-16  2:52 UTC (permalink / raw)
  To: Simon Kirby
  Cc: mark gross, Takashi Iwai, linux-kernel, James Bottomley, linux-pm

On Wed, Sep 15, 2010 at 05:23:03PM -0700, Simon Kirby wrote:
> On Wed, Sep 15, 2010 at 06:17:12AM -0700, mark gross wrote:
> 
> > On Wed, Sep 15, 2010 at 10:29:54AM +0200, Takashi Iwai wrote:
> > > At Wed, 15 Sep 2010 00:06:26 +0200,
> > > Rafael J. Wysocki wrote:
> > > > 
> > > > On Tuesday, September 14, 2010, Rafael J. Wysocki wrote:
> > > > > On Monday, September 13, 2010, Simon Kirby wrote:
> > > > > > Hi!
> > > > > > 
> > > > > > At first I thought I hitting a suspend bug, but it was because I had an
> > > > > > rmmod/modprobe of snd_ice1724 in my "gosleep" script to work around my
> > > > > > sound sounding crackly after resume.  I can reproduce this same crash
> > > > > > simply by using sound, rmmod snd_ice1724, modprobe snd_ice1724, and then
> > > > > > using sound again.
> > > > > > 
> > > > > > I git-bisected to:
> > > > > > 82f682514a5df89ffb3890627eebf0897b7a84ec is the first bad commit
> > > > > > commit 82f682514a5df89ffb3890627eebf0897b7a84ec
> > > > > > Author: James Bottomley <James.Bottomley@suse.de>
> > > > > > Date:   Mon Jul 5 22:53:06 2010 +0200
> > > > > > 
> > > > > >     pm_qos: Get rid of the allocation in pm_qos_add_request()
> > > > > > 
> > > > > >     All current users of pm_qos_add_request() have the ability to supply
> > > > > >     the memory required by the pm_qos routines, so make them do this and
> > > > > >     eliminate the kmalloc() with pm_qos_add_request().  This has the
> > > > > >     double benefit of making the call never fail and allowing it to be
> > > > > >     called from atomic context.
> > > > > > 
> > > > > >     Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > > > > >     Signed-off-by: mark gross <markgross@thegnar.org>
> > > > > >     Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > > > > 
> > > > > > :040000 040000 0370d037cfe56ae4079bc47713b98c5961de258f 92d6437da54649496c84bacc16ef4b17f8087bbd M      drivers
> > > > > > :040000 040000 6ec551133758633de70e97b4a41d893a5c050e74 4ac17ea227c07d9cf7dae297a40bbbcc374c232f M      include
> > > > > > :040000 040000 d0bb4b15e77a0976940af48588ef0e9b9fd7590b ac682974217c7e75a771b92c8efa7c83659d2916 M      kernel
> > > > > > :040000 040000 9ab6e37ec622354cb93abc2b7424b10263bcd007 011680bad3da4b60aebb659eaeb85a037e2513f3 M      sound
> > > > > > 
> > > > > > ...but I can't revert this over HEAD due to conflicts and I can't see
> > > > > > what's wrong in the diff.  2.6.35 is fine.
> > > > > > 
> > > > > > Simon-
> > > > > > 
> > > > > > [   61.234230] ICE1724 0000:01:06.0: PCI INT A disabled
> > > > > > [   61.240010] ICE1724 0000:01:06.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21
> > > > > > [   62.156008] BUG: unable to handle kernel NULL pointer dereference at (null)
> > > > > > [   62.156008] IP: [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.156008] PGD 1ad813067 PUD 1ad989067 PMD 0 
> > > > > > [   62.156008] Oops: 0000 [#1] SMP 
> > > > > > [   62.156008] last sysfs file: /sys/devices/pci0000:00/0000:00:14.4/0000:01:06.0/sound/card0/uevent
> > > > > > [   62.156118] CPU 1 
> > > > > > [   62.156147] Modules linked in: snd_ice1724 sco bnep rfcomm l2cap bluetooth ppdev hwmon_vid usb_storage tun i2c_viapro snd_rawmidi snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_ak4114 snd_pt2258 snd_i2c parport_pc r8169 parport snd_ak4113 k10temp [last unloaded: snd_ice1724]
> > > > > > [   62.156849] 
> > > > > > [   62.156925] Pid: 3053, comm: mplayer Not tainted 2.6.36-rc4-oof+ #6 M4A79T Deluxe/System Product Name
> > > > > > [   62.157058] RIP: 0010:[<ffffffff81379706>]  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.157217] RSP: 0018:ffff8801ac7f3cd8  EFLAGS: 00010002
> > > > > > [   62.157297] RAX: fffffffffffffff8 RBX: ffff8801a93eae40 RCX: ffff8801ac48e048
> > > > > > [   62.157386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801a93eae40
> > > > > > [   62.157475] RBP: ffff8801ac7f3cf8 R08: 0000000000005356 R09: 0000000000004000
> > > > > > [   62.157564] R10: 0000000000000000 R11: 00000000fffffffe R12: ffffffff8195a2a0
> > > > > > [   62.157590] R13: ffff8801a93eae58 R14: 0000000000003e80 R15: ffffffff8195a2b0
> > > > > > [   62.157590] FS:  00007f79749f5860(0000) GS:ffff880001c80000(0000) knlGS:0000000000000000
> > > > > > [   62.157590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > [   62.157590] CR2: 0000000000000000 CR3: 00000001aef63000 CR4: 00000000000006e0
> > > > > > [   62.157590] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > [   62.157590] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > > > > [   62.157590] Process mplayer (pid: 3053, threadinfo ffff8801ac7f2000, task ffff8801ac535cd0)
> > > > > > [   62.157590] Stack:
> > > > > > [   62.157590]  ffff8801ac7f3d18 ffffffff8195a2a0 ffff8801a93eae40 00000000ffffffff
> > > > > > [   62.157590] <0> ffff8801ac7f3d48 ffffffff8105f35b ffff880100000000 0000000000000292
> > > > > > [   62.157590] <0> ffff8801ac7f3d58 ffff8801a93eae40 ffff8801ad849400 ffff8801ad849c00
> > > > > > [   62.157590] Call Trace:
> > > > > > [   62.157590]  [<ffffffff8105f35b>] update_target+0x12b/0x140
> > > > > > [   62.157590]  [<ffffffff8105f5aa>] pm_qos_add_request+0x4a/0x80
> > > > > > [   62.157590]  [<ffffffff8157e8a4>] snd_pcm_hw_params+0x2d4/0x3a0
> > > > > > [   62.157590]  [<ffffffff8157ed41>] snd_pcm_common_ioctl1+0xb1/0xb90
> > > > > > [   62.157590]  [<ffffffff810bcbc2>] ? handle_mm_fault+0x192/0xa50
> > > > > > [   62.157590]  [<ffffffff8157facd>] snd_pcm_playback_ioctl1+0x3d/0x220
> > > > > > [   62.157590]  [<ffffffff81698b9f>] ? do_page_fault+0x17f/0x440
> > > > > > [   62.157590]  [<ffffffff8158035d>] snd_pcm_playback_ioctl+0x3d/0x50
> > > > > > [   62.157590]  [<ffffffff810e8787>] do_vfs_ioctl+0x97/0x530
> > > > > > [   62.157590]  [<ffffffff810c493c>] ? do_mmap_pgoff+0x34c/0x3a0
> > > > > > [   62.157590]  [<ffffffff810e8c6a>] sys_ioctl+0x4a/0x80
> > > > > > [   62.157590]  [<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
> > > > > > [   62.157590] Code: 54 49 89 f4 53 48 89 fb 48 83 ec 08 4c 3b 6f 18 75 69 49 8b 04 24 48 83 e8 08 eb 0f 90 8b 30 39 33 7c 18 66 90 74 4e 48 8d 42 f8 <48> 8b 50 08 48 8d 48 08 4c 39 e1 0f 18 0a 75 e2 48 8b 50 10 48 
> > > > > > [   62.157590] RIP  [<ffffffff81379706>] plist_add+0x36/0xa0
> > > > > > [   62.157590]  RSP <ffff8801ac7f3cd8>
> > > > > > [   62.157590] CR2: 0000000000000000
> > > > > > [   62.157590] ---[ end trace 853ed59ac5273c1f ]---
> > > > > 
> > > > > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > > > > check (using gdb) what line of code corresponds to the address
> > > > > plist_add+0x36/0xa0 ?
> > > > 
> > > > Well, my current theory is that the list member of latency_pm_qos_req in
> > > > struct snd_pcm_substream gets corrupted because of the changed size of the
> > > > structure.  Then, deleting it from the plist corrupts the plist itself, which
> > > > only turns up when we try to add a new item to it.
> > > > 
> > > > If that's really the case, the (untested!) patch below should help (unless I broke it).
> > > 
> > > Could you try first the patch in kernel bugzilla 17922?
> > > This might be an unreleased object, mostly only triggered via OSS
> > > emulation.
> > >     https://bugzilla.kernel.org/show_bug.cgi?id=17922
> > 
> > adding the pm_qos_remove in the snd_pcm_release_substream may be all
> > thats needed to fix the problem.  If the snd_cpm is unloading its module
> > without removing the qos request from the list then the very next time
> > any pm_qos operation requiring a list walk like adding or removing an
> > element, would trigger a failure.
> > 
> > i.e. the second hunk of the change in the above bugzilla should be all
> > that is needed.
> 
> Confirmed: I can't reproduce an Oops with the second hunk applied.  My
> snd_pcm_release_substream() looks a bit different than that diff, though. 
> Here is a diff against git HEAD I pulled yesterday.  The lines may best
> be inserted in a slightly different place.
> 
> Simon-
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 134fc6c..d4eb2ef 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -1992,6 +1992,8 @@ void snd_pcm_release_substream(struct snd_pcm_substream *substream)
>  		substream->ops->close(substream);
>  		substream->hw_opened = 0;
>  	}
> +	if (pm_qos_request_active(&substream->latency_pm_qos_req))
> +		pm_qos_remove_request(&substream->latency_pm_qos_req);
>  	if (substream->pcm_release) {
>  		substream->pcm_release(substream);
>  		substream->pcm_release = NULL;

yay!

sorry for the trouble.

--mark

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add)
  2010-09-15 23:46   ` [linux-pm] " Simon Kirby
  2010-09-16 18:26     ` Rafael J. Wysocki
@ 2010-09-16 18:26     ` Rafael J. Wysocki
  1 sibling, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-16 18:26 UTC (permalink / raw)
  To: Simon Kirby
  Cc: linux-pm, linux-kernel, James Bottomley, mark gross, Takashi Iwai

On Thursday, September 16, 2010, Simon Kirby wrote:
> On Tue, Sep 14, 2010 at 10:38:25PM +0200, Rafael J. Wysocki wrote:
> 
> > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > check (using gdb) what line of code corresponds to the address
> > plist_add+0x36/0xa0 ?
> 
> I ended up rebuilding since then, and I enabled a bunch of debugging
> stuff.  Does this help make it more obvious?  I'll try your other patch
> tonight, but I still don't get what's wrong with the existing code.

That's not necessary, since the https://bugzilla.kernel.org/show_bug.cgi?id=17922
patch works for you.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add)
  2010-09-15 23:46   ` [linux-pm] " Simon Kirby
@ 2010-09-16 18:26     ` Rafael J. Wysocki
  2010-09-16 18:26     ` [linux-pm] " Rafael J. Wysocki
  1 sibling, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2010-09-16 18:26 UTC (permalink / raw)
  To: Simon Kirby
  Cc: Takashi Iwai, linux-pm, James Bottomley, linux-kernel, mark gross

On Thursday, September 16, 2010, Simon Kirby wrote:
> On Tue, Sep 14, 2010 at 10:38:25PM +0200, Rafael J. Wysocki wrote:
> 
> > Hmm, interesting.  This looks like a plist corruption to me, but can you please
> > check (using gdb) what line of code corresponds to the address
> > plist_add+0x36/0xa0 ?
> 
> I ended up rebuilding since then, and I enabled a bunch of debugging
> stuff.  Does this help make it more obvious?  I'll try your other patch
> tonight, but I still don't get what's wrong with the existing code.

That's not necessary, since the https://bugzilla.kernel.org/show_bug.cgi?id=17922
patch works for you.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2010-09-16 18:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-13  7:00 [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add) Simon Kirby
2010-09-13  7:00 ` Simon Kirby
2010-09-14 20:38 ` [linux-pm] " Rafael J. Wysocki
2010-09-14 22:06   ` Rafael J. Wysocki
2010-09-15  8:29     ` Takashi Iwai
2010-09-15  8:29     ` [linux-pm] " Takashi Iwai
2010-09-15 13:05       ` mark gross
2010-09-15 13:12         ` Takashi Iwai
2010-09-15 13:12         ` Takashi Iwai
2010-09-15 13:05       ` mark gross
2010-09-15 13:17       ` [linux-pm] " mark gross
2010-09-16  0:23         ` Simon Kirby
2010-09-16  0:23         ` [linux-pm] " Simon Kirby
2010-09-16  2:52           ` mark gross
2010-09-16  2:52           ` [linux-pm] " mark gross
2010-09-15 13:17       ` mark gross
2010-09-14 22:06   ` Rafael J. Wysocki
2010-09-15 23:46   ` [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference?(plist_add) Simon Kirby
2010-09-15 23:46   ` [linux-pm] " Simon Kirby
2010-09-16 18:26     ` Rafael J. Wysocki
2010-09-16 18:26     ` [linux-pm] " Rafael J. Wysocki
2010-09-14 20:38 ` [2.6.36-rc4/HEAD] unable to handle kernel NULL pointer dereference (plist_add) Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.