Takashi Iwai wrote, on 05/25/12 18:06: > At Fri, 25 May 2012 17:33:11 +0200, > Jörg-Volker Peetz wrote: >> >> Hello, >> >> Takashi Iwai wrote, on 05/25/12 09:25: >>> At Wed, 23 May 2012 13:26:57 -0700, >>> Tejun Heo wrote: >>>> >>>> Cc'ing Takashi. Hi! >>> >>> Also Cc'ed Fengguang, who worked on ELD stuff. >>> >>>> On Wed, May 23, 2012 at 09:56:36PM +0200, Jörg-Volker Peetz wrote: >>>>> May 23 21:32:33 hostname kernel: XXX delayed_work_timer_fn: cwq >>>>> (null), fn=hdmi_repoll_eld >>>> >>>> So, we have the winner. >>>> >>>> Takashi, sound/pci/hda/patch_hdmi.c::hdmi_repoll_eld() is causing >>>> workqueue code dereference %NULL pointer. It *looks* like something >>>> is corrupting the work item while it's queued. It could be a >>>> workqueue bug but I don't think that's likely - the code has been >>>> stable for quite some time now. I glanced through the code and >>>> nothing stands out. Does something ring a bell? >>> >>> I also don't know of this problem. My initial thought was that the >>> work struct placed right after sink_eld in struct hdmi_spec_per_pin is >>> overwritten wrongly by reading some ELD data. But I failed to spot >>> out the bug... >>> >>> Reading back through the thread, the problem seems triggered via usb >>> video cam. I wonder how this is connected to the HDMI audio. >>> >>> To get things straight: does this bug happen even without HDMI, DP or >>> DVI cable plugged, i.e. only with the laptop without connecting to the >>> external digital output? >>> >> yes it happens without any HDMI cable plugged. The notebook is only connected to >> an ethernet cable and the power cable. I'll append /var/log/dmesg, it also >> contains the kernel command line with "radeon.audio=1". >> >> The computer has two graphic chips: >> ATI Mobility Radeon HD 4200 integrated graphics (non-free firmware R600_rlc.bin) >> ATI Mobility Radeon HD 5470 graphic (512MB) (non-free firmware CEDAR_*.bin) >> During booting, the discrete GPU is switched off using vga switcheroo: >> >> $ mount -t debugfs none /sys/kernel/debug >> $ echo -n OFF > /sys/kernel/debug/vgaswitcheroo/switch > > This explains the codec stall, at least. Disabling the D-GPU also > disables the HD-audio controller. Once when it's disabled, even > accessing the PCI may trigger an Oops. It's a known problem. > > The support of vga-switcheroo for HD-audio was recently added, and I > sent a pull request to Linus today. Try the latest Linus tree and > pull sound git tree hda-switcheroo tag onto it: > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git tags/hda-switcheroo > I will try that and report the result. Is it ok if I use the patch of Tejun on top of this in order to avoid a freeze? > I'm not sure whether this is related with the workq Oops, though. > At least, you can try without disabling D-GPU to check whether you see > the same workq problem. > Simply switching on the discrete GPU with $ echo -n ON > /sys/kernel/debug/vgaswitcheroo/switch after it has been switched off results in the same oops and the output of alsa-info.sh differs only in a few lines (see the attached diff-file). > >> For the sound kernel module the following options are set in >> /etc/modprobe.d/alsa-base.conf: >> >> options snd-hda-intel model=hp-dv7-4000 enable_msi=1 >> >>> >>>>> (without line-break). >>>>> >>>>> By the way, don't know if this is related, I have a phenomenon with a spurious >>>>> interrupt with every linux version I've used before on this notebook. Half a >>>>> minute after starting the system the computer produces approx. 220 lines like >>>>> >>>>> ... kernel: hda-intel: spurious response 0x0:0x0, last cmd=0x170503 >>>>> >>>>> Now with 3.4.0, I see an additional message right before (the minute before) the >>>>> "XXX ..." line: >>>>> >>>>> ...kernel: hda_intel: azx_get_response timeout, switching to single_cmd mode: >>>>> last cmd=0x003f0900 >>>> >>>> These too seem to be for you, Takashi. :) >>> >>> This means essentially the codec communication got stalled. This is a >>> bad signal. It happens often with a wrong HD-audio verb, but often >>> with a bad IRQ, whatever. >>> >>> I'd need alsa-info.sh output (run with --no-upload option) for further >>> analysis. >>> >>> >>> thanks, >>> >>> Takashi >> >> My first try to run the alsa-info.sh script with the plain 3.4 kernel produced >> the same kernel oops freezing the notebook (and /tmp is mounted on tmpfs). >> Therefore I applied the patch from Tejun to produce a usable output. >> I attach it also. As you will notice, it contains the line beginning with "XXX" >> due to Tejun's patch. > > Get alsa-info.sh without disabling D-GPU if you run it on 3.4 or > earlier kernel. > For the case without mounting debugfs and , thus, both GPUS active, the output of alsa-info.sh is also attached. It doesn't trigger the oops and the viewer for the built-in USB-camera works also without triggering the oops. > > thanks, > > Takashi -- Best regards, Jörg-Volker.