From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754012AbdLTAwI (ORCPT ); Tue, 19 Dec 2017 19:52:08 -0500 Received: from shells.gnugeneration.com ([66.240.222.126]:40270 "EHLO shells.gnugeneration.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753762AbdLTAwG (ORCPT ); Tue, 19 Dec 2017 19:52:06 -0500 Date: Tue, 19 Dec 2017 16:59:44 -0800 From: vcaputo@pengaru.com To: Thomas Gleixner Cc: Pavel Machek , linux-kernel , Takashi Iwai Subject: Re: thinkpad x60: sound problems in 4.15-rc1 was Re: thinkpad x60: sound problems in 4.14.0-next-20171114 Message-ID: <20171220005943.qkxn4kmbxkiiq4lp@shells.gnugeneration.com> References: <20171127184411.GA22165@amd> <20171214095730.GA6150@amd> <20171219020658.qpvxx7fqvrisezs6@shells.gnugeneration.com> <20171219045415.46croaosad656d5v@shells.gnugeneration.com> <20171219231256.dpizizz55o27op3s@shells.gnugeneration.com> <20171219232212.GA5177@amd> <20171220003610.vogk3wrke47drsfw@shells.gnugeneration.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 20, 2017 at 01:33:45AM +0100, Thomas Gleixner wrote: > On Tue, 19 Dec 2017, vcaputo@pengaru.com wrote: > > On Wed, Dec 20, 2017 at 12:22:12AM +0100, Pavel Machek wrote: > > > You forgot to mention commit id :-). > > > > > > > That is very strange, anyhow: > > > > commit fdba46ffb4c203b6e6794163493fd310f98bb4be > > Author: Thomas Gleixner > > Date: Wed Sep 13 23:29:27 2017 +0200 > > > > x86/apic: Get rid of multi CPU affinity > > > > > > Will try reverting soon, just a bit busy today out in the desert and the sun > > is going down so my solar panel is useless. > > The revert is not trivial. > > What is the exact problem and how do you reproduce that? > Dang. Ostensibly the problem is audio playback looping what seems to be stuck in a DMA buffer when I pause the audio. I also saw once during the bisect on a 'bad' commit, a reproduction of this, which hung everything doing IO until it the ata1 reset happened: [ 36.606657] Monitor-Mwait will be used to enter C-3 state [ 36.628663] do_IRQ: 0.35 No irq handler for vector [ 37.875724] do_IRQ: 0.194 No irq handler for vector [ 69.099090] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 69.099165] ata1.00: failed command: FLUSH CACHE EXT [ 69.099211] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 6 res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 69.099285] ata1.00: status: { DRDY } [ 69.099309] ata1: hard resetting link [ 69.406185] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 69.409255] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 69.409259] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 69.409261] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 69.409433] ata1.00: supports DRM functions and may not be fully accessible [ 69.409819] ata1.00: NCQ Send/Recv Log not supported [ 69.411644] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 69.411649] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 69.411654] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 69.411842] ata1.00: supports DRM functions and may not be fully accessible [ 69.412242] ata1.00: NCQ Send/Recv Log not supported [ 69.412719] ata1.00: configured for UDMA/133 [ 69.412721] ata1.00: retrying FLUSH 0xea Emask 0x4 [ 69.412964] ata1: EH complete But that is more difficult to reproduce, it doesn't seem to happen regularly. Infact, I thought that was independent from the audio problem at first but now it's become clear they're all related. The 'bad' commits always showed the 'do_IRQ: No irq handler for vector' lines, sometimes they appear for the first time on the console when I yanked the power cord while playing music in the quick repro cycle without running X or anything. What I've been doing to reproduce this is simply boot to multi-user.target, login, run `cmus`, and play a song. Next pull the AC power from the thinkpad x61s. Immediately the audio gets messed up, pausing the audio doesn't pause it, it just loops the last tiny buffer contents. That's all I've got right now, but it doesn't seem to be limited to the audio problem as shown by the ata1 reset. Also I've searched through all my stored journals for anything resembling that ata1 problem, and there's not a single occurrence going back ~300 boots across a handful of kernel versions. If you'd like to prep a patch for me to test, I'm happy to test, but I'm not sure how online I'll be for the next 24 hours. It's some kind of miracle we're communicating as-is, I'm in the middle of the damn desert but somehow there's a cell signal. Thanks, Vito Caputo