From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 14 Feb 2017 17:34:37 +0100 From: "hch@lst.de" To: Dexuan Cui Cc: "hch@lst.de" , Jens Axboe , Bart Van Assche , "hare@suse.com" , "hare@suse.de" , "Martin K. Petersen" , "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "jth@kernel.org" , Nick Meier , "Alex Ng (LIS)" , Long Li , "Adrian Suhov (Cloudbase Solutions SRL)" , "Chris Valean (Cloudbase Solutions SRL)" Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements") Message-ID: <20170214163437.GA23956@lst.de> References: <20170208180314.GA17838@lst.de> <20170209130800.GA12057@lst.de> <20170214134736.GA19620@lst.de> <20170214142837.GB20706@lst.de> <20170214145101.GA21427@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: List-ID: > I tested today's linux-next (next-20170214) + the 2 patches just now and got > a weird result: > sometimes the VM stills hung with a new calltrace (BUG: spinlock bad > magic) , but sometimes the VM did boot up despite the new calltrace! > > Attached is the log of a "good" boot. > > It looks we have a memory corruption issue somewhere... Yes. > Actually previously I saw the "BUG: spinlock bad magic" message once, but I > couldn't repro it later, so I didn't mention it to you. Interesting. > > The good news is that now I can repro the "spinlock bad magic" message > every time. > I tried to dig into this by enabling Kernel hacking -> Memory debugging, > but didn't find anything abnormal. > Is it possible that the SCSI layer passes a wrong memory address? It's possible, but this looks like it might be a different issue. A few questions on the dmesg: [ 6.208794] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] [ 6.209447] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code [ 6.210043] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] [ 6.210618] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code [ 6.212272] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] [ 6.212897] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code [ 6.213474] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] [ 6.214051] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code I didn't see anything like this in the other logs. Are these messages something usual on HyperV VMs? [ 6.358405] XFS (sdb1): Mounting V5 Filesystem [ 6.404478] XFS (sdb1): Ending clean mount [ 7.535174] BUG: spinlock bad magic on CPU#0, swapper/0/0 [ 7.536807] lock: host_ts+0x30/0xffffffffffffe1a0 [hv_utils], .magic: 00000000, .owner: /-1, .owner_cpu: 0 [ 7.538436] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170214+ #1 [ 7.539142] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 [ 7.539142] Call Trace: [ 7.539142] [ 7.539142] dump_stack+0x63/0x82 [ 7.539142] spin_dump+0x78/0xc0 [ 7.539142] do_raw_spin_lock+0xfd/0x160 [ 7.539142] _raw_spin_lock_irqsave+0x4c/0x60 [ 7.539142] ? timesync_onchannelcallback+0x153/0x220 [hv_utils] [ 7.539142] timesync_onchannelcallback+0x153/0x220 [hv_utils] Can you resolve this address using gdb to a line of code? Once inside gdb do: l *(timesync_onchannelcallback+0x153)