From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.5 required=3.0 tests=BAYES_00,DKIM_ADSP_ALL, DKIM_INVALID,DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F53CC433E1 for ; Thu, 16 Jul 2020 23:28:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AE8E320760 for ; Thu, 16 Jul 2020 23:28:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="aXWtekAm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AE8E320760 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5D8976B0068; Thu, 16 Jul 2020 19:28:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5635D8D0003; Thu, 16 Jul 2020 19:28:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 403A66B0073; Thu, 16 Jul 2020 19:28:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id 23A066B0068 for ; Thu, 16 Jul 2020 19:28:36 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D605C180AD802 for ; Thu, 16 Jul 2020 23:28:35 +0000 (UTC) X-FDA: 77045530590.14.flock00_4203eed26f06 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id ABF1C18229835 for ; Thu, 16 Jul 2020 23:28:35 +0000 (UTC) X-HE-Tag: flock00_4203eed26f06 X-Filterd-Recvd-Size: 9924 Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com [207.171.190.10]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jul 2020 23:28:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1594942116; x=1626478116; h=date:from:to:cc:message-id:references:mime-version: content-transfer-encoding:in-reply-to:subject; bh=jYMN3GJrGjA7ojiGUc+ciVEe93zlOhFmaxfZs4H/WZY=; b=aXWtekAmjTmmJCjdkTR35j/1FKXtr4KRsuse/fl7DxJGp8IuJ7QJQmcc nmgtXqS6dJbcoNKrfx0kjweWeaDNo4rYvtvg57wtEliCQ0HXRJIvPyHC5 4+zO4xGcQPycY1tu67h09l4V9KSwFx3F0W+mBK9teiu6i+D6FzKkJBahh 4=; IronPort-SDR: l3hkHaieLfP9aicaua6dk5DKRxOesNbvVEFYq8ruQ20zrMR0yFNnemTS5j3jWHYSO5I6275xJA fwBryLxen1ag== X-IronPort-AV: E=Sophos;i="5.75,360,1589241600"; d="scan'208";a="59225594" Subject: Re: [PATCH v2 00/11] Fix PM hibernation in Xen guests Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-2a-90c42d1d.us-west-2.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 16 Jul 2020 23:28:33 +0000 Received: from EX13MTAUEE002.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan3.pdx.amazon.com [10.170.41.166]) by email-inbound-relay-2a-90c42d1d.us-west-2.amazon.com (Postfix) with ESMTPS id 0DA0EA2021; Thu, 16 Jul 2020 23:28:31 +0000 (UTC) Received: from EX13D08UEE002.ant.amazon.com (10.43.62.92) by EX13MTAUEE002.ant.amazon.com (10.43.62.24) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 16 Jul 2020 23:28:13 +0000 Received: from EX13MTAUEA002.ant.amazon.com (10.43.61.77) by EX13D08UEE002.ant.amazon.com (10.43.62.92) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 16 Jul 2020 23:28:12 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.61.169) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 16 Jul 2020 23:28:13 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id CEA575697F; Thu, 16 Jul 2020 23:28:12 +0000 (UTC) Date: Thu, 16 Jul 2020 23:28:12 +0000 From: Anchal Agarwal To: Boris Ostrovsky CC: "tglx@linutronix.de" , "mingo@redhat.com" , "bp@alien8.de" , "hpa@zytor.com" , "x86@kernel.org" , "jgross@suse.com" , "linux-pm@vger.kernel.org" , "linux-mm@kvack.org" , "Kamata, Munehisa" , "sstabellini@kernel.org" , "konrad.wilk@oracle.com" , "roger.pau@citrix.com" , "axboe@kernel.dk" , "davem@davemloft.net" , "rjw@rjwysocki.net" , "len.brown@intel.com" , "pavel@ucw.cz" , "peterz@infradead.org" , "Valentin, Eduardo" , "Singh, Balbir" , "xen-devel@lists.xenproject.org" , "vkuznets@redhat.com" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Woodhouse, David" , "benh@kernel.crashing.org" Message-ID: <20200716232812.GA26338@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> References: <324020A7-996F-4CF8-A2F4-46957CEA5F0C@amazon.com> <20200715194933.GA17938@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> <6145a0d9-fd4e-a739-407e-97f7261eecd8@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline In-Reply-To: <6145a0d9-fd4e-a739-407e-97f7261eecd8@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Rspamd-Queue-Id: ABF1C18229835 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 15, 2020 at 04:49:57PM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not= click links or open attachments unless you can confirm the sender and kn= ow the content is safe. >=20 >=20 >=20 > On 7/15/20 3:49 PM, Anchal Agarwal wrote: > > On Mon, Jul 13, 2020 at 03:43:33PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do = not click links or open attachments unless you can confirm the sender and= know the content is safe. > >> > >> > >> > >> On 7/10/20 2:17 PM, Agarwal, Anchal wrote: > >>> Gentle ping on this series. > >> > >> Have you tested save/restore? > >> > > No, not with the last few series. But a good point, I will test that = and get > > back to you. Do you see anything specific in the series that suggests= otherwise? >=20 >=20 > root@ovs104> xl save pvh saved > Saving to saved new xl format (info 0x3/0x0/1699) > xc: info: Saving domain 3, type x86 HVM > xc: Frames: 1044480/1044480 100% > xc: End of stream: 0/0 0% > root@ovs104> xl restore saved > Loading new save file saved (new xl fmt info 0x3/0x0/1699) > Savefile contains xl domain config in JSON format > Parsing config from > xc: info: Found x86 HVM domain from Xen 4.13 > xc: info: Restoring domain > xc: info: Restore successful > xc: info: XenStore: mfn 0xfeffc, dom 0, evt 1 > xc: info: Console: mfn 0xfefff, dom 0, evt 2 > root@ovs104> xl console pvh > [=A0 139.943872] ------------[ cut here ]------------ > [=A0 139.943872] kernel BUG at arch/x86/xen/enlighten.c:205! > [=A0 139.943872] invalid opcode: 0000 [#1] SMP PTI > [=A0 139.943872] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.8.0-rc5= #26 > [=A0 139.943872] RIP: 0010:xen_vcpu_setup+0x16d/0x180 > [=A0 139.943872] Code: 4a 8b 14 f5 40 c9 1b 82 48 89 d8 48 89 2c 02 8b = 05 > a4 d4 40 01 85 c0 0f 85 15 ff ff ff 4a 8b 04 f5 40 c9 1b 82 e9 f4 fe ff > ff <0f> 0b b8 ed ff ff ff e9 14 ff ff ff e8 12 4f 86 00 66 90 66 66 66 > [=A0 139.943872] RSP: 0018:ffffc9000006bdb0 EFLAGS: 00010046 > [=A0 139.943872] RAX: 0000000000000000 RBX: ffffc9000014fe00 RCX: > 0000000000000000 > [=A0 139.943872] RDX: ffff88803fc00000 RSI: 0000000000016128 RDI: > 0000000000000000 > [=A0 139.943872] RBP: 0000000000000000 R08: 0000000000000000 R09: > 0000000000000000 > [=A0 139.943872] R10: ffffffff826174a0 R11: ffffc9000006bcb4 R12: > 0000000000016120 > [=A0 139.943872] R13: 0000000000016120 R14: 0000000000016128 R15: > 0000000000000000 > [=A0 139.943872] FS: 0000000000000000(0000) GS:ffff88803fc00000(0000) > knlGS:0000000000000000 > [=A0 139.943872] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [=A0 139.943872] CR2: 00007f704be8b000 CR3: 000000003901a004 CR4: > 00000000000606f0 > [=A0 139.943872] Call Trace: > [=A0 139.943872]=A0 ? __kmalloc+0x167/0x260 > [=A0 139.943872]=A0 ? xen_manage_runstate_time+0x14a/0x170 > [=A0 139.943872]=A0 xen_vcpu_restore+0x134/0x170 > [=A0 139.943872]=A0 xen_hvm_post_suspend+0x1d/0x30 > [=A0 139.943872]=A0 xen_arch_post_suspend+0x13/0x30 > [=A0 139.943872]=A0 xen_suspend+0x87/0x190 > [=A0 139.943872]=A0 multi_cpu_stop+0x6d/0x110 > [=A0 139.943872]=A0 ? stop_machine_yield+0x10/0x10 > [=A0 139.943872]=A0 cpu_stopper_thread+0x47/0x100 > [=A0 139.943872]=A0 smpboot_thread_fn+0xc5/0x160 > [=A0 139.943872]=A0 ? sort_range+0x20/0x20 > [=A0 139.943872]=A0 kthread+0xfe/0x140 > [=A0 139.943872]=A0 ? kthread_park+0x90/0x90 > [=A0 139.943872]=A0 ret_from_fork+0x22/0x30 > [=A0 139.943872] Modules linked in: > [=A0 139.943872] ---[ end trace 74716859a6b4f0a8 ]--- > [=A0 139.943872] RIP: 0010:xen_vcpu_setup+0x16d/0x180 > [=A0 139.943872] Code: 4a 8b 14 f5 40 c9 1b 82 48 89 d8 48 89 2c 02 8b = 05 > a4 d4 40 01 85 c0 0f 85 15 ff ff ff 4a 8b 04 f5 40 c9 1b 82 e9 f4 fe ff > ff <0f> 0b b8 ed ff ff ff e9 14 ff ff ff e8 12 4f 86 00 66 90 66 66 66 > [=A0 139.943872] RSP: 0018:ffffc9000006bdb0 EFLAGS: 00010046 > [=A0 139.943872] RAX: 0000000000000000 RBX: ffffc9000014fe00 RCX: > 0000000000000000 > [=A0 139.943872] RDX: ffff88803fc00000 RSI: 0000000000016128 RDI: > 0000000000000000 > [=A0 139.943872] RBP: 0000000000000000 R08: 0000000000000000 R09: > 0000000000000000 > [=A0 139.943872] R10: ffffffff826174a0 R11: ffffc9000006bcb4 R12: > 0000000000016120 > [=A0 139.943872] R13: 0000000000016120 R14: 0000000000016128 R15: > 0000000000000000 > [=A0 139.943872] FS: 0000000000000000(0000) GS:ffff88803fc00000(0000) > knlGS:0000000000000000 > [=A0 139.943872] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [=A0 139.943872] CR2: 00007f704be8b000 CR3: 000000003901a004 CR4: > 00000000000606f0 > [=A0 139.943872] Kernel panic - not syncing: Fatal exception > [=A0 139.943872] Shutting down cpus with NMI > [=A0 143.927559] Kernel Offset: disabled > root@ovs104> > I think I may have found a bug. There were no issues with V1 version tha= t I send however, there were issues with V2. I tested both series and found x= l save/restore to be working in V1 but not in V2. I should have tested it. Anyways, looks the issue is coming from executing syscore ops registered = for hibernation use case during call to xen_suspend.=20 I remember your comment from earlier where you did ask why we need to check xen_suspend mode xen_syscore_suspend [patch-004] and I removed that= based on my theoretical understanding of your suggestion that since lock_system= _sleep() lock is taken, we cannot initialize hibernation. I skipped to check the part i= n the code where during xen_suspend(), all registered syscore_ops suspend callb= acks are called. Hence the ones registered for PM hibernation will also be called. With no check there on suspend mode, it fails to return from the function= and they never should be executed in case of xen suspend. I will revert a part of that check in Patch-004 from V1 and send an updat= ed patch with the fix. Thanks, Anchal