From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 972B8C33CAF for ; Mon, 13 Jan 2020 13:08:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6382C2081E for ; Mon, 13 Jan 2020 13:08:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="PFWqB/E/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728731AbgAMNIQ (ORCPT ); Mon, 13 Jan 2020 08:08:16 -0500 Received: from esa3.hc3370-68.iphmx.com ([216.71.145.155]:9696 "EHLO esa3.hc3370-68.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726593AbgAMNIP (ORCPT ); Mon, 13 Jan 2020 08:08:15 -0500 X-Greylist: delayed 426 seconds by postgrey-1.27 at vger.kernel.org; Mon, 13 Jan 2020 08:08:14 EST DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1578920896; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=ngkACCPTrWi5McT5n6PAauLZLAB+ALgfEQVDPw6WcrY=; b=PFWqB/E/GXwB/X0Oeh5Gm4ZrjxrrPZ9s2K2I4SfqTm38VFngAotQiKRm rQAsW88fU8Ap5uI4NAdSkQiHpwGPiyC19RKZbytVr2p/MjAR0ozCHdaiQ 8i4nAiPKOCYP7Y9cYlK56IpGF6d0RXqhjf6VAwTSs3itJaU184uIsg4ed w=; Authentication-Results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@citrix.com; spf=Pass smtp.mailfrom=Andrew.Cooper3@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: None (esa3.hc3370-68.iphmx.com: no sender authenticity information available from domain of andrew.cooper3@citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa3.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="andrew.cooper3@citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa3.hc3370-68.iphmx.com: domain of Andrew.Cooper3@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa3.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="Andrew.Cooper3@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ip4:168.245.78.127 ~all" Received-SPF: None (esa3.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa3.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: GlauahcLPcHiN+06JBb+FFer7fhPNPvyqF8yyWZbPPRkFWWBNJGIFBvz8S9tRXAkU1/OKSXilb N87abC6Y4uVDvvv6XHlqkQvbGa2yqAh/Zy3zCUL8DN5Q89i90pTh+XBvuk5sMsHe51PcK1oND1 AfmXW8js1o5PHvxveLH0UTha768NDUThPbfKNdX+wVzVxbtMXVjDGi6ILPC+ZeDOAg8q7Y6e89 y8ulSfoUwjf57z9mM1ueqD6xvKVBchjvg86GkD4dK+2YeTImmj/FVhP+jPxHpCTNTPOadGpw53 qoo= X-SBRS: 2.7 X-MesageID: 10820025 X-Ironport-Server: esa3.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.69,429,1571716800"; d="scan'208";a="10820025" Subject: Re: [Xen-devel] [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation To: "Singh, Balbir" , "peterz@infradead.org" , "Valentin, Eduardo" CC: "konrad.wilk@oracle.co" , "x86@kernel.org" , "len.brown@intel.com" , "linux-mm@kvack.org" , "pavel@ucw.cz" , "hpa@zytor.com" , "boris.ostrovsky@oracle.com" , "sstabellini@kernel.org" , "fllinden@amaozn.com" , "Kamata, Munehisa" , "mingo@redhat.com" , "xen-devel@lists.xenproject.org" , "axboe@kernel.dk" , "linux-pm@vger.kernel.org" , "Agarwal, Anchal" , "bp@alien8.de" , "tglx@linutronix.de" , "jgross@suse.com" , "netdev@vger.kernel.org" , "Woodhouse@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com" , "rjw@rjwysocki.net" , "linux-kernel@vger.kernel.org" , "vkuznets@redhat.com" , "davem@davemloft.net" , "Woodhouse, David" , "roger.pau@citrix.com" References: <20200107234526.GA19034@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> <20200108105011.GY2827@hirez.programming.kicks-ass.net> <20200110153520.GC8214@u40b0340c692b58f6553c.ant.amazon.com> <20200113101609.GT2844@hirez.programming.kicks-ass.net> <857b42b2e86b2ae09a23f488daada3b1b2836116.camel@amazon.com> From: Andrew Cooper Openpgp: preference=signencrypt Autocrypt: addr=andrew.cooper3@citrix.com; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA== Message-ID: <7bb967ca-2a91-6397-9c0a-6eafd43c83ed@citrix.com> Date: Mon, 13 Jan 2020 13:01:02 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <857b42b2e86b2ae09a23f488daada3b1b2836116.camel@amazon.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Content-Language: en-GB X-ClientProxiedBy: AMSPEX02CAS01.citrite.net (10.69.22.112) To AMSPEX02CL01.citrite.net (10.69.22.125) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/01/2020 11:43, Singh, Balbir wrote: > On Mon, 2020-01-13 at 11:16 +0100, Peter Zijlstra wrote: >> On Fri, Jan 10, 2020 at 07:35:20AM -0800, Eduardo Valentin wrote: >>> Hey Peter, >>> >>> On Wed, Jan 08, 2020 at 11:50:11AM +0100, Peter Zijlstra wrote: >>>> On Tue, Jan 07, 2020 at 11:45:26PM +0000, Anchal Agarwal wrote: >>>>> From: Eduardo Valentin >>>>> >>>>> System instability are seen during resume from hibernation when system >>>>> is under heavy CPU load. This is due to the lack of update of sched >>>>> clock data, and the scheduler would then think that heavy CPU hog >>>>> tasks need more time in CPU, causing the system to freeze >>>>> during the unfreezing of tasks. For example, threaded irqs, >>>>> and kernel processes servicing network interface may be delayed >>>>> for several tens of seconds, causing the system to be unreachable. >>>>> The fix for this situation is to mark the sched clock as unstable >>>>> as early as possible in the resume path, leaving it unstable >>>>> for the duration of the resume process. This will force the >>>>> scheduler to attempt to align the sched clock across CPUs using >>>>> the delta with time of day, updating sched clock data. In a post >>>>> hibernation event, we can then mark the sched clock as stable >>>>> again, avoiding unnecessary syncs with time of day on systems >>>>> in which TSC is reliable. >>>> This makes no frigging sense what so bloody ever. If the clock is >>>> stable, we don't care about sched_clock_data. When it is stable you get >>>> a linear function of the TSC without complicated bits on. >>>> >>>> When it is unstable, only then do we care about the sched_clock_data. >>>> >>> Yeah, maybe what is not clear here is that we covering for situation >>> where clock stability changes over time, e.g. at regular boot clock is >>> stable, hibernation happens, then restore happens in a non-stable clock. >> Still confused, who marks the thing unstable? The patch seems to suggest >> you do yourself, but it is not at all clear why. >> >> If TSC really is unstable, then it needs to remain unstable. If the TSC >> really is stable then there is no point in marking is unstable. >> >> Either way something is off, and you're not telling me what. >> > Hi, Peter > > For your original comment, just wanted to clarify the following: > > 1. After hibernation, the machine can be resumed on a different but compatible > host (these are VM images hibernated) > 2. This means the clock between host1 and host2 can/will be different The guests TSC value is part of all save/migrate/resume state.  Given this bug, I presume you've actually discarded all register state on hibernate, and the TSC is starting again from 0? The frequency of the new TSC might very likely be different, but the scale/offset in the paravirtual clock information should let Linux's view of time stay consistent. > In your comments are you making the assumption that the host(s) is/are the > same? Just checking the assumptions being made and being on the same page with > them. TSCs are a massive source of "fun".  I'm not surprised that there are yet more bugs around. Does anyone actually know what does/should happen to the real TSC on native S4?  The default course of action should be for virtualisation to follow suit. ~Andrew