netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: boris.ostrovsky@oracle.com
To: Anchal Agarwal <anchalag@amazon.com>
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, jgross@suse.com,
	linux-pm@vger.kernel.org, linux-mm@kvack.org, kamatam@amazon.com,
	sstabellini@kernel.org, konrad.wilk@oracle.com,
	roger.pau@citrix.com, axboe@kernel.dk, davem@davemloft.net,
	rjw@rjwysocki.net, len.brown@intel.com, pavel@ucw.cz,
	peterz@infradead.org, eduval@amazon.com, sblbir@amazon.com,
	xen-devel@lists.xenproject.org, vkuznets@redhat.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	dwmw@amazon.co.uk, benh@kernel.crashing.org
Subject: Re: [PATCH v3 01/11] xen/manage: keep track of the on-going suspend mode
Date: Mon, 14 Sep 2020 20:24:22 -0400	[thread overview]
Message-ID: <e9b94104-d20a-b6b2-cbe0-f79b1ed09c98@oracle.com> (raw)
In-Reply-To: <20200914214754.GA19975@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>


On 9/14/20 5:47 PM, Anchal Agarwal wrote:
> On Sun, Sep 13, 2020 at 11:43:30AM -0400, boris.ostrovsky@oracle.com wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> On 8/21/20 6:25 PM, Anchal Agarwal wrote:
>>> Though, accquirng pm_mutex is still right thing to do, we may
>>> see deadlock if PM hibernation is interrupted by Xen suspend.
>>> PM hibernation depends on xenwatch thread to process xenbus state
>>> transactions, but the thread will sleep to wait pm_mutex which is
>>> already held by PM hibernation context in the scenario. Xen shutdown
>>> code may need some changes to avoid the issue.
>>
>>
>> Is it Xen's shutdown or suspend code that needs to address this? (Or I
>> may not understand what the problem is that you are describing)
>>
> Its Xen suspend code I think. If we do not take the system_transition_mutex
> in do_suspend then if hibernation is triggered in parallel to xen suspend there
> could be issues. 


But you *are* taking this mutex to avoid this exact race, aren't you?


> Now this is still theoretical in my case and I havent been able
> to reproduce such a race. So the approach the original author took was to take
> this lock which to me seems right.
> And its Xen suspend and not Xen Shutdown. So basically if this scenario
> happens I am of the view one of other will fail to occur then how do we recover
> or avoid this at all.
>
> Does that answer your question?
>


>>> +
>>> +static int xen_setup_pm_notifier(void)
>>> +{
>>> +     if (!xen_hvm_domain() || xen_initial_domain())
>>> +             return -ENODEV;
>>
>> I don't think this works anymore.
> What do you mean?
> The first check is for xen domain types and other is for architecture support. 
> The reason I put this check here is because I wanted to segregate the two.
> I do not want to register this notifier at all for !hmv guest and also if its
> an initial control domain.
> The arm check only lands in notifier because once hibernate() api is called ->
> calls pm_notifier_call_chain for PM_HIBERNATION_PREPARE this will fail for
> aarch64. 
> Once we have support for aarch64 this notifier can go away altogether. 
>
> Is there any other reason I may be missing why we should move this check to
> notifier?


Not registering this notifier is equivalent to having it return NOTIFY_OK.


In your earlier versions just returning NOTIFY_OK was not sufficient for
hibernation to proceed since the notifier would also need to set
suspend_mode appropriately. But now your notifier essentially filters
out unsupported configurations. And so if it is not called your
configuration (e.g. PV domain) will be considered supported.


>> In the past your notifier would set suspend_mode (or something) but now
>> it really doesn't do anything except reports an error in some (ARM) cases.
>>
>> So I think you should move this check into the notifier.
>> (And BTW I still think PM_SUSPEND_PREPARE should return an error too.
>> The fact that we are using "suspend" in xen routine names is irrelevant)
>>
> I may have send "not-updated" version of the notifier's function change.
>
> +    switch (pm_event) {
> +       case PM_HIBERNATION_PREPARE:
> +        /* Guest hibernation is not supported for aarch64 currently*/
> +        if (IS_ENABLED(CONFIG_ARM64)) {
> +             ret = NOTIFY_BAD;                                                                                                                                                                                                                                                    
> +             break;                                                                                                                                                                                                                                                               
> +     }               
> +       case PM_RESTORE_PREPARE:
> +       case PM_POST_RESTORE:
> +       case PM_POST_HIBERNATION:
> +       default:
> +           ret = NOTIFY_OK;
> +    }


There is no difference on x86 between this code and what you sent
earlier. In both instances PM_SUSPEND_PREPARE will return NOTIFY_OK.


On ARM this code will allow suspend to proceed (which is not what we want).


-boris


>
> With the above path PM_SUSPEND_PREPARE will go all together. Does that
> resolves this issue? I wanted to get rid of all SUSPEND_* as they are not needed
> here clearly.
> The only reason I kept it there is if someone tries to trigger hibernation on
> ARM instances they should get an error. As I am not sure about the current
> behavior. There may be a better way to not invoke hibernation on ARM DomU's and
> get rid of this block all together.
>
> Again, sorry for sending in the half baked fix. My workspace switch may have
> caused the error.
>>
>>
>> -boris
>>
> Anchal
>>
>>> +     return register_pm_notifier(&xen_pm_notifier_block);
>>> +}
>>> +

  reply	other threads:[~2020-09-15  0:27 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 22:22 [PATCH v3 00/11] Fix PM hibernation in Xen guests Anchal Agarwal
2020-08-21 22:25 ` [PATCH v3 01/11] xen/manage: keep track of the on-going suspend mode Anchal Agarwal
2020-09-13 15:43   ` boris.ostrovsky
2020-09-14 21:47     ` Anchal Agarwal
2020-09-15  0:24       ` boris.ostrovsky [this message]
2020-09-15 18:00         ` Anchal Agarwal
2020-09-15 19:58           ` boris.ostrovsky
2020-09-21 21:54             ` Anchal Agarwal
2020-09-22 16:18               ` boris.ostrovsky
2020-09-22 23:17                 ` Anchal Agarwal
     [not found]                   ` <20200925190423.GA31885@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
2020-09-25 20:02                     ` boris.ostrovsky
2020-09-25 22:28                       ` Anchal Agarwal
2020-09-28 18:49                         ` boris.ostrovsky
2020-09-30 21:29                           ` Anchal Agarwal
2020-10-01 12:43                             ` boris.ostrovsky
2021-05-21  5:26                               ` Anchal Agarwal
2021-05-25 22:23                                 ` Boris Ostrovsky
2021-05-26  4:40                                   ` Anchal Agarwal
2021-05-26 18:29                                     ` Boris Ostrovsky
     [not found]                                       ` <20210528215008.GA19622@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
2021-06-01 14:18                                         ` Boris Ostrovsky
2021-06-02 19:37                                           ` Anchal Agarwal
2021-06-03 20:11                                             ` Boris Ostrovsky
2021-06-03 23:27                                               ` Anchal Agarwal
2021-06-04  1:49                                                 ` Boris Ostrovsky
2020-09-13 17:07   ` boris.ostrovsky
2020-08-21 22:26 ` [PATCH v3 02/11] xenbus: add freeze/thaw/restore callbacks support Anchal Agarwal
2020-09-13 16:11   ` boris.ostrovsky
2020-09-15 19:56     ` Anchal Agarwal
2020-08-21 22:26 ` [PATCH v3 03/11] x86/xen: Introduce new function to map HYPERVISOR_shared_info on Resume Anchal Agarwal
2020-08-21 22:27 ` [PATCH v3 04/11] x86/xen: add system core suspend and resume callbacks Anchal Agarwal
2020-09-13 17:25   ` boris.ostrovsky
2020-08-21 22:27 ` [PATCH v3 05/11] genirq: Shutdown irq chips in suspend/resume during hibernation Thomas Gleixner
2020-08-22  0:36   ` Thomas Gleixner
2020-08-24 17:25     ` Anchal Agarwal
2020-08-25 13:20     ` Christoph Hellwig
2020-08-25 15:25       ` Thomas Gleixner
2020-08-21 22:28 ` [PATCH v3 06/11] xen-blkfront: add callbacks for PM suspend and hibernation Anchal Agarwal
2020-08-21 22:29 ` [PATCH v3 07/11] xen-netfront: " Anchal Agarwal
2020-08-21 22:29 ` [PATCH v3 08/11] x86/xen: save and restore steal clock during PM hibernation Anchal Agarwal
2020-08-21 22:30 ` [PATCH v3 09/11] xen: Introduce wrapper for save/restore sched clock offset Anchal Agarwal
2020-08-21 22:30 ` [PATCH v3 10/11] xen: Update sched clock offset to avoid system instability in hibernation Anchal Agarwal
2020-09-13 17:52   ` boris.ostrovsky
2020-08-21 22:31 ` [PATCH v3 11/11] PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA Anchal Agarwal
2020-08-28 18:26 ` [PATCH v3 00/11] Fix PM hibernation in Xen guests Anchal Agarwal
2020-08-28 18:29   ` Rafael J. Wysocki
2020-08-28 18:39     ` Anchal Agarwal
2020-09-11 20:44       ` Anchal Agarwal
2020-09-11 15:19 ` boris.ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9b94104-d20a-b6b2-cbe0-f79b1ed09c98@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=anchalag@amazon.com \
    --cc=axboe@kernel.dk \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=davem@davemloft.net \
    --cc=dwmw@amazon.co.uk \
    --cc=eduval@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kamatam@amazon.com \
    --cc=konrad.wilk@oracle.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=roger.pau@citrix.com \
    --cc=sblbir@amazon.com \
    --cc=sstabellini@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).