linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dexuan Cui <decui@microsoft.com>
To: vkuznets <vkuznets@redhat.com>
Cc: KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"sashal@kernel.org" <sashal@kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Michael Kelley <mikelley@microsoft.com>
Subject: RE: [PATCH 1/3] hv_utils: Add the support of hibernation
Date: Thu, 19 Sep 2019 06:34:28 +0000	[thread overview]
Message-ID: <PU1P153MB01694ABFED7ADAE8B40A65F7BF890@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <87pnk0bpe8.fsf@vitty.brq.redhat.com>

> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Sent: Monday, September 16, 2019 1:46 AM
> Dexuan Cui <decui@microsoft.com> writes:
> 
> >> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> Sent: Thursday, September 12, 2019 9:37 AM
> >
> >> > +static int util_suspend(struct hv_device *dev)
> >> > +{
> >> > +	struct hv_util_service *srv = hv_get_drvdata(dev);
> >> > +
> >> > +	if (srv->util_cancel_work)
> >> > +		srv->util_cancel_work();
> >> > +
> >> > +	vmbus_close(dev->channel);
> >>
> >> And what happens if you receive a real reply from userspace after you
> >> close the channel? You've only cancelled timeout work so the driver
> >> will not try to reply by itself but this doesn't mean we won't try to
> >> write to the channel on legitimate replies from daemons.
> >>
> >> I think you need to block all userspace requests (hang in kernel until
> >> util_resume()).
> >>
> >> While I'm not sure we can't get away without it but I'd suggest we add a
> >> new HVUTIL_SUSPENDED state to the hvutil state machine.
> >> Vitaly
> >
> > When we reach util_suspend(), all the userspace processes have been
> > frozen: see kernel/power/hibernate.c: hibernate() -> freeze_processes(),
> > so here we can not receive a reply from the userspace daemon.
> >
> 
> Let's add a WARN() or something then as if this ever happens this is
> going to be realy tricky to catch.

Looking at the path hibernate() -> freeze_processes() -> 
try_to_freeze_tasks(true) -> freeze_task() -> fake_signal_wake_up(), I'm
sure when try_to_freeze_tasks(true) returns 0, all the user-space processes
must be frozen in do_signal() -> get_signal() -> try_to_freeze() -> ... -> 
__refrigerator().

hibernate () -> hibernation_snapshot () -> dpm_suspend() -> ... -> 
util_suspend() only runs after hibernate() -> freeze_processes(), so I'm
pretty sure we're safe here.

> > However, it looks there is indeed some tricky corner cases we need to deal
> > with: in util_resume(), before we call vmbus_open(), all the userspace
> > processes are still frozen, and the userspace daemon (e.g. hv_kvp_daemon)
> > can be in any of these states:
> >
> > 1. the driver has not buffered any message for the daemon. This is good.
> >
> > 2. the driver has buffered a message for the daemon, and
> > kvp_transaction.state is HVUTIL_USERSPACE_REQ. Later, the kvp daemon
> > writes the response to the driver, and in kvp_on_msg()
> > kvp_transaction.state is moved to HVUTIL_USERSPACE_RECV, but since
> > cancel_delayed_work_sync(&kvp_timeout_work) is false (the work has
> > been cancelled by util_suspend()), the driver reports nothing to the host,
> > which is good as the VM has undergone a hibernation event and IMO the
> > response may be stale and I believe the host is not expecting this
> > response from the VM at all (the host side application that requested the
> > KVP must have failed or timed out), but the bad thing is: the "state"
> > remains in HVUTIL_USERSPACE_RECV, preventing
> > hv_kvp_onchannelcallback() from reading from the channel ringbuffer.
> >
> > I suggest we work around this race condition by the below patch:
> >
> > --- a/drivers/hv/hv_kvp.c
> > +++ b/drivers/hv/hv_kvp.c
> > @@ -250,8 +250,8 @@ static int kvp_on_msg(void *msg, int len)
> >          */
> >         if (cancel_delayed_work_sync(&kvp_timeout_work)) {
> >                 kvp_respond_to_host(message, error);
> > -               hv_poll_channel(kvp_transaction.recv_channel,
> kvp_poll_wrapper);
> >         }
> > +       hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper);
> >
> >         return 0;
> >  }
> >
> > How do you like this?
> >
> 
> Is it safe to call hv_poll_channel() simultaneously on several CPUs? It
> seems it is as we're doing
> 
> smp_call_function_single(channel->target_cpu, cb, channel, true);

Looks safe to me. The code has been there for years. :-)
 
> (I'm asking because if it's not, than doing what you suggest will open
> the following window: timeout function kvp_timeout_func() is already
> running but the daemon is trying to reply at the same moment).
> 
> > I can imagine there is still a small chance that the state machine can run
> > out of order, and the kvp daemon may exit due to the return values of
> > read()/write() being -EINVAL, but the chance should be small enough in
> > practice, and IMO the issue even exists in the current code when
> > hibernation is not involved, e.g. kvp_timeout_func() and kvp_on_msg()
> > can run concurrently; if kvp_on_msg() runs slowly due to some reason
> > (e.g. the kvp daemon is stopped as I'm gdb'ing it), kvp_timeout_func()
> > fires and moves the state to HVUTIL_READY; later kvp_on_msg() starts
> > to run and returns -EINVAL, and the kvp daemon will exit().
> >
> > IMHO here it looks extremely difficult to make things flawless (if that's
> > even possible), so it's necessary to ask the daemons to auto-restart once
> > they exit() unexpectedly. This can be achieved by configuring systemd
> > properly for the kvp/vss/fcopy services.
> 
> I think we can also teach daemons to ignore -EINVAL or switch to
> something like -EAGAIN in non-fatal cases.

Maybe the driver should return 0 rather than -EINVAL for the kvp daemon
in some scenarios, since kvp is never 100% reliable.

> > I'm not sure introducing a HVUTIL_SUSPENDED state would solve all
> > of the corner cases, but I'm sure that would further complicate the
> > current code, at least to me. :-)
> >
> 
> Maybe, if we don't need it than we don't. Basically, I see the only
> advantage in having such state: it makes our tricks to support
> hibernation more visible in the code.
> Vitaly

Let me think about this. 

BTW, for vss, maybe the VM should not hibernate if there is a backup 
ongoing? -- if the file system is frozen by hv_vss_daemon, and the VM
hibernates, then when the VM resumes back, it's almost always true that 
the VM won't receive the host's VSS_OP_THAW request, and the VM will
end up in an unusable state.

Thanks,
-- Dexuan


  reply	other threads:[~2019-09-19  6:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-11 23:38 [PATCH 0/3] Enhance hv_utils to support hibernation Dexuan Cui
2019-09-11 23:38 ` [PATCH 1/3] hv_utils: Add the support of hibernation Dexuan Cui
2019-09-12 16:36   ` Vitaly Kuznetsov
2019-09-13 19:15     ` Dexuan Cui
2019-09-16  8:45       ` Vitaly Kuznetsov
2019-09-19  6:34         ` Dexuan Cui [this message]
2019-09-19 10:27           ` Vitaly Kuznetsov
2019-09-21  7:26             ` Dexuan Cui
2019-09-11 23:38 ` [PATCH 2/3] hv_utils: Support host-initiated hibernation request Dexuan Cui
2019-09-12 16:26   ` Vitaly Kuznetsov
2019-09-13 16:42     ` Dexuan Cui
2019-09-11 23:39 ` [PATCH 3/3] hv_utils: Support host-initiated restart request Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PU1P153MB01694ABFED7ADAE8B40A65F7BF890@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM \
    --to=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=sashal@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).