From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=kLjY=4I=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	TVD_PH_BODY_ACCOUNTS_PRE autolearn=no autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6BDCEC11D00
	for <linux-mm@archiver.kernel.org>; Thu, 20 Feb 2020 08:39:19 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 0F2C9208C4
	for <linux-mm@archiver.kernel.org>; Thu, 20 Feb 2020 08:39:19 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="QDUEJcge"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F2C9208C4
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=citrix.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id A0BD86B0007; Thu, 20 Feb 2020 03:39:18 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 996436B0008; Thu, 20 Feb 2020 03:39:18 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 80F556B000A; Thu, 20 Feb 2020 03:39:18 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161])
	by kanga.kvack.org (Postfix) with ESMTP id 5D9766B0007
	for <linux-mm@kvack.org>; Thu, 20 Feb 2020 03:39:18 -0500 (EST)
Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with ESMTP id 2537F8248047
	for <linux-mm@kvack.org>; Thu, 20 Feb 2020 08:39:18 +0000 (UTC)
X-FDA: 76509855996.15.trees23_8b5bd9435d02d
X-HE-Tag: trees23_8b5bd9435d02d
X-Filterd-Recvd-Size: 10645
Received: from esa3.hc3370-68.iphmx.com (esa3.hc3370-68.iphmx.com [216.71.145.155])
	by imf36.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Thu, 20 Feb 2020 08:39:17 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple;
  d=citrix.com; s=securemail; t=1582187957;
  h=date:from:to:cc:subject:message-id:references:
   mime-version:content-transfer-encoding:in-reply-to;
  bh=JwHl4quXWDG9itkjDYku543f/vfU0gx98KV7jgfEzIs=;
  b=QDUEJcgeWST5yRntYbahZONUA+VPL0R7KOiV4Nqfgh1sF52u7M/LuDhh
   jRChza0ZZF5x215zU6EZzyOeW4NIWg7E1Mwt3rSayQvBjBeF4syrSw5Uf
   I9+oiyt7TgEuQ7m405L9IkMdYpWwNuxkBndz+VMD1ptbuOBqtsQG2svFs
   E=;
Authentication-Results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=roger.pau@citrix.com; spf=Pass smtp.mailfrom=roger.pau@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com
Received-SPF: None (esa3.hc3370-68.iphmx.com: no sender
  authenticity information available from domain of
  roger.pau@citrix.com) identity=pra; client-ip=162.221.158.21;
  receiver=esa3.hc3370-68.iphmx.com;
  envelope-from="roger.pau@citrix.com";
  x-sender="roger.pau@citrix.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (esa3.hc3370-68.iphmx.com: domain of
  roger.pau@citrix.com designates 162.221.158.21 as permitted
  sender) identity=mailfrom; client-ip=162.221.158.21;
  receiver=esa3.hc3370-68.iphmx.com;
  envelope-from="roger.pau@citrix.com";
  x-sender="roger.pau@citrix.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1";
  x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133
  ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4
  ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88
  ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83
  ip4:168.245.78.127 ~all"
Received-SPF: None (esa3.hc3370-68.iphmx.com: no sender
  authenticity information available from domain of
  postmaster@mail.citrix.com) identity=helo;
  client-ip=162.221.158.21; receiver=esa3.hc3370-68.iphmx.com;
  envelope-from="roger.pau@citrix.com";
  x-sender="postmaster@mail.citrix.com";
  x-conformance=sidf_compatible
IronPort-SDR: MM0J4GlVDlB8jW19M11+P9HrOGhNxD8WXYL9vu3YBiAQ460tEWfz3gUg2Lw6ArGektClf8Nbag
 ZDMsNn+bIaHgvmiQrWsOTOXSdwNZIukFIJ+v8ThpcN5voCcqc9fpCjOv1dFXfaPkJlXRiaocNz
 Ey3nz+5z5gJ8Z3tkNDhfq+2pULZijq4YhGeqIYtykJ4RyOTbAeqOEVpuzrpMAKyUgCJObCNTDs
 ys4QXG50X3UKLzLY0I+Ya9hAkPZ3tBMhdhqXVkrciAd+JgmZ9nUhR/Mfl0wZ+8MU2oflNLNG06
 Zow=
X-SBRS: 2.7
X-MesageID: 12723432
X-Ironport-Server: esa3.hc3370-68.iphmx.com
X-Remote-IP: 162.221.158.21
X-Policy: $RELAYED
X-IronPort-AV: E=Sophos;i="5.70,463,1574139600"; 
   d="scan'208";a="12723432"
Date: Thu, 20 Feb 2020 09:39:04 +0100
From: Roger Pau =?utf-8?B?TW9ubsOp?= <roger.pau@citrix.com>
To: Anchal Agarwal <anchalag@amazon.com>
CC: <tglx@linutronix.de>, <mingo@redhat.com>, <bp@alien8.de>, <hpa@zytor.com>,
	<x86@kernel.org>, <boris.ostrovsky@oracle.com>, <jgross@suse.com>,
	<linux-pm@vger.kernel.org>, <linux-mm@kvack.org>, <kamatam@amazon.com>,
	<sstabellini@kernel.org>, <konrad.wilk@oracle.com>, <axboe@kernel.dk>,
	<davem@davemloft.net>, <rjw@rjwysocki.net>, <len.brown@intel.com>,
	<pavel@ucw.cz>, <peterz@infradead.org>, <eduval@amazon.com>,
	<sblbir@amazon.com>, <xen-devel@lists.xenproject.org>, <vkuznets@redhat.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<dwmw@amazon.co.uk>, <fllinden@amaozn.com>, <benh@kernel.crashing.org>
Subject: Re: [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend
 and hibernation
Message-ID: <20200220083904.GI4679@Air-de-Roger>
References: <cover.1581721799.git.anchalag@amazon.com>
 <890c404c585d7790514527f0c021056a7be6e748.1581721799.git.anchalag@amazon.com>
 <20200217100509.GE4679@Air-de-Roger>
 <20200217230553.GA8100@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
 <20200218091611.GN4679@Air-de-Roger>
 <20200219180424.GA17584@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <20200219180424.GA17584@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
X-ClientProxiedBy: AMSPEX02CAS02.citrite.net (10.69.22.113) To
 AMSPEX02CL01.citrite.net (10.69.22.125)
Content-Transfer-Encoding: quoted-printable
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

Thanks for this work, please see below.

On Wed, Feb 19, 2020 at 06:04:24PM +0000, Anchal Agarwal wrote:
> On Tue, Feb 18, 2020 at 10:16:11AM +0100, Roger Pau Monn=C3=A9 wrote:
> > On Mon, Feb 17, 2020 at 11:05:53PM +0000, Anchal Agarwal wrote:
> > > On Mon, Feb 17, 2020 at 11:05:09AM +0100, Roger Pau Monn=C3=A9 wrot=
e:
> > > > On Fri, Feb 14, 2020 at 11:25:34PM +0000, Anchal Agarwal wrote:
> > > > > From: Munehisa Kamata <kamatam@amazon.com
> > > > >=20
> > > > > Add freeze, thaw and restore callbacks for PM suspend and hiber=
nation
> > > > > support. All frontend drivers that needs to use PM_HIBERNATION/=
PM_SUSPEND
> > > > > events, need to implement these xenbus_driver callbacks.
> > > > > The freeze handler stops a block-layer queue and disconnect the
> > > > > frontend from the backend while freeing ring_info and associate=
d resources.
> > > > > The restore handler re-allocates ring_info and re-connect to th=
e
> > > > > backend, so the rest of the kernel can continue to use the bloc=
k device
> > > > > transparently. Also, the handlers are used for both PM suspend =
and
> > > > > hibernation so that we can keep the existing suspend/resume cal=
lbacks for
> > > > > Xen suspend without modification. Before disconnecting from bac=
kend,
> > > > > we need to prevent any new IO from being queued and wait for ex=
isting
> > > > > IO to complete.
> > > >=20
> > > > This is different from Xen (xenstore) initiated suspension, as in=
 that
> > > > case Linux doesn't flush the rings or disconnects from the backen=
d.
> > > Yes, AFAIK in xen initiated suspension backend takes care of it.=20
> >=20
> > No, in Xen initiated suspension backend doesn't take care of flushing
> > the rings, the frontend has a shadow copy of the ring contents and it
> > re-issues the requests on resume.
> >=20
> Yes, I meant suspension in general where both xenstore and backend know=
s
> system is going under suspension and not flushing of rings.

backend has no idea the guest is going to be suspended. Backend code
is completely agnostic to suspension/resume.

> That happens
> in frontend when backend indicates that state is closing and so on.
> I may have written it in wrong context.

I'm afraid I'm not sure I fully understand this last sentence.

> > > > > +static int blkfront_freeze(struct xenbus_device *dev)
> > > > > +{
> > > > > +	unsigned int i;
> > > > > +	struct blkfront_info *info =3D dev_get_drvdata(&dev->dev);
> > > > > +	struct blkfront_ring_info *rinfo;
> > > > > +	/* This would be reasonable timeout as used in xenbus_dev_shu=
tdown() */
> > > > > +	unsigned int timeout =3D 5 * HZ;
> > > > > +	int err =3D 0;
> > > > > +
> > > > > +	info->connected =3D BLKIF_STATE_FREEZING;
> > > > > +
> > > > > +	blk_mq_freeze_queue(info->rq);
> > > > > +	blk_mq_quiesce_queue(info->rq);
> > > > > +
> > > > > +	for (i =3D 0; i < info->nr_rings; i++) {
> > > > > +		rinfo =3D &info->rinfo[i];
> > > > > +
> > > > > +		gnttab_cancel_free_callback(&rinfo->callback);
> > > > > +		flush_work(&rinfo->work);
> > > > > +	}
> > > > > +
> > > > > +	/* Kick the backend to disconnect */
> > > > > +	xenbus_switch_state(dev, XenbusStateClosing);
> > > >=20
> > > > Are you sure this is safe?
> > > >=20
> > > In my testing running multiple fio jobs, other test scenarios runni=
ng
> > > a memory loader works fine. I did not came across a scenario that w=
ould
> > > have failed resume due to blkfront issues unless you can sugest som=
e?
> >=20
> > AFAICT you don't wait for the in-flight requests to be finished, and
> > just rely on blkback to finish processing those. I'm not sure all
> > blkback implementations out there can guarantee that.
> >=20
> > The approach used by Xen initiated suspension is to re-issue the
> > in-flight requests when resuming. I have to admit I don't think this
> > is the best approach, but I would like to keep both the Xen and the P=
M
> > initiated suspension using the same logic, and hence I would request
> > that you try to re-use the existing resume logic (blkfront_resume).
> >=20
> > > > I don't think you wait for all requests pending on the ring to be
> > > > finished by the backend, and hence you might loose requests as th=
e
> > > > ones on the ring would not be re-issued by blkfront_restore AFAIC=
T.
> > > >=20
> > > AFAIU, blk_mq_freeze_queue/blk_mq_quiesce_queue should take care of=
 no used
> > > request on the shared ring. Also, we I want to pause the queue and =
flush all
> > > the pending requests in the shared ring before disconnecting from b=
ackend.
> >=20
> > Oh, so blk_mq_freeze_queue does wait for in-flight requests to be
> > finished. I guess it's fine then.
> >=20
> Ok.
> > > Quiescing the queue seemed a better option here as we want to make =
sure ongoing
> > > requests dispatches are totally drained.
> > > I should accept that some of these notion is borrowed from how nvme=
 freeze/unfreeze=20
> > > is done although its not apple to apple comparison.
> >=20
> > That's fine, but I would still like to requests that you use the same
> > logic (as much as possible) for both the Xen and the PM initiated
> > suspension.
> >=20
> > So you either apply this freeze/unfreeze to the Xen suspension (and
> > drop the re-issuing of requests on resume) or adapt the same approach
> > as the Xen initiated suspension. Keeping two completely different
> > approaches to suspension / resume on blkfront is not suitable long
> > term.
> >=20
> I agree with you on overhaul of xen suspend/resume wrt blkfront is a go=
od
> idea however, IMO that is a work for future and this patch series shoul=
d=20
> not be blocked for it. What do you think?

It's not so much that I think an overhaul of suspend/resume in
blkfront is needed, it's just that I don't want to have two completely
different suspend/resume paths inside blkfront.

So from my PoV I think the right solution is to either use the same
code (as much as possible) as it's currently used by Xen initiated
suspend/resume, or to also switch Xen initiated suspension to use the
newly introduced code.

Having two different approaches to suspend/resume in the same driver
is a recipe for disaster IMO: it adds complexity by forcing developers
to take into account two different suspend/resume approaches when
there's no need for it.

Thanks, Roger.