From: Ross Lagerwall <ross.lagerwall@citrix.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2] xenbus: Avoid deadlock during suspend due to open transactions
Date: Wed, 22 May 2019 16:25:52 +0100 [thread overview]
Message-ID: <56ce57f7-8c99-7c16-c997-97bf1e7773bf@citrix.com> (raw)
In-Reply-To: <20190513135635.22406-1-ross.lagerwall@citrix.com>
Ping?
On 5/13/19 2:56 PM, Ross Lagerwall wrote:
> During a suspend/resume, the xenwatch thread waits for all outstanding
> xenstore requests and transactions to complete. This does not work
> correctly for transactions started by userspace because it waits for
> them to complete after freezing userspace threads which means the
> transactions have no way of completing, resulting in a deadlock. This is
> trivial to reproduce by running this script and then suspending the VM:
>
> import pyxs, time
> c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus")
> c.connect()
> c.transaction()
> time.sleep(3600)
>
> Even if this deadlock were resolved, misbehaving userspace should not
> prevent a VM from being migrated. So, instead of waiting for these
> transactions to complete before suspending, store the current generation
> id for each transaction when it is started. The global generation id is
> incremented during resume. If the caller commits the transaction and the
> generation id does not match the current generation id, return EAGAIN so
> that they try again. If the transaction was instead discarded, return OK
> since no changes were made anyway.
>
> This only affects users of the xenbus file interface. In-kernel users of
> xenbus are assumed to be well-behaved and complete all transactions
> before freezing.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> ---
>
> Changed in v2: rewrote according to Juergen's suggestion.
>
> drivers/xen/xenbus/xenbus.h | 3 +++
> drivers/xen/xenbus/xenbus_dev_frontend.c | 18 ++++++++++++++++++
> drivers/xen/xenbus/xenbus_xs.c | 7 +++++--
> 3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
> index 092981171df1..d75a2385b37c 100644
> --- a/drivers/xen/xenbus/xenbus.h
> +++ b/drivers/xen/xenbus/xenbus.h
> @@ -83,6 +83,7 @@ struct xb_req_data {
> int num_vecs;
> int err;
> enum xb_req_state state;
> + bool user_req;
> void (*cb)(struct xb_req_data *);
> void *par;
> };
> @@ -133,4 +134,6 @@ void xenbus_ring_ops_init(void);
> int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par);
> void xenbus_dev_queue_reply(struct xb_req_data *req);
>
> +extern unsigned int xb_dev_generation_id;
> +
> #endif
> diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c b/drivers/xen/xenbus/xenbus_dev_frontend.c
> index 0782ff3c2273..39c63152a358 100644
> --- a/drivers/xen/xenbus/xenbus_dev_frontend.c
> +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c
> @@ -62,6 +62,8 @@
>
> #include "xenbus.h"
>
> +unsigned int xb_dev_generation_id;
> +
> /*
> * An element of a list of outstanding transactions, for which we're
> * still waiting a reply.
> @@ -69,6 +71,7 @@
> struct xenbus_transaction_holder {
> struct list_head list;
> struct xenbus_transaction handle;
> + unsigned int generation_id;
> };
>
> /*
> @@ -441,6 +444,7 @@ static int xenbus_write_transaction(unsigned msg_type,
> rc = -ENOMEM;
> goto out;
> }
> + trans->generation_id = xb_dev_generation_id;
> list_add(&trans->list, &u->transactions);
> } else if (msg->hdr.tx_id != 0 &&
> !xenbus_get_transaction(u, msg->hdr.tx_id))
> @@ -449,6 +453,20 @@ static int xenbus_write_transaction(unsigned msg_type,
> !(msg->hdr.len == 2 &&
> (!strcmp(msg->body, "T") || !strcmp(msg->body, "F"))))
> return xenbus_command_reply(u, XS_ERROR, "EINVAL");
> + else if (msg_type == XS_TRANSACTION_END) {
> + trans = xenbus_get_transaction(u, msg->hdr.tx_id);
> + if (trans && trans->generation_id != xb_dev_generation_id) {
> + list_del(&trans->list);
> + kfree(trans);
> + if (!strcmp(msg->body, "T"))
> + return xenbus_command_reply(u, XS_ERROR,
> + "EAGAIN");
> + else
> + return xenbus_command_reply(u,
> + XS_TRANSACTION_END,
> + "OK");
> + }
> + }
>
> rc = xenbus_dev_request_and_reply(&msg->hdr, u);
> if (rc && trans) {
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index 49a3874ae6bb..ddc18da61834 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -105,6 +105,7 @@ static void xs_suspend_enter(void)
>
> static void xs_suspend_exit(void)
> {
> + xb_dev_generation_id++;
> spin_lock(&xs_state_lock);
> xs_suspend_active--;
> spin_unlock(&xs_state_lock);
> @@ -125,7 +126,7 @@ static uint32_t xs_request_enter(struct xb_req_data *req)
> spin_lock(&xs_state_lock);
> }
>
> - if (req->type == XS_TRANSACTION_START)
> + if (req->type == XS_TRANSACTION_START && !req->user_req)
> xs_state_users++;
> xs_state_users++;
> rq_id = xs_request_id++;
> @@ -140,7 +141,7 @@ void xs_request_exit(struct xb_req_data *req)
> spin_lock(&xs_state_lock);
> xs_state_users--;
> if ((req->type == XS_TRANSACTION_START && req->msg.type == XS_ERROR) ||
> - (req->type == XS_TRANSACTION_END &&
> + (req->type == XS_TRANSACTION_END && !req->user_req &&
> !WARN_ON_ONCE(req->msg.type == XS_ERROR &&
> !strcmp(req->body, "ENOENT"))))
> xs_state_users--;
> @@ -286,6 +287,7 @@ int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par)
> req->num_vecs = 1;
> req->cb = xenbus_dev_queue_reply;
> req->par = par;
> + req->user_req = true;
>
> xs_send(req, msg);
>
> @@ -313,6 +315,7 @@ static void *xs_talkv(struct xenbus_transaction t,
> req->vec = iovec;
> req->num_vecs = num_vecs;
> req->cb = xs_wake_up;
> + req->user_req = false;
>
> msg.req_id = 0;
> msg.tx_id = t.id;
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
WARNING: multiple messages have this Message-ID (diff)
From: Ross Lagerwall <ross.lagerwall@citrix.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Juergen Gross <jgross@suse.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] [PATCH v2] xenbus: Avoid deadlock during suspend due to open transactions
Date: Wed, 22 May 2019 16:25:52 +0100 [thread overview]
Message-ID: <56ce57f7-8c99-7c16-c997-97bf1e7773bf@citrix.com> (raw)
Message-ID: <20190522152552.LwkdqmuuDCKViQP06nC6YEN1_eTQ-RJB9yPWEkcDlec@z> (raw)
In-Reply-To: <20190513135635.22406-1-ross.lagerwall@citrix.com>
Ping?
On 5/13/19 2:56 PM, Ross Lagerwall wrote:
> During a suspend/resume, the xenwatch thread waits for all outstanding
> xenstore requests and transactions to complete. This does not work
> correctly for transactions started by userspace because it waits for
> them to complete after freezing userspace threads which means the
> transactions have no way of completing, resulting in a deadlock. This is
> trivial to reproduce by running this script and then suspending the VM:
>
> import pyxs, time
> c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus")
> c.connect()
> c.transaction()
> time.sleep(3600)
>
> Even if this deadlock were resolved, misbehaving userspace should not
> prevent a VM from being migrated. So, instead of waiting for these
> transactions to complete before suspending, store the current generation
> id for each transaction when it is started. The global generation id is
> incremented during resume. If the caller commits the transaction and the
> generation id does not match the current generation id, return EAGAIN so
> that they try again. If the transaction was instead discarded, return OK
> since no changes were made anyway.
>
> This only affects users of the xenbus file interface. In-kernel users of
> xenbus are assumed to be well-behaved and complete all transactions
> before freezing.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> ---
>
> Changed in v2: rewrote according to Juergen's suggestion.
>
> drivers/xen/xenbus/xenbus.h | 3 +++
> drivers/xen/xenbus/xenbus_dev_frontend.c | 18 ++++++++++++++++++
> drivers/xen/xenbus/xenbus_xs.c | 7 +++++--
> 3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/xenbus/xenbus.h b/drivers/xen/xenbus/xenbus.h
> index 092981171df1..d75a2385b37c 100644
> --- a/drivers/xen/xenbus/xenbus.h
> +++ b/drivers/xen/xenbus/xenbus.h
> @@ -83,6 +83,7 @@ struct xb_req_data {
> int num_vecs;
> int err;
> enum xb_req_state state;
> + bool user_req;
> void (*cb)(struct xb_req_data *);
> void *par;
> };
> @@ -133,4 +134,6 @@ void xenbus_ring_ops_init(void);
> int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par);
> void xenbus_dev_queue_reply(struct xb_req_data *req);
>
> +extern unsigned int xb_dev_generation_id;
> +
> #endif
> diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c b/drivers/xen/xenbus/xenbus_dev_frontend.c
> index 0782ff3c2273..39c63152a358 100644
> --- a/drivers/xen/xenbus/xenbus_dev_frontend.c
> +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c
> @@ -62,6 +62,8 @@
>
> #include "xenbus.h"
>
> +unsigned int xb_dev_generation_id;
> +
> /*
> * An element of a list of outstanding transactions, for which we're
> * still waiting a reply.
> @@ -69,6 +71,7 @@
> struct xenbus_transaction_holder {
> struct list_head list;
> struct xenbus_transaction handle;
> + unsigned int generation_id;
> };
>
> /*
> @@ -441,6 +444,7 @@ static int xenbus_write_transaction(unsigned msg_type,
> rc = -ENOMEM;
> goto out;
> }
> + trans->generation_id = xb_dev_generation_id;
> list_add(&trans->list, &u->transactions);
> } else if (msg->hdr.tx_id != 0 &&
> !xenbus_get_transaction(u, msg->hdr.tx_id))
> @@ -449,6 +453,20 @@ static int xenbus_write_transaction(unsigned msg_type,
> !(msg->hdr.len == 2 &&
> (!strcmp(msg->body, "T") || !strcmp(msg->body, "F"))))
> return xenbus_command_reply(u, XS_ERROR, "EINVAL");
> + else if (msg_type == XS_TRANSACTION_END) {
> + trans = xenbus_get_transaction(u, msg->hdr.tx_id);
> + if (trans && trans->generation_id != xb_dev_generation_id) {
> + list_del(&trans->list);
> + kfree(trans);
> + if (!strcmp(msg->body, "T"))
> + return xenbus_command_reply(u, XS_ERROR,
> + "EAGAIN");
> + else
> + return xenbus_command_reply(u,
> + XS_TRANSACTION_END,
> + "OK");
> + }
> + }
>
> rc = xenbus_dev_request_and_reply(&msg->hdr, u);
> if (rc && trans) {
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index 49a3874ae6bb..ddc18da61834 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -105,6 +105,7 @@ static void xs_suspend_enter(void)
>
> static void xs_suspend_exit(void)
> {
> + xb_dev_generation_id++;
> spin_lock(&xs_state_lock);
> xs_suspend_active--;
> spin_unlock(&xs_state_lock);
> @@ -125,7 +126,7 @@ static uint32_t xs_request_enter(struct xb_req_data *req)
> spin_lock(&xs_state_lock);
> }
>
> - if (req->type == XS_TRANSACTION_START)
> + if (req->type == XS_TRANSACTION_START && !req->user_req)
> xs_state_users++;
> xs_state_users++;
> rq_id = xs_request_id++;
> @@ -140,7 +141,7 @@ void xs_request_exit(struct xb_req_data *req)
> spin_lock(&xs_state_lock);
> xs_state_users--;
> if ((req->type == XS_TRANSACTION_START && req->msg.type == XS_ERROR) ||
> - (req->type == XS_TRANSACTION_END &&
> + (req->type == XS_TRANSACTION_END && !req->user_req &&
> !WARN_ON_ONCE(req->msg.type == XS_ERROR &&
> !strcmp(req->body, "ENOENT"))))
> xs_state_users--;
> @@ -286,6 +287,7 @@ int xenbus_dev_request_and_reply(struct xsd_sockmsg *msg, void *par)
> req->num_vecs = 1;
> req->cb = xenbus_dev_queue_reply;
> req->par = par;
> + req->user_req = true;
>
> xs_send(req, msg);
>
> @@ -313,6 +315,7 @@ static void *xs_talkv(struct xenbus_transaction t,
> req->vec = iovec;
> req->num_vecs = num_vecs;
> req->cb = xs_wake_up;
> + req->user_req = false;
>
> msg.req_id = 0;
> msg.tx_id = t.id;
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-05-22 15:25 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-13 13:56 [PATCH v2] xenbus: Avoid deadlock during suspend due to open transactions Ross Lagerwall
2019-05-13 13:56 ` [Xen-devel] " Ross Lagerwall
2019-05-22 15:25 ` Ross Lagerwall [this message]
2019-05-22 15:25 ` Ross Lagerwall
2019-05-27 5:11 ` Juergen Gross
2019-05-27 5:11 ` [Xen-devel] " Juergen Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56ce57f7-8c99-7c16-c997-97bf1e7773bf@citrix.com \
--to=ross.lagerwall@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=jgross@suse.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).