All of lore.kernel.org
 help / color / mirror / Atom feed
* bug in hvmloader xenbus_shutdown logic
@ 2013-12-04  4:25 Liuqiming (John)
  2013-12-04  9:49 ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Liuqiming (John) @ 2013-12-04  4:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Yanqiangjun

Hi,

A couple of months ago, I run into a xenstored IO ring issue ( see http://lists.xen.org/archives/html/xen-devel/2013-07/msg00134.html) and at that time we add a protection on oxenstored side.

Somehow I meet this problem again. A suse11 hvm domain start and hang with error:

    XENBUS request ring is not quiescent (00000010:00000000)!
    XENBUS response ring is not quiescent (00000000:00000013): fixing up

It happens when pv-driver init xenbus modules

...
int xb_init_comms(void)
{
	struct xenstore_domain_interface *intf = xen_store_interface;
	int err;

	if (intf->req_prod != intf->req_cons)
		pr_err("XENBUS request ring is not quiescent "
		       "(%08x:%08x)!\n", intf->req_cons, intf->req_prod);

	if (intf->rsp_prod != intf->rsp_cons) {
		pr_warning("XENBUS response ring is not quiescent"
			   " (%08x:%08x): fixing up\n",
			   intf->rsp_cons, intf->rsp_prod);
		/* breaks kdump */
		if (!reset_devices)
			intf->rsp_cons = intf->rsp_prod;
	}
...

With the protection logic, oxenstored will treat this domain as " malicious domains", but apparently it's not.

And this time I review all the related source code, and found that it may be caused by a bug in hvmloader.

In hvmloader after all the setup, there will be a xenbus_shutdown call which will clear up the ring,

void xenbus_shutdown(void)
{
...
    /* We zero out the whole ring -- the backend can handle this, and it's 
     * not going to surprise any frontends since it's equivalent to never 
     * having used the rings. */
    memset(rings, 0, sizeof *rings); // <-- clear it here
...
} 

And  memset is implemented as this 
void *
memset(void *s, int c, unsigned n)
{
    uint8_t b = (uint8_t) c;
    uint8_t *p = (uint8_t *)s;
    int i;
    for ( i = 0; i < n; i++ )
        *p++ = b;
    return s;
}

The ring structure is put in a shared page and definition is 
struct xenstore_domain_interface
{
    char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */
    char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */
    XENSTORE_RING_IDX req_cons, req_prod;
    XENSTORE_RING_IDX rsp_cons, rsp_prod;
};

So here is the problem:
memset can not set all the page to zero in an atomic way, and during the clear up process oxenstored may access this ring.
So memset set every 8bit in this page to zero one by one and when req_cons set to zero oxenstored read req_cons and req_prod to compare, 
After that oxenstored will process as a normal incoming request and memset will zero out the left page. 
Finally oxenstored will increase req_cons but req_prod will remain as 0 and that status will be treated as a "bad connection".

To fix that we need to make memset atomicly (clear up req_cons and req_prod the same time at least) or we can make oxenstored do not to access the ring when hvmloader clearing up.

Any suggestion?

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-09-24 13:36 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-04  4:25 bug in hvmloader xenbus_shutdown logic Liuqiming (John)
2013-12-04  9:49 ` Ian Campbell
2013-12-05  2:08   ` Liuqiming (John)
2013-12-05  9:36     ` David Scott
2013-12-05  9:43       ` Paul Durrant
2013-12-05 11:10       ` Ian Campbell
2013-12-06  3:53         ` Liuqiming (John)
2014-06-16 13:43           ` Dave Scott
2014-06-16 13:57             ` Paul Durrant
2014-06-25 21:15               ` [PATCH RFC] RFC: extend the xenstore ring with a 'closing' signal David Scott
2014-06-26  9:05                 ` Paul Durrant
2014-06-26  9:45                   ` Dave Scott
2014-06-30 14:42                     ` [PATCH v2] " David Scott
2014-07-03 15:15                       ` [PATCH v3] " David Scott
2014-07-04  8:21                         ` Paul Durrant
2014-07-04 10:00                           ` Dave Scott
2014-07-04  9:59                         ` Ian Campbell
2014-07-04 10:19                           ` Dave Scott
2014-07-04 11:27                             ` Ian Campbell
2014-09-03 16:25                         ` [PATCH v4] xenstore: " David Scott
2014-09-10 13:35                           ` Ian Campbell
2014-09-10 14:33                             ` Dave Scott
     [not found]                           ` <DF76B30A-D122-4600-987E-6BBD66CFFF73@citrix.com>
2014-09-22 16:38                             ` Ian Jackson
2014-09-24  9:06                               ` Dave Scott
2014-09-24 13:36                           ` Jon Ludlam
2014-07-04 11:02                       ` [PATCH v2] RFC: " Ian Jackson
2014-07-02 12:32                 ` [PATCH RFC] " Andrew Cooper
2014-07-02 16:47                   ` Dave Scott
2014-07-02 16:52                     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.