All of lore.kernel.org
 help / color / mirror / Atom feed
* stub domain crash related to bind_interdomain
@ 2017-06-19 23:39 Sarah Newman
  2017-06-20  8:24 ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Sarah Newman @ 2017-06-19 23:39 UTC (permalink / raw)
  To: xen-devel

I have gotten messages like this sporadically in the qemu-dm log for stub domains, both at domain start and domain reboot:

evtchn_open() -> 7
ERROR: bind_interdomain failed with rc=-22xenevtchn_bind_interdomain(121, 0) = -22
bind interdomain ioctl error 22
Unable to find x86 CPU definition
close(0)

It is not always remote port 0 that fails but typically is so.

We recently upgraded to xen 4.8.1. When I look at the code, what seems like the most likely case for failure is this check in
xen/event_channel.c:evtchn_bind_interdomain:

    if ( (rchn->state != ECS_UNBOUND) ||
         (rchn->u.unbound.remote_domid != ld->domain_id) )
        ERROR_EXIT_DOM(-EINVAL, rd);

But I don't know how this could happen.

Was there a fix for this since the 4.8.1 release that I missed? I was not successful in finding anything related in the xen-devel logs since April 10
or in the git repositories.

Please keep me CC'ed.

Thanks, Sarah

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stub domain crash related to bind_interdomain
  2017-06-19 23:39 stub domain crash related to bind_interdomain Sarah Newman
@ 2017-06-20  8:24 ` Jan Beulich
  2017-06-20 17:36   ` Sarah Newman
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2017-06-20  8:24 UTC (permalink / raw)
  To: Sarah Newman; +Cc: xen-devel

>>> On 20.06.17 at 01:39, <srn@prgmr.com> wrote:
> I have gotten messages like this sporadically in the qemu-dm log for stub 
> domains, both at domain start and domain reboot:
> 
> evtchn_open() -> 7
> ERROR: bind_interdomain failed with rc=-22xenevtchn_bind_interdomain(121, 0) 
> = -22
> bind interdomain ioctl error 22
> Unable to find x86 CPU definition
> close(0)
> 
> It is not always remote port 0 that fails but typically is so.

But I'm afraid this is a relevant distinction, and hence you may be
seeing two different issues. Have you been able to find out where
that remote port is coming from? I ask because port 0 is never a
valid one (see evtchn_init() setting it to ECS_RESERVED).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stub domain crash related to bind_interdomain
  2017-06-20  8:24 ` Jan Beulich
@ 2017-06-20 17:36   ` Sarah Newman
  2017-06-21  6:37     ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Sarah Newman @ 2017-06-20 17:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 06/20/2017 01:24 AM, Jan Beulich wrote:
>>>> On 20.06.17 at 01:39, <srn@prgmr.com> wrote:
>> I have gotten messages like this sporadically in the qemu-dm log for stub 
>> domains, both at domain start and domain reboot:
>>
>> evtchn_open() -> 7
>> ERROR: bind_interdomain failed with rc=-22xenevtchn_bind_interdomain(121, 0) 
>> = -22
>> bind interdomain ioctl error 22
>> Unable to find x86 CPU definition
>> close(0)
>>
>> It is not always remote port 0 that fails but typically is so.
> 
> But I'm afraid this is a relevant distinction, and hence you may be
> seeing two different issues. Have you been able to find out where
> that remote port is coming from? I ask because port 0 is never a
> valid one (see evtchn_init() setting it to ECS_RESERVED).

By inspection I think it is
shared_page->vcpu_ioreq[i].vp_eport used in helper2.c:cpu_x86_init because otherwise I should see another message like

xc_evtchn_bind_interdomain(21, 3) = 0
first, and I only see one message from xc_evtchn_bind_interdomain.

I think it should be reproducible within a few hundred reboots.

--Sarah

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stub domain crash related to bind_interdomain
  2017-06-20 17:36   ` Sarah Newman
@ 2017-06-21  6:37     ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2017-06-21  6:37 UTC (permalink / raw)
  To: Sarah Newman; +Cc: xen-devel

>>> On 20.06.17 at 19:36, <srn@prgmr.com> wrote:
> On 06/20/2017 01:24 AM, Jan Beulich wrote:
>>>>> On 20.06.17 at 01:39, <srn@prgmr.com> wrote:
>>> I have gotten messages like this sporadically in the qemu-dm log for stub 
>>> domains, both at domain start and domain reboot:
>>>
>>> evtchn_open() -> 7
>>> ERROR: bind_interdomain failed with rc=-22xenevtchn_bind_interdomain(121, 0)
>>> = -22
>>> bind interdomain ioctl error 22
>>> Unable to find x86 CPU definition
>>> close(0)
>>>
>>> It is not always remote port 0 that fails but typically is so.
>> 
>> But I'm afraid this is a relevant distinction, and hence you may be
>> seeing two different issues. Have you been able to find out where
>> that remote port is coming from? I ask because port 0 is never a
>> valid one (see evtchn_init() setting it to ECS_RESERVED).
> 
> By inspection I think it is
> shared_page->vcpu_ioreq[i].vp_eport used in helper2.c:cpu_x86_init because 
> otherwise I should see another message like
> 
> xc_evtchn_bind_interdomain(21, 3) = 0
> first, and I only see one message from xc_evtchn_bind_interdomain.

So perhaps a race between the setting up of that field and its
consumption for binding? With most of the involved code in qemu
being the same between use in Dom0 and in stubdom, it may
simply be a race that happens to never be lost in the former
case (and as you say it's rare enough in the latter). Otoh I'm
not sure qemu-dm uses multiple threads in the first place, and if
it doesn't I can't see ways for such an occasionally lost race. In
any event - I'm not a qemu-dm specialist at all, so I'll defer
further analysis on that side to people who are.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-21  6:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-19 23:39 stub domain crash related to bind_interdomain Sarah Newman
2017-06-20  8:24 ` Jan Beulich
2017-06-20 17:36   ` Sarah Newman
2017-06-21  6:37     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.