xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: James Dingwall <james-xen@dingwall.me.uk>
Cc: <xen-devel@lists.xenproject.org>
Subject: Re: possible kernel/libxl race with xl network-attach
Date: Fri, 21 Jan 2022 15:00:29 +0100	[thread overview]
Message-ID: <Yeq8fdOWLngcjDH8@Air-de-Roger> (raw)
In-Reply-To: <1671ad485c35fb86d758951889cfa05d@dingwall.me.uk>

On Fri, Jan 21, 2022 at 01:34:54PM +0000, James Dingwall wrote:
> On 2022-01-13 16:11, Roger Pau Monné wrote:
> > On Thu, Jan 13, 2022 at 11:19:46AM +0000, James Dingwall wrote:
> > > 
> > > I have been trying to debug a problem where a vif with the backend
> > > in a
> > > driver domain is added to dom0.  Intermittently the hotplug script is
> > > not invoked by libxl (running as xl devd) in the driver domain.  By
> > > enabling some debug for the driver domain kernel and libxl I have
> > > these
> > > messages:
> > > 
> > > driver domain kernel (Ubuntu 5.4.0-92-generic):
> > > 
> > > [Thu Jan 13 01:39:31 2022] [1408] 564: vif vif-0-0 vif0.0:
> > > Successfully created xenvif
> > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > /local/domain/0/device/vif/0 -> Initialising
> > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > xen_netback:backend_switch_state: backend/vif/0/0 -> InitWait
> > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > /local/domain/0/device/vif/0 -> Connected
> > > [Thu Jan 13 01:39:31 2022] vif vif-0-0 vif0.0: Guest Rx ready
> > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > xen_netback:backend_switch_state: backend/vif/0/0 -> Connected
> > > 
> > > xl devd (Xen 4.14.3):
> > > 
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > wpath=/local/domain/2/backend token=3/0: event
> > > epath=/local/domain/2/backend/vif/0/0/state
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac569700:
> > > nested ao, parent 0x5633ac567f90
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > epath=/local/domain/2/backend/vif/0/0/state
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:1055:devstate_callback: backend
> > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > state 4
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > wpath=/local/domain/2/backend token=3/0: event
> > > epath=/local/domain/2/backend/vif/0/0/state
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac56a220:
> > > nested ao, parent 0x5633ac567f90
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > epath=/local/domain/2/backend/vif/0/0/state
> > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > libxl_event.c:1055:devstate_callback: backend
> > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > state 4
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_aoutils.c:88:xswait_timeout_callback: backend
> > > /local/domain/2/backend/vif/0/0/state (hoping for state change to
> > > 2): xswait timeout (path=/local/domain/2/backend/vif/0/0/state)
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:850:libxl__ev_xswatch_deregister: watch
> > > w=0x5633ac569180 wpath=/local/domain/2/backend/vif/0/0/state
> > > token=2/1: deregister slotnum=2
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:1039:devstate_callback: backend
> > > /local/domain/2/backend/vif/0/0/state wanted state 2  timed out
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:864:libxl__ev_xswatch_deregister: watch
> > > w=0x5633ac569180: deregister unregistered
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_device.c:1092:device_backend_callback: calling
> > > device_backend_cleanup
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:864:libxl__ev_xswatch_deregister: watch
> > > w=0x5633ac569180: deregister unregistered
> > > 2022-01-13 01:39:51 UTC libxl: error:
> > > libxl_device.c:1105:device_backend_callback: unable to add device
> > > with path /local/domain/2/backend/vif/0/0
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:864:libxl__ev_xswatch_deregister: watch
> > > w=0x5633ac569280: deregister unregistered
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_device.c:1470:device_complete: device
> > > /local/domain/2/backend/vif/0/0 add failed
> > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > libxl_event.c:2035:libxl__ao__destroy: ao 0x5633ac568f30: destroy
> > > 
> > > the xenstore content for the backend:
> > > 
> > > # xenstore-ls /local/domain/2/backend/vif/0
> > > 0 = ""
> > >  frontend = "/local/domain/0/device/vif/0"
> > >  frontend-id = "0"
> > >  online = "1"
> > >  state = "4"
> > >  script = "/etc/xen/scripts/vif-zynstra"
> > >  vifname = "dom0.0"
> > >  mac = "00:16:3e:6c:de:82"
> > >  bridge = "cluster"
> > >  handle = "0"
> > >  type = "vif"
> > >  feature-sg = "1"
> > >  feature-gso-tcpv4 = "1"
> > >  feature-gso-tcpv6 = "1"
> > >  feature-ipv6-csum-offload = "1"
> > >  feature-rx-copy = "1"
> > >  feature-rx-flip = "0"
> > >  feature-multicast-control = "1"
> > >  feature-dynamic-multicast-control = "1"
> > >  feature-split-event-channels = "1"
> > >  multi-queue-max-queues = "2"
> > >  feature-ctrl-ring = "1"
> > >  hotplug-status = "connected"

The relevant point here is that `hotplug-status = "connected"` in the
backend xenstore nodes, and that's set by the hotplug script.

Having hotplug-status == "connected" will allow the backend to proceed
to the connected state, so there's some component in your system that
sets this xenstore node before `xl devd` get a change to run the
hotplug script, hence me asking for any udev rules in the previous
email.

If it's not an udev rule then I'm lost. Do you have any modifications
to the Xen tools that could set this xenstore node?

Regards, Roger.


  reply	other threads:[~2022-01-21 14:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13 11:19 possible kernel/libxl race with xl network-attach James Dingwall
2022-01-13 16:11 ` Roger Pau Monné
2022-01-21 13:34   ` James Dingwall
2022-01-21 14:00     ` Roger Pau Monné [this message]
2022-01-21 15:05       ` James Dingwall
2022-01-24  9:07         ` Roger Pau Monné
2022-01-24 13:38           ` James Dingwall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yeq8fdOWLngcjDH8@Air-de-Roger \
    --to=roger.pau@citrix.com \
    --cc=james-xen@dingwall.me.uk \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).