From: Paul Durrant <xadimgnik@gmail.com>
To: Michael Brown <mcb30@ipxe.org>, Wei Liu <wei.liu@kernel.org>,
xen-devel@lists.xenproject.org, netdev@vger.kernel.org,
Paul Durrant <pdurrant@amazon.com>
Subject: Re: xen-netback hotplug-status regression bug
Date: Tue, 13 Apr 2021 11:55:36 +0100 [thread overview]
Message-ID: <54659eec-e315-5dc5-1578-d91633a80077@xen.org> (raw)
In-Reply-To: <58ccc3b7-9ccb-b9bf-84e7-4a023ccb5c56@ipxe.org>
On 13/04/2021 11:48, Michael Brown wrote:
> On 13/04/2021 08:12, Paul Durrant wrote:
>>> If the frontend subsequently disconnects and reconnects (e.g.
>>> transitions through Closed->Initialising->Connected) then:
>>>
>>> - Nothing recreates "hotplug-status"
>>>
>>> - When the frontend re-enters Connected state, connect() sets up a
>>> watch on "hotplug-status" again
>>>
>>> - The callback hotplug_status_changed() is never triggered, and so
>>> the backend device never transitions to Connected state.
>>
>> That's not how I read it. Given that "hotplug-status" is removed by
>> the call to hotplug_status_changed() then the next call to connect()
>> should fail to register the watch and 'have_hotplug_status_watch'
>> should be 0. Thus backend_switch_state() should not defer the
>> transition to XenbusStateConnected in any subsequent interaction with
>> the frontend.
>
> Thank you for the reply. I've tested and confirmed my initial
> hypothesis: the call to xenbus_watch_pathfmt() succeeds even if the node
> does not exist.
>
> I confirmed this with ftrace using:
>
> cd /sys/kernel/debug/tracing
> echo function_graph > current_tracer
> echo set_backend_state > set_ftrace_filter
> echo xenbus_watch_pathfmt >> set_ftrace_filter
> echo register_xenbus_watch >> set_ftrace_filter
> echo xenbus_dev_fatal >> set_ftrace_filter
>
> On the second time that the frontend transitions to Connected, this
> produced the trace:
>
> set_backend_state [xen_netback]() {
> register_xenbus_watch();
> register_xenbus_watch();
> xenbus_watch_pathfmt() {
> register_xenbus_watch();
> }
> }
>
> which seems to confirm that the error path in xenbus_watch_path() is
> *not* taken, i.e. that the call to register_xenbus_watch() succeeded
> even though the node did not exist.
>
>
> Other observations also seem to confirm this behaviour:
>
> - Running "xenstore ls" in dom0 confirms that on the second frontend
> transition to Connected, the frontend state is indeed Connected (4) but
> the backend state remains in InitWait (2)
>
> - Running "xenstore watch
> /local/domain/0/backend/vif/<domU>/0/hotplug-status" *before* starting
> the domU confirms that it is possible to create a watch on a node that
> does not (yet) exist, and that the watch *is* notified when the node is
> later created.
>
>> Are you seeing the watch successfully re-registered even though the
>> node does not exist? Perhaps there has been a change in xenstore
>> behaviour?
>
> So, the TL;DR is that yes, the watch does successfully register even
> though the node does not exist.
>
> From a quick look through the xenstored source, it looks as though the
> only check on the node name is the call to is_valid_nodename(), which
> seems to perform a syntactic validity check only. I can't immediately
> find any commit that would have changed this behaviour.
>
Ok, so it sound like this was probably my misunderstanding of xenstore
semantics in the first place (although I'm sure I remember watch
registration failing for non-existent nodes at some point in the past...
that may have been with a non-upstream version of oxenstored though).
Anyway... a reasonable fix would therefore be to read the node first and
only register the watch if it does exist.
Paul
> Thanks,
>
> Michael
next prev parent reply other threads:[~2021-04-13 10:55 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-10 18:25 xen-netback hotplug-status regression bug Michael Brown
2021-04-13 7:12 ` Paul Durrant
2021-04-13 10:48 ` Michael Brown
2021-04-13 10:55 ` Paul Durrant [this message]
2021-04-13 15:14 ` Michael Brown
2021-04-13 15:25 ` [PATCH] xen-netback: Check for hotplug-status existence before watching Michael Brown
2021-04-13 19:12 ` Paul Durrant
2021-04-13 22:30 ` patchwork-bot+netdevbpf
2021-05-10 18:32 ` Marek Marczykowski-Górecki
2021-05-10 18:47 ` Michael Brown
2021-05-10 18:53 ` Marek Marczykowski-Górecki
2021-05-10 19:06 ` Michael Brown
2021-05-10 19:42 ` Marek Marczykowski-Górecki
2021-05-11 7:06 ` Durrant, Paul
2021-05-11 10:40 ` Marek Marczykowski-Górecki
2021-05-11 10:45 ` Marek Marczykowski-Górecki
2021-05-11 12:46 ` Durrant, Paul
2021-05-17 21:43 ` Marek Marczykowski-Górecki
2021-05-17 21:51 ` Michael Brown
2021-05-17 21:58 ` Marek Marczykowski-Górecki
2021-05-18 6:57 ` Paul Durrant
2021-05-18 9:18 ` Marek Marczykowski-Górecki
[not found] ` <887f9533f5c54bfabfbff7231eb99b08@EX13D32EUC003.ant.amazon.com>
[not found] ` <YKOMpXwcnr9QiXy8@mail-itl>
[not found] ` <2c23e102b6254e42877eb1e8fe68a4f7@EX13D32EUC003.ant.amazon.com>
2021-05-18 10:42 ` Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54659eec-e315-5dc5-1578-d91633a80077@xen.org \
--to=xadimgnik@gmail.com \
--cc=mcb30@ipxe.org \
--cc=netdev@vger.kernel.org \
--cc=paul@xen.org \
--cc=pdurrant@amazon.com \
--cc=wei.liu@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).