All of lore.kernel.org
 help / color / mirror / Atom feed
* vm_event regression in 4.7
@ 2016-02-03  0:51 Tamas K Lengyel
  2016-02-03  1:00 ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Tamas K Lengyel @ 2016-02-03  0:51 UTC (permalink / raw)
  To: Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 672 bytes --]

Hello all,
with the latest master branch of Xen there is a regression enabling
vm_event on a domain. If an event listener was previously active on the
domain it is now not possible to reenable events as the domctl returns
-EINVAL. The problem seems to stem from activating the magic page for
vm_event using prepare_ring_for_helper as it returns NULL. Further looking
into where things go wrong within that function it seems the page type
returned by __get_gfn_type_access is p2m_ram_logdirty with an invalid mfn
(0xffffffffffffffff) and then it hits "Error path: not a suitable GFN at
all".

Can anyone point me to which change or what may be causing this?

Thanks,
Tamas

[-- Attachment #1.2: Type: text/html, Size: 811 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: vm_event regression in 4.7
  2016-02-03  0:51 vm_event regression in 4.7 Tamas K Lengyel
@ 2016-02-03  1:00 ` Andrew Cooper
  2016-02-03  1:32   ` Tamas K Lengyel
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2016-02-03  1:00 UTC (permalink / raw)
  To: Tamas K Lengyel, Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1197 bytes --]

On 03/02/2016 00:51, Tamas K Lengyel wrote:
> Hello all,
> with the latest master branch of Xen there is a regression enabling
> vm_event on a domain. If an event listener was previously active on
> the domain it is now not possible to reenable events as the domctl
> returns -EINVAL. The problem seems to stem from activating the magic
> page for vm_event using prepare_ring_for_helper as it returns NULL.
> Further looking into where things go wrong within that function it
> seems the page type returned by __get_gfn_type_access is
> p2m_ram_logdirty with an invalid mfn (0xffffffffffffffff) and then it
> hits "Error path: not a suitable GFN at all".
>
> Can anyone point me to which change or what may be causing this?

Did the previous event listener replace the page it stole from guest
physmap for ring purposes when it exited?

That error specifically means that the gfn chosen for the ring was not
present when prepare_ring_for_helper() was called.

A first gut feeling would point to the changed in HVM domain
construction stemming from the DMLite work, but if event listening works
for the first time and then fails, the magic page was suitably present
the first time around.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 2071 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: vm_event regression in 4.7
  2016-02-03  1:00 ` Andrew Cooper
@ 2016-02-03  1:32   ` Tamas K Lengyel
  2016-02-03 10:35     ` Andrew Cooper
  0 siblings, 1 reply; 6+ messages in thread
From: Tamas K Lengyel @ 2016-02-03  1:32 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1520 bytes --]

On Tue, Feb 2, 2016 at 6:00 PM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 03/02/2016 00:51, Tamas K Lengyel wrote:
>
> Hello all,
> with the latest master branch of Xen there is a regression enabling
> vm_event on a domain. If an event listener was previously active on the
> domain it is now not possible to reenable events as the domctl returns
> -EINVAL. The problem seems to stem from activating the magic page for
> vm_event using prepare_ring_for_helper as it returns NULL. Further looking
> into where things go wrong within that function it seems the page type
> returned by __get_gfn_type_access is p2m_ram_logdirty with an invalid mfn
> (0xffffffffffffffff) and then it hits "Error path: not a suitable GFN at
> all".
>
> Can anyone point me to which change or what may be causing this?
>
>
> Did the previous event listener replace the page it stole from guest
> physmap for ring purposes when it exited?
>

Ah, here is what seems to be the problem. Previously it was not required to
do this during teardown. What we had was libxc would check if it can map
the ring page with xc_map_foreign_pages, and it would repopulate the page
if it failed before running xc_vm_event_enable. However, now it seems
xc_map_foreign_pages return non-NULL the second time around as well, either
though the page is not in the physmap. If I enforce libxc to run
populate_physmap then I can get vm_event to initialize properly again. So
the change seems to relate somehow the behavior of xc_map_foreign_pages.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 2459 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: vm_event regression in 4.7
  2016-02-03  1:32   ` Tamas K Lengyel
@ 2016-02-03 10:35     ` Andrew Cooper
  2016-02-05 20:34       ` Tamas K Lengyel
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Cooper @ 2016-02-03 10:35 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Ian Campbell, Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2054 bytes --]

On 03/02/16 01:32, Tamas K Lengyel wrote:
>
>
> On Tue, Feb 2, 2016 at 6:00 PM, Andrew Cooper
> <andrew.cooper3@citrix.com <mailto:andrew.cooper3@citrix.com>> wrote:
>
>     On 03/02/2016 00:51, Tamas K Lengyel wrote:
>>     Hello all,
>>     with the latest master branch of Xen there is a regression
>>     enabling vm_event on a domain. If an event listener was
>>     previously active on the domain it is now not possible to
>>     reenable events as the domctl returns -EINVAL. The problem seems
>>     to stem from activating the magic page for vm_event using
>>     prepare_ring_for_helper as it returns NULL. Further looking into
>>     where things go wrong within that function it seems the page type
>>     returned by __get_gfn_type_access is p2m_ram_logdirty with an
>>     invalid mfn (0xffffffffffffffff) and then it hits "Error path:
>>     not a suitable GFN at all".
>>
>>     Can anyone point me to which change or what may be causing this?
>
>     Did the previous event listener replace the page it stole from
>     guest physmap for ring purposes when it exited?
>
>
> Ah, here is what seems to be the problem. Previously it was not
> required to do this during teardown. What we had was libxc would check
> if it can map the ring page with xc_map_foreign_pages, and it would
> repopulate the page if it failed before running xc_vm_event_enable.
> However, now it seems xc_map_foreign_pages return non-NULL the second
> time around as well, either though the page is not in the physmap.

This is the bug then.  If there isn't a page in the physmap,
xc_map_foreign_pages() should indicate an error.

> If I enforce libxc to run populate_physmap then I can get vm_event to
> initialize properly again. So the change seems to relate somehow the
> behavior of xc_map_foreign_pages.

This seems likely due to the splitting out of libxenforeignmem from
libxc, which included the the merging of 4? almost identical
map_foreign_$FOO() functions into one.  It is likely that there is a
subtle change in behaviour on an error path.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 4474 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: vm_event regression in 4.7
  2016-02-03 10:35     ` Andrew Cooper
@ 2016-02-05 20:34       ` Tamas K Lengyel
  2016-02-05 21:08         ` Tamas K Lengyel
  0 siblings, 1 reply; 6+ messages in thread
From: Tamas K Lengyel @ 2016-02-05 20:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Campbell, Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2359 bytes --]

On Wed, Feb 3, 2016 at 3:35 AM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 03/02/16 01:32, Tamas K Lengyel wrote:
>
>
>
> On Tue, Feb 2, 2016 at 6:00 PM, Andrew Cooper <andrew.cooper3@citrix.com>
> wrote:
>
>> On 03/02/2016 00:51, Tamas K Lengyel wrote:
>>
>> Hello all,
>> with the latest master branch of Xen there is a regression enabling
>> vm_event on a domain. If an event listener was previously active on the
>> domain it is now not possible to reenable events as the domctl returns
>> -EINVAL. The problem seems to stem from activating the magic page for
>> vm_event using prepare_ring_for_helper as it returns NULL. Further looking
>> into where things go wrong within that function it seems the page type
>> returned by __get_gfn_type_access is p2m_ram_logdirty with an invalid mfn
>> (0xffffffffffffffff) and then it hits "Error path: not a suitable GFN at
>> all".
>>
>> Can anyone point me to which change or what may be causing this?
>>
>>
>> Did the previous event listener replace the page it stole from guest
>> physmap for ring purposes when it exited?
>>
>
> Ah, here is what seems to be the problem. Previously it was not required
> to do this during teardown. What we had was libxc would check if it can map
> the ring page with xc_map_foreign_pages, and it would repopulate the page
> if it failed before running xc_vm_event_enable. However, now it seems
> xc_map_foreign_pages return non-NULL the second time around as well, either
> though the page is not in the physmap.
>
>
> This is the bug then.  If there isn't a page in the physmap,
> xc_map_foreign_pages() should indicate an error.
>
> If I enforce libxc to run populate_physmap then I can get vm_event to
> initialize properly again. So the change seems to relate somehow the
> behavior of xc_map_foreign_pages.
>
>
> This seems likely due to the splitting out of libxenforeignmem from libxc,
> which included the the merging of 4? almost identical map_foreign_$FOO()
> functions into one.  It is likely that there is a subtle change in
> behaviour on an error path.
>

I've added a bunch of debug messages and it gets all the way down to
IOCTL_PRIVCMD_MMAPBATCH_V2 without an error in
tools/libs/foreignmemory/linux.c. That ioctl returns 0 too, so I'm not sure
where the error comes from. Compared to the flow in Xen 4.6 I don't really
see what changed..

Tamas

[-- Attachment #1.2: Type: text/html, Size: 4851 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: vm_event regression in 4.7
  2016-02-05 20:34       ` Tamas K Lengyel
@ 2016-02-05 21:08         ` Tamas K Lengyel
  0 siblings, 0 replies; 6+ messages in thread
From: Tamas K Lengyel @ 2016-02-05 21:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Ian Campbell, Xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2619 bytes --]

On Fri, Feb 5, 2016 at 1:34 PM, Tamas K Lengyel <tamas.k.lengyel@gmail.com>
wrote:

>
>
>
> On Wed, Feb 3, 2016 at 3:35 AM, Andrew Cooper <andrew.cooper3@citrix.com>
> wrote:
>
>> On 03/02/16 01:32, Tamas K Lengyel wrote:
>>
>>
>>
>> On Tue, Feb 2, 2016 at 6:00 PM, Andrew Cooper <andrew.cooper3@citrix.com>
>> wrote:
>>
>>> On 03/02/2016 00:51, Tamas K Lengyel wrote:
>>>
>>> Hello all,
>>> with the latest master branch of Xen there is a regression enabling
>>> vm_event on a domain. If an event listener was previously active on the
>>> domain it is now not possible to reenable events as the domctl returns
>>> -EINVAL. The problem seems to stem from activating the magic page for
>>> vm_event using prepare_ring_for_helper as it returns NULL. Further looking
>>> into where things go wrong within that function it seems the page type
>>> returned by __get_gfn_type_access is p2m_ram_logdirty with an invalid mfn
>>> (0xffffffffffffffff) and then it hits "Error path: not a suitable GFN at
>>> all".
>>>
>>> Can anyone point me to which change or what may be causing this?
>>>
>>>
>>> Did the previous event listener replace the page it stole from guest
>>> physmap for ring purposes when it exited?
>>>
>>
>> Ah, here is what seems to be the problem. Previously it was not required
>> to do this during teardown. What we had was libxc would check if it can map
>> the ring page with xc_map_foreign_pages, and it would repopulate the page
>> if it failed before running xc_vm_event_enable. However, now it seems
>> xc_map_foreign_pages return non-NULL the second time around as well, either
>> though the page is not in the physmap.
>>
>>
>> This is the bug then.  If there isn't a page in the physmap,
>> xc_map_foreign_pages() should indicate an error.
>>
>> If I enforce libxc to run populate_physmap then I can get vm_event to
>> initialize properly again. So the change seems to relate somehow the
>> behavior of xc_map_foreign_pages.
>>
>>
>> This seems likely due to the splitting out of libxenforeignmem from
>> libxc, which included the the merging of 4? almost identical
>> map_foreign_$FOO() functions into one.  It is likely that there is a subtle
>> change in behaviour on an error path.
>>
>
> I've added a bunch of debug messages and it gets all the way down to
> IOCTL_PRIVCMD_MMAPBATCH_V2 without an error in
> tools/libs/foreignmemory/linux.c. That ioctl returns 0 too, so I'm not sure
> where the error comes from. Compared to the flow in Xen 4.6 I don't really
> see what changed..
>
>
Never mind, found it. The commit "b701ccc8 tools: Remove
xc_map_foreign_batch" caused the regression.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 5449 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-05 21:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03  0:51 vm_event regression in 4.7 Tamas K Lengyel
2016-02-03  1:00 ` Andrew Cooper
2016-02-03  1:32   ` Tamas K Lengyel
2016-02-03 10:35     ` Andrew Cooper
2016-02-05 20:34       ` Tamas K Lengyel
2016-02-05 21:08         ` Tamas K Lengyel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.