xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] UBSAN report in find_next_bit()
@ 2019-06-24 16:24 Andrew Cooper
  2019-06-25  9:38 ` Jan Beulich
  2019-06-25 10:12 ` Julien Grall
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Cooper @ 2019-06-24 16:24 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Julien Grall, Jan Beulich, Roger Pau Monne

Hello,

One UBSAN report on x86 is:

(XEN)
================================================================================
(XEN) UBSAN: Undefined behaviour in
/local/xen.git/xen/include/xen/nodemask.h:220:9
(XEN) shift exponent 64 is too large for 64-bit type 'long unsigned int'
(XEN) ----[ Xen-4.13-unstable  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d0802fdea1>] ubsan.c#ubsan_epilogue+0xa/0xd2
(XEN) RFLAGS: 0000000000010082   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff82d080e6fa68   rcx: 0000000000000000
(XEN) rdx: ffff82d080e6ffd0   rsi: 000000000000000a   rdi: ffff82d080e6fa68
(XEN) rbp: ffff82d080e6fa00   rsp: ffff82d080e6f9f0   r8:  00000000ffffffff
(XEN) r9:  0000000000000000   r10: ffff82d080e6fa10   r11: ffff82d08084e287
(XEN) r12: 0000000000000040   r13: ffffffffffffffff   r14: ffff82d080c1a62e
(XEN) r15: ffff82d080c1a610   cr0: 000000008005003b   cr4: 00000000001526e0
(XEN) cr3: 000000003fc16000   cr2: 0000000000000000
(XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d0802fdea1> (ubsan.c#ubsan_epilogue+0xa/0xd2):
(XEN)  89 e5 41 54 53 48 89 fb <0f> 0b 48 8d 3d 1e 5b 53 00 e8 f0 29 00
00 48 85
(XEN) Xen stack trace from rsp=ffff82d080e6f9f0:
(XEN)    0000000000000000 0000000000000040 ffff82d080e6faa0 ffff82d0802fed4d
(XEN)    ffff82d08084ee5f 3434373634343831 3535393037333730 0000000035313631
(XEN)    ffff83003ee38000 0000000000000001 0000000000003436 ffff82d080e6faa0
(XEN)    ffff82d0803008ec 0000000000000018 ffff82d080e6fab0 0000000000000202
(XEN)    ffff82e000000100 0000000000000040 0000000000000000 ffff83003ee38000
(XEN)    0000000000000001 0000000000000000 ffff82d080e6fb90 ffff82d08026f835
(XEN)    ffff82d080f7b7d8 ffff82d080f7b7d8 ffff82d080f7b3d8 00007d2f7f084e28
(XEN)    00000000000002f8 0000000000000140 0000002100000000 0000000000000000
(XEN)    ffff83003ee38000 0000000100000028 0000000000000000 00000000000001f8
(XEN)    0000000000000200 0000000000000000 00ff01d000000001 ffff82d0802cd54d
(XEN)    ffff82d080f7b7d8 0000000000000200 ffff82d080c1a600 0000000000000001
(XEN)    ffff83003ee38000 0000003f00000000 ffffffffffffffff 0000000000000028
(XEN)    0000000000000001 ffff83003ee38000 0000000000000001 0000000000000000
(XEN)    ffff82d080e6fc40 ffff82d08027550b 0000000500000006 0000000000000080
(XEN)    0000000000000080 0000000000000001 0000000000000090 ffff83003ee38000
(XEN)    ffff82d080e6fc18 ffff82d000000021 0000000000000000 0000000000000000
(XEN)    0000000000000020 0000000000000000 ffff83003ee2b600 ffff83003ee380f0
(XEN)    0000000000000000 0000000000000028 0000000000000021 ffff83003ee38000
(XEN)    0000000000000000 0000000000000000 ffff82d080e6fc78 ffff82d080276a40
(XEN)    ffff83003ee38000 ffff820060001000 0000000000000001 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d0802fdea1>] ubsan.c#ubsan_epilogue+0xa/0xd2
(XEN)    [<ffff82d0802fed4d>] __ubsan_handle_shift_out_of_bounds+0xd5/0x1cd
(XEN)    [<ffff82d08026f835>] page_alloc.c#get_free_buddy+0xd0c/0xd92
(XEN)    [<ffff82d08027550b>] page_alloc.c#alloc_heap_pages+0xda/0x151b
(XEN)    [<ffff82d080276a40>] alloc_domheap_pages+0xf4/0x223
(XEN)    [<ffff82d0803bb359>] create_perdomain_mapping+0xc9/0xc9a
(XEN)    [<ffff82d08037b516>] mapcache_domain_init+0xdc/0x15a
(XEN)    [<ffff82d08036f468>] arch_domain_create+0x744/0x801
(XEN)    [<ffff82d08021cf6c>] domain_create+0x8f6/0xe84
(XEN)    [<ffff82d080a54cdb>] __start_xen+0x5a0b/0x63e5
(XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
(XEN)
(XEN)
================================================================================

There is a separate bug/misfeature with d->node_affinity being fully set
which is why we end up looping over all bits in a nodemask_t.  I'll fix
this separately, but the issue in question is:

> else if ( (node = next_node(node, nodemask)) >= MAX_NUMNODES )
>     node = first_node(nodemask);

On x86, MAX_NUMNODES is 64, and this part of get_free_buddy() loops over
nodes {0..63}.  next_node() expands to find_next_bit(..., node+1) which
passes offset == size on the final iteration.

find_next_bit() has an optimisation for bitmaps of 64 or fewer bits
which does:

> else if ( __builtin_constant_p(size) && s__ <= BITS_PER_LONG )
>     r__ = o__ + __scanbit(*(const unsigned long *)(a__) >> o__, s__);

UBSAN takes objection to the shift, which in this case is a shift by 64.

The code in __find_next_bit() makes it clear that offset == size is a
valid condition, which would suggest that the bug is with the optimisation.

However, this conclusion contradicts the views of c/s b20079da9 which
decided that offset == size is not a valid condition.

ARM64's find_next_bit() explicitly copes with offset >= size, and while
I don't speak ARM asm well enough to work out whether
_find_first_bit_le() copes with offset == size, the vgic.c code
definitely expects it to function in this way.

As a result, I think the reasoning in c/s b20079da9 is false, and that
change needs re-adjusting.  I also think that x86's optimisation for
size == 64 should be considered buggy and fixed.  TBH, I'm not sure the
optimisation is worthwhile having in the first place.

Thoughts,

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xen-devel] UBSAN report in find_next_bit()
  2019-06-24 16:24 [Xen-devel] UBSAN report in find_next_bit() Andrew Cooper
@ 2019-06-25  9:38 ` Jan Beulich
  2019-06-25 10:14   ` Julien Grall
  2019-06-25 10:12 ` Julien Grall
  1 sibling, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2019-06-25  9:38 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Tim Deegan,
	Julien Grall, xen-devel, Roger Pau Monne

>>> On 24.06.19 at 18:24, <andrew.cooper3@citrix.com> wrote:
>> else if ( (node = next_node(node, nodemask)) >= MAX_NUMNODES )
>>     node = first_node(nodemask);
> 
> On x86, MAX_NUMNODES is 64, and this part of get_free_buddy() loops over
> nodes {0..63}.  next_node() expands to find_next_bit(..., node+1) which
> passes offset == size on the final iteration.
> 
> find_next_bit() has an optimisation for bitmaps of 64 or fewer bits
> which does:
> 
>> else if ( __builtin_constant_p(size) && s__ <= BITS_PER_LONG )
>>     r__ = o__ + __scanbit(*(const unsigned long *)(a__) >> o__, s__);
> 
> UBSAN takes objection to the shift, which in this case is a shift by 64.
> 
> The code in __find_next_bit() makes it clear that offset == size is a
> valid condition, which would suggest that the bug is with the optimisation.

Oh, in particular the ASSERT() there is indeed very clear.

> However, this conclusion contradicts the views of c/s b20079da9 which
> decided that offset == size is not a valid condition.

And that was based on how x86'es find_next{,_zero}_bit() as well
as ...

> ARM64's find_next_bit() explicitly copes with offset >= size, and while
> I don't speak ARM asm well enough to work out whether
> _find_first_bit_le() copes with offset == size, the vgic.c code
> definitely expects it to function in this way.

... Arm32's _find_next{,_zero}_bit_le. You've named the issue the x86
logic has. Arm32's, afaict, will read one byte past the array when offset
and size match and are a multiple of 8.

> As a result, I think the reasoning in c/s b20079da9 is false, and that
> change needs re-adjusting.  I also think that x86's optimisation for
> size == 64 should be considered buggy and fixed.  TBH, I'm not sure the
> optimisation is worthwhile having in the first place.

The question though is whether, alongside offset == size potentially
being meant to be valid, offset > size is to be treated like such, too.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xen-devel] UBSAN report in find_next_bit()
  2019-06-24 16:24 [Xen-devel] UBSAN report in find_next_bit() Andrew Cooper
  2019-06-25  9:38 ` Jan Beulich
@ 2019-06-25 10:12 ` Julien Grall
  1 sibling, 0 replies; 4+ messages in thread
From: Julien Grall @ 2019-06-25 10:12 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich, Roger Pau Monne

Hi Andrew,

On 24/06/2019 17:24, Andrew Cooper wrote:
> ARM64's find_next_bit() explicitly copes with offset >= size, and while
> I don't speak ARM asm well enough to work out whether
> _find_first_bit_le() copes with offset == size, the vgic.c code
> definitely expects it to function in this way.

I looked at the instance of find_* in arch/arm/vgic.c. AFAICT, all of them will 
always be called with offset < size. Could you point which one you think will 
relies on offset == size?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xen-devel] UBSAN report in find_next_bit()
  2019-06-25  9:38 ` Jan Beulich
@ 2019-06-25 10:14   ` Julien Grall
  0 siblings, 0 replies; 4+ messages in thread
From: Julien Grall @ 2019-06-25 10:14 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Stefano Stabellini, WeiLiu, George Dunlap, Tim Deegan, xen-devel,
	Roger Pau Monne

Hi Jan,

On 25/06/2019 10:38, Jan Beulich wrote:
>>>> On 24.06.19 at 18:24, <andrew.cooper3@citrix.com> wrote:
>> ARM64's find_next_bit() explicitly copes with offset >= size, and while
>> I don't speak ARM asm well enough to work out whether
>> _find_first_bit_le() copes with offset == size, the vgic.c code
>> definitely expects it to function in this way.
> 
> ... Arm32's _find_next{,_zero}_bit_le. You've named the issue the x86
> logic has. Arm32's, afaict, will read one byte past the array when offset
> and size match and are a multiple of 8.

It took me a bit to get my head around as the code is quite convoluted. But I 
agree with you here, arm32 find_* does not cope with offset == size.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-25 10:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-24 16:24 [Xen-devel] UBSAN report in find_next_bit() Andrew Cooper
2019-06-25  9:38 ` Jan Beulich
2019-06-25 10:14   ` Julien Grall
2019-06-25 10:12 ` Julien Grall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).