All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
@ 2014-02-17 13:40 Alex Bennée
  2014-02-24 13:01 ` Janne Grunau
  2014-02-24 20:58 ` Dann Frazier
  0 siblings, 2 replies; 27+ messages in thread
From: Alex Bennée @ 2014-02-17 13:40 UTC (permalink / raw)
  To: linaro-dev, linaro-toolchain
  Cc: Peter Maydell, Michael Matz, Alexander Graf, qemu-devel,
	Wook Wookey, Christoffer Dall

Hi,

After a solid few months of work the QEMU master branch [1] has now reached
instruction feature parity with the suse-1.6 [6] tree that a lot of people
have been using to build various aarch64 binaries. In addition to the
SUSE work we have fixed numerous edge cases and finished off classes of
instructions. All instructions have been verified with Peter's RISU
random instruction testing tool. I have also built and run many
packages as well as built gcc and passed most of the aarch64 specific tests.

I've tested against the following aarch64 rootfs:
    * SUSE [2]
    * Debian [3]
    * Ubuntu Saucy [4]

In my tree the remaining insns that the GCC aarch64 tests need to
implement are:
    FRECPE
    FRECPX
    CLS (2 misc variant)
    CLZ (2 misc variant)
    FSQRT
    FRINTZ
    FCVTZS

Which I'm currently working though now. However for most build tasks I
expect the instructions in master [1] will be enough.

If you want the latest instructions working their way to mainline you
are free to use my tree [5] which currently has:

* Additional NEON/SIMD instructions
* sendmsg syscall
* Improved helper scripts for setting up binfmt_misc
* The ability to set QEMU_LOG_FILENAME to /path/to/something-%d.log
  - this is useful when tests are failing N-levels deep as %d is
    replaced with the pid

Feedback I'm interested in
==========================

* Any instruction failure (please include the log line with the
  unsupported message)
* Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).

If you need to catch me in real time I'm available on #qemu (stsquad)
and #linaro-virtualization (ajb-linaro).

Many thanks to the SUSE guys for getting the aarch64 train rolling. I
hope your happy with the final result ;-)

Cheers,

--
Alex Bennée
QEMU/KVM Hacker for Linaro

[1] git://git.qemu.org/qemu.git master
[2] http://download.opensuse.org/ports/aarch64/distribution/13.1/appliances/openSUSE-13.1-ARM-JeOS.aarch64-rootfs.aarch64-1.12.1-Build32.1.tbz
[3] http://people.debian.org/~wookey/bootstrap/rootfs/debian-unstable-arm64.tar.gz
[4] http://people.debian.org/~wookey/bootstrap/rootfs/saucy-arm64.tar.gz
[5] https://github.com/stsquad/qemu/tree/ajb-a64-working
[6] https://github.com/susematz/qemu/tree/aarch64-1.6

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-17 13:40 [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation Alex Bennée
@ 2014-02-24 13:01 ` Janne Grunau
  2014-02-25 15:54   ` Alex Bennée
  2014-02-24 20:58 ` Dann Frazier
  1 sibling, 1 reply; 27+ messages in thread
From: Janne Grunau @ 2014-02-24 13:01 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

Hi,

On 2014-02-17 13:40:00 +0000, Alex Bennée wrote:
> 
> After a solid few months of work the QEMU master branch [1] has now reached
> instruction feature parity with the suse-1.6 [6] tree that a lot of people
> have been using to build various aarch64 binaries. In addition to the
> SUSE work we have fixed numerous edge cases and finished off classes of
> instructions. All instructions have been verified with Peter's RISU
> random instruction testing tool. I have also built and run many
> packages as well as built gcc and passed most of the aarch64 specific tests.
> 
> I've tested against the following aarch64 rootfs:
>     * SUSE [2]
>     * Debian [3]
>     * Ubuntu Saucy [4]

I'm running Libav's test suite (https://fate.libav.org/?arch=aarch64&comment=qemu)
using a Gentoo crossdev sysroot.

> In my tree the remaining insns that the GCC aarch64 tests need to
> implement are:
>     FRECPE
>     FRECPX
>     CLS (2 misc variant)
>     CLZ (2 misc variant)
>     FSQRT
>     FRINTZ
>     FCVTZS
> 
> Which I'm currently working though now. However for most build tasks I
> expect the instructions in master [1] will be enough.

Qemu master is enough to pass the tests with libav built with gcc 4.8.2,
clang 3.3 and 3.4 (clang 3.4 build only with -O1, it fails otherwise).

> Feedback I'm interested in
> ==========================
> 
> * Any instruction failure (please include the log line with the
>   unsupported message)

Neon support is not complete enough to run the hand written neon
assembler optimizations in libav. Currently failing on narrowing shifts.

> * Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).

Just as a note, qemu-arm64 from the suse tree didn't show any threading
issues with the libav test suite while qemu-aarch64 from master failed
with a probability of ~10% running the same binary.
 
> Many thanks to the SUSE guys for getting the aarch64 train rolling. I
> hope your happy with the final result ;-)

I'm happy, qemu master runs the libav test suite around 30% faster.
Thanks to everyone involved.

Janne

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-17 13:40 [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation Alex Bennée
  2014-02-24 13:01 ` Janne Grunau
@ 2014-02-24 20:58 ` Dann Frazier
  2014-02-25  8:39   ` Alex Bennée
  1 sibling, 1 reply; 27+ messages in thread
From: Dann Frazier @ 2014-02-24 20:58 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On Mon, Feb 17, 2014 at 6:40 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
> Hi,

Thanks to all involved for your work here!

> After a solid few months of work the QEMU master branch [1] has now reached
> instruction feature parity with the suse-1.6 [6] tree that a lot of people
> have been using to build various aarch64 binaries. In addition to the
> SUSE work we have fixed numerous edge cases and finished off classes of
> instructions. All instructions have been verified with Peter's RISU
> random instruction testing tool. I have also built and run many
> packages as well as built gcc and passed most of the aarch64 specific tests.
>
> I've tested against the following aarch64 rootfs:
>     * SUSE [2]
>     * Debian [3]
>     * Ubuntu Saucy [4]

fyi, I've been doing my testing with Ubuntu Trusty.

> In my tree the remaining insns that the GCC aarch64 tests need to
> implement are:
>     FRECPE
>     FRECPX
>     CLS (2 misc variant)
>     CLZ (2 misc variant)
>     FSQRT
>     FRINTZ
>     FCVTZS
>
> Which I'm currently working though now. However for most build tasks I
> expect the instructions in master [1] will be enough.
>
> If you want the latest instructions working their way to mainline you
> are free to use my tree [5] which currently has:
>
> * Additional NEON/SIMD instructions
> * sendmsg syscall
> * Improved helper scripts for setting up binfmt_misc
> * The ability to set QEMU_LOG_FILENAME to /path/to/something-%d.log
>   - this is useful when tests are failing N-levels deep as %d is
>     replaced with the pid
>
> Feedback I'm interested in
> ==========================
>
> * Any instruction failure (please include the log line with the
>   unsupported message)
> * Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).

I'm not sure if this qualifies as generic QEMU threading flakiness or not. I've
found a couple conditions that causes master to core dump fairly
reliably, while the aarch64-1.6 branch seems to consistently work
fine.

 1) dh_fixperms is a script that commonly runs at the end of a package build.
     Its basically doing a `find | xargs chmod`.
 2) debootstrap --second-stage
     This is used to configure an arm64 chroot that was built using
     debootstrap on a non-native host. It is basically invoking a bunch of
     shell scripts (postinst, etc). When it blows up, the stack consistently
     looks like this:

Core was generated by `/usr/bin/qemu-aarch64-static /bin/sh -e
/debootstrap/debootstrap --second-stage'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
__dest=0x400082c330) at
/usr/include/x86_64-linux-gnu/bits/string3.h:51
51  return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) bt
#0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
__dest=0x400082c330) at
/usr/include/x86_64-linux-gnu/bits/string3.h:51
#1  stq_p (v=274886476624, ptr=0x400082c330) at
/mnt/qemu.upstream/include/qemu/bswap.h:280
#2  stq_le_p (v=274886476624, ptr=0x400082c330) at
/mnt/qemu.upstream/include/qemu/bswap.h:315
#3  target_setup_sigframe (set=0x7fff62ae3530, env=0x62d9c678,
sf=0x400082b0d0) at /mnt/qemu.upstream/linux-user/signal.c:1167
#4  target_setup_frame (usig=usig@entry=17, ka=ka@entry=0x604ec1e0
<sigact_table+512>, info=info@entry=0x0, set=set@entry=0x7fff62ae3530,
env=env@entry=0x62d9c678)
    at /mnt/qemu.upstream/linux-user/signal.c:1286
#5  0x0000000060059f46 in setup_frame (env=0x62d9c678,
set=0x7fff62ae3530, ka=0x604ec1e0 <sigact_table+512>, sig=17) at
/mnt/qemu.upstream/linux-user/signal.c:1322
#6  process_pending_signals (cpu_env=cpu_env@entry=0x62d9c678) at
/mnt/qemu.upstream/linux-user/signal.c:5747
#7  0x0000000060056e60 in cpu_loop (env=env@entry=0x62d9c678) at
/mnt/qemu.upstream/linux-user/main.c:1082
#8  0x0000000060005079 in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at
/mnt/qemu.upstream/linux-user/main.c:4374

There are some pretty large differences between these trees with
respect to signal syscalls - is that the likely culprit?

 -dann



> If you need to catch me in real time I'm available on #qemu (stsquad)
> and #linaro-virtualization (ajb-linaro).
>
> Many thanks to the SUSE guys for getting the aarch64 train rolling. I
> hope your happy with the final result ;-)
>
> Cheers,
>
> --
> Alex Bennée
> QEMU/KVM Hacker for Linaro
>
> [1] git://git.qemu.org/qemu.git master
> [2] http://download.opensuse.org/ports/aarch64/distribution/13.1/appliances/openSUSE-13.1-ARM-JeOS.aarch64-rootfs.aarch64-1.12.1-Build32.1.tbz
> [3] http://people.debian.org/~wookey/bootstrap/rootfs/debian-unstable-arm64.tar.gz
> [4] http://people.debian.org/~wookey/bootstrap/rootfs/saucy-arm64.tar.gz
> [5] https://github.com/stsquad/qemu/tree/ajb-a64-working
> [6] https://github.com/susematz/qemu/tree/aarch64-1.6
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-24 20:58 ` Dann Frazier
@ 2014-02-25  8:39   ` Alex Bennée
  2014-02-25  8:49     ` Andreas Färber
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Alex Bennée @ 2014-02-25  8:39 UTC (permalink / raw)
  To: Dann Frazier
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Alex Bennée,
	Christoffer Dall


Dann Frazier <dann.frazier@canonical.com> writes:

> On Mon, Feb 17, 2014 at 6:40 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Hi,
>
> Thanks to all involved for your work here!
>
>> After a solid few months of work the QEMU master branch [1] has now reached
>> instruction feature parity with the suse-1.6 [6] tree that a lot of people
>> have been using to build various aarch64 binaries. In addition to the
<snip>
>>
>> I've tested against the following aarch64 rootfs:
>>     * SUSE [2]
>>     * Debian [3]
>>     * Ubuntu Saucy [4]
>
> fyi, I've been doing my testing with Ubuntu Trusty.

Good stuff, I shall see if I can set one up. Is the package coverage
between trusty and saucy much different? I noticed for example I
couldn't find zile and various build-deps for llvm.

<snip>
>>
>> Feedback I'm interested in
>> ==========================
>>
>> * Any instruction failure (please include the log line with the
>>   unsupported message)
>> * Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).
>
> I'm not sure if this qualifies as generic QEMU threading flakiness or not. I've
> found a couple conditions that causes master to core dump fairly
> reliably, while the aarch64-1.6 branch seems to consistently work
> fine.
>
>  1) dh_fixperms is a script that commonly runs at the end of a package build.
>      Its basically doing a `find | xargs chmod`.
>  2) debootstrap --second-stage
>      This is used to configure an arm64 chroot that was built using
>      debootstrap on a non-native host. It is basically invoking a bunch of
>      shell scripts (postinst, etc). When it blows up, the stack consistently
>      looks like this:
>
> Core was generated by `/usr/bin/qemu-aarch64-static /bin/sh -e
> /debootstrap/debootstrap --second-stage'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
> __dest=0x400082c330) at
> /usr/include/x86_64-linux-gnu/bits/string3.h:51
> 51  return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
> (gdb) bt
> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
> __dest=0x400082c330) at
> /usr/include/x86_64-linux-gnu/bits/string3.h:51
> #1  stq_p (v=274886476624, ptr=0x400082c330) at
> /mnt/qemu.upstream/include/qemu/bswap.h:280
> #2  stq_le_p (v=274886476624, ptr=0x400082c330) at
> /mnt/qemu.upstream/include/qemu/bswap.h:315
> #3  target_setup_sigframe (set=0x7fff62ae3530, env=0x62d9c678,
> sf=0x400082b0d0) at /mnt/qemu.upstream/linux-user/signal.c:1167
> #4  target_setup_frame (usig=usig@entry=17, ka=ka@entry=0x604ec1e0
> <sigact_table+512>, info=info@entry=0x0, set=set@entry=0x7fff62ae3530,
> env=env@entry=0x62d9c678)
>     at /mnt/qemu.upstream/linux-user/signal.c:1286
> #5  0x0000000060059f46 in setup_frame (env=0x62d9c678,
> set=0x7fff62ae3530, ka=0x604ec1e0 <sigact_table+512>, sig=17) at
> /mnt/qemu.upstream/linux-user/signal.c:1322
> #6  process_pending_signals (cpu_env=cpu_env@entry=0x62d9c678) at
> /mnt/qemu.upstream/linux-user/signal.c:5747
> #7  0x0000000060056e60 in cpu_loop (env=env@entry=0x62d9c678) at
> /mnt/qemu.upstream/linux-user/main.c:1082
> #8  0x0000000060005079 in main (argc=<optimized out>, argv=<optimized
> out>, envp=<optimized out>) at
> /mnt/qemu.upstream/linux-user/main.c:4374
>
> There are some pretty large differences between these trees with
> respect to signal syscalls - is that the likely culprit?

Quite likely. We explicitly concentrated on the arch64 specific
instruction emulation leaving more generic patches to flow in from SUSE
as they matured.

I guess it's time to go through the remaining patches and see what's up-streamable.

Alex/Michael,

Are any of these patches in flight now?

Cheers,

--
Alex Bennée
QEMU/KVM Hacker for Linaro

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25  8:39   ` Alex Bennée
@ 2014-02-25  8:49     ` Andreas Färber
  2014-02-25 13:33       ` Michael Matz
  2014-02-26 22:06     ` Dann Frazier
  2014-03-09 23:37     ` Dann Frazier
  2 siblings, 1 reply; 27+ messages in thread
From: Andreas Färber @ 2014-02-25  8:49 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Dann Frazier, Michael Matz,
	Alexander Graf, linaro-toolchain, qemu-devel, Wook Wookey,
	Christoffer Dall

Am 25.02.2014 09:39, schrieb Alex Bennée:
> 
> Dann Frazier <dann.frazier@canonical.com> writes:
> 
>> On Mon, Feb 17, 2014 at 6:40 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> Hi,
>>
>> Thanks to all involved for your work here!
>>
>>> After a solid few months of work the QEMU master branch [1] has now reached
>>> instruction feature parity with the suse-1.6 [6] tree that a lot of people
>>> have been using to build various aarch64 binaries. In addition to the
> <snip>
>>>
>>> I've tested against the following aarch64 rootfs:
>>>     * SUSE [2]
>>>     * Debian [3]
>>>     * Ubuntu Saucy [4]
>>
>> fyi, I've been doing my testing with Ubuntu Trusty.
> 
> Good stuff, I shall see if I can set one up. Is the package coverage
> between trusty and saucy much different? I noticed for example I
> couldn't find zile and various build-deps for llvm.
> 
> <snip>
>>>
>>> Feedback I'm interested in
>>> ==========================
>>>
>>> * Any instruction failure (please include the log line with the
>>>   unsupported message)
>>> * Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).
>>
>> I'm not sure if this qualifies as generic QEMU threading flakiness or not. I've
>> found a couple conditions that causes master to core dump fairly
>> reliably, while the aarch64-1.6 branch seems to consistently work
>> fine.
>>
>>  1) dh_fixperms is a script that commonly runs at the end of a package build.
>>      Its basically doing a `find | xargs chmod`.
>>  2) debootstrap --second-stage
>>      This is used to configure an arm64 chroot that was built using
>>      debootstrap on a non-native host. It is basically invoking a bunch of
>>      shell scripts (postinst, etc). When it blows up, the stack consistently
>>      looks like this:
>>
>> Core was generated by `/usr/bin/qemu-aarch64-static /bin/sh -e
>> /debootstrap/debootstrap --second-stage'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
>> __dest=0x400082c330) at
>> /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> 51  return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
>> (gdb) bt
>> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
>> __dest=0x400082c330) at
>> /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> #1  stq_p (v=274886476624, ptr=0x400082c330) at
>> /mnt/qemu.upstream/include/qemu/bswap.h:280
>> #2  stq_le_p (v=274886476624, ptr=0x400082c330) at
>> /mnt/qemu.upstream/include/qemu/bswap.h:315
>> #3  target_setup_sigframe (set=0x7fff62ae3530, env=0x62d9c678,
>> sf=0x400082b0d0) at /mnt/qemu.upstream/linux-user/signal.c:1167
>> #4  target_setup_frame (usig=usig@entry=17, ka=ka@entry=0x604ec1e0
>> <sigact_table+512>, info=info@entry=0x0, set=set@entry=0x7fff62ae3530,
>> env=env@entry=0x62d9c678)
>>     at /mnt/qemu.upstream/linux-user/signal.c:1286
>> #5  0x0000000060059f46 in setup_frame (env=0x62d9c678,
>> set=0x7fff62ae3530, ka=0x604ec1e0 <sigact_table+512>, sig=17) at
>> /mnt/qemu.upstream/linux-user/signal.c:1322
>> #6  process_pending_signals (cpu_env=cpu_env@entry=0x62d9c678) at
>> /mnt/qemu.upstream/linux-user/signal.c:5747
>> #7  0x0000000060056e60 in cpu_loop (env=env@entry=0x62d9c678) at
>> /mnt/qemu.upstream/linux-user/main.c:1082
>> #8  0x0000000060005079 in main (argc=<optimized out>, argv=<optimized
>> out>, envp=<optimized out>) at
>> /mnt/qemu.upstream/linux-user/main.c:4374
>>
>> There are some pretty large differences between these trees with
>> respect to signal syscalls - is that the likely culprit?
> 
> Quite likely. We explicitly concentrated on the arch64 specific
> instruction emulation leaving more generic patches to flow in from SUSE
> as they matured.
> 
> I guess it's time to go through the remaining patches and see what's up-streamable.
> 
> Alex/Michael,
> 
> Are any of these patches in flight now?

I don't think so, Alex seems to hate cleaning that stuff up... :P

Compare https://github.com/openSUSE/qemu/commits/opensuse-1.7 for our
general queue. We have patches adding locking to TCG, and there's a hack
pinning the CPU somewhere.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25  8:49     ` Andreas Färber
@ 2014-02-25 13:33       ` Michael Matz
  2014-02-25 13:46         ` Peter Maydell
  0 siblings, 1 reply; 27+ messages in thread
From: Michael Matz @ 2014-02-25 13:33 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Peter Maydell, linaro-dev, Dann Frazier, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Alex Bennée,
	Christoffer Dall

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1381 bytes --]

Hi,

On Tue, 25 Feb 2014, Andreas Färber wrote:

> >> There are some pretty large differences between these trees with 
> >> respect to signal syscalls - is that the likely culprit?
> > 
> > Quite likely. We explicitly concentrated on the arch64 specific 
> > instruction emulation leaving more generic patches to flow in from 
> > SUSE as they matured.
> > 
> > I guess it's time to go through the remaining patches and see what's 
> > up-streamable.
> > 
> > Alex/Michael,
> > 
> > Are any of these patches in flight now?
> 
> I don't think so, Alex seems to hate cleaning that stuff up... :P
> 
> Compare https://github.com/openSUSE/qemu/commits/opensuse-1.7 for our
> general queue. We have patches adding locking to TCG, and there's a hack
> pinning the CPU somewhere.

The locking and pinning is all wrong (resp. overbroad).  The aarch64-1.6 
branch contains better implementations for that and some actual fixes for 
aarch64' userspace.

Somehow I don't find the time to go through our patches in linux-user and 
submit them.  The biggest road-block is that signal vs syscall handling is 
fundamentally broken in linux-user and it's unfixable without 
assembler implementations of the syscall caller.  That is also still 
broken on the suse branch where I tried various ways to fix that until 
coming to that conclusion.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25 13:33       ` Michael Matz
@ 2014-02-25 13:46         ` Peter Maydell
  2014-02-25 14:56           ` Michael Matz
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Maydell @ 2014-02-25 13:46 UTC (permalink / raw)
  To: Michael Matz
  Cc: linaro-dev, Dann Frazier, Alexander Graf, linaro-toolchain,
	qemu-devel, Wook Wookey, Alex Bennée, Andreas Färber,
	Christoffer Dall

On 25 February 2014 13:33, Michael Matz <matz@suse.de> wrote
> The biggest road-block is that signal vs syscall handling is
> fundamentally broken in linux-user and it's unfixable without
> assembler implementations of the syscall caller.

I'm not entirely sure it's possible to fix even with
hand-rolled assembly, to be honest.

However there are a bunch of bugfixes in your tree
which it would be really nice to see upstreamed:
the sendmmsg patch, for instance. We can at least
get the aarch64 support to the same level as the
32 bit arm linux-user setup, which is genuinely
useful to people despite the well known races and
locking issues.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25 13:46         ` Peter Maydell
@ 2014-02-25 14:56           ` Michael Matz
  2014-02-28 14:12             ` Alex Bennée
  0 siblings, 1 reply; 27+ messages in thread
From: Michael Matz @ 2014-02-25 14:56 UTC (permalink / raw)
  To: Peter Maydell
  Cc: linaro-dev, Dann Frazier, Alexander Graf, linaro-toolchain,
	qemu-devel, Wook Wookey, Alex Bennée, Andreas Färber,
	Christoffer Dall

Hi,

On Tue, 25 Feb 2014, Peter Maydell wrote:

> On 25 February 2014 13:33, Michael Matz <matz@suse.de> wrote
> > The biggest road-block is that signal vs syscall handling is
> > fundamentally broken in linux-user and it's unfixable without
> > assembler implementations of the syscall caller.
> 
> I'm not entirely sure it's possible to fix even with
> hand-rolled assembly, to be honest.

I am fairly sure.  The problem is "simply" to detect if the signal arrived 
while inside the kernel (doing the syscalls job) or still or already 
outside. This structure helps with that:

before:
    setup args and stuff for syscall to do
atsys:
    syscall insn (single insn!)
after:
    mov return, return-register-per-psABI
realafter:
    rest of stuff

When a signal arrives you look at the return address the kernel puts into 
the siginfo.  Several cases:

* before <= retaddr < atsys:
  syscall hasn't yet started, so break syscall sequence, handle signal in 
  main loop, redo the syscall.
* atsys == retaddr
  syscall has started and the kernel wants to restart it after sighandler
  returns, _or_ syscall was just about to be started.  No matter what,
  the right thing to do is to actually do the syscall (again) after 
  handling the signal.  So break syscall sequence, handle signal in main
  loop, (re)do the syscall.
* after <= retaddr < realafter:
  syscall is complete but return value not yet in some variable but still 
  in register (or other post-syscall work that still needs doing isn't
  complete yet); nothing interesting to do, just let it continue with the 
  syscall sequence, handle signal in main loop after that one returned.
* retaddr any other value:
  uninteresting; actually I'm not sure we'd need the distinction between 
  after and realafter.  Handle signal as usual in main loop.

The important thing for qemu is to know precisely if the signal arrived 
before the syscall was started (or is to be restarted), or after it 
returned, and for that the compiler must not be allowed to insert any code 
between atsys and after.

> However there are a bunch of bugfixes in your tree
> which it would be really nice to see upstreamed:
> the sendmmsg patch, for instance. We can at least
> get the aarch64 support to the same level as the
> 32 bit arm linux-user setup, which is genuinely
> useful to people despite the well known races and
> locking issues.

Yeah.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-24 13:01 ` Janne Grunau
@ 2014-02-25 15:54   ` Alex Bennée
  2014-02-25 17:11     ` Janne Grunau
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Bennée @ 2014-02-25 15:54 UTC (permalink / raw)
  To: Janne Grunau
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall


Janne Grunau <j@jannau.net> writes:

> Hi,
>
> On 2014-02-17 13:40:00 +0000, Alex Bennée wrote:
>> 
<snip>
>
>> In my tree the remaining insns that the GCC aarch64 tests need to
>> implement are:
>>     FRECPE
>>     FRECPX
>>     CLS (2 misc variant)
>>     CLZ (2 misc variant)
>>     FSQRT

My GitHub tree now has fixes for the above as well as all pending A64
pull requests. The commits are:

34bbde5 * target-arm: A64: add remaining CLS/Z vector ops
96d890c * target-arm: A64: add FSQRT to C3.6.17 (two misc)
997e712 * target-arm: A64: fix bug in add_sub_ext an handling rn
c37ba93 * target-arm: A64: Add last AdvSIMD Integer to FP ops
1e35ff3 * target-arm: A64: Implement AdvSIMD reciprocal ops
46ec7d9 * peter/a64-working target-arm: A64: Implement PMULL instruction

>>     FRINTZ
>>     FCVTZS
>> 
>> Which I'm currently working though now. However for most build tasks I
>> expect the instructions in master [1] will be enough.
>
> Qemu master is enough to pass the tests with libav built with gcc 4.8.2,
> clang 3.3 and 3.4 (clang 3.4 build only with -O1, it fails otherwise).
>
>> Feedback I'm interested in
>> ==========================
>> 
>> * Any instruction failure (please include the log line with the
>>   unsupported message)
>
> Neon support is not complete enough to run the hand written neon
> assembler optimizations in libav. Currently failing on narrowing shifts.
<snip>

Have you got the log file "unsupported" line? I seem to recall you did
ping me but maybe it was just on IRC? I just want to make sure I
do the right ones. I'm working on this now.


Cheers,

--
Alex Bennée
QEMU/KVM Hacker for Linaro

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25 15:54   ` Alex Bennée
@ 2014-02-25 17:11     ` Janne Grunau
  2014-03-06 11:40       ` Alex Bennée
  0 siblings, 1 reply; 27+ messages in thread
From: Janne Grunau @ 2014-02-25 17:11 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On 2014-02-25 15:54:37 +0000, Alex Bennée wrote:
>
> >> Feedback I'm interested in
> >> ==========================
> >>
> >> * Any instruction failure (please include the log line with the
> >>   unsupported message)
> >
> > Neon support is not complete enough to run the hand written neon
> > assembler optimizations in libav. Currently failing on narrowing shifts.
> <snip>
>
> Have you got the log file "unsupported" line? I seem to recall you did
> ping me but maybe it was just on IRC? I just want to make sure I
> do the right ones. I'm working on this now.

We spoke on irc about it. a quick test commenting unsupported
instructions out revealed that rshrn/2, sqrshrun and shrn/2 are
the only NEON instructions used in libav still missing support
in qemu master. Unsoppurted lines from qemu master 0459650d94d1
below.

target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8e10 at pc=00000000008632c8
target-arm/translate-a64.c:6884: unsupported instruction encoding 0x2f0b8f9c at pc=0000000000865764
target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8610 at pc=0000000000863afc

Janne

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25  8:39   ` Alex Bennée
  2014-02-25  8:49     ` Andreas Färber
@ 2014-02-26 22:06     ` Dann Frazier
  2014-02-27 13:20       ` Michael Matz
  2014-03-09 23:37     ` Dann Frazier
  2 siblings, 1 reply; 27+ messages in thread
From: Dann Frazier @ 2014-02-26 22:06 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On Tue, Feb 25, 2014 at 1:39 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Dann Frazier <dann.frazier@canonical.com> writes:
>
>> On Mon, Feb 17, 2014 at 6:40 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> Hi,
>>
>> Thanks to all involved for your work here!
>>
>>> After a solid few months of work the QEMU master branch [1] has now reached
>>> instruction feature parity with the suse-1.6 [6] tree that a lot of people
>>> have been using to build various aarch64 binaries. In addition to the
> <snip>
>>>
>>> I've tested against the following aarch64 rootfs:
>>>     * SUSE [2]
>>>     * Debian [3]
>>>     * Ubuntu Saucy [4]
>>
>> fyi, I've been doing my testing with Ubuntu Trusty.
>
> Good stuff, I shall see if I can set one up. Is the package coverage
> between trusty and saucy much different? I noticed for example I
> couldn't find zile and various build-deps for llvm.
>
> <snip>
>>>
>>> Feedback I'm interested in
>>> ==========================
>>>
>>> * Any instruction failure (please include the log line with the
>>>   unsupported message)
>>> * Any aarch64 specific failures (i.e. not generic QEMU threading flakeiness).
>>
>> I'm not sure if this qualifies as generic QEMU threading flakiness or not. I've
>> found a couple conditions that causes master to core dump fairly
>> reliably, while the aarch64-1.6 branch seems to consistently work
>> fine.
>>
>>  1) dh_fixperms is a script that commonly runs at the end of a package build.
>>      Its basically doing a `find | xargs chmod`.
>>  2) debootstrap --second-stage
>>      This is used to configure an arm64 chroot that was built using
>>      debootstrap on a non-native host. It is basically invoking a bunch of
>>      shell scripts (postinst, etc). When it blows up, the stack consistently
>>      looks like this:

I've narrowed down the changes that seem to prevent both types of
segfaults to the following changes that introduce a wrapper around
sigprocmask:

https://github.com/susematz/qemu/commit/f1542ae9fe10d5a241fc2624ecaef5f0948e3472
https://github.com/susematz/qemu/commit/4e5e1607758841c760cda4652b0ee7a6bc6eb79d
https://github.com/susematz/qemu/commit/63eb8d3ea58f58d5857153b0c632def1bbd05781

I'm not sure if this is a real fix or just papering over my issue -
but either way, are these changes reasonable for upstream submission?

 -dann


>> Core was generated by `/usr/bin/qemu-aarch64-static /bin/sh -e
>> /debootstrap/debootstrap --second-stage'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
>> __dest=0x400082c330) at
>> /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> 51  return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
>> (gdb) bt
>> #0  0x0000000060058e55 in memcpy (__len=8, __src=0x7fff62ae34e0,
>> __dest=0x400082c330) at
>> /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> #1  stq_p (v=274886476624, ptr=0x400082c330) at
>> /mnt/qemu.upstream/include/qemu/bswap.h:280
>> #2  stq_le_p (v=274886476624, ptr=0x400082c330) at
>> /mnt/qemu.upstream/include/qemu/bswap.h:315
>> #3  target_setup_sigframe (set=0x7fff62ae3530, env=0x62d9c678,
>> sf=0x400082b0d0) at /mnt/qemu.upstream/linux-user/signal.c:1167
>> #4  target_setup_frame (usig=usig@entry=17, ka=ka@entry=0x604ec1e0
>> <sigact_table+512>, info=info@entry=0x0, set=set@entry=0x7fff62ae3530,
>> env=env@entry=0x62d9c678)
>>     at /mnt/qemu.upstream/linux-user/signal.c:1286
>> #5  0x0000000060059f46 in setup_frame (env=0x62d9c678,
>> set=0x7fff62ae3530, ka=0x604ec1e0 <sigact_table+512>, sig=17) at
>> /mnt/qemu.upstream/linux-user/signal.c:1322
>> #6  process_pending_signals (cpu_env=cpu_env@entry=0x62d9c678) at
>> /mnt/qemu.upstream/linux-user/signal.c:5747
>> #7  0x0000000060056e60 in cpu_loop (env=env@entry=0x62d9c678) at
>> /mnt/qemu.upstream/linux-user/main.c:1082
>> #8  0x0000000060005079 in main (argc=<optimized out>, argv=<optimized
>> out>, envp=<optimized out>) at
>> /mnt/qemu.upstream/linux-user/main.c:4374
>>
>> There are some pretty large differences between these trees with
>> respect to signal syscalls - is that the likely culprit?
>
> Quite likely. We explicitly concentrated on the arch64 specific
> instruction emulation leaving more generic patches to flow in from SUSE
> as they matured.
>
> I guess it's time to go through the remaining patches and see what's up-streamable.
>
> Alex/Michael,
>
> Are any of these patches in flight now?
>
> Cheers,
>
> --
> Alex Bennée
> QEMU/KVM Hacker for Linaro
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-26 22:06     ` Dann Frazier
@ 2014-02-27 13:20       ` Michael Matz
  2014-02-27 19:47         ` Dann Frazier
  2014-03-14 14:20         ` Peter Maydell
  0 siblings, 2 replies; 27+ messages in thread
From: Michael Matz @ 2014-02-27 13:20 UTC (permalink / raw)
  To: Dann Frazier
  Cc: Peter Maydell, linaro-dev, qemu-devel, linaro-toolchain,
	Alexander Graf, Wook Wookey, Alex Bennée, Christoffer Dall

Hi,

On Wed, 26 Feb 2014, Dann Frazier wrote:

> I've narrowed down the changes that seem to prevent both types of
> segfaults to the following changes that introduce a wrapper around
> sigprocmask:
> 
> https://github.com/susematz/qemu/commit/f1542ae9fe10d5a241fc2624ecaef5f0948e3472
> https://github.com/susematz/qemu/commit/4e5e1607758841c760cda4652b0ee7a6bc6eb79d
> https://github.com/susematz/qemu/commit/63eb8d3ea58f58d5857153b0c632def1bbd05781
> 
> I'm not sure if this is a real fix or just papering over my issue -

It's fixing the issue, but strictly speaking introduces an QoI problem. 
SIGSEGV must not be controllable by the guest, it needs to be always 
deliverable to qemu; that is what's fixed.

The QoI problem introduced is that with the implementation as is, the 
fiddling with SIGSEGV is detectable by the guest.  E.g. if it installs a 
segv handler, blocks segv, then forces a segfault, checks that it didn't 
arrive, then unblocks segv and checks that it now arrives, such testcase 
would be able to detect that in fact it couldn't block SIGSEGV.

Luckily for us, the effect of blocking SIGSEGV and then generating one in 
other ways than kill/sigqueue/raise (e.g. by writing to NULL) are 
undefined, so in practice that QoI issue doesn't matter.

To fix also that latter part it'd need a further per-thread flag 
segv_blocked_p which needs to be checked before actually delivering a 
guest-directed SIGSEGV (in comparison to a qemu-directed SEGV), and 
otherwise requeue it.  That's made a bit complicated when the SEGV was 
process-directed (not thread-directed) because in that case it needs to be 
delivered as long as there's _any_ thread which has it unblocked.  So 
given the above undefinedness for sane uses of SEGVs it didn't seem worth 
the complication of having an undetectable virtualization of SIGSEGV.

> but either way, are these changes reasonable for upstream submission?

IIRC the first two commits (from Alex Barcelo) were submitted once in the 
past, but fell through the cracks.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-27 13:20       ` Michael Matz
@ 2014-02-27 19:47         ` Dann Frazier
  2014-03-14 14:20         ` Peter Maydell
  1 sibling, 0 replies; 27+ messages in thread
From: Dann Frazier @ 2014-02-27 19:47 UTC (permalink / raw)
  To: Michael Matz
  Cc: Peter Maydell, linaro-dev, Alex Barcelo, qemu-devel,
	linaro-toolchain, Alexander Graf, Wook Wookey, Alex Bennée,
	Christoffer Dall

[Adding Alex Barcelo to the CC]

On Thu, Feb 27, 2014 at 6:20 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 26 Feb 2014, Dann Frazier wrote:
>
>> I've narrowed down the changes that seem to prevent both types of
>> segfaults to the following changes that introduce a wrapper around
>> sigprocmask:
>>
>> https://github.com/susematz/qemu/commit/f1542ae9fe10d5a241fc2624ecaef5f0948e3472
>> https://github.com/susematz/qemu/commit/4e5e1607758841c760cda4652b0ee7a6bc6eb79d
>> https://github.com/susematz/qemu/commit/63eb8d3ea58f58d5857153b0c632def1bbd05781
>>
>> I'm not sure if this is a real fix or just papering over my issue -
>
> It's fixing the issue, but strictly speaking introduces an QoI problem.
> SIGSEGV must not be controllable by the guest, it needs to be always
> deliverable to qemu; that is what's fixed.
>
> The QoI problem introduced is that with the implementation as is, the
> fiddling with SIGSEGV is detectable by the guest.  E.g. if it installs a
> segv handler, blocks segv, then forces a segfault, checks that it didn't
> arrive, then unblocks segv and checks that it now arrives, such testcase
> would be able to detect that in fact it couldn't block SIGSEGV.
>
> Luckily for us, the effect of blocking SIGSEGV and then generating one in
> other ways than kill/sigqueue/raise (e.g. by writing to NULL) are
> undefined, so in practice that QoI issue doesn't matter.
>
> To fix also that latter part it'd need a further per-thread flag
> segv_blocked_p which needs to be checked before actually delivering a
> guest-directed SIGSEGV (in comparison to a qemu-directed SEGV), and
> otherwise requeue it.  That's made a bit complicated when the SEGV was
> process-directed (not thread-directed) because in that case it needs to be
> delivered as long as there's _any_ thread which has it unblocked.  So
> given the above undefinedness for sane uses of SEGVs it didn't seem worth
> the complication of having an undetectable virtualization of SIGSEGV.

Thanks for the explanation.

>> but either way, are these changes reasonable for upstream submission?
>
> IIRC the first two commits (from Alex Barcelo) were submitted once in the
> past, but fell through the cracks.

Alex: are you interested in resubmitting these - or would you like me
to attempt to on your behalf?

 -dann

>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25 14:56           ` Michael Matz
@ 2014-02-28 14:12             ` Alex Bennée
  2014-02-28 14:21               ` Peter Maydell
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Bennée @ 2014-02-28 14:12 UTC (permalink / raw)
  To: Michael Matz
  Cc: Peter Maydell, linaro-dev, Dann Frazier, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Andreas Färber,
	Christoffer Dall


Michael Matz <matz@suse.de> writes:

> Hi,
>
> On Tue, 25 Feb 2014, Peter Maydell wrote:
>
>> On 25 February 2014 13:33, Michael Matz <matz@suse.de> wrote
>> > The biggest road-block is that signal vs syscall handling is
>> > fundamentally broken in linux-user and it's unfixable without
>> > assembler implementations of the syscall caller.
>> 
>> I'm not entirely sure it's possible to fix even with
>> hand-rolled assembly, to be honest.
>
> I am fairly sure.  The problem is "simply" to detect if the signal arrived 
> while inside the kernel (doing the syscalls job) or still or already 
> outside. This structure helps with that:
<snip>

Is this "simply" a case of having a precise state in/around syscalls?

AIUI we already have such a mechanism for dealing with faults in
translated code so this is all aimed at when an asynchronous signal
arrives somewhere in QEMU's own code. So this case be:

* the execution/translation loop
* a helper function
* a syscall (helper jump out of execution/translation loop?)

I wonder if it would be possible to defer the handing of the signal back
to the process until we know we are precise?

-- 
Alex Bennée
Finding this all eerily familiar.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-28 14:12             ` Alex Bennée
@ 2014-02-28 14:21               ` Peter Maydell
  2014-02-28 14:27                 ` Alexander Graf
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Maydell @ 2014-02-28 14:21 UTC (permalink / raw)
  To: Alex Bennée
  Cc: linaro-dev, Dann Frazier, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Andreas Färber,
	Christoffer Dall

On 28 February 2014 14:12, Alex Bennée <alex.bennee@linaro.org> wrote:
> Is this "simply" a case of having a precise state in/around syscalls?

No.

> AIUI we already have such a mechanism for dealing with faults in
> translated code so this is all aimed at when an asynchronous signal
> arrives somewhere in QEMU's own code.

The major problem is that system calls are supposed to be
atomic wrt signals, ie for the guest we must appear to
either take the signal first, or have the syscall return
EINTR, or take it after. Further, we mustn't make a host
syscall that is supposed to be interrupted by a signal if
the signal has already arrived, because we'll hang.

http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg00384.html
has a fuller description of the problem, though note that
my analysis of the solution is incorrect. I think Michael's
right that we could deal with this if we had known native
asm for the syscall sequence. (We probably want to separate
out the interruptible syscalls so we can continue to use
straightforward "just call libc" code for the bulk of them
which are non-interruptible.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-28 14:21               ` Peter Maydell
@ 2014-02-28 14:27                 ` Alexander Graf
  2014-02-28 14:49                   ` Peter Maydell
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Graf @ 2014-02-28 14:27 UTC (permalink / raw)
  To: Peter Maydell
  Cc: linaro-dev, Dann Frazier, Michael Matz, qemu-devel,
	linaro-toolchain, Wook Wookey, Alex Bennée,
	Andreas Färber, Christoffer Dall



> Am 28.02.2014 um 22:21 schrieb Peter Maydell <peter.maydell@linaro.org>:
> 
>> On 28 February 2014 14:12, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Is this "simply" a case of having a precise state in/around syscalls?
> 
> No.
> 
>> AIUI we already have such a mechanism for dealing with faults in
>> translated code so this is all aimed at when an asynchronous signal
>> arrives somewhere in QEMU's own code.
> 
> The major problem is that system calls are supposed to be
> atomic wrt signals, ie for the guest we must appear to
> either take the signal first, or have the syscall return
> EINTR, or take it after. Further, we mustn't make a host
> syscall that is supposed to be interrupted by a signal if
> the signal has already arrived, because we'll hang.
> 
> http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg00384.html
> has a fuller description of the problem, though note that
> my analysis of the solution is incorrect. I think Michael's
> right that we could deal with this if we had known native
> asm for the syscall sequence. (We probably want to separate
> out the interruptible syscalls so we can continue to use
> straightforward "just call libc" code for the bulk of them
> which are non-interruptible.)

Could we check the instruction at the sognaling pc and check if it's a known syscall instruction? No need to replace glibc wrappers then.

Alex

> 
> thanks
> -- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-28 14:27                 ` Alexander Graf
@ 2014-02-28 14:49                   ` Peter Maydell
  2014-02-28 17:08                     ` Alex Bennée
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Maydell @ 2014-02-28 14:49 UTC (permalink / raw)
  To: Alexander Graf
  Cc: linaro-dev, Dann Frazier, Michael Matz, qemu-devel,
	linaro-toolchain, Wook Wookey, Alex Bennée,
	Andreas Färber, Christoffer Dall

On 28 February 2014 14:27, Alexander Graf <agraf@suse.de> wrote:
> Could we check the instruction at the sognaling pc and check
> if it's a known syscall instruction? No need to replace glibc
> wrappers then.

No, because the behaviour we want for "started handling
syscall in qemu" through to "PC anything up to but not
including the syscall insn" is "back out and take signal
then try again", which means we need to be able to unwind
anything we were doing. If we (effectively) longjmp out of
the middle of glibc we're liable to leave locked mutexes
and otherwise mess up glibc internals. Also we need to be
able to distinguish "not got to syscall insn yet" from
"after syscall insn", which isn't possible to determine
if all you have is "PC is inside glibc but not actually
at the syscall insn".

There really aren't all that many interruptible syscalls,
though, so we can probably live with handrolling those.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-28 14:49                   ` Peter Maydell
@ 2014-02-28 17:08                     ` Alex Bennée
  2014-02-28 17:17                       ` Peter Maydell
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Bennée @ 2014-02-28 17:08 UTC (permalink / raw)
  To: Peter Maydell
  Cc: linaro-dev, Dann Frazier, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Andreas Färber,
	Christoffer Dall


Peter Maydell <peter.maydell@linaro.org> writes:

> On 28 February 2014 14:27, Alexander Graf <agraf@suse.de> wrote:
>> Could we check the instruction at the sognaling pc and check
>> if it's a known syscall instruction? No need to replace glibc
>> wrappers then.
>
> No, because the behaviour we want for "started handling
> syscall in qemu" through to "PC anything up to but not
> including the syscall insn" is "back out and take signal
> then try again", which means we need to be able to unwind
> anything we were doing. If we (effectively) longjmp out of
> the middle of glibc we're liable to leave locked mutexes
> and otherwise mess up glibc internals.

The other option is roll the real PC forward until you know you are at a
point that everything is in a known state - in this case a labelled
syscall instruction. You can achieve this with a host interpreter (which
would be a lot of work to add to QEMU) or maybe achieve the same magic
with ptrace?

If you really want to avoid too much messing about you mask off all your
signals until you really know you can do something about them.

It goes without saying I hope that any serious attempt to fix this needs
a decent set of test cases because the edge cases are numerous.

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-28 17:08                     ` Alex Bennée
@ 2014-02-28 17:17                       ` Peter Maydell
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Maydell @ 2014-02-28 17:17 UTC (permalink / raw)
  To: Alex Bennée
  Cc: linaro-dev, Dann Frazier, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Andreas Färber,
	Christoffer Dall

On 28 February 2014 17:08, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> On 28 February 2014 14:27, Alexander Graf <agraf@suse.de> wrote:
>>> Could we check the instruction at the sognaling pc and check
>>> if it's a known syscall instruction? No need to replace glibc
>>> wrappers then.
>>
>> No, because the behaviour we want for "started handling
>> syscall in qemu" through to "PC anything up to but not
>> including the syscall insn" is "back out and take signal
>> then try again", which means we need to be able to unwind
>> anything we were doing. If we (effectively) longjmp out of
>> the middle of glibc we're liable to leave locked mutexes
>> and otherwise mess up glibc internals.
>
> The other option is roll the real PC forward until you know you are at a
> point that everything is in a known state - in this case a labelled
> syscall instruction.

I don't see how rolling the host PC forward would work.
We can't take the guest signal where we are, we can't
go forward because that would imply executing the host
syscall (which might now block): the only thing we
can do is roll back to a point where we can make it
appear we hadn't executed the guest syscall insn yet,
and then take the guest signal.

Masking signals doesn't work in general because you
need the signal to be unblocked while you do the
host syscall (so it can correctly return EINTR if
the signal comes in while it's doing stuff), and
there's no way to atomically unblock-and-do-syscall
(and certainly no way to do that if your syscall is
buried inside glibc).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25 17:11     ` Janne Grunau
@ 2014-03-06 11:40       ` Alex Bennée
  2014-03-06 16:04         ` Janne Grunau
  0 siblings, 1 reply; 27+ messages in thread
From: Alex Bennée @ 2014-03-06 11:40 UTC (permalink / raw)
  To: Janne Grunau
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall


Janne Grunau <j@jannau.net> writes:

> On 2014-02-25 15:54:37 +0000, Alex Bennée wrote:
>>
<snip>
>> Have you got the log file "unsupported" line? I seem to recall you did
>> ping me but maybe it was just on IRC? I just want to make sure I
>> do the right ones. I'm working on this now.
>
> We spoke on irc about it. a quick test commenting unsupported
> instructions out revealed that rshrn/2, sqrshrun and shrn/2 are
> the only NEON instructions used in libav still missing support
> in qemu master. Unsoppurted lines from qemu master 0459650d94d1
> below.
>
> target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8e10 at pc=00000000008632c8
> target-arm/translate-a64.c:6884: unsupported instruction encoding 0x2f0b8f9c at pc=0000000000865764
> target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8610 at pc=0000000000863afc

I've just pushed support for the various shrn opcodes to:

https://github.com/stsquad/qemu/tree/ajb-a64-working

I suspect if libav uses them heavily there could be some optimisation
to be made as the narrow operations make heavy use of helpers to do the
saturation stuff.

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-03-06 11:40       ` Alex Bennée
@ 2014-03-06 16:04         ` Janne Grunau
  0 siblings, 0 replies; 27+ messages in thread
From: Janne Grunau @ 2014-03-06 16:04 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On 2014-03-06 11:40:47 +0000, Alex Bennée wrote:
> 
> Janne Grunau <j@jannau.net> writes:
> 
> > On 2014-02-25 15:54:37 +0000, Alex Bennée wrote:
> >>
> <snip>
> >> Have you got the log file "unsupported" line? I seem to recall you did
> >> ping me but maybe it was just on IRC? I just want to make sure I
> >> do the right ones. I'm working on this now.
> >
> > We spoke on irc about it. a quick test commenting unsupported
> > instructions out revealed that rshrn/2, sqrshrun and shrn/2 are
> > the only NEON instructions used in libav still missing support
> > in qemu master. Unsoppurted lines from qemu master 0459650d94d1
> > below.
> >
> > target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8e10 at pc=00000000008632c8
> > target-arm/translate-a64.c:6884: unsupported instruction encoding 0x2f0b8f9c at pc=0000000000865764
> > target-arm/translate-a64.c:6884: unsupported instruction encoding 0x0f0a8610 at pc=0000000000863afc
> 
> I've just pushed support for the various shrn opcodes to:
> 
> https://github.com/stsquad/qemu/tree/ajb-a64-working

Thanks, just testing it and it seems to work as expected.

> I suspect if libav uses them heavily there could be some optimisation
> to be made as the narrow operations make heavy use of helpers to do the
> saturation stuff.

The saturating shift is not that heavily used and I don't care much as
long as qemu is an order of magnitude faster than ARM's foundation model
and much easier to handle than Apple hardware.

Janne

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-25  8:39   ` Alex Bennée
  2014-02-25  8:49     ` Andreas Färber
  2014-02-26 22:06     ` Dann Frazier
@ 2014-03-09 23:37     ` Dann Frazier
  2014-03-09 23:51       ` Peter Maydell
  2 siblings, 1 reply; 27+ messages in thread
From: Dann Frazier @ 2014-03-09 23:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On Tue, Feb 25, 2014 at 1:39 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Dann Frazier <dann.frazier@canonical.com> writes:
>
>> On Mon, Feb 17, 2014 at 6:40 AM, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> Hi,
>>
>> Thanks to all involved for your work here!
>>
>>> After a solid few months of work the QEMU master branch [1] has now reached
>>> instruction feature parity with the suse-1.6 [6] tree that a lot of people
>>> have been using to build various aarch64 binaries. In addition to the
> <snip>
>>>
>>> I've tested against the following aarch64 rootfs:
>>>     * SUSE [2]
>>>     * Debian [3]
>>>     * Ubuntu Saucy [4]
>>
>> fyi, I've been doing my testing with Ubuntu Trusty.
>
> Good stuff, I shall see if I can set one up. Is the package coverage
> between trusty and saucy much different? I noticed for example I
> couldn't find zile and various build-deps for llvm.

Oops, must've missed this question before. I'd say in general they
should be quite similar, but there are obviously exceptions (zile is
one).
I'm not sure why it was omitted.

Also - I've found an issue with running OpenJDK in the latest upstream git:

root@server-75e0210e-4f99-4c86-9277-3201ab7b6afd:/root# java
#
[thread 274902467056 also had an error]# A fatal error has been
detected by the Java Runtime Environment:

#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=9960, tid=297441608176
#
# JRE version: OpenJDK Runtime Environment (7.0_51-b31) (build 1.7.0_51-b31)
# Java VM: OpenJDK 64-Bit Server VM (25.0-b59 mixed mode linux-aarch64
compressed oops)
# Problematic frame:
# qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)

This is openjdk-7-jre-headless, version 7u51-2.4.5-1ubuntu1. I'm not
sure what debug info would be useful here, but let me know and I can
collect it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-03-09 23:37     ` Dann Frazier
@ 2014-03-09 23:51       ` Peter Maydell
  2014-03-10 11:28         ` Alex Bennée
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Maydell @ 2014-03-09 23:51 UTC (permalink / raw)
  To: Dann Frazier
  Cc: linaro-dev, Michael Matz, Alexander Graf, linaro-toolchain,
	qemu-devel, Wook Wookey, Alex Bennée, Christoffer Dall

On 9 March 2014 23:37, Dann Frazier <dann.frazier@canonical.com> wrote:
> Also - I've found an issue with running OpenJDK in the latest upstream git:
>
> root@server-75e0210e-4f99-4c86-9277-3201ab7b6afd:/root# java
> #
> [thread 274902467056 also had an error]# A fatal error has been
> detected by the Java Runtime Environment:
>
> #
> #  SIGSEGV (0xb) at pc=0x0000000000000000, pid=9960, tid=297441608176

Not sure there's much point looking very deeply into this.
Java programs are threaded, threaded programs don't work
under QEMU => don't try to run Java under QEMU :-)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-03-09 23:51       ` Peter Maydell
@ 2014-03-10 11:28         ` Alex Bennée
  2014-03-10 11:45           ` Peter Maydell
  2014-03-10 13:56           ` Michael Matz
  0 siblings, 2 replies; 27+ messages in thread
From: Alex Bennée @ 2014-03-10 11:28 UTC (permalink / raw)
  To: Peter Maydell
  Cc: linaro-dev, Dann Frazier, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall


Peter Maydell <peter.maydell@linaro.org> writes:

> On 9 March 2014 23:37, Dann Frazier <dann.frazier@canonical.com> wrote:
>> Also - I've found an issue with running OpenJDK in the latest upstream git:
>>
>> root@server-75e0210e-4f99-4c86-9277-3201ab7b6afd:/root# java
>> #
>> [thread 274902467056 also had an error]# A fatal error has been
>> detected by the Java Runtime Environment:
>>
>> #
>> #  SIGSEGV (0xb) at pc=0x0000000000000000, pid=9960, tid=297441608176
>
> Not sure there's much point looking very deeply into this.
> Java programs are threaded, threaded programs don't work
> under QEMU => don't try to run Java under QEMU :-)

Having said that I'm sure there was another SIGILL related crash on
Launchpad and I think we would be interested in those. Is JAVA really
that buggy under QEMU just because of threading?

>
> thanks
> -- PMM

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-03-10 11:28         ` Alex Bennée
@ 2014-03-10 11:45           ` Peter Maydell
  2014-03-10 13:56           ` Michael Matz
  1 sibling, 0 replies; 27+ messages in thread
From: Peter Maydell @ 2014-03-10 11:45 UTC (permalink / raw)
  To: Alex Bennée
  Cc: linaro-dev, Dann Frazier, Michael Matz, Alexander Graf,
	linaro-toolchain, qemu-devel, Wook Wookey, Christoffer Dall

On 10 March 2014 11:28, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
>> On 9 March 2014 23:37, Dann Frazier <dann.frazier@canonical.com> wrote:
>>> Also - I've found an issue with running OpenJDK in the latest upstream git:
>>>
>>> root@server-75e0210e-4f99-4c86-9277-3201ab7b6afd:/root# java
>>> #
>>> [thread 274902467056 also had an error]# A fatal error has been
>>> detected by the Java Runtime Environment:
>>>
>>> #
>>> #  SIGSEGV (0xb) at pc=0x0000000000000000, pid=9960, tid=297441608176
>>
>> Not sure there's much point looking very deeply into this.
>> Java programs are threaded, threaded programs don't work
>> under QEMU => don't try to run Java under QEMU :-)
>
> Having said that I'm sure there was another SIGILL related crash on
> Launchpad and I think we would be interested in those. Is JAVA really
> that buggy under QEMU just because of threading?

My experience is that typically it's "take a day investigating
what's going wrong in a complicated and hard to debug guest
program, determine that it's the same cluster of issues we
already know about and have no plan to fix". So 99% of the
time it's just not worth investigating if the guest program
uses threads at all.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-03-10 11:28         ` Alex Bennée
  2014-03-10 11:45           ` Peter Maydell
@ 2014-03-10 13:56           ` Michael Matz
  1 sibling, 0 replies; 27+ messages in thread
From: Michael Matz @ 2014-03-10 13:56 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Peter Maydell, linaro-dev, Dann Frazier, qemu-devel,
	linaro-toolchain, Alexander Graf, Wook Wookey, Christoffer Dall

[-- Attachment #1: Type: TEXT/PLAIN, Size: 757 bytes --]

Hi,

On Mon, 10 Mar 2014, Alex Bennée wrote:

> > Not sure there's much point looking very deeply into this. Java 
> > programs are threaded, threaded programs don't work under QEMU => 
> > don't try to run Java under QEMU :-)
> 
> Having said that I'm sure there was another SIGILL related crash on 
> Launchpad and I think we would be interested in those. Is JAVA really 
> that buggy under QEMU just because of threading?

Generally speaking, yes.  I've never seen problems with openjdk (with the 
suse tree), so the segfault above might be also be related to the segfault 
handling for read-only data segments containing code (the signal 
trampoline on stack), for which the patches were recently proposed 
upstream.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation
  2014-02-27 13:20       ` Michael Matz
  2014-02-27 19:47         ` Dann Frazier
@ 2014-03-14 14:20         ` Peter Maydell
  1 sibling, 0 replies; 27+ messages in thread
From: Peter Maydell @ 2014-03-14 14:20 UTC (permalink / raw)
  To: Michael Matz
  Cc: linaro-dev, Dann Frazier, Alexander Graf, linaro-toolchain,
	qemu-devel, Wook Wookey, Alex Bennée, Christoffer Dall

On 27 February 2014 13:20, Michael Matz <matz@suse.de> wrote:
> On Wed, 26 Feb 2014, Dann Frazier wrote:
>> I've narrowed down the changes that seem to prevent both types of
>> segfaults to the following changes that introduce a wrapper around
>> sigprocmask:
>>
>> https://github.com/susematz/qemu/commit/f1542ae9fe10d5a241fc2624ecaef5f0948e3472
>> https://github.com/susematz/qemu/commit/4e5e1607758841c760cda4652b0ee7a6bc6eb79d
>> https://github.com/susematz/qemu/commit/63eb8d3ea58f58d5857153b0c632def1bbd05781
>>
>> I'm not sure if this is a real fix or just papering over my issue -

I've cleaned up these a bit (and added a bunch of missing
wrapper calls) and am testing them now. I'll post them to
qemu-devel when that's done.

> It's fixing the issue, but strictly speaking introduces an QoI problem.
> SIGSEGV must not be controllable by the guest, it needs to be always
> deliverable to qemu; that is what's fixed.
>
> The QoI problem introduced is that with the implementation as is, the
> fiddling with SIGSEGV is detectable by the guest.  E.g. if it installs a
> segv handler, blocks segv, then forces a segfault, checks that it didn't
> arrive, then unblocks segv and checks that it now arrives, such testcase
> would be able to detect that in fact it couldn't block SIGSEGV.

My rework opts to track with a flag whether SIGSEGV is blocked
by the guest or not; this is fairly straightforward.

> To fix also that latter part it'd need a further per-thread flag
> segv_blocked_p which needs to be checked before actually delivering a
> guest-directed SIGSEGV (in comparison to a qemu-directed SEGV), and
> otherwise requeue it.  That's made a bit complicated when the SEGV was
> process-directed (not thread-directed) because in that case it needs to be
> delivered as long as there's _any_ thread which has it unblocked.  So
> given the above undefinedness for sane uses of SEGVs it didn't seem worth
> the complication of having an undetectable virtualization of SIGSEGV.

I don't bother to emulate to this level though -- if we get a guest
directed SIGSEGV and the target has it blocked then we assume it was
a from-the-kernel SEGV (ie guest dereferenced NULL or similar) and
treat it as the kernel force_sig_info() does: behave as if the
default SEGV handler was installed. This seems a reasonable compromise
since I assume people don't typically go around sending manual SEGVs
to other processes and threads.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2014-03-14 14:20 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-17 13:40 [Qemu-devel] Call for testing QEMU aarch64-linux-user emulation Alex Bennée
2014-02-24 13:01 ` Janne Grunau
2014-02-25 15:54   ` Alex Bennée
2014-02-25 17:11     ` Janne Grunau
2014-03-06 11:40       ` Alex Bennée
2014-03-06 16:04         ` Janne Grunau
2014-02-24 20:58 ` Dann Frazier
2014-02-25  8:39   ` Alex Bennée
2014-02-25  8:49     ` Andreas Färber
2014-02-25 13:33       ` Michael Matz
2014-02-25 13:46         ` Peter Maydell
2014-02-25 14:56           ` Michael Matz
2014-02-28 14:12             ` Alex Bennée
2014-02-28 14:21               ` Peter Maydell
2014-02-28 14:27                 ` Alexander Graf
2014-02-28 14:49                   ` Peter Maydell
2014-02-28 17:08                     ` Alex Bennée
2014-02-28 17:17                       ` Peter Maydell
2014-02-26 22:06     ` Dann Frazier
2014-02-27 13:20       ` Michael Matz
2014-02-27 19:47         ` Dann Frazier
2014-03-14 14:20         ` Peter Maydell
2014-03-09 23:37     ` Dann Frazier
2014-03-09 23:51       ` Peter Maydell
2014-03-10 11:28         ` Alex Bennée
2014-03-10 11:45           ` Peter Maydell
2014-03-10 13:56           ` Michael Matz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.