All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
@ 2018-03-16 13:12 Peter Maydell
  2018-03-16 13:24 ` Peter Maydell
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2018-03-16 13:12 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Stefan Berger, Daniel P. Berrange

On OSX host, I noticed that tpm-tis-test and tpm-crb-test
both crash on OSX, hitting an error_abort case:

(lldb) run
Process 65115 launched:
'/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
(x86_64)
/i386/tpm-tis/test_check_localities: OK
/i386/tpm-tis/test_check_access_reg: OK
/i386/tpm-tis/test_check_access_reg_seize: OK
/i386/tpm-tis/test_check_access_reg_release: OK
/i386/tpm-tis/test_check_transmit: OK
Unexpected error in qio_channel_socket_readv() at
/Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
Unable to read from socket: Bad file descriptor

Here's a backtrace from tpm-tis-test:

* thread #2: tid = 0xb504cf, 0x00007fff7eb96e3e
libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal
SIGABRT
  * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
    frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
    frame #4: 0x0000000100043406
tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
line=<unavailable>, func=<unavailable>,
err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
    frame #5: 0x00000001000435db
tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
to read from socket") + 219 at error.c:111
    frame #6: 0x0000000100007ba5
tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
channel-socket.c:493
    frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
[inlined] qio_channel_readv_full(ioc=0x0000000100501f80,
iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
errp=0x000000010006c930) + 62 at channel.c:65
    frame #8: 0x00000001000046d9
tpm-tis-test`qio_channel_read(ioc=0x0000000100501f80,
buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
channel.c:216
    frame #9: 0x0000000100003dd1
tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
tpm-emu.c:41
    frame #10: 0x00000001001b2ec0
libglib-2.0.0.dylib`g_thread_create_proxy + 191
    frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
    frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13

Slightly alarmingly, this doesn't result in 'make check' failing.
I think this is because the error only happens in the test's cleanup,
after all its test cases have reported a pass to the test runner.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:12 [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX Peter Maydell
@ 2018-03-16 13:24 ` Peter Maydell
  2018-03-16 13:27   ` Daniel P. Berrangé
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2018-03-16 13:24 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Stefan Berger, Daniel P. Berrange

On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
> On OSX host, I noticed that tpm-tis-test and tpm-crb-test
> both crash on OSX, hitting an error_abort case:
>
> (lldb) run
> Process 65115 launched:
> '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
> (x86_64)
> /i386/tpm-tis/test_check_localities: OK
> /i386/tpm-tis/test_check_access_reg: OK
> /i386/tpm-tis/test_check_access_reg_seize: OK
> /i386/tpm-tis/test_check_access_reg_release: OK
> /i386/tpm-tis/test_check_transmit: OK
> Unexpected error in qio_channel_socket_readv() at
> /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
> Unable to read from socket: Bad file descriptor
>
> Here's a backtrace from tpm-tis-test:

Dan suggested a race condition, which prompted me to get an
all-threads backtrace:

  thread #1: tid = 0xb50f19, 0x00007fff7eb97502
libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
    frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
    frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
    frame #2: 0x000000010001b2df
tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
    frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
+ 9 at libqtest.h:555
    frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
argv=<unavailable>) + 520 at tpm-tis-test.c:477
    frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
    frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1

  thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
libsystem_kernel.dylib`close + 10
    frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
    frame #1: 0x0000000100007def
tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
errp=0x000000010006c930) + 63 at channel-socket.c:693
    frame #2: 0x00000001000039f9
tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
tpm-emu.c:128
    frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
    frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
    frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13

* thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
libsystem_kernel.dylib`__pthread_kill + 10
  * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
    frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
    frame #4: 0x0000000100043406
tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
line=<unavailable>, func=<unavailable>,
err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
    frame #5: 0x00000001000435db
tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
to read from socket") + 219 at error.c:111
    frame #6: 0x0000000100007ba5
tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
channel-socket.c:493
    frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
[inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
errp=0x000000010006c930) + 62 at channel.c:65
    frame #8: 0x00000001000046d9
tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
channel.c:216
    frame #9: 0x0000000100003dd1
tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
tpm-emu.c:41
    frame #10: 0x00000001001b2ec0
libglib-2.0.0.dylib`g_thread_create_proxy + 191
    frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
    frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13


My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
to abort because its read returned an error.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:24 ` Peter Maydell
@ 2018-03-16 13:27   ` Daniel P. Berrangé
  2018-03-16 13:37     ` Marc-André Lureau
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel P. Berrangé @ 2018-03-16 13:27 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers, Stefan Berger

On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
> > On OSX host, I noticed that tpm-tis-test and tpm-crb-test
> > both crash on OSX, hitting an error_abort case:
> >
> > (lldb) run
> > Process 65115 launched:
> > '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
> > (x86_64)
> > /i386/tpm-tis/test_check_localities: OK
> > /i386/tpm-tis/test_check_access_reg: OK
> > /i386/tpm-tis/test_check_access_reg_seize: OK
> > /i386/tpm-tis/test_check_access_reg_release: OK
> > /i386/tpm-tis/test_check_transmit: OK
> > Unexpected error in qio_channel_socket_readv() at
> > /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
> > Unable to read from socket: Bad file descriptor
> >
> > Here's a backtrace from tpm-tis-test:
> 
> Dan suggested a race condition, which prompted me to get an
> all-threads backtrace:
> 
>   thread #1: tid = 0xb50f19, 0x00007fff7eb97502
> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>     frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>     frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>     frame #2: 0x000000010001b2df
> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>     frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
> + 9 at libqtest.h:555
>     frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>     frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>     frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
> 
>   thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
> libsystem_kernel.dylib`close + 10
>     frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>     frame #1: 0x0000000100007def
> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
> errp=0x000000010006c930) + 63 at channel-socket.c:693
>     frame #2: 0x00000001000039f9
> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
> tpm-emu.c:128
>     frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
>     frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>     frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>     frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
> 
> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
> libsystem_kernel.dylib`__pthread_kill + 10
>   * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
>     frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
>     frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>     frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>     frame #4: 0x0000000100043406
> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
> line=<unavailable>, func=<unavailable>,
> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>     frame #5: 0x00000001000435db
> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
> to read from socket") + 219 at error.c:111
>     frame #6: 0x0000000100007ba5
> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
> channel-socket.c:493
>     frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
> errp=0x000000010006c930) + 62 at channel.c:65
>     frame #8: 0x00000001000046d9
> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
> channel.c:216
>     frame #9: 0x0000000100003dd1
> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
> tpm-emu.c:41
>     frame #10: 0x00000001001b2ec0
> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>     frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>     frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>     frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
> 
> 
> My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
> forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
> to abort because its read returned an error.

At least the tpm_emu_tpm_thread() there is only something in the test
suite, so the real system emulator code isn't at risk of crashing.

Feels like the thread simply should *not* use error_abort, and instead
have a more graceful way to exit when the socket closes

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:27   ` Daniel P. Berrangé
@ 2018-03-16 13:37     ` Marc-André Lureau
  2018-03-16 13:41       ` Marc-André Lureau
  2018-03-16 13:43       ` Daniel P. Berrangé
  0 siblings, 2 replies; 11+ messages in thread
From: Marc-André Lureau @ 2018-03-16 13:37 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Peter Maydell, QEMU Developers, Stefan Berger

Hi

On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
>> > On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>> > both crash on OSX, hitting an error_abort case:
>> >
>> > (lldb) run
>> > Process 65115 launched:
>> > '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>> > (x86_64)
>> > /i386/tpm-tis/test_check_localities: OK
>> > /i386/tpm-tis/test_check_access_reg: OK
>> > /i386/tpm-tis/test_check_access_reg_seize: OK
>> > /i386/tpm-tis/test_check_access_reg_release: OK
>> > /i386/tpm-tis/test_check_transmit: OK
>> > Unexpected error in qio_channel_socket_readv() at
>> > /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>> > Unable to read from socket: Bad file descriptor
>> >
>> > Here's a backtrace from tpm-tis-test:
>>
>> Dan suggested a race condition, which prompted me to get an
>> all-threads backtrace:
>>
>>   thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>     frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>     frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>     frame #2: 0x000000010001b2df
>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>     frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
>> + 9 at libqtest.h:555
>>     frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>     frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>     frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>
>>   thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>> libsystem_kernel.dylib`close + 10
>>     frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>     frame #1: 0x0000000100007def
>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>     frame #2: 0x00000001000039f9
>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>> tpm-emu.c:128
>>     frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>     frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>     frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>     frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>
>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>> libsystem_kernel.dylib`__pthread_kill + 10
>>   * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
>>     frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
>>     frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>     frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>     frame #4: 0x0000000100043406
>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>> line=<unavailable>, func=<unavailable>,
>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>     frame #5: 0x00000001000435db
>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>> to read from socket") + 219 at error.c:111
>>     frame #6: 0x0000000100007ba5
>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>> channel-socket.c:493
>>     frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>> errp=0x000000010006c930) + 62 at channel.c:65
>>     frame #8: 0x00000001000046d9
>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>> channel.c:216
>>     frame #9: 0x0000000100003dd1
>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>> tpm-emu.c:41
>>     frame #10: 0x00000001001b2ec0
>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>     frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>     frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>     frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>
>>
>> My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
>> forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
>> to abort because its read returned an error.
>
> At least the tpm_emu_tpm_thread() there is only something in the test
> suite, so the real system emulator code isn't at risk of crashing.
>
> Feels like the thread simply should *not* use error_abort, and instead
> have a more graceful way to exit when the socket closes
>

The code expects the read() to return 0 on disconnect, not an error.
Apparently this works on !osx. Should we adapt qio-channel-socket to
return 0 in this case on osx too?

thanks

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:37     ` Marc-André Lureau
@ 2018-03-16 13:41       ` Marc-André Lureau
  2018-03-16 13:45         ` Stefan Berger
  2018-03-16 13:43       ` Daniel P. Berrangé
  1 sibling, 1 reply; 11+ messages in thread
From: Marc-André Lureau @ 2018-03-16 13:41 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Peter Maydell, QEMU Developers, Stefan Berger

Hi

On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
> Hi
>
> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
>> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>>> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> > On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>>> > both crash on OSX, hitting an error_abort case:
>>> >
>>> > (lldb) run
>>> > Process 65115 launched:
>>> > '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>>> > (x86_64)
>>> > /i386/tpm-tis/test_check_localities: OK
>>> > /i386/tpm-tis/test_check_access_reg: OK
>>> > /i386/tpm-tis/test_check_access_reg_seize: OK
>>> > /i386/tpm-tis/test_check_access_reg_release: OK
>>> > /i386/tpm-tis/test_check_transmit: OK
>>> > Unexpected error in qio_channel_socket_readv() at
>>> > /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>>> > Unable to read from socket: Bad file descriptor
>>> >
>>> > Here's a backtrace from tpm-tis-test:
>>>
>>> Dan suggested a race condition, which prompted me to get an
>>> all-threads backtrace:
>>>
>>>   thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>>     frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>>     frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>>     frame #2: 0x000000010001b2df
>>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>>     frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
>>> + 9 at libqtest.h:555
>>>     frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
>>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>>     frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>     frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>
>>>   thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>>> libsystem_kernel.dylib`close + 10
>>>     frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>>     frame #1: 0x0000000100007def
>>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>>     frame #2: 0x00000001000039f9
>>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>>> tpm-emu.c:128
>>>     frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>     frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>>     frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>>     frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>>
>>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>   * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
>>>     frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
>>>     frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>>     frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>>     frame #4: 0x0000000100043406
>>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>>> line=<unavailable>, func=<unavailable>,
>>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>>     frame #5: 0x00000001000435db
>>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>>> to read from socket") + 219 at error.c:111
>>>     frame #6: 0x0000000100007ba5
>>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>>> channel-socket.c:493
>>>     frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>>> errp=0x000000010006c930) + 62 at channel.c:65
>>>     frame #8: 0x00000001000046d9
>>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>>> channel.c:216
>>>     frame #9: 0x0000000100003dd1
>>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>>> tpm-emu.c:41
>>>     frame #10: 0x00000001001b2ec0
>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>     frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>>     frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>>     frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>>
>>>
>>> My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
>>> forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
>>> to abort because its read returned an error.
>>
>> At least the tpm_emu_tpm_thread() there is only something in the test
>> suite, so the real system emulator code isn't at risk of crashing.
>>
>> Feels like the thread simply should *not* use error_abort, and instead
>> have a more graceful way to exit when the socket closes
>>
>
> The code expects the read() to return 0 on disconnect, not an error.
> Apparently this works on !osx. Should we adapt qio-channel-socket to
> return 0 in this case on osx too?

Oh I see, it calls close() on the same end, that's not correct. I
wonder if shutdown would be better. Other suggestions?

-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:37     ` Marc-André Lureau
  2018-03-16 13:41       ` Marc-André Lureau
@ 2018-03-16 13:43       ` Daniel P. Berrangé
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2018-03-16 13:43 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Peter Maydell, QEMU Developers, Stefan Berger

On Fri, Mar 16, 2018 at 02:37:16PM +0100, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
> >> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
> >> > On OSX host, I noticed that tpm-tis-test and tpm-crb-test
> >> > both crash on OSX, hitting an error_abort case:
> >> >
> >> > (lldb) run
> >> > Process 65115 launched:
> >> > '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
> >> > (x86_64)
> >> > /i386/tpm-tis/test_check_localities: OK
> >> > /i386/tpm-tis/test_check_access_reg: OK
> >> > /i386/tpm-tis/test_check_access_reg_seize: OK
> >> > /i386/tpm-tis/test_check_access_reg_release: OK
> >> > /i386/tpm-tis/test_check_transmit: OK
> >> > Unexpected error in qio_channel_socket_readv() at
> >> > /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
> >> > Unable to read from socket: Bad file descriptor
> >> >
> >> > Here's a backtrace from tpm-tis-test:
> >>
> >> Dan suggested a race condition, which prompted me to get an
> >> all-threads backtrace:
> >>
> >>   thread #1: tid = 0xb50f19, 0x00007fff7eb97502
> >> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
> >>     frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
> >>     frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
> >> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
> >>     frame #2: 0x000000010001b2df
> >> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
> >>     frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
> >> + 9 at libqtest.h:555
> >>     frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
> >> argv=<unavailable>) + 520 at tpm-tis-test.c:477
> >>     frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
> >>     frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
> >>
> >>   thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
> >> libsystem_kernel.dylib`close + 10
> >>     frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
> >>     frame #1: 0x0000000100007def
> >> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
> >> errp=0x000000010006c930) + 63 at channel-socket.c:693
> >>     frame #2: 0x00000001000039f9
> >> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
> >> tpm-emu.c:128
> >>     frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
> >>     frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
> >>     frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
> >>     frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
> >>
> >> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
> >> libsystem_kernel.dylib`__pthread_kill + 10
> >>   * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
> >>     frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
> >>     frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
> >>     frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
> >> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
> >>     frame #4: 0x0000000100043406
> >> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
> >> line=<unavailable>, func=<unavailable>,
> >> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
> >> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
> >>     frame #5: 0x00000001000435db
> >> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
> >> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
> >> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
> >> to read from socket") + 219 at error.c:111
> >>     frame #6: 0x0000000100007ba5
> >> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
> >> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
> >> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
> >> channel-socket.c:493
> >>     frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
> >> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
> >> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
> >> errp=0x000000010006c930) + 62 at channel.c:65
> >>     frame #8: 0x00000001000046d9
> >> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
> >> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
> >> channel.c:216
> >>     frame #9: 0x0000000100003dd1
> >> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
> >> tpm-emu.c:41
> >>     frame #10: 0x00000001001b2ec0
> >> libglib-2.0.0.dylib`g_thread_create_proxy + 191
> >>     frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
> >>     frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
> >>     frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
> >>
> >>
> >> My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
> >> forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
> >> to abort because its read returned an error.
> >
> > At least the tpm_emu_tpm_thread() there is only something in the test
> > suite, so the real system emulator code isn't at risk of crashing.
> >
> > Feels like the thread simply should *not* use error_abort, and instead
> > have a more graceful way to exit when the socket closes
> >
> 
> The code expects the read() to return 0 on disconnect, not an error.
> Apparently this works on !osx. Should we adapt qio-channel-socket to
> return 0 in this case on osx too?

I don't think that's the cause there - we're getting -1, because we're
calling read() on a file descriptor that has alrady been closed by
the other thread. Even on Linux that won't return 0.

I suspect the problem is that the two threads involved are simply
schedled in a different way on Linux vs OS-X so the race only hits
on one.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:41       ` Marc-André Lureau
@ 2018-03-16 13:45         ` Stefan Berger
  2018-03-16 14:08           ` Stefan Berger
  2018-03-16 14:18           ` Daniel P. Berrangé
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Berger @ 2018-03-16 13:45 UTC (permalink / raw)
  To: Marc-André Lureau, Daniel P. Berrangé
  Cc: Peter Maydell, QEMU Developers

On 03/16/2018 09:41 AM, Marc-André Lureau wrote:
> Hi
>
> On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
> <marcandre.lureau@gmail.com> wrote:
>> Hi
>>
>> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>>>> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org> wrote:
>>>>> On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>>>>> both crash on OSX, hitting an error_abort case:
>>>>>
>>>>> (lldb) run
>>>>> Process 65115 launched:
>>>>> '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>>>>> (x86_64)
>>>>> /i386/tpm-tis/test_check_localities: OK
>>>>> /i386/tpm-tis/test_check_access_reg: OK
>>>>> /i386/tpm-tis/test_check_access_reg_seize: OK
>>>>> /i386/tpm-tis/test_check_access_reg_release: OK
>>>>> /i386/tpm-tis/test_check_transmit: OK
>>>>> Unexpected error in qio_channel_socket_readv() at
>>>>> /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>>>>> Unable to read from socket: Bad file descriptor
>>>>>
>>>>> Here's a backtrace from tpm-tis-test:
>>>> Dan suggested a race condition, which prompted me to get an
>>>> all-threads backtrace:
>>>>
>>>>    thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>>>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>>>      frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>>>      frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>>>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>>>      frame #2: 0x000000010001b2df
>>>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>>>      frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] qtest_end
>>>> + 9 at libqtest.h:555
>>>>      frame #4: 0x0000000100001bc8 tpm-tis-test`main(argc=<unavailable>,
>>>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>>>      frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>      frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>
>>>>    thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>>>> libsystem_kernel.dylib`close + 10
>>>>      frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>>>      frame #1: 0x0000000100007def
>>>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>>>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>>>      frame #2: 0x00000001000039f9
>>>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>>>> tpm-emu.c:128
>>>>      frame #3: 0x00000001001b2ec0 libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>      frame #4: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>>>      frame #5: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>>>      frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>>>
>>>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>    * frame #0: 0x00007fff7eb96e3e libsystem_kernel.dylib`__pthread_kill + 10
>>>>      frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill + 333
>>>>      frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>>>      frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>>>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>>>      frame #4: 0x0000000100043406
>>>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>>>> line=<unavailable>, func=<unavailable>,
>>>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>>>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>>>      frame #5: 0x00000001000435db
>>>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>>>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>>>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>>>> to read from socket") + 219 at error.c:111
>>>>      frame #6: 0x0000000100007ba5
>>>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>>>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>>>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>>>> channel-socket.c:493
>>>>      frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>>>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>>>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>>>> errp=0x000000010006c930) + 62 at channel.c:65
>>>>      frame #8: 0x00000001000046d9
>>>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>>>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>>>> channel.c:216
>>>>      frame #9: 0x0000000100003dd1
>>>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>>>> tpm-emu.c:41
>>>>      frame #10: 0x00000001001b2ec0
>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>      frame #11: 0x00007fff7ecd26c1 libsystem_pthread.dylib`_pthread_body + 340
>>>>      frame #12: 0x00007fff7ecd256d libsystem_pthread.dylib`_pthread_start + 377
>>>>      frame #13: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start + 13
>>>>
>>>>
>>>> My guess is that the problem here is that the tpm_emu_ctrl_thread (thread 3) is
>>>> forcibly closing the channel, which causes the tpm_emu_thread (thread 2)
>>>> to abort because its read returned an error.
>>> At least the tpm_emu_tpm_thread() there is only something in the test
>>> suite, so the real system emulator code isn't at risk of crashing.
>>>
>>> Feels like the thread simply should *not* use error_abort, and instead
>>> have a more graceful way to exit when the socket closes
>>>
>> The code expects the read() to return 0 on disconnect, not an error.
>> Apparently this works on !osx. Should we adapt qio-channel-socket to
>> return 0 in this case on osx too?
> Oh I see, it calls close() on the same end, that's not correct. I
> wonder if shutdown would be better. Other suggestions?
>
We could send the thread a special message, like 0xff ff ff ff, and that 
terminates it...

     Stefan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:45         ` Stefan Berger
@ 2018-03-16 14:08           ` Stefan Berger
  2018-03-16 14:09             ` Marc-André Lureau
  2018-03-16 14:18           ` Daniel P. Berrangé
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Berger @ 2018-03-16 14:08 UTC (permalink / raw)
  To: Marc-André Lureau, Daniel P. Berrangé
  Cc: Peter Maydell, QEMU Developers

On 03/16/2018 09:45 AM, Stefan Berger wrote:
> On 03/16/2018 09:41 AM, Marc-André Lureau wrote:
>> Hi
>>
>> On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
>> <marcandre.lureau@gmail.com> wrote:
>>> Hi
>>>
>>> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé 
>>> <berrange@redhat.com> wrote:
>>>> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>>>>> On 16 March 2018 at 13:12, Peter Maydell 
>>>>> <peter.maydell@linaro.org> wrote:
>>>>>> On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>>>>>> both crash on OSX, hitting an error_abort case:
>>>>>>
>>>>>> (lldb) run
>>>>>> Process 65115 launched:
>>>>>> '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>>>>>> (x86_64)
>>>>>> /i386/tpm-tis/test_check_localities: OK
>>>>>> /i386/tpm-tis/test_check_access_reg: OK
>>>>>> /i386/tpm-tis/test_check_access_reg_seize: OK
>>>>>> /i386/tpm-tis/test_check_access_reg_release: OK
>>>>>> /i386/tpm-tis/test_check_transmit: OK
>>>>>> Unexpected error in qio_channel_socket_readv() at
>>>>>> /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>>>>>> Unable to read from socket: Bad file descriptor
>>>>>>
>>>>>> Here's a backtrace from tpm-tis-test:
>>>>> Dan suggested a race condition, which prompted me to get an
>>>>> all-threads backtrace:
>>>>>
>>>>>    thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>>>>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>>>>      frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>>>>      frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>>>>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>>>>      frame #2: 0x000000010001b2df
>>>>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>>>>      frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined] 
>>>>> qtest_end
>>>>> + 9 at libqtest.h:555
>>>>>      frame #4: 0x0000000100001bc8 
>>>>> tpm-tis-test`main(argc=<unavailable>,
>>>>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>>>>      frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>      frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>
>>>>>    thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>>>>> libsystem_kernel.dylib`close + 10
>>>>>      frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>>>>      frame #1: 0x0000000100007def
>>>>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>>>>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>>>>      frame #2: 0x00000001000039f9
>>>>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>>>>> tpm-emu.c:128
>>>>>      frame #3: 0x00000001001b2ec0 
>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>      frame #4: 0x00007fff7ecd26c1 
>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>      frame #5: 0x00007fff7ecd256d 
>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>      frame #6: 0x00007fff7ecd1c5d 
>>>>> libsystem_pthread.dylib`thread_start + 13
>>>>>
>>>>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>    * frame #0: 0x00007fff7eb96e3e 
>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>      frame #1: 0x00007fff7ecd5150 
>>>>> libsystem_pthread.dylib`pthread_kill + 333
>>>>>      frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>>>>      frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>>>>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>>>>      frame #4: 0x0000000100043406
>>>>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>>>>> line=<unavailable>, func=<unavailable>,
>>>>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>>>>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>>>>      frame #5: 0x00000001000435db
>>>>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>>>>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>>>>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>>>>> to read from socket") + 219 at error.c:111
>>>>>      frame #6: 0x0000000100007ba5
>>>>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>>>>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>>>>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>>>>> channel-socket.c:493
>>>>>      frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>>>>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>>>>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>>>>> errp=0x000000010006c930) + 62 at channel.c:65
>>>>>      frame #8: 0x00000001000046d9
>>>>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>>>>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>>>>> channel.c:216
>>>>>      frame #9: 0x0000000100003dd1
>>>>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>>>>> tpm-emu.c:41
>>>>>      frame #10: 0x00000001001b2ec0
>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>      frame #11: 0x00007fff7ecd26c1 
>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>      frame #12: 0x00007fff7ecd256d 
>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>      frame #13: 0x00007fff7ecd1c5d 
>>>>> libsystem_pthread.dylib`thread_start + 13
>>>>>
>>>>>
>>>>> My guess is that the problem here is that the tpm_emu_ctrl_thread 
>>>>> (thread 3) is
>>>>> forcibly closing the channel, which causes the tpm_emu_thread 
>>>>> (thread 2)
>>>>> to abort because its read returned an error.
>>>> At least the tpm_emu_tpm_thread() there is only something in the test
>>>> suite, so the real system emulator code isn't at risk of crashing.
>>>>
>>>> Feels like the thread simply should *not* use error_abort, and instead
>>>> have a more graceful way to exit when the socket closes
>>>>
>>> The code expects the read() to return 0 on disconnect, not an error.
>>> Apparently this works on !osx. Should we adapt qio-channel-socket to
>>> return 0 in this case on osx too?
>> Oh I see, it calls close() on the same end, that's not correct. I
>> wonder if shutdown would be better. Other suggestions?
>>
> We could send the thread a special message, like 0xff ff ff ff, and 
> that terminates it...

... wrong end of socket, so doesn't work. Other way would be to pass a 
pipe to the TPM emulator thread and have it poll on the pipefd and the 
channelfd and terminate upon pipefd reception...


>
>     Stefan
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 14:08           ` Stefan Berger
@ 2018-03-16 14:09             ` Marc-André Lureau
  2018-03-16 14:35               ` Stefan Berger
  0 siblings, 1 reply; 11+ messages in thread
From: Marc-André Lureau @ 2018-03-16 14:09 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Daniel P. Berrangé, Peter Maydell, QEMU Developers

Hi

On Fri, Mar 16, 2018 at 3:08 PM, Stefan Berger
<stefanb@linux.vnet.ibm.com> wrote:
> On 03/16/2018 09:45 AM, Stefan Berger wrote:
>>
>> On 03/16/2018 09:41 AM, Marc-André Lureau wrote:
>>>
>>> Hi
>>>
>>> On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
>>> <marcandre.lureau@gmail.com> wrote:
>>>>
>>>> Hi
>>>>
>>>> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé
>>>> <berrange@redhat.com> wrote:
>>>>>
>>>>> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>>>>>>
>>>>>> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>>>>>>> both crash on OSX, hitting an error_abort case:
>>>>>>>
>>>>>>> (lldb) run
>>>>>>> Process 65115 launched:
>>>>>>> '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>>>>>>> (x86_64)
>>>>>>> /i386/tpm-tis/test_check_localities: OK
>>>>>>> /i386/tpm-tis/test_check_access_reg: OK
>>>>>>> /i386/tpm-tis/test_check_access_reg_seize: OK
>>>>>>> /i386/tpm-tis/test_check_access_reg_release: OK
>>>>>>> /i386/tpm-tis/test_check_transmit: OK
>>>>>>> Unexpected error in qio_channel_socket_readv() at
>>>>>>> /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>>>>>>> Unable to read from socket: Bad file descriptor
>>>>>>>
>>>>>>> Here's a backtrace from tpm-tis-test:
>>>>>>
>>>>>> Dan suggested a race condition, which prompted me to get an
>>>>>> all-threads backtrace:
>>>>>>
>>>>>>    thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>>>>>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>>>>>      frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>>>>>      frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>>>>>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>>>>>      frame #2: 0x000000010001b2df
>>>>>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>>>>>      frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined]
>>>>>> qtest_end
>>>>>> + 9 at libqtest.h:555
>>>>>>      frame #4: 0x0000000100001bc8
>>>>>> tpm-tis-test`main(argc=<unavailable>,
>>>>>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>>>>>      frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>>      frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>>
>>>>>>    thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>>>>>> libsystem_kernel.dylib`close + 10
>>>>>>      frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>>>>>      frame #1: 0x0000000100007def
>>>>>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>>>>>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>>>>>      frame #2: 0x00000001000039f9
>>>>>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>>>>>> tpm-emu.c:128
>>>>>>      frame #3: 0x00000001001b2ec0
>>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>>      frame #4: 0x00007fff7ecd26c1
>>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>>      frame #5: 0x00007fff7ecd256d
>>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>>      frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start
>>>>>> + 13
>>>>>>
>>>>>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>>    * frame #0: 0x00007fff7eb96e3e
>>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>>      frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill
>>>>>> + 333
>>>>>>      frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>>>>>      frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>>>>>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>>>>>      frame #4: 0x0000000100043406
>>>>>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>>>>>> line=<unavailable>, func=<unavailable>,
>>>>>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>>>>>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>>>>>      frame #5: 0x00000001000435db
>>>>>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>>>>>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>>>>>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>>>>>> to read from socket") + 219 at error.c:111
>>>>>>      frame #6: 0x0000000100007ba5
>>>>>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>>>>>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>>>>>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>>>>>> channel-socket.c:493
>>>>>>      frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>>>>>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>>>>>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>>>>>> errp=0x000000010006c930) + 62 at channel.c:65
>>>>>>      frame #8: 0x00000001000046d9
>>>>>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>>>>>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>>>>>> channel.c:216
>>>>>>      frame #9: 0x0000000100003dd1
>>>>>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>>>>>> tpm-emu.c:41
>>>>>>      frame #10: 0x00000001001b2ec0
>>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>>      frame #11: 0x00007fff7ecd26c1
>>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>>      frame #12: 0x00007fff7ecd256d
>>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>>      frame #13: 0x00007fff7ecd1c5d
>>>>>> libsystem_pthread.dylib`thread_start + 13
>>>>>>
>>>>>>
>>>>>> My guess is that the problem here is that the tpm_emu_ctrl_thread
>>>>>> (thread 3) is
>>>>>> forcibly closing the channel, which causes the tpm_emu_thread (thread
>>>>>> 2)
>>>>>> to abort because its read returned an error.
>>>>>
>>>>> At least the tpm_emu_tpm_thread() there is only something in the test
>>>>> suite, so the real system emulator code isn't at risk of crashing.
>>>>>
>>>>> Feels like the thread simply should *not* use error_abort, and instead
>>>>> have a more graceful way to exit when the socket closes
>>>>>
>>>> The code expects the read() to return 0 on disconnect, not an error.
>>>> Apparently this works on !osx. Should we adapt qio-channel-socket to
>>>> return 0 in this case on osx too?
>>>
>>> Oh I see, it calls close() on the same end, that's not correct. I
>>> wonder if shutdown would be better. Other suggestions?
>>>
>> We could send the thread a special message, like 0xff ff ff ff, and that
>> terminates it...
>
>
> ... wrong end of socket, so doesn't work. Other way would be to pass a pipe
> to the TPM emulator thread and have it poll on the pipefd and the channelfd
> and terminate upon pipefd reception...
>

Or close the other end on tpm_emulator_shutdown(), I am trying this
approach now.



-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 13:45         ` Stefan Berger
  2018-03-16 14:08           ` Stefan Berger
@ 2018-03-16 14:18           ` Daniel P. Berrangé
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2018-03-16 14:18 UTC (permalink / raw)
  To: Stefan Berger; +Cc: Marc-André Lureau, Peter Maydell, QEMU Developers

On Fri, Mar 16, 2018 at 09:45:59AM -0400, Stefan Berger wrote:
> On 03/16/2018 09:41 AM, Marc-André Lureau wrote:
> > Hi
> > 
> > On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
> > <marcandre.lureau@gmail.com> wrote:
> > > Hi
> > > 
> > > On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > At least the tpm_emu_tpm_thread() there is only something in the test
> > > > suite, so the real system emulator code isn't at risk of crashing.
> > > > 
> > > > Feels like the thread simply should *not* use error_abort, and instead
> > > > have a more graceful way to exit when the socket closes
> > > > 
> > > The code expects the read() to return 0 on disconnect, not an error.
> > > Apparently this works on !osx. Should we adapt qio-channel-socket to
> > > return 0 in this case on osx too?
> > Oh I see, it calls close() on the same end, that's not correct. I
> > wonder if shutdown would be better. Other suggestions?
> > 
> We could send the thread a special message, like 0xff ff ff ff, and that
> terminates it...

If there's no resource cleanup in this test suite thread to worry about
could just  pthread_cancel() it

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX
  2018-03-16 14:09             ` Marc-André Lureau
@ 2018-03-16 14:35               ` Stefan Berger
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Berger @ 2018-03-16 14:35 UTC (permalink / raw)
  To: Marc-André Lureau; +Cc: Peter Maydell, QEMU Developers

On 03/16/2018 10:09 AM, Marc-André Lureau wrote:
> Hi
>
> On Fri, Mar 16, 2018 at 3:08 PM, Stefan Berger
> <stefanb@linux.vnet.ibm.com> wrote:
>> On 03/16/2018 09:45 AM, Stefan Berger wrote:
>>> On 03/16/2018 09:41 AM, Marc-André Lureau wrote:
>>>> Hi
>>>>
>>>> On Fri, Mar 16, 2018 at 2:37 PM, Marc-André Lureau
>>>> <marcandre.lureau@gmail.com> wrote:
>>>>> Hi
>>>>>
>>>>> On Fri, Mar 16, 2018 at 2:27 PM, Daniel P. Berrangé
>>>>> <berrange@redhat.com> wrote:
>>>>>> On Fri, Mar 16, 2018 at 01:24:53PM +0000, Peter Maydell wrote:
>>>>>>> On 16 March 2018 at 13:12, Peter Maydell <peter.maydell@linaro.org>
>>>>>>> wrote:
>>>>>>>> On OSX host, I noticed that tpm-tis-test and tpm-crb-test
>>>>>>>> both crash on OSX, hitting an error_abort case:
>>>>>>>>
>>>>>>>> (lldb) run
>>>>>>>> Process 65115 launched:
>>>>>>>> '/Users/pm215/src/qemu-for-merges/build/all/tests/tpm-tis-test'
>>>>>>>> (x86_64)
>>>>>>>> /i386/tpm-tis/test_check_localities: OK
>>>>>>>> /i386/tpm-tis/test_check_access_reg: OK
>>>>>>>> /i386/tpm-tis/test_check_access_reg_seize: OK
>>>>>>>> /i386/tpm-tis/test_check_access_reg_release: OK
>>>>>>>> /i386/tpm-tis/test_check_transmit: OK
>>>>>>>> Unexpected error in qio_channel_socket_readv() at
>>>>>>>> /Users/pm215/src/qemu-for-merges/io/channel-socket.c:494:
>>>>>>>> Unable to read from socket: Bad file descriptor
>>>>>>>>
>>>>>>>> Here's a backtrace from tpm-tis-test:
>>>>>>> Dan suggested a race condition, which prompted me to get an
>>>>>>> all-threads backtrace:
>>>>>>>
>>>>>>>     thread #1: tid = 0xb50f19, 0x00007fff7eb97502
>>>>>>> libsystem_kernel.dylib`__wait4 + 10, queue = 'com.apple.main-thread'
>>>>>>>       frame #0: 0x00007fff7eb97502 libsystem_kernel.dylib`__wait4 + 10
>>>>>>>       frame #1: 0x000000010001b303 tpm-tis-test`qtest_quit [inlined]
>>>>>>> kill_qemu(s=<unavailable>) + 99 at libqtest.c:107
>>>>>>>       frame #2: 0x000000010001b2df
>>>>>>> tpm-tis-test`qtest_quit(s=0x0000000100404c60) + 63 at libqtest.c:280
>>>>>>>       frame #3: 0x0000000100001bd1 tpm-tis-test`main [inlined]
>>>>>>> qtest_end
>>>>>>> + 9 at libqtest.h:555
>>>>>>>       frame #4: 0x0000000100001bc8
>>>>>>> tpm-tis-test`main(argc=<unavailable>,
>>>>>>> argv=<unavailable>) + 520 at tpm-tis-test.c:477
>>>>>>>       frame #5: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>>>       frame #6: 0x00007fff7ea47115 libdyld.dylib`start + 1
>>>>>>>
>>>>>>>     thread #3: tid = 0xb50f4a, 0x00007fff7eb977d2
>>>>>>> libsystem_kernel.dylib`close + 10
>>>>>>>       frame #0: 0x00007fff7eb977d2 libsystem_kernel.dylib`close + 10
>>>>>>>       frame #1: 0x0000000100007def
>>>>>>> tpm-tis-test`qio_channel_socket_close(ioc=<unavailable>,
>>>>>>> errp=0x000000010006c930) + 63 at channel-socket.c:693
>>>>>>>       frame #2: 0x00000001000039f9
>>>>>>> tpm-tis-test`tpm_emu_ctrl_thread(data=0x00007ffeefbff0e8) + 713 at
>>>>>>> tpm-emu.c:128
>>>>>>>       frame #3: 0x00000001001b2ec0
>>>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>>>       frame #4: 0x00007fff7ecd26c1
>>>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>>>       frame #5: 0x00007fff7ecd256d
>>>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>>>       frame #6: 0x00007fff7ecd1c5d libsystem_pthread.dylib`thread_start
>>>>>>> + 13
>>>>>>>
>>>>>>> * thread #2: tid = 0xb50f50, 0x00007fff7eb96e3e
>>>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>>>     * frame #0: 0x00007fff7eb96e3e
>>>>>>> libsystem_kernel.dylib`__pthread_kill + 10
>>>>>>>       frame #1: 0x00007fff7ecd5150 libsystem_pthread.dylib`pthread_kill
>>>>>>> + 333
>>>>>>>       frame #2: 0x00007fff7eaf3312 libsystem_c.dylib`abort + 127
>>>>>>>       frame #3: 0x0000000100043431 tpm-tis-test`error_setv [inlined]
>>>>>>> error_handle_fatal(errp=<unavailable>) + 43 at error.c:38
>>>>>>>       frame #4: 0x0000000100043406
>>>>>>> tpm-tis-test`error_setv(errp=<unavailable>, src=<unavailable>,
>>>>>>> line=<unavailable>, func=<unavailable>,
>>>>>>> err_class=ERROR_CLASS_GENERIC_ERROR, fmt=<unavailable>,
>>>>>>> ap=<unavailable>, suffix=<unavailable>) + 246 at error.c:71
>>>>>>>       frame #5: 0x00000001000435db
>>>>>>> tpm-tis-test`error_setg_errno_internal(errp=0x000000010006c930,
>>>>>>> src="/Users/pm215/src/qemu-for-merges/io/channel-socket.c", line=494,
>>>>>>> func="qio_channel_socket_readv", os_errno=<unavailable>, fmt="Unable
>>>>>>> to read from socket") + 219 at error.c:111
>>>>>>>       frame #6: 0x0000000100007ba5
>>>>>>> tpm-tis-test`qio_channel_socket_readv(ioc=<unavailable>,
>>>>>>> iov=<unavailable>, niov=<unavailable>, fds=0x0000000000000000,
>>>>>>> nfds=0x0000000000000000, errp=0x000000010006c930) + 341 at
>>>>>>> channel-socket.c:493
>>>>>>>       frame #7: 0x0000000100004717 tpm-tis-test`qio_channel_read
>>>>>>> [inlined] qio_channel_readv_full(ioc=0x00000001007006b0,
>>>>>>> iov=<unavailable>, niov=1, fds=<unavailable>, nfds=<unavailable>,
>>>>>>> errp=0x000000010006c930) + 62 at channel.c:65
>>>>>>>       frame #8: 0x00000001000046d9
>>>>>>> tpm-tis-test`qio_channel_read(ioc=0x00000001007006b0,
>>>>>>> buf=<unavailable>, buflen=<unavailable>, errp=<unavailable>) + 41 at
>>>>>>> channel.c:216
>>>>>>>       frame #9: 0x0000000100003dd1
>>>>>>> tpm-tis-test`tpm_emu_tpm_thread(data=0x00007ffeefbff0e8) + 241 at
>>>>>>> tpm-emu.c:41
>>>>>>>       frame #10: 0x00000001001b2ec0
>>>>>>> libglib-2.0.0.dylib`g_thread_create_proxy + 191
>>>>>>>       frame #11: 0x00007fff7ecd26c1
>>>>>>> libsystem_pthread.dylib`_pthread_body + 340
>>>>>>>       frame #12: 0x00007fff7ecd256d
>>>>>>> libsystem_pthread.dylib`_pthread_start + 377
>>>>>>>       frame #13: 0x00007fff7ecd1c5d
>>>>>>> libsystem_pthread.dylib`thread_start + 13
>>>>>>>
>>>>>>>
>>>>>>> My guess is that the problem here is that the tpm_emu_ctrl_thread
>>>>>>> (thread 3) is
>>>>>>> forcibly closing the channel, which causes the tpm_emu_thread (thread
>>>>>>> 2)
>>>>>>> to abort because its read returned an error.
>>>>>> At least the tpm_emu_tpm_thread() there is only something in the test
>>>>>> suite, so the real system emulator code isn't at risk of crashing.
>>>>>>
>>>>>> Feels like the thread simply should *not* use error_abort, and instead
>>>>>> have a more graceful way to exit when the socket closes
>>>>>>
>>>>> The code expects the read() to return 0 on disconnect, not an error.
>>>>> Apparently this works on !osx. Should we adapt qio-channel-socket to
>>>>> return 0 in this case on osx too?
>>>> Oh I see, it calls close() on the same end, that's not correct. I
>>>> wonder if shutdown would be better. Other suggestions?
>>>>
>>> We could send the thread a special message, like 0xff ff ff ff, and that
>>> terminates it...
>>
>> ... wrong end of socket, so doesn't work. Other way would be to pass a pipe
>> to the TPM emulator thread and have it poll on the pipefd and the channelfd
>> and terminate upon pipefd reception...
>>
> Or close the other end on tpm_emulator_shutdown(), I am trying this
> approach now.

We have this here in the tpm_emulator_finalize():

static void tpm_emulator_inst_finalize(Object *obj)
{
     TPMEmulator *tpm_emu = TPM_EMULATOR(obj);

     tpm_emulator_shutdown(tpm_emu);

     printf("Dereferencing data_ioc\n");
     object_unref(OBJECT(tpm_emu->data_ioc));
     printf("DONE Dereferencing data_ioc\n");

     qemu_chr_fe_deinit(&tpm_emu->ctrl_chr, false);

object_unref() seems to close the socket. So maybe reordering the join 
and channel_close() does the trick?!I don't have OsX...

diff --git a/tests/tpm-emu.c b/tests/tpm-emu.c
index 4dada76..6fc4e3d 100644
--- a/tests/tpm-emu.c
+++ b/tests/tpm-emu.c
@@ -125,8 +125,8 @@ void *tpm_emu_ctrl_thread(void *data)
          case CMD_SHUTDOWN: {
              ptm_res res = 0;
              qio_channel_write(ioc, (char *)&res, sizeof(res), 
&error_abort);
-            qio_channel_close(s->tpm_ioc, &error_abort);
              g_thread_join(s->emu_tpm_thread);
+            qio_channel_close(s->tpm_ioc, &error_abort);
              break;
          }
          case CMD_STOP: {


    Stefan

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-03-16 14:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-16 13:12 [Qemu-devel] tpm-tis-test and tpm-crb-test crash on OSX Peter Maydell
2018-03-16 13:24 ` Peter Maydell
2018-03-16 13:27   ` Daniel P. Berrangé
2018-03-16 13:37     ` Marc-André Lureau
2018-03-16 13:41       ` Marc-André Lureau
2018-03-16 13:45         ` Stefan Berger
2018-03-16 14:08           ` Stefan Berger
2018-03-16 14:09             ` Marc-André Lureau
2018-03-16 14:35               ` Stefan Berger
2018-03-16 14:18           ` Daniel P. Berrangé
2018-03-16 13:43       ` Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.