qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* logfile issue
@ 2019-10-31 17:24 Robert Foley
  2019-11-04 12:13 ` Alex Bennée
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Foley @ 2019-10-31 17:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Richard Henderson, Alex Bennée, Peter Puhov

We hit an issue when trying to change the log file from the monitor
console.  The root of the issue here is that the qemu_logfile handle
is not thread safe.  So when we try to close the file, we end up with
a seg fault.  The full analysis is below along with some possible
solutions.
Will plan to post a patch soon, but any comments or opinions on our
proposed solution would be appreciated.  Thanks.

The version of QEMU we are using is: master as of about Oct 15,
9020e9526cd08c4dc99d54dba48730de2908c970.

This is what we did to reproduce the issue.
First we enable logging and select the log file.
(qemu) log in_asm,out_asm,op
(qemu) logfile asm.out

Then we start this command in the guest.  This just keeps the guest
performing operations that result in logging to be constantly
generated.
$ for i in {0..1000}; do ls -l; done

Next we switch to the monitor console and change the file.
(qemu) logfile asm_new.log

This action causes a seg fault.  Please see the stack trace (below).

The code, which changes the log file unconditionally
(qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL,
and then opens the new file.
Since the file handle is still in use, we end up with a seg fault when
the code which is trying to log ends up using a NULL file handle.

We are considering a few solutions.

A straightforward solution would be to simply prevent the file from
being changed while logging is enabled.  In other words, force the
user to first disable logging before changing the log file.
This solution seems to cover the general case.  However, if a user
were to disable logging and change the log file in quick succession,
we would still be subject to a similar race.  A log call could still
make it through the logging enable check and proceed to use a file
handle that gets changed to NULL.

Another option is to add a mutex to prevent the qemu_logfile handle
from being changed while it is in use.  This certainly works and has
the advantage of being fairly straightforward.  Also we are thinking
that since the mutex would only be used when logging is enabled it has
the advantage of not having an effect on the normal case performance.
Another option is to implement a simple atomic ref count and prevent
the file from being changed while there are outstanding references.

We are leaning towards the mutex option, and plan to post a patch
soon, but would appreciate comments or opinions on this solution.

Thanks,
Rob Foley

stack trace
==========================================
Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile
(stream=0x0) at ../sysdeps/pthread/flockfile.c:27
27 ../sysdeps/pthread/flockfile.c: No such file or directory.
(gdb) bt
#0  __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
#1  0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at
/home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87
#2  qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57
#3  translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>,
db=0xffff113f9088, db@entry=0xffff113f9098,
cpu=cpu@entry=0xaaab0a52bc50,
    tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>,
max_insns=max_insns@entry=512) at
/home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121
#4  0x0000aaaae10c1c18 in gen_intermediate_code
(cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000
<code_gen_buffer+814927796>, max_insns=max_insns@entry=512)
    at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320
#5  0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50,
cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536,
flags=43690, cflags=-16252928, cflags@entry=524288)
    at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738
#6  0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0,
last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>,
cpu=0xffffabe2a000)
    at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408
#7  cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at
/home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730
#8  0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at
/home/rob/qemu/qemu_unchanged/cpus.c:1454
#9  0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50)
at /home/rob/qemu/qemu_unchanged/cpus.c:1762
#10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at
/home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519
#11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at
pthread_create.c:463
#12 0x0000ffffabd7a4ec in thread_start () at
../sysdeps/unix/sysv/linux/aarch64/clone.S:78


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: logfile issue
  2019-10-31 17:24 logfile issue Robert Foley
@ 2019-11-04 12:13 ` Alex Bennée
  2019-11-06 15:46   ` Robert Foley
  0 siblings, 1 reply; 3+ messages in thread
From: Alex Bennée @ 2019-11-04 12:13 UTC (permalink / raw)
  To: Robert Foley; +Cc: Peter Puhov, Richard Henderson, qemu-devel


Robert Foley <robert.foley@linaro.org> writes:

> We hit an issue when trying to change the log file from the monitor
> console.  The root of the issue here is that the qemu_logfile handle
> is not thread safe.  So when we try to close the file, we end up with
> a seg fault.  The full analysis is below along with some possible
> solutions.
> Will plan to post a patch soon, but any comments or opinions on our
> proposed solution would be appreciated.  Thanks.
>
> The version of QEMU we are using is: master as of about Oct 15,
> 9020e9526cd08c4dc99d54dba48730de2908c970.
>
> This is what we did to reproduce the issue.
> First we enable logging and select the log file.
> (qemu) log in_asm,out_asm,op
> (qemu) logfile asm.out
>
> Then we start this command in the guest.  This just keeps the guest
> performing operations that result in logging to be constantly
> generated.
> $ for i in {0..1000}; do ls -l; done
>
> Next we switch to the monitor console and change the file.
> (qemu) logfile asm_new.log
>
> This action causes a seg fault.  Please see the stack trace (below).
>
> The code, which changes the log file unconditionally
> (qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL,
> and then opens the new file.
> Since the file handle is still in use, we end up with a seg fault when
> the code which is trying to log ends up using a NULL file handle.

Yep, good catch.

> We are considering a few solutions.
>
> A straightforward solution would be to simply prevent the file from
> being changed while logging is enabled.  In other words, force the
> user to first disable logging before changing the log file.
> This solution seems to cover the general case.  However, if a user
> were to disable logging and change the log file in quick succession,
> we would still be subject to a similar race.  A log call could still
> make it through the logging enable check and proceed to use a file
> handle that gets changed to NULL.
>
> Another option is to add a mutex to prevent the qemu_logfile handle
> from being changed while it is in use.  This certainly works and has
> the advantage of being fairly straightforward.  Also we are thinking
> that since the mutex would only be used when logging is enabled it has
> the advantage of not having an effect on the normal case performance.
> Another option is to implement a simple atomic ref count and prevent
> the file from being changed while there are outstanding references.

The mutex is the simplest but I wonder if it show up on heavy usage? The
log file is already doing file locking for the areas that want
contiguous log statements.

I wonder if using RCU to swap out the new and old FD would make things a
bit smoother here? You would have to tweak the qemu_log_lock/unlock
functions to preserve the current FD around the lock and call_rcu a
freeing function when a new handle is set.

> We are leaning towards the mutex option, and plan to post a patch
> soon, but would appreciate comments or opinions on this solution.
>
> Thanks,
> Rob Foley
>
> stack trace
> ==========================================
> Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile
> (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> 27 ../sysdeps/pthread/flockfile.c: No such file or directory.
> (gdb) bt
> #0  __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> #1  0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at
> /home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87
> #2  qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57
> #3  translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>,
> db=0xffff113f9088, db@entry=0xffff113f9098,
> cpu=cpu@entry=0xaaab0a52bc50,
>     tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>,
> max_insns=max_insns@entry=512) at
> /home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121
> #4  0x0000aaaae10c1c18 in gen_intermediate_code
> (cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000
> <code_gen_buffer+814927796>, max_insns=max_insns@entry=512)
>     at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320
> #5  0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50,
> cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536,
> flags=43690, cflags=-16252928, cflags@entry=524288)
>     at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738
> #6  0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0,
> last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>,
> cpu=0xffffabe2a000)
>     at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408
> #7  cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at
> /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730
> #8  0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at
> /home/rob/qemu/qemu_unchanged/cpus.c:1454
> #9  0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50)
> at /home/rob/qemu/qemu_unchanged/cpus.c:1762
> #10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at
> /home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519
> #11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at
> pthread_create.c:463
> #12 0x0000ffffabd7a4ec in thread_start () at
> ../sysdeps/unix/sysv/linux/aarch64/clone.S:78


--
Alex Bennée


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: logfile issue
  2019-11-04 12:13 ` Alex Bennée
@ 2019-11-06 15:46   ` Robert Foley
  0 siblings, 0 replies; 3+ messages in thread
From: Robert Foley @ 2019-11-06 15:46 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Peter Puhov, Richard Henderson, qemu-devel

On Mon, 4 Nov 2019 at 07:13, Alex Bennée <alex.bennee@linaro.org> wrote:

> I wonder if using RCU to swap out the new and old FD would make things a
> bit smoother here? You would have to tweak the qemu_log_lock/unlock
> functions to preserve the current FD around the lock and call_rcu a
> freeing function when a new handle is set.

Thanks for your comments on this.
We agree that using RCU is a great solution here.  We have this patch
well underway and plan to post a patch in the next few days.

-Rob

On Mon, 4 Nov 2019 at 07:13, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Robert Foley <robert.foley@linaro.org> writes:
>
> > We hit an issue when trying to change the log file from the monitor
> > console.  The root of the issue here is that the qemu_logfile handle
> > is not thread safe.  So when we try to close the file, we end up with
> > a seg fault.  The full analysis is below along with some possible
> > solutions.
> > Will plan to post a patch soon, but any comments or opinions on our
> > proposed solution would be appreciated.  Thanks.
> >
> > The version of QEMU we are using is: master as of about Oct 15,
> > 9020e9526cd08c4dc99d54dba48730de2908c970.
> >
> > This is what we did to reproduce the issue.
> > First we enable logging and select the log file.
> > (qemu) log in_asm,out_asm,op
> > (qemu) logfile asm.out
> >
> > Then we start this command in the guest.  This just keeps the guest
> > performing operations that result in logging to be constantly
> > generated.
> > $ for i in {0..1000}; do ls -l; done
> >
> > Next we switch to the monitor console and change the file.
> > (qemu) logfile asm_new.log
> >
> > This action causes a seg fault.  Please see the stack trace (below).
> >
> > The code, which changes the log file unconditionally
> > (qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL,
> > and then opens the new file.
> > Since the file handle is still in use, we end up with a seg fault when
> > the code which is trying to log ends up using a NULL file handle.
>
> Yep, good catch.
>
> > We are considering a few solutions.
> >
> > A straightforward solution would be to simply prevent the file from
> > being changed while logging is enabled.  In other words, force the
> > user to first disable logging before changing the log file.
> > This solution seems to cover the general case.  However, if a user
> > were to disable logging and change the log file in quick succession,
> > we would still be subject to a similar race.  A log call could still
> > make it through the logging enable check and proceed to use a file
> > handle that gets changed to NULL.
> >
> > Another option is to add a mutex to prevent the qemu_logfile handle
> > from being changed while it is in use.  This certainly works and has
> > the advantage of being fairly straightforward.  Also we are thinking
> > that since the mutex would only be used when logging is enabled it has
> > the advantage of not having an effect on the normal case performance.
> > Another option is to implement a simple atomic ref count and prevent
> > the file from being changed while there are outstanding references.
>
> The mutex is the simplest but I wonder if it show up on heavy usage? The
> log file is already doing file locking for the areas that want
> contiguous log statements.
>
> I wonder if using RCU to swap out the new and old FD would make things a
> bit smoother here? You would have to tweak the qemu_log_lock/unlock
> functions to preserve the current FD around the lock and call_rcu a
> freeing function when a new handle is set.
>
> > We are leaning towards the mutex option, and plan to post a patch
> > soon, but would appreciate comments or opinions on this solution.
> >
> > Thanks,
> > Rob Foley
> >
> > stack trace
> > ==========================================
> > Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile
> > (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> > 27 ../sysdeps/pthread/flockfile.c: No such file or directory.
> > (gdb) bt
> > #0  __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27
> > #1  0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at
> > /home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87
> > #2  qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57
> > #3  translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>,
> > db=0xffff113f9088, db@entry=0xffff113f9098,
> > cpu=cpu@entry=0xaaab0a52bc50,
> >     tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>,
> > max_insns=max_insns@entry=512) at
> > /home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121
> > #4  0x0000aaaae10c1c18 in gen_intermediate_code
> > (cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000
> > <code_gen_buffer+814927796>, max_insns=max_insns@entry=512)
> >     at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320
> > #5  0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50,
> > cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536,
> > flags=43690, cflags=-16252928, cflags@entry=524288)
> >     at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738
> > #6  0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0,
> > last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>,
> > cpu=0xffffabe2a000)
> >     at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408
> > #7  cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at
> > /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730
> > #8  0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at
> > /home/rob/qemu/qemu_unchanged/cpus.c:1454
> > #9  0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50)
> > at /home/rob/qemu/qemu_unchanged/cpus.c:1762
> > #10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at
> > /home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519
> > #11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at
> > pthread_create.c:463
> > #12 0x0000ffffabd7a4ec in thread_start () at
> > ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
>
>
> --
> Alex Bennée


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-06 15:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-31 17:24 logfile issue Robert Foley
2019-11-04 12:13 ` Alex Bennée
2019-11-06 15:46   ` Robert Foley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).