* qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-06-28 13:05 Peter Lieven
2012-06-28 13:25 ` Jan Kiszka
2012-08-06 15:11 ` [Qemu-devel] " Stefan Hajnoczi
0 siblings, 2 replies; 33+ messages in thread
From: Peter Lieven @ 2012-06-28 13:05 UTC (permalink / raw)
To: qemu-devel, kvm
Hi,
i debugged my initial problem further and found out that the problem
happens to be that
the main thread is stuck in pause_all_vcpus() on reset or quit commands
in the monitor
if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
condition from while (ret == 0)
to while ((ret == 0) && !env->stop); it works, but is this the right fix?
"Quit" command seems to work, but on "Reset" the VM enterns pause state.
Thanks,
Peter
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 13:05 qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop Peter Lieven
@ 2012-06-28 13:25 ` Jan Kiszka
2012-06-28 15:02 ` Peter Lieven
2012-08-06 15:11 ` [Qemu-devel] " Stefan Hajnoczi
1 sibling, 1 reply; 33+ messages in thread
From: Jan Kiszka @ 2012-06-28 13:25 UTC (permalink / raw)
To: Peter Lieven; +Cc: qemu-devel, kvm
On 2012-06-28 15:05, Peter Lieven wrote:
> Hi,
>
> i debugged my initial problem further and found out that the problem
> happens to be that
> the main thread is stuck in pause_all_vcpus() on reset or quit commands
> in the monitor
> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
> condition from while (ret == 0)
> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
Before entering the wait loop in pause_all_vcpus, there are kicks sent
to all vcpus. Now we need to find out why some of those kicks apparently
don't reach the destination.
Again:
- on which host kernels does this occur, and which change may have
changed it?
- with which qemu-kvm version is it reproducible, and which commit
introduced or fixed it?
I failed reproducing so far.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 13:25 ` Jan Kiszka
@ 2012-06-28 15:02 ` Peter Lieven
2012-06-28 15:22 ` Jan Kiszka
0 siblings, 1 reply; 33+ messages in thread
From: Peter Lieven @ 2012-06-28 15:02 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel, kvm
On 28.06.2012 15:25, Jan Kiszka wrote:
> On 2012-06-28 15:05, Peter Lieven wrote:
>> Hi,
>>
>> i debugged my initial problem further and found out that the problem
>> happens to be that
>> the main thread is stuck in pause_all_vcpus() on reset or quit commands
>> in the monitor
>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>> condition from while (ret == 0)
>> to while ((ret == 0)&& !env->stop); it works, but is this the right fix?
>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
> Before entering the wait loop in pause_all_vcpus, there are kicks sent
> to all vcpus. Now we need to find out why some of those kicks apparently
> don't reach the destination.
can you explain shot what exactly these kicks do? does these kicks lead
to leaving the kernel mode and returning to userspace?
> Again:
> - on which host kernels does this occur, and which change may have
> changed it?
I do not see it in 3.0.0 and have also not seen it in 2.6.38. both
the mainline 64-bit ubuntu-server kernels (for natty / oneiric
respectively).
If I compile a more recent kvm-kmod 3.3 or 3.4 on these machines,
it is no longer working.
> - with which qemu-kvm version is it reproducible, and which commit
> introduced or fixed it?
qemu-kvm-1.0.1 from sourceforge. to get into the scenario it
is not sufficient to boot from an empty harddisk. to reproduce
i have use a live cd like ubuntu-server 12.04 and choose to
boot from the first harddisk. i think the isolinux loader does
not check for a valid bootsector and just executes what is found
in sector 0. this leads to the mmio reads i posted and 100%
cpu load (most spent in kernel). at that time the monitor/qmp
is still responsible. if i sent a command that pauses all vcpus,
the first cpu is looping in kvm_cpu_exec and the main thread
is waiting. at that time the monitor stops responding.
i have also seen this issue on very old windows 2000 servers
where the system fails to power off and is just halted. maybe
this is also a busy loop.
i will try to bisect this asap and let you know, maybe the above
info helps you already to reproduce.
thanks,
peter
> I failed reproducing so far.
>
> Jan
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 15:02 ` Peter Lieven
@ 2012-06-28 15:22 ` Jan Kiszka
2012-06-28 16:29 ` Peter Lieven
0 siblings, 1 reply; 33+ messages in thread
From: Jan Kiszka @ 2012-06-28 15:22 UTC (permalink / raw)
To: Peter Lieven; +Cc: qemu-devel, kvm
On 2012-06-28 17:02, Peter Lieven wrote:
> On 28.06.2012 15:25, Jan Kiszka wrote:
>> On 2012-06-28 15:05, Peter Lieven wrote:
>>> Hi,
>>>
>>> i debugged my initial problem further and found out that the problem
>>> happens to be that
>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands
>>> in the monitor
>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>> condition from while (ret == 0)
>>> to while ((ret == 0)&& !env->stop); it works, but is this the right fix?
>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>> Before entering the wait loop in pause_all_vcpus, there are kicks sent
>> to all vcpus. Now we need to find out why some of those kicks apparently
>> don't reach the destination.
> can you explain shot what exactly these kicks do? does these kicks lead
> to leaving the kernel mode and returning to userspace?
Yes. A signal is sent, and KVM returns from the guest to userspace on
pending signals.
>> Again:
>> - on which host kernels does this occur, and which change may have
>> changed it?
> I do not see it in 3.0.0 and have also not seen it in 2.6.38. both
> the mainline 64-bit ubuntu-server kernels (for natty / oneiric
> respectively).
> If I compile a more recent kvm-kmod 3.3 or 3.4 on these machines,
> it is no longer working.
I was asking for kernel 3.3 or 3.4 without kvm-kmod.
>> - with which qemu-kvm version is it reproducible, and which commit
>> introduced or fixed it?
> qemu-kvm-1.0.1 from sourceforge. to get into the scenario it
> is not sufficient to boot from an empty harddisk. to reproduce
Please also try qemu-kvm git to see if something fixed it there.
> i have use a live cd like ubuntu-server 12.04 and choose to
> boot from the first harddisk. i think the isolinux loader does
> not check for a valid bootsector and just executes what is found
> in sector 0. this leads to the mmio reads i posted and 100%
> cpu load (most spent in kernel). at that time the monitor/qmp
> is still responsible. if i sent a command that pauses all vcpus,
> the first cpu is looping in kvm_cpu_exec and the main thread
> is waiting. at that time the monitor stops responding.
> i have also seen this issue on very old windows 2000 servers
> where the system fails to power off and is just halted. maybe
> this is also a busy loop.
>
> i will try to bisect this asap and let you know, maybe the above
> info helps you already to reproduce.
OK, thanks,
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 15:22 ` Jan Kiszka
@ 2012-06-28 16:29 ` Peter Lieven
2012-06-28 16:32 ` Avi Kivity
0 siblings, 1 reply; 33+ messages in thread
From: Peter Lieven @ 2012-06-28 16:29 UTC (permalink / raw)
To: Jan Kiszka; +Cc: qemu-devel, kvm
On 28.06.2012 17:22, Jan Kiszka wrote:
> On 2012-06-28 17:02, Peter Lieven wrote:
>> On 28.06.2012 15:25, Jan Kiszka wrote:
>>> On 2012-06-28 15:05, Peter Lieven wrote:
>>>> Hi,
>>>>
>>>> i debugged my initial problem further and found out that the problem
>>>> happens to be that
>>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands
>>>> in the monitor
>>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>>> condition from while (ret == 0)
>>>> to while ((ret == 0)&& !env->stop); it works, but is this the right fix?
>>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>> Before entering the wait loop in pause_all_vcpus, there are kicks sent
>>> to all vcpus. Now we need to find out why some of those kicks apparently
>>> don't reach the destination.
>> can you explain shot what exactly these kicks do? does these kicks lead
>> to leaving the kernel mode and returning to userspace?
> Yes. A signal is sent, and KVM returns from the guest to userspace on
> pending signals.
is there a description available how this process exactly works?
thanks
peter
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 16:29 ` Peter Lieven
@ 2012-06-28 16:32 ` Avi Kivity
2012-06-28 19:27 ` Peter Lieven
0 siblings, 1 reply; 33+ messages in thread
From: Avi Kivity @ 2012-06-28 16:32 UTC (permalink / raw)
To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm
On 06/28/2012 07:29 PM, Peter Lieven wrote:
>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>> pending signals.
> is there a description available how this process exactly works?
The kernel part is in vcpu_enter_guest(), see the check for
signal_pending(). But this hasn't seen changes for quite a long while.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 16:32 ` Avi Kivity
@ 2012-06-28 19:27 ` Peter Lieven
2012-07-01 8:19 ` [Qemu-devel] " Avi Kivity
0 siblings, 1 reply; 33+ messages in thread
From: Peter Lieven @ 2012-06-28 19:27 UTC (permalink / raw)
To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm
[-- Attachment #1: Type: text/plain, Size: 812 bytes --]
Am 28.06.2012 um 18:32 schrieb Avi Kivity:
> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>> pending signals.
>
>> is there a description available how this process exactly works?
>
> The kernel part is in vcpu_enter_guest(), see the check for
> signal_pending(). But this hasn't seen changes for quite a long while.
Thank you, i will have a look. I noticed a few patches that where submitted
during the last year, maybe one of them is related:
Switch SIG_IPI to SIGUSR1
Fix signal handling of SIG_IPI when io-thread is enabled
In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
is there any reference to that?
Thank you,
Peter
>
> --
> error compiling committee.c: too many arguments to function
>
>
[-- Attachment #2: Type: text/html, Size: 2316 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 19:27 ` Peter Lieven
@ 2012-07-01 8:19 ` Avi Kivity
0 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2012-07-01 8:19 UTC (permalink / raw)
To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm
On 06/28/2012 10:27 PM, Peter Lieven wrote:
>
> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>
>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>> pending signals.
>>
>>> is there a description available how this process exactly works?
>>
>> The kernel part is in vcpu_enter_guest(), see the check for
>> signal_pending(). But this hasn't seen changes for quite a long while.
>
> Thank you, i will have a look. I noticed a few patches that where submitted
> during the last year, maybe one of them is related:
>
> Switch SIG_IPI to SIGUSR1
> Fix signal handling of SIG_IPI when io-thread is enabled
>
> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
> is there any reference to that?
http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
running 32-on-64?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-07-01 8:19 ` Avi Kivity
0 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2012-07-01 8:19 UTC (permalink / raw)
To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm
On 06/28/2012 10:27 PM, Peter Lieven wrote:
>
> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>
>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>> pending signals.
>>
>>> is there a description available how this process exactly works?
>>
>> The kernel part is in vcpu_enter_guest(), see the check for
>> signal_pending(). But this hasn't seen changes for quite a long while.
>
> Thank you, i will have a look. I noticed a few patches that where submitted
> during the last year, maybe one of them is related:
>
> Switch SIG_IPI to SIGUSR1
> Fix signal handling of SIG_IPI when io-thread is enabled
>
> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
> is there any reference to that?
http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
running 32-on-64?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-07-01 8:19 ` [Qemu-devel] " Avi Kivity
@ 2012-07-01 19:18 ` Peter Lieven
-1 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-07-01 19:18 UTC (permalink / raw)
To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm
Am 01.07.2012 um 10:19 schrieb Avi Kivity:
> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>
>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>
>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>> pending signals.
>>>
>>>> is there a description available how this process exactly works?
>>>
>>> The kernel part is in vcpu_enter_guest(), see the check for
>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>
>> Thank you, i will have a look. I noticed a few patches that where submitted
>> during the last year, maybe one of them is related:
>>
>> Switch SIG_IPI to SIGUSR1
>> Fix signal handling of SIG_IPI when io-thread is enabled
>>
>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>> is there any reference to that?
>
>
> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
> running 32-on-64?
I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
cd image. The second case where i have seen the race is on shutdown of a
Windows 2000 Server which is also 32-bit.
Peter
>
>
> --
> error compiling committee.c: too many arguments to function
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-07-01 19:18 ` Peter Lieven
0 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-07-01 19:18 UTC (permalink / raw)
To: Avi Kivity; +Cc: Jan Kiszka, qemu-devel, kvm
Am 01.07.2012 um 10:19 schrieb Avi Kivity:
> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>
>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>
>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>> pending signals.
>>>
>>>> is there a description available how this process exactly works?
>>>
>>> The kernel part is in vcpu_enter_guest(), see the check for
>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>
>> Thank you, i will have a look. I noticed a few patches that where submitted
>> during the last year, maybe one of them is related:
>>
>> Switch SIG_IPI to SIGUSR1
>> Fix signal handling of SIG_IPI when io-thread is enabled
>>
>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>> is there any reference to that?
>
>
> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
> running 32-on-64?
I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
cd image. The second case where i have seen the race is on shutdown of a
Windows 2000 Server which is also 32-bit.
Peter
>
>
> --
> error compiling committee.c: too many arguments to function
>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-07-01 19:18 ` [Qemu-devel] " Peter Lieven
@ 2012-07-02 7:05 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-07-02 7:05 UTC (permalink / raw)
To: Peter Lieven; +Cc: Avi Kivity, kvm, qemu-devel
On 2012-07-01 21:18, Peter Lieven wrote:
>
> Am 01.07.2012 um 10:19 schrieb Avi Kivity:
>
>> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>>
>>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>>
>>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>>> pending signals.
>>>>
>>>>> is there a description available how this process exactly works?
>>>>
>>>> The kernel part is in vcpu_enter_guest(), see the check for
>>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>>
>>> Thank you, i will have a look. I noticed a few patches that where submitted
>>> during the last year, maybe one of them is related:
>>>
>>> Switch SIG_IPI to SIGUSR1
>>> Fix signal handling of SIG_IPI when io-thread is enabled
>>>
>>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>>> is there any reference to that?
>>
>>
>> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
>> running 32-on-64?
>
> I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
> isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
> cd image. The second case where i have seen the race is on shutdown of a
> Windows 2000 Server which is also 32-bit.
"32-on-64" particularly means using a 32-bit QEMU[-kvm] binary on a
64-bit host kernel. What does "file qemu-system-x86_64" report about yours?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-07-02 7:05 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-07-02 7:05 UTC (permalink / raw)
To: Peter Lieven; +Cc: Avi Kivity, kvm, qemu-devel
On 2012-07-01 21:18, Peter Lieven wrote:
>
> Am 01.07.2012 um 10:19 schrieb Avi Kivity:
>
>> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>>
>>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>>
>>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>>> pending signals.
>>>>
>>>>> is there a description available how this process exactly works?
>>>>
>>>> The kernel part is in vcpu_enter_guest(), see the check for
>>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>>
>>> Thank you, i will have a look. I noticed a few patches that where submitted
>>> during the last year, maybe one of them is related:
>>>
>>> Switch SIG_IPI to SIGUSR1
>>> Fix signal handling of SIG_IPI when io-thread is enabled
>>>
>>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>>> is there any reference to that?
>>
>>
>> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
>> running 32-on-64?
>
> I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
> isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
> cd image. The second case where i have seen the race is on shutdown of a
> Windows 2000 Server which is also 32-bit.
"32-on-64" particularly means using a 32-bit QEMU[-kvm] binary on a
64-bit host kernel. What does "file qemu-system-x86_64" report about yours?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-07-02 7:05 ` [Qemu-devel] " Jan Kiszka
@ 2012-07-02 8:12 ` Peter Lieven
-1 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-07-02 8:12 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Avi Kivity, qemu-devel, kvm
On 02.07.2012 09:05, Jan Kiszka wrote:
> On 2012-07-01 21:18, Peter Lieven wrote:
>> Am 01.07.2012 um 10:19 schrieb Avi Kivity:
>>
>>> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>>>
>>>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>>>> pending signals.
>>>>>> is there a description available how this process exactly works?
>>>>> The kernel part is in vcpu_enter_guest(), see the check for
>>>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>>> Thank you, i will have a look. I noticed a few patches that where submitted
>>>> during the last year, maybe one of them is related:
>>>>
>>>> Switch SIG_IPI to SIGUSR1
>>>> Fix signal handling of SIG_IPI when io-thread is enabled
>>>>
>>>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>>>> is there any reference to that?
>>>
>>> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
>>> running 32-on-64?
>> I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
>> isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
>> cd image. The second case where i have seen the race is on shutdown of a
>> Windows 2000 Server which is also 32-bit.
> "32-on-64" particularly means using a 32-bit QEMU[-kvm] binary on a
> 64-bit host kernel. What does "file qemu-system-x86_64" report about yours?
Its custom build on a 64-bit linux as 64-bit application. I will try to
continue to find out
today whats going wrong. Any help or hints appreciated ;-)
Thanks,
Peter
> Jan
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-07-02 8:12 ` Peter Lieven
0 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-07-02 8:12 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Avi Kivity, kvm, qemu-devel
On 02.07.2012 09:05, Jan Kiszka wrote:
> On 2012-07-01 21:18, Peter Lieven wrote:
>> Am 01.07.2012 um 10:19 schrieb Avi Kivity:
>>
>>> On 06/28/2012 10:27 PM, Peter Lieven wrote:
>>>> Am 28.06.2012 um 18:32 schrieb Avi Kivity:
>>>>
>>>>> On 06/28/2012 07:29 PM, Peter Lieven wrote:
>>>>>>> Yes. A signal is sent, and KVM returns from the guest to userspace on
>>>>>>> pending signals.
>>>>>> is there a description available how this process exactly works?
>>>>> The kernel part is in vcpu_enter_guest(), see the check for
>>>>> signal_pending(). But this hasn't seen changes for quite a long while.
>>>> Thank you, i will have a look. I noticed a few patches that where submitted
>>>> during the last year, maybe one of them is related:
>>>>
>>>> Switch SIG_IPI to SIGUSR1
>>>> Fix signal handling of SIG_IPI when io-thread is enabled
>>>>
>>>> In the first commit there is mentioned a "32-on-64-bit Linux kernel bug"
>>>> is there any reference to that?
>>>
>>> http://web.archiveorange.com/archive/v/1XS1vwGSFLyYygwTXg1K. Are you
>>> running 32-on-64?
>> I think the issue occurs when running a 32-bit guest on a 64-bit system. Afaik, the
>> isolinux loader where is see the race is 32-bit altough it is a 64-bit ubuntu lts
>> cd image. The second case where i have seen the race is on shutdown of a
>> Windows 2000 Server which is also 32-bit.
> "32-on-64" particularly means using a 32-bit QEMU[-kvm] binary on a
> 64-bit host kernel. What does "file qemu-system-x86_64" report about yours?
Its custom build on a 64-bit linux as 64-bit application. I will try to
continue to find out
today whats going wrong. Any help or hints appreciated ;-)
Thanks,
Peter
> Jan
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-06-28 13:05 qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop Peter Lieven
@ 2012-08-06 15:11 ` Stefan Hajnoczi
2012-08-06 15:11 ` [Qemu-devel] " Stefan Hajnoczi
1 sibling, 0 replies; 33+ messages in thread
From: Stefan Hajnoczi @ 2012-08-06 15:11 UTC (permalink / raw)
To: Peter Lieven; +Cc: qemu-devel, kvm, Jan Kiszka, Avi Kivity
On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
> i debugged my initial problem further and found out that the problem happens
> to be that
> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
> the monitor
> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
> condition from while (ret == 0)
> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
I think I'm hitting something similar. I installed a F17 amd64 guest
(3.5 kernel) but before booting entered the GRUB boot menu edit mode.
The guest seemed unresponsive so I switched to the monitor, which also
froze shortly afterwards. The VNC screen ended up being all black.
qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
Linux 3.2.0-3-amd64 from Debian testing
$ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
if=virtio,cache=none,file=f17.img,aio=native -serial stdio
(gdb) thread apply all bt
Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
#0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#1 0x00007f80137b92c9 in kvm_vcpu_ioctl
(env=env@entry=0x7f8015b49640, type=type@entry=44672)
at /home/stefanha/qemu-kvm/kvm-all.c:1619
#2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
at /home/stefanha/qemu-kvm/kvm-all.c:1506
#3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
at /home/stefanha/qemu-kvm/cpus.c:756
#4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#5 0x00007f800f8986dd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()
This vcpu is still executing guest code and I've seen it successfully
dispatching I/O. The problem is it's missing the exit_request...
Thread 2 (Thread 0x7f8008622700 (LWP 368)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
#2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
at /home/stefanha/qemu-kvm/cpus.c:724
#3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
/home/stefanha/qemu-kvm/cpus.c:761
#4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#5 0x00007f800f8986dd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()
No problems here.
Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
#2 0x00007f8013768949 in pause_all_vcpus () at
/home/stefanha/qemu-kvm/cpus.c:962
#3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
Here are the vcpus:
(gdb) p first_cpu
$6 = (struct CPUX86State *) 0x7f8015b49640
(gdb) p first_cpu->next_cpu
$7 = (struct CPUX86State *) 0x7f8015b67450
(gdb) p first_cpu->next_cpu->next_cpu
$8 = (struct CPUX86State *) 0x0
(gdb) p first_cpu->stop
$9 = 1
(gdb) p first_cpu->stopped
$10 = 0
(gdb) p first_cpu->exit_request
$11 = 0
:(
This isn't easy to reproduce. I tried entering the GRUB boot menu
again and there was no deadlock.
Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-06 15:11 ` Stefan Hajnoczi
0 siblings, 0 replies; 33+ messages in thread
From: Stefan Hajnoczi @ 2012-08-06 15:11 UTC (permalink / raw)
To: Peter Lieven; +Cc: Jan Kiszka, qemu-devel, kvm, Avi Kivity
On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
> i debugged my initial problem further and found out that the problem happens
> to be that
> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
> the monitor
> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
> condition from while (ret == 0)
> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
I think I'm hitting something similar. I installed a F17 amd64 guest
(3.5 kernel) but before booting entered the GRUB boot menu edit mode.
The guest seemed unresponsive so I switched to the monitor, which also
froze shortly afterwards. The VNC screen ended up being all black.
qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
Linux 3.2.0-3-amd64 from Debian testing
$ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
if=virtio,cache=none,file=f17.img,aio=native -serial stdio
(gdb) thread apply all bt
Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
#0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#1 0x00007f80137b92c9 in kvm_vcpu_ioctl
(env=env@entry=0x7f8015b49640, type=type@entry=44672)
at /home/stefanha/qemu-kvm/kvm-all.c:1619
#2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
at /home/stefanha/qemu-kvm/kvm-all.c:1506
#3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
at /home/stefanha/qemu-kvm/cpus.c:756
#4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#5 0x00007f800f8986dd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()
This vcpu is still executing guest code and I've seen it successfully
dispatching I/O. The problem is it's missing the exit_request...
Thread 2 (Thread 0x7f8008622700 (LWP 368)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
#2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
at /home/stefanha/qemu-kvm/cpus.c:724
#3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
/home/stefanha/qemu-kvm/cpus.c:761
#4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#5 0x00007f800f8986dd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()
No problems here.
Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
#2 0x00007f8013768949 in pause_all_vcpus () at
/home/stefanha/qemu-kvm/cpus.c:962
#3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
Here are the vcpus:
(gdb) p first_cpu
$6 = (struct CPUX86State *) 0x7f8015b49640
(gdb) p first_cpu->next_cpu
$7 = (struct CPUX86State *) 0x7f8015b67450
(gdb) p first_cpu->next_cpu->next_cpu
$8 = (struct CPUX86State *) 0x0
(gdb) p first_cpu->stop
$9 = 1
(gdb) p first_cpu->stopped
$10 = 0
(gdb) p first_cpu->exit_request
$11 = 0
:(
This isn't easy to reproduce. I tried entering the GRUB boot menu
again and there was no deadlock.
Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-06 15:11 ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-08-17 13:11 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 13:11 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-06 17:11, Stefan Hajnoczi wrote:
> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>> i debugged my initial problem further and found out that the problem happens
>> to be that
>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>> the monitor
>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>> condition from while (ret == 0)
>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>
> I think I'm hitting something similar. I installed a F17 amd64 guest
> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
> The guest seemed unresponsive so I switched to the monitor, which also
> froze shortly afterwards. The VNC screen ended up being all black.
>
> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
> Linux 3.2.0-3-amd64 from Debian testing
>
> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>
> (gdb) thread apply all bt
>
> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
> at /home/stefanha/qemu-kvm/kvm-all.c:1619
> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
> at /home/stefanha/qemu-kvm/kvm-all.c:1506
> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
> at /home/stefanha/qemu-kvm/cpus.c:756
> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
> pthread_create.c:304
> #5 0x00007f800f8986dd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #6 0x0000000000000000 in ?? ()
>
> This vcpu is still executing guest code and I've seen it successfully
> dispatching I/O. The problem is it's missing the exit_request...
>
> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
> at /home/stefanha/qemu-kvm/cpus.c:724
> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
> /home/stefanha/qemu-kvm/cpus.c:761
> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
> pthread_create.c:304
> #5 0x00007f800f8986dd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #6 0x0000000000000000 in ?? ()
>
> No problems here.
>
> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
> #2 0x00007f8013768949 in pause_all_vcpus () at
> /home/stefanha/qemu-kvm/cpus.c:962
> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>
> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>
> Here are the vcpus:
>
> (gdb) p first_cpu
> $6 = (struct CPUX86State *) 0x7f8015b49640
> (gdb) p first_cpu->next_cpu
> $7 = (struct CPUX86State *) 0x7f8015b67450
> (gdb) p first_cpu->next_cpu->next_cpu
> $8 = (struct CPUX86State *) 0x0
>
> (gdb) p first_cpu->stop
> $9 = 1
> (gdb) p first_cpu->stopped
> $10 = 0
> (gdb) p first_cpu->exit_request
> $11 = 0
CPUState::exit_request is only set on specific synchronous events, see
target-i386/kvm.c.
More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
will skip the kicking via a signal. Maybe there is some race. Let me
think about such possibilities again...
Jan
>
> :(
>
> This isn't easy to reproduce. I tried entering the GRUB boot menu
> again and there was no deadlock.
>
> Stefan
>
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-17 13:11 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 13:11 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-06 17:11, Stefan Hajnoczi wrote:
> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>> i debugged my initial problem further and found out that the problem happens
>> to be that
>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>> the monitor
>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>> condition from while (ret == 0)
>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>
> I think I'm hitting something similar. I installed a F17 amd64 guest
> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
> The guest seemed unresponsive so I switched to the monitor, which also
> froze shortly afterwards. The VNC screen ended up being all black.
>
> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
> Linux 3.2.0-3-amd64 from Debian testing
>
> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>
> (gdb) thread apply all bt
>
> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
> at /home/stefanha/qemu-kvm/kvm-all.c:1619
> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
> at /home/stefanha/qemu-kvm/kvm-all.c:1506
> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
> at /home/stefanha/qemu-kvm/cpus.c:756
> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
> pthread_create.c:304
> #5 0x00007f800f8986dd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #6 0x0000000000000000 in ?? ()
>
> This vcpu is still executing guest code and I've seen it successfully
> dispatching I/O. The problem is it's missing the exit_request...
>
> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
> at /home/stefanha/qemu-kvm/cpus.c:724
> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
> /home/stefanha/qemu-kvm/cpus.c:761
> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
> pthread_create.c:304
> #5 0x00007f800f8986dd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #6 0x0000000000000000 in ?? ()
>
> No problems here.
>
> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
> #2 0x00007f8013768949 in pause_all_vcpus () at
> /home/stefanha/qemu-kvm/cpus.c:962
> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>
> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>
> Here are the vcpus:
>
> (gdb) p first_cpu
> $6 = (struct CPUX86State *) 0x7f8015b49640
> (gdb) p first_cpu->next_cpu
> $7 = (struct CPUX86State *) 0x7f8015b67450
> (gdb) p first_cpu->next_cpu->next_cpu
> $8 = (struct CPUX86State *) 0x0
>
> (gdb) p first_cpu->stop
> $9 = 1
> (gdb) p first_cpu->stopped
> $10 = 0
> (gdb) p first_cpu->exit_request
> $11 = 0
CPUState::exit_request is only set on specific synchronous events, see
target-i386/kvm.c.
More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
will skip the kicking via a signal. Maybe there is some race. Let me
think about such possibilities again...
Jan
>
> :(
>
> This isn't easy to reproduce. I tried entering the GRUB boot menu
> again and there was no deadlock.
>
> Stefan
>
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-17 13:11 ` [Qemu-devel] " Jan Kiszka
@ 2012-08-17 14:36 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 14:36 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-17 15:11, Jan Kiszka wrote:
> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>> i debugged my initial problem further and found out that the problem happens
>>> to be that
>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>> the monitor
>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>> condition from while (ret == 0)
>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>
>> I think I'm hitting something similar. I installed a F17 amd64 guest
>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>> The guest seemed unresponsive so I switched to the monitor, which also
>> froze shortly afterwards. The VNC screen ended up being all black.
>>
>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>> Linux 3.2.0-3-amd64 from Debian testing
>>
>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>
>> (gdb) thread apply all bt
>>
>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>> at /home/stefanha/qemu-kvm/cpus.c:756
>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>> pthread_create.c:304
>> #5 0x00007f800f8986dd in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>> #6 0x0000000000000000 in ?? ()
>>
>> This vcpu is still executing guest code and I've seen it successfully
>> dispatching I/O. The problem is it's missing the exit_request...
>>
>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>> at /home/stefanha/qemu-kvm/cpus.c:724
>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>> /home/stefanha/qemu-kvm/cpus.c:761
>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>> pthread_create.c:304
>> #5 0x00007f800f8986dd in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>> #6 0x0000000000000000 in ?? ()
>>
>> No problems here.
>>
>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>> #2 0x00007f8013768949 in pause_all_vcpus () at
>> /home/stefanha/qemu-kvm/cpus.c:962
>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>
>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>
>> Here are the vcpus:
>>
>> (gdb) p first_cpu
>> $6 = (struct CPUX86State *) 0x7f8015b49640
>> (gdb) p first_cpu->next_cpu
>> $7 = (struct CPUX86State *) 0x7f8015b67450
>> (gdb) p first_cpu->next_cpu->next_cpu
>> $8 = (struct CPUX86State *) 0x0
>>
>> (gdb) p first_cpu->stop
>> $9 = 1
>> (gdb) p first_cpu->stopped
>> $10 = 0
>> (gdb) p first_cpu->exit_request
>> $11 = 0
>
> CPUState::exit_request is only set on specific synchronous events, see
> target-i386/kvm.c.
>
> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
> will skip the kicking via a signal. Maybe there is some race. Let me
> think about such possibilities again...
diff --git a/cpus.c b/cpus.c
index e476a3c..30f3228 100644
--- a/cpus.c
+++ b/cpus.c
@@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
}
qemu_kvm_eat_signals(env);
+ /* Ensure that checking env->stop cannot overtake signal processing so
+ * that we lose the latter without stopping. */
+ smp_rmb();
qemu_wait_io_event_common(env);
}
Can anyone imagine that such a barrier may actually be required? If it
is currently possible that env->stop is evaluated before we called into
sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
signal without properly processing its reason (stop).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-17 14:36 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 14:36 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-17 15:11, Jan Kiszka wrote:
> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>> i debugged my initial problem further and found out that the problem happens
>>> to be that
>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>> the monitor
>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>> condition from while (ret == 0)
>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>
>> I think I'm hitting something similar. I installed a F17 amd64 guest
>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>> The guest seemed unresponsive so I switched to the monitor, which also
>> froze shortly afterwards. The VNC screen ended up being all black.
>>
>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>> Linux 3.2.0-3-amd64 from Debian testing
>>
>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>
>> (gdb) thread apply all bt
>>
>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>> at /home/stefanha/qemu-kvm/cpus.c:756
>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>> pthread_create.c:304
>> #5 0x00007f800f8986dd in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>> #6 0x0000000000000000 in ?? ()
>>
>> This vcpu is still executing guest code and I've seen it successfully
>> dispatching I/O. The problem is it's missing the exit_request...
>>
>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>> at /home/stefanha/qemu-kvm/cpus.c:724
>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>> /home/stefanha/qemu-kvm/cpus.c:761
>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>> pthread_create.c:304
>> #5 0x00007f800f8986dd in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>> #6 0x0000000000000000 in ?? ()
>>
>> No problems here.
>>
>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>> #2 0x00007f8013768949 in pause_all_vcpus () at
>> /home/stefanha/qemu-kvm/cpus.c:962
>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>
>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>
>> Here are the vcpus:
>>
>> (gdb) p first_cpu
>> $6 = (struct CPUX86State *) 0x7f8015b49640
>> (gdb) p first_cpu->next_cpu
>> $7 = (struct CPUX86State *) 0x7f8015b67450
>> (gdb) p first_cpu->next_cpu->next_cpu
>> $8 = (struct CPUX86State *) 0x0
>>
>> (gdb) p first_cpu->stop
>> $9 = 1
>> (gdb) p first_cpu->stopped
>> $10 = 0
>> (gdb) p first_cpu->exit_request
>> $11 = 0
>
> CPUState::exit_request is only set on specific synchronous events, see
> target-i386/kvm.c.
>
> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
> will skip the kicking via a signal. Maybe there is some race. Let me
> think about such possibilities again...
diff --git a/cpus.c b/cpus.c
index e476a3c..30f3228 100644
--- a/cpus.c
+++ b/cpus.c
@@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
}
qemu_kvm_eat_signals(env);
+ /* Ensure that checking env->stop cannot overtake signal processing so
+ * that we lose the latter without stopping. */
+ smp_rmb();
qemu_wait_io_event_common(env);
}
Can anyone imagine that such a barrier may actually be required? If it
is currently possible that env->stop is evaluated before we called into
sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
signal without properly processing its reason (stop).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-17 14:36 ` [Qemu-devel] " Jan Kiszka
@ 2012-08-17 14:41 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 14:41 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Peter Lieven, qemu-devel, kvm, Avi Kivity, Paolo Bonzini
On 2012-08-17 16:36, Jan Kiszka wrote:
> On 2012-08-17 15:11, Jan Kiszka wrote:
>> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>>> i debugged my initial problem further and found out that the problem happens
>>>> to be that
>>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>>> the monitor
>>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>>> condition from while (ret == 0)
>>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>>
>>> I think I'm hitting something similar. I installed a F17 amd64 guest
>>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>>> The guest seemed unresponsive so I switched to the monitor, which also
>>> froze shortly afterwards. The VNC screen ended up being all black.
>>>
>>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>>> Linux 3.2.0-3-amd64 from Debian testing
>>>
>>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>>
>>> (gdb) thread apply all bt
>>>
>>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>>> at /home/stefanha/qemu-kvm/cpus.c:756
>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:304
>>> #5 0x00007f800f8986dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>> #6 0x0000000000000000 in ?? ()
>>>
>>> This vcpu is still executing guest code and I've seen it successfully
>>> dispatching I/O. The problem is it's missing the exit_request...
>>>
>>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>>> at /home/stefanha/qemu-kvm/cpus.c:724
>>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>>> /home/stefanha/qemu-kvm/cpus.c:761
>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:304
>>> #5 0x00007f800f8986dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>> #6 0x0000000000000000 in ?? ()
>>>
>>> No problems here.
>>>
>>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>> #2 0x00007f8013768949 in pause_all_vcpus () at
>>> /home/stefanha/qemu-kvm/cpus.c:962
>>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>>
>>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>>
>>> Here are the vcpus:
>>>
>>> (gdb) p first_cpu
>>> $6 = (struct CPUX86State *) 0x7f8015b49640
>>> (gdb) p first_cpu->next_cpu
>>> $7 = (struct CPUX86State *) 0x7f8015b67450
>>> (gdb) p first_cpu->next_cpu->next_cpu
>>> $8 = (struct CPUX86State *) 0x0
>>>
>>> (gdb) p first_cpu->stop
>>> $9 = 1
>>> (gdb) p first_cpu->stopped
>>> $10 = 0
>>> (gdb) p first_cpu->exit_request
>>> $11 = 0
>>
>> CPUState::exit_request is only set on specific synchronous events, see
>> target-i386/kvm.c.
>>
>> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
>> will skip the kicking via a signal. Maybe there is some race. Let me
>> think about such possibilities again...
>
> diff --git a/cpus.c b/cpus.c
> index e476a3c..30f3228 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
> }
>
> qemu_kvm_eat_signals(env);
> + /* Ensure that checking env->stop cannot overtake signal processing so
> + * that we lose the latter without stopping. */
> + smp_rmb();
rmb is nonsense. Should be a plain barrier() - if at all.
> qemu_wait_io_event_common(env);
> }
>
> Can anyone imagine that such a barrier may actually be required? If it
> is currently possible that env->stop is evaluated before we called into
> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
> signal without properly processing its reason (stop).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-17 14:41 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 14:41 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-17 16:36, Jan Kiszka wrote:
> On 2012-08-17 15:11, Jan Kiszka wrote:
>> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>>> i debugged my initial problem further and found out that the problem happens
>>>> to be that
>>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>>> the monitor
>>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>>> condition from while (ret == 0)
>>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>>
>>> I think I'm hitting something similar. I installed a F17 amd64 guest
>>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>>> The guest seemed unresponsive so I switched to the monitor, which also
>>> froze shortly afterwards. The VNC screen ended up being all black.
>>>
>>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>>> Linux 3.2.0-3-amd64 from Debian testing
>>>
>>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>>
>>> (gdb) thread apply all bt
>>>
>>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>>> at /home/stefanha/qemu-kvm/cpus.c:756
>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:304
>>> #5 0x00007f800f8986dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>> #6 0x0000000000000000 in ?? ()
>>>
>>> This vcpu is still executing guest code and I've seen it successfully
>>> dispatching I/O. The problem is it's missing the exit_request...
>>>
>>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>>> at /home/stefanha/qemu-kvm/cpus.c:724
>>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>>> /home/stefanha/qemu-kvm/cpus.c:761
>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:304
>>> #5 0x00007f800f8986dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>> #6 0x0000000000000000 in ?? ()
>>>
>>> No problems here.
>>>
>>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>> #2 0x00007f8013768949 in pause_all_vcpus () at
>>> /home/stefanha/qemu-kvm/cpus.c:962
>>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>>
>>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>>
>>> Here are the vcpus:
>>>
>>> (gdb) p first_cpu
>>> $6 = (struct CPUX86State *) 0x7f8015b49640
>>> (gdb) p first_cpu->next_cpu
>>> $7 = (struct CPUX86State *) 0x7f8015b67450
>>> (gdb) p first_cpu->next_cpu->next_cpu
>>> $8 = (struct CPUX86State *) 0x0
>>>
>>> (gdb) p first_cpu->stop
>>> $9 = 1
>>> (gdb) p first_cpu->stopped
>>> $10 = 0
>>> (gdb) p first_cpu->exit_request
>>> $11 = 0
>>
>> CPUState::exit_request is only set on specific synchronous events, see
>> target-i386/kvm.c.
>>
>> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
>> will skip the kicking via a signal. Maybe there is some race. Let me
>> think about such possibilities again...
>
> diff --git a/cpus.c b/cpus.c
> index e476a3c..30f3228 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
> }
>
> qemu_kvm_eat_signals(env);
> + /* Ensure that checking env->stop cannot overtake signal processing so
> + * that we lose the latter without stopping. */
> + smp_rmb();
rmb is nonsense. Should be a plain barrier() - if at all.
> qemu_wait_io_event_common(env);
> }
>
> Can anyone imagine that such a barrier may actually be required? If it
> is currently possible that env->stop is evaluated before we called into
> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
> signal without properly processing its reason (stop).
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-17 14:41 ` [Qemu-devel] " Jan Kiszka
@ 2012-08-17 15:04 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 15:04 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Peter Lieven, qemu-devel, kvm, Avi Kivity, Paolo Bonzini
On 2012-08-17 16:41, Jan Kiszka wrote:
> On 2012-08-17 16:36, Jan Kiszka wrote:
>> On 2012-08-17 15:11, Jan Kiszka wrote:
>>> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>>>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>>>> i debugged my initial problem further and found out that the problem happens
>>>>> to be that
>>>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>>>> the monitor
>>>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>>>> condition from while (ret == 0)
>>>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>>>
>>>> I think I'm hitting something similar. I installed a F17 amd64 guest
>>>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>>>> The guest seemed unresponsive so I switched to the monitor, which also
>>>> froze shortly afterwards. The VNC screen ended up being all black.
>>>>
>>>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>>>> Linux 3.2.0-3-amd64 from Debian testing
>>>>
>>>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>>>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>>>
>>>> (gdb) thread apply all bt
>>>>
>>>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>>>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>>>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>>>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>>>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>>>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>>>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>>>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>>>> at /home/stefanha/qemu-kvm/cpus.c:756
>>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:304
>>>> #5 0x00007f800f8986dd in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>>> #6 0x0000000000000000 in ?? ()
>>>>
>>>> This vcpu is still executing guest code and I've seen it successfully
>>>> dispatching I/O. The problem is it's missing the exit_request...
>>>>
>>>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>>>> at /home/stefanha/qemu-kvm/cpus.c:724
>>>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>>>> /home/stefanha/qemu-kvm/cpus.c:761
>>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:304
>>>> #5 0x00007f800f8986dd in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>>> #6 0x0000000000000000 in ?? ()
>>>>
>>>> No problems here.
>>>>
>>>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>>> #2 0x00007f8013768949 in pause_all_vcpus () at
>>>> /home/stefanha/qemu-kvm/cpus.c:962
>>>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>>>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>>>
>>>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>>>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>>>
>>>> Here are the vcpus:
>>>>
>>>> (gdb) p first_cpu
>>>> $6 = (struct CPUX86State *) 0x7f8015b49640
>>>> (gdb) p first_cpu->next_cpu
>>>> $7 = (struct CPUX86State *) 0x7f8015b67450
>>>> (gdb) p first_cpu->next_cpu->next_cpu
>>>> $8 = (struct CPUX86State *) 0x0
>>>>
>>>> (gdb) p first_cpu->stop
>>>> $9 = 1
>>>> (gdb) p first_cpu->stopped
>>>> $10 = 0
>>>> (gdb) p first_cpu->exit_request
>>>> $11 = 0
>>>
>>> CPUState::exit_request is only set on specific synchronous events, see
>>> target-i386/kvm.c.
>>>
>>> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
>>> will skip the kicking via a signal. Maybe there is some race. Let me
>>> think about such possibilities again...
>>
>> diff --git a/cpus.c b/cpus.c
>> index e476a3c..30f3228 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
>> }
>>
>> qemu_kvm_eat_signals(env);
>> + /* Ensure that checking env->stop cannot overtake signal processing so
>> + * that we lose the latter without stopping. */
>> + smp_rmb();
>
> rmb is nonsense. Should be a plain barrier() - if at all.
>
>> qemu_wait_io_event_common(env);
>> }
>>
>> Can anyone imagine that such a barrier may actually be required? If it
>> is currently possible that env->stop is evaluated before we called into
>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>> signal without properly processing its reason (stop).
Should not be required (TM): Both signal eating / stop checking and stop
setting / signal generation happens under the BQL, thus the ordering
must not make a difference here.
Don't see where we could lose a signal. Maybe due to a subtle memory
corruption that sets thread_kicked to non-zero, preventing the kicking
this way.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-17 15:04 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-17 15:04 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Peter Lieven, qemu-devel, kvm, Avi Kivity
On 2012-08-17 16:41, Jan Kiszka wrote:
> On 2012-08-17 16:36, Jan Kiszka wrote:
>> On 2012-08-17 15:11, Jan Kiszka wrote:
>>> On 2012-08-06 17:11, Stefan Hajnoczi wrote:
>>>> On Thu, Jun 28, 2012 at 2:05 PM, Peter Lieven <pl@dlhnet.de> wrote:
>>>>> i debugged my initial problem further and found out that the problem happens
>>>>> to be that
>>>>> the main thread is stuck in pause_all_vcpus() on reset or quit commands in
>>>>> the monitor
>>>>> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the
>>>>> condition from while (ret == 0)
>>>>> to while ((ret == 0) && !env->stop); it works, but is this the right fix?
>>>>> "Quit" command seems to work, but on "Reset" the VM enterns pause state.
>>>>
>>>> I think I'm hitting something similar. I installed a F17 amd64 guest
>>>> (3.5 kernel) but before booting entered the GRUB boot menu edit mode.
>>>> The guest seemed unresponsive so I switched to the monitor, which also
>>>> froze shortly afterwards. The VNC screen ended up being all black.
>>>>
>>>> qemu-kvm.git/master 3e4305694fd891b69e4450e59ec4c65420907ede
>>>> Linux 3.2.0-3-amd64 from Debian testing
>>>>
>>>> $ qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -drive
>>>> if=virtio,cache=none,file=f17.img,aio=native -serial stdio
>>>>
>>>> (gdb) thread apply all bt
>>>>
>>>> Thread 3 (Thread 0x7f8008e23700 (LWP 367)):
>>>> #0 0x00007f800f891727 in ioctl () at ../sysdeps/unix/syscall-template.S:82
>>>> #1 0x00007f80137b92c9 in kvm_vcpu_ioctl
>>>> (env=env@entry=0x7f8015b49640, type=type@entry=44672)
>>>> at /home/stefanha/qemu-kvm/kvm-all.c:1619
>>>> #2 0x00007f80137b93fe in kvm_cpu_exec (env=env@entry=0x7f8015b49640)
>>>> at /home/stefanha/qemu-kvm/kvm-all.c:1506
>>>> #3 0x00007f8013766f31 in qemu_kvm_cpu_thread_fn (arg=0x7f8015b49640)
>>>> at /home/stefanha/qemu-kvm/cpus.c:756
>>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:304
>>>> #5 0x00007f800f8986dd in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>>> #6 0x0000000000000000 in ?? ()
>>>>
>>>> This vcpu is still executing guest code and I've seen it successfully
>>>> dispatching I/O. The problem is it's missing the exit_request...
>>>>
>>>> Thread 2 (Thread 0x7f8008622700 (LWP 368)):
>>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=<optimized out>,
>>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>>> #2 0x00007f8013766eff in qemu_kvm_wait_io_event (env=<optimized out>)
>>>> at /home/stefanha/qemu-kvm/cpus.c:724
>>>> #3 qemu_kvm_cpu_thread_fn (arg=0x7f8015b67450) at
>>>> /home/stefanha/qemu-kvm/cpus.c:761
>>>> #4 0x00007f800fb4db50 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:304
>>>> #5 0x00007f800f8986dd in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>>>> #6 0x0000000000000000 in ?? ()
>>>>
>>>> No problems here.
>>>>
>>>> Thread 1 (Thread 0x7f801347b8c0 (LWP 365)):
>>>> #0 pthread_cond_wait@@GLIBC_2.3.2 ()
>>>> at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
>>>> #1 0x00007f801372b229 in qemu_cond_wait (cond=cond@entry=0x7f801402fd80,
>>>> mutex=mutex@entry=0x7f80144367c0) at qemu-thread-posix.c:113
>>>> #2 0x00007f8013768949 in pause_all_vcpus () at
>>>> /home/stefanha/qemu-kvm/cpus.c:962
>>>> #3 0x00007f80136028c8 in main (argc=<optimized out>, argv=<optimized out>,
>>>> envp=<optimized out>) at /home/stefanha/qemu-kvm/vl.c:3695
>>>>
>>>> We're deadlocked in pause_all_vcpus(), waiting for vcpu #0 to pause.
>>>> Unfortunately vcpu #0 has ->exit_request=0 although ->stop=1.
>>>>
>>>> Here are the vcpus:
>>>>
>>>> (gdb) p first_cpu
>>>> $6 = (struct CPUX86State *) 0x7f8015b49640
>>>> (gdb) p first_cpu->next_cpu
>>>> $7 = (struct CPUX86State *) 0x7f8015b67450
>>>> (gdb) p first_cpu->next_cpu->next_cpu
>>>> $8 = (struct CPUX86State *) 0x0
>>>>
>>>> (gdb) p first_cpu->stop
>>>> $9 = 1
>>>> (gdb) p first_cpu->stopped
>>>> $10 = 0
>>>> (gdb) p first_cpu->exit_request
>>>> $11 = 0
>>>
>>> CPUState::exit_request is only set on specific synchronous events, see
>>> target-i386/kvm.c.
>>>
>>> More interesting is CPUState::thread_kicked. If it's set, qemu_cpu_kick
>>> will skip the kicking via a signal. Maybe there is some race. Let me
>>> think about such possibilities again...
>>
>> diff --git a/cpus.c b/cpus.c
>> index e476a3c..30f3228 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -726,6 +726,9 @@ static void qemu_kvm_wait_io_event(CPUArchState *env)
>> }
>>
>> qemu_kvm_eat_signals(env);
>> + /* Ensure that checking env->stop cannot overtake signal processing so
>> + * that we lose the latter without stopping. */
>> + smp_rmb();
>
> rmb is nonsense. Should be a plain barrier() - if at all.
>
>> qemu_wait_io_event_common(env);
>> }
>>
>> Can anyone imagine that such a barrier may actually be required? If it
>> is currently possible that env->stop is evaluated before we called into
>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>> signal without properly processing its reason (stop).
Should not be required (TM): Both signal eating / stop checking and stop
setting / signal generation happens under the BQL, thus the ordering
must not make a difference here.
Don't see where we could lose a signal. Maybe due to a subtle memory
corruption that sets thread_kicked to non-zero, preventing the kicking
this way.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-17 15:04 ` [Qemu-devel] " Jan Kiszka
@ 2012-08-19 9:42 ` Avi Kivity
-1 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2012-08-19 9:42 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Stefan Hajnoczi, Peter Lieven, qemu-devel, kvm, Paolo Bonzini
On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>
>>> Can anyone imagine that such a barrier may actually be required? If it
>>> is currently possible that env->stop is evaluated before we called into
>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>> signal without properly processing its reason (stop).
>
> Should not be required (TM): Both signal eating / stop checking and stop
> setting / signal generation happens under the BQL, thus the ordering
> must not make a difference here.
Agree.
> Don't see where we could lose a signal. Maybe due to a subtle memory
> corruption that sets thread_kicked to non-zero, preventing the kicking
> this way.
Cannot be ruled out, yet too much of a coincidence.
Could be a kernel bug (either in kvm or elsewhere), we've had several
before in this area.
Is this reproducible?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-19 9:42 ` Avi Kivity
0 siblings, 0 replies; 33+ messages in thread
From: Avi Kivity @ 2012-08-19 9:42 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Stefan Hajnoczi, Peter Lieven, qemu-devel, kvm, Paolo Bonzini
On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>
>>> Can anyone imagine that such a barrier may actually be required? If it
>>> is currently possible that env->stop is evaluated before we called into
>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>> signal without properly processing its reason (stop).
>
> Should not be required (TM): Both signal eating / stop checking and stop
> setting / signal generation happens under the BQL, thus the ordering
> must not make a difference here.
Agree.
> Don't see where we could lose a signal. Maybe due to a subtle memory
> corruption that sets thread_kicked to non-zero, preventing the kicking
> this way.
Cannot be ruled out, yet too much of a coincidence.
Could be a kernel bug (either in kvm or elsewhere), we've had several
before in this area.
Is this reproducible?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-19 9:42 ` [Qemu-devel] " Avi Kivity
@ 2012-08-21 7:21 ` Jan Kiszka
-1 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-21 7:21 UTC (permalink / raw)
To: Avi Kivity; +Cc: Stefan Hajnoczi, Peter Lieven, qemu-devel, kvm, Paolo Bonzini
On 2012-08-19 11:42, Avi Kivity wrote:
> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>
>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>> is currently possible that env->stop is evaluated before we called into
>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>> signal without properly processing its reason (stop).
>>
>> Should not be required (TM): Both signal eating / stop checking and stop
>> setting / signal generation happens under the BQL, thus the ordering
>> must not make a difference here.
>
> Agree.
>
>
>> Don't see where we could lose a signal. Maybe due to a subtle memory
>> corruption that sets thread_kicked to non-zero, preventing the kicking
>> this way.
>
> Cannot be ruled out, yet too much of a coincidence.
>
> Could be a kernel bug (either in kvm or elsewhere), we've had several
> before in this area.
>
> Is this reproducible?
Not for me. Peter only hit it very rarely, Peter obviously more easily.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-21 7:21 ` Jan Kiszka
0 siblings, 0 replies; 33+ messages in thread
From: Jan Kiszka @ 2012-08-21 7:21 UTC (permalink / raw)
To: Avi Kivity; +Cc: Stefan Hajnoczi, Peter Lieven, qemu-devel, kvm, Paolo Bonzini
On 2012-08-19 11:42, Avi Kivity wrote:
> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>
>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>> is currently possible that env->stop is evaluated before we called into
>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>> signal without properly processing its reason (stop).
>>
>> Should not be required (TM): Both signal eating / stop checking and stop
>> setting / signal generation happens under the BQL, thus the ordering
>> must not make a difference here.
>
> Agree.
>
>
>> Don't see where we could lose a signal. Maybe due to a subtle memory
>> corruption that sets thread_kicked to non-zero, preventing the kicking
>> this way.
>
> Cannot be ruled out, yet too much of a coincidence.
>
> Could be a kernel bug (either in kvm or elsewhere), we've had several
> before in this area.
>
> Is this reproducible?
Not for me. Peter only hit it very rarely, Peter obviously more easily.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-21 7:21 ` [Qemu-devel] " Jan Kiszka
@ 2012-08-21 8:23 ` Stefan Hajnoczi
-1 siblings, 0 replies; 33+ messages in thread
From: Stefan Hajnoczi @ 2012-08-21 8:23 UTC (permalink / raw)
To: Avi Kivity; +Cc: Peter Lieven, qemu-devel, kvm, Paolo Bonzini, Jan Kiszka
On Tue, Aug 21, 2012 at 8:21 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-08-19 11:42, Avi Kivity wrote:
>> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>>
>>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>>> is currently possible that env->stop is evaluated before we called into
>>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>>> signal without properly processing its reason (stop).
>>>
>>> Should not be required (TM): Both signal eating / stop checking and stop
>>> setting / signal generation happens under the BQL, thus the ordering
>>> must not make a difference here.
>>
>> Agree.
>>
>>
>>> Don't see where we could lose a signal. Maybe due to a subtle memory
>>> corruption that sets thread_kicked to non-zero, preventing the kicking
>>> this way.
>>
>> Cannot be ruled out, yet too much of a coincidence.
>>
>> Could be a kernel bug (either in kvm or elsewhere), we've had several
>> before in this area.
>>
>> Is this reproducible?
>
> Not for me. Peter only hit it very rarely, Peter obviously more easily.
I have only hit this once and was not able to reproduce it.
Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-21 8:23 ` Stefan Hajnoczi
0 siblings, 0 replies; 33+ messages in thread
From: Stefan Hajnoczi @ 2012-08-21 8:23 UTC (permalink / raw)
To: Avi Kivity; +Cc: Paolo Bonzini, Peter Lieven, qemu-devel, kvm, Jan Kiszka
On Tue, Aug 21, 2012 at 8:21 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2012-08-19 11:42, Avi Kivity wrote:
>> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>>
>>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>>> is currently possible that env->stop is evaluated before we called into
>>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>>> signal without properly processing its reason (stop).
>>>
>>> Should not be required (TM): Both signal eating / stop checking and stop
>>> setting / signal generation happens under the BQL, thus the ordering
>>> must not make a difference here.
>>
>> Agree.
>>
>>
>>> Don't see where we could lose a signal. Maybe due to a subtle memory
>>> corruption that sets thread_kicked to non-zero, preventing the kicking
>>> this way.
>>
>> Cannot be ruled out, yet too much of a coincidence.
>>
>> Could be a kernel bug (either in kvm or elsewhere), we've had several
>> before in this area.
>>
>> Is this reproducible?
>
> Not for me. Peter only hit it very rarely, Peter obviously more easily.
I have only hit this once and was not able to reproduce it.
Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
2012-08-21 8:23 ` [Qemu-devel] " Stefan Hajnoczi
@ 2012-08-22 12:52 ` Peter Lieven
-1 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-08-22 12:52 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Jan Kiszka, Avi Kivity, kvm, qemu-devel
On 08/21/12 10:23, Stefan Hajnoczi wrote:
> On Tue, Aug 21, 2012 at 8:21 AM, Jan Kiszka<jan.kiszka@siemens.com> wrote:
>> On 2012-08-19 11:42, Avi Kivity wrote:
>>> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>>>> is currently possible that env->stop is evaluated before we called into
>>>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>>>> signal without properly processing its reason (stop).
>>>> Should not be required (TM): Both signal eating / stop checking and stop
>>>> setting / signal generation happens under the BQL, thus the ordering
>>>> must not make a difference here.
>>> Agree.
>>>
>>>
>>>> Don't see where we could lose a signal. Maybe due to a subtle memory
>>>> corruption that sets thread_kicked to non-zero, preventing the kicking
>>>> this way.
>>> Cannot be ruled out, yet too much of a coincidence.
>>>
>>> Could be a kernel bug (either in kvm or elsewhere), we've had several
>>> before in this area.
>>>
>>> Is this reproducible?
>> Not for me. Peter only hit it very rarely, Peter obviously more easily.
> I have only hit this once and was not able to reproduce it.
For me it was very reproducible, but my issue was fixed by:
http://www.mail-archive.com/kvm@vger.kernel.org/msg70908.html
Never seen this since then,
Peter
> Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop
@ 2012-08-22 12:52 ` Peter Lieven
0 siblings, 0 replies; 33+ messages in thread
From: Peter Lieven @ 2012-08-22 12:52 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Jan Kiszka, Avi Kivity, kvm, qemu-devel
On 08/21/12 10:23, Stefan Hajnoczi wrote:
> On Tue, Aug 21, 2012 at 8:21 AM, Jan Kiszka<jan.kiszka@siemens.com> wrote:
>> On 2012-08-19 11:42, Avi Kivity wrote:
>>> On 08/17/2012 06:04 PM, Jan Kiszka wrote:
>>>>>> Can anyone imagine that such a barrier may actually be required? If it
>>>>>> is currently possible that env->stop is evaluated before we called into
>>>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the
>>>>>> signal without properly processing its reason (stop).
>>>> Should not be required (TM): Both signal eating / stop checking and stop
>>>> setting / signal generation happens under the BQL, thus the ordering
>>>> must not make a difference here.
>>> Agree.
>>>
>>>
>>>> Don't see where we could lose a signal. Maybe due to a subtle memory
>>>> corruption that sets thread_kicked to non-zero, preventing the kicking
>>>> this way.
>>> Cannot be ruled out, yet too much of a coincidence.
>>>
>>> Could be a kernel bug (either in kvm or elsewhere), we've had several
>>> before in this area.
>>>
>>> Is this reproducible?
>> Not for me. Peter only hit it very rarely, Peter obviously more easily.
> I have only hit this once and was not able to reproduce it.
For me it was very reproducible, but my issue was fixed by:
http://www.mail-archive.com/kvm@vger.kernel.org/msg70908.html
Never seen this since then,
Peter
> Stefan
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2012-08-22 12:52 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-28 13:05 qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop Peter Lieven
2012-06-28 13:25 ` Jan Kiszka
2012-06-28 15:02 ` Peter Lieven
2012-06-28 15:22 ` Jan Kiszka
2012-06-28 16:29 ` Peter Lieven
2012-06-28 16:32 ` Avi Kivity
2012-06-28 19:27 ` Peter Lieven
2012-07-01 8:19 ` Avi Kivity
2012-07-01 8:19 ` [Qemu-devel] " Avi Kivity
2012-07-01 19:18 ` Peter Lieven
2012-07-01 19:18 ` [Qemu-devel] " Peter Lieven
2012-07-02 7:05 ` Jan Kiszka
2012-07-02 7:05 ` [Qemu-devel] " Jan Kiszka
2012-07-02 8:12 ` Peter Lieven
2012-07-02 8:12 ` [Qemu-devel] " Peter Lieven
2012-08-06 15:11 ` Stefan Hajnoczi
2012-08-06 15:11 ` [Qemu-devel] " Stefan Hajnoczi
2012-08-17 13:11 ` Jan Kiszka
2012-08-17 13:11 ` [Qemu-devel] " Jan Kiszka
2012-08-17 14:36 ` Jan Kiszka
2012-08-17 14:36 ` [Qemu-devel] " Jan Kiszka
2012-08-17 14:41 ` Jan Kiszka
2012-08-17 14:41 ` [Qemu-devel] " Jan Kiszka
2012-08-17 15:04 ` Jan Kiszka
2012-08-17 15:04 ` [Qemu-devel] " Jan Kiszka
2012-08-19 9:42 ` Avi Kivity
2012-08-19 9:42 ` [Qemu-devel] " Avi Kivity
2012-08-21 7:21 ` Jan Kiszka
2012-08-21 7:21 ` [Qemu-devel] " Jan Kiszka
2012-08-21 8:23 ` Stefan Hajnoczi
2012-08-21 8:23 ` [Qemu-devel] " Stefan Hajnoczi
2012-08-22 12:52 ` Peter Lieven
2012-08-22 12:52 ` [Qemu-devel] " Peter Lieven
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.