linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
@ 2011-12-12  0:32 Alexey Zaytsev
  2011-12-12  6:13 ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-12  0:32 UTC (permalink / raw)
  To: Kernel development list
  Cc: Liu Jinsong, Jan Kiszka, Marcelo Tosatti, Avi Kivity

Hi.

After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0) fails
to boot illumos. The OS gets stuck pretty late in the boot process,
without any hints from the guest. MDB (the illumos kenrnel debugger)
failed to clarify the situation at once, the kernel seems to be stuck
idling. I've bisected the problem to commit
a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me know if you
need me to collect any debug information or test any patches.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-12  0:32 [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Alexey Zaytsev
@ 2011-12-12  6:13 ` Liu, Jinsong
  2011-12-14  9:37   ` Alexey Zaytsev
  0 siblings, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-12  6:13 UTC (permalink / raw)
  To: Alexey Zaytsev, Kernel development list
  Cc: Jan Kiszka, Marcelo Tosatti, Avi Kivity

Alexey Zaytsev wrote:
> Hi.
> 
> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0) fails
> to boot illumos. The OS gets stuck pretty late in the boot process,
> without any hints from the guest. MDB (the illumos kenrnel debugger)
> failed to clarify the situation at once, the kernel seems to be stuck
> idling. I've bisected the problem to commit
> a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me know if you
> need me to collect any debug information or test any patches.

Alexey,

Does illumos use tsc deadline timer? and do you run it at Intel platform?

If yes, would you please help me to collect debug information by add some printk points at kvm tsc deadline timer logic? currently I totally have no clue to figure out what the issue root from :)
and, would you please check illumos tsc deadline timer logic by confirm whether illumos
1). Enumerate tsc deadline timer capability by CPUID;
2). Enable tsc deadline timer mode by lapic MMIO;
3). Start tsc deadline timer by WRMSR;

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-12  6:13 ` Liu, Jinsong
@ 2011-12-14  9:37   ` Alexey Zaytsev
  2011-12-20  8:43     ` Alexey Zaytsev
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-14  9:37 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Alexey Zaytsev wrote:
>> Hi.
>>
>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0) fails
>> to boot illumos. The OS gets stuck pretty late in the boot process,
>> without any hints from the guest. MDB (the illumos kenrnel debugger)
>> failed to clarify the situation at once, the kernel seems to be stuck
>> idling. I've bisected the problem to commit
>> a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me know if you
>> need me to collect any debug information or test any patches.
>
> Alexey,
>
> Does illumos use tsc deadline timer? and do you run it at Intel platform?
>
> If yes, would you please help me to collect debug information by add some printk points at kvm tsc deadline timer logic? currently I totally have no clue to figure out what the issue root from :)
> and, would you please check illumos tsc deadline timer logic by confirm whether illumos
> 1). Enumerate tsc deadline timer capability by CPUID;
> 2). Enable tsc deadline timer mode by lapic MMIO;
> 3). Start tsc deadline timer by WRMSR;

I think, here's what's going on. We get into
kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but qemu
does not use the KVM_CREATE_IRQCHIP ioctl, so we are running without
vcpu->arch.apic and the functions do nothing. And the Linux guest just
handles broken tsc a lot  better, so it survives. Both guests seem to
work fine with qemu-kvm, which uses the ioctl.

>
> Thanks,
> Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-14  9:37   ` Alexey Zaytsev
@ 2011-12-20  8:43     ` Alexey Zaytsev
  2011-12-20  8:53       ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20  8:43 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
<alexey.zaytsev@nexenta.com> wrote:
> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com> wrote:
>> Alexey Zaytsev wrote:
>>> Hi.
>>>
>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0) fails
>>> to boot illumos. The OS gets stuck pretty late in the boot process,
>>> without any hints from the guest. MDB (the illumos kenrnel debugger)
>>> failed to clarify the situation at once, the kernel seems to be stuck
>>> idling. I've bisected the problem to commit
>>> a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me know if you
>>> need me to collect any debug information or test any patches.
>>
>> Alexey,
>>
>> Does illumos use tsc deadline timer? and do you run it at Intel platform?
>>
>> If yes, would you please help me to collect debug information by add some printk points at kvm tsc deadline timer logic? currently I totally have no clue to figure out what the issue root from :)
>> and, would you please check illumos tsc deadline timer logic by confirm whether illumos
>> 1). Enumerate tsc deadline timer capability by CPUID;
>> 2). Enable tsc deadline timer mode by lapic MMIO;
>> 3). Start tsc deadline timer by WRMSR;
>
> I think, here's what's going on. We get into
> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but qemu
> does not use the KVM_CREATE_IRQCHIP ioctl, so we are running without
> vcpu->arch.apic and the functions do nothing. And the Linux guest just
> handles broken tsc a lot  better, so it survives. Both guests seem to
> work fine with qemu-kvm, which uses the ioctl.
>

Hi. Do you need any more info on thins?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  8:43     ` Alexey Zaytsev
@ 2011-12-20  8:53       ` Liu, Jinsong
  2011-12-20  8:58         ` Alexey Zaytsev
  0 siblings, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20  8:53 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

Alexey Zaytsev wrote:
> On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
> <alexey.zaytsev@nexenta.com> wrote:
>> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com>
>> wrote: 
>>> Alexey Zaytsev wrote:
>>>> Hi.
>>>> 
>>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0)
>>>> fails to boot illumos. The OS gets stuck pretty late in the boot
>>>> process, without any hints from the guest. MDB (the illumos
>>>> kenrnel debugger) failed to clarify the situation at once, the
>>>> kernel seems to be stuck idling. I've bisected the problem to
>>>> commit a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me
>>>> know if you need me to collect any debug information or test any
>>>> patches. 
>>> 
>>> Alexey,
>>> 
>>> Does illumos use tsc deadline timer? and do you run it at Intel
>>> platform? 
>>> 
>>> If yes, would you please help me to collect debug information by
>>> add some printk points at kvm tsc deadline timer logic? currently I
>>> totally have no clue to figure out what the issue root from :) and,
>>> would you please check illumos tsc deadline timer logic by confirm
>>> whether illumos 1). Enumerate tsc deadline timer capability by
>>> CPUID; 2). Enable tsc deadline timer mode by lapic MMIO; 3). Start
>>> tsc deadline timer by WRMSR;  
>> 
>> I think, here's what's going on. We get into
>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but qemu
>> does not use the KVM_CREATE_IRQCHIP ioctl, so we are running without
>> vcpu->arch.apic and the functions do nothing. And the Linux guest
>> just handles broken tsc a lot  better, so it survives. Both guests
>> seem to work fine with qemu-kvm, which uses the ioctl.
>> 
> 
> Hi. Do you need any more info on thins?

Seems illumos works fine w/ qemu-kvm, that's enough. Maybe you need to wait qemu-kvm sync into qemu?

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  8:53       ` Liu, Jinsong
@ 2011-12-20  8:58         ` Alexey Zaytsev
  2011-12-20  9:26           ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20  8:58 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore, Linus Torvalds

On Tue, Dec 20, 2011 at 10:53, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Alexey Zaytsev wrote:
>> On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
>> <alexey.zaytsev@nexenta.com> wrote:
>>> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com>
>>> wrote:
>>>> Alexey Zaytsev wrote:
>>>>> Hi.
>>>>>
>>>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0)
>>>>> fails to boot illumos. The OS gets stuck pretty late in the boot
>>>>> process, without any hints from the guest. MDB (the illumos
>>>>> kenrnel debugger) failed to clarify the situation at once, the
>>>>> kernel seems to be stuck idling. I've bisected the problem to
>>>>> commit a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me
>>>>> know if you need me to collect any debug information or test any
>>>>> patches.
>>>>
>>>> Alexey,
>>>>
>>>> Does illumos use tsc deadline timer? and do you run it at Intel
>>>> platform?
>>>>
>>>> If yes, would you please help me to collect debug information by
>>>> add some printk points at kvm tsc deadline timer logic? currently I
>>>> totally have no clue to figure out what the issue root from :) and,
>>>> would you please check illumos tsc deadline timer logic by confirm
>>>> whether illumos 1). Enumerate tsc deadline timer capability by
>>>> CPUID; 2). Enable tsc deadline timer mode by lapic MMIO; 3). Start
>>>> tsc deadline timer by WRMSR;
>>>
>>> I think, here's what's going on. We get into
>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but qemu
>>> does not use the KVM_CREATE_IRQCHIP ioctl, so we are running without
>>> vcpu->arch.apic and the functions do nothing. And the Linux guest
>>> just handles broken tsc a lot  better, so it survives. Both guests
>>> seem to work fine with qemu-kvm, which uses the ioctl.
>>>
>>
>> Hi. Do you need any more info on thins?
>
> Seems illumos works fine w/ qemu-kvm, that's enough. Maybe you need to wait qemu-kvm sync into qemu?
>

Qemu-kvm is fine for me, but this is a regression and should be fixed.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  8:58         ` Alexey Zaytsev
@ 2011-12-20  9:26           ` Liu, Jinsong
  2011-12-20  9:48             ` Avi Kivity
  2011-12-20  9:51             ` Alexey Zaytsev
  0 siblings, 2 replies; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20  9:26 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore, Linus Torvalds

Alexey Zaytsev wrote:
> On Tue, Dec 20, 2011 at 10:53, Liu, Jinsong <jinsong.liu@intel.com>
> wrote: 
>> Alexey Zaytsev wrote:
>>> On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
>>> <alexey.zaytsev@nexenta.com> wrote:
>>>> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com>
>>>> wrote:
>>>>> Alexey Zaytsev wrote:
>>>>>> Hi.
>>>>>> 
>>>>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0)
>>>>>> fails to boot illumos. The OS gets stuck pretty late in the boot
>>>>>> process, without any hints from the guest. MDB (the illumos
>>>>>> kenrnel debugger) failed to clarify the situation at once, the
>>>>>> kernel seems to be stuck idling. I've bisected the problem to
>>>>>> commit a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me
>>>>>> know if you need me to collect any debug information or test any
>>>>>> patches.
>>>>> 
>>>>> Alexey,
>>>>> 
>>>>> Does illumos use tsc deadline timer? and do you run it at Intel
>>>>> platform? 
>>>>> 
>>>>> If yes, would you please help me to collect debug information by
>>>>> add some printk points at kvm tsc deadline timer logic? currently
>>>>> I totally have no clue to figure out what the issue root from :)
>>>>> and, would you please check illumos tsc deadline timer logic by
>>>>> confirm whether illumos 1). Enumerate tsc deadline timer
>>>>> capability by CPUID; 2). Enable tsc deadline timer mode by lapic
>>>>> MMIO; 3). Start tsc deadline timer by WRMSR;
>>>> 
>>>> I think, here's what's going on. We get into
>>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but
>>>> qemu does not use the KVM_CREATE_IRQCHIP ioctl, so we are running
>>>> without vcpu->arch.apic and the functions do nothing. And the
>>>> Linux guest just handles broken tsc a lot  better, so it survives.
>>>> Both guests seem to work fine with qemu-kvm, which uses the ioctl.
>>>> 
>>> 
>>> Hi. Do you need any more info on thins?
>> 
>> Seems illumos works fine w/ qemu-kvm, that's enough. Maybe you need
>> to wait qemu-kvm sync into qemu? 
>> 
> 
> Qemu-kvm is fine for me, but this is a regression and should be fixed.

Sorry, I haven't your environment to debug illumos regression.
I don't think it directly related to kvm tsc deadline timer, and I think it's no value to do it now, since 
for linux guest it works fine w/ both qemu & qemu-kvm,
for illumos guest you just need wait qemu-kvm sync into qemu.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  9:26           ` Liu, Jinsong
@ 2011-12-20  9:48             ` Avi Kivity
  2011-12-20  9:51             ` Alexey Zaytsev
  1 sibling, 0 replies; 36+ messages in thread
From: Avi Kivity @ 2011-12-20  9:48 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Alexey Zaytsev, Kernel development list, Jan Kiszka,
	Marcelo Tosatti, Garrett D'Amore, Linus Torvalds

On 12/20/2011 11:26 AM, Liu, Jinsong wrote:
> > 
> > Qemu-kvm is fine for me, but this is a regression and should be fixed.
>
> Sorry, I haven't your environment to debug illumos regression.
> I don't think it directly related to kvm tsc deadline timer, and I think it's no value to do it now, since 
> for linux guest it works fine w/ both qemu & qemu-kvm,
> for illumos guest you just need wait qemu-kvm sync into qemu.

No, we need to understand what's going on here, instead of sweeping it
under the rug.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  9:26           ` Liu, Jinsong
  2011-12-20  9:48             ` Avi Kivity
@ 2011-12-20  9:51             ` Alexey Zaytsev
  2011-12-20 18:58               ` Linus Torvalds
  2011-12-20 19:05               ` [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Liu, Jinsong
  1 sibling, 2 replies; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20  9:51 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore, Linus Torvalds

On Tue, Dec 20, 2011 at 11:26, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Alexey Zaytsev wrote:
>> On Tue, Dec 20, 2011 at 10:53, Liu, Jinsong <jinsong.liu@intel.com>
>> wrote:
>>> Alexey Zaytsev wrote:
>>>> On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
>>>> <alexey.zaytsev@nexenta.com> wrote:
>>>>> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong <jinsong.liu@intel.com>
>>>>> wrote:
>>>>>> Alexey Zaytsev wrote:
>>>>>>> Hi.
>>>>>>>
>>>>>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0)
>>>>>>> fails to boot illumos. The OS gets stuck pretty late in the boot
>>>>>>> process, without any hints from the guest. MDB (the illumos
>>>>>>> kenrnel debugger) failed to clarify the situation at once, the
>>>>>>> kernel seems to be stuck idling. I've bisected the problem to
>>>>>>> commit a3e06bbe8445f57eb949e6474c5a9b30f24d2057. Please let me
>>>>>>> know if you need me to collect any debug information or test any
>>>>>>> patches.
>>>>>>
>>>>>> Alexey,
>>>>>>
>>>>>> Does illumos use tsc deadline timer? and do you run it at Intel
>>>>>> platform?
>>>>>>
>>>>>> If yes, would you please help me to collect debug information by
>>>>>> add some printk points at kvm tsc deadline timer logic? currently
>>>>>> I totally have no clue to figure out what the issue root from :)
>>>>>> and, would you please check illumos tsc deadline timer logic by
>>>>>> confirm whether illumos 1). Enumerate tsc deadline timer
>>>>>> capability by CPUID; 2). Enable tsc deadline timer mode by lapic
>>>>>> MMIO; 3). Start tsc deadline timer by WRMSR;
>>>>>
>>>>> I think, here's what's going on. We get into
>>>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but
>>>>> qemu does not use the KVM_CREATE_IRQCHIP ioctl, so we are running
>>>>> without vcpu->arch.apic and the functions do nothing. And the
>>>>> Linux guest just handles broken tsc a lot  better, so it survives.
>>>>> Both guests seem to work fine with qemu-kvm, which uses the ioctl.
>>>>>
>>>>
>>>> Hi. Do you need any more info on thins?
>>>
>>> Seems illumos works fine w/ qemu-kvm, that's enough. Maybe you need
>>> to wait qemu-kvm sync into qemu?
>>>
>>
>> Qemu-kvm is fine for me, but this is a regression and should be fixed.
>
> Sorry, I haven't your environment to debug illumos regression.
> I don't think it directly related to kvm tsc deadline timer, and I think it's no value to do it now, since
> for linux guest it works fine w/ both qemu & qemu-kvm,
> for illumos guest you just need wait qemu-kvm sync into qemu.
>

Let me clarify the situation.

Before this commit, the tsc was advertised in cpuid, and it was
handled, if I understand things correctly, by qemu.
After this commit, the tsc is advertised in cpuid, and is handled in
the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
not issue the ioctl, the kernel just discards any wrmsrs done to the
tsc. This does not look like an Illumos problem to me. Linux guests
kind of work here, because they are  prepared to work on utterly
broken hardware. Good for you, but please don't break less-prepared
guests.

You can just run a Linux guest in qemu, and see what happens in
kvm_get_lapic_tscdeadline_msr and kvm_set_lapic_tscdeadline_msr.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  9:51             ` Alexey Zaytsev
@ 2011-12-20 18:58               ` Linus Torvalds
  2011-12-20 19:21                 ` Liu, Jinsong
  2011-12-20 23:04                 ` Jan Kiszka
  2011-12-20 19:05               ` [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Liu, Jinsong
  1 sibling, 2 replies; 36+ messages in thread
From: Linus Torvalds @ 2011-12-20 18:58 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Liu, Jinsong, Kernel development list, Jan Kiszka,
	Marcelo Tosatti, Avi Kivity, Garrett D'Amore

On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
<alexey.zaytsev@gmail.com> wrote:
>
> Let me clarify the situation.
>
> Before this commit, the tsc was advertised in cpuid, and it was
> handled, if I understand things correctly, by qemu.
> After this commit, the tsc is advertised in cpuid, and is handled in
> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
> not issue the ioctl, the kernel just discards any wrmsrs done to the
> tsc. This does not look like an Illumos problem to me. Linux guests
> kind of work here, because they are  prepared to work on utterly
> broken hardware. Good for you, but please don't break less-prepared
> guests.

Yes. This is a regression, and needs to be fixed.

Liu, if you don't have time to debug it, we'll just revert the commit.
It's that easy. Regressions are not allowed. There are no excuses.

In particular, saying "just wait for qemu-kvm" is not an acceptable
answer, because the point is that things *used* to work, and they
broke. No "change it to work with the new kernel" allowed, except for
some *very* rare critical circumstances (usually "major security-bug
that we had to fix, and people who relied on it are thus out of
luck").

Commit a3e06bbe8445 still seems to revert cleanly, so that is the easy option.

That said, it sounds like maybe another solution is to start with the
TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
KVM_CREATE_IRQCHIP ioctl.

In fact, the patch is clearly buggy, in that it apparently doesn't
emulate TSC_DEADLINE correctly and natively on its own.

Jan, Marcelo, Avi - is there a quick fix, or should I just revert?

And please don't *EVER* tell people that they should just work around
regressions. Regressions are absolutely unacceptable. Kernel people
need to understand that.

                   Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20  9:51             ` Alexey Zaytsev
  2011-12-20 18:58               ` Linus Torvalds
@ 2011-12-20 19:05               ` Liu, Jinsong
  2011-12-21  9:20                 ` Avi Kivity
  1 sibling, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20 19:05 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore, Linus Torvalds

Alexey Zaytsev wrote:
> On Tue, Dec 20, 2011 at 11:26, Liu, Jinsong <jinsong.liu@intel.com>
> wrote: 
>> Alexey Zaytsev wrote:
>>> On Tue, Dec 20, 2011 at 10:53, Liu, Jinsong <jinsong.liu@intel.com>
>>> wrote:
>>>> Alexey Zaytsev wrote:
>>>>> On Wed, Dec 14, 2011 at 11:37, Alexey Zaytsev
>>>>> <alexey.zaytsev@nexenta.com> wrote:
>>>>>> On Mon, Dec 12, 2011 at 10:13, Liu, Jinsong
>>>>>> <jinsong.liu@intel.com> wrote:
>>>>>>> Alexey Zaytsev wrote:
>>>>>>>> Hi.
>>>>>>>> 
>>>>>>>> After a recent change, qemu --enable-kvm (both 0.15.92 and 1.0)
>>>>>>>> fails to boot illumos. The OS gets stuck pretty late in the
>>>>>>>> boot process, without any hints from the guest. MDB (the
>>>>>>>> illumos kenrnel debugger) failed to clarify the situation at
>>>>>>>> once, the kernel seems to be stuck idling. I've bisected the
>>>>>>>> problem to commit a3e06bbe8445f57eb949e6474c5a9b30f24d2057.
>>>>>>>> Please let me know if you need me to collect any debug
>>>>>>>> information or test any patches.
>>>>>>> 
>>>>>>> Alexey,
>>>>>>> 
>>>>>>> Does illumos use tsc deadline timer? and do you run it at Intel
>>>>>>> platform? 
>>>>>>> 
>>>>>>> If yes, would you please help me to collect debug information by
>>>>>>> add some printk points at kvm tsc deadline timer logic?
>>>>>>> currently I totally have no clue to figure out what the issue
>>>>>>> root from :) and, would you please check illumos tsc deadline
>>>>>>> timer logic by confirm whether illumos 1). Enumerate tsc
>>>>>>> deadline timer capability by CPUID; 2). Enable tsc deadline
>>>>>>> timer mode by lapic MMIO; 3). Start tsc deadline timer by WRMSR;
>>>>>> 
>>>>>> I think, here's what's going on. We get into
>>>>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr, but
>>>>>> qemu does not use the KVM_CREATE_IRQCHIP ioctl, so we are running
>>>>>> without vcpu->arch.apic and the functions do nothing. And the
>>>>>> Linux guest just handles broken tsc a lot  better, so it
>>>>>> survives. Both guests seem to work fine with qemu-kvm, which
>>>>>> uses the ioctl. 
>>>>>> 
>>>>> 
>>>>> Hi. Do you need any more info on thins?
>>>> 
>>>> Seems illumos works fine w/ qemu-kvm, that's enough. Maybe you need
>>>> to wait qemu-kvm sync into qemu?
>>>> 
>>> 
>>> Qemu-kvm is fine for me, but this is a regression and should be
>>> fixed. 
>> 
>> Sorry, I haven't your environment to debug illumos regression.
>> I don't think it directly related to kvm tsc deadline timer, and I
>> think it's no value to do it now, since for linux guest it works
>> fine w/ both qemu & qemu-kvm, 
>> for illumos guest you just need wait qemu-kvm sync into qemu.
>> 
> 
> Let me clarify the situation.
> 
> Before this commit, the tsc was advertised in cpuid, and it was
> handled, if I understand things correctly, by qemu.
> After this commit, the tsc is advertised in cpuid, and is handled in
> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
> not issue the ioctl, the kernel just discards any wrmsrs done to the
> tsc. This does not look like an Illumos problem to me. Linux guests
> kind of work here, because they are  prepared to work on utterly
> broken hardware. Good for you, but please don't break less-prepared
> guests.
> 
> You can just run a Linux guest in qemu, and see what happens in
> kvm_get_lapic_tscdeadline_msr and kvm_set_lapic_tscdeadline_msr.

I double checked kvm tsc deadline timer patch, make sure your issue not *directly* related to the patch. I guess it maybe a qemu bug (which not exist at Avi's qemu-kvm).
Can you tell me where do you pull qemu? I used Avi's qemu-kvm tree, but can try your environment.

and, would you please tell me
1). which emulated pic/ioapic/lapic logic do you use, qemu or of kvm?
2). has illumos enabled tsc deadline timer, and it wrmsr MSR_IA32_TSCDEADLINE (0x6e0)?

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 18:58               ` Linus Torvalds
@ 2011-12-20 19:21                 ` Liu, Jinsong
  2011-12-20 19:47                   ` Alexey Zaytsev
  2011-12-21  9:18                   ` Avi Kivity
  2011-12-20 23:04                 ` Jan Kiszka
  1 sibling, 2 replies; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20 19:21 UTC (permalink / raw)
  To: Linus Torvalds, Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

Linus Torvalds wrote:
> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
> <alexey.zaytsev@gmail.com> wrote:
>> 
>> Let me clarify the situation.
>> 
>> Before this commit, the tsc was advertised in cpuid, and it was
>> handled, if I understand things correctly, by qemu.
>> After this commit, the tsc is advertised in cpuid, and is handled in
>> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
>> not issue the ioctl, the kernel just discards any wrmsrs done to the
>> tsc. This does not look like an Illumos problem to me. Linux guests
>> kind of work here, because they are  prepared to work on utterly
>> broken hardware. Good for you, but please don't break less-prepared
>> guests.
> 
> Yes. This is a regression, and needs to be fixed.
> 
> Liu, if you don't have time to debug it, we'll just revert the commit.
> It's that easy. Regressions are not allowed. There are no excuses.
> 
> In particular, saying "just wait for qemu-kvm" is not an acceptable
> answer, because the point is that things *used* to work, and they
> broke. No "change it to work with the new kernel" allowed, except for
> some *very* rare critical circumstances (usually "major security-bug
> that we had to fix, and people who relied on it are thus out of
> luck").
> 
> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
> easy option. 
> 
> That said, it sounds like maybe another solution is to start with the
> TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
> KVM_CREATE_IRQCHIP ioctl.
> 
> In fact, the patch is clearly buggy, in that it apparently doesn't
> emulate TSC_DEADLINE correctly and natively on its own.
> 
> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
> 
> And please don't *EVER* tell people that they should just work around
> regressions. Regressions are absolutely unacceptable. Kernel people
> need to understand that.
> 
>                    Linus

Yes, my fault to say 'walk around' before knowing Alex's issue clearly.

After Alex send his last email to clarify the situation, I have checked the bug.
Basically it caused from
1. qemu didn't issue KVM_CREATE_IRQCHIP, hence irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at kvm_arch_vcpu_init();
2. tsc deadline work based on vcpu lapic, hence break illumos;

A fix is to update cpuid, as you said, setting it after KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so I ask Alex his environment to setup at my side to do more test.
If you think kvm tsc deadline timer patch itself not clean, please tell me.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 19:21                 ` Liu, Jinsong
@ 2011-12-20 19:47                   ` Alexey Zaytsev
  2011-12-20 20:19                     ` Liu, Jinsong
  2011-12-21  9:18                   ` Avi Kivity
  1 sibling, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20 19:47 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Linus Torvalds wrote:
>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>> <alexey.zaytsev@gmail.com> wrote:
>>>
>>> Let me clarify the situation.
>>>
>>> Before this commit, the tsc was advertised in cpuid, and it was
>>> handled, if I understand things correctly, by qemu.
>>> After this commit, the tsc is advertised in cpuid, and is handled in
>>> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
>>> not issue the ioctl, the kernel just discards any wrmsrs done to the
>>> tsc. This does not look like an Illumos problem to me. Linux guests
>>> kind of work here, because they are  prepared to work on utterly
>>> broken hardware. Good for you, but please don't break less-prepared
>>> guests.
>>
>> Yes. This is a regression, and needs to be fixed.
>>
>> Liu, if you don't have time to debug it, we'll just revert the commit.
>> It's that easy. Regressions are not allowed. There are no excuses.
>>
>> In particular, saying "just wait for qemu-kvm" is not an acceptable
>> answer, because the point is that things *used* to work, and they
>> broke. No "change it to work with the new kernel" allowed, except for
>> some *very* rare critical circumstances (usually "major security-bug
>> that we had to fix, and people who relied on it are thus out of
>> luck").
>>
>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
>> easy option.
>>
>> That said, it sounds like maybe another solution is to start with the
>> TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
>> KVM_CREATE_IRQCHIP ioctl.
>>
>> In fact, the patch is clearly buggy, in that it apparently doesn't
>> emulate TSC_DEADLINE correctly and natively on its own.
>>
>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
>>
>> And please don't *EVER* tell people that they should just work around
>> regressions. Regressions are absolutely unacceptable. Kernel people
>> need to understand that.
>>
>>                    Linus
>
> Yes, my fault to say 'walk around' before knowing Alex's issue clearly.
>
> After Alex send his last email to clarify the situation, I have checked the bug.
> Basically it caused from
> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at kvm_arch_vcpu_init();
> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>
> A fix is to update cpuid, as you said, setting it after KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so I ask Alex his environment to setup at my side to do more test.
> If you think kvm tsc deadline timer patch itself not clean, please tell me.

[Removed Linus from CC]

If your internet connection is good enough, I could just pass you the
disk image. It should be around 5g when compressed. Otherwise, you can
download an openindiana disk image from http://openindiana.org. You
probably want the text-only "server" one. The TSC code has changed
since that release, but the old one does not work as well.

Do I get it right that Linux does not use the tsc deadline timer? I've
added printks to the
kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr functions,
and I swear I've seen them trigger a few times with the Linux guest.
But I can't find the code that is issuing the msr reads/writes in the
Linux kernel.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 19:47                   ` Alexey Zaytsev
@ 2011-12-20 20:19                     ` Liu, Jinsong
  2011-12-20 20:22                       ` Alexey Zaytsev
  0 siblings, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20 20:19 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

Alexey Zaytsev wrote:
> On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong <jinsong.liu@intel.com>
> wrote: 
>> Linus Torvalds wrote:
>>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>>> <alexey.zaytsev@gmail.com> wrote:
>>>> 
>>>> Let me clarify the situation.
>>>> 
>>>> Before this commit, the tsc was advertised in cpuid, and it was
>>>> handled, if I understand things correctly, by qemu.
>>>> After this commit, the tsc is advertised in cpuid, and is handled
>>>> in the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If
>>>> it does not issue the ioctl, the kernel just discards any wrmsrs
>>>> done to the tsc. This does not look like an Illumos problem to me.
>>>> Linux guests kind of work here, because they are  prepared to work
>>>> on utterly broken hardware. Good for you, but please don't break
>>>> less-prepared guests.
>>> 
>>> Yes. This is a regression, and needs to be fixed.
>>> 
>>> Liu, if you don't have time to debug it, we'll just revert the
>>> commit. It's that easy. Regressions are not allowed. There are no
>>> excuses. 
>>> 
>>> In particular, saying "just wait for qemu-kvm" is not an acceptable
>>> answer, because the point is that things *used* to work, and they
>>> broke. No "change it to work with the new kernel" allowed, except
>>> for some *very* rare critical circumstances (usually "major
>>> security-bug that we had to fix, and people who relied on it are
>>> thus out of luck"). 
>>> 
>>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
>>> easy option. 
>>> 
>>> That said, it sounds like maybe another solution is to start with
>>> the TSC_DEADLINE timer bit in cpuid cleared, and only setting it
>>> after the KVM_CREATE_IRQCHIP ioctl. 
>>> 
>>> In fact, the patch is clearly buggy, in that it apparently doesn't
>>> emulate TSC_DEADLINE correctly and natively on its own.
>>> 
>>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
>>> 
>>> And please don't *EVER* tell people that they should just work
>>> around regressions. Regressions are absolutely unacceptable. Kernel
>>> people need to understand that. 
>>> 
>>>                    Linus
>> 
>> Yes, my fault to say 'walk around' before knowing Alex's issue
>> clearly. 
>> 
>> After Alex send his last email to clarify the situation, I have
>> checked the bug. 
>> Basically it caused from
>> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence
>> irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at
>> kvm_arch_vcpu_init();  
>> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>> 
>> A fix is to update cpuid, as you said, setting it after
>> KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so I
>> ask Alex his environment to setup at my side to do more test. If you
>> think kvm tsc deadline timer patch itself not clean, please tell me.
> 
> [Removed Linus from CC]
> 
> If your internet connection is good enough, I could just pass you the
> disk image. It should be around 5g when compressed. Otherwise, you can
> download an openindiana disk image from http://openindiana.org. You
> probably want the text-only "server" one. The TSC code has changed
> since that release, but the old one does not work as well.

I mean I will build environment at my side with your qemu version (where do you pull from? commit number?) --> just to verify whether we can solve it at qemu side, or, solve it by cpuid at kvm side.
I will present a patch to fix it and you can test at your environment.

> 
> Do I get it right that Linux does not use the tsc deadline timer? I've
> added printks to the
> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr functions,
> and I swear I've seen them trigger a few times with the Linux guest.
> But I can't find the code that is issuing the msr reads/writes in the
> Linux kernel.

I will check it tomorrow, too late now.
and would you please tell me, does illumos use tsc deadline timer? I lack this basic information.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 20:19                     ` Liu, Jinsong
@ 2011-12-20 20:22                       ` Alexey Zaytsev
  2011-12-20 20:26                         ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20 20:22 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

On Tue, Dec 20, 2011 at 22:19, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Alexey Zaytsev wrote:
>> On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong <jinsong.liu@intel.com>
>> wrote:
>>> Linus Torvalds wrote:
>>>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>>>> <alexey.zaytsev@gmail.com> wrote:
>>>>>
>>>>> Let me clarify the situation.
>>>>>
>>>>> Before this commit, the tsc was advertised in cpuid, and it was
>>>>> handled, if I understand things correctly, by qemu.
>>>>> After this commit, the tsc is advertised in cpuid, and is handled
>>>>> in the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If
>>>>> it does not issue the ioctl, the kernel just discards any wrmsrs
>>>>> done to the tsc. This does not look like an Illumos problem to me.
>>>>> Linux guests kind of work here, because they are  prepared to work
>>>>> on utterly broken hardware. Good for you, but please don't break
>>>>> less-prepared guests.
>>>>
>>>> Yes. This is a regression, and needs to be fixed.
>>>>
>>>> Liu, if you don't have time to debug it, we'll just revert the
>>>> commit. It's that easy. Regressions are not allowed. There are no
>>>> excuses.
>>>>
>>>> In particular, saying "just wait for qemu-kvm" is not an acceptable
>>>> answer, because the point is that things *used* to work, and they
>>>> broke. No "change it to work with the new kernel" allowed, except
>>>> for some *very* rare critical circumstances (usually "major
>>>> security-bug that we had to fix, and people who relied on it are
>>>> thus out of luck").
>>>>
>>>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
>>>> easy option.
>>>>
>>>> That said, it sounds like maybe another solution is to start with
>>>> the TSC_DEADLINE timer bit in cpuid cleared, and only setting it
>>>> after the KVM_CREATE_IRQCHIP ioctl.
>>>>
>>>> In fact, the patch is clearly buggy, in that it apparently doesn't
>>>> emulate TSC_DEADLINE correctly and natively on its own.
>>>>
>>>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
>>>>
>>>> And please don't *EVER* tell people that they should just work
>>>> around regressions. Regressions are absolutely unacceptable. Kernel
>>>> people need to understand that.
>>>>
>>>>                    Linus
>>>
>>> Yes, my fault to say 'walk around' before knowing Alex's issue
>>> clearly.
>>>
>>> After Alex send his last email to clarify the situation, I have
>>> checked the bug.
>>> Basically it caused from
>>> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence
>>> irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at
>>> kvm_arch_vcpu_init();
>>> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>>>
>>> A fix is to update cpuid, as you said, setting it after
>>> KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so I
>>> ask Alex his environment to setup at my side to do more test. If you
>>> think kvm tsc deadline timer patch itself not clean, please tell me.
>>
>> [Removed Linus from CC]
>>
>> If your internet connection is good enough, I could just pass you the
>> disk image. It should be around 5g when compressed. Otherwise, you can
>> download an openindiana disk image from http://openindiana.org. You
>> probably want the text-only "server" one. The TSC code has changed
>> since that release, but the old one does not work as well.
>
> I mean I will build environment at my side with your qemu version (where do you pull from? commit number?) --> just to verify whether we can solve it at qemu side, or, solve it by cpuid at kvm side.
> I will present a patch to fix it and you can test at your environment.
>
>>
>> Do I get it right that Linux does not use the tsc deadline timer? I've
>> added printks to the
>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr functions,
>> and I swear I've seen them trigger a few times with the Linux guest.
>> But I can't find the code that is issuing the msr reads/writes in the
>> Linux kernel.
>
> I will check it tomorrow, too late now.
> and would you please tell me, does illumos use tsc deadline timer? I lack this basic information.

It does, if available. Please check
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/i86pc/io/pcplusmp/apic_timer.c#L290


>
> Thanks,
> Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 20:22                       ` Alexey Zaytsev
@ 2011-12-20 20:26                         ` Liu, Jinsong
  2011-12-20 20:36                           ` Alexey Zaytsev
  0 siblings, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20 20:26 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

Alexey Zaytsev wrote:
> On Tue, Dec 20, 2011 at 22:19, Liu, Jinsong <jinsong.liu@intel.com>
> wrote: 
>> Alexey Zaytsev wrote:
>>> On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong <jinsong.liu@intel.com>
>>> wrote:
>>>> Linus Torvalds wrote:
>>>>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>>>>> <alexey.zaytsev@gmail.com> wrote:
>>>>>> 
>>>>>> Let me clarify the situation.
>>>>>> 
>>>>>> Before this commit, the tsc was advertised in cpuid, and it was
>>>>>> handled, if I understand things correctly, by qemu.
>>>>>> After this commit, the tsc is advertised in cpuid, and is handled
>>>>>> in the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If
>>>>>> it does not issue the ioctl, the kernel just discards any wrmsrs
>>>>>> done to the tsc. This does not look like an Illumos problem to
>>>>>> me. Linux guests kind of work here, because they are  prepared
>>>>>> to work on utterly broken hardware. Good for you, but please
>>>>>> don't break less-prepared guests.
>>>>> 
>>>>> Yes. This is a regression, and needs to be fixed.
>>>>> 
>>>>> Liu, if you don't have time to debug it, we'll just revert the
>>>>> commit. It's that easy. Regressions are not allowed. There are no
>>>>> excuses. 
>>>>> 
>>>>> In particular, saying "just wait for qemu-kvm" is not an
>>>>> acceptable answer, because the point is that things *used* to
>>>>> work, and they broke. No "change it to work with the new kernel"
>>>>> allowed, except for some *very* rare critical circumstances
>>>>> (usually "major security-bug that we had to fix, and people who
>>>>> relied on it are thus out of luck"). 
>>>>> 
>>>>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
>>>>> easy option. 
>>>>> 
>>>>> That said, it sounds like maybe another solution is to start with
>>>>> the TSC_DEADLINE timer bit in cpuid cleared, and only setting it
>>>>> after the KVM_CREATE_IRQCHIP ioctl.
>>>>> 
>>>>> In fact, the patch is clearly buggy, in that it apparently doesn't
>>>>> emulate TSC_DEADLINE correctly and natively on its own.
>>>>> 
>>>>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
>>>>> 
>>>>> And please don't *EVER* tell people that they should just work
>>>>> around regressions. Regressions are absolutely unacceptable.
>>>>> Kernel people need to understand that.
>>>>> 
>>>>>                    Linus
>>>> 
>>>> Yes, my fault to say 'walk around' before knowing Alex's issue
>>>> clearly. 
>>>> 
>>>> After Alex send his last email to clarify the situation, I have
>>>> checked the bug. Basically it caused from
>>>> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence
>>>> irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at
>>>> kvm_arch_vcpu_init(); 
>>>> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>>>> 
>>>> A fix is to update cpuid, as you said, setting it after
>>>> KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so
>>>> I ask Alex his environment to setup at my side to do more test. If
>>>> you think kvm tsc deadline timer patch itself not clean, please
>>>> tell me. 
>>> 
>>> [Removed Linus from CC]
>>> 
>>> If your internet connection is good enough, I could just pass you
>>> the disk image. It should be around 5g when compressed. Otherwise,
>>> you can download an openindiana disk image from
>>> http://openindiana.org. You probably want the text-only "server"
>>> one. The TSC code has changed since that release, but the old one
>>> does not work as well. 
>> 
>> I mean I will build environment at my side with your qemu version
>> (where do you pull from? commit number?) --> just to verify whether
>> we can solve it at qemu side, or, solve it by cpuid at kvm side. I
>> will present a patch to fix it and you can test at your environment.
>> 
>>> 
>>> Do I get it right that Linux does not use the tsc deadline timer?
>>> I've added printks to the
>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr
>>> functions, and I swear I've seen them trigger a few times with the
>>> Linux guest. But I can't find the code that is issuing the msr
>>> reads/writes in the Linux kernel. 
>> 
>> I will check it tomorrow, too late now.
>> and would you please tell me, does illumos use tsc deadline timer? I
>> lack this basic information. 
> 
> It does, if available. Please check
> https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/i86pc/io/pcplusmp/apic_timer.c#L290
> 

and where do you pull qemu? and commit number? I usually use Avi's qemu-kvm.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 20:26                         ` Liu, Jinsong
@ 2011-12-20 20:36                           ` Alexey Zaytsev
  2011-12-20 20:44                             ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-20 20:36 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

On Tue, Dec 20, 2011 at 22:26, Liu, Jinsong <jinsong.liu@intel.com> wrote:
> Alexey Zaytsev wrote:
>> On Tue, Dec 20, 2011 at 22:19, Liu, Jinsong <jinsong.liu@intel.com>
>> wrote:
>>> Alexey Zaytsev wrote:
>>>> On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong <jinsong.liu@intel.com>
>>>> wrote:
>>>>> Linus Torvalds wrote:
>>>>>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>>>>>> <alexey.zaytsev@gmail.com> wrote:
>>>>>>>
>>>>>>> Let me clarify the situation.
>>>>>>>
>>>>>>> Before this commit, the tsc was advertised in cpuid, and it was
>>>>>>> handled, if I understand things correctly, by qemu.
>>>>>>> After this commit, the tsc is advertised in cpuid, and is handled
>>>>>>> in the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If
>>>>>>> it does not issue the ioctl, the kernel just discards any wrmsrs
>>>>>>> done to the tsc. This does not look like an Illumos problem to
>>>>>>> me. Linux guests kind of work here, because they are  prepared
>>>>>>> to work on utterly broken hardware. Good for you, but please
>>>>>>> don't break less-prepared guests.
>>>>>>
>>>>>> Yes. This is a regression, and needs to be fixed.
>>>>>>
>>>>>> Liu, if you don't have time to debug it, we'll just revert the
>>>>>> commit. It's that easy. Regressions are not allowed. There are no
>>>>>> excuses.
>>>>>>
>>>>>> In particular, saying "just wait for qemu-kvm" is not an
>>>>>> acceptable answer, because the point is that things *used* to
>>>>>> work, and they broke. No "change it to work with the new kernel"
>>>>>> allowed, except for some *very* rare critical circumstances
>>>>>> (usually "major security-bug that we had to fix, and people who
>>>>>> relied on it are thus out of luck").
>>>>>>
>>>>>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the
>>>>>> easy option.
>>>>>>
>>>>>> That said, it sounds like maybe another solution is to start with
>>>>>> the TSC_DEADLINE timer bit in cpuid cleared, and only setting it
>>>>>> after the KVM_CREATE_IRQCHIP ioctl.
>>>>>>
>>>>>> In fact, the patch is clearly buggy, in that it apparently doesn't
>>>>>> emulate TSC_DEADLINE correctly and natively on its own.
>>>>>>
>>>>>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
>>>>>>
>>>>>> And please don't *EVER* tell people that they should just work
>>>>>> around regressions. Regressions are absolutely unacceptable.
>>>>>> Kernel people need to understand that.
>>>>>>
>>>>>>                    Linus
>>>>>
>>>>> Yes, my fault to say 'walk around' before knowing Alex's issue
>>>>> clearly.
>>>>>
>>>>> After Alex send his last email to clarify the situation, I have
>>>>> checked the bug. Basically it caused from
>>>>> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence
>>>>> irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at
>>>>> kvm_arch_vcpu_init();
>>>>> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>>>>>
>>>>> A fix is to update cpuid, as you said, setting it after
>>>>> KVM_CREATE_IRQCHIP. I just wonder is there any better solution? so
>>>>> I ask Alex his environment to setup at my side to do more test. If
>>>>> you think kvm tsc deadline timer patch itself not clean, please
>>>>> tell me.
>>>>
>>>> [Removed Linus from CC]
>>>>
>>>> If your internet connection is good enough, I could just pass you
>>>> the disk image. It should be around 5g when compressed. Otherwise,
>>>> you can download an openindiana disk image from
>>>> http://openindiana.org. You probably want the text-only "server"
>>>> one. The TSC code has changed since that release, but the old one
>>>> does not work as well.
>>>
>>> I mean I will build environment at my side with your qemu version
>>> (where do you pull from? commit number?) --> just to verify whether
>>> we can solve it at qemu side, or, solve it by cpuid at kvm side. I
>>> will present a patch to fix it and you can test at your environment.
>>>
>>>>
>>>> Do I get it right that Linux does not use the tsc deadline timer?
>>>> I've added printks to the
>>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr
>>>> functions, and I swear I've seen them trigger a few times with the
>>>> Linux guest. But I can't find the code that is issuing the msr
>>>> reads/writes in the Linux kernel.
>>>
>>> I will check it tomorrow, too late now.
>>> and would you please tell me, does illumos use tsc deadline timer? I
>>> lack this basic information.
>>
>> It does, if available. Please check
>> https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/i86pc/io/pcplusmp/apic_timer.c#L290
>>
>
> and where do you pull qemu? and commit number? I usually use Avi's qemu-kvm.

>From git://git.qemu.org/qemu.git, commit
62ba9f3662c5481f31160daf25474c8f22d6b3d2
Qemu 0.15.92 exhibited the same behavior.


>
> Thanks,
> Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 20:36                           ` Alexey Zaytsev
@ 2011-12-20 20:44                             ` Liu, Jinsong
  0 siblings, 0 replies; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-20 20:44 UTC (permalink / raw)
  To: Alexey Zaytsev
  Cc: Kernel development list, Jan Kiszka, Marcelo Tosatti, Avi Kivity,
	Garrett D'Amore

Alexey Zaytsev wrote:
> On Tue, Dec 20, 2011 at 22:26, Liu, Jinsong <jinsong.liu@intel.com>
> wrote: 
>> Alexey Zaytsev wrote:
>>> On Tue, Dec 20, 2011 at 22:19, Liu, Jinsong <jinsong.liu@intel.com>
>>> wrote:
>>>> Alexey Zaytsev wrote:
>>>>> On Tue, Dec 20, 2011 at 21:21, Liu, Jinsong
>>>>> <jinsong.liu@intel.com> wrote:
>>>>>> Linus Torvalds wrote:
>>>>>>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>>>>>>> <alexey.zaytsev@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Let me clarify the situation.
>>>>>>>> 
>>>>>>>> Before this commit, the tsc was advertised in cpuid, and it was
>>>>>>>> handled, if I understand things correctly, by qemu.
>>>>>>>> After this commit, the tsc is advertised in cpuid, and is
>>>>>>>> handled in the kernel, but only after qemu issues
>>>>>>>> KVM_CREATE_IRQCHIP. If it does not issue the ioctl, the kernel
>>>>>>>> just discards any wrmsrs done to the tsc. This does not look
>>>>>>>> like an Illumos problem to me. Linux guests kind of work here,
>>>>>>>> because they are  prepared to work on utterly broken hardware.
>>>>>>>> Good for you, but please don't break less-prepared guests.
>>>>>>> 
>>>>>>> Yes. This is a regression, and needs to be fixed.
>>>>>>> 
>>>>>>> Liu, if you don't have time to debug it, we'll just revert the
>>>>>>> commit. It's that easy. Regressions are not allowed. There are
>>>>>>> no excuses. 
>>>>>>> 
>>>>>>> In particular, saying "just wait for qemu-kvm" is not an
>>>>>>> acceptable answer, because the point is that things *used* to
>>>>>>> work, and they broke. No "change it to work with the new kernel"
>>>>>>> allowed, except for some *very* rare critical circumstances
>>>>>>> (usually "major security-bug that we had to fix, and people who
>>>>>>> relied on it are thus out of luck").
>>>>>>> 
>>>>>>> Commit a3e06bbe8445 still seems to revert cleanly, so that is
>>>>>>> the easy option. 
>>>>>>> 
>>>>>>> That said, it sounds like maybe another solution is to start
>>>>>>> with the TSC_DEADLINE timer bit in cpuid cleared, and only
>>>>>>> setting it after the KVM_CREATE_IRQCHIP ioctl.
>>>>>>> 
>>>>>>> In fact, the patch is clearly buggy, in that it apparently
>>>>>>> doesn't emulate TSC_DEADLINE correctly and natively on its own.
>>>>>>> 
>>>>>>> Jan, Marcelo, Avi - is there a quick fix, or should I just
>>>>>>> revert? 
>>>>>>> 
>>>>>>> And please don't *EVER* tell people that they should just work
>>>>>>> around regressions. Regressions are absolutely unacceptable.
>>>>>>> Kernel people need to understand that.
>>>>>>> 
>>>>>>>                    Linus
>>>>>> 
>>>>>> Yes, my fault to say 'walk around' before knowing Alex's issue
>>>>>> clearly. 
>>>>>> 
>>>>>> After Alex send his last email to clarify the situation, I have
>>>>>> checked the bug. Basically it caused from
>>>>>> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence
>>>>>> irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at
>>>>>> kvm_arch_vcpu_init(); 
>>>>>> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>>>>>> 
>>>>>> A fix is to update cpuid, as you said, setting it after
>>>>>> KVM_CREATE_IRQCHIP. I just wonder is there any better solution?
>>>>>> so I ask Alex his environment to setup at my side to do more
>>>>>> test. If you think kvm tsc deadline timer patch itself not
>>>>>> clean, please tell me.
>>>>> 
>>>>> [Removed Linus from CC]
>>>>> 
>>>>> If your internet connection is good enough, I could just pass you
>>>>> the disk image. It should be around 5g when compressed. Otherwise,
>>>>> you can download an openindiana disk image from
>>>>> http://openindiana.org. You probably want the text-only "server"
>>>>> one. The TSC code has changed since that release, but the old one
>>>>> does not work as well.
>>>> 
>>>> I mean I will build environment at my side with your qemu version
>>>> (where do you pull from? commit number?) --> just to verify whether
>>>> we can solve it at qemu side, or, solve it by cpuid at kvm side. I
>>>> will present a patch to fix it and you can test at your
>>>> environment. 
>>>> 
>>>>> 
>>>>> Do I get it right that Linux does not use the tsc deadline timer?
>>>>> I've added printks to the
>>>>> kvm_get_lapic_tscdeadline_msr/kvm_set_lapic_tscdeadline_msr
>>>>> functions, and I swear I've seen them trigger a few times with
>>>>> the Linux guest. But I can't find the code that is issuing the
>>>>> msr reads/writes in the Linux kernel. 
>>>> 
>>>> I will check it tomorrow, too late now.
>>>> and would you please tell me, does illumos use tsc deadline timer?
>>>> I lack this basic information.
>>> 
>>> It does, if available. Please check
>>> https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/i86pc/io/pcplusmp/apic_timer.c#L290
>>> 
>> 
>> and where do you pull qemu? and commit number? I usually use Avi's
>> qemu-kvm. 
> 
> From git://git.qemu.org/qemu.git, commit
> 62ba9f3662c5481f31160daf25474c8f22d6b3d2
> Qemu 0.15.92 exhibited the same behavior.
> 

OK, thanks. I will fix it tomorrow.

Thanks,
Jinsong


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 18:58               ` Linus Torvalds
  2011-12-20 19:21                 ` Liu, Jinsong
@ 2011-12-20 23:04                 ` Jan Kiszka
  2011-12-21 10:10                   ` [PATCH] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC Jan Kiszka
  1 sibling, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2011-12-20 23:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alexey Zaytsev, Liu, Jinsong, Kernel development list,
	Marcelo Tosatti, Avi Kivity, Garrett D'Amore

On 2011-12-20 19:58, Linus Torvalds wrote:
> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
> <alexey.zaytsev@gmail.com> wrote:
>>
>> Let me clarify the situation.
>>
>> Before this commit, the tsc was advertised in cpuid, and it was
>> handled, if I understand things correctly, by qemu.
>> After this commit, the tsc is advertised in cpuid, and is handled in
>> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
>> not issue the ioctl, the kernel just discards any wrmsrs done to the
>> tsc. This does not look like an Illumos problem to me. Linux guests
>> kind of work here, because they are  prepared to work on utterly
>> broken hardware. Good for you, but please don't break less-prepared
>> guests.
> 
> Yes. This is a regression, and needs to be fixed.
> 
> Liu, if you don't have time to debug it, we'll just revert the commit.
> It's that easy. Regressions are not allowed. There are no excuses.
> 
> In particular, saying "just wait for qemu-kvm" is not an acceptable
> answer, because the point is that things *used* to work, and they
> broke. No "change it to work with the new kernel" allowed, except for
> some *very* rare critical circumstances (usually "major security-bug
> that we had to fix, and people who relied on it are thus out of
> luck").
> 
> Commit a3e06bbe8445 still seems to revert cleanly, so that is the easy option.
> 
> That said, it sounds like maybe another solution is to start with the
> TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
> KVM_CREATE_IRQCHIP ioctl.

KVM cpuid discovery can (and is) unfortunately be done before user space
decides about creating an in-kernel irqchip or not. But the patch in
question suggests to user space that it's optimal and perfectly fine to
expose TSC_DEADLINE_TIMER to the guest - obviously a wrong suggestion in
case user space does not use the in-kernel irqchip and does not
implement the deadline timer in its own APIC model.

So the only proper solution I see is to drop that feature flag exposure
from kvm_update_cpuid. Instead, KVM should indicate to *well informed*
(i.e. updated) user space differently that there is now deadline timer
support in the in-kernel APIC model. That different channel should be a
KVM_CAP flag that user space can easily check and then merge the
necessary flag on its own.

> 
> In fact, the patch is clearly buggy, in that it apparently doesn't
> emulate TSC_DEADLINE correctly and natively on its own.
> 
> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?

Maybe fixing can be done quickly. In any case, the current ABI should
not be exposed to user space, even at the price of postponing it's
introduction. Just my 2 cents, though.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 19:21                 ` Liu, Jinsong
  2011-12-20 19:47                   ` Alexey Zaytsev
@ 2011-12-21  9:18                   ` Avi Kivity
  1 sibling, 0 replies; 36+ messages in thread
From: Avi Kivity @ 2011-12-21  9:18 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Linus Torvalds, Alexey Zaytsev, Kernel development list,
	Jan Kiszka, Marcelo Tosatti, Garrett D'Amore

On 12/20/2011 09:21 PM, Liu, Jinsong wrote:
> Yes, my fault to say 'walk around' before knowing Alex's issue clearly.
>
> After Alex send his last email to clarify the situation, I have checked the bug.
> Basically it caused from
> 1. qemu didn't issue KVM_CREATE_IRQCHIP, hence irqchip_in_kernel(kvm) fail when setup vcpu lapic logic at kvm_arch_vcpu_init();
> 2. tsc deadline work based on vcpu lapic, hence break illumos;
>
> A fix is to update cpuid, as you said, setting it after KVM_CREATE_IRQCHIP. 

Wait, qemu set up a vcpu with TSC deadline timer in cpuid, but with
userspace apic?  Then it was clearly lying, since qemu doesn't implement
the TSC deadline timer.

> I just wonder is there any better solution? so I ask Alex his environment to setup at my side to do more test.
> If you think kvm tsc deadline timer patch itself not clean, please tell me.
>

Alexey, did you use -cpu host?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest"
  2011-12-20 19:05               ` [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Liu, Jinsong
@ 2011-12-21  9:20                 ` Avi Kivity
  0 siblings, 0 replies; 36+ messages in thread
From: Avi Kivity @ 2011-12-21  9:20 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Alexey Zaytsev, Kernel development list, Jan Kiszka,
	Marcelo Tosatti, Garrett D'Amore, Linus Torvalds

On 12/20/2011 09:05 PM, Liu, Jinsong wrote:
> I double checked kvm tsc deadline timer patch, make sure your issue not *directly* related to the patch. I guess it maybe a qemu bug (which not exist at Avi's qemu-kvm).
> Can you tell me where do you pull qemu? I used Avi's qemu-kvm tree, but can try your environment.
>
>

It should happen in qemu-kvm if you use -no-kvm-irqchip.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-20 23:04                 ` Jan Kiszka
@ 2011-12-21 10:10                   ` Jan Kiszka
  2011-12-21 10:25                     ` [PATCH v2] " Jan Kiszka
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2011-12-21 10:10 UTC (permalink / raw)
  To: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong, Avi Kivity
  Cc: Kernel development list, Marcelo Tosatti, Garrett D'Amore

On 2011-12-21 00:04, Jan Kiszka wrote:
> On 2011-12-20 19:58, Linus Torvalds wrote:
>> On Tue, Dec 20, 2011 at 1:51 AM, Alexey Zaytsev
>> <alexey.zaytsev@gmail.com> wrote:
>>>
>>> Let me clarify the situation.
>>>
>>> Before this commit, the tsc was advertised in cpuid, and it was
>>> handled, if I understand things correctly, by qemu.
>>> After this commit, the tsc is advertised in cpuid, and is handled in
>>> the kernel, but only after qemu issues KVM_CREATE_IRQCHIP. If it does
>>> not issue the ioctl, the kernel just discards any wrmsrs done to the
>>> tsc. This does not look like an Illumos problem to me. Linux guests
>>> kind of work here, because they are  prepared to work on utterly
>>> broken hardware. Good for you, but please don't break less-prepared
>>> guests.
>>
>> Yes. This is a regression, and needs to be fixed.
>>
>> Liu, if you don't have time to debug it, we'll just revert the commit.
>> It's that easy. Regressions are not allowed. There are no excuses.
>>
>> In particular, saying "just wait for qemu-kvm" is not an acceptable
>> answer, because the point is that things *used* to work, and they
>> broke. No "change it to work with the new kernel" allowed, except for
>> some *very* rare critical circumstances (usually "major security-bug
>> that we had to fix, and people who relied on it are thus out of
>> luck").
>>
>> Commit a3e06bbe8445 still seems to revert cleanly, so that is the easy option.
>>
>> That said, it sounds like maybe another solution is to start with the
>> TSC_DEADLINE timer bit in cpuid cleared, and only setting it after the
>> KVM_CREATE_IRQCHIP ioctl.
> 
> KVM cpuid discovery can (and is) unfortunately be done before user space
> decides about creating an in-kernel irqchip or not. But the patch in
> question suggests to user space that it's optimal and perfectly fine to
> expose TSC_DEADLINE_TIMER to the guest - obviously a wrong suggestion in
> case user space does not use the in-kernel irqchip and does not
> implement the deadline timer in its own APIC model.
> 
> So the only proper solution I see is to drop that feature flag exposure
> from kvm_update_cpuid. Instead, KVM should indicate to *well informed*
> (i.e. updated) user space differently that there is now deadline timer
> support in the in-kernel APIC model. That different channel should be a
> KVM_CAP flag that user space can easily check and then merge the
> necessary flag on its own.
> 
>>
>> In fact, the patch is clearly buggy, in that it apparently doesn't
>> emulate TSC_DEADLINE correctly and natively on its own.
>>
>> Jan, Marcelo, Avi - is there a quick fix, or should I just revert?
> 
> Maybe fixing can be done quickly. In any case, the current ABI should
> not be exposed to user space, even at the price of postponing it's
> introduction. Just my 2 cents, though.

Was obviously too late for me. Here is a more accurate code analysis and
fix proposal:

kvm_update_cpuid is not involved in KVM_GET_SUPPORTED_CPUID as I
thought, thus there is no ABI issue. It should just be trivial bug in
setting the feature flag. Does this help? Untested!

Jan

-------8<-------

We must not report the TSC deadline timer feature on our own when user
space provides the APIC as we have no clue about its features.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/cpuid.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 230f713..28615f2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -40,7 +40,7 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
 			best->ecx |= bit(X86_FEATURE_OSXSAVE);
 	}
 
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+	if (apic && boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
 		best->function == 0x1) {
 		best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
 		timer_mode_mask = 3 << 17;
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:10                   ` [PATCH] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC Jan Kiszka
@ 2011-12-21 10:25                     ` Jan Kiszka
  2011-12-21 10:35                       ` Avi Kivity
  2011-12-21 10:41                       ` Alexey Zaytsev
  0 siblings, 2 replies; 36+ messages in thread
From: Jan Kiszka @ 2011-12-21 10:25 UTC (permalink / raw)
  To: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong, Avi Kivity
  Cc: Kernel development list, Marcelo Tosatti, Garrett D'Amore, kvm

We must not report the TSC deadline timer feature on our own when user
space provides the APIC as we have no clue about its features.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---

Changes in v2:
 - reorganized code to make the flow clearer

 arch/x86/kvm/cpuid.c |   18 ++++++++----------
 1 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 230f713..b7cdd3b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -27,7 +27,6 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 	struct kvm_lapic *apic = vcpu->arch.apic;
-	u32 timer_mode_mask;
 
 	best = kvm_find_cpuid_entry(vcpu, 1, 0);
 	if (!best)
@@ -40,15 +39,14 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
 			best->ecx |= bit(X86_FEATURE_OSXSAVE);
 	}
 
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
-		best->function == 0x1) {
-		best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
-		timer_mode_mask = 3 << 17;
-	} else
-		timer_mode_mask = 1 << 17;
-
-	if (apic)
-		apic->lapic_timer.timer_mode_mask = timer_mode_mask;
+	if (apic) {
+		if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+		    best->function == 0x1) {
+			best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
+			apic->lapic_timer.timer_mode_mask = 3 << 17;
+		} else
+			apic->lapic_timer.timer_mode_mask = 1 << 17;
+	}
 
 	kvm_pmu_cpuid_update(vcpu);
 }
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:25                     ` [PATCH v2] " Jan Kiszka
@ 2011-12-21 10:35                       ` Avi Kivity
  2011-12-21 10:41                         ` Jan Kiszka
  2011-12-22 15:41                         ` Liu, Jinsong
  2011-12-21 10:41                       ` Alexey Zaytsev
  1 sibling, 2 replies; 36+ messages in thread
From: Avi Kivity @ 2011-12-21 10:35 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 12/21/2011 12:25 PM, Jan Kiszka wrote:
> We must not report the TSC deadline timer feature on our own when user
> space provides the APIC as we have no clue about its features.

We must not report the TSC deadline timer feature on our own, period. 
We should just update the timer mode mask there.  Don't know how this
slipped through review.

I think your original idea was correct.  Add a new KVM_CAP for the tsc
deadline timer.  Userspace can add the bit to cpuid if either it
implements the feature in a userspace apic, or if it finds the new
capability and uses the kernel apic.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:35                       ` Avi Kivity
@ 2011-12-21 10:41                         ` Jan Kiszka
  2011-12-21 10:44                           ` Avi Kivity
  2011-12-22 15:41                         ` Liu, Jinsong
  1 sibling, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2011-12-21 10:41 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 2011-12-21 11:35, Avi Kivity wrote:
> On 12/21/2011 12:25 PM, Jan Kiszka wrote:
>> We must not report the TSC deadline timer feature on our own when user
>> space provides the APIC as we have no clue about its features.
> 
> We must not report the TSC deadline timer feature on our own, period. 
> We should just update the timer mode mask there.  Don't know how this
> slipped through review.
>
> I think your original idea was correct.  Add a new KVM_CAP for the tsc
> deadline timer.  Userspace can add the bit to cpuid if either it
> implements the feature in a userspace apic, or if it finds the new
> capability and uses the kernel apic.

Right, we do need some control for user space to keep the feature
disabled when migrating from an older host.

However, there is also the timer_mode_mask which requires tuning in
addition to the cpuid flag.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:25                     ` [PATCH v2] " Jan Kiszka
  2011-12-21 10:35                       ` Avi Kivity
@ 2011-12-21 10:41                       ` Alexey Zaytsev
  1 sibling, 0 replies; 36+ messages in thread
From: Alexey Zaytsev @ 2011-12-21 10:41 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Linus Torvalds, Liu, Jinsong, Avi Kivity,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On Wed, Dec 21, 2011 at 12:25, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> We must not report the TSC deadline timer feature on our own when user
> space provides the APIC as we have no clue about its features.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
>
> Changes in v2:
>  - reorganized code to make the flow clearer

I've seen Avi's comment below, but just for the note, it works fine
when applied to arch/x86/kvm/x86.c.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:41                         ` Jan Kiszka
@ 2011-12-21 10:44                           ` Avi Kivity
  2011-12-21 11:28                             ` Jan Kiszka
  0 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2011-12-21 10:44 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 12/21/2011 12:41 PM, Jan Kiszka wrote:
> On 2011-12-21 11:35, Avi Kivity wrote:
> > On 12/21/2011 12:25 PM, Jan Kiszka wrote:
> >> We must not report the TSC deadline timer feature on our own when user
> >> space provides the APIC as we have no clue about its features.
> > 
> > We must not report the TSC deadline timer feature on our own, period. 
> > We should just update the timer mode mask there.  Don't know how this
> > slipped through review.
> >
> > I think your original idea was correct.  Add a new KVM_CAP for the tsc
> > deadline timer.  Userspace can add the bit to cpuid if either it
> > implements the feature in a userspace apic, or if it finds the new
> > capability and uses the kernel apic.
>
> Right, we do need some control for user space to keep the feature
> disabled when migrating from an older host.
>
> However, there is also the timer_mode_mask which requires tuning in
> addition to the cpuid flag.
>

timer_mode_mask can just be slaved to the bit (as received by
KVM_SET_CPUID); that's exactly kvm_update_cpuid()'s role.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:44                           ` Avi Kivity
@ 2011-12-21 11:28                             ` Jan Kiszka
  2011-12-21 11:45                               ` Avi Kivity
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2011-12-21 11:28 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 2011-12-21 11:44, Avi Kivity wrote:
> On 12/21/2011 12:41 PM, Jan Kiszka wrote:
>> On 2011-12-21 11:35, Avi Kivity wrote:
>>> On 12/21/2011 12:25 PM, Jan Kiszka wrote:
>>>> We must not report the TSC deadline timer feature on our own when user
>>>> space provides the APIC as we have no clue about its features.
>>>
>>> We must not report the TSC deadline timer feature on our own, period. 
>>> We should just update the timer mode mask there.  Don't know how this
>>> slipped through review.
>>>
>>> I think your original idea was correct.  Add a new KVM_CAP for the tsc
>>> deadline timer.  Userspace can add the bit to cpuid if either it
>>> implements the feature in a userspace apic, or if it finds the new
>>> capability and uses the kernel apic.
>>
>> Right, we do need some control for user space to keep the feature
>> disabled when migrating from an older host.
>>
>> However, there is also the timer_mode_mask which requires tuning in
>> addition to the cpuid flag.
>>
> 
> timer_mode_mask can just be slaved to the bit (as received by
> KVM_SET_CPUID); that's exactly kvm_update_cpuid()'s role.

Like

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 230f713..89b02bf 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -27,7 +27,6 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
 	struct kvm_lapic *apic = vcpu->arch.apic;
-	u32 timer_mode_mask;

 	best = kvm_find_cpuid_entry(vcpu, 1, 0);
 	if (!best)
@@ -40,15 +39,12 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
 			best->ecx |= bit(X86_FEATURE_OSXSAVE);
 	}

-	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
-		best->function == 0x1) {
-		best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
-		timer_mode_mask = 3 << 17;
-	} else
-		timer_mode_mask = 1 << 17;
-
-	if (apic)
-		apic->lapic_timer.timer_mode_mask = timer_mode_mask;
+	if (apic) {
+		if (best->ecx & bit(X86_FEATURE_TSC_DEADLINE_TIMER))
+			apic->lapic_timer.timer_mode_mask = 3 << 17;
+		else
+			apic->lapic_timer.timer_mode_mask = 1 << 17;
+	}

 	kvm_pmu_cpuid_update(vcpu);
 }


+ the KVM_CAP thing?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 11:28                             ` Jan Kiszka
@ 2011-12-21 11:45                               ` Avi Kivity
  2011-12-21 11:58                                 ` Jan Kiszka
  0 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2011-12-21 11:45 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 12/21/2011 01:28 PM, Jan Kiszka wrote:
> > 
> > timer_mode_mask can just be slaved to the bit (as received by
> > KVM_SET_CPUID); that's exactly kvm_update_cpuid()'s role.
>
> Like

Ugh, the facebook generation invades lkml.

> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 230f713..89b02bf 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -27,7 +27,6 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_cpuid_entry2 *best;
>  	struct kvm_lapic *apic = vcpu->arch.apic;
> -	u32 timer_mode_mask;
>
>  	best = kvm_find_cpuid_entry(vcpu, 1, 0);
>  	if (!best)
> @@ -40,15 +39,12 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
>  			best->ecx |= bit(X86_FEATURE_OSXSAVE);
>  	}
>
> -	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> -		best->function == 0x1) {
> -		best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
> -		timer_mode_mask = 3 << 17;
> -	} else
> -		timer_mode_mask = 1 << 17;
> -
> -	if (apic)
> -		apic->lapic_timer.timer_mode_mask = timer_mode_mask;
> +	if (apic) {
> +		if (best->ecx & bit(X86_FEATURE_TSC_DEADLINE_TIMER))
> +			apic->lapic_timer.timer_mode_mask = 3 << 17;
> +		else
> +			apic->lapic_timer.timer_mode_mask = 1 << 17;
> +	}
>
>  	kvm_pmu_cpuid_update(vcpu);
>  }
>
>
> + the KVM_CAP thing?
>

+1

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 11:45                               ` Avi Kivity
@ 2011-12-21 11:58                                 ` Jan Kiszka
  2011-12-21 12:02                                   ` Avi Kivity
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Kiszka @ 2011-12-21 11:58 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 2011-12-21 12:45, Avi Kivity wrote:
> On 12/21/2011 01:28 PM, Jan Kiszka wrote:
>>>
>>> timer_mode_mask can just be slaved to the bit (as received by
>>> KVM_SET_CPUID); that's exactly kvm_update_cpuid()'s role.
>>
>> Like
> 
> Ugh, the facebook generation invades lkml.

I'm not on facebook. Does anyone sell shirts with that?

> 
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 230f713..89b02bf 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -27,7 +27,6 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
>>  {
>>  	struct kvm_cpuid_entry2 *best;
>>  	struct kvm_lapic *apic = vcpu->arch.apic;
>> -	u32 timer_mode_mask;
>>
>>  	best = kvm_find_cpuid_entry(vcpu, 1, 0);
>>  	if (!best)
>> @@ -40,15 +39,12 @@ void kvm_update_cpuid(struct kvm_vcpu *vcpu)
>>  			best->ecx |= bit(X86_FEATURE_OSXSAVE);
>>  	}
>>
>> -	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
>> -		best->function == 0x1) {
>> -		best->ecx |= bit(X86_FEATURE_TSC_DEADLINE_TIMER);
>> -		timer_mode_mask = 3 << 17;
>> -	} else
>> -		timer_mode_mask = 1 << 17;
>> -
>> -	if (apic)
>> -		apic->lapic_timer.timer_mode_mask = timer_mode_mask;
>> +	if (apic) {
>> +		if (best->ecx & bit(X86_FEATURE_TSC_DEADLINE_TIMER))
>> +			apic->lapic_timer.timer_mode_mask = 3 << 17;
>> +		else
>> +			apic->lapic_timer.timer_mode_mask = 1 << 17;
>> +	}
>>
>>  	kvm_pmu_cpuid_update(vcpu);
>>  }
>>
>>
>> + the KVM_CAP thing?
>>
> 
> +1

Ah, the adult crowd that trusts the other side.

OK, will see that I can cook up a proper patch later.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 11:58                                 ` Jan Kiszka
@ 2011-12-21 12:02                                   ` Avi Kivity
  0 siblings, 0 replies; 36+ messages in thread
From: Avi Kivity @ 2011-12-21 12:02 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Linus Torvalds, Alexey Zaytsev, Liu, Jinsong,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 12/21/2011 01:58 PM, Jan Kiszka wrote:
> > +1
>
> Ah, the adult crowd that trusts the other side.

It's 80% gnome-shell rants and similar, don't bother.

> OK, will see that I can cook up a proper patch later.
>

Thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-21 10:35                       ` Avi Kivity
  2011-12-21 10:41                         ` Jan Kiszka
@ 2011-12-22 15:41                         ` Liu, Jinsong
  2011-12-25 12:37                           ` Avi Kivity
  1 sibling, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-22 15:41 UTC (permalink / raw)
  To: Avi Kivity, Jan Kiszka
  Cc: Linus Torvalds, Alexey Zaytsev, Kernel development list,
	Marcelo Tosatti, Garrett D'Amore, kvm

Avi Kivity wrote:
> On 12/21/2011 12:25 PM, Jan Kiszka wrote:
>> We must not report the TSC deadline timer feature on our own when
>> user space provides the APIC as we have no clue about its features.
> 
> We must not report the TSC deadline timer feature on our own, period.
> We should just update the timer mode mask there.  Don't know how this
> slipped through review.
> 
> I think your original idea was correct.  Add a new KVM_CAP for the tsc
> deadline timer.  Userspace can add the bit to cpuid if either it
> implements the feature in a userspace apic, or if it finds the new
> capability and uses the kernel apic.

Is it necessary to use KVM_CAP? If I didn't misunderstand, the KVM_CAP sulotion would be:
1. qemu get kvm tsc deadline timer capability by KVM_CAP_...;
2. qemu add cpuid bit
        if ((guest use qemu apic && qemu emualte tsc deadline timer) || 
           (guest use kvm apic && kvm emulate tsc deadline timer (KVM_CAP)))
3. qemu ioctl KVM_SET_CPUID2
4. kvm expose the feature to guest by saving it at vcpu->arch.cpuid_entries, 
seems it's logically redundant.

Jan's patch v2 is a straight forward and simple fix. in the patch
    if (apic) { ... }
means apic (and then its sub-logic tsc deadline timer) emulated by kvm, that's enough:
if quest use kvm apic, it's OK to add cpuid bit and expose to guest;
if guest don't use kvm apic, it will not touch cpuid bit;

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-22 15:41                         ` Liu, Jinsong
@ 2011-12-25 12:37                           ` Avi Kivity
  2011-12-26  8:11                             ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2011-12-25 12:37 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Jan Kiszka, Linus Torvalds, Alexey Zaytsev,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm

On 12/22/2011 05:41 PM, Liu, Jinsong wrote:
> Avi Kivity wrote:
> > On 12/21/2011 12:25 PM, Jan Kiszka wrote:
> >> We must not report the TSC deadline timer feature on our own when
> >> user space provides the APIC as we have no clue about its features.
> > 
> > We must not report the TSC deadline timer feature on our own, period.
> > We should just update the timer mode mask there.  Don't know how this
> > slipped through review.
> > 
> > I think your original idea was correct.  Add a new KVM_CAP for the tsc
> > deadline timer.  Userspace can add the bit to cpuid if either it
> > implements the feature in a userspace apic, or if it finds the new
> > capability and uses the kernel apic.
>
> Is it necessary to use KVM_CAP? If I didn't misunderstand, the KVM_CAP sulotion would be:
> 1. qemu get kvm tsc deadline timer capability by KVM_CAP_...;
> 2. qemu add cpuid bit
>         if ((guest use qemu apic && qemu emualte tsc deadline timer) || 
>            (guest use kvm apic && kvm emulate tsc deadline timer (KVM_CAP)))
> 3. qemu ioctl KVM_SET_CPUID2
> 4. kvm expose the feature to guest by saving it at vcpu->arch.cpuid_entries, 

Correct.

> seems it's logically redundant.

What's logically redundant?

> Jan's patch v2 is a straight forward and simple fix. in the patch
>     if (apic) { ... }
> means apic (and then its sub-logic tsc deadline timer) emulated by kvm, that's enough:
> if quest use kvm apic, it's OK to add cpuid bit and expose to guest;
> if guest don't use kvm apic, it will not touch cpuid bit;
>

It breaks live migration: if you start a guest on a TSC-deadline capable
host kernel, and migrate it to a TSC-deadline incapable host kernel, you
end up with a broken guest.

More broadly, kvm never exposes features transparently to the guest, it
always passes them to userspace first, so userspace controls the ABI
exposed to the guest.  This prevents the following scenario:

- a guest is started on some hardware, which doesn't support some cpuid
feature (say AVX for example)
- the guest or one of its applications are broken wrt AVX, but because
the feature is not exposed, it works correctly
- the host hardware is upgraded to one which supports AVX
- the guest is now broken

(the downside is that guests run slower than they would with automatic
feature exposure, but that's better than breaking them)

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-25 12:37                           ` Avi Kivity
@ 2011-12-26  8:11                             ` Liu, Jinsong
  2011-12-26 11:35                               ` Avi Kivity
  0 siblings, 1 reply; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-26  8:11 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jan Kiszka, Linus Torvalds, Alexey Zaytsev,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm, Li, Susie

Avi Kivity wrote:
> On 12/22/2011 05:41 PM, Liu, Jinsong wrote:
>> Avi Kivity wrote:
>>> On 12/21/2011 12:25 PM, Jan Kiszka wrote:
>>>> We must not report the TSC deadline timer feature on our own when
>>>> user space provides the APIC as we have no clue about its features.
>>> 
>>> We must not report the TSC deadline timer feature on our own,
>>> period. We should just update the timer mode mask there.  Don't
>>> know how this slipped through review. 
>>> 
>>> I think your original idea was correct.  Add a new KVM_CAP for the
>>> tsc deadline timer.  Userspace can add the bit to cpuid if either it
>>> implements the feature in a userspace apic, or if it finds the new
>>> capability and uses the kernel apic.
>> 
>> Is it necessary to use KVM_CAP? If I didn't misunderstand, the
>> KVM_CAP sulotion would be: 
>> 1. qemu get kvm tsc deadline timer capability by KVM_CAP_...;
>> 2. qemu add cpuid bit
>>         if ((guest use qemu apic && qemu emualte tsc deadline timer)
>>            || (guest use kvm apic && kvm emulate tsc deadline timer
>> (KVM_CAP))) 
>> 3. qemu ioctl KVM_SET_CPUID2
>> 4. kvm expose the feature to guest by saving it at
>> vcpu->arch.cpuid_entries, 
> 
> Correct.
> 
>> seems it's logically redundant.
> 
> What's logically redundant?
> 
>> Jan's patch v2 is a straight forward and simple fix. in the patch   
>> if (apic) { ... } means apic (and then its sub-logic tsc deadline
>> timer) emulated by kvm, that's enough: if quest use kvm apic, it's
>> OK to add cpuid bit and expose to guest; 
>> if guest don't use kvm apic, it will not touch cpuid bit;
>> 
> 
> It breaks live migration: if you start a guest on a TSC-deadline
> capable host kernel, and migrate it to a TSC-deadline incapable host
> kernel, you end up with a broken guest.
> 
> More broadly, kvm never exposes features transparently to the guest,
> it always passes them to userspace first, so userspace controls the
> ABI exposed to the guest.  This prevents the following scenario:

Do you mean, by the method qemu control cpuid exposing, it can avoid live migration broken issue by
1. user probe the lowest ability host of whole pool where vm may live migrate;
2. only if the lowest ablility host support the feature can user enable the feature when boot a vm;
3. if the lowest ability host didn't support the feature (say tsc deadline timer as example), user disable the feature when boot a vm;
In this way, live migration wouldn't be broken. Right?

or, do you mean qemu-kvm solve live migration broken issue by some other method?

> 
> - a guest is started on some hardware, which doesn't support some
> cpuid feature (say AVX for example)
> - the guest or one of its applications are broken wrt AVX, but because
> the feature is not exposed, it works correctly
> - the host hardware is upgraded to one which supports AVX
> - the guest is now broken

You mean, live migrate from 'old' (which doesn't support the feature) platform to 'new' platform would broken?

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-26  8:11                             ` Liu, Jinsong
@ 2011-12-26 11:35                               ` Avi Kivity
  2011-12-26 14:23                                 ` Liu, Jinsong
  0 siblings, 1 reply; 36+ messages in thread
From: Avi Kivity @ 2011-12-26 11:35 UTC (permalink / raw)
  To: Liu, Jinsong
  Cc: Jan Kiszka, Linus Torvalds, Alexey Zaytsev,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm, Li, Susie

On 12/26/2011 10:11 AM, Liu, Jinsong wrote:
> > 
> > It breaks live migration: if you start a guest on a TSC-deadline
> > capable host kernel, and migrate it to a TSC-deadline incapable host
> > kernel, you end up with a broken guest.
> > 
> > More broadly, kvm never exposes features transparently to the guest,
> > it always passes them to userspace first, so userspace controls the
> > ABI exposed to the guest.  This prevents the following scenario:
>
> Do you mean, by the method qemu control cpuid exposing, it can avoid live migration broken issue by
> 1. user probe the lowest ability host of whole pool where vm may live migrate;
> 2. only if the lowest ablility host support the feature can user enable the feature when boot a vm;
> 3. if the lowest ability host didn't support the feature (say tsc deadline timer as example), user disable the feature when boot a vm;
> In this way, live migration wouldn't be broken. Right?

Right.

> or, do you mean qemu-kvm solve live migration broken issue by some other method?

The method you outlined, or any other method, such as partitioning the
cluster according to hardware capabilities.

>
> > 
> > - a guest is started on some hardware, which doesn't support some
> > cpuid feature (say AVX for example)
> > - the guest or one of its applications are broken wrt AVX, but because
> > the feature is not exposed, it works correctly
> > - the host hardware is upgraded to one which supports AVX
> > - the guest is now broken
>
> You mean, live migrate from 'old' (which doesn't support the feature) platform to 'new' platform would broken?

Live migration, or even just a guest restart on updated hardware.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH v2] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC
  2011-12-26 11:35                               ` Avi Kivity
@ 2011-12-26 14:23                                 ` Liu, Jinsong
  0 siblings, 0 replies; 36+ messages in thread
From: Liu, Jinsong @ 2011-12-26 14:23 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jan Kiszka, Linus Torvalds, Alexey Zaytsev,
	Kernel development list, Marcelo Tosatti, Garrett D'Amore,
	kvm, Li, Susie

Avi Kivity wrote:
> On 12/26/2011 10:11 AM, Liu, Jinsong wrote:
>>> 
>>> It breaks live migration: if you start a guest on a TSC-deadline
>>> capable host kernel, and migrate it to a TSC-deadline incapable host
>>> kernel, you end up with a broken guest.
>>> 
>>> More broadly, kvm never exposes features transparently to the guest,
>>> it always passes them to userspace first, so userspace controls the
>>> ABI exposed to the guest.  This prevents the following scenario:
>> 
>> Do you mean, by the method qemu control cpuid exposing, it can avoid
>> live migration broken issue by 
>> 1. user probe the lowest ability host of whole pool where vm may
>> live migrate; 
>> 2. only if the lowest ablility host support the feature can user
>> enable the feature when boot a vm; 
>> 3. if the lowest ability host didn't support the feature (say tsc
>> deadline timer as example), user disable the feature when boot a vm;
>> In this way, live migration wouldn't be broken. Right? 
> 
> Right.

Thanks Avi, for your detailed explanation, fix a long misunderstanding I had for live migration.

Best Regards,
Jinsong

> 
>> or, do you mean qemu-kvm solve live migration broken issue by some
>> other method? 
> 
> The method you outlined, or any other method, such as partitioning the
> cluster according to hardware capabilities.
> 
>> 
>>> 
>>> - a guest is started on some hardware, which doesn't support some
>>> cpuid feature (say AVX for example)
>>> - the guest or one of its applications are broken wrt AVX, but
>>> because the feature is not exposed, it works correctly
>>> - the host hardware is upgraded to one which supports AVX
>>> - the guest is now broken
>> 
>> You mean, live migrate from 'old' (which doesn't support the
>> feature) platform to 'new' platform would broken? 
> 
> Live migration, or even just a guest restart on updated hardware.


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2011-12-26 14:23 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-12  0:32 [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Alexey Zaytsev
2011-12-12  6:13 ` Liu, Jinsong
2011-12-14  9:37   ` Alexey Zaytsev
2011-12-20  8:43     ` Alexey Zaytsev
2011-12-20  8:53       ` Liu, Jinsong
2011-12-20  8:58         ` Alexey Zaytsev
2011-12-20  9:26           ` Liu, Jinsong
2011-12-20  9:48             ` Avi Kivity
2011-12-20  9:51             ` Alexey Zaytsev
2011-12-20 18:58               ` Linus Torvalds
2011-12-20 19:21                 ` Liu, Jinsong
2011-12-20 19:47                   ` Alexey Zaytsev
2011-12-20 20:19                     ` Liu, Jinsong
2011-12-20 20:22                       ` Alexey Zaytsev
2011-12-20 20:26                         ` Liu, Jinsong
2011-12-20 20:36                           ` Alexey Zaytsev
2011-12-20 20:44                             ` Liu, Jinsong
2011-12-21  9:18                   ` Avi Kivity
2011-12-20 23:04                 ` Jan Kiszka
2011-12-21 10:10                   ` [PATCH] KVM: x86: Prevent exposing TSC deadline timer feature in the absence of in-kernel APIC Jan Kiszka
2011-12-21 10:25                     ` [PATCH v2] " Jan Kiszka
2011-12-21 10:35                       ` Avi Kivity
2011-12-21 10:41                         ` Jan Kiszka
2011-12-21 10:44                           ` Avi Kivity
2011-12-21 11:28                             ` Jan Kiszka
2011-12-21 11:45                               ` Avi Kivity
2011-12-21 11:58                                 ` Jan Kiszka
2011-12-21 12:02                                   ` Avi Kivity
2011-12-22 15:41                         ` Liu, Jinsong
2011-12-25 12:37                           ` Avi Kivity
2011-12-26  8:11                             ` Liu, Jinsong
2011-12-26 11:35                               ` Avi Kivity
2011-12-26 14:23                                 ` Liu, Jinsong
2011-12-21 10:41                       ` Alexey Zaytsev
2011-12-20 19:05               ` [Regression, bisected] a3e06bbe8445f57eb949e6474c5a9b30f24d2057: KVM: emulate lapic tsc deadline timer for guest" Liu, Jinsong
2011-12-21  9:20                 ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).