All of lore.kernel.org
 help / color / mirror / Atom feed
* The "memory" test is failing in the kvm-unit-tests CI
@ 2023-03-29  9:50 Thomas Huth
  2023-03-29 19:11 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Huth @ 2023-03-29  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, KVM; +Cc: Cole Robinson, Sean Christopherson


  Hi,

I noticed that in recent builds, the "memory" test started failing in the 
kvm-unit-test CI. After doing some experiments, I think it might rather be 
related to the environment than to a recent change in the k-u-t sources.

It used to work fine with commit 2480430a here in January:

  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873

Now I've re-run the CI with the same commit 2480430a here and it is failing now:

  https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733

Does anybody have an idea what could be causing this regression? The build 
in January used 7.0.0-12.fc37, the new build used 7.0.0-15.fc37, could that 
be related? Or maybe a different kernel version?

  Thomas


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The "memory" test is failing in the kvm-unit-tests CI
  2023-03-29  9:50 The "memory" test is failing in the kvm-unit-tests CI Thomas Huth
@ 2023-03-29 19:11 ` Sean Christopherson
  2023-03-30  7:59   ` Thomas Huth
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2023-03-29 19:11 UTC (permalink / raw)
  To: Thomas Huth; +Cc: Paolo Bonzini, KVM, Cole Robinson

On Wed, Mar 29, 2023, Thomas Huth wrote:
> 
>  Hi,
> 
> I noticed that in recent builds, the "memory" test started failing in the
> kvm-unit-test CI. After doing some experiments, I think it might rather be
> related to the environment than to a recent change in the k-u-t sources.
> 
> It used to work fine with commit 2480430a here in January:
> 
>  https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873
> 
> Now I've re-run the CI with the same commit 2480430a here and it is failing now:
> 
>  https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733

Can you provide the logs from the failing test, and/or the build artifacts?  I
tried, and failed, to find them on Gitlab.

> Does anybody have an idea what could be causing this regression? The build
> in January used 7.0.0-12.fc37, the new build used 7.0.0-15.fc37, could that
> be related? Or maybe a different kernel version?

Nothing jumps to mind.  Triaging this without at least the logs in going to be
painful.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The "memory" test is failing in the kvm-unit-tests CI
  2023-03-29 19:11 ` Sean Christopherson
@ 2023-03-30  7:59   ` Thomas Huth
  2023-03-30 19:37     ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Huth @ 2023-03-30  7:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, KVM, Cole Robinson

On 29/03/2023 21.11, Sean Christopherson wrote:
> On Wed, Mar 29, 2023, Thomas Huth wrote:
>>
>>   Hi,
>>
>> I noticed that in recent builds, the "memory" test started failing in the
>> kvm-unit-test CI. After doing some experiments, I think it might rather be
>> related to the environment than to a recent change in the k-u-t sources.
>>
>> It used to work fine with commit 2480430a here in January:
>>
>>   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873
>>
>> Now I've re-run the CI with the same commit 2480430a here and it is failing now:
>>
>>   https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733
> 
> Can you provide the logs from the failing test, and/or the build artifacts?  I
> tried, and failed, to find them on Gitlab.

Yes, that's still missing in the CI scripts ... I'll try to come up with a 
patch that provides the logs as artifacts.

Meanwhile, here's a run with a manual "cat logs/memory.log":

https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4029213352#L2726

Seems like these are the failing memory tests:

FAIL: clflushopt (ABSENT)
FAIL: clwb (ABSENT)

  Thomas


>> Does anybody have an idea what could be causing this regression? The build
>> in January used 7.0.0-12.fc37, the new build used 7.0.0-15.fc37, could that
>> be related? Or maybe a different kernel version?
> 
> Nothing jumps to mind.  Triaging this without at least the logs in going to be
> painful.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The "memory" test is failing in the kvm-unit-tests CI
  2023-03-30  7:59   ` Thomas Huth
@ 2023-03-30 19:37     ` Sean Christopherson
  2023-04-03  8:23       ` Thomas Huth
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2023-03-30 19:37 UTC (permalink / raw)
  To: Thomas Huth; +Cc: Paolo Bonzini, KVM, Cole Robinson

On Thu, Mar 30, 2023, Thomas Huth wrote:
> On 29/03/2023 21.11, Sean Christopherson wrote:
> > On Wed, Mar 29, 2023, Thomas Huth wrote:
> > > 
> > >   Hi,
> > > 
> > > I noticed that in recent builds, the "memory" test started failing in the
> > > kvm-unit-test CI. After doing some experiments, I think it might rather be
> > > related to the environment than to a recent change in the k-u-t sources.
> > > 
> > > It used to work fine with commit 2480430a here in January:
> > > 
> > >   https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873
> > > 
> > > Now I've re-run the CI with the same commit 2480430a here and it is failing now:
> > > 
> > >   https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733
> > 
> > Can you provide the logs from the failing test, and/or the build artifacts?  I
> > tried, and failed, to find them on Gitlab.
> 
> Yes, that's still missing in the CI scripts ... I'll try to come up with a
> patch that provides the logs as artifacts.
> 
> Meanwhile, here's a run with a manual "cat logs/memory.log":
> 
> https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4029213352#L2726
> 
> Seems like these are the failing memory tests:
> 
> FAIL: clflushopt (ABSENT)
> FAIL: clwb (ABSENT)

More than likely what is happening is that the platform supports CLFLUSHOPT and
CLWB (possibly even via a ucode patch update), but the CPUID bits are not being
enumerated to the guest.  Neither VMX nor SVM has intercept controls for the
instructions, so KVM has no way to enforce the the guest's CPUID model.  E.g.
the failures can be reproduce by manually hiding the features:

  rkt ./x86/run x86/memory.flat -smp 1 -cpu max,-clflushopt,-clwb

This isn't a KVM bug because of the virtualization hole.  And really, the test
itself is bogus when running on KVM precisely because of said hole (similar holes
exist for all the other instructions in the test).

The test appears to have been added for QEMU's TCG, which makes sense as there
shouldn't be any virtualization holes in a pure emulation environment.

That said, it is interesting that the test is suddenly failing, as it means
something is buggy.  If you can run commands on the host, check for host support
via /proc/cpuinfo.  If those come back negative (no support), then it would appear
that hardware or the host kernel is in a bad/unexpected state.

  grep -q clflushopt /proc/cpuinfo 
  grep -q clwb /proc/cpuinfo 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The "memory" test is failing in the kvm-unit-tests CI
  2023-03-30 19:37     ` Sean Christopherson
@ 2023-04-03  8:23       ` Thomas Huth
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Huth @ 2023-04-03  8:23 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, KVM, Cole Robinson

On 30/03/2023 21.37, Sean Christopherson wrote:
> On Thu, Mar 30, 2023, Thomas Huth wrote:
>> On 29/03/2023 21.11, Sean Christopherson wrote:
>>> On Wed, Mar 29, 2023, Thomas Huth wrote:
>>>>
>>>>    Hi,
>>>>
>>>> I noticed that in recent builds, the "memory" test started failing in the
>>>> kvm-unit-test CI. After doing some experiments, I think it might rather be
>>>> related to the environment than to a recent change in the k-u-t sources.
>>>>
>>>> It used to work fine with commit 2480430a here in January:
>>>>
>>>>    https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/jobs/3613156199#L2873
>>>>
>>>> Now I've re-run the CI with the same commit 2480430a here and it is failing now:
>>>>
>>>>    https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4022074711#L2733
>>>
>>> Can you provide the logs from the failing test, and/or the build artifacts?  I
>>> tried, and failed, to find them on Gitlab.
>>
>> Yes, that's still missing in the CI scripts ... I'll try to come up with a
>> patch that provides the logs as artifacts.
>>
>> Meanwhile, here's a run with a manual "cat logs/memory.log":
>>
>> https://gitlab.com/thuth/kvm-unit-tests/-/jobs/4029213352#L2726
>>
>> Seems like these are the failing memory tests:
>>
>> FAIL: clflushopt (ABSENT)
>> FAIL: clwb (ABSENT)
> 
> More than likely what is happening is that the platform supports CLFLUSHOPT and
> CLWB (possibly even via a ucode patch update), but the CPUID bits are not being
> enumerated to the guest.  Neither VMX nor SVM has intercept controls for the
> instructions, so KVM has no way to enforce the the guest's CPUID model.  E.g.
> the failures can be reproduce by manually hiding the features:
> 
>    rkt ./x86/run x86/memory.flat -smp 1 -cpu max,-clflushopt,-clwb
> 
> This isn't a KVM bug because of the virtualization hole.  And really, the test
> itself is bogus when running on KVM precisely because of said hole (similar holes
> exist for all the other instructions in the test).
 >
> The test appears to have been added for QEMU's TCG, which makes sense as there
> shouldn't be any virtualization holes in a pure emulation environment.
> 
> That said, it is interesting that the test is suddenly failing, as it means
> something is buggy.  If you can run commands on the host, check for host support
> via /proc/cpuinfo.  If those come back negative (no support), then it would appear
> that hardware or the host kernel is in a bad/unexpected state.
> 
>    grep -q clflushopt /proc/cpuinfo
>    grep -q clwb /proc/cpuinfo

I dumped the cpuinfo here:

  https://cirrus-ci.com/task/4861043097206784?logs=main#L22

And indeed, clflushopt and clwb do not show up. It's a nested setup, so I 
guess the flags have been disabled on the L0 host already.

I guess there's not much we can do here except disabling the "memory" test 
on cirrus-CI now...

  Thomas




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-04-03  8:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-29  9:50 The "memory" test is failing in the kvm-unit-tests CI Thomas Huth
2023-03-29 19:11 ` Sean Christopherson
2023-03-30  7:59   ` Thomas Huth
2023-03-30 19:37     ` Sean Christopherson
2023-04-03  8:23       ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.