All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reiji Watanabe <reijiw@google.com>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Will Deacon <will@kernel.org>, Peter Shier <pshier@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Oliver Upton <oupton@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Raghavendra Rao Anata <rananta@google.com>
Subject: Re: [PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Date: Wed, 12 Jan 2022 21:33:21 -0800	[thread overview]
Message-ID: <CAAeT=Fx62SZJvQYfFWj2sA-pcvrghuh8dSGuAaAtYWnJ5T5HCA@mail.gmail.com> (raw)
In-Reply-To: <Yd2sLbiw/XPCZe7q@monolith.localdoman>

Hi Marc and Alex,

On Tue, Jan 11, 2022 at 8:11 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Marc,
>
> On Tue, Jan 11, 2022 at 01:30:41PM +0000, Marc Zyngier wrote:
> > On Tue, 11 Jan 2022 07:37:57 +0000,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > >
> > > Hi Alex,
> > >
> > > On Mon, Jan 10, 2022 at 2:48 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > > >
> > > > Hi Reiji,
> > > >
> > > > On Sun, Jan 09, 2022 at 09:40:41PM -0800, Reiji Watanabe wrote:
> > > > > vcpu_allowed_register_width() checks if all the VCPUs are either
> > > > > all 32bit or all 64bit.  Since the checking is done even for vCPUs
> > > > > that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet,
> > > > > the non-initialized vCPUs are erroneously treated as 64bit vCPU,
> > > > > which causes the function to incorrectly detect a mixed-width VM.
> > > > >
> > > > > Fix vcpu_allowed_register_width() to skip the check for vCPUs that
> > > > > are not initialized yet.
> > > > >
> > > > > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
> > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > ---
> > > > >  arch/arm64/kvm/reset.c | 11 +++++++++++
> > > > >  1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > > > > index 426bd7fbc3fd..ef78bbc7566a 100644
> > > > > --- a/arch/arm64/kvm/reset.c
> > > > > +++ b/arch/arm64/kvm/reset.c
> > > > > @@ -180,8 +180,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > > > >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> > > > >               return false;
> > > > >
> > > > > +     /*
> > > > > +      * Make sure vcpu->arch.target setting is visible from others so
> > > > > +      * that the width consistency checking between two vCPUs is done
> > > > > +      * by at least one of them at KVM_ARM_VCPU_INIT.
> > > > > +      */
> > > > > +     smp_mb();
> > > >
> > > > From ARM DDI 0487G.a, page B2-146 ("Data Memory Barrier (DMB)"):
> > > >
> > > > "The DMB instruction is a memory barrier instruction that ensures the relative
> > > > order of memory accesses before the barrier with memory accesses after the
> > > > barrier."
> > > >
> > > > I'm going to assume from the comment that you are referring to completion of
> > > > memory accesses ("Make sure [..] is visible from others"). Please correct me if
> > > > I am wrong. In this case, DMB ensures ordering of memory accesses with regards
> > > > to writes and reads, not *completion*.  Have a look at
> > > > tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus for
> > > > the classic message passing example as an example of memory ordering.
> > > > Message passing and other patterns are also explained in ARM DDI 0487G.a, page
> > > > K11-8363.
> > > >
> > > > I'm not saying that your approach is incorrect, but the commit message should
> > > > explain what memory accesses are being ordered relative to each other and why.
> > >
> > > Thank you so much for the review.
> > > What I meant with the comment was:
> > > ---
> > >   DMB is used to make sure that writing @vcpu->arch.target, which is done
> > >   by kvm_vcpu_set_target() before getting here, is visible to other PEs
> > >   before the following kvm_for_each_vcpu iteration reads the other vCPUs'
> > >   target field.
> > > ---
> > > Did the comment become more clear ?? (Or do I use DMB incorrectly ?)
> > >
> > > > > +
> > > > >       /* Check that the vcpus are either all 32bit or all 64bit */
> > > > >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > > > > +             /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > > > > +             if (tmp->arch.target == -1)
> > > > > +                     continue;
> > >
> > > I just noticed DMB(ishld) is needed here to assure ordering between
> > > reading tmp->arch.target and reading vcpu->arch.features for this fix.
> > > Similarly, kvm_vcpu_set_target() needs DMB(ishst) to assure ordering
> > > between writing vcpu->arch.features and writing vcpu->arch.target...
> > > I am going to fix them in the v2 series.
> >
> > Yes, you'd need at least this, and preferably in their smp_rmb/wmb
> > variants.
> >
> > However, this looks like a pretty fragile construct, as there are
> > multiple paths where we can change target (including some error
> > paths from the run loop).
> >
> > I'd rather all changes to target and the feature bits happen under the
> > kvm->lock, and take that lock when checking for consistency in
> > vcpu_allowed_register_width(), as this isn't a fast path. I wrote the
> > following, which is obviously incomplete and as usual untested.
>
> I think this is the better approach, because we also want to make sure that
> a PE observes changes to target and features as soon as they have been
> made, to avoid situations where one PE sets the target and the 32bit
> feature, and another PE reads the old values and skips the check, in which
> case memory ordering is not enough.

Thank you for the comments and the suggestion.
I will look into fixing this based on the suggestion.

Thanks,
Reiji




>
> Thanks,
> Alex
>
> >
> > Thanks,
> >
> >       M.
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..42f2ab80646c 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1061,7 +1061,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> >  static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >                              const struct kvm_vcpu_init *init)
> >  {
> > -     unsigned int i, ret;
> > +     unsigned int i;
> > +     int ret = 0;
> >       u32 phys_target = kvm_target_cpu();
> >
> >       if (init->target != phys_target)
> > @@ -1074,32 +1075,46 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >       if (vcpu->arch.target != -1 && vcpu->arch.target != init->target)
> >               return -EINVAL;
> >
> > +     /* Hazard against a concurent check of the target in kvm_reset_vcpu() */
> > +     mutex_lock(&vcpu->kvm->lock);
> > +
> >       /* -ENOENT for unknown features, -EINVAL for invalid combinations. */
> >       for (i = 0; i < sizeof(init->features) * 8; i++) {
> >               bool set = (init->features[i / 32] & (1 << (i % 32)));
> >
> > -             if (set && i >= KVM_VCPU_MAX_FEATURES)
> > -                     return -ENOENT;
> > +             if (set && i >= KVM_VCPU_MAX_FEATURES) {
> > +                     ret = -ENOENT;
> > +                     break;
> > +             }
> >
> >               /*
> >                * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must
> >                * use the same feature set.
> >                */
> >               if (vcpu->arch.target != -1 && i < KVM_VCPU_MAX_FEATURES &&
> > -                 test_bit(i, vcpu->arch.features) != set)
> > -                     return -EINVAL;
> > +                 test_bit(i, vcpu->arch.features) != set) {
> > +                     ret = -EINVAL;
> > +                     break;
> > +             }
> >
> >               if (set)
> >                       set_bit(i, vcpu->arch.features);
> >       }
> >
> > -     vcpu->arch.target = phys_target;
> > +     if (!ret)
> > +             vcpu->arch.target = phys_target;
> > +
> > +     mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             return ret;
> >
> >       /* Now we know what it is, we can reset it. */
> >       ret = kvm_reset_vcpu(vcpu);
> >       if (ret) {
> > +             mutex_lock(&vcpu->kvm->lock);
> >               vcpu->arch.target = -1;
> >               bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> > +             mutex_unlock(&vcpu->kvm->lock);
> >       }
> >
> >       return ret;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index ef78bbc7566a..fae88a703140 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -180,13 +180,6 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> >               return false;
> >
> > -     /*
> > -      * Make sure vcpu->arch.target setting is visible from others so
> > -      * that the width consistency checking between two vCPUs is done
> > -      * by at least one of them at KVM_ARM_VCPU_INIT.
> > -      */
> > -     smp_mb();
> > -
> >       /* Check that the vcpus are either all 32bit or all 64bit */
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > @@ -222,14 +215,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  {
> >       struct vcpu_reset_state reset_state;
> > -     int ret;
> > +     int ret = -EINVAL;
> >       bool loaded;
> >       u32 pstate;
> >
> >       mutex_lock(&vcpu->kvm->lock);
> > -     reset_state = vcpu->arch.reset_state;
> > -     WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +     if (vcpu_allowed_register_width(vcpu)) {
> > +             reset_state = vcpu->arch.reset_state;
> > +             WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +             ret = 0;
> > +     }
> >       mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             goto out;
> >
> >       /* Reset PMU outside of the non-preemptible section */
> >       kvm_pmu_vcpu_reset(vcpu);
> > @@ -257,11 +255,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >               }
> >       }
> >
> > -     if (!vcpu_allowed_register_width(vcpu)) {
> > -             ret = -EINVAL;
> > -             goto out;
> > -     }
> > -
> >       switch (vcpu->arch.target) {
> >       default:
> >               if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> >
> > --
> > Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Reiji Watanabe <reijiw@google.com>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: kvm@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	Peter Shier <pshier@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Will Deacon <will@kernel.org>,
	kvmarm@lists.cs.columbia.edu,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Date: Wed, 12 Jan 2022 21:33:21 -0800	[thread overview]
Message-ID: <CAAeT=Fx62SZJvQYfFWj2sA-pcvrghuh8dSGuAaAtYWnJ5T5HCA@mail.gmail.com> (raw)
In-Reply-To: <Yd2sLbiw/XPCZe7q@monolith.localdoman>

Hi Marc and Alex,

On Tue, Jan 11, 2022 at 8:11 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Marc,
>
> On Tue, Jan 11, 2022 at 01:30:41PM +0000, Marc Zyngier wrote:
> > On Tue, 11 Jan 2022 07:37:57 +0000,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > >
> > > Hi Alex,
> > >
> > > On Mon, Jan 10, 2022 at 2:48 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > > >
> > > > Hi Reiji,
> > > >
> > > > On Sun, Jan 09, 2022 at 09:40:41PM -0800, Reiji Watanabe wrote:
> > > > > vcpu_allowed_register_width() checks if all the VCPUs are either
> > > > > all 32bit or all 64bit.  Since the checking is done even for vCPUs
> > > > > that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet,
> > > > > the non-initialized vCPUs are erroneously treated as 64bit vCPU,
> > > > > which causes the function to incorrectly detect a mixed-width VM.
> > > > >
> > > > > Fix vcpu_allowed_register_width() to skip the check for vCPUs that
> > > > > are not initialized yet.
> > > > >
> > > > > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
> > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > ---
> > > > >  arch/arm64/kvm/reset.c | 11 +++++++++++
> > > > >  1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > > > > index 426bd7fbc3fd..ef78bbc7566a 100644
> > > > > --- a/arch/arm64/kvm/reset.c
> > > > > +++ b/arch/arm64/kvm/reset.c
> > > > > @@ -180,8 +180,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > > > >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> > > > >               return false;
> > > > >
> > > > > +     /*
> > > > > +      * Make sure vcpu->arch.target setting is visible from others so
> > > > > +      * that the width consistency checking between two vCPUs is done
> > > > > +      * by at least one of them at KVM_ARM_VCPU_INIT.
> > > > > +      */
> > > > > +     smp_mb();
> > > >
> > > > From ARM DDI 0487G.a, page B2-146 ("Data Memory Barrier (DMB)"):
> > > >
> > > > "The DMB instruction is a memory barrier instruction that ensures the relative
> > > > order of memory accesses before the barrier with memory accesses after the
> > > > barrier."
> > > >
> > > > I'm going to assume from the comment that you are referring to completion of
> > > > memory accesses ("Make sure [..] is visible from others"). Please correct me if
> > > > I am wrong. In this case, DMB ensures ordering of memory accesses with regards
> > > > to writes and reads, not *completion*.  Have a look at
> > > > tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus for
> > > > the classic message passing example as an example of memory ordering.
> > > > Message passing and other patterns are also explained in ARM DDI 0487G.a, page
> > > > K11-8363.
> > > >
> > > > I'm not saying that your approach is incorrect, but the commit message should
> > > > explain what memory accesses are being ordered relative to each other and why.
> > >
> > > Thank you so much for the review.
> > > What I meant with the comment was:
> > > ---
> > >   DMB is used to make sure that writing @vcpu->arch.target, which is done
> > >   by kvm_vcpu_set_target() before getting here, is visible to other PEs
> > >   before the following kvm_for_each_vcpu iteration reads the other vCPUs'
> > >   target field.
> > > ---
> > > Did the comment become more clear ?? (Or do I use DMB incorrectly ?)
> > >
> > > > > +
> > > > >       /* Check that the vcpus are either all 32bit or all 64bit */
> > > > >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > > > > +             /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > > > > +             if (tmp->arch.target == -1)
> > > > > +                     continue;
> > >
> > > I just noticed DMB(ishld) is needed here to assure ordering between
> > > reading tmp->arch.target and reading vcpu->arch.features for this fix.
> > > Similarly, kvm_vcpu_set_target() needs DMB(ishst) to assure ordering
> > > between writing vcpu->arch.features and writing vcpu->arch.target...
> > > I am going to fix them in the v2 series.
> >
> > Yes, you'd need at least this, and preferably in their smp_rmb/wmb
> > variants.
> >
> > However, this looks like a pretty fragile construct, as there are
> > multiple paths where we can change target (including some error
> > paths from the run loop).
> >
> > I'd rather all changes to target and the feature bits happen under the
> > kvm->lock, and take that lock when checking for consistency in
> > vcpu_allowed_register_width(), as this isn't a fast path. I wrote the
> > following, which is obviously incomplete and as usual untested.
>
> I think this is the better approach, because we also want to make sure that
> a PE observes changes to target and features as soon as they have been
> made, to avoid situations where one PE sets the target and the 32bit
> feature, and another PE reads the old values and skips the check, in which
> case memory ordering is not enough.

Thank you for the comments and the suggestion.
I will look into fixing this based on the suggestion.

Thanks,
Reiji




>
> Thanks,
> Alex
>
> >
> > Thanks,
> >
> >       M.
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..42f2ab80646c 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1061,7 +1061,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> >  static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >                              const struct kvm_vcpu_init *init)
> >  {
> > -     unsigned int i, ret;
> > +     unsigned int i;
> > +     int ret = 0;
> >       u32 phys_target = kvm_target_cpu();
> >
> >       if (init->target != phys_target)
> > @@ -1074,32 +1075,46 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >       if (vcpu->arch.target != -1 && vcpu->arch.target != init->target)
> >               return -EINVAL;
> >
> > +     /* Hazard against a concurent check of the target in kvm_reset_vcpu() */
> > +     mutex_lock(&vcpu->kvm->lock);
> > +
> >       /* -ENOENT for unknown features, -EINVAL for invalid combinations. */
> >       for (i = 0; i < sizeof(init->features) * 8; i++) {
> >               bool set = (init->features[i / 32] & (1 << (i % 32)));
> >
> > -             if (set && i >= KVM_VCPU_MAX_FEATURES)
> > -                     return -ENOENT;
> > +             if (set && i >= KVM_VCPU_MAX_FEATURES) {
> > +                     ret = -ENOENT;
> > +                     break;
> > +             }
> >
> >               /*
> >                * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must
> >                * use the same feature set.
> >                */
> >               if (vcpu->arch.target != -1 && i < KVM_VCPU_MAX_FEATURES &&
> > -                 test_bit(i, vcpu->arch.features) != set)
> > -                     return -EINVAL;
> > +                 test_bit(i, vcpu->arch.features) != set) {
> > +                     ret = -EINVAL;
> > +                     break;
> > +             }
> >
> >               if (set)
> >                       set_bit(i, vcpu->arch.features);
> >       }
> >
> > -     vcpu->arch.target = phys_target;
> > +     if (!ret)
> > +             vcpu->arch.target = phys_target;
> > +
> > +     mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             return ret;
> >
> >       /* Now we know what it is, we can reset it. */
> >       ret = kvm_reset_vcpu(vcpu);
> >       if (ret) {
> > +             mutex_lock(&vcpu->kvm->lock);
> >               vcpu->arch.target = -1;
> >               bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> > +             mutex_unlock(&vcpu->kvm->lock);
> >       }
> >
> >       return ret;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index ef78bbc7566a..fae88a703140 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -180,13 +180,6 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> >               return false;
> >
> > -     /*
> > -      * Make sure vcpu->arch.target setting is visible from others so
> > -      * that the width consistency checking between two vCPUs is done
> > -      * by at least one of them at KVM_ARM_VCPU_INIT.
> > -      */
> > -     smp_mb();
> > -
> >       /* Check that the vcpus are either all 32bit or all 64bit */
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > @@ -222,14 +215,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  {
> >       struct vcpu_reset_state reset_state;
> > -     int ret;
> > +     int ret = -EINVAL;
> >       bool loaded;
> >       u32 pstate;
> >
> >       mutex_lock(&vcpu->kvm->lock);
> > -     reset_state = vcpu->arch.reset_state;
> > -     WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +     if (vcpu_allowed_register_width(vcpu)) {
> > +             reset_state = vcpu->arch.reset_state;
> > +             WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +             ret = 0;
> > +     }
> >       mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             goto out;
> >
> >       /* Reset PMU outside of the non-preemptible section */
> >       kvm_pmu_vcpu_reset(vcpu);
> > @@ -257,11 +255,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >               }
> >       }
> >
> > -     if (!vcpu_allowed_register_width(vcpu)) {
> > -             ret = -EINVAL;
> > -             goto out;
> > -     }
> > -
> >       switch (vcpu->arch.target) {
> >       default:
> >               if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> >
> > --
> > Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Reiji Watanabe <reijiw@google.com>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	James Morse <james.morse@arm.com>,
	 Suzuki K Poulose <suzuki.poulose@arm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Will Deacon <will@kernel.org>, Peter Shier <pshier@google.com>,
	 Ricardo Koller <ricarkol@google.com>,
	Oliver Upton <oupton@google.com>,
	 Jing Zhang <jingzhangos@google.com>,
	Raghavendra Rao Anata <rananta@google.com>
Subject: Re: [PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Date: Wed, 12 Jan 2022 21:33:21 -0800	[thread overview]
Message-ID: <CAAeT=Fx62SZJvQYfFWj2sA-pcvrghuh8dSGuAaAtYWnJ5T5HCA@mail.gmail.com> (raw)
In-Reply-To: <Yd2sLbiw/XPCZe7q@monolith.localdoman>

Hi Marc and Alex,

On Tue, Jan 11, 2022 at 8:11 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi Marc,
>
> On Tue, Jan 11, 2022 at 01:30:41PM +0000, Marc Zyngier wrote:
> > On Tue, 11 Jan 2022 07:37:57 +0000,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > >
> > > Hi Alex,
> > >
> > > On Mon, Jan 10, 2022 at 2:48 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > > >
> > > > Hi Reiji,
> > > >
> > > > On Sun, Jan 09, 2022 at 09:40:41PM -0800, Reiji Watanabe wrote:
> > > > > vcpu_allowed_register_width() checks if all the VCPUs are either
> > > > > all 32bit or all 64bit.  Since the checking is done even for vCPUs
> > > > > that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet,
> > > > > the non-initialized vCPUs are erroneously treated as 64bit vCPU,
> > > > > which causes the function to incorrectly detect a mixed-width VM.
> > > > >
> > > > > Fix vcpu_allowed_register_width() to skip the check for vCPUs that
> > > > > are not initialized yet.
> > > > >
> > > > > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation")
> > > > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > > > > ---
> > > > >  arch/arm64/kvm/reset.c | 11 +++++++++++
> > > > >  1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > > > > index 426bd7fbc3fd..ef78bbc7566a 100644
> > > > > --- a/arch/arm64/kvm/reset.c
> > > > > +++ b/arch/arm64/kvm/reset.c
> > > > > @@ -180,8 +180,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> > > > >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> > > > >               return false;
> > > > >
> > > > > +     /*
> > > > > +      * Make sure vcpu->arch.target setting is visible from others so
> > > > > +      * that the width consistency checking between two vCPUs is done
> > > > > +      * by at least one of them at KVM_ARM_VCPU_INIT.
> > > > > +      */
> > > > > +     smp_mb();
> > > >
> > > > From ARM DDI 0487G.a, page B2-146 ("Data Memory Barrier (DMB)"):
> > > >
> > > > "The DMB instruction is a memory barrier instruction that ensures the relative
> > > > order of memory accesses before the barrier with memory accesses after the
> > > > barrier."
> > > >
> > > > I'm going to assume from the comment that you are referring to completion of
> > > > memory accesses ("Make sure [..] is visible from others"). Please correct me if
> > > > I am wrong. In this case, DMB ensures ordering of memory accesses with regards
> > > > to writes and reads, not *completion*.  Have a look at
> > > > tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus for
> > > > the classic message passing example as an example of memory ordering.
> > > > Message passing and other patterns are also explained in ARM DDI 0487G.a, page
> > > > K11-8363.
> > > >
> > > > I'm not saying that your approach is incorrect, but the commit message should
> > > > explain what memory accesses are being ordered relative to each other and why.
> > >
> > > Thank you so much for the review.
> > > What I meant with the comment was:
> > > ---
> > >   DMB is used to make sure that writing @vcpu->arch.target, which is done
> > >   by kvm_vcpu_set_target() before getting here, is visible to other PEs
> > >   before the following kvm_for_each_vcpu iteration reads the other vCPUs'
> > >   target field.
> > > ---
> > > Did the comment become more clear ?? (Or do I use DMB incorrectly ?)
> > >
> > > > > +
> > > > >       /* Check that the vcpus are either all 32bit or all 64bit */
> > > > >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> > > > > +             /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > > > > +             if (tmp->arch.target == -1)
> > > > > +                     continue;
> > >
> > > I just noticed DMB(ishld) is needed here to assure ordering between
> > > reading tmp->arch.target and reading vcpu->arch.features for this fix.
> > > Similarly, kvm_vcpu_set_target() needs DMB(ishst) to assure ordering
> > > between writing vcpu->arch.features and writing vcpu->arch.target...
> > > I am going to fix them in the v2 series.
> >
> > Yes, you'd need at least this, and preferably in their smp_rmb/wmb
> > variants.
> >
> > However, this looks like a pretty fragile construct, as there are
> > multiple paths where we can change target (including some error
> > paths from the run loop).
> >
> > I'd rather all changes to target and the feature bits happen under the
> > kvm->lock, and take that lock when checking for consistency in
> > vcpu_allowed_register_width(), as this isn't a fast path. I wrote the
> > following, which is obviously incomplete and as usual untested.
>
> I think this is the better approach, because we also want to make sure that
> a PE observes changes to target and features as soon as they have been
> made, to avoid situations where one PE sets the target and the 32bit
> feature, and another PE reads the old values and skips the check, in which
> case memory ordering is not enough.

Thank you for the comments and the suggestion.
I will look into fixing this based on the suggestion.

Thanks,
Reiji




>
> Thanks,
> Alex
>
> >
> > Thanks,
> >
> >       M.
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index e4727dc771bf..42f2ab80646c 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1061,7 +1061,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
> >  static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >                              const struct kvm_vcpu_init *init)
> >  {
> > -     unsigned int i, ret;
> > +     unsigned int i;
> > +     int ret = 0;
> >       u32 phys_target = kvm_target_cpu();
> >
> >       if (init->target != phys_target)
> > @@ -1074,32 +1075,46 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
> >       if (vcpu->arch.target != -1 && vcpu->arch.target != init->target)
> >               return -EINVAL;
> >
> > +     /* Hazard against a concurent check of the target in kvm_reset_vcpu() */
> > +     mutex_lock(&vcpu->kvm->lock);
> > +
> >       /* -ENOENT for unknown features, -EINVAL for invalid combinations. */
> >       for (i = 0; i < sizeof(init->features) * 8; i++) {
> >               bool set = (init->features[i / 32] & (1 << (i % 32)));
> >
> > -             if (set && i >= KVM_VCPU_MAX_FEATURES)
> > -                     return -ENOENT;
> > +             if (set && i >= KVM_VCPU_MAX_FEATURES) {
> > +                     ret = -ENOENT;
> > +                     break;
> > +             }
> >
> >               /*
> >                * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must
> >                * use the same feature set.
> >                */
> >               if (vcpu->arch.target != -1 && i < KVM_VCPU_MAX_FEATURES &&
> > -                 test_bit(i, vcpu->arch.features) != set)
> > -                     return -EINVAL;
> > +                 test_bit(i, vcpu->arch.features) != set) {
> > +                     ret = -EINVAL;
> > +                     break;
> > +             }
> >
> >               if (set)
> >                       set_bit(i, vcpu->arch.features);
> >       }
> >
> > -     vcpu->arch.target = phys_target;
> > +     if (!ret)
> > +             vcpu->arch.target = phys_target;
> > +
> > +     mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             return ret;
> >
> >       /* Now we know what it is, we can reset it. */
> >       ret = kvm_reset_vcpu(vcpu);
> >       if (ret) {
> > +             mutex_lock(&vcpu->kvm->lock);
> >               vcpu->arch.target = -1;
> >               bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> > +             mutex_unlock(&vcpu->kvm->lock);
> >       }
> >
> >       return ret;
> > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> > index ef78bbc7566a..fae88a703140 100644
> > --- a/arch/arm64/kvm/reset.c
> > +++ b/arch/arm64/kvm/reset.c
> > @@ -180,13 +180,6 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >       if (kvm_has_mte(vcpu->kvm) && is32bit)
> >               return false;
> >
> > -     /*
> > -      * Make sure vcpu->arch.target setting is visible from others so
> > -      * that the width consistency checking between two vCPUs is done
> > -      * by at least one of them at KVM_ARM_VCPU_INIT.
> > -      */
> > -     smp_mb();
> > -
> >       /* Check that the vcpus are either all 32bit or all 64bit */
> >       kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> >               /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */
> > @@ -222,14 +215,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
> >  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >  {
> >       struct vcpu_reset_state reset_state;
> > -     int ret;
> > +     int ret = -EINVAL;
> >       bool loaded;
> >       u32 pstate;
> >
> >       mutex_lock(&vcpu->kvm->lock);
> > -     reset_state = vcpu->arch.reset_state;
> > -     WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +     if (vcpu_allowed_register_width(vcpu)) {
> > +             reset_state = vcpu->arch.reset_state;
> > +             WRITE_ONCE(vcpu->arch.reset_state.reset, false);
> > +             ret = 0;
> > +     }
> >       mutex_unlock(&vcpu->kvm->lock);
> > +     if (ret)
> > +             goto out;
> >
> >       /* Reset PMU outside of the non-preemptible section */
> >       kvm_pmu_vcpu_reset(vcpu);
> > @@ -257,11 +255,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
> >               }
> >       }
> >
> > -     if (!vcpu_allowed_register_width(vcpu)) {
> > -             ret = -EINVAL;
> > -             goto out;
> > -     }
> > -
> >       switch (vcpu->arch.target) {
> >       default:
> >               if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
> >
> > --
> > Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-01-13  5:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-10  5:40 [PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs Reiji Watanabe
2022-01-10  5:40 ` Reiji Watanabe
2022-01-10  5:40 ` Reiji Watanabe
2022-01-10  5:40 ` [PATCH 2/2] KVM: arm64: selftests: Introduce vcpu_width_config Reiji Watanabe
2022-01-10  5:40   ` Reiji Watanabe
2022-01-10  5:40   ` Reiji Watanabe
2022-01-11  9:47   ` Andrew Jones
2022-01-11  9:47     ` Andrew Jones
2022-01-11  9:47     ` Andrew Jones
2022-01-10 10:48 ` [PATCH 1/2] KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs Alexandru Elisei
2022-01-10 10:48   ` Alexandru Elisei
2022-01-10 10:48   ` Alexandru Elisei
2022-01-11  7:37   ` Reiji Watanabe
2022-01-11  7:37     ` Reiji Watanabe
2022-01-11  7:37     ` Reiji Watanabe
2022-01-11 13:30     ` Marc Zyngier
2022-01-11 13:30       ` Marc Zyngier
2022-01-11 13:30       ` Marc Zyngier
2022-01-11 16:11       ` Alexandru Elisei
2022-01-11 16:11         ` Alexandru Elisei
2022-01-11 16:11         ` Alexandru Elisei
2022-01-13  5:33         ` Reiji Watanabe [this message]
2022-01-13  5:33           ` Reiji Watanabe
2022-01-13  5:33           ` Reiji Watanabe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAeT=Fx62SZJvQYfFWj2sA-pcvrghuh8dSGuAaAtYWnJ5T5HCA@mail.gmail.com' \
    --to=reijiw@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=james.morse@arm.com \
    --cc=jingzhangos@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=pshier@google.com \
    --cc=rananta@google.com \
    --cc=ricarkol@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.