All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Durrant <Paul.Durrant@citrix.com>
To: 'Jan Beulich' <JBeulich@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"Tim (Xen.org)" <tim@xen.org>,
	George Dunlap <George.Dunlap@citrix.com>,
	Julien Grall <julien.grall@arm.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH v3 8/9] viridian: add implementation of synthetic timers
Date: Wed, 6 Mar 2019 11:23:32 +0000	[thread overview]
Message-ID: <e4806e589a84416c989389e97b58dd0b@AMSPEX02CL02.citrite.net> (raw)
In-Reply-To: <5C7401940200007800219EEC@prv1-mh.provo.novell.com>

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 25 February 2019 14:54
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v3 8/9] viridian: add implementation of synthetic timers
> 
> >>> On 31.01.19 at 11:47, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/viridian/synic.c
> > +++ b/xen/arch/x86/hvm/viridian/synic.c
> > @@ -329,7 +329,53 @@ void viridian_synic_domain_deinit(struct domain *d)
> >
> >  void viridian_synic_poll_messages(struct vcpu *v)
> >  {
> > -    /* There are currently no message sources */
> > +    viridian_time_poll_timers(v);
> > +}
> > +
> > +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx,
> > +                                      unsigned int index,
> > +                                      int64_t expiration, int64_t delivery)
> > +{
> > +    const union viridian_sint_msr *vs = &v->arch.hvm.viridian->sint[sintx];
> > +    HV_MESSAGE *msg = v->arch.hvm.viridian->simp.ptr;
> > +    struct {
> > +        uint32_t TimerIndex;
> > +        uint32_t Reserved;
> > +        uint64_t ExpirationTime;
> > +        uint64_t DeliveryTime;
> > +    } payload = {
> > +        .TimerIndex = index,
> > +        .ExpirationTime = expiration,
> > +        .DeliveryTime = delivery,
> > +    };
> > +
> > +    if ( test_bit(sintx, &v->arch.hvm.viridian->msg_pending) )
> > +        return false;
> > +
> > +    BUILD_BUG_ON(sizeof(*msg) != HV_MESSAGE_SIZE);
> > +    msg += sintx;
> > +
> > +    /*
> > +     * To avoid using an atomic test-and-set this function must be called
> > +     * in context of the vcpu receiving the message.
> > +     */
> > +    ASSERT(v == current);
> > +    if ( msg->Header.MessageType != HvMessageTypeNone )
> > +    {
> > +        msg->Header.MessageFlags.MessagePending = 1;
> > +        set_bit(sintx, &v->arch.hvm.viridian->msg_pending);
> 
> As per the comment above this is always in context of the subject
> vCPU. It looks to me as if this was also the case for the two
> clear_bit() on the field in the prior patch. If so, all three could be
> the non-atomic variants instead.

The only slight subtlety I think is the one in the wrmsr function, which can be called in context of a domain restore. I think it's still ok for it to be non-atomic in this case but I'll assert (v = current || !v->running), which I think should cover it.

> 
> > +        return false;
> > +    }
> > +
> > +    msg->Header.MessageType = HvMessageTimerExpired;
> > +    msg->Header.MessageFlags.MessagePending = 0;
> > +    msg->Header.PayloadSize = sizeof(payload);
> > +    memcpy(msg->Payload, &payload, sizeof(payload));
> > +
> > +    if ( !vs->fields.mask )
> > +        vlapic_set_irq(vcpu_vlapic(v), vs->fields.vector, 0);
> 
> If this wasn't with v == current, I think you'd also need a barrier
> here. Could you extend the comment above to also mention this
> aspect?

Ok.

> 
> > @@ -118,14 +119,237 @@ static int64_t time_ref_count(struct domain *d)
> >      return raw_trc_val(d) + trc->off;
> >  }
> >
> > +static int64_t time_now(struct domain *d)
> 
> Why would this return a signed value? And can't the function
> parameter be const?

The function parameter can be const, but I think the result needs to be signed for the missed ticks logic in start_timer() to work correctly.

> 
> > +{
> > +    const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc;
> > +    HV_REFERENCE_TSC_PAGE *p = rt->ptr;
> > +    uint32_t start, end;
> > +    __int128_t tsc;
> > +    __int128_t scale;
> 
> I don't think you need both of them be 128 bits wide. I also don't
> see why either would want to be of a signed type.

The spec says (as in the comment below):

"The partition reference time is computed by the following formula:

ReferenceTime = ((VirtualTsc * TscScale) >> 64) + TscOffset

The multiplication is a 64 bit multiplication, which results in a 128 bit number which is then shifted 64 times to the right to obtain the high 64 bits.TscScale"

Again, I'm using signed arithmetic as I think it's necessary for the missed ticks logic to work correctly in the event of an overflow.

> 
> > +    int64_t offset;
> > +
> > +    /*
> > +     * If the reference TSC page is not enabled, or has been invalidated
> > +     * fall back to the partition reference counter.
> > +     */
> > +    if ( !p || !p->TscSequence )
> > +        return time_ref_count(d);
> > +
> > +    /*
> > +     * The following sampling algorithm for tsc, scale and offset is
> > +     * documented in the specifiction.
> > +     */
> > +    start = p->TscSequence;
> > +
> > +    do {
> > +        tsc = rdtsc();
> > +        scale = p->TscScale;
> > +        offset = p->TscOffset;
> > +
> > +        smp_mb();
> > +        end = p->TscSequence;
> 
> Why is this a full barrier, rather than just a read one? And don't you need
> to add a counterpart in update_reference_tsc()?

Yes, a read barrier is enough with the counterpart write barrier added.

> 
> > +    } while (end != start);
> 
> update_reference_tsc() increments TscSequence. If end doesn't match
> start at this point, you're entering a near infinite loop here as long as
> you don't update start inside the loop. I also think that there's a
> second read barrier needed between this initial reading of the sequence
> number and the reading of the actual values.

Yes, the start value should be inside the loop of course.

> 
> > +    /*
> > +     * The specification says: "The partition reference time is computed
> > +     * by the following formula:
> > +     *
> > +     * ReferenceTime = ((VirtualTsc * TscScale) >> 64) + TscOffset
> > +     *
> > +     * The multiplication is a 64 bit multiplication, which results in a
> > +     * 128 bit number which is then shifted 64 times to the right to obtain
> > +     * the high 64 bits."
> > +     */
> > +    return ((tsc * scale) >> 64) + offset;
> > +}
> > +
> > +static void stop_stimer(struct viridian_stimer *vs)
> > +{
> > +    struct vcpu *v = vs->v;
> 
> const?

Ok.

> 
> > +    unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0];
> > +
> > +    if ( !vs->started )
> > +        return;
> > +
> > +    stop_timer(&vs->timer);
> > +    clear_bit(stimerx, &v->arch.hvm.viridian->stimer_pending);
> > +    vs->started = false;
> > +}
> > +
> > +static void stimer_expire(void *data)
> > +{
> > +    struct viridian_stimer *vs = data;
> 
> const?

Ok.

> 
> > +    struct vcpu *v = vs->v;
> > +    unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0];
> > +
> > +    if ( !vs->config.fields.enabled )
> > +        return;
> > +
> > +    set_bit(stimerx, &v->arch.hvm.viridian->stimer_pending);
> > +    vcpu_kick(v);
> > +}
> > +
> > +static void start_stimer(struct viridian_stimer *vs)
> > +{
> > +    struct vcpu *v = vs->v;
> > +    unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0];
> > +    int64_t now = time_now(v->domain);
> > +    s_time_t timeout;
> > +
> > +    if ( !test_and_set_bit(stimerx, &v->arch.hvm.viridian->stimer_enabled) )
> > +        printk(XENLOG_G_INFO "%pv: VIRIDIAN STIMER%u: enabled\n", v,
> > +               stimerx);
> > +
> > +    if ( vs->config.fields.periodic )
> > +    {
> > +        unsigned int missed = 0;
> > +        int64_t next;
> > +
> > +        /*
> > +         * The specification says that if the timer is lazy then we
> > +         * skip over any missed expirations so we can treat this case
> > +         * as the same as if the timer is currently stopped, i.e. we
> > +         * just schedule expiration to be 'count' ticks from now.
> > +         */
> > +        if ( !vs->started || vs->config.fields.lazy )
> > +        {
> > +            next = now + vs->count;
> > +        }
> 
> Unnecessary braces.

Yes.

> 
> > @@ -149,6 +373,57 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val)
> >          }
> >          break;
> >
> > +    case HV_X64_MSR_TIME_REF_COUNT:
> > +        return X86EMUL_EXCEPTION;
> 
> Isn't this an unrelated change?

It is. I'll call it out in the comment comment.

> 
> > +    case HV_X64_MSR_STIMER0_CONFIG:
> > +    case HV_X64_MSR_STIMER1_CONFIG:
> > +    case HV_X64_MSR_STIMER2_CONFIG:
> > +    case HV_X64_MSR_STIMER3_CONFIG:
> > +    {
> > +        unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_CONFIG) / 2;
> > +        struct viridian_stimer *vs = &v->arch.hvm.viridian->stimer[stimerx];
> > +
> > +        if ( !(viridian_feature_mask(d) & HVMPV_stimer) )
> > +            return X86EMUL_EXCEPTION;
> > +
> > +        stop_stimer(vs);
> > +
> > +        vs->config.raw = val;
> > +
> > +        if ( !vs->config.fields.sintx )
> > +            vs->config.fields.enabled = 0;
> > +
> > +        if ( vs->config.fields.enabled )
> > +            start_stimer(vs);
> > +
> > +        break;
> > +    }
> > +    case HV_X64_MSR_STIMER0_COUNT:
> 
> Missing blank line again (and also further down here as well as in the
> rdmsr code).
> 

Ok. TBH I've always thought the normal style was to omit the blank line if the case statement has braces.

> > +    case HV_X64_MSR_STIMER1_COUNT:
> > +    case HV_X64_MSR_STIMER2_COUNT:
> > +    case HV_X64_MSR_STIMER3_COUNT:
> > +    {
> > +        unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_COUNT) / 2;
> > +        struct viridian_stimer *vs = &v->arch.hvm.viridian->stimer[stimerx];
> > +
> > +        if ( !(viridian_feature_mask(d) & HVMPV_stimer) )
> > +            return X86EMUL_EXCEPTION;
> > +
> > +        stop_stimer(vs);
> > +
> > +        vs->count = val;
> > +
> > +        if ( !vs->count  )
> 
> Any reason you don't use val here (which the compiler likely will do
> anyway)?

Not particularly, I just think it reads better and is more consistent with other code.

> 
> > @@ -201,6 +476,32 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val)
> >          break;
> >      }
> >
> > +    case HV_X64_MSR_STIMER0_CONFIG:
> > +    case HV_X64_MSR_STIMER1_CONFIG:
> > +    case HV_X64_MSR_STIMER2_CONFIG:
> > +    case HV_X64_MSR_STIMER3_CONFIG:
> > +    {
> > +        unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_CONFIG) / 2;
> > +
> > +        if ( !(viridian_feature_mask(d) & HVMPV_stimer) )
> > +            return X86EMUL_EXCEPTION;
> > +
> > +        *val = v->arch.hvm.viridian->stimer[stimerx].config.raw;
> 
> While more noticeable here and ...
> 
> > +        break;
> > +    }
> > +    case HV_X64_MSR_STIMER0_COUNT:
> > +    case HV_X64_MSR_STIMER1_COUNT:
> > +    case HV_X64_MSR_STIMER2_COUNT:
> > +    case HV_X64_MSR_STIMER3_COUNT:
> > +    {
> > +        unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_COUNT) / 2;
> > +
> > +        if ( !(viridian_feature_mask(d) & HVMPV_stimer) )
> > +            return X86EMUL_EXCEPTION;
> > +
> > +        *val = v->arch.hvm.viridian->stimer[stimerx].count;
> 
> ... here, array_access_nospec() are probably needed not just here,
> but also in the wrmsr logic.

Really? stimerx is calculated based on hitting the case statement in the first place.

> 
> > --- a/xen/include/asm-x86/hvm/viridian.h
> > +++ b/xen/include/asm-x86/hvm/viridian.h
> > @@ -40,6 +40,33 @@ union viridian_sint_msr
> >      } fields;
> >  };
> >
> > +union viridian_stimer_config_msr
> > +{
> > +    uint64_t raw;
> > +    struct
> > +    {
> > +        uint64_t enabled:1;
> > +        uint64_t periodic:1;
> > +        uint64_t lazy:1;
> > +        uint64_t auto_enable:1;
> > +        uint64_t vector:8;
> > +        uint64_t direct_mode:1;
> > +        uint64_t reserved_zero1:3;
> > +        uint64_t sintx:4;
> > +        uint64_t reserved_zero2:44;
> > +    } fields;
> > +};
> > +
> > +struct viridian_stimer {
> > +    struct vcpu *v;
> 
> Isn't a full 8-byte pointer a little too much overhead here? You could
> instead store the timer index ...

I think I need it in stimer_expire() which can be called in any vcpu context IIUC.

> 
> > +    struct timer timer;
> > +    union viridian_stimer_config_msr config;
> > +    uint64_t count;
> > +    int64_t expiration;
> > +    s_time_t timeout;
> > +    bool started;
> 
> ... in a field using the 7-byte padding here, and use container_of()
> to get at the outer structure.

That would get me as far as viridian_vcpu, but there's no pointer to struct vcpu in there, and I need one to call vcpu_kick().

  Paul

> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-03-06 11:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-31 10:47 [PATCH v3 0/9] viridian: implement more enlightenments Paul Durrant
2019-01-31 10:47 ` [PATCH v3 1/9] viridian: add init hooks Paul Durrant
2019-01-31 10:47 ` [PATCH v3 2/9] viridian: separately allocate domain and vcpu structures Paul Durrant
2019-01-31 10:47 ` [PATCH v3 3/9] viridian: extend init/deinit hooks into synic and time modules Paul Durrant
2019-01-31 14:45   ` Wei Liu
2019-01-31 10:47 ` [PATCH v3 4/9] viridian: add missing context save helpers " Paul Durrant
2019-01-31 10:47 ` [PATCH v3 5/9] viridian: use viridian_map/unmap_guest_page() for reference tsc page Paul Durrant
2019-01-31 10:47 ` [PATCH v3 6/9] viridian: add implementation of synthetic interrupt MSRs Paul Durrant
2019-01-31 14:46   ` Wei Liu
2019-02-25 14:12   ` Jan Beulich
2019-03-04 17:01     ` Paul Durrant
2019-01-31 10:47 ` [PATCH v3 7/9] viridian: stop directly calling viridian_time_ref_count_freeze/thaw() Paul Durrant
2019-01-31 10:47 ` [PATCH v3 8/9] viridian: add implementation of synthetic timers Paul Durrant
2019-01-31 14:46   ` Wei Liu
2019-02-25 14:54   ` Jan Beulich
2019-03-06 11:23     ` Paul Durrant [this message]
2019-03-06 11:47       ` Paul Durrant
2019-03-06 12:59         ` Jan Beulich
2019-03-06 13:03           ` Paul Durrant
2019-03-06 12:56       ` Jan Beulich
2019-03-06 13:05         ` Paul Durrant
2019-03-06 13:09           ` Paul Durrant
2019-03-06 14:38             ` Jan Beulich
2019-03-06 14:41               ` Paul Durrant
2019-01-31 10:47 ` [PATCH v3 9/9] viridian: add implementation of the HvSendSyntheticClusterIpi hypercall Paul Durrant
2019-01-31 14:46   ` Wei Liu
2019-02-25 15:00   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4806e589a84416c989389e97b58dd0b@AMSPEX02CL02.citrite.net \
    --to=paul.durrant@citrix.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.