linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ryan Thibodeaux <thibodux@gmail.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dario Faggioli <dfaggioli@suse.com>,
	luca abeni <luca.abeni@santannapisa.it>,
	xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org,
	oleksandr_andrushchenko@epam.com, tglx@linutronix.de,
	jgross@suse.com, ryan.thibodeaux@starlab.io
Subject: Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
Date: Wed, 27 Mar 2019 10:59:58 -0400	[thread overview]
Message-ID: <20190327145958.GA12371@centos-dev.localdomain> (raw)
In-Reply-To: <04ee9e67-5720-72df-9e2e-2ba42febf90f@oracle.com>

On Wed, Mar 27, 2019 at 10:46:21AM -0400, Boris Ostrovsky wrote:
> On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> > On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> >> On 3/26/19 5:13 AM, Dario Faggioli wrote:
> >>> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> >>>> On 3/25/19 8:05 AM, luca abeni wrote:
> >>>>> The picture shows the latencies measured with an unpatched guest
> >>>>> kernel
> >>>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> >>>>> small
> >>>>> value :).
> >>>>> All the experiments have been performed booting the hypervisor with
> >>>>> a
> >>>>> small timer_slop (the hypervisor's one) value. So, they show that
> >>>>> decreasing the hypervisor's timer_slop is not enough to measure low
> >>>>> latencies with cyclictest.
> >>>> I have a couple of questions:
> >>>> * Does it make sense to make this a tunable for other clockevent
> >>>> devices
> >>>> as well?
> >>>>
> >>> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> >>> keep the delta between now and the next timer event within
> >>> dev->max_delta_ns and dev->min_delta_ns:
> >>>
> >>>   delta = min(delta, (int64_t) dev->max_delta_ns);
> >>>   delta = max(delta, (int64_t) dev->min_delta_ns);
> >>>
> >>> For Xen (well, for the Xen clock) we have:
> >>>
> >>>   .max_delta_ns = 0xffffffff,
> >>>   .min_delta_ns = TIMER_SLOP,
> >>>
> >>> which means a guest can't ask for a timer to fire earlier than 100us
> >>> ahead, which is a bit too coarse, especially on contemporary hardware.
> >>>
> >>> For "lapic_deadline" (which was what was in use in KVM guests, in our
> >>> experiments) we have:
> >>>
> >>>   lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> >>>   lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
> >>>
> >>> Which means max is 0x7FFFFF device ticks, and min is 0xF.
> >>> clockevent_delta2ns() does the conversion from ticks to ns, basing on
> >>> the results of the APIC calibration process. It calls cev_delta2ns()
> >>> which does some scaling, shifting, divs, etc, and, at the very end,
> >>> this:
> >>>
> >>>   /* Deltas less than 1usec are pointless noise */
> >>>   return clc > 1000 ? clc : 1000;
> >>>
> >>> So, as Ryan is also saying, the actual minimum, in this case, depends
> >>> on hardware, with a sanity check of "never below 1us" (which is quite
> >>> smaller than 100us!)
> >>>
> >>> Of course, the actual granularity depends on hardware in the Xen case
> >>> as well, but that is handled in Xen itself. And we have mechanisms in
> >>> place in there to avoid timer interrupt storms (like, ahem, the Xen's
> >>> 'timer_slop' boot parameter... :-P)
> >>>
> >>> And this is basically why I was also thinking we can/should lower the
> >>> default value of TIMER_SLOP, here in the Xen clock implementation in
> >>> Linux.
> >> What do you think would be a sane value? 10us? Should we then still keep
> >> this patch?
> >>
> >> My concern would be that if we change the current value and it turns out
> >> to be very wrong we'd then have no recourse.
> >>
> >>
> >> -boris
> >>
> > Speaking out of turn but as a participant in this thread, I would not
> > assume to change the default value for all cases without significant
> > testing by the community, touching a variety of configurations.
> >
> > It feels like changing the default has a non-trivial amount of 
> > unknowns that would need to be addressed.
> >
> > Not surprisingly, I am biased to the approach of my patch which
> > does not change the default but offers flexibility to all.
> 
> 
> If we are to change the default it would be good to at least collect
> some data on distribution of delta values in
> clockevents_program_event(). But as I said, I'd keep the patch.
> 
> Also, as far as the comment describing TIMER_SLOP, I agree that it is
> rather misleading.
> 
> I can replace it with /* Minimum amount of time until next clock event
> fires */, I  can do it while committing so no need to resend.
> 
> -boris

I like that. Thanks Boris!

- Ryan

  reply	other threads:[~2019-03-27 15:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 18:29 [PATCH] x86/xen: Add "xen_timer_slop" command line option thibodux
2019-03-22 22:10 ` Boris Ostrovsky
2019-03-23  2:58   ` Dario Faggioli
2019-03-23 10:41     ` luca abeni
2019-03-25 12:05       ` luca abeni
2019-03-25 13:43         ` Boris Ostrovsky
2019-03-25 14:07           ` luca abeni
2019-03-25 14:11           ` Ryan Thibodeaux
2019-03-25 17:36             ` Ryan Thibodeaux
2019-03-25 18:31             ` Boris Ostrovsky
2019-03-26  9:13           ` Dario Faggioli
2019-03-26 11:12             ` luca abeni
2019-03-26 11:41               ` Ryan Thibodeaux
2019-03-26 23:21             ` Boris Ostrovsky
2019-03-27 10:00               ` Ryan Thibodeaux
2019-03-27 14:46                 ` Boris Ostrovsky
2019-03-27 14:59                   ` Ryan Thibodeaux [this message]
2019-03-27 15:19                   ` Dario Faggioli
2019-03-23 12:00   ` Ryan Thibodeaux
2019-03-24 18:07     ` Boris Ostrovsky
2019-03-25 10:36       ` Dario Faggioli
2019-04-24 18:47 ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190327145958.GA12371@centos-dev.localdomain \
    --to=thibodux@gmail.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dfaggioli@suse.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@santannapisa.it \
    --cc=oleksandr_andrushchenko@epam.com \
    --cc=ryan.thibodeaux@starlab.io \
    --cc=tglx@linutronix.de \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).