* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
@ 2019-03-25 14:07 ` luca abeni
2019-03-25 14:07 ` luca abeni
` (4 subsequent siblings)
5 siblings, 0 replies; 45+ messages in thread
From: luca abeni @ 2019-03-25 14:07 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: jgross, oleksandr_andrushchenko, linux-kernel, Dario Faggioli,
thibodux, ryan.thibodeaux, xen-devel, tglx
Hi,
On Mon, 25 Mar 2019 09:43:20 -0400
Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
[...]
> > http://retis.santannapisa.it/luca/XenTimers/
> > (there also is a link to the scripts to be used for reproducing the
> > results). The latencies have been measured by running cyclictest in
> > the guest (see the scripts for details).
> >
> > The picture shows the latencies measured with an unpatched guest
> > kernel and with a guest kernel having TIMER_SLOP set to 1000
> > (arbitrary small value :).
> > All the experiments have been performed booting the hypervisor with
> > a small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
>
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent
> devices as well?
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
Sorry, I do not know much about clockevent devices, so I have no answers
to these questions...
What I can say is that when I repeated the cyclictest experiments on
VMs using a different clockevent device (lapic) I did not measure large
latencies.
So, I guess the "lapic" clockevent device already defaults to a smaller
min value (not sure about other clockevent devices, I do not know how
to test them).
Luca
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
2019-03-25 14:07 ` luca abeni
@ 2019-03-25 14:07 ` luca abeni
2019-03-25 14:11 ` Ryan Thibodeaux
` (3 subsequent siblings)
5 siblings, 0 replies; 45+ messages in thread
From: luca abeni @ 2019-03-25 14:07 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: Dario Faggioli, thibodux, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
Hi,
On Mon, 25 Mar 2019 09:43:20 -0400
Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
[...]
> > http://retis.santannapisa.it/luca/XenTimers/
> > (there also is a link to the scripts to be used for reproducing the
> > results). The latencies have been measured by running cyclictest in
> > the guest (see the scripts for details).
> >
> > The picture shows the latencies measured with an unpatched guest
> > kernel and with a guest kernel having TIMER_SLOP set to 1000
> > (arbitrary small value :).
> > All the experiments have been performed booting the hypervisor with
> > a small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
>
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent
> devices as well?
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
Sorry, I do not know much about clockevent devices, so I have no answers
to these questions...
What I can say is that when I repeated the cyclictest experiments on
VMs using a different clockevent device (lapic) I did not measure large
latencies.
So, I guess the "lapic" clockevent device already defaults to a smaller
min value (not sure about other clockevent devices, I do not know how
to test them).
Luca
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
2019-03-25 14:07 ` luca abeni
2019-03-25 14:07 ` luca abeni
@ 2019-03-25 14:11 ` Ryan Thibodeaux
2019-03-25 17:36 ` Ryan Thibodeaux
` (3 more replies)
2019-03-25 14:11 ` Ryan Thibodeaux
` (2 subsequent siblings)
5 siblings, 4 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-25 14:11 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: luca abeni, Dario Faggioli, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On Mon, Mar 25, 2019 at 09:43:20AM -0400, Boris Ostrovsky wrote:
> On 3/25/19 8:05 AM, luca abeni wrote:
> > Hi all,
> >
> > On Sat, 23 Mar 2019 11:41:51 +0100
> > luca abeni <luca.abeni@santannapisa.it> wrote:
> > [...]
> >>>> Is there any data that shows effects of using this new parameter?
> >>>>
> >>> Yes, I've done some research and experiments on this. I did it
> >>> together with a friend, which I'm Cc-ing, as I'm not sure we're
> >>> ready/capable to share the results, yet (Luca?).
> >> I think we can easily share the experimental data (cyclictest output
> >> and plots).
> >>
> >> Moreover, we can share the scripts and tools for running the
> >> experiments (so, everyone can easily reproduce the numbers by simply
> >> typing "make" and waiting for some time :)
> >>
> >>
> >> I'll try to package the results and the scripts/tools this evening,
> >> and I'll send them.
> > Sorry for the delay. I put some quick results here:
> > http://retis.santannapisa.it/luca/XenTimers/
> > (there also is a link to the scripts to be used for reproducing the
> > results). The latencies have been measured by running cyclictest in the
> > guest (see the scripts for details).
> >
> > The picture shows the latencies measured with an unpatched guest kernel
> > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary small
> > value :).
> > All the experiments have been performed booting the hypervisor with a
> > small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
>
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent devices
> as well?
I gather that would be on a case-by-case basis for very specific
ones.
For many timers in the kernel, the minimums are determined by the
actual hardware backing the timer, and the minimum can be
adjusted by the clockevent code. This case is special since it
is entirely a software-based implementation in the kernel, where
the actual timer implementation is in the Xen hypervisor.
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
I cannot think of such a case where that would be helpful, but I
cannot rule that out or speak as an authority.
- Ryan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 14:11 ` Ryan Thibodeaux
@ 2019-03-25 17:36 ` Ryan Thibodeaux
2019-03-25 17:36 ` Ryan Thibodeaux
` (2 subsequent siblings)
3 siblings, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-25 17:36 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On Mon, Mar 25, 2019 at 10:11:38AM -0400, Ryan Thibodeaux wrote:
> > > [...]
> > >>>> Is there any data that shows effects of using this new parameter?
> > >>>>
Continuing with the experimental data conversation (thanks to Luca and
Dario for being so generous), I am providing more results from quick
tests this morning.
I ran the same sequence of tests four times with the same hardware,
hypervisor, and Linux guest setup. Only changes between runs was
adjusting the slop settings in Xen and Linux. This was on a
build of Xen 4.10 and a Linux guest running the current Xen
tip.git kernel with my patch.
For each sequence, I ran two variations of cyclictest on an isolated
processor. The first test used an interval of 50 microseconds and
second test used an interval of 1000 microseconds, passing "-i50"
and "-10000" arguments to cyclictest respectively.
The variations of the sequences are as follows:
#1 - default slops: Xen@50000, Linux@100000
#2 - lowering Linux: Xen@50000, Linux@5000
#3 - lowering Xen: Xen@5000, Linux@100000
#4 - lowering both: Xen@5000, Linux@5000
The cleaned up test output is below. Only showing the total
stats for each run and the number of spikes / samples that went
over 100 microseconds. I do not record each sample value like
Luca and Dario, because I want to eliminate as many variables as
possible, e.g., eliminating overhead of writing out raw results.
Looking at the results, you can see that only lowering the Linux
slop (with my proposed patch) does reduce the overall PDL stats for
the shorter interval, but it especially lowers the spikes, in both
cases. Even in test #3 where the Xen slop was lowered, the spikes
are a problem at the default Linux slop.
Reiterating what Luca and Dario said, lowering both slops is the
way to consisten results for both interval configurations.
Note: even better stats can likely be achieved with more tuning
and using the RT patchset. These results were just focusing on
a non-specialized configuration.
...
##############################
# Timer Slop: Xen (default, 50000) | Guest (default, 100000)
# Cyclictest Interval (-i50)
Min: 62
Avg: 127
Max: 212
Spikes (over 100): 3892034
# Cyclictest Interval (-i1000)
Min: 24
Avg: 45
Max: 156
Spikes (over 100): 27
##############################
# Timer Slop: Xen (default, 50000) | Guest (5000)
# Cyclictest Interval (-i50)
Min: 25
Avg: 78
Max: 230
Spikes (over 100): 274549
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 82
Spikes (over 100): 0
##############################
# Timer Slop: Xen (5000) | Guest (default, 100000)
# Cyclictest Interval (-i50)
Min: 61
Avg: 126
Max: 226
Spikes (over 100): 3877860
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 74
Spikes (over 100): 0
##############################
# Timer Slop: Xen (5000) | Guest (5000)
# Cyclictest Interval (-i50)
Min: 13
Avg: 30
Max: 150
Spikes (over 100): 120
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 97
Spikes (over 100): 0
...
- Ryan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 14:11 ` Ryan Thibodeaux
2019-03-25 17:36 ` Ryan Thibodeaux
@ 2019-03-25 17:36 ` Ryan Thibodeaux
2019-03-25 18:31 ` Boris Ostrovsky
2019-03-25 18:31 ` Boris Ostrovsky
3 siblings, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-25 17:36 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: luca abeni, Dario Faggioli, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On Mon, Mar 25, 2019 at 10:11:38AM -0400, Ryan Thibodeaux wrote:
> > > [...]
> > >>>> Is there any data that shows effects of using this new parameter?
> > >>>>
Continuing with the experimental data conversation (thanks to Luca and
Dario for being so generous), I am providing more results from quick
tests this morning.
I ran the same sequence of tests four times with the same hardware,
hypervisor, and Linux guest setup. Only changes between runs was
adjusting the slop settings in Xen and Linux. This was on a
build of Xen 4.10 and a Linux guest running the current Xen
tip.git kernel with my patch.
For each sequence, I ran two variations of cyclictest on an isolated
processor. The first test used an interval of 50 microseconds and
second test used an interval of 1000 microseconds, passing "-i50"
and "-10000" arguments to cyclictest respectively.
The variations of the sequences are as follows:
#1 - default slops: Xen@50000, Linux@100000
#2 - lowering Linux: Xen@50000, Linux@5000
#3 - lowering Xen: Xen@5000, Linux@100000
#4 - lowering both: Xen@5000, Linux@5000
The cleaned up test output is below. Only showing the total
stats for each run and the number of spikes / samples that went
over 100 microseconds. I do not record each sample value like
Luca and Dario, because I want to eliminate as many variables as
possible, e.g., eliminating overhead of writing out raw results.
Looking at the results, you can see that only lowering the Linux
slop (with my proposed patch) does reduce the overall PDL stats for
the shorter interval, but it especially lowers the spikes, in both
cases. Even in test #3 where the Xen slop was lowered, the spikes
are a problem at the default Linux slop.
Reiterating what Luca and Dario said, lowering both slops is the
way to consisten results for both interval configurations.
Note: even better stats can likely be achieved with more tuning
and using the RT patchset. These results were just focusing on
a non-specialized configuration.
...
##############################
# Timer Slop: Xen (default, 50000) | Guest (default, 100000)
# Cyclictest Interval (-i50)
Min: 62
Avg: 127
Max: 212
Spikes (over 100): 3892034
# Cyclictest Interval (-i1000)
Min: 24
Avg: 45
Max: 156
Spikes (over 100): 27
##############################
# Timer Slop: Xen (default, 50000) | Guest (5000)
# Cyclictest Interval (-i50)
Min: 25
Avg: 78
Max: 230
Spikes (over 100): 274549
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 82
Spikes (over 100): 0
##############################
# Timer Slop: Xen (5000) | Guest (default, 100000)
# Cyclictest Interval (-i50)
Min: 61
Avg: 126
Max: 226
Spikes (over 100): 3877860
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 74
Spikes (over 100): 0
##############################
# Timer Slop: Xen (5000) | Guest (5000)
# Cyclictest Interval (-i50)
Min: 13
Avg: 30
Max: 150
Spikes (over 100): 120
# Cyclictest Interval (-i1000)
Min: 37
Avg: 45
Max: 97
Spikes (over 100): 0
...
- Ryan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 14:11 ` Ryan Thibodeaux
2019-03-25 17:36 ` Ryan Thibodeaux
2019-03-25 17:36 ` Ryan Thibodeaux
@ 2019-03-25 18:31 ` Boris Ostrovsky
2019-03-25 18:31 ` Boris Ostrovsky
3 siblings, 0 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-25 18:31 UTC (permalink / raw)
To: Ryan Thibodeaux
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On 3/25/19 10:11 AM, Ryan Thibodeaux wrote:
> On Mon, Mar 25, 2019 at 09:43:20AM -0400, Boris Ostrovsky wrote:
>> On 3/25/19 8:05 AM, luca abeni wrote:
>>> Hi all,
>>>
>>> On Sat, 23 Mar 2019 11:41:51 +0100
>>> luca abeni <luca.abeni@santannapisa.it> wrote:
>>> [...]
>>>>>> Is there any data that shows effects of using this new parameter?
>>>>>>
>>>>> Yes, I've done some research and experiments on this. I did it
>>>>> together with a friend, which I'm Cc-ing, as I'm not sure we're
>>>>> ready/capable to share the results, yet (Luca?).
>>>> I think we can easily share the experimental data (cyclictest output
>>>> and plots).
>>>>
>>>> Moreover, we can share the scripts and tools for running the
>>>> experiments (so, everyone can easily reproduce the numbers by simply
>>>> typing "make" and waiting for some time :)
>>>>
>>>>
>>>> I'll try to package the results and the scripts/tools this evening,
>>>> and I'll send them.
>>> Sorry for the delay. I put some quick results here:
>>> http://retis.santannapisa.it/luca/XenTimers/
>>> (there also is a link to the scripts to be used for reproducing the
>>> results). The latencies have been measured by running cyclictest in the
>>> guest (see the scripts for details).
>>>
>>> The picture shows the latencies measured with an unpatched guest kernel
>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary small
>>> value :).
>>> All the experiments have been performed booting the hypervisor with a
>>> small timer_slop (the hypervisor's one) value. So, they show that
>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>> latencies with cyclictest.
>>
>>
>> I have a couple of questions:
>> * Does it make sense to make this a tunable for other clockevent devices
>> as well?
> I gather that would be on a case-by-case basis for very specific
> ones.
>
> For many timers in the kernel, the minimums are determined by the
> actual hardware backing the timer, and the minimum can be
> adjusted by the clockevent code. This case is special since it
> is entirely a software-based implementation in the kernel, where
> the actual timer implementation is in the Xen hypervisor.
>
>> * This patch adjusts min value. Could max value (ever) need a similar
>> adjustment?
> I cannot think of such a case where that would be helpful, but I
> cannot rule that out or speak as an authority.
I am asking mostly because you are introducing new interface and I don't
want it to change in the future. I suppose if later we decide to add
control for the max value we could just expand your current proposal to
xen_timer_slop=[min],[max] and keep it to be back-compatible.
For the patch:
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 14:11 ` Ryan Thibodeaux
` (2 preceding siblings ...)
2019-03-25 18:31 ` Boris Ostrovsky
@ 2019-03-25 18:31 ` Boris Ostrovsky
3 siblings, 0 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-25 18:31 UTC (permalink / raw)
To: Ryan Thibodeaux
Cc: luca abeni, Dario Faggioli, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On 3/25/19 10:11 AM, Ryan Thibodeaux wrote:
> On Mon, Mar 25, 2019 at 09:43:20AM -0400, Boris Ostrovsky wrote:
>> On 3/25/19 8:05 AM, luca abeni wrote:
>>> Hi all,
>>>
>>> On Sat, 23 Mar 2019 11:41:51 +0100
>>> luca abeni <luca.abeni@santannapisa.it> wrote:
>>> [...]
>>>>>> Is there any data that shows effects of using this new parameter?
>>>>>>
>>>>> Yes, I've done some research and experiments on this. I did it
>>>>> together with a friend, which I'm Cc-ing, as I'm not sure we're
>>>>> ready/capable to share the results, yet (Luca?).
>>>> I think we can easily share the experimental data (cyclictest output
>>>> and plots).
>>>>
>>>> Moreover, we can share the scripts and tools for running the
>>>> experiments (so, everyone can easily reproduce the numbers by simply
>>>> typing "make" and waiting for some time :)
>>>>
>>>>
>>>> I'll try to package the results and the scripts/tools this evening,
>>>> and I'll send them.
>>> Sorry for the delay. I put some quick results here:
>>> http://retis.santannapisa.it/luca/XenTimers/
>>> (there also is a link to the scripts to be used for reproducing the
>>> results). The latencies have been measured by running cyclictest in the
>>> guest (see the scripts for details).
>>>
>>> The picture shows the latencies measured with an unpatched guest kernel
>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary small
>>> value :).
>>> All the experiments have been performed booting the hypervisor with a
>>> small timer_slop (the hypervisor's one) value. So, they show that
>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>> latencies with cyclictest.
>>
>>
>> I have a couple of questions:
>> * Does it make sense to make this a tunable for other clockevent devices
>> as well?
> I gather that would be on a case-by-case basis for very specific
> ones.
>
> For many timers in the kernel, the minimums are determined by the
> actual hardware backing the timer, and the minimum can be
> adjusted by the clockevent code. This case is special since it
> is entirely a software-based implementation in the kernel, where
> the actual timer implementation is in the Xen hypervisor.
>
>> * This patch adjusts min value. Could max value (ever) need a similar
>> adjustment?
> I cannot think of such a case where that would be helpful, but I
> cannot rule that out or speak as an authority.
I am asking mostly because you are introducing new interface and I don't
want it to change in the future. I suppose if later we decide to add
control for the max value we could just expand your current proposal to
xen_timer_slop=[min],[max] and keep it to be back-compatible.
For the patch:
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
` (2 preceding siblings ...)
2019-03-25 14:11 ` Ryan Thibodeaux
@ 2019-03-25 14:11 ` Ryan Thibodeaux
2019-03-26 9:13 ` Dario Faggioli
2019-03-26 9:13 ` Dario Faggioli
5 siblings, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-25 14:11 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On Mon, Mar 25, 2019 at 09:43:20AM -0400, Boris Ostrovsky wrote:
> On 3/25/19 8:05 AM, luca abeni wrote:
> > Hi all,
> >
> > On Sat, 23 Mar 2019 11:41:51 +0100
> > luca abeni <luca.abeni@santannapisa.it> wrote:
> > [...]
> >>>> Is there any data that shows effects of using this new parameter?
> >>>>
> >>> Yes, I've done some research and experiments on this. I did it
> >>> together with a friend, which I'm Cc-ing, as I'm not sure we're
> >>> ready/capable to share the results, yet (Luca?).
> >> I think we can easily share the experimental data (cyclictest output
> >> and plots).
> >>
> >> Moreover, we can share the scripts and tools for running the
> >> experiments (so, everyone can easily reproduce the numbers by simply
> >> typing "make" and waiting for some time :)
> >>
> >>
> >> I'll try to package the results and the scripts/tools this evening,
> >> and I'll send them.
> > Sorry for the delay. I put some quick results here:
> > http://retis.santannapisa.it/luca/XenTimers/
> > (there also is a link to the scripts to be used for reproducing the
> > results). The latencies have been measured by running cyclictest in the
> > guest (see the scripts for details).
> >
> > The picture shows the latencies measured with an unpatched guest kernel
> > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary small
> > value :).
> > All the experiments have been performed booting the hypervisor with a
> > small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
>
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent devices
> as well?
I gather that would be on a case-by-case basis for very specific
ones.
For many timers in the kernel, the minimums are determined by the
actual hardware backing the timer, and the minimum can be
adjusted by the clockevent code. This case is special since it
is entirely a software-based implementation in the kernel, where
the actual timer implementation is in the Xen hypervisor.
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
I cannot think of such a case where that would be helpful, but I
cannot rule that out or speak as an authority.
- Ryan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
` (3 preceding siblings ...)
2019-03-25 14:11 ` Ryan Thibodeaux
@ 2019-03-26 9:13 ` Dario Faggioli
2019-03-26 9:13 ` Dario Faggioli
5 siblings, 0 replies; 45+ messages in thread
From: Dario Faggioli @ 2019-03-26 9:13 UTC (permalink / raw)
To: Boris Ostrovsky, luca abeni
Cc: jgross, oleksandr_andrushchenko, linux-kernel, thibodux,
ryan.thibodeaux, xen-devel, tglx
[-- Attachment #1.1: Type: text/plain, Size: 3280 bytes --]
On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> On 3/25/19 8:05 AM, luca abeni wrote:
> >
> > The picture shows the latencies measured with an unpatched guest
> > kernel
> > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > small
> > value :).
> > All the experiments have been performed booting the hypervisor with
> > a
> > small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent
> devices
> as well?
>
So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
keep the delta between now and the next timer event within
dev->max_delta_ns and dev->min_delta_ns:
delta = min(delta, (int64_t) dev->max_delta_ns);
delta = max(delta, (int64_t) dev->min_delta_ns);
For Xen (well, for the Xen clock) we have:
.max_delta_ns = 0xffffffff,
.min_delta_ns = TIMER_SLOP,
which means a guest can't ask for a timer to fire earlier than 100us
ahead, which is a bit too coarse, especially on contemporary hardware.
For "lapic_deadline" (which was what was in use in KVM guests, in our
experiments) we have:
lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
Which means max is 0x7FFFFF device ticks, and min is 0xF.
clockevent_delta2ns() does the conversion from ticks to ns, basing on
the results of the APIC calibration process. It calls cev_delta2ns()
which does some scaling, shifting, divs, etc, and, at the very end,
this:
/* Deltas less than 1usec are pointless noise */
return clc > 1000 ? clc : 1000;
So, as Ryan is also saying, the actual minimum, in this case, depends
on hardware, with a sanity check of "never below 1us" (which is quite
smaller than 100us!)
Of course, the actual granularity depends on hardware in the Xen case
as well, but that is handled in Xen itself. And we have mechanisms in
place in there to avoid timer interrupt storms (like, ahem, the Xen's
'timer_slop' boot parameter... :-P)
And this is basically why I was also thinking we can/should lower the
default value of TIMER_SLOP, here in the Xen clock implementation in
Linux.
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
>
Well, for Xen, it's already 0xffffffff. I don't see use cases when one
would want a smaller max. Wanting an higher max *might* be of some
interest, e.g., for power management, if the first timer event is 1min
ahead, and you don't want to be woken up every (if my math is right) 4
secs.
But we'd have to see if that actually works, not to mention that 4 secs
is already large enough, IMHO, that it's unlikely we'll be really
sleeping for that much time without having to wake up for one reason or
another. :-)
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-25 13:43 ` Boris Ostrovsky
` (4 preceding siblings ...)
2019-03-26 9:13 ` Dario Faggioli
@ 2019-03-26 9:13 ` Dario Faggioli
2019-03-26 11:12 ` luca abeni
` (3 more replies)
5 siblings, 4 replies; 45+ messages in thread
From: Dario Faggioli @ 2019-03-26 9:13 UTC (permalink / raw)
To: Boris Ostrovsky, luca abeni
Cc: thibodux, xen-devel, linux-kernel, oleksandr_andrushchenko, tglx,
jgross, ryan.thibodeaux
[-- Attachment #1: Type: text/plain, Size: 3280 bytes --]
On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> On 3/25/19 8:05 AM, luca abeni wrote:
> >
> > The picture shows the latencies measured with an unpatched guest
> > kernel
> > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > small
> > value :).
> > All the experiments have been performed booting the hypervisor with
> > a
> > small timer_slop (the hypervisor's one) value. So, they show that
> > decreasing the hypervisor's timer_slop is not enough to measure low
> > latencies with cyclictest.
>
> I have a couple of questions:
> * Does it make sense to make this a tunable for other clockevent
> devices
> as well?
>
So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
keep the delta between now and the next timer event within
dev->max_delta_ns and dev->min_delta_ns:
delta = min(delta, (int64_t) dev->max_delta_ns);
delta = max(delta, (int64_t) dev->min_delta_ns);
For Xen (well, for the Xen clock) we have:
.max_delta_ns = 0xffffffff,
.min_delta_ns = TIMER_SLOP,
which means a guest can't ask for a timer to fire earlier than 100us
ahead, which is a bit too coarse, especially on contemporary hardware.
For "lapic_deadline" (which was what was in use in KVM guests, in our
experiments) we have:
lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
Which means max is 0x7FFFFF device ticks, and min is 0xF.
clockevent_delta2ns() does the conversion from ticks to ns, basing on
the results of the APIC calibration process. It calls cev_delta2ns()
which does some scaling, shifting, divs, etc, and, at the very end,
this:
/* Deltas less than 1usec are pointless noise */
return clc > 1000 ? clc : 1000;
So, as Ryan is also saying, the actual minimum, in this case, depends
on hardware, with a sanity check of "never below 1us" (which is quite
smaller than 100us!)
Of course, the actual granularity depends on hardware in the Xen case
as well, but that is handled in Xen itself. And we have mechanisms in
place in there to avoid timer interrupt storms (like, ahem, the Xen's
'timer_slop' boot parameter... :-P)
And this is basically why I was also thinking we can/should lower the
default value of TIMER_SLOP, here in the Xen clock implementation in
Linux.
> * This patch adjusts min value. Could max value (ever) need a similar
> adjustment?
>
Well, for Xen, it's already 0xffffffff. I don't see use cases when one
would want a smaller max. Wanting an higher max *might* be of some
interest, e.g., for power management, if the first timer event is 1min
ahead, and you don't want to be woken up every (if my math is right) 4
secs.
But we'd have to see if that actually works, not to mention that 4 secs
is already large enough, IMHO, that it's unlikely we'll be really
sleeping for that much time without having to wake up for one reason or
another. :-)
Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 9:13 ` Dario Faggioli
@ 2019-03-26 11:12 ` luca abeni
2019-03-26 11:41 ` Ryan Thibodeaux
2019-03-26 11:41 ` Ryan Thibodeaux
2019-03-26 11:12 ` luca abeni
` (2 subsequent siblings)
3 siblings, 2 replies; 45+ messages in thread
From: luca abeni @ 2019-03-26 11:12 UTC (permalink / raw)
To: Dario Faggioli
Cc: Boris Ostrovsky, thibodux, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
Hi all,
On Tue, 26 Mar 2019 10:13:32 +0100
Dario Faggioli <dfaggioli@suse.com> wrote:
> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> > On 3/25/19 8:05 AM, luca abeni wrote:
> > >
> > > The picture shows the latencies measured with an unpatched guest
> > > kernel
> > > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > > small
> > > value :).
> > > All the experiments have been performed booting the hypervisor
> > > with a
> > > small timer_slop (the hypervisor's one) value. So, they show that
> > > decreasing the hypervisor's timer_slop is not enough to measure
> > > low latencies with cyclictest.
> >
> > I have a couple of questions:
> > * Does it make sense to make this a tunable for other clockevent
> > devices
> > as well?
> >
> So, AFAIUI, the thing is as follows. In clockevents_program_event(),
> we keep the delta between now and the next timer event within
> dev->max_delta_ns and dev->min_delta_ns:
>
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
>
> For Xen (well, for the Xen clock) we have:
>
> .max_delta_ns = 0xffffffff,
> .min_delta_ns = TIMER_SLOP,
>
> which means a guest can't ask for a timer to fire earlier than 100us
[...]
I know this is not fully related with the current discussion, but in
these days I had a look at the code again, and...
The comment for TIMER_SLOP in arch/x86/xen/time.c says:
/* Xen may fire a timer up to this many ns early */
Isn't the comment wrong? shouldn't it be "...many ns late" instead of
"early"?
Thanks,
Luca
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 11:12 ` luca abeni
@ 2019-03-26 11:41 ` Ryan Thibodeaux
2019-03-26 11:41 ` Ryan Thibodeaux
1 sibling, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-26 11:41 UTC (permalink / raw)
To: luca abeni
Cc: jgross, oleksandr_andrushchenko, linux-kernel, Dario Faggioli,
ryan.thibodeaux, xen-devel, Boris Ostrovsky, tglx
On Tue, Mar 26, 2019 at 12:12:56PM +0100, luca abeni wrote:
> Hi all,
>
> On Tue, 26 Mar 2019 10:13:32 +0100
> Dario Faggioli <dfaggioli@suse.com> wrote:
>
> > On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> > > On 3/25/19 8:05 AM, luca abeni wrote:
> > > >
> > > > The picture shows the latencies measured with an unpatched guest
> > > > kernel
> > > > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > > > small
> > > > value :).
> > > > All the experiments have been performed booting the hypervisor
> > > > with a
> > > > small timer_slop (the hypervisor's one) value. So, they show that
> > > > decreasing the hypervisor's timer_slop is not enough to measure
> > > > low latencies with cyclictest.
> > >
> > > I have a couple of questions:
> > > * Does it make sense to make this a tunable for other clockevent
> > > devices
> > > as well?
> > >
> > So, AFAIUI, the thing is as follows. In clockevents_program_event(),
> > we keep the delta between now and the next timer event within
> > dev->max_delta_ns and dev->min_delta_ns:
> >
> > delta = min(delta, (int64_t) dev->max_delta_ns);
> > delta = max(delta, (int64_t) dev->min_delta_ns);
> >
> > For Xen (well, for the Xen clock) we have:
> >
> > .max_delta_ns = 0xffffffff,
> > .min_delta_ns = TIMER_SLOP,
> >
> > which means a guest can't ask for a timer to fire earlier than 100us
> [...]
>
> I know this is not fully related with the current discussion, but in
> these days I had a look at the code again, and...
> The comment for TIMER_SLOP in arch/x86/xen/time.c says:
> /* Xen may fire a timer up to this many ns early */
>
> Isn't the comment wrong? shouldn't it be "...many ns late" instead of
> "early"?
I would say is something else entirely.
If you look at "clockevents_program_event()" in kernel/time/clockevents.c,
you see that the min_delta_ns value sets the limit or granulariy for the
clock's sleep time.
Basically, it is the minimum amount of sleep one can set for the next
event for the clock in question.
- Ryan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 11:12 ` luca abeni
2019-03-26 11:41 ` Ryan Thibodeaux
@ 2019-03-26 11:41 ` Ryan Thibodeaux
1 sibling, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-26 11:41 UTC (permalink / raw)
To: luca abeni
Cc: Dario Faggioli, Boris Ostrovsky, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On Tue, Mar 26, 2019 at 12:12:56PM +0100, luca abeni wrote:
> Hi all,
>
> On Tue, 26 Mar 2019 10:13:32 +0100
> Dario Faggioli <dfaggioli@suse.com> wrote:
>
> > On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> > > On 3/25/19 8:05 AM, luca abeni wrote:
> > > >
> > > > The picture shows the latencies measured with an unpatched guest
> > > > kernel
> > > > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > > > small
> > > > value :).
> > > > All the experiments have been performed booting the hypervisor
> > > > with a
> > > > small timer_slop (the hypervisor's one) value. So, they show that
> > > > decreasing the hypervisor's timer_slop is not enough to measure
> > > > low latencies with cyclictest.
> > >
> > > I have a couple of questions:
> > > * Does it make sense to make this a tunable for other clockevent
> > > devices
> > > as well?
> > >
> > So, AFAIUI, the thing is as follows. In clockevents_program_event(),
> > we keep the delta between now and the next timer event within
> > dev->max_delta_ns and dev->min_delta_ns:
> >
> > delta = min(delta, (int64_t) dev->max_delta_ns);
> > delta = max(delta, (int64_t) dev->min_delta_ns);
> >
> > For Xen (well, for the Xen clock) we have:
> >
> > .max_delta_ns = 0xffffffff,
> > .min_delta_ns = TIMER_SLOP,
> >
> > which means a guest can't ask for a timer to fire earlier than 100us
> [...]
>
> I know this is not fully related with the current discussion, but in
> these days I had a look at the code again, and...
> The comment for TIMER_SLOP in arch/x86/xen/time.c says:
> /* Xen may fire a timer up to this many ns early */
>
> Isn't the comment wrong? shouldn't it be "...many ns late" instead of
> "early"?
I would say is something else entirely.
If you look at "clockevents_program_event()" in kernel/time/clockevents.c,
you see that the min_delta_ns value sets the limit or granulariy for the
clock's sleep time.
Basically, it is the minimum amount of sleep one can set for the next
event for the clock in question.
- Ryan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 9:13 ` Dario Faggioli
2019-03-26 11:12 ` luca abeni
@ 2019-03-26 11:12 ` luca abeni
2019-03-26 23:21 ` Boris Ostrovsky
2019-03-26 23:21 ` Boris Ostrovsky
3 siblings, 0 replies; 45+ messages in thread
From: luca abeni @ 2019-03-26 11:12 UTC (permalink / raw)
To: Dario Faggioli
Cc: jgross, oleksandr_andrushchenko, linux-kernel, thibodux,
ryan.thibodeaux, xen-devel, Boris Ostrovsky, tglx
Hi all,
On Tue, 26 Mar 2019 10:13:32 +0100
Dario Faggioli <dfaggioli@suse.com> wrote:
> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> > On 3/25/19 8:05 AM, luca abeni wrote:
> > >
> > > The picture shows the latencies measured with an unpatched guest
> > > kernel
> > > and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> > > small
> > > value :).
> > > All the experiments have been performed booting the hypervisor
> > > with a
> > > small timer_slop (the hypervisor's one) value. So, they show that
> > > decreasing the hypervisor's timer_slop is not enough to measure
> > > low latencies with cyclictest.
> >
> > I have a couple of questions:
> > * Does it make sense to make this a tunable for other clockevent
> > devices
> > as well?
> >
> So, AFAIUI, the thing is as follows. In clockevents_program_event(),
> we keep the delta between now and the next timer event within
> dev->max_delta_ns and dev->min_delta_ns:
>
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
>
> For Xen (well, for the Xen clock) we have:
>
> .max_delta_ns = 0xffffffff,
> .min_delta_ns = TIMER_SLOP,
>
> which means a guest can't ask for a timer to fire earlier than 100us
[...]
I know this is not fully related with the current discussion, but in
these days I had a look at the code again, and...
The comment for TIMER_SLOP in arch/x86/xen/time.c says:
/* Xen may fire a timer up to this many ns early */
Isn't the comment wrong? shouldn't it be "...many ns late" instead of
"early"?
Thanks,
Luca
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 9:13 ` Dario Faggioli
2019-03-26 11:12 ` luca abeni
2019-03-26 11:12 ` luca abeni
@ 2019-03-26 23:21 ` Boris Ostrovsky
2019-03-27 10:00 ` Ryan Thibodeaux
2019-03-27 10:00 ` Ryan Thibodeaux
2019-03-26 23:21 ` Boris Ostrovsky
3 siblings, 2 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-26 23:21 UTC (permalink / raw)
To: Dario Faggioli, luca abeni
Cc: thibodux, xen-devel, linux-kernel, oleksandr_andrushchenko, tglx,
jgross, ryan.thibodeaux
[-- Attachment #1.1: Type: text/plain, Size: 2744 bytes --]
On 3/26/19 5:13 AM, Dario Faggioli wrote:
> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
>> On 3/25/19 8:05 AM, luca abeni wrote:
>>> The picture shows the latencies measured with an unpatched guest
>>> kernel
>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
>>> small
>>> value :).
>>> All the experiments have been performed booting the hypervisor with
>>> a
>>> small timer_slop (the hypervisor's one) value. So, they show that
>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>> latencies with cyclictest.
>> I have a couple of questions:
>> * Does it make sense to make this a tunable for other clockevent
>> devices
>> as well?
>>
> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> keep the delta between now and the next timer event within
> dev->max_delta_ns and dev->min_delta_ns:
>
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
>
> For Xen (well, for the Xen clock) we have:
>
> .max_delta_ns = 0xffffffff,
> .min_delta_ns = TIMER_SLOP,
>
> which means a guest can't ask for a timer to fire earlier than 100us
> ahead, which is a bit too coarse, especially on contemporary hardware.
>
> For "lapic_deadline" (which was what was in use in KVM guests, in our
> experiments) we have:
>
> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
>
> Which means max is 0x7FFFFF device ticks, and min is 0xF.
> clockevent_delta2ns() does the conversion from ticks to ns, basing on
> the results of the APIC calibration process. It calls cev_delta2ns()
> which does some scaling, shifting, divs, etc, and, at the very end,
> this:
>
> /* Deltas less than 1usec are pointless noise */
> return clc > 1000 ? clc : 1000;
>
> So, as Ryan is also saying, the actual minimum, in this case, depends
> on hardware, with a sanity check of "never below 1us" (which is quite
> smaller than 100us!)
>
> Of course, the actual granularity depends on hardware in the Xen case
> as well, but that is handled in Xen itself. And we have mechanisms in
> place in there to avoid timer interrupt storms (like, ahem, the Xen's
> 'timer_slop' boot parameter... :-P)
>
> And this is basically why I was also thinking we can/should lower the
> default value of TIMER_SLOP, here in the Xen clock implementation in
> Linux.
What do you think would be a sane value? 10us? Should we then still keep
this patch?
My concern would be that if we change the current value and it turns out
to be very wrong we'd then have no recourse.
-boris
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 23:21 ` Boris Ostrovsky
@ 2019-03-27 10:00 ` Ryan Thibodeaux
2019-03-27 10:00 ` Ryan Thibodeaux
1 sibling, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-27 10:00 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> On 3/26/19 5:13 AM, Dario Faggioli wrote:
> > On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> >> On 3/25/19 8:05 AM, luca abeni wrote:
> >>> The picture shows the latencies measured with an unpatched guest
> >>> kernel
> >>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> >>> small
> >>> value :).
> >>> All the experiments have been performed booting the hypervisor with
> >>> a
> >>> small timer_slop (the hypervisor's one) value. So, they show that
> >>> decreasing the hypervisor's timer_slop is not enough to measure low
> >>> latencies with cyclictest.
> >> I have a couple of questions:
> >> * Does it make sense to make this a tunable for other clockevent
> >> devices
> >> as well?
> >>
> > So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> > keep the delta between now and the next timer event within
> > dev->max_delta_ns and dev->min_delta_ns:
> >
> > delta = min(delta, (int64_t) dev->max_delta_ns);
> > delta = max(delta, (int64_t) dev->min_delta_ns);
> >
> > For Xen (well, for the Xen clock) we have:
> >
> > .max_delta_ns = 0xffffffff,
> > .min_delta_ns = TIMER_SLOP,
> >
> > which means a guest can't ask for a timer to fire earlier than 100us
> > ahead, which is a bit too coarse, especially on contemporary hardware.
> >
> > For "lapic_deadline" (which was what was in use in KVM guests, in our
> > experiments) we have:
> >
> > lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> > lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
> >
> > Which means max is 0x7FFFFF device ticks, and min is 0xF.
> > clockevent_delta2ns() does the conversion from ticks to ns, basing on
> > the results of the APIC calibration process. It calls cev_delta2ns()
> > which does some scaling, shifting, divs, etc, and, at the very end,
> > this:
> >
> > /* Deltas less than 1usec are pointless noise */
> > return clc > 1000 ? clc : 1000;
> >
> > So, as Ryan is also saying, the actual minimum, in this case, depends
> > on hardware, with a sanity check of "never below 1us" (which is quite
> > smaller than 100us!)
> >
> > Of course, the actual granularity depends on hardware in the Xen case
> > as well, but that is handled in Xen itself. And we have mechanisms in
> > place in there to avoid timer interrupt storms (like, ahem, the Xen's
> > 'timer_slop' boot parameter... :-P)
> >
> > And this is basically why I was also thinking we can/should lower the
> > default value of TIMER_SLOP, here in the Xen clock implementation in
> > Linux.
>
> What do you think would be a sane value? 10us? Should we then still keep
> this patch?
>
> My concern would be that if we change the current value and it turns out
> to be very wrong we'd then have no recourse.
>
>
> -boris
>
Speaking out of turn but as a participant in this thread, I would not
assume to change the default value for all cases without significant
testing by the community, touching a variety of configurations.
It feels like changing the default has a non-trivial amount of
unknowns that would need to be addressed.
Not surprisingly, I am biased to the approach of my patch which
does not change the default but offers flexibility to all.
- Ryan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 23:21 ` Boris Ostrovsky
2019-03-27 10:00 ` Ryan Thibodeaux
@ 2019-03-27 10:00 ` Ryan Thibodeaux
2019-03-27 14:46 ` Boris Ostrovsky
2019-03-27 14:46 ` Boris Ostrovsky
1 sibling, 2 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-27 10:00 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: Dario Faggioli, luca abeni, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> On 3/26/19 5:13 AM, Dario Faggioli wrote:
> > On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> >> On 3/25/19 8:05 AM, luca abeni wrote:
> >>> The picture shows the latencies measured with an unpatched guest
> >>> kernel
> >>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> >>> small
> >>> value :).
> >>> All the experiments have been performed booting the hypervisor with
> >>> a
> >>> small timer_slop (the hypervisor's one) value. So, they show that
> >>> decreasing the hypervisor's timer_slop is not enough to measure low
> >>> latencies with cyclictest.
> >> I have a couple of questions:
> >> * Does it make sense to make this a tunable for other clockevent
> >> devices
> >> as well?
> >>
> > So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> > keep the delta between now and the next timer event within
> > dev->max_delta_ns and dev->min_delta_ns:
> >
> > delta = min(delta, (int64_t) dev->max_delta_ns);
> > delta = max(delta, (int64_t) dev->min_delta_ns);
> >
> > For Xen (well, for the Xen clock) we have:
> >
> > .max_delta_ns = 0xffffffff,
> > .min_delta_ns = TIMER_SLOP,
> >
> > which means a guest can't ask for a timer to fire earlier than 100us
> > ahead, which is a bit too coarse, especially on contemporary hardware.
> >
> > For "lapic_deadline" (which was what was in use in KVM guests, in our
> > experiments) we have:
> >
> > lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> > lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
> >
> > Which means max is 0x7FFFFF device ticks, and min is 0xF.
> > clockevent_delta2ns() does the conversion from ticks to ns, basing on
> > the results of the APIC calibration process. It calls cev_delta2ns()
> > which does some scaling, shifting, divs, etc, and, at the very end,
> > this:
> >
> > /* Deltas less than 1usec are pointless noise */
> > return clc > 1000 ? clc : 1000;
> >
> > So, as Ryan is also saying, the actual minimum, in this case, depends
> > on hardware, with a sanity check of "never below 1us" (which is quite
> > smaller than 100us!)
> >
> > Of course, the actual granularity depends on hardware in the Xen case
> > as well, but that is handled in Xen itself. And we have mechanisms in
> > place in there to avoid timer interrupt storms (like, ahem, the Xen's
> > 'timer_slop' boot parameter... :-P)
> >
> > And this is basically why I was also thinking we can/should lower the
> > default value of TIMER_SLOP, here in the Xen clock implementation in
> > Linux.
>
> What do you think would be a sane value? 10us? Should we then still keep
> this patch?
>
> My concern would be that if we change the current value and it turns out
> to be very wrong we'd then have no recourse.
>
>
> -boris
>
Speaking out of turn but as a participant in this thread, I would not
assume to change the default value for all cases without significant
testing by the community, touching a variety of configurations.
It feels like changing the default has a non-trivial amount of
unknowns that would need to be addressed.
Not surprisingly, I am biased to the approach of my patch which
does not change the default but offers flexibility to all.
- Ryan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 10:00 ` Ryan Thibodeaux
@ 2019-03-27 14:46 ` Boris Ostrovsky
2019-03-27 14:59 ` Ryan Thibodeaux
` (3 more replies)
2019-03-27 14:46 ` Boris Ostrovsky
1 sibling, 4 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-27 14:46 UTC (permalink / raw)
To: Ryan Thibodeaux
Cc: Dario Faggioli, luca abeni, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
>> On 3/26/19 5:13 AM, Dario Faggioli wrote:
>>> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
>>>> On 3/25/19 8:05 AM, luca abeni wrote:
>>>>> The picture shows the latencies measured with an unpatched guest
>>>>> kernel
>>>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
>>>>> small
>>>>> value :).
>>>>> All the experiments have been performed booting the hypervisor with
>>>>> a
>>>>> small timer_slop (the hypervisor's one) value. So, they show that
>>>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>>>> latencies with cyclictest.
>>>> I have a couple of questions:
>>>> * Does it make sense to make this a tunable for other clockevent
>>>> devices
>>>> as well?
>>>>
>>> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
>>> keep the delta between now and the next timer event within
>>> dev->max_delta_ns and dev->min_delta_ns:
>>>
>>> delta = min(delta, (int64_t) dev->max_delta_ns);
>>> delta = max(delta, (int64_t) dev->min_delta_ns);
>>>
>>> For Xen (well, for the Xen clock) we have:
>>>
>>> .max_delta_ns = 0xffffffff,
>>> .min_delta_ns = TIMER_SLOP,
>>>
>>> which means a guest can't ask for a timer to fire earlier than 100us
>>> ahead, which is a bit too coarse, especially on contemporary hardware.
>>>
>>> For "lapic_deadline" (which was what was in use in KVM guests, in our
>>> experiments) we have:
>>>
>>> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
>>> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
>>>
>>> Which means max is 0x7FFFFF device ticks, and min is 0xF.
>>> clockevent_delta2ns() does the conversion from ticks to ns, basing on
>>> the results of the APIC calibration process. It calls cev_delta2ns()
>>> which does some scaling, shifting, divs, etc, and, at the very end,
>>> this:
>>>
>>> /* Deltas less than 1usec are pointless noise */
>>> return clc > 1000 ? clc : 1000;
>>>
>>> So, as Ryan is also saying, the actual minimum, in this case, depends
>>> on hardware, with a sanity check of "never below 1us" (which is quite
>>> smaller than 100us!)
>>>
>>> Of course, the actual granularity depends on hardware in the Xen case
>>> as well, but that is handled in Xen itself. And we have mechanisms in
>>> place in there to avoid timer interrupt storms (like, ahem, the Xen's
>>> 'timer_slop' boot parameter... :-P)
>>>
>>> And this is basically why I was also thinking we can/should lower the
>>> default value of TIMER_SLOP, here in the Xen clock implementation in
>>> Linux.
>> What do you think would be a sane value? 10us? Should we then still keep
>> this patch?
>>
>> My concern would be that if we change the current value and it turns out
>> to be very wrong we'd then have no recourse.
>>
>>
>> -boris
>>
> Speaking out of turn but as a participant in this thread, I would not
> assume to change the default value for all cases without significant
> testing by the community, touching a variety of configurations.
>
> It feels like changing the default has a non-trivial amount of
> unknowns that would need to be addressed.
>
> Not surprisingly, I am biased to the approach of my patch which
> does not change the default but offers flexibility to all.
If we are to change the default it would be good to at least collect
some data on distribution of delta values in
clockevents_program_event(). But as I said, I'd keep the patch.
Also, as far as the comment describing TIMER_SLOP, I agree that it is
rather misleading.
I can replace it with /* Minimum amount of time until next clock event
fires */, I can do it while committing so no need to resend.
-boris
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 14:46 ` Boris Ostrovsky
@ 2019-03-27 14:59 ` Ryan Thibodeaux
2019-03-27 14:59 ` Ryan Thibodeaux
` (2 subsequent siblings)
3 siblings, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-27 14:59 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On Wed, Mar 27, 2019 at 10:46:21AM -0400, Boris Ostrovsky wrote:
> On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> > On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> >> On 3/26/19 5:13 AM, Dario Faggioli wrote:
> >>> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> >>>> On 3/25/19 8:05 AM, luca abeni wrote:
> >>>>> The picture shows the latencies measured with an unpatched guest
> >>>>> kernel
> >>>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> >>>>> small
> >>>>> value :).
> >>>>> All the experiments have been performed booting the hypervisor with
> >>>>> a
> >>>>> small timer_slop (the hypervisor's one) value. So, they show that
> >>>>> decreasing the hypervisor's timer_slop is not enough to measure low
> >>>>> latencies with cyclictest.
> >>>> I have a couple of questions:
> >>>> * Does it make sense to make this a tunable for other clockevent
> >>>> devices
> >>>> as well?
> >>>>
> >>> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> >>> keep the delta between now and the next timer event within
> >>> dev->max_delta_ns and dev->min_delta_ns:
> >>>
> >>> delta = min(delta, (int64_t) dev->max_delta_ns);
> >>> delta = max(delta, (int64_t) dev->min_delta_ns);
> >>>
> >>> For Xen (well, for the Xen clock) we have:
> >>>
> >>> .max_delta_ns = 0xffffffff,
> >>> .min_delta_ns = TIMER_SLOP,
> >>>
> >>> which means a guest can't ask for a timer to fire earlier than 100us
> >>> ahead, which is a bit too coarse, especially on contemporary hardware.
> >>>
> >>> For "lapic_deadline" (which was what was in use in KVM guests, in our
> >>> experiments) we have:
> >>>
> >>> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> >>> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
> >>>
> >>> Which means max is 0x7FFFFF device ticks, and min is 0xF.
> >>> clockevent_delta2ns() does the conversion from ticks to ns, basing on
> >>> the results of the APIC calibration process. It calls cev_delta2ns()
> >>> which does some scaling, shifting, divs, etc, and, at the very end,
> >>> this:
> >>>
> >>> /* Deltas less than 1usec are pointless noise */
> >>> return clc > 1000 ? clc : 1000;
> >>>
> >>> So, as Ryan is also saying, the actual minimum, in this case, depends
> >>> on hardware, with a sanity check of "never below 1us" (which is quite
> >>> smaller than 100us!)
> >>>
> >>> Of course, the actual granularity depends on hardware in the Xen case
> >>> as well, but that is handled in Xen itself. And we have mechanisms in
> >>> place in there to avoid timer interrupt storms (like, ahem, the Xen's
> >>> 'timer_slop' boot parameter... :-P)
> >>>
> >>> And this is basically why I was also thinking we can/should lower the
> >>> default value of TIMER_SLOP, here in the Xen clock implementation in
> >>> Linux.
> >> What do you think would be a sane value? 10us? Should we then still keep
> >> this patch?
> >>
> >> My concern would be that if we change the current value and it turns out
> >> to be very wrong we'd then have no recourse.
> >>
> >>
> >> -boris
> >>
> > Speaking out of turn but as a participant in this thread, I would not
> > assume to change the default value for all cases without significant
> > testing by the community, touching a variety of configurations.
> >
> > It feels like changing the default has a non-trivial amount of
> > unknowns that would need to be addressed.
> >
> > Not surprisingly, I am biased to the approach of my patch which
> > does not change the default but offers flexibility to all.
>
>
> If we are to change the default it would be good to at least collect
> some data on distribution of delta values in
> clockevents_program_event(). But as I said, I'd keep the patch.
>
> Also, as far as the comment describing TIMER_SLOP, I agree that it is
> rather misleading.
>
> I can replace it with /* Minimum amount of time until next clock event
> fires */, I can do it while committing so no need to resend.
>
> -boris
I like that. Thanks Boris!
- Ryan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 14:46 ` Boris Ostrovsky
2019-03-27 14:59 ` Ryan Thibodeaux
@ 2019-03-27 14:59 ` Ryan Thibodeaux
2019-03-27 15:19 ` Dario Faggioli
2019-03-27 15:19 ` Dario Faggioli
3 siblings, 0 replies; 45+ messages in thread
From: Ryan Thibodeaux @ 2019-03-27 14:59 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: Dario Faggioli, luca abeni, xen-devel, linux-kernel,
oleksandr_andrushchenko, tglx, jgross, ryan.thibodeaux
On Wed, Mar 27, 2019 at 10:46:21AM -0400, Boris Ostrovsky wrote:
> On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> > On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> >> On 3/26/19 5:13 AM, Dario Faggioli wrote:
> >>> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
> >>>> On 3/25/19 8:05 AM, luca abeni wrote:
> >>>>> The picture shows the latencies measured with an unpatched guest
> >>>>> kernel
> >>>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
> >>>>> small
> >>>>> value :).
> >>>>> All the experiments have been performed booting the hypervisor with
> >>>>> a
> >>>>> small timer_slop (the hypervisor's one) value. So, they show that
> >>>>> decreasing the hypervisor's timer_slop is not enough to measure low
> >>>>> latencies with cyclictest.
> >>>> I have a couple of questions:
> >>>> * Does it make sense to make this a tunable for other clockevent
> >>>> devices
> >>>> as well?
> >>>>
> >>> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> >>> keep the delta between now and the next timer event within
> >>> dev->max_delta_ns and dev->min_delta_ns:
> >>>
> >>> delta = min(delta, (int64_t) dev->max_delta_ns);
> >>> delta = max(delta, (int64_t) dev->min_delta_ns);
> >>>
> >>> For Xen (well, for the Xen clock) we have:
> >>>
> >>> .max_delta_ns = 0xffffffff,
> >>> .min_delta_ns = TIMER_SLOP,
> >>>
> >>> which means a guest can't ask for a timer to fire earlier than 100us
> >>> ahead, which is a bit too coarse, especially on contemporary hardware.
> >>>
> >>> For "lapic_deadline" (which was what was in use in KVM guests, in our
> >>> experiments) we have:
> >>>
> >>> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> >>> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
> >>>
> >>> Which means max is 0x7FFFFF device ticks, and min is 0xF.
> >>> clockevent_delta2ns() does the conversion from ticks to ns, basing on
> >>> the results of the APIC calibration process. It calls cev_delta2ns()
> >>> which does some scaling, shifting, divs, etc, and, at the very end,
> >>> this:
> >>>
> >>> /* Deltas less than 1usec are pointless noise */
> >>> return clc > 1000 ? clc : 1000;
> >>>
> >>> So, as Ryan is also saying, the actual minimum, in this case, depends
> >>> on hardware, with a sanity check of "never below 1us" (which is quite
> >>> smaller than 100us!)
> >>>
> >>> Of course, the actual granularity depends on hardware in the Xen case
> >>> as well, but that is handled in Xen itself. And we have mechanisms in
> >>> place in there to avoid timer interrupt storms (like, ahem, the Xen's
> >>> 'timer_slop' boot parameter... :-P)
> >>>
> >>> And this is basically why I was also thinking we can/should lower the
> >>> default value of TIMER_SLOP, here in the Xen clock implementation in
> >>> Linux.
> >> What do you think would be a sane value? 10us? Should we then still keep
> >> this patch?
> >>
> >> My concern would be that if we change the current value and it turns out
> >> to be very wrong we'd then have no recourse.
> >>
> >>
> >> -boris
> >>
> > Speaking out of turn but as a participant in this thread, I would not
> > assume to change the default value for all cases without significant
> > testing by the community, touching a variety of configurations.
> >
> > It feels like changing the default has a non-trivial amount of
> > unknowns that would need to be addressed.
> >
> > Not surprisingly, I am biased to the approach of my patch which
> > does not change the default but offers flexibility to all.
>
>
> If we are to change the default it would be good to at least collect
> some data on distribution of delta values in
> clockevents_program_event(). But as I said, I'd keep the patch.
>
> Also, as far as the comment describing TIMER_SLOP, I agree that it is
> rather misleading.
>
> I can replace it with /* Minimum amount of time until next clock event
> fires */, I can do it while committing so no need to resend.
>
> -boris
I like that. Thanks Boris!
- Ryan
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 14:46 ` Boris Ostrovsky
2019-03-27 14:59 ` Ryan Thibodeaux
2019-03-27 14:59 ` Ryan Thibodeaux
@ 2019-03-27 15:19 ` Dario Faggioli
2019-03-27 15:19 ` Dario Faggioli
3 siblings, 0 replies; 45+ messages in thread
From: Dario Faggioli @ 2019-03-27 15:19 UTC (permalink / raw)
To: Boris Ostrovsky, Ryan Thibodeaux
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
ryan.thibodeaux, xen-devel, tglx
On Wed, 2019-03-27 at 10:46 -0400, Boris Ostrovsky wrote:
> On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> > On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> > > On 3/26/19 5:13 AM, Dario Faggioli wrote:
> > > >
> > > > And this is basically why I was also thinking we can/should
> > > > lower the
> > > > default value of TIMER_SLOP, here in the Xen clock
> > > > implementation in
> > > > Linux.
> > > What do you think would be a sane value? 10us? Should we then
> > > still keep
> > > this patch?
> > >
> > > My concern would be that if we change the current value and it
> > > turns out
> > > to be very wrong we'd then have no recourse.
> > >
> > Speaking out of turn but as a participant in this thread, I would
> > not
> > assume to change the default value for all cases without
> > significant
> > testing by the community, touching a variety of configurations.
> >
> If we are to change the default it would be good to at least collect
> some data on distribution of delta values in
> clockevents_program_event(). But as I said, I'd keep the patch.
>
I would definitely take/keep this patch. Choosing a more sane (IMO)
default and making things flexible and configurable are not mutually
exclusive things. :-)
I think that having this set to 100us stands in the way of a lot of
people wanting to do time sensitive stuff in Xen VMs. I'd at least
halve that to 50us, but 10us is even better.
But sure we can do this at a later point. And even at that point, a
patch like this is valuable, because there might be people that might,
probably after some testing of their own setup, want to lower it even
further.
> Also, as far as the comment describing TIMER_SLOP, I agree that it is
> rather misleading.
>
> I can replace it with /* Minimum amount of time until next clock
> event
> fires */, I can do it while committing so no need to resend.
>
Yeah, much better, yes.
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 14:46 ` Boris Ostrovsky
` (2 preceding siblings ...)
2019-03-27 15:19 ` Dario Faggioli
@ 2019-03-27 15:19 ` Dario Faggioli
3 siblings, 0 replies; 45+ messages in thread
From: Dario Faggioli @ 2019-03-27 15:19 UTC (permalink / raw)
To: Boris Ostrovsky, Ryan Thibodeaux
Cc: luca abeni, xen-devel, linux-kernel, oleksandr_andrushchenko,
tglx, jgross, ryan.thibodeaux
On Wed, 2019-03-27 at 10:46 -0400, Boris Ostrovsky wrote:
> On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> > On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
> > > On 3/26/19 5:13 AM, Dario Faggioli wrote:
> > > >
> > > > And this is basically why I was also thinking we can/should
> > > > lower the
> > > > default value of TIMER_SLOP, here in the Xen clock
> > > > implementation in
> > > > Linux.
> > > What do you think would be a sane value? 10us? Should we then
> > > still keep
> > > this patch?
> > >
> > > My concern would be that if we change the current value and it
> > > turns out
> > > to be very wrong we'd then have no recourse.
> > >
> > Speaking out of turn but as a participant in this thread, I would
> > not
> > assume to change the default value for all cases without
> > significant
> > testing by the community, touching a variety of configurations.
> >
> If we are to change the default it would be good to at least collect
> some data on distribution of delta values in
> clockevents_program_event(). But as I said, I'd keep the patch.
>
I would definitely take/keep this patch. Choosing a more sane (IMO)
default and making things flexible and configurable are not mutually
exclusive things. :-)
I think that having this set to 100us stands in the way of a lot of
people wanting to do time sensitive stuff in Xen VMs. I'd at least
halve that to 50us, but 10us is even better.
But sure we can do this at a later point. And even at that point, a
patch like this is valuable, because there might be people that might,
probably after some testing of their own setup, want to lower it even
further.
> Also, as far as the comment describing TIMER_SLOP, I agree that it is
> rather misleading.
>
> I can replace it with /* Minimum amount of time until next clock
> event
> fires */, I can do it while committing so no need to resend.
>
Yeah, much better, yes.
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-27 10:00 ` Ryan Thibodeaux
2019-03-27 14:46 ` Boris Ostrovsky
@ 2019-03-27 14:46 ` Boris Ostrovsky
1 sibling, 0 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-27 14:46 UTC (permalink / raw)
To: Ryan Thibodeaux
Cc: jgross, oleksandr_andrushchenko, luca abeni, linux-kernel,
Dario Faggioli, ryan.thibodeaux, xen-devel, tglx
On 3/27/19 6:00 AM, Ryan Thibodeaux wrote:
> On Tue, Mar 26, 2019 at 07:21:31PM -0400, Boris Ostrovsky wrote:
>> On 3/26/19 5:13 AM, Dario Faggioli wrote:
>>> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
>>>> On 3/25/19 8:05 AM, luca abeni wrote:
>>>>> The picture shows the latencies measured with an unpatched guest
>>>>> kernel
>>>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
>>>>> small
>>>>> value :).
>>>>> All the experiments have been performed booting the hypervisor with
>>>>> a
>>>>> small timer_slop (the hypervisor's one) value. So, they show that
>>>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>>>> latencies with cyclictest.
>>>> I have a couple of questions:
>>>> * Does it make sense to make this a tunable for other clockevent
>>>> devices
>>>> as well?
>>>>
>>> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
>>> keep the delta between now and the next timer event within
>>> dev->max_delta_ns and dev->min_delta_ns:
>>>
>>> delta = min(delta, (int64_t) dev->max_delta_ns);
>>> delta = max(delta, (int64_t) dev->min_delta_ns);
>>>
>>> For Xen (well, for the Xen clock) we have:
>>>
>>> .max_delta_ns = 0xffffffff,
>>> .min_delta_ns = TIMER_SLOP,
>>>
>>> which means a guest can't ask for a timer to fire earlier than 100us
>>> ahead, which is a bit too coarse, especially on contemporary hardware.
>>>
>>> For "lapic_deadline" (which was what was in use in KVM guests, in our
>>> experiments) we have:
>>>
>>> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
>>> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
>>>
>>> Which means max is 0x7FFFFF device ticks, and min is 0xF.
>>> clockevent_delta2ns() does the conversion from ticks to ns, basing on
>>> the results of the APIC calibration process. It calls cev_delta2ns()
>>> which does some scaling, shifting, divs, etc, and, at the very end,
>>> this:
>>>
>>> /* Deltas less than 1usec are pointless noise */
>>> return clc > 1000 ? clc : 1000;
>>>
>>> So, as Ryan is also saying, the actual minimum, in this case, depends
>>> on hardware, with a sanity check of "never below 1us" (which is quite
>>> smaller than 100us!)
>>>
>>> Of course, the actual granularity depends on hardware in the Xen case
>>> as well, but that is handled in Xen itself. And we have mechanisms in
>>> place in there to avoid timer interrupt storms (like, ahem, the Xen's
>>> 'timer_slop' boot parameter... :-P)
>>>
>>> And this is basically why I was also thinking we can/should lower the
>>> default value of TIMER_SLOP, here in the Xen clock implementation in
>>> Linux.
>> What do you think would be a sane value? 10us? Should we then still keep
>> this patch?
>>
>> My concern would be that if we change the current value and it turns out
>> to be very wrong we'd then have no recourse.
>>
>>
>> -boris
>>
> Speaking out of turn but as a participant in this thread, I would not
> assume to change the default value for all cases without significant
> testing by the community, touching a variety of configurations.
>
> It feels like changing the default has a non-trivial amount of
> unknowns that would need to be addressed.
>
> Not surprisingly, I am biased to the approach of my patch which
> does not change the default but offers flexibility to all.
If we are to change the default it would be good to at least collect
some data on distribution of delta values in
clockevents_program_event(). But as I said, I'd keep the patch.
Also, as far as the comment describing TIMER_SLOP, I agree that it is
rather misleading.
I can replace it with /* Minimum amount of time until next clock event
fires */, I can do it while committing so no need to resend.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] x86/xen: Add "xen_timer_slop" command line option
2019-03-26 9:13 ` Dario Faggioli
` (2 preceding siblings ...)
2019-03-26 23:21 ` Boris Ostrovsky
@ 2019-03-26 23:21 ` Boris Ostrovsky
3 siblings, 0 replies; 45+ messages in thread
From: Boris Ostrovsky @ 2019-03-26 23:21 UTC (permalink / raw)
To: Dario Faggioli, luca abeni
Cc: jgross, oleksandr_andrushchenko, linux-kernel, thibodux,
ryan.thibodeaux, xen-devel, tglx
[-- Attachment #1.1.1: Type: text/plain, Size: 2744 bytes --]
On 3/26/19 5:13 AM, Dario Faggioli wrote:
> On Mon, 2019-03-25 at 09:43 -0400, Boris Ostrovsky wrote:
>> On 3/25/19 8:05 AM, luca abeni wrote:
>>> The picture shows the latencies measured with an unpatched guest
>>> kernel
>>> and with a guest kernel having TIMER_SLOP set to 1000 (arbitrary
>>> small
>>> value :).
>>> All the experiments have been performed booting the hypervisor with
>>> a
>>> small timer_slop (the hypervisor's one) value. So, they show that
>>> decreasing the hypervisor's timer_slop is not enough to measure low
>>> latencies with cyclictest.
>> I have a couple of questions:
>> * Does it make sense to make this a tunable for other clockevent
>> devices
>> as well?
>>
> So, AFAIUI, the thing is as follows. In clockevents_program_event(), we
> keep the delta between now and the next timer event within
> dev->max_delta_ns and dev->min_delta_ns:
>
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
>
> For Xen (well, for the Xen clock) we have:
>
> .max_delta_ns = 0xffffffff,
> .min_delta_ns = TIMER_SLOP,
>
> which means a guest can't ask for a timer to fire earlier than 100us
> ahead, which is a bit too coarse, especially on contemporary hardware.
>
> For "lapic_deadline" (which was what was in use in KVM guests, in our
> experiments) we have:
>
> lapic_clockevent.max_delta_ns = clockevent_delta2ns(0x7FFFFF, &lapic_clockevent);
> lapic_clockevent.min_delta_ns = clockevent_delta2ns(0xF, &lapic_clockevent);
>
> Which means max is 0x7FFFFF device ticks, and min is 0xF.
> clockevent_delta2ns() does the conversion from ticks to ns, basing on
> the results of the APIC calibration process. It calls cev_delta2ns()
> which does some scaling, shifting, divs, etc, and, at the very end,
> this:
>
> /* Deltas less than 1usec are pointless noise */
> return clc > 1000 ? clc : 1000;
>
> So, as Ryan is also saying, the actual minimum, in this case, depends
> on hardware, with a sanity check of "never below 1us" (which is quite
> smaller than 100us!)
>
> Of course, the actual granularity depends on hardware in the Xen case
> as well, but that is handled in Xen itself. And we have mechanisms in
> place in there to avoid timer interrupt storms (like, ahem, the Xen's
> 'timer_slop' boot parameter... :-P)
>
> And this is basically why I was also thinking we can/should lower the
> default value of TIMER_SLOP, here in the Xen clock implementation in
> Linux.
What do you think would be a sane value? 10us? Should we then still keep
this patch?
My concern would be that if we change the current value and it turns out
to be very wrong we'd then have no recourse.
-boris
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 45+ messages in thread