All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] ppc icount questions
@ 2018-01-12 16:19 Steven Seeger
  2018-01-12 16:52 ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Seeger @ 2018-01-12 16:19 UTC (permalink / raw)
  To: qemu-devel

Hi guys. I'm the poster on the qemu-discuss list about some technical icount 
questions and was told to come over here to qemu-devel.

My scenario: x86-64 host running qemu/ppc-softmmu with an unmodfied ppc750 cpu 
and a custom board target with chipset I implemented.

I am trying to use icount to get virtual time to increase based on CPU 
instructions executed and not host time. So, if I have a register in a device 
model that is implemented with sleep(1); I would expect virtual time to think 
only a single instruction (or small group of instructions) passed with the 
register access even though real time has stalled a whole second.

When using icount shift=auto, I see behavior where my UART character TX 
interrupt (I had to add a character tx timer to serial.c because WindRiver's 
UART code stops at 11 characters and waits for an interrupt that never comes 
in qemu's impossibly-fast UART) fires every 40ms of virtual time instead of 
every 87 microseconds of virtual time. After the bootup characters fly by, 
more interrupts are turned on and the behavior changes. (I tend to see a 
character come every 120-155 microseconds of virtual time.) 

With icount sleep=off, I see the UART interrupts happen must faster on bootup, 
but their timing is still imprecise.

My goal is to have QEMU respond deterministically to timer events and also 
execute instructions wtih time increasing as a proportion of those executed 
instructions.

A good example of this would be that say I have an interrupt that occurs every 
second. If I were to print out the virtual time that interrupt occurs in the 
device model, I should see a time of:

1.000000
2.000000
3.000000
4.000000

etc

Instead, I see:

1.000000
2.000013
3.000074
4.000022

When the timer function is called in the device model, I arm the timer again 
with qemu_get_clock_ns(QEMU_CLOCK_VIRTUAL + 1000000000ULL) and expect this 
time to be exaclty 1 second of virtual time later.

Either the virtual time is increasing without instructions executing or the 
granularity of when the timer is serviced relative to virtual time is not 
exact. I think the latter is happening. Is this because a tcg codeblock must 
execute completely and this causes increases in virtual time based on the 
number of instructions in that block, and the number of instructions varies?

I looked at Aaron Larson's post at > http://lists.nongnu.org/archive/html/
qemu-discuss/2017-01/msg00022.html and this did not work for me. In fact, I 
never see warp_start be anything other than -1 during the length of time I 
tested it.

Thanks for your help or any feedback.

Steven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 16:19 [Qemu-devel] ppc icount questions Steven Seeger
@ 2018-01-12 16:52 ` Paolo Bonzini
  2018-01-12 17:12   ` Steven Seeger
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2018-01-12 16:52 UTC (permalink / raw)
  To: steven.seeger, qemu-devel

On 12/01/2018 17:19, Steven Seeger wrote:
> A good example of this would be that say I have an interrupt that occurs every 
> second. If I were to print out the virtual time that interrupt occurs in the 
> device model, I should see a time of:
> 
> 1.000000
> 2.000000
> 3.000000
> 4.000000
> 
> etc
> 
> Instead, I see:
> 
> 1.000000
> 2.000013
> 3.000074
> 4.000022

What is the guest doing in the meanwhile?

> When the timer function is called in the device model, I arm the timer again 
> with qemu_get_clock_ns(QEMU_CLOCK_VIRTUAL + 1000000000ULL) and expect this 
> time to be exaclty 1 second of virtual time later.
> 
> Either the virtual time is increasing without instructions executing or the 
> granularity of when the timer is serviced relative to virtual time is not 
> exact. I think the latter is happening. Is this because a tcg codeblock must 
> execute completely and this causes increases in virtual time based on the 
> number of instructions in that block, and the number of instructions varies?

virtual time increases only when instructions are executed, or when the
CPUs are idle (in the latter case, behavior depends on "-icount sleep":
if on, it increases at the same pace as real time, if off, it jumps
immediately to the next deadline).

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 16:52 ` Paolo Bonzini
@ 2018-01-12 17:12   ` Steven Seeger
  2018-01-12 17:19     ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Seeger @ 2018-01-12 17:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On Friday, January 12, 2018 11:52:57 AM EST Paolo Bonzini wrote:
> What is the guest doing in the meanwhile?

The guest is running vxWorks with several threads. The CPU does idle at times. 

> virtual time increases only when instructions are executed, or when the
> CPUs are idle (in the latter case, behavior depends on "-icount sleep":
> if on, it increases at the same pace as real time, if off, it jumps
> immediately to the next deadline).

If we jump to the next available deadline, won't that run faster than 
realtime? The preferred goal here is to run realtime (sleep as appropriate) 
but slow down if the guest or model world requires too many host resources. 
But, the desire would be to maintain proportionality between number of 
instructions executed and increase in virtual time. 

One of the things happening in the guest code is there is a once-per-second 
interrupt and a once-per-10ms interrupt that the software expects to see in-
phase with each other. If not, then errors occur. I am seeing errors when I do 
more work in the device model. However, even with this extra work disabled I 
still do not see the timer granularity I expect.

Sorry for all the questions.

Steven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 17:12   ` Steven Seeger
@ 2018-01-12 17:19     ` Paolo Bonzini
  2018-01-12 18:03       ` Steven Seeger
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2018-01-12 17:19 UTC (permalink / raw)
  To: steven.seeger; +Cc: qemu-devel

On 12/01/2018 18:12, Steven Seeger wrote:
> On Friday, January 12, 2018 11:52:57 AM EST Paolo Bonzini wrote:
>> What is the guest doing in the meanwhile?
> 
> The guest is running vxWorks with several threads. The CPU does idle at times. 
> 
>> virtual time increases only when instructions are executed, or when the
>> CPUs are idle (in the latter case, behavior depends on "-icount sleep":
>> if on, it increases at the same pace as real time, if off, it jumps
>> immediately to the next deadline).
> 
> If we jump to the next available deadline, won't that run faster than 
> realtime?

Correct.  I mentioned it because you also had "-icount sleep=off" in
your previous message.

> The preferred goal here is to run realtime (sleep as appropriate) 
> but slow down if the guest or model world requires too many host resources. 
> But, the desire would be to maintain proportionality between number of 
> instructions executed and increase in virtual time. 

Note that in general you'll have different paces when the CPU is idle
and when it is not (because it's unlikely that emulation speed is
exactly 10^9/2^shift; "-icount shift=auto" achieves what you want but
loses more in determinism).  This won't be visible if the guest is
mostly idle though.

> One of the things happening in the guest code is there is a once-per-second 
> interrupt and a once-per-10ms interrupt that the software expects to see in-
> phase with each other. If not, then errors occur. I am seeing errors when I do 
> more work in the device model. However, even with this extra work disabled I 
> still do not see the timer granularity I expect.

That's probably because the CPU runs in the background while the timers
run.  So QEMU_CLOCK_VIRTUAL is _not_ latched while the timers run.
Would that explain it?

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 17:19     ` Paolo Bonzini
@ 2018-01-12 18:03       ` Steven Seeger
  2018-01-12 18:11         ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Seeger @ 2018-01-12 18:03 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On Friday, January 12, 2018 12:19:22 PM EST Paolo Bonzini wrote:
> 
> Correct.  I mentioned it because you also had "-icount sleep=off" in
> your previous message.

Yes I have tried both. With sleep=off, I get the faster interrupt response 
(better granularity) but with sleep=on, it is poor. Again, a timer that should 
fire every 87us fires every 0.040000 seconds (always this precision) while 
guest is booting up and loading applications (very little guest idle time)

> Note that in general you'll have different paces when the CPU is idle
> and when it is not (because it's unlikely that emulation speed is
> exactly 10^9/2^shift; "-icount shift=auto" achieves what you want but
> loses more in determinism).  This won't be visible if the guest is
> mostly idle though.

It seems to me that if the TCG keeps track of number of instructions, we 
should be able to tie this to virtual timer increase. However it seems this is 
not the case. There's still some processing of a notion of "time" even when 
icount is used. We can be deterministic but only to some granularity and I 
can't seem to figure out where that is set.

> That's probably because the CPU runs in the background while the timers
> run.  So QEMU_CLOCK_VIRTUAL is _not_ latched while the timers run.
> Would that explain it?

Yes that would explain it. QEMU_CLOCK_VIRTUAL should increase with number of 
executed instructions, but it seems as I said above that this is still 
factoring time in somewhere. Even though time is a factor (the host must be 
able to wake up determinstically to handle the next timer deadline in the 
guest) surely the concept of QEMU_CLOCK_VIRTUAL as tied to number of executed 
instructions could remain stable.

Perhaps this is the case and I am doing something wrong somewhere.

I can obtain "sort-of" decent results by using QEMU_CLOCK_VIRTUAL_RT for my tx 
char timer in serial.c which allows fast bootup and approximately determinstic 
virtual time later on in execution, but I still have issues with the number of 
cpu instructions executed varying between timer responses.

With an interrupt every 1 second, and an interrupt every 10 ms, I would expect 
the vxWorks guest to respond to these interrupts with a rather accurate delay 
between them at the time the 10ms and 1 second interrupt occur at "the same 
time."

Steven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 18:03       ` Steven Seeger
@ 2018-01-12 18:11         ` Paolo Bonzini
  2018-01-12 18:35           ` Steven Seeger
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2018-01-12 18:11 UTC (permalink / raw)
  To: steven.seeger; +Cc: qemu-devel

On 12/01/2018 19:03, Steven Seeger wrote:
>> That's probably because the CPU runs in the background while the timers
>> run.  So QEMU_CLOCK_VIRTUAL is _not_ latched while the timers run.
>> Would that explain it?
> 
> Yes that would explain it. QEMU_CLOCK_VIRTUAL should increase with number of 
> executed instructions, but it seems as I said above that this is still 
> factoring time in somewhere. Even though time is a factor (the host must be 
> able to wake up determinstically to handle the next timer deadline in the 
> guest) surely the concept of QEMU_CLOCK_VIRTUAL as tied to number of executed 
> instructions could remain stable.

I think this is the issue:

     I/O thread                    vCPU thread
 -----------------------------------------------------------------------
                                   executes 1,000,000,000-th instruction
                                   wakes up I/O thread
     finds 1st timer
     runs 1st timer
                                   executes 1,000 instructions
----------- QEMU_CLOCK_VIRTUAL now is 1,000,001,000 --------------------
     1st timer finishes
                                   executes 10,000 instructions
----------- QEMU_CLOCK_VIRTUAL now is 1,000,011,000 --------------------
     runs 2nd timer

> I can obtain "sort-of" decent results by using QEMU_CLOCK_VIRTUAL_RT for my tx 
> char timer in serial.c which allows fast bootup and approximately determinstic 
> virtual time later on in execution, but I still have issues with the number of 
> cpu instructions executed varying between timer responses.

QEMU_CLOCK_VIRTUAL_RT is for internal use (by -icount sleep, -icount
shift=auto, etc.).  You almost certainly don't need it.

Paolo

> With an interrupt every 1 second, and an interrupt every 10 ms, I would expect 
> the vxWorks guest to respond to these interrupts with a rather accurate delay 
> between them at the time the 10ms and 1 second interrupt occur at "the same 
> time."

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc icount questions
  2018-01-12 18:11         ` Paolo Bonzini
@ 2018-01-12 18:35           ` Steven Seeger
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Seeger @ 2018-01-12 18:35 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

> I think this is the issue:
> 
>      I/O thread                    vCPU thread
>  -----------------------------------------------------------------------
>                                    executes 1,000,000,000-th instruction
>                                    wakes up I/O thread
>      finds 1st timer
>      runs 1st timer
>                                    executes 1,000 instructions
> ----------- QEMU_CLOCK_VIRTUAL now is 1,000,001,000 --------------------
>      1st timer finishes
>                                    executes 10,000 instructions
> ----------- QEMU_CLOCK_VIRTUAL now is 1,000,011,000 --------------------
>      runs 2nd timer

I would agree this is the issue. I was thinking that the timer ran in the same 
thread as the CPU (thus preventing the two from running at the same time) but 
I guess this is not true. There must be some sync point, because taking too 
long to finish the timer makes things stall (or that may just be due to 
causing a delay in delivery of the next interrupt.)

So I guess what I am looking for is a way to ensure the two run mutually 
exclusive of each other. I know from other systems that we can run all this in 
a single thread (hardware models and guest CPU) so it should be possible to do 
in QEMU as well.

Steven

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-12 18:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-12 16:19 [Qemu-devel] ppc icount questions Steven Seeger
2018-01-12 16:52 ` Paolo Bonzini
2018-01-12 17:12   ` Steven Seeger
2018-01-12 17:19     ` Paolo Bonzini
2018-01-12 18:03       ` Steven Seeger
2018-01-12 18:11         ` Paolo Bonzini
2018-01-12 18:35           ` Steven Seeger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.