All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
@ 2018-02-01 13:47 Gonglei
  2018-02-01 14:23 ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Gonglei @ 2018-02-01 13:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, weidong.huang, Gonglei

As windows guest use rtc as the clock source device,
and access rtc frequently. Let's move the rtc memory
region outside BQL to decrease overhead for windows guests.

strace -tt -p $1 -c -o result_$1.log &
sleep $2
pid=$(pidof strace)
kill $pid
cat result_$1.log

Before appling this change:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.87    0.119070          30      4000           ppoll
  3.27    0.004148           2      2038           ioctl
  2.66    0.003370           2      2014           futex
  0.09    0.000113           1       106           read
  0.09    0.000109           1       104           io_getevents
  0.02    0.000029           1        30           poll
  0.00    0.000000           0         1           write
------ ----------- ----------- --------- --------- ----------------
100.00    0.126839                  8293           total

After appling the change:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 92.86    0.067441          16      4094           ppoll
  4.85    0.003522           2      2136           ioctl
  1.17    0.000850           4       189           futex
  0.54    0.000395           2       202           read
  0.52    0.000379           2       202           io_getevents
  0.05    0.000037           1        30           poll
------ ----------- ----------- --------- --------- ----------------
100.00    0.072624                  6853           total

The futex call number decreases ~90.6% on an idle windows 7 guest.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/timer/mc146818rtc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index 35a05a6..d9d99c5 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
     qemu_register_suspend_notifier(&s->suspend_notifier);
 
     memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
+    memory_region_clear_global_locking(&s->io);
     isa_register_ioport(isadev, &s->io, base);
 
     qdev_set_legacy_instance_id(dev, base, 3);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-01 13:47 [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL Gonglei
@ 2018-02-01 14:23 ` Paolo Bonzini
  2018-02-04  6:20   ` Gonglei (Arei)
  2018-02-04 18:02   ` Peter Maydell
  0 siblings, 2 replies; 9+ messages in thread
From: Paolo Bonzini @ 2018-02-01 14:23 UTC (permalink / raw)
  To: Gonglei, qemu-devel; +Cc: weidong.huang

On 01/02/2018 08:47, Gonglei wrote:
> As windows guest use rtc as the clock source device,
> and access rtc frequently. Let's move the rtc memory
> region outside BQL to decrease overhead for windows guests.
> 
> strace -tt -p $1 -c -o result_$1.log &
> sleep $2
> pid=$(pidof strace)
> kill $pid
> cat result_$1.log
> 
> Before appling this change:
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  93.87    0.119070          30      4000           ppoll
>   3.27    0.004148           2      2038           ioctl
>   2.66    0.003370           2      2014           futex
>   0.09    0.000113           1       106           read
>   0.09    0.000109           1       104           io_getevents
>   0.02    0.000029           1        30           poll
>   0.00    0.000000           0         1           write
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.126839                  8293           total
> 
> After appling the change:
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  92.86    0.067441          16      4094           ppoll
>   4.85    0.003522           2      2136           ioctl
>   1.17    0.000850           4       189           futex
>   0.54    0.000395           2       202           read
>   0.52    0.000379           2       202           io_getevents
>   0.05    0.000037           1        30           poll
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.072624                  6853           total
> 
> The futex call number decreases ~90.6% on an idle windows 7 guest.
> 
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  hw/timer/mc146818rtc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
> index 35a05a6..d9d99c5 100644
> --- a/hw/timer/mc146818rtc.c
> +++ b/hw/timer/mc146818rtc.c
> @@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
>      qemu_register_suspend_notifier(&s->suspend_notifier);
>  
>      memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
> +    memory_region_clear_global_locking(&s->io);
>      isa_register_ioport(isadev, &s->io, base);
>  
>      qdev_set_legacy_instance_id(dev, base, 3);
> 

This is not enough, you need to add a new lock or something like that.
Otherwise two vCPUs can access the RTC together and make a mess.

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-01 14:23 ` Paolo Bonzini
@ 2018-02-04  6:20   ` Gonglei (Arei)
  2018-02-04 18:02   ` Peter Maydell
  1 sibling, 0 replies; 9+ messages in thread
From: Gonglei (Arei) @ 2018-02-04  6:20 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: Huangweidong (C)

> -----Original Message-----
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> Sent: Thursday, February 01, 2018 10:24 PM
> To: Gonglei (Arei); qemu-devel@nongnu.org
> Cc: Huangweidong (C)
> Subject: Re: [PATCH] rtc: placing RTC memory region outside BQL
> 
> On 01/02/2018 08:47, Gonglei wrote:
> > As windows guest use rtc as the clock source device,
> > and access rtc frequently. Let's move the rtc memory
> > region outside BQL to decrease overhead for windows guests.
> >
> > strace -tt -p $1 -c -o result_$1.log &
> > sleep $2
> > pid=$(pidof strace)
> > kill $pid
> > cat result_$1.log
> >
> > Before appling this change:
> >
> > % time     seconds  usecs/call     calls    errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> >  93.87    0.119070          30      4000           ppoll
> >   3.27    0.004148           2      2038           ioctl
> >   2.66    0.003370           2      2014           futex
> >   0.09    0.000113           1       106           read
> >   0.09    0.000109           1       104           io_getevents
> >   0.02    0.000029           1        30           poll
> >   0.00    0.000000           0         1           write
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00    0.126839                  8293           total
> >
> > After appling the change:
> >
> > % time     seconds  usecs/call     calls    errors syscall
> > ------ ----------- ----------- --------- --------- ----------------
> >  92.86    0.067441          16      4094           ppoll
> >   4.85    0.003522           2      2136           ioctl
> >   1.17    0.000850           4       189           futex
> >   0.54    0.000395           2       202           read
> >   0.52    0.000379           2       202           io_getevents
> >   0.05    0.000037           1        30           poll
> > ------ ----------- ----------- --------- --------- ----------------
> > 100.00    0.072624                  6853           total
> >
> > The futex call number decreases ~90.6% on an idle windows 7 guest.
> >
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > ---
> >  hw/timer/mc146818rtc.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
> > index 35a05a6..d9d99c5 100644
> > --- a/hw/timer/mc146818rtc.c
> > +++ b/hw/timer/mc146818rtc.c
> > @@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error
> **errp)
> >      qemu_register_suspend_notifier(&s->suspend_notifier);
> >
> >      memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
> > +    memory_region_clear_global_locking(&s->io);
> >      isa_register_ioport(isadev, &s->io, base);
> >
> >      qdev_set_legacy_instance_id(dev, base, 3);
> >
> 
> This is not enough, you need to add a new lock or something like that.
> Otherwise two vCPUs can access the RTC together and make a mess.
> 

Hi Paolo,

Yes, that's true, although I have not encountered any problems yet.
Let me enhance it in v2.

Thanks,
-Gonglei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-01 14:23 ` Paolo Bonzini
  2018-02-04  6:20   ` Gonglei (Arei)
@ 2018-02-04 18:02   ` Peter Maydell
  2018-02-05 14:04     ` Paolo Bonzini
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2018-02-04 18:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Gonglei, QEMU Developers, Huangweidong (C)

On 1 February 2018 at 14:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 01/02/2018 08:47, Gonglei wrote:
>> diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
>> index 35a05a6..d9d99c5 100644
>> --- a/hw/timer/mc146818rtc.c
>> +++ b/hw/timer/mc146818rtc.c
>> @@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
>>      qemu_register_suspend_notifier(&s->suspend_notifier);
>>
>>      memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
>> +    memory_region_clear_global_locking(&s->io);
>>      isa_register_ioport(isadev, &s->io, base);
>>
>>      qdev_set_legacy_instance_id(dev, base, 3);
>>
>
> This is not enough, you need to add a new lock or something like that.
> Otherwise two vCPUs can access the RTC together and make a mess.

Do you also need to do something to take the global lock before
raising the outbound IRQ line (since it might be connected to a device
that does need the global lock), or am I confused ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-04 18:02   ` Peter Maydell
@ 2018-02-05 14:04     ` Paolo Bonzini
  2018-02-06  8:24       ` Gonglei (Arei)
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2018-02-05 14:04 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Gonglei, QEMU Developers, Huangweidong (C)

On 04/02/2018 19:02, Peter Maydell wrote:
> On 1 February 2018 at 14:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 01/02/2018 08:47, Gonglei wrote:
>>> diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
>>> index 35a05a6..d9d99c5 100644
>>> --- a/hw/timer/mc146818rtc.c
>>> +++ b/hw/timer/mc146818rtc.c
>>> @@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
>>>      qemu_register_suspend_notifier(&s->suspend_notifier);
>>>
>>>      memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
>>> +    memory_region_clear_global_locking(&s->io);
>>>      isa_register_ioport(isadev, &s->io, base);
>>>
>>>      qdev_set_legacy_instance_id(dev, base, 3);
>>>
>>
>> This is not enough, you need to add a new lock or something like that.
>> Otherwise two vCPUs can access the RTC together and make a mess.
> 
> Do you also need to do something to take the global lock before
> raising the outbound IRQ line (since it might be connected to a device
> that does need the global lock), or am I confused ?

Yes, that's a good point.  Most of the time the IRQ line is raised in a
timer, but not always.

Paolo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-05 14:04     ` Paolo Bonzini
@ 2018-02-06  8:24       ` Gonglei (Arei)
  2018-02-06  9:49         ` Peter Maydell
  0 siblings, 1 reply; 9+ messages in thread
From: Gonglei (Arei) @ 2018-02-06  8:24 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Maydell; +Cc: QEMU Developers, Huangweidong (C)


> -----Original Message-----
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> Sent: Monday, February 05, 2018 10:04 PM
> To: Peter Maydell
> Cc: Gonglei (Arei); QEMU Developers; Huangweidong (C)
> Subject: Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
> 
> On 04/02/2018 19:02, Peter Maydell wrote:
> > On 1 February 2018 at 14:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> On 01/02/2018 08:47, Gonglei wrote:
> >>> diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
> >>> index 35a05a6..d9d99c5 100644
> >>> --- a/hw/timer/mc146818rtc.c
> >>> +++ b/hw/timer/mc146818rtc.c
> >>> @@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error
> **errp)
> >>>      qemu_register_suspend_notifier(&s->suspend_notifier);
> >>>
> >>>      memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
> >>> +    memory_region_clear_global_locking(&s->io);
> >>>      isa_register_ioport(isadev, &s->io, base);
> >>>
> >>>      qdev_set_legacy_instance_id(dev, base, 3);
> >>>
> >>
> >> This is not enough, you need to add a new lock or something like that.
> >> Otherwise two vCPUs can access the RTC together and make a mess.
> >
> > Do you also need to do something to take the global lock before
> > raising the outbound IRQ line (since it might be connected to a device
> > that does need the global lock), or am I confused ?
> 
> Yes, that's a good point.  Most of the time the IRQ line is raised in a
> timer, but not always.
> 
So, taking BQL is necessary, and what we can do is trying our best to narrow
down the process of locking ? For example, do the following wrapping:

static void rtc_rasie_irq(RTCState *s)
{
    qemu_mutex_lock_iothread();
    qemu_irq_raise(s->irq);
    qemu_mutex_unlock_iothread();
}

static void rtc_lower_irq(RTCState *s)
{
    qemu_mutex_lock_iothread();
    qemu_irq_lower(s->irq);
    qemu_mutex_unlock_iothread();
}

Thanks,
-Gonglei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-06  8:24       ` Gonglei (Arei)
@ 2018-02-06  9:49         ` Peter Maydell
  2018-02-06 13:05           ` Gonglei (Arei)
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Maydell @ 2018-02-06  9:49 UTC (permalink / raw)
  To: Gonglei (Arei); +Cc: Paolo Bonzini, QEMU Developers, Huangweidong (C)

On 6 February 2018 at 08:24, Gonglei (Arei) <arei.gonglei@huawei.com> wrote:
> So, taking BQL is necessary, and what we can do is trying our best to narrow
> down the process of locking ? For example, do the following wrapping:
>
> static void rtc_rasie_irq(RTCState *s)
> {
>     qemu_mutex_lock_iothread();
>     qemu_irq_raise(s->irq);
>     qemu_mutex_unlock_iothread();
> }
>
> static void rtc_lower_irq(RTCState *s)
> {
>     qemu_mutex_lock_iothread();
>     qemu_irq_lower(s->irq);
>     qemu_mutex_unlock_iothread();
> }

If you do that you'll also need to be careful about not calling
those functions from contexts where you already hold the iothread
mutex (eg timer callbacks), since you can't lock a mutex you
already have locked.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
  2018-02-06  9:49         ` Peter Maydell
@ 2018-02-06 13:05           ` Gonglei (Arei)
  0 siblings, 0 replies; 9+ messages in thread
From: Gonglei (Arei) @ 2018-02-06 13:05 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Paolo Bonzini, QEMU Developers, Huangweidong (C)


> -----Original Message-----
> From: Peter Maydell [mailto:peter.maydell@linaro.org]
> Sent: Tuesday, February 06, 2018 5:49 PM
> To: Gonglei (Arei)
> Cc: Paolo Bonzini; QEMU Developers; Huangweidong (C)
> Subject: Re: [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
> 
> On 6 February 2018 at 08:24, Gonglei (Arei) <arei.gonglei@huawei.com>
> wrote:
> > So, taking BQL is necessary, and what we can do is trying our best to narrow
> > down the process of locking ? For example, do the following wrapping:
> >
> > static void rtc_rasie_irq(RTCState *s)
> > {
> >     qemu_mutex_lock_iothread();
> >     qemu_irq_raise(s->irq);
> >     qemu_mutex_unlock_iothread();
> > }
> >
> > static void rtc_lower_irq(RTCState *s)
> > {
> >     qemu_mutex_lock_iothread();
> >     qemu_irq_lower(s->irq);
> >     qemu_mutex_unlock_iothread();
> > }
> 
> If you do that you'll also need to be careful about not calling
> those functions from contexts where you already hold the iothread
> mutex (eg timer callbacks), since you can't lock a mutex you
> already have locked.
> 
Exactly, all contexts caused by the main process. :)
Three timers callbacks, calling rtc_reset().

Thanks,
-Gonglei

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL
@ 2018-02-01 13:51 Gonglei
  0 siblings, 0 replies; 9+ messages in thread
From: Gonglei @ 2018-02-01 13:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, weidong.huang, Gonglei

As windows guest use rtc as the clock source device,
and access rtc frequently. Let's move the rtc memory
region outside BQL to decrease overhead for windows guests.

$ cat strace_c.sh
strace -tt -p $1 -c -o result_$1.log &
sleep $2
pid=$(pidof strace)
kill $pid
cat result_$1.log

Before appling this change:
$ ./strace_c.sh 10528 30
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.87    0.119070          30      4000           ppoll
  3.27    0.004148           2      2038           ioctl
  2.66    0.003370           2      2014           futex
  0.09    0.000113           1       106           read
  0.09    0.000109           1       104           io_getevents
  0.02    0.000029           1        30           poll
  0.00    0.000000           0         1           write
------ ----------- ----------- --------- --------- ----------------
100.00    0.126839                  8293           total

After appling the change:
$ ./strace_c.sh 23829 30
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 92.86    0.067441          16      4094           ppoll
  4.85    0.003522           2      2136           ioctl
  1.17    0.000850           4       189           futex
  0.54    0.000395           2       202           read
  0.52    0.000379           2       202           io_getevents
  0.05    0.000037           1        30           poll
------ ----------- ----------- --------- --------- ----------------
100.00    0.072624                  6853           total

The futex call number decreases ~90.6% on an idle windows 7 guest.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/timer/mc146818rtc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index 35a05a6..d9d99c5 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -986,6 +986,7 @@ static void rtc_realizefn(DeviceState *dev, Error **errp)
     qemu_register_suspend_notifier(&s->suspend_notifier);
 
     memory_region_init_io(&s->io, OBJECT(s), &cmos_ops, s, "rtc", 2);
+    memory_region_clear_global_locking(&s->io);
     isa_register_ioport(isadev, &s->io, base);
 
     qdev_set_legacy_instance_id(dev, base, 3);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-02-06 13:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-01 13:47 [Qemu-devel] [PATCH] rtc: placing RTC memory region outside BQL Gonglei
2018-02-01 14:23 ` Paolo Bonzini
2018-02-04  6:20   ` Gonglei (Arei)
2018-02-04 18:02   ` Peter Maydell
2018-02-05 14:04     ` Paolo Bonzini
2018-02-06  8:24       ` Gonglei (Arei)
2018-02-06  9:49         ` Peter Maydell
2018-02-06 13:05           ` Gonglei (Arei)
2018-02-01 13:51 Gonglei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.