xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Anisov <andrii.anisov@gmail.com>
To: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Andrii Anisov <Andrii_Anisov@epam.com>, Wei Liu <wl@xen.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>, Dario Faggioli <dfaggioli@suse.com>,
	Julien Grall <julien.grall@arm.com>,
	Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>
Subject: Re: [Xen-devel] [RFC 1/9] schedule: Introduce per-pcpu time accounting
Date: Thu, 12 Sep 2019 13:26:01 +0300	[thread overview]
Message-ID: <fa467ad5-c12f-c621-1a57-570c0a87a4a0@gmail.com> (raw)
In-Reply-To: <87o8zqg19z.fsf@epam.com>

Hello Volodymyr,

On 11.09.19 21:01, Volodymyr Babchuk wrote:
>> Introduce per-pcpu time accounting what includes the following states:
>>
>> TACC_HYP - the pcpu executes hypervisor code like softirq processing
>>             (including scheduling), tasklets and context switches
>> TACC_GUEST - the pcpu executes guests code
>> TACC_IDLE - the low-power state of the pcpu
> Is it really low-power?

It is rather matter of scheduling design. It differs from OS to OS, even from arch to arch. See [1].
Me personally tend to treat only low-power state as a true idle.
As a bad (IMO) example I can give the current XEN mainline. Pretty heavy tasks could be performed by the idle vcpu, and they are accounted as idle. This may mislead, for example, cpufreq governor.

>> TACC_IRQ - the pcpu performs interrupts processing, without separation to
>>             guest or hypervisor interrupts
> I think, word "distinguishing" would be better than "separation"

Why so?

> 
>> TACC_GSYNC - the pcpu executes hypervisor code to process synchronous trap
>>               from the guest. E.g. hypercall processing or io emulation.
>>
>> Currently, the only reenterant state is TACC_IRQ. It is assumed, no changes
>> to state other than TACC_IRQ could happen until we return from nested
>> interrupts. IRQ time is accounted in a distinct way comparing to other states.
>> It is acumulated between other states transition moments, and is substracted
>> from the old state on states transion calculation.
>>
>> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
>> ---
>>   xen/common/schedule.c   | 81 +++++++++++++++++++++++++++++++++++++++++++++++++
>>   xen/include/xen/sched.h | 27 +++++++++++++++++
>>   2 files changed, 108 insertions(+)
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index 7b71581..6dd6603 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -1539,6 +1539,87 @@ static void schedule(void)
>>       context_switch(prev, next);
>>   }
>>   
>> +DEFINE_PER_CPU(struct tacc, tacc);
>> +
>> +static void tacc_state_change(enum TACC_STATES new_state)
>> +{
>> +    s_time_t now, delta;
>> +    struct tacc* tacc = &this_cpu(tacc);
>> +    unsigned long flags;
>> +
>> +    local_irq_save(flags);
>> +
>> +    now = NOW();
>> +    delta = now - tacc->state_entry_time;
>> +
>> +    /* We do not expect states reenterability (at least through this function)*/
>> +    ASSERT(new_state != tacc->state);
>> +
>> +    tacc->state_time[tacc->state] += delta - tacc->irq_time;
>> +    tacc->state_time[TACC_IRQ] += tacc->irq_time;
>> +    tacc->irq_time = 0;
>> +    tacc->state = new_state;
>> +    tacc->state_entry_time = now;
>> +
>> +    local_irq_restore(flags);
>> +}
>> +
>> +void tacc_hyp(int place)
> I believe, you want some enum for this "place" parameter type
>> +{
>> +//    printk("\ttacc_hyp %u, place %d\n", smp_processor_id(), place);
> Please, don't push commented-out code. BTW, I think, it is possible to
> add some TACC_DEBUG facilities to enable/disable this traces during
> compile-time.
> 
> Also, looks like you don't use "place" parameter at all.

Since that is the RFC, I've comforted myself with leaving my debug code in place. I hope it should not be confusing.

> 
> Lastly, I believe that this function (and other similar functions below)
> can be defined as "static inline" in a header file

Not this time. They are mostly called from asm (at least now).

> 
>> +    tacc_state_change(TACC_HYP);
>> +}
>> +
>> +void tacc_guest(int place)
>> +{
>> +//    printk("\ttacc_guest %u, place %d\n", smp_processor_id(), place);
>> +    tacc_state_change(TACC_GUEST);
>> +}
>> +
>> +void tacc_idle(int place)
>> +{
>> +//    printk("\tidle cpu %u, place %d\n", smp_processor_id(), place);
>> +    tacc_state_change(TACC_IDLE);
>> +}
>> +
>> +void tacc_gsync(int place)
>> +{
>> +//    printk("\ttacc_gsync %u, place %d\n", smp_processor_id(), place);
>> +    tacc_state_change(TACC_GSYNC);
>> +}
>> +
>> +void tacc_irq_enter(int place)
>> +{
>> +    struct tacc* tacc = &this_cpu(tacc);
>> +
>> +//    printk("\ttacc_irq_enter %u, place %d, cnt %d\n", smp_processor_id(), place, this_cpu(tacc).irq_cnt);
>> +    ASSERT(!local_irq_is_enabled());
>> +    ASSERT(tacc->irq_cnt >= 0);
> You can make irq_cnt unsigned and drop this assert.

No. Otherwise one might miss proper call sequence when utilize this for the different arch, and have no notice from the debug assertion.

> 
>> +
>> +    if ( tacc->irq_cnt == 0 )
>> +    {
>> +        tacc->irq_enter_time = NOW();
>> +    }
> Coding style:
> 
> ---
> Braces should be omitted for blocks with a single statement. e.g.,
> 
> if ( condition )
>      single_statement();
> ---
> 

OK.

>> +
>> +    tacc->irq_cnt++;
>> +}
>> +
>> +void tacc_irq_exit(int place)
>> +{
>> +    struct tacc* tacc = &this_cpu(tacc);
>> +
>> +//    printk("\ttacc_irq_exit %u, place %d, cnt %d\n", smp_processor_id(), place, tacc->irq_cnt);
>> +    ASSERT(!local_irq_is_enabled());
>> +    ASSERT(tacc->irq_cnt > 0);
>> +    if ( tacc->irq_cnt == 1 )
>> +    {
>> +        tacc->irq_time = NOW() - tacc->irq_enter_time;
>> +        tacc->irq_enter_time = 0;
>> +    }
>> +
>> +    tacc->irq_cnt--;
> What if, you IRQ will arrive right after this? I believe, you will lose
> some of the accumulated time.

See ASSERT(!local_irq_is_enabled()) above.

> 
>> +}
>> +
>>   void context_saved(struct vcpu *prev)
>>   {
>>       /* Clear running flag /after/ writing context to memory. */
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index e3601c1..04a8724 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -1002,6 +1002,33 @@ extern void dump_runq(unsigned char key);
>>   
>>   void arch_do_physinfo(struct xen_sysctl_physinfo *pi);
>>   
>> +enum TACC_STATES {
> If I remember correct, enum names should in lower case

Ugh...

> 
>> +    TACC_HYP = 0,
>> +    TACC_GUEST = 1,
>> +    TACC_IDLE = 2,
>> +    TACC_IRQ = 3,
>> +    TACC_GSYNC = 4,
>> +    TACC_STATES_MAX
>> +};
>> +
>> +struct tacc
>> +{
>> +    s_time_t state_time[TACC_STATES_MAX];
>> +    s_time_t state_entry_time;
>> +    int state;
> enum, maybe?

Maybe.

> 
>> +
>> +    s_time_t guest_time;
>> +
>> +    s_time_t irq_enter_time;
>> +    s_time_t irq_time;
>> +    int irq_cnt;
>> +};
>> +
>> +DECLARE_PER_CPU(struct tacc, tacc);
>> +
>> +void tacc_hyp(int place);
>> +void tacc_idle(int place);
> What about functions from sched.c? Should they be declared there?

Maybe.

> 
>> +
>>   #endif /* __SCHED_H__ */
>>   
>>   /*
> 
> 

[1] https://elixir.bootlin.com/linux/latest/source/kernel/sched/cputime.c#L429

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-09-12 10:26 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-11 10:32 [Xen-devel] [RFC 0/9] Changes to time accounting Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 1/9] schedule: Introduce per-pcpu " Andrii Anisov
2019-09-11 18:01   ` Volodymyr Babchuk
2019-09-12 10:26     ` Andrii Anisov [this message]
2019-10-28 14:28   ` Julien Grall
2019-11-06 11:24     ` Andrii Anisov
2020-05-26  2:27       ` Volodymyr Babchuk
2020-05-29  8:48         ` Dario Faggioli
2020-06-02  1:12           ` Volodymyr Babchuk
2020-06-03 15:22             ` Dario Faggioli
2019-09-11 10:32 ` [Xen-devel] [RFC 2/9] sysctl: extend XEN_SYSCTL_getcpuinfo interface Andrii Anisov
2019-10-28 14:52   ` Julien Grall
2019-11-06 11:25     ` Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 3/9] xentop: show CPU load information Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 4/9] arm64: utilize time accounting Andrii Anisov
2019-09-11 17:48   ` Volodymyr Babchuk
2019-09-12 12:09     ` Andrii Anisov
2019-09-12 12:17       ` Julien Grall
2019-09-12 12:29         ` Andrii Anisov
2019-10-28 14:47   ` Julien Grall
2019-11-06 11:31     ` Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 5/9] tacc: Introduce a lockless interface for guest time Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 6/9] sched:rtds: get guest time from time accounting code Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 7/9] tacc: Introduce a locked interface for guest time Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 8/9] sched:credit: get guest time from time accounting code Andrii Anisov
2019-09-11 10:32 ` [Xen-devel] [RFC 9/9] sched:credit2: " Andrii Anisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa467ad5-c12f-c621-1a57-570c0a87a4a0@gmail.com \
    --to=andrii.anisov@gmail.com \
    --cc=Andrii_Anisov@epam.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dfaggioli@suse.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=konrad.wilk@oracle.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).