From: Valentin Schneider <firstname.lastname@example.org> To: Parth Shah <email@example.com>, Patrick Bellasi <firstname.lastname@example.org> Cc: Peter Zijlstra <email@example.com>, Subhra Mazumdar <firstname.lastname@example.org>, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice Date: Fri, 6 Sep 2019 23:50:58 +0100 Message-ID: <email@example.com> (raw) In-Reply-To: <firstname.lastname@example.org> On 06/09/2019 18:10, Parth Shah wrote: > Right, CPU capacity can solve the problem of indicating the thermal throttle to the scheduler. > AFAIU, the patchset from Thara changes CPU capacity to reflect Thermal headroom of the CPU. > This is a nice mitigation but, > 1. Sometimes a single task is responsible for the Thermal heatup of the core, reducing the > CPU capacity of all the CPUs in the core is not optimal when just moving such single > task to other core can allow us to remain within thermal headroom. This is important > for the servers especially where there are upto 8 threads.> 2. Given the implementation in the patches and its integration with EAS, it seems difficult > to adapt to servers, where CPU capacity itself is in doubt. > https://lkml.org/lkml/2019/5/15/1402 > I'd nuance this to *SMT* capacity (which isn't just servers). The thing is that it's difficult to come up with a sensible scheme to describe the base capacity of a single logical CPU. But yeah, valid point. >> >> For active balance, we actually already have a condition that moves a task >> to a less capacity-pressured CPU (although it is somewhat specific). So if >> thermal pressure follows that task (e.g. it's doing tons of vector/float), >> it will be rotated around. > > Agree. But this should break in certain conditions like when we have multiple tasks > in a core with almost equal utilization among which one is just doing vector operations. > LB can pick and move any task with equal probability if the capacity is reduced here. > Right, if/when we get things like per-unit signals (wasn't there something about tracking AVX a few months back?) then we'll be able to make more informed decisions, for now we'll need some handholding (read: task classification). >> >> However there should be a point made on latency vs throughput. If you >> care about latency you probably do not want to active balance your task. If > > Can you please elaborate on why not to consider active balance for latency sensitive tasks? > Because, sometimes finding a thermally cool core is beneficial when Turbo frequency > range is around 20% above rated ones. > This goes back to my reply to Patrick further up the thread. Right now active balance can happen just because we've been imbalanced for some time and repeatedly failed to migrate anything. After 3 (IIRC) successive failed attempts, we'll active balance the running task of the remote rq we decided was busiest. If that happens to be a latency sensitive task, that's not great - active balancing means stopping that task's execution, so we're going to add some latency to this latency-sensitive task. My proposal was to further ratelimit active balance (e.g. require more failed attempts) when the task that would be preempted is latency-sensitive. My point is: if that task is doing fine where it is, why preempt it? That's just introducing latency IMO (keeping in mind that those balance attempts could happen despite not having any thermal pressure). If you care about performance (e.g. a minimum level of throughput), to me that is a separate (though perhaps not entirely distinct) property. >> you care about throughput, it should be specified in some way (util-clamp >> says hello!). >> > > yes I do care for latency and throughput both. :-) Don't we all! > but I'm wondering how uclamp can solve the problem for throughput. > If I make the thermally hot tasks to appear bigger than other tasks then reducing > CPU capacity can allow such tasks to move around the chip. > But this will require the utilization value to be relatively large compared to the other > tasks in the core. Or other task's uclamp.max can be lowered to make such task rotate. > If I got it right, then this will be a difficult UCLAMP usecase from user perspective, right? > I feel like I'm missing something here. > Hmm perhaps I was jumping the gun here. What I was getting to is if you have something like misfit that migrates tasks to CPUs of higher capacity than the one they are on, you could use uclamp to flag them. You could translate your throughput requirement as a uclamp.min of e.g. 80%, and if the CPU capacity goes below that (or close within a margin) then you'd try to migrate the task to a CPU of higher capacity (i.e. not or less thermally pressured). This doesn't have to involve your less throughput-sensitive tasks, since you would only tag and take action for your throughput-sensitive tasks.
next prev parent reply index Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-30 17:49 [RFC PATCH 0/9] Task latency-nice subhra mazumdar 2019-08-30 17:49 ` [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice subhra mazumdar 2019-09-04 17:32 ` Tim Chen 2019-09-05 6:15 ` Parth Shah 2019-09-05 10:11 ` Patrick Bellasi 2019-09-06 12:22 ` Parth Shah 2019-09-05 8:31 ` Peter Zijlstra 2019-09-05 9:45 ` Patrick Bellasi 2019-09-05 10:46 ` Peter Zijlstra 2019-09-05 11:13 ` Qais Yousef 2019-09-05 11:30 ` Peter Zijlstra 2019-09-05 11:40 ` Patrick Bellasi 2019-09-05 11:48 ` Peter Zijlstra 2019-09-05 13:32 ` Qais Yousef 2019-09-05 11:47 ` Qais Yousef 2020-04-16 0:02 ` Joel Fernandes 2020-04-16 17:23 ` Dietmar Eggemann 2020-04-18 16:01 ` Joel Fernandes 2020-04-20 11:26 ` Parth Shah 2020-04-20 19:14 ` Joel Fernandes 2020-04-20 11:47 ` Qais Yousef 2020-04-20 19:10 ` Joel Fernandes 2019-09-05 11:30 ` Patrick Bellasi 2019-09-05 11:47 ` Peter Zijlstra 2019-09-05 11:18 ` Patrick Bellasi 2019-09-05 11:40 ` Peter Zijlstra 2019-09-05 11:46 ` Patrick Bellasi 2019-09-05 11:46 ` Valentin Schneider 2019-09-05 13:07 ` Patrick Bellasi 2019-09-05 14:48 ` Valentin Schneider 2019-09-06 12:45 ` Parth Shah 2019-09-06 14:13 ` Valentin Schneider 2019-09-06 14:32 ` Vincent Guittot 2019-09-06 17:10 ` Parth Shah 2019-09-06 22:50 ` Valentin Schneider [this message] 2019-09-06 12:31 ` Parth Shah 2019-09-05 10:05 ` Patrick Bellasi 2019-09-05 10:48 ` Peter Zijlstra 2019-08-30 17:49 ` [RFC PATCH 2/9] sched: add search limit as per latency-nice subhra mazumdar 2019-09-05 6:22 ` Parth Shah 2019-08-30 17:49 ` [RFC PATCH 3/9] sched: add sched feature to disable idle core search subhra mazumdar 2019-09-05 10:17 ` Patrick Bellasi 2019-09-05 22:02 ` Subhra Mazumdar 2019-08-30 17:49 ` [RFC PATCH 4/9] sched: SIS_CORE " subhra mazumdar 2019-09-05 10:19 ` Patrick Bellasi 2019-08-30 17:49 ` [RFC PATCH 5/9] sched: Define macro for number of CPUs in core subhra mazumdar 2019-08-30 17:49 ` [RFC PATCH 6/9] x86/smpboot: Optimize cpumask_weight_sibling macro for x86 subhra mazumdar 2019-08-30 17:49 ` [RFC PATCH 7/9] sched: search SMT before LLC domain subhra mazumdar 2019-09-05 9:31 ` Peter Zijlstra 2019-09-05 20:40 ` Subhra Mazumdar 2019-08-30 17:49 ` [RFC PATCH 8/9] sched: introduce per-cpu var next_cpu to track search limit subhra mazumdar 2019-08-30 17:49 ` [RFC PATCH 9/9] sched: rotate the cpu search window for better spread subhra mazumdar 2019-09-05 6:37 ` Parth Shah 2019-09-05 5:55 ` [RFC PATCH 0/9] Task latency-nice Parth Shah 2019-09-05 10:31 ` Patrick Bellasi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ email@example.com public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git