From: Marco Elver <elver@google.com> To: elver@google.com, Peter Zijlstra <peterz@infradead.org>, Frederic Weisbecker <frederic@kernel.org>, Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>, Dmitry Vyukov <dvyukov@google.com>, Michael Ellerman <mpe@ellerman.id.au>, linuxppc-dev@lists.ozlabs.org, linux-perf-users@vger.kernel.org, x86@kernel.org, linux-sh@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 07/13] perf/hw_breakpoint: Remove useless code related to flexible breakpoints Date: Tue, 28 Jun 2022 11:58:27 +0200 [thread overview] Message-ID: <20220628095833.2579903-8-elver@google.com> (raw) In-Reply-To: <20220628095833.2579903-1-elver@google.com> Flexible breakpoints have never been implemented, with bp_cpuinfo::flexible always being 0. Unfortunately, they still occupy 4 bytes in each bp_cpuinfo and bp_busy_slots, as well as computing the max flexible count in fetch_bp_busy_slots(). This again causes suboptimal code generation, when we always know that `!!slots.flexible` will be 0. Just get rid of the flexible "placeholder" and remove all real code related to it. Make a note in the comment related to the constraints algorithm but don't remove them from the algorithm, so that if in future flexible breakpoints need supporting, it should be trivial to revive them (along with reverting this change). Signed-off-by: Marco Elver <elver@google.com> --- v2: * Also remove struct bp_busy_slots, and simplify functions. --- kernel/events/hw_breakpoint.c | 57 +++++++++++------------------------ 1 file changed, 17 insertions(+), 40 deletions(-) diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a124786e3ade..63e39dc836bd 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -45,8 +45,6 @@ struct bp_cpuinfo { #else unsigned int *tsk_pinned; #endif - /* Number of non-pinned cpu/task breakpoints in a cpu */ - unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ }; static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); @@ -67,12 +65,6 @@ static const struct rhashtable_params task_bps_ht_params = { static bool constraints_initialized __ro_after_init; -/* Gather the number of total pinned and un-pinned bp in a cpuset */ -struct bp_busy_slots { - unsigned int pinned; - unsigned int flexible; -}; - /* Serialize accesses to the above constraints */ static DEFINE_MUTEX(nr_bp_mutex); @@ -190,14 +182,14 @@ static const struct cpumask *cpumask_of_bp(struct perf_event *bp) } /* - * Report the number of pinned/un-pinned breakpoints we have in - * a given cpu (cpu > -1) or in all of them (cpu = -1). + * Returns the max pinned breakpoint slots in a given + * CPU (cpu > -1) or across all of them (cpu = -1). */ -static void -fetch_bp_busy_slots(struct bp_busy_slots *slots, struct perf_event *bp, - enum bp_type_idx type) +static int +max_bp_pinned_slots(struct perf_event *bp, enum bp_type_idx type) { const struct cpumask *cpumask = cpumask_of_bp(bp); + int pinned_slots = 0; int cpu; for_each_cpu(cpu, cpumask) { @@ -210,24 +202,10 @@ fetch_bp_busy_slots(struct bp_busy_slots *slots, struct perf_event *bp, else nr += task_bp_pinned(cpu, bp, type); - if (nr > slots->pinned) - slots->pinned = nr; - - nr = info->flexible; - if (nr > slots->flexible) - slots->flexible = nr; + pinned_slots = max(nr, pinned_slots); } -} -/* - * For now, continue to consider flexible as pinned, until we can - * ensure no flexible event can ever be scheduled before a pinned event - * in a same cpu. - */ -static void -fetch_this_slot(struct bp_busy_slots *slots, int weight) -{ - slots->pinned += weight; + return pinned_slots; } /* @@ -298,7 +276,12 @@ __weak void arch_unregister_hw_breakpoint(struct perf_event *bp) } /* - * Constraints to check before allowing this new breakpoint counter: + * Constraints to check before allowing this new breakpoint counter. + * + * Note: Flexible breakpoints are currently unimplemented, but outlined in the + * below algorithm for completeness. The implementation treats flexible as + * pinned due to no guarantee that we currently always schedule flexible events + * before a pinned event in a same CPU. * * == Non-pinned counter == (Considered as pinned for now) * @@ -340,8 +323,8 @@ __weak void arch_unregister_hw_breakpoint(struct perf_event *bp) */ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) { - struct bp_busy_slots slots = {0}; enum bp_type_idx type; + int max_pinned_slots; int weight; int ret; @@ -357,15 +340,9 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) type = find_slot_idx(bp_type); weight = hw_breakpoint_weight(bp); - fetch_bp_busy_slots(&slots, bp, type); - /* - * Simulate the addition of this breakpoint to the constraints - * and see the result. - */ - fetch_this_slot(&slots, weight); - - /* Flexible counters need to keep at least one slot */ - if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) + /* Check if this new breakpoint can be satisfied across all CPUs. */ + max_pinned_slots = max_bp_pinned_slots(bp, type) + weight; + if (max_pinned_slots > hw_breakpoint_slots_cached(type)) return -ENOSPC; ret = arch_reserve_bp_slot(bp); -- 2.37.0.rc0.161.g10f37bed90-goog
WARNING: multiple messages have this Message-ID (diff)
From: Marco Elver <elver@google.com> To: elver@google.com, Peter Zijlstra <peterz@infradead.org>, Frederic Weisbecker <frederic@kernel.org>, Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com>, linux-sh@vger.kernel.org, Alexander Shishkin <alexander.shishkin@linux.intel.com>, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, Arnaldo Carvalho de Melo <acme@kernel.org>, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kasan-dev@googlegroups.com, Namhyung Kim <namhyung@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Jiri Olsa <jolsa@redhat.com>, Dmitry Vyukov <dvyukov@google.com> Subject: [PATCH v2 07/13] perf/hw_breakpoint: Remove useless code related to flexible breakpoints Date: Tue, 28 Jun 2022 11:58:27 +0200 [thread overview] Message-ID: <20220628095833.2579903-8-elver@google.com> (raw) In-Reply-To: <20220628095833.2579903-1-elver@google.com> Flexible breakpoints have never been implemented, with bp_cpuinfo::flexible always being 0. Unfortunately, they still occupy 4 bytes in each bp_cpuinfo and bp_busy_slots, as well as computing the max flexible count in fetch_bp_busy_slots(). This again causes suboptimal code generation, when we always know that `!!slots.flexible` will be 0. Just get rid of the flexible "placeholder" and remove all real code related to it. Make a note in the comment related to the constraints algorithm but don't remove them from the algorithm, so that if in future flexible breakpoints need supporting, it should be trivial to revive them (along with reverting this change). Signed-off-by: Marco Elver <elver@google.com> --- v2: * Also remove struct bp_busy_slots, and simplify functions. --- kernel/events/hw_breakpoint.c | 57 +++++++++++------------------------ 1 file changed, 17 insertions(+), 40 deletions(-) diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index a124786e3ade..63e39dc836bd 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -45,8 +45,6 @@ struct bp_cpuinfo { #else unsigned int *tsk_pinned; #endif - /* Number of non-pinned cpu/task breakpoints in a cpu */ - unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ }; static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); @@ -67,12 +65,6 @@ static const struct rhashtable_params task_bps_ht_params = { static bool constraints_initialized __ro_after_init; -/* Gather the number of total pinned and un-pinned bp in a cpuset */ -struct bp_busy_slots { - unsigned int pinned; - unsigned int flexible; -}; - /* Serialize accesses to the above constraints */ static DEFINE_MUTEX(nr_bp_mutex); @@ -190,14 +182,14 @@ static const struct cpumask *cpumask_of_bp(struct perf_event *bp) } /* - * Report the number of pinned/un-pinned breakpoints we have in - * a given cpu (cpu > -1) or in all of them (cpu = -1). + * Returns the max pinned breakpoint slots in a given + * CPU (cpu > -1) or across all of them (cpu = -1). */ -static void -fetch_bp_busy_slots(struct bp_busy_slots *slots, struct perf_event *bp, - enum bp_type_idx type) +static int +max_bp_pinned_slots(struct perf_event *bp, enum bp_type_idx type) { const struct cpumask *cpumask = cpumask_of_bp(bp); + int pinned_slots = 0; int cpu; for_each_cpu(cpu, cpumask) { @@ -210,24 +202,10 @@ fetch_bp_busy_slots(struct bp_busy_slots *slots, struct perf_event *bp, else nr += task_bp_pinned(cpu, bp, type); - if (nr > slots->pinned) - slots->pinned = nr; - - nr = info->flexible; - if (nr > slots->flexible) - slots->flexible = nr; + pinned_slots = max(nr, pinned_slots); } -} -/* - * For now, continue to consider flexible as pinned, until we can - * ensure no flexible event can ever be scheduled before a pinned event - * in a same cpu. - */ -static void -fetch_this_slot(struct bp_busy_slots *slots, int weight) -{ - slots->pinned += weight; + return pinned_slots; } /* @@ -298,7 +276,12 @@ __weak void arch_unregister_hw_breakpoint(struct perf_event *bp) } /* - * Constraints to check before allowing this new breakpoint counter: + * Constraints to check before allowing this new breakpoint counter. + * + * Note: Flexible breakpoints are currently unimplemented, but outlined in the + * below algorithm for completeness. The implementation treats flexible as + * pinned due to no guarantee that we currently always schedule flexible events + * before a pinned event in a same CPU. * * == Non-pinned counter == (Considered as pinned for now) * @@ -340,8 +323,8 @@ __weak void arch_unregister_hw_breakpoint(struct perf_event *bp) */ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) { - struct bp_busy_slots slots = {0}; enum bp_type_idx type; + int max_pinned_slots; int weight; int ret; @@ -357,15 +340,9 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) type = find_slot_idx(bp_type); weight = hw_breakpoint_weight(bp); - fetch_bp_busy_slots(&slots, bp, type); - /* - * Simulate the addition of this breakpoint to the constraints - * and see the result. - */ - fetch_this_slot(&slots, weight); - - /* Flexible counters need to keep at least one slot */ - if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) + /* Check if this new breakpoint can be satisfied across all CPUs. */ + max_pinned_slots = max_bp_pinned_slots(bp, type) + weight; + if (max_pinned_slots > hw_breakpoint_slots_cached(type)) return -ENOSPC; ret = arch_reserve_bp_slot(bp); -- 2.37.0.rc0.161.g10f37bed90-goog
next prev parent reply other threads:[~2022-06-28 9:59 UTC|newest] Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-06-28 9:58 [PATCH v2 00/13] perf/hw_breakpoint: Optimize for thousands of tasks Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 9:58 ` [PATCH v2 01/13] perf/hw_breakpoint: Add KUnit test for constraints accounting Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 12:53 ` Dmitry Vyukov 2022-06-28 12:53 ` Dmitry Vyukov 2022-06-28 13:26 ` Marco Elver 2022-06-28 13:26 ` Marco Elver 2022-06-28 9:58 ` [PATCH v2 02/13] perf/hw_breakpoint: Clean up headers Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 9:58 ` [PATCH v2 03/13] perf/hw_breakpoint: Optimize list of per-task breakpoints Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 13:08 ` Dmitry Vyukov 2022-06-28 13:08 ` Dmitry Vyukov 2022-06-28 14:53 ` Marco Elver 2022-06-28 14:53 ` Marco Elver 2022-06-28 15:27 ` Dmitry Vyukov 2022-06-28 15:27 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 04/13] perf/hw_breakpoint: Mark data __ro_after_init Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 9:58 ` [PATCH v2 05/13] perf/hw_breakpoint: Optimize constant number of breakpoint slots Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 9:58 ` [PATCH v2 06/13] perf/hw_breakpoint: Make hw_breakpoint_weight() inlinable Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 13:16 ` Dmitry Vyukov 2022-06-28 13:16 ` Dmitry Vyukov 2022-06-28 9:58 ` Marco Elver [this message] 2022-06-28 9:58 ` [PATCH v2 07/13] perf/hw_breakpoint: Remove useless code related to flexible breakpoints Marco Elver 2022-06-28 13:18 ` Dmitry Vyukov 2022-06-28 13:18 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 08/13] powerpc/hw_breakpoint: Avoid relying on caller synchronization Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 13:21 ` Dmitry Vyukov 2022-06-28 13:21 ` Dmitry Vyukov 2022-07-01 8:54 ` Christophe Leroy 2022-07-01 8:54 ` Christophe Leroy 2022-07-01 9:41 ` Marco Elver 2022-07-01 9:41 ` Marco Elver 2022-07-01 10:15 ` Christophe Leroy 2022-07-01 10:15 ` Christophe Leroy 2022-06-28 9:58 ` [PATCH v2 09/13] locking/percpu-rwsem: Add percpu_is_write_locked() and percpu_is_read_locked() Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 14:44 ` Dmitry Vyukov 2022-06-28 14:44 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 10/13] perf/hw_breakpoint: Reduce contention with large number of tasks Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 14:45 ` Dmitry Vyukov 2022-06-28 14:45 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 11/13] perf/hw_breakpoint: Introduce bp_slots_histogram Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 14:52 ` Dmitry Vyukov 2022-06-28 14:52 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 12/13] perf/hw_breakpoint: Optimize max_bp_pinned_slots() for CPU-independent task targets Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 15:41 ` Dmitry Vyukov 2022-06-28 15:41 ` Dmitry Vyukov 2022-06-28 9:58 ` [PATCH v2 13/13] perf/hw_breakpoint: Optimize toggle_bp_slot() " Marco Elver 2022-06-28 9:58 ` Marco Elver 2022-06-28 10:54 ` Marco Elver 2022-06-28 10:54 ` Marco Elver 2022-06-28 15:45 ` Dmitry Vyukov 2022-06-28 15:45 ` Dmitry Vyukov 2022-06-28 16:00 ` Marco Elver 2022-06-28 16:00 ` Marco Elver
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220628095833.2579903-8-elver@google.com \ --to=elver@google.com \ --cc=acme@kernel.org \ --cc=alexander.shishkin@linux.intel.com \ --cc=dvyukov@google.com \ --cc=frederic@kernel.org \ --cc=jolsa@redhat.com \ --cc=kasan-dev@googlegroups.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-perf-users@vger.kernel.org \ --cc=linux-sh@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mark.rutland@arm.com \ --cc=mingo@kernel.org \ --cc=mpe@ellerman.id.au \ --cc=namhyung@kernel.org \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.