From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D212C433F5 for ; Mon, 31 Jan 2022 21:16:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356644AbiAaVQK (ORCPT ); Mon, 31 Jan 2022 16:16:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232007AbiAaVQJ (ORCPT ); Mon, 31 Jan 2022 16:16:09 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 168C8C061714 for ; Mon, 31 Jan 2022 13:16:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 596EFB82C49 for ; Mon, 31 Jan 2022 21:16:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3A1CC340E8; Mon, 31 Jan 2022 21:16:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1643663766; bh=vzE2jLt9R1/HggBo0ShJsq7J8kKVbOlzKBF54397jgE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=jE4GpYSAm6f3R03QUv6AaS+DvAscwWRksVcqLejqg2j+X6OatuQHNNYPrchZbkUq9 n1QK07/28sFohanQ0zKuonZ1oqzYTwrPsiRrB1gXcXwaGX5aC9Z7SASY6nOK+qs0Ak k44mbu/LLJdINkj+Fw+NDFJSetiHf/6Z7673t3D6V9oLaTMJ4Vgmm3BBbWE/ZkUHVL XKwqN25hM0BflkfHssOCxwMYxIi8qZ0DKWJQM6hownseZCdVCZ1pPVMeq9DbuK8tYU KzeU+TWAyscMNj6PZxqJ8aF/yS/lasDvQFD1ovMI76km0VwKSqw3qy3hr1vRNunnxJ qVinVrrzoJBrQ== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 3008B40466; Mon, 31 Jan 2022 18:16:03 -0300 (-03) Date: Mon, 31 Jan 2022 18:16:03 -0300 From: Arnaldo Carvalho de Melo To: Alexey Bayduraev Cc: Jiri Olsa , Namhyung Kim , Alexander Shishkin , Peter Zijlstra , Ingo Molnar , linux-kernel , Andi Kleen , Adrian Hunter , Alexander Antonov , Alexei Budankov , Riccardo Mancini Subject: Re: [PATCH v13 01/16] perf record: Introduce thread affinity and mmap masks Message-ID: References: <9042bf7daf988e17e17e6acbf5d29590bde869cd.1642440724.git.alexey.v.bayduraev@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Url: http://acmel.wordpress.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Mon, Jan 31, 2022 at 06:00:31PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Jan 17, 2022 at 09:34:21PM +0300, Alexey Bayduraev escreveu: > > Introduce affinity and mmap thread masks. Thread affinity mask > > defines CPUs that a thread is allowed to run on. Thread maps > > mask defines mmap data buffers the thread serves to stream > > profiling data from. > > > > Acked-by: Andi Kleen > > Acked-by: Namhyung Kim > > Reviewed-by: Riccardo Mancini > > Tested-by: Riccardo Mancini > > Signed-off-by: Alexey Bayduraev > > Some simplifications I added here while reviewing this patchkit: But then, why allocate these even without using them? I.e. the init should be left for when we are sure that we'll actually use this, i.e. when the user asks for parallel mode. We already have lots of needless initializations, reading of files that may not be needed, so we should avoid doing things till we really know that we'll use those allocations, readings, etc. Anyway, continuing to review, will leave what I have at a separata branch so that we can continue from there. - Arnaldo > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index 41998f2140cd5119..53b88c8600624237 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c > @@ -2213,35 +2213,33 @@ static int record__parse_affinity(const struct option *opt, const char *str, int > > static int record__mmap_cpu_mask_alloc(struct mmap_cpu_mask *mask, int nr_bits) > { > - mask->nbits = nr_bits; > mask->bits = bitmap_zalloc(mask->nbits); > if (!mask->bits) > return -ENOMEM; > > + mask->nbits = nr_bits; > return 0; > } > > static void record__mmap_cpu_mask_free(struct mmap_cpu_mask *mask) > { > bitmap_free(mask->bits); > + mask->bits = NULL; > mask->nbits = 0; > } > > static int record__thread_mask_alloc(struct thread_mask *mask, int nr_bits) > { > - int ret; > + int ret = record__mmap_cpu_mask_alloc(&mask->maps, nr_bits); > > - ret = record__mmap_cpu_mask_alloc(&mask->maps, nr_bits); > if (ret) { > mask->affinity.bits = NULL; > return ret; > } > > ret = record__mmap_cpu_mask_alloc(&mask->affinity, nr_bits); > - if (ret) { > + if (ret) > record__mmap_cpu_mask_free(&mask->maps); > - mask->maps.bits = NULL; > - } > > return ret; > } > @@ -2733,18 +2731,14 @@ struct option *record_options = __record_options; > > static void record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) > { > - int c; > - > - for (c = 0; c < cpus->nr; c++) > + for (int c = 0; c < cpus->nr; c++) > set_bit(cpus->map[c].cpu, mask->bits); > } > > static void record__free_thread_masks(struct record *rec, int nr_threads) > { > - int t; > - > if (rec->thread_masks) > - for (t = 0; t < nr_threads; t++) > + for (int t = 0; t < nr_threads; t++) > record__thread_mask_free(&rec->thread_masks[t]); > > zfree(&rec->thread_masks); > @@ -2752,7 +2746,7 @@ static void record__free_thread_masks(struct record *rec, int nr_threads) > > static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr_bits) > { > - int t, ret; > + int ret; > > rec->thread_masks = zalloc(nr_threads * sizeof(*(rec->thread_masks))); > if (!rec->thread_masks) { > @@ -2760,7 +2754,7 @@ static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr > return -ENOMEM; > } > > - for (t = 0; t < nr_threads; t++) { > + for (int t = 0; t < nr_threads; t++) { > ret = record__thread_mask_alloc(&rec->thread_masks[t], nr_bits); > if (ret) { > pr_err("Failed to allocate thread masks[%d]\n", t); > @@ -2778,9 +2772,7 @@ static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr > > static int record__init_thread_default_masks(struct record *rec, struct perf_cpu_map *cpus) > { > - int ret; > - > - ret = record__alloc_thread_masks(rec, 1, cpu__max_cpu().cpu); > + int ret = record__alloc_thread_masks(rec, 1, cpu__max_cpu().cpu); > if (ret) > return ret; > > > > > --- > > tools/perf/builtin-record.c | 123 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 123 insertions(+) > > > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > > index bb716c953d02..41998f2140cd 100644 > > --- a/tools/perf/builtin-record.c > > +++ b/tools/perf/builtin-record.c > > @@ -87,6 +87,11 @@ struct switch_output { > > int cur_file; > > }; > > > > +struct thread_mask { > > + struct mmap_cpu_mask maps; > > + struct mmap_cpu_mask affinity; > > +}; > > + > > struct record { > > struct perf_tool tool; > > struct record_opts opts; > > @@ -112,6 +117,8 @@ struct record { > > struct mmap_cpu_mask affinity_mask; > > unsigned long output_max_size; /* = 0: unlimited */ > > struct perf_debuginfod debuginfod; > > + int nr_threads; > > + struct thread_mask *thread_masks; > > }; > > > > static volatile int done; > > @@ -2204,6 +2211,47 @@ static int record__parse_affinity(const struct option *opt, const char *str, int > > return 0; > > } > > > > +static int record__mmap_cpu_mask_alloc(struct mmap_cpu_mask *mask, int nr_bits) > > +{ > > + mask->nbits = nr_bits; > > + mask->bits = bitmap_zalloc(mask->nbits); > > + if (!mask->bits) > > + return -ENOMEM; > > + > > + return 0; > > +} > > + > > +static void record__mmap_cpu_mask_free(struct mmap_cpu_mask *mask) > > +{ > > + bitmap_free(mask->bits); > > + mask->nbits = 0; > > +} > > + > > +static int record__thread_mask_alloc(struct thread_mask *mask, int nr_bits) > > +{ > > + int ret; > > + > > + ret = record__mmap_cpu_mask_alloc(&mask->maps, nr_bits); > > + if (ret) { > > + mask->affinity.bits = NULL; > > + return ret; > > + } > > + > > + ret = record__mmap_cpu_mask_alloc(&mask->affinity, nr_bits); > > + if (ret) { > > + record__mmap_cpu_mask_free(&mask->maps); > > + mask->maps.bits = NULL; > > + } > > + > > + return ret; > > +} > > + > > +static void record__thread_mask_free(struct thread_mask *mask) > > +{ > > + record__mmap_cpu_mask_free(&mask->maps); > > + record__mmap_cpu_mask_free(&mask->affinity); > > +} > > + > > static int parse_output_max_size(const struct option *opt, > > const char *str, int unset) > > { > > @@ -2683,6 +2731,73 @@ static struct option __record_options[] = { > > > > struct option *record_options = __record_options; > > > > +static void record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) > > +{ > > + int c; > > + > > + for (c = 0; c < cpus->nr; c++) > > + set_bit(cpus->map[c].cpu, mask->bits); > > +} > > + > > +static void record__free_thread_masks(struct record *rec, int nr_threads) > > +{ > > + int t; > > + > > + if (rec->thread_masks) > > + for (t = 0; t < nr_threads; t++) > > + record__thread_mask_free(&rec->thread_masks[t]); > > + > > + zfree(&rec->thread_masks); > > +} > > + > > +static int record__alloc_thread_masks(struct record *rec, int nr_threads, int nr_bits) > > +{ > > + int t, ret; > > + > > + rec->thread_masks = zalloc(nr_threads * sizeof(*(rec->thread_masks))); > > + if (!rec->thread_masks) { > > + pr_err("Failed to allocate thread masks\n"); > > + return -ENOMEM; > > + } > > + > > + for (t = 0; t < nr_threads; t++) { > > + ret = record__thread_mask_alloc(&rec->thread_masks[t], nr_bits); > > + if (ret) { > > + pr_err("Failed to allocate thread masks[%d]\n", t); > > + goto out_free; > > + } > > + } > > + > > + return 0; > > + > > +out_free: > > + record__free_thread_masks(rec, nr_threads); > > + > > + return ret; > > +} > > + > > +static int record__init_thread_default_masks(struct record *rec, struct perf_cpu_map *cpus) > > +{ > > + int ret; > > + > > + ret = record__alloc_thread_masks(rec, 1, cpu__max_cpu().cpu); > > + if (ret) > > + return ret; > > + > > + record__mmap_cpu_mask_init(&rec->thread_masks->maps, cpus); > > + > > + rec->nr_threads = 1; > > + > > + return 0; > > +} > > + > > +static int record__init_thread_masks(struct record *rec) > > +{ > > + struct perf_cpu_map *cpus = rec->evlist->core.cpus; > > + > > + return record__init_thread_default_masks(rec, cpus); > > +} > > + > > int cmd_record(int argc, const char **argv) > > { > > int err; > > @@ -2948,6 +3063,12 @@ int cmd_record(int argc, const char **argv) > > goto out; > > } > > > > + err = record__init_thread_masks(rec); > > + if (err) { > > + pr_err("Failed to initialize parallel data streaming masks\n"); > > + goto out; > > + } > > + > > if (rec->opts.nr_cblocks > nr_cblocks_max) > > rec->opts.nr_cblocks = nr_cblocks_max; > > pr_debug("nr_cblocks: %d\n", rec->opts.nr_cblocks); > > @@ -2966,6 +3087,8 @@ int cmd_record(int argc, const char **argv) > > symbol__exit(); > > auxtrace_record__free(rec->itr); > > out_opts: > > + record__free_thread_masks(rec, rec->nr_threads); > > + rec->nr_threads = 0; > > evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.ctl_fd_close); > > return err; > > } > > -- > > 2.19.0 > > -- > > - Arnaldo -- - Arnaldo