From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756082AbdIHOS6 (ORCPT ); Fri, 8 Sep 2017 10:18:58 -0400 Received: from mail.kernel.org ([198.145.29.99]:40086 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753183AbdIHOS4 (ORCPT ); Fri, 8 Sep 2017 10:18:56 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09B2A22B21 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=acme@kernel.org Date: Fri, 8 Sep 2017 11:18:53 -0300 From: Arnaldo Carvalho de Melo To: kan.liang@intel.com Cc: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org, jolsa@kernel.org, namhyung@kernel.org, adrian.hunter@intel.com, lukasz.odzioba@intel.com, ak@linux.intel.com Subject: Re: [PATCH RFC 01/10] perf tools: hashtable for machine threads Message-ID: <20170908141853.GF11725@kernel.org> References: <1504806954-150842-1-git-send-email-kan.liang@intel.com> <1504806954-150842-2-git-send-email-kan.liang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1504806954-150842-2-git-send-email-kan.liang@intel.com> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Thu, Sep 07, 2017 at 10:55:45AM -0700, kan.liang@intel.com escreveu: > From: Kan Liang > > To process any events, it needs to find the thread in the machine first. > The machine maintains a rb tree to store all threads. The rb tree is > protected by a rw lock. > It is not a problem for current perf which serially processing events. > However, it will have scalability performance issue to process events in > parallel, especially on a heave load system which have many threads. > > Introduce a hashtable to divide the big rb tree into many samll rb tree > for threads. The index is thread id % hashtable size. It can reduce the > lock contention. > +++ b/tools/perf/util/machine.h > @@ -23,6 +23,17 @@ extern const char *ref_reloc_sym_names[]; > > struct vdso_info; > > +#define MACHINE_TH_TABLE_BITS 8 > +#define MACHINE_TH_TABLE_SIZE (1 << MACHINE_TH_TABLE_BITS) > + > +struct machine_th { > + struct rb_root threads; > + pthread_rwlock_t threads_lock; > + unsigned int nr_threads; > + struct list_head dead_threads; > + struct thread *last_match; > +}; > + Call it just "threads", no need to call it then threads->threads, but threads->entries, also no threads->threads_lock, but threads->lock, threads->deads, threads->nr. MACHINE_TH_TABLE_SIZE -> THREADS__TABLE_SIZE, etc. > struct machine { > struct rb_node rb_node; > pid_t pid; > @@ -30,11 +41,7 @@ struct machine { > bool comm_exec; > bool kptr_restrict_warned; > char *root_dir; > - struct rb_root threads; > - pthread_rwlock_t threads_lock; > - unsigned int nr_threads; > - struct list_head dead_threads; > - struct thread *last_match; > + struct machine_th threads[MACHINE_TH_TABLE_SIZE]; > struct vdso_info *vdso_info; > struct perf_env *env; > struct dsos dsos; > @@ -49,6 +56,12 @@ struct machine { > }; > > static inline > +struct machine_th *machine_thread(struct machine *machine, pid_t tid) We separate the class name (machine) from the method name (thread) using double underscores, i.e. the above becomes: static inline threads *machine__threads(struct machine *machine, pid_t tid) > +{ > + return &machine->threads[tid % MACHINE_TH_TABLE_SIZE]; > +} > + > +static inline > struct map *__machine__kernel_map(struct machine *machine, enum map_type type) > { > return machine->vmlinux_maps[type]; > diff --git a/tools/perf/util/rb_resort.h b/tools/perf/util/rb_resort.h > index 808cc45..bbd78fa 100644 > --- a/tools/perf/util/rb_resort.h > +++ b/tools/perf/util/rb_resort.h > @@ -143,7 +143,8 @@ struct __name##_sorted *__name = __name##_sorted__new > __ilist->rblist.nr_entries) > > /* For 'struct machine->threads' */ > -#define DECLARE_RESORT_RB_MACHINE_THREADS(__name, __machine) \ > - DECLARE_RESORT_RB(__name)(&__machine->threads, __machine->nr_threads) > +#define DECLARE_RESORT_RB_MACHINE_THREADS(__name, __machine, tid) \ > + DECLARE_RESORT_RB(__name)(&__machine->threads[tid].threads, \ > + __machine->threads[tid].nr_threads) > > #endif /* _PERF_RESORT_RB_H_ */ > -- > 2.5.5