All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Jiri Olsa" <jolsa@kernel.org>,
	"Adrian Hunter" <adrian.hunter@intel.com>,
	"James Clark" <james.clark@arm.com>,
	"Athira Rajeev" <atrajeev@linux.vnet.ibm.com>,
	"Colin Ian King" <colin.i.king@gmail.com>,
	"Ahelenia Ziemiańska" <nabijaczleweli@nabijaczleweli.xyz>,
	"Leo Yan" <leo.yan@linux.dev>, "Song Liu" <song@kernel.org>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Liam Howlett" <liam.howlett@oracle.com>,
	"Ilkka Koskinen" <ilkka@os.amperecomputing.com>,
	"Ben Gainey" <ben.gainey@arm.com>,
	"K Prateek Nayak" <kprateek.nayak@amd.com>,
	"Kan Liang" <kan.liang@linux.intel.com>,
	"Yanteng Si" <siyanteng@loongson.cn>,
	"Ravi Bangoria" <ravi.bangoria@amd.com>,
	"Sun Haiyong" <sunhaiyong@loongson.cn>,
	"Changbin Du" <changbin.du@huawei.com>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	zhaimingbing <zhaimingbing@cmss.chinamobile.com>,
	"Paran Lee" <p4ranlee@gmail.com>, "Li Dong" <lidong@vivo.com>,
	elfring@users.sourceforge.net, "Andi Kleen" <ak@linux.intel.com>,
	"Markus Elfring" <Markus.Elfring@web.de>,
	"Chengen Du" <chengen.du@canonical.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH v2 01/13] perf dso: Reorder variables to save space in struct dso
Date: Thu, 21 Mar 2024 17:28:09 -0300	[thread overview]
Message-ID: <ZfyYWcu0E-Pcn6Rf@x1> (raw)
In-Reply-To: <20240321160300.1635121-2-irogers@google.com>

On Thu, Mar 21, 2024 at 09:02:48AM -0700, Ian Rogers wrote:
> Save 40 bytes and move from 8 to 7 cache lines. Make variable dwfl
> dependent on being a powerpc build. Squeeze bits of int/enum types
> when appropriate. Remove holes/padding by reordering variables.

Thanks, applied.

- Arnaldo
 
> Before:
> ```
> struct dso {
>         struct mutex               lock;                 /*     0    40 */
>         struct list_head           node;                 /*    40    16 */
>         struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
>         /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
>         struct rb_root *           root;                 /*    80     8 */
>         struct rb_root_cached      symbols;              /*    88    16 */
>         struct symbol * *          symbol_names;         /*   104     8 */
>         size_t                     symbol_names_len;     /*   112     8 */
>         struct rb_root_cached      inlined_nodes;        /*   120    16 */
>         /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
>         struct rb_root_cached      srclines;             /*   136    16 */
>         struct {
>                 u64                addr;                 /*   152     8 */
>                 struct symbol *    symbol;               /*   160     8 */
>         } last_find_result;                              /*   152    16 */
>         void *                     a2l;                  /*   168     8 */
>         char *                     symsrc_filename;      /*   176     8 */
>         unsigned int               a2l_fails;            /*   184     4 */
>         enum dso_space_type        kernel;               /*   188     4 */
>         /* --- cacheline 3 boundary (192 bytes) --- */
>         _Bool                      is_kmod;              /*   192     1 */
> 
>         /* XXX 3 bytes hole, try to pack */
> 
>         enum dso_swap_type         needs_swap;           /*   196     4 */
>         enum dso_binary_type       symtab_type;          /*   200     4 */
>         enum dso_binary_type       binary_type;          /*   204     4 */
>         enum dso_load_errno        load_errno;           /*   208     4 */
>         u8                         adjust_symbols:1;     /*   212: 0  1 */
>         u8                         has_build_id:1;       /*   212: 1  1 */
>         u8                         header_build_id:1;    /*   212: 2  1 */
>         u8                         has_srcline:1;        /*   212: 3  1 */
>         u8                         hit:1;                /*   212: 4  1 */
>         u8                         annotate_warned:1;    /*   212: 5  1 */
>         u8                         auxtrace_warned:1;    /*   212: 6  1 */
>         u8                         short_name_allocated:1; /*   212: 7  1 */
>         u8                         long_name_allocated:1; /*   213: 0  1 */
>         u8                         is_64_bit:1;          /*   213: 1  1 */
> 
>         /* XXX 6 bits hole, try to pack */
> 
>         _Bool                      sorted_by_name;       /*   214     1 */
>         _Bool                      loaded;               /*   215     1 */
>         u8                         rel;                  /*   216     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         struct build_id            bid;                  /*   224    32 */
>         /* --- cacheline 4 boundary (256 bytes) --- */
>         u64                        text_offset;          /*   256     8 */
>         u64                        text_end;             /*   264     8 */
>         const char  *              short_name;           /*   272     8 */
>         const char  *              long_name;            /*   280     8 */
>         u16                        long_name_len;        /*   288     2 */
>         u16                        short_name_len;       /*   290     2 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         void *                     dwfl;                 /*   296     8 */
>         struct auxtrace_cache *    auxtrace_cache;       /*   304     8 */
>         int                        comp;                 /*   312     4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         /* --- cacheline 5 boundary (320 bytes) --- */
>         struct {
>                 struct rb_root     cache;                /*   320     8 */
>                 int                fd;                   /*   328     4 */
>                 int                status;               /*   332     4 */
>                 u32                status_seen;          /*   336     4 */
> 
>                 /* XXX 4 bytes hole, try to pack */
> 
>                 u64                file_size;            /*   344     8 */
>                 struct list_head   open_entry;           /*   352    16 */
>                 u64                elf_base_addr;        /*   368     8 */
>                 u64                debug_frame_offset;   /*   376     8 */
>                 /* --- cacheline 6 boundary (384 bytes) --- */
>                 u64                eh_frame_hdr_addr;    /*   384     8 */
>                 u64                eh_frame_hdr_offset;  /*   392     8 */
>         } data;                                          /*   320    80 */
>         struct {
>                 u32                id;                   /*   400     4 */
>                 u32                sub_id;               /*   404     4 */
>                 struct perf_env *  env;                  /*   408     8 */
>         } bpf_prog;                                      /*   400    16 */
>         union {
>                 void *             priv;                 /*   416     8 */
>                 u64                db_id;                /*   416     8 */
>         };                                               /*   416     8 */
>         struct nsinfo *            nsinfo;               /*   424     8 */
>         struct dso_id              id;                   /*   432    24 */
>         /* --- cacheline 7 boundary (448 bytes) was 8 bytes ago --- */
>         refcount_t                 refcnt;               /*   456     4 */
>         char                       name[];               /*   460     0 */
> 
>         /* size: 464, cachelines: 8, members: 49 */
>         /* sum members: 440, holes: 4, sum holes: 18 */
>         /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 6 bits */
>         /* padding: 4 */
>         /* forced alignments: 1 */
>         /* last cacheline: 16 bytes */
> } __attribute__((__aligned__(8)));
> ```
> 
> After:
> ```
> struct dso {
>         struct mutex               lock;                 /*     0    40 */
>         struct list_head           node;                 /*    40    16 */
>         struct rb_node             rb_node __attribute__((__aligned__(8))); /*    56    24 */
>         /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
>         struct rb_root *           root;                 /*    80     8 */
>         struct rb_root_cached      symbols;              /*    88    16 */
>         struct symbol * *          symbol_names;         /*   104     8 */
>         size_t                     symbol_names_len;     /*   112     8 */
>         struct rb_root_cached      inlined_nodes;        /*   120    16 */
>         /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
>         struct rb_root_cached      srclines;             /*   136    16 */
>         struct {
>                 u64                addr;                 /*   152     8 */
>                 struct symbol *    symbol;               /*   160     8 */
>         } last_find_result;                              /*   152    16 */
>         struct build_id            bid;                  /*   168    32 */
>         /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */
>         u64                        text_offset;          /*   200     8 */
>         u64                        text_end;             /*   208     8 */
>         const char  *              short_name;           /*   216     8 */
>         const char  *              long_name;            /*   224     8 */
>         void *                     a2l;                  /*   232     8 */
>         char *                     symsrc_filename;      /*   240     8 */
>         struct nsinfo *            nsinfo;               /*   248     8 */
>         /* --- cacheline 4 boundary (256 bytes) --- */
>         struct auxtrace_cache *    auxtrace_cache;       /*   256     8 */
>         union {
>                 void *             priv;                 /*   264     8 */
>                 u64                db_id;                /*   264     8 */
>         };                                               /*   264     8 */
>         struct {
>                 struct perf_env *  env;                  /*   272     8 */
>                 u32                id;                   /*   280     4 */
>                 u32                sub_id;               /*   284     4 */
>         } bpf_prog;                                      /*   272    16 */
>         struct {
>                 struct rb_root     cache;                /*   288     8 */
>                 struct list_head   open_entry;           /*   296    16 */
>                 u64                file_size;            /*   312     8 */
>                 /* --- cacheline 5 boundary (320 bytes) --- */
>                 u64                elf_base_addr;        /*   320     8 */
>                 u64                debug_frame_offset;   /*   328     8 */
>                 u64                eh_frame_hdr_addr;    /*   336     8 */
>                 u64                eh_frame_hdr_offset;  /*   344     8 */
>                 int                fd;                   /*   352     4 */
>                 int                status;               /*   356     4 */
>                 u32                status_seen;          /*   360     4 */
>         } data;                                          /*   288    80 */
> 
>         /* XXX last struct has 4 bytes of padding */
> 
>         struct dso_id              id;                   /*   368    24 */
>         /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
>         unsigned int               a2l_fails;            /*   392     4 */
>         int                        comp;                 /*   396     4 */
>         refcount_t                 refcnt;               /*   400     4 */
>         enum dso_load_errno        load_errno;           /*   404     4 */
>         u16                        long_name_len;        /*   408     2 */
>         u16                        short_name_len;       /*   410     2 */
>         enum dso_binary_type       symtab_type:8;        /*   412: 0  4 */
>         enum dso_binary_type       binary_type:8;        /*   412: 8  4 */
>         enum dso_space_type        kernel:2;             /*   412:16  4 */
>         enum dso_swap_type         needs_swap:2;         /*   412:18  4 */
> 
>         /* Bitfield combined with next fields */
> 
>         _Bool                      is_kmod:1;            /*   414: 4  1 */
>         u8                         adjust_symbols:1;     /*   414: 5  1 */
>         u8                         has_build_id:1;       /*   414: 6  1 */
>         u8                         header_build_id:1;    /*   414: 7  1 */
>         u8                         has_srcline:1;        /*   415: 0  1 */
>         u8                         hit:1;                /*   415: 1  1 */
>         u8                         annotate_warned:1;    /*   415: 2  1 */
>         u8                         auxtrace_warned:1;    /*   415: 3  1 */
>         u8                         short_name_allocated:1; /*   415: 4  1 */
>         u8                         long_name_allocated:1; /*   415: 5  1 */
>         u8                         is_64_bit:1;          /*   415: 6  1 */
> 
>         /* XXX 1 bit hole, try to pack */
> 
>         _Bool                      sorted_by_name;       /*   416     1 */
>         _Bool                      loaded;               /*   417     1 */
>         u8                         rel;                  /*   418     1 */
>         char                       name[];               /*   419     0 */
> 
>         /* size: 424, cachelines: 7, members: 48 */
>         /* sum members: 415 */
>         /* sum bitfield members: 31 bits, bit holes: 1, sum bit holes: 1 bits */
>         /* padding: 5 */
>         /* paddings: 1, sum paddings: 4 */
>         /* forced alignments: 1 */
>         /* last cacheline: 40 bytes */
> } __attribute__((__aligned__(8)));
> ```
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/util/dso.h | 84 +++++++++++++++++++++----------------------
>  1 file changed, 42 insertions(+), 42 deletions(-)
> 
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index 2cdcd1e2ef8b..17dab230a2ca 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -161,66 +161,66 @@ struct dso {
>  		u64		addr;
>  		struct symbol	*symbol;
>  	} last_find_result;
> -	void		 *a2l;
> -	char		 *symsrc_filename;
> -	unsigned int	 a2l_fails;
> -	enum dso_space_type	kernel;
> -	bool			is_kmod;
> -	enum dso_swap_type	needs_swap;
> -	enum dso_binary_type	symtab_type;
> -	enum dso_binary_type	binary_type;
> -	enum dso_load_errno	load_errno;
> -	u8		 adjust_symbols:1;
> -	u8		 has_build_id:1;
> -	u8		 header_build_id:1;
> -	u8		 has_srcline:1;
> -	u8		 hit:1;
> -	u8		 annotate_warned:1;
> -	u8		 auxtrace_warned:1;
> -	u8		 short_name_allocated:1;
> -	u8		 long_name_allocated:1;
> -	u8		 is_64_bit:1;
> -	bool		 sorted_by_name;
> -	bool		 loaded;
> -	u8		 rel;
>  	struct build_id	 bid;
>  	u64		 text_offset;
>  	u64		 text_end;
>  	const char	 *short_name;
>  	const char	 *long_name;
> -	u16		 long_name_len;
> -	u16		 short_name_len;
> +	void		 *a2l;
> +	char		 *symsrc_filename;
> +#if defined(__powerpc__)
>  	void		*dwfl;			/* DWARF debug info */
> +#endif
> +	struct nsinfo	*nsinfo;
>  	struct auxtrace_cache *auxtrace_cache;
> -	int		 comp;
> -
> +	union { /* Tool specific area */
> +		void	 *priv;
> +		u64	 db_id;
> +	};
> +	/* bpf prog information */
> +	struct {
> +		struct perf_env	*env;
> +		u32		id;
> +		u32		sub_id;
> +	} bpf_prog;
>  	/* dso data file */
>  	struct {
>  		struct rb_root	 cache;
> -		int		 fd;
> -		int		 status;
> -		u32		 status_seen;
> -		u64		 file_size;
>  		struct list_head open_entry;
> +		u64		 file_size;
>  		u64		 elf_base_addr;
>  		u64		 debug_frame_offset;
>  		u64		 eh_frame_hdr_addr;
>  		u64		 eh_frame_hdr_offset;
> +		int		 fd;
> +		int		 status;
> +		u32		 status_seen;
>  	} data;
> -	/* bpf prog information */
> -	struct {
> -		u32		id;
> -		u32		sub_id;
> -		struct perf_env	*env;
> -	} bpf_prog;
> -
> -	union { /* Tool specific area */
> -		void	 *priv;
> -		u64	 db_id;
> -	};
> -	struct nsinfo	*nsinfo;
>  	struct dso_id	 id;
> +	unsigned int	 a2l_fails;
> +	int		 comp;
>  	refcount_t	 refcnt;
> +	enum dso_load_errno	load_errno;
> +	u16		 long_name_len;
> +	u16		 short_name_len;
> +	enum dso_binary_type	symtab_type:8;
> +	enum dso_binary_type	binary_type:8;
> +	enum dso_space_type	kernel:2;
> +	enum dso_swap_type	needs_swap:2;
> +	bool			is_kmod:1;
> +	u8		 adjust_symbols:1;
> +	u8		 has_build_id:1;
> +	u8		 header_build_id:1;
> +	u8		 has_srcline:1;
> +	u8		 hit:1;
> +	u8		 annotate_warned:1;
> +	u8		 auxtrace_warned:1;
> +	u8		 short_name_allocated:1;
> +	u8		 long_name_allocated:1;
> +	u8		 is_64_bit:1;
> +	bool		 sorted_by_name;
> +	bool		 loaded;
> +	u8		 rel;
>  	char		 name[];
>  };
>  
> -- 
> 2.44.0.396.g6e790dbe36-goog
> 

  reply	other threads:[~2024-03-21 20:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-21 16:02 [PATCH v2 00/13] dso/dsos memory savings and clean up Ian Rogers
2024-03-21 16:02 ` [PATCH v2 01/13] perf dso: Reorder variables to save space in struct dso Ian Rogers
2024-03-21 20:28   ` Arnaldo Carvalho de Melo [this message]
2024-03-21 16:02 ` [PATCH v2 02/13] perf dsos: Attempt to better abstract dsos internals Ian Rogers
2024-03-21 16:02 ` [PATCH v2 03/13] perf dsos: Tidy reference counting and locking Ian Rogers
2024-03-21 16:02 ` [PATCH v2 04/13] perf dsos: Add dsos__for_each_dso Ian Rogers
2024-03-22 20:43   ` Namhyung Kim
2024-03-22 20:54     ` Namhyung Kim
2024-03-21 16:02 ` [PATCH v2 05/13] perf dso: Move dso functions out of dsos Ian Rogers
2024-03-21 16:02 ` [PATCH v2 06/13] perf dsos: Switch more loops to dsos__for_each_dso Ian Rogers
2024-03-21 16:02 ` [PATCH v2 07/13] perf dsos: Switch backing storage to array from rbtree/list Ian Rogers
2024-03-21 16:02 ` [PATCH v2 08/13] perf dsos: Remove __dsos__addnew Ian Rogers
2024-03-21 16:02 ` [PATCH v2 09/13] perf dsos: Remove __dsos__findnew_link_by_longname_id Ian Rogers
2024-03-21 16:02 ` [PATCH v2 10/13] perf dsos: Switch hand code to bsearch Ian Rogers
2024-03-21 16:02 ` [PATCH v2 11/13] perf dso: Add reference count checking and accessor functions Ian Rogers
2024-03-21 16:02 ` [PATCH v2 12/13] perf dso: Reference counting related fixes Ian Rogers
2024-03-25 17:22   ` Markus Elfring
2024-03-21 16:03 ` [PATCH v2 13/13] perf dso: Use container_of to avoid a pointer in dso_data Ian Rogers
2024-03-25 21:03 ` [PATCH v2 00/13] dso/dsos memory savings and clean up Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfyYWcu0E-Pcn6Rf@x1 \
    --to=acme@kernel.org \
    --cc=Markus.Elfring@web.de \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=atrajeev@linux.vnet.ibm.com \
    --cc=ben.gainey@arm.com \
    --cc=bpf@vger.kernel.org \
    --cc=changbin.du@huawei.com \
    --cc=chengen.du@canonical.com \
    --cc=colin.i.king@gmail.com \
    --cc=elfring@users.sourceforge.net \
    --cc=ilkka@os.amperecomputing.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=leo.yan@linux.dev \
    --cc=liam.howlett@oracle.com \
    --cc=lidong@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nabijaczleweli@nabijaczleweli.xyz \
    --cc=namhyung@kernel.org \
    --cc=ojeda@kernel.org \
    --cc=p4ranlee@gmail.com \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=siyanteng@loongson.cn \
    --cc=song@kernel.org \
    --cc=sunhaiyong@loongson.cn \
    --cc=zhaimingbing@cmss.chinamobile.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.