All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
@ 2014-06-07 21:46 Thomas Petazzoni
  2014-06-07 21:46 ` [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-07 21:46 UTC (permalink / raw)
  To: buildroot

Hello,

I gave a training this week, and one of the question I had was how to
analyze the size of the things that are present on the root
filesystem. And I thought that Buildroot was lacking a tool to help
with this. Therefore, the following set of commits implement a script
that generates a pie chart of the size contribution of each package to
the target root filesystem.

To see an example of the generated pie chart, see:

  http://free-electrons.com/~thomas/pub/buildroot/graph-size.pdf

The implementation consists in adding a global instrumentation hook
that registers which files are installed by each package. A limitation
of the current implementation is that when a file is installed by a
package A and then overriden by package B, the mechanism will assume
the file was installed by package A. Suggestions to welcome on how to
solve this in a reasonably simple way.

The size contribution of each package is not computed directly in the
global instrumentation hook, because all the stripping and cleanup
work takes place in target-finalize, after all packages are
installed. Therefore, the hook only registers the path of the files
that are installed.

The 'graph-size' make target then runs a Python scripts, that looks at
all files in $(TARGET_DIR) and is able to find out, thanks to the data
registered by the global instrumentation hook, to which package the
file belongs. Using this, it is quite simple to generate a pie chart.

Comments, ideas, suggestions are welcome.

Thomas

Thomas Petazzoni (4):
  toolchain-external: split target installation from staging
    installation
  pkg-generic: add step_pkg_size global instrumentation hook
  support/scripts: add graph-size script
  Makefile: implement a graph-size target

 Makefile                                           |   6 +
 package/pkg-generic.mk                             |  24 +++
 support/scripts/graph-size                         | 164 +++++++++++++++++++++
 toolchain/toolchain-external/toolchain-external.mk |  36 ++++-
 4 files changed, 223 insertions(+), 7 deletions(-)
 create mode 100755 support/scripts/graph-size

-- 
2.0.0

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
@ 2014-06-07 21:46 ` Thomas Petazzoni
  2014-06-09 21:49   ` Yann E. MORIN
  2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-07 21:46 UTC (permalink / raw)
  To: buildroot

Currently, all the installation work of the toolchain-external package
is done during the install-staging step. However, in order to be able
to properly collect the size added by each package to the target
filesystem, we need to make sure that toolchain-external installs its
files to $(TARGET_DIR) during the install-target step.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 toolchain/toolchain-external/toolchain-external.mk | 36 +++++++++++++++++-----
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/toolchain/toolchain-external/toolchain-external.mk b/toolchain/toolchain-external/toolchain-external.mk
index c73cc4a..45926cf 100644
--- a/toolchain/toolchain-external/toolchain-external.mk
+++ b/toolchain/toolchain-external/toolchain-external.mk
@@ -551,7 +551,7 @@ endif
 #                       considered when searching libraries for copy
 #                       to the target filesystem.
 
-define TOOLCHAIN_EXTERNAL_INSTALL_CORE
+define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS
 	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
 	if test -z "$${SYSROOT_DIR}" ; then \
 		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
@@ -576,8 +576,6 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
 			$(call copy_toolchain_lib_root,$${ARCH_SYSROOT_DIR},$${SUPPORT_LIB_DIR},$${ARCH_LIB_DIR},$$libs,/usr/lib); \
 		done ; \
 	fi ; \
-	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
-	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR}) ; \
 	if test "$(BR2_TOOLCHAIN_EXTERNAL_GDB_SERVER_COPY)" = "y"; then \
 		$(call MESSAGE,"Copying gdbserver") ; \
 		gdbserver_found=0 ; \
@@ -595,6 +593,26 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
 	fi
 endef
 
+define TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS
+	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
+	if test -z "$${SYSROOT_DIR}" ; then \
+		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
+		exit 1 ; \
+	fi ; \
+	ARCH_SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
+	ARCH_LIB_DIR="$(call toolchain_find_libdir,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
+	SUPPORT_LIB_DIR="" ; \
+	if test `find $${ARCH_SYSROOT_DIR} -name 'libstdc++.a' | wc -l` -eq 0 ; then \
+		LIBSTDCPP_A_LOCATION=$$(LANG=C $(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS) -print-file-name=libstdc++.a) ; \
+		if [ -e "$${LIBSTDCPP_A_LOCATION}" ]; then \
+			SUPPORT_LIB_DIR=`readlink -f $${LIBSTDCPP_A_LOCATION} | sed -r -e 's:libstdc\+\+\.a::'` ; \
+		fi ; \
+	fi ; \
+	ARCH_SUBDIR=`echo $${ARCH_SYSROOT_DIR} | sed -r -e "s:^$${SYSROOT_DIR}(.*)/$$:\1:"` ; \
+	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
+	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR})
+endef
+
 # Special installation target used on the Blackfin architecture when
 # FDPIC is not the primary binary format being used, but the user has
 # nonetheless requested the installation of the FDPIC libraries to the
@@ -685,15 +703,19 @@ define TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT
 	fi
 endef
 
+define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
+	$(TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS)
+	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
+	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
+endef
+
 # Even though we're installing things in both the staging, the host
 # and the target directory, we do everything within the
 # install-staging step, arbitrarily.
-define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
-	$(TOOLCHAIN_EXTERNAL_INSTALL_CORE)
+define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_CMDS
+	$(TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS)
 	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FDPIC)
 	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FLAT)
-	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
-	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
 endef
 
 $(eval $(generic-package))
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
  2014-06-07 21:46 ` [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
@ 2014-06-07 21:46 ` Thomas Petazzoni
  2014-06-08  2:56   ` Baruch Siach
                     ` (2 more replies)
  2014-06-07 21:46 ` [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script Thomas Petazzoni
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-07 21:46 UTC (permalink / raw)
  To: buildroot

This patch adds a global instrumentation hook that collects the list
of files installed in $(TARGET_DIR) by each package, and stores this
list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
be used to determine the size contribution of each package to the
target root filesystem.

The only limitation is that if a file is installed by a package A, and
then overriden by a file from package B, the file will only be listed
in $(BUILD_DIR)/A.filelist as it is the first time we will see the
file.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 package/pkg-generic.mk | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index 5116ed9..069653e 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -55,6 +55,30 @@ define step_time
 endef
 GLOBAL_INSTRUMENTATION_HOOKS += step_time
 
+# Package size steps
+define step_pkg_size_start
+	echo "PKG SIZE START $(1)"
+	(cd $(TARGET_DIR) ; find . -type f) | sort > \
+		$(BUILD_DIR)/$(1).tmp_filelist_before
+endef
+
+define step_pkg_size_end
+	echo "PKG SIZE END $(1)"
+	(cd $(TARGET_DIR); find . -type f) | sort > \
+		$(BUILD_DIR)/$(1).tmp_filelist_after
+	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
+		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
+	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
+		$(BUILD_DIR)/$(1).tmp_filelist_after
+endef
+
+define step_pkg_size
+	$(if $(filter install-target,$(2)),\
+		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
+		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
+endef
+GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
+
 # User-supplied script
 define step_user
 	@$(foreach user_hook, $(BR2_INSTRUMENTATION_SCRIPTS), \
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
  2014-06-07 21:46 ` [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
  2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2014-06-07 21:46 ` Thomas Petazzoni
  2014-06-09 22:06   ` Yann E. MORIN
  2014-06-07 21:46 ` [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target Thomas Petazzoni
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-07 21:46 UTC (permalink / raw)
  To: buildroot

This new script uses the data collected by the step_pkg_size
instrumentation hook to generate a pie chart of the size contribution
of each package to the target root filesystem. To achieve this, it
looks at each file in $(TARGET_DIR), and using the <pkgname>.filelist
information collected by the step_pkg_size hook, it determines to
which package the file belongs. It is therefore able to give the size
installed by each package.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/scripts/graph-size | 164 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 164 insertions(+)
 create mode 100755 support/scripts/graph-size

diff --git a/support/scripts/graph-size b/support/scripts/graph-size
new file mode 100755
index 0000000..5f7fe58
--- /dev/null
+++ b/support/scripts/graph-size
@@ -0,0 +1,164 @@
+#!/usr/bin/env python
+
+# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+# This script draws a pie chart of the size used by each package in
+# the target filesystem.
+
+import sys
+import os.path
+import argparse
+import matplotlib.font_manager as fm
+import matplotlib.pyplot as plt
+
+colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
+          '#0068b5', '#f28e00', '#940084', '#97c000']
+
+#
+# This function parses one .filelist file (which lists the files
+# installed by a particular package), and returns a Python list of the
+# file paths installed by the package.
+#
+# pkgf: path to the .filelist file
+#
+def handle_pkg(pkgf):
+    files = []
+    with open(pkgf) as f:
+        for l in f.readlines():
+            files.append(l.strip().replace("./", ""))
+    return files
+
+#
+# This function returns the list of files present in the skeleton,
+# with the exception of the .empty files. It is used to create a fake
+# entry in the dictionary of files installed by each package, emulated
+# the presence of a skeleton package.
+#
+def handle_skeleton():
+    skeleton_files = []
+    for root, _, files in os.walk("system/skeleton"):
+        for f in files:
+            if f == ".empty":
+                continue
+            frelpath = os.path.relpath(os.path.join(root, f), "system/skeleton")
+            skeleton_files.append(frelpath)
+    return skeleton_files
+
+#
+# This function returns a dict where each key is the name of a
+# package, and the value is a list of the files installed by this
+# package.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_dict(builddir):
+    pkgdict = {}
+    # Parse all the .filelist files generated by the package
+    # installation process
+    filelist = os.listdir(os.path.join(builddir, "build"))
+    for file in filelist:
+        (_, ext) = os.path.splitext(file)
+        if ext != ".filelist":
+            continue
+        pkgname = file.replace(".filelist", "")
+        pkgdict[pkgname] = handle_pkg(os.path.join(builddir, "build", file))
+    # Add a special fake entry for the files installed by the skeleton
+    pkgdict['skeleton'] = handle_skeleton()
+    return pkgdict
+
+#
+# This function looks into the 'pkgdict' dictionary (as generated by
+# build_package_dict) to see which package has installed the file 'f',
+# and returns the name of that package.
+#
+def find_file(pkgdict, f):
+    for (p, flist) in pkgdict.iteritems():
+        if f in flist:
+            return p
+    return None
+
+#
+# This function build a dictionary that contains the name of a package
+# as key, and the size of the files installed by this package as the
+# value.
+#
+def build_package_size(builddir):
+    pkgdict = build_package_dict(builddir)
+    pkgsize = {}
+
+    for root, _, files in os.walk(os.path.join(builddir, "target")):
+        for f in files:
+            fpath = os.path.join(root, f)
+            if os.path.islink(fpath):
+                continue
+            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
+            pkg = find_file(pkgdict, frelpath)
+            if pkg is None:
+                print "WARNING: %s is not part of any package" % frelpath
+                pkg = "unknown"
+            if pkg in pkgsize:
+                pkgsize[pkg] += os.path.getsize(fpath)
+            else:
+                pkgsize[pkg] = os.path.getsize(fpath)
+
+    return pkgsize
+
+#
+# Given a dict returned by build_package_size(), this function
+# generates a pie chart of the size installed by each package.
+#
+def draw_graph(pkgsize, outputf):
+    total = 0
+    for (p, sz) in pkgsize.iteritems():
+        total += sz
+    labels = []
+    values = []
+    other_value = 0
+    for (p, sz) in pkgsize.iteritems():
+        if sz < (total * 0.01):
+            other_value += sz
+        else:
+            labels.append(p)
+            values.append(sz)
+    labels.append("Other")
+    values.append(other_value)
+
+    plt.figure()
+    patches, texts, autotexts = plt.pie(values, labels=labels,
+                                        autopct='%1.1f%%', shadow=True,
+                                        colors=colors)
+    # Reduce text size
+    proptease = fm.FontProperties()
+    proptease.set_size('xx-small')
+    plt.setp(autotexts, fontproperties=proptease)
+    plt.setp(texts, fontproperties=proptease)
+
+    plt.title('Size per package')
+    plt.savefig(outputf)
+
+parser = argparse.ArgumentParser(description='Draw build time graphs')
+
+parser.add_argument("--builddir", '-i', metavar="BUILDDIR",
+                    help="Buildroot output directory")
+parser.add_argument("--output", '-o', metavar="OUTPUT", required=True,
+                    help="Output file (.pdf or .png extension)")
+args = parser.parse_args()
+
+
+ps = build_package_size(args.builddir)
+draw_graph(ps, args.output)
+
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
                   ` (2 preceding siblings ...)
  2014-06-07 21:46 ` [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script Thomas Petazzoni
@ 2014-06-07 21:46 ` Thomas Petazzoni
  2014-06-09 22:28   ` Yann E. MORIN
  2014-06-07 21:54 ` [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Will Wagner
  2014-06-24 13:05 ` Luca Ceresoli
  5 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-07 21:46 UTC (permalink / raw)
  To: buildroot

Like we have graph-build and graph-depends, this commit implements a
graph-size target to generate the corresponding graph.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 Makefile | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Makefile b/Makefile
index 0b4264a..fee6b46 100644
--- a/Makefile
+++ b/Makefile
@@ -677,6 +677,11 @@ graph-depends:
 	|tee $(O)/graphs/$(@).dot \
 	|dot -T$(BR_GRAPH_OUT) -o $(O)/graphs/$(@).$(BR_GRAPH_OUT)
 
+graph-size:
+	@$(INSTALL) -d $(O)/graphs
+	@cd "$(TOPDIR)"; \
+	./support/scripts/graph-size --builddir $(O) --output $(O)/graphs/$(@).$(BR_GRAPH_OUT)
+
 else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
 
 all: menuconfig
@@ -892,6 +897,7 @@ endif
 	@echo '  manual-epub            - build manual in ePub'
 	@echo '  graph-build            - generate graphs of the build times'
 	@echo '  graph-depends          - generate graph of the dependency tree'
+	@echo '  graph-size             - generate graph of the filesystem size'
 	@echo
 	@echo 'Miscellaneous:'
 	@echo '  source                 - download all sources needed for offline-build'
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
                   ` (3 preceding siblings ...)
  2014-06-07 21:46 ` [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target Thomas Petazzoni
@ 2014-06-07 21:54 ` Will Wagner
  2014-06-08  7:42   ` Thomas Petazzoni
  2014-06-24 13:05 ` Luca Ceresoli
  5 siblings, 1 reply; 27+ messages in thread
From: Will Wagner @ 2014-06-07 21:54 UTC (permalink / raw)
  To: buildroot

On 07/06/2014 22:46, Thomas Petazzoni wrote:
> Hello,
>
> I gave a training this week, and one of the question I had was how to
> analyze the size of the things that are present on the root
> filesystem. And I thought that Buildroot was lacking a tool to help
> with this. Therefore, the following set of commits implement a script
> that generates a pie chart of the size contribution of each package to
> the target root filesystem.

Would certainly be useful in buildroot, your idea looks good.

>
> To see an example of the generated pie chart, see:
>
>    http://free-electrons.com/~thomas/pub/buildroot/graph-size.pdf
>
> The implementation consists in adding a global instrumentation hook
> that registers which files are installed by each package. A limitation
> of the current implementation is that when a file is installed by a
> package A and then overriden by package B, the mechanism will assume
> the file was installed by package A. Suggestions to welcome on how to
> solve this in a reasonably simple way.

Can you not record the timestamp the first time you see the file and see 
if it has changed at the end. I appreciate a bash script to do this is 
going to be pretty hard.

Regards
Will

-- 
------------------------------------------------------------------------
Will Wagner                                     will_wagner at carallon.com
Development Manager                      Office Tel: +44 (0)20 7371 2032
Carallon Ltd, Studio G20, Shepherds Building, Rockley Rd, London W14 0DA
------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2014-06-08  2:56   ` Baruch Siach
  2014-06-08  8:19     ` Thomas Petazzoni
  2014-06-09 22:02   ` Yann E. MORIN
  2014-06-24 16:36   ` Arnout Vandecappelle
  2 siblings, 1 reply; 27+ messages in thread
From: Baruch Siach @ 2014-06-08  2:56 UTC (permalink / raw)
  To: buildroot

Hi Thomas,

On Sat, Jun 07, 2014 at 11:46:05PM +0200, Thomas Petazzoni wrote:
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> be used to determine the size contribution of each package to the
> target root filesystem.

How does this play with parallel build? Is install-target guaranteed to run 
sequentially for each package?

baruch

> The only limitation is that if a file is installed by a package A, and
> then overriden by a file from package B, the file will only be listed
> in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> file.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  package/pkg-generic.mk | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 5116ed9..069653e 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,30 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  
> +# Package size steps
> +define step_pkg_size_start
> +	echo "PKG SIZE START $(1)"
> +	(cd $(TARGET_DIR) ; find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_before
> +endef
> +
> +define step_pkg_size_end
> +	echo "PKG SIZE END $(1)"
> +	(cd $(TARGET_DIR); find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
> +		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
> +	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +endef
> +
> +define step_pkg_size
> +	$(if $(filter install-target,$(2)),\
> +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> +
>  # User-supplied script
>  define step_user
>  	@$(foreach user_hook, $(BR2_INSTRUMENTATION_SCRIPTS), \

-- 
     http://baruch.siach.name/blog/                  ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch at tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-07 21:54 ` [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Will Wagner
@ 2014-06-08  7:42   ` Thomas Petazzoni
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-08  7:42 UTC (permalink / raw)
  To: buildroot

Dear Will Wagner,

On Sat, 07 Jun 2014 22:54:38 +0100, Will Wagner wrote:

> > The implementation consists in adding a global instrumentation hook
> > that registers which files are installed by each package. A limitation
> > of the current implementation is that when a file is installed by a
> > package A and then overriden by package B, the mechanism will assume
> > the file was installed by package A. Suggestions to welcome on how to
> > solve this in a reasonably simple way.
> 
> Can you not record the timestamp the first time you see the file and see 
> if it has changed at the end. I appreciate a bash script to do this is 
> going to be pretty hard.

Depends on what you call "at the end". The problem is that between the
end of the install-target step of a given package, and the end of the
entire build of the system, most of the files in $(TARGET_DIR) are
modified due to stripping. Therefore, their timestamp changes.

The only thing I could think of would be based on SHA1: in the global
instrumentation hook, instead of just doing a "find" on all files in
$(TARGET_DIR) and then diff the result before/after the installation,
we include the SHA1. Therefore, the diff would indicate when a file has
changed, so we would be able to detect that. However, this would mean
that a file will be mentioned in two .filelist files, and since the
Python script has no idea in which order the packages were built, it
does not know which of the two packages won for the installation of the
given file. Could be based on the timestamp of the .filelist file,
though. Starts to be a bit tricky, as you can see :)

Thanks for your feedback!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-08  2:56   ` Baruch Siach
@ 2014-06-08  8:19     ` Thomas Petazzoni
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-08  8:19 UTC (permalink / raw)
  To: buildroot

Dear Baruch Siach,

On Sun, 8 Jun 2014 05:56:29 +0300, Baruch Siach wrote:

> On Sat, Jun 07, 2014 at 11:46:05PM +0200, Thomas Petazzoni wrote:
> > This patch adds a global instrumentation hook that collects the list
> > of files installed in $(TARGET_DIR) by each package, and stores this
> > list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> > be used to determine the size contribution of each package to the
> > target root filesystem.
> 
> How does this play with parallel build? Is install-target guaranteed to run 
> sequentially for each package?

It obviously clearly doesn't work with top-level parallel build. The
mechanism assumes that between the beginning of an install-target step
and its end, nothing else runs and installs stuff in $(TARGET_DIR).
Which is true for sequential builds, but false for top-level parallel
builds.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation
  2014-06-07 21:46 ` [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
@ 2014-06-09 21:49   ` Yann E. MORIN
  2014-06-10  8:04     ` Thomas Petazzoni
  0 siblings, 1 reply; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-09 21:49 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> Currently, all the installation work of the toolchain-external package
> is done during the install-staging step. However, in order to be able
> to properly collect the size added by each package to the target
> filesystem, we need to make sure that toolchain-external installs its
> files to $(TARGET_DIR) during the install-target step.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Tested with a ARM code sourcery toolchain, and a custom toolchain.

Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Tested-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

But see my comment below...

> ---
>  toolchain/toolchain-external/toolchain-external.mk | 36 +++++++++++++++++-----
>  1 file changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/toolchain/toolchain-external/toolchain-external.mk b/toolchain/toolchain-external/toolchain-external.mk
> index c73cc4a..45926cf 100644
> --- a/toolchain/toolchain-external/toolchain-external.mk
> +++ b/toolchain/toolchain-external/toolchain-external.mk
> @@ -551,7 +551,7 @@ endif
>  #                       considered when searching libraries for copy
>  #                       to the target filesystem.
>  
> -define TOOLCHAIN_EXTERNAL_INSTALL_CORE
> +define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS
>  	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
>  	if test -z "$${SYSROOT_DIR}" ; then \
>  		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
> @@ -576,8 +576,6 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
>  			$(call copy_toolchain_lib_root,$${ARCH_SYSROOT_DIR},$${SUPPORT_LIB_DIR},$${ARCH_LIB_DIR},$$libs,/usr/lib); \
>  		done ; \
>  	fi ; \
> -	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
> -	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR}) ; \
>  	if test "$(BR2_TOOLCHAIN_EXTERNAL_GDB_SERVER_COPY)" = "y"; then \
>  		$(call MESSAGE,"Copying gdbserver") ; \
>  		gdbserver_found=0 ; \
> @@ -595,6 +593,26 @@ define TOOLCHAIN_EXTERNAL_INSTALL_CORE
>  	fi
>  endef
>  
> +define TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS

Maybe the code from here...

> +	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
> +	if test -z "$${SYSROOT_DIR}" ; then \
> +		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
> +		exit 1 ; \
> +	fi ; \
> +	ARCH_SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> +	ARCH_LIB_DIR="$(call toolchain_find_libdir,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> +	SUPPORT_LIB_DIR="" ; \
> +	if test `find $${ARCH_SYSROOT_DIR} -name 'libstdc++.a' | wc -l` -eq 0 ; then \
> +		LIBSTDCPP_A_LOCATION=$$(LANG=C $(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS) -print-file-name=libstdc++.a) ; \
> +		if [ -e "$${LIBSTDCPP_A_LOCATION}" ]; then \
> +			SUPPORT_LIB_DIR=`readlink -f $${LIBSTDCPP_A_LOCATION} | sed -r -e 's:libstdc\+\+\.a::'` ; \
> +		fi ; \
> +	fi ; \
> +	ARCH_SUBDIR=`echo $${ARCH_SYSROOT_DIR} | sed -r -e "s:^$${SYSROOT_DIR}(.*)/$$:\1:"` ; \

... to here could be moved to a common function, so it can be shared
between the staging and target functions?

Otherwise, I'm afraid we might miss changing one or the other if we need
to make some changes.

But granted, that might not be that easy... In which case a big fat
warning above each block, to remind to keep the other in sync, would be
needed.

Regards,
Yann E. MORIN.

> +	$(call MESSAGE,"Copying external toolchain sysroot to staging...") ; \
> +	$(call copy_toolchain_sysroot,$${SYSROOT_DIR},$${ARCH_SYSROOT_DIR},$${ARCH_SUBDIR},$${ARCH_LIB_DIR},$${SUPPORT_LIB_DIR})
> +endef
> +
>  # Special installation target used on the Blackfin architecture when
>  # FDPIC is not the primary binary format being used, but the user has
>  # nonetheless requested the installation of the FDPIC libraries to the
> @@ -685,15 +703,19 @@ define TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT
>  	fi
>  endef
>  
> +define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
> +	$(TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS)
> +	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
> +	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
> +endef
> +
>  # Even though we're installing things in both the staging, the host
>  # and the target directory, we do everything within the
>  # install-staging step, arbitrarily.
> -define TOOLCHAIN_EXTERNAL_INSTALL_STAGING_CMDS
> -	$(TOOLCHAIN_EXTERNAL_INSTALL_CORE)
> +define TOOLCHAIN_EXTERNAL_INSTALL_TARGET_CMDS
> +	$(TOOLCHAIN_EXTERNAL_INSTALL_TARGET_LIBS)
>  	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FDPIC)
>  	$(TOOLCHAIN_EXTERNAL_INSTALL_BFIN_FLAT)
> -	$(TOOLCHAIN_EXTERNAL_INSTALL_WRAPPER)
> -	$(TOOLCHAIN_EXTERNAL_INSTALL_GDBINIT)
>  endef
>  
>  $(eval $(generic-package))
> -- 
> 2.0.0
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
  2014-06-08  2:56   ` Baruch Siach
@ 2014-06-09 22:02   ` Yann E. MORIN
  2014-06-10 16:42     ` Jérôme Pouiller
       [not found]     ` <3156840.4l9buZIenR@sagittea>
  2014-06-24 16:36   ` Arnout Vandecappelle
  2 siblings, 2 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-09 22:02 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> be used to determine the size contribution of each package to the
> target root filesystem.
> 
> The only limitation is that if a file is installed by a package A, and
> then overriden by a file from package B, the file will only be listed
> in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> file.

If we really wanted to account for the realy package, we'd have to
somehow notice that a pacakge did change the content of a file.

So, we would need to run sha1sum on all the files in the pre-step and
the post step. Any differing line would mean a new file, or a changed
file.

See below for a proposed storage solution.

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  package/pkg-generic.mk | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 5116ed9..069653e 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,30 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  
> +# Package size steps
> +define step_pkg_size_start
> +	echo "PKG SIZE START $(1)"
> +	(cd $(TARGET_DIR) ; find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_before

At first, I wondered if we should not store those files in the packages'
own build directory (along with the .stamp files.)

But then I went back to thinking about the second-package-to-install-a-file
issue raised in the commit log.

So, say we are able to determine what files a pacakge installs or modify
(using the sah1, for example.) Then we could just store that list in a
single file, that gets appended to package after package, and which
format would be:

package-name <TAB> path/to/file
package-name <TAB> path/to/other/file
other-package <TAB> path/to/third/file
pther-package <TAB> path/to/file             <-- override

That way, the python script has only one file to scan, which is sorted
by build-order, and the script can detect overwritten files, and even
report that, while still accounting the size to the real pacakge that
installed the file that will end up in the target.

Of course, using sha1 would slow the build quite a bit.

Thoughts?

Regards,
Yann E. MORIN.

> +endef
> +
> +define step_pkg_size_end
> +	echo "PKG SIZE END $(1)"
> +	(cd $(TARGET_DIR); find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
> +		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
> +	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +endef
> +
> +define step_pkg_size
> +	$(if $(filter install-target,$(2)),\
> +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> +
>  # User-supplied script
>  define step_user
>  	@$(foreach user_hook, $(BR2_INSTRUMENTATION_SCRIPTS), \
> -- 
> 2.0.0
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script
  2014-06-07 21:46 ` [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script Thomas Petazzoni
@ 2014-06-09 22:06   ` Yann E. MORIN
  0 siblings, 0 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-09 22:06 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the <pkgname>.filelist
> information collected by the step_pkg_size hook, it determines to
> which package the file belongs. It is therefore able to give the size
> installed by each package.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Well, I won't really review this Python code, since I can't be said to
be a Python expert! ;-)

However, with my proposal in patch 2/4, that script might need some big
overhaul, so I doubt it would be a usefull review anyway.

Still, with my proposal, we should pass the list file as an argument to
this script. That would allow one to store the list file, and regenerate
the graphs later on.

Regards,
Yann E. MORIN.

> ---
>  support/scripts/graph-size | 164 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 164 insertions(+)
>  create mode 100755 support/scripts/graph-size
> 
> diff --git a/support/scripts/graph-size b/support/scripts/graph-size
> new file mode 100755
> index 0000000..5f7fe58
> --- /dev/null
> +++ b/support/scripts/graph-size
> @@ -0,0 +1,164 @@
> +#!/usr/bin/env python
> +
> +# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> +
> +# This script draws a pie chart of the size used by each package in
> +# the target filesystem.
> +
> +import sys
> +import os.path
> +import argparse
> +import matplotlib.font_manager as fm
> +import matplotlib.pyplot as plt
> +
> +colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
> +          '#0068b5', '#f28e00', '#940084', '#97c000']
> +
> +#
> +# This function parses one .filelist file (which lists the files
> +# installed by a particular package), and returns a Python list of the
> +# file paths installed by the package.
> +#
> +# pkgf: path to the .filelist file
> +#
> +def handle_pkg(pkgf):
> +    files = []
> +    with open(pkgf) as f:
> +        for l in f.readlines():
> +            files.append(l.strip().replace("./", ""))
> +    return files
> +
> +#
> +# This function returns the list of files present in the skeleton,
> +# with the exception of the .empty files. It is used to create a fake
> +# entry in the dictionary of files installed by each package, emulated
> +# the presence of a skeleton package.
> +#
> +def handle_skeleton():
> +    skeleton_files = []
> +    for root, _, files in os.walk("system/skeleton"):
> +        for f in files:
> +            if f == ".empty":
> +                continue
> +            frelpath = os.path.relpath(os.path.join(root, f), "system/skeleton")
> +            skeleton_files.append(frelpath)
> +    return skeleton_files
> +
> +#
> +# This function returns a dict where each key is the name of a
> +# package, and the value is a list of the files installed by this
> +# package.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_dict(builddir):
> +    pkgdict = {}
> +    # Parse all the .filelist files generated by the package
> +    # installation process
> +    filelist = os.listdir(os.path.join(builddir, "build"))
> +    for file in filelist:
> +        (_, ext) = os.path.splitext(file)
> +        if ext != ".filelist":
> +            continue
> +        pkgname = file.replace(".filelist", "")
> +        pkgdict[pkgname] = handle_pkg(os.path.join(builddir, "build", file))
> +    # Add a special fake entry for the files installed by the skeleton
> +    pkgdict['skeleton'] = handle_skeleton()
> +    return pkgdict
> +
> +#
> +# This function looks into the 'pkgdict' dictionary (as generated by
> +# build_package_dict) to see which package has installed the file 'f',
> +# and returns the name of that package.
> +#
> +def find_file(pkgdict, f):
> +    for (p, flist) in pkgdict.iteritems():
> +        if f in flist:
> +            return p
> +    return None
> +
> +#
> +# This function build a dictionary that contains the name of a package
> +# as key, and the size of the files installed by this package as the
> +# value.
> +#
> +def build_package_size(builddir):
> +    pkgdict = build_package_dict(builddir)
> +    pkgsize = {}
> +
> +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> +        for f in files:
> +            fpath = os.path.join(root, f)
> +            if os.path.islink(fpath):
> +                continue
> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> +            pkg = find_file(pkgdict, frelpath)
> +            if pkg is None:
> +                print "WARNING: %s is not part of any package" % frelpath
> +                pkg = "unknown"
> +            if pkg in pkgsize:
> +                pkgsize[pkg] += os.path.getsize(fpath)
> +            else:
> +                pkgsize[pkg] = os.path.getsize(fpath)
> +
> +    return pkgsize
> +
> +#
> +# Given a dict returned by build_package_size(), this function
> +# generates a pie chart of the size installed by each package.
> +#
> +def draw_graph(pkgsize, outputf):
> +    total = 0
> +    for (p, sz) in pkgsize.iteritems():
> +        total += sz
> +    labels = []
> +    values = []
> +    other_value = 0
> +    for (p, sz) in pkgsize.iteritems():
> +        if sz < (total * 0.01):
> +            other_value += sz
> +        else:
> +            labels.append(p)
> +            values.append(sz)
> +    labels.append("Other")
> +    values.append(other_value)
> +
> +    plt.figure()
> +    patches, texts, autotexts = plt.pie(values, labels=labels,
> +                                        autopct='%1.1f%%', shadow=True,
> +                                        colors=colors)
> +    # Reduce text size
> +    proptease = fm.FontProperties()
> +    proptease.set_size('xx-small')
> +    plt.setp(autotexts, fontproperties=proptease)
> +    plt.setp(texts, fontproperties=proptease)
> +
> +    plt.title('Size per package')
> +    plt.savefig(outputf)
> +
> +parser = argparse.ArgumentParser(description='Draw build time graphs')
> +
> +parser.add_argument("--builddir", '-i', metavar="BUILDDIR",
> +                    help="Buildroot output directory")
> +parser.add_argument("--output", '-o', metavar="OUTPUT", required=True,
> +                    help="Output file (.pdf or .png extension)")
> +args = parser.parse_args()
> +
> +
> +ps = build_package_size(args.builddir)
> +draw_graph(ps, args.output)
> +
> -- 
> 2.0.0
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target
  2014-06-07 21:46 ` [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target Thomas Petazzoni
@ 2014-06-09 22:28   ` Yann E. MORIN
  0 siblings, 0 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-09 22:28 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> Like we have graph-build and graph-depends, this commit implements a
> graph-size target to generate the corresponding graph.

Do not forget about the manual! ;-)

Regards,
Yann E. MORIN.

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  Makefile | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/Makefile b/Makefile
> index 0b4264a..fee6b46 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -677,6 +677,11 @@ graph-depends:
>  	|tee $(O)/graphs/$(@).dot \
>  	|dot -T$(BR_GRAPH_OUT) -o $(O)/graphs/$(@).$(BR_GRAPH_OUT)
>  
> +graph-size:
> +	@$(INSTALL) -d $(O)/graphs
> +	@cd "$(TOPDIR)"; \
> +	./support/scripts/graph-size --builddir $(O) --output $(O)/graphs/$(@).$(BR_GRAPH_OUT)
> +
>  else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
>  
>  all: menuconfig
> @@ -892,6 +897,7 @@ endif
>  	@echo '  manual-epub            - build manual in ePub'
>  	@echo '  graph-build            - generate graphs of the build times'
>  	@echo '  graph-depends          - generate graph of the dependency tree'
> +	@echo '  graph-size             - generate graph of the filesystem size'
>  	@echo
>  	@echo 'Miscellaneous:'
>  	@echo '  source                 - download all sources needed for offline-build'
> -- 
> 2.0.0
> 
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation
  2014-06-09 21:49   ` Yann E. MORIN
@ 2014-06-10  8:04     ` Thomas Petazzoni
  2014-06-10 16:49       ` Yann E. MORIN
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-10  8:04 UTC (permalink / raw)
  To: buildroot

Dear Yann E. MORIN,

On Mon, 9 Jun 2014 23:49:28 +0200, Yann E. MORIN wrote:

> > +define TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS
> 
> Maybe the code from here...
> 
> > +	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
> > +	if test -z "$${SYSROOT_DIR}" ; then \
> > +		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
> > +		exit 1 ; \
> > +	fi ; \
> > +	ARCH_SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> > +	ARCH_LIB_DIR="$(call toolchain_find_libdir,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> > +	SUPPORT_LIB_DIR="" ; \
> > +	if test `find $${ARCH_SYSROOT_DIR} -name 'libstdc++.a' | wc -l` -eq 0 ; then \
> > +		LIBSTDCPP_A_LOCATION=$$(LANG=C $(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS) -print-file-name=libstdc++.a) ; \
> > +		if [ -e "$${LIBSTDCPP_A_LOCATION}" ]; then \
> > +			SUPPORT_LIB_DIR=`readlink -f $${LIBSTDCPP_A_LOCATION} | sed -r -e 's:libstdc\+\+\.a::'` ; \
> > +		fi ; \
> > +	fi ; \
> > +	ARCH_SUBDIR=`echo $${ARCH_SYSROOT_DIR} | sed -r -e "s:^$${SYSROOT_DIR}(.*)/$$:\1:"` ; \
> 
> ... to here could be moved to a common function, so it can be shared
> between the staging and target functions?

How do you suggest this to be done? The problem is that we need those
variables to be defined within the shell block that follows. I don't
see any easy way to factorize that. Or maybe I should just take this
opportunity, and move some of this crap into a helper shell script,
which will avoid these horrible shell blocks with lots of quoting and
backslashes.

I'll take a look at this.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-09 22:02   ` Yann E. MORIN
@ 2014-06-10 16:42     ` Jérôme Pouiller
       [not found]     ` <3156840.4l9buZIenR@sagittea>
  1 sibling, 0 replies; 27+ messages in thread
From: Jérôme Pouiller @ 2014-06-10 16:42 UTC (permalink / raw)
  To: buildroot

Hello Yann,

On Tuesday 10 June 2014 00:02:41 Yann E. MORIN wrote:
> Thomas, All,
> 
> On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> > This patch adds a global instrumentation hook that collects the list
> > of files installed in $(TARGET_DIR) by each package, and stores this
> > list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> > be used to determine the size contribution of each package to the
> > target root filesystem.
> > 
> > The only limitation is that if a file is installed by a package A, and
> > then overriden by a file from package B, the file will only be listed
> > in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> > file.
> 
> If we really wanted to account for the realy package, we'd have to
> somehow notice that a pacakge did change the content of a file.
> 
> So, we would need to run sha1sum on all the files in the pre-step and
> the post step. Any differing line would mean a new file, or a changed
> file.
I did something similar in the past. I used inotify to follow modification 
done on TARGET_DIR. It was fast and detect overwritten files.

Stopping inotify process when build was interrupted was a little tricky, but 
results was corrects.

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation
  2014-06-10  8:04     ` Thomas Petazzoni
@ 2014-06-10 16:49       ` Yann E. MORIN
  0 siblings, 0 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-10 16:49 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-06-10 10:04 +0200, Thomas Petazzoni spake thusly:
> On Mon, 9 Jun 2014 23:49:28 +0200, Yann E. MORIN wrote:
> > > +define TOOLCHAIN_EXTERNAL_INSTALL_SYSROOT_LIBS
> > 
> > Maybe the code from here...
> > 
> > > +	$(Q)SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC))" ; \
> > > +	if test -z "$${SYSROOT_DIR}" ; then \
> > > +		@echo "External toolchain doesn't support --sysroot. Cannot use." ; \
> > > +		exit 1 ; \
> > > +	fi ; \
> > > +	ARCH_SYSROOT_DIR="$(call toolchain_find_sysroot,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> > > +	ARCH_LIB_DIR="$(call toolchain_find_libdir,$(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS))" ; \
> > > +	SUPPORT_LIB_DIR="" ; \
> > > +	if test `find $${ARCH_SYSROOT_DIR} -name 'libstdc++.a' | wc -l` -eq 0 ; then \
> > > +		LIBSTDCPP_A_LOCATION=$$(LANG=C $(TOOLCHAIN_EXTERNAL_CC) $(TOOLCHAIN_EXTERNAL_CFLAGS) -print-file-name=libstdc++.a) ; \
> > > +		if [ -e "$${LIBSTDCPP_A_LOCATION}" ]; then \
> > > +			SUPPORT_LIB_DIR=`readlink -f $${LIBSTDCPP_A_LOCATION} | sed -r -e 's:libstdc\+\+\.a::'` ; \
> > > +		fi ; \
> > > +	fi ; \
> > > +	ARCH_SUBDIR=`echo $${ARCH_SYSROOT_DIR} | sed -r -e "s:^$${SYSROOT_DIR}(.*)/$$:\1:"` ; \
> > 
> > ... to here could be moved to a common function, so it can be shared
> > between the staging and target functions?
> 
> How do you suggest this to be done? The problem is that we need those
> variables to be defined within the shell block that follows. I don't
> see any easy way to factorize that. Or maybe I should just take this
> opportunity, and move some of this crap into a helper shell script,
> which will avoid these horrible shell blocks with lots of quoting and
> backslashes.

Hay, I said: "that might not be that easy..." ;-)

Fact is, I was just pointing out the code duplication, code which is not
trivial, and there is an opportunity for those two part to diverge if we
are not careful.

I don't know how we could do that sanely (without too much of double- or
quadruple-dollar signs all over...) I can have  look at it, if you want.

Maybe a big fat comment is all we really need here.

In the end, this is just an RFC, and there's room for improvements. ;-)
But that sure would be a very nice addition to Buildroot! :-)

Regards,
Yann e. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
       [not found]     ` <3156840.4l9buZIenR@sagittea>
@ 2014-06-10 16:58       ` Yann E. MORIN
  2014-06-10 17:37         ` Jérôme Pouiller
  0 siblings, 1 reply; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-10 16:58 UTC (permalink / raw)
  To: buildroot

J?r?me, All,

On 2014-06-10 18:20 +0200, J?r?me Pouiller spake thusly:
> On Tuesday 10 June 2014 00:02:41 Yann E. MORIN wrote:
> > Thomas, All,
> > 
> > On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> > > This patch adds a global instrumentation hook that collects the list
> > > of files installed in $(TARGET_DIR) by each package, and stores this
> > > list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> > > be used to determine the size contribution of each package to the
> > > target root filesystem.
> > > 
> > > The only limitation is that if a file is installed by a package A, and
> > > then overriden by a file from package B, the file will only be listed
> > > in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> > > file.
> > 
> > If we really wanted to account for the realy package, we'd have to
> > somehow notice that a pacakge did change the content of a file.
> > 
> > So, we would need to run sha1sum on all the files in the pre-step and
> > the post step. Any differing line would mean a new file, or a changed
> > file.
> I did something similar in the past. I used inotify to follow modification 
> done on TARGET_DIR. It was fast and detect overwritten files.

I've been playing with inotify too, and it is not really up to the task.
I was able to overload the "wait queue" very easily, and missed some
events (sometimes a lot of them), when the build was creating/changing a
lot of files in rapid-fire.

I was able to overcome the "wait-queue" limitation by increasing the
number of queueable events, but that requires root access, but not
everyone can be root on their machine (think shared build farms), and
the defaults are quite low.

> Handling build interruption was a little tricky, but results was corrects.

That's also an issue I see, and I don't think we want to go into too
complex a setting.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-10 16:58       ` Yann E. MORIN
@ 2014-06-10 17:37         ` Jérôme Pouiller
  0 siblings, 0 replies; 27+ messages in thread
From: Jérôme Pouiller @ 2014-06-10 17:37 UTC (permalink / raw)
  To: buildroot

On Tuesday 10 June 2014 18:58:40 Yann E. MORIN wrote:
> J?r?me, All,
> 
> On 2014-06-10 18:20 +0200, J?r?me Pouiller spake thusly:
> > On Tuesday 10 June 2014 00:02:41 Yann E. MORIN wrote:
> > > Thomas, All,
> > > 
> > > On 2014-06-07 23:46 +0200, Thomas Petazzoni spake thusly:
> > > > This patch adds a global instrumentation hook that collects the list
> > > > of files installed in $(TARGET_DIR) by each package, and stores this
> > > > list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> > > > be used to determine the size contribution of each package to the
> > > > target root filesystem.
> > > > 
> > > > The only limitation is that if a file is installed by a package A, and
> > > > then overriden by a file from package B, the file will only be listed
> > > > in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> > > > file.
> > > 
> > > If we really wanted to account for the realy package, we'd have to
> > > somehow notice that a pacakge did change the content of a file.
> > > 
> > > So, we would need to run sha1sum on all the files in the pre-step and
> > > the post step. Any differing line would mean a new file, or a changed
> > > file.
> > 
> > I did something similar in the past. I used inotify to follow modification
> > done on TARGET_DIR. It was fast and detect overwritten files.
> 
> I've been playing with inotify too, and it is not really up to the task.
> I was able to overload the "wait queue" very easily, and missed some
> events (sometimes a lot of them), when the build was creating/changing a
> lot of files in rapid-fire.
> 
> I was able to overcome the "wait-queue" limitation by increasing the
> number of queueable events, but that requires root access, but not
> everyone can be root on their machine (think shared build farms), and
> the defaults are quite low.
Strange, I watched read access to staging directory and I didn't notice this 
issue. However, I did only one build at time.


> > Handling build interruption was a little tricky, but results was corrects.
> 
> That's also an issue I see, and I don't think we want to go into too
> complex a setting.
From some point of view, it is not so much complex than computing file 
checksums.

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
                   ` (4 preceding siblings ...)
  2014-06-07 21:54 ` [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Will Wagner
@ 2014-06-24 13:05 ` Luca Ceresoli
  2014-06-24 16:26   ` Yann E. MORIN
  2014-06-24 16:31   ` Arnout Vandecappelle
  5 siblings, 2 replies; 27+ messages in thread
From: Luca Ceresoli @ 2014-06-24 13:05 UTC (permalink / raw)
  To: buildroot

Dear Thomas,

sorry for jumping in late, but I have a question about this patchset.

Thomas Petazzoni wrote:
> Hello,
>
> I gave a training this week, and one of the question I had was how to
> analyze the size of the things that are present on the root
> filesystem. And I thought that Buildroot was lacking a tool to help
> with this. Therefore, the following set of commits implement a script
> that generates a pie chart of the size contribution of each package to
> the target root filesystem.

Nice feature!

>
> To see an example of the generated pie chart, see:
>
>    http://free-electrons.com/~thomas/pub/buildroot/graph-size.pdf
>
> The implementation consists in adding a global instrumentation hook
> that registers which files are installed by each package. A limitation
> of the current implementation is that when a file is installed by a
> package A and then overriden by package B, the mechanism will assume
> the file was installed by package A. Suggestions to welcome on how to
> solve this in a reasonably simple way.

It's very well possible that I'm missing something, but I don't get why
you need to save the list of all installed files.

Can't you just save the whole rootfs size before and after installation?
It can be simply computer by 'du -bs $(TARGET_DIR)', and it's way
easier to parse later. It would also take into account the change of
size for overwritten files, for free.

Of course your approach collects more information, but I don't see
these extra info used in the final graph.

-- 
Luca

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-24 13:05 ` Luca Ceresoli
@ 2014-06-24 16:26   ` Yann E. MORIN
  2014-06-24 16:31   ` Arnout Vandecappelle
  1 sibling, 0 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-24 16:26 UTC (permalink / raw)
  To: buildroot

Luca, All,

On 2014-06-24 15:05 +0200, Luca Ceresoli spake thusly:
> Thomas Petazzoni wrote:
[--SNIP--]
> >To see an example of the generated pie chart, see:
> >
> >   http://free-electrons.com/~thomas/pub/buildroot/graph-size.pdf
> >
> >The implementation consists in adding a global instrumentation hook
> >that registers which files are installed by each package. A limitation
> >of the current implementation is that when a file is installed by a
> >package A and then overriden by package B, the mechanism will assume
> >the file was installed by package A. Suggestions to welcome on how to
> >solve this in a reasonably simple way.
> 
> It's very well possible that I'm missing something, but I don't get why
> you need to save the list of all installed files.
> 
> Can't you just save the whole rootfs size before and after installation?
> It can be simply computer by 'du -bs $(TARGET_DIR)', and it's way
> easier to parse later. It would also take into account the change of
> size for overwritten files, for free.
> 
> Of course your approach collects more information, but I don't see
> these extra info used in the final graph.

What is important is not the installed size, but the actual size on the
target.

The binaries installed are not stripped at instll time, but later, just
before making the target images.

Thus, you want to wait for just before generationg the target image, but
just after stripping binaries, to compute the actual size of each
packages.

So, what the patches basically does is:
  - record for each package what files are installed
  - just after stripping, get the size of each file,
  - assign the size to the package that installed that file

Of course, as Thomas says, if two packages install the same file, the
first pacakge will be credited with the size of this file, not the last
package.

Overcoming this would not be trivial (but not impossible either.) This
can be refined in later interations of the patchset; we just need to
come up with a simple-enough heuristic. Storing sha1s of each file
before install, and comparign after install is a solution, but it is a
bit more involved than just diffing the output of 'find -type f'.

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-24 13:05 ` Luca Ceresoli
  2014-06-24 16:26   ` Yann E. MORIN
@ 2014-06-24 16:31   ` Arnout Vandecappelle
  2014-06-24 16:42     ` Thomas Petazzoni
  2014-06-24 19:54     ` Luca Ceresoli
  1 sibling, 2 replies; 27+ messages in thread
From: Arnout Vandecappelle @ 2014-06-24 16:31 UTC (permalink / raw)
  To: buildroot

On 24/06/14 15:05, Luca Ceresoli wrote:
[snip]
>> The implementation consists in adding a global instrumentation hook
>> that registers which files are installed by each package. A limitation
>> of the current implementation is that when a file is installed by a
>> package A and then overriden by package B, the mechanism will assume
>> the file was installed by package A. Suggestions to welcome on how to
>> solve this in a reasonably simple way.
> 
> It's very well possible that I'm missing something, but I don't get why
> you need to save the list of all installed files.
> 
> Can't you just save the whole rootfs size before and after installation?
> It can be simply computer by 'du -bs $(TARGET_DIR)', and it's way
> easier to parse later. It would also take into account the change of
> size for overwritten files, for free.
> 
> Of course your approach collects more information, but I don't see
> these extra info used in the final graph.

 The reason is explained a bit further in that cover letter: we only do
stripping and removing of redundant stuff in the finalize step, so calculating
the size before that doesn't make much sense.

 The alternative of repeating the finalize step after each package doesn't sound
very attractive either.

 Regards,
 Arnout


-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7CB5 E4CC 6C2E EFD4 6E3D A754 F963 ECAB 2450 2F1F

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
  2014-06-08  2:56   ` Baruch Siach
  2014-06-09 22:02   ` Yann E. MORIN
@ 2014-06-24 16:36   ` Arnout Vandecappelle
  2014-06-24 16:41     ` Thomas Petazzoni
  2014-06-24 16:53     ` Yann E. MORIN
  2 siblings, 2 replies; 27+ messages in thread
From: Arnout Vandecappelle @ 2014-06-24 16:36 UTC (permalink / raw)
  To: buildroot

On 07/06/14 23:46, Thomas Petazzoni wrote:
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> be used to determine the size contribution of each package to the
> target root filesystem.
> 
> The only limitation is that if a file is installed by a package A, and
> then overriden by a file from package B, the file will only be listed
> in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> file.
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  package/pkg-generic.mk | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 5116ed9..069653e 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,30 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  
> +# Package size steps
> +define step_pkg_size_start
> +	echo "PKG SIZE START $(1)"
> +	(cd $(TARGET_DIR) ; find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_before
> +endef
> +
> +define step_pkg_size_end
> +	echo "PKG SIZE END $(1)"
> +	(cd $(TARGET_DIR); find . -type f) | sort > \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
> +		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
> +	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
> +		$(BUILD_DIR)/$(1).tmp_filelist_after
> +endef
> +
> +define step_pkg_size
> +	$(if $(filter install-target,$(2)),\
> +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size

 Since these instrumentation steps are relatively expensive (especially for
large builds with many small packages), I would prefer to only enable this after
setting some environment variable or config option or similar.

 Regards,
 Arnout

> +
>  # User-supplied script
>  define step_user
>  	@$(foreach user_hook, $(BR2_INSTRUMENTATION_SCRIPTS), \
> 


-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7CB5 E4CC 6C2E EFD4 6E3D A754 F963 ECAB 2450 2F1F

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-24 16:36   ` Arnout Vandecappelle
@ 2014-06-24 16:41     ` Thomas Petazzoni
  2014-06-24 16:53     ` Yann E. MORIN
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-24 16:41 UTC (permalink / raw)
  To: buildroot

Dear Arnout Vandecappelle,

On Tue, 24 Jun 2014 18:36:58 +0200, Arnout Vandecappelle wrote:

> > +# Package size steps
> > +define step_pkg_size_start
> > +	echo "PKG SIZE START $(1)"
> > +	(cd $(TARGET_DIR) ; find . -type f) | sort > \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_before
> > +endef
> > +
> > +define step_pkg_size_end
> > +	echo "PKG SIZE END $(1)"
> > +	(cd $(TARGET_DIR); find . -type f) | sort > \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_after
> > +	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
> > +		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
> > +	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_after
> > +endef
> > +
> > +define step_pkg_size
> > +	$(if $(filter install-target,$(2)),\
> > +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> > +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> > +endef
> > +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> 
>  Since these instrumentation steps are relatively expensive (especially for
> large builds with many small packages), I would prefer to only enable this after
> setting some environment variable or config option or similar.

Agreed, I'll add this in the next iteration of the patch series.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-24 16:31   ` Arnout Vandecappelle
@ 2014-06-24 16:42     ` Thomas Petazzoni
  2014-06-24 19:54     ` Luca Ceresoli
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-24 16:42 UTC (permalink / raw)
  To: buildroot

Luca, Arnout,

On Tue, 24 Jun 2014 18:31:30 +0200, Arnout Vandecappelle wrote:

> > Of course your approach collects more information, but I don't see
> > these extra info used in the final graph.
> 
>  The reason is explained a bit further in that cover letter: we only do
> stripping and removing of redundant stuff in the finalize step, so calculating
> the size before that doesn't make much sense.

Exactly.

>  The alternative of repeating the finalize step after each package doesn't sound
> very attractive either.

Indeed. Doing all the stripping and removal of unneeded stuff involves
a lot of 'find' invocations on the entire output/target directory,
which we probably don't want to repeat over and over again at the end
of the target installation step of each package.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook
  2014-06-24 16:36   ` Arnout Vandecappelle
  2014-06-24 16:41     ` Thomas Petazzoni
@ 2014-06-24 16:53     ` Yann E. MORIN
  1 sibling, 0 replies; 27+ messages in thread
From: Yann E. MORIN @ 2014-06-24 16:53 UTC (permalink / raw)
  To: buildroot

Arnout, All,

On 2014-06-24 18:36 +0200, Arnout Vandecappelle spake thusly:
> On 07/06/14 23:46, Thomas Petazzoni wrote:
> > This patch adds a global instrumentation hook that collects the list
> > of files installed in $(TARGET_DIR) by each package, and stores this
> > list into a file called $(BUILD_DIR)/<pkgname>.filelist. It can later
> > be used to determine the size contribution of each package to the
> > target root filesystem.
> > 
> > The only limitation is that if a file is installed by a package A, and
> > then overriden by a file from package B, the file will only be listed
> > in $(BUILD_DIR)/A.filelist as it is the first time we will see the
> > file.
> > 
> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> > ---
> >  package/pkg-generic.mk | 24 ++++++++++++++++++++++++
> >  1 file changed, 24 insertions(+)
> > 
> > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > index 5116ed9..069653e 100644
> > --- a/package/pkg-generic.mk
> > +++ b/package/pkg-generic.mk
> > @@ -55,6 +55,30 @@ define step_time
> >  endef
> >  GLOBAL_INSTRUMENTATION_HOOKS += step_time
> >  
> > +# Package size steps
> > +define step_pkg_size_start
> > +	echo "PKG SIZE START $(1)"
> > +	(cd $(TARGET_DIR) ; find . -type f) | sort > \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_before
> > +endef
> > +
> > +define step_pkg_size_end
> > +	echo "PKG SIZE END $(1)"
> > +	(cd $(TARGET_DIR); find . -type f) | sort > \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_after
> > +	diff -u $(BUILD_DIR)/$(1).tmp_filelist_before $(BUILD_DIR)/$(1).tmp_filelist_after | \
> > +		grep '^\+\./' | sed 's%^\+%%' > $(BUILD_DIR)/$(1).filelist
> > +	$(RM) -f $(BUILD_DIR)/$(1).tmp_filelist_before \
> > +		$(BUILD_DIR)/$(1).tmp_filelist_after
> > +endef
> > +
> > +define step_pkg_size
> > +	$(if $(filter install-target,$(2)),\
> > +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> > +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> > +endef
> > +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> 
>  Since these instrumentation steps are relatively expensive (especially for
> large builds with many small packages), I would prefer to only enable this after
> setting some environment variable or config option or similar.

Ideally, I would like an entry in the menuconfig (probably in the "build
options" sub-menu):

    [*] Graphs and intrumentation  --->
        *** Graphs ***
        [ ] Packages' installed size
        [ ] Packages' build time
        [ ] Packages' dependencies
        [ ] Global dependencies
        *** Instrumentation ***
        [ ] Check RPATH
        [ ] Check for overwritten files

Thus, all relevant information would be automatically generated. Of
course, some can be generated on-demand (eg. the dependencies graphs),
while others 

(Note: the rpath check I'm currently working on, and the overwriting
check would be relatively easy to implement I guess; but that's just an
example.)

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-24 16:31   ` Arnout Vandecappelle
  2014-06-24 16:42     ` Thomas Petazzoni
@ 2014-06-24 19:54     ` Luca Ceresoli
  2014-06-24 20:11       ` Thomas Petazzoni
  1 sibling, 1 reply; 27+ messages in thread
From: Luca Ceresoli @ 2014-06-24 19:54 UTC (permalink / raw)
  To: buildroot

Dear Arnout, Yann, Thomas,

Arnout Vandecappelle wrote:
> On 24/06/14 15:05, Luca Ceresoli wrote:
> [snip]
>>> The implementation consists in adding a global instrumentation hook
>>> that registers which files are installed by each package. A limitation
>>> of the current implementation is that when a file is installed by a
>>> package A and then overriden by package B, the mechanism will assume
>>> the file was installed by package A. Suggestions to welcome on how to
>>> solve this in a reasonably simple way.
>>
>> It's very well possible that I'm missing something, but I don't get why
>> you need to save the list of all installed files.
>>
>> Can't you just save the whole rootfs size before and after installation?
>> It can be simply computer by 'du -bs $(TARGET_DIR)', and it's way
>> easier to parse later. It would also take into account the change of
>> size for overwritten files, for free.
>>
>> Of course your approach collects more information, but I don't see
>> these extra info used in the final graph.
>
>   The reason is explained a bit further in that cover letter: we only do
> stripping and removing of redundant stuff in the finalize step, so calculating
> the size before that doesn't make much sense.

Aaah, yes, stripping! Of course! I was sure there is a good reason...

I'll try to connect my brain to my fingers next time (or at least read
the email properly)...

Thanks for having gently replied instead of simply blaming me (which
would have made a lot of sense indeed!).

-- 
Luca

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package
  2014-06-24 19:54     ` Luca Ceresoli
@ 2014-06-24 20:11       ` Thomas Petazzoni
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Petazzoni @ 2014-06-24 20:11 UTC (permalink / raw)
  To: buildroot

Dear Luca Ceresoli,

On Tue, 24 Jun 2014 21:54:17 +0200, Luca Ceresoli wrote:

> >   The reason is explained a bit further in that cover letter: we only do
> > stripping and removing of redundant stuff in the finalize step, so calculating
> > the size before that doesn't make much sense.
> 
> Aaah, yes, stripping! Of course! I was sure there is a good reason...

Well really it's not only stripping. It's all the things that
target-finalize is doing: stripping of course, but also removing .h
files, .a files, documentation, unneeded locales, and more.

Therefore, my approach was more to really only at the files that are in
output/target/ (and their size) at the end of the build and then try to
match the packages they are coming from.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2014-06-24 20:11 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-07 21:46 [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Thomas Petazzoni
2014-06-07 21:46 ` [Buildroot] [RFCv1 1/4] toolchain-external: split target installation from staging installation Thomas Petazzoni
2014-06-09 21:49   ` Yann E. MORIN
2014-06-10  8:04     ` Thomas Petazzoni
2014-06-10 16:49       ` Yann E. MORIN
2014-06-07 21:46 ` [Buildroot] [RFCv1 2/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
2014-06-08  2:56   ` Baruch Siach
2014-06-08  8:19     ` Thomas Petazzoni
2014-06-09 22:02   ` Yann E. MORIN
2014-06-10 16:42     ` Jérôme Pouiller
     [not found]     ` <3156840.4l9buZIenR@sagittea>
2014-06-10 16:58       ` Yann E. MORIN
2014-06-10 17:37         ` Jérôme Pouiller
2014-06-24 16:36   ` Arnout Vandecappelle
2014-06-24 16:41     ` Thomas Petazzoni
2014-06-24 16:53     ` Yann E. MORIN
2014-06-07 21:46 ` [Buildroot] [RFCv1 3/4] support/scripts: add graph-size script Thomas Petazzoni
2014-06-09 22:06   ` Yann E. MORIN
2014-06-07 21:46 ` [Buildroot] [RFCv1 4/4] Makefile: implement a graph-size target Thomas Petazzoni
2014-06-09 22:28   ` Yann E. MORIN
2014-06-07 21:54 ` [Buildroot] [RFCv1 0/4] Generating a graph of the size installed by each package Will Wagner
2014-06-08  7:42   ` Thomas Petazzoni
2014-06-24 13:05 ` Luca Ceresoli
2014-06-24 16:26   ` Yann E. MORIN
2014-06-24 16:31   ` Arnout Vandecappelle
2014-06-24 16:42     ` Thomas Petazzoni
2014-06-24 19:54     ` Luca Ceresoli
2014-06-24 20:11       ` Thomas Petazzoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.