All of lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics
@ 2015-05-25 21:56 Thomas Petazzoni
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-05-25 21:56 UTC (permalink / raw)
  To: buildroot

Hello,

Here is the fourth iteration of the patches adding a 'size-stats'
feature, which mainly allows to generate a pie chart of the size
contribution of each package to the overall root filesystem size.

This code is also available at:

  http://git.free-electrons.com/users/thomas-petazzoni/buildroot/log/?h=size-stats-v4

Remaining questions:

 - How to handle files added by post-build script and rootfs overlays?
   Handling rootfs overlays like the skeleton is done today is
   possible, but it doesn't work for post-build scripts. One
   possibility is to use the script suggested by Romain Naour: to add
   an explicit call to step_pkg_size_start before copying the
   overlays, and a call to step_pkg_size_end after copying the
   overlays, with a fake package name:

	$(call step_pkg_size_start,rootfs-overlay)
	@$(foreach d, $(call qstrip,$(BR2_ROOTFS_OVERLAY)), \
		$(call MESSAGE,"Copying overlay $(d)"); \
		rsync -a --ignore-times $(RSYNC_VCS_EXCLUSIONS) \
			--chmod=Du+w --exclude .empty --exclude '*~' \
			$(d)/ $(TARGET_DIR)$(sep))
	$(call step_pkg_size_end,rootfs-overlay)

   We could do the same for the post-build script and the skeleton,
   and reduce the complexity in the size-stats script itself.

   What do you think?

Changes since v3:

 - Remove the BR2_COLLECT_FILE_SIZE_STATS option, and collect the
   necessary data unconditionally. This was suggested by Yann
   E. Morin, who did some testing of the performance impact, and found
   that it was so insignificant that it is negligible. See Yann's
   analysis at
   http://lists.busybox.net/pipermail/buildroot/2015-February/119431.html.

 - Store the temporary files listing the files before/after inside
   each package build directory, rather than in $(BUILD_DIR)
   directly. This was suggested by Yann E. Morin.

   They are now named .br_filelist_before and .br_filelist_after. They
   are also no longer removed after the step_pkg_size_end hook, as
   they don't consume that much space and could be useful for
   analysis. If anyone finds that annoying, I'll be happy to change
   this back to remove them.

 - As suggested by Arnout, make a number of improvements to the
   size-stats Python script:

   * Use collections.defaultdict()

   * Use a dict of tuples (package name, file size) for the filesdict,
     instead of a dict of dict containing just two entries.

   * Factorize code between build_skeleton_dict() and
     build_package_dict() into a add_file() function.

   * Use [2:] instead of .replace("./", "") to remove the leading "./"
     in relative file paths returned by find.

   * Many other small things: one digit for percentages, use sum()
     when possible, rename pkgdict to filesdict where it makes sense,
     etc.

 - In the main Makefile:

   * Use mkdir -p instead of install -d, as suggested by Arnout.

   * Use $(Q) instead of @, as suggested by Arnout

   * Do not cd into $(CONFIG_DIR), as it would break out of tree
     build, since the path to the skeleton is passed as a relative
     path.

 - Add a new patch updating the manual with a new section describing
   this feature.

Changes since v2:

 - Use print0 in find and -0 in xargs to support spaces in filenames,
   just in case. Suggested by J?r?me Pouiller.

 - Support custom skeleton locations, instead of assuming the skeleton
   comes from system/skeleton.

 - Fix the size-stats target to actually call the size-stats script,
   and not graph-size, which was the old name of the script in
   previous versions of the series.

 - Fix usage of the size-stats target in out of tree build
   situations. For consistency reasons, we chose to mimic what
   graph-depends already does, even if size-stats does not necessarily
   need to be executed from within CONFIG_DIR.

 - Add one patch to make sure that 'make clean' removes the graphs/
   subdirectory in the output directory, which wasn't done today. This
   issue isn't specific to size-stats, and already existed with
   graph-build and graph-depends.

Changes since v1:

 - Improve the logic to properly take into account packages that
   overwrite files installed by other packages.

 - Make the new pkg_step hook optional.

 - Extend the facility to not only generate a graph, but also CSV
   files. As a consequence, everything was renamed from 'graph-size'
   to 'size-stats'.

Best regards,

Thomas

Thomas Petazzoni (4):
  pkg-generic: add step_pkg_size global instrumentation hook
  support/scripts: add size-stats script
  Makefile: implement a size-stats target
  docs/manual: add section about size graphing

 Makefile                     |   9 ++
 docs/manual/common-usage.txt |  39 +++++++
 package/pkg-generic.mk       |  32 ++++++
 support/scripts/size-stats   | 238 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 318 insertions(+)
 create mode 100755 support/scripts/size-stats

-- 
2.1.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook
  2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
@ 2015-05-25 21:56 ` Thomas Petazzoni
  2015-05-28  3:29   ` Ryan Barnett
  2015-07-11 10:46   ` Romain Naour
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script Thomas Petazzoni
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-05-25 21:56 UTC (permalink / raw)
  To: buildroot

This patch adds a global instrumentation hook that collects the list
of files installed in $(TARGET_DIR) by each package, and stores this
list into a file called $(BUILD_DIR)/packages-file-list.txt. It can
later be used to determine the size contribution of each package to
the target root filesystem.

Note that in order to detect if a file installed by one package is
later overriden by another package, we calculate the md5 of installed
files and compare them at each installation of a new package.

This commit also adds a Config.in option to enable the collection of
this data, as calculating the md5 of all installed files at the
beginning and end of the installation of each package can be
considered a time-consuming process which maybe some users will not be
willing to suffer from.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 package/pkg-generic.mk | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index d5b29f0..944b0fc 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -55,6 +55,38 @@ define step_time
 endef
 GLOBAL_INSTRUMENTATION_HOOKS += step_time
 
+# Hooks to collect statistics about installed files
+
+# This hook will be called before the target installation of a
+# package. We store in a file named .br_filelist_before the list of
+# files currently installed in the target. Note that the MD5 is also
+# stored, in order to identify if the files are overwritten.
+define step_pkg_size_start
+	(cd $(TARGET_DIR) ; find . -type f -print0 | xargs -0 md5sum) | sort > \
+		$($(PKG)_DIR)/.br_filelist_before
+endef
+
+# This hook will be called after the target installation of a
+# package. We store in a file named .br_filelist_after the list of
+# files (and their MD5) currently installed in the target. We then do
+# a diff with the .br_filelist_before to compute the list of files
+# installed by this package.
+define step_pkg_size_end
+	(cd $(TARGET_DIR); find . -type f -print0 | xargs -0 md5sum) | sort > \
+		$($(PKG)_DIR)/.br_filelist_after
+	comm -13 $($(PKG)_DIR)/.br_filelist_before $($(PKG)_DIR)/.br_filelist_after | \
+		while read hash file ; do \
+			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
+		done
+endef
+
+define step_pkg_size
+	$(if $(filter install-target,$(2)),\
+		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
+		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
+endef
+GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
+
 # User-supplied script
 ifneq ($(BR2_INSTRUMENTATION_SCRIPTS),)
 define step_user
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2015-05-25 21:56 ` Thomas Petazzoni
  2015-05-28  3:18   ` Ryan Barnett
  2015-06-03 15:50   ` Clayton Shotwell
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target Thomas Petazzoni
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-05-25 21:56 UTC (permalink / raw)
  To: buildroot

This new script uses the data collected by the step_pkg_size
instrumentation hook to generate a pie chart of the size contribution
of each package to the target root filesystem, and two CSV files with
statistics about the package size and file size. To achieve this, it
looks at each file in $(TARGET_DIR), and using the
packages-file-list.txt information collected by the step_pkg_size
hook, it determines to which package the file belongs. It is therefore
able to give the size installed by each package.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 238 insertions(+)
 create mode 100755 support/scripts/size-stats

diff --git a/support/scripts/size-stats b/support/scripts/size-stats
new file mode 100755
index 0000000..48a64cd
--- /dev/null
+++ b/support/scripts/size-stats
@@ -0,0 +1,238 @@
+#!/usr/bin/env python
+
+# Copyright (C) 2014 by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+import sys
+import os
+import os.path
+import argparse
+import csv
+import collections
+
+try:
+    import matplotlib.font_manager as fm
+    import matplotlib.pyplot as plt
+except ImportError:
+    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
+    exit(1)
+
+colors = ['#e60004', '#009836', '#2e1d86', '#ffed00',
+          '#0068b5', '#f28e00', '#940084', '#97c000']
+
+#
+# This function adds a new file to 'filesdict', after checking its
+# size. The 'filesdict' contain the relative path of the file as the
+# key, and as the value a tuple containing the name of the package to
+# which the file belongs and the size of the file.
+#
+# filesdict: the dict to which  the file is added
+# relpath: relative path of the file
+# fullpath: absolute path to the file
+# pkg: package to which the file belongs
+#
+def add_file(filesdict, relpath, abspath, pkg):
+    if not os.path.exists(abspath):
+        return
+    if os.path.islink(abspath):
+        return
+    sz = os.stat(abspath).st_size
+    filesdict[relpath] = (pkg, sz)
+
+#
+# This function returns a dict containing as keys the files present in
+# the filesystem skeleton, and as value, the string "skeleton". It is
+# used to simulate a fake "skeleton" package, to assign the files from
+# the skeleton to some package.
+#
+# builddir: path to the Buildroot output directory
+# skeleton_path: path to the rootfs skeleton
+#
+def build_skeleton_dict(builddir, skeleton_path):
+    skeleton_files = {}
+    for root, _, files in os.walk(skeleton_path):
+        for f in files:
+            if f == ".empty":
+                continue
+            frelpath = os.path.relpath(os.path.join(root, f), skeleton_path)
+            # Get the real size of the installed file
+            targetpath = os.path.join(builddir, "target", frelpath)
+            add_file(skeleton_files, frelpath, targetpath, "skeleton")
+    return skeleton_files
+
+#
+# This function returns a dict where each key is the path of a file in
+# the root filesystem, and the value is a tuple containing two
+# elements: the name of the package to which this file belongs and the
+# size of the file.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_dict(builddir):
+    filesdict = {}
+    with open(os.path.join(builddir, "build", "packages-file-list.txt")) as filelistf:
+        for l in filelistf.readlines():
+            pkg, fpath = l.split(",")
+            # remove the initial './' in each file path
+            fpath = fpath.strip()[2:]
+            fullpath = os.path.join(builddir, "target", fpath)
+            add_file(filesdict, fpath, fullpath, pkg)
+    return filesdict
+
+#
+# This function builds a dictionary that contains the name of a
+# package as key, and the size of the files installed by this package
+# as the value.
+#
+# filesdict: dictionary with the name of the files as key, and as
+# value a tuple containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# builddir: path to the Buildroot output directory
+#
+def build_package_size(filesdict, builddir):
+    pkgsize = collections.defaultdict(int)
+
+    for root, _, files in os.walk(os.path.join(builddir, "target")):
+        for f in files:
+            fpath = os.path.join(root, f)
+            if os.path.islink(fpath):
+                continue
+            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
+            if not frelpath in filesdict:
+                print("WARNING: %s is not part of any package" % frelpath)
+                pkg = "unknown"
+            else:
+                pkg = filesdict[frelpath][0]
+
+            pkgsize[pkg] += os.path.getsize(fpath)
+
+    return pkgsize
+
+#
+# Given a dict returned by build_package_size(), this function
+# generates a pie chart of the size installed by each package.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output file for the graph
+#
+def draw_graph(pkgsize, outputf):
+    total = sum(pkgsize.values())
+    labels = []
+    values = []
+    other_value = 0
+    for (p, sz) in pkgsize.items():
+        if sz < (total * 0.01):
+            other_value += sz
+        else:
+            labels.append("%s (%d kB)" % (p, sz / 1000.))
+            values.append(sz)
+    labels.append("Other (%d kB)" % (other_value / 1000.))
+    values.append(other_value)
+
+    plt.figure()
+    patches, texts, autotexts = plt.pie(values, labels=labels,
+                                        autopct='%1.1f%%', shadow=True,
+                                        colors=colors)
+    # Reduce text size
+    proptease = fm.FontProperties()
+    proptease.set_size('xx-small')
+    plt.setp(autotexts, fontproperties=proptease)
+    plt.setp(texts, fontproperties=proptease)
+
+    plt.title('Size per package')
+    plt.savefig(outputf)
+
+#
+# Generate a CSV file with statistics about the size of each file, its
+# size contribution to the package and to the overall system.
+#
+# filesdict: dictionary with the name of the files as key, and as
+# value a tuple containing the name of the package to which the files
+# belongs, and the size of the file. As returned by
+# build_package_dict.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_files_csv(filesdict, pkgsizes, outputf):
+    total = 0
+    for (p, sz) in pkgsizes.items():
+        total += sz
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["File name",
+                     "Package name",
+                     "File size",
+                     "Package size",
+                     "File size in package (%)",
+                     "File size in system (%)"])
+        for f, (pkgname, filesize) in filesdict.items():
+            pkgsize = pkgsizes[pkgname]
+            wr.writerow([f, pkgname, filesize, pkgsize,
+                         "%.1f" % (float(filesize) / pkgsize * 100),
+                         "%.1f" % (float(filesize) / total * 100)])
+
+
+#
+# Generate a CSV file with statistics about the size of each package,
+# and their size contribution to the overall system.
+#
+# pkgsize: dictionary with the name of the package as a key, and the
+# size as the value, as returned by build_package_size.
+#
+# outputf: output CSV file
+#
+def gen_packages_csv(pkgsizes, outputf):
+    total = sum(pkgsizes.values())
+    with open(outputf, 'w') as csvfile:
+        wr = csv.writer(csvfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
+        wr.writerow(["Package name", "Package size", "Package size in system (%)"])
+        for (pkg, size) in pkgsizes.items():
+            wr.writerow([pkg, size, "%.1f" % (float(size) / total * 100)])
+
+parser = argparse.ArgumentParser(description='Draw build time graphs')
+
+parser.add_argument("--builddir", '-i', metavar="BUILDDIR", required=True,
+                    help="Buildroot output directory")
+parser.add_argument("--graph", '-g', metavar="GRAPH",
+                    help="Graph output file (.pdf or .png extension)")
+parser.add_argument("--file-size-csv", '-f', metavar="FILE_SIZE_CSV",
+                    help="CSV output file with file size statistics")
+parser.add_argument("--package-size-csv", '-p', metavar="PKG_SIZE_CSV",
+                    help="CSV output file with package size statistics")
+parser.add_argument("--skeleton-path", '-s', metavar="SKELETON_PATH", required=True,
+                    help="Path to the skeleton used for the system")
+args = parser.parse_args()
+
+# Find out which package installed what files
+pkgdict = build_package_dict(args.builddir)
+pkgdict.update(build_skeleton_dict(args.builddir, args.skeleton_path))
+
+# Collect the size installed by each package
+pkgsize = build_package_size(pkgdict, args.builddir)
+
+if args.graph:
+    draw_graph(pkgsize, args.graph)
+if args.file_size_csv:
+    gen_files_csv(pkgdict, pkgsize, args.file_size_csv)
+if args.package_size_csv:
+    gen_packages_csv(pkgsize, args.package_size_csv)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script Thomas Petazzoni
@ 2015-05-25 21:56 ` Thomas Petazzoni
  2015-05-28  3:32   ` Ryan Barnett
  2015-08-20 12:42   ` Ryan Barnett
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing Thomas Petazzoni
  2015-07-30 21:54 ` [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Ryan Barnett
  4 siblings, 2 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-05-25 21:56 UTC (permalink / raw)
  To: buildroot

This commit implements a size-stats target that calls the script of
the same name to generate the graph and CSV files related to package
and file sizes.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 Makefile | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Makefile b/Makefile
index cf244d7..c9b1e72 100644
--- a/Makefile
+++ b/Makefile
@@ -666,6 +666,14 @@ graph-depends: graph-depends-requirements
 	|tee $(GRAPHS_DIR)/$(@).dot \
 	|dot $(BR2_GRAPH_DOT_OPTS) -T$(BR_GRAPH_OUT) -o $(GRAPHS_DIR)/$(@).$(BR_GRAPH_OUT)
 
+size-stats:
+	$(Q)mkdir -p $(GRAPHS_DIR)
+	$(Q)$(TOPDIR)/support/scripts/size-stats --builddir $(BASE_DIR) \
+		--graph $(BASE_DIR)/graphs/graph-size.$(BR_GRAPH_OUT) \
+		--file-size-csv $(BASE_DIR)/build/file-size-stats.csv \
+		--package-size-csv $(BASE_DIR)/build/package-size-stats.csv \
+		--skeleton-path $(TARGET_SKELETON)
+
 else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
 
 all: menuconfig
@@ -883,6 +891,7 @@ endif
 	@echo '  manual-epub            - build manual in ePub'
 	@echo '  graph-build            - generate graphs of the build times'
 	@echo '  graph-depends          - generate graph of the dependency tree'
+	@echo '  size-stats             - generate stats of the filesystem size'
 	@echo '  list-defconfigs        - list all defconfigs (pre-configured minimal systems)'
 	@echo
 	@echo 'Miscellaneous:'
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing
  2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
                   ` (2 preceding siblings ...)
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target Thomas Petazzoni
@ 2015-05-25 21:56 ` Thomas Petazzoni
  2015-05-28  3:42   ` Ryan Barnett
  2015-07-30 21:54 ` [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Ryan Barnett
  4 siblings, 1 reply; 20+ messages in thread
From: Thomas Petazzoni @ 2015-05-25 21:56 UTC (permalink / raw)
  To: buildroot

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 docs/manual/common-usage.txt | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/docs/manual/common-usage.txt b/docs/manual/common-usage.txt
index 5b27b1f..c7a14ce 100644
--- a/docs/manual/common-usage.txt
+++ b/docs/manual/common-usage.txt
@@ -273,6 +273,45 @@ only other format supported is PNG:
 BR2_GRAPH_OUT=png make graph-build
 ----------------
 
+=== Graphing the filesystem size contribution of packages
+
+When your target system grows, it is sometimes useful to understand
+how much each Buildroot package is contributing to the overall root
+filesystem size. To help with such an analysis, Buildroot collects
+some data about which package installs which file, and can generate
+graphs and CSVs files with the relevant data.
+
+To generate these data after a build, run:
+
+----------------
+make size-stats
+----------------
+
+This will generate:
+
+* +output/graphs/graph-size+, a pie chart of the contribution of each
+  package to the overall root filesystem size
+
+* +output/build/package-size-stats.csv+, a CSV file giving the size
+  contribution of each package to the overall root filesystem size
+
+* +output/build/file-size-stats.csv+, a CSV file giving the size
+  contribution of each installed file to the package it belongs, and
+  to the overall filesystem size.
+
+This +size-stats+ target requires the Python Matplotlib library to be
+installed (+python-matplotlib+ on most distributions), and also the
++argpase+ module if you're using a Python version older than 2.7
+(+python-argparse+ on most distributions).
+
+Just like for the duration graph, a +BR2_GRAPH_OUT+ environment is
+supported to adjust the output file format.
+
+.Note
+The collected filesystem size data is only meaningful after a complete
+clean rebuild. Be sure to run +make clean all+ before using +make
+size-stats+.
+
 include::eclipse-integration.txt[]
 
 include::advanced.txt[]
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script Thomas Petazzoni
@ 2015-05-28  3:18   ` Ryan Barnett
  2015-05-28 14:55     ` Matthew Weber
  2015-09-02 21:08     ` Thomas Petazzoni
  2015-06-03 15:50   ` Clayton Shotwell
  1 sibling, 2 replies; 20+ messages in thread
From: Ryan Barnett @ 2015-05-28  3:18 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 238 insertions(+)
>  create mode 100755 support/scripts/size-stats

Other than a few minor suggestion below, things look good.

Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>

> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..48a64cd
> --- /dev/null
> +++ b/support/scripts/size-stats
> @@ -0,0 +1,238 @@

[...]

> +#
> +# This function builds a dictionary that contains the name of a
> +# package as key, and the size of the files installed by this package
> +# as the value.
> +#
> +# filesdict: dictionary with the name of the files as key, and as
> +# value a tuple containing the name of the package to which the files
> +# belongs, and the size of the file. As returned by
> +# build_package_dict.
> +#
> +# builddir: path to the Buildroot output directory
> +#
> +def build_package_size(filesdict, builddir):
> +    pkgsize = collections.defaultdict(int)
> +
> +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> +        for f in files:
> +            fpath = os.path.join(root, f)
> +            if os.path.islink(fpath):
> +                continue
> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> +            if not frelpath in filesdict:
> +                print("WARNING: %s is not part of any package" % frelpath)

Would it be useful to have an exclusion list since this will always be
printed out?

Every time you run 'make clean all size-stats' you will be faced with
warnings such as this:

WARNING: THIS_IS_NOT_YOUR_ROOT_FILESYSTEM is not part of any package
WARNING: etc/ld.so.cache is not part of any package
WARNING: etc/hostname is not part of any package
WARNING: etc/os-release is not part of any package
WARNING: etc/nsswitch.conf is not part of any package
WARNING: etc/ld.so.conf is not part of any package
WARNING: etc/network/interfaces is not part of any package
WARNING: tmp/ldconfig/aux-cache is not part of any package
WARNING: dev/console is not part of any package

Initially when I saw this I didn't do something correct, however, I
quickly released that these are files that are generated by
buildroot's makefiles (such as THIS_IS_NOT_YOUR_ROOT_FILESYSTEM and
etc/hostname). Since this files are generated by buildroot and one
shouldn't be concerned about this files not being a part of any
package. While typing this, would it make sense to create a package
called 'buildroot' whose files are defined statically within this
script?

> +                pkg = "unknown"
> +            else:
> +                pkg = filesdict[frelpath][0]
> +
> +            pkgsize[pkg] += os.path.getsize(fpath)
> +
> +    return pkgsize
> +
> +#
> +# Given a dict returned by build_package_size(), this function
> +# generates a pie chart of the size installed by each package.
> +#
> +# pkgsize: dictionary with the name of the package as a key, and the
> +# size as the value, as returned by build_package_size.
> +#
> +# outputf: output file for the graph
> +#
> +def draw_graph(pkgsize, outputf):
> +    total = sum(pkgsize.values())
> +    labels = []
> +    values = []
> +    other_value = 0
> +    for (p, sz) in pkgsize.items():
> +        if sz < (total * 0.01):
> +            other_value += sz
> +        else:
> +            labels.append("%s (%d kB)" % (p, sz / 1000.))
> +            values.append(sz)
> +    labels.append("Other (%d kB)" % (other_value / 1000.))
> +    values.append(other_value)
> +
> +    plt.figure()
> +    patches, texts, autotexts = plt.pie(values, labels=labels,
> +                                        autopct='%1.1f%%', shadow=True,
> +                                        colors=colors)
> +    # Reduce text size
> +    proptease = fm.FontProperties()
> +    proptease.set_size('xx-small')
> +    plt.setp(autotexts, fontproperties=proptease)
> +    plt.setp(texts, fontproperties=proptease)

Could the total size of filesystem be placed on this graph? I was
thinking maybe at the bottom of the graph in as a subtitle - don't
know if this possible?

Another idea would be the option to specify a chart title. This could
be something that could be used with dependency graph as well or any
other graph generated by buildroot (can't think of any others off at
this moment). However, I would say that should be a future feature
that is implemented.

Thanks,
-Ryan

> +    plt.title('Size per package')
> +    plt.savefig(outputf)

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
@ 2015-05-28  3:29   ` Ryan Barnett
  2015-07-11 10:46   ` Romain Naour
  1 sibling, 0 replies; 20+ messages in thread
From: Ryan Barnett @ 2015-05-28  3:29 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/packages-file-list.txt. It can
> later be used to determine the size contribution of each package to
> the target root filesystem.
>
> Note that in order to detect if a file installed by one package is
> later overriden by another package, we calculate the md5 of installed
> files and compare them at each installation of a new package.
>
> This commit also adds a Config.in option to enable the collection of
> this data, as calculating the md5 of all installed files at the
> beginning and end of the installation of each package can be
> considered a time-consuming process which maybe some users will not be
> willing to suffer from.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
Test-by: Ryan Barnett <ryanbarnett3@gmail.com>

> ---
>  package/pkg-generic.mk | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index d5b29f0..944b0fc 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,38 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>
> +# Hooks to collect statistics about installed files
> +
> +# This hook will be called before the target installation of a
> +# package. We store in a file named .br_filelist_before the list of
> +# files currently installed in the target. Note that the MD5 is also
> +# stored, in order to identify if the files are overwritten.
> +define step_pkg_size_start
> +       (cd $(TARGET_DIR) ; find . -type f -print0 | xargs -0 md5sum) | sort > \
> +               $($(PKG)_DIR)/.br_filelist_before
> +endef
> +
> +# This hook will be called after the target installation of a
> +# package. We store in a file named .br_filelist_after the list of
> +# files (and their MD5) currently installed in the target. We then do
> +# a diff with the .br_filelist_before to compute the list of files
> +# installed by this package.
> +define step_pkg_size_end
> +       (cd $(TARGET_DIR); find . -type f -print0 | xargs -0 md5sum) | sort > \
> +               $($(PKG)_DIR)/.br_filelist_after
> +       comm -13 $($(PKG)_DIR)/.br_filelist_before $($(PKG)_DIR)/.br_filelist_after | \
> +               while read hash file ; do \
> +                       echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
> +               done
> +endef
> +
> +define step_pkg_size
> +       $(if $(filter install-target,$(2)),\
> +               $(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +               $(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> +
>  # User-supplied script
>  ifneq ($(BR2_INSTRUMENTATION_SCRIPTS),)
>  define step_user

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target Thomas Petazzoni
@ 2015-05-28  3:32   ` Ryan Barnett
  2015-08-20 12:42   ` Ryan Barnett
  1 sibling, 0 replies; 20+ messages in thread
From: Ryan Barnett @ 2015-05-28  3:32 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This commit implements a size-stats target that calls the script of
> the same name to generate the graph and CSV files related to package
> and file sizes.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>

> ---
>  Makefile | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index cf244d7..c9b1e72 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -666,6 +666,14 @@ graph-depends: graph-depends-requirements
>         |tee $(GRAPHS_DIR)/$(@).dot \
>         |dot $(BR2_GRAPH_DOT_OPTS) -T$(BR_GRAPH_OUT) -o $(GRAPHS_DIR)/$(@).$(BR_GRAPH_OUT)
>
> +size-stats:
> +       $(Q)mkdir -p $(GRAPHS_DIR)
> +       $(Q)$(TOPDIR)/support/scripts/size-stats --builddir $(BASE_DIR) \
> +               --graph $(BASE_DIR)/graphs/graph-size.$(BR_GRAPH_OUT) \
> +               --file-size-csv $(BASE_DIR)/build/file-size-stats.csv \
> +               --package-size-csv $(BASE_DIR)/build/package-size-stats.csv \
> +               --skeleton-path $(TARGET_SKELETON)
> +
>  else # ifeq ($(BR2_HAVE_DOT_CONFIG),y)
>
>  all: menuconfig
> @@ -883,6 +891,7 @@ endif
>         @echo '  manual-epub            - build manual in ePub'
>         @echo '  graph-build            - generate graphs of the build times'
>         @echo '  graph-depends          - generate graph of the dependency tree'
> +       @echo '  size-stats             - generate stats of the filesystem size'
>         @echo '  list-defconfigs        - list all defconfigs (pre-configured minimal systems)'
>         @echo
>         @echo 'Miscellaneous:'
> --
> 2.1.0
>
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing Thomas Petazzoni
@ 2015-05-28  3:42   ` Ryan Barnett
  0 siblings, 0 replies; 20+ messages in thread
From: Ryan Barnett @ 2015-05-28  3:42 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Other than the a few minor suggestions below, things look good.

Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>

> ---
>  docs/manual/common-usage.txt | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/docs/manual/common-usage.txt b/docs/manual/common-usage.txt
> index 5b27b1f..c7a14ce 100644
> --- a/docs/manual/common-usage.txt
> +++ b/docs/manual/common-usage.txt
> @@ -273,6 +273,45 @@ only other format supported is PNG:
>  BR2_GRAPH_OUT=png make graph-build
>  ----------------
>
> +=== Graphing the filesystem size contribution of packages
> +
> +When your target system grows, it is sometimes useful to understand
> +how much each Buildroot package is contributing to the overall root
> +filesystem size. To help with such an analysis, Buildroot collects
> +some data about which package installs which file, and can generate

Minor nit: the sentence doesn't sound right to me. Maybe reword to this:

data about the files each package install and generates

> +graphs and CSVs files with the relevant data.
> +
> +To generate these data after a build, run:
> +
> +----------------
> +make size-stats
> +----------------
> +
> +This will generate:
> +
> +* +output/graphs/graph-size+, a pie chart of the contribution of each
> +  package to the overall root filesystem size
> +
> +* +output/build/package-size-stats.csv+, a CSV file giving the size
> +  contribution of each package to the overall root filesystem size
> +
> +* +output/build/file-size-stats.csv+, a CSV file giving the size
> +  contribution of each installed file to the package it belongs, and
> +  to the overall filesystem size.
> +
> +This +size-stats+ target requires the Python Matplotlib library to be
> +installed (+python-matplotlib+ on most distributions), and also the
> ++argpase+ module if you're using a Python version older than 2.7
> +(+python-argparse+ on most distributions).
> +
> +Just like for the duration graph, a +BR2_GRAPH_OUT+ environment is
> +supported to adjust the output file format.

Add a link to the section describing BR2_GRAPH_OUT?

Thanks,
-Ryan

> +
> +.Note
> +The collected filesystem size data is only meaningful after a complete
> +clean rebuild. Be sure to run +make clean all+ before using +make
> +size-stats+.
> +
>  include::eclipse-integration.txt[]
>
>  include::advanced.txt[]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-05-28  3:18   ` Ryan Barnett
@ 2015-05-28 14:55     ` Matthew Weber
  2015-09-02 21:08     ` Thomas Petazzoni
  1 sibling, 0 replies; 20+ messages in thread
From: Matthew Weber @ 2015-05-28 14:55 UTC (permalink / raw)
  To: buildroot

Ryan, Thomas,

On Wed, May 27, 2015 at 10:18 PM, Ryan Barnett <ryanbarnett3@gmail.com> wrote:
> Thomas,
>
> On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
> <thomas.petazzoni@free-electrons.com> wrote:
>> This new script uses the data collected by the step_pkg_size
>> instrumentation hook to generate a pie chart of the size contribution
>> of each package to the target root filesystem, and two CSV files with
>> statistics about the package size and file size. To achieve this, it
>> looks at each file in $(TARGET_DIR), and using the
>> packages-file-list.txt information collected by the step_pkg_size
>> hook, it determines to which package the file belongs. It is therefore
>> able to give the size installed by each package.
>>
>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> ---
>>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 238 insertions(+)
>>  create mode 100755 support/scripts/size-stats
>
> Other than a few minor suggestion below, things look good.
>
> Reviewed-by: Ryan Barnett <ryanbarnett3@gmail.com>
> Tested-by: Ryan Barnett <ryanbarnett3@gmail.com>
>
>> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
>> new file mode 100755
>> index 0000000..48a64cd
>> --- /dev/null
>> +++ b/support/scripts/size-stats
>> @@ -0,0 +1,238 @@
>
> [...]
>
>> +#
>> +# This function builds a dictionary that contains the name of a
>> +# package as key, and the size of the files installed by this package
>> +# as the value.
>> +#
>> +# filesdict: dictionary with the name of the files as key, and as
>> +# value a tuple containing the name of the package to which the files
>> +# belongs, and the size of the file. As returned by
>> +# build_package_dict.
>> +#
>> +# builddir: path to the Buildroot output directory
>> +#
>> +def build_package_size(filesdict, builddir):
>> +    pkgsize = collections.defaultdict(int)
>> +
>> +    for root, _, files in os.walk(os.path.join(builddir, "target")):
>> +        for f in files:
>> +            fpath = os.path.join(root, f)
>> +            if os.path.islink(fpath):
>> +                continue
>> +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
>> +            if not frelpath in filesdict:
>> +                print("WARNING: %s is not part of any package" % frelpath)
>
> Would it be useful to have an exclusion list since this will always be
> printed out?

Or maybe a single warning that has a path to a file where these are appended to?

<snip?



-- 
Matthew L Weber / Pr Software Engineer
Airborne Information Systems / Security Systems and Software / Secure Platforms
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
www.rockwellcollins.com

Note: Any Export License Required Information and License Restricted
Third Party Intellectual Property (TPIP) content must be encrypted and
sent to matthew.weber@corp.rockwellcollins.com.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script Thomas Petazzoni
  2015-05-28  3:18   ` Ryan Barnett
@ 2015-06-03 15:50   ` Clayton Shotwell
  2015-07-11 11:46     ` Romain Naour
  1 sibling, 1 reply; 20+ messages in thread
From: Clayton Shotwell @ 2015-06-03 15:50 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This new script uses the data collected by the step_pkg_size
> instrumentation hook to generate a pie chart of the size contribution
> of each package to the target root filesystem, and two CSV files with
> statistics about the package size and file size. To achieve this, it
> looks at each file in $(TARGET_DIR), and using the
> packages-file-list.txt information collected by the step_pkg_size
> hook, it determines to which package the file belongs. It is therefore
> able to give the size installed by each package.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 238 insertions(+)
>  create mode 100755 support/scripts/size-stats
>
> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
> new file mode 100755
> index 0000000..48a64cd
> --- /dev/null
> +++ b/support/scripts/size-stats

> +import sys
> +import os
> +import os.path
> +import argparse
> +import csv
> +import collections
> +
> +try:

Would it be possible to add in the following lines here to ensure the
graphing does not try to connect to an X-server?

import matplotlib
matplotlib.use('Agg')

I ran into an issue testing this on my setup where I ssh into my build
server using screen. I found this solution on stack overflow at the
following link.

http://stackoverflow.com/questions/4706451/how-to-save-a-figure-remotely-with-pylab/4706614#4706614

> +    import matplotlib.font_manager as fm
> +    import matplotlib.pyplot as plt
> +except ImportError:
> +    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
> +    exit(1)

Thanks,
Clayton

Clayton Shotwell
Senior Software Engineer, Rockwell Collins
clayton.shotwell at rockwellcollins.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
  2015-05-28  3:29   ` Ryan Barnett
@ 2015-07-11 10:46   ` Romain Naour
  2015-07-11 11:09     ` Thomas Petazzoni
  1 sibling, 1 reply; 20+ messages in thread
From: Romain Naour @ 2015-07-11 10:46 UTC (permalink / raw)
  To: buildroot

Hi Thomas,

Le 25/05/2015 23:56, Thomas Petazzoni a ?crit :
> This patch adds a global instrumentation hook that collects the list
> of files installed in $(TARGET_DIR) by each package, and stores this
> list into a file called $(BUILD_DIR)/packages-file-list.txt. It can
> later be used to determine the size contribution of each package to
> the target root filesystem.
> 
> Note that in order to detect if a file installed by one package is
> later overriden by another package, we calculate the md5 of installed
> files and compare them at each installation of a new package.
> 
> This commit also adds a Config.in option to enable the collection of
> this data, as calculating the md5 of all installed files at the
> beginning and end of the installation of each package can be
> considered a time-consuming process which maybe some users will not be
> willing to suffer from.

The part for Config.in is not in this patch, I seems that you dropped it wile
reworking this series. So the size-stat option can't be disabled actually.

Best regards,
Romain
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  package/pkg-generic.mk | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index d5b29f0..944b0fc 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -55,6 +55,38 @@ define step_time
>  endef
>  GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  
> +# Hooks to collect statistics about installed files
> +
> +# This hook will be called before the target installation of a
> +# package. We store in a file named .br_filelist_before the list of
> +# files currently installed in the target. Note that the MD5 is also
> +# stored, in order to identify if the files are overwritten.
> +define step_pkg_size_start
> +	(cd $(TARGET_DIR) ; find . -type f -print0 | xargs -0 md5sum) | sort > \
> +		$($(PKG)_DIR)/.br_filelist_before
> +endef
> +
> +# This hook will be called after the target installation of a
> +# package. We store in a file named .br_filelist_after the list of
> +# files (and their MD5) currently installed in the target. We then do
> +# a diff with the .br_filelist_before to compute the list of files
> +# installed by this package.
> +define step_pkg_size_end
> +	(cd $(TARGET_DIR); find . -type f -print0 | xargs -0 md5sum) | sort > \
> +		$($(PKG)_DIR)/.br_filelist_after
> +	comm -13 $($(PKG)_DIR)/.br_filelist_before $($(PKG)_DIR)/.br_filelist_after | \
> +		while read hash file ; do \
> +			echo "$(1),$${file}" >> $(BUILD_DIR)/packages-file-list.txt ; \
> +		done
> +endef
> +
> +define step_pkg_size
> +	$(if $(filter install-target,$(2)),\
> +		$(if $(filter start,$(1)),$(call step_pkg_size_start,$(3))) \
> +		$(if $(filter end,$(1)),$(call step_pkg_size_end,$(3))))
> +endef
> +GLOBAL_INSTRUMENTATION_HOOKS += step_pkg_size
> +
>  # User-supplied script
>  ifneq ($(BR2_INSTRUMENTATION_SCRIPTS),)
>  define step_user
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook
  2015-07-11 10:46   ` Romain Naour
@ 2015-07-11 11:09     ` Thomas Petazzoni
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-07-11 11:09 UTC (permalink / raw)
  To: buildroot

Dear Romain Naour,

On Sat, 11 Jul 2015 12:46:54 +0200, Romain Naour wrote:

> > This commit also adds a Config.in option to enable the collection of
> > this data, as calculating the md5 of all installed files at the
> > beginning and end of the installation of each package can be
> > considered a time-consuming process which maybe some users will not be
> > willing to suffer from.
> 
> The part for Config.in is not in this patch, I seems that you dropped it wile
> reworking this series. So the size-stat option can't be disabled actually.

It's actually the commit log that is incorrect: removing the option is
intentional. From the cover letter:

 - Remove the BR2_COLLECT_FILE_SIZE_STATS option, and collect the
   necessary data unconditionally. This was suggested by Yann
   E. Morin, who did some testing of the performance impact, and found
   that it was so insignificant that it is negligible. See Yann's
   analysis at
   http://lists.busybox.net/pipermail/buildroot/2015-February/119431.html.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-06-03 15:50   ` Clayton Shotwell
@ 2015-07-11 11:46     ` Romain Naour
  0 siblings, 0 replies; 20+ messages in thread
From: Romain Naour @ 2015-07-11 11:46 UTC (permalink / raw)
  To: buildroot

Hi Clayton, Thomas, all

Le 03/06/2015 17:50, Clayton Shotwell a ?crit :
> Thomas,
> 
> On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
> <thomas.petazzoni@free-electrons.com> wrote:
>> This new script uses the data collected by the step_pkg_size
>> instrumentation hook to generate a pie chart of the size contribution
>> of each package to the target root filesystem, and two CSV files with
>> statistics about the package size and file size. To achieve this, it
>> looks at each file in $(TARGET_DIR), and using the
>> packages-file-list.txt information collected by the step_pkg_size
>> hook, it determines to which package the file belongs. It is therefore
>> able to give the size installed by each package.
>>
>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> ---
>>  support/scripts/size-stats | 238 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 238 insertions(+)
>>  create mode 100755 support/scripts/size-stats
>>
>> diff --git a/support/scripts/size-stats b/support/scripts/size-stats
>> new file mode 100755
>> index 0000000..48a64cd
>> --- /dev/null
>> +++ b/support/scripts/size-stats
> 
>> +import sys
>> +import os
>> +import os.path
>> +import argparse
>> +import csv
>> +import collections
>> +
>> +try:
> 
> Would it be possible to add in the following lines here to ensure the
> graphing does not try to connect to an X-server?
> 
> import matplotlib
> matplotlib.use('Agg')
> 
> I ran into an issue testing this on my setup where I ssh into my build
> server using screen. I found this solution on stack overflow at the
> following link.
> 
> http://stackoverflow.com/questions/4706451/how-to-save-a-figure-remotely-with-pylab/4706614#4706614

With Samuel, we are ok to import matplitlib entirely to ensure that it doesn't
try to connect to X-server.
The overhead seems negligible, I not able to see any difference with or without
import matplotlib.

Best regards,
Romain Naour

> 
>> +    import matplotlib.font_manager as fm
>> +    import matplotlib.pyplot as plt
>> +except ImportError:
>> +    sys.stderr.write("You need python-matplotlib to generate the size graph\n")
>> +    exit(1)
> 
> Thanks,
> Clayton
> 
> Clayton Shotwell
> Senior Software Engineer, Rockwell Collins
> clayton.shotwell at rockwellcollins.com
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics
  2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
                   ` (3 preceding siblings ...)
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing Thomas Petazzoni
@ 2015-07-30 21:54 ` Ryan Barnett
  4 siblings, 0 replies; 20+ messages in thread
From: Ryan Barnett @ 2015-07-30 21:54 UTC (permalink / raw)
  To: buildroot

Hi Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
>
> Thomas Petazzoni (4):
>   pkg-generic: add step_pkg_size global instrumentation hook
>   support/scripts: add size-stats script
>   Makefile: implement a size-stats target
>   docs/manual: add section about size graphing
>
>  Makefile                     |   9 ++
>  docs/manual/common-usage.txt |  39 +++++++
>  package/pkg-generic.mk       |  32 ++++++
>  support/scripts/size-stats   | 238 +++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 318 insertions(+)
>  create mode 100755 support/scripts/size-stats

We have been rebasing your patch on top of the latest master and since
around July 23rd (that is when we jumped to from June) it doesn't
appear the graph generation is working anymore. I think it has to do
with how you are treating the filesystem skeleton now as a package but
really don't know if that is the case or not.

Here is the error that is happening when we try to get the package stats:

umask 0022 && make -C /tmp/linux/buildroot
O=/tmp/linux/build/test-build/. size-stats
usage: size-stats [-h] --builddir BUILDDIR [--graph GRAPH]
                  [--file-size-csv FILE_SIZE_CSV]
                  [--package-size-csv PKG_SIZE_CSV] --skeleton-path
                  SKELETON_PATH
size-stats: error: argument --skeleton-path/-s: expected one argument
make[4]: *** [size-stats] Error 2
make[3]: *** [_all] Error 2
Failed to graph the package size statistics

Thanks,
-Ryan

-- 
Ryan Barnett / Sr Software Engineer
Airborne Information Systems / Security Systems and Software
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
ryan.barnett at rockwellcollins.com
www.rockwellcollins.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target Thomas Petazzoni
  2015-05-28  3:32   ` Ryan Barnett
@ 2015-08-20 12:42   ` Ryan Barnett
  2015-08-20 20:43     ` Thomas Petazzoni
  1 sibling, 1 reply; 20+ messages in thread
From: Ryan Barnett @ 2015-08-20 12:42 UTC (permalink / raw)
  To: buildroot

Thomas,

On Mon, May 25, 2015 at 4:56 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> This commit implements a size-stats target that calls the script of
> the same name to generate the graph and CSV files related to package
> and file sizes.
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  Makefile | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index cf244d7..c9b1e72 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -666,6 +666,14 @@ graph-depends: graph-depends-requirements
>         |tee $(GRAPHS_DIR)/$(@).dot \
>         |dot $(BR2_GRAPH_DOT_OPTS) -T$(BR_GRAPH_OUT) -o $(GRAPHS_DIR)/$(@).$(BR_GRAPH_OUT)
>
> +size-stats:
> +       $(Q)mkdir -p $(GRAPHS_DIR)
> +       $(Q)$(TOPDIR)/support/scripts/size-stats --builddir $(BASE_DIR) \
> +               --graph $(BASE_DIR)/graphs/graph-size.$(BR_GRAPH_OUT) \
> +               --file-size-csv $(BASE_DIR)/build/file-size-stats.csv \
> +               --package-size-csv $(BASE_DIR)/build/package-size-stats.csv \
> +               --skeleton-path $(TARGET_SKELETON)

In order to make this patch series to continue to work with the recent
addition of the skeleton package, the --skeleton-path variable needs
to be updated to:

   --skeleton-path $(SKELETON_PATH)

I don't know if you have already updated this or not in your patch
series but this change needs to be in order to have the graph
statistics work.

Thanks,
-Ryan

-- 
Ryan Barnett / Sr Software Engineer
Airborne Information Systems / Security Systems and Software
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
ryan.barnett at rockwellcollins.com
www.rockwellcollins.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-08-20 12:42   ` Ryan Barnett
@ 2015-08-20 20:43     ` Thomas Petazzoni
  2015-08-21 14:07       ` Ryan Barnett
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Petazzoni @ 2015-08-20 20:43 UTC (permalink / raw)
  To: buildroot

Ryan,

On Thu, 20 Aug 2015 07:42:39 -0500, Ryan Barnett wrote:

> > +size-stats:
> > +       $(Q)mkdir -p $(GRAPHS_DIR)
> > +       $(Q)$(TOPDIR)/support/scripts/size-stats --builddir $(BASE_DIR) \
> > +               --graph $(BASE_DIR)/graphs/graph-size.$(BR_GRAPH_OUT) \
> > +               --file-size-csv $(BASE_DIR)/build/file-size-stats.csv \
> > +               --package-size-csv $(BASE_DIR)/build/package-size-stats.csv \
> > +               --skeleton-path $(TARGET_SKELETON)
> 
> In order to make this patch series to continue to work with the recent
> addition of the skeleton package, the --skeleton-path variable needs
> to be updated to:
> 
>    --skeleton-path $(SKELETON_PATH)
> 
> I don't know if you have already updated this or not in your patch
> series but this change needs to be in order to have the graph
> statistics work.

I haven't had the time to update the series unfortunately. However, now
that the skeleton is a real package, I believe all the special handling
of the skeleton can be removed from this size-stats mechanism. It will
simply be handled as a normal package.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-08-20 20:43     ` Thomas Petazzoni
@ 2015-08-21 14:07       ` Ryan Barnett
  2015-08-23 14:46         ` Thomas Petazzoni
  0 siblings, 1 reply; 20+ messages in thread
From: Ryan Barnett @ 2015-08-21 14:07 UTC (permalink / raw)
  To: buildroot

Thomas,

On Thu, Aug 20, 2015 at 3:43 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> Ryan,
>
> On Thu, 20 Aug 2015 07:42:39 -0500, Ryan Barnett wrote:
>
>> > +size-stats:
>> > +       $(Q)mkdir -p $(GRAPHS_DIR)
>> > +       $(Q)$(TOPDIR)/support/scripts/size-stats --builddir $(BASE_DIR) \
>> > +               --graph $(BASE_DIR)/graphs/graph-size.$(BR_GRAPH_OUT) \
>> > +               --file-size-csv $(BASE_DIR)/build/file-size-stats.csv \
>> > +               --package-size-csv $(BASE_DIR)/build/package-size-stats.csv \
>> > +               --skeleton-path $(TARGET_SKELETON)
>>
>> In order to make this patch series to continue to work with the recent
>> addition of the skeleton package, the --skeleton-path variable needs
>> to be updated to:
>>
>>    --skeleton-path $(SKELETON_PATH)
>>
>> I don't know if you have already updated this or not in your patch
>> series but this change needs to be in order to have the graph
>> statistics work.
>
> I haven't had the time to update the series unfortunately. However, now
> that the skeleton is a real package, I believe all the special handling
> of the skeleton can be removed from this size-stats mechanism. It will
> simply be handled as a normal package.

Not a problem - we have been pretty thoroughly testing these patches
for the last couple months chasing master. I guess the one thing about
it being handled by being a "package" is there are no files in the
build directory. However, if I remember correctly you size-stats
script monitors the target directory for new files so the skeleton
package would be handled.

Thanks,
-Ryan

-- 
Ryan Barnett / Sr Software Engineer
Airborne Information Systems / Security Systems and Software
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
ryan.barnett at rockwellcollins.com
www.rockwellcollins.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target
  2015-08-21 14:07       ` Ryan Barnett
@ 2015-08-23 14:46         ` Thomas Petazzoni
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-08-23 14:46 UTC (permalink / raw)
  To: buildroot

Dear Ryan Barnett,

On Fri, 21 Aug 2015 09:07:26 -0500, Ryan Barnett wrote:

> Not a problem - we have been pretty thoroughly testing these patches
> for the last couple months chasing master. I guess the one thing about
> it being handled by being a "package" is there are no files in the
> build directory. However, if I remember correctly you size-stats
> script monitors the target directory for new files so the skeleton
> package would be handled.

Yes, IIRC the size-stats stuff doesn't take care of anything in the
per-package build directory, it only monitors what gets installed by
each package in the target directory.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script
  2015-05-28  3:18   ` Ryan Barnett
  2015-05-28 14:55     ` Matthew Weber
@ 2015-09-02 21:08     ` Thomas Petazzoni
  1 sibling, 0 replies; 20+ messages in thread
From: Thomas Petazzoni @ 2015-09-02 21:08 UTC (permalink / raw)
  To: buildroot

Ryan,

(Yes, I'm replying to a very old e-mail)

On Wed, 27 May 2015 22:18:32 -0500, Ryan Barnett wrote:

> > +def build_package_size(filesdict, builddir):
> > +    pkgsize = collections.defaultdict(int)
> > +
> > +    for root, _, files in os.walk(os.path.join(builddir, "target")):
> > +        for f in files:
> > +            fpath = os.path.join(root, f)
> > +            if os.path.islink(fpath):
> > +                continue
> > +            frelpath = os.path.relpath(fpath, os.path.join(builddir, "target"))
> > +            if not frelpath in filesdict:
> > +                print("WARNING: %s is not part of any package" % frelpath)
> 
> Would it be useful to have an exclusion list since this will always be
> printed out?
> 
> Every time you run 'make clean all size-stats' you will be faced with
> warnings such as this:
> 
> WARNING: THIS_IS_NOT_YOUR_ROOT_FILESYSTEM is not part of any package
> WARNING: etc/ld.so.cache is not part of any package
> WARNING: etc/hostname is not part of any package
> WARNING: etc/os-release is not part of any package
> WARNING: etc/nsswitch.conf is not part of any package
> WARNING: etc/ld.so.conf is not part of any package
> WARNING: etc/network/interfaces is not part of any package
> WARNING: tmp/ldconfig/aux-cache is not part of any package
> WARNING: dev/console is not part of any package
> 
> Initially when I saw this I didn't do something correct, however, I
> quickly released that these are files that are generated by
> buildroot's makefiles (such as THIS_IS_NOT_YOUR_ROOT_FILESYSTEM and
> etc/hostname). Since this files are generated by buildroot and one
> shouldn't be concerned about this files not being a part of any
> package. While typing this, would it make sense to create a package
> called 'buildroot' whose files are defined statically within this
> script?

Since I am not sure how to handle those files yet, I've left this as is
for the moment. In my tests I'm seeing less warnings now:

thomas at skate:~/projets/buildroot (size-stats-v5)$ make size-stats
WARNING: etc/os-release is not part of any package
WARNING: etc/ld.so.conf is not part of any package
WARNING: etc/hostname is not part of any package
WARNING: etc/network/interfaces is not part of any package

Which looks a bit more reasonable.


> > +    # Reduce text size
> > +    proptease = fm.FontProperties()
> > +    proptease.set_size('xx-small')
> > +    plt.setp(autotexts, fontproperties=proptease)
> > +    plt.setp(texts, fontproperties=proptease)
> 
> Could the total size of filesystem be placed on this graph? I was
> thinking maybe at the bottom of the graph in as a subtitle - don't
> know if this possible?

I've implemented this idea, thanks for the suggestion!

> Another idea would be the option to specify a chart title. This could
> be something that could be used with dependency graph as well or any
> other graph generated by buildroot (can't think of any others off at
> this moment). However, I would say that should be a future feature
> that is implemented.

For this one, I'd say we should handle it together with the other
graphs generated by Buildroot, so I've left it on the side for now.

Thanks!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-09-02 21:08 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-25 21:56 [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Thomas Petazzoni
2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 1/4] pkg-generic: add step_pkg_size global instrumentation hook Thomas Petazzoni
2015-05-28  3:29   ` Ryan Barnett
2015-07-11 10:46   ` Romain Naour
2015-07-11 11:09     ` Thomas Petazzoni
2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 2/4] support/scripts: add size-stats script Thomas Petazzoni
2015-05-28  3:18   ` Ryan Barnett
2015-05-28 14:55     ` Matthew Weber
2015-09-02 21:08     ` Thomas Petazzoni
2015-06-03 15:50   ` Clayton Shotwell
2015-07-11 11:46     ` Romain Naour
2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 3/4] Makefile: implement a size-stats target Thomas Petazzoni
2015-05-28  3:32   ` Ryan Barnett
2015-08-20 12:42   ` Ryan Barnett
2015-08-20 20:43     ` Thomas Petazzoni
2015-08-21 14:07       ` Ryan Barnett
2015-08-23 14:46         ` Thomas Petazzoni
2015-05-25 21:56 ` [Buildroot] [PATCH v4 2015.08 4/4] docs/manual: add section about size graphing Thomas Petazzoni
2015-05-28  3:42   ` Ryan Barnett
2015-07-30 21:54 ` [Buildroot] [PATCH v4 2015.08 0/4] Generate package size statistics Ryan Barnett

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.