All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joshua Watt <jpewhacker@gmail.com>
To: bitbake-devel@lists.openembedded.org,
	openembedded-core@lists.openembedded.org
Subject: [RFC v2 00/16] Hash Equivalency Server
Date: Thu,  9 Aug 2018 17:08:24 -0500	[thread overview]
Message-ID: <20180809220840.26697-1-JPEWhacker@gmail.com> (raw)
In-Reply-To: <20180716203728.23078-1-JPEWhacker@gmail.com>

These patches are a first pass at implementing a hash equivalence server
in bitbake & OE.

Apologies for cross-posting this to both the bitbake-devel and
openembedded-devel; this work necessarily intertwines both places, and
it is really necessary to look at both parts to get an idea of what is
going on. For convenience, the bitbake patches are listed first,
followed by the oe-core patches.

The basic premise is that any given task no longer hashes a dependent
task's taskhash to determine it's own taskhash, but instead hashes the
dependent task's "dependency ID" (which doesn't strictly need to be a
hash, but is for consistency. We can have the discussion as to whether
this should be called a "dependency hash" if anyone wants). This allows
multiple taskhashes to map to the same dependency ID, meaning that
trivial changes to a recipe that would change the taskhash don't
necessarily need to change the dependency ID, and thus don't need to
cause downstream tasks to be rebuilt (with caveats, see below).

In the absence of any interaction by the user, the dependency ID for a
task is just that task's taskhash, which effectively maintains the
current behavior. However, if the user enables the "OEEquivHash"
signature generator, they can direct it to look at a hash equivalency
server (of which a reference implementation is provided). The sstate
code will provide the server with an output hash that it calculates, and
the server will record all tasks with the same output hash as
"equivalent" and report the same dependency ID for them when requested.
When initializing tasks, bitbake can ask the server about the dependency
ID for new tasks it has never seen before and potentially skip
rebuilding, or restore the task from an equivalent sstate file. To
facilitate restoring tasks from sstate, sstate objects are now named
based on the tasks dependency ID instead of the taskhash (which, again
has no effect if the server is in use).

This patchset doesn't make any attempt to dynamically update task
dependency IDs after bitbake initializes the tasks, and as such there
are some cases where this isn't accelerating the build as much as it
possibly could. I think it will be possible to add support for this, but
this preliminary support needs to come first.

Some patches have additional NOTEs that indicate places where I wasn't
sure what to do.

You can also see these patches (and my first attempts at dynamic task
re-hashing) on the "jpew/hash-equivalence" branch in poky-contrib.

As always, thanks for your feedback and time

VERSION 2:

At the core, this patch does the same thing as V1 with some very minor
tweaks. The main things that have changed are:
 1) Per request, the Hash Equivalence Server reference implementation is
    now based entirely on built in Python modules and requires no
    external libraries. It also has a wrapper script to launch it
    (bitbake-hashserv) and unittests.
 2) There is a major rework of persist_data in bitbake. I think these
    patches could be submitted independently, but I doubt anyone is
    clamoring for them. The general gist of them is that there were a
    lot of strange edge cases that I found when using persist_data as an
    IPC mechanism between the main bitbake process and the
    bitbake-worker processes. I went ahead and added extensive unit
    tests for this as well.

Joshua Watt (16):
  bitbake: fork: Add os.fork() wrappers
  bitbake: persist_data: Fix leaking cursors causing deadlock
  bitbake: persist_data: Add key constraints
  bitbake: persist_data: Enable Write Ahead Log
  bitbake: persist_data: Disable enable_shared_cache
  bitbake: persist_data: Close databases across fork
  bitbake: tests/persist_data: Add tests
  bitbake: bitbake-worker: Pass taskhash as runtask parameter
  bitbake: siggen: Split out stampfile hash fetch
  bitbake: siggen: Split out task depend ID
  bitbake: runqueue: Track task dependency ID
  bitbake: runqueue: Pass dependency ID to task
  bitbake: runqueue: Pass dependency ID to hash validate
  classes/sstate: Handle depid in hash check
  bitbake: hashserv: Add hash equivalence reference server
  sstate: Implement hash equivalence sstate

 bitbake/bin/bitbake-hashserv         |  67 ++++++++
 bitbake/bin/bitbake-selftest         |   3 +
 bitbake/bin/bitbake-worker           |  11 +-
 bitbake/lib/bb/fork.py               |  71 ++++++++
 bitbake/lib/bb/persist_data.py       | 239 ++++++++++++++++++++-------
 bitbake/lib/bb/runqueue.py           |  56 ++++---
 bitbake/lib/bb/siggen.py             |  20 ++-
 bitbake/lib/bb/tests/persist_data.py | 188 +++++++++++++++++++++
 bitbake/lib/hashserv/__init__.py     | 152 +++++++++++++++++
 bitbake/lib/hashserv/tests.py        | 141 ++++++++++++++++
 meta/classes/sstate.bbclass          | 102 +++++++++++-
 meta/conf/bitbake.conf               |   4 +-
 meta/lib/oe/sstatesig.py             | 166 +++++++++++++++++++
 13 files changed, 1117 insertions(+), 103 deletions(-)
 create mode 100755 bitbake/bin/bitbake-hashserv
 create mode 100644 bitbake/lib/bb/fork.py
 create mode 100644 bitbake/lib/bb/tests/persist_data.py
 create mode 100644 bitbake/lib/hashserv/__init__.py
 create mode 100644 bitbake/lib/hashserv/tests.py

-- 
2.17.1



  parent reply	other threads:[~2018-08-09 22:10 UTC|newest]

Thread overview: 158+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16 20:37 [RFC 0/9] Hash Equivalency Server Joshua Watt
2018-07-16 20:37 ` [RFC 1/9] bitbake-worker: Pass taskhash as runtask parameter Joshua Watt
2018-07-16 20:37 ` [RFC 2/9] siggen: Split out stampfile hash fetch Joshua Watt
2018-07-16 20:37 ` [RFC 3/9] siggen: Split out task depend ID Joshua Watt
2018-07-16 20:37 ` [RFC 4/9] runqueue: Track task dependency ID Joshua Watt
2018-07-16 20:37 ` [RFC 5/9] runqueue: Pass dependency ID to task Joshua Watt
2018-07-16 20:37 ` [RFC 6/9] runqueue: Pass dependency ID to hash validate Joshua Watt
2018-07-16 20:37 ` [RFC 7/9] classes/sstate: Handle depid in hash check Joshua Watt
2018-07-16 20:37 ` [RFC 8/9] hashserver: Add initial reference server Joshua Watt
2018-07-17 12:11   ` Richard Purdie
2018-07-17 12:11     ` [bitbake-devel] " Richard Purdie
2018-07-17 13:44     ` Joshua Watt
2018-07-17 13:44       ` [bitbake-devel] " Joshua Watt
2018-07-18 13:53     ` Joshua Watt
2018-07-18 13:53       ` [bitbake-devel] " Joshua Watt
2018-07-16 20:37 ` [RFC 9/9] sstate: Implement hash equivalence sstate Joshua Watt
2018-08-09 22:08 ` Joshua Watt [this message]
2018-08-09 22:08   ` [RFC v2 01/16] bitbake: fork: Add os.fork() wrappers Joshua Watt
2018-08-09 22:08   ` [RFC v2 02/16] bitbake: persist_data: Fix leaking cursors causing deadlock Joshua Watt
2018-08-09 22:08   ` [RFC v2 03/16] bitbake: persist_data: Add key constraints Joshua Watt
2018-08-09 22:08   ` [RFC v2 04/16] bitbake: persist_data: Enable Write Ahead Log Joshua Watt
2018-08-09 22:08   ` [RFC v2 05/16] bitbake: persist_data: Disable enable_shared_cache Joshua Watt
2018-08-09 22:08   ` [RFC v2 06/16] bitbake: persist_data: Close databases across fork Joshua Watt
2018-08-09 22:08   ` [RFC v2 07/16] bitbake: tests/persist_data: Add tests Joshua Watt
2018-08-09 22:08   ` [RFC v2 08/16] bitbake: bitbake-worker: Pass taskhash as runtask parameter Joshua Watt
2018-08-09 22:08   ` [RFC v2 09/16] bitbake: siggen: Split out stampfile hash fetch Joshua Watt
2018-08-09 22:08   ` [RFC v2 10/16] bitbake: siggen: Split out task depend ID Joshua Watt
2018-08-09 22:08   ` [RFC v2 11/16] bitbake: runqueue: Track task dependency ID Joshua Watt
2018-08-09 22:08   ` [RFC v2 12/16] bitbake: runqueue: Pass dependency ID to task Joshua Watt
2018-08-09 22:08   ` [RFC v2 13/16] bitbake: runqueue: Pass dependency ID to hash validate Joshua Watt
2018-08-09 22:08   ` [RFC v2 14/16] classes/sstate: Handle depid in hash check Joshua Watt
2018-08-09 22:08   ` [RFC v2 15/16] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2018-08-09 22:08   ` [RFC v2 16/16] sstate: Implement hash equivalence sstate Joshua Watt
2018-12-04  3:42   ` [OE-core][PATCH v3 00/17] Hash Equivalency Server Joshua Watt
2018-12-04  3:42     ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 01/17] bitbake: fork: Add os.fork() wrappers Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 02/17] bitbake: persist_data: Fix leaking cursors causing deadlock Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 03/17] bitbake: persist_data: Add key constraints Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 04/17] bitbake: persist_data: Enable Write Ahead Log Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 05/17] bitbake: persist_data: Disable enable_shared_cache Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 06/17] bitbake: persist_data: Close databases across fork Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 07/17] bitbake: tests/persist_data: Add tests Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 08/17] bitbake: bitbake-worker: Pass taskhash as runtask parameter Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 09/17] bitbake: siggen: Split out stampfile hash fetch Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 10/17] bitbake: siggen: Split out task depend ID Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-05 22:50       ` [OE-core][PATCH " Richard Purdie
2018-12-05 22:50         ` [bitbake-devel] [PATCH " Richard Purdie
2018-12-06 14:58         ` [OE-core][PATCH " Joshua Watt
2018-12-06 14:58           ` [bitbake-devel] [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 11/17] bitbake: runqueue: Track task dependency ID Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 12/17] bitbake: runqueue: Pass dependency ID to task Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 13/17] bitbake: runqueue: Pass dependency ID to hash validate Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-05 22:52       ` [OE-core][PATCH " Richard Purdie
2018-12-05 22:52         ` [bitbake-devel] [PATCH " Richard Purdie
2018-12-04  3:42     ` [OE-core][PATCH v3 14/17] classes/sstate: Handle depid in hash check Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 15/17] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 16/17] sstate: Implement hash equivalence sstate Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-04  3:42     ` [OE-core][PATCH v3 17/17] classes/image-buildinfo: Remove unused argument Joshua Watt
2018-12-04  3:42       ` [PATCH " Joshua Watt
2018-12-18 15:30     ` [OE-core][PATCH v4 00/10] Hash Equivalency Server Joshua Watt
2018-12-18 15:30       ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 01/10] bitbake: fork: Add os.fork() wrappers Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 02/10] bitbake: persist_data: Close databases across fork Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 03/10] bitbake: tests/persist_data: Add tests Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 04/10] bitbake: siggen: Split out task unique hash Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 05/10] bitbake: runqueue: Track " Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 06/10] bitbake: runqueue: Pass unique hash to task Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 07/10] bitbake: runqueue: Pass unique hash to hash validate Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 16:24         ` [OE-core] " Richard Purdie
2018-12-18 16:24           ` Richard Purdie
2018-12-18 16:31           ` [OE-core] " Joshua Watt
2018-12-18 16:31             ` Joshua Watt
2018-12-18 15:30       ` [OE-core][PATCH v4 08/10] classes/sstate: Handle unihash in hash check Joshua Watt
2018-12-18 15:30         ` [PATCH " Joshua Watt
2018-12-18 15:31       ` [OE-core][PATCH v4 09/10] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2018-12-18 15:31         ` [PATCH " Joshua Watt
2018-12-18 15:31       ` [OE-core][PATCH v4 10/10] sstate: Implement hash equivalence sstate Joshua Watt
2018-12-18 15:31         ` [PATCH " Joshua Watt
2018-12-19  3:10       ` [OE-core][PATCH v5 0/8] Hash Equivalency Server Joshua Watt
2018-12-19  3:10         ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 1/8] bitbake: tests/persist_data: Add tests Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 2/8] bitbake: siggen: Split out task unique hash Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 3/8] bitbake: runqueue: Track " Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2019-01-05  7:49           ` [OE-core] " Alejandro Hernandez
2019-01-05  7:49             ` Alejandro Hernandez
2019-01-06  3:09             ` [OE-core] " Joshua Watt
2019-01-06  3:09               ` Joshua Watt
2019-01-07  6:52               ` [OE-core] " Alejandro Hernandez
2019-01-07  6:52                 ` Alejandro Hernandez
2019-01-07 16:16               ` [OE-core] " akuster808
2019-01-07 16:16                 ` akuster808
2019-01-07 16:40                 ` [OE-core] " Joshua Watt
2019-01-07 16:40                   ` Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 4/8] bitbake: runqueue: Pass unique hash to task Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 5/8] bitbake: runqueue: Pass unique hash to hash validate Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 6/8] classes/sstate: Handle unihash in hash check Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 7/8] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:10         ` [OE-core][PATCH v5 8/8] sstate: Implement hash equivalence sstate Joshua Watt
2018-12-19  3:10           ` [PATCH " Joshua Watt
2018-12-19  3:33       ` ✗ patchtest: failure for Hash Equivalency Server (rev3) Patchwork
2019-01-04  2:42       ` [OE-core][PATCH v6 0/3] Hash Equivalency Server Joshua Watt
2019-01-04  2:42         ` [PATCH " Joshua Watt
2019-01-04  2:42         ` [OE-core][PATCH v6 1/3] classes/sstate: Handle unihash in hash check Joshua Watt
2019-01-04  2:42           ` [PATCH " Joshua Watt
2019-01-04  7:01           ` [OE-core][PATCH " Richard Purdie
2019-01-04  7:01             ` [bitbake-devel] [PATCH " Richard Purdie
2019-01-04  2:42         ` [OE-core][PATCH v6 2/3] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2019-01-04  2:42           ` [PATCH " Joshua Watt
2019-01-04  2:42         ` [OE-core][PATCH v6 3/3] sstate: Implement hash equivalence sstate Joshua Watt
2019-01-04  2:42           ` [PATCH " Joshua Watt
2019-01-04 16:20         ` [OE-core][PATCH v7 0/3] Hash Equivalency Server Joshua Watt
2019-01-04 16:20           ` [PATCH " Joshua Watt
2019-01-04 16:20           ` [OE-core][PATCH v7 1/3] classes/sstate: Handle unihash in hash check Joshua Watt
2019-01-04 16:20             ` [PATCH " Joshua Watt
2019-01-04 16:20           ` [OE-core][PATCH v7 2/3] bitbake: hashserv: Add hash equivalence reference server Joshua Watt
2019-01-04 16:20             ` [PATCH " Joshua Watt
2019-01-04 16:20           ` [OE-core][PATCH v7 3/3] sstate: Implement hash equivalence sstate Joshua Watt
2019-01-04 16:20             ` [PATCH " Joshua Watt
2019-01-08  6:29             ` [OE-core][PATCH " Jacob Kroon
2019-01-08  6:29               ` [bitbake-devel] [PATCH " Jacob Kroon
2019-01-09 17:09               ` [OE-core][PATCH " Joshua Watt
2019-01-09 17:09                 ` [bitbake-devel] [PATCH " Joshua Watt
2019-01-11 20:39                 ` [OE-core][PATCH " Peter Kjellerstedt
2019-01-11 20:39                   ` [bitbake-devel] [PATCH " Peter Kjellerstedt
2019-01-04 16:33         ` ✗ patchtest: failure for Hash Equivalency Server (rev5) Patchwork
2019-01-04  3:03       ` ✗ patchtest: failure for Hash Equivalency Server (rev4) Patchwork
2018-12-18 16:03     ` ✗ patchtest: failure for Hash Equivalency Server (rev2) Patchwork
2018-12-04  4:05   ` ✗ patchtest: failure for Hash Equivalency Server Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180809220840.26697-1-JPEWhacker@gmail.com \
    --to=jpewhacker@gmail.com \
    --cc=bitbake-devel@lists.openembedded.org \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.