All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] Improve externalsrc task dependency tracking
@ 2016-02-25 14:29 Markus Lehtonen
  2016-02-25 14:29 ` [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure Markus Lehtonen
  0 siblings, 1 reply; 4+ messages in thread
From: Markus Lehtonen @ 2016-02-25 14:29 UTC (permalink / raw)
  To: openembedded-core

This refines the first version of the patch by utilizing (or abusing) inline
Python variable expansion: if the source tree is a git repository the python
function uses a custom git index file to track any changes in the source tree
and only returns this one index file for bitbake to hash. If the source tree is
not a git repository it works like the first version of this patch: all files
in the source tree are added as a task dependency (and thus hashed by bitbake).

[YOCTO #8853]

The following changes since commit 205b446f3fc4a9885179a66a8dab9d81bcc63dca:

  uclibc: Do not use immediate expansion operator (2016-02-22 20:42:34 +0000)

are available in the git repository at:

  git://git.openembedded.org/openembedded-core-contrib marquiz/devtool/fixes


Markus Lehtonen (1):
  externalsrc.bbclas: remove nostamp from do_configure

 meta/classes/externalsrc.bbclass | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

-- 
2.6.2



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure
  2016-02-25 14:29 [PATCH v2] Improve externalsrc task dependency tracking Markus Lehtonen
@ 2016-02-25 14:29 ` Markus Lehtonen
  2016-03-08  5:03   ` Paul Eggleton
  0 siblings, 1 reply; 4+ messages in thread
From: Markus Lehtonen @ 2016-02-25 14:29 UTC (permalink / raw)
  To: openembedded-core

Be a bit more intelligent than mindlessly re-compiling every time.
Instead of using 'nostamp' flag for do_compile run a python function to
get a list of files to add as 'file-checksums' flag.  The intention is
to only re-run do_compile if something in the source tree content
changes.

This python function, srctree_hash_files(), works differently, depending
if the source tree is a git repository clone or not. If the source tree
is a git repository, the function runs 'git add .' with a custom git
index file in order to record all changes in the git working tree. This
custom index file is then returned as the file for the task to depend
on. The index file is changed if any changes are made in the source tree
causing the task to be re-run.

If the source tree is not a git repository, srctree_hash_files() simply
adds the whole source tree as a dependency, causing bitbake to basically
hash every file in it. Hidden files and directories in the source tree
root are ignored by the glob currently used. This has the advantage of
automatically ignoring .git directory, for example.

This method of tracking changes source tree changes to determine if
re-build is needed does not work perofectly, though. Many packages are
built under ${S} which effectively changes the source tree causing some
unwanted re-compilations.  However, if do_compile of the recipe does not
produce new/different artefacts on every run (as commonly is and should
be the case) the re-compilation loop stops. Thus, you should usually see
only one re-compilation (if any) after which the source tree is
"stabilized" and no more re-compilations happen.

During the first bitbake run preparing of the task runqueue may take
much longer if the source tree is not a git repository. The reason is
that all the files in the source tree are hashed.  Subsequent builds are
not significantly slower because (most) file hashes are found from the
cache.

[YOCTO #8853]

Signed-off-by: Markus Lehtonen <markus.lehtonen@linux.intel.com>
---
 meta/classes/externalsrc.bbclass | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
index b608bd0..4f25bcf 100644
--- a/meta/classes/externalsrc.bbclass
+++ b/meta/classes/externalsrc.bbclass
@@ -85,8 +85,7 @@ python () {
         d.prependVarFlag('do_compile', 'prefuncs', "externalsrc_compile_prefunc ")
         d.prependVarFlag('do_configure', 'prefuncs', "externalsrc_configure_prefunc ")
 
-        # Ensure compilation happens every time
-        d.setVarFlag('do_compile', 'nostamp', '1')
+        d.setVarFlag('do_compile', 'file-checksums', '${@srctree_hash_files(d)}')
 
         # We don't want the workdir to go away
         d.appendVar('RM_WORK_EXCLUDE', ' ' + d.getVar('PN', True))
@@ -125,3 +124,25 @@ python externalsrc_compile_prefunc() {
     # Make it obvious that this is happening, since forgetting about it could lead to much confusion
     bb.plain('NOTE: %s: compiling from external source tree %s' % (d.getVar('PN', True), d.getVar('EXTERNALSRC', True)))
 }
+
+def srctree_hash_files(d):
+    import shutil
+    import subprocess
+
+    s_dir = d.getVar('EXTERNALSRC', True)
+    git_dir = os.path.join(s_dir, '.git')
+    oe_index_file = os.path.join(git_dir, 'oe-devtool-index')
+
+    ret = " "
+    if os.path.exists(git_dir):
+        # Clone index
+        if not os.path.exists(oe_index_file):
+            shutil.copy2(os.path.join(git_dir, 'index'), oe_index_file)
+        # Update our custom index
+        env = os.environ.copy()
+        env['GIT_INDEX_FILE'] = oe_index_file
+        subprocess.check_output(['git', 'add', '.'], cwd=s_dir, env=env)
+        ret = oe_index_file + ':True'
+    else:
+        ret = d.getVar('EXTERNALSRC') + '/*:True'
+    return ret
-- 
2.6.2



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure
  2016-02-25 14:29 ` [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure Markus Lehtonen
@ 2016-03-08  5:03   ` Paul Eggleton
  2016-03-22 17:14     ` Markus Lehtonen
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Eggleton @ 2016-03-08  5:03 UTC (permalink / raw)
  To: Markus Lehtonen; +Cc: openembedded-core

Hi Markus,

On Thu, 25 Feb 2016 16:29:47 Markus Lehtonen wrote:
> Be a bit more intelligent than mindlessly re-compiling every time.
> Instead of using 'nostamp' flag for do_compile run a python function to
> get a list of files to add as 'file-checksums' flag.  The intention is
> to only re-run do_compile if something in the source tree content
> changes.
> 
> This python function, srctree_hash_files(), works differently, depending
> if the source tree is a git repository clone or not. If the source tree
> is a git repository, the function runs 'git add .' with a custom git
> index file in order to record all changes in the git working tree. This
> custom index file is then returned as the file for the task to depend
> on. The index file is changed if any changes are made in the source tree
> causing the task to be re-run.
> 
> If the source tree is not a git repository, srctree_hash_files() simply
> adds the whole source tree as a dependency, causing bitbake to basically
> hash every file in it. Hidden files and directories in the source tree
> root are ignored by the glob currently used. This has the advantage of
> automatically ignoring .git directory, for example.
> 
> This method of tracking changes source tree changes to determine if
> re-build is needed does not work perofectly, though. Many packages are
> built under ${S} which effectively changes the source tree causing some
> unwanted re-compilations.  However, if do_compile of the recipe does not
> produce new/different artefacts on every run (as commonly is and should
> be the case) the re-compilation loop stops. Thus, you should usually see
> only one re-compilation (if any) after which the source tree is
> "stabilized" and no more re-compilations happen.
> 
> During the first bitbake run preparing of the task runqueue may take
> much longer if the source tree is not a git repository. The reason is
> that all the files in the source tree are hashed.  Subsequent builds are
> not significantly slower because (most) file hashes are found from the
> cache.
> 
> [YOCTO #8853]
> 
> Signed-off-by: Markus Lehtonen <markus.lehtonen@linux.intel.com>
> ---
>  meta/classes/externalsrc.bbclass | 25 +++++++++++++++++++++++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/meta/classes/externalsrc.bbclass
> b/meta/classes/externalsrc.bbclass index b608bd0..4f25bcf 100644
> --- a/meta/classes/externalsrc.bbclass
> +++ b/meta/classes/externalsrc.bbclass
> @@ -85,8 +85,7 @@ python () {
>          d.prependVarFlag('do_compile', 'prefuncs',
> "externalsrc_compile_prefunc ") d.prependVarFlag('do_configure',
> 'prefuncs', "externalsrc_configure_prefunc ")
> 
> -        # Ensure compilation happens every time
> -        d.setVarFlag('do_compile', 'nostamp', '1')
> +        d.setVarFlag('do_compile', 'file-checksums',
> '${@srctree_hash_files(d)}')
> 
>          # We don't want the workdir to go away
>          d.appendVar('RM_WORK_EXCLUDE', ' ' + d.getVar('PN', True))
> @@ -125,3 +124,25 @@ python externalsrc_compile_prefunc() {
>      # Make it obvious that this is happening, since forgetting about it
> could lead to much confusion bb.plain('NOTE: %s: compiling from external
> source tree %s' % (d.getVar('PN', True), d.getVar('EXTERNALSRC', True))) }
> +
> +def srctree_hash_files(d):
> +    import shutil
> +    import subprocess
> +
> +    s_dir = d.getVar('EXTERNALSRC', True)
> +    git_dir = os.path.join(s_dir, '.git')
> +    oe_index_file = os.path.join(git_dir, 'oe-devtool-index')
> +
> +    ret = " "
> +    if os.path.exists(git_dir):
> +        # Clone index
> +        if not os.path.exists(oe_index_file):
> +            shutil.copy2(os.path.join(git_dir, 'index'), oe_index_file)
> +        # Update our custom index
> +        env = os.environ.copy()
> +        env['GIT_INDEX_FILE'] = oe_index_file
> +        subprocess.check_output(['git', 'add', '.'], cwd=s_dir, env=env)
> +        ret = oe_index_file + ':True'
> +    else:
> +        ret = d.getVar('EXTERNALSRC') + '/*:True'
> +    return ret

So I finally made the time to look at this - sorry for the extreme delay. There 
are a few issues:

1) Unfortunately this clashes with the EXTERNALSRC_SYMLINKS functionality - we 
now create oe-logs and oe-workdir symlinks in the source directory, and these 
will be picked up by the file-checksums resulting in either warnings or errors 
when pseudo.socket goes missing. For git repositories we should probably be 
poking these into .git/info/exclude somehow; but without a git repository I'm 
unsure as to how to exclude them. It could be that we make things easy on 
ourselves and only activate this functionality if the source tree is a git 
repository and just fall back to the old behaviour if it isn't.

2) If the source tree is a git repo then we always only add files to the custom 
index; if you then realise your .gitignore isn't complete and add some items 
to be ignored within it, those items are still in the custom index and thus 
still get incorporated into the signature. Perhaps we need to be doing a git 
reset for that index before git add each time?

3) Even with a git repository and a properly set up .gitignore such that I 
could tell the index file's md5sum wasn't changing, I couldn't seem to get it 
to work - it just built every time as before. I wonder if this has to do with 
the CONFIGURESTAMPFILE functionality, since I noticed it's do_configure 
executing every time.

Cheers,
Paul

-- 

Paul Eggleton
Intel Open Source Technology Centre


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure
  2016-03-08  5:03   ` Paul Eggleton
@ 2016-03-22 17:14     ` Markus Lehtonen
  0 siblings, 0 replies; 4+ messages in thread
From: Markus Lehtonen @ 2016-03-22 17:14 UTC (permalink / raw)
  To: Paul Eggleton; +Cc: openembedded-core

Hi Paul,



On 08/03/16 07:03, "Paul Eggleton" <paul.eggleton@linux.intel.com> wrote:

>Hi Markus,
>
>On Thu, 25 Feb 2016 16:29:47 Markus Lehtonen wrote:
>> Be a bit more intelligent than mindlessly re-compiling every time.
>> Instead of using 'nostamp' flag for do_compile run a python function to
>> get a list of files to add as 'file-checksums' flag.  The intention is
>> to only re-run do_compile if something in the source tree content
>> changes.
>> 
>> This python function, srctree_hash_files(), works differently, depending
>> if the source tree is a git repository clone or not. If the source tree
>> is a git repository, the function runs 'git add .' with a custom git
>> index file in order to record all changes in the git working tree. This
>> custom index file is then returned as the file for the task to depend
>> on. The index file is changed if any changes are made in the source tree
>> causing the task to be re-run.
>> 
>> If the source tree is not a git repository, srctree_hash_files() simply
>> adds the whole source tree as a dependency, causing bitbake to basically
>> hash every file in it. Hidden files and directories in the source tree
>> root are ignored by the glob currently used. This has the advantage of
>> automatically ignoring .git directory, for example.
>> 
>> This method of tracking changes source tree changes to determine if
>> re-build is needed does not work perofectly, though. Many packages are
>> built under ${S} which effectively changes the source tree causing some
>> unwanted re-compilations.  However, if do_compile of the recipe does not
>> produce new/different artefacts on every run (as commonly is and should
>> be the case) the re-compilation loop stops. Thus, you should usually see
>> only one re-compilation (if any) after which the source tree is
>> "stabilized" and no more re-compilations happen.
>> 
>> During the first bitbake run preparing of the task runqueue may take
>> much longer if the source tree is not a git repository. The reason is
>> that all the files in the source tree are hashed.  Subsequent builds are
>> not significantly slower because (most) file hashes are found from the
>> cache.
>> 
>> [YOCTO #8853]
>> 
>> Signed-off-by: Markus Lehtonen <markus.lehtonen@linux.intel.com>
>> ---
>>  meta/classes/externalsrc.bbclass | 25 +++++++++++++++++++++++--
>>  1 file changed, 23 insertions(+), 2 deletions(-)
>> 
>> diff --git a/meta/classes/externalsrc.bbclass
>> b/meta/classes/externalsrc.bbclass index b608bd0..4f25bcf 100644
>> --- a/meta/classes/externalsrc.bbclass
>> +++ b/meta/classes/externalsrc.bbclass
>> @@ -85,8 +85,7 @@ python () {
>>          d.prependVarFlag('do_compile', 'prefuncs',
>> "externalsrc_compile_prefunc ") d.prependVarFlag('do_configure',
>> 'prefuncs', "externalsrc_configure_prefunc ")
>> 
>> -        # Ensure compilation happens every time
>> -        d.setVarFlag('do_compile', 'nostamp', '1')
>> +        d.setVarFlag('do_compile', 'file-checksums',
>> '${@srctree_hash_files(d)}')
>> 
>>          # We don't want the workdir to go away
>>          d.appendVar('RM_WORK_EXCLUDE', ' ' + d.getVar('PN', True))
>> @@ -125,3 +124,25 @@ python externalsrc_compile_prefunc() {
>>      # Make it obvious that this is happening, since forgetting about it
>> could lead to much confusion bb.plain('NOTE: %s: compiling from external
>> source tree %s' % (d.getVar('PN', True), d.getVar('EXTERNALSRC', True))) }
>> +
>> +def srctree_hash_files(d):
>> +    import shutil
>> +    import subprocess
>> +
>> +    s_dir = d.getVar('EXTERNALSRC', True)
>> +    git_dir = os.path.join(s_dir, '.git')
>> +    oe_index_file = os.path.join(git_dir, 'oe-devtool-index')
>> +
>> +    ret = " "
>> +    if os.path.exists(git_dir):
>> +        # Clone index
>> +        if not os.path.exists(oe_index_file):
>> +            shutil.copy2(os.path.join(git_dir, 'index'), oe_index_file)
>> +        # Update our custom index
>> +        env = os.environ.copy()
>> +        env['GIT_INDEX_FILE'] = oe_index_file
>> +        subprocess.check_output(['git', 'add', '.'], cwd=s_dir, env=env)
>> +        ret = oe_index_file + ':True'
>> +    else:
>> +        ret = d.getVar('EXTERNALSRC') + '/*:True'
>> +    return ret
>
>So I finally made the time to look at this - sorry for the extreme delay. There 
>are a few issues:

Thank you for the review. I'm sorry about the latest delay on my part. I had just
missed your email.

I just submitted a new version of the patchset. That should have the issues you
were seeing resolved. It now requires two patches to bitbake, too, though.



>1) Unfortunately this clashes with the EXTERNALSRC_SYMLINKS functionality - we 
>now create oe-logs and oe-workdir symlinks in the source directory, and these 
>will be picked up by the file-checksums resulting in either warnings or errors 
>when pseudo.socket goes missing. For git repositories we should probably be 
>poking these into .git/info/exclude somehow; but without a git repository I'm 
>unsure as to how to exclude them. It could be that we make things easy on 
>ourselves and only activate this functionality if the source tree is a git 
>repository and just fall back to the old behaviour if it isn't.

Yes, it does (or did). Generally, I don't like the idea of build system
dirtying/tampering with the source tree (unless B=S). But, I guess there's
not much I can do about that now.

For git trees the symlinks do not cause any problems for checksumming as Git
handles those. Even if they make the git tree dirty and devtool should add those
into .git/info/exclude. However, I think that is unrelated to this patchset and
could/should be done in a separate patchset.

My latest iteration fixes the symlink problem for non-Git trees by changing bitbake
checksumming code not to follow directory symlinks. If that is not seen feasible we
could try e.g. listing each file in the root directory as a dependency separately
(i.e. without using a glob) and filtering out symlinks there.



>2) If the source tree is a git repo then we always only add files to the custom 
>index; if you then realise your .gitignore isn't complete and add some items 
>to be ignored within it, those items are still in the custom index and thus 
>still get incorporated into the signature. Perhaps we need to be doing a git 
>reset for that index before git add each time?

This is now also fixed. In the new version a fresh copy of the "real" git index is
always used as a base for the custom index so this shouldn't be an issue anymore.



>3) Even with a git repository and a properly set up .gitignore such that I 
>could tell the index file's md5sum wasn't changing, I couldn't seem to get it 
>to work - it just built every time as before. I wonder if this has to do with 
>the CONFIGURESTAMPFILE functionality, since I noticed it's do_configure 
>executing every time.

Yes, this was caused by the semi-broken CONFIGURESTAMP functionality. It is
fixed in master, now.


Thanks,
   Markus




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-22 17:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-25 14:29 [PATCH v2] Improve externalsrc task dependency tracking Markus Lehtonen
2016-02-25 14:29 ` [PATCH v2] externalsrc.bbclas: remove nostamp from do_configure Markus Lehtonen
2016-03-08  5:03   ` Paul Eggleton
2016-03-22 17:14     ` Markus Lehtonen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.