All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] gitsm.py: submodule init and ssh url processing
@ 2019-01-08 23:38 Mark Hatle
  2019-01-08 23:38 ` [PATCH 1/3] gitsm.py: Fix when a submodule is defined, but not initialized Mark Hatle
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Mark Hatle @ 2019-01-08 23:38 UTC (permalink / raw)
  To: bitbake-devel

This patch set covers two different items.  The first is work that started
before Christmas where someone noticed a repository that had a defined
gitmodule, but it was never initialized.  The first patch deals with that
issue.  (Thanks Raphael Lisicki for the fix approach...)

The second set works through the issues that Linus Ziegert, and 
Krystian Garlinski have brought up on the list.  This fix takes a
different approach then Linus's approach of adding a specific regex.
Instead we look for '://', if we find it -- its a URL, otherwise we
look for ':' -- finding that it must be 'ssh style', otherwise it
has to be a file URL.  I think this simplifies the overall approach.

Finally a test case was developed.  See:

http://git.yoctoproject.org/cgit/cgit.cgi/git-submodule-test/tree/.gitmodules?h=ssh-gitsm-tests

for the specific .gitmodules file that was tested.  I would appreciate
it if the people experiencing problems could try out this code and
verify it works in their configurations.

RP - in the last patch, I wasn't sure how to switch which URL is used.
Either the original simplified testing or the more complex ssh based tests.
Any suggestions on a better way to handle this?


Mark Hatle (3):
  gitsm.py: Fix when a submodule is defined, but not initialized
  gitsm.py: Add support for alternative URL formats from submodule files
  tests/fetch.py: Add alternative gitsm test case

 lib/bb/fetch2/gitsm.py | 39 +++++++++++++++++++++++++++++++++------
 lib/bb/tests/fetch.py  |  6 +++++-
 2 files changed, 38 insertions(+), 7 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] gitsm.py: Fix when a submodule is defined, but not initialized
  2019-01-08 23:38 [PATCH 0/3] gitsm.py: submodule init and ssh url processing Mark Hatle
@ 2019-01-08 23:38 ` Mark Hatle
  2019-01-08 23:38 ` [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files Mark Hatle
  2019-01-08 23:38 ` [PATCH 3/3] tests/fetch.py: Add alternative gitsm test case Mark Hatle
  2 siblings, 0 replies; 7+ messages in thread
From: Mark Hatle @ 2019-01-08 23:38 UTC (permalink / raw)
  To: bitbake-devel

It is possible for a submodule to be defined in the .gitmodules file, but
never initialized in the repository itself.  This shows itself when searching
for the defined module hash you will get back a empty value.

Similarly we need to identify and skip defined but not initialized submodules
during the unpack stages as well.

Thanks to raphael.lisicki@siemens.com for their help is figuring out how
to resolve this issue.

Additionally a problem was found where, while unlikely, it may be possible
for the wrong revision to have been searched using ls-tree.  This has been
resolved in the update_submodules function by keeping the correct revision
along with the submodule path.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 lib/bb/fetch2/gitsm.py | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/lib/bb/fetch2/gitsm.py b/lib/bb/fetch2/gitsm.py
index 35729db..88609f0 100644
--- a/lib/bb/fetch2/gitsm.py
+++ b/lib/bb/fetch2/gitsm.py
@@ -64,6 +64,7 @@ class GitSM(Git):
     def update_submodules(self, ud, d):
         submodules = []
         paths = {}
+        revision = {}
         uris = {}
         local_paths = {}
 
@@ -77,6 +78,7 @@ class GitSM(Git):
             for m, md in self.parse_gitmodules(gitmodules).items():
                 submodules.append(m)
                 paths[m] = md['path']
+                revision[m] = ud.revisions[name]
                 uris[m] = md['url']
                 if uris[m].startswith('..'):
                     newud = copy.copy(ud)
@@ -84,7 +86,12 @@ class GitSM(Git):
                     uris[m] = Git._get_repo_url(self, newud)
 
         for module in submodules:
-            module_hash = runfetchcmd("%s ls-tree -z -d %s %s" % (ud.basecmd, ud.revisions[name], paths[module]), d, quiet=True, workdir=ud.clonedir)
+            module_hash = runfetchcmd("%s ls-tree -z -d %s %s" % (ud.basecmd, revision[module], paths[module]), d, quiet=True, workdir=ud.clonedir)
+
+            if not module_hash:
+                logger.debug(1, "submodule %s is defined, but is not initialized in the repository. Skipping", module)
+                continue
+
             module_hash = module_hash.split()[2]
 
             # Build new SRC_URI
@@ -143,7 +150,7 @@ class GitSM(Git):
         if not ud.shallow or ud.localpath != ud.fullshallow:
             self.update_submodules(ud, d)
 
-    def copy_submodules(self, submodules, ud, destdir, d):
+    def copy_submodules(self, submodules, ud, name, destdir, d):
         if ud.bareclone:
             repo_conf = destdir
         else:
@@ -156,6 +163,13 @@ class GitSM(Git):
             srcpath = os.path.join(ud.clonedir, 'modules', md['path'])
             modpath = os.path.join(repo_conf, 'modules', md['path'])
 
+            # Check if the module is initialized
+            module_hash = runfetchcmd("%s ls-tree -z -d %s %s" % (ud.basecmd, ud.revisions[name], md['path']), d, quiet=True, workdir=ud.clonedir)
+
+            if not module_hash:
+                logger.debug(1, "submodule %s is defined, but is not initialized in the repository. Skipping", module)
+                continue
+
             if os.path.exists(srcpath):
                 if os.path.exists(os.path.join(srcpath, '.git')):
                     srcpath = os.path.join(srcpath, '.git')
@@ -188,7 +202,7 @@ class GitSM(Git):
                 continue
 
             submodules = self.parse_gitmodules(gitmodules)
-            self.copy_submodules(submodules, ud, dest, d)
+            self.copy_submodules(submodules, ud, name, dest, d)
 
     def unpack(self, ud, destdir, d):
         Git.unpack(self, ud, destdir, d)
@@ -211,7 +225,7 @@ class GitSM(Git):
                 continue
 
             submodules = self.parse_gitmodules(gitmodules)
-            self.copy_submodules(submodules, ud, ud.destdir, d)
+            self.copy_submodules(submodules, ud, name, ud.destdir, d)
 
             submodules_queue = [(module, os.path.join(repo_conf, 'modules', md['path'])) for module, md in submodules.items()]
             while len(submodules_queue) != 0:
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files
  2019-01-08 23:38 [PATCH 0/3] gitsm.py: submodule init and ssh url processing Mark Hatle
  2019-01-08 23:38 ` [PATCH 1/3] gitsm.py: Fix when a submodule is defined, but not initialized Mark Hatle
@ 2019-01-08 23:38 ` Mark Hatle
  2019-01-09  1:18   ` Olof Johansson
  2019-01-08 23:38 ` [PATCH 3/3] tests/fetch.py: Add alternative gitsm test case Mark Hatle
  2 siblings, 1 reply; 7+ messages in thread
From: Mark Hatle @ 2019-01-08 23:38 UTC (permalink / raw)
  To: bitbake-devel

The following appear to be the git supported formats:

  proto://user:pass@host/path  (URI format)
  user@host:path (SSH format)
  /path or ./path or ../path (local file format)

We adjust the parsing to find out if we have a URI format or not.
When we are NOT in URI format, we do our best to determine SSH or
file format by looking for a ':' in the overall string.  If we find
a ':' we assume SSH format and adjust accordingly.

Note, in SSH format we simply replace the ':' with a '/' when constructing
the URL.  However, if the original path was ":/...", we don't want '//' so
we deal with this corner case as well.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 lib/bb/fetch2/gitsm.py | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/lib/bb/fetch2/gitsm.py b/lib/bb/fetch2/gitsm.py
index 88609f0..0981bba 100644
--- a/lib/bb/fetch2/gitsm.py
+++ b/lib/bb/fetch2/gitsm.py
@@ -95,8 +95,21 @@ class GitSM(Git):
             module_hash = module_hash.split()[2]
 
             # Build new SRC_URI
-            proto = uris[module].split(':', 1)[0]
-            url = uris[module].replace('%s:' % proto, 'gitsm:', 1)
+            if "://" not in uris[module]:
+                # It's ssh if the format does NOT have "://", but has a ':'
+                if ":" in uris[module]:
+                    proto = "ssh"
+                    if ":/" in uris[module]:
+                        url = "gitsm://" + uris[module].replace(':/', '/', 1)
+                    else:
+                        url = "gitsm://" + uris[module].replace(':', '/', 1)
+                else: # Fall back to 'file' if there is no ':'
+                    proto = "file"
+                    url = "gitsm://" + uris[module]
+            else:
+                proto = uris[module].split(':', 1)[0]
+                url = uris[module].replace('%s:' % proto, 'gitsm:', 1)
+
             url += ';protocol=%s' % proto
             url += ";name=%s" % module
             url += ";bareclone=1;nocheckout=1;nobranch=1"
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] tests/fetch.py: Add alternative gitsm test case
  2019-01-08 23:38 [PATCH 0/3] gitsm.py: submodule init and ssh url processing Mark Hatle
  2019-01-08 23:38 ` [PATCH 1/3] gitsm.py: Fix when a submodule is defined, but not initialized Mark Hatle
  2019-01-08 23:38 ` [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files Mark Hatle
@ 2019-01-08 23:38 ` Mark Hatle
  2 siblings, 0 replies; 7+ messages in thread
From: Mark Hatle @ 2019-01-08 23:38 UTC (permalink / raw)
  To: bitbake-devel

In order to test the ssh processing in gitsm, we add an alternative
testcase that can be downloaded from git.yoctoproject.org.  However,
this test case requries (read) access, via ssh, to git.yoctoproject.org.

Signed-off-by: Mark Hatle <mark.hatle@windriver.com>
---
 lib/bb/tests/fetch.py | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index 6848095..311c701 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -893,7 +893,11 @@ class FetcherNetworkTest(FetcherTest):
 
     @skipIfNoNetwork()
     def test_git_submodule(self):
-        fetcher = bb.fetch.Fetch(["gitsm://git.yoctoproject.org/git-submodule-test;rev=f12e57f2edf0aa534cf1616fa983d165a92b0842"], self.d)
+        # URL with ssh submodules
+        url = "gitsm://git.yoctoproject.org/git-submodule-test;branch=ssh-gitsm-tests;rev=0d3ffc14bce95e8b3a21a0a67bfe4c4a96ba6350"
+        # Original URL (comment this if you have ssh access to git.yoctoproject.org)
+        url = "gitsm://git.yoctoproject.org/git-submodule-test;rev=f12e57f2edf0aa534cf1616fa983d165a92b0842"
+        fetcher = bb.fetch.Fetch([url], self.d)
         fetcher.download()
         # Previous cwd has been deleted
         os.chdir(os.path.dirname(self.unpackdir))
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files
  2019-01-08 23:38 ` [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files Mark Hatle
@ 2019-01-09  1:18   ` Olof Johansson
  2019-01-09  1:29     ` Olof Johansson
  0 siblings, 1 reply; 7+ messages in thread
From: Olof Johansson @ 2019-01-09  1:18 UTC (permalink / raw)
  To: Mark Hatle; +Cc: bitbake-devel

On 19-01-08 18:38 -0500, Mark Hatle wrote:
> Note, in SSH format we simply replace the ':' with a '/' when constructing
> the URL.  However, if the original path was ":/...", we don't want '//' so
> we deal with this corner case as well.
...
> +  if ":/" in uris[module]:
> +      url = "gitsm://" + uris[module].replace(':/', '/', 1)
> +  else:
> +      url = "gitsm://" + uris[module].replace(':', '/', 1)

Without a / prefix in the original ssh path, the path becomes
relative to whatever the ssh server wants, usually the user's
$HOME. I.e., they do not refer to the same path on the remote
system.

  alice@example.com:/foo.git -> ssh://alice@example.com/foo.git
  alice@example.com:foo.git -> ssh://alice@example.com/home/alice/foo.git (or something else!)

The path in an ssh:// uri is absolute, so the relative case must
be handled differently. But it's easy, git supports ~ expansion,
like:

  alice@example.com:foo.git -> ssh://alice@example.com/~/foo.git

-- 
olofjn


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files
  2019-01-09  1:18   ` Olof Johansson
@ 2019-01-09  1:29     ` Olof Johansson
  2019-01-09 16:33       ` Mark Hatle
  0 siblings, 1 reply; 7+ messages in thread
From: Olof Johansson @ 2019-01-09  1:29 UTC (permalink / raw)
  To: Mark Hatle, bitbake-devel

On 19-01-09 02:18 +0100, Olof Johansson wrote:
> The path in an ssh:// uri is absolute, so the relative case must
> be handled differently. But it's easy, git supports ~ expansion,
> like:
> 
>   alice@example.com:foo.git -> ssh://alice@example.com/~/foo.git

Tested some more. This works with git over openssh (possibly only
against unix systems), but not with github it seems, so this
wasn't as general of a solution I hoped. Sorry. I think the issue
is real, but my proposed solution wasn't :(.

And, of course, the github ssh remote specifiers all look like
git@github.com:alice/foo.

-- 
olofjn


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files
  2019-01-09  1:29     ` Olof Johansson
@ 2019-01-09 16:33       ` Mark Hatle
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Hatle @ 2019-01-09 16:33 UTC (permalink / raw)
  To: bitbake-devel

On 1/8/19 7:29 PM, Olof Johansson wrote:
> On 19-01-09 02:18 +0100, Olof Johansson wrote:
>> The path in an ssh:// uri is absolute, so the relative case must
>> be handled differently. But it's easy, git supports ~ expansion,
>> like:
>>
>>   alice@example.com:foo.git -> ssh://alice@example.com/~/foo.git
> 
> Tested some more. This works with git over openssh (possibly only
> against unix systems), but not with github it seems, so this
> wasn't as general of a solution I hoped. Sorry. I think the issue
> is real, but my proposed solution wasn't :(.
> 
> And, of course, the github ssh remote specifiers all look like
> git@github.com:alice/foo.
> 

Unfortunately, this is what I observed as well.  In the specific test cases I
had replacing it 'as-is' was a better solution.  (test case was specifically
github, and gitolite based)

The only alternative I can see in this case, is to try one fetch (try/catch) and
if it fails to try the alternative.  If they both fail, then it becomes a fetch
failure.

--Mark


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-09 16:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-08 23:38 [PATCH 0/3] gitsm.py: submodule init and ssh url processing Mark Hatle
2019-01-08 23:38 ` [PATCH 1/3] gitsm.py: Fix when a submodule is defined, but not initialized Mark Hatle
2019-01-08 23:38 ` [PATCH 2/3] gitsm.py: Add support for alternative URL formats from submodule files Mark Hatle
2019-01-09  1:18   ` Olof Johansson
2019-01-09  1:29     ` Olof Johansson
2019-01-09 16:33       ` Mark Hatle
2019-01-08 23:38 ` [PATCH 3/3] tests/fetch.py: Add alternative gitsm test case Mark Hatle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.