From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.windriver.com (mail.windriver.com [147.11.1.11]) by mail.openembedded.org (Postfix) with ESMTP id 990B87C3BE for ; Fri, 25 Jan 2019 16:53:17 +0000 (UTC) Received: from ALA-HCA.corp.ad.wrs.com ([147.11.189.40]) by mail.windriver.com (8.15.2/8.15.1) with ESMTPS id x0PGrG86008881 (version=TLSv1 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 25 Jan 2019 08:53:16 -0800 (PST) Received: from soho-mhatle-m.local (172.25.36.226) by ALA-HCA.corp.ad.wrs.com (147.11.189.50) with Microsoft SMTP Server id 14.3.408.0; Fri, 25 Jan 2019 08:53:15 -0800 To: Ming Liu , Richard Purdie References: <1548413847-32550-1-git-send-email-liu.ming50@gmail.com> <18561a9b63079e5dc2d4b4a8b6b12728475b0402.camel@linuxfoundation.org> <1dbceb752ca54fcbb32c4c4325272b80670693f4.camel@linuxfoundation.org> From: Mark Hatle Openpgp: preference=signencrypt Autocrypt: addr=mark.hatle@windriver.com; prefer-encrypt=mutual; keydata= mQENBFYKxFgBCACt/pzutBp6p/xVKTFJjHbM3KpQKCblyot/YP+bpTr51Hrc5xDXBQhoG7TC aIRvRIvbhEevEQK9y04gW3JK/5lobq5ORebolcsHlYBUvpNeIPjupLQwGvz/TPtrLRNGLqDC rvsM6OA2XbQ2bwzxWaSQS3ImE2O2iXOZn9HhThMGeDB4Nff3fgUvXOTDIrgWOn9K2DgLL7Yc zkUIlFdj+Nraksd/7BSk8oH6tjeBVhFqSFvKta9QxWgdr58oPaTYaW/xNqUjlLrbJuMw/MSe xzuYfdfDfm6J8kRjMOnwQ0n8svJElzqAk+d83ow38gpGQ+LkjGgnf8ZFJ4rUJFADroX3ABEB AAG0JU1hcmsgSGF0bGUgPG1hcmsuaGF0bGVAd2luZHJpdmVyLmNvbT6JATcEEwEIACEFAlYK xFgCGwMFCwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQfv796/r0vvlvZAf9Gs+eN320yhRW V/fZCsngKhmOK4v3HrTwFrkSmoD9QHQiE/5IPdNacHwIPwZx07tNBohB8xOeNqCPRYRBwGhA AnxKOPyd0nnm6ZhPzbA57v4x3IGRQr4QzvcBTASJq91l3Ew4lpAslyx5w1DPPqRD7G8ycDKg peKyDwmdkvCunVisSAQI3XIMq2y230biTO98tDPEezg+lg+yTsz9ZT33F5KNuWrpf8VL5fG/ mt+kAv7wtsx/KTRbqhH3iFXF6eBSwMjAfTXFlkLfbM9riJGXrWEl9n2S2R3cDHNHug0lb8f4 whK370KEO4OwRKIYW/VUBmzk5XZUE9DTlDSV8ycsrrkBDQRWCsRYAQgAwK3FuHCE+HW3YWdH PUjeSn5p//xJ57u8g2rng8zm9zNjmYgpPv5UxozaD9i2jf4mlQLHGGOezhHae8K4Nj70oVcv 8AmwcrJa9i9WL1oy/9R3fHMWf/Ctt9VXTO0qlCuq6PDzaUfvsXR61aJIjTKNQTOjCLjY1vXm VSewUgARysmA8WrjTfwGBihMBxAX0+kIjx8nOlam0WvekMBXZ0AbS56oTLRxYao6DI3GeB/N oWPy/5DfuTKaSdM0Pf8al20x9RuNN5/HLMlyDH/k8bIa1xd9aAqW+Feiw5gC107V2E6ULyIy q6em2UrsmIRxrvpHqbNgQKqvTehJ+V/i4g/uOwARAQABiQEfBBgBCAAJBQJWCsRYAhsMAAoJ EH7+/ev69L755XAH/3ZcNhooqd9OBhFkvXm1iWZ8EoC7motWqVn2oEyxoonsg8AD9kFXiN+T dYp7dH99EZu9q4ptj56AXm4uHzOgywL/5/V2TY6twCGAjUGzDjAB5gzoi+JLIBlDiyOip0eL QswIhRk473xy3j8DA4oVamnSPWgyNJ+qsdt37YWDzoDFFvtDoRU7Eb+znfIMDKzlny0XU/8L cW1bNHJlpv/78GPdfP4tjysEd8MuA5jf5o5w4XqcwTqalffEJtQ/s3pbkstEi7qm5uPui5Kt gq6YYLSqcSNe0GWAF9/T+qwyo7burSTxUWCWtMmlXdAQLW9SynLhB3Jbch0nFAh0fCKi6yY= Organization: Wind River Systems Message-ID: Date: Fri, 25 Jan 2019 10:53:14 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Cc: bitbake-devel@lists.openembedded.org, Stefan Agner , Luka Pivk Subject: Re: [PATCH] fetch2/gitsm: fix config file concurrent update race X-BeenThere: bitbake-devel@lists.openembedded.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Patches and discussion that advance bitbake development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2019 16:53:17 -0000 Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit On 1/25/19 7:44 AM, Ming Liu wrote: > Hi, Richard: > > Involve in Mark. > > Please correct me if I am wrong, but according to the code, seems these upper > level locks could not protect this race condition, because it's inside the GitSM > unpack function, the call chain is: original code, where the error occurred happened during -fetch-. Somehow, two fetches happened at the same time and tried to modify a shared download location. It was determined, the most likely cause of this was from the previous 'need_update' function, which may or may not have been locking things properly. (That was never determined). The patch below removes that function as no longer necessary. http://git.openembedded.org/bitbake/commit/?id=346338667edca1f58ace769ad417548da2b8d981 Further... all other config calls that happened during fetch were removed in: http://git.openembedded.org/bitbake/commit/?id=610dbee5634677f5055e2b36a3043cd197fb8c51 The config call with a target path of ud.clonedir, specifically: - # Correct the submodule references to the local download version... - runfetchcmd("%(basecmd)s config submodule.%(module)s.url %(url)s" % {'basecmd': ud.basecmd, 'module': module, 'url' : local_paths[module]}, d, workdir=ud.clonedir) were removed.. ensuring that no git config calls during fetch, targeting the clone directory [download dir] can occur. (All other actions happen based on the regular git fetcher invocations.) Thus there are no longer calls to git config during the fetch operation to a shared location. This leaves unpack as the only place that git config is invoked... > Inside GitSM: > ``` > def unpack >   -> for module in submodules: >        -> unpack_submodules  >             -> runfetchcmd >                  -> bb.process.run >                       -> git config submodule > ``` > > since 'git config submodule' run concurrently and try to modify a same git > config file, hence the race might happen, but it's hard to be reproduced since > 'git config submodule' finishes quite quickly. git config runs only under the unpack operations, with a working directory located within the WORKDIR of the recipe. The only way to end up with two processes calling git config, in the same workdir, at the same time is if two unpack tasks are running against the same WORKDIR. This itself is an error, and must never happened for a functioning system. (The only time this -could- happen is if you have a shared workdir, but the locking on the shared workdir itself will prevent this operation. If you recipe is lacking the appropriate locking there it is a bug in the recipe implementation.) --Mark > //Ming Liu > > Richard Purdie > 於 2019年1月25日 週五 下午1:24寫道: > > On Fri, 2019-01-25 at 12:35 +0100, Ming Liu wrote: > > Hi, Stefan: > > > > Could you help confirm this? > > > > Hi, Richard: > > > > May I know how has the issue been fixed? > > > > I updated the tip to: > > ``` > > commit 8c8ecec2a722bc2885e2648d41ac8df07bdf660d > >     gitsmy.py: Fix unpack of submodules of submodules > >      > >     If the submodule is in a subdirectory, it needs to have that > > structure > >     preserved.  This means the unpack path needs to be in the > > 'dirname' of the > >     final path -- since the unpack directory name is specified in the > > URI. > >      > >     Additional specific test cases were added to ensure this is > > working properly > >     based on two recent error reports. > >      > >     Signed-off-by: Mark Hatle > > >     Signed-off-by: Richard Purdie > > > > > ``` > > on top of that commit, I could see there are "git config submodule" > > being called concurrently in different processes, which does have a > > race condition, and will lead to the error observed by Stefan, > > although it's not easy to be reproduced. > > There are locks at a higher level in the fetchers, in > lib/bb/fetch2/__init__.py the download() method has: > > if ud.lockfile: >     lf = bb.utils.lockfile(ud.lockfile) > > and the unpack() method has similar locks. > > Mark and I believe there were some code paths where the locks were > being bypassed in the gitsm fetcher but Mark reworked the code so that > we were both happy that there should always be a lock held. > > We need to fix any races properly ensuring they're covered by the main > fetcher locks, adding new ones would only fix a symptom of the real > problem. > > Cheers, > > Richard > > >