From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail1.windriver.com (mail1.windriver.com [147.11.146.13]) by mail.openembedded.org (Postfix) with ESMTP id AF9E16A441 for ; Mon, 19 Aug 2019 08:33:01 +0000 (UTC) Received: from ALA-HCB.corp.ad.wrs.com ([147.11.189.41]) by mail1.windriver.com (8.15.2/8.15.1) with ESMTPS id x7J8WxvM012442 (version=TLSv1 cipher=AES128-SHA bits=128 verify=FAIL); Mon, 19 Aug 2019 01:32:59 -0700 (PDT) Received: from localhost.localdomain (128.224.162.182) by ALA-HCB.corp.ad.wrs.com (147.11.189.41) with Microsoft SMTP Server id 14.3.468.0; Mon, 19 Aug 2019 01:32:58 -0700 To: , References: <850ae48669455c75cf34b2306f01add428aa62c0.camel@linuxfoundation.org> <3fb3e0d9-098f-7fdc-5c3c-9501dfe98af4@windriver.com> <352084d7f3f933b692ad60aa8ce50dee9b05c80d.camel@linuxfoundation.org> From: Robert Yang Message-ID: Date: Mon, 19 Aug 2019 16:34:36 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <352084d7f3f933b692ad60aa8ce50dee9b05c80d.camel@linuxfoundation.org> Subject: Re: [PATCH 1/1] bitbake: cookerdata: Avoid double exceptions for bb.fatal() X-BeenThere: bitbake-devel@lists.openembedded.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Patches and discussion that advance bitbake development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Aug 2019 08:33:01 -0000 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 8/16/19 7:03 AM, richard.purdie@linuxfoundation.org wrote: > On Thu, 2019-08-15 at 19:29 +0800, Robert Yang wrote: >> >> On 5/14/19 7:02 PM, Robert Yang wrote: >>> >>> On 5/12/19 4:28 PM, Richard Purdie wrote: >>>> On Thu, 2019-05-09 at 16:03 +0800, Robert Yang wrote: >>>>> The bb.fatal() raises BBHandledException() which causes double >>>>> exceptions, >>>>> e.g.: >>>>> >>>>> Add 'HOSTTOOLS += "hello"' to conf/local.conf: >>>>> $ bitbake -p >>>>> [snip] >>>>> During handling of the above exception, another exception >>>>> occurred: >>>>> [snip] >>>>> ERROR: The following required tools (as specified by HOSTTOOLS) >>>>> appear to be >>>>> unavailable in PATH, please install them in order to proceed: >>>>> hello >>>>> >>>>> Use "raise" rather than "raise bb.BBHandledException" to fix >>>>> the double >>>>> exceptions. >>>>> >>>>> [YOCTO #13267] >>>>> >>>>> Signed-off-by: Robert Yang >>>>> --- >>>>> bitbake/lib/bb/cookerdata.py | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/bitbake/lib/bb/cookerdata.py >>>>> b/bitbake/lib/bb/cookerdata.py >>>>> index f8ae410..585edc5 100644 >>>>> --- a/bitbake/lib/bb/cookerdata.py >>>>> +++ b/bitbake/lib/bb/cookerdata.py >>>>> @@ -301,7 +301,9 @@ class CookerDataBuilder(object): >>>>> if multiconfig: >>>>> >>>>> bb.event.fire(bb.event.MultiConfigParsed(self.mcdata >>>>> ), self.data) >>>>> - except (SyntaxError, bb.BBHandledException): >>>>> + except bb.BBHandledException: >>>>> + raise >>>>> + except SyntaxError: >>>>> raise bb.BBHandledException >>>>> except bb.data_smart.ExpansionError as e: >>>>> logger.error(str(e)) >>>> >>>> Hi Robert, >>>> >>>> This doesn't sound right, where is this exception being printed a >>>> second time? The point of "BBHandledException" is to say "don't >>>> print >>>> any further traces for this exception". If this fixes the bug, it >>>> means >>>> something somewhere is printing a trace for a second time when it >>>> should pass through BBHandledException? >>> >>> Hi RP, >>> >>> I found another serious problem when tried to raise >>> BBHandledException. There >>> is a deadlock when a recipe is failed to parse, e.g.: >>> >>> $ echo helloworld >> meta/recipes-extended/bash/bash_4.4.18.bb >>> $ bitbake -p >>> meta/recipes-extended/bash/bash_4.4.18.bb:42: unparsed line: >>> 'helloworld' >>> [hangs] >>> >>> Then bitbake hangs, we can use Ctrl-C to break it, but the sub >>> processes >>> are still existed, and we need kill them manually, otherwise we >>> can't start >>> bitbake again. >> >> BTW, things becomes much better after the following two patches are >> merged: >> bitbake: knotty: Fix for the Second Keyboard Interrupt >> bitbake: cooker: Cleanup the queue before call process.join() >> >> Now we hardly can reproduce the problem: >> echo helloworld >> meta/recipes-extended/bash/bash_4.4.18.bb >> $ while true; do kill-bb; rm -fr bitbake-cookerdaemon.log >> tmp/cache/default-glibc/qemux86-64/x86_64/bb_cache.dat* ; bitbake -p; >> done >> >> It's not easy to hang any more, but still hangs sometimes, I tried to >> debug it, >> but didn't find the root cause, the ui/knotty.py can't get event from >> server, >> and goes into a dead loop. >> >> event = eventHandler.waitEvent(0) >> if event is None: >> if main.shutdown > 1: >> break >> termfilter.updateFooter() >> event = eventHandler.waitEvent(0.25) >> if event is None: >> continue >> >> The main.shutdown is always 0 when it hangs. > > In theory there are timeouts there so it should never hang waiting for > an event. Is it looping and not getting an event? or is the other end > disconnected? > > I guess the question is what we can do to detect a dead connection, or > if the server is still alive, why the server is hanging and not sending > any events? After more investigations, it may hang at two places, they are very rarely to happen, but does happen, I can use the following command to reproduce it in 10 minutes: $ while true; do kill-bb; rm -fr bitbake-cookerdaemon.log tmp/cache/default-glibc/qemux86-64/x86_64/bb_cache.dat* ; bitbake -p; done * Hangs #1 in cooker.py: 2065 # Cleanup the queue before call process.join(), otherwise there might be 2066 # deadlocks. 2067 while True: 2068 try: 2069 self.result_queue.get(timeout=0.25) 2070 except queue.Empty: 2071 break It hangs at self.result_queue.get(timeout=0.25), the timeout doesn't work here, I tried python 3.5.2 and 3.6.7, the later one is a little better, but still have the problem, I think that it's a bug of python3's multiprocessing, and we can call self.result_queue.cancel_join_thread() to fix the problem. * Hangs #2 in cooker.py: 2073 for process in self.processes: 2074 if force: 2075 process.join(.1) 2076 process.terminate() 2077 else: 2078 process.join() It hangs at process.join(), I added debug code there, it is because the process is alive when join() it, I think that we can use a while loop to check whether the process is alive or not before join(), and force join() after many tries. I will do more testing before send the patches, make sure it won't hang in hours. // Robert > > Thanks for the patches, those were tricky issues to track down and > solve! > > Cheers, > > Richard > > >