From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas De Schampheleire Date: Tue, 8 Jan 2019 18:22:47 +0100 Subject: [Buildroot] [PATCH 10/19] support: rewrite check-bin-arch in python In-Reply-To: <20190108163752.GK19623@scaer> References: <638ba4cafb8ab013365ae54a393b49f459bc6e74.1546898693.git.yann.morin.1998@free.fr> <20190108163752.GK19623@scaer> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: buildroot@busybox.net El mar., 8 ene. 2019 a las 17:37, Yann E. MORIN () escribi?: > > On 2019-01-08 15:56 +0100, Thomas De Schampheleire spake thusly: > > El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN > > () escribi?: > [--SNIP--] > > > Rewrite that script in python. > [--SNIP--] > > > +def main(): > > > + parser = argparse.ArgumentParser() > > > + parser.add_argument('--package', '-p', metavar='PACKAGE', required=True) > > > + parser.add_argument('--pkg-list', '-l', metavar='PKG_LIST', required=True) > > > + parser.add_argument('--readelf', '-r', metavar='READELF', required=True) > > > + parser.add_argument('--arch', '-a', metavar='ARCH', required=True) > > > + parser.add_argument('--ignore', '-i', metavar='PATH', action='append') > > I think the above 'metavar' options are not necessary, as they are the > > default value. > > I'll check and drop if they indeed are not. > > > > + args = parser.parse_args() > > > + > > > + if args.ignore is not None: > > > + # Ensure we do have single '/' as separators, and that we have a > > > + # leading and a trailing one, then append to the global list. > > > + for pattern in args.ignore: > > > + IGNORES.add(re.sub('/+', '/', '/{0}/'.format(pattern))) > > Note that the global list itself does not have trailing slashes. This > > seems inconsistent to me. > > It is inconsistent, but I kept the existing behaviour intact as much as > possible, so the python script is a drop-in replacement for the shjell > script, with the same semantics. > > > > + ignores_re = set() > > > + for i in IGNORES: > > > + ignores_re.add(re.compile(i)) > > > + > > > + arch_re = re.compile('^ Machine: +(.+)') > > > + > > > + target_dir = os.environ['TARGET_DIR'] > > > + > > > + exit_code = 0 > > > + for record in parse_pkg_file_list(args.pkg_list): > > > + if record['pkg'] != args.package: > > > + continue > > > + > > > + fpath = record['file'] > > > + > > > + ignored = False > > > + for i in ignores_re: > > > + if i.match(fpath): > > > + ignored = True > > > + break > > > + if ignored: > > > + continue > > > > We can never get into this if, right?, because there is already a > > break whenever ignored is set to True. > > Yes we can, because the break only applied to the inner-most loop, which > is the one that iterates over the ignores regexps. Ok, I see. > > > Note that I think that performance will be better with a list > > comprehension instead of explicit for's. Something like (untested): > > > > valid_records = [ record for record in parse_pkg_file_list(args.pkg_list) > > if record['pkg'] == args.package > > and not any(ignore_re.match(record['file']) for ignore_re > > in ignores_re) ] > > Sorry, but this is totally illegible to me. :-D Ok, I won't argue about the fact itself. Just, as a reference, small decomposition: list = [ f(x) for x in otherlist ] is a 'list comprehension' and is a performant way to generate a list without requiring a for loop. f(x) can be any action on x, e.g. x.strip() or more complex expressions involving x. The 'for x in otherlist' can be restricted further with 'if' clauses. In the code I gave above, the base list is the return value of 'parse_pkg_file_list(..)' and it is filtered by two clauses. The first one: record['pkg'] == args.package only preserves entries in the base list that match the package we are interested in. This is the equivalent of: if record['pkg'] != args.package: continue and the second clause: not any(ignore_re.match(record['file']) for ignore_re in ignores_re) Here, any(x) is a function that takes an iterable (x) and returns True if any of the items in 'x' evaluate to True. Similar to any(x) there is also all(x) which only returns True if _all_ items in x evaluate to True. The value passed to 'any' is '(ignore_re.match(record['file']) for ignore_re in ignores_re)' which itself is a type of list comprehension, except that it actually is a generator comprehension. 'generator' is what you called an 'iterator' in another patch: a thing that 'yields' values one by one. The difference does not really matter for this discussion, so the base format is still: f(x) for x in some_iterable Here f(x) is ignore_re.match(..) and x is 'ignore_re' itself. So the following code: valid_records = [ record for record in parse_pkg_file_list(args.pkg_list) if record['pkg'] == args.package and not any(ignore_re.match(record['file']) for ignore_re in ignores_re) ] naturally reads as: valid_records is the list of records returned by parse_pkg_file_list for which the 'pkg' field is the one we want, and not any of our ignore expression matches the 'file' field of the record. I hope it makes things a bit more clear :) /Thomas