All of lore.kernel.org
 help / color / mirror / Atom feed
* Yocto Technical Team Minutes, Engineering Sync, for August 24, 2021
@ 2021-09-04  1:16 Trevor Woerner
  0 siblings, 0 replies; only message in thread
From: Trevor Woerner @ 2021-09-04  1:16 UTC (permalink / raw)
  To: yocto

Yocto Technical Team Minutes, Engineering Sync, for August 24, 2021
archive: https://docs.google.com/document/d/1ly8nyhO14kDNnFcW2QskANXW3ZT7QwKC5wWVDg9dDH4/edit

== disclaimer ==
Best efforts are made to ensure the below is accurate and valid. However,
errors sometimes happen. If any errors or omissions are found, please feel
free to reply to this email with any corrections.

== attendees ==
Trevor Woerner, Stephen Jolley, Peter Kjellerstedt, Randy MacLeod, Armin
Kuster, Jan-Simon Möller, Joshua Watt, Richard Elberger, Scott Murray,
Steve Sakoman, Richard Purdie, Saul Wold, Tim Orling, Alejandro Hernandez,
Bruce Ashfield, Denys Dmytriyenko, Jon Mason, Ross Burton, Trevor Gamblin

== project status ==
- now in feature freeze for 3.4 (honister)
- read-only prserv and switch to asyncio merged
- rust merge is problematic (issues with uninative), will need to be fixed in
    next day or two to make it into 3.4
- glibc 2.34 causes significant issues for pseudo, this will get worse as more
    host distros upgrade
- tune file refactorization merged
- still hoping to get some sbom stuff into 3.4

== discussion ==
RP: we’re now at feature freeze

RP: the asycio stuff is finally working, thanks Scott

RP: the news isn’t so good with rust - there’s some weird uninative issue
    (something to do with the linker relocations that we do). we were seeing
    issues on debian 8, but it looks like we can reproduce that issue by
    using the buildtools’ extended tarball as the compiler, which also
    provides its own libc, which them seems to cause the problems. i could
    get rid of the relocations that uninative causes, but at a cost of it not
    working with the eSDK, but i decided to ignore that. but even if we do
    that there’s always the relocation issue with the buildtools tarball
    which we can’t avoid. for a while i could reproduce reliably, but then
    it stopped and i can’t reproduce anymore
Randy: i tried reproducing but couldn't. my impression is that the rust
    community is happy with meta-rust and use it for specific use-cases but
    they don’t go beyond that very much (and therefore aren’t seeing
    issues). even if we fixed the things you call blockers, i’d still call
    it beta quality for or-core if we merge it. do you want to merge it now
    (as beta quality) or wait for the next window?
RP: there’s no winning scenario. if we merge it then i’m signing myself
    up to maintain and fix it (esp before release). on the other hand if we
    push it out then we’ll be in feature freeze and nobody will pay any
    attention to it until later, then other things will bump its priority
    down. i can see that there are some open issues dating back to 2016, that
    obviously nobody cares much about, so pushing it out isn’t going to
    change anything.
Randy: not having rust in is holding back a bunch of things, but i,
    relatively, don’t know rust very well and without the rust community’s
    help i don’t know how to move this forward. ideally someone with rust
    experience could step up; maybe ARM?
Ross: we’d like to see it in core, we’re using it but with meta-rust so
    we’re happy with it so far. my preference would be to hone it and push
    it early in the next release cycle
Randy: schedules are dancing around, so we’ll try to get things moving along

RP: the pseudo glibc problem has me scared. any distro that upgrades to
    glibc 2.34 (natively) will break. we have a ticking timebomb, and it was
    discovered by our toolchain testing (thanks Ross)
RP: we make interesting assumptions with unintave and pseudo. we end up with
    host tools that are linked, potentially, against a newer glibc, therefore
    pseudo has to run as an LD_PRELOAD against multiple libc versions, so if
    it links against a newer one but then has to run against an older one it
    breaks with symbol location problems. we’ve had these issues before, and
    we’ve implemented various fixes. libpseudo only links against libdl and
    libpthread and we can’t get rid of those things (libdl because that’s
    how it works (fundamentally loading libraries dynamically), and threads
    because of the mutex that we use for locking). the release can’t go out
    if, when people upgrade their host systems, it’s going to break; badly.
    we’ve tried every technique that we’ve tried before and then some. in
    2.34 all the symbols are merged back into the main library, so there are
    no libpthread symbols, it’s all part of libc.so. in the past we’ve
    been able to link against uninative 2.33 (libdl and libpthread) and then
    link pseudo-native against those binaries. thereby force-linking against
    older versions using the newer glibc headers (which is horrible). what
    worries me is i’m basically the only one paying attention; i don’t
    even have anyone to bounce ideas off of or talk to about it. so we have a
    solution, it is horrendous, but it’s the only thing we’ve got right
    now. so if there’s anyone who knows about weak linking or strong linking
    or mutex locks without pthreads i’d like to talk to them.
JEPW: would you be opposed to making the direct kernel call to do the locking?
    that would bypass pthreads
RP: i’m not adverse to it, you mean the futex calls?
JPEW: yes
RP:  i’m not opposed, but i don’t think it’s as simple as making direct
    calls to the kernel. i read up on it but decided implementing our own
    locks wasn’t quite the direction i wanted to take. the number of ways to
    get this wrong is… interesting. 
JPEW: i know the futex call does a million things, and that’s one of the
    problems with it. i wonder if it would be possible to look at the pthreads
    mutex code and copy the parts that deal with futex?
RP: i did think of doing that; just distilling the pthreads code into what
    we need. we just need a very simple lock so it might be possible. may be
    something we need to look at
PeterK: wouldn’t you still need to link against libdl
RP: yes, but the scary stuff that goes on is in pthreads (headers and
    declarations). the libdl stuff is 3 function calls that are plain; no
    dependencies, no crazy symbols, etc. long term, ideally, we’d get rid of
    the libpthread dependency, then libdl should be comparatively simpler
TrevorW: i could take a stab at it, i’ve done dynamic library things before:
    loading a library, looking for a symbol, doing one thing or another based
    on whether it’s found
RP: it’s more complicated than that. what they’ve done in libc is
    there’s now a libdl with weak globbing symbols that redirect the
    previous symbols back to libc, so you only get a libc linkage. i haven’t
    worked out how you’d force it to link to the libdl (which you have to
    do if you run against an older binary). specifying versions is one thing
    (easy to do), specifying the library… there’s no way to specify
    the library, it’s hard-coded at link time… as far as i can tell.
    the other viable solution (instead of the current one which is to use
    an older libc and force the link) my other plan was to create a dummy
    binary to link against that would put the symbols in the right place. so
    we could just take the linker and generate a specially-crafted binary,
    and then use it in the linking process to force libpseudo to look in the
    correct form. however i realized that it was probably easier just for
    testing purposes to download the glibc 2.33 binaries, rather than try to
    create a specially-crafted one. so another thing to potentially look at
    (besides those pre-built 2.33) would be a binary that would do the right
    things. then we could do it as part of the build process. so that could be
    something to look at
TrevorW: my first step would be to reduce the problem to a simple test case
RP: generating a simple test case isn’t so much the issue, it’s
    the fact it only breaks when you have a build within a build.
    but creating a test case would be easy. there is a bugzilla:
    https://bugzilla.yoctoproject.org/show_bug.cgi?id=14521 longer term,
    getting rid of the pthread dependency would be helpful then the libdl
    thing would be relatively simple.

RP: i’ve talked about the things i know about which are gating m3, is there
    anything i don’t know about or haven’t mentioned
JPEW: sbom stuff. it’s pretty hands-off, the only thing it touches that
    might affect anyone is the package data extended.
RP: we should try that
JPEW: is everyone okay with it (Saul and Ross) has anyone had a look at it. is
    it ready to go in (i know there still are things to add)
Ross: looks good to me, the only thing i would mention is the path that’s
    used, but there’s a fix for that
JPEW: yep
Ross: i haven’t run selftest myself, but i don’t think there’s any
    massive problem with what’s there now
Saul: i agree with Ross, there is one thing, but we can work around it, so
    i’m okay with it going in
RP: Anuj and myself have started and killed loads so quickly recently that the
    AB is keeling over because it can’t delete things fast enough, so it’s
    running out of space
SS: i think i’ve been contributing to that as well this week
JPEW: is it a matter of “rm -fr” being slow
RP: actually when we delete we actually move stuff to a junk area then do the
    actual deletion at idle, but there hasn’t been enough idle lately, so
    it’s running out of disk space
Randy: is this something that TrevorG should look at? i.e. I/O load too high
    meaning builds won’t take place
RP: not sure how we’d go about solving it
TrevorG: i could look at it once i’m done my current stuff
RP: maybe adding a task that runs early in a build that would block the start
    of new builds until a certain amount of resources are available
TrevorG: sounds good

JonM: with the last mesa update (2 days ago), anything that doesn’t have
    hard float on arm won’t compile. i don’t know if we’re going to need
    to have that as a requirement. it tries to do neon regardless of anything
    else
RP: is it something they did intentionally, or by mistake?
JonM: according to the mesa build logs, they were trying to speed up their
    build times by using neon instructions. this isn’t a problem even if
    you you have semi-modern arm hardware. anything with cortex is going to
    have hard float but we’re blowing up on the armv5 stuff because it’s
    ancient and we’re intentionally using it for the soft-float
RP: is there something in mesa that we can configure to disable this
JonM: don’t think so, it looks like it’s just checking for arm and then
    going ahead and doing it
RP: maybe Ross has a friend or two who we could ask. perhaps ask upstream why
    the change was and if we couldn’t at least configure it
TrevorW: curiously enough i do know of at least 1 armv5 soc that does have
    hard float (or vfp at least) because it is optional. but the vast majority
    of them don’t do hard float. i’m wondering about the pi 0’s and the
    pi 1’s, i believe those are armv6.
JonM: the qemu that we’re using has hard float natively
TrevorW: so you’re saying the pi’s shouldn’t be affected?
JonM: probably not. although it would be affected if you had one of those but
    purposefully disabled hard float. you could configure yourself into a hole
RP: we should figure out if they did this intentionally or not, because it is
    easy to do things like this unintentionally

RP: the tune updates seemed to have gone well
TrevorW: speaking of tunes i did run into one that doesn’t seem happy
    (mips32r2el-24kc)
JonM: i could take a look at it

RP: speaking of older platforms, we’re seeing an issue with serial port
    emulation on qemuppc that is causing lots of problems. paulg is looking
    at it. hopefully we can get a band-aid that will keep the AB happy. i do
    wonder how many people are using ppc, but everytime i try to remove it i
    get lots of pushback. it does show the project is multiplatform

PeterK: i did the conversion to the new override syntax the other day, we
    now have a brand new syntax that is used for real overrides and wannabe
    overrides (e.g. FILES:${PN} and RDEPENDS:). these look like real overrides
    but they aren’t. ${PN} has to be first, but with real overrides the
    order doesn't matter. also you can’t say the :append has to be first 
    because it has to come after the override-wannabe
RP: i can see what you’re saying because the ordering is important. it’s
    not fair to call them wannabe overrides because the code does treat them
    as overrides
PeterK: but they, technically, don’t use the override mechanism, so you
    can’t change the order of them
RP: you can, it’s just that they get appended to the overrides variable in
    a limited context. e.g. when it’s writing the pn-package it will have
    ${PN} in overrides, when it’s writing the pn-debug it will put the ${PN}
    in overrides. so they are used as overrides.
PeterK: yea, but there are a lot of places where you do things like
    getvar_foo:${PN} to get these variables
RP: right. it is a compromise. going forward into the future when you do a
    getvar_files:${PN}, behind the scenes we put ${PN} in overrides then
    fetch that variable. in the future we can get creative and use this
    more effectively. i can’t promise you what the future will look like,
    but, code-wise, we had painted ourselves into a corner and we had to do
    something. so i don’t think they should be considered wannabe-overrides,
    they are overrides they’re just used in a slightly different context
    to say “machine override”. i know what you mean about the :append
    being a little bit tricky because i have seen a couple cases where some
    code was using the alternative format you alluded to which doesn’t
    quite make sense, but sorta does. the nice thing is we can at least now
    detect this which gives us more options going forward. this opens up the
    possibility to be more creative in the future, but it’s not like i have
    a concrete plan yet going forward. in my spare time i have been looking
    into the bitbake code, there’s a huge override data variable bitbake
    uses globally and it was hard to tell what was a variable and what was an
    override (e.g. SRC_URI). so we can move things from global scope to local
    scope which will give us a cleaner syntax and make things faster. as a
    worse case, even if there was no parsing advantages, it would at least
    make the syntax cleaner, which i think is a huge win.
TrevorW: any plans to do the corner cases, e.g. layer.conf. these might not be
    overrides in the code, technically, but conceptually they are overrides
RP: i do have a branch where i played with this (making layer.conf variables
    overrides). there are some interesting side effects. yes, they do look
    like overrides, but they aren’t ever used as overrides, which is why
    they weren’t converted, and there would be problems if some of them were
    converted because of the way they get used. the nice thing is that it is
    a very specific namespace. the : change was huge and global, but this is
    localized so it might not be too bad. maybe in the next release. there
    are things to do with collections and things that perhaps could go away.
    nobody today knows what a collection is, it’s only something you’d
    know if you used bitbake 12 years ago.

SteveS: there are some updates we’re still waiting for on the AB restart
RP: there are patches to swapbot that need to be applied as well. remind
    MichaelH. it’ll happen as part of the regular maintenance

JPEW: did you want to enable spdx output on the AB?
RP: we should at least have some tests for it
JPEW: there are a couple knobs to balance the time it takes to generate it vs
    the amount of stuff you’re generating
RP: we should at least have something somewhere exercising those
TimO: recipetool/devtool don’t know about the spdx license identifier so
    they failed to pick up the right license for a couple things i was looking
    at recently
RP: please open a bug

TimO: OEHH tomorrow!

RP: there’s a patch on the list, involving changes to glibc testing that
    concerns me. there has always been a dilemma regarding glibc’s testing:
    whether to include as a ptest or run with its own test runner? in other
    words, run it as a special case. and we already have a handful of special
    cases: binutils, gcc, and glibc. they’re big and unwieldy and aren’t
    easy to turn into ptests therefore we did run them standalone. the patch
    enables turning it into a ptest. so we now have the options of running
    it under system emulation using NFS, user-mode emulation, or using
    ptests. i’m worried it enables too many options where we have too many
    half-working solutions.
Randy: for people who are concerned about the integrity of the toolchains it
    sounds like a good idea; more options sounds good
RP: options are good to a point, but if you have two things doing,
    effectively, the same thing, then that can be problematic
Randy: is there a way to run the current glibc tests on a target?
RP: yes, not easy to setup, but can be done (give it an IP address etc)
Randy: maybe give it to the doc person (MO)
RP: there might be other higher priority things for docs right now
TrevorW: are the 2 sets of tests orthogonal?
RP: exact same tests, just run different ways

Denys: OEHH tomorrow, Asia-Pacific, 9pm UTC

Randy: do we have a test suite for self-hosted builds?
RP: Ross’s tests for buildtools is close to that
Randy: how do i find that? do you have a keyword?
RP: the way you would run it is: bitbake buildtools-extended-tarball -c
    testsdk
Ross: it only builds libc, as it depends on how much of the builds works
RP: it’s the closest thing we have, it could be easily extended

Denys: nomination period for OE TSC ends of today

TrevorW: Joshua: was you video posted?
JPEW: not yet, i think it should be soon
Ross: what i read said it should be soon
RP: something else i heard today says it would be soon, if not today. it was a
    good presentation, thanks Joshua

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-04  1:16 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-04  1:16 Yocto Technical Team Minutes, Engineering Sync, for August 24, 2021 Trevor Woerner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.