On Tue, Mar 5, 2019 at 10:04 PM Dan Rue wrote: > On Tue, Mar 05, 2019 at 12:14:12PM +0000, Mark Brown wrote: > > On Mon, Mar 04, 2019 at 01:20:25PM +0000, Guillaume Tucker wrote: > > > > > I'm not entirely sure how much flexibility Jenkins can offer in > > > that respect, but at least having 3 versions of the monitor job > > > that runs every hour, every day and every week should cover all > > > the cases. If possible, we may be able to implement something > > > that dynamically schedules the next check for each branch. > > > > One big concern I have with this is latency. One of the common cases > > where people don't need builds all the time is when they're mainly > > looking for checks before they submit pull requests. For those if you > > might have to wait almost a day before we even queue the build it might > > be a bit of an issue. If it was just a "skip this tree if we built it > > in the past X time" check that wouldn't be such an issue, it's just if > > the daily check runs at some fixed time each day or whatever that it > > might add huge extra latency. > > > > Another idea I just thought of but I'm not sure is practical would be to > > only check some trees if the Jenkins queue is less than some number of > > builds - that way if we're busy we won't add extra load, but it feels > > like it's more trouble than it's worth to implement fairly. > These are different use-cases, some people have said that they wanted their branches checked every morning (linusw) or every Monday (media). For real-time feedback, we need to do something quite different indeed. With the ability to whitelist some defconfigs and arches, we could have quick builds for some maintainers' trees to optimise the turnaround time / test coverage ratio. And yes, a mechanism to skip builds on arbitrary criteria would seem to be quite hard to design with fair rules. And at the end of the day, if we don't have enough build power to cope with the load, even the best rules would just be moving the problem somewhere else. What I think we should have is some kind of OOM system, to track that if the queue keeps increasing from one day to the next then some builds need to be killed, because there really isn't any choice in that situation. We're still running at about 75% of our capacity and builds can be cancelled manually if anything goes very wrong, so it's not a practical concern right now. > I wonder what happens when something doesn't fit in a > hourly/daily/weekly box. It could also cause a daily/weekly bottleneck > if they're all scheduled at the same time. > Everything kind of fits in an hourly box, because that's a small enough interval to start building things shortly after a new revision was pushed. Having longer periods is useful for people who don't want intermediate versions to go through the test system, or don't want to set up a branch just for KernelCI. Perhaps each tree gets a cooling off period defined in e.g. seconds, and > it could be defaulted to current behavior of 1 hour. If a tree is > triggered but its cooling off period hasn't passed, the trigger is > either ignored or deferred. This would also let us increase the > frequency of the build trigger to something like every 5 minutes. > If we want that kind of speed then we should be using git hooks to trigger builds directly rather than polling continuously. > This way, load is evened out a bit (no spikes when 'weekly' or even > 'hourly' runs), kernelci is more realtime (to Mark's point), and > configuration is granular and per-tree (we can still offer standard > cool-off periods, of course). > The issue here is that it would be doing the opposite of what some people want. For example, when a branch gets updated several times during a day, some developers don't want the first ones to be tested but rather let the system wait until the evening before doing a build so they get the results the next morning (see Linus' comment on the GPIO thread). Also, periodic checks don't have to all be at the same time, they could be evened out within their polling period. The only state that has to be tracked is the time of the last build per > tree, though I have no idea about the ease of implementation. > The implementation will depend on a lot of things, i.e. the build automation tool (Jenkins or other) and the backend / storage server where previous builds are stored. At the moment we're keeping a file containing the commit sha from last time a branch was sampled, it wouldn't be hard to add date and time information to that for example (or get the last modified date of that file). The first thing to do imo is to have the ability to specify how often a tree should be checked in the YAML config file. We could reuse cron's syntax for that, or have simple but rough values like "daily" to let the implentation decide at what to actually do the checks. As mentioned before, in some cases we could be using a hook to trigger a build directly from the git server every time a commit is pushed (like a real CI system). We should also have a way to define that in the YAML config, probably by not setting any polling interval. Guillaume