From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751952AbdIRAsu (ORCPT ); Sun, 17 Sep 2017 20:48:50 -0400 Received: from mga06.intel.com ([134.134.136.31]:61421 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751295AbdIRAss (ORCPT ); Sun, 17 Sep 2017 20:48:48 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,410,1500966000"; d="scan'208";a="1015426630" Date: Mon, 18 Sep 2017 08:48:45 +0800 From: Fengguang Wu To: Linus Torvalds Cc: LKML Subject: Re: [linus:master] BUILD REGRESSION 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e Message-ID: <20170918004845.y67x3p3sz2bhusgv@wfg-t540p.sh.intel.com> References: <59be0fe7.rWX1qt+B4E+6WufZ%fengguang.wu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20161104 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Linus, On Sun, Sep 17, 2017 at 08:31:56AM -0700, Linus Torvalds wrote: >Fengguang, > it looks like the kernel build robot _only_ tests the actual rc >kernels, and doesn't bisect down where the error started. Nah, that's an illusion. :) It's a per-branch summary report _in addition to_ per-bisect reports. The former shows all active (not-yet-fixed) error/warnings in the current branch HEAD; the latter shows result of one bisect. Typically for all error messages showed in this summary report, there have been individual bisect reports sent out to the relevant authors and committers. I'll give concrete examples in the bottom. >Any change that when it notices an error, it would bisect it, like it >does for linux-next? It should already be so -- otherwise it's a bug in 0day robot. In fact your tree _implicitly_ receives much more tests than linux-next, and linux-next receives more tests than other individual developer trees. It works like this: The robot will normally test all pushed branch HEADs of all git trees. IOW, each of your (and others') git push will trigger tests -- unless when occasionally the robot cannot catch up. The RC kernels will effectively receive _much more_ tests, since developers typically base their git branches on RC releases. So whenever they do git push, the triggered tests on their branch HEAD will automatically cover its base RC kernel. Whenever an error is found in a commit (typically the branch HEAD), the robot will traverse backwards in its git history and test these critical points until a GOOD point is found for starting the bisect: - the branch's BASE commit (typically an RC kernel) - the official releases (eg. 4.14 => 4.13 => 4.12 => ...) We'll give up when the bug is found to exist in too old kernel, since old bugs are likely either uninteresting (no one cares to fix) or hard to bisect. >On Sat, Sep 16, 2017 at 11:02 PM, kbuild test robot > wrote: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >> 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e Linux 4.14-rc1 >> >> arch/alpha/include/asm/mmu_context.h:160:24: error: invalid type argument of '->' (have 'int') Error ids grouped by kconfigs: recent_errors ├── alpha-allmodconfig │ ├── arch-alpha-include-asm-mmu_context.h:error:implicit-declaration-of-function-task_thread_info │ └── arch-alpha-include-asm-mmu_context.h:error:invalid-type-argument-of-(have-int-) The bisect report was sent here: https://lkml.org/lkml/2017/9/16/187 And a fix was freshly posted here: https://patchwork.kernel.org/patch/9954963/ ├── cris-allyesconfig │ ├── drivers-tty-serial-8250_core.c:error:unrecognizable-insn: │ └── drivers-tty-serial-8250_core.c:internal-compiler-error:in-extract_insn-at-recog.c Bisected and reported here: https://www.spinics.net/lists/linux-serial/msg27175.html ├── ia64-allmodconfig │ ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type │ └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device Reported here https://www.spinics.net/lists/kernel/msg2556450.html which may be fixed by this RFC patch: https://patchwork.kernel.org/patch/9939191/ ├── ia64-allyesconfig │ ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type │ └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device Ditto. ├── mips-jmr3927_defconfig │ ├── arch-mips-vdso-elf.S:error:march-r3900-requires-mfp32 │ ├── arch-mips-vdso-gettimeofday.c:error:march-r3900-requires-mfp32 │ ├── arch-mips-vdso-sigreturn.S:error:march-r3900-requires-mfp32 │ └── cc1:error:march-r3900-requires-mfp32 That's rather old bug that I gave up repeatedly reporting: https://www.linux-mips.org/archives/linux-mips/2016-03/msg00215.html ├── parisc-allmodconfig │ └── ERROR:__cmpxchg_u64-drivers-net-ethernet-intel-i40e-i40e.ko-undefined Reported here: https://lkml.org/lkml/2017/9/10/100 ├── sparc64-allmodconfig │ ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-per_cpu │ ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-smp_processor_id │ ├── arch-sparc-include-asm-mmu_context_64.h:error:per_cpu_secondary_mm-undeclared-(first-use-in-this-function) │ └── arch-sparc-include-asm-mmu_context_64.h:error:unknown-type-name-per_cpu_secondary_mm Reported here: https://lists.01.org/pipermail/kbuild-all/2017-August/037613.html https://lists.01.org/pipermail/kbuild-all/2017-September/037968.html And recently fixed here: https://patchwork.kernel.org/patch/9946375/ └── x86_64-randconfig-s4-09170918 └── net-netfilter-nf_nat_core.c:note:in-expansion-of-macro-if Reported here: https://lkml.org/lkml/2017/9/16/203 As you may see, all the errors mentioned in this summary report have been individually bisected and reported somewhere before. Regards, Fengguang