All of lore.kernel.org
 help / color / mirror / Atom feed
* Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
@ 2017-02-04 20:05 Luis R. Rodriguez
  2017-02-04 23:17 ` Stephen Rothwell
  2017-02-05  1:22 ` Guenter Roeck
  0 siblings, 2 replies; 9+ messages in thread
From: Luis R. Rodriguez @ 2017-02-04 20:05 UTC (permalink / raw)
  To: Stephen Rothwell, Mark Brown
  Cc: Fengguang Wu, Guenter Roeck, linux-kernel, X86 ML,
	Andy Lutomirski, Borislav Petkov

I could not boot next-20170203 on my x86_64 qemu instance. It stalls at:

[    0.015549] CPU: Physical Processor ID: 0
[    0.015842] mce: CPU supports 10 MCE banks
[    0.016032] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.016393] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[    0.016871] Freeing SMP alternatives memory: 24K
[    0.017710] ftrace: allocating 25888 entries in 102 pages
[    0.024102] smpboot: Max logical packages: 4
[    0.024524] x2apic enabled
[    0.024851] Switched APIC routing to physical x2apic.
[    0.025755] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1

This is no different than a functional boot, it just stalls. I see it
next-20170203 booted and worked on other qemu instances elsewhere
though so it seems something with my configuration and boot. I
bisected next-20170203 between its latest commit and v4.10-rc6 and
ended up with this bad commit:

104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62

$ git show 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
commit 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
Merge: 7c3b1edeee66 3f87493930a0
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date:   Fri Feb 3 12:30:38 2017 +1100

    Merge remote-tracking branch 'spi/for-next'

I have checked Next/SHA1s and it shows:

mcgrof@piggy ~/linux-next (git::original)$ grep spi Next/SHA1s
spi-nor        dc12bcccadafb5441170e6b7c8a438c91d4f385b
spi        3f87493930a0f934549b04e100ecc2110e4f1efd
hwspinlock    bd5717a4632cdecafe82d03de7dcb3b1876e2828

The commit 3f87493930a0f934549b04e100ecc2110e4f1efd then seems to be
what I need to test. I have cloned Mark's spi tree and just tried to
boot the for-next branch (on v4.10-rc1) on
3f87493930a0f934549b04e100ecc2110e4f1efd, and it boots successfully.
This would lead me to believe this issue might be related to the merge
conflict resolution done by Stephen, but wanted to check and ask.
Perhaps there might be some specific tests I can run.

The qemu instance I am using:

/usr/local/bin/qemu-system-x86_64 -cpu kvm64 -enable-kvm -m 4096 -smp
4 -netdev vde,sock=/var/run/qemu-vde.ctl,group=kvm,mode=0660,id=vde0
-device e1000,netdev=vde0,mac=52:54:00:12:34:84 -hda
/opt/qemu/debian-x86_64.qcow2 -hdb /opt/qemu/linux-next.qcow2 -monitor
pty -serial stdio -chardev pty,id=ttyS1 -device
isa-serial,chardev=ttyS1 -chardev pty,id=ttyS2 -device
isa-serial,chardev=ttyS2 -nographic -boot order=d

  Luis

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-04 20:05 Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64 Luis R. Rodriguez
@ 2017-02-04 23:17 ` Stephen Rothwell
  2017-02-04 23:27   ` Stephen Rothwell
  2017-02-05 13:59   ` Mark Brown
  2017-02-05  1:22 ` Guenter Roeck
  1 sibling, 2 replies; 9+ messages in thread
From: Stephen Rothwell @ 2017-02-04 23:17 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Mark Brown, Fengguang Wu, Guenter Roeck, linux-kernel, X86 ML,
	Andy Lutomirski, Borislav Petkov

Hi Luis,

On Sat, 4 Feb 2017 12:05:42 -0800 "Luis R. Rodriguez" <mcgrof@kernel.org> wrote:
>
> though so it seems something with my configuration and boot. I
> bisected next-20170203 between its latest commit and v4.10-rc6 and
> ended up with this bad commit:
> 
> 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> 
> $ git show 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> commit 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> Merge: 7c3b1edeee66 3f87493930a0
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date:   Fri Feb 3 12:30:38 2017 +1100
> 
>     Merge remote-tracking branch 'spi/for-next'
> 
> I have checked Next/SHA1s and it shows:
> 
> mcgrof@piggy ~/linux-next (git::original)$ grep spi Next/SHA1s
> spi-nor        dc12bcccadafb5441170e6b7c8a438c91d4f385b
> spi        3f87493930a0f934549b04e100ecc2110e4f1efd
> hwspinlock    bd5717a4632cdecafe82d03de7dcb3b1876e2828
> 
> The commit 3f87493930a0f934549b04e100ecc2110e4f1efd then seems to be
> what I need to test. I have cloned Mark's spi tree and just tried to
> boot the for-next branch (on v4.10-rc1) on
> 3f87493930a0f934549b04e100ecc2110e4f1efd, and it boots successfully.
> This would lead me to believe this issue might be related to the merge
> conflict resolution done by Stephen, but wanted to check and ask.
> Perhaps there might be some specific tests I can run.

OK, it is possible that the merge is actually incorrect.  I did *not*
do any manual resolution of that merge and git only reported an
automatic resolution in file drivers/spi/spi-bcm-qspi.c (which looks ok
from a quick glance).

It is always possible that there is some semantic conflict that git
won't see and didn;t also involve a syntactic conflict or a build
failure.  e.g. the internal semantics of a function changes on one side
of the merge but a new usage expecting the old semantics is introduced
on the other side.

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-04 23:17 ` Stephen Rothwell
@ 2017-02-04 23:27   ` Stephen Rothwell
  2017-02-05 13:59   ` Mark Brown
  1 sibling, 0 replies; 9+ messages in thread
From: Stephen Rothwell @ 2017-02-04 23:27 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Mark Brown, Fengguang Wu, Guenter Roeck, linux-kernel, X86 ML,
	Andy Lutomirski, Borislav Petkov

Hi all,

On Sun, 5 Feb 2017 10:17:29 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> On Sat, 4 Feb 2017 12:05:42 -0800 "Luis R. Rodriguez" <mcgrof@kernel.org> wrote:
> >
> > though so it seems something with my configuration and boot. I
> > bisected next-20170203 between its latest commit and v4.10-rc6 and
> > ended up with this bad commit:
> > 
> > 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> > 
> > $ git show 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> > commit 104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62
> > Merge: 7c3b1edeee66 3f87493930a0
> > Author: Stephen Rothwell <sfr@canb.auug.org.au>
> > Date:   Fri Feb 3 12:30:38 2017 +1100
> > 
> >     Merge remote-tracking branch 'spi/for-next'
> > 
> > I have checked Next/SHA1s and it shows:
> > 
> > mcgrof@piggy ~/linux-next (git::original)$ grep spi Next/SHA1s
> > spi-nor        dc12bcccadafb5441170e6b7c8a438c91d4f385b
> > spi        3f87493930a0f934549b04e100ecc2110e4f1efd
> > hwspinlock    bd5717a4632cdecafe82d03de7dcb3b1876e2828
> > 
> > The commit 3f87493930a0f934549b04e100ecc2110e4f1efd then seems to be
> > what I need to test. I have cloned Mark's spi tree and just tried to
> > boot the for-next branch (on v4.10-rc1) on
> > 3f87493930a0f934549b04e100ecc2110e4f1efd, and it boots successfully.
> > This would lead me to believe this issue might be related to the merge
> > conflict resolution done by Stephen, but wanted to check and ask.
> > Perhaps there might be some specific tests I can run.  
> 
> OK, it is possible that the merge is actually incorrect.  I did *not*
> do any manual resolution of that merge and git only reported an
> automatic resolution in file drivers/spi/spi-bcm-qspi.c (which looks ok
> from a quick glance).
> 
> It is always possible that there is some semantic conflict that git
> won't see and didn;t also involve a syntactic conflict or a build
> failure.  e.g. the internal semantics of a function changes on one side
> of the merge but a new usage expecting the old semantics is introduced
> on the other side.

Just to mention, there was no change to the spi tree between
next-20170202 and next-20170203.  I assume that next-20170202 is fine?
If so, you could try bisecting with next-20170202 as good and
104a519fe1732b4e503ebc7b4ac71b6f0b8a0b62 as bad.  I have no idea if
that sort of bisec will even work, though.

Or if commit 8cfb3801a57a (the merge of the spi tree in next-20170202)
is fine, then you could try using that as your starting good (that will
remove a lot of next-20170202).

-- 
Cheers,
Stephen Rothwell

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-04 20:05 Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64 Luis R. Rodriguez
  2017-02-04 23:17 ` Stephen Rothwell
@ 2017-02-05  1:22 ` Guenter Roeck
  2017-02-05 10:33   ` Borislav Petkov
  1 sibling, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2017-02-05  1:22 UTC (permalink / raw)
  To: Luis R. Rodriguez, Stephen Rothwell, Mark Brown
  Cc: Fengguang Wu, linux-kernel, X86 ML, Andy Lutomirski, Borislav Petkov

On 02/04/2017 12:05 PM, Luis R. Rodriguez wrote:
> I could not boot next-20170203 on my x86_64 qemu instance. It stalls at:
>
> [    0.015549] CPU: Physical Processor ID: 0
> [    0.015842] mce: CPU supports 10 MCE banks
> [    0.016032] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
> [    0.016393] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
> [    0.016871] Freeing SMP alternatives memory: 24K
> [    0.017710] ftrace: allocating 25888 entries in 102 pages
> [    0.024102] smpboot: Max logical packages: 4
> [    0.024524] x2apic enabled
> [    0.024851] Switched APIC routing to physical x2apic.
> [    0.025755] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>

There may be a possible problem or change in the timer code. For the most
part the "kvm64" cpu works for me, but I saw this once:

smpboot: Max logical packages: 1
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found apic 0 pin 2) ...
....... failed.
...trying to set up timer as Virtual Wire IRQ...
..... failed.
...trying to set up timer as ExtINT IRQ...
..... failed :(.
Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug and send a report.  Then try booting with the 'noapic' option.

The "normal" log with -next looks as follows:

smpboot: Max logical packages: 4
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...
..... (found apic 0 pin 2) ...
....... failed.
...trying to set up timer as Virtual Wire IRQ...
..... failed.
...trying to set up timer as ExtINT IRQ...
..... works.

Upstream (v4.10-rc6-193-ga572a1b99948), the same command yields no error at all:

..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
smpboot: CPU0: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1)

The "new" error message is also seen with other CPUs, not just with "kvm64".

I am using qemu 2.8.

Maybe we should try to figure out where the new error messages come from ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-05  1:22 ` Guenter Roeck
@ 2017-02-05 10:33   ` Borislav Petkov
  2017-02-05 15:00     ` Guenter Roeck
  2017-02-06 17:47     ` Luis R. Rodriguez
  0 siblings, 2 replies; 9+ messages in thread
From: Borislav Petkov @ 2017-02-05 10:33 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Luis R. Rodriguez, Stephen Rothwell, Mark Brown, Fengguang Wu,
	linux-kernel, X86 ML, Andy Lutomirski

On Sat, Feb 04, 2017 at 05:22:55PM -0800, Guenter Roeck wrote:
> Upstream (v4.10-rc6-193-ga572a1b99948), the same command yields no error at all:

That's because you tested Linus' merge commit of the branch which fixed that :-)

IOW, the fix should be:

aaaec6fc7554 ("x86/irq: Make irq activate operations symmetric")

It is on its way to stable too, as we speak.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-04 23:17 ` Stephen Rothwell
  2017-02-04 23:27   ` Stephen Rothwell
@ 2017-02-05 13:59   ` Mark Brown
  1 sibling, 0 replies; 9+ messages in thread
From: Mark Brown @ 2017-02-05 13:59 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Luis R. Rodriguez, Fengguang Wu, Guenter Roeck, linux-kernel,
	X86 ML, Andy Lutomirski, Borislav Petkov

[-- Attachment #1: Type: text/plain, Size: 377 bytes --]

On Sun, Feb 05, 2017 at 10:17:29AM +1100, Stephen Rothwell wrote:

> OK, it is possible that the merge is actually incorrect.  I did *not*
> do any manual resolution of that merge and git only reported an
> automatic resolution in file drivers/spi/spi-bcm-qspi.c (which looks ok
> from a quick glance).

That's a driver for Broadcom SoCs which is not going to run on x86 so...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-05 10:33   ` Borislav Petkov
@ 2017-02-05 15:00     ` Guenter Roeck
  2017-02-06 17:47     ` Luis R. Rodriguez
  1 sibling, 0 replies; 9+ messages in thread
From: Guenter Roeck @ 2017-02-05 15:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luis R. Rodriguez, Stephen Rothwell, Mark Brown, Fengguang Wu,
	linux-kernel, X86 ML, Andy Lutomirski

On 02/05/2017 02:33 AM, Borislav Petkov wrote:
> On Sat, Feb 04, 2017 at 05:22:55PM -0800, Guenter Roeck wrote:
>> Upstream (v4.10-rc6-193-ga572a1b99948), the same command yields no error at all:
>
> That's because you tested Linus' merge commit of the branch which fixed that :-)
>
FWIW, I also tested v4.10-rc3, which didn't have the problem either.

> IOW, the fix should be:
>
> aaaec6fc7554 ("x86/irq: Make irq activate operations symmetric")
>
Yes, it does. Let's hope that this fixes Luis' problem as well.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-05 10:33   ` Borislav Petkov
  2017-02-05 15:00     ` Guenter Roeck
@ 2017-02-06 17:47     ` Luis R. Rodriguez
  2017-02-07  0:02       ` Borislav Petkov
  1 sibling, 1 reply; 9+ messages in thread
From: Luis R. Rodriguez @ 2017-02-06 17:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Guenter Roeck, Luis R. Rodriguez, Stephen Rothwell, Mark Brown,
	Fengguang Wu, linux-kernel, X86 ML, Andy Lutomirski

On Sun, Feb 05, 2017 at 11:33:25AM +0100, Borislav Petkov wrote:
> On Sat, Feb 04, 2017 at 05:22:55PM -0800, Guenter Roeck wrote:
> > Upstream (v4.10-rc6-193-ga572a1b99948), the same command yields no error at all:
> 
> That's because you tested Linus' merge commit of the branch which fixed that :-)
> 
> IOW, the fix should be:
> 
> aaaec6fc7554 ("x86/irq: Make irq activate operations symmetric")
> 
> It is on its way to stable too, as we speak.

I've taken this patch and applied it on top of next-20170203
and confirm it fixes the regression. I've also tested next-20170206
which has the fix and confirm that boots as well.

Do we have any test units which can kick off regularly to test against such
type of regression in the future or is it not worth it?

Thanks Boris!

  Luis

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64
  2017-02-06 17:47     ` Luis R. Rodriguez
@ 2017-02-07  0:02       ` Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2017-02-07  0:02 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Guenter Roeck, Stephen Rothwell, Mark Brown, Fengguang Wu,
	linux-kernel, X86 ML, Andy Lutomirski

On Mon, Feb 06, 2017 at 06:47:43PM +0100, Luis R. Rodriguez wrote:
> Do we have any test units which can kick off regularly to test against
> such type of regression in the future or is it not worth it?

Yap, it is called: build new kernel and boot it the box :-)

I always try to build and boot all -rcs and tip/master on my boxes and I
do catch a couple of issues almost every week this way.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-02-07  0:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-04 20:05 Regression on next-20170203 spi/for-next 3f87493930a0f qemu on x86_64 Luis R. Rodriguez
2017-02-04 23:17 ` Stephen Rothwell
2017-02-04 23:27   ` Stephen Rothwell
2017-02-05 13:59   ` Mark Brown
2017-02-05  1:22 ` Guenter Roeck
2017-02-05 10:33   ` Borislav Petkov
2017-02-05 15:00     ` Guenter Roeck
2017-02-06 17:47     ` Luis R. Rodriguez
2017-02-07  0:02       ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.