All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.38-rc3 regression on parisc: segfaults
@ 2011-02-01 22:00 Meelis Roos
  2011-02-01 22:12 ` James Bottomley
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Meelis Roos @ 2011-02-01 22:00 UTC (permalink / raw)
  To: linux-parisc, Linux Kernel list

I have been testing devel kernels on SMP L1000 successfully until 
2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new 
kernel and running aptitude to update to current debian unstable.

Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2 
tries. Maybe aptitude was broken inbetween but it looks like a kernel 
bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine so 
it's more likely a kernel problem.

What additional information can I provide?

[   74.590000]
[   74.590000] do_page_fault() pid=979 command='aptitude' type=15 address=0x0000002d
[   74.590000]
[   74.590000]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[   74.590000] PSW: 00000000000001001111111100001111 Not tainted
[   74.590000] r00-03  000000ff0004ff0f 000000004027b5ac 00000000405df23b 000000004067e884
[   74.590000] r04-07  000000004067c860 000000004067e6d0 000000004067e880 00000000c014b7d0
[   74.590000] r08-11  0000000000000001 0000000000000001 000000004067c860 0000000041b082c8
[   74.590000] r12-15  000000004067e730 000000004067e6d0 000000004067c860 000000004067c860
[   74.590000] r16-19  000000004067c860 000000004067e060 0000000000000000 000000004067c860
[   74.590000] r20-23  0000000000000229 0000000000000000 0000000000000000 0000000000000000
[   74.590000] r24-27  fffffffffffffff5 ffffffffffffffd3 000000004067e730 00000000004227a4
[   74.590000] r28-31  000000000000002d 0000000000000000 00000000c014b8c0 00000000402688db
[   74.590000] sr00-03  0000000000228800 0000000000228800 0000000000000000 0000000000228800
[   74.590000] sr04-07  0000000000228800 0000000000228800 0000000000228800 0000000000228800
[   74.590000]
[   74.590000]       VZOUICununcqcqcqcqcqcrmunTDVZOUI
[   74.590000] FPSR: 00001000001000100010000000000000
[   74.590000] FPER1: 00000000
[   74.590000] fr00-03  0822200000000000 0000000000000000 0000000000000000 0000000000000000
[   74.590000] fr04-07  0000000a00000000 0000000000000000 0000000000000000 0000000000000000
[   74.590000] fr08-11  0000000000000000 00000000406cf120 00000000401563e8 00000000404c59d8
[   74.590000] fr12-15  000000000804000f 000000000800000f 00000000401563e8 00000000ffc60460
[   74.590000] fr16-19  00000000406cf120 0000000040639d54 0000000000000046 0000000040599294
[   74.590000] fr20-23  00000000ffc60348 00000000406dd920 0000000000000038 4038000000000000
[   74.590000] fr24-27  0000000000000000 0000000000000000 3ff0000000000000 412e848c00000000
[   74.590000] fr28-31  0000000040599250 00000000ffc60357 00000000ffc60357 00000000405dfba8
[   74.590000]
[   74.590000] IASQ: 0000000000228800 0000000000228800 IAOQ: 00000000405df25b 00000000405df25f
[   74.590000]  IIR: 0f80108b    ISR: 0000000000228800  IOR: 000000000000002d
[   74.590000]  CPU:        0   CR30: 00000000fe050000 CR31: 0000000000008020
[   74.590000]  ORIG_R28: 0000000000000080
[   74.590000]  IAOQ[0]: 00000000405df25b
[   74.590000]  IAOQ[1]: 00000000405df25f
[   74.590000]  RP(r2): 00000000405df23b


-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-01 22:00 2.6.38-rc3 regression on parisc: segfaults Meelis Roos
@ 2011-02-01 22:12 ` James Bottomley
  2011-02-03 22:36     ` Meelis Roos
  2011-02-01 22:16   ` Carlos O'Donell
  2011-02-03  2:24 ` John David Anglin
  2 siblings, 1 reply; 13+ messages in thread
From: James Bottomley @ 2011-02-01 22:12 UTC (permalink / raw)
  To: Meelis Roos; +Cc: linux-parisc, Linux Kernel list

On Wed, 2011-02-02 at 00:00 +0200, Meelis Roos wrote:
> I have been testing devel kernels on SMP L1000 successfully until 
> 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new 
> kernel and running aptitude to update to current debian unstable.
> 
> Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2 
> tries. Maybe aptitude was broken inbetween but it looks like a kernel 
> bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine so 
> it's more likely a kernel problem.
> 
> What additional information can I provide?

Probably a bisection, if you could.  There have been no parisc patches
between -rc2 and -rc3, so it's coming from outside the architecture.

Thanks,

James



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-01 22:00 2.6.38-rc3 regression on parisc: segfaults Meelis Roos
@ 2011-02-01 22:16   ` Carlos O'Donell
  2011-02-01 22:16   ` Carlos O'Donell
  2011-02-03  2:24 ` John David Anglin
  2 siblings, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2011-02-01 22:16 UTC (permalink / raw)
  To: Meelis Roos; +Cc: linux-parisc, Linux Kernel list

On Tue, Feb 1, 2011 at 5:00 PM, Meelis Roos <mroos@linux.ee> wrote:
> I have been testing devel kernels on SMP L1000 successfully until
> 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new
> kernel and running aptitude to update to current debian unstable.
>
> Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2
> tries. Maybe aptitude was broken inbetween but it looks like a kernel
> bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine s=
o
> it's more likely a kernel problem.
>
> What additional information can I provide?
>
> [ =A0 74.590000]
> [ =A0 74.590000] do_page_fault() pid=3D979 command=3D'aptitude' type=3D=
15 address=3D0x0000002d
> [ =A0 74.590000]
> [ =A0 74.590000] =A0 =A0 =A0YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> [ =A0 74.590000] PSW: 00000000000001001111111100001111 Not tainted
> [ =A0 74.590000] r00-03 =A0000000ff0004ff0f 000000004027b5ac 00000000=
405df23b 000000004067e884
> [ =A0 74.590000] r04-07 =A0000000004067c860 000000004067e6d0 00000000=
4067e880 00000000c014b7d0
> [ =A0 74.590000] r08-11 =A00000000000000001 0000000000000001 00000000=
4067c860 0000000041b082c8
> [ =A0 74.590000] r12-15 =A0000000004067e730 000000004067e6d0 00000000=
4067c860 000000004067c860
> [ =A0 74.590000] r16-19 =A0000000004067c860 000000004067e060 00000000=
00000000 000000004067c860
> [ =A0 74.590000] r20-23 =A00000000000000229 0000000000000000 00000000=
00000000 0000000000000000
> [ =A0 74.590000] r24-27 =A0fffffffffffffff5 ffffffffffffffd3 00000000=
4067e730 00000000004227a4
> [ =A0 74.590000] r28-31 =A0000000000000002d 0000000000000000 00000000=
c014b8c0 00000000402688db
> [ =A0 74.590000] sr00-03 =A00000000000228800 0000000000228800 0000000=
000000000 0000000000228800
> [ =A0 74.590000] sr04-07 =A00000000000228800 0000000000228800 0000000=
000228800 0000000000228800
> [ =A0 74.590000]
> [ =A0 74.590000] =A0 =A0 =A0 VZOUICununcqcqcqcqcqcrmunTDVZOUI
> [ =A0 74.590000] FPSR: 00001000001000100010000000000000
> [ =A0 74.590000] FPER1: 00000000
> [ =A0 74.590000] fr00-03 =A00822200000000000 0000000000000000 0000000=
000000000 0000000000000000
> [ =A0 74.590000] fr04-07 =A00000000a00000000 0000000000000000 0000000=
000000000 0000000000000000
> [ =A0 74.590000] fr08-11 =A00000000000000000 00000000406cf120 0000000=
0401563e8 00000000404c59d8
> [ =A0 74.590000] fr12-15 =A0000000000804000f 000000000800000f 0000000=
0401563e8 00000000ffc60460
> [ =A0 74.590000] fr16-19 =A000000000406cf120 0000000040639d54 0000000=
000000046 0000000040599294
> [ =A0 74.590000] fr20-23 =A000000000ffc60348 00000000406dd920 0000000=
000000038 4038000000000000
> [ =A0 74.590000] fr24-27 =A00000000000000000 0000000000000000 3ff0000=
000000000 412e848c00000000
> [ =A0 74.590000] fr28-31 =A00000000040599250 00000000ffc60357 0000000=
0ffc60357 00000000405dfba8
> [ =A0 74.590000]
> [ =A0 74.590000] IASQ: 0000000000228800 0000000000228800 IAOQ: 000000=
00405df25b 00000000405df25f
> [ =A0 74.590000] =A0IIR: 0f80108b =A0 =A0ISR: 0000000000228800 =A0IOR=
: 000000000000002d
> [ =A0 74.590000] =A0CPU: =A0 =A0 =A0 =A00 =A0 CR30: 00000000fe050000 =
CR31: 0000000000008020
> [ =A0 74.590000] =A0ORIG_R28: 0000000000000080
> [ =A0 74.590000] =A0IAOQ[0]: 00000000405df25b
> [ =A0 74.590000] =A0IAOQ[1]: 00000000405df25f
> [ =A0 74.590000] =A0RP(r2): 00000000405df23b

The rp (return pointer) is pointing back into what appears to be a
shared library (always loaded around 0x4???????).

The iir (interrupting instruction register) is instruction "0:   0f 80
10 8b     ldw 0(ret0),r11" (you can do this yourself with "disasm"
from http://cvs.parisc-linux.org/build-tools/disasm?revision=3D1.1&view=
=3Dmarkup).

You can see that ret0 is indeed 0x2d (the address of the fault), and
loading 0x0 + 0x2d will cause a fault and kill your program.

However, the failure probably happened earlier.

As James says, you should try to bisect exactly which commit caused the=
 failure.

Cheers,
CArlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
@ 2011-02-01 22:16   ` Carlos O'Donell
  0 siblings, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2011-02-01 22:16 UTC (permalink / raw)
  To: Meelis Roos; +Cc: linux-parisc, Linux Kernel list

On Tue, Feb 1, 2011 at 5:00 PM, Meelis Roos <mroos@linux.ee> wrote:
> I have been testing devel kernels on SMP L1000 successfully until
> 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new
> kernel and running aptitude to update to current debian unstable.
>
> Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2
> tries. Maybe aptitude was broken inbetween but it looks like a kernel
> bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine so
> it's more likely a kernel problem.
>
> What additional information can I provide?
>
> [   74.590000]
> [   74.590000] do_page_fault() pid=979 command='aptitude' type=15 address=0x0000002d
> [   74.590000]
> [   74.590000]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> [   74.590000] PSW: 00000000000001001111111100001111 Not tainted
> [   74.590000] r00-03  000000ff0004ff0f 000000004027b5ac 00000000405df23b 000000004067e884
> [   74.590000] r04-07  000000004067c860 000000004067e6d0 000000004067e880 00000000c014b7d0
> [   74.590000] r08-11  0000000000000001 0000000000000001 000000004067c860 0000000041b082c8
> [   74.590000] r12-15  000000004067e730 000000004067e6d0 000000004067c860 000000004067c860
> [   74.590000] r16-19  000000004067c860 000000004067e060 0000000000000000 000000004067c860
> [   74.590000] r20-23  0000000000000229 0000000000000000 0000000000000000 0000000000000000
> [   74.590000] r24-27  fffffffffffffff5 ffffffffffffffd3 000000004067e730 00000000004227a4
> [   74.590000] r28-31  000000000000002d 0000000000000000 00000000c014b8c0 00000000402688db
> [   74.590000] sr00-03  0000000000228800 0000000000228800 0000000000000000 0000000000228800
> [   74.590000] sr04-07  0000000000228800 0000000000228800 0000000000228800 0000000000228800
> [   74.590000]
> [   74.590000]       VZOUICununcqcqcqcqcqcrmunTDVZOUI
> [   74.590000] FPSR: 00001000001000100010000000000000
> [   74.590000] FPER1: 00000000
> [   74.590000] fr00-03  0822200000000000 0000000000000000 0000000000000000 0000000000000000
> [   74.590000] fr04-07  0000000a00000000 0000000000000000 0000000000000000 0000000000000000
> [   74.590000] fr08-11  0000000000000000 00000000406cf120 00000000401563e8 00000000404c59d8
> [   74.590000] fr12-15  000000000804000f 000000000800000f 00000000401563e8 00000000ffc60460
> [   74.590000] fr16-19  00000000406cf120 0000000040639d54 0000000000000046 0000000040599294
> [   74.590000] fr20-23  00000000ffc60348 00000000406dd920 0000000000000038 4038000000000000
> [   74.590000] fr24-27  0000000000000000 0000000000000000 3ff0000000000000 412e848c00000000
> [   74.590000] fr28-31  0000000040599250 00000000ffc60357 00000000ffc60357 00000000405dfba8
> [   74.590000]
> [   74.590000] IASQ: 0000000000228800 0000000000228800 IAOQ: 00000000405df25b 00000000405df25f
> [   74.590000]  IIR: 0f80108b    ISR: 0000000000228800  IOR: 000000000000002d
> [   74.590000]  CPU:        0   CR30: 00000000fe050000 CR31: 0000000000008020
> [   74.590000]  ORIG_R28: 0000000000000080
> [   74.590000]  IAOQ[0]: 00000000405df25b
> [   74.590000]  IAOQ[1]: 00000000405df25f
> [   74.590000]  RP(r2): 00000000405df23b

The rp (return pointer) is pointing back into what appears to be a
shared library (always loaded around 0x4???????).

The iir (interrupting instruction register) is instruction "0:   0f 80
10 8b     ldw 0(ret0),r11" (you can do this yourself with "disasm"
from http://cvs.parisc-linux.org/build-tools/disasm?revision=1.1&view=markup).

You can see that ret0 is indeed 0x2d (the address of the fault), and
loading 0x0 + 0x2d will cause a fault and kill your program.

However, the failure probably happened earlier.

As James says, you should try to bisect exactly which commit caused the failure.

Cheers,
CArlos.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-01 22:00 2.6.38-rc3 regression on parisc: segfaults Meelis Roos
  2011-02-01 22:12 ` James Bottomley
  2011-02-01 22:16   ` Carlos O'Donell
@ 2011-02-03  2:24 ` John David Anglin
  2011-02-03  7:03   ` Meelis Roos
  2011-02-04 10:11   ` Meelis Roos
  2 siblings, 2 replies; 13+ messages in thread
From: John David Anglin @ 2011-02-03  2:24 UTC (permalink / raw)
  To: Meelis Roos; +Cc: linux-parisc, linux-kernel

> I have been testing devel kernels on SMP L1000 successfully until 
> 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new 
> kernel and running aptitude to update to current debian unstable.
> 
> Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2 
> tries. Maybe aptitude was broken inbetween but it looks like a kernel 
> bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine so 
> it's more likely a kernel problem.

If aptitude fails consistently, it should be possible to debug or
isolate to a particular kernel change.  Usually, SMP segvs don't
provide much information as to the cause of the problem.  strace
output and a gdb backtrace would be useful.

I have seen improved SMP stability building with GCC 4.5.3 (try a
recent snap).  This fixes an asm/branch problem.  It seems like James'
flush patch hasn't been pulled.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-03  2:24 ` John David Anglin
@ 2011-02-03  7:03   ` Meelis Roos
  2011-02-04 10:11   ` Meelis Roos
  1 sibling, 0 replies; 13+ messages in thread
From: Meelis Roos @ 2011-02-03  7:03 UTC (permalink / raw)
  To: John David Anglin; +Cc: linux-parisc, linux-kernel

> If aptitude fails consistently, it should be possible to debug or
> isolate to a particular kernel change.  Usually, SMP segvs don't
> provide much information as to the cause of the problem.  strace
> output and a gdb backtrace would be useful.

It's not failing consitently - it's in different places. I'm bisecting 
now.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-01 22:12 ` James Bottomley
@ 2011-02-03 22:36     ` Meelis Roos
  0 siblings, 0 replies; 13+ messages in thread
From: Meelis Roos @ 2011-02-03 22:36 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-parisc, Linux Kernel list

> > I have been testing devel kernels on SMP L1000 successfully until=20
> > 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the n=
ew=20
> > kernel and running aptitude to update to current debian unstable.
> >=20
> > Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2=20
> > tries. Maybe aptitude was broken inbetween but it looks like a kern=
el=20
> > bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine=
 so=20
> > it's more likely a kernel problem.
> >=20
> > What additional information can I provide?
>=20
> Probably a bisection, if you could.  There have been no parisc patche=
s
> between -rc2 and -rc3, so it's coming from outside the architecture.

The result is strange :(

6b28405395f7ec492ea69f541cc774adcb9e00ca is the first bad commit
commit 6b28405395f7ec492ea69f541cc774adcb9e00ca
Author: Axel K=C3=B6llhofer <AxelKoellhofer@web.de>
Date:   Sat Jan 22 14:33:50 2011 -0600

    staging: r8712u: Add new device IDs

    This patch adds several new device ids to the r8712u staging driver=
=2E
    The new ids were retrieved from latest vendor driver (v2.6.6.0.2010=
1111)
    downloadable from www.realtek.com.tw

    Signed-off-by: Axel Koellhofer <AxelKoellhofer@web.de>
    Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
    Cc: Stable <stable@kernel.org> [2.6.37]
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 185c3d2c1e98cc99009bfb772ed0779410784110 f5aa903931116f2=
8f803003f594fec3b2a29a6f6 M      drivers

Seems absolutely unrelated - I do not have staging enabled so so no=20
CONFIG_R8712U either.

The "bad" bisects were clearly bad and failed quicky during aptitude=20
list update but the good ones might have needed more stress... or it is=
=20
some alignment-like problem. Will try again starting from these bad=20
bisects to narrow it down, and stress seemingly good ones better.

--=20
Meelis Roos (mroos@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
@ 2011-02-03 22:36     ` Meelis Roos
  0 siblings, 0 replies; 13+ messages in thread
From: Meelis Roos @ 2011-02-03 22:36 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-parisc, Linux Kernel list

> > I have been testing devel kernels on SMP L1000 successfully until 
> > 2.6.38-rc2-00324-g70d1f36 included. The testing means booting the new 
> > kernel and running aptitude to update to current debian unstable.
> > 
> > Now I tried 2.6.38-rc3 and got a crash from aptitude on 2 out of 2 
> > tries. Maybe aptitude was broken inbetween but it looks like a kernel 
> > bug. Retried 2.6.38-rc2-00324-g70d1f36 and that seemed to work fine so 
> > it's more likely a kernel problem.
> > 
> > What additional information can I provide?
> 
> Probably a bisection, if you could.  There have been no parisc patches
> between -rc2 and -rc3, so it's coming from outside the architecture.

The result is strange :(

6b28405395f7ec492ea69f541cc774adcb9e00ca is the first bad commit
commit 6b28405395f7ec492ea69f541cc774adcb9e00ca
Author: Axel Köllhofer <AxelKoellhofer@web.de>
Date:   Sat Jan 22 14:33:50 2011 -0600

    staging: r8712u: Add new device IDs

    This patch adds several new device ids to the r8712u staging driver.
    The new ids were retrieved from latest vendor driver (v2.6.6.0.20101111)
    downloadable from www.realtek.com.tw

    Signed-off-by: Axel Koellhofer <AxelKoellhofer@web.de>
    Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
    Cc: Stable <stable@kernel.org> [2.6.37]
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 185c3d2c1e98cc99009bfb772ed0779410784110 f5aa903931116f28f803003f594fec3b2a29a6f6 M      drivers

Seems absolutely unrelated - I do not have staging enabled so so no 
CONFIG_R8712U either.

The "bad" bisects were clearly bad and failed quicky during aptitude 
list update but the good ones might have needed more stress... or it is 
some alignment-like problem. Will try again starting from these bad 
bisects to narrow it down, and stress seemingly good ones better.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-03  2:24 ` John David Anglin
  2011-02-03  7:03   ` Meelis Roos
@ 2011-02-04 10:11   ` Meelis Roos
  2011-02-04 15:07     ` John David Anglin
  1 sibling, 1 reply; 13+ messages in thread
From: Meelis Roos @ 2011-02-04 10:11 UTC (permalink / raw)
  To: John David Anglin; +Cc: linux-parisc, linux-kernel

> If aptitude fails consistently, it should be possible to debug or
> isolate to a particular kernel change.  Usually, SMP segvs don't
> provide much information as to the cause of the problem.  strace
> output and a gdb backtrace would be useful.

strace works but does not tell much to me:

2349  _newselect(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
2349  rt_sigaction(SIGTSTP, {0x40664bea, [RT_1 RT_4 RT_5 RT_7 RT_11 RT_12 RT_15 RT_16 RT_18 RT_26], SA_RESTART}, NULL, 8) = 0
2349  futex(0x458e0858, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x458e0850, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = -1 ENOSYS (Function not implemented)
2349  futex(0x458e0858, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
2363  <... futex resumed> )             = 0
2363  futex(0x458e0850, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
2349  <... futex resumed> )             = 1
2349  futex(0x458e0850, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
2363  <... futex resumed> )             = 0
2363  futex(0x458e0850, FUTEX_WAKE_PRIVATE, 1) = 0
2363  futex(0x458e0820, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
2349  <... futex resumed> )             = 1
2349  futex(0x458e0820, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
2363  <... futex resumed> )             = 0
2363  futex(0x458e0820, FUTEX_WAKE_PRIVATE, 1) = 0
2363  rename("/var/lib/apt/lists/partial/ftp.ee.debian.org_debian_dists_unstable_main_source_Sources.diff_2011-02-02-0207.41.decomp", "/var/lib/apt/lists/ftp.ee.debian.org_debian_dists_unstable_main_source_Sources.ed") = 0
2363  stat64("/usr/lib/apt/methods/rred", {st_mode=0, st_size=0, ...}) = 0
2363  pipe([26, 28])                    = 0
2363  pipe([30, 31])                    = 0
2363  fcntl64(26, F_SETFD, FD_CLOEXEC)  = 0
2363  fcntl64(28, F_SETFD, FD_CLOEXEC)  = 0
2363  fcntl64(30, F_SETFD, FD_CLOEXEC)  = 0
2363  fcntl64(31, F_SETFD, FD_CLOEXEC)  = 0
2363  clone( <unfinished ...>
2349  <... futex resumed> )             = 1
2363  <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x460df4a8) = 2372
2349  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
2372  rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTART},  <unfinished ...>
2349  write(1, "\33[56;1H\33[34h\33[?25h", 18 <unfinished ...>

Something futex-related. Full log temprarilty available at 
http://www.cs.ut.ee/~mroos/aptitude-strace.txt

gdb does not seem to work well:

root@hernes:~# gdb aptitude
GNU gdb (GDB) 7.2-debian
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "hppa-linux-gnu".
For bug reporting instructions, please see:<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/aptitude...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/aptitude
[Thread debugging using libthread_db enabled]
warning: Can't attach LWP 1075813436: No such process
/tmp/buildd/gdb-7.2/gdb/linux-thread-db.c:392: internal-error: thread_get_info_callback: Assertion `inout->thread_info != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-04 10:11   ` Meelis Roos
@ 2011-02-04 15:07     ` John David Anglin
  2011-02-04 15:20         ` Carlos O'Donell
  0 siblings, 1 reply; 13+ messages in thread
From: John David Anglin @ 2011-02-04 15:07 UTC (permalink / raw)
  To: Meelis Roos; +Cc: linux-parisc, linux-kernel

On Fri, 04 Feb 2011, Meelis Roos wrote:

> 2363  clone( <unfinished ...>
> 2349  <... futex resumed> )             = 1
> 2363  <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x460df4a8) = 2372
> 2349  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> 2372  rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTART},  <unfinished ...>
> 2349  write(1, "\33[56;1H\33[34h\33[?25h", 18 <unfinished ...>
> 
> Something futex-related. Full log temprarilty available at 
> http://www.cs.ut.ee/~mroos/aptitude-strace.txt

This is possibly the infamous COW bug.

> gdb does not seem to work well:

I think the segv is in the dynamic loader.  Try gdb on dynamic loader
and aptitude as run argument.  Also suggest adding /usr/lib/debug to
LD_LIBRARY_PATH.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-04 15:07     ` John David Anglin
@ 2011-02-04 15:20         ` Carlos O'Donell
  0 siblings, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2011-02-04 15:20 UTC (permalink / raw)
  To: John David Anglin; +Cc: Meelis Roos, linux-parisc, linux-kernel

On Fri, Feb 4, 2011 at 10:07 AM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> On Fri, 04 Feb 2011, Meelis Roos wrote:
>
>> 2363 =A0clone( <unfinished ...>
>> 2349 =A0<... futex resumed> ) =A0 =A0 =A0 =A0 =A0 =A0 =3D 1
>> 2363 =A0<... clone resumed> child_stack=3D0, flags=3DCLONE_CHILD_CLE=
ARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=3D0x460df4a8) =3D 2372
>> 2349 =A0--- SIGSEGV (Segmentation fault) @ 0 (0) ---
>> 2372 =A0rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTART}, =A0<unfinis=
hed ...>
>> 2349 =A0write(1, "\33[56;1H\33[34h\33[?25h", 18 <unfinished ...>
>>
>> Something futex-related. Full log temprarilty available at
>> http://www.cs.ut.ee/~mroos/aptitude-strace.txt
>
> This is possibly the infamous COW bug.

The COW bug that is triggered by a COW from an LWS-CAS? The solution
to which is to use locks around the LWS-CAS even on UP? I'd forgotten
about this issue actually, I should push that patch out to James.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
@ 2011-02-04 15:20         ` Carlos O'Donell
  0 siblings, 0 replies; 13+ messages in thread
From: Carlos O'Donell @ 2011-02-04 15:20 UTC (permalink / raw)
  To: John David Anglin; +Cc: Meelis Roos, linux-parisc, linux-kernel

On Fri, Feb 4, 2011 at 10:07 AM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> On Fri, 04 Feb 2011, Meelis Roos wrote:
>
>> 2363  clone( <unfinished ...>
>> 2349  <... futex resumed> )             = 1
>> 2363  <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x460df4a8) = 2372
>> 2349  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>> 2372  rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTART},  <unfinished ...>
>> 2349  write(1, "\33[56;1H\33[34h\33[?25h", 18 <unfinished ...>
>>
>> Something futex-related. Full log temprarilty available at
>> http://www.cs.ut.ee/~mroos/aptitude-strace.txt
>
> This is possibly the infamous COW bug.

The COW bug that is triggered by a COW from an LWS-CAS? The solution
to which is to use locks around the LWS-CAS even on UP? I'd forgotten
about this issue actually, I should push that patch out to James.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 2.6.38-rc3 regression on parisc: segfaults
  2011-02-04 15:20         ` Carlos O'Donell
  (?)
@ 2011-02-04 16:17         ` John David Anglin
  -1 siblings, 0 replies; 13+ messages in thread
From: John David Anglin @ 2011-02-04 16:17 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, mroos, linux-parisc, linux-kernel

> > This is possibly the infamous COW bug.
> 
> The COW bug that is triggered by a COW from an LWS-CAS? The solution
> to which is to use locks around the LWS-CAS even on UP? I'd forgotten
> about this issue actually, I should push that patch out to James.

I was actually thinking of the fork/clone race for which there are various
testcases on the wiki.  The aptitude segv is with a SMP kernel, so I don't
think this is the UP LWS_CAS issue.  However, I agree you should push
the change.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-02-04 16:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-01 22:00 2.6.38-rc3 regression on parisc: segfaults Meelis Roos
2011-02-01 22:12 ` James Bottomley
2011-02-03 22:36   ` Meelis Roos
2011-02-03 22:36     ` Meelis Roos
2011-02-01 22:16 ` Carlos O'Donell
2011-02-01 22:16   ` Carlos O'Donell
2011-02-03  2:24 ` John David Anglin
2011-02-03  7:03   ` Meelis Roos
2011-02-04 10:11   ` Meelis Roos
2011-02-04 15:07     ` John David Anglin
2011-02-04 15:20       ` Carlos O'Donell
2011-02-04 15:20         ` Carlos O'Donell
2011-02-04 16:17         ` John David Anglin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.