All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.36 regression : swap support broken
@ 2010-12-12 18:38 Guy Martin
  2010-12-12 19:21 ` John David Anglin
  2010-12-23  2:19 ` John David Anglin
  0 siblings, 2 replies; 10+ messages in thread
From: Guy Martin @ 2010-12-12 18:38 UTC (permalink / raw)
  To: linux-parisc; +Cc: eike-kernel


Hi all,


Recently, swap got broken. When it has to be used, the kernel issue the
following line numerous times :
swap_free: Bad swap offset entry 00100009

It eventually panic with useless backtrace. Note that the offset entry
is always the same number.

This can be reproduced easily by booting with mem=64m and start
compiling something like the kernel.


I've been bisecting this and I found out it has been introduced in
4e60c86bd9e5a7110ed28874d0b6592186550ae8 and to be precise, the
following change in include/linux/mmdebug.h :
-#define VM_BUG_ON(cond) do { } while (0)
+#define VM_BUG_ON(cond) do { (void)(cond); } while (0)


While the above change looks harmless, it introduces a slight change in
mm/swap_state.c:154 in add_to_swap() : VM_BUG_ON(!PageUptodate(page));
If you comment out that line, the problem is gone.

PageUptodate() will call smp_rmb() being the memory barrier instruction.

Those tests were done with linux-2.6.36 and gcc-4.4.4.
Having the feeling that this might have been a gcc bug, I recompiled
the whole kernel with gcc-4.2.4 and the problem was gone.

Looking at the assembly output for add_to_swap(), there is a slight
difference.

- gcc-4.4.4 output :

add_to_swap:
        .PROC
        .CALLINFO FRAME=64,CALLS,SAVE_RP,ENTRY_GR=4
        .ENTRY
        stw %r2,-20(%r30)
        stw,ma %r4,64(%r30)
        stw %r3,-60(%r30)
        ldw 0(%r26),%r28
        copy %r26,%r3
        ldw 0(%r26),%r28
        bb,>=,n %r28,28,.L61
.L61:   
        b,l get_swap_page,%r2
        ...


- gcc-4.2.4 output :

add_to_swap:
        .PROC
        .CALLINFO FRAME=64,CALLS,SAVE_RP,ENTRY_GR=4
        .ENTRY
        stw %r2,-20(%r30)
        stw,ma %r4,64(%r30)
        stw %r3,-60(%r30)
        ldw 0(%r26),%r19
        copy %r26,%r3
        ldw 0(%r26),%r28
        extrw,s,>= %r28,28,1,%r0
.L76:   
        b,l get_swap_page,%r2
        ...

As you can see, gcc-4.2 uses extrw while gcc-4.4 uses bb.

What's the next step in order to troubleshoot this ?
Is this really a gcc bug or is the problem elsewhere ?



HTH,
  Guy



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-12 18:38 2.6.36 regression : swap support broken Guy Martin
@ 2010-12-12 19:21 ` John David Anglin
       [not found]   ` <20101212203322.21537607@borg.bxl.tuxicoman.be>
  2010-12-23  2:19 ` John David Anglin
  1 sibling, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-12 19:21 UTC (permalink / raw)
  To: Guy Martin; +Cc: linux-parisc, eike-kernel

> I've been bisecting this and I found out it has been introduced in
> 4e60c86bd9e5a7110ed28874d0b6592186550ae8 and to be precise, the
> following change in include/linux/mmdebug.h :
> -#define VM_BUG_ON(cond) do { } while (0)
> +#define VM_BUG_ON(cond) do { (void)(cond); } while (0)
> 
> 
> While the above change looks harmless, it introduces a slight change in
> mm/swap_state.c:154 in add_to_swap() : VM_BUG_ON(!PageUptodate(page));
> If you comment out that line, the problem is gone.
> 
> PageUptodate() will call smp_rmb() being the memory barrier instruction.
> 
> Those tests were done with linux-2.6.36 and gcc-4.4.4.
> Having the feeling that this might have been a gcc bug, I recompiled
> the whole kernel with gcc-4.2.4 and the problem was gone.
> 
> Looking at the assembly output for add_to_swap(), there is a slight
> difference.
> 
> - gcc-4.4.4 output :
> 
> add_to_swap:
>         .PROC
>         .CALLINFO FRAME=64,CALLS,SAVE_RP,ENTRY_GR=4
>         .ENTRY
>         stw %r2,-20(%r30)
>         stw,ma %r4,64(%r30)
>         stw %r3,-60(%r30)
>         ldw 0(%r26),%r28
>         copy %r26,%r3
>         ldw 0(%r26),%r28
>         bb,>=,n %r28,28,.L61
> .L61:   
>         b,l get_swap_page,%r2
>         ...

Hmmm, I didn't think we allowed call instructions in the delay slot of
a branch.  Even though nullified, I think there were issues on some
processors.  Is an asm involved?

> - gcc-4.2.4 output :
> 
> add_to_swap:
>         .PROC
>         .CALLINFO FRAME=64,CALLS,SAVE_RP,ENTRY_GR=4
>         .ENTRY
>         stw %r2,-20(%r30)
>         stw,ma %r4,64(%r30)
>         stw %r3,-60(%r30)
>         ldw 0(%r26),%r19
>         copy %r26,%r3
>         ldw 0(%r26),%r28
>         extrw,s,>= %r28,28,1,%r0
> .L76:   
>         b,l get_swap_page,%r2
>         ...

This is almost certainly wrong as it will skip the call when bit 28 is zero.

Would you please send preprocessed source?

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
       [not found]   ` <20101212203322.21537607@borg.bxl.tuxicoman.be>
@ 2010-12-12 20:22     ` John David Anglin
  2010-12-12 22:52       ` John David Anglin
  0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-12 20:22 UTC (permalink / raw)
  To: Guy Martin; +Cc: linux-parisc, eike-kernel

On Sun, 12 Dec 2010, Guy Martin wrote:

> On Sun, 12 Dec 2010 14:21:47 -0500 (EST)
> "John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:
> 
> > Hmmm, I didn't think we allowed call instructions in the delay slot of
> > a branch.  Even though nullified, I think there were issues on some
> > processors.  Is an asm involved?
> 
> I forgot to say that I've been testing this on my C3600.
> What do you mean by "Is an asm involved?" ?

Internally, the PA backend keeps track of "insn" types using an attribute
mechanism.  This is what keeps calls out of the delay slot of branches, etc.

asms can do stuff that GCC doesn't know about and this can cause
wrong code issues.  I was just speculating.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-12 20:22     ` John David Anglin
@ 2010-12-12 22:52       ` John David Anglin
  2010-12-12 23:55         ` John David Anglin
  0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-12 22:52 UTC (permalink / raw)
  To: Guy Martin; +Cc: linux-parisc, eike-kernel

On Sun, 12 Dec 2010, John David Anglin wrote:

> On Sun, 12 Dec 2010, Guy Martin wrote:
> 
> > On Sun, 12 Dec 2010 14:21:47 -0500 (EST)
> > "John David Anglin" <dave@hiauly1.hia.nrc.ca> wrote:
> > 
> > > Hmmm, I didn't think we allowed call instructions in the delay slot of
> > > a branch.  Even though nullified, I think there were issues on some
> > > processors.  Is an asm involved?
> > 
> > I forgot to say that I've been testing this on my C3600.
> > What do you mean by "Is an asm involved?" ?
> 
> Internally, the PA backend keeps track of "insn" types using an attribute
> mechanism.  This is what keeps calls out of the delay slot of branches, etc.
> 
> asms can do stuff that GCC doesn't know about and this can cause
> wrong code issues.  I was just speculating.

This is a GCC PA bug.  The code generation is wrong for this RTL:

(jump_insn 13 10 14 (set (pc)
        (if_then_else (eq (zero_extract:SI (reg:SI 28 %r28 [orig:100 D.22506 ] [100])
		    (const_int 1 [0x1])
		    (const_int 28 [0x1c]))
		(const_int 0 [0]))
	    (label_ref 16)
	    (pc))) include/linux/page-flags.h:297 29 {*pa.md:1610}
     (expr_list:REG_BR_PRED (const_int 5 [0x5])
        (expr_list:REG_DEAD (reg:SI 28 %r28 [orig:100 D.22506 ] [100])
	    (expr_list:REG_BR_PROB (const_int 5000 [0x1388])
		(nil))))
 -> 16)

(note 14 13 15 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn:TI 15 14 16 (parallel [
            (asm_operands/v ("") ("") 0 []
		 []
		 [] mm/swap_state.c:185)
	    (clobber (mem:BLK (scratch) [0 A8]))
	]) include/linux/page-flags.h:298 -1
    (nil))

(code_label 16 15 17 15 "" [1 uses])

(note 17 16 19 [bb 4] NOTE_INSN_BASIC_BLOCK)

(note 19 17 107 NOTE_INSN_DELETED)

(insn 107 19 122 (sequence [
            (call_insn:TI 18 19 4 (parallel [
			(set (reg:SI 28 %r28)
			    (call (mem:SI (symbol_ref/v:SI ("@get_swap_page") [flags 0x41]  <function_decl 0x40ae6280 get_swap_page>) [0 S4 A32])
...

The output_bb code used for jump_insn 13 is confused by the memory
clobber asm.  There should be a nop after the bb instruction as a
conditional branch to the following instruction is asking for disaster.
Some situations where a nop is needed are already covered, but you
have found a new one.

I created a GCC PR:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46915

This may affect other conditional branches...

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-12 22:52       ` John David Anglin
@ 2010-12-12 23:55         ` John David Anglin
  2010-12-13  0:03           ` John David Anglin
  0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-12 23:55 UTC (permalink / raw)
  To: dave.anglin; +Cc: gmsoft, linux-parisc, eike-kernel

> (insn:TI 15 14 16 (parallel [
>             (asm_operands/v ("") ("") 0 []
> 		 []
> 		 [] mm/swap_state.c:185)
> 	    (clobber (mem:BLK (scratch) [0 A8]))
> 	]) include/linux/page-flags.h:298 -1
>     (nil))

I think the way to avoid this in the kernel is to stick a "nop" in
the memory barrier.  That will avoid a zero length asm and branching
into the delay slot.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-12 23:55         ` John David Anglin
@ 2010-12-13  0:03           ` John David Anglin
  2010-12-13 12:52             ` Carlos O'Donell
  0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-13  0:03 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, gmsoft, linux-parisc, eike-kernel

> > (insn:TI 15 14 16 (parallel [
> >             (asm_operands/v ("") ("") 0 []
> > 		 []
> > 		 [] mm/swap_state.c:185)
> > 	    (clobber (mem:BLK (scratch) [0 A8]))
> > 	]) include/linux/page-flags.h:298 -1
> >     (nil))
> 
> I think the way to avoid this in the kernel is to stick a "nop" in
> the memory barrier.  That will avoid a zero length asm and branching
> into the delay slot.

A further reason is GCC does not know the length of asm insns.  So,
using a zero length asm on parisc is asking for trouble.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-13  0:03           ` John David Anglin
@ 2010-12-13 12:52             ` Carlos O'Donell
  2010-12-13 13:06               ` John David Anglin
  0 siblings, 1 reply; 10+ messages in thread
From: Carlos O'Donell @ 2010-12-13 12:52 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, gmsoft, linux-parisc, eike-kernel

On Sun, Dec 12, 2010 at 7:03 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
>> > (insn:TI 15 14 16 (parallel [
>> > =A0 =A0 =A0 =A0 =A0 =A0 (asm_operands/v ("") ("") 0 []
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0[]
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0[] mm/swap_state.c:185)
>> > =A0 =A0 =A0 =A0 (clobber (mem:BLK (scratch) [0 A8]))
>> > =A0 =A0 ]) include/linux/page-flags.h:298 -1
>> > =A0 =A0 (nil))
>>
>> I think the way to avoid this in the kernel is to stick a "nop" in
>> the memory barrier. =A0That will avoid a zero length asm and branchi=
ng
>> into the delay slot.
>
> A further reason is GCC does not know the length of asm insns. =A0So,
> using a zero length asm on parisc is asking for trouble.

The problem is that many applications (including glibc IIRC) expect a
zero length asm volatile to work correctly. Althought it's asking for
trouble, I think it's one of these expected things that just has to
work :-)

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-13 12:52             ` Carlos O'Donell
@ 2010-12-13 13:06               ` John David Anglin
  2010-12-13 14:47                 ` Carlos O'Donell
  0 siblings, 1 reply; 10+ messages in thread
From: John David Anglin @ 2010-12-13 13:06 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, gmsoft, linux-parisc, eike-kernel

> >> I think the way to avoid this in the kernel is to stick a "nop" in
> >> the memory barrier. =A0That will avoid a zero length asm and branching
> >> into the delay slot.

This adds about 17000 odd nops ;(

> > A further reason is GCC does not know the length of asm insns. =A0So,
> > using a zero length asm on parisc is asking for trouble.
> 
> The problem is that many applications (including glibc IIRC) expect a
> zero length asm volatile to work correctly. Althought it's asking for
> trouble, I think it's one of these expected things that just has to
> work :-)

Most targets don't have to deal with delay branch sequences and the
fact that the hardware can't deal with a branch to the delay slot.

As I have learned this morning, the length calculated for asms is just
an estimate.  It can be confused by comments, etc.  So, gcc is going to
have to add a nop when it finds an asm after a branch because the size
estimate is unreliable.

I had tried to fix this before but missed the asm_operands case.  The
branch type selection (useskip) also needs to check for asms ;(

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-13 13:06               ` John David Anglin
@ 2010-12-13 14:47                 ` Carlos O'Donell
  0 siblings, 0 replies; 10+ messages in thread
From: Carlos O'Donell @ 2010-12-13 14:47 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, gmsoft, linux-parisc, eike-kernel

On Mon, Dec 13, 2010 at 8:06 AM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
>> >> I think the way to avoid this in the kernel is to stick a "nop" i=
n
>> >> the memory barrier. =3DA0That will avoid a zero length asm and br=
anching
>> >> into the delay slot.
>
> This adds about 17000 odd nops ;(
>
>> > A further reason is GCC does not know the length of asm insns. =3D=
A0So,
>> > using a zero length asm on parisc is asking for trouble.
>>
>> The problem is that many applications (including glibc IIRC) expect =
a
>> zero length asm volatile to work correctly. Althought it's asking fo=
r
>> trouble, I think it's one of these expected things that just has to
>> work :-)
>
> Most targets don't have to deal with delay branch sequences and the
> fact that the hardware can't deal with a branch to the delay slot.
>
> As I have learned this morning, the length calculated for asms is jus=
t
> an estimate. =A0It can be confused by comments, etc. =A0So, gcc is go=
ing to
> have to add a nop when it finds an asm after a branch because the siz=
e
> estimate is unreliable.
>
> I had tried to fix this before but missed the asm_operands case. =A0T=
he
> branch type selection (useskip) also needs to check for asms ;(

=46or the record I have noticed at least 3 failures in branch delay
slots over the past few years, all fixed by -fno-dbr (sp?), so it's
quite possible these were the same problem?

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.36 regression : swap support broken
  2010-12-12 18:38 2.6.36 regression : swap support broken Guy Martin
  2010-12-12 19:21 ` John David Anglin
@ 2010-12-23  2:19 ` John David Anglin
  1 sibling, 0 replies; 10+ messages in thread
From: John David Anglin @ 2010-12-23  2:19 UTC (permalink / raw)
  To: Guy Martin; +Cc: linux-parisc, eike-kernel

On Sun, 12 Dec 2010, Guy Martin wrote:

> Recently, swap got broken. When it has to be used, the kernel issue the
> following line numerous times :
> swap_free: Bad swap offset entry 00100009

The incorrect handling of conditional branches followed by an (empty) asm
is now fixed in the gcc tree on all active branches (4.3 through to head).

The 4.2 branch is closed.  So, I wouldn't recommend using it or any earlier
branch of gcc for kernel builds.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-12-23  2:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-12 18:38 2.6.36 regression : swap support broken Guy Martin
2010-12-12 19:21 ` John David Anglin
     [not found]   ` <20101212203322.21537607@borg.bxl.tuxicoman.be>
2010-12-12 20:22     ` John David Anglin
2010-12-12 22:52       ` John David Anglin
2010-12-12 23:55         ` John David Anglin
2010-12-13  0:03           ` John David Anglin
2010-12-13 12:52             ` Carlos O'Donell
2010-12-13 13:06               ` John David Anglin
2010-12-13 14:47                 ` Carlos O'Donell
2010-12-23  2:19 ` John David Anglin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.