All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: SMTC support status in latest git head.
@ 2010-12-16 15:37 ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-16 15:37 UTC (permalink / raw)
  To: kevink, anoop.pa; +Cc: linux-mips, Anoop_P.A

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

Two other possible clues:

The EVP is clear in the MVPControl register.
   Does this say that only VPE0, T0 gets to run?

Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage Exception dispatch.
   But that seems to conflict the EVP bit above.

Perhaps these are an artifact of getting to a good state to dump things out.

[-- Attachment #2: Type: text/html, Size: 966 bytes --]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
@ 2010-12-16 15:37 ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-16 15:37 UTC (permalink / raw)
  To: kevink, anoop.pa; +Cc: linux-mips, Anoop_P.A

[-- Attachment #1: Type: text/plain, Size: 347 bytes --]

Two other possible clues:

The EVP is clear in the MVPControl register.
   Does this say that only VPE0, T0 gets to run?

Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage Exception dispatch.
   But that seems to conflict the EVP bit above.

Perhaps these are an artifact of getting to a good state to dump things out.

[-- Attachment #2: Type: text/html, Size: 966 bytes --]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
       [not found] ` <4D0A677C.6040104@paralogos.com>
@ 2010-12-16 19:58   ` Kevin D. Kissell
  2010-12-17 21:35     ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-16 19:58 UTC (permalink / raw)
  To: STUART VENTERS; +Cc: anoop.pa, linux-mips, Anoop_P.A

Ralf tells me that this message got blocked by the LMO server due to 
HTML content.
So here it is again, textier.

On 12/16/10 11:24, Kevin D. Kissell wrote:
 > On 12/16/10 07:37, STUART VENTERS wrote:
 >
 > Two other possible clues:
 >
 > The EVP is clear in the MVPControl register.
 > Does this say that only VPE0, T0 gets to run?

That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't 
matter.  It's just possible that setting EVP is conditional on more than 
one VPE being used, but that's not the way I remember it.

 > Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage 
Exception dispatch.
 > But that seems to conflict the EVP bit above.

I don't have a copy of the ASE spec handy to see whether those bits have 
a defined power-on value, but particularly if maxvpes=1 was set at boot 
time, I would expect VPE1's registers to be in a partly random power-up 
state.

 > Perhaps these are an artifact of getting to a good state to dump 
things out.

As per my previous mail, I looked at the MT register dump source, and it 
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack 
frame.  The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based 
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation 
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption 
that dvpe() returns the pre-invocation
value of MVPcontrol.  Break those assumptions, and you'll get 
inconsistent state dumps like this,
and very possibly incorrect execution.   Particularly if what was done 
was that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore() 
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions.  That 
would have been a *very* bad idea...

              Regards,

              Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-16 19:58   ` Kevin D. Kissell
@ 2010-12-17 21:35     ` Kevin D. Kissell
  2010-12-20 10:44       ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-17 21:35 UTC (permalink / raw)
  To: anoop.pa; +Cc: STUART VENTERS, linux-mips, Anoop_P.A

So, Anoop, if you get a minute for this any time in the next day or so 
(after which I'll have very limited net access until next year), could 
you please do an <mumble>-mips<mumble>-objdump --disassemble of your 
kernel image (or even just the mips-mt.o module) from a failing kernel 
build and post the disassembly of mips_mt_regdump()?  The confirmation 
or refutation of the theory about local_irq_save() no longer being built 
correctly for SMTC would be within the first few instructions...

/K.


On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to 
> HTML content.
> So here it is again, textier.
>
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> > On 12/16/10 07:37, STUART VENTERS wrote:
> >
> > Two other possible clues:
> >
> > The EVP is clear in the MVPControl register.
> > Does this say that only VPE0, T0 gets to run?
>
> That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't 
> matter.  It's just possible that setting EVP is conditional on more 
> than one VPE being used, but that's not the way I remember it.
>
> > Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage 
> Exception dispatch.
> > But that seems to conflict the EVP bit above.
>
> I don't have a copy of the ASE spec handy to see whether those bits 
> have a defined power-on value, but particularly if maxvpes=1 was set 
> at boot time, I would expect VPE1's registers to be in a partly random 
> power-up state.
>
> > Perhaps these are an artifact of getting to a good state to dump 
> things out.
>
> As per my previous mail, I looked at the MT register dump source, and 
> it really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack 
> frame.  The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is 
> based on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire* 
> pre-invocation value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the 
> assumption that dvpe() returns the pre-invocation
> value of MVPcontrol.  Break those assumptions, and you'll get 
> inconsistent state dumps like this,
> and very possibly incorrect execution.   Particularly if what was done 
> was that effectively replaces
> the SMTC-specific implementation of 
> local_irq_save()/local_irq_restore() with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions.  
> That would have been a *very* bad idea...
>
>              Regards,
>
>              Kevin K.
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-17 21:35     ` Kevin D. Kissell
@ 2010-12-20 10:44       ` Anoop P A
       [not found]         ` <4D10F7A9.1020306@paralogos.com>
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-20 10:44 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, linux-mips, Anoop_P.A

Hi Kevin,

Please find disassembly  for mips_mt_reg_dump

Thanks
Anoop

Disassembly of section .text:

00000000 <mips_mt_regdump>:
  0:   27bdffb8        addiu   sp,sp,-72
  4:   00802821        move    a1,a0
  8:   afbf0044        sw      ra,68(sp)
  c:   afbe0040        sw      s8,64(sp)
 10:   afb7003c        sw      s7,60(sp)
 14:   afb60038        sw      s6,56(sp)
 18:   afb50034        sw      s5,52(sp)
 1c:   afb40030        sw      s4,48(sp)
 20:   afb3002c        sw      s3,44(sp)
 24:   afb20028        sw      s2,40(sp)
 28:   afb10024        sw      s1,36(sp)
 2c:   afb00020        sw      s0,32(sp)
 30:   40141001        mfc0    s4,c0_tcstatus
 34:   36810400        ori     at,s4,0x400
 38:   40811001        mtc0    at,c0_tcstatus
 3c:   32940400        andi    s4,s4,0x400
 40:   000000c0        ehb
 44:   41610001        dvpe    at
 48:   0020a821        move    s5,at
 4c:   000000c0        ehb
 50:   3c020000        lui     v0,0x0
 54:   24420060        addiu   v0,v0,96
 58:   00400408        jr.hb   v0
 5c:   00000000        nop
 60:   3c040000        lui     a0,0x0
 64:   24840000        addiu   a0,a0,0
 68:   0c000000        jal     0 <mips_mt_regdump>
 6c:   afa50010        sw      a1,16(sp)
 70:   3c040000        lui     a0,0x0
 74:   0c000000        jal     0 <mips_mt_regdump>
 78:   24840000        addiu   a0,a0,0
 7c:   8fa50010        lw      a1,16(sp)
 80:   3c040000        lui     a0,0x0
 84:   0c000000        jal     0 <mips_mt_regdump>
 88:   24840000        addiu   a0,a0,0
 8c:   3c040000        lui     a0,0x0
 90:   24840000        addiu   a0,a0,0
 94:   0c000000        jal     0 <mips_mt_regdump>
 98:   02a02821        move    a1,s5
 9c:   40110002        mfc0    s1,c0_mvpconf0
 a0:   3c040000        lui     a0,0x0
 a4:   02202821        move    a1,s1
 a8:   0c000000        jal     0 <mips_mt_regdump>
 ac:   24840000        addiu   a0,a0,0
 b0:   3c040000        lui     a0,0x0
 b4:   0c000000        jal     0 <mips_mt_regdump>
 b8:   24840000        addiu   a0,a0,0
 bc:   7e331a80        ext     s3,s1,0xa,0x4
 c0:   3c090000        lui     t1,0x0
 c4:   323100ff        andi    s1,s1,0xff
 c8:   3c080000        lui     t0,0x0
 cc:   3c030000        lui     v1,0x0
 d0:   3c1e0000        lui     s8,0x0
 d4:   3c170000        lui     s7,0x0
 d8:   3c160000        lui     s6,0x0
 dc:   3c0a0000        lui     t2,0x0
 e0:   26730001        addiu   s3,s3,1
 e4:   26310001        addiu   s1,s1,1
 e8:   00008021        move    s0,zero
 ec:   2412ff00        li      s2,-256
 f0:   25290000        addiu   t1,t1,0
 f4:   25080000        addiu   t0,t0,0
 f8:   24630000        addiu   v1,v1,0
 fc:   27de0000        addiu   s8,s8,0
 100:   26f70000        addiu   s7,s7,0
 104:   26d60000        addiu   s6,s6,0
 108:   254a0000        addiu   t2,t2,0
 10c:   00001021        move    v0,zero
 110:   40040801        mfc0    a0,c0_vpecontrol
 114:   00922024        and     a0,a0,s2
 118:   00442025        or      a0,v0,a0
 11c:   40840801        mtc0    a0,c0_vpecontrol
 120:   000000c0        ehb
 124:   41020802        mftc0   at,c0_tcbind
 128:   00202021        move    a0,at
 12c:   24420001        addiu   v0,v0,1
 130:   3084000f        andi    a0,a0,0xf
 134:   12040031        beq     s0,a0,1fc <mips_mt_regdump+0x1fc>
 138:   0051282a        slt     a1,v0,s1
 13c:   14a0fff4        bnez    a1,110 <mips_mt_regdump+0x110>
 140:   00000000        nop
 144:   26100001        addiu   s0,s0,1
 148:   0213102a        slt     v0,s0,s3
 14c:   1440fff0        bnez    v0,110 <mips_mt_regdump+0x110>
 150:   00001021        move    v0,zero
 154:   3c040000        lui     a0,0x0
 158:   24840000        addiu   a0,a0,0
 15c:   3c1e0000        lui     s8,0x0
 160:   3c170000        lui     s7,0x0
 164:   3c160000        lui     s6,0x0
 168:   3c130000        lui     s3,0x0
 16c:   0c000000        jal     0 <mips_mt_regdump>
 170:   3c120000        lui     s2,0x0
 174:   00008021        move    s0,zero
 178:   27de0000        addiu   s8,s8,0
 17c:   26f70000        addiu   s7,s7,0
 180:   26d60000        addiu   s6,s6,0
 184:   26730000        addiu   s3,s3,0
 188:   26520000        addiu   s2,s2,0
 18c:   40020801        mfc0    v0,c0_vpecontrol
 190:   2403ff00        li      v1,-256
 194:   00431024        and     v0,v0,v1
 198:   02021025        or      v0,s0,v0
 19c:   40820801        mtc0    v0,c0_vpecontrol
 1a0:   000000c0        ehb
 1a4:   41020802        mftc0   at,c0_tcbind
 1a8:   00201821        move    v1,at
 1ac:   40021002        mfc0    v0,c0_tcbind
 1b0:   1062003f        beq     v1,v0,2b0 <mips_mt_regdump+0x2b0>
 1b4:   00000000        nop
 1b8:   41020804        mftc0   at,c0_tchalt
 1bc:   00201821        move    v1,at
 1c0:   24020001        li      v0,1
 1c4:   00400821        move    at,v0
 1c8:   41811004        mttc0   at,c0_tchalt
 1cc:   41020801        mftc0   at,c0_tcstatus
 1d0:   00203021        move    a2,at
 1d4:   3c040000        lui     a0,0x0
 1d8:   02002821        move    a1,s0
 1dc:   24840000        addiu   a0,a0,0
 1e0:   afa3001c        sw      v1,28(sp)
 1e4:   0c000000        jal     0 <mips_mt_regdump>
 1e8:   afa60010        sw      a2,16(sp)
 1ec:   8fa60010        lw      a2,16(sp)
 1f0:   8fa3001c        lw      v1,28(sp)
 1f4:   080000b2        j       2c8 <mips_mt_regdump+0x2c8>
 1f8:   00c02821        move    a1,a2
 1fc:   01202021        move    a0,t1
 200:   02002821        move    a1,s0
 204:   afa3001c        sw      v1,28(sp)
 208:   afa80014        sw      t0,20(sp)
 20c:   afa90010        sw      t1,16(sp)
 210:   0c000000        jal     0 <mips_mt_regdump>
 214:   afaa0018        sw      t2,24(sp)
 218:   41010801        mftc0   at,c0_vpecontrol
 21c:   00202821        move    a1,at
 220:   8fa80014        lw      t0,20(sp)
 224:   0c000000        jal     0 <mips_mt_regdump>
 228:   01002021        move    a0,t0
 22c:   41010802        mftc0   at,c0_vpeconf0
 230:   00202821        move    a1,at
 234:   8fa3001c        lw      v1,28(sp)
 238:   0c000000        jal     0 <mips_mt_regdump>
 23c:   00602021        move    a0,v1
 240:   410c0800        mftc0   at,c0_status
 244:   00203021        move    a2,at
 248:   03c02021        move    a0,s8
 24c:   0c000000        jal     0 <mips_mt_regdump>
 250:   02002821        move    a1,s0
 254:   410e0800        mftc0   at,c0_epc
 258:   00203021        move    a2,at
 25c:   410e0800        mftc0   at,c0_epc
 260:   00203821        move    a3,at
 264:   02e02021        move    a0,s7
 268:   0c000000        jal     0 <mips_mt_regdump>
 26c:   02002821        move    a1,s0
 270:   410d0800        mftc0   at,c0_cause
 274:   00203021        move    a2,at
 278:   02c02021        move    a0,s6
 27c:   0c000000        jal     0 <mips_mt_regdump>
 280:   02002821        move    a1,s0
 284:   41100807        mftc0   at,$16,7
 288:   00203021        move    a2,at
 28c:   8faa0018        lw      t2,24(sp)
 290:   02002821        move    a1,s0
 294:   0c000000        jal     0 <mips_mt_regdump>
 298:   01402021        move    a0,t2
 29c:   8fa3001c        lw      v1,28(sp)
 2a0:   8fa80014        lw      t0,20(sp)
 2a4:   8fa90010        lw      t1,16(sp)
 2a8:   08000051        j       144 <mips_mt_regdump+0x144>
 2ac:   8faa0018        lw      t2,24(sp)
 2b0:   3c040000        lui     a0,0x0
 2b4:   02002821        move    a1,s0
 2b8:   0c000000        jal     0 <mips_mt_regdump>
 2bc:   24840000        addiu   a0,a0,0
 2c0:   00001821        move    v1,zero
 2c4:   02802821        move    a1,s4
 2c8:   03c02021        move    a0,s8
 2cc:   0c000000        jal     0 <mips_mt_regdump>
 2d0:   afa3001c        sw      v1,28(sp)
 2d4:   41020802        mftc0   at,c0_tcbind
 2d8:   00202821        move    a1,at
 2dc:   0c000000        jal     0 <mips_mt_regdump>
 2e0:   02e02021        move    a0,s7
 2e4:   41020803        mftc0   at,c0_tcrestart
 2e8:   00202821        move    a1,at
 2ec:   41020803        mftc0   at,c0_tcrestart
 2f0:   00203021        move    a2,at
 2f4:   0c000000        jal     0 <mips_mt_regdump>
 2f8:   02c02021        move    a0,s6
 2fc:   8fa3001c        lw      v1,28(sp)
 300:   02602021        move    a0,s3
 304:   0c000000        jal     0 <mips_mt_regdump>
 308:   00602821        move    a1,v1
 30c:   41020805        mftc0   at,c0_tccontext
 310:   00202821        move    a1,at
 314:   0c000000        jal     0 <mips_mt_regdump>
 318:   02402021        move    a0,s2
 31c:   8fa3001c        lw      v1,28(sp)
 320:   14600003        bnez    v1,330 <mips_mt_regdump+0x330>
 324:   00001021        move    v0,zero
 328:   00400821        move    at,v0
 32c:   41811004        mttc0   at,c0_tchalt
 330:   26100001        addiu   s0,s0,1
 334:   0211102a        slt     v0,s0,s1
 338:   1440ff94        bnez    v0,18c <mips_mt_regdump+0x18c>
 33c:   00000000        nop
 340:   0c000000        jal     0 <mips_mt_regdump>
 344:   32b50001        andi    s5,s5,0x1
 348:   3c040000        lui     a0,0x0
 34c:   0c000000        jal     0 <mips_mt_regdump>
 350:   24840000        addiu   a0,a0,0
 354:   12a00004        beqz    s5,368 <mips_mt_regdump+0x368>
 358:   32820400        andi    v0,s4,0x400
 35c:   41600021        evpe
 360:   000000c0        ehb
 364:   32820400        andi    v0,s4,0x400
 368:   14400003        bnez    v0,378 <mips_mt_regdump+0x378>
 36c:   00000000        nop
 370:   0c000000        jal     0 <mips_mt_regdump>
 374:   00000000        nop
 378:   40011001        mfc0    at,c0_tcstatus
 37c:   32940400        andi    s4,s4,0x400
 380:   34210400        ori     at,at,0x400
 384:   38210400        xori    at,at,0x400
 388:   0281a025        or      s4,s4,at
 38c:   40941001        mtc0    s4,c0_tcstatus
 390:   000000c0        ehb
 394:   8fbf0044        lw      ra,68(sp)
 398:   8fbe0040        lw      s8,64(sp)
 39c:   8fb7003c        lw      s7,60(sp)
 3a0:   8fb60038        lw      s6,56(sp)
 3a4:   8fb50034        lw      s5,52(sp)
 3a8:   8fb40030        lw      s4,48(sp)
 3ac:   8fb3002c        lw      s3,44(sp)
 3b0:   8fb20028        lw      s2,40(sp)
 3b4:   8fb10024        lw      s1,36(sp)
 3b8:   8fb00020        lw      s0,32(sp)
 3bc:   03e00008        jr      ra
 3c0:   27bd0048        addiu   sp,sp,72


On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()?  The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
>
> /K.
>
>
> On 12/16/10 11:58, Kevin D. Kissell wrote:
>>
>> Ralf tells me that this message got blocked by the LMO server due to HTML
>> content.
>> So here it is again, textier.
>>
>> On 12/16/10 11:24, Kevin D. Kissell wrote:
>> > On 12/16/10 07:37, STUART VENTERS wrote:
>> >
>> > Two other possible clues:
>> >
>> > The EVP is clear in the MVPControl register.
>> > Does this say that only VPE0, T0 gets to run?
>>
>> That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
>>  It's just possible that setting EVP is conditional on more than one VPE
>> being used, but that's not the way I remember it.
>>
>> > Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
>> > Exception dispatch.
>> > But that seems to conflict the EVP bit above.
>>
>> I don't have a copy of the ASE spec handy to see whether those bits have a
>> defined power-on value, but particularly if maxvpes=1 was set at boot time,
>> I would expect VPE1's registers to be in a partly random power-up state.
>>
>> > Perhaps these are an artifact of getting to a good state to dump things
>> > out.
>>
>> As per my previous mail, I looked at the MT register dump source, and it
>> really does pull values directly
>> out of registers and doesn't depend on having a sane kernel stack frame.
>>  The exceptions to that rule
>> are the reported values for TCStatus of the executing TC, which is based
>> on the perhaps-now-broken
>> assumption that local_irq_save(flags) stores the *entire* pre-invocation
>> value of the TCStatus register
>> in the flags variable, and MVPcontrol, which is based on the assumption
>> that dvpe() returns the pre-invocation
>> value of MVPcontrol.  Break those assumptions, and you'll get inconsistent
>> state dumps like this,
>> and very possibly incorrect execution.   Particularly if what was done was
>> that effectively replaces
>> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
>> with something that uses
>> the generic MIPS32R2 atomic interrupt enable/disable instructions.  That
>> would have been a *very* bad idea...
>>
>>             Regards,
>>
>>             Kevin K.
>>
>>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-21 20:06             ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:06 UTC (permalink / raw)
  To: Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips


OK. I will check it.

BTW following patch is responsible for irq change.

http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2101

Thanks
Anoop
________________________________________
From: Kevin D. Kissell [mailto:kevink@paralogos.com] 
Sent: Wednesday, December 22, 2010 12:23 AM
To: Anoop P A
Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
Subject: Re: SMTC support status in latest git head.

OK, I see why the MT register dump isn't giving us useful information.  It's not clear that it's at the root of your functional problems, though.  Apparently, somebody decided that it was unwholesome to propagate anything other than the previous interrupt enable state in the flags variable passed between irq_save() and irq_restore().  I agree philosophically, but it does break the MT register dump function.  And I'm quite sure that there were other bits of SMTC code that knew that it was a TCStatus value, at least in the earliest versions of the code.  I'm not a gitweb power user,  but I haven't been able to figure out how to determine when the "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's where it is in the head of tree) was checked-in.  If it's at the boundary between working and non-working versions for SMTC, it might be the cause of the problems, but it may well not be responsible for anything other than the problem with reporting the value in the MT register dump - which really ought to be fixed.

I'm in a small village in France for the holidays with no git/build system at my disposal, but I think that if you were to tweak mips-mt.c at line 103 to change
the

        tcstatval = flags; /* And pre-dump TCStatus is flags */

        

        to something more like

        

        /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
        */

        tcstatval = (read_c0_tcstatus() & ~0x400) | flags;

        

        should fix the dump.

            Regards,

            Kevin K.

On 12/20/10 2:44 AM, Anoop P A wrote: 
Hi Kevin,

Please find disassembly  for mips_mt_reg_dump

Thanks
Anoop

Disassembly of section .text:

00000000 <mips_mt_regdump>:
  0:   27bdffb8        addiu   sp,sp,-72
  4:   00802821        move    a1,a0
  8:   afbf0044        sw      ra,68(sp)
  c:   afbe0040        sw      s8,64(sp)
 10:   afb7003c        sw      s7,60(sp)
 14:   afb60038        sw      s6,56(sp)
 18:   afb50034        sw      s5,52(sp)
 1c:   afb40030        sw      s4,48(sp)
 20:   afb3002c        sw      s3,44(sp)
 24:   afb20028        sw      s2,40(sp)
 28:   afb10024        sw      s1,36(sp)
 2c:   afb00020        sw      s0,32(sp)
 30:   40141001        mfc0    s4,c0_tcstatus
 34:   36810400        ori     at,s4,0x400
 38:   40811001        mtc0    at,c0_tcstatus
 3c:   32940400        andi    s4,s4,0x400
 40:   000000c0        ehb
 44:   41610001        dvpe    at
 48:   0020a821        move    s5,at
 4c:   000000c0        ehb
 50:   3c020000        lui     v0,0x0
 54:   24420060        addiu   v0,v0,96
 58:   00400408        jr.hb   v0
 5c:   00000000        nop
 60:   3c040000        lui     a0,0x0
 64:   24840000        addiu   a0,a0,0
 68:   0c000000        jal     0 <mips_mt_regdump>
 6c:   afa50010        sw      a1,16(sp)
 70:   3c040000        lui     a0,0x0
 74:   0c000000        jal     0 <mips_mt_regdump>
 78:   24840000        addiu   a0,a0,0
 7c:   8fa50010        lw      a1,16(sp)
 80:   3c040000        lui     a0,0x0
 84:   0c000000        jal     0 <mips_mt_regdump>
 88:   24840000        addiu   a0,a0,0
 8c:   3c040000        lui     a0,0x0
 90:   24840000        addiu   a0,a0,0
 94:   0c000000        jal     0 <mips_mt_regdump>
 98:   02a02821        move    a1,s5
 9c:   40110002        mfc0    s1,c0_mvpconf0
 a0:   3c040000        lui     a0,0x0
 a4:   02202821        move    a1,s1
 a8:   0c000000        jal     0 <mips_mt_regdump>
 ac:   24840000        addiu   a0,a0,0
 b0:   3c040000        lui     a0,0x0
 b4:   0c000000        jal     0 <mips_mt_regdump>
 b8:   24840000        addiu   a0,a0,0
 bc:   7e331a80        ext     s3,s1,0xa,0x4
 c0:   3c090000        lui     t1,0x0
 c4:   323100ff        andi    s1,s1,0xff
 c8:   3c080000        lui     t0,0x0
 cc:   3c030000        lui     v1,0x0
 d0:   3c1e0000        lui     s8,0x0
 d4:   3c170000        lui     s7,0x0
 d8:   3c160000        lui     s6,0x0
 dc:   3c0a0000        lui     t2,0x0
 e0:   26730001        addiu   s3,s3,1
 e4:   26310001        addiu   s1,s1,1
 e8:   00008021        move    s0,zero
 ec:   2412ff00        li      s2,-256
 f0:   25290000        addiu   t1,t1,0
 f4:   25080000        addiu   t0,t0,0
 f8:   24630000        addiu   v1,v1,0
 fc:   27de0000        addiu   s8,s8,0
 100:   26f70000        addiu   s7,s7,0
 104:   26d60000        addiu   s6,s6,0
 108:   254a0000        addiu   t2,t2,0
 10c:   00001021        move    v0,zero
 110:   40040801        mfc0    a0,c0_vpecontrol
 114:   00922024        and     a0,a0,s2
 118:   00442025        or      a0,v0,a0
 11c:   40840801        mtc0    a0,c0_vpecontrol
 120:   000000c0        ehb
 124:   41020802        mftc0   at,c0_tcbind
 128:   00202021        move    a0,at
 12c:   24420001        addiu   v0,v0,1
 130:   3084000f        andi    a0,a0,0xf
 134:   12040031        beq     s0,a0,1fc <mips_mt_regdump+0x1fc>
 138:   0051282a        slt     a1,v0,s1
 13c:   14a0fff4        bnez    a1,110 <mips_mt_regdump+0x110>
 140:   00000000        nop
 144:   26100001        addiu   s0,s0,1
 148:   0213102a        slt     v0,s0,s3
 14c:   1440fff0        bnez    v0,110 <mips_mt_regdump+0x110>
 150:   00001021        move    v0,zero
 154:   3c040000        lui     a0,0x0
 158:   24840000        addiu   a0,a0,0
 15c:   3c1e0000        lui     s8,0x0
 160:   3c170000        lui     s7,0x0
 164:   3c160000        lui     s6,0x0
 168:   3c130000        lui     s3,0x0
 16c:   0c000000        jal     0 <mips_mt_regdump>
 170:   3c120000        lui     s2,0x0
 174:   00008021        move    s0,zero
 178:   27de0000        addiu   s8,s8,0
 17c:   26f70000        addiu   s7,s7,0
 180:   26d60000        addiu   s6,s6,0
 184:   26730000        addiu   s3,s3,0
 188:   26520000        addiu   s2,s2,0
 18c:   40020801        mfc0    v0,c0_vpecontrol
 190:   2403ff00        li      v1,-256
 194:   00431024        and     v0,v0,v1
 198:   02021025        or      v0,s0,v0
 19c:   40820801        mtc0    v0,c0_vpecontrol
 1a0:   000000c0        ehb
 1a4:   41020802        mftc0   at,c0_tcbind
 1a8:   00201821        move    v1,at
 1ac:   40021002        mfc0    v0,c0_tcbind
 1b0:   1062003f        beq     v1,v0,2b0 <mips_mt_regdump+0x2b0>
 1b4:   00000000        nop
 1b8:   41020804        mftc0   at,c0_tchalt
 1bc:   00201821        move    v1,at
 1c0:   24020001        li      v0,1
 1c4:   00400821        move    at,v0
 1c8:   41811004        mttc0   at,c0_tchalt
 1cc:   41020801        mftc0   at,c0_tcstatus
 1d0:   00203021        move    a2,at
 1d4:   3c040000        lui     a0,0x0
 1d8:   02002821        move    a1,s0
 1dc:   24840000        addiu   a0,a0,0
 1e0:   afa3001c        sw      v1,28(sp)
 1e4:   0c000000        jal     0 <mips_mt_regdump>
 1e8:   afa60010        sw      a2,16(sp)
 1ec:   8fa60010        lw      a2,16(sp)
 1f0:   8fa3001c        lw      v1,28(sp)
 1f4:   080000b2        j       2c8 <mips_mt_regdump+0x2c8>
 1f8:   00c02821        move    a1,a2
 1fc:   01202021        move    a0,t1
 200:   02002821        move    a1,s0
 204:   afa3001c        sw      v1,28(sp)
 208:   afa80014        sw      t0,20(sp)
 20c:   afa90010        sw      t1,16(sp)
 210:   0c000000        jal     0 <mips_mt_regdump>
 214:   afaa0018        sw      t2,24(sp)
 218:   41010801        mftc0   at,c0_vpecontrol
 21c:   00202821        move    a1,at
 220:   8fa80014        lw      t0,20(sp)
 224:   0c000000        jal     0 <mips_mt_regdump>
 228:   01002021        move    a0,t0
 22c:   41010802        mftc0   at,c0_vpeconf0
 230:   00202821        move    a1,at
 234:   8fa3001c        lw      v1,28(sp)
 238:   0c000000        jal     0 <mips_mt_regdump>
 23c:   00602021        move    a0,v1
 240:   410c0800        mftc0   at,c0_status
 244:   00203021        move    a2,at
 248:   03c02021        move    a0,s8
 24c:   0c000000        jal     0 <mips_mt_regdump>
 250:   02002821        move    a1,s0
 254:   410e0800        mftc0   at,c0_epc
 258:   00203021        move    a2,at
 25c:   410e0800        mftc0   at,c0_epc
 260:   00203821        move    a3,at
 264:   02e02021        move    a0,s7
 268:   0c000000        jal     0 <mips_mt_regdump>
 26c:   02002821        move    a1,s0
 270:   410d0800        mftc0   at,c0_cause
 274:   00203021        move    a2,at
 278:   02c02021        move    a0,s6
 27c:   0c000000        jal     0 <mips_mt_regdump>
 280:   02002821        move    a1,s0
 284:   41100807        mftc0   at,$16,7
 288:   00203021        move    a2,at
 28c:   8faa0018        lw      t2,24(sp)
 290:   02002821        move    a1,s0
 294:   0c000000        jal     0 <mips_mt_regdump>
 298:   01402021        move    a0,t2
 29c:   8fa3001c        lw      v1,28(sp)
 2a0:   8fa80014        lw      t0,20(sp)
 2a4:   8fa90010        lw      t1,16(sp)
 2a8:   08000051        j       144 <mips_mt_regdump+0x144>
 2ac:   8faa0018        lw      t2,24(sp)
 2b0:   3c040000        lui     a0,0x0
 2b4:   02002821        move    a1,s0
 2b8:   0c000000        jal     0 <mips_mt_regdump>
 2bc:   24840000        addiu   a0,a0,0
 2c0:   00001821        move    v1,zero
 2c4:   02802821        move    a1,s4
 2c8:   03c02021        move    a0,s8
 2cc:   0c000000        jal     0 <mips_mt_regdump>
 2d0:   afa3001c        sw      v1,28(sp)
 2d4:   41020802        mftc0   at,c0_tcbind
 2d8:   00202821        move    a1,at
 2dc:   0c000000        jal     0 <mips_mt_regdump>
 2e0:   02e02021        move    a0,s7
 2e4:   41020803        mftc0   at,c0_tcrestart
 2e8:   00202821        move    a1,at
 2ec:   41020803        mftc0   at,c0_tcrestart
 2f0:   00203021        move    a2,at
 2f4:   0c000000        jal     0 <mips_mt_regdump>
 2f8:   02c02021        move    a0,s6
 2fc:   8fa3001c        lw      v1,28(sp)
 300:   02602021        move    a0,s3
 304:   0c000000        jal     0 <mips_mt_regdump>
 308:   00602821        move    a1,v1
 30c:   41020805        mftc0   at,c0_tccontext
 310:   00202821        move    a1,at
 314:   0c000000        jal     0 <mips_mt_regdump>
 318:   02402021        move    a0,s2
 31c:   8fa3001c        lw      v1,28(sp)
 320:   14600003        bnez    v1,330 <mips_mt_regdump+0x330>
 324:   00001021        move    v0,zero
 328:   00400821        move    at,v0
 32c:   41811004        mttc0   at,c0_tchalt
 330:   26100001        addiu   s0,s0,1
 334:   0211102a        slt     v0,s0,s1
 338:   1440ff94        bnez    v0,18c <mips_mt_regdump+0x18c>
 33c:   00000000        nop
 340:   0c000000        jal     0 <mips_mt_regdump>
 344:   32b50001        andi    s5,s5,0x1
 348:   3c040000        lui     a0,0x0
 34c:   0c000000        jal     0 <mips_mt_regdump>
 350:   24840000        addiu   a0,a0,0
 354:   12a00004        beqz    s5,368 <mips_mt_regdump+0x368>
 358:   32820400        andi    v0,s4,0x400
 35c:   41600021        evpe
 360:   000000c0        ehb
 364:   32820400        andi    v0,s4,0x400
 368:   14400003        bnez    v0,378 <mips_mt_regdump+0x378>
 36c:   00000000        nop
 370:   0c000000        jal     0 <mips_mt_regdump>
 374:   00000000        nop
 378:   40011001        mfc0    at,c0_tcstatus
 37c:   32940400        andi    s4,s4,0x400
 380:   34210400        ori     at,at,0x400
 384:   38210400        xori    at,at,0x400
 388:   0281a025        or      s4,s4,at
 38c:   40941001        mtc0    s4,c0_tcstatus
 390:   000000c0        ehb
 394:   8fbf0044        lw      ra,68(sp)
 398:   8fbe0040        lw      s8,64(sp)
 39c:   8fb7003c        lw      s7,60(sp)
 3a0:   8fb60038        lw      s6,56(sp)
 3a4:   8fb50034        lw      s5,52(sp)
 3a8:   8fb40030        lw      s4,48(sp)
 3ac:   8fb3002c        lw      s3,44(sp)
 3b0:   8fb20028        lw      s2,40(sp)
 3b4:   8fb10024        lw      s1,36(sp)
 3b8:   8fb00020        lw      s0,32(sp)
 3bc:   03e00008        jr      ra
 3c0:   27bd0048        addiu   sp,sp,72


On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
So, Anoop, if you get a minute for this any time in the next day or so
(after which I'll have very limited net access until next year), could you
please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
image (or even just the mips-mt.o module) from a failing kernel build and
post the disassembly of mips_mt_regdump()?  The confirmation or refutation
of the theory about local_irq_save() no longer being built correctly for
SMTC would be within the first few instructions...

/K.


On 12/16/10 11:58, Kevin D. Kissell wrote:
Ralf tells me that this message got blocked by the LMO server due to HTML
content.
So here it is again, textier.

On 12/16/10 11:24, Kevin D. Kissell wrote:
On 12/16/10 07:37, STUART VENTERS wrote:

Two other possible clues:

The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
 It's just possible that setting EVP is conditional on more than one VPE
being used, but that's not the way I remember it.

Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
Exception dispatch.
But that seems to conflict the EVP bit above.
I don't have a copy of the ASE spec handy to see whether those bits have a
defined power-on value, but particularly if maxvpes=1 was set at boot time,
I would expect VPE1's registers to be in a partly random power-up state.

Perhaps these are an artifact of getting to a good state to dump things
out.
As per my previous mail, I looked at the MT register dump source, and it
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack frame.
 The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption
that dvpe() returns the pre-invocation
value of MVPcontrol.  Break those assumptions, and you'll get inconsistent
state dumps like this,
and very possibly incorrect execution.   Particularly if what was done was
that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore()
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions.  That
would have been a *very* bad idea...

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-21 20:06             ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:06 UTC (permalink / raw)
  To: Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips


OK. I will check it.

BTW following patch is responsible for irq change.

http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2101

Thanks
Anoop
________________________________________
From: Kevin D. Kissell [mailto:kevink@paralogos.com] 
Sent: Wednesday, December 22, 2010 12:23 AM
To: Anoop P A
Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
Subject: Re: SMTC support status in latest git head.

OK, I see why the MT register dump isn't giving us useful information.  It's not clear that it's at the root of your functional problems, though.  Apparently, somebody decided that it was unwholesome to propagate anything other than the previous interrupt enable state in the flags variable passed between irq_save() and irq_restore().  I agree philosophically, but it does break the MT register dump function.  And I'm quite sure that there were other bits of SMTC code that knew that it was a TCStatus value, at least in the earliest versions of the code.  I'm not a gitweb power user,  but I haven't been able to figure out how to determine when the "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's where it is in the head of tree) was checked-in.  If it's at the boundary between working and non-working versions for SMTC, it might be the cause of the problems, but it may well not be responsible for anything other than the problem with reporting the value in the MT register dump - which really ought to be fixed.

I'm in a small village in France for the holidays with no git/build system at my disposal, but I think that if you were to tweak mips-mt.c at line 103 to change
the

        tcstatval = flags; /* And pre-dump TCStatus is flags */

        

        to something more like

        

        /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
        */

        tcstatval = (read_c0_tcstatus() & ~0x400) | flags;

        

        should fix the dump.

            Regards,

            Kevin K.

On 12/20/10 2:44 AM, Anoop P A wrote: 
Hi Kevin,

Please find disassembly  for mips_mt_reg_dump

Thanks
Anoop

Disassembly of section .text:

00000000 <mips_mt_regdump>:
  0:   27bdffb8        addiu   sp,sp,-72
  4:   00802821        move    a1,a0
  8:   afbf0044        sw      ra,68(sp)
  c:   afbe0040        sw      s8,64(sp)
 10:   afb7003c        sw      s7,60(sp)
 14:   afb60038        sw      s6,56(sp)
 18:   afb50034        sw      s5,52(sp)
 1c:   afb40030        sw      s4,48(sp)
 20:   afb3002c        sw      s3,44(sp)
 24:   afb20028        sw      s2,40(sp)
 28:   afb10024        sw      s1,36(sp)
 2c:   afb00020        sw      s0,32(sp)
 30:   40141001        mfc0    s4,c0_tcstatus
 34:   36810400        ori     at,s4,0x400
 38:   40811001        mtc0    at,c0_tcstatus
 3c:   32940400        andi    s4,s4,0x400
 40:   000000c0        ehb
 44:   41610001        dvpe    at
 48:   0020a821        move    s5,at
 4c:   000000c0        ehb
 50:   3c020000        lui     v0,0x0
 54:   24420060        addiu   v0,v0,96
 58:   00400408        jr.hb   v0
 5c:   00000000        nop
 60:   3c040000        lui     a0,0x0
 64:   24840000        addiu   a0,a0,0
 68:   0c000000        jal     0 <mips_mt_regdump>
 6c:   afa50010        sw      a1,16(sp)
 70:   3c040000        lui     a0,0x0
 74:   0c000000        jal     0 <mips_mt_regdump>
 78:   24840000        addiu   a0,a0,0
 7c:   8fa50010        lw      a1,16(sp)
 80:   3c040000        lui     a0,0x0
 84:   0c000000        jal     0 <mips_mt_regdump>
 88:   24840000        addiu   a0,a0,0
 8c:   3c040000        lui     a0,0x0
 90:   24840000        addiu   a0,a0,0
 94:   0c000000        jal     0 <mips_mt_regdump>
 98:   02a02821        move    a1,s5
 9c:   40110002        mfc0    s1,c0_mvpconf0
 a0:   3c040000        lui     a0,0x0
 a4:   02202821        move    a1,s1
 a8:   0c000000        jal     0 <mips_mt_regdump>
 ac:   24840000        addiu   a0,a0,0
 b0:   3c040000        lui     a0,0x0
 b4:   0c000000        jal     0 <mips_mt_regdump>
 b8:   24840000        addiu   a0,a0,0
 bc:   7e331a80        ext     s3,s1,0xa,0x4
 c0:   3c090000        lui     t1,0x0
 c4:   323100ff        andi    s1,s1,0xff
 c8:   3c080000        lui     t0,0x0
 cc:   3c030000        lui     v1,0x0
 d0:   3c1e0000        lui     s8,0x0
 d4:   3c170000        lui     s7,0x0
 d8:   3c160000        lui     s6,0x0
 dc:   3c0a0000        lui     t2,0x0
 e0:   26730001        addiu   s3,s3,1
 e4:   26310001        addiu   s1,s1,1
 e8:   00008021        move    s0,zero
 ec:   2412ff00        li      s2,-256
 f0:   25290000        addiu   t1,t1,0
 f4:   25080000        addiu   t0,t0,0
 f8:   24630000        addiu   v1,v1,0
 fc:   27de0000        addiu   s8,s8,0
 100:   26f70000        addiu   s7,s7,0
 104:   26d60000        addiu   s6,s6,0
 108:   254a0000        addiu   t2,t2,0
 10c:   00001021        move    v0,zero
 110:   40040801        mfc0    a0,c0_vpecontrol
 114:   00922024        and     a0,a0,s2
 118:   00442025        or      a0,v0,a0
 11c:   40840801        mtc0    a0,c0_vpecontrol
 120:   000000c0        ehb
 124:   41020802        mftc0   at,c0_tcbind
 128:   00202021        move    a0,at
 12c:   24420001        addiu   v0,v0,1
 130:   3084000f        andi    a0,a0,0xf
 134:   12040031        beq     s0,a0,1fc <mips_mt_regdump+0x1fc>
 138:   0051282a        slt     a1,v0,s1
 13c:   14a0fff4        bnez    a1,110 <mips_mt_regdump+0x110>
 140:   00000000        nop
 144:   26100001        addiu   s0,s0,1
 148:   0213102a        slt     v0,s0,s3
 14c:   1440fff0        bnez    v0,110 <mips_mt_regdump+0x110>
 150:   00001021        move    v0,zero
 154:   3c040000        lui     a0,0x0
 158:   24840000        addiu   a0,a0,0
 15c:   3c1e0000        lui     s8,0x0
 160:   3c170000        lui     s7,0x0
 164:   3c160000        lui     s6,0x0
 168:   3c130000        lui     s3,0x0
 16c:   0c000000        jal     0 <mips_mt_regdump>
 170:   3c120000        lui     s2,0x0
 174:   00008021        move    s0,zero
 178:   27de0000        addiu   s8,s8,0
 17c:   26f70000        addiu   s7,s7,0
 180:   26d60000        addiu   s6,s6,0
 184:   26730000        addiu   s3,s3,0
 188:   26520000        addiu   s2,s2,0
 18c:   40020801        mfc0    v0,c0_vpecontrol
 190:   2403ff00        li      v1,-256
 194:   00431024        and     v0,v0,v1
 198:   02021025        or      v0,s0,v0
 19c:   40820801        mtc0    v0,c0_vpecontrol
 1a0:   000000c0        ehb
 1a4:   41020802        mftc0   at,c0_tcbind
 1a8:   00201821        move    v1,at
 1ac:   40021002        mfc0    v0,c0_tcbind
 1b0:   1062003f        beq     v1,v0,2b0 <mips_mt_regdump+0x2b0>
 1b4:   00000000        nop
 1b8:   41020804        mftc0   at,c0_tchalt
 1bc:   00201821        move    v1,at
 1c0:   24020001        li      v0,1
 1c4:   00400821        move    at,v0
 1c8:   41811004        mttc0   at,c0_tchalt
 1cc:   41020801        mftc0   at,c0_tcstatus
 1d0:   00203021        move    a2,at
 1d4:   3c040000        lui     a0,0x0
 1d8:   02002821        move    a1,s0
 1dc:   24840000        addiu   a0,a0,0
 1e0:   afa3001c        sw      v1,28(sp)
 1e4:   0c000000        jal     0 <mips_mt_regdump>
 1e8:   afa60010        sw      a2,16(sp)
 1ec:   8fa60010        lw      a2,16(sp)
 1f0:   8fa3001c        lw      v1,28(sp)
 1f4:   080000b2        j       2c8 <mips_mt_regdump+0x2c8>
 1f8:   00c02821        move    a1,a2
 1fc:   01202021        move    a0,t1
 200:   02002821        move    a1,s0
 204:   afa3001c        sw      v1,28(sp)
 208:   afa80014        sw      t0,20(sp)
 20c:   afa90010        sw      t1,16(sp)
 210:   0c000000        jal     0 <mips_mt_regdump>
 214:   afaa0018        sw      t2,24(sp)
 218:   41010801        mftc0   at,c0_vpecontrol
 21c:   00202821        move    a1,at
 220:   8fa80014        lw      t0,20(sp)
 224:   0c000000        jal     0 <mips_mt_regdump>
 228:   01002021        move    a0,t0
 22c:   41010802        mftc0   at,c0_vpeconf0
 230:   00202821        move    a1,at
 234:   8fa3001c        lw      v1,28(sp)
 238:   0c000000        jal     0 <mips_mt_regdump>
 23c:   00602021        move    a0,v1
 240:   410c0800        mftc0   at,c0_status
 244:   00203021        move    a2,at
 248:   03c02021        move    a0,s8
 24c:   0c000000        jal     0 <mips_mt_regdump>
 250:   02002821        move    a1,s0
 254:   410e0800        mftc0   at,c0_epc
 258:   00203021        move    a2,at
 25c:   410e0800        mftc0   at,c0_epc
 260:   00203821        move    a3,at
 264:   02e02021        move    a0,s7
 268:   0c000000        jal     0 <mips_mt_regdump>
 26c:   02002821        move    a1,s0
 270:   410d0800        mftc0   at,c0_cause
 274:   00203021        move    a2,at
 278:   02c02021        move    a0,s6
 27c:   0c000000        jal     0 <mips_mt_regdump>
 280:   02002821        move    a1,s0
 284:   41100807        mftc0   at,$16,7
 288:   00203021        move    a2,at
 28c:   8faa0018        lw      t2,24(sp)
 290:   02002821        move    a1,s0
 294:   0c000000        jal     0 <mips_mt_regdump>
 298:   01402021        move    a0,t2
 29c:   8fa3001c        lw      v1,28(sp)
 2a0:   8fa80014        lw      t0,20(sp)
 2a4:   8fa90010        lw      t1,16(sp)
 2a8:   08000051        j       144 <mips_mt_regdump+0x144>
 2ac:   8faa0018        lw      t2,24(sp)
 2b0:   3c040000        lui     a0,0x0
 2b4:   02002821        move    a1,s0
 2b8:   0c000000        jal     0 <mips_mt_regdump>
 2bc:   24840000        addiu   a0,a0,0
 2c0:   00001821        move    v1,zero
 2c4:   02802821        move    a1,s4
 2c8:   03c02021        move    a0,s8
 2cc:   0c000000        jal     0 <mips_mt_regdump>
 2d0:   afa3001c        sw      v1,28(sp)
 2d4:   41020802        mftc0   at,c0_tcbind
 2d8:   00202821        move    a1,at
 2dc:   0c000000        jal     0 <mips_mt_regdump>
 2e0:   02e02021        move    a0,s7
 2e4:   41020803        mftc0   at,c0_tcrestart
 2e8:   00202821        move    a1,at
 2ec:   41020803        mftc0   at,c0_tcrestart
 2f0:   00203021        move    a2,at
 2f4:   0c000000        jal     0 <mips_mt_regdump>
 2f8:   02c02021        move    a0,s6
 2fc:   8fa3001c        lw      v1,28(sp)
 300:   02602021        move    a0,s3
 304:   0c000000        jal     0 <mips_mt_regdump>
 308:   00602821        move    a1,v1
 30c:   41020805        mftc0   at,c0_tccontext
 310:   00202821        move    a1,at
 314:   0c000000        jal     0 <mips_mt_regdump>
 318:   02402021        move    a0,s2
 31c:   8fa3001c        lw      v1,28(sp)
 320:   14600003        bnez    v1,330 <mips_mt_regdump+0x330>
 324:   00001021        move    v0,zero
 328:   00400821        move    at,v0
 32c:   41811004        mttc0   at,c0_tchalt
 330:   26100001        addiu   s0,s0,1
 334:   0211102a        slt     v0,s0,s1
 338:   1440ff94        bnez    v0,18c <mips_mt_regdump+0x18c>
 33c:   00000000        nop
 340:   0c000000        jal     0 <mips_mt_regdump>
 344:   32b50001        andi    s5,s5,0x1
 348:   3c040000        lui     a0,0x0
 34c:   0c000000        jal     0 <mips_mt_regdump>
 350:   24840000        addiu   a0,a0,0
 354:   12a00004        beqz    s5,368 <mips_mt_regdump+0x368>
 358:   32820400        andi    v0,s4,0x400
 35c:   41600021        evpe
 360:   000000c0        ehb
 364:   32820400        andi    v0,s4,0x400
 368:   14400003        bnez    v0,378 <mips_mt_regdump+0x378>
 36c:   00000000        nop
 370:   0c000000        jal     0 <mips_mt_regdump>
 374:   00000000        nop
 378:   40011001        mfc0    at,c0_tcstatus
 37c:   32940400        andi    s4,s4,0x400
 380:   34210400        ori     at,at,0x400
 384:   38210400        xori    at,at,0x400
 388:   0281a025        or      s4,s4,at
 38c:   40941001        mtc0    s4,c0_tcstatus
 390:   000000c0        ehb
 394:   8fbf0044        lw      ra,68(sp)
 398:   8fbe0040        lw      s8,64(sp)
 39c:   8fb7003c        lw      s7,60(sp)
 3a0:   8fb60038        lw      s6,56(sp)
 3a4:   8fb50034        lw      s5,52(sp)
 3a8:   8fb40030        lw      s4,48(sp)
 3ac:   8fb3002c        lw      s3,44(sp)
 3b0:   8fb20028        lw      s2,40(sp)
 3b4:   8fb10024        lw      s1,36(sp)
 3b8:   8fb00020        lw      s0,32(sp)
 3bc:   03e00008        jr      ra
 3c0:   27bd0048        addiu   sp,sp,72


On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
So, Anoop, if you get a minute for this any time in the next day or so
(after which I'll have very limited net access until next year), could you
please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
image (or even just the mips-mt.o module) from a failing kernel build and
post the disassembly of mips_mt_regdump()?  The confirmation or refutation
of the theory about local_irq_save() no longer being built correctly for
SMTC would be within the first few instructions...

/K.


On 12/16/10 11:58, Kevin D. Kissell wrote:
Ralf tells me that this message got blocked by the LMO server due to HTML
content.
So here it is again, textier.

On 12/16/10 11:24, Kevin D. Kissell wrote:
On 12/16/10 07:37, STUART VENTERS wrote:

Two other possible clues:

The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
 It's just possible that setting EVP is conditional on more than one VPE
being used, but that's not the way I remember it.

Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
Exception dispatch.
But that seems to conflict the EVP bit above.
I don't have a copy of the ASE spec handy to see whether those bits have a
defined power-on value, but particularly if maxvpes=1 was set at boot time,
I would expect VPE1's registers to be in a partly random power-up state.

Perhaps these are an artifact of getting to a good state to dump things
out.
As per my previous mail, I looked at the MT register dump source, and it
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack frame.
 The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption
that dvpe() returns the pre-invocation
value of MVPcontrol.  Break those assumptions, and you'll get inconsistent
state dumps like this,
and very possibly incorrect execution.   Particularly if what was done was
that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore()
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions.  That
would have been a *very* bad idea...

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-21 20:29               ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:29 UTC (permalink / raw)
  To: Anoop P.A., Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips

Sorry I misunderstood file. git blame shows that "andi" is around for quite sometime .

49a89efb include/asm-mips/irqflags.h      (Ralf Baechle     2007-10-11 23:46:15 +0100 128) __asm__(
df9ee292 arch/mips/include/asm/irqflags.h (David Howells    2010-10-07 14:08:55 +0100 129)      "       .macro  arch_local_irq_save result
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 130)      "       .set    push
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 131)      "       .set    reorder
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 132)      "       .set    noat
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 133) #ifdef CONFIG_MIPS_MT_SMTC
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 134)      "       mfc0    \\result, $2, 1
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 135)      "       ori     $1, \\result, 0x400
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 136)      "       .set    noreorder
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 137)      "       mtc0    $1, $2, 1
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 138)      "       andi    \\result, \\result, 0x400
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 139) #elif defined(CONFIG_CPU_MIPSR2)
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 140)      "       di      \\result
15265251 include/asm-mips/interrupt.h     (Maxime Bizon     2005-12-20 06:32:19 +0100 141)      "       andi    \\result, 1
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 142) #else
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 143)      "       mfc0    \\result, $12
c226f260 include/asm-mips/interrupt.h     (Atsushi Nemoto   2006-02-03 01:34:01 +0900 144)      "       ori     $1, \\result, 0x1f
c226f260 include/asm-mips/interrupt.h     (Atsushi Nemoto   2006-02-03 01:34:01 +0900 145)      "       xori    $1, 0x1f
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 146)      "       .set    noreorder
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 147)      "       mtc0    $1, $12
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 148) #endif
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 149)      "       irq_disable_hazard
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 150)      "       .set    pop
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 151)      "       .endm
^1da177e include/asm-mips/interrupt.h     (Linus Torvalds   2005-04-16 15:20:36 -0700 152)

> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org [mailto:linux-mips-bounce@linux-
> mips.org] On Behalf Of Anoop P.A.
> Sent: Wednesday, December 22, 2010 1:37 AM
> To: Kevin D. Kissell; Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org
> Subject: RE: SMTC support status in latest git head.
> 
> 
> OK. I will check it.
> 
> BTW following patch is responsible for irq change.
> 
> http://git.linux-
> mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2
> 101
> 
> Thanks
> Anoop
> ________________________________________
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Wednesday, December 22, 2010 12:23 AM
> To: Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
> Subject: Re: SMTC support status in latest git head.
> 
> OK, I see why the MT register dump isn't giving us useful information.
> It's not clear that it's at the root of your functional problems, though.
> Apparently, somebody decided that it was unwholesome to propagate anything
> other than the previous interrupt enable state in the flags variable
> passed between irq_save() and irq_restore().  I agree philosophically, but
> it does break the MT register dump function.  And I'm quite sure that
> there were other bits of SMTC code that knew that it was a TCStatus value,
> at least in the earliest versions of the code.  I'm not a gitweb power
> user,  but I haven't been able to figure out how to determine when the
> "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's
> where it is in the head of tree) was checked-in.  If it's at the boundary
> between working and non-working versions for SMTC, it might be the cause
> of the problems, but it may well not be responsible for anything other
> than the problem with reporting the value in
>  the MT register dump - which really ought to be fixed.
> 
> I'm in a small village in France for the holidays with no git/build system
> at my disposal, but I think that if you were to tweak mips-mt.c at line
> 103 to change
> the
> 
>         tcstatval = flags; /* And pre-dump TCStatus is flags */
> 
> 
> 
>         to something more like
> 
> 
> 
>         /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
>         */
> 
>         tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
> 
> 
> 
>         should fix the dump.
> 
>             Regards,
> 
>             Kevin K.
> 
> On 12/20/10 2:44 AM, Anoop P A wrote:
> Hi Kevin,
> 
> Please find disassembly  for mips_mt_reg_dump
> 
> Thanks
> Anoop
> 
> Disassembly of section .text:
> 
> 00000000 <mips_mt_regdump>:
>   0:   27bdffb8        addiu   sp,sp,-72
>   4:   00802821        move    a1,a0
>   8:   afbf0044        sw      ra,68(sp)
>   c:   afbe0040        sw      s8,64(sp)
>  10:   afb7003c        sw      s7,60(sp)
>  14:   afb60038        sw      s6,56(sp)
>  18:   afb50034        sw      s5,52(sp)
>  1c:   afb40030        sw      s4,48(sp)
>  20:   afb3002c        sw      s3,44(sp)
>  24:   afb20028        sw      s2,40(sp)
>  28:   afb10024        sw      s1,36(sp)
>  2c:   afb00020        sw      s0,32(sp)
>  30:   40141001        mfc0    s4,c0_tcstatus
>  34:   36810400        ori     at,s4,0x400
>  38:   40811001        mtc0    at,c0_tcstatus
>  3c:   32940400        andi    s4,s4,0x400
>  40:   000000c0        ehb
>  44:   41610001        dvpe    at
>  48:   0020a821        move    s5,at
>  4c:   000000c0        ehb
>  50:   3c020000        lui     v0,0x0
>  54:   24420060        addiu   v0,v0,96
>  58:   00400408        jr.hb   v0
>  5c:   00000000        nop
>  60:   3c040000        lui     a0,0x0
>  64:   24840000        addiu   a0,a0,0
>  68:   0c000000        jal     0 <mips_mt_regdump>
>  6c:   afa50010        sw      a1,16(sp)
>  70:   3c040000        lui     a0,0x0
>  74:   0c000000        jal     0 <mips_mt_regdump>
>  78:   24840000        addiu   a0,a0,0
>  7c:   8fa50010        lw      a1,16(sp)
>  80:   3c040000        lui     a0,0x0
>  84:   0c000000        jal     0 <mips_mt_regdump>
>  88:   24840000        addiu   a0,a0,0
>  8c:   3c040000        lui     a0,0x0
>  90:   24840000        addiu   a0,a0,0
>  94:   0c000000        jal     0 <mips_mt_regdump>
>  98:   02a02821        move    a1,s5
>  9c:   40110002        mfc0    s1,c0_mvpconf0
>  a0:   3c040000        lui     a0,0x0
>  a4:   02202821        move    a1,s1
>  a8:   0c000000        jal     0 <mips_mt_regdump>
>  ac:   24840000        addiu   a0,a0,0
>  b0:   3c040000        lui     a0,0x0
>  b4:   0c000000        jal     0 <mips_mt_regdump>
>  b8:   24840000        addiu   a0,a0,0
>  bc:   7e331a80        ext     s3,s1,0xa,0x4
>  c0:   3c090000        lui     t1,0x0
>  c4:   323100ff        andi    s1,s1,0xff
>  c8:   3c080000        lui     t0,0x0
>  cc:   3c030000        lui     v1,0x0
>  d0:   3c1e0000        lui     s8,0x0
>  d4:   3c170000        lui     s7,0x0
>  d8:   3c160000        lui     s6,0x0
>  dc:   3c0a0000        lui     t2,0x0
>  e0:   26730001        addiu   s3,s3,1
>  e4:   26310001        addiu   s1,s1,1
>  e8:   00008021        move    s0,zero
>  ec:   2412ff00        li      s2,-256
>  f0:   25290000        addiu   t1,t1,0
>  f4:   25080000        addiu   t0,t0,0
>  f8:   24630000        addiu   v1,v1,0
>  fc:   27de0000        addiu   s8,s8,0
>  100:   26f70000        addiu   s7,s7,0
>  104:   26d60000        addiu   s6,s6,0
>  108:   254a0000        addiu   t2,t2,0
>  10c:   00001021        move    v0,zero
>  110:   40040801        mfc0    a0,c0_vpecontrol
>  114:   00922024        and     a0,a0,s2
>  118:   00442025        or      a0,v0,a0
>  11c:   40840801        mtc0    a0,c0_vpecontrol
>  120:   000000c0        ehb
>  124:   41020802        mftc0   at,c0_tcbind
>  128:   00202021        move    a0,at
>  12c:   24420001        addiu   v0,v0,1
>  130:   3084000f        andi    a0,a0,0xf
>  134:   12040031        beq     s0,a0,1fc <mips_mt_regdump+0x1fc>
>  138:   0051282a        slt     a1,v0,s1
>  13c:   14a0fff4        bnez    a1,110 <mips_mt_regdump+0x110>
>  140:   00000000        nop
>  144:   26100001        addiu   s0,s0,1
>  148:   0213102a        slt     v0,s0,s3
>  14c:   1440fff0        bnez    v0,110 <mips_mt_regdump+0x110>
>  150:   00001021        move    v0,zero
>  154:   3c040000        lui     a0,0x0
>  158:   24840000        addiu   a0,a0,0
>  15c:   3c1e0000        lui     s8,0x0
>  160:   3c170000        lui     s7,0x0
>  164:   3c160000        lui     s6,0x0
>  168:   3c130000        lui     s3,0x0
>  16c:   0c000000        jal     0 <mips_mt_regdump>
>  170:   3c120000        lui     s2,0x0
>  174:   00008021        move    s0,zero
>  178:   27de0000        addiu   s8,s8,0
>  17c:   26f70000        addiu   s7,s7,0
>  180:   26d60000        addiu   s6,s6,0
>  184:   26730000        addiu   s3,s3,0
>  188:   26520000        addiu   s2,s2,0
>  18c:   40020801        mfc0    v0,c0_vpecontrol
>  190:   2403ff00        li      v1,-256
>  194:   00431024        and     v0,v0,v1
>  198:   02021025        or      v0,s0,v0
>  19c:   40820801        mtc0    v0,c0_vpecontrol
>  1a0:   000000c0        ehb
>  1a4:   41020802        mftc0   at,c0_tcbind
>  1a8:   00201821        move    v1,at
>  1ac:   40021002        mfc0    v0,c0_tcbind
>  1b0:   1062003f        beq     v1,v0,2b0 <mips_mt_regdump+0x2b0>
>  1b4:   00000000        nop
>  1b8:   41020804        mftc0   at,c0_tchalt
>  1bc:   00201821        move    v1,at
>  1c0:   24020001        li      v0,1
>  1c4:   00400821        move    at,v0
>  1c8:   41811004        mttc0   at,c0_tchalt
>  1cc:   41020801        mftc0   at,c0_tcstatus
>  1d0:   00203021        move    a2,at
>  1d4:   3c040000        lui     a0,0x0
>  1d8:   02002821        move    a1,s0
>  1dc:   24840000        addiu   a0,a0,0
>  1e0:   afa3001c        sw      v1,28(sp)
>  1e4:   0c000000        jal     0 <mips_mt_regdump>
>  1e8:   afa60010        sw      a2,16(sp)
>  1ec:   8fa60010        lw      a2,16(sp)
>  1f0:   8fa3001c        lw      v1,28(sp)
>  1f4:   080000b2        j       2c8 <mips_mt_regdump+0x2c8>
>  1f8:   00c02821        move    a1,a2
>  1fc:   01202021        move    a0,t1
>  200:   02002821        move    a1,s0
>  204:   afa3001c        sw      v1,28(sp)
>  208:   afa80014        sw      t0,20(sp)
>  20c:   afa90010        sw      t1,16(sp)
>  210:   0c000000        jal     0 <mips_mt_regdump>
>  214:   afaa0018        sw      t2,24(sp)
>  218:   41010801        mftc0   at,c0_vpecontrol
>  21c:   00202821        move    a1,at
>  220:   8fa80014        lw      t0,20(sp)
>  224:   0c000000        jal     0 <mips_mt_regdump>
>  228:   01002021        move    a0,t0
>  22c:   41010802        mftc0   at,c0_vpeconf0
>  230:   00202821        move    a1,at
>  234:   8fa3001c        lw      v1,28(sp)
>  238:   0c000000        jal     0 <mips_mt_regdump>
>  23c:   00602021        move    a0,v1
>  240:   410c0800        mftc0   at,c0_status
>  244:   00203021        move    a2,at
>  248:   03c02021        move    a0,s8
>  24c:   0c000000        jal     0 <mips_mt_regdump>
>  250:   02002821        move    a1,s0
>  254:   410e0800        mftc0   at,c0_epc
>  258:   00203021        move    a2,at
>  25c:   410e0800        mftc0   at,c0_epc
>  260:   00203821        move    a3,at
>  264:   02e02021        move    a0,s7
>  268:   0c000000        jal     0 <mips_mt_regdump>
>  26c:   02002821        move    a1,s0
>  270:   410d0800        mftc0   at,c0_cause
>  274:   00203021        move    a2,at
>  278:   02c02021        move    a0,s6
>  27c:   0c000000        jal     0 <mips_mt_regdump>
>  280:   02002821        move    a1,s0
>  284:   41100807        mftc0   at,$16,7
>  288:   00203021        move    a2,at
>  28c:   8faa0018        lw      t2,24(sp)
>  290:   02002821        move    a1,s0
>  294:   0c000000        jal     0 <mips_mt_regdump>
>  298:   01402021        move    a0,t2
>  29c:   8fa3001c        lw      v1,28(sp)
>  2a0:   8fa80014        lw      t0,20(sp)
>  2a4:   8fa90010        lw      t1,16(sp)
>  2a8:   08000051        j       144 <mips_mt_regdump+0x144>
>  2ac:   8faa0018        lw      t2,24(sp)
>  2b0:   3c040000        lui     a0,0x0
>  2b4:   02002821        move    a1,s0
>  2b8:   0c000000        jal     0 <mips_mt_regdump>
>  2bc:   24840000        addiu   a0,a0,0
>  2c0:   00001821        move    v1,zero
>  2c4:   02802821        move    a1,s4
>  2c8:   03c02021        move    a0,s8
>  2cc:   0c000000        jal     0 <mips_mt_regdump>
>  2d0:   afa3001c        sw      v1,28(sp)
>  2d4:   41020802        mftc0   at,c0_tcbind
>  2d8:   00202821        move    a1,at
>  2dc:   0c000000        jal     0 <mips_mt_regdump>
>  2e0:   02e02021        move    a0,s7
>  2e4:   41020803        mftc0   at,c0_tcrestart
>  2e8:   00202821        move    a1,at
>  2ec:   41020803        mftc0   at,c0_tcrestart
>  2f0:   00203021        move    a2,at
>  2f4:   0c000000        jal     0 <mips_mt_regdump>
>  2f8:   02c02021        move    a0,s6
>  2fc:   8fa3001c        lw      v1,28(sp)
>  300:   02602021        move    a0,s3
>  304:   0c000000        jal     0 <mips_mt_regdump>
>  308:   00602821        move    a1,v1
>  30c:   41020805        mftc0   at,c0_tccontext
>  310:   00202821        move    a1,at
>  314:   0c000000        jal     0 <mips_mt_regdump>
>  318:   02402021        move    a0,s2
>  31c:   8fa3001c        lw      v1,28(sp)
>  320:   14600003        bnez    v1,330 <mips_mt_regdump+0x330>
>  324:   00001021        move    v0,zero
>  328:   00400821        move    at,v0
>  32c:   41811004        mttc0   at,c0_tchalt
>  330:   26100001        addiu   s0,s0,1
>  334:   0211102a        slt     v0,s0,s1
>  338:   1440ff94        bnez    v0,18c <mips_mt_regdump+0x18c>
>  33c:   00000000        nop
>  340:   0c000000        jal     0 <mips_mt_regdump>
>  344:   32b50001        andi    s5,s5,0x1
>  348:   3c040000        lui     a0,0x0
>  34c:   0c000000        jal     0 <mips_mt_regdump>
>  350:   24840000        addiu   a0,a0,0
>  354:   12a00004        beqz    s5,368 <mips_mt_regdump+0x368>
>  358:   32820400        andi    v0,s4,0x400
>  35c:   41600021        evpe
>  360:   000000c0        ehb
>  364:   32820400        andi    v0,s4,0x400
>  368:   14400003        bnez    v0,378 <mips_mt_regdump+0x378>
>  36c:   00000000        nop
>  370:   0c000000        jal     0 <mips_mt_regdump>
>  374:   00000000        nop
>  378:   40011001        mfc0    at,c0_tcstatus
>  37c:   32940400        andi    s4,s4,0x400
>  380:   34210400        ori     at,at,0x400
>  384:   38210400        xori    at,at,0x400
>  388:   0281a025        or      s4,s4,at
>  38c:   40941001        mtc0    s4,c0_tcstatus
>  390:   000000c0        ehb
>  394:   8fbf0044        lw      ra,68(sp)
>  398:   8fbe0040        lw      s8,64(sp)
>  39c:   8fb7003c        lw      s7,60(sp)
>  3a0:   8fb60038        lw      s6,56(sp)
>  3a4:   8fb50034        lw      s5,52(sp)
>  3a8:   8fb40030        lw      s4,48(sp)
>  3ac:   8fb3002c        lw      s3,44(sp)
>  3b0:   8fb20028        lw      s2,40(sp)
>  3b4:   8fb10024        lw      s1,36(sp)
>  3b8:   8fb00020        lw      s0,32(sp)
>  3bc:   03e00008        jr      ra
>  3c0:   27bd0048        addiu   sp,sp,72
> 
> 
> On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com>
> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()?  The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
> 
> /K.
> 
> 
> On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to HTML
> content.
> So here it is again, textier.
> 
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> On 12/16/10 07:37, STUART VENTERS wrote:
> 
> Two other possible clues:
> 
> The EVP is clear in the MVPControl register.
> Does this say that only VPE0, T0 gets to run?
> That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
>  It's just possible that setting EVP is conditional on more than one VPE
> being used, but that's not the way I remember it.
> 
> Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
> Exception dispatch.
> But that seems to conflict the EVP bit above.
> I don't have a copy of the ASE spec handy to see whether those bits have a
> defined power-on value, but particularly if maxvpes=1 was set at boot
> time,
> I would expect VPE1's registers to be in a partly random power-up state.
> 
> Perhaps these are an artifact of getting to a good state to dump things
> out.
> As per my previous mail, I looked at the MT register dump source, and it
> really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack frame.
>  The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is based
> on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire* pre-invocation
> value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the assumption
> that dvpe() returns the pre-invocation
> value of MVPcontrol.  Break those assumptions, and you'll get inconsistent
> state dumps like this,
> and very possibly incorrect execution.   Particularly if what was done was
> that effectively replaces
> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
> with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions.  That
> would have been a *very* bad idea...
> 
>             Regards,
> 
>             Kevin K.
> 
> 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-21 20:29               ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:29 UTC (permalink / raw)
  To: Anoop P.A., Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips

Sorry I misunderstood file. git blame shows that "andi" is around for quite sometime .

49a89efb include/asm-mips/irqflags.h      (Ralf Baechle     2007-10-11 23:46:15 +0100 128) __asm__(
df9ee292 arch/mips/include/asm/irqflags.h (David Howells    2010-10-07 14:08:55 +0100 129)      "       .macro  arch_local_irq_save result
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 130)      "       .set    push
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 131)      "       .set    reorder
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 132)      "       .set    noat
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 133) #ifdef CONFIG_MIPS_MT_SMTC
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 134)      "       mfc0    \\result, $2, 1
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 135)      "       ori     $1, \\result, 0x400
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 136)      "       .set    noreorder
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 137)      "       mtc0    $1, $2, 1
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 138)      "       andi    \\result, \\result, 0x400
41c594ab include/asm-mips/interrupt.h     (Ralf Baechle     2006-04-05 09:45:45 +0100 139) #elif defined(CONFIG_CPU_MIPSR2)
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 140)      "       di      \\result
15265251 include/asm-mips/interrupt.h     (Maxime Bizon     2005-12-20 06:32:19 +0100 141)      "       andi    \\result, 1
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 142) #else
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 143)      "       mfc0    \\result, $12
c226f260 include/asm-mips/interrupt.h     (Atsushi Nemoto   2006-02-03 01:34:01 +0900 144)      "       ori     $1, \\result, 0x1f
c226f260 include/asm-mips/interrupt.h     (Atsushi Nemoto   2006-02-03 01:34:01 +0900 145)      "       xori    $1, 0x1f
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 146)      "       .set    noreorder
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 147)      "       mtc0    $1, $12
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 148) #endif
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 149)      "       irq_disable_hazard
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 150)      "       .set    pop
ff88f8a3 include/asm-mips/interrupt.h     (Ralf Baechle     2005-07-12 14:54:31 +0000 151)      "       .endm
^1da177e include/asm-mips/interrupt.h     (Linus Torvalds   2005-04-16 15:20:36 -0700 152)

> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org [mailto:linux-mips-bounce@linux-
> mips.org] On Behalf Of Anoop P.A.
> Sent: Wednesday, December 22, 2010 1:37 AM
> To: Kevin D. Kissell; Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org
> Subject: RE: SMTC support status in latest git head.
> 
> 
> OK. I will check it.
> 
> BTW following patch is responsible for irq change.
> 
> http://git.linux-
> mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2
> 101
> 
> Thanks
> Anoop
> ________________________________________
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Wednesday, December 22, 2010 12:23 AM
> To: Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
> Subject: Re: SMTC support status in latest git head.
> 
> OK, I see why the MT register dump isn't giving us useful information.
> It's not clear that it's at the root of your functional problems, though.
> Apparently, somebody decided that it was unwholesome to propagate anything
> other than the previous interrupt enable state in the flags variable
> passed between irq_save() and irq_restore().  I agree philosophically, but
> it does break the MT register dump function.  And I'm quite sure that
> there were other bits of SMTC code that knew that it was a TCStatus value,
> at least in the earliest versions of the code.  I'm not a gitweb power
> user,  but I haven't been able to figure out how to determine when the
> "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's
> where it is in the head of tree) was checked-in.  If it's at the boundary
> between working and non-working versions for SMTC, it might be the cause
> of the problems, but it may well not be responsible for anything other
> than the problem with reporting the value in
>  the MT register dump - which really ought to be fixed.
> 
> I'm in a small village in France for the holidays with no git/build system
> at my disposal, but I think that if you were to tweak mips-mt.c at line
> 103 to change
> the
> 
>         tcstatval = flags; /* And pre-dump TCStatus is flags */
> 
> 
> 
>         to something more like
> 
> 
> 
>         /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
>         */
> 
>         tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
> 
> 
> 
>         should fix the dump.
> 
>             Regards,
> 
>             Kevin K.
> 
> On 12/20/10 2:44 AM, Anoop P A wrote:
> Hi Kevin,
> 
> Please find disassembly  for mips_mt_reg_dump
> 
> Thanks
> Anoop
> 
> Disassembly of section .text:
> 
> 00000000 <mips_mt_regdump>:
>   0:   27bdffb8        addiu   sp,sp,-72
>   4:   00802821        move    a1,a0
>   8:   afbf0044        sw      ra,68(sp)
>   c:   afbe0040        sw      s8,64(sp)
>  10:   afb7003c        sw      s7,60(sp)
>  14:   afb60038        sw      s6,56(sp)
>  18:   afb50034        sw      s5,52(sp)
>  1c:   afb40030        sw      s4,48(sp)
>  20:   afb3002c        sw      s3,44(sp)
>  24:   afb20028        sw      s2,40(sp)
>  28:   afb10024        sw      s1,36(sp)
>  2c:   afb00020        sw      s0,32(sp)
>  30:   40141001        mfc0    s4,c0_tcstatus
>  34:   36810400        ori     at,s4,0x400
>  38:   40811001        mtc0    at,c0_tcstatus
>  3c:   32940400        andi    s4,s4,0x400
>  40:   000000c0        ehb
>  44:   41610001        dvpe    at
>  48:   0020a821        move    s5,at
>  4c:   000000c0        ehb
>  50:   3c020000        lui     v0,0x0
>  54:   24420060        addiu   v0,v0,96
>  58:   00400408        jr.hb   v0
>  5c:   00000000        nop
>  60:   3c040000        lui     a0,0x0
>  64:   24840000        addiu   a0,a0,0
>  68:   0c000000        jal     0 <mips_mt_regdump>
>  6c:   afa50010        sw      a1,16(sp)
>  70:   3c040000        lui     a0,0x0
>  74:   0c000000        jal     0 <mips_mt_regdump>
>  78:   24840000        addiu   a0,a0,0
>  7c:   8fa50010        lw      a1,16(sp)
>  80:   3c040000        lui     a0,0x0
>  84:   0c000000        jal     0 <mips_mt_regdump>
>  88:   24840000        addiu   a0,a0,0
>  8c:   3c040000        lui     a0,0x0
>  90:   24840000        addiu   a0,a0,0
>  94:   0c000000        jal     0 <mips_mt_regdump>
>  98:   02a02821        move    a1,s5
>  9c:   40110002        mfc0    s1,c0_mvpconf0
>  a0:   3c040000        lui     a0,0x0
>  a4:   02202821        move    a1,s1
>  a8:   0c000000        jal     0 <mips_mt_regdump>
>  ac:   24840000        addiu   a0,a0,0
>  b0:   3c040000        lui     a0,0x0
>  b4:   0c000000        jal     0 <mips_mt_regdump>
>  b8:   24840000        addiu   a0,a0,0
>  bc:   7e331a80        ext     s3,s1,0xa,0x4
>  c0:   3c090000        lui     t1,0x0
>  c4:   323100ff        andi    s1,s1,0xff
>  c8:   3c080000        lui     t0,0x0
>  cc:   3c030000        lui     v1,0x0
>  d0:   3c1e0000        lui     s8,0x0
>  d4:   3c170000        lui     s7,0x0
>  d8:   3c160000        lui     s6,0x0
>  dc:   3c0a0000        lui     t2,0x0
>  e0:   26730001        addiu   s3,s3,1
>  e4:   26310001        addiu   s1,s1,1
>  e8:   00008021        move    s0,zero
>  ec:   2412ff00        li      s2,-256
>  f0:   25290000        addiu   t1,t1,0
>  f4:   25080000        addiu   t0,t0,0
>  f8:   24630000        addiu   v1,v1,0
>  fc:   27de0000        addiu   s8,s8,0
>  100:   26f70000        addiu   s7,s7,0
>  104:   26d60000        addiu   s6,s6,0
>  108:   254a0000        addiu   t2,t2,0
>  10c:   00001021        move    v0,zero
>  110:   40040801        mfc0    a0,c0_vpecontrol
>  114:   00922024        and     a0,a0,s2
>  118:   00442025        or      a0,v0,a0
>  11c:   40840801        mtc0    a0,c0_vpecontrol
>  120:   000000c0        ehb
>  124:   41020802        mftc0   at,c0_tcbind
>  128:   00202021        move    a0,at
>  12c:   24420001        addiu   v0,v0,1
>  130:   3084000f        andi    a0,a0,0xf
>  134:   12040031        beq     s0,a0,1fc <mips_mt_regdump+0x1fc>
>  138:   0051282a        slt     a1,v0,s1
>  13c:   14a0fff4        bnez    a1,110 <mips_mt_regdump+0x110>
>  140:   00000000        nop
>  144:   26100001        addiu   s0,s0,1
>  148:   0213102a        slt     v0,s0,s3
>  14c:   1440fff0        bnez    v0,110 <mips_mt_regdump+0x110>
>  150:   00001021        move    v0,zero
>  154:   3c040000        lui     a0,0x0
>  158:   24840000        addiu   a0,a0,0
>  15c:   3c1e0000        lui     s8,0x0
>  160:   3c170000        lui     s7,0x0
>  164:   3c160000        lui     s6,0x0
>  168:   3c130000        lui     s3,0x0
>  16c:   0c000000        jal     0 <mips_mt_regdump>
>  170:   3c120000        lui     s2,0x0
>  174:   00008021        move    s0,zero
>  178:   27de0000        addiu   s8,s8,0
>  17c:   26f70000        addiu   s7,s7,0
>  180:   26d60000        addiu   s6,s6,0
>  184:   26730000        addiu   s3,s3,0
>  188:   26520000        addiu   s2,s2,0
>  18c:   40020801        mfc0    v0,c0_vpecontrol
>  190:   2403ff00        li      v1,-256
>  194:   00431024        and     v0,v0,v1
>  198:   02021025        or      v0,s0,v0
>  19c:   40820801        mtc0    v0,c0_vpecontrol
>  1a0:   000000c0        ehb
>  1a4:   41020802        mftc0   at,c0_tcbind
>  1a8:   00201821        move    v1,at
>  1ac:   40021002        mfc0    v0,c0_tcbind
>  1b0:   1062003f        beq     v1,v0,2b0 <mips_mt_regdump+0x2b0>
>  1b4:   00000000        nop
>  1b8:   41020804        mftc0   at,c0_tchalt
>  1bc:   00201821        move    v1,at
>  1c0:   24020001        li      v0,1
>  1c4:   00400821        move    at,v0
>  1c8:   41811004        mttc0   at,c0_tchalt
>  1cc:   41020801        mftc0   at,c0_tcstatus
>  1d0:   00203021        move    a2,at
>  1d4:   3c040000        lui     a0,0x0
>  1d8:   02002821        move    a1,s0
>  1dc:   24840000        addiu   a0,a0,0
>  1e0:   afa3001c        sw      v1,28(sp)
>  1e4:   0c000000        jal     0 <mips_mt_regdump>
>  1e8:   afa60010        sw      a2,16(sp)
>  1ec:   8fa60010        lw      a2,16(sp)
>  1f0:   8fa3001c        lw      v1,28(sp)
>  1f4:   080000b2        j       2c8 <mips_mt_regdump+0x2c8>
>  1f8:   00c02821        move    a1,a2
>  1fc:   01202021        move    a0,t1
>  200:   02002821        move    a1,s0
>  204:   afa3001c        sw      v1,28(sp)
>  208:   afa80014        sw      t0,20(sp)
>  20c:   afa90010        sw      t1,16(sp)
>  210:   0c000000        jal     0 <mips_mt_regdump>
>  214:   afaa0018        sw      t2,24(sp)
>  218:   41010801        mftc0   at,c0_vpecontrol
>  21c:   00202821        move    a1,at
>  220:   8fa80014        lw      t0,20(sp)
>  224:   0c000000        jal     0 <mips_mt_regdump>
>  228:   01002021        move    a0,t0
>  22c:   41010802        mftc0   at,c0_vpeconf0
>  230:   00202821        move    a1,at
>  234:   8fa3001c        lw      v1,28(sp)
>  238:   0c000000        jal     0 <mips_mt_regdump>
>  23c:   00602021        move    a0,v1
>  240:   410c0800        mftc0   at,c0_status
>  244:   00203021        move    a2,at
>  248:   03c02021        move    a0,s8
>  24c:   0c000000        jal     0 <mips_mt_regdump>
>  250:   02002821        move    a1,s0
>  254:   410e0800        mftc0   at,c0_epc
>  258:   00203021        move    a2,at
>  25c:   410e0800        mftc0   at,c0_epc
>  260:   00203821        move    a3,at
>  264:   02e02021        move    a0,s7
>  268:   0c000000        jal     0 <mips_mt_regdump>
>  26c:   02002821        move    a1,s0
>  270:   410d0800        mftc0   at,c0_cause
>  274:   00203021        move    a2,at
>  278:   02c02021        move    a0,s6
>  27c:   0c000000        jal     0 <mips_mt_regdump>
>  280:   02002821        move    a1,s0
>  284:   41100807        mftc0   at,$16,7
>  288:   00203021        move    a2,at
>  28c:   8faa0018        lw      t2,24(sp)
>  290:   02002821        move    a1,s0
>  294:   0c000000        jal     0 <mips_mt_regdump>
>  298:   01402021        move    a0,t2
>  29c:   8fa3001c        lw      v1,28(sp)
>  2a0:   8fa80014        lw      t0,20(sp)
>  2a4:   8fa90010        lw      t1,16(sp)
>  2a8:   08000051        j       144 <mips_mt_regdump+0x144>
>  2ac:   8faa0018        lw      t2,24(sp)
>  2b0:   3c040000        lui     a0,0x0
>  2b4:   02002821        move    a1,s0
>  2b8:   0c000000        jal     0 <mips_mt_regdump>
>  2bc:   24840000        addiu   a0,a0,0
>  2c0:   00001821        move    v1,zero
>  2c4:   02802821        move    a1,s4
>  2c8:   03c02021        move    a0,s8
>  2cc:   0c000000        jal     0 <mips_mt_regdump>
>  2d0:   afa3001c        sw      v1,28(sp)
>  2d4:   41020802        mftc0   at,c0_tcbind
>  2d8:   00202821        move    a1,at
>  2dc:   0c000000        jal     0 <mips_mt_regdump>
>  2e0:   02e02021        move    a0,s7
>  2e4:   41020803        mftc0   at,c0_tcrestart
>  2e8:   00202821        move    a1,at
>  2ec:   41020803        mftc0   at,c0_tcrestart
>  2f0:   00203021        move    a2,at
>  2f4:   0c000000        jal     0 <mips_mt_regdump>
>  2f8:   02c02021        move    a0,s6
>  2fc:   8fa3001c        lw      v1,28(sp)
>  300:   02602021        move    a0,s3
>  304:   0c000000        jal     0 <mips_mt_regdump>
>  308:   00602821        move    a1,v1
>  30c:   41020805        mftc0   at,c0_tccontext
>  310:   00202821        move    a1,at
>  314:   0c000000        jal     0 <mips_mt_regdump>
>  318:   02402021        move    a0,s2
>  31c:   8fa3001c        lw      v1,28(sp)
>  320:   14600003        bnez    v1,330 <mips_mt_regdump+0x330>
>  324:   00001021        move    v0,zero
>  328:   00400821        move    at,v0
>  32c:   41811004        mttc0   at,c0_tchalt
>  330:   26100001        addiu   s0,s0,1
>  334:   0211102a        slt     v0,s0,s1
>  338:   1440ff94        bnez    v0,18c <mips_mt_regdump+0x18c>
>  33c:   00000000        nop
>  340:   0c000000        jal     0 <mips_mt_regdump>
>  344:   32b50001        andi    s5,s5,0x1
>  348:   3c040000        lui     a0,0x0
>  34c:   0c000000        jal     0 <mips_mt_regdump>
>  350:   24840000        addiu   a0,a0,0
>  354:   12a00004        beqz    s5,368 <mips_mt_regdump+0x368>
>  358:   32820400        andi    v0,s4,0x400
>  35c:   41600021        evpe
>  360:   000000c0        ehb
>  364:   32820400        andi    v0,s4,0x400
>  368:   14400003        bnez    v0,378 <mips_mt_regdump+0x378>
>  36c:   00000000        nop
>  370:   0c000000        jal     0 <mips_mt_regdump>
>  374:   00000000        nop
>  378:   40011001        mfc0    at,c0_tcstatus
>  37c:   32940400        andi    s4,s4,0x400
>  380:   34210400        ori     at,at,0x400
>  384:   38210400        xori    at,at,0x400
>  388:   0281a025        or      s4,s4,at
>  38c:   40941001        mtc0    s4,c0_tcstatus
>  390:   000000c0        ehb
>  394:   8fbf0044        lw      ra,68(sp)
>  398:   8fbe0040        lw      s8,64(sp)
>  39c:   8fb7003c        lw      s7,60(sp)
>  3a0:   8fb60038        lw      s6,56(sp)
>  3a4:   8fb50034        lw      s5,52(sp)
>  3a8:   8fb40030        lw      s4,48(sp)
>  3ac:   8fb3002c        lw      s3,44(sp)
>  3b0:   8fb20028        lw      s2,40(sp)
>  3b4:   8fb10024        lw      s1,36(sp)
>  3b8:   8fb00020        lw      s0,32(sp)
>  3bc:   03e00008        jr      ra
>  3c0:   27bd0048        addiu   sp,sp,72
> 
> 
> On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com>
> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()?  The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
> 
> /K.
> 
> 
> On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to HTML
> content.
> So here it is again, textier.
> 
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> On 12/16/10 07:37, STUART VENTERS wrote:
> 
> Two other possible clues:
> 
> The EVP is clear in the MVPControl register.
> Does this say that only VPE0, T0 gets to run?
> That's correct.  In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
>  It's just possible that setting EVP is conditional on more than one VPE
> being used, but that's not the way I remember it.
> 
> Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
> Exception dispatch.
> But that seems to conflict the EVP bit above.
> I don't have a copy of the ASE spec handy to see whether those bits have a
> defined power-on value, but particularly if maxvpes=1 was set at boot
> time,
> I would expect VPE1's registers to be in a partly random power-up state.
> 
> Perhaps these are an artifact of getting to a good state to dump things
> out.
> As per my previous mail, I looked at the MT register dump source, and it
> really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack frame.
>  The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is based
> on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire* pre-invocation
> value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the assumption
> that dvpe() returns the pre-invocation
> value of MVPcontrol.  Break those assumptions, and you'll get inconsistent
> state dumps like this,
> and very possibly incorrect execution.   Particularly if what was done was
> that effectively replaces
> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
> with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions.  That
> would have been a *very* bad idea...
> 
>             Regards,
> 
>             Kevin K.
> 
> 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-21 20:29               ` Anoop P.A.
  (?)
@ 2010-12-22 10:27               ` Kevin D. Kissell
  2010-12-22 11:35                 ` Anoop P A
  -1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 10:27 UTC (permalink / raw)
  To: Anoop P.A.; +Cc: Anoop P A, STUART VENTERS, linux-mips

 > Sorry I misunderstood file. git blame shows that "andi" is around for 
quite
 > some time.

I've never used git blame, so I don't know how far it can be trusted, 
but if that change was made in 2006, that would predate the major 
breakage by several
years.  So my suggestion from yesterday is a reasonable one:

 > I think that if you were to tweak mips-mt.c at line 103 to change
 > the
 >
 >        tcstatval = flags; /* And pre-dump TCStatus is flags */
 >
 > to something more like
 >
 > /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
 > tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
 >
 > should fix the dump.

With that patch, if you re-run the experiment of hang-breakout-dump, we 
might be able to deduce something.

Ralf wrote to me independently to say that my message from yesterday 
with that suggestion and some other commentary got eaten once again by 
the LMO mail forwarder because of the HTML content.  With all due 
respect, I'm using a very standard open-source mail client (Thunderbird) 
with a very normal option (reply to text with text, HTML with HTML).  
Perhaps it it's the LMO mail system that needs to change, and not the 
mail configurations of the whole LMO community.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-22 10:27               ` Kevin D. Kissell
@ 2010-12-22 11:35                 ` Anoop P A
  2010-12-22 11:37                   ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-22 11:35 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., STUART VENTERS, linux-mips

On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
> > Sorry I misunderstood file. git blame shows that "andi" is around for 
> quite
>  > some time.
> 
> I've never used git blame, so I don't know how far it can be trusted, 
> but if that change was made in 2006, that would predate the major 
> breakage by several
> years.  So my suggestion from yesterday is a reasonable one:
That change is present in booting 2.6.32 kernel.Corresponding patch can
be found in gitweb .
http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39

> 
>  > I think that if you were to tweak mips-mt.c at line 103 to change
>  > the
>  >
>  >        tcstatval = flags; /* And pre-dump TCStatus is flags */
>  >
>  > to something more like
>  >
>  > /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
>  > tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
>  >
>  > should fix the dump.
> 
> With that patch, if you re-run the experiment of hang-breakout-dump, we 
> might be able to deduce something.
Here is the dump with the patch. 

[    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[    0.000000] -- Global State --
[    0.000000]    MVPControl Passed: 00000000
[    0.000000]    MVPControl Read: 00000000
[    0.000000]    MVPConf0 : a8008406
[    0.000000] -- per-VPE State --
[    0.000000]   VPE 0
[    0.000000]    VPEControl : 00000000
[    0.000000]    VPEConf0 : 800f0003
[    0.000000]    VPE0.Status : 11004001
[    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
[    0.000000]    VPE0.Cause : e080407c
[    0.000000]    VPE0.Config7 : 00010000
[    0.000000]   VPE 1
[    0.000000]    VPEControl : 00030000
[    0.000000]    VPEConf0 : 800f0000
[    0.000000]    VPE1.Status : 00407904
[    0.000000]    VPE1.EPC : fffdffff 0xfffdffff
[    0.000000]    VPE1.Cause : 4000027c
[    0.000000]    VPE1.Config7 : 00010000
[    0.000000] -- per-TC State --
[    0.000000]   TC 0 (current TC with VPE EPC above)
[    0.000000]    TCStatus : 11004001
[    0.000000]    TCBind : 00000000
[    0.000000]    TCRestart : 803fc408 printk+0x10/0x30
[    0.000000]    TCHalt : 00000000
[    0.000000]    TCContext : 00000000
[    0.000000]   TC 1
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00200001
[    0.000000]    TCRestart : 3ffffffe 0x3ffffffe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : efffffff
[    0.000000]   TC 2
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00400001
[    0.000000]    TCRestart : ffffffee 0xffffffee
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : efffffbf
[    0.000000]   TC 3
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00600001
[    0.000000]    TCRestart : ffe00200 0xffe00200
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 7fffb77f
[    0.000000]   TC 4
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00800001
[    0.000000]    TCRestart : ffe00200 0xffe00200
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 7ffdf736
[    0.000000]   TC 5
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00a00001
[    0.000000]    TCRestart : ffe00200 0xffe00200
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : ee5ffff7
[    0.000000]   TC 6
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00c00001
[    0.000000]    TCRestart : f7ff7ffe 0xf7ff7ffe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : e6fffffb
[    0.000000] Counter Interrupts taken per CPU (TC)
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] Self-IPI invocations:
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] 0 Recoveries of "stolen" FPU
[    0.000000] ===========================
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
[    0.010000]    VPE0.Cause : 50804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00030000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00407904
[    0.010000]    VPE1.EPC : fffdffff 0xfffdffff
[    0.010000]    VPE1.Cause : 4000027c
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 18004000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803fc408 printk+0x10/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 3ffffffe 0x3ffffffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : efffffff
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : ffffffee 0xffffffee
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : efffffbf
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : ffe00200 0xffe00200
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 7fffb77f
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : ffe00200 0xffe00200
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 7ffdf736
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : ffe00200 0xffe00200
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : ee5ffff7
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : f7ff7ffe 0xf7ff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : e6fffffb
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-22 11:35                 ` Anoop P A
@ 2010-12-22 11:37                   ` Kevin D. Kissell
  2010-12-22 11:51                     ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 11:37 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., STUART VENTERS, linux-mips

Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause 
all indicate that interrupts are enabled and not inhibited at the per-TC 
level, and the presumed timer interrupt, in the 0x4000 bit, is present 
and not masked-off.  Logically, the system must be entering (and 
exiting) the interrupt handler, yet the timer calibration isn't 
completing.  That leaves more complex possible explanations for failure, 
most of which would fall into two categories:

1)  The platform interrupt handler is failing to decode the event 
properly as a timer event.
2)  Despite there being only one TC active, the calibration code is 
waiting for some handshake from another "CPU"

To test the first, you might consider adding a kprintf() to the case of 
a "spurious" timer-like interrupt being detected and ignored...

             Regards,

             Kevin K.

On 12/22/10 3:35 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
>>> Sorry I misunderstood file. git blame shows that "andi" is around for
>> quite
>>   >  some time.
>>
>> I've never used git blame, so I don't know how far it can be trusted,
>> but if that change was made in 2006, that would predate the major
>> breakage by several
>> years.  So my suggestion from yesterday is a reasonable one:
> That change is present in booting 2.6.32 kernel.Corresponding patch can
> be found in gitweb .
> http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39
>
>>   >  I think that if you were to tweak mips-mt.c at line 103 to change
>>   >  the
>>   >
>>   >         tcstatval = flags; /* And pre-dump TCStatus is flags */
>>   >
>>   >  to something more like
>>   >
>>   >  /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
>>   >  tcstatval = (read_c0_tcstatus()&  ~0x400) | flags;
>>   >
>>   >  should fix the dump.
>>
>> With that patch, if you re-run the experiment of hang-breakout-dump, we
>> might be able to deduce something.
> Here is the dump with the patch.
>
> [    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [    0.000000] -- Global State --
> [    0.000000]    MVPControl Passed: 00000000
> [    0.000000]    MVPControl Read: 00000000
> [    0.000000]    MVPConf0 : a8008406
> [    0.000000] -- per-VPE State --
> [    0.000000]   VPE 0
> [    0.000000]    VPEControl : 00000000
> [    0.000000]    VPEConf0 : 800f0003
> [    0.000000]    VPE0.Status : 11004001
> [    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
> [    0.000000]    VPE0.Cause : e080407c
> [    0.000000]    VPE0.Config7 : 00010000
> [    0.000000]   VPE 1
> [    0.000000]    VPEControl : 00030000
> [    0.000000]    VPEConf0 : 800f0000
> [    0.000000]    VPE1.Status : 00407904
> [    0.000000]    VPE1.EPC : fffdffff 0xfffdffff
> [    0.000000]    VPE1.Cause : 4000027c
> [    0.000000]    VPE1.Config7 : 00010000
> [    0.000000] -- per-TC State --
> [    0.000000]   TC 0 (current TC with VPE EPC above)
> [    0.000000]    TCStatus : 11004001
> [    0.000000]    TCBind : 00000000
> [    0.000000]    TCRestart : 803fc408 printk+0x10/0x30
> [    0.000000]    TCHalt : 00000000
> [    0.000000]    TCContext : 00000000
> [    0.000000]   TC 1
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00200001
> [    0.000000]    TCRestart : 3ffffffe 0x3ffffffe
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : efffffff
> [    0.000000]   TC 2
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00400001
> [    0.000000]    TCRestart : ffffffee 0xffffffee
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : efffffbf
> [    0.000000]   TC 3
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00600001
> [    0.000000]    TCRestart : ffe00200 0xffe00200
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 7fffb77f
> [    0.000000]   TC 4
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00800001
> [    0.000000]    TCRestart : ffe00200 0xffe00200
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 7ffdf736
> [    0.000000]   TC 5
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00a00001
> [    0.000000]    TCRestart : ffe00200 0xffe00200
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : ee5ffff7
> [    0.000000]   TC 6
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00c00001
> [    0.000000]    TCRestart : f7ff7ffe 0xf7ff7ffe
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : e6fffffb
> [    0.000000] Counter Interrupts taken per CPU (TC)
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] Self-IPI invocations:
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] 0 Recoveries of "stolen" FPU
> [    0.000000] ===========================
> [    0.010000] === MIPS MT State Dump ===
> [    0.010000] -- Global State --
> [    0.010000]    MVPControl Passed: 00000000
> [    0.010000]    MVPControl Read: 00000000
> [    0.010000]    MVPConf0 : a8008406
> [    0.010000] -- per-VPE State --
> [    0.010000]   VPE 0
> [    0.010000]    VPEControl : 00000000
> [    0.010000]    VPEConf0 : 800f0003
> [    0.010000]    VPE0.Status : 18004000
> [    0.010000]    VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
> [    0.010000]    VPE0.Cause : 50804000
> [    0.010000]    VPE0.Config7 : 00010000
> [    0.010000]   VPE 1
> [    0.010000]    VPEControl : 00030000
> [    0.010000]    VPEConf0 : 800f0000
> [    0.010000]    VPE1.Status : 00407904
> [    0.010000]    VPE1.EPC : fffdffff 0xfffdffff
> [    0.010000]    VPE1.Cause : 4000027c
> [    0.010000]    VPE1.Config7 : 00010000
> [    0.010000] -- per-TC State --
> [    0.010000]   TC 0 (current TC with VPE EPC above)
> [    0.010000]    TCStatus : 18004000
> [    0.010000]    TCBind : 00000000
> [    0.010000]    TCRestart : 803fc408 printk+0x10/0x30
> [    0.010000]    TCHalt : 00000000
> [    0.010000]    TCContext : 00000000
> [    0.010000]   TC 1
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00200001
> [    0.010000]    TCRestart : 3ffffffe 0x3ffffffe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : efffffff
> [    0.010000]   TC 2
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00400001
> [    0.010000]    TCRestart : ffffffee 0xffffffee
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : efffffbf
> [    0.010000]   TC 3
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00600001
> [    0.010000]    TCRestart : ffe00200 0xffe00200
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 7fffb77f
> [    0.010000]   TC 4
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00800001
> [    0.010000]    TCRestart : ffe00200 0xffe00200
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 7ffdf736
> [    0.010000]   TC 5
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00a00001
> [    0.010000]    TCRestart : ffe00200 0xffe00200
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : ee5ffff7
> [    0.010000]   TC 6
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00c00001
> [    0.010000]    TCRestart : f7ff7ffe 0xf7ff7ffe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : e6fffffb
> [    0.010000] Counter Interrupts taken per CPU (TC)
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] Self-IPI invocations:
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] 0 Recoveries of "stolen" FPU
> [    0.010000] ===========================
>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-22 11:37                   ` Kevin D. Kissell
@ 2010-12-22 11:51                     ` Anoop P A
  2010-12-22 13:03                       ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-22 11:51 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., STUART VENTERS, linux-mips

On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
> Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause 
> all indicate that interrupts are enabled and not inhibited at the per-TC 
> level, and the presumed timer interrupt, in the 0x4000 bit, is present 
> and not masked-off.  Logically, the system must be entering (and 
> exiting) the interrupt handler, yet the timer calibration isn't 
> completing.  That leaves more complex possible explanations for failure, 
> most of which would fall into two categories:
> 
> 1)  The platform interrupt handler is failing to decode the event 
> properly as a timer event.
> 2)  Despite there being only one TC active, the calibration code is 
> waiting for some handshake from another "CPU"
> 
> To test the first, you might consider adding a kprintf() to the case of 
> a "spurious" timer-like interrupt being detected and ignored...

I have tried it . only one interrupt is coming and platform handler
detect it as timer interrupt and acknowledges properly . you can see a
time stamp change in the logs.

> 
>              Regards,
> 
>              Kevin K.
> 
> On 12/22/10 3:35 AM, Anoop P A wrote:
> > On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
> >>> Sorry I misunderstood file. git blame shows that "andi" is around for
> >> quite
> >>   >  some time.
> >>
> >> I've never used git blame, so I don't know how far it can be trusted,
> >> but if that change was made in 2006, that would predate the major
> >> breakage by several
> >> years.  So my suggestion from yesterday is a reasonable one:
> > That change is present in booting 2.6.32 kernel.Corresponding patch can
> > be found in gitweb .
> > http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39
> >
> >>   >  I think that if you were to tweak mips-mt.c at line 103 to change
> >>   >  the
> >>   >
> >>   >         tcstatval = flags; /* And pre-dump TCStatus is flags */
> >>   >
> >>   >  to something more like
> >>   >
> >>   >  /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
> >>   >  tcstatval = (read_c0_tcstatus()&  ~0x400) | flags;
> >>   >
> >>   >  should fix the dump.
> >>
> >> With that patch, if you re-run the experiment of hang-breakout-dump, we
> >> might be able to deduce something.
> > Here is the dump with the patch.
> >
> > [    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> > [    0.000000] -- Global State --
> > [    0.000000]    MVPControl Passed: 00000000
> > [    0.000000]    MVPControl Read: 00000000
> > [    0.000000]    MVPConf0 : a8008406
> > [    0.000000] -- per-VPE State --
> > [    0.000000]   VPE 0
> > [    0.000000]    VPEControl : 00000000
> > [    0.000000]    VPEConf0 : 800f0003
> > [    0.000000]    VPE0.Status : 11004001
> > [    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
> > [    0.000000]    VPE0.Cause : e080407c
> > [    0.000000]    VPE0.Config7 : 00010000
> > [    0.000000]   VPE 1
> > [    0.000000]    VPEControl : 00030000
> > [    0.000000]    VPEConf0 : 800f0000
> > [    0.000000]    VPE1.Status : 00407904
> > [    0.000000]    VPE1.EPC : fffdffff 0xfffdffff
> > [    0.000000]    VPE1.Cause : 4000027c
> > [    0.000000]    VPE1.Config7 : 00010000
> > [    0.000000] -- per-TC State --
> > [    0.000000]   TC 0 (current TC with VPE EPC above)
> > [    0.000000]    TCStatus : 11004001
> > [    0.000000]    TCBind : 00000000
> > [    0.000000]    TCRestart : 803fc408 printk+0x10/0x30
> > [    0.000000]    TCHalt : 00000000
> > [    0.000000]    TCContext : 00000000
> > [    0.000000]   TC 1
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00200001
> > [    0.000000]    TCRestart : 3ffffffe 0x3ffffffe
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : efffffff
> > [    0.000000]   TC 2
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00400001
> > [    0.000000]    TCRestart : ffffffee 0xffffffee
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : efffffbf
> > [    0.000000]   TC 3
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00600001
> > [    0.000000]    TCRestart : ffe00200 0xffe00200
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : 7fffb77f
> > [    0.000000]   TC 4
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00800001
> > [    0.000000]    TCRestart : ffe00200 0xffe00200
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : 7ffdf736
> > [    0.000000]   TC 5
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00a00001
> > [    0.000000]    TCRestart : ffe00200 0xffe00200
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : ee5ffff7
> > [    0.000000]   TC 6
> > [    0.000000]    TCStatus : 00000000
> > [    0.000000]    TCBind : 00c00001
> > [    0.000000]    TCRestart : f7ff7ffe 0xf7ff7ffe
> > [    0.000000]    TCHalt : 00000001
> > [    0.000000]    TCContext : e6fffffb
> > [    0.000000] Counter Interrupts taken per CPU (TC)
> > [    0.000000] 0: 0
> > [    0.000000] 1: 0
> > [    0.000000] Self-IPI invocations:
> > [    0.000000] 0: 0
> > [    0.000000] 1: 0
> > [    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > [    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > [    0.000000] 0 Recoveries of "stolen" FPU
> > [    0.000000] ===========================
> > [    0.010000] === MIPS MT State Dump ===
> > [    0.010000] -- Global State --
> > [    0.010000]    MVPControl Passed: 00000000
> > [    0.010000]    MVPControl Read: 00000000
> > [    0.010000]    MVPConf0 : a8008406
> > [    0.010000] -- per-VPE State --
> > [    0.010000]   VPE 0
> > [    0.010000]    VPEControl : 00000000
> > [    0.010000]    VPEConf0 : 800f0003
> > [    0.010000]    VPE0.Status : 18004000
> > [    0.010000]    VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
> > [    0.010000]    VPE0.Cause : 50804000
> > [    0.010000]    VPE0.Config7 : 00010000
> > [    0.010000]   VPE 1
> > [    0.010000]    VPEControl : 00030000
> > [    0.010000]    VPEConf0 : 800f0000
> > [    0.010000]    VPE1.Status : 00407904
> > [    0.010000]    VPE1.EPC : fffdffff 0xfffdffff
> > [    0.010000]    VPE1.Cause : 4000027c
> > [    0.010000]    VPE1.Config7 : 00010000
> > [    0.010000] -- per-TC State --
> > [    0.010000]   TC 0 (current TC with VPE EPC above)
> > [    0.010000]    TCStatus : 18004000
> > [    0.010000]    TCBind : 00000000
> > [    0.010000]    TCRestart : 803fc408 printk+0x10/0x30
> > [    0.010000]    TCHalt : 00000000
> > [    0.010000]    TCContext : 00000000
> > [    0.010000]   TC 1
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00200001
> > [    0.010000]    TCRestart : 3ffffffe 0x3ffffffe
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : efffffff
> > [    0.010000]   TC 2
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00400001
> > [    0.010000]    TCRestart : ffffffee 0xffffffee
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : efffffbf
> > [    0.010000]   TC 3
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00600001
> > [    0.010000]    TCRestart : ffe00200 0xffe00200
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : 7fffb77f
> > [    0.010000]   TC 4
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00800001
> > [    0.010000]    TCRestart : ffe00200 0xffe00200
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : 7ffdf736
> > [    0.010000]   TC 5
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00a00001
> > [    0.010000]    TCRestart : ffe00200 0xffe00200
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : ee5ffff7
> > [    0.010000]   TC 6
> > [    0.010000]    TCStatus : 00000000
> > [    0.010000]    TCBind : 00c00001
> > [    0.010000]    TCRestart : f7ff7ffe 0xf7ff7ffe
> > [    0.010000]    TCHalt : 00000001
> > [    0.010000]    TCContext : e6fffffb
> > [    0.010000] Counter Interrupts taken per CPU (TC)
> > [    0.010000] 0: 0
> > [    0.010000] 1: 0
> > [    0.010000] Self-IPI invocations:
> > [    0.010000] 0: 0
> > [    0.010000] 1: 0
> > [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > [    0.010000] 0 Recoveries of "stolen" FPU
> > [    0.010000] ===========================
> >
> >
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-22 11:51                     ` Anoop P A
@ 2010-12-22 13:03                       ` Kevin D. Kissell
  2010-12-22 16:34                           ` STUART VENTERS
  2010-12-23 21:09                           ` STUART VENTERS
  0 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 13:03 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., STUART VENTERS, linux-mips

On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off.  Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing.  That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1)  The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2)  Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange.  And your timer interrupt is definitely on the 
interrupt that corresponds to the 0x4000 mask?

I may have written the MT spec and the original SMTC code, but I don't 
have a copy of the spec, and it's been a few years, and I can't 
interpret the MVP and VPE control/config values. But I just don't see 
how the processor could not be taking more interrupts.  Stuart did 
decode the global/VPE state enough to observe that global multithreaded 
execution wasn't enabled, which is indeed strange - it shouldn't matter 
for single-TC execution, but I don't recall there being any special-case 
in the SMTC initialization that bypassed that enable.  That makes me 
suspect that maybe someone changed the initialization sequence in a way 
that bypasses one of the canonical initialization steps in a way that 
would break SMTC, but I don't know why that would result in the 
interrupt behavior you observe.

It might be yet another blind alley, but could you add/arm diagnostic 
output for each of the initialization functions in smtc.c?

Ah, yes, and one other thing.  You should add a dump of ErrorEPC to the 
MT register dump.  I did it for myself once upon a time when I was 
confronted with a similar mystery, but never filed a patch.  If you're 
breaking in with NMI, that could help identify more precisely where it's 
locking up.

You really ought to try to borrow an EJTAG probe.  It would save us both 
a lot of time.  And my time to trouble-shoot this with you is limited.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-22 16:34                           ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-22 16:34 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., linux-mips, Kevin D. Kissell

Anoop,

Nothing jumps out to me in the new set of register values.

It might be worth dumping all the CP0 registers?
   I'm especially interested in the Config3 to see the VEIC bit.
   The timer registers might be useful as well.

Regards,

Stuart

    

-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.


On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off.  Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing.  That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1)  The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2)  Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange.  And your timer interrupt is definitely on the 
interrupt that corresponds to the 0x4000 mask?

I may have written the MT spec and the original SMTC code, but I don't 
have a copy of the spec, and it's been a few years, and I can't 
interpret the MVP and VPE control/config values. But I just don't see 
how the processor could not be taking more interrupts.  Stuart did 
decode the global/VPE state enough to observe that global multithreaded 
execution wasn't enabled, which is indeed strange - it shouldn't matter 
for single-TC execution, but I don't recall there being any special-case 
in the SMTC initialization that bypassed that enable.  That makes me 
suspect that maybe someone changed the initialization sequence in a way 
that bypasses one of the canonical initialization steps in a way that 
would break SMTC, but I don't know why that would result in the 
interrupt behavior you observe.

It might be yet another blind alley, but could you add/arm diagnostic 
output for each of the initialization functions in smtc.c?

Ah, yes, and one other thing.  You should add a dump of ErrorEPC to the 
MT register dump.  I did it for myself once upon a time when I was 
confronted with a similar mystery, but never filed a patch.  If you're 
breaking in with NMI, that could help identify more precisely where it's 
locking up.

You really ought to try to borrow an EJTAG probe.  It would save us both 
a lot of time.  And my time to trouble-shoot this with you is limited.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-22 16:34                           ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-22 16:34 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., linux-mips, Kevin D. Kissell

Anoop,

Nothing jumps out to me in the new set of register values.

It might be worth dumping all the CP0 registers?
   I'm especially interested in the Config3 to see the VEIC bit.
   The timer registers might be useful as well.

Regards,

Stuart

    

-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.


On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off.  Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing.  That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1)  The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2)  Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange.  And your timer interrupt is definitely on the 
interrupt that corresponds to the 0x4000 mask?

I may have written the MT spec and the original SMTC code, but I don't 
have a copy of the spec, and it's been a few years, and I can't 
interpret the MVP and VPE control/config values. But I just don't see 
how the processor could not be taking more interrupts.  Stuart did 
decode the global/VPE state enough to observe that global multithreaded 
execution wasn't enabled, which is indeed strange - it shouldn't matter 
for single-TC execution, but I don't recall there being any special-case 
in the SMTC initialization that bypassed that enable.  That makes me 
suspect that maybe someone changed the initialization sequence in a way 
that bypasses one of the canonical initialization steps in a way that 
would break SMTC, but I don't know why that would result in the 
interrupt behavior you observe.

It might be yet another blind alley, but could you add/arm diagnostic 
output for each of the initialization functions in smtc.c?

Ah, yes, and one other thing.  You should add a dump of ErrorEPC to the 
MT register dump.  I did it for myself once upon a time when I was 
confronted with a similar mystery, but never filed a patch.  If you're 
breaking in with NMI, that could help identify more precisely where it's 
locking up.

You really ought to try to borrow an EJTAG probe.  It would save us both 
a lot of time.  And my time to trouble-shoot this with you is limited.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-23 21:09                           ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-23 21:09 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips, Anoop P A

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

Kevin,

I'm not sure if it's useful,
   but finally I got the time to look at the two kernel versions Anoop pointed out.
    works   2.6.32-stable with patch 804
    works_not 2.6.33-stable

greping for files with CONFIG_MIPS_MT_SMTC
   and looking for timer interrupt related stuff found the following differences:


arch/mips/include/asm/irq.h
arch/mips/kernel/irq.c
  do_IRQ
     
arch/mips/include/asm/stackframe.h
  SAVE_SOME SAVE_TEMP get/set_saved_sp

arch/mips/include/asm/time.h
  clocksource_set_clock

arch/mips/kernel/process.c
  cpu_idle

arch/mips/kernel/smtc.c
  __irq_entry
  ipi_decode
      SMTC_CLOCK_TICK 


Enclosed are the two subsets of files for a more expert look.

I'll try to look in more detail after Christmas.


Cheers,

Stuart





[-- Attachment #2: foo.tar.gz --]
[-- Type: application/x-gzip, Size: 46685 bytes --]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-23 21:09                           ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-23 21:09 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips, Anoop P A

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

Kevin,

I'm not sure if it's useful,
   but finally I got the time to look at the two kernel versions Anoop pointed out.
    works   2.6.32-stable with patch 804
    works_not 2.6.33-stable

greping for files with CONFIG_MIPS_MT_SMTC
   and looking for timer interrupt related stuff found the following differences:


arch/mips/include/asm/irq.h
arch/mips/kernel/irq.c
  do_IRQ
     
arch/mips/include/asm/stackframe.h
  SAVE_SOME SAVE_TEMP get/set_saved_sp

arch/mips/include/asm/time.h
  clocksource_set_clock

arch/mips/kernel/process.c
  cpu_idle

arch/mips/kernel/smtc.c
  __irq_entry
  ipi_decode
      SMTC_CLOCK_TICK 


Enclosed are the two subsets of files for a more expert look.

I'll try to look in more detail after Christmas.


Cheers,

Stuart





[-- Attachment #2: foo.tar.gz --]
[-- Type: application/x-gzip, Size: 46685 bytes --]

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-23 21:09                           ` STUART VENTERS
  (?)
@ 2010-12-24 12:32                           ` Kevin D. Kissell
  2010-12-24 14:39                             ` Anoop P A
  -1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 12:32 UTC (permalink / raw)
  To: STUART VENTERS; +Cc: Anoop P.A., linux-mips, Anoop P A

Thank you, Stuart!  I've spotted some definite breakage to SMTC between 
those versions.  In arch/mips/include/asm/stackframe.h, someone moved 
the store of the Status register value in SAVE_SOME (line 169 or 204, 
depending on the version) from two instructions after the mfc0 to a 
point after the #ifdef for SMTC, presumably to get better pipelining of 
the register access.  Unfortunately, the v1 register is also used in the 
SMTC-specific fragment to save TCStatus, so the Status value gets 
clobbered before it gets stored.  This will eventually result in the 
Status register getting a TCStatus value, which has some bits on common, 
but isn't identical and sooner or later Bad Things will happen.

I'm a little surprised this wasn't caught by visual inspection of the patch.

Possible solutions would include reverting the store of the CP0_STATUS 
value to the block above the #ifdef, or, to retain whatever performance 
advantage was obtained by moving the store downward, to use v0/$2 
instead of v1/$3, as the staging register for the TCStatus value.  I'd 
lean toward the second option, but I'm not in a position to test and 
submit a patch just now.

             Regards,

             Kevin K.

On 12/23/10 1:09 PM, STUART VENTERS wrote:
> Kevin,
>
> I'm not sure if it's useful,
>     but finally I got the time to look at the two kernel versions Anoop pointed out.
>      works   2.6.32-stable with patch 804
>      works_not 2.6.33-stable
>
> greping for files with CONFIG_MIPS_MT_SMTC
>     and looking for timer interrupt related stuff found the following differences:
>
>
> arch/mips/include/asm/irq.h
> arch/mips/kernel/irq.c
>    do_IRQ
>
> arch/mips/include/asm/stackframe.h
>    SAVE_SOME SAVE_TEMP get/set_saved_sp
>
> arch/mips/include/asm/time.h
>    clocksource_set_clock
>
> arch/mips/kernel/process.c
>    cpu_idle
>
> arch/mips/kernel/smtc.c
>    __irq_entry
>    ipi_decode
>        SMTC_CLOCK_TICK
>
>
> Enclosed are the two subsets of files for a more expert look.
>
> I'll try to look in more detail after Christmas.
>
>
> Cheers,
>
> Stuart
>
>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-24 12:32                           ` Kevin D. Kissell
@ 2010-12-24 14:39                             ` Anoop P A
  2010-12-24 14:53                               ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-24 14:39 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Hi Kevin, Stuart ,

Woohooo You guys spotted !.

 http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
the culprit

Once I restored previous version of stackframe.h 2.6.33-stable started
booting !.

Thanks,
Anoop

On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> Thank you, Stuart!  I've spotted some definite breakage to SMTC between 
> those versions.  In arch/mips/include/asm/stackframe.h, someone moved 
> the store of the Status register value in SAVE_SOME (line 169 or 204, 
> depending on the version) from two instructions after the mfc0 to a 
> point after the #ifdef for SMTC, presumably to get better pipelining of 
> the register access.  Unfortunately, the v1 register is also used in the 
> SMTC-specific fragment to save TCStatus, so the Status value gets 
> clobbered before it gets stored.  This will eventually result in the 
> Status register getting a TCStatus value, which has some bits on common, 
> but isn't identical and sooner or later Bad Things will happen.
> 
> I'm a little surprised this wasn't caught by visual inspection of the patch.
> 
> Possible solutions would include reverting the store of the CP0_STATUS 
> value to the block above the #ifdef, or, to retain whatever performance 
> advantage was obtained by moving the store downward, to use v0/$2 
> instead of v1/$3, as the staging register for the TCStatus value.  I'd 
> lean toward the second option, but I'm not in a position to test and 
> submit a patch just now.
> 
>              Regards,
> 
>              Kevin K.
> 
> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > Kevin,
> >
> > I'm not sure if it's useful,
> >     but finally I got the time to look at the two kernel versions Anoop pointed out.
> >      works   2.6.32-stable with patch 804
> >      works_not 2.6.33-stable
> >
> > greping for files with CONFIG_MIPS_MT_SMTC
> >     and looking for timer interrupt related stuff found the following differences:
> >
> >
> > arch/mips/include/asm/irq.h
> > arch/mips/kernel/irq.c
> >    do_IRQ
> >
> > arch/mips/include/asm/stackframe.h
> >    SAVE_SOME SAVE_TEMP get/set_saved_sp
> >
> > arch/mips/include/asm/time.h
> >    clocksource_set_clock
> >
> > arch/mips/kernel/process.c
> >    cpu_idle
> >
> > arch/mips/kernel/smtc.c
> >    __irq_entry
> >    ipi_decode
> >        SMTC_CLOCK_TICK
> >
> >
> > Enclosed are the two subsets of files for a more expert look.
> >
> > I'll try to look in more detail after Christmas.
> >
> >
> > Cheers,
> >
> > Stuart
> >
> >
> >
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-24 14:39                             ` Anoop P A
@ 2010-12-24 14:53                               ` Kevin D. Kissell
  2010-12-24 16:02                                 ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 14:53 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

[-- Attachment #1: Type: text/plain, Size: 2748 bytes --]

Excellent!  Now, does the attached patch (relative to 2.6.37.11) also 
fix things, while preserving the other fixes and performance enhancements?

/K.

On 12/24/10 6:39 AM, Anoop P A wrote:
> Hi Kevin, Stuart ,
>
> Woohooo You guys spotted !.
>
>   http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> the culprit
>
> Once I restored previous version of stackframe.h 2.6.33-stable started
> booting !.
>
> Thanks,
> Anoop
>
> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>> depending on the version) from two instructions after the mfc0 to a
>> point after the #ifdef for SMTC, presumably to get better pipelining of
>> the register access.  Unfortunately, the v1 register is also used in the
>> SMTC-specific fragment to save TCStatus, so the Status value gets
>> clobbered before it gets stored.  This will eventually result in the
>> Status register getting a TCStatus value, which has some bits on common,
>> but isn't identical and sooner or later Bad Things will happen.
>>
>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>
>> Possible solutions would include reverting the store of the CP0_STATUS
>> value to the block above the #ifdef, or, to retain whatever performance
>> advantage was obtained by moving the store downward, to use v0/$2
>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>> lean toward the second option, but I'm not in a position to test and
>> submit a patch just now.
>>
>>               Regards,
>>
>>               Kevin K.
>>
>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>> Kevin,
>>>
>>> I'm not sure if it's useful,
>>>      but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>       works   2.6.32-stable with patch 804
>>>       works_not 2.6.33-stable
>>>
>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>      and looking for timer interrupt related stuff found the following differences:
>>>
>>>
>>> arch/mips/include/asm/irq.h
>>> arch/mips/kernel/irq.c
>>>     do_IRQ
>>>
>>> arch/mips/include/asm/stackframe.h
>>>     SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>
>>> arch/mips/include/asm/time.h
>>>     clocksource_set_clock
>>>
>>> arch/mips/kernel/process.c
>>>     cpu_idle
>>>
>>> arch/mips/kernel/smtc.c
>>>     __irq_entry
>>>     ipi_decode
>>>         SMTC_CLOCK_TICK
>>>
>>>
>>> Enclosed are the two subsets of files for a more expert look.
>>>
>>> I'll try to look in more detail after Christmas.
>>>
>>>
>>> Cheers,
>>>
>>> Stuart
>>>
>>>
>>>
>>>
>


[-- Attachment #2: smtc_stackframe.h.patch --]
[-- Type: text/plain, Size: 394 bytes --]

--- stackframe.h	2010-12-24 06:47:06.000000000 -0800
+++ stackframe.h.test	2010-12-24 06:48:56.000000000 -0800
@@ -195,9 +195,9 @@
 		 * to cover the pipeline delay.
 		 */
 		.set	mips32
-		mfc0	v1, CP0_TCSTATUS
+		mfc0	v0, CP0_TCSTATUS
 		.set	mips0
-		LONG_S	v1, PT_TCSTATUS(sp)
+		LONG_S	v0, PT_TCSTATUS(sp)
 #endif /* CONFIG_MIPS_MT_SMTC */
 		LONG_S	$4, PT_R4(sp)
 		LONG_S	$5, PT_R5(sp)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-24 14:53                               ` Kevin D. Kissell
@ 2010-12-24 16:02                                 ` Anoop P A
  2010-12-24 23:34                                   ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-24 16:02 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also 
> fix things, while preserving the other fixes and performance enhancements?
> 
I have tested that patch with 2.6.37 branch it well passes calibration
loop but hangs after switching to mips closource

TC 6 going on-line as CPU 6
Brought up 7 CPUs
bio: create slab <bio-0> at 0
SCSI subsystem initialized
Switching to clocksource MIPS

I Presume this is a different issue as restoring older file didn't help
much to get rid of this hang.

diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..7fc9f10 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -195,9 +195,9 @@
 		 * to cover the pipeline delay.
 		 */
 		.set	mips32
-		mfc0	v1, CP0_TCSTATUS
+		mfc0	v0, CP0_TCSTATUS
 		.set	mips0
-		LONG_S	v1, PT_TCSTATUS(sp)
+		LONG_S	v0, PT_TCSTATUS(sp)
 #endif /* CONFIG_MIPS_MT_SMTC */
 		LONG_S	$4, PT_R4(sp)
 		LONG_S	$5, PT_R5(sp)


> /K.
> 
> On 12/24/10 6:39 AM, Anoop P A wrote:
> > Hi Kevin, Stuart ,
> >
> > Woohooo You guys spotted !.
> >
> >   http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > the culprit
> >
> > Once I restored previous version of stackframe.h 2.6.33-stable started
> > booting !.
> >
> > Thanks,
> > Anoop
> >
> > On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >> depending on the version) from two instructions after the mfc0 to a
> >> point after the #ifdef for SMTC, presumably to get better pipelining of
> >> the register access.  Unfortunately, the v1 register is also used in the
> >> SMTC-specific fragment to save TCStatus, so the Status value gets
> >> clobbered before it gets stored.  This will eventually result in the
> >> Status register getting a TCStatus value, which has some bits on common,
> >> but isn't identical and sooner or later Bad Things will happen.
> >>
> >> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>
> >> Possible solutions would include reverting the store of the CP0_STATUS
> >> value to the block above the #ifdef, or, to retain whatever performance
> >> advantage was obtained by moving the store downward, to use v0/$2
> >> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >> lean toward the second option, but I'm not in a position to test and
> >> submit a patch just now.
> >>
> >>               Regards,
> >>
> >>               Kevin K.
> >>
> >> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>> Kevin,
> >>>
> >>> I'm not sure if it's useful,
> >>>      but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>       works   2.6.32-stable with patch 804
> >>>       works_not 2.6.33-stable
> >>>
> >>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>      and looking for timer interrupt related stuff found the following differences:
> >>>
> >>>
> >>> arch/mips/include/asm/irq.h
> >>> arch/mips/kernel/irq.c
> >>>     do_IRQ
> >>>
> >>> arch/mips/include/asm/stackframe.h
> >>>     SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>
> >>> arch/mips/include/asm/time.h
> >>>     clocksource_set_clock
> >>>
> >>> arch/mips/kernel/process.c
> >>>     cpu_idle
> >>>
> >>> arch/mips/kernel/smtc.c
> >>>     __irq_entry
> >>>     ipi_decode
> >>>         SMTC_CLOCK_TICK
> >>>
> >>>
> >>> Enclosed are the two subsets of files for a more expert look.
> >>>
> >>> I'll try to look in more detail after Christmas.
> >>>
> >>>
> >>> Cheers,
> >>>
> >>> Stuart
> >>>
> >>>
> >>>
> >>>
> >
> 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-24 16:02                                 ` Anoop P A
@ 2010-12-24 23:34                                   ` Kevin D. Kissell
  2010-12-25  7:32                                     ` Anoop P A
  2010-12-27 15:49                                       ` STUART VENTERS
  0 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 23:34 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Ah, well, at least we have a stackframe.h fix that preserves David's 
performance tweak for the deeper pipelined processors.  In looking for 
this, I did notice that someone did some modification to the SMTC clock 
tick logic that I was skeptical had ever been tested.  If you've still 
got that kernel binary handy, you might check to see if it boots with 
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.

Oh, yes, and Merry Christmas one and all!

             Regards,

             Kevin K.

On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0>  at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
>   		 * to cover the pipeline delay.
>   		 */
>   		.set	mips32
> -		mfc0	v1, CP0_TCSTATUS
> +		mfc0	v0, CP0_TCSTATUS
>   		.set	mips0
> -		LONG_S	v1, PT_TCSTATUS(sp)
> +		LONG_S	v0, PT_TCSTATUS(sp)
>   #endif /* CONFIG_MIPS_MT_SMTC */
>   		LONG_S	$4, PT_R4(sp)
>   		LONG_S	$5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored.  This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>>                Regards,
>>>>
>>>>                Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>        works   2.6.32-stable with patch 804
>>>>>        works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>       and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>>      do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>>      clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>>      cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>>      __irq_entry
>>>>>      ipi_decode
>>>>>          SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-24 23:34                                   ` Kevin D. Kissell
@ 2010-12-25  7:32                                     ` Anoop P A
  2010-12-25 15:17                                       ` Kevin D. Kissell
  2010-12-27 15:49                                       ` STUART VENTERS
  1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-25  7:32 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Fri, 2010-12-24 at 15:34 -0800, Kevin D. Kissell wrote:
> Ah, well, at least we have a stackframe.h fix that preserves David's 
> performance tweak for the deeper pipelined processors.  In looking for 
> this, I did notice that someone did some modification to the SMTC clock 
> tick logic that I was skeptical had ever been tested.  If you've still 
> got that kernel binary handy, you might check to see if it boots with 
> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.

Yes I have tried with various combinations of tcs and vpes. with
maxvpes=1 I can boot with a max of 4 TCS ( VPE0 has 4 TCs) .
However setting maxpes=2 and maxtcs=2 hangs pretty early.

Clock rate set to 600000000
console [ttyS0] enabled
Calibrating delay loop... 398.33 BogoMIPS (lpj=796672)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
Limit of 2 VPEs set
Limit of 2 TCs set
TLB of 64 entry pairs shared by 2 VPEs
VPE 0: TC 0, VPE 1: TC 1
IPI buffer pool of 32 buffers
CPU revision is: 00019548 ((null))
TC 1 going on-line as CPU 1
Brought up 2 CPUs

One strange observation is with maxtcs=3 and maxvpes=2 kernel boots all
the way. 

Again with maxtcs=5 and maxvpes=2 it hangs after switching to MIPS
clocksource.

I strongly suspect some issue with locking. I will dig the code early
next week.


> 
> Oh, yes, and Merry Christmas one and all!

Thank you ! ..

Everybody Happy Christmas.

> 
>              Regards,
> 
>              Kevin K.
> 
> On 12/24/10 8:02 AM, Anoop P A wrote:
> > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >> fix things, while preserving the other fixes and performance enhancements?
> >>
> > I have tested that patch with 2.6.37 branch it well passes calibration
> > loop but hangs after switching to mips closource
> >
> > TC 6 going on-line as CPU 6
> > Brought up 7 CPUs
> > bio: create slab<bio-0>  at 0
> > SCSI subsystem initialized
> > Switching to clocksource MIPS
> >
> > I Presume this is a different issue as restoring older file didn't help
> > much to get rid of this hang.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..7fc9f10 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -195,9 +195,9 @@
> >   		 * to cover the pipeline delay.
> >   		 */
> >   		.set	mips32
> > -		mfc0	v1, CP0_TCSTATUS
> > +		mfc0	v0, CP0_TCSTATUS
> >   		.set	mips0
> > -		LONG_S	v1, PT_TCSTATUS(sp)
> > +		LONG_S	v0, PT_TCSTATUS(sp)
> >   #endif /* CONFIG_MIPS_MT_SMTC */
> >   		LONG_S	$4, PT_R4(sp)
> >   		LONG_S	$5, PT_R5(sp)
> >
> >
> >> /K.
> >>
> >> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>> Hi Kevin, Stuart ,
> >>>
> >>> Woohooo You guys spotted !.
> >>>
> >>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>> the culprit
> >>>
> >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>> booting !.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>> depending on the version) from two instructions after the mfc0 to a
> >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>> clobbered before it gets stored.  This will eventually result in the
> >>>> Status register getting a TCStatus value, which has some bits on common,
> >>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>
> >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>
> >>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>> lean toward the second option, but I'm not in a position to test and
> >>>> submit a patch just now.
> >>>>
> >>>>                Regards,
> >>>>
> >>>>                Kevin K.
> >>>>
> >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> I'm not sure if it's useful,
> >>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>        works   2.6.32-stable with patch 804
> >>>>>        works_not 2.6.33-stable
> >>>>>
> >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>       and looking for timer interrupt related stuff found the following differences:
> >>>>>
> >>>>>
> >>>>> arch/mips/include/asm/irq.h
> >>>>> arch/mips/kernel/irq.c
> >>>>>      do_IRQ
> >>>>>
> >>>>> arch/mips/include/asm/stackframe.h
> >>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>
> >>>>> arch/mips/include/asm/time.h
> >>>>>      clocksource_set_clock
> >>>>>
> >>>>> arch/mips/kernel/process.c
> >>>>>      cpu_idle
> >>>>>
> >>>>> arch/mips/kernel/smtc.c
> >>>>>      __irq_entry
> >>>>>      ipi_decode
> >>>>>          SMTC_CLOCK_TICK
> >>>>>
> >>>>>
> >>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>
> >>>>> I'll try to look in more detail after Christmas.
> >>>>>
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-25  7:32                                     ` Anoop P A
@ 2010-12-25 15:17                                       ` Kevin D. Kissell
  0 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-25 15:17 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 12/24/10 11:32 PM, Anoop P A wrote:
> On Fri, 2010-12-24 at 15:34 -0800, Kevin D. Kissell wrote:
>> Ah, well, at least we have a stackframe.h fix that preserves David's
>> performance tweak for the deeper pipelined processors.  In looking for
>> this, I did notice that someone did some modification to the SMTC clock
>> tick logic that I was skeptical had ever been tested.  If you've still
>> got that kernel binary handy, you might check to see if it boots with
>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> Yes I have tried with various combinations of tcs and vpes. with
> maxvpes=1 I can boot with a max of 4 TCS ( VPE0 has 4 TCs) .
> However setting maxpes=2 and maxtcs=2 hangs pretty early.
>
> Clock rate set to 600000000
> console [ttyS0] enabled
> Calibrating delay loop... 398.33 BogoMIPS (lpj=796672)
> pid_max: default: 32768 minimum: 301
> Mount-cache hash table entries: 512
> Limit of 2 VPEs set
> Limit of 2 TCs set
> TLB of 64 entry pairs shared by 2 VPEs
> VPE 0: TC 0, VPE 1: TC 1
> IPI buffer pool of 32 buffers
> CPU revision is: 00019548 ((null))
> TC 1 going on-line as CPU 1
> Brought up 2 CPUs
>
> One strange observation is with maxtcs=3 and maxvpes=2 kernel boots all
> the way.
>
> Again with maxtcs=5 and maxvpes=2 it hangs after switching to MIPS
> clocksource.
>
> I strongly suspect some issue with locking. I will dig the code early
> next week.
If locking is screwed up, I'd expect more problems with 4 TC "CPUs" in 
the same VPE. It also suggests that the basic distribution via local 
low-latency IPI within a VPE is functioning, but that something is 
broken in the cross-VPE evengt propagation.  I strongly suspect that 
your maxtcs=3, maxvpes=2 case would hang sooner or later, but by luck of 
the draw none of the init threads got scheduled on VPE 1 long enough to 
get stuck.

I note that there were some changes made under the rubric "MIPS: SMTC: 
Avoid queueing multiple reschedule IPIs" in October and November of last 
year that make me nervous.  I wouldn't have coded things that way 
myself, but they might be OK. Still, the first bisection I'd make if I 
was trouble-shooting this would be to roll back to just before they went in.

             Ho, ho, ho,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-27 15:49                                       ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-27 15:49 UTC (permalink / raw)
  To: Kevin D. Kissell, Anoop P A; +Cc: Anoop P.A., linux-mips

Kevin,

Outstanding, sometimes it's better to be lucky than good.


Anoop,

Maybe we can get lucky again.

If you can isolate the .33 works/.37 works_not bug to a specific pair of versions, 
   I'll be happy to do another diff.


Hope you'll have had a good Christmas as well.
  We've had snow in Alabama since Christmas eve!


Regards,

Stuart


-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Friday, December 24, 2010 5:34 PM
To: Anoop P A
Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.


Ah, well, at least we have a stackframe.h fix that preserves David's 
performance tweak for the deeper pipelined processors.  In looking for 
this, I did notice that someone did some modification to the SMTC clock 
tick logic that I was skeptical had ever been tested.  If you've still 
got that kernel binary handy, you might check to see if it boots with 
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.

Oh, yes, and Merry Christmas one and all!

             Regards,

             Kevin K.

On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0>  at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
>   		 * to cover the pipeline delay.
>   		 */
>   		.set	mips32
> -		mfc0	v1, CP0_TCSTATUS
> +		mfc0	v0, CP0_TCSTATUS
>   		.set	mips0
> -		LONG_S	v1, PT_TCSTATUS(sp)
> +		LONG_S	v0, PT_TCSTATUS(sp)
>   #endif /* CONFIG_MIPS_MT_SMTC */
>   		LONG_S	$4, PT_R4(sp)
>   		LONG_S	$5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored.  This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>>                Regards,
>>>>
>>>>                Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>        works   2.6.32-stable with patch 804
>>>>>        works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>       and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>>      do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>>      clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>>      cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>>      __irq_entry
>>>>>      ipi_decode
>>>>>          SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-27 15:49                                       ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-27 15:49 UTC (permalink / raw)
  To: Kevin D. Kissell, Anoop P A; +Cc: Anoop P.A., linux-mips

Kevin,

Outstanding, sometimes it's better to be lucky than good.


Anoop,

Maybe we can get lucky again.

If you can isolate the .33 works/.37 works_not bug to a specific pair of versions, 
   I'll be happy to do another diff.


Hope you'll have had a good Christmas as well.
  We've had snow in Alabama since Christmas eve!


Regards,

Stuart


-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Friday, December 24, 2010 5:34 PM
To: Anoop P A
Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.


Ah, well, at least we have a stackframe.h fix that preserves David's 
performance tweak for the deeper pipelined processors.  In looking for 
this, I did notice that someone did some modification to the SMTC clock 
tick logic that I was skeptical had ever been tested.  If you've still 
got that kernel binary handy, you might check to see if it boots with 
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.

Oh, yes, and Merry Christmas one and all!

             Regards,

             Kevin K.

On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0>  at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
>   		 * to cover the pipeline delay.
>   		 */
>   		.set	mips32
> -		mfc0	v1, CP0_TCSTATUS
> +		mfc0	v0, CP0_TCSTATUS
>   		.set	mips0
> -		LONG_S	v1, PT_TCSTATUS(sp)
> +		LONG_S	v0, PT_TCSTATUS(sp)
>   #endif /* CONFIG_MIPS_MT_SMTC */
>   		LONG_S	$4, PT_R4(sp)
>   		LONG_S	$5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored.  This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>>                Regards,
>>>>
>>>>                Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>        works   2.6.32-stable with patch 804
>>>>>        works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>       and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>>      do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>>      clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>>      cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>>      __irq_entry
>>>>>      ipi_decode
>>>>>          SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>


^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
  2010-12-27 15:49                                       ` STUART VENTERS
  (?)
@ 2010-12-27 17:19                                       ` Anoop P A
  2010-12-28  8:19                                         ` Anoop P A
  -1 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-27 17:19 UTC (permalink / raw)
  To: STUART VENTERS; +Cc: Kevin D. Kissell, Anoop P.A., linux-mips

Hi Kevin,

It is very unlikely that the patch you pointed has any impact on the the
hang I am seeing. The patch you have mentioned got into kernel around
2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
stackframe patch) . 

Hi Stuart,

I haven't got much time to spend on this today.

I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)

So probably some patches in 2.6.37 branch introduced this hang.

Hopefully I will get some free slot tomorrow so that I can look into
code diff .

Thanks
Anoop

On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> Kevin,
> 
> Outstanding, sometimes it's better to be lucky than good.
> 
> 
> Anoop,
> 
> Maybe we can get lucky again.
> 
> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions, 
>    I'll be happy to do another diff.
> 
> 
> Hope you'll have had a good Christmas as well.
>   We've had snow in Alabama since Christmas eve!
> 
> 
> Regards,
> 
> Stuart
> 
> 
> -----Original Message-----
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Friday, December 24, 2010 5:34 PM
> To: Anoop P A
> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> Subject: Re: SMTC support status in latest git head.
> 
> 
> Ah, well, at least we have a stackframe.h fix that preserves David's 
> performance tweak for the deeper pipelined processors.  In looking for 
> this, I did notice that someone did some modification to the SMTC clock 
> tick logic that I was skeptical had ever been tested.  If you've still 
> got that kernel binary handy, you might check to see if it boots with 
> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> 
> Oh, yes, and Merry Christmas one and all!
> 
>              Regards,
> 
>              Kevin K.
> 
> On 12/24/10 8:02 AM, Anoop P A wrote:
> > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >> fix things, while preserving the other fixes and performance enhancements?
> >>
> > I have tested that patch with 2.6.37 branch it well passes calibration
> > loop but hangs after switching to mips closource
> >
> > TC 6 going on-line as CPU 6
> > Brought up 7 CPUs
> > bio: create slab<bio-0>  at 0
> > SCSI subsystem initialized
> > Switching to clocksource MIPS
> >
> > I Presume this is a different issue as restoring older file didn't help
> > much to get rid of this hang.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..7fc9f10 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -195,9 +195,9 @@
> >   		 * to cover the pipeline delay.
> >   		 */
> >   		.set	mips32
> > -		mfc0	v1, CP0_TCSTATUS
> > +		mfc0	v0, CP0_TCSTATUS
> >   		.set	mips0
> > -		LONG_S	v1, PT_TCSTATUS(sp)
> > +		LONG_S	v0, PT_TCSTATUS(sp)
> >   #endif /* CONFIG_MIPS_MT_SMTC */
> >   		LONG_S	$4, PT_R4(sp)
> >   		LONG_S	$5, PT_R5(sp)
> >
> >
> >> /K.
> >>
> >> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>> Hi Kevin, Stuart ,
> >>>
> >>> Woohooo You guys spotted !.
> >>>
> >>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>> the culprit
> >>>
> >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>> booting !.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>> depending on the version) from two instructions after the mfc0 to a
> >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>> clobbered before it gets stored.  This will eventually result in the
> >>>> Status register getting a TCStatus value, which has some bits on common,
> >>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>
> >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>
> >>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>> lean toward the second option, but I'm not in a position to test and
> >>>> submit a patch just now.
> >>>>
> >>>>                Regards,
> >>>>
> >>>>                Kevin K.
> >>>>
> >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> I'm not sure if it's useful,
> >>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>        works   2.6.32-stable with patch 804
> >>>>>        works_not 2.6.33-stable
> >>>>>
> >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>       and looking for timer interrupt related stuff found the following differences:
> >>>>>
> >>>>>
> >>>>> arch/mips/include/asm/irq.h
> >>>>> arch/mips/kernel/irq.c
> >>>>>      do_IRQ
> >>>>>
> >>>>> arch/mips/include/asm/stackframe.h
> >>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>
> >>>>> arch/mips/include/asm/time.h
> >>>>>      clocksource_set_clock
> >>>>>
> >>>>> arch/mips/kernel/process.c
> >>>>>      cpu_idle
> >>>>>
> >>>>> arch/mips/kernel/smtc.c
> >>>>>      __irq_entry
> >>>>>      ipi_decode
> >>>>>          SMTC_CLOCK_TICK
> >>>>>
> >>>>>
> >>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>
> >>>>> I'll try to look in more detail after Christmas.
> >>>>>
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
  2010-12-27 17:19                                       ` Anoop P A
@ 2010-12-28  8:19                                         ` Anoop P A
  2010-12-28  8:43                                           ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-28  8:19 UTC (permalink / raw)
  To: STUART VENTERS; +Cc: Kevin D. Kissell, Anoop P.A., linux-mips

Hi,

I had a glance into the code diff without notice of any suspect-able
code .
Tracing the hang showed that it is getting hanged in timekeeping_notify
function.

Thanks,
Anoop

PS: I may not be available until Thursday

On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> Hi Kevin,
> 
> It is very unlikely that the patch you pointed has any impact on the the
> hang I am seeing. The patch you have mentioned got into kernel around
> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> stackframe patch) . 
> 
> Hi Stuart,
> 
> I haven't got much time to spend on this today.
> 
> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> 
> So probably some patches in 2.6.37 branch introduced this hang.
> 
> Hopefully I will get some free slot tomorrow so that I can look into
> code diff .
> 
> Thanks
> Anoop
> 
> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> > Kevin,
> > 
> > Outstanding, sometimes it's better to be lucky than good.
> > 
> > 
> > Anoop,
> > 
> > Maybe we can get lucky again.
> > 
> > If you can isolate the .33 works/.37 works_not bug to a specific pair of versions, 
> >    I'll be happy to do another diff.
> > 
> > 
> > Hope you'll have had a good Christmas as well.
> >   We've had snow in Alabama since Christmas eve!
> > 
> > 
> > Regards,
> > 
> > Stuart
> > 
> > 
> > -----Original Message-----
> > From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> > Sent: Friday, December 24, 2010 5:34 PM
> > To: Anoop P A
> > Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> > Subject: Re: SMTC support status in latest git head.
> > 
> > 
> > Ah, well, at least we have a stackframe.h fix that preserves David's 
> > performance tweak for the deeper pipelined processors.  In looking for 
> > this, I did notice that someone did some modification to the SMTC clock 
> > tick logic that I was skeptical had ever been tested.  If you've still 
> > got that kernel binary handy, you might check to see if it boots with 
> > maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> > 
> > Oh, yes, and Merry Christmas one and all!
> > 
> >              Regards,
> > 
> >              Kevin K.
> > 
> > On 12/24/10 8:02 AM, Anoop P A wrote:
> > > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> > >> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> > >> fix things, while preserving the other fixes and performance enhancements?
> > >>
> > > I have tested that patch with 2.6.37 branch it well passes calibration
> > > loop but hangs after switching to mips closource
> > >
> > > TC 6 going on-line as CPU 6
> > > Brought up 7 CPUs
> > > bio: create slab<bio-0>  at 0
> > > SCSI subsystem initialized
> > > Switching to clocksource MIPS
> > >
> > > I Presume this is a different issue as restoring older file didn't help
> > > much to get rid of this hang.
> > >
> > > diff --git a/arch/mips/include/asm/stackframe.h
> > > b/arch/mips/include/asm/stackframe.h
> > > index 58730c5..7fc9f10 100644
> > > --- a/arch/mips/include/asm/stackframe.h
> > > +++ b/arch/mips/include/asm/stackframe.h
> > > @@ -195,9 +195,9 @@
> > >   		 * to cover the pipeline delay.
> > >   		 */
> > >   		.set	mips32
> > > -		mfc0	v1, CP0_TCSTATUS
> > > +		mfc0	v0, CP0_TCSTATUS
> > >   		.set	mips0
> > > -		LONG_S	v1, PT_TCSTATUS(sp)
> > > +		LONG_S	v0, PT_TCSTATUS(sp)
> > >   #endif /* CONFIG_MIPS_MT_SMTC */
> > >   		LONG_S	$4, PT_R4(sp)
> > >   		LONG_S	$5, PT_R5(sp)
> > >
> > >
> > >> /K.
> > >>
> > >> On 12/24/10 6:39 AM, Anoop P A wrote:
> > >>> Hi Kevin, Stuart ,
> > >>>
> > >>> Woohooo You guys spotted !.
> > >>>
> > >>>    http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > >>> the culprit
> > >>>
> > >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> > >>> booting !.
> > >>>
> > >>> Thanks,
> > >>> Anoop
> > >>>
> > >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> > >>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> > >>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> > >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> > >>>> depending on the version) from two instructions after the mfc0 to a
> > >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> > >>>> the register access.  Unfortunately, the v1 register is also used in the
> > >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> > >>>> clobbered before it gets stored.  This will eventually result in the
> > >>>> Status register getting a TCStatus value, which has some bits on common,
> > >>>> but isn't identical and sooner or later Bad Things will happen.
> > >>>>
> > >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> > >>>>
> > >>>> Possible solutions would include reverting the store of the CP0_STATUS
> > >>>> value to the block above the #ifdef, or, to retain whatever performance
> > >>>> advantage was obtained by moving the store downward, to use v0/$2
> > >>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> > >>>> lean toward the second option, but I'm not in a position to test and
> > >>>> submit a patch just now.
> > >>>>
> > >>>>                Regards,
> > >>>>
> > >>>>                Kevin K.
> > >>>>
> > >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > >>>>> Kevin,
> > >>>>>
> > >>>>> I'm not sure if it's useful,
> > >>>>>       but finally I got the time to look at the two kernel versions Anoop pointed out.
> > >>>>>        works   2.6.32-stable with patch 804
> > >>>>>        works_not 2.6.33-stable
> > >>>>>
> > >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> > >>>>>       and looking for timer interrupt related stuff found the following differences:
> > >>>>>
> > >>>>>
> > >>>>> arch/mips/include/asm/irq.h
> > >>>>> arch/mips/kernel/irq.c
> > >>>>>      do_IRQ
> > >>>>>
> > >>>>> arch/mips/include/asm/stackframe.h
> > >>>>>      SAVE_SOME SAVE_TEMP get/set_saved_sp
> > >>>>>
> > >>>>> arch/mips/include/asm/time.h
> > >>>>>      clocksource_set_clock
> > >>>>>
> > >>>>> arch/mips/kernel/process.c
> > >>>>>      cpu_idle
> > >>>>>
> > >>>>> arch/mips/kernel/smtc.c
> > >>>>>      __irq_entry
> > >>>>>      ipi_decode
> > >>>>>          SMTC_CLOCK_TICK
> > >>>>>
> > >>>>>
> > >>>>> Enclosed are the two subsets of files for a more expert look.
> > >>>>>
> > >>>>> I'll try to look in more detail after Christmas.
> > >>>>>
> > >>>>>
> > >>>>> Cheers,
> > >>>>>
> > >>>>> Stuart
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >
> > 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-28  8:19                                         ` Anoop P A
@ 2010-12-28  8:43                                           ` Kevin D. Kissell
  2010-12-31 12:27                                             ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-28  8:43 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

I took a quick look last night, and the only thing that looked vaguely 
dangerous in changes since the timer changes I alluded to earlier was 
the global naming cleanup of irq-related function names that David 
Howell submitted.  The diff didn't look dangerous in itself, but some of 
the definitions are nested subtly for SMTC to maximize the amount of 
common code, and I could imagine something getting lost in translation 
there.  If that were really the problem, it would of course affect much 
more than just the timer subsystem, but early in the boot process, 
timers are pretty much the only interrupts that have to be handled 
correctly.

I'm travelling today, but will take a look at timekeeping_notify() 
tomorrow or the next day...

/K.

On 12/28/10 12:19 AM, Anoop P A wrote:
> Hi,
>
> I had a glance into the code diff without notice of any suspect-able
> code .
> Tracing the hang showed that it is getting hanged in timekeeping_notify
> function.
>
> Thanks,
> Anoop
>
> PS: I may not be available until Thursday
>
> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>> Hi Kevin,
>>
>> It is very unlikely that the patch you pointed has any impact on the the
>> hang I am seeing. The patch you have mentioned got into kernel around
>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
>> stackframe patch) .
>>
>> Hi Stuart,
>>
>> I haven't got much time to spend on this today.
>>
>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>
>> So probably some patches in 2.6.37 branch introduced this hang.
>>
>> Hopefully I will get some free slot tomorrow so that I can look into
>> code diff .
>>
>> Thanks
>> Anoop
>>
>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>> Kevin,
>>>
>>> Outstanding, sometimes it's better to be lucky than good.
>>>
>>>
>>> Anoop,
>>>
>>> Maybe we can get lucky again.
>>>
>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>     I'll be happy to do another diff.
>>>
>>>
>>> Hope you'll have had a good Christmas as well.
>>>    We've had snow in Alabama since Christmas eve!
>>>
>>>
>>> Regards,
>>>
>>> Stuart
>>>
>>>
>>> -----Original Message-----
>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>> Sent: Friday, December 24, 2010 5:34 PM
>>> To: Anoop P A
>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>> Subject: Re: SMTC support status in latest git head.
>>>
>>>
>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>> performance tweak for the deeper pipelined processors.  In looking for
>>> this, I did notice that someone did some modification to the SMTC clock
>>> tick logic that I was skeptical had ever been tested.  If you've still
>>> got that kernel binary handy, you might check to see if it boots with
>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>
>>> Oh, yes, and Merry Christmas one and all!
>>>
>>>               Regards,
>>>
>>>               Kevin K.
>>>
>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>
>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>> loop but hangs after switching to mips closource
>>>>
>>>> TC 6 going on-line as CPU 6
>>>> Brought up 7 CPUs
>>>> bio: create slab<bio-0>   at 0
>>>> SCSI subsystem initialized
>>>> Switching to clocksource MIPS
>>>>
>>>> I Presume this is a different issue as restoring older file didn't help
>>>> much to get rid of this hang.
>>>>
>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>> b/arch/mips/include/asm/stackframe.h
>>>> index 58730c5..7fc9f10 100644
>>>> --- a/arch/mips/include/asm/stackframe.h
>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>> @@ -195,9 +195,9 @@
>>>>    		 * to cover the pipeline delay.
>>>>    		 */
>>>>    		.set	mips32
>>>> -		mfc0	v1, CP0_TCSTATUS
>>>> +		mfc0	v0, CP0_TCSTATUS
>>>>    		.set	mips0
>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
>>>>    		LONG_S	$4, PT_R4(sp)
>>>>    		LONG_S	$5, PT_R5(sp)
>>>>
>>>>
>>>>> /K.
>>>>>
>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>> Hi Kevin, Stuart ,
>>>>>>
>>>>>> Woohooo You guys spotted !.
>>>>>>
>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>> the culprit
>>>>>>
>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>> booting !.
>>>>>>
>>>>>> Thanks,
>>>>>> Anoop
>>>>>>
>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>> clobbered before it gets stored.  This will eventually result in the
>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>
>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>
>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>> submit a patch just now.
>>>>>>>
>>>>>>>                 Regards,
>>>>>>>
>>>>>>>                 Kevin K.
>>>>>>>
>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>> Kevin,
>>>>>>>>
>>>>>>>> I'm not sure if it's useful,
>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>         works   2.6.32-stable with patch 804
>>>>>>>>         works_not 2.6.33-stable
>>>>>>>>
>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
>>>>>>>>
>>>>>>>>
>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>       do_IRQ
>>>>>>>>
>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>
>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>       clocksource_set_clock
>>>>>>>>
>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>       cpu_idle
>>>>>>>>
>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>       __irq_entry
>>>>>>>>       ipi_decode
>>>>>>>>           SMTC_CLOCK_TICK
>>>>>>>>
>>>>>>>>
>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>
>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Stuart
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-28  8:43                                           ` Kevin D. Kissell
@ 2010-12-31 12:27                                             ` Anoop P A
  2011-01-01  8:42                                               ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-31 12:27 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Hi ,

Kernel hangs on stop_machine call. Please find mt reg dump below.
Another important observation is even though 2.6.33 kernel + stackframe
patch well passes calibration hang , I am still unable boot in to a
initramfs root ( verified ramfs working with VSMP). So it looks like
still some issue to fix between 2.6.32 and 2.6.33 .
######################## Log ###########################

=== MIPS MT State Dump ===
-- Global State --
   MVPControl Passed: 00000005
   MVPControl Read: 00000004
   MVPConf0 : a8008406
-- per-VPE State --
  VPE 0
   VPEControl : 00008000
   VPEConf0 : 800f0003
   VPE0.Status : 11004201
   VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
   VPE0.Cause : 50804000
   VPE0.Config7 : 00010000
  VPE 1
   VPEControl : 00068006
   VPEConf0 : 80cf0003
   VPE1.Status : 11008301
   VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
   VPE1.Cause : 50800000
   VPE1.Config7 : 00010000
-- per-TC State --
  TC 0 (current TC with VPE EPC above)
   TCStatus : 18102000
   TCBind : 00000000
   TCRestart : 803fa19c printk+0xc/0x30
   TCHalt : 00000000
   TCContext : 00000000
  TC 1
   TCStatus : 18902000
   TCBind : 00200000
   TCRestart : 801022a0 r4k_wait+0x20/0x40
   TCHalt : 00000000
   TCContext : 00140000
  TC 2
   TCStatus : 18902000
   TCBind : 00400000
   TCRestart : 801022a0 r4k_wait+0x20/0x40
   TCHalt : 00000000
   TCContext : 00280000
  TC 3
   TCStatus : 18902000
   TCBind : 00600000
   TCRestart : 801022a0 r4k_wait+0x20/0x40
   TCHalt : 00000000
   TCContext : 003c0000
  TC 4
   TCStatus : 18902000
   TCBind : 00800001
   TCRestart : 8010229c r4k_wait+0x1c/0x40
   TCHalt : 00000000
   TCContext : 00500000
  TC 5
   TCStatus : 18902000
   TCBind : 00a00001
   TCRestart : 8010229c r4k_wait+0x1c/0x40
   TCHalt : 00000000
   TCContext : 00640000
  TC 6
   TCStatus : 18902000
   TCBind : 00c00001
   TCRestart : 8010229c r4k_wait+0x1c/0x40
   TCHalt : 00000000
   TCContext : 00780000
Counter Interrupts taken per CPU (TC)
0: 0
1: 0
2: 0
3: 0
4: 0
5: 0
6: 0
7: 0
Self-IPI invocations:
0: 12
1: 0
2: 0
3: 0
4: 0
5: 5
6: 4
7: 0
IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
0 Recoveries of "stolen" FPU
===========================

################################################################

Thanks
Anoop

On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> I took a quick look last night, and the only thing that looked vaguely 
> dangerous in changes since the timer changes I alluded to earlier was 
> the global naming cleanup of irq-related function names that David 
> Howell submitted.  The diff didn't look dangerous in itself, but some of 
> the definitions are nested subtly for SMTC to maximize the amount of 
> common code, and I could imagine something getting lost in translation 
> there.  If that were really the problem, it would of course affect much 
> more than just the timer subsystem, but early in the boot process, 
> timers are pretty much the only interrupts that have to be handled 
> correctly.
> 
> I'm travelling today, but will take a look at timekeeping_notify() 
> tomorrow or the next day...
> 
> /K.
> 
> On 12/28/10 12:19 AM, Anoop P A wrote:
> > Hi,
> >
> > I had a glance into the code diff without notice of any suspect-able
> > code .
> > Tracing the hang showed that it is getting hanged in timekeeping_notify
> > function.
> >
> > Thanks,
> > Anoop
> >
> > PS: I may not be available until Thursday
> >
> > On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >> Hi Kevin,
> >>
> >> It is very unlikely that the patch you pointed has any impact on the the
> >> hang I am seeing. The patch you have mentioned got into kernel around
> >> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> >> stackframe patch) .
> >>
> >> Hi Stuart,
> >>
> >> I haven't got much time to spend on this today.
> >>
> >> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>
> >> So probably some patches in 2.6.37 branch introduced this hang.
> >>
> >> Hopefully I will get some free slot tomorrow so that I can look into
> >> code diff .
> >>
> >> Thanks
> >> Anoop
> >>
> >> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>> Kevin,
> >>>
> >>> Outstanding, sometimes it's better to be lucky than good.
> >>>
> >>>
> >>> Anoop,
> >>>
> >>> Maybe we can get lucky again.
> >>>
> >>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>     I'll be happy to do another diff.
> >>>
> >>>
> >>> Hope you'll have had a good Christmas as well.
> >>>    We've had snow in Alabama since Christmas eve!
> >>>
> >>>
> >>> Regards,
> >>>
> >>> Stuart
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>> Sent: Friday, December 24, 2010 5:34 PM
> >>> To: Anoop P A
> >>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>> Subject: Re: SMTC support status in latest git head.
> >>>
> >>>
> >>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>> performance tweak for the deeper pipelined processors.  In looking for
> >>> this, I did notice that someone did some modification to the SMTC clock
> >>> tick logic that I was skeptical had ever been tested.  If you've still
> >>> got that kernel binary handy, you might check to see if it boots with
> >>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>
> >>> Oh, yes, and Merry Christmas one and all!
> >>>
> >>>               Regards,
> >>>
> >>>               Kevin K.
> >>>
> >>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>
> >>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>> loop but hangs after switching to mips closource
> >>>>
> >>>> TC 6 going on-line as CPU 6
> >>>> Brought up 7 CPUs
> >>>> bio: create slab<bio-0>   at 0
> >>>> SCSI subsystem initialized
> >>>> Switching to clocksource MIPS
> >>>>
> >>>> I Presume this is a different issue as restoring older file didn't help
> >>>> much to get rid of this hang.
> >>>>
> >>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>> b/arch/mips/include/asm/stackframe.h
> >>>> index 58730c5..7fc9f10 100644
> >>>> --- a/arch/mips/include/asm/stackframe.h
> >>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>> @@ -195,9 +195,9 @@
> >>>>    		 * to cover the pipeline delay.
> >>>>    		 */
> >>>>    		.set	mips32
> >>>> -		mfc0	v1, CP0_TCSTATUS
> >>>> +		mfc0	v0, CP0_TCSTATUS
> >>>>    		.set	mips0
> >>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> >>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> >>>>    #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>    		LONG_S	$4, PT_R4(sp)
> >>>>    		LONG_S	$5, PT_R5(sp)
> >>>>
> >>>>
> >>>>> /K.
> >>>>>
> >>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>> Hi Kevin, Stuart ,
> >>>>>>
> >>>>>> Woohooo You guys spotted !.
> >>>>>>
> >>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>> the culprit
> >>>>>>
> >>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>> booting !.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>> clobbered before it gets stored.  This will eventually result in the
> >>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>
> >>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>
> >>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>> submit a patch just now.
> >>>>>>>
> >>>>>>>                 Regards,
> >>>>>>>
> >>>>>>>                 Kevin K.
> >>>>>>>
> >>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>> Kevin,
> >>>>>>>>
> >>>>>>>> I'm not sure if it's useful,
> >>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>         works   2.6.32-stable with patch 804
> >>>>>>>>         works_not 2.6.33-stable
> >>>>>>>>
> >>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>        and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>       do_IRQ
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>       clocksource_set_clock
> >>>>>>>>
> >>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>       cpu_idle
> >>>>>>>>
> >>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>       __irq_entry
> >>>>>>>>       ipi_decode
> >>>>>>>>           SMTC_CLOCK_TICK
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>
> >>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>>
> >>>>>>>> Stuart
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-31 12:27                                             ` Anoop P A
@ 2011-01-01  8:42                                               ` Kevin D. Kissell
  2011-01-03 15:12                                                 ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-01  8:42 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

At this point the logical thing to do would seem to look at your kernel
image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
shows the last exception to have been taken.  That's a critical SMTC
routine that gets called whenever an xxx_irq_restore() enables
interrupts, so that virtual per-TC IPI interrupts that were posted while
the TC had interrupts disabled can be handled deterministically.  As I
mentioned in an earlier message, there was some cleanup work from David
Howell that changed a number of irq management-related function names
and prototypes across all architectures, which went into linux-mips.org
at very roughly the time of the breakage.  The SMTC overlay over the irq
implementation has been pretty robust, but it's written in a perhaps
doomed attempt to be both efficient and using a maximum amount of common
code with the general case.  A mechanical or semi-mechanical change
could conceivably have broken things.

            Regards,

            Kevin K.


On 12/31/2010 4:27 AM, Anoop P A wrote:
> Hi ,
>
> Kernel hangs on stop_machine call. Please find mt reg dump below.
> Another important observation is even though 2.6.33 kernel + stackframe
> patch well passes calibration hang , I am still unable boot in to a
> initramfs root ( verified ramfs working with VSMP). So it looks like
> still some issue to fix between 2.6.32 and 2.6.33 .
> ######################## Log ###########################
>
> === MIPS MT State Dump ===
> -- Global State --
>    MVPControl Passed: 00000005
>    MVPControl Read: 00000004
>    MVPConf0 : a8008406
> -- per-VPE State --
>   VPE 0
>    VPEControl : 00008000
>    VPEConf0 : 800f0003
>    VPE0.Status : 11004201
>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>    VPE0.Cause : 50804000
>    VPE0.Config7 : 00010000
>   VPE 1
>    VPEControl : 00068006
>    VPEConf0 : 80cf0003
>    VPE1.Status : 11008301
>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>    VPE1.Cause : 50800000
>    VPE1.Config7 : 00010000
> -- per-TC State --
>   TC 0 (current TC with VPE EPC above)
>    TCStatus : 18102000
>    TCBind : 00000000
>    TCRestart : 803fa19c printk+0xc/0x30
>    TCHalt : 00000000
>    TCContext : 00000000
>   TC 1
>    TCStatus : 18902000
>    TCBind : 00200000
>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>    TCHalt : 00000000
>    TCContext : 00140000
>   TC 2
>    TCStatus : 18902000
>    TCBind : 00400000
>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>    TCHalt : 00000000
>    TCContext : 00280000
>   TC 3
>    TCStatus : 18902000
>    TCBind : 00600000
>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>    TCHalt : 00000000
>    TCContext : 003c0000
>   TC 4
>    TCStatus : 18902000
>    TCBind : 00800001
>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>    TCHalt : 00000000
>    TCContext : 00500000
>   TC 5
>    TCStatus : 18902000
>    TCBind : 00a00001
>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>    TCHalt : 00000000
>    TCContext : 00640000
>   TC 6
>    TCStatus : 18902000
>    TCBind : 00c00001
>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>    TCHalt : 00000000
>    TCContext : 00780000
> Counter Interrupts taken per CPU (TC)
> 0: 0
> 1: 0
> 2: 0
> 3: 0
> 4: 0
> 5: 0
> 6: 0
> 7: 0
> Self-IPI invocations:
> 0: 12
> 1: 0
> 2: 0
> 3: 0
> 4: 0
> 5: 5
> 6: 4
> 7: 0
> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> 0 Recoveries of "stolen" FPU
> ===========================
>
> ################################################################
>
> Thanks
> Anoop
>
> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>> I took a quick look last night, and the only thing that looked vaguely 
>> dangerous in changes since the timer changes I alluded to earlier was 
>> the global naming cleanup of irq-related function names that David 
>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
>> the definitions are nested subtly for SMTC to maximize the amount of 
>> common code, and I could imagine something getting lost in translation 
>> there.  If that were really the problem, it would of course affect much 
>> more than just the timer subsystem, but early in the boot process, 
>> timers are pretty much the only interrupts that have to be handled 
>> correctly.
>>
>> I'm travelling today, but will take a look at timekeeping_notify() 
>> tomorrow or the next day...
>>
>> /K.
>>
>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>> Hi,
>>>
>>> I had a glance into the code diff without notice of any suspect-able
>>> code .
>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>> function.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> PS: I may not be available until Thursday
>>>
>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>> Hi Kevin,
>>>>
>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
>>>> stackframe patch) .
>>>>
>>>> Hi Stuart,
>>>>
>>>> I haven't got much time to spend on this today.
>>>>
>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>
>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>
>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>> code diff .
>>>>
>>>> Thanks
>>>> Anoop
>>>>
>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>
>>>>>
>>>>> Anoop,
>>>>>
>>>>> Maybe we can get lucky again.
>>>>>
>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>     I'll be happy to do another diff.
>>>>>
>>>>>
>>>>> Hope you'll have had a good Christmas as well.
>>>>>    We've had snow in Alabama since Christmas eve!
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>> To: Anoop P A
>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>
>>>>>
>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>> performance tweak for the deeper pipelined processors.  In looking for
>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>> tick logic that I was skeptical had ever been tested.  If you've still
>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>
>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>
>>>>>               Regards,
>>>>>
>>>>>               Kevin K.
>>>>>
>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>
>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>> loop but hangs after switching to mips closource
>>>>>>
>>>>>> TC 6 going on-line as CPU 6
>>>>>> Brought up 7 CPUs
>>>>>> bio: create slab<bio-0>   at 0
>>>>>> SCSI subsystem initialized
>>>>>> Switching to clocksource MIPS
>>>>>>
>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>> much to get rid of this hang.
>>>>>>
>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>> index 58730c5..7fc9f10 100644
>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>> @@ -195,9 +195,9 @@
>>>>>>    		 * to cover the pipeline delay.
>>>>>>    		 */
>>>>>>    		.set	mips32
>>>>>> -		mfc0	v1, CP0_TCSTATUS
>>>>>> +		mfc0	v0, CP0_TCSTATUS
>>>>>>    		.set	mips0
>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>    		LONG_S	$4, PT_R4(sp)
>>>>>>    		LONG_S	$5, PT_R5(sp)
>>>>>>
>>>>>>
>>>>>>> /K.
>>>>>>>
>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>
>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>
>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>> the culprit
>>>>>>>>
>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>> booting !.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>
>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>
>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>> submit a patch just now.
>>>>>>>>>
>>>>>>>>>                 Regards,
>>>>>>>>>
>>>>>>>>>                 Kevin K.
>>>>>>>>>
>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>> Kevin,
>>>>>>>>>>
>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>         works   2.6.32-stable with patch 804
>>>>>>>>>>         works_not 2.6.33-stable
>>>>>>>>>>
>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>       do_IRQ
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>       clocksource_set_clock
>>>>>>>>>>
>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>       cpu_idle
>>>>>>>>>>
>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>       __irq_entry
>>>>>>>>>>       ipi_decode
>>>>>>>>>>           SMTC_CLOCK_TICK
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>
>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Stuart
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-01  8:42                                               ` Kevin D. Kissell
@ 2011-01-03 15:12                                                 ` Anoop P A
  2011-01-03 16:14                                                   ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-03 15:12 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Hi ,

Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
SMP kernel.  
http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366

CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
( which will be only available RCU implementation for SMTC kernel from
2.6.37 onwards) .

With no forced preemption and selecting TREE_CPU I am able to boot
further to the hang that I have reported.

Thanks
Anoop

On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> At this point the logical thing to do would seem to look at your kernel
> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> shows the last exception to have been taken.  That's a critical SMTC
> routine that gets called whenever an xxx_irq_restore() enables
> interrupts, so that virtual per-TC IPI interrupts that were posted while
> the TC had interrupts disabled can be handled deterministically.  As I
> mentioned in an earlier message, there was some cleanup work from David
> Howell that changed a number of irq management-related function names
> and prototypes across all architectures, which went into linux-mips.org
> at very roughly the time of the breakage.  The SMTC overlay over the irq
> implementation has been pretty robust, but it's written in a perhaps
> doomed attempt to be both efficient and using a maximum amount of common
> code with the general case.  A mechanical or semi-mechanical change
> could conceivably have broken things.
> 
>             Regards,
> 
>             Kevin K.
> 
> 
> On 12/31/2010 4:27 AM, Anoop P A wrote:
> > Hi ,
> >
> > Kernel hangs on stop_machine call. Please find mt reg dump below.
> > Another important observation is even though 2.6.33 kernel + stackframe
> > patch well passes calibration hang , I am still unable boot in to a
> > initramfs root ( verified ramfs working with VSMP). So it looks like
> > still some issue to fix between 2.6.32 and 2.6.33 .
> > ######################## Log ###########################
> >
> > === MIPS MT State Dump ===
> > -- Global State --
> >    MVPControl Passed: 00000005
> >    MVPControl Read: 00000004
> >    MVPConf0 : a8008406
> > -- per-VPE State --
> >   VPE 0
> >    VPEControl : 00008000
> >    VPEConf0 : 800f0003
> >    VPE0.Status : 11004201
> >    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >    VPE0.Cause : 50804000
> >    VPE0.Config7 : 00010000
> >   VPE 1
> >    VPEControl : 00068006
> >    VPEConf0 : 80cf0003
> >    VPE1.Status : 11008301
> >    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >    VPE1.Cause : 50800000
> >    VPE1.Config7 : 00010000
> > -- per-TC State --
> >   TC 0 (current TC with VPE EPC above)
> >    TCStatus : 18102000
> >    TCBind : 00000000
> >    TCRestart : 803fa19c printk+0xc/0x30
> >    TCHalt : 00000000
> >    TCContext : 00000000
> >   TC 1
> >    TCStatus : 18902000
> >    TCBind : 00200000
> >    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >    TCHalt : 00000000
> >    TCContext : 00140000
> >   TC 2
> >    TCStatus : 18902000
> >    TCBind : 00400000
> >    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >    TCHalt : 00000000
> >    TCContext : 00280000
> >   TC 3
> >    TCStatus : 18902000
> >    TCBind : 00600000
> >    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >    TCHalt : 00000000
> >    TCContext : 003c0000
> >   TC 4
> >    TCStatus : 18902000
> >    TCBind : 00800001
> >    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >    TCHalt : 00000000
> >    TCContext : 00500000
> >   TC 5
> >    TCStatus : 18902000
> >    TCBind : 00a00001
> >    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >    TCHalt : 00000000
> >    TCContext : 00640000
> >   TC 6
> >    TCStatus : 18902000
> >    TCBind : 00c00001
> >    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >    TCHalt : 00000000
> >    TCContext : 00780000
> > Counter Interrupts taken per CPU (TC)
> > 0: 0
> > 1: 0
> > 2: 0
> > 3: 0
> > 4: 0
> > 5: 0
> > 6: 0
> > 7: 0
> > Self-IPI invocations:
> > 0: 12
> > 1: 0
> > 2: 0
> > 3: 0
> > 4: 0
> > 5: 5
> > 6: 4
> > 7: 0
> > IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> > 0 Recoveries of "stolen" FPU
> > ===========================
> >
> > ################################################################
> >
> > Thanks
> > Anoop
> >
> > On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >> I took a quick look last night, and the only thing that looked vaguely 
> >> dangerous in changes since the timer changes I alluded to earlier was 
> >> the global naming cleanup of irq-related function names that David 
> >> Howell submitted.  The diff didn't look dangerous in itself, but some of 
> >> the definitions are nested subtly for SMTC to maximize the amount of 
> >> common code, and I could imagine something getting lost in translation 
> >> there.  If that were really the problem, it would of course affect much 
> >> more than just the timer subsystem, but early in the boot process, 
> >> timers are pretty much the only interrupts that have to be handled 
> >> correctly.
> >>
> >> I'm travelling today, but will take a look at timekeeping_notify() 
> >> tomorrow or the next day...
> >>
> >> /K.
> >>
> >> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>> Hi,
> >>>
> >>> I had a glance into the code diff without notice of any suspect-able
> >>> code .
> >>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>> function.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> PS: I may not be available until Thursday
> >>>
> >>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>> Hi Kevin,
> >>>>
> >>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> >>>> stackframe patch) .
> >>>>
> >>>> Hi Stuart,
> >>>>
> >>>> I haven't got much time to spend on this today.
> >>>>
> >>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>
> >>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>
> >>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>> code diff .
> >>>>
> >>>> Thanks
> >>>> Anoop
> >>>>
> >>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>
> >>>>>
> >>>>> Anoop,
> >>>>>
> >>>>> Maybe we can get lucky again.
> >>>>>
> >>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>     I'll be happy to do another diff.
> >>>>>
> >>>>>
> >>>>> Hope you'll have had a good Christmas as well.
> >>>>>    We've had snow in Alabama since Christmas eve!
> >>>>>
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>> To: Anoop P A
> >>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>
> >>>>>
> >>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>> performance tweak for the deeper pipelined processors.  In looking for
> >>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>> tick logic that I was skeptical had ever been tested.  If you've still
> >>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>
> >>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>
> >>>>>               Regards,
> >>>>>
> >>>>>               Kevin K.
> >>>>>
> >>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>
> >>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>> loop but hangs after switching to mips closource
> >>>>>>
> >>>>>> TC 6 going on-line as CPU 6
> >>>>>> Brought up 7 CPUs
> >>>>>> bio: create slab<bio-0>   at 0
> >>>>>> SCSI subsystem initialized
> >>>>>> Switching to clocksource MIPS
> >>>>>>
> >>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>> much to get rid of this hang.
> >>>>>>
> >>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>> index 58730c5..7fc9f10 100644
> >>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>> @@ -195,9 +195,9 @@
> >>>>>>    		 * to cover the pipeline delay.
> >>>>>>    		 */
> >>>>>>    		.set	mips32
> >>>>>> -		mfc0	v1, CP0_TCSTATUS
> >>>>>> +		mfc0	v0, CP0_TCSTATUS
> >>>>>>    		.set	mips0
> >>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> >>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> >>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>    		LONG_S	$4, PT_R4(sp)
> >>>>>>    		LONG_S	$5, PT_R5(sp)
> >>>>>>
> >>>>>>
> >>>>>>> /K.
> >>>>>>>
> >>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>
> >>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>
> >>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>> the culprit
> >>>>>>>>
> >>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>> booting !.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>> clobbered before it gets stored.  This will eventually result in the
> >>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>
> >>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>
> >>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>> submit a patch just now.
> >>>>>>>>>
> >>>>>>>>>                 Regards,
> >>>>>>>>>
> >>>>>>>>>                 Kevin K.
> >>>>>>>>>
> >>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>> Kevin,
> >>>>>>>>>>
> >>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>         works   2.6.32-stable with patch 804
> >>>>>>>>>>         works_not 2.6.33-stable
> >>>>>>>>>>
> >>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>       do_IRQ
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>       clocksource_set_clock
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>       cpu_idle
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>       __irq_entry
> >>>>>>>>>>       ipi_decode
> >>>>>>>>>>           SMTC_CLOCK_TICK
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>
> >>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>>
> >>>>>>>>>> Stuart
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-03 15:12                                                 ` Anoop P A
@ 2011-01-03 16:14                                                   ` Kevin D. Kissell
  2011-01-03 19:20                                                     ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-03 16:14 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

The very first SMTC implementations didn't support full kernel-mode
preemption, which anyway wasn't a priority, given the hardware event
response support in MIPS MT.  I believe it was later made compatible,
but it was never extensively exercised.  Since SMTC has fingers in some
pretty low-level atomicity mechanisms, if a new, parallel set was
implemented for RCU, I can easily imagine that nobody has yet
implemented SMTC-ified variants of that set.

Your last statement isn't very clear, though.  Are you saying that if
you configure for no forced preemption and with TREE_CPU, the 2.6.37
kernel boots all the way up, or that it simply hangs later?   What's the
last rev kernel that actually boots all the way up?

            Regards,

            Kevin K.

On 1/3/2011 7:12 AM, Anoop P A wrote:
> Hi ,
>
> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> SMP kernel.  
> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>
> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> ( which will be only available RCU implementation for SMTC kernel from
> 2.6.37 onwards) .
>
> With no forced preemption and selecting TREE_CPU I am able to boot
> further to the hang that I have reported.
>
> Thanks
> Anoop
>
> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>> At this point the logical thing to do would seem to look at your kernel
>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>> shows the last exception to have been taken.  That's a critical SMTC
>> routine that gets called whenever an xxx_irq_restore() enables
>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>> the TC had interrupts disabled can be handled deterministically.  As I
>> mentioned in an earlier message, there was some cleanup work from David
>> Howell that changed a number of irq management-related function names
>> and prototypes across all architectures, which went into linux-mips.org
>> at very roughly the time of the breakage.  The SMTC overlay over the irq
>> implementation has been pretty robust, but it's written in a perhaps
>> doomed attempt to be both efficient and using a maximum amount of common
>> code with the general case.  A mechanical or semi-mechanical change
>> could conceivably have broken things.
>>
>>             Regards,
>>
>>             Kevin K.
>>
>>
>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>> Hi ,
>>>
>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>> Another important observation is even though 2.6.33 kernel + stackframe
>>> patch well passes calibration hang , I am still unable boot in to a
>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>> ######################## Log ###########################
>>>
>>> === MIPS MT State Dump ===
>>> -- Global State --
>>>    MVPControl Passed: 00000005
>>>    MVPControl Read: 00000004
>>>    MVPConf0 : a8008406
>>> -- per-VPE State --
>>>   VPE 0
>>>    VPEControl : 00008000
>>>    VPEConf0 : 800f0003
>>>    VPE0.Status : 11004201
>>>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>>    VPE0.Cause : 50804000
>>>    VPE0.Config7 : 00010000
>>>   VPE 1
>>>    VPEControl : 00068006
>>>    VPEConf0 : 80cf0003
>>>    VPE1.Status : 11008301
>>>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>>    VPE1.Cause : 50800000
>>>    VPE1.Config7 : 00010000
>>> -- per-TC State --
>>>   TC 0 (current TC with VPE EPC above)
>>>    TCStatus : 18102000
>>>    TCBind : 00000000
>>>    TCRestart : 803fa19c printk+0xc/0x30
>>>    TCHalt : 00000000
>>>    TCContext : 00000000
>>>   TC 1
>>>    TCStatus : 18902000
>>>    TCBind : 00200000
>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 00140000
>>>   TC 2
>>>    TCStatus : 18902000
>>>    TCBind : 00400000
>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 00280000
>>>   TC 3
>>>    TCStatus : 18902000
>>>    TCBind : 00600000
>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 003c0000
>>>   TC 4
>>>    TCStatus : 18902000
>>>    TCBind : 00800001
>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 00500000
>>>   TC 5
>>>    TCStatus : 18902000
>>>    TCBind : 00a00001
>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 00640000
>>>   TC 6
>>>    TCStatus : 18902000
>>>    TCBind : 00c00001
>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>    TCHalt : 00000000
>>>    TCContext : 00780000
>>> Counter Interrupts taken per CPU (TC)
>>> 0: 0
>>> 1: 0
>>> 2: 0
>>> 3: 0
>>> 4: 0
>>> 5: 0
>>> 6: 0
>>> 7: 0
>>> Self-IPI invocations:
>>> 0: 12
>>> 1: 0
>>> 2: 0
>>> 3: 0
>>> 4: 0
>>> 5: 5
>>> 6: 4
>>> 7: 0
>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>> 0 Recoveries of "stolen" FPU
>>> ===========================
>>>
>>> ################################################################
>>>
>>> Thanks
>>> Anoop
>>>
>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>> I took a quick look last night, and the only thing that looked vaguely 
>>>> dangerous in changes since the timer changes I alluded to earlier was 
>>>> the global naming cleanup of irq-related function names that David 
>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
>>>> the definitions are nested subtly for SMTC to maximize the amount of 
>>>> common code, and I could imagine something getting lost in translation 
>>>> there.  If that were really the problem, it would of course affect much 
>>>> more than just the timer subsystem, but early in the boot process, 
>>>> timers are pretty much the only interrupts that have to be handled 
>>>> correctly.
>>>>
>>>> I'm travelling today, but will take a look at timekeeping_notify() 
>>>> tomorrow or the next day...
>>>>
>>>> /K.
>>>>
>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>> Hi,
>>>>>
>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>> code .
>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>> function.
>>>>>
>>>>> Thanks,
>>>>> Anoop
>>>>>
>>>>> PS: I may not be available until Thursday
>>>>>
>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>> Hi Kevin,
>>>>>>
>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
>>>>>> stackframe patch) .
>>>>>>
>>>>>> Hi Stuart,
>>>>>>
>>>>>> I haven't got much time to spend on this today.
>>>>>>
>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>
>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>
>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>> code diff .
>>>>>>
>>>>>> Thanks
>>>>>> Anoop
>>>>>>
>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>> Kevin,
>>>>>>>
>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>
>>>>>>>
>>>>>>> Anoop,
>>>>>>>
>>>>>>> Maybe we can get lucky again.
>>>>>>>
>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>>     I'll be happy to do another diff.
>>>>>>>
>>>>>>>
>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>>    We've had snow in Alabama since Christmas eve!
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Stuart
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>> To: Anoop P A
>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>
>>>>>>>
>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>
>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>
>>>>>>>               Regards,
>>>>>>>
>>>>>>>               Kevin K.
>>>>>>>
>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>
>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>
>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>> Brought up 7 CPUs
>>>>>>>> bio: create slab<bio-0>   at 0
>>>>>>>> SCSI subsystem initialized
>>>>>>>> Switching to clocksource MIPS
>>>>>>>>
>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>> much to get rid of this hang.
>>>>>>>>
>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>>    		 * to cover the pipeline delay.
>>>>>>>>    		 */
>>>>>>>>    		.set	mips32
>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
>>>>>>>>    		.set	mips0
>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
>>>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>>    		LONG_S	$4, PT_R4(sp)
>>>>>>>>    		LONG_S	$5, PT_R5(sp)
>>>>>>>>
>>>>>>>>
>>>>>>>>> /K.
>>>>>>>>>
>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>
>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>
>>>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>> the culprit
>>>>>>>>>>
>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>> booting !.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anoop
>>>>>>>>>>
>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>
>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>
>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>
>>>>>>>>>>>                 Regards,
>>>>>>>>>>>
>>>>>>>>>>>                 Kevin K.
>>>>>>>>>>>
>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>>         works   2.6.32-stable with patch 804
>>>>>>>>>>>>         works_not 2.6.33-stable
>>>>>>>>>>>>
>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>>       do_IRQ
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>>       clocksource_set_clock
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>>       cpu_idle
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>>       __irq_entry
>>>>>>>>>>>>       ipi_decode
>>>>>>>>>>>>           SMTC_CLOCK_TICK
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>
>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> Stuart
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-03 16:14                                                   ` Kevin D. Kissell
@ 2011-01-03 19:20                                                     ` Anoop P A
  2011-01-04  8:17                                                       ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-03 19:20 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Hi Kevin,

On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> The very first SMTC implementations didn't support full kernel-mode
> preemption, which anyway wasn't a priority, given the hardware event
> response support in MIPS MT.  I believe it was later made compatible,
> but it was never extensively exercised.  Since SMTC has fingers in some
> pretty low-level atomicity mechanisms, if a new, parallel set was
> implemented for RCU, I can easily imagine that nobody has yet
> implemented SMTC-ified variants of that set.
> 
> Your last statement isn't very clear, though.  Are you saying that if
> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> kernel boots all the way up, or that it simply hangs later?   What's the
> last rev kernel that actually boots all the way up?

I have debugged this a bit more. It seems that kernel getting stalled
while executing on TC's of second VPE . 
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=2504 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=10036 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=17568 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=25100 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=32632 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=40164 jiffies)

With CONFIG_TREE_CPU we were not hitting this scenario very often.
However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.

I presume some issue in my timer setup . I am not seeing timer interrupt
(or IPI interrupt) getting  incremented for VPE1 tcs on a completely
booted 2.6.32-stable kernel.

/ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
CPU6
  1:        148      15023      15140      15093       3779          8
2            MIPS  SMTC_IPI
  6:          0          0          0          0          0          0
0            MIPS  MSP CIC cascade
  8:          0          0          0          0          0          0
0         MSP_CIC  Softreset button
  9:          0          0          0          0          0          0
0         MSP_CIC  Standby switch
 21:          0          0          0          0          0          0
0         MSP_CIC  MSP PER cascade
 25:      15113        341          4          7          0          0
0         MSP_CIC  timer
 27:        260          9          0          1          0          0
0         MSP_CIC  serial
 34:          0          0          0          0          0          0
0         MSP_CIC  timer

Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?. 

I have tried setting up VPE1 timer from get_co_compare_int as follows

unsigned int __cpuinit get_c0_compare_int(void)
{
	if ((1==get_current_vpe()) && !vpe1_timr_installed){
	
	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
	
	setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
                  vpe1_timr_installed++;
          }
          return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
MSP_INT_VPE0_TIMER);
}

Thanks
Anoop

> 
>             Regards,
> 
>             Kevin K.
> 
> On 1/3/2011 7:12 AM, Anoop P A wrote:
> > Hi ,
> >
> > Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> > SMP kernel.  
> > http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >
> > CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> > ( which will be only available RCU implementation for SMTC kernel from
> > 2.6.37 onwards) .
> >
> > With no forced preemption and selecting TREE_CPU I am able to boot
> > further to the hang that I have reported.
> >
> > Thanks
> > Anoop
> >
> > On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >> At this point the logical thing to do would seem to look at your kernel
> >> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >> shows the last exception to have been taken.  That's a critical SMTC
> >> routine that gets called whenever an xxx_irq_restore() enables
> >> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >> the TC had interrupts disabled can be handled deterministically.  As I
> >> mentioned in an earlier message, there was some cleanup work from David
> >> Howell that changed a number of irq management-related function names
> >> and prototypes across all architectures, which went into linux-mips.org
> >> at very roughly the time of the breakage.  The SMTC overlay over the irq
> >> implementation has been pretty robust, but it's written in a perhaps
> >> doomed attempt to be both efficient and using a maximum amount of common
> >> code with the general case.  A mechanical or semi-mechanical change
> >> could conceivably have broken things.
> >>
> >>             Regards,
> >>
> >>             Kevin K.
> >>
> >>
> >> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>> Hi ,
> >>>
> >>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>> Another important observation is even though 2.6.33 kernel + stackframe
> >>> patch well passes calibration hang , I am still unable boot in to a
> >>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>> ######################## Log ###########################
> >>>
> >>> === MIPS MT State Dump ===
> >>> -- Global State --
> >>>    MVPControl Passed: 00000005
> >>>    MVPControl Read: 00000004
> >>>    MVPConf0 : a8008406
> >>> -- per-VPE State --
> >>>   VPE 0
> >>>    VPEControl : 00008000
> >>>    VPEConf0 : 800f0003
> >>>    VPE0.Status : 11004201
> >>>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>>    VPE0.Cause : 50804000
> >>>    VPE0.Config7 : 00010000
> >>>   VPE 1
> >>>    VPEControl : 00068006
> >>>    VPEConf0 : 80cf0003
> >>>    VPE1.Status : 11008301
> >>>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>>    VPE1.Cause : 50800000
> >>>    VPE1.Config7 : 00010000
> >>> -- per-TC State --
> >>>   TC 0 (current TC with VPE EPC above)
> >>>    TCStatus : 18102000
> >>>    TCBind : 00000000
> >>>    TCRestart : 803fa19c printk+0xc/0x30
> >>>    TCHalt : 00000000
> >>>    TCContext : 00000000
> >>>   TC 1
> >>>    TCStatus : 18902000
> >>>    TCBind : 00200000
> >>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 00140000
> >>>   TC 2
> >>>    TCStatus : 18902000
> >>>    TCBind : 00400000
> >>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 00280000
> >>>   TC 3
> >>>    TCStatus : 18902000
> >>>    TCBind : 00600000
> >>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 003c0000
> >>>   TC 4
> >>>    TCStatus : 18902000
> >>>    TCBind : 00800001
> >>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 00500000
> >>>   TC 5
> >>>    TCStatus : 18902000
> >>>    TCBind : 00a00001
> >>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 00640000
> >>>   TC 6
> >>>    TCStatus : 18902000
> >>>    TCBind : 00c00001
> >>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>    TCHalt : 00000000
> >>>    TCContext : 00780000
> >>> Counter Interrupts taken per CPU (TC)
> >>> 0: 0
> >>> 1: 0
> >>> 2: 0
> >>> 3: 0
> >>> 4: 0
> >>> 5: 0
> >>> 6: 0
> >>> 7: 0
> >>> Self-IPI invocations:
> >>> 0: 12
> >>> 1: 0
> >>> 2: 0
> >>> 3: 0
> >>> 4: 0
> >>> 5: 5
> >>> 6: 4
> >>> 7: 0
> >>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>> 0 Recoveries of "stolen" FPU
> >>> ===========================
> >>>
> >>> ################################################################
> >>>
> >>> Thanks
> >>> Anoop
> >>>
> >>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>> I took a quick look last night, and the only thing that looked vaguely 
> >>>> dangerous in changes since the timer changes I alluded to earlier was 
> >>>> the global naming cleanup of irq-related function names that David 
> >>>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
> >>>> the definitions are nested subtly for SMTC to maximize the amount of 
> >>>> common code, and I could imagine something getting lost in translation 
> >>>> there.  If that were really the problem, it would of course affect much 
> >>>> more than just the timer subsystem, but early in the boot process, 
> >>>> timers are pretty much the only interrupts that have to be handled 
> >>>> correctly.
> >>>>
> >>>> I'm travelling today, but will take a look at timekeeping_notify() 
> >>>> tomorrow or the next day...
> >>>>
> >>>> /K.
> >>>>
> >>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>> code .
> >>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>> function.
> >>>>>
> >>>>> Thanks,
> >>>>> Anoop
> >>>>>
> >>>>> PS: I may not be available until Thursday
> >>>>>
> >>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>> Hi Kevin,
> >>>>>>
> >>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> >>>>>> stackframe patch) .
> >>>>>>
> >>>>>> Hi Stuart,
> >>>>>>
> >>>>>> I haven't got much time to spend on this today.
> >>>>>>
> >>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>
> >>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>
> >>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>> code diff .
> >>>>>>
> >>>>>> Thanks
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>> Kevin,
> >>>>>>>
> >>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>
> >>>>>>>
> >>>>>>> Anoop,
> >>>>>>>
> >>>>>>> Maybe we can get lucky again.
> >>>>>>>
> >>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>>     I'll be happy to do another diff.
> >>>>>>>
> >>>>>>>
> >>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>>    We've had snow in Alabama since Christmas eve!
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Stuart
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>> To: Anoop P A
> >>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>
> >>>>>>>
> >>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>> performance tweak for the deeper pipelined processors.  In looking for
> >>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
> >>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>
> >>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>
> >>>>>>>               Regards,
> >>>>>>>
> >>>>>>>               Kevin K.
> >>>>>>>
> >>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>
> >>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>
> >>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>> Brought up 7 CPUs
> >>>>>>>> bio: create slab<bio-0>   at 0
> >>>>>>>> SCSI subsystem initialized
> >>>>>>>> Switching to clocksource MIPS
> >>>>>>>>
> >>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>> much to get rid of this hang.
> >>>>>>>>
> >>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>>    		 * to cover the pipeline delay.
> >>>>>>>>    		 */
> >>>>>>>>    		.set	mips32
> >>>>>>>> -		mfc0	v1, CP0_TCSTATUS
> >>>>>>>> +		mfc0	v0, CP0_TCSTATUS
> >>>>>>>>    		.set	mips0
> >>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> >>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> >>>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>>    		LONG_S	$4, PT_R4(sp)
> >>>>>>>>    		LONG_S	$5, PT_R5(sp)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> /K.
> >>>>>>>>>
> >>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>
> >>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>
> >>>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>> the culprit
> >>>>>>>>>>
> >>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>> booting !.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Anoop
> >>>>>>>>>>
> >>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
> >>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>
> >>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>
> >>>>>>>>>>>                 Regards,
> >>>>>>>>>>>
> >>>>>>>>>>>                 Kevin K.
> >>>>>>>>>>>
> >>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>>         works   2.6.32-stable with patch 804
> >>>>>>>>>>>>         works_not 2.6.33-stable
> >>>>>>>>>>>>
> >>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>>       do_IRQ
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>>       clocksource_set_clock
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>>       cpu_idle
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>>       __irq_entry
> >>>>>>>>>>>>       ipi_decode
> >>>>>>>>>>>>           SMTC_CLOCK_TICK
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Stuart
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-03 19:20                                                     ` Anoop P A
@ 2011-01-04  8:17                                                       ` Kevin D. Kissell
  2011-01-04 13:02                                                         ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04  8:17 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Those interrupt counters show that IPIs are being taken everywhere,
though very few by CPUs 5 and 6.  If I understand the configuration
correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
rate, *if* we're looking at a tickless kernel under low load.  But there
may be a clue there to part of your problem.  I have no idea why the
behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
you're getting your clock interrupts through the MSP CIC interrupt
controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
example code is perhaps deceptively simple, in that both VPEs have their
count/compare indication wired directly to the 2 clock interrupt inputs,
so that having both of them running with only a single set of irq state
just works.  I don't know whether the MSP CIC timer interrupt is a
gating of the VPE0 count/compare output, or whether it's it's own
interval timer, but I suspect that you may need to do some further
low-level initialization in the platform-specific code to set up an
interrupt on the VPE1 side.  I don't think the snippet you've got below
would work as written.

If it's purely an issue with clock distribution on VPE1, then a boot
with maxvpes=1 maxtcs=4 should be stable.

/K.

On 1/3/2011 11:20 AM, Anoop P A wrote:
> Hi Kevin,
>
> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>> The very first SMTC implementations didn't support full kernel-mode
>> preemption, which anyway wasn't a priority, given the hardware event
>> response support in MIPS MT.  I believe it was later made compatible,
>> but it was never extensively exercised.  Since SMTC has fingers in some
>> pretty low-level atomicity mechanisms, if a new, parallel set was
>> implemented for RCU, I can easily imagine that nobody has yet
>> implemented SMTC-ified variants of that set.
>>
>> Your last statement isn't very clear, though.  Are you saying that if
>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>> kernel boots all the way up, or that it simply hangs later?   What's the
>> last rev kernel that actually boots all the way up?
> I have debugged this a bit more. It seems that kernel getting stalled
> while executing on TC's of second VPE . 
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=2504 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=10036 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=17568 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=25100 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=32632 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=40164 jiffies)
>
> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>
> I presume some issue in my timer setup . I am not seeing timer interrupt
> (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
> booted 2.6.32-stable kernel.
>
> / # cat /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> CPU6
>   1:        148      15023      15140      15093       3779          8
> 2            MIPS  SMTC_IPI
>   6:          0          0          0          0          0          0
> 0            MIPS  MSP CIC cascade
>   8:          0          0          0          0          0          0
> 0         MSP_CIC  Softreset button
>   9:          0          0          0          0          0          0
> 0         MSP_CIC  Standby switch
>  21:          0          0          0          0          0          0
> 0         MSP_CIC  MSP PER cascade
>  25:      15113        341          4          7          0          0
> 0         MSP_CIC  timer
>  27:        260          9          0          1          0          0
> 0         MSP_CIC  serial
>  34:          0          0          0          0          0          0
> 0         MSP_CIC  timer
>
> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?. 
>
> I have tried setting up VPE1 timer from get_co_compare_int as follows
>
> unsigned int __cpuinit get_c0_compare_int(void)
> {
> 	if ((1==get_current_vpe()) && !vpe1_timr_installed){
> 	
> 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> 	
> 	setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
>                   vpe1_timr_installed++;
>           }
>           return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> MSP_INT_VPE0_TIMER);
> }
>
> Thanks
> Anoop
>
>>             Regards,
>>
>>             Kevin K.
>>
>> On 1/3/2011 7:12 AM, Anoop P A wrote:
>>> Hi ,
>>>
>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
>>> SMP kernel.  
>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>>>
>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
>>> ( which will be only available RCU implementation for SMTC kernel from
>>> 2.6.37 onwards) .
>>>
>>> With no forced preemption and selecting TREE_CPU I am able to boot
>>> further to the hang that I have reported.
>>>
>>> Thanks
>>> Anoop
>>>
>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>>>> At this point the logical thing to do would seem to look at your kernel
>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>>>> shows the last exception to have been taken.  That's a critical SMTC
>>>> routine that gets called whenever an xxx_irq_restore() enables
>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>>>> the TC had interrupts disabled can be handled deterministically.  As I
>>>> mentioned in an earlier message, there was some cleanup work from David
>>>> Howell that changed a number of irq management-related function names
>>>> and prototypes across all architectures, which went into linux-mips.org
>>>> at very roughly the time of the breakage.  The SMTC overlay over the irq
>>>> implementation has been pretty robust, but it's written in a perhaps
>>>> doomed attempt to be both efficient and using a maximum amount of common
>>>> code with the general case.  A mechanical or semi-mechanical change
>>>> could conceivably have broken things.
>>>>
>>>>             Regards,
>>>>
>>>>             Kevin K.
>>>>
>>>>
>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>>>> Hi ,
>>>>>
>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>>>> Another important observation is even though 2.6.33 kernel + stackframe
>>>>> patch well passes calibration hang , I am still unable boot in to a
>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>>>> ######################## Log ###########################
>>>>>
>>>>> === MIPS MT State Dump ===
>>>>> -- Global State --
>>>>>    MVPControl Passed: 00000005
>>>>>    MVPControl Read: 00000004
>>>>>    MVPConf0 : a8008406
>>>>> -- per-VPE State --
>>>>>   VPE 0
>>>>>    VPEControl : 00008000
>>>>>    VPEConf0 : 800f0003
>>>>>    VPE0.Status : 11004201
>>>>>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>>>>    VPE0.Cause : 50804000
>>>>>    VPE0.Config7 : 00010000
>>>>>   VPE 1
>>>>>    VPEControl : 00068006
>>>>>    VPEConf0 : 80cf0003
>>>>>    VPE1.Status : 11008301
>>>>>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>>>>    VPE1.Cause : 50800000
>>>>>    VPE1.Config7 : 00010000
>>>>> -- per-TC State --
>>>>>   TC 0 (current TC with VPE EPC above)
>>>>>    TCStatus : 18102000
>>>>>    TCBind : 00000000
>>>>>    TCRestart : 803fa19c printk+0xc/0x30
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00000000
>>>>>   TC 1
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00200000
>>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00140000
>>>>>   TC 2
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00400000
>>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00280000
>>>>>   TC 3
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00600000
>>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 003c0000
>>>>>   TC 4
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00800001
>>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00500000
>>>>>   TC 5
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00a00001
>>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00640000
>>>>>   TC 6
>>>>>    TCStatus : 18902000
>>>>>    TCBind : 00c00001
>>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>    TCHalt : 00000000
>>>>>    TCContext : 00780000
>>>>> Counter Interrupts taken per CPU (TC)
>>>>> 0: 0
>>>>> 1: 0
>>>>> 2: 0
>>>>> 3: 0
>>>>> 4: 0
>>>>> 5: 0
>>>>> 6: 0
>>>>> 7: 0
>>>>> Self-IPI invocations:
>>>>> 0: 12
>>>>> 1: 0
>>>>> 2: 0
>>>>> 3: 0
>>>>> 4: 0
>>>>> 5: 5
>>>>> 6: 4
>>>>> 7: 0
>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>>>> 0 Recoveries of "stolen" FPU
>>>>> ===========================
>>>>>
>>>>> ################################################################
>>>>>
>>>>> Thanks
>>>>> Anoop
>>>>>
>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>>>> I took a quick look last night, and the only thing that looked vaguely 
>>>>>> dangerous in changes since the timer changes I alluded to earlier was 
>>>>>> the global naming cleanup of irq-related function names that David 
>>>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
>>>>>> the definitions are nested subtly for SMTC to maximize the amount of 
>>>>>> common code, and I could imagine something getting lost in translation 
>>>>>> there.  If that were really the problem, it would of course affect much 
>>>>>> more than just the timer subsystem, but early in the boot process, 
>>>>>> timers are pretty much the only interrupts that have to be handled 
>>>>>> correctly.
>>>>>>
>>>>>> I'm travelling today, but will take a look at timekeeping_notify() 
>>>>>> tomorrow or the next day...
>>>>>>
>>>>>> /K.
>>>>>>
>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>>>> code .
>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>>>> function.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Anoop
>>>>>>>
>>>>>>> PS: I may not be available until Thursday
>>>>>>>
>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>>>> Hi Kevin,
>>>>>>>>
>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
>>>>>>>> stackframe patch) .
>>>>>>>>
>>>>>>>> Hi Stuart,
>>>>>>>>
>>>>>>>> I haven't got much time to spend on this today.
>>>>>>>>
>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>>>
>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>>>
>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>>>> code diff .
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>>>> Kevin,
>>>>>>>>>
>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Anoop,
>>>>>>>>>
>>>>>>>>> Maybe we can get lucky again.
>>>>>>>>>
>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>>>>     I'll be happy to do another diff.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>>>>    We've had snow in Alabama since Christmas eve!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Stuart
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>>>> To: Anoop P A
>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>>>
>>>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>>>
>>>>>>>>>               Regards,
>>>>>>>>>
>>>>>>>>>               Kevin K.
>>>>>>>>>
>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>>>
>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>>>
>>>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>>>> Brought up 7 CPUs
>>>>>>>>>> bio: create slab<bio-0>   at 0
>>>>>>>>>> SCSI subsystem initialized
>>>>>>>>>> Switching to clocksource MIPS
>>>>>>>>>>
>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>>>> much to get rid of this hang.
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>>>>    		 * to cover the pipeline delay.
>>>>>>>>>>    		 */
>>>>>>>>>>    		.set	mips32
>>>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
>>>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
>>>>>>>>>>    		.set	mips0
>>>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
>>>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
>>>>>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>>>>    		LONG_S	$4, PT_R4(sp)
>>>>>>>>>>    		LONG_S	$5, PT_R5(sp)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> /K.
>>>>>>>>>>>
>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>>>
>>>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>>>
>>>>>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>>>> the culprit
>>>>>>>>>>>>
>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>>>> booting !.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Anoop
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 Kevin K.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>>>>         works   2.6.32-stable with patch 804
>>>>>>>>>>>>>>         works_not 2.6.33-stable
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>>>>       do_IRQ
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>>>>       clocksource_set_clock
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>>>>       cpu_idle
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>>>>       __irq_entry
>>>>>>>>>>>>>>       ipi_decode
>>>>>>>>>>>>>>           SMTC_CLOCK_TICK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Stuart
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04  8:17                                                       ` Kevin D. Kissell
@ 2011-01-04 13:02                                                         ` Anoop P A
  2011-01-04 14:37                                                           ` Anoop P A
  2011-01-04 17:40                                                           ` Kevin D. Kissell
  0 siblings, 2 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-04 13:02 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> Those interrupt counters show that IPIs are being taken everywhere,
> though very few by CPUs 5 and 6.  If I understand the configuration
> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
Yes CPU4 is in second VPE

> rate, *if* we're looking at a tickless kernel under low load.  But there
No it was not the tickless kernel.I had selected 250 MHz timer. can't we
expect IPI / timer interrupt for all the threads in this case ?.

> may be a clue there to part of your problem.  I have no idea why the
> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> you're getting your clock interrupts through the MSP CIC interrupt
> controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
> example code is perhaps deceptively simple, in that both VPEs have their
> count/compare indication wired directly to the 2 clock interrupt inputs,
> so that having both of them running with only a single set of irq state
> just works.  I don't know whether the MSP CIC timer interrupt is a

In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and  
MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
connected to cpu irq 6. 

I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
interrupt . Don't we have support for separate irq in SMTC
implementation ?..

> gating of the VPE0 count/compare output, or whether it's it's own
> interval timer, but I suspect that you may need to do some further
> low-level initialization in the platform-specific code to set up an
> interrupt on the VPE1 side.  I don't think the snippet you've got below
> would work as written.

The routine which I copied works fine for VSMP mode . 

/ # cat /proc/interrupts
           CPU0       CPU1
  0:        187        254            MIPS  IPI_resched
  1:         77        174            MIPS  IPI_call
  6:          0          0            MIPS  MSP CIC cascade
  8:          0          0         MSP_CIC  Softreset button
  9:          0          0         MSP_CIC  Standby switch
 21:          0          0         MSP_CIC  MSP PER cascade
 25:      37077          0         MSP_CIC  timer
 27:        188          0         MSP_CIC  serial
 34:          0      36986         MSP_CIC  timer

Do I want to change anything specific for SMTC ? .  

> 
> If it's purely an issue with clock distribution on VPE1, then a boot
> with maxvpes=1 maxtcs=4 should be stable.

Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .

> 
> /K.
> 
> On 1/3/2011 11:20 AM, Anoop P A wrote:
> > Hi Kevin,
> >
> > On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >> The very first SMTC implementations didn't support full kernel-mode
> >> preemption, which anyway wasn't a priority, given the hardware event
> >> response support in MIPS MT.  I believe it was later made compatible,
> >> but it was never extensively exercised.  Since SMTC has fingers in some
> >> pretty low-level atomicity mechanisms, if a new, parallel set was
> >> implemented for RCU, I can easily imagine that nobody has yet
> >> implemented SMTC-ified variants of that set.
> >>
> >> Your last statement isn't very clear, though.  Are you saying that if
> >> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >> kernel boots all the way up, or that it simply hangs later?   What's the
> >> last rev kernel that actually boots all the way up?
> > I have debugged this a bit more. It seems that kernel getting stalled
> > while executing on TC's of second VPE . 
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=2504 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=10036 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=17568 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=25100 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=32632 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=40164 jiffies)
> >
> > With CONFIG_TREE_CPU we were not hitting this scenario very often.
> > However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >
> > I presume some issue in my timer setup . I am not seeing timer interrupt
> > (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
> > booted 2.6.32-stable kernel.
> >
> > / # cat /proc/interrupts
> >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> > CPU6
> >   1:        148      15023      15140      15093       3779          8
> > 2            MIPS  SMTC_IPI
> >   6:          0          0          0          0          0          0
> > 0            MIPS  MSP CIC cascade
> >   8:          0          0          0          0          0          0
> > 0         MSP_CIC  Softreset button
> >   9:          0          0          0          0          0          0
> > 0         MSP_CIC  Standby switch
> >  21:          0          0          0          0          0          0
> > 0         MSP_CIC  MSP PER cascade
> >  25:      15113        341          4          7          0          0
> > 0         MSP_CIC  timer
> >  27:        260          9          0          1          0          0
> > 0         MSP_CIC  serial
> >  34:          0          0          0          0          0          0
> > 0         MSP_CIC  timer
> >
> > Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?. 
> >
> > I have tried setting up VPE1 timer from get_co_compare_int as follows
> >
> > unsigned int __cpuinit get_c0_compare_int(void)
> > {
> > 	if ((1==get_current_vpe()) && !vpe1_timr_installed){
> > 	
> > 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> > 	
> > 	setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
> >                   vpe1_timr_installed++;
> >           }
> >           return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> > MSP_INT_VPE0_TIMER);
> > }
> >
> > Thanks
> > Anoop
> >
> >>             Regards,
> >>
> >>             Kevin K.
> >>
> >> On 1/3/2011 7:12 AM, Anoop P A wrote:
> >>> Hi ,
> >>>
> >>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> >>> SMP kernel.  
> >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >>>
> >>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> >>> ( which will be only available RCU implementation for SMTC kernel from
> >>> 2.6.37 onwards) .
> >>>
> >>> With no forced preemption and selecting TREE_CPU I am able to boot
> >>> further to the hang that I have reported.
> >>>
> >>> Thanks
> >>> Anoop
> >>>
> >>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >>>> At this point the logical thing to do would seem to look at your kernel
> >>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >>>> shows the last exception to have been taken.  That's a critical SMTC
> >>>> routine that gets called whenever an xxx_irq_restore() enables
> >>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >>>> the TC had interrupts disabled can be handled deterministically.  As I
> >>>> mentioned in an earlier message, there was some cleanup work from David
> >>>> Howell that changed a number of irq management-related function names
> >>>> and prototypes across all architectures, which went into linux-mips.org
> >>>> at very roughly the time of the breakage.  The SMTC overlay over the irq
> >>>> implementation has been pretty robust, but it's written in a perhaps
> >>>> doomed attempt to be both efficient and using a maximum amount of common
> >>>> code with the general case.  A mechanical or semi-mechanical change
> >>>> could conceivably have broken things.
> >>>>
> >>>>             Regards,
> >>>>
> >>>>             Kevin K.
> >>>>
> >>>>
> >>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>>>> Hi ,
> >>>>>
> >>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>>>> Another important observation is even though 2.6.33 kernel + stackframe
> >>>>> patch well passes calibration hang , I am still unable boot in to a
> >>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>>>> ######################## Log ###########################
> >>>>>
> >>>>> === MIPS MT State Dump ===
> >>>>> -- Global State --
> >>>>>    MVPControl Passed: 00000005
> >>>>>    MVPControl Read: 00000004
> >>>>>    MVPConf0 : a8008406
> >>>>> -- per-VPE State --
> >>>>>   VPE 0
> >>>>>    VPEControl : 00008000
> >>>>>    VPEConf0 : 800f0003
> >>>>>    VPE0.Status : 11004201
> >>>>>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>>>>    VPE0.Cause : 50804000
> >>>>>    VPE0.Config7 : 00010000
> >>>>>   VPE 1
> >>>>>    VPEControl : 00068006
> >>>>>    VPEConf0 : 80cf0003
> >>>>>    VPE1.Status : 11008301
> >>>>>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>>>>    VPE1.Cause : 50800000
> >>>>>    VPE1.Config7 : 00010000
> >>>>> -- per-TC State --
> >>>>>   TC 0 (current TC with VPE EPC above)
> >>>>>    TCStatus : 18102000
> >>>>>    TCBind : 00000000
> >>>>>    TCRestart : 803fa19c printk+0xc/0x30
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00000000
> >>>>>   TC 1
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00200000
> >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00140000
> >>>>>   TC 2
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00400000
> >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00280000
> >>>>>   TC 3
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00600000
> >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 003c0000
> >>>>>   TC 4
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00800001
> >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00500000
> >>>>>   TC 5
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00a00001
> >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00640000
> >>>>>   TC 6
> >>>>>    TCStatus : 18902000
> >>>>>    TCBind : 00c00001
> >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>    TCHalt : 00000000
> >>>>>    TCContext : 00780000
> >>>>> Counter Interrupts taken per CPU (TC)
> >>>>> 0: 0
> >>>>> 1: 0
> >>>>> 2: 0
> >>>>> 3: 0
> >>>>> 4: 0
> >>>>> 5: 0
> >>>>> 6: 0
> >>>>> 7: 0
> >>>>> Self-IPI invocations:
> >>>>> 0: 12
> >>>>> 1: 0
> >>>>> 2: 0
> >>>>> 3: 0
> >>>>> 4: 0
> >>>>> 5: 5
> >>>>> 6: 4
> >>>>> 7: 0
> >>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>>>> 0 Recoveries of "stolen" FPU
> >>>>> ===========================
> >>>>>
> >>>>> ################################################################
> >>>>>
> >>>>> Thanks
> >>>>> Anoop
> >>>>>
> >>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>>>> I took a quick look last night, and the only thing that looked vaguely 
> >>>>>> dangerous in changes since the timer changes I alluded to earlier was 
> >>>>>> the global naming cleanup of irq-related function names that David 
> >>>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
> >>>>>> the definitions are nested subtly for SMTC to maximize the amount of 
> >>>>>> common code, and I could imagine something getting lost in translation 
> >>>>>> there.  If that were really the problem, it would of course affect much 
> >>>>>> more than just the timer subsystem, but early in the boot process, 
> >>>>>> timers are pretty much the only interrupts that have to be handled 
> >>>>>> correctly.
> >>>>>>
> >>>>>> I'm travelling today, but will take a look at timekeeping_notify() 
> >>>>>> tomorrow or the next day...
> >>>>>>
> >>>>>> /K.
> >>>>>>
> >>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>>>> code .
> >>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>>>> function.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Anoop
> >>>>>>>
> >>>>>>> PS: I may not be available until Thursday
> >>>>>>>
> >>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>>>> Hi Kevin,
> >>>>>>>>
> >>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> >>>>>>>> stackframe patch) .
> >>>>>>>>
> >>>>>>>> Hi Stuart,
> >>>>>>>>
> >>>>>>>> I haven't got much time to spend on this today.
> >>>>>>>>
> >>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>>>
> >>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>>>
> >>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>>>> code diff .
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>>>> Kevin,
> >>>>>>>>>
> >>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Anoop,
> >>>>>>>>>
> >>>>>>>>> Maybe we can get lucky again.
> >>>>>>>>>
> >>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>>>>     I'll be happy to do another diff.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>>>>    We've had snow in Alabama since Christmas eve!
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Stuart
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>>>> To: Anoop P A
> >>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
> >>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
> >>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>>>
> >>>>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>>>
> >>>>>>>>>               Regards,
> >>>>>>>>>
> >>>>>>>>>               Kevin K.
> >>>>>>>>>
> >>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>>>
> >>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>>>
> >>>>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>>>> Brought up 7 CPUs
> >>>>>>>>>> bio: create slab<bio-0>   at 0
> >>>>>>>>>> SCSI subsystem initialized
> >>>>>>>>>> Switching to clocksource MIPS
> >>>>>>>>>>
> >>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>>>> much to get rid of this hang.
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>>>>    		 * to cover the pipeline delay.
> >>>>>>>>>>    		 */
> >>>>>>>>>>    		.set	mips32
> >>>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
> >>>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
> >>>>>>>>>>    		.set	mips0
> >>>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> >>>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> >>>>>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>>>>    		LONG_S	$4, PT_R4(sp)
> >>>>>>>>>>    		LONG_S	$5, PT_R5(sp)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> /K.
> >>>>>>>>>>>
> >>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>>>> the culprit
> >>>>>>>>>>>>
> >>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>>>> booting !.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Anoop
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
> >>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>                 Regards,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>                 Kevin K.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>>>>         works   2.6.32-stable with patch 804
> >>>>>>>>>>>>>>         works_not 2.6.33-stable
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>>>>       do_IRQ
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>>>>       clocksource_set_clock
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>>>>       cpu_idle
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>>>>       __irq_entry
> >>>>>>>>>>>>>>       ipi_decode
> >>>>>>>>>>>>>>           SMTC_CLOCK_TICK
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Stuart
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 13:02                                                         ` Anoop P A
@ 2011-01-04 14:37                                                           ` Anoop P A
  2011-01-04 17:21                                                             ` Kevin D. Kissell
  2011-01-04 17:40                                                           ` Kevin D. Kissell
  1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-04 14:37 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Hi Kevin,

the stackframe patch that you have suggested had some side effects I was
unable execute init. When I changed some thing like below it started
working .Could you kindly review it ?.

diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..da786ed 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -181,14 +181,6 @@
 #endif
 		LONG_S	k0, PT_R29(sp)
 		LONG_S	$3, PT_R3(sp)
-		/*
-		 * You might think that you don't need to save $0,
-		 * but the FPU emulator and gdb remote debug stub
-		 * need it to operate correctly
-		 */
-		LONG_S	$0, PT_R0(sp)
-		mfc0	v1, CP0_STATUS
-		LONG_S	$2, PT_R2(sp)
 #ifdef CONFIG_MIPS_MT_SMTC
 		/*
 		 * Ideally, these instructions would be shuffled in
@@ -199,6 +191,14 @@
 		.set	mips0
 		LONG_S	v1, PT_TCSTATUS(sp)
 #endif /* CONFIG_MIPS_MT_SMTC */
+		/*
+		 * You might think that you don't need to save $0,
+		 * but the FPU emulator and gdb remote debug stub
+		 * need it to operate correctly
+		 */
+		LONG_S	$0, PT_R0(sp)
+		mfc0	v1, CP0_STATUS
+		LONG_S	$2, PT_R2(sp)
 		LONG_S	$4, PT_R4(sp)
 		LONG_S	$5, PT_R5(sp)
 		LONG_S	v1, PT_STATUS(sp)

Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
line.

/ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
CPU6
  1:        249     218024     218286     218263     218235     218208
218179            MIPS  SMTC_IPI
  6:          0          0          0          0          0          0
0            MIPS  MSP CIC cascade
  8:          0          0          0          0          0          0
0         MSP_CIC  Softreset button
  9:          0          0          0          0          0          0
0         MSP_CIC  Standby switch
 21:          0          0          0          0          0          0
0         MSP_CIC  MSP PER cascade
 25:     218128        711         11          0          0          0
0         MSP_CIC  timer
 27:        341         22          0          0          2          0
6         MSP_CIC  serial

ERR:          0
/ # uname -a
Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
Jan 4 19:48:31 IST 2011 mips GNU/Linux

So clock setup / distribution on VPE1 is some thing need fix.

Thanks
Anoop


On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> > Those interrupt counters show that IPIs are being taken everywhere,
> > though very few by CPUs 5 and 6.  If I understand the configuration
> > correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> Yes CPU4 is in second VPE
> 
> > rate, *if* we're looking at a tickless kernel under low load.  But there
> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> expect IPI / timer interrupt for all the threads in this case ?.
> 
> > may be a clue there to part of your problem.  I have no idea why the
> > behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> > you're getting your clock interrupts through the MSP CIC interrupt
> > controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
> > example code is perhaps deceptively simple, in that both VPEs have their
> > count/compare indication wired directly to the 2 clock interrupt inputs,
> > so that having both of them running with only a single set of irq state
> > just works.  I don't know whether the MSP CIC timer interrupt is a
> 
> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and  
> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> connected to cpu irq 6. 
> 
> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> interrupt . Don't we have support for separate irq in SMTC
> implementation ?..
> 
> > gating of the VPE0 count/compare output, or whether it's it's own
> > interval timer, but I suspect that you may need to do some further
> > low-level initialization in the platform-specific code to set up an
> > interrupt on the VPE1 side.  I don't think the snippet you've got below
> > would work as written.
> 
> The routine which I copied works fine for VSMP mode . 
> 
> / # cat /proc/interrupts
>            CPU0       CPU1
>   0:        187        254            MIPS  IPI_resched
>   1:         77        174            MIPS  IPI_call
>   6:          0          0            MIPS  MSP CIC cascade
>   8:          0          0         MSP_CIC  Softreset button
>   9:          0          0         MSP_CIC  Standby switch
>  21:          0          0         MSP_CIC  MSP PER cascade
>  25:      37077          0         MSP_CIC  timer
>  27:        188          0         MSP_CIC  serial
>  34:          0      36986         MSP_CIC  timer
> 
> Do I want to change anything specific for SMTC ? .  
> 
> > 
> > If it's purely an issue with clock distribution on VPE1, then a boot
> > with maxvpes=1 maxtcs=4 should be stable.
> 
> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
> 
> > 
> > /K.
> > 
> > On 1/3/2011 11:20 AM, Anoop P A wrote:
> > > Hi Kevin,
> > >
> > > On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> > >> The very first SMTC implementations didn't support full kernel-mode
> > >> preemption, which anyway wasn't a priority, given the hardware event
> > >> response support in MIPS MT.  I believe it was later made compatible,
> > >> but it was never extensively exercised.  Since SMTC has fingers in some
> > >> pretty low-level atomicity mechanisms, if a new, parallel set was
> > >> implemented for RCU, I can easily imagine that nobody has yet
> > >> implemented SMTC-ified variants of that set.
> > >>
> > >> Your last statement isn't very clear, though.  Are you saying that if
> > >> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> > >> kernel boots all the way up, or that it simply hangs later?   What's the
> > >> last rev kernel that actually boots all the way up?
> > > I have debugged this a bit more. It seems that kernel getting stalled
> > > while executing on TC's of second VPE . 
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=2504 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=10036 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=17568 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=25100 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=32632 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=40164 jiffies)
> > >
> > > With CONFIG_TREE_CPU we were not hitting this scenario very often.
> > > However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> > >
> > > I presume some issue in my timer setup . I am not seeing timer interrupt
> > > (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
> > > booted 2.6.32-stable kernel.
> > >
> > > / # cat /proc/interrupts
> > >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> > > CPU6
> > >   1:        148      15023      15140      15093       3779          8
> > > 2            MIPS  SMTC_IPI
> > >   6:          0          0          0          0          0          0
> > > 0            MIPS  MSP CIC cascade
> > >   8:          0          0          0          0          0          0
> > > 0         MSP_CIC  Softreset button
> > >   9:          0          0          0          0          0          0
> > > 0         MSP_CIC  Standby switch
> > >  21:          0          0          0          0          0          0
> > > 0         MSP_CIC  MSP PER cascade
> > >  25:      15113        341          4          7          0          0
> > > 0         MSP_CIC  timer
> > >  27:        260          9          0          1          0          0
> > > 0         MSP_CIC  serial
> > >  34:          0          0          0          0          0          0
> > > 0         MSP_CIC  timer
> > >
> > > Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?. 
> > >
> > > I have tried setting up VPE1 timer from get_co_compare_int as follows
> > >
> > > unsigned int __cpuinit get_c0_compare_int(void)
> > > {
> > > 	if ((1==get_current_vpe()) && !vpe1_timr_installed){
> > > 	
> > > 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> > > 	
> > > 	setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
> > >                   vpe1_timr_installed++;
> > >           }
> > >           return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> > > MSP_INT_VPE0_TIMER);
> > > }
> > >
> > > Thanks
> > > Anoop
> > >
> > >>             Regards,
> > >>
> > >>             Kevin K.
> > >>
> > >> On 1/3/2011 7:12 AM, Anoop P A wrote:
> > >>> Hi ,
> > >>>
> > >>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> > >>> SMP kernel.  
> > >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> > >>>
> > >>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> > >>> ( which will be only available RCU implementation for SMTC kernel from
> > >>> 2.6.37 onwards) .
> > >>>
> > >>> With no forced preemption and selecting TREE_CPU I am able to boot
> > >>> further to the hang that I have reported.
> > >>>
> > >>> Thanks
> > >>> Anoop
> > >>>
> > >>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> > >>>> At this point the logical thing to do would seem to look at your kernel
> > >>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> > >>>> shows the last exception to have been taken.  That's a critical SMTC
> > >>>> routine that gets called whenever an xxx_irq_restore() enables
> > >>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> > >>>> the TC had interrupts disabled can be handled deterministically.  As I
> > >>>> mentioned in an earlier message, there was some cleanup work from David
> > >>>> Howell that changed a number of irq management-related function names
> > >>>> and prototypes across all architectures, which went into linux-mips.org
> > >>>> at very roughly the time of the breakage.  The SMTC overlay over the irq
> > >>>> implementation has been pretty robust, but it's written in a perhaps
> > >>>> doomed attempt to be both efficient and using a maximum amount of common
> > >>>> code with the general case.  A mechanical or semi-mechanical change
> > >>>> could conceivably have broken things.
> > >>>>
> > >>>>             Regards,
> > >>>>
> > >>>>             Kevin K.
> > >>>>
> > >>>>
> > >>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> > >>>>> Hi ,
> > >>>>>
> > >>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> > >>>>> Another important observation is even though 2.6.33 kernel + stackframe
> > >>>>> patch well passes calibration hang , I am still unable boot in to a
> > >>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> > >>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> > >>>>> ######################## Log ###########################
> > >>>>>
> > >>>>> === MIPS MT State Dump ===
> > >>>>> -- Global State --
> > >>>>>    MVPControl Passed: 00000005
> > >>>>>    MVPControl Read: 00000004
> > >>>>>    MVPConf0 : a8008406
> > >>>>> -- per-VPE State --
> > >>>>>   VPE 0
> > >>>>>    VPEControl : 00008000
> > >>>>>    VPEConf0 : 800f0003
> > >>>>>    VPE0.Status : 11004201
> > >>>>>    VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> > >>>>>    VPE0.Cause : 50804000
> > >>>>>    VPE0.Config7 : 00010000
> > >>>>>   VPE 1
> > >>>>>    VPEControl : 00068006
> > >>>>>    VPEConf0 : 80cf0003
> > >>>>>    VPE1.Status : 11008301
> > >>>>>    VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> > >>>>>    VPE1.Cause : 50800000
> > >>>>>    VPE1.Config7 : 00010000
> > >>>>> -- per-TC State --
> > >>>>>   TC 0 (current TC with VPE EPC above)
> > >>>>>    TCStatus : 18102000
> > >>>>>    TCBind : 00000000
> > >>>>>    TCRestart : 803fa19c printk+0xc/0x30
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00000000
> > >>>>>   TC 1
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00200000
> > >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00140000
> > >>>>>   TC 2
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00400000
> > >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00280000
> > >>>>>   TC 3
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00600000
> > >>>>>    TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 003c0000
> > >>>>>   TC 4
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00800001
> > >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00500000
> > >>>>>   TC 5
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00a00001
> > >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00640000
> > >>>>>   TC 6
> > >>>>>    TCStatus : 18902000
> > >>>>>    TCBind : 00c00001
> > >>>>>    TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>>    TCHalt : 00000000
> > >>>>>    TCContext : 00780000
> > >>>>> Counter Interrupts taken per CPU (TC)
> > >>>>> 0: 0
> > >>>>> 1: 0
> > >>>>> 2: 0
> > >>>>> 3: 0
> > >>>>> 4: 0
> > >>>>> 5: 0
> > >>>>> 6: 0
> > >>>>> 7: 0
> > >>>>> Self-IPI invocations:
> > >>>>> 0: 12
> > >>>>> 1: 0
> > >>>>> 2: 0
> > >>>>> 3: 0
> > >>>>> 4: 0
> > >>>>> 5: 5
> > >>>>> 6: 4
> > >>>>> 7: 0
> > >>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> 0 Recoveries of "stolen" FPU
> > >>>>> ===========================
> > >>>>>
> > >>>>> ################################################################
> > >>>>>
> > >>>>> Thanks
> > >>>>> Anoop
> > >>>>>
> > >>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> > >>>>>> I took a quick look last night, and the only thing that looked vaguely 
> > >>>>>> dangerous in changes since the timer changes I alluded to earlier was 
> > >>>>>> the global naming cleanup of irq-related function names that David 
> > >>>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of 
> > >>>>>> the definitions are nested subtly for SMTC to maximize the amount of 
> > >>>>>> common code, and I could imagine something getting lost in translation 
> > >>>>>> there.  If that were really the problem, it would of course affect much 
> > >>>>>> more than just the timer subsystem, but early in the boot process, 
> > >>>>>> timers are pretty much the only interrupts that have to be handled 
> > >>>>>> correctly.
> > >>>>>>
> > >>>>>> I'm travelling today, but will take a look at timekeeping_notify() 
> > >>>>>> tomorrow or the next day...
> > >>>>>>
> > >>>>>> /K.
> > >>>>>>
> > >>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> > >>>>>>> Hi,
> > >>>>>>>
> > >>>>>>> I had a glance into the code diff without notice of any suspect-able
> > >>>>>>> code .
> > >>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> > >>>>>>> function.
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Anoop
> > >>>>>>>
> > >>>>>>> PS: I may not be available until Thursday
> > >>>>>>>
> > >>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> > >>>>>>>> Hi Kevin,
> > >>>>>>>>
> > >>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> > >>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> > >>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> > >>>>>>>> stackframe patch) .
> > >>>>>>>>
> > >>>>>>>> Hi Stuart,
> > >>>>>>>>
> > >>>>>>>> I haven't got much time to spend on this today.
> > >>>>>>>>
> > >>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> > >>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> > >>>>>>>>
> > >>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> > >>>>>>>>
> > >>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> > >>>>>>>> code diff .
> > >>>>>>>>
> > >>>>>>>> Thanks
> > >>>>>>>> Anoop
> > >>>>>>>>
> > >>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> > >>>>>>>>> Kevin,
> > >>>>>>>>>
> > >>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Anoop,
> > >>>>>>>>>
> > >>>>>>>>> Maybe we can get lucky again.
> > >>>>>>>>>
> > >>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> > >>>>>>>>>     I'll be happy to do another diff.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Hope you'll have had a good Christmas as well.
> > >>>>>>>>>    We've had snow in Alabama since Christmas eve!
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Stuart
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> > >>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> > >>>>>>>>> To: Anoop P A
> > >>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> > >>>>>>>>> Subject: Re: SMTC support status in latest git head.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> > >>>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
> > >>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> > >>>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
> > >>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> > >>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> > >>>>>>>>>
> > >>>>>>>>> Oh, yes, and Merry Christmas one and all!
> > >>>>>>>>>
> > >>>>>>>>>               Regards,
> > >>>>>>>>>
> > >>>>>>>>>               Kevin K.
> > >>>>>>>>>
> > >>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> > >>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> > >>>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> > >>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> > >>>>>>>>>>>
> > >>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> > >>>>>>>>>> loop but hangs after switching to mips closource
> > >>>>>>>>>>
> > >>>>>>>>>> TC 6 going on-line as CPU 6
> > >>>>>>>>>> Brought up 7 CPUs
> > >>>>>>>>>> bio: create slab<bio-0>   at 0
> > >>>>>>>>>> SCSI subsystem initialized
> > >>>>>>>>>> Switching to clocksource MIPS
> > >>>>>>>>>>
> > >>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> > >>>>>>>>>> much to get rid of this hang.
> > >>>>>>>>>>
> > >>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> index 58730c5..7fc9f10 100644
> > >>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> @@ -195,9 +195,9 @@
> > >>>>>>>>>>    		 * to cover the pipeline delay.
> > >>>>>>>>>>    		 */
> > >>>>>>>>>>    		.set	mips32
> > >>>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
> > >>>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
> > >>>>>>>>>>    		.set	mips0
> > >>>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> > >>>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> > >>>>>>>>>>    #endif /* CONFIG_MIPS_MT_SMTC */
> > >>>>>>>>>>    		LONG_S	$4, PT_R4(sp)
> > >>>>>>>>>>    		LONG_S	$5, PT_R5(sp)
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> /K.
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> > >>>>>>>>>>>> Hi Kevin, Stuart ,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Woohooo You guys spotted !.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > >>>>>>>>>>>> the culprit
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> > >>>>>>>>>>>> booting !.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> Anoop
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> > >>>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> > >>>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> > >>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> > >>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> > >>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> > >>>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> > >>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> > >>>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
> > >>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> > >>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> > >>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> > >>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> > >>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> > >>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> > >>>>>>>>>>>>> submit a patch just now.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>                 Regards,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>                 Kevin K.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > >>>>>>>>>>>>>> Kevin,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I'm not sure if it's useful,
> > >>>>>>>>>>>>>>        but finally I got the time to look at the two kernel versions Anoop pointed out.
> > >>>>>>>>>>>>>>         works   2.6.32-stable with patch 804
> > >>>>>>>>>>>>>>         works_not 2.6.33-stable
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> > >>>>>>>>>>>>>>        and looking for timer interrupt related stuff found the following differences:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> > >>>>>>>>>>>>>> arch/mips/kernel/irq.c
> > >>>>>>>>>>>>>>       do_IRQ
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> > >>>>>>>>>>>>>>       SAVE_SOME SAVE_TEMP get/set_saved_sp
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/time.h
> > >>>>>>>>>>>>>>       clocksource_set_clock
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/kernel/process.c
> > >>>>>>>>>>>>>>       cpu_idle
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> > >>>>>>>>>>>>>>       __irq_entry
> > >>>>>>>>>>>>>>       ipi_decode
> > >>>>>>>>>>>>>>           SMTC_CLOCK_TICK
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Stuart
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >
> > 
> 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 14:37                                                           ` Anoop P A
@ 2011-01-04 17:21                                                             ` Kevin D. Kissell
  2011-01-04 17:54                                                               ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 17:21 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

I'm trying to figure out a reason why your change below should help, and 
offhand, modulo tool bugs, I don't see it.  I'm assuming that your diff 
below is a diff relative to the pre-patch stackframe.h.   I wouldn't 
bless it as an alternative because it moves code and comments 
unnecessarily - all you should really have to do is to move the


  190                 mfc0    v1, CP0_STATUS
  191                 LONG_S  $2, PT_R2(sp)

to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.

If moving the save of zero to PT_R0(sp) actually makes a difference, 
it's evidence that you've got problems in your toolchain (or, heaven 
forbid, your pipeline)!

But I'd really like to see what your assembler is doing to the original 
patch for it to be broken.  Assembler instruction reordering is armed, 
but it ought not to move register moves and stores around in ways where 
your sequence

197                 .set    mips32
198                 mfc0    v1, CP0_TCSTATUS
199                 .set    mips0
200                 LONG_S  v1, PT_TCSTATUS(sp)
189                 LONG_S  $0, PT_R0(sp)
190                 mfc0    v1, CP0_STATUS
191                 LONG_S  $2, PT_R2(sp)
202                 LONG_S  $4, PT_R4(sp)
203                 LONG_S  $5, PT_R5(sp)
204                 LONG_S  v1, PT_STATUS(sp)

to work while

189                 LONG_S  $0, PT_R0(sp)
190                 mfc0    v1, CP0_STATUS
191                 LONG_S  $2, PT_R2(sp)
197                 .set    mips32
198                 mfc0    v0, CP0_TCSTATUS
199                 .set    mips0
200                 LONG_S  v0, PT_TCSTATUS(sp)
202                 LONG_S  $4, PT_R4(sp)
203                 LONG_S  $5, PT_R5(sp)
204                 LONG_S  v1, PT_STATUS(sp)

does not, provided that the identity of v0=$2, v1=$3 is respected.

One thing that does stick out as being different - though, again, I'd 
need to see the disassembly of an instance of the macro to know what it 
could have done - is that the SMTC conditiona code brackets the mfc0 of 
TCStatus with .set mips32/.set mips0.  Given that the code no longer has 
a .set mips0 early in the macro, it would be more correct to make it:

                 .set    push
                 .set    mips32
                 mfc0    v0, CP0_TCSTATUS    (or v1, if we move the mfc0 
v1,CP0_STATUS)
                 .set    pop

and presumably make a similar chage for the block from line 334 to 429.

But I don't see any causal path from that funniness to failure.

             Regards,

             Kevin K.

On 01/04/11 06:37, Anoop P A wrote:
> Hi Kevin,
>
> the stackframe patch that you have suggested had some side effects I was
> unable execute init. When I changed some thing like below it started
> working .Could you kindly review it ?.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..da786ed 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -181,14 +181,6 @@
>   #endif
>   		LONG_S	k0, PT_R29(sp)
>   		LONG_S	$3, PT_R3(sp)
> -		/*
> -		 * You might think that you don't need to save $0,
> -		 * but the FPU emulator and gdb remote debug stub
> -		 * need it to operate correctly
> -		 */
> -		LONG_S	$0, PT_R0(sp)
> -		mfc0	v1, CP0_STATUS
> -		LONG_S	$2, PT_R2(sp)
>   #ifdef CONFIG_MIPS_MT_SMTC
>   		/*
>   		 * Ideally, these instructions would be shuffled in
> @@ -199,6 +191,14 @@
>   		.set	mips0
>   		LONG_S	v1, PT_TCSTATUS(sp)
>   #endif /* CONFIG_MIPS_MT_SMTC */
> +		/*
> +		 * You might think that you don't need to save $0,
> +		 * but the FPU emulator and gdb remote debug stub
> +		 * need it to operate correctly
> +		 */
> +		LONG_S	$0, PT_R0(sp)
> +		mfc0	v1, CP0_STATUS
> +		LONG_S	$2, PT_R2(sp)
>   		LONG_S	$4, PT_R4(sp)
>   		LONG_S	$5, PT_R5(sp)
>   		LONG_S	v1, PT_STATUS(sp)
>
> Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
> line.
>
> / # cat /proc/interrupts
>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> CPU6
>    1:        249     218024     218286     218263     218235     218208
> 218179            MIPS  SMTC_IPI
>    6:          0          0          0          0          0          0
> 0            MIPS  MSP CIC cascade
>    8:          0          0          0          0          0          0
> 0         MSP_CIC  Softreset button
>    9:          0          0          0          0          0          0
> 0         MSP_CIC  Standby switch
>   21:          0          0          0          0          0          0
> 0         MSP_CIC  MSP PER cascade
>   25:     218128        711         11          0          0          0
> 0         MSP_CIC  timer
>   27:        341         22          0          0          2          0
> 6         MSP_CIC  serial
>
> ERR:          0
> / # uname -a
> Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
> Jan 4 19:48:31 IST 2011 mips GNU/Linux
>
> So clock setup / distribution on VPE1 is some thing need fix.
>
> Thanks
> Anoop
>
>
> On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
>> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
>>> Those interrupt counters show that IPIs are being taken everywhere,
>>> though very few by CPUs 5 and 6.  If I understand the configuration
>>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
>> Yes CPU4 is in second VPE
>>
>>> rate, *if* we're looking at a tickless kernel under low load.  But there
>> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
>> expect IPI / timer interrupt for all the threads in this case ?.
>>
>>> may be a clue there to part of your problem.  I have no idea why the
>>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
>>> you're getting your clock interrupts through the MSP CIC interrupt
>>> controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
>>> example code is perhaps deceptively simple, in that both VPEs have their
>>> count/compare indication wired directly to the 2 clock interrupt inputs,
>>> so that having both of them running with only a single set of irq state
>>> just works.  I don't know whether the MSP CIC timer interrupt is a
>> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
>> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
>> connected to cpu irq 6.
>>
>> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
>> interrupt . Don't we have support for separate irq in SMTC
>> implementation ?..
>>
>>> gating of the VPE0 count/compare output, or whether it's it's own
>>> interval timer, but I suspect that you may need to do some further
>>> low-level initialization in the platform-specific code to set up an
>>> interrupt on the VPE1 side.  I don't think the snippet you've got below
>>> would work as written.
>> The routine which I copied works fine for VSMP mode .
>>
>> / # cat /proc/interrupts
>>             CPU0       CPU1
>>    0:        187        254            MIPS  IPI_resched
>>    1:         77        174            MIPS  IPI_call
>>    6:          0          0            MIPS  MSP CIC cascade
>>    8:          0          0         MSP_CIC  Softreset button
>>    9:          0          0         MSP_CIC  Standby switch
>>   21:          0          0         MSP_CIC  MSP PER cascade
>>   25:      37077          0         MSP_CIC  timer
>>   27:        188          0         MSP_CIC  serial
>>   34:          0      36986         MSP_CIC  timer
>>
>> Do I want to change anything specific for SMTC ? .
>>
>>> If it's purely an issue with clock distribution on VPE1, then a boot
>>> with maxvpes=1 maxtcs=4 should be stable.
>> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>>
>>> /K.
>>>
>>> On 1/3/2011 11:20 AM, Anoop P A wrote:
>>>> Hi Kevin,
>>>>
>>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>>>>> The very first SMTC implementations didn't support full kernel-mode
>>>>> preemption, which anyway wasn't a priority, given the hardware event
>>>>> response support in MIPS MT.  I believe it was later made compatible,
>>>>> but it was never extensively exercised.  Since SMTC has fingers in some
>>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
>>>>> implemented for RCU, I can easily imagine that nobody has yet
>>>>> implemented SMTC-ified variants of that set.
>>>>>
>>>>> Your last statement isn't very clear, though.  Are you saying that if
>>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>>>>> kernel boots all the way up, or that it simply hangs later?   What's the
>>>>> last rev kernel that actually boots all the way up?
>>>> I have debugged this a bit more. It seems that kernel getting stalled
>>>> while executing on TC's of second VPE .
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=2504 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=10036 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=17568 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=25100 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=32632 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=40164 jiffies)
>>>>
>>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
>>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>>>>
>>>> I presume some issue in my timer setup . I am not seeing timer interrupt
>>>> (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
>>>> booted 2.6.32-stable kernel.
>>>>
>>>> / # cat /proc/interrupts
>>>>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
>>>> CPU6
>>>>    1:        148      15023      15140      15093       3779          8
>>>> 2            MIPS  SMTC_IPI
>>>>    6:          0          0          0          0          0          0
>>>> 0            MIPS  MSP CIC cascade
>>>>    8:          0          0          0          0          0          0
>>>> 0         MSP_CIC  Softreset button
>>>>    9:          0          0          0          0          0          0
>>>> 0         MSP_CIC  Standby switch
>>>>   21:          0          0          0          0          0          0
>>>> 0         MSP_CIC  MSP PER cascade
>>>>   25:      15113        341          4          7          0          0
>>>> 0         MSP_CIC  timer
>>>>   27:        260          9          0          1          0          0
>>>> 0         MSP_CIC  serial
>>>>   34:          0          0          0          0          0          0
>>>> 0         MSP_CIC  timer
>>>>
>>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
>>>>
>>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
>>>>
>>>> unsigned int __cpuinit get_c0_compare_int(void)
>>>> {
>>>> 	if ((1==get_current_vpe())&&  !vpe1_timr_installed){
>>>> 	
>>>> 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
>>>> 	
>>>> 	setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
>>>>                    vpe1_timr_installed++;
>>>>            }
>>>>            return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
>>>> MSP_INT_VPE0_TIMER);
>>>> }
>>>>
>>>> Thanks
>>>> Anoop
>>>>
>>>>>              Regards,
>>>>>
>>>>>              Kevin K.
>>>>>
>>>>> On 1/3/2011 7:12 AM, Anoop P A wrote:
>>>>>> Hi ,
>>>>>>
>>>>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
>>>>>> SMP kernel.
>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>>>>>>
>>>>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
>>>>>> ( which will be only available RCU implementation for SMTC kernel from
>>>>>> 2.6.37 onwards) .
>>>>>>
>>>>>> With no forced preemption and selecting TREE_CPU I am able to boot
>>>>>> further to the hang that I have reported.
>>>>>>
>>>>>> Thanks
>>>>>> Anoop
>>>>>>
>>>>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>>>>>>> At this point the logical thing to do would seem to look at your kernel
>>>>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>>>>>>> shows the last exception to have been taken.  That's a critical SMTC
>>>>>>> routine that gets called whenever an xxx_irq_restore() enables
>>>>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>>>>>>> the TC had interrupts disabled can be handled deterministically.  As I
>>>>>>> mentioned in an earlier message, there was some cleanup work from David
>>>>>>> Howell that changed a number of irq management-related function names
>>>>>>> and prototypes across all architectures, which went into linux-mips.org
>>>>>>> at very roughly the time of the breakage.  The SMTC overlay over the irq
>>>>>>> implementation has been pretty robust, but it's written in a perhaps
>>>>>>> doomed attempt to be both efficient and using a maximum amount of common
>>>>>>> code with the general case.  A mechanical or semi-mechanical change
>>>>>>> could conceivably have broken things.
>>>>>>>
>>>>>>>              Regards,
>>>>>>>
>>>>>>>              Kevin K.
>>>>>>>
>>>>>>>
>>>>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>>>>>>> Hi ,
>>>>>>>>
>>>>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>>>>>>> Another important observation is even though 2.6.33 kernel + stackframe
>>>>>>>> patch well passes calibration hang , I am still unable boot in to a
>>>>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>>>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>>>>>>> ######################## Log ###########################
>>>>>>>>
>>>>>>>> === MIPS MT State Dump ===
>>>>>>>> -- Global State --
>>>>>>>>     MVPControl Passed: 00000005
>>>>>>>>     MVPControl Read: 00000004
>>>>>>>>     MVPConf0 : a8008406
>>>>>>>> -- per-VPE State --
>>>>>>>>    VPE 0
>>>>>>>>     VPEControl : 00008000
>>>>>>>>     VPEConf0 : 800f0003
>>>>>>>>     VPE0.Status : 11004201
>>>>>>>>     VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>>>>>>>     VPE0.Cause : 50804000
>>>>>>>>     VPE0.Config7 : 00010000
>>>>>>>>    VPE 1
>>>>>>>>     VPEControl : 00068006
>>>>>>>>     VPEConf0 : 80cf0003
>>>>>>>>     VPE1.Status : 11008301
>>>>>>>>     VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>>>>>>>     VPE1.Cause : 50800000
>>>>>>>>     VPE1.Config7 : 00010000
>>>>>>>> -- per-TC State --
>>>>>>>>    TC 0 (current TC with VPE EPC above)
>>>>>>>>     TCStatus : 18102000
>>>>>>>>     TCBind : 00000000
>>>>>>>>     TCRestart : 803fa19c printk+0xc/0x30
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00000000
>>>>>>>>    TC 1
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00200000
>>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00140000
>>>>>>>>    TC 2
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00400000
>>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00280000
>>>>>>>>    TC 3
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00600000
>>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 003c0000
>>>>>>>>    TC 4
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00800001
>>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00500000
>>>>>>>>    TC 5
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00a00001
>>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00640000
>>>>>>>>    TC 6
>>>>>>>>     TCStatus : 18902000
>>>>>>>>     TCBind : 00c00001
>>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>>     TCHalt : 00000000
>>>>>>>>     TCContext : 00780000
>>>>>>>> Counter Interrupts taken per CPU (TC)
>>>>>>>> 0: 0
>>>>>>>> 1: 0
>>>>>>>> 2: 0
>>>>>>>> 3: 0
>>>>>>>> 4: 0
>>>>>>>> 5: 0
>>>>>>>> 6: 0
>>>>>>>> 7: 0
>>>>>>>> Self-IPI invocations:
>>>>>>>> 0: 12
>>>>>>>> 1: 0
>>>>>>>> 2: 0
>>>>>>>> 3: 0
>>>>>>>> 4: 0
>>>>>>>> 5: 5
>>>>>>>> 6: 4
>>>>>>>> 7: 0
>>>>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> 0 Recoveries of "stolen" FPU
>>>>>>>> ===========================
>>>>>>>>
>>>>>>>> ################################################################
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>>>>>>> I took a quick look last night, and the only thing that looked vaguely
>>>>>>>>> dangerous in changes since the timer changes I alluded to earlier was
>>>>>>>>> the global naming cleanup of irq-related function names that David
>>>>>>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of
>>>>>>>>> the definitions are nested subtly for SMTC to maximize the amount of
>>>>>>>>> common code, and I could imagine something getting lost in translation
>>>>>>>>> there.  If that were really the problem, it would of course affect much
>>>>>>>>> more than just the timer subsystem, but early in the boot process,
>>>>>>>>> timers are pretty much the only interrupts that have to be handled
>>>>>>>>> correctly.
>>>>>>>>>
>>>>>>>>> I'm travelling today, but will take a look at timekeeping_notify()
>>>>>>>>> tomorrow or the next day...
>>>>>>>>>
>>>>>>>>> /K.
>>>>>>>>>
>>>>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>>>>>>> code .
>>>>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>>>>>>> function.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anoop
>>>>>>>>>>
>>>>>>>>>> PS: I may not be available until Thursday
>>>>>>>>>>
>>>>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>
>>>>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
>>>>>>>>>>> stackframe patch) .
>>>>>>>>>>>
>>>>>>>>>>> Hi Stuart,
>>>>>>>>>>>
>>>>>>>>>>> I haven't got much time to spend on this today.
>>>>>>>>>>>
>>>>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>>>>>>
>>>>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>>>>>>
>>>>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>>>>>>> code diff .
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Anoop
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>
>>>>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Anoop,
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe we can get lucky again.
>>>>>>>>>>>>
>>>>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>>>>>>>      I'll be happy to do another diff.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>>>>>>>     We've had snow in Alabama since Christmas eve!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Stuart
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>>>>>>> To: Anoop P A
>>>>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
>>>>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
>>>>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>>>>>>
>>>>>>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>>>>>>
>>>>>>>>>>>>                Regards,
>>>>>>>>>>>>
>>>>>>>>>>>>                Kevin K.
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>>>>>>
>>>>>>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>>>>>>> Brought up 7 CPUs
>>>>>>>>>>>>> bio: create slab<bio-0>    at 0
>>>>>>>>>>>>> SCSI subsystem initialized
>>>>>>>>>>>>> Switching to clocksource MIPS
>>>>>>>>>>>>>
>>>>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>>>>>>> much to get rid of this hang.
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>>>>>>>     		 * to cover the pipeline delay.
>>>>>>>>>>>>>     		 */
>>>>>>>>>>>>>     		.set	mips32
>>>>>>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
>>>>>>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
>>>>>>>>>>>>>     		.set	mips0
>>>>>>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
>>>>>>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
>>>>>>>>>>>>>     #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>>>>>>>     		LONG_S	$4, PT_R4(sp)
>>>>>>>>>>>>>     		LONG_S	$5, PT_R5(sp)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> /K.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>>>>>>> the culprit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>>>>>>> booting !.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Anoop
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
>>>>>>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
>>>>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
>>>>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
>>>>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                  Regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                  Kevin K.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>>>>>>>         but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>>>>>>>          works   2.6.32-stable with patch 804
>>>>>>>>>>>>>>>>>          works_not 2.6.33-stable
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>>>>>>>         and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>>>>>>>        do_IRQ
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>>>>>>        SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>>>>>>>        clocksource_set_clock
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>>>>>>>        cpu_idle
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>>>>>>>        __irq_entry
>>>>>>>>>>>>>>>>>        ipi_decode
>>>>>>>>>>>>>>>>>            SMTC_CLOCK_TICK
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Stuart
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 13:02                                                         ` Anoop P A
  2011-01-04 14:37                                                           ` Anoop P A
@ 2011-01-04 17:40                                                           ` Kevin D. Kissell
  2011-01-05 13:09                                                             ` Anoop P A
  1 sibling, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 17:40 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/04/11 05:02, Anoop P A wrote:
> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
>> Those interrupt counters show that IPIs are being taken everywhere,
>> though very few by CPUs 5 and 6.  If I understand the configuration
>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> Yes CPU4 is in second VPE
>
>> rate, *if* we're looking at a tickless kernel under low load.  But there
> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> expect IPI / timer interrupt for all the threads in this case ?.

In that case, you should expect a distribution of timer interrupts that 
favors the low-numbered TCs within the VPE, as you do in VPE0, and a 
distribution of IPIs that is sort-of the inverse, as you do in VPE0.  
But the low counts on VPE1 are indeed suspicious, as you note.

>> may be a clue there to part of your problem.  I have no idea why the
>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
>> you're getting your clock interrupts through the MSP CIC interrupt
>> controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
>> example code is perhaps deceptively simple, in that both VPEs have their
>> count/compare indication wired directly to the 2 clock interrupt inputs,
>> so that having both of them running with only a single set of irq state
>> just works.  I don't know whether the MSP CIC timer interrupt is a
> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> connected to cpu irq 6.
>
> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> interrupt . Don't we have support for separate irq in SMTC
> implementation ?..

There are hooks for platform-specific SMTC support, which is implemented 
for the Malta in arch/mips/mti-malta/malta-smtc.c.  See 
msmtc_init_secondary(), for example, where the clock/compare, profile, 
and IPI interrupts are armed for VPE 1, while I/O peripheral interrupts 
are inhibited.

>> gating of the VPE0 count/compare output, or whether it's it's own
>> interval timer, but I suspect that you may need to do some further
>> low-level initialization in the platform-specific code to set up an
>> interrupt on the VPE1 side.  I don't think the snippet you've got below
>> would work as written.
> The routine which I copied works fine for VSMP mode .
>
> / # cat /proc/interrupts
>             CPU0       CPU1
>    0:        187        254            MIPS  IPI_resched
>    1:         77        174            MIPS  IPI_call
>    6:          0          0            MIPS  MSP CIC cascade
>    8:          0          0         MSP_CIC  Softreset button
>    9:          0          0         MSP_CIC  Standby switch
>   21:          0          0         MSP_CIC  MSP PER cascade
>   25:      37077          0         MSP_CIC  timer
>   27:        188          0         MSP_CIC  serial
>   34:          0      36986         MSP_CIC  timer
>
> Do I want to change anything specific for SMTC ? .

If it works (which I doubt), then we can critique stylistic points like 
using

		if ((1==get_current_vpe())

Instead of the more readable and general

		if (get_current_vpe()>  0)


But I think you're generally looking in the wrong place.  Look at the 
Malta code and see what's done where.  The initial SMTC code had a lot 
of Malta assumptions in the main line that I pushed out to platform code 
in later patches.  I can see how things could be made even more modular, 
but for the moment I think it's just that there's some stuff that ought 
to be done in a "msp_smtc.c" file that doesn't exist in 2.6.37.

             Regards,

             Kevin K.
>
>
>> If it's purely an issue with clock distribution on VPE1, then a boot
>> with maxvpes=1 maxtcs=4 should be stable.
> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>
>> /K.
>>
>> On 1/3/2011 11:20 AM, Anoop P A wrote:
>>> Hi Kevin,
>>>
>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>>>> The very first SMTC implementations didn't support full kernel-mode
>>>> preemption, which anyway wasn't a priority, given the hardware event
>>>> response support in MIPS MT.  I believe it was later made compatible,
>>>> but it was never extensively exercised.  Since SMTC has fingers in some
>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
>>>> implemented for RCU, I can easily imagine that nobody has yet
>>>> implemented SMTC-ified variants of that set.
>>>>
>>>> Your last statement isn't very clear, though.  Are you saying that if
>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>>>> kernel boots all the way up, or that it simply hangs later?   What's the
>>>> last rev kernel that actually boots all the way up?
>>> I have debugged this a bit more. It seems that kernel getting stalled
>>> while executing on TC's of second VPE .
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=2504 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=10036 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=17568 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=25100 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=32632 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=40164 jiffies)
>>>
>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>>>
>>> I presume some issue in my timer setup . I am not seeing timer interrupt
>>> (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
>>> booted 2.6.32-stable kernel.
>>>
>>> / # cat /proc/interrupts
>>>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
>>> CPU6
>>>    1:        148      15023      15140      15093       3779          8
>>> 2            MIPS  SMTC_IPI
>>>    6:          0          0          0          0          0          0
>>> 0            MIPS  MSP CIC cascade
>>>    8:          0          0          0          0          0          0
>>> 0         MSP_CIC  Softreset button
>>>    9:          0          0          0          0          0          0
>>> 0         MSP_CIC  Standby switch
>>>   21:          0          0          0          0          0          0
>>> 0         MSP_CIC  MSP PER cascade
>>>   25:      15113        341          4          7          0          0
>>> 0         MSP_CIC  timer
>>>   27:        260          9          0          1          0          0
>>> 0         MSP_CIC  serial
>>>   34:          0          0          0          0          0          0
>>> 0         MSP_CIC  timer
>>>
>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
>>>
>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
>>>
>>> unsigned int __cpuinit get_c0_compare_int(void)
>>> {
>>> 	if ((1==get_current_vpe())&&  !vpe1_timr_installed){
>>> 	
>>> 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
>>> 	
>>> 	setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
>>>                    vpe1_timr_installed++;
>>>            }
>>>            return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
>>> MSP_INT_VPE0_TIMER);
>>> }
>>>
>>> Thanks
>>> Anoop

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 17:21                                                             ` Kevin D. Kissell
@ 2011-01-04 17:54                                                               ` Anoop P A
  2011-01-04 18:33                                                                 ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-04 17:54 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
> I'm trying to figure out a reason why your change below should help, and 
> offhand, modulo tool bugs, I don't see it.  I'm assuming that your diff 
> below is a diff relative to the pre-patch stackframe.h.   I wouldn't 
Yes patch created against stock code .

> bless it as an alternative because it moves code and comments 
> unnecessarily - all you should really have to do is to move the
> 
> 
>   190                 mfc0    v1, CP0_STATUS
>   191                 LONG_S  $2, PT_R2(sp)
> 
> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
of code ( which store $0 ) . git diff did the rest on behalf of me :)

> 
> If moving the save of zero to PT_R0(sp) actually makes a difference, 
> it's evidence that you've got problems in your toolchain (or, heaven 
> forbid, your pipeline)!

In previous version of patch usage of V0 was creating issue. I have
verified this with previous version of code ( working code before
David's instruction rearrangement patch.) . 

> 
> But I'd really like to see what your assembler is doing to the original 
> patch for it to be broken.  Assembler instruction reordering is armed, 
> but it ought not to move register moves and stores around in ways where 
> your sequence
> 
> 197                 .set    mips32
> 198                 mfc0    v1, CP0_TCSTATUS
> 199                 .set    mips0
> 200                 LONG_S  v1, PT_TCSTATUS(sp)
> 189                 LONG_S  $0, PT_R0(sp)
> 190                 mfc0    v1, CP0_STATUS
> 191                 LONG_S  $2, PT_R2(sp)
> 202                 LONG_S  $4, PT_R4(sp)
> 203                 LONG_S  $5, PT_R5(sp)
> 204                 LONG_S  v1, PT_STATUS(sp)
> 
> to work while
> 
> 189                 LONG_S  $0, PT_R0(sp)
> 190                 mfc0    v1, CP0_STATUS
> 191                 LONG_S  $2, PT_R2(sp)
> 197                 .set    mips32
> 198                 mfc0    v0, CP0_TCSTATUS
> 199                 .set    mips0
> 200                 LONG_S  v0, PT_TCSTATUS(sp)
> 202                 LONG_S  $4, PT_R4(sp)
> 203                 LONG_S  $5, PT_R5(sp)
> 204                 LONG_S  v1, PT_STATUS(sp)
> 
> does not, provided that the identity of v0=$2, v1=$3 is respected.
> 
> One thing that does stick out as being different - though, again, I'd 
> need to see the disassembly of an instance of the macro to know what it 
> could have done - is that the SMTC conditiona code brackets the mfc0 of 
> TCStatus with .set mips32/.set mips0.  Given that the code no longer has 
> a .set mips0 early in the macro, it would be more correct to make it:
> 
>                  .set    push
>                  .set    mips32
>                  mfc0    v0, CP0_TCSTATUS    (or v1, if we move the mfc0 
> v1,CP0_STATUS)
>                  .set    pop
> 
> and presumably make a similar chage for the block from line 334 to 429.
> 
> But I don't see any causal path from that funniness to failure.
> 
>              Regards,
> 
>              Kevin K.
> 
> On 01/04/11 06:37, Anoop P A wrote:
> > Hi Kevin,
> >
> > the stackframe patch that you have suggested had some side effects I was
> > unable execute init. When I changed some thing like below it started
> > working .Could you kindly review it ?.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..da786ed 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -181,14 +181,6 @@
> >   #endif
> >   		LONG_S	k0, PT_R29(sp)
> >   		LONG_S	$3, PT_R3(sp)
> > -		/*
> > -		 * You might think that you don't need to save $0,
> > -		 * but the FPU emulator and gdb remote debug stub
> > -		 * need it to operate correctly
> > -		 */
> > -		LONG_S	$0, PT_R0(sp)
> > -		mfc0	v1, CP0_STATUS
> > -		LONG_S	$2, PT_R2(sp)
> >   #ifdef CONFIG_MIPS_MT_SMTC
> >   		/*
> >   		 * Ideally, these instructions would be shuffled in
> > @@ -199,6 +191,14 @@
> >   		.set	mips0
> >   		LONG_S	v1, PT_TCSTATUS(sp)
> >   #endif /* CONFIG_MIPS_MT_SMTC */
> > +		/*
> > +		 * You might think that you don't need to save $0,
> > +		 * but the FPU emulator and gdb remote debug stub
> > +		 * need it to operate correctly
> > +		 */
> > +		LONG_S	$0, PT_R0(sp)
> > +		mfc0	v1, CP0_STATUS
> > +		LONG_S	$2, PT_R2(sp)
> >   		LONG_S	$4, PT_R4(sp)
> >   		LONG_S	$5, PT_R5(sp)
> >   		LONG_S	v1, PT_STATUS(sp)
> >
> > Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
> > line.
> >
> > / # cat /proc/interrupts
> >             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> > CPU6
> >    1:        249     218024     218286     218263     218235     218208
> > 218179            MIPS  SMTC_IPI
> >    6:          0          0          0          0          0          0
> > 0            MIPS  MSP CIC cascade
> >    8:          0          0          0          0          0          0
> > 0         MSP_CIC  Softreset button
> >    9:          0          0          0          0          0          0
> > 0         MSP_CIC  Standby switch
> >   21:          0          0          0          0          0          0
> > 0         MSP_CIC  MSP PER cascade
> >   25:     218128        711         11          0          0          0
> > 0         MSP_CIC  timer
> >   27:        341         22          0          0          2          0
> > 6         MSP_CIC  serial
> >
> > ERR:          0
> > / # uname -a
> > Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
> > Jan 4 19:48:31 IST 2011 mips GNU/Linux
> >
> > So clock setup / distribution on VPE1 is some thing need fix.
> >
> > Thanks
> > Anoop
> >
> >
> > On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
> >> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> >>> Those interrupt counters show that IPIs are being taken everywhere,
> >>> though very few by CPUs 5 and 6.  If I understand the configuration
> >>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> >> Yes CPU4 is in second VPE
> >>
> >>> rate, *if* we're looking at a tickless kernel under low load.  But there
> >> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> >> expect IPI / timer interrupt for all the threads in this case ?.
> >>
> >>> may be a clue there to part of your problem.  I have no idea why the
> >>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> >>> you're getting your clock interrupts through the MSP CIC interrupt
> >>> controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
> >>> example code is perhaps deceptively simple, in that both VPEs have their
> >>> count/compare indication wired directly to the 2 clock interrupt inputs,
> >>> so that having both of them running with only a single set of irq state
> >>> just works.  I don't know whether the MSP CIC timer interrupt is a
> >> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> >> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> >> connected to cpu irq 6.
> >>
> >> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> >> interrupt . Don't we have support for separate irq in SMTC
> >> implementation ?..
> >>
> >>> gating of the VPE0 count/compare output, or whether it's it's own
> >>> interval timer, but I suspect that you may need to do some further
> >>> low-level initialization in the platform-specific code to set up an
> >>> interrupt on the VPE1 side.  I don't think the snippet you've got below
> >>> would work as written.
> >> The routine which I copied works fine for VSMP mode .
> >>
> >> / # cat /proc/interrupts
> >>             CPU0       CPU1
> >>    0:        187        254            MIPS  IPI_resched
> >>    1:         77        174            MIPS  IPI_call
> >>    6:          0          0            MIPS  MSP CIC cascade
> >>    8:          0          0         MSP_CIC  Softreset button
> >>    9:          0          0         MSP_CIC  Standby switch
> >>   21:          0          0         MSP_CIC  MSP PER cascade
> >>   25:      37077          0         MSP_CIC  timer
> >>   27:        188          0         MSP_CIC  serial
> >>   34:          0      36986         MSP_CIC  timer
> >>
> >> Do I want to change anything specific for SMTC ? .
> >>
> >>> If it's purely an issue with clock distribution on VPE1, then a boot
> >>> with maxvpes=1 maxtcs=4 should be stable.
> >> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
> >>
> >>> /K.
> >>>
> >>> On 1/3/2011 11:20 AM, Anoop P A wrote:
> >>>> Hi Kevin,
> >>>>
> >>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >>>>> The very first SMTC implementations didn't support full kernel-mode
> >>>>> preemption, which anyway wasn't a priority, given the hardware event
> >>>>> response support in MIPS MT.  I believe it was later made compatible,
> >>>>> but it was never extensively exercised.  Since SMTC has fingers in some
> >>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
> >>>>> implemented for RCU, I can easily imagine that nobody has yet
> >>>>> implemented SMTC-ified variants of that set.
> >>>>>
> >>>>> Your last statement isn't very clear, though.  Are you saying that if
> >>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >>>>> kernel boots all the way up, or that it simply hangs later?   What's the
> >>>>> last rev kernel that actually boots all the way up?
> >>>> I have debugged this a bit more. It seems that kernel getting stalled
> >>>> while executing on TC's of second VPE .
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=2504 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=10036 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=17568 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=25100 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=32632 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=40164 jiffies)
> >>>>
> >>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> >>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >>>>
> >>>> I presume some issue in my timer setup . I am not seeing timer interrupt
> >>>> (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
> >>>> booted 2.6.32-stable kernel.
> >>>>
> >>>> / # cat /proc/interrupts
> >>>>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> >>>> CPU6
> >>>>    1:        148      15023      15140      15093       3779          8
> >>>> 2            MIPS  SMTC_IPI
> >>>>    6:          0          0          0          0          0          0
> >>>> 0            MIPS  MSP CIC cascade
> >>>>    8:          0          0          0          0          0          0
> >>>> 0         MSP_CIC  Softreset button
> >>>>    9:          0          0          0          0          0          0
> >>>> 0         MSP_CIC  Standby switch
> >>>>   21:          0          0          0          0          0          0
> >>>> 0         MSP_CIC  MSP PER cascade
> >>>>   25:      15113        341          4          7          0          0
> >>>> 0         MSP_CIC  timer
> >>>>   27:        260          9          0          1          0          0
> >>>> 0         MSP_CIC  serial
> >>>>   34:          0          0          0          0          0          0
> >>>> 0         MSP_CIC  timer
> >>>>
> >>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> >>>>
> >>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
> >>>>
> >>>> unsigned int __cpuinit get_c0_compare_int(void)
> >>>> {
> >>>> 	if ((1==get_current_vpe())&&  !vpe1_timr_installed){
> >>>> 	
> >>>> 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> >>>> 	
> >>>> 	setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
> >>>>                    vpe1_timr_installed++;
> >>>>            }
> >>>>            return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> >>>> MSP_INT_VPE0_TIMER);
> >>>> }
> >>>>
> >>>> Thanks
> >>>> Anoop
> >>>>
> >>>>>              Regards,
> >>>>>
> >>>>>              Kevin K.
> >>>>>
> >>>>> On 1/3/2011 7:12 AM, Anoop P A wrote:
> >>>>>> Hi ,
> >>>>>>
> >>>>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> >>>>>> SMP kernel.
> >>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >>>>>>
> >>>>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> >>>>>> ( which will be only available RCU implementation for SMTC kernel from
> >>>>>> 2.6.37 onwards) .
> >>>>>>
> >>>>>> With no forced preemption and selecting TREE_CPU I am able to boot
> >>>>>> further to the hang that I have reported.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >>>>>>> At this point the logical thing to do would seem to look at your kernel
> >>>>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >>>>>>> shows the last exception to have been taken.  That's a critical SMTC
> >>>>>>> routine that gets called whenever an xxx_irq_restore() enables
> >>>>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >>>>>>> the TC had interrupts disabled can be handled deterministically.  As I
> >>>>>>> mentioned in an earlier message, there was some cleanup work from David
> >>>>>>> Howell that changed a number of irq management-related function names
> >>>>>>> and prototypes across all architectures, which went into linux-mips.org
> >>>>>>> at very roughly the time of the breakage.  The SMTC overlay over the irq
> >>>>>>> implementation has been pretty robust, but it's written in a perhaps
> >>>>>>> doomed attempt to be both efficient and using a maximum amount of common
> >>>>>>> code with the general case.  A mechanical or semi-mechanical change
> >>>>>>> could conceivably have broken things.
> >>>>>>>
> >>>>>>>              Regards,
> >>>>>>>
> >>>>>>>              Kevin K.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>>>>>>> Hi ,
> >>>>>>>>
> >>>>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>>>>>>> Another important observation is even though 2.6.33 kernel + stackframe
> >>>>>>>> patch well passes calibration hang , I am still unable boot in to a
> >>>>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>>>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>>>>>>> ######################## Log ###########################
> >>>>>>>>
> >>>>>>>> === MIPS MT State Dump ===
> >>>>>>>> -- Global State --
> >>>>>>>>     MVPControl Passed: 00000005
> >>>>>>>>     MVPControl Read: 00000004
> >>>>>>>>     MVPConf0 : a8008406
> >>>>>>>> -- per-VPE State --
> >>>>>>>>    VPE 0
> >>>>>>>>     VPEControl : 00008000
> >>>>>>>>     VPEConf0 : 800f0003
> >>>>>>>>     VPE0.Status : 11004201
> >>>>>>>>     VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>>>>>>>     VPE0.Cause : 50804000
> >>>>>>>>     VPE0.Config7 : 00010000
> >>>>>>>>    VPE 1
> >>>>>>>>     VPEControl : 00068006
> >>>>>>>>     VPEConf0 : 80cf0003
> >>>>>>>>     VPE1.Status : 11008301
> >>>>>>>>     VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>>     VPE1.Cause : 50800000
> >>>>>>>>     VPE1.Config7 : 00010000
> >>>>>>>> -- per-TC State --
> >>>>>>>>    TC 0 (current TC with VPE EPC above)
> >>>>>>>>     TCStatus : 18102000
> >>>>>>>>     TCBind : 00000000
> >>>>>>>>     TCRestart : 803fa19c printk+0xc/0x30
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00000000
> >>>>>>>>    TC 1
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00200000
> >>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00140000
> >>>>>>>>    TC 2
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00400000
> >>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00280000
> >>>>>>>>    TC 3
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00600000
> >>>>>>>>     TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 003c0000
> >>>>>>>>    TC 4
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00800001
> >>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00500000
> >>>>>>>>    TC 5
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00a00001
> >>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00640000
> >>>>>>>>    TC 6
> >>>>>>>>     TCStatus : 18902000
> >>>>>>>>     TCBind : 00c00001
> >>>>>>>>     TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>>     TCHalt : 00000000
> >>>>>>>>     TCContext : 00780000
> >>>>>>>> Counter Interrupts taken per CPU (TC)
> >>>>>>>> 0: 0
> >>>>>>>> 1: 0
> >>>>>>>> 2: 0
> >>>>>>>> 3: 0
> >>>>>>>> 4: 0
> >>>>>>>> 5: 0
> >>>>>>>> 6: 0
> >>>>>>>> 7: 0
> >>>>>>>> Self-IPI invocations:
> >>>>>>>> 0: 12
> >>>>>>>> 1: 0
> >>>>>>>> 2: 0
> >>>>>>>> 3: 0
> >>>>>>>> 4: 0
> >>>>>>>> 5: 5
> >>>>>>>> 6: 4
> >>>>>>>> 7: 0
> >>>>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> 0 Recoveries of "stolen" FPU
> >>>>>>>> ===========================
> >>>>>>>>
> >>>>>>>> ################################################################
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> I took a quick look last night, and the only thing that looked vaguely
> >>>>>>>>> dangerous in changes since the timer changes I alluded to earlier was
> >>>>>>>>> the global naming cleanup of irq-related function names that David
> >>>>>>>>> Howell submitted.  The diff didn't look dangerous in itself, but some of
> >>>>>>>>> the definitions are nested subtly for SMTC to maximize the amount of
> >>>>>>>>> common code, and I could imagine something getting lost in translation
> >>>>>>>>> there.  If that were really the problem, it would of course affect much
> >>>>>>>>> more than just the timer subsystem, but early in the boot process,
> >>>>>>>>> timers are pretty much the only interrupts that have to be handled
> >>>>>>>>> correctly.
> >>>>>>>>>
> >>>>>>>>> I'm travelling today, but will take a look at timekeeping_notify()
> >>>>>>>>> tomorrow or the next day...
> >>>>>>>>>
> >>>>>>>>> /K.
> >>>>>>>>>
> >>>>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>>>>>>> code .
> >>>>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>>>>>>> function.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Anoop
> >>>>>>>>>>
> >>>>>>>>>> PS: I may not be available until Thursday
> >>>>>>>>>>
> >>>>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>>>>>>> Hi Kevin,
> >>>>>>>>>>>
> >>>>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and  2.6.33 kernel ( +
> >>>>>>>>>>> stackframe patch) .
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Stuart,
> >>>>>>>>>>>
> >>>>>>>>>>> I haven't got much time to spend on this today.
> >>>>>>>>>>>
> >>>>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>>>>>>
> >>>>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>>>>>>
> >>>>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>>>>>>> code diff .
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>> Anoop
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Anoop,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Maybe we can get lucky again.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>>>>>>>      I'll be happy to do another diff.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>>>>>>>     We've had snow in Alabama since Christmas eve!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Stuart
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>>>>>>> To: Anoop P A
> >>>>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>>>>>>> performance tweak for the deeper pipelined processors.  In looking for
> >>>>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>>>>>>> tick logic that I was skeptical had ever been tested.  If you've still
> >>>>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>>>>>>
> >>>>>>>>>>>>                Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>>                Kevin K.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>>> Excellent!  Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>>>>>>> Brought up 7 CPUs
> >>>>>>>>>>>>> bio: create slab<bio-0>    at 0
> >>>>>>>>>>>>> SCSI subsystem initialized
> >>>>>>>>>>>>> Switching to clocksource MIPS
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>>>>>>> much to get rid of this hang.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>>>>>>>     		 * to cover the pipeline delay.
> >>>>>>>>>>>>>     		 */
> >>>>>>>>>>>>>     		.set	mips32
> >>>>>>>>>>>>> -		mfc0	v1, CP0_TCSTATUS
> >>>>>>>>>>>>> +		mfc0	v0, CP0_TCSTATUS
> >>>>>>>>>>>>>     		.set	mips0
> >>>>>>>>>>>>> -		LONG_S	v1, PT_TCSTATUS(sp)
> >>>>>>>>>>>>> +		LONG_S	v0, PT_TCSTATUS(sp)
> >>>>>>>>>>>>>     #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>>>>>>>     		LONG_S	$4, PT_R4(sp)
> >>>>>>>>>>>>>     		LONG_S	$5, PT_R5(sp)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> /K.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>      http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>>>>>>> the culprit
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>>>>>>> booting !.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Anoop
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>>>>> Thank you, Stuart!  I've spotted some definite breakage to SMTC between
> >>>>>>>>>>>>>>>> those versions.  In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>>>>>>> the register access.  Unfortunately, the v1 register is also used in the
> >>>>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>>>>>>> clobbered before it gets stored.  This will eventually result in the
> >>>>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value.  I'd
> >>>>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>                  Regards,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>                  Kevin K.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>>>>>>>         but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>>>>>>>          works   2.6.32-stable with patch 804
> >>>>>>>>>>>>>>>>>          works_not 2.6.33-stable
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>>>>>>>         and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>>>>>>>        do_IRQ
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>>>>>>        SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>>>>>>>        clocksource_set_clock
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>>>>>>>        cpu_idle
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>>>>>>>        __irq_entry
> >>>>>>>>>>>>>>>>>        ipi_decode
> >>>>>>>>>>>>>>>>>            SMTC_CLOCK_TICK
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Stuart
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >
> >
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 17:54                                                               ` Anoop P A
@ 2011-01-04 18:33                                                                 ` Kevin D. Kissell
  2011-01-05 13:11                                                                   ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 18:33 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/04/11 09:54, Anoop P A wrote:
> On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
>> I'm trying to figure out a reason why your change below should help, and
>> offhand, modulo tool bugs, I don't see it.  I'm assuming that your diff
>> below is a diff relative to the pre-patch stackframe.h.   I wouldn't
> Yes patch created against stock code .
>
>> bless it as an alternative because it moves code and comments
>> unnecessarily - all you should really have to do is to move the
>>
>>
>>    190                 mfc0    v1, CP0_STATUS
>>    191                 LONG_S  $2, PT_R2(sp)
>>
>> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
> Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
> of code ( which store $0 ) . git diff did the rest on behalf of me :)
>
>> If moving the save of zero to PT_R0(sp) actually makes a difference,
>> it's evidence that you've got problems in your toolchain (or, heaven
>> forbid, your pipeline)!
> In previous version of patch usage of V0 was creating issue. I have
> verified this with previous version of code ( working code before
> David's instruction rearrangement patch.) .

Argh.  It's not very clearly commented, but it looks as if the system 
call trap handler has an implicit assumption that v0 has never been 
changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI.  So yeah, moving the 
code around to fix the v1 conflict ends up being better than using v0 - 
otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the 
LONG_S v0, PT_TCSTATUS(sp) of the original patch.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 17:40                                                           ` Kevin D. Kissell
@ 2011-01-05 13:09                                                             ` Anoop P A
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-05 13:09 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

[-- Attachment #1: Type: text/plain, Size: 7962 bytes --]

On Tue, 2011-01-04 at 09:40 -0800, Kevin D. Kissell wrote:
> On 01/04/11 05:02, Anoop P A wrote:
> > On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> >> Those interrupt counters show that IPIs are being taken everywhere,
> >> though very few by CPUs 5 and 6.  If I understand the configuration
> >> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> > Yes CPU4 is in second VPE
> >
> >> rate, *if* we're looking at a tickless kernel under low load.  But there
> > No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> > expect IPI / timer interrupt for all the threads in this case ?.
> 
> In that case, you should expect a distribution of timer interrupts that 
> favors the low-numbered TCs within the VPE, as you do in VPE0, and a 
> distribution of IPIs that is sort-of the inverse, as you do in VPE0.  
> But the low counts on VPE1 are indeed suspicious, as you note.
> 
> >> may be a clue there to part of your problem.  I have no idea why the
> >> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> >> you're getting your clock interrupts through the MSP CIC interrupt
> >> controller on VPE 0.  There's nothing symmetric for VPE1. The Malta
> >> example code is perhaps deceptively simple, in that both VPEs have their
> >> count/compare indication wired directly to the 2 clock interrupt inputs,
> >> so that having both of them running with only a single set of irq state
> >> just works.  I don't know whether the MSP CIC timer interrupt is a
> > In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> > MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> > connected to cpu irq 6.
> >
> > I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> > interrupt . Don't we have support for separate irq in SMTC
> > implementation ?..
> 
> There are hooks for platform-specific SMTC support, which is implemented 
> for the Malta in arch/mips/mti-malta/malta-smtc.c.  See 
> msmtc_init_secondary(), for example, where the clock/compare, profile, 
> and IPI interrupts are armed for VPE 1, while I/O peripheral interrupts 
> are inhibited.
> 
> >> gating of the VPE0 count/compare output, or whether it's it's own
> >> interval timer, but I suspect that you may need to do some further
> >> low-level initialization in the platform-specific code to set up an
> >> interrupt on the VPE1 side.  I don't think the snippet you've got below
> >> would work as written.
> > The routine which I copied works fine for VSMP mode .
> >
> > / # cat /proc/interrupts
> >             CPU0       CPU1
> >    0:        187        254            MIPS  IPI_resched
> >    1:         77        174            MIPS  IPI_call
> >    6:          0          0            MIPS  MSP CIC cascade
> >    8:          0          0         MSP_CIC  Softreset button
> >    9:          0          0         MSP_CIC  Standby switch
> >   21:          0          0         MSP_CIC  MSP PER cascade
> >   25:      37077          0         MSP_CIC  timer
> >   27:        188          0         MSP_CIC  serial
> >   34:          0      36986         MSP_CIC  timer
> >
> > Do I want to change anything specific for SMTC ? .
> 
> If it works (which I doubt), then we can critique stylistic points like 
> using
> 
> 		if ((1==get_current_vpe())
> 
> Instead of the more readable and general
> 
> 		if (get_current_vpe()>  0)
> 
> 
> But I think you're generally looking in the wrong place.  Look at the 
> Malta code and see what's done where.  The initial SMTC code had a lot 
> of Malta assumptions in the main line that I pushed out to platform code 
> in later patches.  I can see how things could be made even more modular, 
> but for the moment I think it's just that there's some stuff that ought 
> to be done in a "msp_smtc.c" file that doesn't exist in 2.6.37.
Yes , I am doing similar stuff in msp_smtc.c . Attaching code for your
reference. I am not seeing a VPE1 timer interrupt.

> 
>              Regards,
> 
>              Kevin K.
> >
> >
> >> If it's purely an issue with clock distribution on VPE1, then a boot
> >> with maxvpes=1 maxtcs=4 should be stable.
> > Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
> >
> >> /K.
> >>
> >> On 1/3/2011 11:20 AM, Anoop P A wrote:
> >>> Hi Kevin,
> >>>
> >>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >>>> The very first SMTC implementations didn't support full kernel-mode
> >>>> preemption, which anyway wasn't a priority, given the hardware event
> >>>> response support in MIPS MT.  I believe it was later made compatible,
> >>>> but it was never extensively exercised.  Since SMTC has fingers in some
> >>>> pretty low-level atomicity mechanisms, if a new, parallel set was
> >>>> implemented for RCU, I can easily imagine that nobody has yet
> >>>> implemented SMTC-ified variants of that set.
> >>>>
> >>>> Your last statement isn't very clear, though.  Are you saying that if
> >>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >>>> kernel boots all the way up, or that it simply hangs later?   What's the
> >>>> last rev kernel that actually boots all the way up?
> >>> I have debugged this a bit more. It seems that kernel getting stalled
> >>> while executing on TC's of second VPE .
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=2504 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=10036 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=17568 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=25100 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=32632 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=40164 jiffies)
> >>>
> >>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> >>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >>>
> >>> I presume some issue in my timer setup . I am not seeing timer interrupt
> >>> (or IPI interrupt) getting  incremented for VPE1 tcs on a completely
> >>> booted 2.6.32-stable kernel.
> >>>
> >>> / # cat /proc/interrupts
> >>>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> >>> CPU6
> >>>    1:        148      15023      15140      15093       3779          8
> >>> 2            MIPS  SMTC_IPI
> >>>    6:          0          0          0          0          0          0
> >>> 0            MIPS  MSP CIC cascade
> >>>    8:          0          0          0          0          0          0
> >>> 0         MSP_CIC  Softreset button
> >>>    9:          0          0          0          0          0          0
> >>> 0         MSP_CIC  Standby switch
> >>>   21:          0          0          0          0          0          0
> >>> 0         MSP_CIC  MSP PER cascade
> >>>   25:      15113        341          4          7          0          0
> >>> 0         MSP_CIC  timer
> >>>   27:        260          9          0          1          0          0
> >>> 0         MSP_CIC  serial
> >>>   34:          0          0          0          0          0          0
> >>> 0         MSP_CIC  timer
> >>>
> >>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> >>>
> >>> I have tried setting up VPE1 timer from get_co_compare_int as follows
> >>>
> >>> unsigned int __cpuinit get_c0_compare_int(void)
> >>> {
> >>> 	if ((1==get_current_vpe())&&  !vpe1_timr_installed){
> >>> 	
> >>> 	memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> >>> 	
> >>> 	setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
> >>>                    vpe1_timr_installed++;
> >>>            }
> >>>            return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> >>> MSP_INT_VPE0_TIMER);
> >>> }
> >>>
> >>> Thanks
> >>> Anoop
> 


[-- Attachment #2: msp_smtc.c --]
[-- Type: text/x-csrc, Size: 3230 bytes --]

/*
 * MSP71xx Platform-specific hooks for SMP operation.
 * Started from malta-smtc.c.
 */
#include <linux/irq.h>
#include <linux/init.h>
#include <linux/sched.h>

#include <asm/mipsregs.h>
#include <asm/mipsmtregs.h>
#include <asm/smtc.h>
#include <asm/smtc_ipi.h>

/* VPE/SMP Prototype implements platform interfaces directly */

/*
 * Cause the specified action to be performed on a targeted "CPU"
 */

static void msp_smtc_send_ipi_single(int cpu, unsigned int action)
{
	/* "CPU" may be TC of same VPE, VPE of same CPU, or different CPU */
	smtc_send_ipi(cpu, LINUX_SMP_IPI, action);
}

static void msp_smtc_send_ipi_mask(const struct cpumask *mask, unsigned int action)
{
	unsigned int i;

	for_each_cpu(i, mask)
		msp_smtc_send_ipi_single(i, action);
}

/*
 * Post-config but pre-boot cleanup entry point
 */
static int prev_vpe;
static void __cpuinit msp_smtc_init_secondary(void)
{
	void smtc_init_secondary(void);
	int myvpe;

	myvpe = read_c0_tcbind() & TCBIND_CURVPE;
	/* Change status register when we switch to new VPE*/
	if ((myvpe != prev_vpe) && (myvpe > 0)) {
		change_c0_status(ST0_IM, STATUSF_IP0 | STATUSF_IP1 |
					 STATUSF_IP6 | STATUSF_IP7);
	}
	prev_vpe = myvpe;
	
	smtc_init_secondary();
}

/*
 * Platform "CPU" startup hook
 */
static void __cpuinit msp_smtc_boot_secondary(int cpu, struct task_struct *idle)
{
	
	smtc_boot_secondary(cpu, idle);
}

/*
 * SMP initialization finalization entry point
 */
static void __cpuinit msp_smtc_smp_finish(void)
{
	smtc_smp_finish();
}

/*
 * Hook for after all CPUs are online
 */

static void msp_smtc_cpus_done(void)
{
}

/*
 * Platform SMP pre-initialization
 *
 * As noted above, we can assume a single CPU for now
 * but it may be multithreaded.
 */

static void __init msp_smtc_smp_setup(void)
{
	/*
	 * we won't get the definitive value until
	 * we've run smtc_prepare_cpus later, but
	 */
	if (read_c0_config3() & (1 << 2))
	smp_num_siblings = smtc_build_cpu_map(0);
}

static void __init msp_smtc_prepare_cpus(unsigned int max_cpus)
{
	smtc_prepare_cpus(max_cpus);
}

struct plat_smp_ops msp_smtc_smp_ops = {
	.send_ipi_single	= msp_smtc_send_ipi_single,
	.send_ipi_mask		= msp_smtc_send_ipi_mask,
	.init_secondary		= msp_smtc_init_secondary,
	.smp_finish		= msp_smtc_smp_finish,
	.cpus_done		= msp_smtc_cpus_done,
	.boot_secondary		= msp_smtc_boot_secondary,
	.smp_setup		= msp_smtc_smp_setup,
	.prepare_cpus		= msp_smtc_prepare_cpus,
};

#if 0
/* TODO */
#ifdef CONFIG_MIPS_MT_SMTC_IRQAFF
/*
 * IRQ affinity hook
 */


int plat_set_irq_affinity(unsigned int irq, const struct cpumask *affinity)
{
	cpumask_t tmask;
	int cpu = 0;
	void smtc_set_irq_affinity(unsigned int irq, cpumask_t aff);

	cpumask_copy(&tmask, affinity);
	for_each_cpu(cpu, affinity) {
		if ((cpu_data[cpu].vpe_id != 0) || !cpu_online(cpu))
			cpu_clear(cpu, tmask);
	}
	cpumask_copy(irq_desc[irq].affinity, &tmask);

	if (cpus_empty(tmask))
		/*
		 * We could restore a default mask here, but the
		 * runtime code can anyway deal with the null set
		 */
		printk(KERN_WARNING
			"IRQ affinity leaves no legal CPU for IRQ %d\n", irq);

	/* Do any generic SMTC IRQ affinity setup */
	smtc_set_irq_affinity(irq, tmask);

	return 0;
}
#endif /* CONFIG_MIPS_MT_SMTC_IRQAFF */
#endif

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-04 18:33                                                                 ` Kevin D. Kissell
@ 2011-01-05 13:11                                                                   ` Anoop P A
  2011-01-05 19:23                                                                     ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-05 13:11 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Tue, 2011-01-04 at 10:33 -0800, Kevin D. Kissell wrote:
> On 01/04/11 09:54, Anoop P A wrote:
> > On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
> >> I'm trying to figure out a reason why your change below should help, and
> >> offhand, modulo tool bugs, I don't see it.  I'm assuming that your diff
> >> below is a diff relative to the pre-patch stackframe.h.   I wouldn't
> > Yes patch created against stock code .
> >
> >> bless it as an alternative because it moves code and comments
> >> unnecessarily - all you should really have to do is to move the
> >>
> >>
> >>    190                 mfc0    v1, CP0_STATUS
> >>    191                 LONG_S  $2, PT_R2(sp)
> >>
> >> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
> > Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
> > of code ( which store $0 ) . git diff did the rest on behalf of me :)
> >
> >> If moving the save of zero to PT_R0(sp) actually makes a difference,
> >> it's evidence that you've got problems in your toolchain (or, heaven
> >> forbid, your pipeline)!
> > In previous version of patch usage of V0 was creating issue. I have
> > verified this with previous version of code ( working code before
> > David's instruction rearrangement patch.) .
> 
> Argh.  It's not very clearly commented, but it looks as if the system 
> call trap handler has an implicit assumption that v0 has never been 
> changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI.  So yeah, moving the 
> code around to fix the v1 conflict ends up being better than using v0 - 
> otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the 
> LONG_S v0, PT_TCSTATUS(sp) of the original patch.
Well, Here is the patch.

diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..19418c4 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -187,8 +187,6 @@
 		 * need it to operate correctly
 		 */
 		LONG_S	$0, PT_R0(sp)
-		mfc0	v1, CP0_STATUS
-		LONG_S	$2, PT_R2(sp)
 #ifdef CONFIG_MIPS_MT_SMTC
 		/*
 		 * Ideally, these instructions would be shuffled in
@@ -199,6 +197,8 @@
 		.set	mips0
 		LONG_S	v1, PT_TCSTATUS(sp)
 #endif /* CONFIG_MIPS_MT_SMTC */
+		mfc0	v1, CP0_STATUS
+		LONG_S	$2, PT_R2(sp)
 		LONG_S	$4, PT_R4(sp)
 		LONG_S	$5, PT_R5(sp)
 		LONG_S	v1, PT_STATUS(sp)


> 
>              Regards,
> 
>              Kevin K.

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-05 13:11                                                                   ` Anoop P A
@ 2011-01-05 19:23                                                                     ` Kevin D. Kissell
  2011-01-06 20:23                                                                       ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-05 19:23 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/05/11 05:11, Anoop P A wrote:
> On Tue, 2011-01-04 at 10:33 -0800, Kevin D. Kissell wrote:
>> On 01/04/11 09:54, Anoop P A wrote:
>>> On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
>>>> I'm trying to figure out a reason why your change below should help, and
>>>> offhand, modulo tool bugs, I don't see it.  I'm assuming that your diff
>>>> below is a diff relative to the pre-patch stackframe.h.   I wouldn't
>>> Yes patch created against stock code .
>>>
>>>> bless it as an alternative because it moves code and comments
>>>> unnecessarily - all you should really have to do is to move the
>>>>
>>>>
>>>>     190                 mfc0    v1, CP0_STATUS
>>>>     191                 LONG_S  $2, PT_R2(sp)
>>>>
>>>> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
>>> Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
>>> of code ( which store $0 ) . git diff did the rest on behalf of me :)
>>>
>>>> If moving the save of zero to PT_R0(sp) actually makes a difference,
>>>> it's evidence that you've got problems in your toolchain (or, heaven
>>>> forbid, your pipeline)!
>>> In previous version of patch usage of V0 was creating issue. I have
>>> verified this with previous version of code ( working code before
>>> David's instruction rearrangement patch.) .
>> Argh.  It's not very clearly commented, but it looks as if the system
>> call trap handler has an implicit assumption that v0 has never been
>> changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI.  So yeah, moving the
>> code around to fix the v1 conflict ends up being better than using v0 -
>> otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the
>> LONG_S v0, PT_TCSTATUS(sp) of the original patch.
> Well, Here is the patch.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..19418c4 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -187,8 +187,6 @@
>   		 * need it to operate correctly
>   		 */
>   		LONG_S	$0, PT_R0(sp)
> -		mfc0	v1, CP0_STATUS
> -		LONG_S	$2, PT_R2(sp)
>   #ifdef CONFIG_MIPS_MT_SMTC
>   		/*
>   		 * Ideally, these instructions would be shuffled in
> @@ -199,6 +197,8 @@
>   		.set	mips0
>   		LONG_S	v1, PT_TCSTATUS(sp)
>   #endif /* CONFIG_MIPS_MT_SMTC */
> +		mfc0	v1, CP0_STATUS
> +		LONG_S	$2, PT_R2(sp)
>   		LONG_S	$4, PT_R4(sp)
>   		LONG_S	$5, PT_R5(sp)
>   		LONG_S	v1, PT_STATUS(sp)

That's exactly what I'd propose as the cleanest minimal fix.  I've got a 
version that also replaces the .set mips32 / .set mips0 with the .set 
push / .set pop paradigm, which I'd have used in the original code if 
I'd known at the time about that assembler directive.  I'm hoping to be 
able to test on a Malta/34K reference platform, and make sure there 
isn't breakage on that platform branch as well, before we commit to the 
repository.

Your msp_smtc.c file looks plausible on the face of it.  The 
init_secondary function has the quirk that it expects to execute on each 
"CPU" in numerical order, which is very likely but not guaranteed. It 
*ought* to be harmless in the rare case where it fails, but the 
assumption is worth a comment, IMHO.

At this point, there shouldn't be a whole lot of SMTC-specific mystery 
to get your timer running on the second VPE.  You know it's taking 
interrupts, because of the IPIs getting through, so in principle you 
just need to run the chain of enables from the clock peripheral itself 
through the CIC to the CPU core and the IM bits.

It would be really cool if we could get a stable repository branch that 
boots SMTC out-of-the-box on both Malta and the MSP platform.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-05 19:23                                                                     ` Kevin D. Kissell
@ 2011-01-06 20:23                                                                       ` Anoop P A
  2011-01-06 23:31                                                                         ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-06 20:23 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Wed, 2011-01-05 at 11:23 -0800, Kevin D. Kissell wrote:
> >   		LONG_S	$5, PT_R5(sp)
> >   		LONG_S	v1, PT_STATUS(sp)
> 
> That's exactly what I'd propose as the cleanest minimal fix.  I've got a 
> version that also replaces the .set mips32 / .set mips0 with the .set 
> push / .set pop paradigm, which I'd have used in the original code if 
> I'd known at the time about that assembler directive.  I'm hoping to be 
> able to test on a Malta/34K reference platform, and make sure there 
> isn't breakage on that platform branch as well, before we commit to the 
> repository.

I hope somebody can test this patch on Malta/34K platform. I don't have
access to any malta boards and I believe 34K MT simulations is not
available on qemu.

> 
> Your msp_smtc.c file looks plausible on the face of it.  The 
> init_secondary function has the quirk that it expects to execute on each 
> "CPU" in numerical order, which is very likely but not guaranteed. It 
> *ought* to be harmless in the rare case where it fails, but the 
> assumption is worth a comment, IMHO.
Yes I will add a comment. 

> 
> At this point, there shouldn't be a whole lot of SMTC-specific mystery 
> to get your timer running on the second VPE.  You know it's taking 
> interrupts, because of the IPIs getting through, so in principle you 
> just need to run the chain of enables from the clock peripheral itself 
> through the CIC to the CPU core and the IM bits.

I hope we are almost there. I have made some progress with the debug . I
think you should be able to give better insight to the observation I
have made.

1. Without selecting CONFIG_MIPS_MT_SMTC_IM_BACKSTOP My kernel hangs in
calibration loop itself . ( I haven't looked further into this).

2. With CONFIG_MIPS_MT_SMTC_IM_BACKSTOP I found I am getting 3
VPE1-TIMER interrupt ( one for each TC of VPE1) .However this interrupts
are not getting carried till c0_compare_interrupt . 

do_IRQ call had a SMTC hook which is modifying tccontext ( To reduce
complexity I haven't selected SMTC affinity). 

Once I disabled this call . I am seeing VPE1 timer interrupts and able
to boot completely without any issue's so far :).

/ # cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
CPU6
  1:        171     727459     727561     727533         27     727446
727453            MIPS  SMTC_IPI
  6:          0          0          0          0          0          0
0            MIPS  MSP CIC cascade
  8:          0          0          0          0          0          0
0         MSP_CIC  Softreset button
  9:          0          0          0          0          0          0
0         MSP_CIC  Standby switch
 21:          0          0          0          0          0          0
0         MSP_CIC  MSP PER cascade
 25:     727507        484         11          0          0          0
0         MSP_CIC  timer
 27:          0          0          0          0        258         10
1         MSP_CIC  serial
 34:          0          0          0          0     727533          7
1         MSP_CIC  timer


BTW following code in my cic init was setting hwmask.

        /* initialize all the IRQ descriptors */
        for (i = MSP_CIC_INTBASE ; i < MSP_CIC_INTBASE + 32 ; i++) {
                set_irq_chip_and_handler(i, &msp_cic_irq_controller,
                                         handle_level_irq);
#ifdef CONFIG_MIPS_MT_SMTC
                irq_hwmask[i] = C_IRQ4;
#endif
        }



> It would be really cool if we could get a stable repository branch that 
> boots SMTC out-of-the-box on both Malta and the MSP platform.
:)

> 
>              Regards,
> 
>              Kevin K.
> 
> 
Thanks
Anoop

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-06 20:23                                                                       ` Anoop P A
@ 2011-01-06 23:31                                                                         ` Kevin D. Kissell
  2011-01-07  7:56                                                                           ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-06 23:31 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/06/11 12:23, Anoop P A wrote:
> On Wed, 2011-01-05 at 11:23 -0800, Kevin D. Kissell wrote:
>> At this point, there shouldn't be a whole lot of SMTC-specific mystery
>> to get your timer running on the second VPE.  You know it's taking
>> interrupts, because of the IPIs getting through, so in principle you
>> just need to run the chain of enables from the clock peripheral itself
>> through the CIC to the CPU core and the IM bits.
> I hope we are almost there. I have made some progress with the debug . I
> think you should be able to give better insight to the observation I
> have made.
>
> 1. Without selecting CONFIG_MIPS_MT_SMTC_IM_BACKSTOP My kernel hangs in
> calibration loop itself . ( I haven't looked further into this).
That suggests a problem with Status.IM initialization and/or
the handling of irq_hwmask[].  Do you mean that this is always
true, or only if VPE1 is being booted?  You haven't mentioned it
before.

> 2. With CONFIG_MIPS_MT_SMTC_IM_BACKSTOP I found I am getting 3
> VPE1-TIMER interrupt ( one for each TC of VPE1) .However this interrupts
> are not getting carried till c0_compare_interrupt .
Would you expect them to?  I thought you were using an outboard
timer and not the CP0 Compare interrupt.
>   do_IRQ call had a SMTC hook which is modifying tccontext ( To reduce
> complexity I haven't selected SMTC affinity).
>
> Once I disabled this call . I am seeing VPE1 timer interrupts and able
> to boot completely without any issue's so far :).
So long as you've got the IM_BACKSTOP hack enabled, right?
Because otherwise, without that __DO_IRQ_SMTC_HOOK() invocation
> / # cat /proc/interrupts
>             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> CPU6
>    1:        171     727459     727561     727533         27     727446
> 727453            MIPS  SMTC_IPI
>    6:          0          0          0          0          0          0
> 0            MIPS  MSP CIC cascade
>    8:          0          0          0          0          0          0
> 0         MSP_CIC  Softreset button
>    9:          0          0          0          0          0          0
> 0         MSP_CIC  Standby switch
>   21:          0          0          0          0          0          0
> 0         MSP_CIC  MSP PER cascade
>   25:     727507        484         11          0          0          0
> 0         MSP_CIC  timer
>   27:          0          0          0          0        258         10
> 1         MSP_CIC  serial
>   34:          0          0          0          0     727533          7
> 1         MSP_CIC  timer
>
>
> BTW following code in my cic init was setting hwmask.
>
>          /* initialize all the IRQ descriptors */
>          for (i = MSP_CIC_INTBASE ; i<  MSP_CIC_INTBASE + 32 ; i++) {
>                  set_irq_chip_and_handler(i,&msp_cic_irq_controller,
>                                           handle_level_irq);
> #ifdef CONFIG_MIPS_MT_SMTC
>                  irq_hwmask[i] = C_IRQ4;
> #endif
>          }

I'm sure I've said this before, and it's in various comments in the SMTC
code, but remember, one of the main problems that the SMTC kernel
had to solve was to prevent all TCs of a VPE from "convoying" after every
interrupt.  The way this is done is that the interrupt vector code, before
clearing EXL, masks off the Status.IM bit associated with the incoming
interrupt.  Of course, to get another interrupt from the same source
(or collection of sources), that IM bit needs to be restored.  The "correct"
mechanism for this is by having the appropriate irq_hwmask[] value set,
so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
(meaning that the source has been quenched and any new occurrence
should be considered a new interrupt), will restore the bit in Status.
This function got moved around a bit in the various SMTC prototypes,
but it proved least intrusive to put it into the xxx_mask_and_ack() 
functions
for the interrupt controllers - see irq-msc01.c and i8259.c.  If you haven't
done the same in any equivalent code for a different on-chip controller,
you'll definitely have problems.

The Backstop scheme works OK for peripheral interrupts that didn't
have an appropriate irq_hwmask[] value set up, but clock interrupts
don't follow the same code paths and can't depend on the backstop.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-06 23:31                                                                         ` Kevin D. Kissell
@ 2011-01-07  7:56                                                                           ` Anoop P A
  2011-01-07 18:46                                                                             ` Kevin D. Kissell
  2011-01-10 19:30                                                                             ` Kevin D. Kissell
  0 siblings, 2 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-07  7:56 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]

On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
> On 01/06/11 12:23, Anoop P A wrote:

> 
> I'm sure I've said this before, and it's in various comments in the SMTC
> code, but remember, one of the main problems that the SMTC kernel
> had to solve was to prevent all TCs of a VPE from "convoying" after every
> interrupt.  The way this is done is that the interrupt vector code, before
> clearing EXL, masks off the Status.IM bit associated with the incoming
> interrupt.  Of course, to get another interrupt from the same source
> (or collection of sources), that IM bit needs to be restored.  The "correct"
> mechanism for this is by having the appropriate irq_hwmask[] value set,
> so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
> (meaning that the source has been quenched and any new occurrence
> should be considered a new interrupt), will restore the bit in Status.
> This function got moved around a bit in the various SMTC prototypes,
> but it proved least intrusive to put it into the xxx_mask_and_ack() 
> functions
> for the interrupt controllers - see irq-msc01.c and i8259.c.  If you haven't
> done the same in any equivalent code for a different on-chip controller,
> you'll definitely have problems.
> 
> The Backstop scheme works OK for peripheral interrupts that didn't
> have an appropriate irq_hwmask[] value set up, but clock interrupts
> don't follow the same code paths and can't depend on the backstop.

Ok. Well thanks much for your detailed explanation. Well I hope I found
the root cause . smtc_clockevent_init() was overriding irq_hwmask even
if are using platform specific get_c0_compare_int. With following patch
everything seems to be working for me.
------------------------------------------------------------------------
diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
index 2e72d30..a25fc59 100644
--- a/arch/mips/kernel/cevt-smtc.c
+++ b/arch/mips/kernel/cevt-smtc.c
@@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
 		return 0;
 	/*
 	 * And we need the hwmask associated with the c0_compare
-	 * vector to be initialized.
+	 * vector to be initialized. However incase of platform 
+	 * specific get_co_compare_int, don't override irq_hwmask
+	 * expect platform code to set a valid mask value. 
 	 */
-	irq_hwmask[irq] = (0x100 << cp0_compare_irq);
+
+	if (!get_c0_compare_int)
+		irq_hwmask[irq] = (0x100 << cp0_compare_irq);
+
 	if (cp0_timer_irq_installed)
 		return 0;
----------------------------------------------------------------------- 


Attaching my msp_ir_cic.c . Kindly have a look if possible.

Thanks
Anoop

> 
>              Regards,
> 
>              Kevin K.


[-- Attachment #2: msp_irq_cic.c --]
[-- Type: text/x-csrc, Size: 5357 bytes --]

/*
 * Copyright 2010 PMC-Sierra, Inc, derived from irq_cpu.c
 *
 * This file define the irq handler for MSP CIC subsystem interrupts.
 *
 * This program is free software; you can redistribute  it and/or modify it
 * under  the terms of  the GNU General  Public License as published by the
 * Free Software Foundation;  either version 2 of the  License, or (at your
 * option) any later version.
 */

#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/kernel.h>
#include <linux/bitops.h>
#include <linux/irq.h>

#include <asm/mipsregs.h>
#include <asm/system.h>

#include <msp_cic_int.h>
#include <msp_regs.h>

/*
 * External API
 */
extern void msp_per_irq_init(void);
extern void msp_per_irq_dispatch(void);


/*
 * Convenience Macro.  Should be somewhere generic.
 */
#define get_current_vpe()   \
	((read_c0_tcbind() >> TCBIND_CURVPE_SHIFT) & TCBIND_CURVPE) 

#ifdef CONFIG_SMP

#define LOCK_VPE(flags, mtflags) \
do {				\
	local_irq_save(flags);	\
	mtflags = dmt();	\
} while (0)

#define UNLOCK_VPE(flags, mtflags) \
do {				\
	emt(mtflags);		\
	local_irq_restore(flags);\
} while (0)

#define LOCK_CORE(flags, mtflags) \
do {				\
	local_irq_save(flags);	\
	mtflags = dvpe();	\
} while (0)

#define UNLOCK_CORE(flags, mtflags)		\
do {				\
	evpe(mtflags);		\
	local_irq_restore(flags);\
} while (0)

#else

#define LOCK_VPE(flags, mtflags) 
#define UNLOCK_VPE(flags, mtflags) 

#endif

/* ensure writes to cic are completed */
static inline void cic_wmb(void)
{
	const volatile void __iomem *cic_mem = CIC_VPE0_MSK_REG;
	volatile u32 dummy_read;

	wmb();
	dummy_read = __raw_readl(cic_mem);
	dummy_read++;
}


static inline void unmask_cic_irq(unsigned int irq)
{
	volatile u32   *cic_msk_reg = CIC_VPE0_MSK_REG;
	int vpe;
#ifdef CONFIG_SMP
	unsigned int mtflags;
	unsigned long  flags;

	/*
	* Make sure we have IRQ affinity.  It may have changed while
	* we were processing the IRQ.
	*/
	if (!cpumask_test_cpu(smp_processor_id(), irq_desc[irq].affinity))
		return;
#endif

	vpe = get_current_vpe();
	LOCK_VPE(flags, mtflags);
	cic_msk_reg[vpe] |= (1 << (irq - MSP_CIC_INTBASE));
	UNLOCK_VPE(flags, mtflags);
	cic_wmb();
}

static inline void mask_cic_irq(unsigned int irq)
{
	volatile u32 *cic_msk_reg = CIC_VPE0_MSK_REG;
	int	vpe = get_current_vpe();
#ifdef CONFIG_SMP
	unsigned long flags, mtflags;
#endif
	LOCK_VPE(flags, mtflags);
	cic_msk_reg[vpe] &= ~(1 << (irq - MSP_CIC_INTBASE));
	UNLOCK_VPE(flags, mtflags);
	cic_wmb();
}
static inline void msp_cic_irq_ack(unsigned int irq)
{
	mask_cic_irq(irq);
	/*
	* Only really necessary for 18, 16-14 and sometimes 3:0
	* (since these can be edge sensitive) but it doesn't
	* hurt for the others
	*/
	*CIC_STS_REG = (1 << (irq - MSP_CIC_INTBASE));
	smtc_im_ack_irq(irq);
}

static void msp_cic_irq_end(unsigned int irq)
{
	if (!(irq_desc[irq].status & (IRQ_DISABLED | IRQ_INPROGRESS)))
		unmask_cic_irq(irq);
}

#ifdef CONFIG_SMP
static inline int msp_cic_irq_set_affinity(unsigned int irq,
					const struct cpumask *cpumask)
{
	int cpu;
	unsigned long flags;
	unsigned int  mtflags;
	unsigned long imask = (1 << (irq - MSP_CIC_INTBASE));
	volatile u32 *cic_mask = (volatile u32 *)CIC_VPE0_MSK_REG;

	/* timer balancing should be disabled in kernel code */
	BUG_ON(irq == MSP_INT_VPE0_TIMER || irq == MSP_INT_VPE1_TIMER);

	LOCK_CORE(flags, mtflags);
	/* enable if any of each VPE's TCs require this IRQ */
	for_each_online_cpu(cpu) {
		if (cpumask_test_cpu(cpu, cpumask))
			cic_mask[cpu] |= imask;
		else
			cic_mask[cpu] &= ~imask;

	}

	UNLOCK_CORE(flags, mtflags);
	return 0;

}
#endif

static struct irq_chip msp_cic_irq_controller = {
	.name = "MSP_CIC",
	.mask = msp_cic_irq_ack,
	.mask_ack = msp_cic_irq_ack,
	.unmask = unmask_cic_irq,
	.ack = msp_cic_irq_ack,
	.end = msp_cic_irq_end,
#ifdef CONFIG_SMP
	.set_affinity = msp_cic_irq_set_affinity,
#endif
};

void __init msp_cic_irq_init(void)
{
	int i;
	/* Mask/clear interrupts. */
	*CIC_VPE0_MSK_REG = 0x00000000;
	*CIC_VPE1_MSK_REG = 0x00000000;
	*CIC_STS_REG      = 0xFFFFFFFF;
	/*
	* The MSP7120 RG and EVBD boards use IRQ[6:4] for PCI.
	* These inputs map to EXT_INT_POL[6:4] inside the CIC.
	* They are to be active low, level sensitive.
	*/
	*CIC_EXT_CFG_REG &= 0xFFFF8F8F;

	/* initialize all the IRQ descriptors */
	for (i = MSP_CIC_INTBASE ; i < MSP_CIC_INTBASE + 32 ; i++) {
		set_irq_chip_and_handler(i, &msp_cic_irq_controller,
					 handle_level_irq);
#ifdef CONFIG_MIPS_MT_SMTC
		/* Mask of CIC interrupt */
		irq_hwmask[i] = C_IRQ4;
#endif
	}

	/* Initialize the PER interrupt sub-system */
	 msp_per_irq_init();
}

/* CIC masked by CIC vector processing before dispatch called */
void msp_cic_irq_dispatch(void)
{
	volatile u32	*cic_msk_reg = (volatile u32 *)CIC_VPE0_MSK_REG;
	u32	cic_mask;
	u32	 pending;
	int	cic_status = *CIC_STS_REG;
	cic_mask = cic_msk_reg[get_current_vpe()];
	pending = cic_status & cic_mask;
	if (pending & (1 << (MSP_INT_VPE0_TIMER - MSP_CIC_INTBASE))) {
		do_IRQ(MSP_INT_VPE0_TIMER);
	} else if (pending & (1 << (MSP_INT_VPE1_TIMER - MSP_CIC_INTBASE))) {
		do_IRQ(MSP_INT_VPE1_TIMER);
	} else if (pending & (1 << (MSP_INT_PER - MSP_CIC_INTBASE))) {
		msp_per_irq_dispatch();
	} else if (pending) {
		do_IRQ(ffs(pending) + MSP_CIC_INTBASE - 1);
	} else{
		spurious_interrupt();
		/* Re-enable the CIC cascaded interrupt. */
		irq_desc[MSP_INT_CIC].chip->end(MSP_INT_CIC);
	}
}

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-07  7:56                                                                           ` Anoop P A
@ 2011-01-07 18:46                                                                             ` Kevin D. Kissell
  2011-01-08 19:33                                                                               ` Anoop P A
  2011-01-10 19:30                                                                             ` Kevin D. Kissell
  1 sibling, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-07 18:46 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/06/11 23:56, Anoop P A wrote:
> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>> I'm sure I've said this before, and it's in various comments in the SMTC
>> code, but...
As an aside to this conversation, would it be possible to create a
Documentation/mips/SMTC.txt file that would actually propagate
upstream, so that I'd stop being the sole repository of SMTC folklore?
I only maintain it as a hobby.
> Ok. Well thanks much for your detailed explanation. Well I hope I found
> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
> if are using platform specific get_c0_compare_int. With following patch
> everything seems to be working for me.
> ------------------------------------------------------------------------
> diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
> index 2e72d30..a25fc59 100644
> --- a/arch/mips/kernel/cevt-smtc.c
> +++ b/arch/mips/kernel/cevt-smtc.c
> @@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
>   		return 0;
>   	/*
>   	 * And we need the hwmask associated with the c0_compare
> -	 * vector to be initialized.
> +	 * vector to be initialized. However incase of platform
> +	 * specific get_co_compare_int, don't override irq_hwmask
> +	 * expect platform code to set a valid mask value.
>   	 */
> -	irq_hwmask[irq] = (0x100<<  cp0_compare_irq);
> +
> +	if (!get_c0_compare_int)
> +		irq_hwmask[irq] = (0x100<<  cp0_compare_irq);
> +
>   	if (cp0_timer_irq_installed)
>   		return 0;
> -----------------------------------------------------------------------
I'm still not clear on one point that, to me, is pretty important when
engineering a fix here.  Are you, in fact, using the Count/Compare
interrupt system, but having the externalization of the compare
interrupt routed back through an intervening interrupt controller,
or is your timer coming from another source?

In the former case, I think you're on the right track as to the
possible cause of a problem, but the fix should actually be simpler
and rather more elegant.  Why can't you simply see to it that
cp0_compare_irq is set to the right value, either at compile time,
or in your earliest platform initialization of the interrupt controller?
That would be a one-line, inline change and spare us another
cryptic conditional.

In the later case, you'll presumably be having lots of other problems,
as cevt-smtc.c is intertwined with cevt-r4k.c and the Count/Compare
paradigm.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-07 18:46                                                                             ` Kevin D. Kissell
@ 2011-01-08 19:33                                                                               ` Anoop P A
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-08 19:33 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Sat, Jan 8, 2011 at 12:16 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
> On 01/06/11 23:56, Anoop P A wrote:
>>
>> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>>>
>>> I'm sure I've said this before, and it's in various comments in the SMTC
>>> code, but...
>
> As an aside to this conversation, would it be possible to create a
> Documentation/mips/SMTC.txt file that would actually propagate
> upstream, so that I'd stop being the sole repository of SMTC folklore?
> I only maintain it as a hobby.
>>
>> Ok. Well thanks much for your detailed explanation. Well I hope I found
>> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
>> if are using platform specific get_c0_compare_int. With following patch
>> everything seems to be working for me.
>> ------------------------------------------------------------------------
>> diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
>> index 2e72d30..a25fc59 100644
>> --- a/arch/mips/kernel/cevt-smtc.c
>> +++ b/arch/mips/kernel/cevt-smtc.c
>> @@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
>>                return 0;
>>        /*
>>         * And we need the hwmask associated with the c0_compare
>> -        * vector to be initialized.
>> +        * vector to be initialized. However incase of platform
>> +        * specific get_co_compare_int, don't override irq_hwmask
>> +        * expect platform code to set a valid mask value.
>>         */
>> -       irq_hwmask[irq] = (0x100<<  cp0_compare_irq);
>> +
>> +       if (!get_c0_compare_int)
>> +               irq_hwmask[irq] = (0x100<<  cp0_compare_irq);
>> +
>>        if (cp0_timer_irq_installed)
>>                return 0;
>> -----------------------------------------------------------------------
>
> I'm still not clear on one point that, to me, is pretty important when
> engineering a fix here.  Are you, in fact, using the Count/Compare
> interrupt system, but having the externalization of the compare
> interrupt routed back through an intervening interrupt controller,
> or is your timer coming from another source?
>
> In the former case, I think you're on the right track as to the
> possible cause of a problem, but the fix should actually be simpler
> and rather more elegant.  Why can't you simply see to it that
> cp0_compare_irq is set to the right value, either at compile time,
> or in your earliest platform initialization of the interrupt controller?
> That would be a one-line, inline change and spare us another
> cryptic conditional.

Yes ,it is first case.

http://git.linux-mips.org/?p=linux.git;a=commit;h=38760d40ca61b18b2809e9c28df8b3ff9af8a02b

Above mentioned patch enables platforms to utilize 4k timer code with
platform specific timer interrupts. cevt-smtc also had ( copied from cevt-r4k)
referred code. Given the specific irq  support in cevt-smtc we should add
support for specific hwmask , IMHO.

>
> In the later case, you'll presumably be having lots of other problems,
> as cevt-smtc.c is intertwined with cevt-r4k.c and the Count/Compare
> paradigm.
>
>            Regards,
>
>            Kevin K.
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-07  7:56                                                                           ` Anoop P A
  2011-01-07 18:46                                                                             ` Kevin D. Kissell
@ 2011-01-10 19:30                                                                             ` Kevin D. Kissell
  2011-01-11  4:05                                                                               ` Anoop P A
  2011-01-13  7:53                                                                               ` Kevin D. Kissell
  1 sibling, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-10 19:30 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On 01/06/11 23:56, Anoop P A wrote:
> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>> I'm sure I've said this before, and it's in various comments in the SMTC
>> code, but remember, one of the main problems that the SMTC kernel
>> had to solve was to prevent all TCs of a VPE from "convoying" after every
>> interrupt.  The way this is done is that the interrupt vector code, before
>> clearing EXL, masks off the Status.IM bit associated with the incoming
>> interrupt.  Of course, to get another interrupt from the same source
>> (or collection of sources), that IM bit needs to be restored.  The "correct"
>> mechanism for this is by having the appropriate irq_hwmask[] value set,
>> so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
>> (meaning that the source has been quenched and any new occurrence
>> should be considered a new interrupt), will restore the bit in Status.
>> This function got moved around a bit in the various SMTC prototypes,
>> but it proved least intrusive to put it into the xxx_mask_and_ack()
>> functions
>> for the interrupt controllers - see irq-msc01.c and i8259.c.  If you haven't
>> done the same in any equivalent code for a different on-chip controller,
>> you'll definitely have problems.
>>
>> The Backstop scheme works OK for peripheral interrupts that didn't
>> have an appropriate irq_hwmask[] value set up, but clock interrupts
>> don't follow the same code paths and can't depend on the backstop.
> Ok. Well thanks much for your detailed explanation. Well I hope I found
> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
> if are using platform specific get_c0_compare_int. With following patch
> everything seems to be working for me.

Would this still be with a "tickful" kernel?  I was able to run some
experiments on a Malta over the weekend, using mostly default
Malta defconfig options including tickless operation.  The 2.6.32.27
build comes up with both VPEs and all TCs firing.  2.6.36.2 with
the stackframe.h patch boots all the way up on a single VPE, but
VERY slowly - as if the Clock/Compare setups weren't being done
correctly and timer intervals were waiting the full Count register
rollover cycle.  I've been looking at diffs, and merged one change
that was made to cevt-r4k.c into the analogous routine in cevt-smtc.c
(no change), but there's clearly more breakage to the SMTC/Malta
configuration post-2.6.32 than just the stackframe.h patch.  Going
tickful may work around it, but tickful+SMTC is grossly inefficient.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-10 19:30                                                                             ` Kevin D. Kissell
@ 2011-01-11  4:05                                                                               ` Anoop P A
  2011-01-13  7:53                                                                               ` Kevin D. Kissell
  1 sibling, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-11  4:05 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips

On Tue, Jan 11, 2011 at 1:00 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
>
> Would this still be with a "tickful" kernel?  I was able to run some
> experiments on a Malta over the weekend, using mostly default
> Malta defconfig options including tickless operation.  The 2.6.32.27
> build comes up with both VPEs and all TCs firing.  2.6.36.2 with
> the stackframe.h patch boots all the way up on a single VPE, but
> VERY slowly - as if the Clock/Compare setups weren't being done
> correctly and timer intervals were waiting the full Count register
> rollover cycle.  I've been looking at diffs, and merged one change
> that was made to cevt-r4k.c into the analogous routine in cevt-smtc.c
> (no change), but there's clearly more breakage to the SMTC/Malta
> configuration post-2.6.32 than just the stackframe.h patch.  Going
> tickful may work around it, but tickful+SMTC is grossly inefficient.

Yes that is true my configuration is using tickful . I had reported this
issue with tickless kernel . I think you missed my last email. I will
resend.

Thanks
Anoop
>
>            Regards,
>
>            Kevin K.
>
>
>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2011-01-10 19:30                                                                             ` Kevin D. Kissell
  2011-01-11  4:05                                                                               ` Anoop P A
@ 2011-01-13  7:53                                                                               ` Kevin D. Kissell
  1 sibling, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-13  7:53 UTC (permalink / raw)
  To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips

Further interesting data point:

If I specify "nowait" on the command line, I get much better behavior on
the 2.6.36 and 2.6.37 kernels.  In particular, 2.6.37, which hung after
the "Switching to clocksource MIPS" even booting with a single TC, gets
far enough to enable swap space even with 4 TCs running.  I note that
there was historically a problem with getting SMTC to work with the
wait-with-interrupts-disabled idle wait mode.  I had it working back in
2.5.2x, but something seems to have gotten broken in that 2.6.32 to
2.6.36 interval...

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-16 13:03           ` Anoop P A
@ 2010-12-16 18:43             ` Kevin D. Kissell
  0 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-16 18:43 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., linux-mips

Getting back to my previous comment, the value reported for
TC0's TCStatus register in the MT register dump can't be right.
There are bits that are literally the same flip-flops between
TCStatus and the containing VPE's Status register, and those
bits are turning up different.  If the reporting is wrong, then
one of the underlying assumptions of the dump code must
have been broken.  Taking a quick look at it - which is all the
time I have for it today - I note with alarm that the TCStatus
value reported for the TC currently executing comes from the
"flags" variable used in the local_irq_save(flags) statement
at the beginning of the dump code.  That historically worked,
because local_irq_save(x) propagated not only the interrupt
enable bit (bit 0) in x, but the entire value of Status - or TCStatus
in the case of SMTC.  It certainly looks as if that's no longer true.
I'm pretty sure that the dump function isn't the only place where
the knowledge of local_irq_save()'s implementation was exploited
by SMTC code.  So you look for changes to the local_irq_save()
macro definitions between 2.6.32 and 2.6.33.

The fact that you're blowing up on a DSP after you force an
exit from the timer calibration loop might also be attributable
to TCStatus is getting trashed, accidentally clearing access
rights to the DSP ASE state.

Honestly, just how many lines changed under arch/mips
(and include/asm-mips, if it was still outside arch/mips)
between 2.6.32 and 2.6.33?  There simply can't be that
many to review.

             Regards,

             Kevin K.

On 12/16/10 05:03, Anoop P A wrote:
> On Wed, 2010-12-15 at 11:58 -0800, Kevin D. Kissell wrote:
>> On 12/15/10 11:18, Anoop P A wrote:
>>>> management algorithms I described
>>> Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
>>> register dump is copied below.
>> I guess what jumps out at me is that VPE0.EPC doesn't look to have
>> changed since the very initial boot vector, as if we'd never successfully
>> taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
>> you're getting that MT state dump by breaking in with an NMI).
>> I'm puzzled that TC0.TCStatus is being reported as 0, when it should
>> have a bunch of bits in common with VPE0.Status.  And I'm particularly
>> intrigued by the fact that you seem to have an interrupt bit set in Cause
>> which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
>> seem to be getting interrupts.
>>
>> Do you have access to some kind of EJTAG probe for your system?
> Unfortunately I don't have access to a working EJTAG at the moment.
>
>>> I have tested few stable tags in git and isolated the code brake.
>>>
>>> 2.6.24-stable + patch[1] = SMTC boot success
>>> 2.6.29-stable + patch[1] = SMTC boot success
>>> 2.6.31-stable + patch[1] = SMTC boot success
>>> 2.6.32-stable + patch[1] = SMTC boot success
>>> 2.6.33-stable		 = SMTC boot failed
>>> 2.6.35-stable 		 = SMTC boot failed
>>>
>>> So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
>> That's a pretty good job of isolating the problem, and the fact
>> that it happens even with no TC or VPE concurrency means it's
>> not a failure of the SMTC logic per se, but that someone changed
>> some code that's common to SMTC and "normal"/SMP operation
>> in a way that breaks the more constrained assumptions of SMTC.
>>
> I have tried digging diff between 2.6.32 and 2.6.33 but I couldn't spot
> any likely causes.
>
> I forgot to mention that I can boot newer kernels both in VSMP and UP
> mode.
>
> The other thing I have tried is booting kernel with pre-set lpj ( Just
> to test how far I can go), which lead me to a dsp exception (spurious ?)
>
> Let me know if you have any thoughts .
>
> Thanks,
> Anoop
>
> ################# log #############
>
> Linux version 2.6.33.7-pmc (paanoop1@paanoop1-desktop) (gcc version
> 4.5.1 (GCC) ) #27 SMP PREEMPT Thu Dec 16 17:49:46 IST 2010
> DSPRAM0: PA=1c100000,Size=00008000,enabled
> UART clock set to 50000000
> CPU revision is: 00019548 (MIPS 34Kc)
> Determined physical RAM map:
>   memory: 00001000 @ 00000000 (reserved)
>   memory: 000ff000 @ 00001000 (usable)
>   memory: 00271000 @ 00100000 (reserved)
>   memory: 0fc5a200 @ 00371000 (usable)
> Wasting 32 bytes for tracking 1 unused pages
> Zone PFN ranges:
>    Normal   0x00000000 ->  0x0000ffcb
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
>      0: 0x00000000 ->  0x0000ffcb
> 6 available secondary CPU TC(s)
> PERCPU: Embedded 7 pages/cpu @81203000 s4896 r8192 d15584 u65536
> pcpu-alloc: s4896 r8192 d15584 u65536 alloc=16*4096
> pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages:
> 64971
> Kernel command line: console=ttyS0,57600 lpj=796672
> PID hash table entries: 1024 (order: 0, 4096 bytes)
> Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
> Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
> Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
> Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
> Writing ErrCtl register=00000000
> Readback ErrCtl register=00000000
> Memory: 255548k/259428k available (1861k kernel code, 3504k reserved,
> 400k data, 156k init, 0k highmem)
> Hierarchical RCU implementation.
> NR_IRQS:128
> Clock rate set to 600000000
> console [ttyS0] enabled
> Calibrating delay loop (skipped) preset value.. 398.33 BogoMIPS
> (lpj=796672)
> Mount-cache hash table entries: 512
> Cpu 0
> $ 0   : 00000000 10102000 00000010 00000003
> $ 4   : 00000003 00000000 00000000 8f82f758
> $ 8   : 00000000 00000000 00000000 00000000
> $12   : 00000000 00000007 8f82301c 00000000
> $16   : 8f82f758 00800b00 8035d3c0 8f830000
> $20   : 80329df8 00000000 8035d3c0 80360000
> $24   : 00000000 00000001
> $28   : 80328000 80329ce0 8f82f868 8010d018
> Hi    : 0000004c
> Lo    : 3831f4b4
> epc   : 8010d054 copy_thread+0x88/0x348
>      Not tainted
> ra    : 8010d018 copy_thread+0x4c/0x348
> Status: 10102000    KERNEL
> Cause : 50804068
> PrId  : 00019548 (MIPS 34Kc)
> Kernel panic - not syncing: Unexpected DSP exception
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-15 19:58         ` Kevin D. Kissell
@ 2010-12-16 13:03           ` Anoop P A
  2010-12-16 18:43             ` Kevin D. Kissell
  0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-16 13:03 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips

On Wed, 2010-12-15 at 11:58 -0800, Kevin D. Kissell wrote:
> On 12/15/10 11:18, Anoop P A wrote:
> >> management algorithms I described
> > Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
> > register dump is copied below.
> I guess what jumps out at me is that VPE0.EPC doesn't look to have
> changed since the very initial boot vector, as if we'd never successfully
> taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
> you're getting that MT state dump by breaking in with an NMI).
> I'm puzzled that TC0.TCStatus is being reported as 0, when it should
> have a bunch of bits in common with VPE0.Status.  And I'm particularly
> intrigued by the fact that you seem to have an interrupt bit set in Cause
> which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
> seem to be getting interrupts.
> 
> Do you have access to some kind of EJTAG probe for your system?

Unfortunately I don't have access to a working EJTAG at the moment.

> 
> > I have tested few stable tags in git and isolated the code brake.
> >
> > 2.6.24-stable + patch[1] = SMTC boot success
> > 2.6.29-stable + patch[1] = SMTC boot success
> > 2.6.31-stable + patch[1] = SMTC boot success
> > 2.6.32-stable + patch[1] = SMTC boot success
> > 2.6.33-stable		 = SMTC boot failed
> > 2.6.35-stable 		 = SMTC boot failed
> >
> > So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
> That's a pretty good job of isolating the problem, and the fact
> that it happens even with no TC or VPE concurrency means it's
> not a failure of the SMTC logic per se, but that someone changed
> some code that's common to SMTC and "normal"/SMP operation
> in a way that breaks the more constrained assumptions of SMTC.
> 

I have tried digging diff between 2.6.32 and 2.6.33 but I couldn't spot
any likely causes.

I forgot to mention that I can boot newer kernels both in VSMP and UP
mode.

The other thing I have tried is booting kernel with pre-set lpj ( Just
to test how far I can go), which lead me to a dsp exception (spurious ?)

Let me know if you have any thoughts .

Thanks,
Anoop

################# log #############

Linux version 2.6.33.7-pmc (paanoop1@paanoop1-desktop) (gcc version
4.5.1 (GCC) ) #27 SMP PREEMPT Thu Dec 16 17:49:46 IST 2010
DSPRAM0: PA=1c100000,Size=00008000,enabled
UART clock set to 50000000
CPU revision is: 00019548 (MIPS 34Kc)
Determined physical RAM map:
 memory: 00001000 @ 00000000 (reserved)
 memory: 000ff000 @ 00001000 (usable)
 memory: 00271000 @ 00100000 (reserved)
 memory: 0fc5a200 @ 00371000 (usable)
Wasting 32 bytes for tracking 1 unused pages
Zone PFN ranges:
  Normal   0x00000000 -> 0x0000ffcb
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x0000ffcb
6 available secondary CPU TC(s)
PERCPU: Embedded 7 pages/cpu @81203000 s4896 r8192 d15584 u65536
pcpu-alloc: s4896 r8192 d15584 u65536 alloc=16*4096
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
Built 1 zonelists in Zone order, mobility grouping on.  Total pages:
64971
Kernel command line: console=ttyS0,57600 lpj=796672
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
Writing ErrCtl register=00000000
Readback ErrCtl register=00000000
Memory: 255548k/259428k available (1861k kernel code, 3504k reserved,
400k data, 156k init, 0k highmem)
Hierarchical RCU implementation.
NR_IRQS:128
Clock rate set to 600000000
console [ttyS0] enabled
Calibrating delay loop (skipped) preset value.. 398.33 BogoMIPS
(lpj=796672)
Mount-cache hash table entries: 512
Cpu 0
$ 0   : 00000000 10102000 00000010 00000003
$ 4   : 00000003 00000000 00000000 8f82f758
$ 8   : 00000000 00000000 00000000 00000000
$12   : 00000000 00000007 8f82301c 00000000
$16   : 8f82f758 00800b00 8035d3c0 8f830000
$20   : 80329df8 00000000 8035d3c0 80360000
$24   : 00000000 00000001
$28   : 80328000 80329ce0 8f82f868 8010d018
Hi    : 0000004c
Lo    : 3831f4b4
epc   : 8010d054 copy_thread+0x88/0x348
    Not tainted
ra    : 8010d018 copy_thread+0x4c/0x348
Status: 10102000    KERNEL
Cause : 50804068
PrId  : 00019548 (MIPS 34Kc)
Kernel panic - not syncing: Unexpected DSP exception

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-15 19:18       ` Anoop P A
@ 2010-12-15 19:58         ` Kevin D. Kissell
  2010-12-16 13:03           ` Anoop P A
  0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-15 19:58 UTC (permalink / raw)
  To: Anoop P A; +Cc: Anoop P.A., linux-mips

On 12/15/10 11:18, Anoop P A wrote:
> On Tue, 2010-12-14 at 10:32 -0800, Kevin D. Kissell wrote:
>
>> I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
>> diff.
> http://patchwork.linux-mips.org/patch/804/ I was speaking about this
> patch. Since my timer is connected through a cascaded CIC , It is
> required to check TI bit of cause register in order to ensure a timer
> interrupt. With above mentioned patch I was able to boot a 2.6.24-stable
> SMTC kernel. ( Not tested fully though )
OK, yes, of course, you'd need that patch.
>> The recommended procedure was, and remains, to isolate clock
>> propagation problems by using command line options "maxtcs="
>> and "maxvpes=".
>>
>> First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
>> a virtual uniprocessor.  If that doesn't run, you've got some fundamental
>> problem with support for your platform, or someone has really fundamentally
>> broken the SMTC build somewhere.  Next, try booting with maxtcs=2
>> and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
>> If those fail, your problem is probably in the interrupt mask
>> management algorithms I described
> Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
> register dump is copied below.
I guess what jumps out at me is that VPE0.EPC doesn't look to have
changed since the very initial boot vector, as if we'd never successfully
taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
you're getting that MT state dump by breaking in with an NMI).
I'm puzzled that TC0.TCStatus is being reported as 0, when it should
have a bunch of bits in common with VPE0.Status.  And I'm particularly
intrigued by the fact that you seem to have an interrupt bit set in Cause
which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
seem to be getting interrupts.

Do you have access to some kind of EJTAG probe for your system?

> I have tested few stable tags in git and isolated the code brake.
>
> 2.6.24-stable + patch[1] = SMTC boot success
> 2.6.29-stable + patch[1] = SMTC boot success
> 2.6.31-stable + patch[1] = SMTC boot success
> 2.6.32-stable + patch[1] = SMTC boot success
> 2.6.33-stable		 = SMTC boot failed
> 2.6.35-stable 		 = SMTC boot failed
>
> So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
That's a pretty good job of isolating the problem, and the fact
that it happens even with no TC or VPE concurrency means it's
not a failure of the SMTC logic per se, but that someone changed
some code that's common to SMTC and "normal"/SMP operation
in a way that breaks the more constrained assumptions of SMTC.

> Thanks and Regards,
> Anoop
>
> patch[1] : http://patchwork.linux-mips.org/patch/804/
>
>
> #############################Log###########################
>      0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [    0.000000] -- Global State --
> [    0.000000]    MVPControl Passed: 00000000
> [    0.000000]    MVPControl Read: 00000000
> [    0.000000]    MVPConf0 : a8008406
> [    0.000000] -- per-VPE State --
> [    0.000000]   VPE 0
> [    0.000000]    VPEControl : 00000000
> [    0.000000]    VPEConf0 : 800f0003
> [    0.000000]    VPE0.Status : 11004001
> [    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
> [    0.000000]    VPE0.Cause : 50804000
> [    0.000000]    VPE0.Config7 : 00010000
> [    0.000000]   VPE 1
> [    0.000000]    VPEControl : 00060000
> [    0.000000]    VPEConf0 : 800f0000
> [    0.000000]    VPE1.Status : 00408305
> [    0.000000]    VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
> [    0.000000]    VPE1.Cause : 50000200
> [    0.000000]    VPE1.Config7 : 00010000
> [    0.000000] -- per-TC State --
> [    0.000000]   TC 0 (current TC with VPE EPC above)
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00000000
> [    0.000000]    TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
> [    0.000000]    TCHalt : 00000000
> [    0.000000]    TCContext : 00000000
> [    0.000000]   TC 1
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00200001
> [    0.000000]    TCRestart : 8f800020 0x8f800020
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00140000
> [    0.000000]   TC 2
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00400001
> [    0.000000]    TCRestart : 8f800020 0x8f800020
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00280000
> [    0.000000]   TC 3
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00600001
> [    0.000000]    TCRestart : 8f800020 0x8f800020
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 003c0000
> [    0.000000]   TC 4
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00800001
> [    0.000000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00500000
> [    0.000000]   TC 5
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00a00001
> [    0.000000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00640000
> [    0.000000]   TC 6
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00c00001
> [    0.000000]    TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00780000
> [    0.000000] Counter Interrupts taken per CPU (TC)
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] 2: 0
> [    0.000000] 3: 0
> [    0.000000] 4: 0
> [    0.000000] 5: 0
> [    0.000000] 6: 0
> [    0.000000] 7: 0
> [    0.000000] Self-IPI invocations:
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] 2: 0
> [    0.000000] 3: 0
> [    0.000000] 4: 0
> [    0.000000] 5: 0
> [    0.000000] 6: 0
> [    0.000000] 7: 0
> [    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] 0 Recoveries of "stolen" FPU
> [    0.000000] ===========================
> [    0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
> pend=0x20000
> [    0.010000] === MIPS MT State Dump ===
> [    0.010000] -- Global State --
> [    0.010000]    MVPControl Passed: 00000000
> [    0.010000]    MVPControl Read: 00000000
> [    0.010000]    MVPConf0 : a8008406
> [    0.010000] -- per-VPE State --
> [    0.010000]   VPE 0
> [    0.010000]    VPEControl : 00000000
> [    0.010000]    VPEConf0 : 800f0003
> [    0.010000]    VPE0.Status : 18004000
> [    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [    0.010000]    VPE0.Cause : 40804000
> [    0.010000]    VPE0.Config7 : 00010000
> [    0.010000]   VPE 1
> [    0.010000]    VPEControl : 00060000
> [    0.010000]    VPEConf0 : 800f0000
> [    0.010000]    VPE1.Status : 00408305
> [    0.010000]    VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
> [    0.010000]    VPE1.Cause : 50000200
> [    0.010000]    VPE1.Config7 : 00010000
> [    0.010000] -- per-TC State --
> [    0.010000]   TC 0 (current TC with VPE EPC above)
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00000000
> [    0.010000]    TCRestart : 803f791c printk+0xc/0x30
> [    0.010000]    TCHalt : 00000000
> [    0.010000]    TCContext : 00000000
> [    0.010000]   TC 1
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00200001
> [    0.010000]    TCRestart : 8f800020 0x8f800020
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00140000
> [    0.010000]   TC 2
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00400001
> [    0.010000]    TCRestart : 8f800020 0x8f800020
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00280000
> [    0.010000]   TC 3
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00600001
> [    0.010000]    TCRestart : 8f800020 0x8f800020
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 003c0000
> [    0.010000]   TC 4
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00800001
> [    0.010000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00500000
> [    0.010000]   TC 5
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00a00001
> [    0.010000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00640000
> [    0.010000]   TC 6
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00c00001
> [    0.010000]    TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00780000
> [    0.010000] Counter Interrupts taken per CPU (TC)
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] Self-IPI invocations:
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] 0 Recoveries of "stolen" FPU
> [    0.010000] ===========================
>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-14 18:32     ` Kevin D. Kissell
  2010-12-14 18:50       ` Ralf Baechle
@ 2010-12-15 19:18       ` Anoop P A
  2010-12-15 19:58         ` Kevin D. Kissell
  1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-15 19:18 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips

On Tue, 2010-12-14 at 10:32 -0800, Kevin D. Kissell wrote:
> Between your mailer and mine (Thunderbird 3.1 on Ubuntu), the quoting
> has become something of a dogs breakfast, so let me just lay things out
> here as best I can.

I am sorry for that. With evolution it will be better I hope.

> 
> I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
> diff.

http://patchwork.linux-mips.org/patch/804/ I was speaking about this
patch. Since my timer is connected through a cascaded CIC , It is
required to check TI bit of cause register in order to ensure a timer
interrupt. With above mentioned patch I was able to boot a 2.6.24-stable
SMTC kernel. ( Not tested fully though )

> The recommended procedure was, and remains, to isolate clock
> propagation problems by using command line options "maxtcs="
> and "maxvpes=".
> 
> First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
> a virtual uniprocessor.  If that doesn't run, you've got some fundamental
> problem with support for your platform, or someone has really fundamentally
> broken the SMTC build somewhere.  Next, try booting with maxtcs=2
> and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
> If those fail, your problem is probably in the interrupt mask
> management algorithms I described

Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
register dump is copied below.


> Your dump below looks as if it comes from 2 TCs running on
> 2 VPEs, and that the interrupt mask issues I alluded to earlier
> are neither relevant nor manifest.  It looks instead as if the
> initialization of "CPU 1" (VPE1/TC1) may not have been done
> properly.  Under normal operation, it would be pretty rare to
> catch TC 1 in the exception vector dispatch code, so the first
> hypothesis that comes to mind is that something isn't right in
> the vector/handler setup, and TC 1 is stuck in an infinite exception
> loop, unable to handshake with TC 0 and thus locking up the
> system.  But that's just my best guess based on limited data.
> 
>              Regards,
> 
>              Kevin K.
> 

I have tested few stable tags in git and isolated the code brake.

2.6.24-stable + patch[1] = SMTC boot success
2.6.29-stable + patch[1] = SMTC boot success
2.6.31-stable + patch[1] = SMTC boot success
2.6.32-stable + patch[1] = SMTC boot success
2.6.33-stable		 = SMTC boot failed
2.6.35-stable 		 = SMTC boot failed

So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .

Thanks and Regards,
Anoop

patch[1] : http://patchwork.linux-mips.org/patch/804/


#############################Log###########################
    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[    0.000000] -- Global State --
[    0.000000]    MVPControl Passed: 00000000
[    0.000000]    MVPControl Read: 00000000
[    0.000000]    MVPConf0 : a8008406
[    0.000000] -- per-VPE State --
[    0.000000]   VPE 0
[    0.000000]    VPEControl : 00000000
[    0.000000]    VPEConf0 : 800f0003
[    0.000000]    VPE0.Status : 11004001
[    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
[    0.000000]    VPE0.Cause : 50804000
[    0.000000]    VPE0.Config7 : 00010000
[    0.000000]   VPE 1
[    0.000000]    VPEControl : 00060000
[    0.000000]    VPEConf0 : 800f0000
[    0.000000]    VPE1.Status : 00408305
[    0.000000]    VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
[    0.000000]    VPE1.Cause : 50000200
[    0.000000]    VPE1.Config7 : 00010000
[    0.000000] -- per-TC State --
[    0.000000]   TC 0 (current TC with VPE EPC above)
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00000000
[    0.000000]    TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[    0.000000]    TCHalt : 00000000
[    0.000000]    TCContext : 00000000
[    0.000000]   TC 1
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00200001
[    0.000000]    TCRestart : 8f800020 0x8f800020
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00140000
[    0.000000]   TC 2
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00400001
[    0.000000]    TCRestart : 8f800020 0x8f800020
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00280000
[    0.000000]   TC 3
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00600001
[    0.000000]    TCRestart : 8f800020 0x8f800020
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 003c0000
[    0.000000]   TC 4
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00800001
[    0.000000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00500000
[    0.000000]   TC 5
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00a00001
[    0.000000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00640000
[    0.000000]   TC 6
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00c00001
[    0.000000]    TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00780000
[    0.000000] Counter Interrupts taken per CPU (TC)
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] Self-IPI invocations:
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] 0 Recoveries of "stolen" FPU
[    0.000000] ===========================
[    0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
[    0.010000]    VPE1.Cause : 50000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 8f800020 0x8f800020
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00140000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 8f800020 0x8f800020
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00280000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : 8f800020 0x8f800020
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 003c0000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00500000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 80100380 name_to_dev_t+0x50/0x430
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00640000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head
  2010-12-14 21:27 ` STUART VENTERS
  (?)
@ 2010-12-14 23:01 ` Kevin D. Kissell
  -1 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-14 23:01 UTC (permalink / raw)
  To: STUART VENTERS; +Cc: linux-mips

On 12/14/10 13:27, STUART VENTERS wrote:
> Kevin,
>
> It turns out we are also looking at Linux SMTC support for 34kc.
>     (For a different pmc part.)
>
> You said you remembered seeing it work on at least one version of the kernel.
>
> Could you help us find that version by bracketing the search a bit?
>
> Maybe a date and/or version range to look in.
>

There were early working versions without dyntick or interrupt affinity
in the 2.6.23/24 timeframe, but as per the commit lots in linux-mips.org,
I finally got the dyntick stuff working in September 2008, with the commits
propagating to various git branches over the following two months.  I 
can see
that the new code was in 2.6.28.1 but not in 2.6.26.8 At some point 
subsequent
to that, I'm pretty sure I checked out the then-latest stable version of 
the Malta
branch and got a functional build.

The last time I regression checked it was in March  of 2009 at which point
some infrastructure changes had broken things, which I fixed in patches
posted on March 31, 2009, one which addressed a change in the semantics
of CP0 access macros, and one of which fixed  a name conflict.
Those were committed on 3/31 and 5/14/2009, depending on the branch
you look at.  With those patches and only those patches on what was then
the latest stable (Malta?) branch at LMO, it seemed to run OK
to the limited degree I was able to have it tested.   Someone else found a
hole in smtc_distribute_timer() in November of 2009, and I worked with
the discoverer on a very small patch committed November 13, 2009,
but I never actually ran the code to test (then again, I'd never been able
to drive a system into the failure it could cause).

Sorry to be a little vague, but I no longer have my MIPS Linux development
build or test systems, so I'm reduced to googling and searching LMO, just
like anyone else.

             Regards,

             Kevin K.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head
@ 2010-12-14 21:27 ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-14 21:27 UTC (permalink / raw)
  To: kevink; +Cc: linux-mips

Kevin,

It turns out we are also looking at Linux SMTC support for 34kc.
   (For a different pmc part.)

You said you remembered seeing it work on at least one version of the kernel.

Could you help us find that version by bracketing the search a bit?

Maybe a date and/or version range to look in.


Regards,

Stuart Venters
Adtran

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head
@ 2010-12-14 21:27 ` STUART VENTERS
  0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-14 21:27 UTC (permalink / raw)
  To: kevink; +Cc: linux-mips

Kevin,

It turns out we are also looking at Linux SMTC support for 34kc.
   (For a different pmc part.)

You said you remembered seeing it work on at least one version of the kernel.

Could you help us find that version by bracketing the search a bit?

Maybe a date and/or version range to look in.


Regards,

Stuart Venters
Adtran

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-14 18:32     ` Kevin D. Kissell
@ 2010-12-14 18:50       ` Ralf Baechle
  2010-12-15 19:18       ` Anoop P A
  1 sibling, 0 replies; 68+ messages in thread
From: Ralf Baechle @ 2010-12-14 18:50 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips

On Tue, Dec 14, 2010 at 10:32:57AM -0800, Kevin D. Kissell wrote:

> I am no longer associated with MIPS Technologies and no longer have
> access to my email archives from that period.  If I did, I could tell you
> which LMO kernel version(s) had SMTC working "out of the box".  There
> definitely was at least one, and I commented on it in an email.  You
> might be able to find it in the LMO email archives, but it's possible that
> I only sent it to a MIPS internal mailing list.
> 
> There was also a message I wrote that I had *thought* had gone to
> the LMO mailing list, but may have only been sent to a group of internal
> MIPS and customer engineers, in which I described the recommended
> procedure for debugging exactly this canonical problem with porting
> SMTC.

git bisect to the rescue :)  It's time consuming with a slow machine but
perfectly doable.  Go back, find some antique kernel version with
functioning SMTC and take it from there.

  Ralf

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-14 15:25     ` Anoop P.A.
  (?)
@ 2010-12-14 18:32     ` Kevin D. Kissell
  2010-12-14 18:50       ` Ralf Baechle
  2010-12-15 19:18       ` Anoop P A
  -1 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-14 18:32 UTC (permalink / raw)
  To: Anoop P.A.; +Cc: linux-mips

Between your mailer and mine (Thunderbird 3.1 on Ubuntu), the quoting
has become something of a dogs breakfast, so let me just lay things out
here as best I can.

I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
diff.

I am no longer associated with MIPS Technologies and no longer have
access to my email archives from that period.  If I did, I could tell you
which LMO kernel version(s) had SMTC working "out of the box".  There
definitely was at least one, and I commented on it in an email.  You
might be able to find it in the LMO email archives, but it's possible that
I only sent it to a MIPS internal mailing list.

There was also a message I wrote that I had *thought* had gone to
the LMO mailing list, but may have only been sent to a group of internal
MIPS and customer engineers, in which I described the recommended
procedure for debugging exactly this canonical problem with porting
SMTC.

The recommended procedure was, and remains, to isolate clock
propagation problems by using command line options "maxtcs="
and "maxvpes=".

First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
a virtual uniprocessor.  If that doesn't run, you've got some fundamental
problem with support for your platform, or someone has really fundamentally
broken the SMTC build somewhere.  Next, try booting with maxtcs=2
and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
If those fail, your problem is probably in the interrupt mask
management algorithms I described.

On the other hand, if you boot with maxtcs=2 and maxvpes=2,
there will be only one TC per VPE and far less vulnerability to interrupt
mask lockup, but you need to have cross-VPE IPI interrupts working.
The preferred method of doing cross-VPE IPIs would be to use a physical
interrupt  input that's instantiated per-VPE and manipulable by software.
Malta didn't have one, so there's the historical hack of using
MIPS MT instructions to freeze the other VPE and set up a
software interrupt using MTTR to the remote Cause register.
The PMC-Sierra platforms did, if I recall correctly, have some kind
of register that one could write to cause a real cross-VPE hardware
interrupt, but I don't recall whether it got used in the SMTC port.

Your dump below looks as if it comes from 2 TCs running on
2 VPEs, and that the interrupt mask issues I alluded to earlier
are neither relevant nor manifest.  It looks instead as if the
initialization of "CPU 1" (VPE1/TC1) may not have been done
properly.  Under normal operation, it would be pretty rare to
catch TC 1 in the exception vector dispatch code, so the first
hypothesis that comes to mind is that something isn't right in
the vector/handler setup, and TC 1 is stuck in an infinite exception
loop, unable to handshake with TC 0 and thus locking up the
system.  But that's just my best guess based on limited data.

             Regards,

             Kevin K.

On 12/14/10 07:25, Anoop P.A. wrote:
>> it ended up being cleaner and more efficient to have *some* hooks in
>> platform specific timer code.  It was there for Malta in the
> kernel.org
>> mainline once upon a time, and I *thought* we'd propagated working
> code
>> for the initial PMC-Sierra 34K-based SoC's at least as far as
> [Anoop P.A.]
> I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
> c0_compare_int_pending changed as following "return (read_c0_cause()>>
> cp0_compare_irq_shift)&  (1ul<<  CAUSEB_IP)"
>
>> linux-mips.org, but the source tree has been considerably reorganized
> -
>> there was a time when some of the hooks were under
>> arch/mips/mips-boards/generic, which no longer exists - and I'm not
> sure
>> where to point you.  Git and grep are your friends.
> [Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
> Can you recollect the version of l-m-o kernel with a known working SMTC
> support ?.
>
>> The first order of business is to break into that hung timer
> calibration
>> loop and dump the CP0 registers for the VPE and the TCs, in particular
>> checking the interrupt enable mask in Status against the pending
>> interrupts in the Cause register.   If you're seeing the timer
>> interrupt's bit set in Cause, but clear in Status, you need to fix the
>> SMTC interrupt mask hook for your platform timer.
> [Anoop P.A.]
> I tried dumping registers from calibration while loop.
> It looks like the timer interrupt bit stay high on both cause and status
> register ( in my case timer interrupt is connected to Cascaded CIC
> interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
> below
>
>> check to see if you're building for "tickless" operation.  Tickless
> ends
>> up being really important for SMTC, and I did get it working properly
>> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
>> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
> going
>> by that I rather doubt were ever tested against an SMTC
> build/platform.
>> There might have been breakage there, and configuring to use a fixed
>> interval timer (say, 100Hz) would be a way to test that hypothesis.
> [Anoop P.A.] I have tried both tickles and fixed interval timer.
>
>>               Regards,
>>
>>               Kevin K.
>
> [Anoop P.A.] Thanks much for your and Ralf's detailed response.
> [Anoop P.A.]
> [    0.000000] Writing ErrCtl register=00000000
> [    0.000000] Readback ErrCtl register=00000000
> [    0.000000] Memory: 254384k/257912k available (3062k kernel code,
> 3528k reserved, 648k data, 200k init, 0k highmem)
> [    0.000000] Preemptable hierarchical RCU implementation.
> [    0.000000] NR_IRQS:128
> [    0.000000] console [ttyS0] enabled
> [    0.000000] Clock rate set to 600000000
> [    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [    0.000000] -- Global State --
> [    0.000000]    MVPControl Passed: 00000000
> [    0.000000]    MVPControl Read: 00000000
> [    0.000000]    MVPConf0 : a8008406
> [    0.000000] -- per-VPE State --
> [    0.000000]   VPE 0
> [    0.000000]    VPEControl : 00000000
> [    0.000000]    VPEConf0 : 800f0003
> [    0.000000]    VPE0.Status : 11004001
> [    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
> [    0.000000]    VPE0.Cause : 40804000
> [    0.000000]    VPE0.Config7 : 00010000
> [    0.000000]   VPE 1
> [    0.000000]    VPEControl : 00060000
> [    0.000000]    VPEConf0 : 800f0000
> [    0.000000]    VPE1.Status : 00408305
> [    0.000000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [    0.000000]    VPE1.Cause : 40000200
> [    0.000000]    VPE1.Config7 : 00010000
> [    0.000000] -- per-TC State --
> [    0.000000]   TC 0 (current TC with VPE EPC above)
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00000000
> [    0.000000]    TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
> [    0.000000]    TCHalt : 00000000
> [    0.000000]    TCContext : 00000000
> [    0.000000]   TC 1
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00200001
> [    0.000000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00180000
> [    0.000000]   TC 2
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00400001
> [    0.000000]    TCRestart : 7ffffffc 0x7ffffffc
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00300000
> [    0.000000]   TC 3
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00600001
> [    0.000000]    TCRestart : fff7ffae 0xfff7ffae
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00480000
> [    0.000000]   TC 4
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00800001
> [    0.000000]    TCRestart : f3fff7fe 0xf3fff7fe
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00600000
> [    0.000000]   TC 5
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00a00001
> [    0.000000]    TCRestart : 7ffffbfe 0x7ffffbfe
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00780000
> [    0.000000]   TC 6
> [    0.000000]    TCStatus : 00000000
> [    0.000000]    TCBind : 00c00001
> [    0.000000]    TCRestart : ffff7ffe 0xffff7ffe
> [    0.000000]    TCHalt : 00000001
> [    0.000000]    TCContext : 00900000
> [    0.000000] Counter Interrupts taken per CPU (TC)
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] 2: 0
> [    0.000000] 3: 0
> [    0.000000] 4: 0
> [    0.000000] 5: 0
> [    0.000000] 6: 0
> [    0.000000] 7: 0
> [    0.000000] Self-IPI invocations:
> [    0.000000] 0: 0
> [    0.000000] 1: 0
> [    0.000000] 2: 0
> [    0.000000] 3: 0
> [    0.000000] 4: 0
> [    0.000000] 5: 0
> [    0.000000] 6: 0
> [    0.000000] 7: 0
> [    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.000000] 0 Recoveries of "stolen" FPU
> [    0.000000] ===========================
> [    0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
> pend=0x20000
> [    0.010000] === MIPS MT State Dump ===
> [    0.010000] -- Global State --
> [    0.010000]    MVPControl Passed: 00000000
> [    0.010000]    MVPControl Read: 00000000
> [    0.010000]    MVPConf0 : a8008406
> [    0.010000] -- per-VPE State --
> [    0.010000]   VPE 0
> [    0.010000]    VPEControl : 00000000
> [    0.010000]    VPEConf0 : 800f0003
> [    0.010000]    VPE0.Status : 18004000
> [    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [    0.010000]    VPE0.Cause : 40804000
> [    0.010000]    VPE0.Config7 : 00010000
> [    0.010000]   VPE 1
> [    0.010000]    VPEControl : 00060000
> [    0.010000]    VPEConf0 : 800f0000
> [    0.010000]    VPE1.Status : 00408305
> [    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [    0.010000]    VPE1.Cause : 40000200
> [    0.010000]    VPE1.Config7 : 00010000
> [    0.010000] -- per-TC State --
> [    0.010000]   TC 0 (current TC with VPE EPC above)
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00000000
> [    0.010000]    TCRestart : 803f791c printk+0xc/0x30
> [    0.010000]    TCHalt : 00000000
> [    0.010000]    TCContext : 00000000
> [    0.010000]   TC 1
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00200001
> [    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00180000
> [    0.010000]   TC 2
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00400001
> [    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00300000
> [    0.010000]   TC 3
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00600001
> [    0.010000]    TCRestart : fff7ffae 0xfff7ffae
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00480000
> [    0.010000]   TC 4
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00800001
> [    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00600000
> [    0.010000]   TC 5
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00a00001
> [    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00780000
> [    0.010000]   TC 6
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00c00001
> [    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00900000
> [    0.010000] Counter Interrupts taken per CPU (TC)
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] Self-IPI invocations:
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] 0 Recoveries of "stolen" FPU
> [    0.010000] ===========================
> [    0.010000] === MIPS MT State Dump ===
> [    0.010000] -- Global State --
> [    0.010000]    MVPControl Passed: 00000000
> [    0.010000]    MVPControl Read: 00000000
> [    0.010000]    MVPConf0 : a8008406
> [    0.010000] -- per-VPE State --
> [    0.010000]   VPE 0
> [    0.010000]    VPEControl : 00000000
> [    0.010000]    VPEConf0 : 800f0003
> [    0.010000]    VPE0.Status : 18004000
> [    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [    0.010000]    VPE0.Cause : 40804000
> [    0.010000]    VPE0.Config7 : 00010000
> [    0.010000]   VPE 1
> [    0.010000]    VPEControl : 00060000
> [    0.010000]    VPEConf0 : 800f0000
> [    0.010000]    VPE1.Status : 00408305
> [    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [    0.010000]    VPE1.Cause : 40000200
> [    0.010000]    VPE1.Config7 : 00010000
> [    0.010000] -- per-TC State --
> [    0.010000]   TC 0 (current TC with VPE EPC above)
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00000000
> [    0.010000]    TCRestart : 803f791c printk+0xc/0x30
> [    0.010000]    TCHalt : 00000000
> [    0.010000]    TCContext : 00000000
> [    0.010000]   TC 1
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00200001
> [    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00180000
> [    0.010000]   TC 2
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00400001
> [    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00300000
> [    0.010000]   TC 3
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00600001
> [    0.010000]    TCRestart : fff7ffae 0xfff7ffae
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00480000
> [    0.010000]   TC 4
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00800001
> [    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00600000
> [    0.010000]   TC 5
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00a00001
> [    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00780000
> [    0.010000]   TC 6
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00c00001
> [    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00900000
> [    0.010000] Counter Interrupts taken per CPU (TC)
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] Self-IPI invocations:
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] 0 Recoveries of "stolen" FPU
> [    0.010000] ===========================
> [    0.010000] === MIPS MT State Dump ===
> [    0.010000] -- Global State --
> [    0.010000]    MVPControl Passed: 00000000
> [    0.010000]    MVPControl Read: 00000000
> [    0.010000]    MVPConf0 : a8008406
> [    0.010000] -- per-VPE State --
> [    0.010000]   VPE 0
> [    0.010000]    VPEControl : 00000000
> [    0.010000]    VPEConf0 : 800f0003
> [    0.010000]    VPE0.Status : 18004000
> [    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [    0.010000]    VPE0.Cause : 40804000
> [    0.010000]    VPE0.Config7 : 00010000
> [    0.010000]   VPE 1
> [    0.010000]    VPEControl : 00060000
> [    0.010000]    VPEConf0 : 800f0000
> [    0.010000]    VPE1.Status : 00408305
> [    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [    0.010000]    VPE1.Cause : 40000200
> [    0.010000]    VPE1.Config7 : 00010000
> [    0.010000] -- per-TC State --
> [    0.010000]   TC 0 (current TC with VPE EPC above)
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00000000
> [    0.010000]    TCRestart : 803f791c printk+0xc/0x30
> [    0.010000]    TCHalt : 00000000
> [    0.010000]    TCContext : 00000000
> [    0.010000]   TC 1
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00200001
> [    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00180000
> [    0.010000]   TC 2
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00400001
> [    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00300000
> [    0.010000]   TC 3
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00600001
> [    0.010000]    TCRestart : fff7ffae 0xfff7ffae
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00480000
> [    0.010000]   TC 4
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00800001
> [    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00600000
> [    0.010000]   TC 5
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00a00001
> [    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00780000
> [    0.010000]   TC 6
> [    0.010000]    TCStatus : 00000000
> [    0.010000]    TCBind : 00c00001
> [    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
> [    0.010000]    TCHalt : 00000001
> [    0.010000]    TCContext : 00900000
> [    0.010000] Counter Interrupts taken per CPU (TC)
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] Self-IPI invocations:
> [    0.010000] 0: 0
> [    0.010000] 1: 0
> [    0.010000] 2: 0
> [    0.010000] 3: 0
> [    0.010000] 4: 0
> [    0.010000] 5: 0
> [    0.010000] 6: 0
> [    0.010000] 7: 0
> [    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [    0.010000] 0 Recoveries of "stolen" FPU
>
>
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-14 15:25     ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-14 15:25 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: linux-mips

> it ended up being cleaner and more efficient to have *some* hooks in
> platform specific timer code.  It was there for Malta in the
kernel.org
> mainline once upon a time, and I *thought* we'd propagated working
code
> for the initial PMC-Sierra 34K-based SoC's at least as far as
[Anoop P.A.] 
I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
c0_compare_int_pending changed as following "return (read_c0_cause() >>
cp0_compare_irq_shift) & (1ul << CAUSEB_IP)"
 
> linux-mips.org, but the source tree has been considerably reorganized
-
> there was a time when some of the hooks were under
> arch/mips/mips-boards/generic, which no longer exists - and I'm not
sure
> where to point you.  Git and grep are your friends.
[Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
Can you recollect the version of l-m-o kernel with a known working SMTC
support ?.

> 
> The first order of business is to break into that hung timer
calibration
> loop and dump the CP0 registers for the VPE and the TCs, in particular
> checking the interrupt enable mask in Status against the pending
> interrupts in the Cause register.   If you're seeing the timer
> interrupt's bit set in Cause, but clear in Status, you need to fix the
> SMTC interrupt mask hook for your platform timer.  
[Anoop P.A.] 
I tried dumping registers from calibration while loop.
It looks like the timer interrupt bit stay high on both cause and status
register ( in my case timer interrupt is connected to Cascaded CIC
interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
below

> check to see if you're building for "tickless" operation.  Tickless
ends
> up being really important for SMTC, and I did get it working properly
> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
going
> by that I rather doubt were ever tested against an SMTC
build/platform.
> There might have been breakage there, and configuring to use a fixed
> interval timer (say, 100Hz) would be a way to test that hypothesis.

[Anoop P.A.] I have tried both tickles and fixed interval timer.

> 
>              Regards,
> 
>              Kevin K.


[Anoop P.A.] Thanks much for your and Ralf's detailed response. 
> 
[Anoop P.A.] 
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] Memory: 254384k/257912k available (3062k kernel code,
3528k reserved, 648k data, 200k init, 0k highmem)
[    0.000000] Preemptable hierarchical RCU implementation.
[    0.000000] NR_IRQS:128
[    0.000000] console [ttyS0] enabled
[    0.000000] Clock rate set to 600000000
[    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[    0.000000] -- Global State --
[    0.000000]    MVPControl Passed: 00000000
[    0.000000]    MVPControl Read: 00000000
[    0.000000]    MVPConf0 : a8008406
[    0.000000] -- per-VPE State --
[    0.000000]   VPE 0
[    0.000000]    VPEControl : 00000000
[    0.000000]    VPEConf0 : 800f0003
[    0.000000]    VPE0.Status : 11004001
[    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
[    0.000000]    VPE0.Cause : 40804000
[    0.000000]    VPE0.Config7 : 00010000
[    0.000000]   VPE 1
[    0.000000]    VPEControl : 00060000
[    0.000000]    VPEConf0 : 800f0000
[    0.000000]    VPE1.Status : 00408305
[    0.000000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.000000]    VPE1.Cause : 40000200
[    0.000000]    VPE1.Config7 : 00010000
[    0.000000] -- per-TC State --
[    0.000000]   TC 0 (current TC with VPE EPC above)
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00000000
[    0.000000]    TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[    0.000000]    TCHalt : 00000000
[    0.000000]    TCContext : 00000000
[    0.000000]   TC 1
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00200001
[    0.000000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00180000
[    0.000000]   TC 2
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00400001
[    0.000000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00300000
[    0.000000]   TC 3
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00600001
[    0.000000]    TCRestart : fff7ffae 0xfff7ffae
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00480000
[    0.000000]   TC 4
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00800001
[    0.000000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00600000
[    0.000000]   TC 5
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00a00001
[    0.000000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00780000
[    0.000000]   TC 6
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00c00001
[    0.000000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00900000
[    0.000000] Counter Interrupts taken per CPU (TC)
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] Self-IPI invocations:
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] 0 Recoveries of "stolen" FPU
[    0.000000] ===========================
[    0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU

^ permalink raw reply	[flat|nested] 68+ messages in thread

* RE: SMTC support status in latest git head.
@ 2010-12-14 15:25     ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-14 15:25 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: linux-mips

> it ended up being cleaner and more efficient to have *some* hooks in
> platform specific timer code.  It was there for Malta in the
kernel.org
> mainline once upon a time, and I *thought* we'd propagated working
code
> for the initial PMC-Sierra 34K-based SoC's at least as far as
[Anoop P.A.] 
I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
c0_compare_int_pending changed as following "return (read_c0_cause() >>
cp0_compare_irq_shift) & (1ul << CAUSEB_IP)"
 
> linux-mips.org, but the source tree has been considerably reorganized
-
> there was a time when some of the hooks were under
> arch/mips/mips-boards/generic, which no longer exists - and I'm not
sure
> where to point you.  Git and grep are your friends.
[Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
Can you recollect the version of l-m-o kernel with a known working SMTC
support ?.

> 
> The first order of business is to break into that hung timer
calibration
> loop and dump the CP0 registers for the VPE and the TCs, in particular
> checking the interrupt enable mask in Status against the pending
> interrupts in the Cause register.   If you're seeing the timer
> interrupt's bit set in Cause, but clear in Status, you need to fix the
> SMTC interrupt mask hook for your platform timer.  
[Anoop P.A.] 
I tried dumping registers from calibration while loop.
It looks like the timer interrupt bit stay high on both cause and status
register ( in my case timer interrupt is connected to Cascaded CIC
interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
below

> check to see if you're building for "tickless" operation.  Tickless
ends
> up being really important for SMTC, and I did get it working properly
> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
going
> by that I rather doubt were ever tested against an SMTC
build/platform.
> There might have been breakage there, and configuring to use a fixed
> interval timer (say, 100Hz) would be a way to test that hypothesis.

[Anoop P.A.] I have tried both tickles and fixed interval timer.

> 
>              Regards,
> 
>              Kevin K.


[Anoop P.A.] Thanks much for your and Ralf's detailed response. 
> 
[Anoop P.A.] 
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] Memory: 254384k/257912k available (3062k kernel code,
3528k reserved, 648k data, 200k init, 0k highmem)
[    0.000000] Preemptable hierarchical RCU implementation.
[    0.000000] NR_IRQS:128
[    0.000000] console [ttyS0] enabled
[    0.000000] Clock rate set to 600000000
[    0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[    0.000000] -- Global State --
[    0.000000]    MVPControl Passed: 00000000
[    0.000000]    MVPControl Read: 00000000
[    0.000000]    MVPConf0 : a8008406
[    0.000000] -- per-VPE State --
[    0.000000]   VPE 0
[    0.000000]    VPEControl : 00000000
[    0.000000]    VPEConf0 : 800f0003
[    0.000000]    VPE0.Status : 11004001
[    0.000000]    VPE0.EPC : 80100000 _stext+0x0/0x10
[    0.000000]    VPE0.Cause : 40804000
[    0.000000]    VPE0.Config7 : 00010000
[    0.000000]   VPE 1
[    0.000000]    VPEControl : 00060000
[    0.000000]    VPEConf0 : 800f0000
[    0.000000]    VPE1.Status : 00408305
[    0.000000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.000000]    VPE1.Cause : 40000200
[    0.000000]    VPE1.Config7 : 00010000
[    0.000000] -- per-TC State --
[    0.000000]   TC 0 (current TC with VPE EPC above)
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00000000
[    0.000000]    TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[    0.000000]    TCHalt : 00000000
[    0.000000]    TCContext : 00000000
[    0.000000]   TC 1
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00200001
[    0.000000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00180000
[    0.000000]   TC 2
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00400001
[    0.000000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00300000
[    0.000000]   TC 3
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00600001
[    0.000000]    TCRestart : fff7ffae 0xfff7ffae
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00480000
[    0.000000]   TC 4
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00800001
[    0.000000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00600000
[    0.000000]   TC 5
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00a00001
[    0.000000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00780000
[    0.000000]   TC 6
[    0.000000]    TCStatus : 00000000
[    0.000000]    TCBind : 00c00001
[    0.000000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.000000]    TCHalt : 00000001
[    0.000000]    TCContext : 00900000
[    0.000000] Counter Interrupts taken per CPU (TC)
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] Self-IPI invocations:
[    0.000000] 0: 0
[    0.000000] 1: 0
[    0.000000] 2: 0
[    0.000000] 3: 0
[    0.000000] 4: 0
[    0.000000] 5: 0
[    0.000000] 6: 0
[    0.000000] 7: 0
[    0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.000000] 0 Recoveries of "stolen" FPU
[    0.000000] ===========================
[    0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU
[    0.010000] ===========================
[    0.010000] === MIPS MT State Dump ===
[    0.010000] -- Global State --
[    0.010000]    MVPControl Passed: 00000000
[    0.010000]    MVPControl Read: 00000000
[    0.010000]    MVPConf0 : a8008406
[    0.010000] -- per-VPE State --
[    0.010000]   VPE 0
[    0.010000]    VPEControl : 00000000
[    0.010000]    VPEConf0 : 800f0003
[    0.010000]    VPE0.Status : 18004000
[    0.010000]    VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[    0.010000]    VPE0.Cause : 40804000
[    0.010000]    VPE0.Config7 : 00010000
[    0.010000]   VPE 1
[    0.010000]    VPEControl : 00060000
[    0.010000]    VPEConf0 : 800f0000
[    0.010000]    VPE1.Status : 00408305
[    0.010000]    VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[    0.010000]    VPE1.Cause : 40000200
[    0.010000]    VPE1.Config7 : 00010000
[    0.010000] -- per-TC State --
[    0.010000]   TC 0 (current TC with VPE EPC above)
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00000000
[    0.010000]    TCRestart : 803f791c printk+0xc/0x30
[    0.010000]    TCHalt : 00000000
[    0.010000]    TCContext : 00000000
[    0.010000]   TC 1
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00200001
[    0.010000]    TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00180000
[    0.010000]   TC 2
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00400001
[    0.010000]    TCRestart : 7ffffffc 0x7ffffffc
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00300000
[    0.010000]   TC 3
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00600001
[    0.010000]    TCRestart : fff7ffae 0xfff7ffae
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00480000
[    0.010000]   TC 4
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00800001
[    0.010000]    TCRestart : f3fff7fe 0xf3fff7fe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00600000
[    0.010000]   TC 5
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00a00001
[    0.010000]    TCRestart : 7ffffbfe 0x7ffffbfe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00780000
[    0.010000]   TC 6
[    0.010000]    TCStatus : 00000000
[    0.010000]    TCBind : 00c00001
[    0.010000]    TCRestart : ffff7ffe 0xffff7ffe
[    0.010000]    TCHalt : 00000001
[    0.010000]    TCContext : 00900000
[    0.010000] Counter Interrupts taken per CPU (TC)
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] Self-IPI invocations:
[    0.010000] 0: 0
[    0.010000] 1: 0
[    0.010000] 2: 0
[    0.010000] 3: 0
[    0.010000] 4: 0
[    0.010000] 5: 0
[    0.010000] 6: 0
[    0.010000] 7: 0
[    0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[    0.010000] 0 Recoveries of "stolen" FPU

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-08 13:48 ` Anoop P.A.
  (?)
  (?)
@ 2010-12-09 18:52 ` Kevin D. Kissell
  2010-12-14 15:25     ` Anoop P.A.
  -1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-09 18:52 UTC (permalink / raw)
  To: Anoop P.A.; +Cc: linux-mips

I used to do occasional tests and damage control patches for SMTC, but 
haven't had the time and resources for the past year or so.  The 
"Calibrating delay loop" hang is an absolutely classic hang in SMTC 
systems that stems from the interrupt management system not being 
properly set up.  Ralf alluded to the intra-TC timer propagation 
protocol, but your problem could just as easily (more easily, actually) 
have to do with enable mask management. In order to keep multiple 
threads from "convoying" into interrupt handlers chasing a single event, 
SMTC manipulates the interrupt enable mask at entry into an interrupt 
exception to ensure that only the initial TC goes after it.  The 
interrupt is unmasked once the interrupt handler has quenched the source 
and invoked the IRQ ack function.  Unfortunately, generic timer 
functions don't always do the canonical source quench performed by most 
device driver interrupt handlers. I tried to make all this 
self-contained in generic architecture-specific code, but at some point 
it ended up being cleaner and more efficient to have *some* hooks in 
platform specific timer code.  It was there for Malta in the kernel.org 
mainline once upon a time, and I *thought* we'd propagated working code 
for the initial PMC-Sierra 34K-based SoC's at least as far as 
linux-mips.org, but the source tree has been considerably reorganized - 
there was a time when some of the hooks were under 
arch/mips/mips-boards/generic, which no longer exists - and I'm not sure 
where to point you.  Git and grep are your friends.

The first order of business is to break into that hung timer calibration 
loop and dump the CP0 registers for the VPE and the TCs, in particular 
checking the interrupt enable mask in Status against the pending 
interrupts in the Cause register.   If you're seeing the timer 
interrupt's bit set in Cause, but clear in Status, you need to fix the 
SMTC interrupt mask hook for your platform timer.  If that's *not* it, 
check to see if you're building for "tickless" operation.  Tickless ends 
up being really important for SMTC, and I did get it working properly 
back in 2008, but I the SMTC-specific cevt-smtc.c code uses common 
functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c going 
by that I rather doubt were ever tested against an SMTC build/platform.  
There might have been breakage there, and configuring to use a fixed 
interval timer (say, 100Hz) would be a way to test that hypothesis.

             Regards,

             Kevin K.

On 12/08/10 05:48, Anoop P.A. wrote:
> Hi list,
>
> Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k).
>
> I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
>
> While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file (  changed to map smp_ops structure)  2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine.
>
> [    0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
> [    0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
> [    0.000000] UART clock set to 50000000
> [    0.000000] CPU revision is: 00019548 (MIPS 34Kc)
> [    0.000000] Determined physical RAM map:
> [    0.000000]  memory: 00001000 @ 00000000 (reserved)
> [    0.000000]  memory: 000ff000 @ 00001000 (usable)
> [    0.000000]  memory: 003f2000 @ 00100000 (reserved)
> [    0.000000]  memory: 0fad9200 @ 004f2000 (usable)
> [    0.000000] Wasting 32 bytes for tracking 1 unused pages
> [    0.000000] Zone PFN ranges:
> [    0.000000]   Normal   0x00000000 ->  0x0000ffcb
> [    0.000000] Movable zone start PFN for each node
> [    0.000000] early_node_map[1] active PFN ranges
> [    0.000000]     0: 0x00000000 ->  0x0000ffcb
> [    0.000000] 6 available secondary CPU TC(s)
> [    0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
> [    0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
> [    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
> [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 64971
> [    0.000000] Kernel command line: console=ttyS0,57600
> [    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
> [    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
> [    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
> [    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
> [    0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
> [    0.000000] Writing ErrCtl register=00000000
> [    0.000000] Readback ErrCtl register=00000000
> [    0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
> [    0.000000] Preemptable hierarchical RCU implementation.
> [    0.000000] NR_IRQS:128
> [    0.000000] console [ttyS0] enabled
> [    0.000000] Clock rate set to 600000000
> [    0.000000] Calibrating delay loop...
>
> Any idea to debug the issue ?.
>
> Thanks,
> Anoop
>
>
>
>    

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: SMTC support status in latest git head.
  2010-12-08 13:48 ` Anoop P.A.
  (?)
@ 2010-12-09 17:07 ` Ralf Baechle
  -1 siblings, 0 replies; 68+ messages in thread
From: Ralf Baechle @ 2010-12-09 17:07 UTC (permalink / raw)
  To: Anoop P.A.; +Cc: linux-mips, Kevin D. Kissell

On Wed, Dec 08, 2010 at 05:48:48AM -0800, Anoop P.A. wrote:

> Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k). 

Correct.  MTI's MIPSsim is the only simulator that supports multithreading
afaik.

SMTC is not terribly popular so doesn't receive the regular testing it should
because it's also a complex beast.

> I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
> 
> While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file (  changed to map smp_ops structure)  2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine. 

Timer interrupts work differently in SMTC.  Each CPU needs a clock event
device, that is an interrupt timer but the CPU core is restricted to just
one per VPE so in typical SMTC setup multiple CPUs aka TCs will have to
share an interrupt timer.  The way this works is that one of the TCs
associated with a VPE will take the timer interrupt and forward it to
the other TCs associated with the same VPE (if any) through a software
IPI mechanism.  The race conditions that need to handled to make this
work are ...  interesting.  Your problem seems to be simpler as you only
get a single timer interrupt.

  Ralf

^ permalink raw reply	[flat|nested] 68+ messages in thread

* SMTC support status in latest git head.
@ 2010-12-08 13:48 ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-08 13:48 UTC (permalink / raw)
  To: linux-mips

Hi list,

Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k). 

I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).

While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file (  changed to map smp_ops structure)  2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine. 

[    0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
[    0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
[    0.000000] UART clock set to 50000000
[    0.000000] CPU revision is: 00019548 (MIPS 34Kc)
[    0.000000] Determined physical RAM map:
[    0.000000]  memory: 00001000 @ 00000000 (reserved)
[    0.000000]  memory: 000ff000 @ 00001000 (usable)
[    0.000000]  memory: 003f2000 @ 00100000 (reserved)
[    0.000000]  memory: 0fad9200 @ 004f2000 (usable)
[    0.000000] Wasting 32 bytes for tracking 1 unused pages
[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0000ffcb
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x0000ffcb
[    0.000000] 6 available secondary CPU TC(s)
[    0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
[    0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 64971
[    0.000000] Kernel command line: console=ttyS0,57600
[    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
[    0.000000] Preemptable hierarchical RCU implementation.
[    0.000000] NR_IRQS:128
[    0.000000] console [ttyS0] enabled
[    0.000000] Clock rate set to 600000000
[    0.000000] Calibrating delay loop...

Any idea to debug the issue ?.

Thanks,
Anoop

^ permalink raw reply	[flat|nested] 68+ messages in thread

* SMTC support status in latest git head.
@ 2010-12-08 13:48 ` Anoop P.A.
  0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-08 13:48 UTC (permalink / raw)
  To: linux-mips

Hi list,

Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k). 

I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).

While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file (  changed to map smp_ops structure)  2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine. 

[    0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
[    0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
[    0.000000] UART clock set to 50000000
[    0.000000] CPU revision is: 00019548 (MIPS 34Kc)
[    0.000000] Determined physical RAM map:
[    0.000000]  memory: 00001000 @ 00000000 (reserved)
[    0.000000]  memory: 000ff000 @ 00001000 (usable)
[    0.000000]  memory: 003f2000 @ 00100000 (reserved)
[    0.000000]  memory: 0fad9200 @ 004f2000 (usable)
[    0.000000] Wasting 32 bytes for tracking 1 unused pages
[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0000ffcb
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x0000ffcb
[    0.000000] 6 available secondary CPU TC(s)
[    0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
[    0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 64971
[    0.000000] Kernel command line: console=ttyS0,57600
[    0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
[    0.000000] Preemptable hierarchical RCU implementation.
[    0.000000] NR_IRQS:128
[    0.000000] console [ttyS0] enabled
[    0.000000] Clock rate set to 600000000
[    0.000000] Calibrating delay loop...

Any idea to debug the issue ?.

Thanks,
Anoop

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2011-01-13  7:53 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-16 15:37 SMTC support status in latest git head STUART VENTERS
2010-12-16 15:37 ` STUART VENTERS
     [not found] ` <4D0A677C.6040104@paralogos.com>
2010-12-16 19:58   ` Kevin D. Kissell
2010-12-17 21:35     ` Kevin D. Kissell
2010-12-20 10:44       ` Anoop P A
     [not found]         ` <4D10F7A9.1020306@paralogos.com>
2010-12-21 20:06           ` Anoop P.A.
2010-12-21 20:06             ` Anoop P.A.
2010-12-21 20:29             ` Anoop P.A.
2010-12-21 20:29               ` Anoop P.A.
2010-12-22 10:27               ` Kevin D. Kissell
2010-12-22 11:35                 ` Anoop P A
2010-12-22 11:37                   ` Kevin D. Kissell
2010-12-22 11:51                     ` Anoop P A
2010-12-22 13:03                       ` Kevin D. Kissell
2010-12-22 16:34                         ` STUART VENTERS
2010-12-22 16:34                           ` STUART VENTERS
2010-12-23 21:09                         ` STUART VENTERS
2010-12-23 21:09                           ` STUART VENTERS
2010-12-24 12:32                           ` Kevin D. Kissell
2010-12-24 14:39                             ` Anoop P A
2010-12-24 14:53                               ` Kevin D. Kissell
2010-12-24 16:02                                 ` Anoop P A
2010-12-24 23:34                                   ` Kevin D. Kissell
2010-12-25  7:32                                     ` Anoop P A
2010-12-25 15:17                                       ` Kevin D. Kissell
2010-12-27 15:49                                     ` STUART VENTERS
2010-12-27 15:49                                       ` STUART VENTERS
2010-12-27 17:19                                       ` Anoop P A
2010-12-28  8:19                                         ` Anoop P A
2010-12-28  8:43                                           ` Kevin D. Kissell
2010-12-31 12:27                                             ` Anoop P A
2011-01-01  8:42                                               ` Kevin D. Kissell
2011-01-03 15:12                                                 ` Anoop P A
2011-01-03 16:14                                                   ` Kevin D. Kissell
2011-01-03 19:20                                                     ` Anoop P A
2011-01-04  8:17                                                       ` Kevin D. Kissell
2011-01-04 13:02                                                         ` Anoop P A
2011-01-04 14:37                                                           ` Anoop P A
2011-01-04 17:21                                                             ` Kevin D. Kissell
2011-01-04 17:54                                                               ` Anoop P A
2011-01-04 18:33                                                                 ` Kevin D. Kissell
2011-01-05 13:11                                                                   ` Anoop P A
2011-01-05 19:23                                                                     ` Kevin D. Kissell
2011-01-06 20:23                                                                       ` Anoop P A
2011-01-06 23:31                                                                         ` Kevin D. Kissell
2011-01-07  7:56                                                                           ` Anoop P A
2011-01-07 18:46                                                                             ` Kevin D. Kissell
2011-01-08 19:33                                                                               ` Anoop P A
2011-01-10 19:30                                                                             ` Kevin D. Kissell
2011-01-11  4:05                                                                               ` Anoop P A
2011-01-13  7:53                                                                               ` Kevin D. Kissell
2011-01-04 17:40                                                           ` Kevin D. Kissell
2011-01-05 13:09                                                             ` Anoop P A
  -- strict thread matches above, loose matches on Subject: below --
2010-12-14 21:27 STUART VENTERS
2010-12-14 21:27 ` STUART VENTERS
2010-12-14 23:01 ` Kevin D. Kissell
2010-12-08 13:48 Anoop P.A.
2010-12-08 13:48 ` Anoop P.A.
2010-12-09 17:07 ` Ralf Baechle
2010-12-09 18:52 ` Kevin D. Kissell
2010-12-14 15:25   ` Anoop P.A.
2010-12-14 15:25     ` Anoop P.A.
2010-12-14 18:32     ` Kevin D. Kissell
2010-12-14 18:50       ` Ralf Baechle
2010-12-15 19:18       ` Anoop P A
2010-12-15 19:58         ` Kevin D. Kissell
2010-12-16 13:03           ` Anoop P A
2010-12-16 18:43             ` Kevin D. Kissell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.