* Re: SMTC support status in latest git head.
@ 2010-12-16 15:37 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-16 15:37 UTC (permalink / raw)
To: kevink, anoop.pa; +Cc: linux-mips, Anoop_P.A
[-- Attachment #1: Type: text/plain, Size: 347 bytes --]
Two other possible clues:
The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage Exception dispatch.
But that seems to conflict the EVP bit above.
Perhaps these are an artifact of getting to a good state to dump things out.
[-- Attachment #2: Type: text/html, Size: 966 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
@ 2010-12-16 15:37 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-16 15:37 UTC (permalink / raw)
To: kevink, anoop.pa; +Cc: linux-mips, Anoop_P.A
[-- Attachment #1: Type: text/plain, Size: 347 bytes --]
Two other possible clues:
The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage Exception dispatch.
But that seems to conflict the EVP bit above.
Perhaps these are an artifact of getting to a good state to dump things out.
[-- Attachment #2: Type: text/html, Size: 966 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
[not found] ` <4D0A677C.6040104@paralogos.com>
@ 2010-12-16 19:58 ` Kevin D. Kissell
2010-12-17 21:35 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-16 19:58 UTC (permalink / raw)
To: STUART VENTERS; +Cc: anoop.pa, linux-mips, Anoop_P.A
Ralf tells me that this message got blocked by the LMO server due to
HTML content.
So here it is again, textier.
On 12/16/10 11:24, Kevin D. Kissell wrote:
> On 12/16/10 07:37, STUART VENTERS wrote:
>
> Two other possible clues:
>
> The EVP is clear in the MVPControl register.
> Does this say that only VPE0, T0 gets to run?
That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't
matter. It's just possible that setting EVP is conditional on more than
one VPE being used, but that's not the way I remember it.
> Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
Exception dispatch.
> But that seems to conflict the EVP bit above.
I don't have a copy of the ASE spec handy to see whether those bits have
a defined power-on value, but particularly if maxvpes=1 was set at boot
time, I would expect VPE1's registers to be in a partly random power-up
state.
> Perhaps these are an artifact of getting to a good state to dump
things out.
As per my previous mail, I looked at the MT register dump source, and it
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack
frame. The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption
that dvpe() returns the pre-invocation
value of MVPcontrol. Break those assumptions, and you'll get
inconsistent state dumps like this,
and very possibly incorrect execution. Particularly if what was done
was that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore()
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions. That
would have been a *very* bad idea...
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-16 19:58 ` Kevin D. Kissell
@ 2010-12-17 21:35 ` Kevin D. Kissell
2010-12-20 10:44 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-17 21:35 UTC (permalink / raw)
To: anoop.pa; +Cc: STUART VENTERS, linux-mips, Anoop_P.A
So, Anoop, if you get a minute for this any time in the next day or so
(after which I'll have very limited net access until next year), could
you please do an <mumble>-mips<mumble>-objdump --disassemble of your
kernel image (or even just the mips-mt.o module) from a failing kernel
build and post the disassembly of mips_mt_regdump()? The confirmation
or refutation of the theory about local_irq_save() no longer being built
correctly for SMTC would be within the first few instructions...
/K.
On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to
> HTML content.
> So here it is again, textier.
>
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> > On 12/16/10 07:37, STUART VENTERS wrote:
> >
> > Two other possible clues:
> >
> > The EVP is clear in the MVPControl register.
> > Does this say that only VPE0, T0 gets to run?
>
> That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't
> matter. It's just possible that setting EVP is conditional on more
> than one VPE being used, but that's not the way I remember it.
>
> > Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
> Exception dispatch.
> > But that seems to conflict the EVP bit above.
>
> I don't have a copy of the ASE spec handy to see whether those bits
> have a defined power-on value, but particularly if maxvpes=1 was set
> at boot time, I would expect VPE1's registers to be in a partly random
> power-up state.
>
> > Perhaps these are an artifact of getting to a good state to dump
> things out.
>
> As per my previous mail, I looked at the MT register dump source, and
> it really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack
> frame. The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is
> based on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire*
> pre-invocation value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the
> assumption that dvpe() returns the pre-invocation
> value of MVPcontrol. Break those assumptions, and you'll get
> inconsistent state dumps like this,
> and very possibly incorrect execution. Particularly if what was done
> was that effectively replaces
> the SMTC-specific implementation of
> local_irq_save()/local_irq_restore() with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions.
> That would have been a *very* bad idea...
>
> Regards,
>
> Kevin K.
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-17 21:35 ` Kevin D. Kissell
@ 2010-12-20 10:44 ` Anoop P A
[not found] ` <4D10F7A9.1020306@paralogos.com>
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-20 10:44 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, linux-mips, Anoop_P.A
Hi Kevin,
Please find disassembly for mips_mt_reg_dump
Thanks
Anoop
Disassembly of section .text:
00000000 <mips_mt_regdump>:
0: 27bdffb8 addiu sp,sp,-72
4: 00802821 move a1,a0
8: afbf0044 sw ra,68(sp)
c: afbe0040 sw s8,64(sp)
10: afb7003c sw s7,60(sp)
14: afb60038 sw s6,56(sp)
18: afb50034 sw s5,52(sp)
1c: afb40030 sw s4,48(sp)
20: afb3002c sw s3,44(sp)
24: afb20028 sw s2,40(sp)
28: afb10024 sw s1,36(sp)
2c: afb00020 sw s0,32(sp)
30: 40141001 mfc0 s4,c0_tcstatus
34: 36810400 ori at,s4,0x400
38: 40811001 mtc0 at,c0_tcstatus
3c: 32940400 andi s4,s4,0x400
40: 000000c0 ehb
44: 41610001 dvpe at
48: 0020a821 move s5,at
4c: 000000c0 ehb
50: 3c020000 lui v0,0x0
54: 24420060 addiu v0,v0,96
58: 00400408 jr.hb v0
5c: 00000000 nop
60: 3c040000 lui a0,0x0
64: 24840000 addiu a0,a0,0
68: 0c000000 jal 0 <mips_mt_regdump>
6c: afa50010 sw a1,16(sp)
70: 3c040000 lui a0,0x0
74: 0c000000 jal 0 <mips_mt_regdump>
78: 24840000 addiu a0,a0,0
7c: 8fa50010 lw a1,16(sp)
80: 3c040000 lui a0,0x0
84: 0c000000 jal 0 <mips_mt_regdump>
88: 24840000 addiu a0,a0,0
8c: 3c040000 lui a0,0x0
90: 24840000 addiu a0,a0,0
94: 0c000000 jal 0 <mips_mt_regdump>
98: 02a02821 move a1,s5
9c: 40110002 mfc0 s1,c0_mvpconf0
a0: 3c040000 lui a0,0x0
a4: 02202821 move a1,s1
a8: 0c000000 jal 0 <mips_mt_regdump>
ac: 24840000 addiu a0,a0,0
b0: 3c040000 lui a0,0x0
b4: 0c000000 jal 0 <mips_mt_regdump>
b8: 24840000 addiu a0,a0,0
bc: 7e331a80 ext s3,s1,0xa,0x4
c0: 3c090000 lui t1,0x0
c4: 323100ff andi s1,s1,0xff
c8: 3c080000 lui t0,0x0
cc: 3c030000 lui v1,0x0
d0: 3c1e0000 lui s8,0x0
d4: 3c170000 lui s7,0x0
d8: 3c160000 lui s6,0x0
dc: 3c0a0000 lui t2,0x0
e0: 26730001 addiu s3,s3,1
e4: 26310001 addiu s1,s1,1
e8: 00008021 move s0,zero
ec: 2412ff00 li s2,-256
f0: 25290000 addiu t1,t1,0
f4: 25080000 addiu t0,t0,0
f8: 24630000 addiu v1,v1,0
fc: 27de0000 addiu s8,s8,0
100: 26f70000 addiu s7,s7,0
104: 26d60000 addiu s6,s6,0
108: 254a0000 addiu t2,t2,0
10c: 00001021 move v0,zero
110: 40040801 mfc0 a0,c0_vpecontrol
114: 00922024 and a0,a0,s2
118: 00442025 or a0,v0,a0
11c: 40840801 mtc0 a0,c0_vpecontrol
120: 000000c0 ehb
124: 41020802 mftc0 at,c0_tcbind
128: 00202021 move a0,at
12c: 24420001 addiu v0,v0,1
130: 3084000f andi a0,a0,0xf
134: 12040031 beq s0,a0,1fc <mips_mt_regdump+0x1fc>
138: 0051282a slt a1,v0,s1
13c: 14a0fff4 bnez a1,110 <mips_mt_regdump+0x110>
140: 00000000 nop
144: 26100001 addiu s0,s0,1
148: 0213102a slt v0,s0,s3
14c: 1440fff0 bnez v0,110 <mips_mt_regdump+0x110>
150: 00001021 move v0,zero
154: 3c040000 lui a0,0x0
158: 24840000 addiu a0,a0,0
15c: 3c1e0000 lui s8,0x0
160: 3c170000 lui s7,0x0
164: 3c160000 lui s6,0x0
168: 3c130000 lui s3,0x0
16c: 0c000000 jal 0 <mips_mt_regdump>
170: 3c120000 lui s2,0x0
174: 00008021 move s0,zero
178: 27de0000 addiu s8,s8,0
17c: 26f70000 addiu s7,s7,0
180: 26d60000 addiu s6,s6,0
184: 26730000 addiu s3,s3,0
188: 26520000 addiu s2,s2,0
18c: 40020801 mfc0 v0,c0_vpecontrol
190: 2403ff00 li v1,-256
194: 00431024 and v0,v0,v1
198: 02021025 or v0,s0,v0
19c: 40820801 mtc0 v0,c0_vpecontrol
1a0: 000000c0 ehb
1a4: 41020802 mftc0 at,c0_tcbind
1a8: 00201821 move v1,at
1ac: 40021002 mfc0 v0,c0_tcbind
1b0: 1062003f beq v1,v0,2b0 <mips_mt_regdump+0x2b0>
1b4: 00000000 nop
1b8: 41020804 mftc0 at,c0_tchalt
1bc: 00201821 move v1,at
1c0: 24020001 li v0,1
1c4: 00400821 move at,v0
1c8: 41811004 mttc0 at,c0_tchalt
1cc: 41020801 mftc0 at,c0_tcstatus
1d0: 00203021 move a2,at
1d4: 3c040000 lui a0,0x0
1d8: 02002821 move a1,s0
1dc: 24840000 addiu a0,a0,0
1e0: afa3001c sw v1,28(sp)
1e4: 0c000000 jal 0 <mips_mt_regdump>
1e8: afa60010 sw a2,16(sp)
1ec: 8fa60010 lw a2,16(sp)
1f0: 8fa3001c lw v1,28(sp)
1f4: 080000b2 j 2c8 <mips_mt_regdump+0x2c8>
1f8: 00c02821 move a1,a2
1fc: 01202021 move a0,t1
200: 02002821 move a1,s0
204: afa3001c sw v1,28(sp)
208: afa80014 sw t0,20(sp)
20c: afa90010 sw t1,16(sp)
210: 0c000000 jal 0 <mips_mt_regdump>
214: afaa0018 sw t2,24(sp)
218: 41010801 mftc0 at,c0_vpecontrol
21c: 00202821 move a1,at
220: 8fa80014 lw t0,20(sp)
224: 0c000000 jal 0 <mips_mt_regdump>
228: 01002021 move a0,t0
22c: 41010802 mftc0 at,c0_vpeconf0
230: 00202821 move a1,at
234: 8fa3001c lw v1,28(sp)
238: 0c000000 jal 0 <mips_mt_regdump>
23c: 00602021 move a0,v1
240: 410c0800 mftc0 at,c0_status
244: 00203021 move a2,at
248: 03c02021 move a0,s8
24c: 0c000000 jal 0 <mips_mt_regdump>
250: 02002821 move a1,s0
254: 410e0800 mftc0 at,c0_epc
258: 00203021 move a2,at
25c: 410e0800 mftc0 at,c0_epc
260: 00203821 move a3,at
264: 02e02021 move a0,s7
268: 0c000000 jal 0 <mips_mt_regdump>
26c: 02002821 move a1,s0
270: 410d0800 mftc0 at,c0_cause
274: 00203021 move a2,at
278: 02c02021 move a0,s6
27c: 0c000000 jal 0 <mips_mt_regdump>
280: 02002821 move a1,s0
284: 41100807 mftc0 at,$16,7
288: 00203021 move a2,at
28c: 8faa0018 lw t2,24(sp)
290: 02002821 move a1,s0
294: 0c000000 jal 0 <mips_mt_regdump>
298: 01402021 move a0,t2
29c: 8fa3001c lw v1,28(sp)
2a0: 8fa80014 lw t0,20(sp)
2a4: 8fa90010 lw t1,16(sp)
2a8: 08000051 j 144 <mips_mt_regdump+0x144>
2ac: 8faa0018 lw t2,24(sp)
2b0: 3c040000 lui a0,0x0
2b4: 02002821 move a1,s0
2b8: 0c000000 jal 0 <mips_mt_regdump>
2bc: 24840000 addiu a0,a0,0
2c0: 00001821 move v1,zero
2c4: 02802821 move a1,s4
2c8: 03c02021 move a0,s8
2cc: 0c000000 jal 0 <mips_mt_regdump>
2d0: afa3001c sw v1,28(sp)
2d4: 41020802 mftc0 at,c0_tcbind
2d8: 00202821 move a1,at
2dc: 0c000000 jal 0 <mips_mt_regdump>
2e0: 02e02021 move a0,s7
2e4: 41020803 mftc0 at,c0_tcrestart
2e8: 00202821 move a1,at
2ec: 41020803 mftc0 at,c0_tcrestart
2f0: 00203021 move a2,at
2f4: 0c000000 jal 0 <mips_mt_regdump>
2f8: 02c02021 move a0,s6
2fc: 8fa3001c lw v1,28(sp)
300: 02602021 move a0,s3
304: 0c000000 jal 0 <mips_mt_regdump>
308: 00602821 move a1,v1
30c: 41020805 mftc0 at,c0_tccontext
310: 00202821 move a1,at
314: 0c000000 jal 0 <mips_mt_regdump>
318: 02402021 move a0,s2
31c: 8fa3001c lw v1,28(sp)
320: 14600003 bnez v1,330 <mips_mt_regdump+0x330>
324: 00001021 move v0,zero
328: 00400821 move at,v0
32c: 41811004 mttc0 at,c0_tchalt
330: 26100001 addiu s0,s0,1
334: 0211102a slt v0,s0,s1
338: 1440ff94 bnez v0,18c <mips_mt_regdump+0x18c>
33c: 00000000 nop
340: 0c000000 jal 0 <mips_mt_regdump>
344: 32b50001 andi s5,s5,0x1
348: 3c040000 lui a0,0x0
34c: 0c000000 jal 0 <mips_mt_regdump>
350: 24840000 addiu a0,a0,0
354: 12a00004 beqz s5,368 <mips_mt_regdump+0x368>
358: 32820400 andi v0,s4,0x400
35c: 41600021 evpe
360: 000000c0 ehb
364: 32820400 andi v0,s4,0x400
368: 14400003 bnez v0,378 <mips_mt_regdump+0x378>
36c: 00000000 nop
370: 0c000000 jal 0 <mips_mt_regdump>
374: 00000000 nop
378: 40011001 mfc0 at,c0_tcstatus
37c: 32940400 andi s4,s4,0x400
380: 34210400 ori at,at,0x400
384: 38210400 xori at,at,0x400
388: 0281a025 or s4,s4,at
38c: 40941001 mtc0 s4,c0_tcstatus
390: 000000c0 ehb
394: 8fbf0044 lw ra,68(sp)
398: 8fbe0040 lw s8,64(sp)
39c: 8fb7003c lw s7,60(sp)
3a0: 8fb60038 lw s6,56(sp)
3a4: 8fb50034 lw s5,52(sp)
3a8: 8fb40030 lw s4,48(sp)
3ac: 8fb3002c lw s3,44(sp)
3b0: 8fb20028 lw s2,40(sp)
3b4: 8fb10024 lw s1,36(sp)
3b8: 8fb00020 lw s0,32(sp)
3bc: 03e00008 jr ra
3c0: 27bd0048 addiu sp,sp,72
On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()? The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
>
> /K.
>
>
> On 12/16/10 11:58, Kevin D. Kissell wrote:
>>
>> Ralf tells me that this message got blocked by the LMO server due to HTML
>> content.
>> So here it is again, textier.
>>
>> On 12/16/10 11:24, Kevin D. Kissell wrote:
>> > On 12/16/10 07:37, STUART VENTERS wrote:
>> >
>> > Two other possible clues:
>> >
>> > The EVP is clear in the MVPControl register.
>> > Does this say that only VPE0, T0 gets to run?
>>
>> That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
>> It's just possible that setting EVP is conditional on more than one VPE
>> being used, but that's not the way I remember it.
>>
>> > Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
>> > Exception dispatch.
>> > But that seems to conflict the EVP bit above.
>>
>> I don't have a copy of the ASE spec handy to see whether those bits have a
>> defined power-on value, but particularly if maxvpes=1 was set at boot time,
>> I would expect VPE1's registers to be in a partly random power-up state.
>>
>> > Perhaps these are an artifact of getting to a good state to dump things
>> > out.
>>
>> As per my previous mail, I looked at the MT register dump source, and it
>> really does pull values directly
>> out of registers and doesn't depend on having a sane kernel stack frame.
>> The exceptions to that rule
>> are the reported values for TCStatus of the executing TC, which is based
>> on the perhaps-now-broken
>> assumption that local_irq_save(flags) stores the *entire* pre-invocation
>> value of the TCStatus register
>> in the flags variable, and MVPcontrol, which is based on the assumption
>> that dvpe() returns the pre-invocation
>> value of MVPcontrol. Break those assumptions, and you'll get inconsistent
>> state dumps like this,
>> and very possibly incorrect execution. Particularly if what was done was
>> that effectively replaces
>> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
>> with something that uses
>> the generic MIPS32R2 atomic interrupt enable/disable instructions. That
>> would have been a *very* bad idea...
>>
>> Regards,
>>
>> Kevin K.
>>
>>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-21 20:06 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:06 UTC (permalink / raw)
To: Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips
OK. I will check it.
BTW following patch is responsible for irq change.
http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2101
Thanks
Anoop
________________________________________
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 12:23 AM
To: Anoop P A
Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
Subject: Re: SMTC support status in latest git head.
OK, I see why the MT register dump isn't giving us useful information. It's not clear that it's at the root of your functional problems, though. Apparently, somebody decided that it was unwholesome to propagate anything other than the previous interrupt enable state in the flags variable passed between irq_save() and irq_restore(). I agree philosophically, but it does break the MT register dump function. And I'm quite sure that there were other bits of SMTC code that knew that it was a TCStatus value, at least in the earliest versions of the code. I'm not a gitweb power user, but I haven't been able to figure out how to determine when the "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's where it is in the head of tree) was checked-in. If it's at the boundary between working and non-working versions for SMTC, it might be the cause of the problems, but it may well not be responsible for anything other than the problem with reporting the value in the MT register dump - which really ought to be fixed.
I'm in a small village in France for the holidays with no git/build system at my disposal, but I think that if you were to tweak mips-mt.c at line 103 to change
the
tcstatval = flags; /* And pre-dump TCStatus is flags */
to something more like
/* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
*/
tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
should fix the dump.
Regards,
Kevin K.
On 12/20/10 2:44 AM, Anoop P A wrote:
Hi Kevin,
Please find disassembly for mips_mt_reg_dump
Thanks
Anoop
Disassembly of section .text:
00000000 <mips_mt_regdump>:
0: 27bdffb8 addiu sp,sp,-72
4: 00802821 move a1,a0
8: afbf0044 sw ra,68(sp)
c: afbe0040 sw s8,64(sp)
10: afb7003c sw s7,60(sp)
14: afb60038 sw s6,56(sp)
18: afb50034 sw s5,52(sp)
1c: afb40030 sw s4,48(sp)
20: afb3002c sw s3,44(sp)
24: afb20028 sw s2,40(sp)
28: afb10024 sw s1,36(sp)
2c: afb00020 sw s0,32(sp)
30: 40141001 mfc0 s4,c0_tcstatus
34: 36810400 ori at,s4,0x400
38: 40811001 mtc0 at,c0_tcstatus
3c: 32940400 andi s4,s4,0x400
40: 000000c0 ehb
44: 41610001 dvpe at
48: 0020a821 move s5,at
4c: 000000c0 ehb
50: 3c020000 lui v0,0x0
54: 24420060 addiu v0,v0,96
58: 00400408 jr.hb v0
5c: 00000000 nop
60: 3c040000 lui a0,0x0
64: 24840000 addiu a0,a0,0
68: 0c000000 jal 0 <mips_mt_regdump>
6c: afa50010 sw a1,16(sp)
70: 3c040000 lui a0,0x0
74: 0c000000 jal 0 <mips_mt_regdump>
78: 24840000 addiu a0,a0,0
7c: 8fa50010 lw a1,16(sp)
80: 3c040000 lui a0,0x0
84: 0c000000 jal 0 <mips_mt_regdump>
88: 24840000 addiu a0,a0,0
8c: 3c040000 lui a0,0x0
90: 24840000 addiu a0,a0,0
94: 0c000000 jal 0 <mips_mt_regdump>
98: 02a02821 move a1,s5
9c: 40110002 mfc0 s1,c0_mvpconf0
a0: 3c040000 lui a0,0x0
a4: 02202821 move a1,s1
a8: 0c000000 jal 0 <mips_mt_regdump>
ac: 24840000 addiu a0,a0,0
b0: 3c040000 lui a0,0x0
b4: 0c000000 jal 0 <mips_mt_regdump>
b8: 24840000 addiu a0,a0,0
bc: 7e331a80 ext s3,s1,0xa,0x4
c0: 3c090000 lui t1,0x0
c4: 323100ff andi s1,s1,0xff
c8: 3c080000 lui t0,0x0
cc: 3c030000 lui v1,0x0
d0: 3c1e0000 lui s8,0x0
d4: 3c170000 lui s7,0x0
d8: 3c160000 lui s6,0x0
dc: 3c0a0000 lui t2,0x0
e0: 26730001 addiu s3,s3,1
e4: 26310001 addiu s1,s1,1
e8: 00008021 move s0,zero
ec: 2412ff00 li s2,-256
f0: 25290000 addiu t1,t1,0
f4: 25080000 addiu t0,t0,0
f8: 24630000 addiu v1,v1,0
fc: 27de0000 addiu s8,s8,0
100: 26f70000 addiu s7,s7,0
104: 26d60000 addiu s6,s6,0
108: 254a0000 addiu t2,t2,0
10c: 00001021 move v0,zero
110: 40040801 mfc0 a0,c0_vpecontrol
114: 00922024 and a0,a0,s2
118: 00442025 or a0,v0,a0
11c: 40840801 mtc0 a0,c0_vpecontrol
120: 000000c0 ehb
124: 41020802 mftc0 at,c0_tcbind
128: 00202021 move a0,at
12c: 24420001 addiu v0,v0,1
130: 3084000f andi a0,a0,0xf
134: 12040031 beq s0,a0,1fc <mips_mt_regdump+0x1fc>
138: 0051282a slt a1,v0,s1
13c: 14a0fff4 bnez a1,110 <mips_mt_regdump+0x110>
140: 00000000 nop
144: 26100001 addiu s0,s0,1
148: 0213102a slt v0,s0,s3
14c: 1440fff0 bnez v0,110 <mips_mt_regdump+0x110>
150: 00001021 move v0,zero
154: 3c040000 lui a0,0x0
158: 24840000 addiu a0,a0,0
15c: 3c1e0000 lui s8,0x0
160: 3c170000 lui s7,0x0
164: 3c160000 lui s6,0x0
168: 3c130000 lui s3,0x0
16c: 0c000000 jal 0 <mips_mt_regdump>
170: 3c120000 lui s2,0x0
174: 00008021 move s0,zero
178: 27de0000 addiu s8,s8,0
17c: 26f70000 addiu s7,s7,0
180: 26d60000 addiu s6,s6,0
184: 26730000 addiu s3,s3,0
188: 26520000 addiu s2,s2,0
18c: 40020801 mfc0 v0,c0_vpecontrol
190: 2403ff00 li v1,-256
194: 00431024 and v0,v0,v1
198: 02021025 or v0,s0,v0
19c: 40820801 mtc0 v0,c0_vpecontrol
1a0: 000000c0 ehb
1a4: 41020802 mftc0 at,c0_tcbind
1a8: 00201821 move v1,at
1ac: 40021002 mfc0 v0,c0_tcbind
1b0: 1062003f beq v1,v0,2b0 <mips_mt_regdump+0x2b0>
1b4: 00000000 nop
1b8: 41020804 mftc0 at,c0_tchalt
1bc: 00201821 move v1,at
1c0: 24020001 li v0,1
1c4: 00400821 move at,v0
1c8: 41811004 mttc0 at,c0_tchalt
1cc: 41020801 mftc0 at,c0_tcstatus
1d0: 00203021 move a2,at
1d4: 3c040000 lui a0,0x0
1d8: 02002821 move a1,s0
1dc: 24840000 addiu a0,a0,0
1e0: afa3001c sw v1,28(sp)
1e4: 0c000000 jal 0 <mips_mt_regdump>
1e8: afa60010 sw a2,16(sp)
1ec: 8fa60010 lw a2,16(sp)
1f0: 8fa3001c lw v1,28(sp)
1f4: 080000b2 j 2c8 <mips_mt_regdump+0x2c8>
1f8: 00c02821 move a1,a2
1fc: 01202021 move a0,t1
200: 02002821 move a1,s0
204: afa3001c sw v1,28(sp)
208: afa80014 sw t0,20(sp)
20c: afa90010 sw t1,16(sp)
210: 0c000000 jal 0 <mips_mt_regdump>
214: afaa0018 sw t2,24(sp)
218: 41010801 mftc0 at,c0_vpecontrol
21c: 00202821 move a1,at
220: 8fa80014 lw t0,20(sp)
224: 0c000000 jal 0 <mips_mt_regdump>
228: 01002021 move a0,t0
22c: 41010802 mftc0 at,c0_vpeconf0
230: 00202821 move a1,at
234: 8fa3001c lw v1,28(sp)
238: 0c000000 jal 0 <mips_mt_regdump>
23c: 00602021 move a0,v1
240: 410c0800 mftc0 at,c0_status
244: 00203021 move a2,at
248: 03c02021 move a0,s8
24c: 0c000000 jal 0 <mips_mt_regdump>
250: 02002821 move a1,s0
254: 410e0800 mftc0 at,c0_epc
258: 00203021 move a2,at
25c: 410e0800 mftc0 at,c0_epc
260: 00203821 move a3,at
264: 02e02021 move a0,s7
268: 0c000000 jal 0 <mips_mt_regdump>
26c: 02002821 move a1,s0
270: 410d0800 mftc0 at,c0_cause
274: 00203021 move a2,at
278: 02c02021 move a0,s6
27c: 0c000000 jal 0 <mips_mt_regdump>
280: 02002821 move a1,s0
284: 41100807 mftc0 at,$16,7
288: 00203021 move a2,at
28c: 8faa0018 lw t2,24(sp)
290: 02002821 move a1,s0
294: 0c000000 jal 0 <mips_mt_regdump>
298: 01402021 move a0,t2
29c: 8fa3001c lw v1,28(sp)
2a0: 8fa80014 lw t0,20(sp)
2a4: 8fa90010 lw t1,16(sp)
2a8: 08000051 j 144 <mips_mt_regdump+0x144>
2ac: 8faa0018 lw t2,24(sp)
2b0: 3c040000 lui a0,0x0
2b4: 02002821 move a1,s0
2b8: 0c000000 jal 0 <mips_mt_regdump>
2bc: 24840000 addiu a0,a0,0
2c0: 00001821 move v1,zero
2c4: 02802821 move a1,s4
2c8: 03c02021 move a0,s8
2cc: 0c000000 jal 0 <mips_mt_regdump>
2d0: afa3001c sw v1,28(sp)
2d4: 41020802 mftc0 at,c0_tcbind
2d8: 00202821 move a1,at
2dc: 0c000000 jal 0 <mips_mt_regdump>
2e0: 02e02021 move a0,s7
2e4: 41020803 mftc0 at,c0_tcrestart
2e8: 00202821 move a1,at
2ec: 41020803 mftc0 at,c0_tcrestart
2f0: 00203021 move a2,at
2f4: 0c000000 jal 0 <mips_mt_regdump>
2f8: 02c02021 move a0,s6
2fc: 8fa3001c lw v1,28(sp)
300: 02602021 move a0,s3
304: 0c000000 jal 0 <mips_mt_regdump>
308: 00602821 move a1,v1
30c: 41020805 mftc0 at,c0_tccontext
310: 00202821 move a1,at
314: 0c000000 jal 0 <mips_mt_regdump>
318: 02402021 move a0,s2
31c: 8fa3001c lw v1,28(sp)
320: 14600003 bnez v1,330 <mips_mt_regdump+0x330>
324: 00001021 move v0,zero
328: 00400821 move at,v0
32c: 41811004 mttc0 at,c0_tchalt
330: 26100001 addiu s0,s0,1
334: 0211102a slt v0,s0,s1
338: 1440ff94 bnez v0,18c <mips_mt_regdump+0x18c>
33c: 00000000 nop
340: 0c000000 jal 0 <mips_mt_regdump>
344: 32b50001 andi s5,s5,0x1
348: 3c040000 lui a0,0x0
34c: 0c000000 jal 0 <mips_mt_regdump>
350: 24840000 addiu a0,a0,0
354: 12a00004 beqz s5,368 <mips_mt_regdump+0x368>
358: 32820400 andi v0,s4,0x400
35c: 41600021 evpe
360: 000000c0 ehb
364: 32820400 andi v0,s4,0x400
368: 14400003 bnez v0,378 <mips_mt_regdump+0x378>
36c: 00000000 nop
370: 0c000000 jal 0 <mips_mt_regdump>
374: 00000000 nop
378: 40011001 mfc0 at,c0_tcstatus
37c: 32940400 andi s4,s4,0x400
380: 34210400 ori at,at,0x400
384: 38210400 xori at,at,0x400
388: 0281a025 or s4,s4,at
38c: 40941001 mtc0 s4,c0_tcstatus
390: 000000c0 ehb
394: 8fbf0044 lw ra,68(sp)
398: 8fbe0040 lw s8,64(sp)
39c: 8fb7003c lw s7,60(sp)
3a0: 8fb60038 lw s6,56(sp)
3a4: 8fb50034 lw s5,52(sp)
3a8: 8fb40030 lw s4,48(sp)
3ac: 8fb3002c lw s3,44(sp)
3b0: 8fb20028 lw s2,40(sp)
3b4: 8fb10024 lw s1,36(sp)
3b8: 8fb00020 lw s0,32(sp)
3bc: 03e00008 jr ra
3c0: 27bd0048 addiu sp,sp,72
On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
So, Anoop, if you get a minute for this any time in the next day or so
(after which I'll have very limited net access until next year), could you
please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
image (or even just the mips-mt.o module) from a failing kernel build and
post the disassembly of mips_mt_regdump()? The confirmation or refutation
of the theory about local_irq_save() no longer being built correctly for
SMTC would be within the first few instructions...
/K.
On 12/16/10 11:58, Kevin D. Kissell wrote:
Ralf tells me that this message got blocked by the LMO server due to HTML
content.
So here it is again, textier.
On 12/16/10 11:24, Kevin D. Kissell wrote:
On 12/16/10 07:37, STUART VENTERS wrote:
Two other possible clues:
The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
It's just possible that setting EVP is conditional on more than one VPE
being used, but that's not the way I remember it.
Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
Exception dispatch.
But that seems to conflict the EVP bit above.
I don't have a copy of the ASE spec handy to see whether those bits have a
defined power-on value, but particularly if maxvpes=1 was set at boot time,
I would expect VPE1's registers to be in a partly random power-up state.
Perhaps these are an artifact of getting to a good state to dump things
out.
As per my previous mail, I looked at the MT register dump source, and it
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack frame.
The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption
that dvpe() returns the pre-invocation
value of MVPcontrol. Break those assumptions, and you'll get inconsistent
state dumps like this,
and very possibly incorrect execution. Particularly if what was done was
that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore()
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions. That
would have been a *very* bad idea...
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-21 20:06 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:06 UTC (permalink / raw)
To: Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips
OK. I will check it.
BTW following patch is responsible for irq change.
http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2101
Thanks
Anoop
________________________________________
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 12:23 AM
To: Anoop P A
Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
Subject: Re: SMTC support status in latest git head.
OK, I see why the MT register dump isn't giving us useful information. It's not clear that it's at the root of your functional problems, though. Apparently, somebody decided that it was unwholesome to propagate anything other than the previous interrupt enable state in the flags variable passed between irq_save() and irq_restore(). I agree philosophically, but it does break the MT register dump function. And I'm quite sure that there were other bits of SMTC code that knew that it was a TCStatus value, at least in the earliest versions of the code. I'm not a gitweb power user, but I haven't been able to figure out how to determine when the "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's where it is in the head of tree) was checked-in. If it's at the boundary between working and non-working versions for SMTC, it might be the cause of the problems, but it may well not be responsible for anything other than the problem with reporting the value in the MT register dump - which really ought to be fixed.
I'm in a small village in France for the holidays with no git/build system at my disposal, but I think that if you were to tweak mips-mt.c at line 103 to change
the
tcstatval = flags; /* And pre-dump TCStatus is flags */
to something more like
/* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
*/
tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
should fix the dump.
Regards,
Kevin K.
On 12/20/10 2:44 AM, Anoop P A wrote:
Hi Kevin,
Please find disassembly for mips_mt_reg_dump
Thanks
Anoop
Disassembly of section .text:
00000000 <mips_mt_regdump>:
0: 27bdffb8 addiu sp,sp,-72
4: 00802821 move a1,a0
8: afbf0044 sw ra,68(sp)
c: afbe0040 sw s8,64(sp)
10: afb7003c sw s7,60(sp)
14: afb60038 sw s6,56(sp)
18: afb50034 sw s5,52(sp)
1c: afb40030 sw s4,48(sp)
20: afb3002c sw s3,44(sp)
24: afb20028 sw s2,40(sp)
28: afb10024 sw s1,36(sp)
2c: afb00020 sw s0,32(sp)
30: 40141001 mfc0 s4,c0_tcstatus
34: 36810400 ori at,s4,0x400
38: 40811001 mtc0 at,c0_tcstatus
3c: 32940400 andi s4,s4,0x400
40: 000000c0 ehb
44: 41610001 dvpe at
48: 0020a821 move s5,at
4c: 000000c0 ehb
50: 3c020000 lui v0,0x0
54: 24420060 addiu v0,v0,96
58: 00400408 jr.hb v0
5c: 00000000 nop
60: 3c040000 lui a0,0x0
64: 24840000 addiu a0,a0,0
68: 0c000000 jal 0 <mips_mt_regdump>
6c: afa50010 sw a1,16(sp)
70: 3c040000 lui a0,0x0
74: 0c000000 jal 0 <mips_mt_regdump>
78: 24840000 addiu a0,a0,0
7c: 8fa50010 lw a1,16(sp)
80: 3c040000 lui a0,0x0
84: 0c000000 jal 0 <mips_mt_regdump>
88: 24840000 addiu a0,a0,0
8c: 3c040000 lui a0,0x0
90: 24840000 addiu a0,a0,0
94: 0c000000 jal 0 <mips_mt_regdump>
98: 02a02821 move a1,s5
9c: 40110002 mfc0 s1,c0_mvpconf0
a0: 3c040000 lui a0,0x0
a4: 02202821 move a1,s1
a8: 0c000000 jal 0 <mips_mt_regdump>
ac: 24840000 addiu a0,a0,0
b0: 3c040000 lui a0,0x0
b4: 0c000000 jal 0 <mips_mt_regdump>
b8: 24840000 addiu a0,a0,0
bc: 7e331a80 ext s3,s1,0xa,0x4
c0: 3c090000 lui t1,0x0
c4: 323100ff andi s1,s1,0xff
c8: 3c080000 lui t0,0x0
cc: 3c030000 lui v1,0x0
d0: 3c1e0000 lui s8,0x0
d4: 3c170000 lui s7,0x0
d8: 3c160000 lui s6,0x0
dc: 3c0a0000 lui t2,0x0
e0: 26730001 addiu s3,s3,1
e4: 26310001 addiu s1,s1,1
e8: 00008021 move s0,zero
ec: 2412ff00 li s2,-256
f0: 25290000 addiu t1,t1,0
f4: 25080000 addiu t0,t0,0
f8: 24630000 addiu v1,v1,0
fc: 27de0000 addiu s8,s8,0
100: 26f70000 addiu s7,s7,0
104: 26d60000 addiu s6,s6,0
108: 254a0000 addiu t2,t2,0
10c: 00001021 move v0,zero
110: 40040801 mfc0 a0,c0_vpecontrol
114: 00922024 and a0,a0,s2
118: 00442025 or a0,v0,a0
11c: 40840801 mtc0 a0,c0_vpecontrol
120: 000000c0 ehb
124: 41020802 mftc0 at,c0_tcbind
128: 00202021 move a0,at
12c: 24420001 addiu v0,v0,1
130: 3084000f andi a0,a0,0xf
134: 12040031 beq s0,a0,1fc <mips_mt_regdump+0x1fc>
138: 0051282a slt a1,v0,s1
13c: 14a0fff4 bnez a1,110 <mips_mt_regdump+0x110>
140: 00000000 nop
144: 26100001 addiu s0,s0,1
148: 0213102a slt v0,s0,s3
14c: 1440fff0 bnez v0,110 <mips_mt_regdump+0x110>
150: 00001021 move v0,zero
154: 3c040000 lui a0,0x0
158: 24840000 addiu a0,a0,0
15c: 3c1e0000 lui s8,0x0
160: 3c170000 lui s7,0x0
164: 3c160000 lui s6,0x0
168: 3c130000 lui s3,0x0
16c: 0c000000 jal 0 <mips_mt_regdump>
170: 3c120000 lui s2,0x0
174: 00008021 move s0,zero
178: 27de0000 addiu s8,s8,0
17c: 26f70000 addiu s7,s7,0
180: 26d60000 addiu s6,s6,0
184: 26730000 addiu s3,s3,0
188: 26520000 addiu s2,s2,0
18c: 40020801 mfc0 v0,c0_vpecontrol
190: 2403ff00 li v1,-256
194: 00431024 and v0,v0,v1
198: 02021025 or v0,s0,v0
19c: 40820801 mtc0 v0,c0_vpecontrol
1a0: 000000c0 ehb
1a4: 41020802 mftc0 at,c0_tcbind
1a8: 00201821 move v1,at
1ac: 40021002 mfc0 v0,c0_tcbind
1b0: 1062003f beq v1,v0,2b0 <mips_mt_regdump+0x2b0>
1b4: 00000000 nop
1b8: 41020804 mftc0 at,c0_tchalt
1bc: 00201821 move v1,at
1c0: 24020001 li v0,1
1c4: 00400821 move at,v0
1c8: 41811004 mttc0 at,c0_tchalt
1cc: 41020801 mftc0 at,c0_tcstatus
1d0: 00203021 move a2,at
1d4: 3c040000 lui a0,0x0
1d8: 02002821 move a1,s0
1dc: 24840000 addiu a0,a0,0
1e0: afa3001c sw v1,28(sp)
1e4: 0c000000 jal 0 <mips_mt_regdump>
1e8: afa60010 sw a2,16(sp)
1ec: 8fa60010 lw a2,16(sp)
1f0: 8fa3001c lw v1,28(sp)
1f4: 080000b2 j 2c8 <mips_mt_regdump+0x2c8>
1f8: 00c02821 move a1,a2
1fc: 01202021 move a0,t1
200: 02002821 move a1,s0
204: afa3001c sw v1,28(sp)
208: afa80014 sw t0,20(sp)
20c: afa90010 sw t1,16(sp)
210: 0c000000 jal 0 <mips_mt_regdump>
214: afaa0018 sw t2,24(sp)
218: 41010801 mftc0 at,c0_vpecontrol
21c: 00202821 move a1,at
220: 8fa80014 lw t0,20(sp)
224: 0c000000 jal 0 <mips_mt_regdump>
228: 01002021 move a0,t0
22c: 41010802 mftc0 at,c0_vpeconf0
230: 00202821 move a1,at
234: 8fa3001c lw v1,28(sp)
238: 0c000000 jal 0 <mips_mt_regdump>
23c: 00602021 move a0,v1
240: 410c0800 mftc0 at,c0_status
244: 00203021 move a2,at
248: 03c02021 move a0,s8
24c: 0c000000 jal 0 <mips_mt_regdump>
250: 02002821 move a1,s0
254: 410e0800 mftc0 at,c0_epc
258: 00203021 move a2,at
25c: 410e0800 mftc0 at,c0_epc
260: 00203821 move a3,at
264: 02e02021 move a0,s7
268: 0c000000 jal 0 <mips_mt_regdump>
26c: 02002821 move a1,s0
270: 410d0800 mftc0 at,c0_cause
274: 00203021 move a2,at
278: 02c02021 move a0,s6
27c: 0c000000 jal 0 <mips_mt_regdump>
280: 02002821 move a1,s0
284: 41100807 mftc0 at,$16,7
288: 00203021 move a2,at
28c: 8faa0018 lw t2,24(sp)
290: 02002821 move a1,s0
294: 0c000000 jal 0 <mips_mt_regdump>
298: 01402021 move a0,t2
29c: 8fa3001c lw v1,28(sp)
2a0: 8fa80014 lw t0,20(sp)
2a4: 8fa90010 lw t1,16(sp)
2a8: 08000051 j 144 <mips_mt_regdump+0x144>
2ac: 8faa0018 lw t2,24(sp)
2b0: 3c040000 lui a0,0x0
2b4: 02002821 move a1,s0
2b8: 0c000000 jal 0 <mips_mt_regdump>
2bc: 24840000 addiu a0,a0,0
2c0: 00001821 move v1,zero
2c4: 02802821 move a1,s4
2c8: 03c02021 move a0,s8
2cc: 0c000000 jal 0 <mips_mt_regdump>
2d0: afa3001c sw v1,28(sp)
2d4: 41020802 mftc0 at,c0_tcbind
2d8: 00202821 move a1,at
2dc: 0c000000 jal 0 <mips_mt_regdump>
2e0: 02e02021 move a0,s7
2e4: 41020803 mftc0 at,c0_tcrestart
2e8: 00202821 move a1,at
2ec: 41020803 mftc0 at,c0_tcrestart
2f0: 00203021 move a2,at
2f4: 0c000000 jal 0 <mips_mt_regdump>
2f8: 02c02021 move a0,s6
2fc: 8fa3001c lw v1,28(sp)
300: 02602021 move a0,s3
304: 0c000000 jal 0 <mips_mt_regdump>
308: 00602821 move a1,v1
30c: 41020805 mftc0 at,c0_tccontext
310: 00202821 move a1,at
314: 0c000000 jal 0 <mips_mt_regdump>
318: 02402021 move a0,s2
31c: 8fa3001c lw v1,28(sp)
320: 14600003 bnez v1,330 <mips_mt_regdump+0x330>
324: 00001021 move v0,zero
328: 00400821 move at,v0
32c: 41811004 mttc0 at,c0_tchalt
330: 26100001 addiu s0,s0,1
334: 0211102a slt v0,s0,s1
338: 1440ff94 bnez v0,18c <mips_mt_regdump+0x18c>
33c: 00000000 nop
340: 0c000000 jal 0 <mips_mt_regdump>
344: 32b50001 andi s5,s5,0x1
348: 3c040000 lui a0,0x0
34c: 0c000000 jal 0 <mips_mt_regdump>
350: 24840000 addiu a0,a0,0
354: 12a00004 beqz s5,368 <mips_mt_regdump+0x368>
358: 32820400 andi v0,s4,0x400
35c: 41600021 evpe
360: 000000c0 ehb
364: 32820400 andi v0,s4,0x400
368: 14400003 bnez v0,378 <mips_mt_regdump+0x378>
36c: 00000000 nop
370: 0c000000 jal 0 <mips_mt_regdump>
374: 00000000 nop
378: 40011001 mfc0 at,c0_tcstatus
37c: 32940400 andi s4,s4,0x400
380: 34210400 ori at,at,0x400
384: 38210400 xori at,at,0x400
388: 0281a025 or s4,s4,at
38c: 40941001 mtc0 s4,c0_tcstatus
390: 000000c0 ehb
394: 8fbf0044 lw ra,68(sp)
398: 8fbe0040 lw s8,64(sp)
39c: 8fb7003c lw s7,60(sp)
3a0: 8fb60038 lw s6,56(sp)
3a4: 8fb50034 lw s5,52(sp)
3a8: 8fb40030 lw s4,48(sp)
3ac: 8fb3002c lw s3,44(sp)
3b0: 8fb20028 lw s2,40(sp)
3b4: 8fb10024 lw s1,36(sp)
3b8: 8fb00020 lw s0,32(sp)
3bc: 03e00008 jr ra
3c0: 27bd0048 addiu sp,sp,72
On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
So, Anoop, if you get a minute for this any time in the next day or so
(after which I'll have very limited net access until next year), could you
please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
image (or even just the mips-mt.o module) from a failing kernel build and
post the disassembly of mips_mt_regdump()? The confirmation or refutation
of the theory about local_irq_save() no longer being built correctly for
SMTC would be within the first few instructions...
/K.
On 12/16/10 11:58, Kevin D. Kissell wrote:
Ralf tells me that this message got blocked by the LMO server due to HTML
content.
So here it is again, textier.
On 12/16/10 11:24, Kevin D. Kissell wrote:
On 12/16/10 07:37, STUART VENTERS wrote:
Two other possible clues:
The EVP is clear in the MVPControl register.
Does this say that only VPE0, T0 gets to run?
That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
It's just possible that setting EVP is conditional on more than one VPE
being used, but that's not the way I remember it.
Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
Exception dispatch.
But that seems to conflict the EVP bit above.
I don't have a copy of the ASE spec handy to see whether those bits have a
defined power-on value, but particularly if maxvpes=1 was set at boot time,
I would expect VPE1's registers to be in a partly random power-up state.
Perhaps these are an artifact of getting to a good state to dump things
out.
As per my previous mail, I looked at the MT register dump source, and it
really does pull values directly
out of registers and doesn't depend on having a sane kernel stack frame.
The exceptions to that rule
are the reported values for TCStatus of the executing TC, which is based
on the perhaps-now-broken
assumption that local_irq_save(flags) stores the *entire* pre-invocation
value of the TCStatus register
in the flags variable, and MVPcontrol, which is based on the assumption
that dvpe() returns the pre-invocation
value of MVPcontrol. Break those assumptions, and you'll get inconsistent
state dumps like this,
and very possibly incorrect execution. Particularly if what was done was
that effectively replaces
the SMTC-specific implementation of local_irq_save()/local_irq_restore()
with something that uses
the generic MIPS32R2 atomic interrupt enable/disable instructions. That
would have been a *very* bad idea...
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-21 20:29 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:29 UTC (permalink / raw)
To: Anoop P.A., Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips
Sorry I misunderstood file. git blame shows that "andi" is around for quite sometime .
49a89efb include/asm-mips/irqflags.h (Ralf Baechle 2007-10-11 23:46:15 +0100 128) __asm__(
df9ee292 arch/mips/include/asm/irqflags.h (David Howells 2010-10-07 14:08:55 +0100 129) " .macro arch_local_irq_save result
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 130) " .set push
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 131) " .set reorder
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 132) " .set noat
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 133) #ifdef CONFIG_MIPS_MT_SMTC
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 134) " mfc0 \\result, $2, 1
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 135) " ori $1, \\result, 0x400
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 136) " .set noreorder
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 137) " mtc0 $1, $2, 1
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 138) " andi \\result, \\result, 0x400
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 139) #elif defined(CONFIG_CPU_MIPSR2)
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 140) " di \\result
15265251 include/asm-mips/interrupt.h (Maxime Bizon 2005-12-20 06:32:19 +0100 141) " andi \\result, 1
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 142) #else
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 143) " mfc0 \\result, $12
c226f260 include/asm-mips/interrupt.h (Atsushi Nemoto 2006-02-03 01:34:01 +0900 144) " ori $1, \\result, 0x1f
c226f260 include/asm-mips/interrupt.h (Atsushi Nemoto 2006-02-03 01:34:01 +0900 145) " xori $1, 0x1f
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 146) " .set noreorder
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 147) " mtc0 $1, $12
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 148) #endif
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 149) " irq_disable_hazard
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 150) " .set pop
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 151) " .endm
^1da177e include/asm-mips/interrupt.h (Linus Torvalds 2005-04-16 15:20:36 -0700 152)
> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org [mailto:linux-mips-bounce@linux-
> mips.org] On Behalf Of Anoop P.A.
> Sent: Wednesday, December 22, 2010 1:37 AM
> To: Kevin D. Kissell; Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org
> Subject: RE: SMTC support status in latest git head.
>
>
> OK. I will check it.
>
> BTW following patch is responsible for irq change.
>
> http://git.linux-
> mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2
> 101
>
> Thanks
> Anoop
> ________________________________________
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Wednesday, December 22, 2010 12:23 AM
> To: Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
> Subject: Re: SMTC support status in latest git head.
>
> OK, I see why the MT register dump isn't giving us useful information.
> It's not clear that it's at the root of your functional problems, though.
> Apparently, somebody decided that it was unwholesome to propagate anything
> other than the previous interrupt enable state in the flags variable
> passed between irq_save() and irq_restore(). I agree philosophically, but
> it does break the MT register dump function. And I'm quite sure that
> there were other bits of SMTC code that knew that it was a TCStatus value,
> at least in the earliest versions of the code. I'm not a gitweb power
> user, but I haven't been able to figure out how to determine when the
> "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's
> where it is in the head of tree) was checked-in. If it's at the boundary
> between working and non-working versions for SMTC, it might be the cause
> of the problems, but it may well not be responsible for anything other
> than the problem with reporting the value in
> the MT register dump - which really ought to be fixed.
>
> I'm in a small village in France for the holidays with no git/build system
> at my disposal, but I think that if you were to tweak mips-mt.c at line
> 103 to change
> the
>
> tcstatval = flags; /* And pre-dump TCStatus is flags */
>
>
>
> to something more like
>
>
>
> /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
> */
>
> tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
>
>
>
> should fix the dump.
>
> Regards,
>
> Kevin K.
>
> On 12/20/10 2:44 AM, Anoop P A wrote:
> Hi Kevin,
>
> Please find disassembly for mips_mt_reg_dump
>
> Thanks
> Anoop
>
> Disassembly of section .text:
>
> 00000000 <mips_mt_regdump>:
> 0: 27bdffb8 addiu sp,sp,-72
> 4: 00802821 move a1,a0
> 8: afbf0044 sw ra,68(sp)
> c: afbe0040 sw s8,64(sp)
> 10: afb7003c sw s7,60(sp)
> 14: afb60038 sw s6,56(sp)
> 18: afb50034 sw s5,52(sp)
> 1c: afb40030 sw s4,48(sp)
> 20: afb3002c sw s3,44(sp)
> 24: afb20028 sw s2,40(sp)
> 28: afb10024 sw s1,36(sp)
> 2c: afb00020 sw s0,32(sp)
> 30: 40141001 mfc0 s4,c0_tcstatus
> 34: 36810400 ori at,s4,0x400
> 38: 40811001 mtc0 at,c0_tcstatus
> 3c: 32940400 andi s4,s4,0x400
> 40: 000000c0 ehb
> 44: 41610001 dvpe at
> 48: 0020a821 move s5,at
> 4c: 000000c0 ehb
> 50: 3c020000 lui v0,0x0
> 54: 24420060 addiu v0,v0,96
> 58: 00400408 jr.hb v0
> 5c: 00000000 nop
> 60: 3c040000 lui a0,0x0
> 64: 24840000 addiu a0,a0,0
> 68: 0c000000 jal 0 <mips_mt_regdump>
> 6c: afa50010 sw a1,16(sp)
> 70: 3c040000 lui a0,0x0
> 74: 0c000000 jal 0 <mips_mt_regdump>
> 78: 24840000 addiu a0,a0,0
> 7c: 8fa50010 lw a1,16(sp)
> 80: 3c040000 lui a0,0x0
> 84: 0c000000 jal 0 <mips_mt_regdump>
> 88: 24840000 addiu a0,a0,0
> 8c: 3c040000 lui a0,0x0
> 90: 24840000 addiu a0,a0,0
> 94: 0c000000 jal 0 <mips_mt_regdump>
> 98: 02a02821 move a1,s5
> 9c: 40110002 mfc0 s1,c0_mvpconf0
> a0: 3c040000 lui a0,0x0
> a4: 02202821 move a1,s1
> a8: 0c000000 jal 0 <mips_mt_regdump>
> ac: 24840000 addiu a0,a0,0
> b0: 3c040000 lui a0,0x0
> b4: 0c000000 jal 0 <mips_mt_regdump>
> b8: 24840000 addiu a0,a0,0
> bc: 7e331a80 ext s3,s1,0xa,0x4
> c0: 3c090000 lui t1,0x0
> c4: 323100ff andi s1,s1,0xff
> c8: 3c080000 lui t0,0x0
> cc: 3c030000 lui v1,0x0
> d0: 3c1e0000 lui s8,0x0
> d4: 3c170000 lui s7,0x0
> d8: 3c160000 lui s6,0x0
> dc: 3c0a0000 lui t2,0x0
> e0: 26730001 addiu s3,s3,1
> e4: 26310001 addiu s1,s1,1
> e8: 00008021 move s0,zero
> ec: 2412ff00 li s2,-256
> f0: 25290000 addiu t1,t1,0
> f4: 25080000 addiu t0,t0,0
> f8: 24630000 addiu v1,v1,0
> fc: 27de0000 addiu s8,s8,0
> 100: 26f70000 addiu s7,s7,0
> 104: 26d60000 addiu s6,s6,0
> 108: 254a0000 addiu t2,t2,0
> 10c: 00001021 move v0,zero
> 110: 40040801 mfc0 a0,c0_vpecontrol
> 114: 00922024 and a0,a0,s2
> 118: 00442025 or a0,v0,a0
> 11c: 40840801 mtc0 a0,c0_vpecontrol
> 120: 000000c0 ehb
> 124: 41020802 mftc0 at,c0_tcbind
> 128: 00202021 move a0,at
> 12c: 24420001 addiu v0,v0,1
> 130: 3084000f andi a0,a0,0xf
> 134: 12040031 beq s0,a0,1fc <mips_mt_regdump+0x1fc>
> 138: 0051282a slt a1,v0,s1
> 13c: 14a0fff4 bnez a1,110 <mips_mt_regdump+0x110>
> 140: 00000000 nop
> 144: 26100001 addiu s0,s0,1
> 148: 0213102a slt v0,s0,s3
> 14c: 1440fff0 bnez v0,110 <mips_mt_regdump+0x110>
> 150: 00001021 move v0,zero
> 154: 3c040000 lui a0,0x0
> 158: 24840000 addiu a0,a0,0
> 15c: 3c1e0000 lui s8,0x0
> 160: 3c170000 lui s7,0x0
> 164: 3c160000 lui s6,0x0
> 168: 3c130000 lui s3,0x0
> 16c: 0c000000 jal 0 <mips_mt_regdump>
> 170: 3c120000 lui s2,0x0
> 174: 00008021 move s0,zero
> 178: 27de0000 addiu s8,s8,0
> 17c: 26f70000 addiu s7,s7,0
> 180: 26d60000 addiu s6,s6,0
> 184: 26730000 addiu s3,s3,0
> 188: 26520000 addiu s2,s2,0
> 18c: 40020801 mfc0 v0,c0_vpecontrol
> 190: 2403ff00 li v1,-256
> 194: 00431024 and v0,v0,v1
> 198: 02021025 or v0,s0,v0
> 19c: 40820801 mtc0 v0,c0_vpecontrol
> 1a0: 000000c0 ehb
> 1a4: 41020802 mftc0 at,c0_tcbind
> 1a8: 00201821 move v1,at
> 1ac: 40021002 mfc0 v0,c0_tcbind
> 1b0: 1062003f beq v1,v0,2b0 <mips_mt_regdump+0x2b0>
> 1b4: 00000000 nop
> 1b8: 41020804 mftc0 at,c0_tchalt
> 1bc: 00201821 move v1,at
> 1c0: 24020001 li v0,1
> 1c4: 00400821 move at,v0
> 1c8: 41811004 mttc0 at,c0_tchalt
> 1cc: 41020801 mftc0 at,c0_tcstatus
> 1d0: 00203021 move a2,at
> 1d4: 3c040000 lui a0,0x0
> 1d8: 02002821 move a1,s0
> 1dc: 24840000 addiu a0,a0,0
> 1e0: afa3001c sw v1,28(sp)
> 1e4: 0c000000 jal 0 <mips_mt_regdump>
> 1e8: afa60010 sw a2,16(sp)
> 1ec: 8fa60010 lw a2,16(sp)
> 1f0: 8fa3001c lw v1,28(sp)
> 1f4: 080000b2 j 2c8 <mips_mt_regdump+0x2c8>
> 1f8: 00c02821 move a1,a2
> 1fc: 01202021 move a0,t1
> 200: 02002821 move a1,s0
> 204: afa3001c sw v1,28(sp)
> 208: afa80014 sw t0,20(sp)
> 20c: afa90010 sw t1,16(sp)
> 210: 0c000000 jal 0 <mips_mt_regdump>
> 214: afaa0018 sw t2,24(sp)
> 218: 41010801 mftc0 at,c0_vpecontrol
> 21c: 00202821 move a1,at
> 220: 8fa80014 lw t0,20(sp)
> 224: 0c000000 jal 0 <mips_mt_regdump>
> 228: 01002021 move a0,t0
> 22c: 41010802 mftc0 at,c0_vpeconf0
> 230: 00202821 move a1,at
> 234: 8fa3001c lw v1,28(sp)
> 238: 0c000000 jal 0 <mips_mt_regdump>
> 23c: 00602021 move a0,v1
> 240: 410c0800 mftc0 at,c0_status
> 244: 00203021 move a2,at
> 248: 03c02021 move a0,s8
> 24c: 0c000000 jal 0 <mips_mt_regdump>
> 250: 02002821 move a1,s0
> 254: 410e0800 mftc0 at,c0_epc
> 258: 00203021 move a2,at
> 25c: 410e0800 mftc0 at,c0_epc
> 260: 00203821 move a3,at
> 264: 02e02021 move a0,s7
> 268: 0c000000 jal 0 <mips_mt_regdump>
> 26c: 02002821 move a1,s0
> 270: 410d0800 mftc0 at,c0_cause
> 274: 00203021 move a2,at
> 278: 02c02021 move a0,s6
> 27c: 0c000000 jal 0 <mips_mt_regdump>
> 280: 02002821 move a1,s0
> 284: 41100807 mftc0 at,$16,7
> 288: 00203021 move a2,at
> 28c: 8faa0018 lw t2,24(sp)
> 290: 02002821 move a1,s0
> 294: 0c000000 jal 0 <mips_mt_regdump>
> 298: 01402021 move a0,t2
> 29c: 8fa3001c lw v1,28(sp)
> 2a0: 8fa80014 lw t0,20(sp)
> 2a4: 8fa90010 lw t1,16(sp)
> 2a8: 08000051 j 144 <mips_mt_regdump+0x144>
> 2ac: 8faa0018 lw t2,24(sp)
> 2b0: 3c040000 lui a0,0x0
> 2b4: 02002821 move a1,s0
> 2b8: 0c000000 jal 0 <mips_mt_regdump>
> 2bc: 24840000 addiu a0,a0,0
> 2c0: 00001821 move v1,zero
> 2c4: 02802821 move a1,s4
> 2c8: 03c02021 move a0,s8
> 2cc: 0c000000 jal 0 <mips_mt_regdump>
> 2d0: afa3001c sw v1,28(sp)
> 2d4: 41020802 mftc0 at,c0_tcbind
> 2d8: 00202821 move a1,at
> 2dc: 0c000000 jal 0 <mips_mt_regdump>
> 2e0: 02e02021 move a0,s7
> 2e4: 41020803 mftc0 at,c0_tcrestart
> 2e8: 00202821 move a1,at
> 2ec: 41020803 mftc0 at,c0_tcrestart
> 2f0: 00203021 move a2,at
> 2f4: 0c000000 jal 0 <mips_mt_regdump>
> 2f8: 02c02021 move a0,s6
> 2fc: 8fa3001c lw v1,28(sp)
> 300: 02602021 move a0,s3
> 304: 0c000000 jal 0 <mips_mt_regdump>
> 308: 00602821 move a1,v1
> 30c: 41020805 mftc0 at,c0_tccontext
> 310: 00202821 move a1,at
> 314: 0c000000 jal 0 <mips_mt_regdump>
> 318: 02402021 move a0,s2
> 31c: 8fa3001c lw v1,28(sp)
> 320: 14600003 bnez v1,330 <mips_mt_regdump+0x330>
> 324: 00001021 move v0,zero
> 328: 00400821 move at,v0
> 32c: 41811004 mttc0 at,c0_tchalt
> 330: 26100001 addiu s0,s0,1
> 334: 0211102a slt v0,s0,s1
> 338: 1440ff94 bnez v0,18c <mips_mt_regdump+0x18c>
> 33c: 00000000 nop
> 340: 0c000000 jal 0 <mips_mt_regdump>
> 344: 32b50001 andi s5,s5,0x1
> 348: 3c040000 lui a0,0x0
> 34c: 0c000000 jal 0 <mips_mt_regdump>
> 350: 24840000 addiu a0,a0,0
> 354: 12a00004 beqz s5,368 <mips_mt_regdump+0x368>
> 358: 32820400 andi v0,s4,0x400
> 35c: 41600021 evpe
> 360: 000000c0 ehb
> 364: 32820400 andi v0,s4,0x400
> 368: 14400003 bnez v0,378 <mips_mt_regdump+0x378>
> 36c: 00000000 nop
> 370: 0c000000 jal 0 <mips_mt_regdump>
> 374: 00000000 nop
> 378: 40011001 mfc0 at,c0_tcstatus
> 37c: 32940400 andi s4,s4,0x400
> 380: 34210400 ori at,at,0x400
> 384: 38210400 xori at,at,0x400
> 388: 0281a025 or s4,s4,at
> 38c: 40941001 mtc0 s4,c0_tcstatus
> 390: 000000c0 ehb
> 394: 8fbf0044 lw ra,68(sp)
> 398: 8fbe0040 lw s8,64(sp)
> 39c: 8fb7003c lw s7,60(sp)
> 3a0: 8fb60038 lw s6,56(sp)
> 3a4: 8fb50034 lw s5,52(sp)
> 3a8: 8fb40030 lw s4,48(sp)
> 3ac: 8fb3002c lw s3,44(sp)
> 3b0: 8fb20028 lw s2,40(sp)
> 3b4: 8fb10024 lw s1,36(sp)
> 3b8: 8fb00020 lw s0,32(sp)
> 3bc: 03e00008 jr ra
> 3c0: 27bd0048 addiu sp,sp,72
>
>
> On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com>
> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()? The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
>
> /K.
>
>
> On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to HTML
> content.
> So here it is again, textier.
>
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> On 12/16/10 07:37, STUART VENTERS wrote:
>
> Two other possible clues:
>
> The EVP is clear in the MVPControl register.
> Does this say that only VPE0, T0 gets to run?
> That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
> It's just possible that setting EVP is conditional on more than one VPE
> being used, but that's not the way I remember it.
>
> Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
> Exception dispatch.
> But that seems to conflict the EVP bit above.
> I don't have a copy of the ASE spec handy to see whether those bits have a
> defined power-on value, but particularly if maxvpes=1 was set at boot
> time,
> I would expect VPE1's registers to be in a partly random power-up state.
>
> Perhaps these are an artifact of getting to a good state to dump things
> out.
> As per my previous mail, I looked at the MT register dump source, and it
> really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack frame.
> The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is based
> on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire* pre-invocation
> value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the assumption
> that dvpe() returns the pre-invocation
> value of MVPcontrol. Break those assumptions, and you'll get inconsistent
> state dumps like this,
> and very possibly incorrect execution. Particularly if what was done was
> that effectively replaces
> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
> with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions. That
> would have been a *very* bad idea...
>
> Regards,
>
> Kevin K.
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-21 20:29 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-21 20:29 UTC (permalink / raw)
To: Anoop P.A., Kevin D. Kissell, Anoop P A; +Cc: STUART VENTERS, linux-mips
Sorry I misunderstood file. git blame shows that "andi" is around for quite sometime .
49a89efb include/asm-mips/irqflags.h (Ralf Baechle 2007-10-11 23:46:15 +0100 128) __asm__(
df9ee292 arch/mips/include/asm/irqflags.h (David Howells 2010-10-07 14:08:55 +0100 129) " .macro arch_local_irq_save result
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 130) " .set push
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 131) " .set reorder
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 132) " .set noat
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 133) #ifdef CONFIG_MIPS_MT_SMTC
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 134) " mfc0 \\result, $2, 1
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 135) " ori $1, \\result, 0x400
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 136) " .set noreorder
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 137) " mtc0 $1, $2, 1
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 138) " andi \\result, \\result, 0x400
41c594ab include/asm-mips/interrupt.h (Ralf Baechle 2006-04-05 09:45:45 +0100 139) #elif defined(CONFIG_CPU_MIPSR2)
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 140) " di \\result
15265251 include/asm-mips/interrupt.h (Maxime Bizon 2005-12-20 06:32:19 +0100 141) " andi \\result, 1
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 142) #else
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 143) " mfc0 \\result, $12
c226f260 include/asm-mips/interrupt.h (Atsushi Nemoto 2006-02-03 01:34:01 +0900 144) " ori $1, \\result, 0x1f
c226f260 include/asm-mips/interrupt.h (Atsushi Nemoto 2006-02-03 01:34:01 +0900 145) " xori $1, 0x1f
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 146) " .set noreorder
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 147) " mtc0 $1, $12
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 148) #endif
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 149) " irq_disable_hazard
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 150) " .set pop
ff88f8a3 include/asm-mips/interrupt.h (Ralf Baechle 2005-07-12 14:54:31 +0000 151) " .endm
^1da177e include/asm-mips/interrupt.h (Linus Torvalds 2005-04-16 15:20:36 -0700 152)
> -----Original Message-----
> From: linux-mips-bounce@linux-mips.org [mailto:linux-mips-bounce@linux-
> mips.org] On Behalf Of Anoop P.A.
> Sent: Wednesday, December 22, 2010 1:37 AM
> To: Kevin D. Kissell; Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org
> Subject: RE: SMTC support status in latest git head.
>
>
> OK. I will check it.
>
> BTW following patch is responsible for irq change.
>
> http://git.linux-
> mips.org/?p=linux.git;a=commitdiff;h=df9ee29270c11dba7d0fe0b83ce47a4d8e8d2
> 101
>
> Thanks
> Anoop
> ________________________________________
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Wednesday, December 22, 2010 12:23 AM
> To: Anoop P A
> Cc: STUART VENTERS; linux-mips@linux-mips.org; Anoop P.A.
> Subject: Re: SMTC support status in latest git head.
>
> OK, I see why the MT register dump isn't giving us useful information.
> It's not clear that it's at the root of your functional problems, though.
> Apparently, somebody decided that it was unwholesome to propagate anything
> other than the previous interrupt enable state in the flags variable
> passed between irq_save() and irq_restore(). I agree philosophically, but
> it does break the MT register dump function. And I'm quite sure that
> there were other bits of SMTC code that knew that it was a TCStatus value,
> at least in the earliest versions of the code. I'm not a gitweb power
> user, but I haven't been able to figure out how to determine when the
> "andi \\result 0x400" on or about line 138 of irqflags.h (at least that's
> where it is in the head of tree) was checked-in. If it's at the boundary
> between working and non-working versions for SMTC, it might be the cause
> of the problems, but it may well not be responsible for anything other
> than the problem with reporting the value in
> the MT register dump - which really ought to be fixed.
>
> I'm in a small village in France for the holidays with no git/build system
> at my disposal, but I think that if you were to tweak mips-mt.c at line
> 103 to change
> the
>
> tcstatval = flags; /* And pre-dump TCStatus is flags */
>
>
>
> to something more like
>
>
>
> /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable
> */
>
> tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
>
>
>
> should fix the dump.
>
> Regards,
>
> Kevin K.
>
> On 12/20/10 2:44 AM, Anoop P A wrote:
> Hi Kevin,
>
> Please find disassembly for mips_mt_reg_dump
>
> Thanks
> Anoop
>
> Disassembly of section .text:
>
> 00000000 <mips_mt_regdump>:
> 0: 27bdffb8 addiu sp,sp,-72
> 4: 00802821 move a1,a0
> 8: afbf0044 sw ra,68(sp)
> c: afbe0040 sw s8,64(sp)
> 10: afb7003c sw s7,60(sp)
> 14: afb60038 sw s6,56(sp)
> 18: afb50034 sw s5,52(sp)
> 1c: afb40030 sw s4,48(sp)
> 20: afb3002c sw s3,44(sp)
> 24: afb20028 sw s2,40(sp)
> 28: afb10024 sw s1,36(sp)
> 2c: afb00020 sw s0,32(sp)
> 30: 40141001 mfc0 s4,c0_tcstatus
> 34: 36810400 ori at,s4,0x400
> 38: 40811001 mtc0 at,c0_tcstatus
> 3c: 32940400 andi s4,s4,0x400
> 40: 000000c0 ehb
> 44: 41610001 dvpe at
> 48: 0020a821 move s5,at
> 4c: 000000c0 ehb
> 50: 3c020000 lui v0,0x0
> 54: 24420060 addiu v0,v0,96
> 58: 00400408 jr.hb v0
> 5c: 00000000 nop
> 60: 3c040000 lui a0,0x0
> 64: 24840000 addiu a0,a0,0
> 68: 0c000000 jal 0 <mips_mt_regdump>
> 6c: afa50010 sw a1,16(sp)
> 70: 3c040000 lui a0,0x0
> 74: 0c000000 jal 0 <mips_mt_regdump>
> 78: 24840000 addiu a0,a0,0
> 7c: 8fa50010 lw a1,16(sp)
> 80: 3c040000 lui a0,0x0
> 84: 0c000000 jal 0 <mips_mt_regdump>
> 88: 24840000 addiu a0,a0,0
> 8c: 3c040000 lui a0,0x0
> 90: 24840000 addiu a0,a0,0
> 94: 0c000000 jal 0 <mips_mt_regdump>
> 98: 02a02821 move a1,s5
> 9c: 40110002 mfc0 s1,c0_mvpconf0
> a0: 3c040000 lui a0,0x0
> a4: 02202821 move a1,s1
> a8: 0c000000 jal 0 <mips_mt_regdump>
> ac: 24840000 addiu a0,a0,0
> b0: 3c040000 lui a0,0x0
> b4: 0c000000 jal 0 <mips_mt_regdump>
> b8: 24840000 addiu a0,a0,0
> bc: 7e331a80 ext s3,s1,0xa,0x4
> c0: 3c090000 lui t1,0x0
> c4: 323100ff andi s1,s1,0xff
> c8: 3c080000 lui t0,0x0
> cc: 3c030000 lui v1,0x0
> d0: 3c1e0000 lui s8,0x0
> d4: 3c170000 lui s7,0x0
> d8: 3c160000 lui s6,0x0
> dc: 3c0a0000 lui t2,0x0
> e0: 26730001 addiu s3,s3,1
> e4: 26310001 addiu s1,s1,1
> e8: 00008021 move s0,zero
> ec: 2412ff00 li s2,-256
> f0: 25290000 addiu t1,t1,0
> f4: 25080000 addiu t0,t0,0
> f8: 24630000 addiu v1,v1,0
> fc: 27de0000 addiu s8,s8,0
> 100: 26f70000 addiu s7,s7,0
> 104: 26d60000 addiu s6,s6,0
> 108: 254a0000 addiu t2,t2,0
> 10c: 00001021 move v0,zero
> 110: 40040801 mfc0 a0,c0_vpecontrol
> 114: 00922024 and a0,a0,s2
> 118: 00442025 or a0,v0,a0
> 11c: 40840801 mtc0 a0,c0_vpecontrol
> 120: 000000c0 ehb
> 124: 41020802 mftc0 at,c0_tcbind
> 128: 00202021 move a0,at
> 12c: 24420001 addiu v0,v0,1
> 130: 3084000f andi a0,a0,0xf
> 134: 12040031 beq s0,a0,1fc <mips_mt_regdump+0x1fc>
> 138: 0051282a slt a1,v0,s1
> 13c: 14a0fff4 bnez a1,110 <mips_mt_regdump+0x110>
> 140: 00000000 nop
> 144: 26100001 addiu s0,s0,1
> 148: 0213102a slt v0,s0,s3
> 14c: 1440fff0 bnez v0,110 <mips_mt_regdump+0x110>
> 150: 00001021 move v0,zero
> 154: 3c040000 lui a0,0x0
> 158: 24840000 addiu a0,a0,0
> 15c: 3c1e0000 lui s8,0x0
> 160: 3c170000 lui s7,0x0
> 164: 3c160000 lui s6,0x0
> 168: 3c130000 lui s3,0x0
> 16c: 0c000000 jal 0 <mips_mt_regdump>
> 170: 3c120000 lui s2,0x0
> 174: 00008021 move s0,zero
> 178: 27de0000 addiu s8,s8,0
> 17c: 26f70000 addiu s7,s7,0
> 180: 26d60000 addiu s6,s6,0
> 184: 26730000 addiu s3,s3,0
> 188: 26520000 addiu s2,s2,0
> 18c: 40020801 mfc0 v0,c0_vpecontrol
> 190: 2403ff00 li v1,-256
> 194: 00431024 and v0,v0,v1
> 198: 02021025 or v0,s0,v0
> 19c: 40820801 mtc0 v0,c0_vpecontrol
> 1a0: 000000c0 ehb
> 1a4: 41020802 mftc0 at,c0_tcbind
> 1a8: 00201821 move v1,at
> 1ac: 40021002 mfc0 v0,c0_tcbind
> 1b0: 1062003f beq v1,v0,2b0 <mips_mt_regdump+0x2b0>
> 1b4: 00000000 nop
> 1b8: 41020804 mftc0 at,c0_tchalt
> 1bc: 00201821 move v1,at
> 1c0: 24020001 li v0,1
> 1c4: 00400821 move at,v0
> 1c8: 41811004 mttc0 at,c0_tchalt
> 1cc: 41020801 mftc0 at,c0_tcstatus
> 1d0: 00203021 move a2,at
> 1d4: 3c040000 lui a0,0x0
> 1d8: 02002821 move a1,s0
> 1dc: 24840000 addiu a0,a0,0
> 1e0: afa3001c sw v1,28(sp)
> 1e4: 0c000000 jal 0 <mips_mt_regdump>
> 1e8: afa60010 sw a2,16(sp)
> 1ec: 8fa60010 lw a2,16(sp)
> 1f0: 8fa3001c lw v1,28(sp)
> 1f4: 080000b2 j 2c8 <mips_mt_regdump+0x2c8>
> 1f8: 00c02821 move a1,a2
> 1fc: 01202021 move a0,t1
> 200: 02002821 move a1,s0
> 204: afa3001c sw v1,28(sp)
> 208: afa80014 sw t0,20(sp)
> 20c: afa90010 sw t1,16(sp)
> 210: 0c000000 jal 0 <mips_mt_regdump>
> 214: afaa0018 sw t2,24(sp)
> 218: 41010801 mftc0 at,c0_vpecontrol
> 21c: 00202821 move a1,at
> 220: 8fa80014 lw t0,20(sp)
> 224: 0c000000 jal 0 <mips_mt_regdump>
> 228: 01002021 move a0,t0
> 22c: 41010802 mftc0 at,c0_vpeconf0
> 230: 00202821 move a1,at
> 234: 8fa3001c lw v1,28(sp)
> 238: 0c000000 jal 0 <mips_mt_regdump>
> 23c: 00602021 move a0,v1
> 240: 410c0800 mftc0 at,c0_status
> 244: 00203021 move a2,at
> 248: 03c02021 move a0,s8
> 24c: 0c000000 jal 0 <mips_mt_regdump>
> 250: 02002821 move a1,s0
> 254: 410e0800 mftc0 at,c0_epc
> 258: 00203021 move a2,at
> 25c: 410e0800 mftc0 at,c0_epc
> 260: 00203821 move a3,at
> 264: 02e02021 move a0,s7
> 268: 0c000000 jal 0 <mips_mt_regdump>
> 26c: 02002821 move a1,s0
> 270: 410d0800 mftc0 at,c0_cause
> 274: 00203021 move a2,at
> 278: 02c02021 move a0,s6
> 27c: 0c000000 jal 0 <mips_mt_regdump>
> 280: 02002821 move a1,s0
> 284: 41100807 mftc0 at,$16,7
> 288: 00203021 move a2,at
> 28c: 8faa0018 lw t2,24(sp)
> 290: 02002821 move a1,s0
> 294: 0c000000 jal 0 <mips_mt_regdump>
> 298: 01402021 move a0,t2
> 29c: 8fa3001c lw v1,28(sp)
> 2a0: 8fa80014 lw t0,20(sp)
> 2a4: 8fa90010 lw t1,16(sp)
> 2a8: 08000051 j 144 <mips_mt_regdump+0x144>
> 2ac: 8faa0018 lw t2,24(sp)
> 2b0: 3c040000 lui a0,0x0
> 2b4: 02002821 move a1,s0
> 2b8: 0c000000 jal 0 <mips_mt_regdump>
> 2bc: 24840000 addiu a0,a0,0
> 2c0: 00001821 move v1,zero
> 2c4: 02802821 move a1,s4
> 2c8: 03c02021 move a0,s8
> 2cc: 0c000000 jal 0 <mips_mt_regdump>
> 2d0: afa3001c sw v1,28(sp)
> 2d4: 41020802 mftc0 at,c0_tcbind
> 2d8: 00202821 move a1,at
> 2dc: 0c000000 jal 0 <mips_mt_regdump>
> 2e0: 02e02021 move a0,s7
> 2e4: 41020803 mftc0 at,c0_tcrestart
> 2e8: 00202821 move a1,at
> 2ec: 41020803 mftc0 at,c0_tcrestart
> 2f0: 00203021 move a2,at
> 2f4: 0c000000 jal 0 <mips_mt_regdump>
> 2f8: 02c02021 move a0,s6
> 2fc: 8fa3001c lw v1,28(sp)
> 300: 02602021 move a0,s3
> 304: 0c000000 jal 0 <mips_mt_regdump>
> 308: 00602821 move a1,v1
> 30c: 41020805 mftc0 at,c0_tccontext
> 310: 00202821 move a1,at
> 314: 0c000000 jal 0 <mips_mt_regdump>
> 318: 02402021 move a0,s2
> 31c: 8fa3001c lw v1,28(sp)
> 320: 14600003 bnez v1,330 <mips_mt_regdump+0x330>
> 324: 00001021 move v0,zero
> 328: 00400821 move at,v0
> 32c: 41811004 mttc0 at,c0_tchalt
> 330: 26100001 addiu s0,s0,1
> 334: 0211102a slt v0,s0,s1
> 338: 1440ff94 bnez v0,18c <mips_mt_regdump+0x18c>
> 33c: 00000000 nop
> 340: 0c000000 jal 0 <mips_mt_regdump>
> 344: 32b50001 andi s5,s5,0x1
> 348: 3c040000 lui a0,0x0
> 34c: 0c000000 jal 0 <mips_mt_regdump>
> 350: 24840000 addiu a0,a0,0
> 354: 12a00004 beqz s5,368 <mips_mt_regdump+0x368>
> 358: 32820400 andi v0,s4,0x400
> 35c: 41600021 evpe
> 360: 000000c0 ehb
> 364: 32820400 andi v0,s4,0x400
> 368: 14400003 bnez v0,378 <mips_mt_regdump+0x378>
> 36c: 00000000 nop
> 370: 0c000000 jal 0 <mips_mt_regdump>
> 374: 00000000 nop
> 378: 40011001 mfc0 at,c0_tcstatus
> 37c: 32940400 andi s4,s4,0x400
> 380: 34210400 ori at,at,0x400
> 384: 38210400 xori at,at,0x400
> 388: 0281a025 or s4,s4,at
> 38c: 40941001 mtc0 s4,c0_tcstatus
> 390: 000000c0 ehb
> 394: 8fbf0044 lw ra,68(sp)
> 398: 8fbe0040 lw s8,64(sp)
> 39c: 8fb7003c lw s7,60(sp)
> 3a0: 8fb60038 lw s6,56(sp)
> 3a4: 8fb50034 lw s5,52(sp)
> 3a8: 8fb40030 lw s4,48(sp)
> 3ac: 8fb3002c lw s3,44(sp)
> 3b0: 8fb20028 lw s2,40(sp)
> 3b4: 8fb10024 lw s1,36(sp)
> 3b8: 8fb00020 lw s0,32(sp)
> 3bc: 03e00008 jr ra
> 3c0: 27bd0048 addiu sp,sp,72
>
>
> On Sat, Dec 18, 2010 at 3:05 AM, Kevin D. Kissell <kevink@paralogos.com>
> wrote:
> So, Anoop, if you get a minute for this any time in the next day or so
> (after which I'll have very limited net access until next year), could you
> please do an <mumble>-mips<mumble>-objdump --disassemble of your kernel
> image (or even just the mips-mt.o module) from a failing kernel build and
> post the disassembly of mips_mt_regdump()? The confirmation or refutation
> of the theory about local_irq_save() no longer being built correctly for
> SMTC would be within the first few instructions...
>
> /K.
>
>
> On 12/16/10 11:58, Kevin D. Kissell wrote:
> Ralf tells me that this message got blocked by the LMO server due to HTML
> content.
> So here it is again, textier.
>
> On 12/16/10 11:24, Kevin D. Kissell wrote:
> On 12/16/10 07:37, STUART VENTERS wrote:
>
> Two other possible clues:
>
> The EVP is clear in the MVPControl register.
> Does this say that only VPE0, T0 gets to run?
> That's correct. In the maxtcs=1/maxvpes=1 boot state, it wouldn't matter.
> It's just possible that setting EVP is conditional on more than one VPE
> being used, but that's not the way I remember it.
>
> Also the EXCPT bits in VPEControl for VPE1 indicate a Gating Storage
> Exception dispatch.
> But that seems to conflict the EVP bit above.
> I don't have a copy of the ASE spec handy to see whether those bits have a
> defined power-on value, but particularly if maxvpes=1 was set at boot
> time,
> I would expect VPE1's registers to be in a partly random power-up state.
>
> Perhaps these are an artifact of getting to a good state to dump things
> out.
> As per my previous mail, I looked at the MT register dump source, and it
> really does pull values directly
> out of registers and doesn't depend on having a sane kernel stack frame.
> The exceptions to that rule
> are the reported values for TCStatus of the executing TC, which is based
> on the perhaps-now-broken
> assumption that local_irq_save(flags) stores the *entire* pre-invocation
> value of the TCStatus register
> in the flags variable, and MVPcontrol, which is based on the assumption
> that dvpe() returns the pre-invocation
> value of MVPcontrol. Break those assumptions, and you'll get inconsistent
> state dumps like this,
> and very possibly incorrect execution. Particularly if what was done was
> that effectively replaces
> the SMTC-specific implementation of local_irq_save()/local_irq_restore()
> with something that uses
> the generic MIPS32R2 atomic interrupt enable/disable instructions. That
> would have been a *very* bad idea...
>
> Regards,
>
> Kevin K.
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-21 20:29 ` Anoop P.A.
(?)
@ 2010-12-22 10:27 ` Kevin D. Kissell
2010-12-22 11:35 ` Anoop P A
-1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 10:27 UTC (permalink / raw)
To: Anoop P.A.; +Cc: Anoop P A, STUART VENTERS, linux-mips
> Sorry I misunderstood file. git blame shows that "andi" is around for
quite
> some time.
I've never used git blame, so I don't know how far it can be trusted,
but if that change was made in 2006, that would predate the major
breakage by several
years. So my suggestion from yesterday is a reasonable one:
> I think that if you were to tweak mips-mt.c at line 103 to change
> the
>
> tcstatval = flags; /* And pre-dump TCStatus is flags */
>
> to something more like
>
> /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
> tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
>
> should fix the dump.
With that patch, if you re-run the experiment of hang-breakout-dump, we
might be able to deduce something.
Ralf wrote to me independently to say that my message from yesterday
with that suggestion and some other commentary got eaten once again by
the LMO mail forwarder because of the HTML content. With all due
respect, I'm using a very standard open-source mail client (Thunderbird)
with a very normal option (reply to text with text, HTML with HTML).
Perhaps it it's the LMO mail system that needs to change, and not the
mail configurations of the whole LMO community.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-22 10:27 ` Kevin D. Kissell
@ 2010-12-22 11:35 ` Anoop P A
2010-12-22 11:37 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-22 11:35 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., STUART VENTERS, linux-mips
On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
> > Sorry I misunderstood file. git blame shows that "andi" is around for
> quite
> > some time.
>
> I've never used git blame, so I don't know how far it can be trusted,
> but if that change was made in 2006, that would predate the major
> breakage by several
> years. So my suggestion from yesterday is a reasonable one:
That change is present in booting 2.6.32 kernel.Corresponding patch can
be found in gitweb .
http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39
>
> > I think that if you were to tweak mips-mt.c at line 103 to change
> > the
> >
> > tcstatval = flags; /* And pre-dump TCStatus is flags */
> >
> > to something more like
> >
> > /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
> > tcstatval = (read_c0_tcstatus() & ~0x400) | flags;
> >
> > should fix the dump.
>
> With that patch, if you re-run the experiment of hang-breakout-dump, we
> might be able to deduce something.
Here is the dump with the patch.
[ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[ 0.000000] -- Global State --
[ 0.000000] MVPControl Passed: 00000000
[ 0.000000] MVPControl Read: 00000000
[ 0.000000] MVPConf0 : a8008406
[ 0.000000] -- per-VPE State --
[ 0.000000] VPE 0
[ 0.000000] VPEControl : 00000000
[ 0.000000] VPEConf0 : 800f0003
[ 0.000000] VPE0.Status : 11004001
[ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
[ 0.000000] VPE0.Cause : e080407c
[ 0.000000] VPE0.Config7 : 00010000
[ 0.000000] VPE 1
[ 0.000000] VPEControl : 00030000
[ 0.000000] VPEConf0 : 800f0000
[ 0.000000] VPE1.Status : 00407904
[ 0.000000] VPE1.EPC : fffdffff 0xfffdffff
[ 0.000000] VPE1.Cause : 4000027c
[ 0.000000] VPE1.Config7 : 00010000
[ 0.000000] -- per-TC State --
[ 0.000000] TC 0 (current TC with VPE EPC above)
[ 0.000000] TCStatus : 11004001
[ 0.000000] TCBind : 00000000
[ 0.000000] TCRestart : 803fc408 printk+0x10/0x30
[ 0.000000] TCHalt : 00000000
[ 0.000000] TCContext : 00000000
[ 0.000000] TC 1
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00200001
[ 0.000000] TCRestart : 3ffffffe 0x3ffffffe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : efffffff
[ 0.000000] TC 2
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00400001
[ 0.000000] TCRestart : ffffffee 0xffffffee
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : efffffbf
[ 0.000000] TC 3
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00600001
[ 0.000000] TCRestart : ffe00200 0xffe00200
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 7fffb77f
[ 0.000000] TC 4
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00800001
[ 0.000000] TCRestart : ffe00200 0xffe00200
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 7ffdf736
[ 0.000000] TC 5
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00a00001
[ 0.000000] TCRestart : ffe00200 0xffe00200
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : ee5ffff7
[ 0.000000] TC 6
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00c00001
[ 0.000000] TCRestart : f7ff7ffe 0xf7ff7ffe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : e6fffffb
[ 0.000000] Counter Interrupts taken per CPU (TC)
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] Self-IPI invocations:
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] 0 Recoveries of "stolen" FPU
[ 0.000000] ===========================
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
[ 0.010000] VPE0.Cause : 50804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00030000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00407904
[ 0.010000] VPE1.EPC : fffdffff 0xfffdffff
[ 0.010000] VPE1.Cause : 4000027c
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 18004000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803fc408 printk+0x10/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 3ffffffe 0x3ffffffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : efffffff
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : ffffffee 0xffffffee
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : efffffbf
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : ffe00200 0xffe00200
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 7fffb77f
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : ffe00200 0xffe00200
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 7ffdf736
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : ffe00200 0xffe00200
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : ee5ffff7
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : f7ff7ffe 0xf7ff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : e6fffffb
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-22 11:35 ` Anoop P A
@ 2010-12-22 11:37 ` Kevin D. Kissell
2010-12-22 11:51 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 11:37 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., STUART VENTERS, linux-mips
Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
all indicate that interrupts are enabled and not inhibited at the per-TC
level, and the presumed timer interrupt, in the 0x4000 bit, is present
and not masked-off. Logically, the system must be entering (and
exiting) the interrupt handler, yet the timer calibration isn't
completing. That leaves more complex possible explanations for failure,
most of which would fall into two categories:
1) The platform interrupt handler is failing to decode the event
properly as a timer event.
2) Despite there being only one TC active, the calibration code is
waiting for some handshake from another "CPU"
To test the first, you might consider adding a kprintf() to the case of
a "spurious" timer-like interrupt being detected and ignored...
Regards,
Kevin K.
On 12/22/10 3:35 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
>>> Sorry I misunderstood file. git blame shows that "andi" is around for
>> quite
>> > some time.
>>
>> I've never used git blame, so I don't know how far it can be trusted,
>> but if that change was made in 2006, that would predate the major
>> breakage by several
>> years. So my suggestion from yesterday is a reasonable one:
> That change is present in booting 2.6.32 kernel.Corresponding patch can
> be found in gitweb .
> http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39
>
>> > I think that if you were to tweak mips-mt.c at line 103 to change
>> > the
>> >
>> > tcstatval = flags; /* And pre-dump TCStatus is flags */
>> >
>> > to something more like
>> >
>> > /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
>> > tcstatval = (read_c0_tcstatus()& ~0x400) | flags;
>> >
>> > should fix the dump.
>>
>> With that patch, if you re-run the experiment of hang-breakout-dump, we
>> might be able to deduce something.
> Here is the dump with the patch.
>
> [ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [ 0.000000] -- Global State --
> [ 0.000000] MVPControl Passed: 00000000
> [ 0.000000] MVPControl Read: 00000000
> [ 0.000000] MVPConf0 : a8008406
> [ 0.000000] -- per-VPE State --
> [ 0.000000] VPE 0
> [ 0.000000] VPEControl : 00000000
> [ 0.000000] VPEConf0 : 800f0003
> [ 0.000000] VPE0.Status : 11004001
> [ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
> [ 0.000000] VPE0.Cause : e080407c
> [ 0.000000] VPE0.Config7 : 00010000
> [ 0.000000] VPE 1
> [ 0.000000] VPEControl : 00030000
> [ 0.000000] VPEConf0 : 800f0000
> [ 0.000000] VPE1.Status : 00407904
> [ 0.000000] VPE1.EPC : fffdffff 0xfffdffff
> [ 0.000000] VPE1.Cause : 4000027c
> [ 0.000000] VPE1.Config7 : 00010000
> [ 0.000000] -- per-TC State --
> [ 0.000000] TC 0 (current TC with VPE EPC above)
> [ 0.000000] TCStatus : 11004001
> [ 0.000000] TCBind : 00000000
> [ 0.000000] TCRestart : 803fc408 printk+0x10/0x30
> [ 0.000000] TCHalt : 00000000
> [ 0.000000] TCContext : 00000000
> [ 0.000000] TC 1
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00200001
> [ 0.000000] TCRestart : 3ffffffe 0x3ffffffe
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : efffffff
> [ 0.000000] TC 2
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00400001
> [ 0.000000] TCRestart : ffffffee 0xffffffee
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : efffffbf
> [ 0.000000] TC 3
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00600001
> [ 0.000000] TCRestart : ffe00200 0xffe00200
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 7fffb77f
> [ 0.000000] TC 4
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00800001
> [ 0.000000] TCRestart : ffe00200 0xffe00200
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 7ffdf736
> [ 0.000000] TC 5
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00a00001
> [ 0.000000] TCRestart : ffe00200 0xffe00200
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : ee5ffff7
> [ 0.000000] TC 6
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00c00001
> [ 0.000000] TCRestart : f7ff7ffe 0xf7ff7ffe
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : e6fffffb
> [ 0.000000] Counter Interrupts taken per CPU (TC)
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] Self-IPI invocations:
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] 0 Recoveries of "stolen" FPU
> [ 0.000000] ===========================
> [ 0.010000] === MIPS MT State Dump ===
> [ 0.010000] -- Global State --
> [ 0.010000] MVPControl Passed: 00000000
> [ 0.010000] MVPControl Read: 00000000
> [ 0.010000] MVPConf0 : a8008406
> [ 0.010000] -- per-VPE State --
> [ 0.010000] VPE 0
> [ 0.010000] VPEControl : 00000000
> [ 0.010000] VPEConf0 : 800f0003
> [ 0.010000] VPE0.Status : 18004000
> [ 0.010000] VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
> [ 0.010000] VPE0.Cause : 50804000
> [ 0.010000] VPE0.Config7 : 00010000
> [ 0.010000] VPE 1
> [ 0.010000] VPEControl : 00030000
> [ 0.010000] VPEConf0 : 800f0000
> [ 0.010000] VPE1.Status : 00407904
> [ 0.010000] VPE1.EPC : fffdffff 0xfffdffff
> [ 0.010000] VPE1.Cause : 4000027c
> [ 0.010000] VPE1.Config7 : 00010000
> [ 0.010000] -- per-TC State --
> [ 0.010000] TC 0 (current TC with VPE EPC above)
> [ 0.010000] TCStatus : 18004000
> [ 0.010000] TCBind : 00000000
> [ 0.010000] TCRestart : 803fc408 printk+0x10/0x30
> [ 0.010000] TCHalt : 00000000
> [ 0.010000] TCContext : 00000000
> [ 0.010000] TC 1
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00200001
> [ 0.010000] TCRestart : 3ffffffe 0x3ffffffe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : efffffff
> [ 0.010000] TC 2
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00400001
> [ 0.010000] TCRestart : ffffffee 0xffffffee
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : efffffbf
> [ 0.010000] TC 3
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00600001
> [ 0.010000] TCRestart : ffe00200 0xffe00200
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 7fffb77f
> [ 0.010000] TC 4
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00800001
> [ 0.010000] TCRestart : ffe00200 0xffe00200
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 7ffdf736
> [ 0.010000] TC 5
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00a00001
> [ 0.010000] TCRestart : ffe00200 0xffe00200
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : ee5ffff7
> [ 0.010000] TC 6
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00c00001
> [ 0.010000] TCRestart : f7ff7ffe 0xf7ff7ffe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : e6fffffb
> [ 0.010000] Counter Interrupts taken per CPU (TC)
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] Self-IPI invocations:
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] 0 Recoveries of "stolen" FPU
> [ 0.010000] ===========================
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-22 11:37 ` Kevin D. Kissell
@ 2010-12-22 11:51 ` Anoop P A
2010-12-22 13:03 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-22 11:51 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., STUART VENTERS, linux-mips
On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
> Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
> all indicate that interrupts are enabled and not inhibited at the per-TC
> level, and the presumed timer interrupt, in the 0x4000 bit, is present
> and not masked-off. Logically, the system must be entering (and
> exiting) the interrupt handler, yet the timer calibration isn't
> completing. That leaves more complex possible explanations for failure,
> most of which would fall into two categories:
>
> 1) The platform interrupt handler is failing to decode the event
> properly as a timer event.
> 2) Despite there being only one TC active, the calibration code is
> waiting for some handshake from another "CPU"
>
> To test the first, you might consider adding a kprintf() to the case of
> a "spurious" timer-like interrupt being detected and ignored...
I have tried it . only one interrupt is coming and platform handler
detect it as timer interrupt and acknowledges properly . you can see a
time stamp change in the logs.
>
> Regards,
>
> Kevin K.
>
> On 12/22/10 3:35 AM, Anoop P A wrote:
> > On Wed, 2010-12-22 at 02:27 -0800, Kevin D. Kissell wrote:
> >>> Sorry I misunderstood file. git blame shows that "andi" is around for
> >> quite
> >> > some time.
> >>
> >> I've never used git blame, so I don't know how far it can be trusted,
> >> but if that change was made in 2006, that would predate the major
> >> breakage by several
> >> years. So my suggestion from yesterday is a reasonable one:
> > That change is present in booting 2.6.32 kernel.Corresponding patch can
> > be found in gitweb .
> > http://git.linux-mips.org/?p=linux.git;a=commitdiff;h=41c594ab65fc89573af296d192aa5235d09717ab#patch39
> >
> >> > I think that if you were to tweak mips-mt.c at line 103 to change
> >> > the
> >> >
> >> > tcstatval = flags; /* And pre-dump TCStatus is flags */
> >> >
> >> > to something more like
> >> >
> >> > /* Pre-dump TCStatus Interrupt Inhibit bit is in flags variable */
> >> > tcstatval = (read_c0_tcstatus()& ~0x400) | flags;
> >> >
> >> > should fix the dump.
> >>
> >> With that patch, if you re-run the experiment of hang-breakout-dump, we
> >> might be able to deduce something.
> > Here is the dump with the patch.
> >
> > [ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> > [ 0.000000] -- Global State --
> > [ 0.000000] MVPControl Passed: 00000000
> > [ 0.000000] MVPControl Read: 00000000
> > [ 0.000000] MVPConf0 : a8008406
> > [ 0.000000] -- per-VPE State --
> > [ 0.000000] VPE 0
> > [ 0.000000] VPEControl : 00000000
> > [ 0.000000] VPEConf0 : 800f0003
> > [ 0.000000] VPE0.Status : 11004001
> > [ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
> > [ 0.000000] VPE0.Cause : e080407c
> > [ 0.000000] VPE0.Config7 : 00010000
> > [ 0.000000] VPE 1
> > [ 0.000000] VPEControl : 00030000
> > [ 0.000000] VPEConf0 : 800f0000
> > [ 0.000000] VPE1.Status : 00407904
> > [ 0.000000] VPE1.EPC : fffdffff 0xfffdffff
> > [ 0.000000] VPE1.Cause : 4000027c
> > [ 0.000000] VPE1.Config7 : 00010000
> > [ 0.000000] -- per-TC State --
> > [ 0.000000] TC 0 (current TC with VPE EPC above)
> > [ 0.000000] TCStatus : 11004001
> > [ 0.000000] TCBind : 00000000
> > [ 0.000000] TCRestart : 803fc408 printk+0x10/0x30
> > [ 0.000000] TCHalt : 00000000
> > [ 0.000000] TCContext : 00000000
> > [ 0.000000] TC 1
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00200001
> > [ 0.000000] TCRestart : 3ffffffe 0x3ffffffe
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : efffffff
> > [ 0.000000] TC 2
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00400001
> > [ 0.000000] TCRestart : ffffffee 0xffffffee
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : efffffbf
> > [ 0.000000] TC 3
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00600001
> > [ 0.000000] TCRestart : ffe00200 0xffe00200
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : 7fffb77f
> > [ 0.000000] TC 4
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00800001
> > [ 0.000000] TCRestart : ffe00200 0xffe00200
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : 7ffdf736
> > [ 0.000000] TC 5
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00a00001
> > [ 0.000000] TCRestart : ffe00200 0xffe00200
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : ee5ffff7
> > [ 0.000000] TC 6
> > [ 0.000000] TCStatus : 00000000
> > [ 0.000000] TCBind : 00c00001
> > [ 0.000000] TCRestart : f7ff7ffe 0xf7ff7ffe
> > [ 0.000000] TCHalt : 00000001
> > [ 0.000000] TCContext : e6fffffb
> > [ 0.000000] Counter Interrupts taken per CPU (TC)
> > [ 0.000000] 0: 0
> > [ 0.000000] 1: 0
> > [ 0.000000] Self-IPI invocations:
> > [ 0.000000] 0: 0
> > [ 0.000000] 1: 0
> > [ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > [ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > [ 0.000000] 0 Recoveries of "stolen" FPU
> > [ 0.000000] ===========================
> > [ 0.010000] === MIPS MT State Dump ===
> > [ 0.010000] -- Global State --
> > [ 0.010000] MVPControl Passed: 00000000
> > [ 0.010000] MVPControl Read: 00000000
> > [ 0.010000] MVPConf0 : a8008406
> > [ 0.010000] -- per-VPE State --
> > [ 0.010000] VPE 0
> > [ 0.010000] VPEControl : 00000000
> > [ 0.010000] VPEConf0 : 800f0003
> > [ 0.010000] VPE0.Status : 18004000
> > [ 0.010000] VPE0.EPC : 8010c9b4 mips_mt_regdump+0x3a4/0x3d4
> > [ 0.010000] VPE0.Cause : 50804000
> > [ 0.010000] VPE0.Config7 : 00010000
> > [ 0.010000] VPE 1
> > [ 0.010000] VPEControl : 00030000
> > [ 0.010000] VPEConf0 : 800f0000
> > [ 0.010000] VPE1.Status : 00407904
> > [ 0.010000] VPE1.EPC : fffdffff 0xfffdffff
> > [ 0.010000] VPE1.Cause : 4000027c
> > [ 0.010000] VPE1.Config7 : 00010000
> > [ 0.010000] -- per-TC State --
> > [ 0.010000] TC 0 (current TC with VPE EPC above)
> > [ 0.010000] TCStatus : 18004000
> > [ 0.010000] TCBind : 00000000
> > [ 0.010000] TCRestart : 803fc408 printk+0x10/0x30
> > [ 0.010000] TCHalt : 00000000
> > [ 0.010000] TCContext : 00000000
> > [ 0.010000] TC 1
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00200001
> > [ 0.010000] TCRestart : 3ffffffe 0x3ffffffe
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : efffffff
> > [ 0.010000] TC 2
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00400001
> > [ 0.010000] TCRestart : ffffffee 0xffffffee
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : efffffbf
> > [ 0.010000] TC 3
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00600001
> > [ 0.010000] TCRestart : ffe00200 0xffe00200
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : 7fffb77f
> > [ 0.010000] TC 4
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00800001
> > [ 0.010000] TCRestart : ffe00200 0xffe00200
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : 7ffdf736
> > [ 0.010000] TC 5
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00a00001
> > [ 0.010000] TCRestart : ffe00200 0xffe00200
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : ee5ffff7
> > [ 0.010000] TC 6
> > [ 0.010000] TCStatus : 00000000
> > [ 0.010000] TCBind : 00c00001
> > [ 0.010000] TCRestart : f7ff7ffe 0xf7ff7ffe
> > [ 0.010000] TCHalt : 00000001
> > [ 0.010000] TCContext : e6fffffb
> > [ 0.010000] Counter Interrupts taken per CPU (TC)
> > [ 0.010000] 0: 0
> > [ 0.010000] 1: 0
> > [ 0.010000] Self-IPI invocations:
> > [ 0.010000] 0: 0
> > [ 0.010000] 1: 0
> > [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > [ 0.010000] 0 Recoveries of "stolen" FPU
> > [ 0.010000] ===========================
> >
> >
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-22 11:51 ` Anoop P A
@ 2010-12-22 13:03 ` Kevin D. Kissell
2010-12-22 16:34 ` STUART VENTERS
2010-12-23 21:09 ` STUART VENTERS
0 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-22 13:03 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., STUART VENTERS, linux-mips
On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off. Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing. That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1) The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2) Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange. And your timer interrupt is definitely on the
interrupt that corresponds to the 0x4000 mask?
I may have written the MT spec and the original SMTC code, but I don't
have a copy of the spec, and it's been a few years, and I can't
interpret the MVP and VPE control/config values. But I just don't see
how the processor could not be taking more interrupts. Stuart did
decode the global/VPE state enough to observe that global multithreaded
execution wasn't enabled, which is indeed strange - it shouldn't matter
for single-TC execution, but I don't recall there being any special-case
in the SMTC initialization that bypassed that enable. That makes me
suspect that maybe someone changed the initialization sequence in a way
that bypasses one of the canonical initialization steps in a way that
would break SMTC, but I don't know why that would result in the
interrupt behavior you observe.
It might be yet another blind alley, but could you add/arm diagnostic
output for each of the initialization functions in smtc.c?
Ah, yes, and one other thing. You should add a dump of ErrorEPC to the
MT register dump. I did it for myself once upon a time when I was
confronted with a similar mystery, but never filed a patch. If you're
breaking in with NMI, that could help identify more precisely where it's
locking up.
You really ought to try to borrow an EJTAG probe. It would save us both
a lot of time. And my time to trouble-shoot this with you is limited.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-22 16:34 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-22 16:34 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., linux-mips, Kevin D. Kissell
Anoop,
Nothing jumps out to me in the new set of register values.
It might be worth dumping all the CP0 registers?
I'm especially interested in the Config3 to see the VEIC bit.
The timer registers might be useful as well.
Regards,
Stuart
-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.
On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off. Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing. That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1) The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2) Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange. And your timer interrupt is definitely on the
interrupt that corresponds to the 0x4000 mask?
I may have written the MT spec and the original SMTC code, but I don't
have a copy of the spec, and it's been a few years, and I can't
interpret the MVP and VPE control/config values. But I just don't see
how the processor could not be taking more interrupts. Stuart did
decode the global/VPE state enough to observe that global multithreaded
execution wasn't enabled, which is indeed strange - it shouldn't matter
for single-TC execution, but I don't recall there being any special-case
in the SMTC initialization that bypassed that enable. That makes me
suspect that maybe someone changed the initialization sequence in a way
that bypasses one of the canonical initialization steps in a way that
would break SMTC, but I don't know why that would result in the
interrupt behavior you observe.
It might be yet another blind alley, but could you add/arm diagnostic
output for each of the initialization functions in smtc.c?
Ah, yes, and one other thing. You should add a dump of ErrorEPC to the
MT register dump. I did it for myself once upon a time when I was
confronted with a similar mystery, but never filed a patch. If you're
breaking in with NMI, that could help identify more precisely where it's
locking up.
You really ought to try to borrow an EJTAG probe. It would save us both
a lot of time. And my time to trouble-shoot this with you is limited.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-22 16:34 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-22 16:34 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., linux-mips, Kevin D. Kissell
Anoop,
Nothing jumps out to me in the new set of register values.
It might be worth dumping all the CP0 registers?
I'm especially interested in the Config3 to see the VEIC bit.
The timer registers might be useful as well.
Regards,
Stuart
-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.
On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off. Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing. That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1) The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2) Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange. And your timer interrupt is definitely on the
interrupt that corresponds to the 0x4000 mask?
I may have written the MT spec and the original SMTC code, but I don't
have a copy of the spec, and it's been a few years, and I can't
interpret the MVP and VPE control/config values. But I just don't see
how the processor could not be taking more interrupts. Stuart did
decode the global/VPE state enough to observe that global multithreaded
execution wasn't enabled, which is indeed strange - it shouldn't matter
for single-TC execution, but I don't recall there being any special-case
in the SMTC initialization that bypassed that enable. That makes me
suspect that maybe someone changed the initialization sequence in a way
that bypasses one of the canonical initialization steps in a way that
would break SMTC, but I don't know why that would result in the
interrupt behavior you observe.
It might be yet another blind alley, but could you add/arm diagnostic
output for each of the initialization functions in smtc.c?
Ah, yes, and one other thing. You should add a dump of ErrorEPC to the
MT register dump. I did it for myself once upon a time when I was
confronted with a similar mystery, but never filed a patch. If you're
breaking in with NMI, that could help identify more precisely where it's
locking up.
You really ought to try to borrow an EJTAG probe. It would save us both
a lot of time. And my time to trouble-shoot this with you is limited.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-23 21:09 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-23 21:09 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips, Anoop P A
[-- Attachment #1: Type: text/plain, Size: 804 bytes --]
Kevin,
I'm not sure if it's useful,
but finally I got the time to look at the two kernel versions Anoop pointed out.
works 2.6.32-stable with patch 804
works_not 2.6.33-stable
greping for files with CONFIG_MIPS_MT_SMTC
and looking for timer interrupt related stuff found the following differences:
arch/mips/include/asm/irq.h
arch/mips/kernel/irq.c
do_IRQ
arch/mips/include/asm/stackframe.h
SAVE_SOME SAVE_TEMP get/set_saved_sp
arch/mips/include/asm/time.h
clocksource_set_clock
arch/mips/kernel/process.c
cpu_idle
arch/mips/kernel/smtc.c
__irq_entry
ipi_decode
SMTC_CLOCK_TICK
Enclosed are the two subsets of files for a more expert look.
I'll try to look in more detail after Christmas.
Cheers,
Stuart
[-- Attachment #2: foo.tar.gz --]
[-- Type: application/x-gzip, Size: 46685 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-23 21:09 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-23 21:09 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips, Anoop P A
[-- Attachment #1: Type: text/plain, Size: 804 bytes --]
Kevin,
I'm not sure if it's useful,
but finally I got the time to look at the two kernel versions Anoop pointed out.
works 2.6.32-stable with patch 804
works_not 2.6.33-stable
greping for files with CONFIG_MIPS_MT_SMTC
and looking for timer interrupt related stuff found the following differences:
arch/mips/include/asm/irq.h
arch/mips/kernel/irq.c
do_IRQ
arch/mips/include/asm/stackframe.h
SAVE_SOME SAVE_TEMP get/set_saved_sp
arch/mips/include/asm/time.h
clocksource_set_clock
arch/mips/kernel/process.c
cpu_idle
arch/mips/kernel/smtc.c
__irq_entry
ipi_decode
SMTC_CLOCK_TICK
Enclosed are the two subsets of files for a more expert look.
I'll try to look in more detail after Christmas.
Cheers,
Stuart
[-- Attachment #2: foo.tar.gz --]
[-- Type: application/x-gzip, Size: 46685 bytes --]
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-23 21:09 ` STUART VENTERS
(?)
@ 2010-12-24 12:32 ` Kevin D. Kissell
2010-12-24 14:39 ` Anoop P A
-1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 12:32 UTC (permalink / raw)
To: STUART VENTERS; +Cc: Anoop P.A., linux-mips, Anoop P A
Thank you, Stuart! I've spotted some definite breakage to SMTC between
those versions. In arch/mips/include/asm/stackframe.h, someone moved
the store of the Status register value in SAVE_SOME (line 169 or 204,
depending on the version) from two instructions after the mfc0 to a
point after the #ifdef for SMTC, presumably to get better pipelining of
the register access. Unfortunately, the v1 register is also used in the
SMTC-specific fragment to save TCStatus, so the Status value gets
clobbered before it gets stored. This will eventually result in the
Status register getting a TCStatus value, which has some bits on common,
but isn't identical and sooner or later Bad Things will happen.
I'm a little surprised this wasn't caught by visual inspection of the patch.
Possible solutions would include reverting the store of the CP0_STATUS
value to the block above the #ifdef, or, to retain whatever performance
advantage was obtained by moving the store downward, to use v0/$2
instead of v1/$3, as the staging register for the TCStatus value. I'd
lean toward the second option, but I'm not in a position to test and
submit a patch just now.
Regards,
Kevin K.
On 12/23/10 1:09 PM, STUART VENTERS wrote:
> Kevin,
>
> I'm not sure if it's useful,
> but finally I got the time to look at the two kernel versions Anoop pointed out.
> works 2.6.32-stable with patch 804
> works_not 2.6.33-stable
>
> greping for files with CONFIG_MIPS_MT_SMTC
> and looking for timer interrupt related stuff found the following differences:
>
>
> arch/mips/include/asm/irq.h
> arch/mips/kernel/irq.c
> do_IRQ
>
> arch/mips/include/asm/stackframe.h
> SAVE_SOME SAVE_TEMP get/set_saved_sp
>
> arch/mips/include/asm/time.h
> clocksource_set_clock
>
> arch/mips/kernel/process.c
> cpu_idle
>
> arch/mips/kernel/smtc.c
> __irq_entry
> ipi_decode
> SMTC_CLOCK_TICK
>
>
> Enclosed are the two subsets of files for a more expert look.
>
> I'll try to look in more detail after Christmas.
>
>
> Cheers,
>
> Stuart
>
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-24 12:32 ` Kevin D. Kissell
@ 2010-12-24 14:39 ` Anoop P A
2010-12-24 14:53 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-24 14:39 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Hi Kevin, Stuart ,
Woohooo You guys spotted !.
http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
the culprit
Once I restored previous version of stackframe.h 2.6.33-stable started
booting !.
Thanks,
Anoop
On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> the store of the Status register value in SAVE_SOME (line 169 or 204,
> depending on the version) from two instructions after the mfc0 to a
> point after the #ifdef for SMTC, presumably to get better pipelining of
> the register access. Unfortunately, the v1 register is also used in the
> SMTC-specific fragment to save TCStatus, so the Status value gets
> clobbered before it gets stored. This will eventually result in the
> Status register getting a TCStatus value, which has some bits on common,
> but isn't identical and sooner or later Bad Things will happen.
>
> I'm a little surprised this wasn't caught by visual inspection of the patch.
>
> Possible solutions would include reverting the store of the CP0_STATUS
> value to the block above the #ifdef, or, to retain whatever performance
> advantage was obtained by moving the store downward, to use v0/$2
> instead of v1/$3, as the staging register for the TCStatus value. I'd
> lean toward the second option, but I'm not in a position to test and
> submit a patch just now.
>
> Regards,
>
> Kevin K.
>
> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > Kevin,
> >
> > I'm not sure if it's useful,
> > but finally I got the time to look at the two kernel versions Anoop pointed out.
> > works 2.6.32-stable with patch 804
> > works_not 2.6.33-stable
> >
> > greping for files with CONFIG_MIPS_MT_SMTC
> > and looking for timer interrupt related stuff found the following differences:
> >
> >
> > arch/mips/include/asm/irq.h
> > arch/mips/kernel/irq.c
> > do_IRQ
> >
> > arch/mips/include/asm/stackframe.h
> > SAVE_SOME SAVE_TEMP get/set_saved_sp
> >
> > arch/mips/include/asm/time.h
> > clocksource_set_clock
> >
> > arch/mips/kernel/process.c
> > cpu_idle
> >
> > arch/mips/kernel/smtc.c
> > __irq_entry
> > ipi_decode
> > SMTC_CLOCK_TICK
> >
> >
> > Enclosed are the two subsets of files for a more expert look.
> >
> > I'll try to look in more detail after Christmas.
> >
> >
> > Cheers,
> >
> > Stuart
> >
> >
> >
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-24 14:39 ` Anoop P A
@ 2010-12-24 14:53 ` Kevin D. Kissell
2010-12-24 16:02 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 14:53 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
[-- Attachment #1: Type: text/plain, Size: 2748 bytes --]
Excellent! Now, does the attached patch (relative to 2.6.37.11) also
fix things, while preserving the other fixes and performance enhancements?
/K.
On 12/24/10 6:39 AM, Anoop P A wrote:
> Hi Kevin, Stuart ,
>
> Woohooo You guys spotted !.
>
> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> the culprit
>
> Once I restored previous version of stackframe.h 2.6.33-stable started
> booting !.
>
> Thanks,
> Anoop
>
> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>> depending on the version) from two instructions after the mfc0 to a
>> point after the #ifdef for SMTC, presumably to get better pipelining of
>> the register access. Unfortunately, the v1 register is also used in the
>> SMTC-specific fragment to save TCStatus, so the Status value gets
>> clobbered before it gets stored. This will eventually result in the
>> Status register getting a TCStatus value, which has some bits on common,
>> but isn't identical and sooner or later Bad Things will happen.
>>
>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>
>> Possible solutions would include reverting the store of the CP0_STATUS
>> value to the block above the #ifdef, or, to retain whatever performance
>> advantage was obtained by moving the store downward, to use v0/$2
>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>> lean toward the second option, but I'm not in a position to test and
>> submit a patch just now.
>>
>> Regards,
>>
>> Kevin K.
>>
>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>> Kevin,
>>>
>>> I'm not sure if it's useful,
>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>> works 2.6.32-stable with patch 804
>>> works_not 2.6.33-stable
>>>
>>> greping for files with CONFIG_MIPS_MT_SMTC
>>> and looking for timer interrupt related stuff found the following differences:
>>>
>>>
>>> arch/mips/include/asm/irq.h
>>> arch/mips/kernel/irq.c
>>> do_IRQ
>>>
>>> arch/mips/include/asm/stackframe.h
>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>
>>> arch/mips/include/asm/time.h
>>> clocksource_set_clock
>>>
>>> arch/mips/kernel/process.c
>>> cpu_idle
>>>
>>> arch/mips/kernel/smtc.c
>>> __irq_entry
>>> ipi_decode
>>> SMTC_CLOCK_TICK
>>>
>>>
>>> Enclosed are the two subsets of files for a more expert look.
>>>
>>> I'll try to look in more detail after Christmas.
>>>
>>>
>>> Cheers,
>>>
>>> Stuart
>>>
>>>
>>>
>>>
>
[-- Attachment #2: smtc_stackframe.h.patch --]
[-- Type: text/plain, Size: 394 bytes --]
--- stackframe.h 2010-12-24 06:47:06.000000000 -0800
+++ stackframe.h.test 2010-12-24 06:48:56.000000000 -0800
@@ -195,9 +195,9 @@
* to cover the pipeline delay.
*/
.set mips32
- mfc0 v1, CP0_TCSTATUS
+ mfc0 v0, CP0_TCSTATUS
.set mips0
- LONG_S v1, PT_TCSTATUS(sp)
+ LONG_S v0, PT_TCSTATUS(sp)
#endif /* CONFIG_MIPS_MT_SMTC */
LONG_S $4, PT_R4(sp)
LONG_S $5, PT_R5(sp)
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-24 14:53 ` Kevin D. Kissell
@ 2010-12-24 16:02 ` Anoop P A
2010-12-24 23:34 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-24 16:02 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> fix things, while preserving the other fixes and performance enhancements?
>
I have tested that patch with 2.6.37 branch it well passes calibration
loop but hangs after switching to mips closource
TC 6 going on-line as CPU 6
Brought up 7 CPUs
bio: create slab <bio-0> at 0
SCSI subsystem initialized
Switching to clocksource MIPS
I Presume this is a different issue as restoring older file didn't help
much to get rid of this hang.
diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..7fc9f10 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -195,9 +195,9 @@
* to cover the pipeline delay.
*/
.set mips32
- mfc0 v1, CP0_TCSTATUS
+ mfc0 v0, CP0_TCSTATUS
.set mips0
- LONG_S v1, PT_TCSTATUS(sp)
+ LONG_S v0, PT_TCSTATUS(sp)
#endif /* CONFIG_MIPS_MT_SMTC */
LONG_S $4, PT_R4(sp)
LONG_S $5, PT_R5(sp)
> /K.
>
> On 12/24/10 6:39 AM, Anoop P A wrote:
> > Hi Kevin, Stuart ,
> >
> > Woohooo You guys spotted !.
> >
> > http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > the culprit
> >
> > Once I restored previous version of stackframe.h 2.6.33-stable started
> > booting !.
> >
> > Thanks,
> > Anoop
> >
> > On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >> depending on the version) from two instructions after the mfc0 to a
> >> point after the #ifdef for SMTC, presumably to get better pipelining of
> >> the register access. Unfortunately, the v1 register is also used in the
> >> SMTC-specific fragment to save TCStatus, so the Status value gets
> >> clobbered before it gets stored. This will eventually result in the
> >> Status register getting a TCStatus value, which has some bits on common,
> >> but isn't identical and sooner or later Bad Things will happen.
> >>
> >> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>
> >> Possible solutions would include reverting the store of the CP0_STATUS
> >> value to the block above the #ifdef, or, to retain whatever performance
> >> advantage was obtained by moving the store downward, to use v0/$2
> >> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >> lean toward the second option, but I'm not in a position to test and
> >> submit a patch just now.
> >>
> >> Regards,
> >>
> >> Kevin K.
> >>
> >> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>> Kevin,
> >>>
> >>> I'm not sure if it's useful,
> >>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>> works 2.6.32-stable with patch 804
> >>> works_not 2.6.33-stable
> >>>
> >>> greping for files with CONFIG_MIPS_MT_SMTC
> >>> and looking for timer interrupt related stuff found the following differences:
> >>>
> >>>
> >>> arch/mips/include/asm/irq.h
> >>> arch/mips/kernel/irq.c
> >>> do_IRQ
> >>>
> >>> arch/mips/include/asm/stackframe.h
> >>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>
> >>> arch/mips/include/asm/time.h
> >>> clocksource_set_clock
> >>>
> >>> arch/mips/kernel/process.c
> >>> cpu_idle
> >>>
> >>> arch/mips/kernel/smtc.c
> >>> __irq_entry
> >>> ipi_decode
> >>> SMTC_CLOCK_TICK
> >>>
> >>>
> >>> Enclosed are the two subsets of files for a more expert look.
> >>>
> >>> I'll try to look in more detail after Christmas.
> >>>
> >>>
> >>> Cheers,
> >>>
> >>> Stuart
> >>>
> >>>
> >>>
> >>>
> >
>
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-24 16:02 ` Anoop P A
@ 2010-12-24 23:34 ` Kevin D. Kissell
2010-12-25 7:32 ` Anoop P A
2010-12-27 15:49 ` STUART VENTERS
0 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-24 23:34 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Ah, well, at least we have a stackframe.h fix that preserves David's
performance tweak for the deeper pipelined processors. In looking for
this, I did notice that someone did some modification to the SMTC clock
tick logic that I was skeptical had ever been tested. If you've still
got that kernel binary handy, you might check to see if it boots with
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
Oh, yes, and Merry Christmas one and all!
Regards,
Kevin K.
On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0> at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
> * to cover the pipeline delay.
> */
> .set mips32
> - mfc0 v1, CP0_TCSTATUS
> + mfc0 v0, CP0_TCSTATUS
> .set mips0
> - LONG_S v1, PT_TCSTATUS(sp)
> + LONG_S v0, PT_TCSTATUS(sp)
> #endif /* CONFIG_MIPS_MT_SMTC */
> LONG_S $4, PT_R4(sp)
> LONG_S $5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access. Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored. This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>> Regards,
>>>>
>>>> Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>> works 2.6.32-stable with patch 804
>>>>> works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>> do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>> clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>> cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>> __irq_entry
>>>>> ipi_decode
>>>>> SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-24 23:34 ` Kevin D. Kissell
@ 2010-12-25 7:32 ` Anoop P A
2010-12-25 15:17 ` Kevin D. Kissell
2010-12-27 15:49 ` STUART VENTERS
1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-25 7:32 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Fri, 2010-12-24 at 15:34 -0800, Kevin D. Kissell wrote:
> Ah, well, at least we have a stackframe.h fix that preserves David's
> performance tweak for the deeper pipelined processors. In looking for
> this, I did notice that someone did some modification to the SMTC clock
> tick logic that I was skeptical had ever been tested. If you've still
> got that kernel binary handy, you might check to see if it boots with
> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
Yes I have tried with various combinations of tcs and vpes. with
maxvpes=1 I can boot with a max of 4 TCS ( VPE0 has 4 TCs) .
However setting maxpes=2 and maxtcs=2 hangs pretty early.
Clock rate set to 600000000
console [ttyS0] enabled
Calibrating delay loop... 398.33 BogoMIPS (lpj=796672)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
Limit of 2 VPEs set
Limit of 2 TCs set
TLB of 64 entry pairs shared by 2 VPEs
VPE 0: TC 0, VPE 1: TC 1
IPI buffer pool of 32 buffers
CPU revision is: 00019548 ((null))
TC 1 going on-line as CPU 1
Brought up 2 CPUs
One strange observation is with maxtcs=3 and maxvpes=2 kernel boots all
the way.
Again with maxtcs=5 and maxvpes=2 it hangs after switching to MIPS
clocksource.
I strongly suspect some issue with locking. I will dig the code early
next week.
>
> Oh, yes, and Merry Christmas one and all!
Thank you ! ..
Everybody Happy Christmas.
>
> Regards,
>
> Kevin K.
>
> On 12/24/10 8:02 AM, Anoop P A wrote:
> > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >> fix things, while preserving the other fixes and performance enhancements?
> >>
> > I have tested that patch with 2.6.37 branch it well passes calibration
> > loop but hangs after switching to mips closource
> >
> > TC 6 going on-line as CPU 6
> > Brought up 7 CPUs
> > bio: create slab<bio-0> at 0
> > SCSI subsystem initialized
> > Switching to clocksource MIPS
> >
> > I Presume this is a different issue as restoring older file didn't help
> > much to get rid of this hang.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..7fc9f10 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -195,9 +195,9 @@
> > * to cover the pipeline delay.
> > */
> > .set mips32
> > - mfc0 v1, CP0_TCSTATUS
> > + mfc0 v0, CP0_TCSTATUS
> > .set mips0
> > - LONG_S v1, PT_TCSTATUS(sp)
> > + LONG_S v0, PT_TCSTATUS(sp)
> > #endif /* CONFIG_MIPS_MT_SMTC */
> > LONG_S $4, PT_R4(sp)
> > LONG_S $5, PT_R5(sp)
> >
> >
> >> /K.
> >>
> >> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>> Hi Kevin, Stuart ,
> >>>
> >>> Woohooo You guys spotted !.
> >>>
> >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>> the culprit
> >>>
> >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>> booting !.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>> depending on the version) from two instructions after the mfc0 to a
> >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>> the register access. Unfortunately, the v1 register is also used in the
> >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>> clobbered before it gets stored. This will eventually result in the
> >>>> Status register getting a TCStatus value, which has some bits on common,
> >>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>
> >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>
> >>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>> lean toward the second option, but I'm not in a position to test and
> >>>> submit a patch just now.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kevin K.
> >>>>
> >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> I'm not sure if it's useful,
> >>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>> works 2.6.32-stable with patch 804
> >>>>> works_not 2.6.33-stable
> >>>>>
> >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>
> >>>>>
> >>>>> arch/mips/include/asm/irq.h
> >>>>> arch/mips/kernel/irq.c
> >>>>> do_IRQ
> >>>>>
> >>>>> arch/mips/include/asm/stackframe.h
> >>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>
> >>>>> arch/mips/include/asm/time.h
> >>>>> clocksource_set_clock
> >>>>>
> >>>>> arch/mips/kernel/process.c
> >>>>> cpu_idle
> >>>>>
> >>>>> arch/mips/kernel/smtc.c
> >>>>> __irq_entry
> >>>>> ipi_decode
> >>>>> SMTC_CLOCK_TICK
> >>>>>
> >>>>>
> >>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>
> >>>>> I'll try to look in more detail after Christmas.
> >>>>>
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-25 7:32 ` Anoop P A
@ 2010-12-25 15:17 ` Kevin D. Kissell
0 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-25 15:17 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 12/24/10 11:32 PM, Anoop P A wrote:
> On Fri, 2010-12-24 at 15:34 -0800, Kevin D. Kissell wrote:
>> Ah, well, at least we have a stackframe.h fix that preserves David's
>> performance tweak for the deeper pipelined processors. In looking for
>> this, I did notice that someone did some modification to the SMTC clock
>> tick logic that I was skeptical had ever been tested. If you've still
>> got that kernel binary handy, you might check to see if it boots with
>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> Yes I have tried with various combinations of tcs and vpes. with
> maxvpes=1 I can boot with a max of 4 TCS ( VPE0 has 4 TCs) .
> However setting maxpes=2 and maxtcs=2 hangs pretty early.
>
> Clock rate set to 600000000
> console [ttyS0] enabled
> Calibrating delay loop... 398.33 BogoMIPS (lpj=796672)
> pid_max: default: 32768 minimum: 301
> Mount-cache hash table entries: 512
> Limit of 2 VPEs set
> Limit of 2 TCs set
> TLB of 64 entry pairs shared by 2 VPEs
> VPE 0: TC 0, VPE 1: TC 1
> IPI buffer pool of 32 buffers
> CPU revision is: 00019548 ((null))
> TC 1 going on-line as CPU 1
> Brought up 2 CPUs
>
> One strange observation is with maxtcs=3 and maxvpes=2 kernel boots all
> the way.
>
> Again with maxtcs=5 and maxvpes=2 it hangs after switching to MIPS
> clocksource.
>
> I strongly suspect some issue with locking. I will dig the code early
> next week.
If locking is screwed up, I'd expect more problems with 4 TC "CPUs" in
the same VPE. It also suggests that the basic distribution via local
low-latency IPI within a VPE is functioning, but that something is
broken in the cross-VPE evengt propagation. I strongly suspect that
your maxtcs=3, maxvpes=2 case would hang sooner or later, but by luck of
the draw none of the init threads got scheduled on VPE 1 long enough to
get stuck.
I note that there were some changes made under the rubric "MIPS: SMTC:
Avoid queueing multiple reschedule IPIs" in October and November of last
year that make me nervous. I wouldn't have coded things that way
myself, but they might be OK. Still, the first bisection I'd make if I
was trouble-shooting this would be to roll back to just before they went in.
Ho, ho, ho,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-27 15:49 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-27 15:49 UTC (permalink / raw)
To: Kevin D. Kissell, Anoop P A; +Cc: Anoop P.A., linux-mips
Kevin,
Outstanding, sometimes it's better to be lucky than good.
Anoop,
Maybe we can get lucky again.
If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
I'll be happy to do another diff.
Hope you'll have had a good Christmas as well.
We've had snow in Alabama since Christmas eve!
Regards,
Stuart
-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Friday, December 24, 2010 5:34 PM
To: Anoop P A
Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.
Ah, well, at least we have a stackframe.h fix that preserves David's
performance tweak for the deeper pipelined processors. In looking for
this, I did notice that someone did some modification to the SMTC clock
tick logic that I was skeptical had ever been tested. If you've still
got that kernel binary handy, you might check to see if it boots with
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
Oh, yes, and Merry Christmas one and all!
Regards,
Kevin K.
On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0> at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
> * to cover the pipeline delay.
> */
> .set mips32
> - mfc0 v1, CP0_TCSTATUS
> + mfc0 v0, CP0_TCSTATUS
> .set mips0
> - LONG_S v1, PT_TCSTATUS(sp)
> + LONG_S v0, PT_TCSTATUS(sp)
> #endif /* CONFIG_MIPS_MT_SMTC */
> LONG_S $4, PT_R4(sp)
> LONG_S $5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access. Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored. This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>> Regards,
>>>>
>>>> Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>> works 2.6.32-stable with patch 804
>>>>> works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>> do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>> clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>> cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>> __irq_entry
>>>>> ipi_decode
>>>>> SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-27 15:49 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-27 15:49 UTC (permalink / raw)
To: Kevin D. Kissell, Anoop P A; +Cc: Anoop P.A., linux-mips
Kevin,
Outstanding, sometimes it's better to be lucky than good.
Anoop,
Maybe we can get lucky again.
If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
I'll be happy to do another diff.
Hope you'll have had a good Christmas as well.
We've had snow in Alabama since Christmas eve!
Regards,
Stuart
-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@paralogos.com]
Sent: Friday, December 24, 2010 5:34 PM
To: Anoop P A
Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
Subject: Re: SMTC support status in latest git head.
Ah, well, at least we have a stackframe.h fix that preserves David's
performance tweak for the deeper pipelined processors. In looking for
this, I did notice that someone did some modification to the SMTC clock
tick logic that I was skeptical had ever been tested. If you've still
got that kernel binary handy, you might check to see if it boots with
maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
Oh, yes, and Merry Christmas one and all!
Regards,
Kevin K.
On 12/24/10 8:02 AM, Anoop P A wrote:
> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>> fix things, while preserving the other fixes and performance enhancements?
>>
> I have tested that patch with 2.6.37 branch it well passes calibration
> loop but hangs after switching to mips closource
>
> TC 6 going on-line as CPU 6
> Brought up 7 CPUs
> bio: create slab<bio-0> at 0
> SCSI subsystem initialized
> Switching to clocksource MIPS
>
> I Presume this is a different issue as restoring older file didn't help
> much to get rid of this hang.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..7fc9f10 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -195,9 +195,9 @@
> * to cover the pipeline delay.
> */
> .set mips32
> - mfc0 v1, CP0_TCSTATUS
> + mfc0 v0, CP0_TCSTATUS
> .set mips0
> - LONG_S v1, PT_TCSTATUS(sp)
> + LONG_S v0, PT_TCSTATUS(sp)
> #endif /* CONFIG_MIPS_MT_SMTC */
> LONG_S $4, PT_R4(sp)
> LONG_S $5, PT_R5(sp)
>
>
>> /K.
>>
>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>> Hi Kevin, Stuart ,
>>>
>>> Woohooo You guys spotted !.
>>>
>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>> the culprit
>>>
>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>> booting !.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>> depending on the version) from two instructions after the mfc0 to a
>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>> the register access. Unfortunately, the v1 register is also used in the
>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>> clobbered before it gets stored. This will eventually result in the
>>>> Status register getting a TCStatus value, which has some bits on common,
>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>
>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>
>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>> lean toward the second option, but I'm not in a position to test and
>>>> submit a patch just now.
>>>>
>>>> Regards,
>>>>
>>>> Kevin K.
>>>>
>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> I'm not sure if it's useful,
>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>> works 2.6.32-stable with patch 804
>>>>> works_not 2.6.33-stable
>>>>>
>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>
>>>>>
>>>>> arch/mips/include/asm/irq.h
>>>>> arch/mips/kernel/irq.c
>>>>> do_IRQ
>>>>>
>>>>> arch/mips/include/asm/stackframe.h
>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>
>>>>> arch/mips/include/asm/time.h
>>>>> clocksource_set_clock
>>>>>
>>>>> arch/mips/kernel/process.c
>>>>> cpu_idle
>>>>>
>>>>> arch/mips/kernel/smtc.c
>>>>> __irq_entry
>>>>> ipi_decode
>>>>> SMTC_CLOCK_TICK
>>>>>
>>>>>
>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>
>>>>> I'll try to look in more detail after Christmas.
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>>
>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
2010-12-27 15:49 ` STUART VENTERS
(?)
@ 2010-12-27 17:19 ` Anoop P A
2010-12-28 8:19 ` Anoop P A
-1 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-27 17:19 UTC (permalink / raw)
To: STUART VENTERS; +Cc: Kevin D. Kissell, Anoop P.A., linux-mips
Hi Kevin,
It is very unlikely that the patch you pointed has any impact on the the
hang I am seeing. The patch you have mentioned got into kernel around
2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
stackframe patch) .
Hi Stuart,
I haven't got much time to spend on this today.
I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
So probably some patches in 2.6.37 branch introduced this hang.
Hopefully I will get some free slot tomorrow so that I can look into
code diff .
Thanks
Anoop
On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> Kevin,
>
> Outstanding, sometimes it's better to be lucky than good.
>
>
> Anoop,
>
> Maybe we can get lucky again.
>
> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> I'll be happy to do another diff.
>
>
> Hope you'll have had a good Christmas as well.
> We've had snow in Alabama since Christmas eve!
>
>
> Regards,
>
> Stuart
>
>
> -----Original Message-----
> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> Sent: Friday, December 24, 2010 5:34 PM
> To: Anoop P A
> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> Subject: Re: SMTC support status in latest git head.
>
>
> Ah, well, at least we have a stackframe.h fix that preserves David's
> performance tweak for the deeper pipelined processors. In looking for
> this, I did notice that someone did some modification to the SMTC clock
> tick logic that I was skeptical had ever been tested. If you've still
> got that kernel binary handy, you might check to see if it boots with
> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>
> Oh, yes, and Merry Christmas one and all!
>
> Regards,
>
> Kevin K.
>
> On 12/24/10 8:02 AM, Anoop P A wrote:
> > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >> fix things, while preserving the other fixes and performance enhancements?
> >>
> > I have tested that patch with 2.6.37 branch it well passes calibration
> > loop but hangs after switching to mips closource
> >
> > TC 6 going on-line as CPU 6
> > Brought up 7 CPUs
> > bio: create slab<bio-0> at 0
> > SCSI subsystem initialized
> > Switching to clocksource MIPS
> >
> > I Presume this is a different issue as restoring older file didn't help
> > much to get rid of this hang.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..7fc9f10 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -195,9 +195,9 @@
> > * to cover the pipeline delay.
> > */
> > .set mips32
> > - mfc0 v1, CP0_TCSTATUS
> > + mfc0 v0, CP0_TCSTATUS
> > .set mips0
> > - LONG_S v1, PT_TCSTATUS(sp)
> > + LONG_S v0, PT_TCSTATUS(sp)
> > #endif /* CONFIG_MIPS_MT_SMTC */
> > LONG_S $4, PT_R4(sp)
> > LONG_S $5, PT_R5(sp)
> >
> >
> >> /K.
> >>
> >> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>> Hi Kevin, Stuart ,
> >>>
> >>> Woohooo You guys spotted !.
> >>>
> >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>> the culprit
> >>>
> >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>> booting !.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>> depending on the version) from two instructions after the mfc0 to a
> >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>> the register access. Unfortunately, the v1 register is also used in the
> >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>> clobbered before it gets stored. This will eventually result in the
> >>>> Status register getting a TCStatus value, which has some bits on common,
> >>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>
> >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>
> >>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>> lean toward the second option, but I'm not in a position to test and
> >>>> submit a patch just now.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kevin K.
> >>>>
> >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> I'm not sure if it's useful,
> >>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>> works 2.6.32-stable with patch 804
> >>>>> works_not 2.6.33-stable
> >>>>>
> >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>
> >>>>>
> >>>>> arch/mips/include/asm/irq.h
> >>>>> arch/mips/kernel/irq.c
> >>>>> do_IRQ
> >>>>>
> >>>>> arch/mips/include/asm/stackframe.h
> >>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>
> >>>>> arch/mips/include/asm/time.h
> >>>>> clocksource_set_clock
> >>>>>
> >>>>> arch/mips/kernel/process.c
> >>>>> cpu_idle
> >>>>>
> >>>>> arch/mips/kernel/smtc.c
> >>>>> __irq_entry
> >>>>> ipi_decode
> >>>>> SMTC_CLOCK_TICK
> >>>>>
> >>>>>
> >>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>
> >>>>> I'll try to look in more detail after Christmas.
> >>>>>
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
2010-12-27 17:19 ` Anoop P A
@ 2010-12-28 8:19 ` Anoop P A
2010-12-28 8:43 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-28 8:19 UTC (permalink / raw)
To: STUART VENTERS; +Cc: Kevin D. Kissell, Anoop P.A., linux-mips
Hi,
I had a glance into the code diff without notice of any suspect-able
code .
Tracing the hang showed that it is getting hanged in timekeeping_notify
function.
Thanks,
Anoop
PS: I may not be available until Thursday
On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> Hi Kevin,
>
> It is very unlikely that the patch you pointed has any impact on the the
> hang I am seeing. The patch you have mentioned got into kernel around
> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> stackframe patch) .
>
> Hi Stuart,
>
> I haven't got much time to spend on this today.
>
> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>
> So probably some patches in 2.6.37 branch introduced this hang.
>
> Hopefully I will get some free slot tomorrow so that I can look into
> code diff .
>
> Thanks
> Anoop
>
> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> > Kevin,
> >
> > Outstanding, sometimes it's better to be lucky than good.
> >
> >
> > Anoop,
> >
> > Maybe we can get lucky again.
> >
> > If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> > I'll be happy to do another diff.
> >
> >
> > Hope you'll have had a good Christmas as well.
> > We've had snow in Alabama since Christmas eve!
> >
> >
> > Regards,
> >
> > Stuart
> >
> >
> > -----Original Message-----
> > From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> > Sent: Friday, December 24, 2010 5:34 PM
> > To: Anoop P A
> > Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> > Subject: Re: SMTC support status in latest git head.
> >
> >
> > Ah, well, at least we have a stackframe.h fix that preserves David's
> > performance tweak for the deeper pipelined processors. In looking for
> > this, I did notice that someone did some modification to the SMTC clock
> > tick logic that I was skeptical had ever been tested. If you've still
> > got that kernel binary handy, you might check to see if it boots with
> > maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >
> > Oh, yes, and Merry Christmas one and all!
> >
> > Regards,
> >
> > Kevin K.
> >
> > On 12/24/10 8:02 AM, Anoop P A wrote:
> > > On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> > >> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> > >> fix things, while preserving the other fixes and performance enhancements?
> > >>
> > > I have tested that patch with 2.6.37 branch it well passes calibration
> > > loop but hangs after switching to mips closource
> > >
> > > TC 6 going on-line as CPU 6
> > > Brought up 7 CPUs
> > > bio: create slab<bio-0> at 0
> > > SCSI subsystem initialized
> > > Switching to clocksource MIPS
> > >
> > > I Presume this is a different issue as restoring older file didn't help
> > > much to get rid of this hang.
> > >
> > > diff --git a/arch/mips/include/asm/stackframe.h
> > > b/arch/mips/include/asm/stackframe.h
> > > index 58730c5..7fc9f10 100644
> > > --- a/arch/mips/include/asm/stackframe.h
> > > +++ b/arch/mips/include/asm/stackframe.h
> > > @@ -195,9 +195,9 @@
> > > * to cover the pipeline delay.
> > > */
> > > .set mips32
> > > - mfc0 v1, CP0_TCSTATUS
> > > + mfc0 v0, CP0_TCSTATUS
> > > .set mips0
> > > - LONG_S v1, PT_TCSTATUS(sp)
> > > + LONG_S v0, PT_TCSTATUS(sp)
> > > #endif /* CONFIG_MIPS_MT_SMTC */
> > > LONG_S $4, PT_R4(sp)
> > > LONG_S $5, PT_R5(sp)
> > >
> > >
> > >> /K.
> > >>
> > >> On 12/24/10 6:39 AM, Anoop P A wrote:
> > >>> Hi Kevin, Stuart ,
> > >>>
> > >>> Woohooo You guys spotted !.
> > >>>
> > >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > >>> the culprit
> > >>>
> > >>> Once I restored previous version of stackframe.h 2.6.33-stable started
> > >>> booting !.
> > >>>
> > >>> Thanks,
> > >>> Anoop
> > >>>
> > >>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> > >>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> > >>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> > >>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> > >>>> depending on the version) from two instructions after the mfc0 to a
> > >>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> > >>>> the register access. Unfortunately, the v1 register is also used in the
> > >>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> > >>>> clobbered before it gets stored. This will eventually result in the
> > >>>> Status register getting a TCStatus value, which has some bits on common,
> > >>>> but isn't identical and sooner or later Bad Things will happen.
> > >>>>
> > >>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> > >>>>
> > >>>> Possible solutions would include reverting the store of the CP0_STATUS
> > >>>> value to the block above the #ifdef, or, to retain whatever performance
> > >>>> advantage was obtained by moving the store downward, to use v0/$2
> > >>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> > >>>> lean toward the second option, but I'm not in a position to test and
> > >>>> submit a patch just now.
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>> Kevin K.
> > >>>>
> > >>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > >>>>> Kevin,
> > >>>>>
> > >>>>> I'm not sure if it's useful,
> > >>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> > >>>>> works 2.6.32-stable with patch 804
> > >>>>> works_not 2.6.33-stable
> > >>>>>
> > >>>>> greping for files with CONFIG_MIPS_MT_SMTC
> > >>>>> and looking for timer interrupt related stuff found the following differences:
> > >>>>>
> > >>>>>
> > >>>>> arch/mips/include/asm/irq.h
> > >>>>> arch/mips/kernel/irq.c
> > >>>>> do_IRQ
> > >>>>>
> > >>>>> arch/mips/include/asm/stackframe.h
> > >>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> > >>>>>
> > >>>>> arch/mips/include/asm/time.h
> > >>>>> clocksource_set_clock
> > >>>>>
> > >>>>> arch/mips/kernel/process.c
> > >>>>> cpu_idle
> > >>>>>
> > >>>>> arch/mips/kernel/smtc.c
> > >>>>> __irq_entry
> > >>>>> ipi_decode
> > >>>>> SMTC_CLOCK_TICK
> > >>>>>
> > >>>>>
> > >>>>> Enclosed are the two subsets of files for a more expert look.
> > >>>>>
> > >>>>> I'll try to look in more detail after Christmas.
> > >>>>>
> > >>>>>
> > >>>>> Cheers,
> > >>>>>
> > >>>>> Stuart
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-28 8:19 ` Anoop P A
@ 2010-12-28 8:43 ` Kevin D. Kissell
2010-12-31 12:27 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-28 8:43 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
I took a quick look last night, and the only thing that looked vaguely
dangerous in changes since the timer changes I alluded to earlier was
the global naming cleanup of irq-related function names that David
Howell submitted. The diff didn't look dangerous in itself, but some of
the definitions are nested subtly for SMTC to maximize the amount of
common code, and I could imagine something getting lost in translation
there. If that were really the problem, it would of course affect much
more than just the timer subsystem, but early in the boot process,
timers are pretty much the only interrupts that have to be handled
correctly.
I'm travelling today, but will take a look at timekeeping_notify()
tomorrow or the next day...
/K.
On 12/28/10 12:19 AM, Anoop P A wrote:
> Hi,
>
> I had a glance into the code diff without notice of any suspect-able
> code .
> Tracing the hang showed that it is getting hanged in timekeeping_notify
> function.
>
> Thanks,
> Anoop
>
> PS: I may not be available until Thursday
>
> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>> Hi Kevin,
>>
>> It is very unlikely that the patch you pointed has any impact on the the
>> hang I am seeing. The patch you have mentioned got into kernel around
>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
>> stackframe patch) .
>>
>> Hi Stuart,
>>
>> I haven't got much time to spend on this today.
>>
>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>
>> So probably some patches in 2.6.37 branch introduced this hang.
>>
>> Hopefully I will get some free slot tomorrow so that I can look into
>> code diff .
>>
>> Thanks
>> Anoop
>>
>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>> Kevin,
>>>
>>> Outstanding, sometimes it's better to be lucky than good.
>>>
>>>
>>> Anoop,
>>>
>>> Maybe we can get lucky again.
>>>
>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>> I'll be happy to do another diff.
>>>
>>>
>>> Hope you'll have had a good Christmas as well.
>>> We've had snow in Alabama since Christmas eve!
>>>
>>>
>>> Regards,
>>>
>>> Stuart
>>>
>>>
>>> -----Original Message-----
>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>> Sent: Friday, December 24, 2010 5:34 PM
>>> To: Anoop P A
>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>> Subject: Re: SMTC support status in latest git head.
>>>
>>>
>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>> performance tweak for the deeper pipelined processors. In looking for
>>> this, I did notice that someone did some modification to the SMTC clock
>>> tick logic that I was skeptical had ever been tested. If you've still
>>> got that kernel binary handy, you might check to see if it boots with
>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>
>>> Oh, yes, and Merry Christmas one and all!
>>>
>>> Regards,
>>>
>>> Kevin K.
>>>
>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>
>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>> loop but hangs after switching to mips closource
>>>>
>>>> TC 6 going on-line as CPU 6
>>>> Brought up 7 CPUs
>>>> bio: create slab<bio-0> at 0
>>>> SCSI subsystem initialized
>>>> Switching to clocksource MIPS
>>>>
>>>> I Presume this is a different issue as restoring older file didn't help
>>>> much to get rid of this hang.
>>>>
>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>> b/arch/mips/include/asm/stackframe.h
>>>> index 58730c5..7fc9f10 100644
>>>> --- a/arch/mips/include/asm/stackframe.h
>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>> @@ -195,9 +195,9 @@
>>>> * to cover the pipeline delay.
>>>> */
>>>> .set mips32
>>>> - mfc0 v1, CP0_TCSTATUS
>>>> + mfc0 v0, CP0_TCSTATUS
>>>> .set mips0
>>>> - LONG_S v1, PT_TCSTATUS(sp)
>>>> + LONG_S v0, PT_TCSTATUS(sp)
>>>> #endif /* CONFIG_MIPS_MT_SMTC */
>>>> LONG_S $4, PT_R4(sp)
>>>> LONG_S $5, PT_R5(sp)
>>>>
>>>>
>>>>> /K.
>>>>>
>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>> Hi Kevin, Stuart ,
>>>>>>
>>>>>> Woohooo You guys spotted !.
>>>>>>
>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>> the culprit
>>>>>>
>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>> booting !.
>>>>>>
>>>>>> Thanks,
>>>>>> Anoop
>>>>>>
>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>> the register access. Unfortunately, the v1 register is also used in the
>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>> clobbered before it gets stored. This will eventually result in the
>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>
>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>
>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>> submit a patch just now.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kevin K.
>>>>>>>
>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>> Kevin,
>>>>>>>>
>>>>>>>> I'm not sure if it's useful,
>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>> works 2.6.32-stable with patch 804
>>>>>>>> works_not 2.6.33-stable
>>>>>>>>
>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>>>>
>>>>>>>>
>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>> do_IRQ
>>>>>>>>
>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>
>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>> clocksource_set_clock
>>>>>>>>
>>>>>>>> arch/mips/kernel/process.c
>>>>>>>> cpu_idle
>>>>>>>>
>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>> __irq_entry
>>>>>>>> ipi_decode
>>>>>>>> SMTC_CLOCK_TICK
>>>>>>>>
>>>>>>>>
>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>
>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Stuart
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-28 8:43 ` Kevin D. Kissell
@ 2010-12-31 12:27 ` Anoop P A
2011-01-01 8:42 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-31 12:27 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Hi ,
Kernel hangs on stop_machine call. Please find mt reg dump below.
Another important observation is even though 2.6.33 kernel + stackframe
patch well passes calibration hang , I am still unable boot in to a
initramfs root ( verified ramfs working with VSMP). So it looks like
still some issue to fix between 2.6.32 and 2.6.33 .
######################## Log ###########################
=== MIPS MT State Dump ===
-- Global State --
MVPControl Passed: 00000005
MVPControl Read: 00000004
MVPConf0 : a8008406
-- per-VPE State --
VPE 0
VPEControl : 00008000
VPEConf0 : 800f0003
VPE0.Status : 11004201
VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
VPE0.Cause : 50804000
VPE0.Config7 : 00010000
VPE 1
VPEControl : 00068006
VPEConf0 : 80cf0003
VPE1.Status : 11008301
VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
VPE1.Cause : 50800000
VPE1.Config7 : 00010000
-- per-TC State --
TC 0 (current TC with VPE EPC above)
TCStatus : 18102000
TCBind : 00000000
TCRestart : 803fa19c printk+0xc/0x30
TCHalt : 00000000
TCContext : 00000000
TC 1
TCStatus : 18902000
TCBind : 00200000
TCRestart : 801022a0 r4k_wait+0x20/0x40
TCHalt : 00000000
TCContext : 00140000
TC 2
TCStatus : 18902000
TCBind : 00400000
TCRestart : 801022a0 r4k_wait+0x20/0x40
TCHalt : 00000000
TCContext : 00280000
TC 3
TCStatus : 18902000
TCBind : 00600000
TCRestart : 801022a0 r4k_wait+0x20/0x40
TCHalt : 00000000
TCContext : 003c0000
TC 4
TCStatus : 18902000
TCBind : 00800001
TCRestart : 8010229c r4k_wait+0x1c/0x40
TCHalt : 00000000
TCContext : 00500000
TC 5
TCStatus : 18902000
TCBind : 00a00001
TCRestart : 8010229c r4k_wait+0x1c/0x40
TCHalt : 00000000
TCContext : 00640000
TC 6
TCStatus : 18902000
TCBind : 00c00001
TCRestart : 8010229c r4k_wait+0x1c/0x40
TCHalt : 00000000
TCContext : 00780000
Counter Interrupts taken per CPU (TC)
0: 0
1: 0
2: 0
3: 0
4: 0
5: 0
6: 0
7: 0
Self-IPI invocations:
0: 12
1: 0
2: 0
3: 0
4: 0
5: 5
6: 4
7: 0
IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
0 Recoveries of "stolen" FPU
===========================
################################################################
Thanks
Anoop
On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> I took a quick look last night, and the only thing that looked vaguely
> dangerous in changes since the timer changes I alluded to earlier was
> the global naming cleanup of irq-related function names that David
> Howell submitted. The diff didn't look dangerous in itself, but some of
> the definitions are nested subtly for SMTC to maximize the amount of
> common code, and I could imagine something getting lost in translation
> there. If that were really the problem, it would of course affect much
> more than just the timer subsystem, but early in the boot process,
> timers are pretty much the only interrupts that have to be handled
> correctly.
>
> I'm travelling today, but will take a look at timekeeping_notify()
> tomorrow or the next day...
>
> /K.
>
> On 12/28/10 12:19 AM, Anoop P A wrote:
> > Hi,
> >
> > I had a glance into the code diff without notice of any suspect-able
> > code .
> > Tracing the hang showed that it is getting hanged in timekeeping_notify
> > function.
> >
> > Thanks,
> > Anoop
> >
> > PS: I may not be available until Thursday
> >
> > On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >> Hi Kevin,
> >>
> >> It is very unlikely that the patch you pointed has any impact on the the
> >> hang I am seeing. The patch you have mentioned got into kernel around
> >> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> >> stackframe patch) .
> >>
> >> Hi Stuart,
> >>
> >> I haven't got much time to spend on this today.
> >>
> >> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>
> >> So probably some patches in 2.6.37 branch introduced this hang.
> >>
> >> Hopefully I will get some free slot tomorrow so that I can look into
> >> code diff .
> >>
> >> Thanks
> >> Anoop
> >>
> >> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>> Kevin,
> >>>
> >>> Outstanding, sometimes it's better to be lucky than good.
> >>>
> >>>
> >>> Anoop,
> >>>
> >>> Maybe we can get lucky again.
> >>>
> >>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>> I'll be happy to do another diff.
> >>>
> >>>
> >>> Hope you'll have had a good Christmas as well.
> >>> We've had snow in Alabama since Christmas eve!
> >>>
> >>>
> >>> Regards,
> >>>
> >>> Stuart
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>> Sent: Friday, December 24, 2010 5:34 PM
> >>> To: Anoop P A
> >>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>> Subject: Re: SMTC support status in latest git head.
> >>>
> >>>
> >>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>> performance tweak for the deeper pipelined processors. In looking for
> >>> this, I did notice that someone did some modification to the SMTC clock
> >>> tick logic that I was skeptical had ever been tested. If you've still
> >>> got that kernel binary handy, you might check to see if it boots with
> >>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>
> >>> Oh, yes, and Merry Christmas one and all!
> >>>
> >>> Regards,
> >>>
> >>> Kevin K.
> >>>
> >>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>
> >>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>> loop but hangs after switching to mips closource
> >>>>
> >>>> TC 6 going on-line as CPU 6
> >>>> Brought up 7 CPUs
> >>>> bio: create slab<bio-0> at 0
> >>>> SCSI subsystem initialized
> >>>> Switching to clocksource MIPS
> >>>>
> >>>> I Presume this is a different issue as restoring older file didn't help
> >>>> much to get rid of this hang.
> >>>>
> >>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>> b/arch/mips/include/asm/stackframe.h
> >>>> index 58730c5..7fc9f10 100644
> >>>> --- a/arch/mips/include/asm/stackframe.h
> >>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>> @@ -195,9 +195,9 @@
> >>>> * to cover the pipeline delay.
> >>>> */
> >>>> .set mips32
> >>>> - mfc0 v1, CP0_TCSTATUS
> >>>> + mfc0 v0, CP0_TCSTATUS
> >>>> .set mips0
> >>>> - LONG_S v1, PT_TCSTATUS(sp)
> >>>> + LONG_S v0, PT_TCSTATUS(sp)
> >>>> #endif /* CONFIG_MIPS_MT_SMTC */
> >>>> LONG_S $4, PT_R4(sp)
> >>>> LONG_S $5, PT_R5(sp)
> >>>>
> >>>>
> >>>>> /K.
> >>>>>
> >>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>> Hi Kevin, Stuart ,
> >>>>>>
> >>>>>> Woohooo You guys spotted !.
> >>>>>>
> >>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>> the culprit
> >>>>>>
> >>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>> booting !.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>> the register access. Unfortunately, the v1 register is also used in the
> >>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>> clobbered before it gets stored. This will eventually result in the
> >>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>
> >>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>
> >>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>> submit a patch just now.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Kevin K.
> >>>>>>>
> >>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>> Kevin,
> >>>>>>>>
> >>>>>>>> I'm not sure if it's useful,
> >>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>> works 2.6.32-stable with patch 804
> >>>>>>>> works_not 2.6.33-stable
> >>>>>>>>
> >>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>> do_IRQ
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>
> >>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>> clocksource_set_clock
> >>>>>>>>
> >>>>>>>> arch/mips/kernel/process.c
> >>>>>>>> cpu_idle
> >>>>>>>>
> >>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>> __irq_entry
> >>>>>>>> ipi_decode
> >>>>>>>> SMTC_CLOCK_TICK
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>
> >>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>>
> >>>>>>>> Stuart
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-31 12:27 ` Anoop P A
@ 2011-01-01 8:42 ` Kevin D. Kissell
2011-01-03 15:12 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-01 8:42 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
At this point the logical thing to do would seem to look at your kernel
image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
shows the last exception to have been taken. That's a critical SMTC
routine that gets called whenever an xxx_irq_restore() enables
interrupts, so that virtual per-TC IPI interrupts that were posted while
the TC had interrupts disabled can be handled deterministically. As I
mentioned in an earlier message, there was some cleanup work from David
Howell that changed a number of irq management-related function names
and prototypes across all architectures, which went into linux-mips.org
at very roughly the time of the breakage. The SMTC overlay over the irq
implementation has been pretty robust, but it's written in a perhaps
doomed attempt to be both efficient and using a maximum amount of common
code with the general case. A mechanical or semi-mechanical change
could conceivably have broken things.
Regards,
Kevin K.
On 12/31/2010 4:27 AM, Anoop P A wrote:
> Hi ,
>
> Kernel hangs on stop_machine call. Please find mt reg dump below.
> Another important observation is even though 2.6.33 kernel + stackframe
> patch well passes calibration hang , I am still unable boot in to a
> initramfs root ( verified ramfs working with VSMP). So it looks like
> still some issue to fix between 2.6.32 and 2.6.33 .
> ######################## Log ###########################
>
> === MIPS MT State Dump ===
> -- Global State --
> MVPControl Passed: 00000005
> MVPControl Read: 00000004
> MVPConf0 : a8008406
> -- per-VPE State --
> VPE 0
> VPEControl : 00008000
> VPEConf0 : 800f0003
> VPE0.Status : 11004201
> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> VPE0.Cause : 50804000
> VPE0.Config7 : 00010000
> VPE 1
> VPEControl : 00068006
> VPEConf0 : 80cf0003
> VPE1.Status : 11008301
> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> VPE1.Cause : 50800000
> VPE1.Config7 : 00010000
> -- per-TC State --
> TC 0 (current TC with VPE EPC above)
> TCStatus : 18102000
> TCBind : 00000000
> TCRestart : 803fa19c printk+0xc/0x30
> TCHalt : 00000000
> TCContext : 00000000
> TC 1
> TCStatus : 18902000
> TCBind : 00200000
> TCRestart : 801022a0 r4k_wait+0x20/0x40
> TCHalt : 00000000
> TCContext : 00140000
> TC 2
> TCStatus : 18902000
> TCBind : 00400000
> TCRestart : 801022a0 r4k_wait+0x20/0x40
> TCHalt : 00000000
> TCContext : 00280000
> TC 3
> TCStatus : 18902000
> TCBind : 00600000
> TCRestart : 801022a0 r4k_wait+0x20/0x40
> TCHalt : 00000000
> TCContext : 003c0000
> TC 4
> TCStatus : 18902000
> TCBind : 00800001
> TCRestart : 8010229c r4k_wait+0x1c/0x40
> TCHalt : 00000000
> TCContext : 00500000
> TC 5
> TCStatus : 18902000
> TCBind : 00a00001
> TCRestart : 8010229c r4k_wait+0x1c/0x40
> TCHalt : 00000000
> TCContext : 00640000
> TC 6
> TCStatus : 18902000
> TCBind : 00c00001
> TCRestart : 8010229c r4k_wait+0x1c/0x40
> TCHalt : 00000000
> TCContext : 00780000
> Counter Interrupts taken per CPU (TC)
> 0: 0
> 1: 0
> 2: 0
> 3: 0
> 4: 0
> 5: 0
> 6: 0
> 7: 0
> Self-IPI invocations:
> 0: 12
> 1: 0
> 2: 0
> 3: 0
> 4: 0
> 5: 5
> 6: 4
> 7: 0
> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> 0 Recoveries of "stolen" FPU
> ===========================
>
> ################################################################
>
> Thanks
> Anoop
>
> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>> I took a quick look last night, and the only thing that looked vaguely
>> dangerous in changes since the timer changes I alluded to earlier was
>> the global naming cleanup of irq-related function names that David
>> Howell submitted. The diff didn't look dangerous in itself, but some of
>> the definitions are nested subtly for SMTC to maximize the amount of
>> common code, and I could imagine something getting lost in translation
>> there. If that were really the problem, it would of course affect much
>> more than just the timer subsystem, but early in the boot process,
>> timers are pretty much the only interrupts that have to be handled
>> correctly.
>>
>> I'm travelling today, but will take a look at timekeeping_notify()
>> tomorrow or the next day...
>>
>> /K.
>>
>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>> Hi,
>>>
>>> I had a glance into the code diff without notice of any suspect-able
>>> code .
>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>> function.
>>>
>>> Thanks,
>>> Anoop
>>>
>>> PS: I may not be available until Thursday
>>>
>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>> Hi Kevin,
>>>>
>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
>>>> stackframe patch) .
>>>>
>>>> Hi Stuart,
>>>>
>>>> I haven't got much time to spend on this today.
>>>>
>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>
>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>
>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>> code diff .
>>>>
>>>> Thanks
>>>> Anoop
>>>>
>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>> Kevin,
>>>>>
>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>
>>>>>
>>>>> Anoop,
>>>>>
>>>>> Maybe we can get lucky again.
>>>>>
>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>> I'll be happy to do another diff.
>>>>>
>>>>>
>>>>> Hope you'll have had a good Christmas as well.
>>>>> We've had snow in Alabama since Christmas eve!
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Stuart
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>> To: Anoop P A
>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>
>>>>>
>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>> performance tweak for the deeper pipelined processors. In looking for
>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>> tick logic that I was skeptical had ever been tested. If you've still
>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>
>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kevin K.
>>>>>
>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>
>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>> loop but hangs after switching to mips closource
>>>>>>
>>>>>> TC 6 going on-line as CPU 6
>>>>>> Brought up 7 CPUs
>>>>>> bio: create slab<bio-0> at 0
>>>>>> SCSI subsystem initialized
>>>>>> Switching to clocksource MIPS
>>>>>>
>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>> much to get rid of this hang.
>>>>>>
>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>> index 58730c5..7fc9f10 100644
>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>> @@ -195,9 +195,9 @@
>>>>>> * to cover the pipeline delay.
>>>>>> */
>>>>>> .set mips32
>>>>>> - mfc0 v1, CP0_TCSTATUS
>>>>>> + mfc0 v0, CP0_TCSTATUS
>>>>>> .set mips0
>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>> LONG_S $4, PT_R4(sp)
>>>>>> LONG_S $5, PT_R5(sp)
>>>>>>
>>>>>>
>>>>>>> /K.
>>>>>>>
>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>
>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>
>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>> the culprit
>>>>>>>>
>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>> booting !.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>> clobbered before it gets stored. This will eventually result in the
>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>
>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>
>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>> submit a patch just now.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kevin K.
>>>>>>>>>
>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>> Kevin,
>>>>>>>>>>
>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>> works 2.6.32-stable with patch 804
>>>>>>>>>> works_not 2.6.33-stable
>>>>>>>>>>
>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>> do_IRQ
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>
>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>> clocksource_set_clock
>>>>>>>>>>
>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>> cpu_idle
>>>>>>>>>>
>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>> __irq_entry
>>>>>>>>>> ipi_decode
>>>>>>>>>> SMTC_CLOCK_TICK
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>
>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Stuart
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-01 8:42 ` Kevin D. Kissell
@ 2011-01-03 15:12 ` Anoop P A
2011-01-03 16:14 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-03 15:12 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Hi ,
Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
SMP kernel.
http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
( which will be only available RCU implementation for SMTC kernel from
2.6.37 onwards) .
With no forced preemption and selecting TREE_CPU I am able to boot
further to the hang that I have reported.
Thanks
Anoop
On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> At this point the logical thing to do would seem to look at your kernel
> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> shows the last exception to have been taken. That's a critical SMTC
> routine that gets called whenever an xxx_irq_restore() enables
> interrupts, so that virtual per-TC IPI interrupts that were posted while
> the TC had interrupts disabled can be handled deterministically. As I
> mentioned in an earlier message, there was some cleanup work from David
> Howell that changed a number of irq management-related function names
> and prototypes across all architectures, which went into linux-mips.org
> at very roughly the time of the breakage. The SMTC overlay over the irq
> implementation has been pretty robust, but it's written in a perhaps
> doomed attempt to be both efficient and using a maximum amount of common
> code with the general case. A mechanical or semi-mechanical change
> could conceivably have broken things.
>
> Regards,
>
> Kevin K.
>
>
> On 12/31/2010 4:27 AM, Anoop P A wrote:
> > Hi ,
> >
> > Kernel hangs on stop_machine call. Please find mt reg dump below.
> > Another important observation is even though 2.6.33 kernel + stackframe
> > patch well passes calibration hang , I am still unable boot in to a
> > initramfs root ( verified ramfs working with VSMP). So it looks like
> > still some issue to fix between 2.6.32 and 2.6.33 .
> > ######################## Log ###########################
> >
> > === MIPS MT State Dump ===
> > -- Global State --
> > MVPControl Passed: 00000005
> > MVPControl Read: 00000004
> > MVPConf0 : a8008406
> > -- per-VPE State --
> > VPE 0
> > VPEControl : 00008000
> > VPEConf0 : 800f0003
> > VPE0.Status : 11004201
> > VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> > VPE0.Cause : 50804000
> > VPE0.Config7 : 00010000
> > VPE 1
> > VPEControl : 00068006
> > VPEConf0 : 80cf0003
> > VPE1.Status : 11008301
> > VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> > VPE1.Cause : 50800000
> > VPE1.Config7 : 00010000
> > -- per-TC State --
> > TC 0 (current TC with VPE EPC above)
> > TCStatus : 18102000
> > TCBind : 00000000
> > TCRestart : 803fa19c printk+0xc/0x30
> > TCHalt : 00000000
> > TCContext : 00000000
> > TC 1
> > TCStatus : 18902000
> > TCBind : 00200000
> > TCRestart : 801022a0 r4k_wait+0x20/0x40
> > TCHalt : 00000000
> > TCContext : 00140000
> > TC 2
> > TCStatus : 18902000
> > TCBind : 00400000
> > TCRestart : 801022a0 r4k_wait+0x20/0x40
> > TCHalt : 00000000
> > TCContext : 00280000
> > TC 3
> > TCStatus : 18902000
> > TCBind : 00600000
> > TCRestart : 801022a0 r4k_wait+0x20/0x40
> > TCHalt : 00000000
> > TCContext : 003c0000
> > TC 4
> > TCStatus : 18902000
> > TCBind : 00800001
> > TCRestart : 8010229c r4k_wait+0x1c/0x40
> > TCHalt : 00000000
> > TCContext : 00500000
> > TC 5
> > TCStatus : 18902000
> > TCBind : 00a00001
> > TCRestart : 8010229c r4k_wait+0x1c/0x40
> > TCHalt : 00000000
> > TCContext : 00640000
> > TC 6
> > TCStatus : 18902000
> > TCBind : 00c00001
> > TCRestart : 8010229c r4k_wait+0x1c/0x40
> > TCHalt : 00000000
> > TCContext : 00780000
> > Counter Interrupts taken per CPU (TC)
> > 0: 0
> > 1: 0
> > 2: 0
> > 3: 0
> > 4: 0
> > 5: 0
> > 6: 0
> > 7: 0
> > Self-IPI invocations:
> > 0: 12
> > 1: 0
> > 2: 0
> > 3: 0
> > 4: 0
> > 5: 5
> > 6: 4
> > 7: 0
> > IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> > IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> > 0 Recoveries of "stolen" FPU
> > ===========================
> >
> > ################################################################
> >
> > Thanks
> > Anoop
> >
> > On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >> I took a quick look last night, and the only thing that looked vaguely
> >> dangerous in changes since the timer changes I alluded to earlier was
> >> the global naming cleanup of irq-related function names that David
> >> Howell submitted. The diff didn't look dangerous in itself, but some of
> >> the definitions are nested subtly for SMTC to maximize the amount of
> >> common code, and I could imagine something getting lost in translation
> >> there. If that were really the problem, it would of course affect much
> >> more than just the timer subsystem, but early in the boot process,
> >> timers are pretty much the only interrupts that have to be handled
> >> correctly.
> >>
> >> I'm travelling today, but will take a look at timekeeping_notify()
> >> tomorrow or the next day...
> >>
> >> /K.
> >>
> >> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>> Hi,
> >>>
> >>> I had a glance into the code diff without notice of any suspect-able
> >>> code .
> >>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>> function.
> >>>
> >>> Thanks,
> >>> Anoop
> >>>
> >>> PS: I may not be available until Thursday
> >>>
> >>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>> Hi Kevin,
> >>>>
> >>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> >>>> stackframe patch) .
> >>>>
> >>>> Hi Stuart,
> >>>>
> >>>> I haven't got much time to spend on this today.
> >>>>
> >>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>
> >>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>
> >>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>> code diff .
> >>>>
> >>>> Thanks
> >>>> Anoop
> >>>>
> >>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>> Kevin,
> >>>>>
> >>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>
> >>>>>
> >>>>> Anoop,
> >>>>>
> >>>>> Maybe we can get lucky again.
> >>>>>
> >>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>> I'll be happy to do another diff.
> >>>>>
> >>>>>
> >>>>> Hope you'll have had a good Christmas as well.
> >>>>> We've had snow in Alabama since Christmas eve!
> >>>>>
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Stuart
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>> To: Anoop P A
> >>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>
> >>>>>
> >>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>> performance tweak for the deeper pipelined processors. In looking for
> >>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>> tick logic that I was skeptical had ever been tested. If you've still
> >>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>
> >>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Kevin K.
> >>>>>
> >>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>
> >>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>> loop but hangs after switching to mips closource
> >>>>>>
> >>>>>> TC 6 going on-line as CPU 6
> >>>>>> Brought up 7 CPUs
> >>>>>> bio: create slab<bio-0> at 0
> >>>>>> SCSI subsystem initialized
> >>>>>> Switching to clocksource MIPS
> >>>>>>
> >>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>> much to get rid of this hang.
> >>>>>>
> >>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>> index 58730c5..7fc9f10 100644
> >>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>> @@ -195,9 +195,9 @@
> >>>>>> * to cover the pipeline delay.
> >>>>>> */
> >>>>>> .set mips32
> >>>>>> - mfc0 v1, CP0_TCSTATUS
> >>>>>> + mfc0 v0, CP0_TCSTATUS
> >>>>>> .set mips0
> >>>>>> - LONG_S v1, PT_TCSTATUS(sp)
> >>>>>> + LONG_S v0, PT_TCSTATUS(sp)
> >>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>> LONG_S $4, PT_R4(sp)
> >>>>>> LONG_S $5, PT_R5(sp)
> >>>>>>
> >>>>>>
> >>>>>>> /K.
> >>>>>>>
> >>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>
> >>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>
> >>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>> the culprit
> >>>>>>>>
> >>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>> booting !.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
> >>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>> clobbered before it gets stored. This will eventually result in the
> >>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>
> >>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>
> >>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>> submit a patch just now.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Kevin K.
> >>>>>>>>>
> >>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>> Kevin,
> >>>>>>>>>>
> >>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>> works 2.6.32-stable with patch 804
> >>>>>>>>>> works_not 2.6.33-stable
> >>>>>>>>>>
> >>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>> do_IRQ
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>> clocksource_set_clock
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>> cpu_idle
> >>>>>>>>>>
> >>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>> __irq_entry
> >>>>>>>>>> ipi_decode
> >>>>>>>>>> SMTC_CLOCK_TICK
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>
> >>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>>
> >>>>>>>>>> Stuart
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-03 15:12 ` Anoop P A
@ 2011-01-03 16:14 ` Kevin D. Kissell
2011-01-03 19:20 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-03 16:14 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
The very first SMTC implementations didn't support full kernel-mode
preemption, which anyway wasn't a priority, given the hardware event
response support in MIPS MT. I believe it was later made compatible,
but it was never extensively exercised. Since SMTC has fingers in some
pretty low-level atomicity mechanisms, if a new, parallel set was
implemented for RCU, I can easily imagine that nobody has yet
implemented SMTC-ified variants of that set.
Your last statement isn't very clear, though. Are you saying that if
you configure for no forced preemption and with TREE_CPU, the 2.6.37
kernel boots all the way up, or that it simply hangs later? What's the
last rev kernel that actually boots all the way up?
Regards,
Kevin K.
On 1/3/2011 7:12 AM, Anoop P A wrote:
> Hi ,
>
> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> SMP kernel.
> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>
> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> ( which will be only available RCU implementation for SMTC kernel from
> 2.6.37 onwards) .
>
> With no forced preemption and selecting TREE_CPU I am able to boot
> further to the hang that I have reported.
>
> Thanks
> Anoop
>
> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>> At this point the logical thing to do would seem to look at your kernel
>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>> shows the last exception to have been taken. That's a critical SMTC
>> routine that gets called whenever an xxx_irq_restore() enables
>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>> the TC had interrupts disabled can be handled deterministically. As I
>> mentioned in an earlier message, there was some cleanup work from David
>> Howell that changed a number of irq management-related function names
>> and prototypes across all architectures, which went into linux-mips.org
>> at very roughly the time of the breakage. The SMTC overlay over the irq
>> implementation has been pretty robust, but it's written in a perhaps
>> doomed attempt to be both efficient and using a maximum amount of common
>> code with the general case. A mechanical or semi-mechanical change
>> could conceivably have broken things.
>>
>> Regards,
>>
>> Kevin K.
>>
>>
>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>> Hi ,
>>>
>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>> Another important observation is even though 2.6.33 kernel + stackframe
>>> patch well passes calibration hang , I am still unable boot in to a
>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>> ######################## Log ###########################
>>>
>>> === MIPS MT State Dump ===
>>> -- Global State --
>>> MVPControl Passed: 00000005
>>> MVPControl Read: 00000004
>>> MVPConf0 : a8008406
>>> -- per-VPE State --
>>> VPE 0
>>> VPEControl : 00008000
>>> VPEConf0 : 800f0003
>>> VPE0.Status : 11004201
>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>> VPE0.Cause : 50804000
>>> VPE0.Config7 : 00010000
>>> VPE 1
>>> VPEControl : 00068006
>>> VPEConf0 : 80cf0003
>>> VPE1.Status : 11008301
>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>> VPE1.Cause : 50800000
>>> VPE1.Config7 : 00010000
>>> -- per-TC State --
>>> TC 0 (current TC with VPE EPC above)
>>> TCStatus : 18102000
>>> TCBind : 00000000
>>> TCRestart : 803fa19c printk+0xc/0x30
>>> TCHalt : 00000000
>>> TCContext : 00000000
>>> TC 1
>>> TCStatus : 18902000
>>> TCBind : 00200000
>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>> TCHalt : 00000000
>>> TCContext : 00140000
>>> TC 2
>>> TCStatus : 18902000
>>> TCBind : 00400000
>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>> TCHalt : 00000000
>>> TCContext : 00280000
>>> TC 3
>>> TCStatus : 18902000
>>> TCBind : 00600000
>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>> TCHalt : 00000000
>>> TCContext : 003c0000
>>> TC 4
>>> TCStatus : 18902000
>>> TCBind : 00800001
>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>> TCHalt : 00000000
>>> TCContext : 00500000
>>> TC 5
>>> TCStatus : 18902000
>>> TCBind : 00a00001
>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>> TCHalt : 00000000
>>> TCContext : 00640000
>>> TC 6
>>> TCStatus : 18902000
>>> TCBind : 00c00001
>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>> TCHalt : 00000000
>>> TCContext : 00780000
>>> Counter Interrupts taken per CPU (TC)
>>> 0: 0
>>> 1: 0
>>> 2: 0
>>> 3: 0
>>> 4: 0
>>> 5: 0
>>> 6: 0
>>> 7: 0
>>> Self-IPI invocations:
>>> 0: 12
>>> 1: 0
>>> 2: 0
>>> 3: 0
>>> 4: 0
>>> 5: 5
>>> 6: 4
>>> 7: 0
>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>> 0 Recoveries of "stolen" FPU
>>> ===========================
>>>
>>> ################################################################
>>>
>>> Thanks
>>> Anoop
>>>
>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>> I took a quick look last night, and the only thing that looked vaguely
>>>> dangerous in changes since the timer changes I alluded to earlier was
>>>> the global naming cleanup of irq-related function names that David
>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
>>>> the definitions are nested subtly for SMTC to maximize the amount of
>>>> common code, and I could imagine something getting lost in translation
>>>> there. If that were really the problem, it would of course affect much
>>>> more than just the timer subsystem, but early in the boot process,
>>>> timers are pretty much the only interrupts that have to be handled
>>>> correctly.
>>>>
>>>> I'm travelling today, but will take a look at timekeeping_notify()
>>>> tomorrow or the next day...
>>>>
>>>> /K.
>>>>
>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>> Hi,
>>>>>
>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>> code .
>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>> function.
>>>>>
>>>>> Thanks,
>>>>> Anoop
>>>>>
>>>>> PS: I may not be available until Thursday
>>>>>
>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>> Hi Kevin,
>>>>>>
>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
>>>>>> stackframe patch) .
>>>>>>
>>>>>> Hi Stuart,
>>>>>>
>>>>>> I haven't got much time to spend on this today.
>>>>>>
>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>
>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>
>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>> code diff .
>>>>>>
>>>>>> Thanks
>>>>>> Anoop
>>>>>>
>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>> Kevin,
>>>>>>>
>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>
>>>>>>>
>>>>>>> Anoop,
>>>>>>>
>>>>>>> Maybe we can get lucky again.
>>>>>>>
>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>> I'll be happy to do another diff.
>>>>>>>
>>>>>>>
>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>> We've had snow in Alabama since Christmas eve!
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Stuart
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>> To: Anoop P A
>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>
>>>>>>>
>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>> performance tweak for the deeper pipelined processors. In looking for
>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>
>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kevin K.
>>>>>>>
>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>
>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>
>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>> Brought up 7 CPUs
>>>>>>>> bio: create slab<bio-0> at 0
>>>>>>>> SCSI subsystem initialized
>>>>>>>> Switching to clocksource MIPS
>>>>>>>>
>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>> much to get rid of this hang.
>>>>>>>>
>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>> * to cover the pipeline delay.
>>>>>>>> */
>>>>>>>> .set mips32
>>>>>>>> - mfc0 v1, CP0_TCSTATUS
>>>>>>>> + mfc0 v0, CP0_TCSTATUS
>>>>>>>> .set mips0
>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>> LONG_S $4, PT_R4(sp)
>>>>>>>> LONG_S $5, PT_R5(sp)
>>>>>>>>
>>>>>>>>
>>>>>>>>> /K.
>>>>>>>>>
>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>
>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>
>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>> the culprit
>>>>>>>>>>
>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>> booting !.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anoop
>>>>>>>>>>
>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>
>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>
>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Kevin K.
>>>>>>>>>>>
>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>> works 2.6.32-stable with patch 804
>>>>>>>>>>>> works_not 2.6.33-stable
>>>>>>>>>>>>
>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>> do_IRQ
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>> clocksource_set_clock
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>> cpu_idle
>>>>>>>>>>>>
>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>> __irq_entry
>>>>>>>>>>>> ipi_decode
>>>>>>>>>>>> SMTC_CLOCK_TICK
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>
>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>
>>>>>>>>>>>> Stuart
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-03 16:14 ` Kevin D. Kissell
@ 2011-01-03 19:20 ` Anoop P A
2011-01-04 8:17 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-03 19:20 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Hi Kevin,
On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> The very first SMTC implementations didn't support full kernel-mode
> preemption, which anyway wasn't a priority, given the hardware event
> response support in MIPS MT. I believe it was later made compatible,
> but it was never extensively exercised. Since SMTC has fingers in some
> pretty low-level atomicity mechanisms, if a new, parallel set was
> implemented for RCU, I can easily imagine that nobody has yet
> implemented SMTC-ified variants of that set.
>
> Your last statement isn't very clear, though. Are you saying that if
> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> kernel boots all the way up, or that it simply hangs later? What's the
> last rev kernel that actually boots all the way up?
I have debugged this a bit more. It seems that kernel getting stalled
while executing on TC's of second VPE .
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=2504 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=10036 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=17568 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=25100 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=32632 jiffies)
INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
by 1, t=40164 jiffies)
With CONFIG_TREE_CPU we were not hitting this scenario very often.
However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
I presume some issue in my timer setup . I am not seeing timer interrupt
(or IPI interrupt) getting incremented for VPE1 tcs on a completely
booted 2.6.32-stable kernel.
/ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6
1: 148 15023 15140 15093 3779 8
2 MIPS SMTC_IPI
6: 0 0 0 0 0 0
0 MIPS MSP CIC cascade
8: 0 0 0 0 0 0
0 MSP_CIC Softreset button
9: 0 0 0 0 0 0
0 MSP_CIC Standby switch
21: 0 0 0 0 0 0
0 MSP_CIC MSP PER cascade
25: 15113 341 4 7 0 0
0 MSP_CIC timer
27: 260 9 0 1 0 0
0 MSP_CIC serial
34: 0 0 0 0 0 0
0 MSP_CIC timer
Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
I have tried setting up VPE1 timer from get_co_compare_int as follows
unsigned int __cpuinit get_c0_compare_int(void)
{
if ((1==get_current_vpe()) && !vpe1_timr_installed){
memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
vpe1_timr_installed++;
}
return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
MSP_INT_VPE0_TIMER);
}
Thanks
Anoop
>
> Regards,
>
> Kevin K.
>
> On 1/3/2011 7:12 AM, Anoop P A wrote:
> > Hi ,
> >
> > Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> > SMP kernel.
> > http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >
> > CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> > ( which will be only available RCU implementation for SMTC kernel from
> > 2.6.37 onwards) .
> >
> > With no forced preemption and selecting TREE_CPU I am able to boot
> > further to the hang that I have reported.
> >
> > Thanks
> > Anoop
> >
> > On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >> At this point the logical thing to do would seem to look at your kernel
> >> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >> shows the last exception to have been taken. That's a critical SMTC
> >> routine that gets called whenever an xxx_irq_restore() enables
> >> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >> the TC had interrupts disabled can be handled deterministically. As I
> >> mentioned in an earlier message, there was some cleanup work from David
> >> Howell that changed a number of irq management-related function names
> >> and prototypes across all architectures, which went into linux-mips.org
> >> at very roughly the time of the breakage. The SMTC overlay over the irq
> >> implementation has been pretty robust, but it's written in a perhaps
> >> doomed attempt to be both efficient and using a maximum amount of common
> >> code with the general case. A mechanical or semi-mechanical change
> >> could conceivably have broken things.
> >>
> >> Regards,
> >>
> >> Kevin K.
> >>
> >>
> >> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>> Hi ,
> >>>
> >>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>> Another important observation is even though 2.6.33 kernel + stackframe
> >>> patch well passes calibration hang , I am still unable boot in to a
> >>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>> ######################## Log ###########################
> >>>
> >>> === MIPS MT State Dump ===
> >>> -- Global State --
> >>> MVPControl Passed: 00000005
> >>> MVPControl Read: 00000004
> >>> MVPConf0 : a8008406
> >>> -- per-VPE State --
> >>> VPE 0
> >>> VPEControl : 00008000
> >>> VPEConf0 : 800f0003
> >>> VPE0.Status : 11004201
> >>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>> VPE0.Cause : 50804000
> >>> VPE0.Config7 : 00010000
> >>> VPE 1
> >>> VPEControl : 00068006
> >>> VPEConf0 : 80cf0003
> >>> VPE1.Status : 11008301
> >>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>> VPE1.Cause : 50800000
> >>> VPE1.Config7 : 00010000
> >>> -- per-TC State --
> >>> TC 0 (current TC with VPE EPC above)
> >>> TCStatus : 18102000
> >>> TCBind : 00000000
> >>> TCRestart : 803fa19c printk+0xc/0x30
> >>> TCHalt : 00000000
> >>> TCContext : 00000000
> >>> TC 1
> >>> TCStatus : 18902000
> >>> TCBind : 00200000
> >>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 00140000
> >>> TC 2
> >>> TCStatus : 18902000
> >>> TCBind : 00400000
> >>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 00280000
> >>> TC 3
> >>> TCStatus : 18902000
> >>> TCBind : 00600000
> >>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 003c0000
> >>> TC 4
> >>> TCStatus : 18902000
> >>> TCBind : 00800001
> >>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 00500000
> >>> TC 5
> >>> TCStatus : 18902000
> >>> TCBind : 00a00001
> >>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 00640000
> >>> TC 6
> >>> TCStatus : 18902000
> >>> TCBind : 00c00001
> >>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>> TCHalt : 00000000
> >>> TCContext : 00780000
> >>> Counter Interrupts taken per CPU (TC)
> >>> 0: 0
> >>> 1: 0
> >>> 2: 0
> >>> 3: 0
> >>> 4: 0
> >>> 5: 0
> >>> 6: 0
> >>> 7: 0
> >>> Self-IPI invocations:
> >>> 0: 12
> >>> 1: 0
> >>> 2: 0
> >>> 3: 0
> >>> 4: 0
> >>> 5: 5
> >>> 6: 4
> >>> 7: 0
> >>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>> 0 Recoveries of "stolen" FPU
> >>> ===========================
> >>>
> >>> ################################################################
> >>>
> >>> Thanks
> >>> Anoop
> >>>
> >>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>> I took a quick look last night, and the only thing that looked vaguely
> >>>> dangerous in changes since the timer changes I alluded to earlier was
> >>>> the global naming cleanup of irq-related function names that David
> >>>> Howell submitted. The diff didn't look dangerous in itself, but some of
> >>>> the definitions are nested subtly for SMTC to maximize the amount of
> >>>> common code, and I could imagine something getting lost in translation
> >>>> there. If that were really the problem, it would of course affect much
> >>>> more than just the timer subsystem, but early in the boot process,
> >>>> timers are pretty much the only interrupts that have to be handled
> >>>> correctly.
> >>>>
> >>>> I'm travelling today, but will take a look at timekeeping_notify()
> >>>> tomorrow or the next day...
> >>>>
> >>>> /K.
> >>>>
> >>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>> code .
> >>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>> function.
> >>>>>
> >>>>> Thanks,
> >>>>> Anoop
> >>>>>
> >>>>> PS: I may not be available until Thursday
> >>>>>
> >>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>> Hi Kevin,
> >>>>>>
> >>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> >>>>>> stackframe patch) .
> >>>>>>
> >>>>>> Hi Stuart,
> >>>>>>
> >>>>>> I haven't got much time to spend on this today.
> >>>>>>
> >>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>
> >>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>
> >>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>> code diff .
> >>>>>>
> >>>>>> Thanks
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>> Kevin,
> >>>>>>>
> >>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>
> >>>>>>>
> >>>>>>> Anoop,
> >>>>>>>
> >>>>>>> Maybe we can get lucky again.
> >>>>>>>
> >>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>> I'll be happy to do another diff.
> >>>>>>>
> >>>>>>>
> >>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>> We've had snow in Alabama since Christmas eve!
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Stuart
> >>>>>>>
> >>>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>> To: Anoop P A
> >>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>
> >>>>>>>
> >>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>> performance tweak for the deeper pipelined processors. In looking for
> >>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>> tick logic that I was skeptical had ever been tested. If you've still
> >>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>
> >>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Kevin K.
> >>>>>>>
> >>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>
> >>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>
> >>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>> Brought up 7 CPUs
> >>>>>>>> bio: create slab<bio-0> at 0
> >>>>>>>> SCSI subsystem initialized
> >>>>>>>> Switching to clocksource MIPS
> >>>>>>>>
> >>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>> much to get rid of this hang.
> >>>>>>>>
> >>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>> * to cover the pipeline delay.
> >>>>>>>> */
> >>>>>>>> .set mips32
> >>>>>>>> - mfc0 v1, CP0_TCSTATUS
> >>>>>>>> + mfc0 v0, CP0_TCSTATUS
> >>>>>>>> .set mips0
> >>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
> >>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
> >>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>> LONG_S $4, PT_R4(sp)
> >>>>>>>> LONG_S $5, PT_R5(sp)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> /K.
> >>>>>>>>>
> >>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>
> >>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>
> >>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>> the culprit
> >>>>>>>>>>
> >>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>> booting !.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Anoop
> >>>>>>>>>>
> >>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
> >>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
> >>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>
> >>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>
> >>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>>
> >>>>>>>>>>> Kevin K.
> >>>>>>>>>>>
> >>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>> works 2.6.32-stable with patch 804
> >>>>>>>>>>>> works_not 2.6.33-stable
> >>>>>>>>>>>>
> >>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>> do_IRQ
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>> clocksource_set_clock
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>> cpu_idle
> >>>>>>>>>>>>
> >>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>> __irq_entry
> >>>>>>>>>>>> ipi_decode
> >>>>>>>>>>>> SMTC_CLOCK_TICK
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Stuart
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-03 19:20 ` Anoop P A
@ 2011-01-04 8:17 ` Kevin D. Kissell
2011-01-04 13:02 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 8:17 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Those interrupt counters show that IPIs are being taken everywhere,
though very few by CPUs 5 and 6. If I understand the configuration
correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
rate, *if* we're looking at a tickless kernel under low load. But there
may be a clue there to part of your problem. I have no idea why the
behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
you're getting your clock interrupts through the MSP CIC interrupt
controller on VPE 0. There's nothing symmetric for VPE1. The Malta
example code is perhaps deceptively simple, in that both VPEs have their
count/compare indication wired directly to the 2 clock interrupt inputs,
so that having both of them running with only a single set of irq state
just works. I don't know whether the MSP CIC timer interrupt is a
gating of the VPE0 count/compare output, or whether it's it's own
interval timer, but I suspect that you may need to do some further
low-level initialization in the platform-specific code to set up an
interrupt on the VPE1 side. I don't think the snippet you've got below
would work as written.
If it's purely an issue with clock distribution on VPE1, then a boot
with maxvpes=1 maxtcs=4 should be stable.
/K.
On 1/3/2011 11:20 AM, Anoop P A wrote:
> Hi Kevin,
>
> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>> The very first SMTC implementations didn't support full kernel-mode
>> preemption, which anyway wasn't a priority, given the hardware event
>> response support in MIPS MT. I believe it was later made compatible,
>> but it was never extensively exercised. Since SMTC has fingers in some
>> pretty low-level atomicity mechanisms, if a new, parallel set was
>> implemented for RCU, I can easily imagine that nobody has yet
>> implemented SMTC-ified variants of that set.
>>
>> Your last statement isn't very clear, though. Are you saying that if
>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>> kernel boots all the way up, or that it simply hangs later? What's the
>> last rev kernel that actually boots all the way up?
> I have debugged this a bit more. It seems that kernel getting stalled
> while executing on TC's of second VPE .
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=2504 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=10036 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=17568 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=25100 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=32632 jiffies)
> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> by 1, t=40164 jiffies)
>
> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>
> I presume some issue in my timer setup . I am not seeing timer interrupt
> (or IPI interrupt) getting incremented for VPE1 tcs on a completely
> booted 2.6.32-stable kernel.
>
> / # cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> CPU6
> 1: 148 15023 15140 15093 3779 8
> 2 MIPS SMTC_IPI
> 6: 0 0 0 0 0 0
> 0 MIPS MSP CIC cascade
> 8: 0 0 0 0 0 0
> 0 MSP_CIC Softreset button
> 9: 0 0 0 0 0 0
> 0 MSP_CIC Standby switch
> 21: 0 0 0 0 0 0
> 0 MSP_CIC MSP PER cascade
> 25: 15113 341 4 7 0 0
> 0 MSP_CIC timer
> 27: 260 9 0 1 0 0
> 0 MSP_CIC serial
> 34: 0 0 0 0 0 0
> 0 MSP_CIC timer
>
> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
>
> I have tried setting up VPE1 timer from get_co_compare_int as follows
>
> unsigned int __cpuinit get_c0_compare_int(void)
> {
> if ((1==get_current_vpe()) && !vpe1_timr_installed){
>
> memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
>
> setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
> vpe1_timr_installed++;
> }
> return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> MSP_INT_VPE0_TIMER);
> }
>
> Thanks
> Anoop
>
>> Regards,
>>
>> Kevin K.
>>
>> On 1/3/2011 7:12 AM, Anoop P A wrote:
>>> Hi ,
>>>
>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
>>> SMP kernel.
>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>>>
>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
>>> ( which will be only available RCU implementation for SMTC kernel from
>>> 2.6.37 onwards) .
>>>
>>> With no forced preemption and selecting TREE_CPU I am able to boot
>>> further to the hang that I have reported.
>>>
>>> Thanks
>>> Anoop
>>>
>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>>>> At this point the logical thing to do would seem to look at your kernel
>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>>>> shows the last exception to have been taken. That's a critical SMTC
>>>> routine that gets called whenever an xxx_irq_restore() enables
>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>>>> the TC had interrupts disabled can be handled deterministically. As I
>>>> mentioned in an earlier message, there was some cleanup work from David
>>>> Howell that changed a number of irq management-related function names
>>>> and prototypes across all architectures, which went into linux-mips.org
>>>> at very roughly the time of the breakage. The SMTC overlay over the irq
>>>> implementation has been pretty robust, but it's written in a perhaps
>>>> doomed attempt to be both efficient and using a maximum amount of common
>>>> code with the general case. A mechanical or semi-mechanical change
>>>> could conceivably have broken things.
>>>>
>>>> Regards,
>>>>
>>>> Kevin K.
>>>>
>>>>
>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>>>> Hi ,
>>>>>
>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>>>> Another important observation is even though 2.6.33 kernel + stackframe
>>>>> patch well passes calibration hang , I am still unable boot in to a
>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>>>> ######################## Log ###########################
>>>>>
>>>>> === MIPS MT State Dump ===
>>>>> -- Global State --
>>>>> MVPControl Passed: 00000005
>>>>> MVPControl Read: 00000004
>>>>> MVPConf0 : a8008406
>>>>> -- per-VPE State --
>>>>> VPE 0
>>>>> VPEControl : 00008000
>>>>> VPEConf0 : 800f0003
>>>>> VPE0.Status : 11004201
>>>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>>>> VPE0.Cause : 50804000
>>>>> VPE0.Config7 : 00010000
>>>>> VPE 1
>>>>> VPEControl : 00068006
>>>>> VPEConf0 : 80cf0003
>>>>> VPE1.Status : 11008301
>>>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>>>> VPE1.Cause : 50800000
>>>>> VPE1.Config7 : 00010000
>>>>> -- per-TC State --
>>>>> TC 0 (current TC with VPE EPC above)
>>>>> TCStatus : 18102000
>>>>> TCBind : 00000000
>>>>> TCRestart : 803fa19c printk+0xc/0x30
>>>>> TCHalt : 00000000
>>>>> TCContext : 00000000
>>>>> TC 1
>>>>> TCStatus : 18902000
>>>>> TCBind : 00200000
>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 00140000
>>>>> TC 2
>>>>> TCStatus : 18902000
>>>>> TCBind : 00400000
>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 00280000
>>>>> TC 3
>>>>> TCStatus : 18902000
>>>>> TCBind : 00600000
>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 003c0000
>>>>> TC 4
>>>>> TCStatus : 18902000
>>>>> TCBind : 00800001
>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 00500000
>>>>> TC 5
>>>>> TCStatus : 18902000
>>>>> TCBind : 00a00001
>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 00640000
>>>>> TC 6
>>>>> TCStatus : 18902000
>>>>> TCBind : 00c00001
>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>> TCHalt : 00000000
>>>>> TCContext : 00780000
>>>>> Counter Interrupts taken per CPU (TC)
>>>>> 0: 0
>>>>> 1: 0
>>>>> 2: 0
>>>>> 3: 0
>>>>> 4: 0
>>>>> 5: 0
>>>>> 6: 0
>>>>> 7: 0
>>>>> Self-IPI invocations:
>>>>> 0: 12
>>>>> 1: 0
>>>>> 2: 0
>>>>> 3: 0
>>>>> 4: 0
>>>>> 5: 5
>>>>> 6: 4
>>>>> 7: 0
>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>>>> 0 Recoveries of "stolen" FPU
>>>>> ===========================
>>>>>
>>>>> ################################################################
>>>>>
>>>>> Thanks
>>>>> Anoop
>>>>>
>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>>>> I took a quick look last night, and the only thing that looked vaguely
>>>>>> dangerous in changes since the timer changes I alluded to earlier was
>>>>>> the global naming cleanup of irq-related function names that David
>>>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
>>>>>> the definitions are nested subtly for SMTC to maximize the amount of
>>>>>> common code, and I could imagine something getting lost in translation
>>>>>> there. If that were really the problem, it would of course affect much
>>>>>> more than just the timer subsystem, but early in the boot process,
>>>>>> timers are pretty much the only interrupts that have to be handled
>>>>>> correctly.
>>>>>>
>>>>>> I'm travelling today, but will take a look at timekeeping_notify()
>>>>>> tomorrow or the next day...
>>>>>>
>>>>>> /K.
>>>>>>
>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>>>> code .
>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>>>> function.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Anoop
>>>>>>>
>>>>>>> PS: I may not be available until Thursday
>>>>>>>
>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>>>> Hi Kevin,
>>>>>>>>
>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
>>>>>>>> stackframe patch) .
>>>>>>>>
>>>>>>>> Hi Stuart,
>>>>>>>>
>>>>>>>> I haven't got much time to spend on this today.
>>>>>>>>
>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>>>
>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>>>
>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>>>> code diff .
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>>>> Kevin,
>>>>>>>>>
>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Anoop,
>>>>>>>>>
>>>>>>>>> Maybe we can get lucky again.
>>>>>>>>>
>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>>>> I'll be happy to do another diff.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>>>> We've had snow in Alabama since Christmas eve!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Stuart
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>>>> To: Anoop P A
>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>>>> performance tweak for the deeper pipelined processors. In looking for
>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>>>
>>>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kevin K.
>>>>>>>>>
>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>>>
>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>>>
>>>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>>>> Brought up 7 CPUs
>>>>>>>>>> bio: create slab<bio-0> at 0
>>>>>>>>>> SCSI subsystem initialized
>>>>>>>>>> Switching to clocksource MIPS
>>>>>>>>>>
>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>>>> much to get rid of this hang.
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>>>> * to cover the pipeline delay.
>>>>>>>>>> */
>>>>>>>>>> .set mips32
>>>>>>>>>> - mfc0 v1, CP0_TCSTATUS
>>>>>>>>>> + mfc0 v0, CP0_TCSTATUS
>>>>>>>>>> .set mips0
>>>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
>>>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
>>>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>>>> LONG_S $4, PT_R4(sp)
>>>>>>>>>> LONG_S $5, PT_R5(sp)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> /K.
>>>>>>>>>>>
>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>>>
>>>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>>>
>>>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>>>> the culprit
>>>>>>>>>>>>
>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>>>> booting !.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Anoop
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kevin K.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>>>> works 2.6.32-stable with patch 804
>>>>>>>>>>>>>> works_not 2.6.33-stable
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>>>> do_IRQ
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>>>> clocksource_set_clock
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>>>> cpu_idle
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>>>> __irq_entry
>>>>>>>>>>>>>> ipi_decode
>>>>>>>>>>>>>> SMTC_CLOCK_TICK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Stuart
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 8:17 ` Kevin D. Kissell
@ 2011-01-04 13:02 ` Anoop P A
2011-01-04 14:37 ` Anoop P A
2011-01-04 17:40 ` Kevin D. Kissell
0 siblings, 2 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-04 13:02 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> Those interrupt counters show that IPIs are being taken everywhere,
> though very few by CPUs 5 and 6. If I understand the configuration
> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
Yes CPU4 is in second VPE
> rate, *if* we're looking at a tickless kernel under low load. But there
No it was not the tickless kernel.I had selected 250 MHz timer. can't we
expect IPI / timer interrupt for all the threads in this case ?.
> may be a clue there to part of your problem. I have no idea why the
> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> you're getting your clock interrupts through the MSP CIC interrupt
> controller on VPE 0. There's nothing symmetric for VPE1. The Malta
> example code is perhaps deceptively simple, in that both VPEs have their
> count/compare indication wired directly to the 2 clock interrupt inputs,
> so that having both of them running with only a single set of irq state
> just works. I don't know whether the MSP CIC timer interrupt is a
In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
connected to cpu irq 6.
I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
interrupt . Don't we have support for separate irq in SMTC
implementation ?..
> gating of the VPE0 count/compare output, or whether it's it's own
> interval timer, but I suspect that you may need to do some further
> low-level initialization in the platform-specific code to set up an
> interrupt on the VPE1 side. I don't think the snippet you've got below
> would work as written.
The routine which I copied works fine for VSMP mode .
/ # cat /proc/interrupts
CPU0 CPU1
0: 187 254 MIPS IPI_resched
1: 77 174 MIPS IPI_call
6: 0 0 MIPS MSP CIC cascade
8: 0 0 MSP_CIC Softreset button
9: 0 0 MSP_CIC Standby switch
21: 0 0 MSP_CIC MSP PER cascade
25: 37077 0 MSP_CIC timer
27: 188 0 MSP_CIC serial
34: 0 36986 MSP_CIC timer
Do I want to change anything specific for SMTC ? .
>
> If it's purely an issue with clock distribution on VPE1, then a boot
> with maxvpes=1 maxtcs=4 should be stable.
Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>
> /K.
>
> On 1/3/2011 11:20 AM, Anoop P A wrote:
> > Hi Kevin,
> >
> > On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >> The very first SMTC implementations didn't support full kernel-mode
> >> preemption, which anyway wasn't a priority, given the hardware event
> >> response support in MIPS MT. I believe it was later made compatible,
> >> but it was never extensively exercised. Since SMTC has fingers in some
> >> pretty low-level atomicity mechanisms, if a new, parallel set was
> >> implemented for RCU, I can easily imagine that nobody has yet
> >> implemented SMTC-ified variants of that set.
> >>
> >> Your last statement isn't very clear, though. Are you saying that if
> >> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >> kernel boots all the way up, or that it simply hangs later? What's the
> >> last rev kernel that actually boots all the way up?
> > I have debugged this a bit more. It seems that kernel getting stalled
> > while executing on TC's of second VPE .
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=2504 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=10036 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=17568 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=25100 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=32632 jiffies)
> > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > by 1, t=40164 jiffies)
> >
> > With CONFIG_TREE_CPU we were not hitting this scenario very often.
> > However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >
> > I presume some issue in my timer setup . I am not seeing timer interrupt
> > (or IPI interrupt) getting incremented for VPE1 tcs on a completely
> > booted 2.6.32-stable kernel.
> >
> > / # cat /proc/interrupts
> > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> > CPU6
> > 1: 148 15023 15140 15093 3779 8
> > 2 MIPS SMTC_IPI
> > 6: 0 0 0 0 0 0
> > 0 MIPS MSP CIC cascade
> > 8: 0 0 0 0 0 0
> > 0 MSP_CIC Softreset button
> > 9: 0 0 0 0 0 0
> > 0 MSP_CIC Standby switch
> > 21: 0 0 0 0 0 0
> > 0 MSP_CIC MSP PER cascade
> > 25: 15113 341 4 7 0 0
> > 0 MSP_CIC timer
> > 27: 260 9 0 1 0 0
> > 0 MSP_CIC serial
> > 34: 0 0 0 0 0 0
> > 0 MSP_CIC timer
> >
> > Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> >
> > I have tried setting up VPE1 timer from get_co_compare_int as follows
> >
> > unsigned int __cpuinit get_c0_compare_int(void)
> > {
> > if ((1==get_current_vpe()) && !vpe1_timr_installed){
> >
> > memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> >
> > setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
> > vpe1_timr_installed++;
> > }
> > return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> > MSP_INT_VPE0_TIMER);
> > }
> >
> > Thanks
> > Anoop
> >
> >> Regards,
> >>
> >> Kevin K.
> >>
> >> On 1/3/2011 7:12 AM, Anoop P A wrote:
> >>> Hi ,
> >>>
> >>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> >>> SMP kernel.
> >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >>>
> >>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> >>> ( which will be only available RCU implementation for SMTC kernel from
> >>> 2.6.37 onwards) .
> >>>
> >>> With no forced preemption and selecting TREE_CPU I am able to boot
> >>> further to the hang that I have reported.
> >>>
> >>> Thanks
> >>> Anoop
> >>>
> >>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >>>> At this point the logical thing to do would seem to look at your kernel
> >>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >>>> shows the last exception to have been taken. That's a critical SMTC
> >>>> routine that gets called whenever an xxx_irq_restore() enables
> >>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >>>> the TC had interrupts disabled can be handled deterministically. As I
> >>>> mentioned in an earlier message, there was some cleanup work from David
> >>>> Howell that changed a number of irq management-related function names
> >>>> and prototypes across all architectures, which went into linux-mips.org
> >>>> at very roughly the time of the breakage. The SMTC overlay over the irq
> >>>> implementation has been pretty robust, but it's written in a perhaps
> >>>> doomed attempt to be both efficient and using a maximum amount of common
> >>>> code with the general case. A mechanical or semi-mechanical change
> >>>> could conceivably have broken things.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Kevin K.
> >>>>
> >>>>
> >>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>>>> Hi ,
> >>>>>
> >>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>>>> Another important observation is even though 2.6.33 kernel + stackframe
> >>>>> patch well passes calibration hang , I am still unable boot in to a
> >>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>>>> ######################## Log ###########################
> >>>>>
> >>>>> === MIPS MT State Dump ===
> >>>>> -- Global State --
> >>>>> MVPControl Passed: 00000005
> >>>>> MVPControl Read: 00000004
> >>>>> MVPConf0 : a8008406
> >>>>> -- per-VPE State --
> >>>>> VPE 0
> >>>>> VPEControl : 00008000
> >>>>> VPEConf0 : 800f0003
> >>>>> VPE0.Status : 11004201
> >>>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>>>> VPE0.Cause : 50804000
> >>>>> VPE0.Config7 : 00010000
> >>>>> VPE 1
> >>>>> VPEControl : 00068006
> >>>>> VPEConf0 : 80cf0003
> >>>>> VPE1.Status : 11008301
> >>>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>>>> VPE1.Cause : 50800000
> >>>>> VPE1.Config7 : 00010000
> >>>>> -- per-TC State --
> >>>>> TC 0 (current TC with VPE EPC above)
> >>>>> TCStatus : 18102000
> >>>>> TCBind : 00000000
> >>>>> TCRestart : 803fa19c printk+0xc/0x30
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00000000
> >>>>> TC 1
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00200000
> >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00140000
> >>>>> TC 2
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00400000
> >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00280000
> >>>>> TC 3
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00600000
> >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 003c0000
> >>>>> TC 4
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00800001
> >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00500000
> >>>>> TC 5
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00a00001
> >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00640000
> >>>>> TC 6
> >>>>> TCStatus : 18902000
> >>>>> TCBind : 00c00001
> >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>> TCHalt : 00000000
> >>>>> TCContext : 00780000
> >>>>> Counter Interrupts taken per CPU (TC)
> >>>>> 0: 0
> >>>>> 1: 0
> >>>>> 2: 0
> >>>>> 3: 0
> >>>>> 4: 0
> >>>>> 5: 0
> >>>>> 6: 0
> >>>>> 7: 0
> >>>>> Self-IPI invocations:
> >>>>> 0: 12
> >>>>> 1: 0
> >>>>> 2: 0
> >>>>> 3: 0
> >>>>> 4: 0
> >>>>> 5: 5
> >>>>> 6: 4
> >>>>> 7: 0
> >>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>>>> 0 Recoveries of "stolen" FPU
> >>>>> ===========================
> >>>>>
> >>>>> ################################################################
> >>>>>
> >>>>> Thanks
> >>>>> Anoop
> >>>>>
> >>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>>>> I took a quick look last night, and the only thing that looked vaguely
> >>>>>> dangerous in changes since the timer changes I alluded to earlier was
> >>>>>> the global naming cleanup of irq-related function names that David
> >>>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
> >>>>>> the definitions are nested subtly for SMTC to maximize the amount of
> >>>>>> common code, and I could imagine something getting lost in translation
> >>>>>> there. If that were really the problem, it would of course affect much
> >>>>>> more than just the timer subsystem, but early in the boot process,
> >>>>>> timers are pretty much the only interrupts that have to be handled
> >>>>>> correctly.
> >>>>>>
> >>>>>> I'm travelling today, but will take a look at timekeeping_notify()
> >>>>>> tomorrow or the next day...
> >>>>>>
> >>>>>> /K.
> >>>>>>
> >>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>>>> code .
> >>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>>>> function.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Anoop
> >>>>>>>
> >>>>>>> PS: I may not be available until Thursday
> >>>>>>>
> >>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>>>> Hi Kevin,
> >>>>>>>>
> >>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> >>>>>>>> stackframe patch) .
> >>>>>>>>
> >>>>>>>> Hi Stuart,
> >>>>>>>>
> >>>>>>>> I haven't got much time to spend on this today.
> >>>>>>>>
> >>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>>>
> >>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>>>
> >>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>>>> code diff .
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>>>> Kevin,
> >>>>>>>>>
> >>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Anoop,
> >>>>>>>>>
> >>>>>>>>> Maybe we can get lucky again.
> >>>>>>>>>
> >>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>>>> I'll be happy to do another diff.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>>>> We've had snow in Alabama since Christmas eve!
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Stuart
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>>>> To: Anoop P A
> >>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>>>> performance tweak for the deeper pipelined processors. In looking for
> >>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
> >>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>>>
> >>>>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Kevin K.
> >>>>>>>>>
> >>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>>>
> >>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>>>
> >>>>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>>>> Brought up 7 CPUs
> >>>>>>>>>> bio: create slab<bio-0> at 0
> >>>>>>>>>> SCSI subsystem initialized
> >>>>>>>>>> Switching to clocksource MIPS
> >>>>>>>>>>
> >>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>>>> much to get rid of this hang.
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>>>> * to cover the pipeline delay.
> >>>>>>>>>> */
> >>>>>>>>>> .set mips32
> >>>>>>>>>> - mfc0 v1, CP0_TCSTATUS
> >>>>>>>>>> + mfc0 v0, CP0_TCSTATUS
> >>>>>>>>>> .set mips0
> >>>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
> >>>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
> >>>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>>>> LONG_S $4, PT_R4(sp)
> >>>>>>>>>> LONG_S $5, PT_R5(sp)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> /K.
> >>>>>>>>>>>
> >>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>>>
> >>>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>>>> the culprit
> >>>>>>>>>>>>
> >>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>>>> booting !.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Anoop
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
> >>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
> >>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Kevin K.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>>>> works 2.6.32-stable with patch 804
> >>>>>>>>>>>>>> works_not 2.6.33-stable
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>>>> do_IRQ
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>>>> clocksource_set_clock
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>>>> cpu_idle
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>>>> __irq_entry
> >>>>>>>>>>>>>> ipi_decode
> >>>>>>>>>>>>>> SMTC_CLOCK_TICK
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Stuart
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 13:02 ` Anoop P A
@ 2011-01-04 14:37 ` Anoop P A
2011-01-04 17:21 ` Kevin D. Kissell
2011-01-04 17:40 ` Kevin D. Kissell
1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-04 14:37 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Hi Kevin,
the stackframe patch that you have suggested had some side effects I was
unable execute init. When I changed some thing like below it started
working .Could you kindly review it ?.
diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..da786ed 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -181,14 +181,6 @@
#endif
LONG_S k0, PT_R29(sp)
LONG_S $3, PT_R3(sp)
- /*
- * You might think that you don't need to save $0,
- * but the FPU emulator and gdb remote debug stub
- * need it to operate correctly
- */
- LONG_S $0, PT_R0(sp)
- mfc0 v1, CP0_STATUS
- LONG_S $2, PT_R2(sp)
#ifdef CONFIG_MIPS_MT_SMTC
/*
* Ideally, these instructions would be shuffled in
@@ -199,6 +191,14 @@
.set mips0
LONG_S v1, PT_TCSTATUS(sp)
#endif /* CONFIG_MIPS_MT_SMTC */
+ /*
+ * You might think that you don't need to save $0,
+ * but the FPU emulator and gdb remote debug stub
+ * need it to operate correctly
+ */
+ LONG_S $0, PT_R0(sp)
+ mfc0 v1, CP0_STATUS
+ LONG_S $2, PT_R2(sp)
LONG_S $4, PT_R4(sp)
LONG_S $5, PT_R5(sp)
LONG_S v1, PT_STATUS(sp)
Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
line.
/ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6
1: 249 218024 218286 218263 218235 218208
218179 MIPS SMTC_IPI
6: 0 0 0 0 0 0
0 MIPS MSP CIC cascade
8: 0 0 0 0 0 0
0 MSP_CIC Softreset button
9: 0 0 0 0 0 0
0 MSP_CIC Standby switch
21: 0 0 0 0 0 0
0 MSP_CIC MSP PER cascade
25: 218128 711 11 0 0 0
0 MSP_CIC timer
27: 341 22 0 0 2 0
6 MSP_CIC serial
ERR: 0
/ # uname -a
Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
Jan 4 19:48:31 IST 2011 mips GNU/Linux
So clock setup / distribution on VPE1 is some thing need fix.
Thanks
Anoop
On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> > Those interrupt counters show that IPIs are being taken everywhere,
> > though very few by CPUs 5 and 6. If I understand the configuration
> > correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> Yes CPU4 is in second VPE
>
> > rate, *if* we're looking at a tickless kernel under low load. But there
> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> expect IPI / timer interrupt for all the threads in this case ?.
>
> > may be a clue there to part of your problem. I have no idea why the
> > behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> > you're getting your clock interrupts through the MSP CIC interrupt
> > controller on VPE 0. There's nothing symmetric for VPE1. The Malta
> > example code is perhaps deceptively simple, in that both VPEs have their
> > count/compare indication wired directly to the 2 clock interrupt inputs,
> > so that having both of them running with only a single set of irq state
> > just works. I don't know whether the MSP CIC timer interrupt is a
>
> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> connected to cpu irq 6.
>
> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> interrupt . Don't we have support for separate irq in SMTC
> implementation ?..
>
> > gating of the VPE0 count/compare output, or whether it's it's own
> > interval timer, but I suspect that you may need to do some further
> > low-level initialization in the platform-specific code to set up an
> > interrupt on the VPE1 side. I don't think the snippet you've got below
> > would work as written.
>
> The routine which I copied works fine for VSMP mode .
>
> / # cat /proc/interrupts
> CPU0 CPU1
> 0: 187 254 MIPS IPI_resched
> 1: 77 174 MIPS IPI_call
> 6: 0 0 MIPS MSP CIC cascade
> 8: 0 0 MSP_CIC Softreset button
> 9: 0 0 MSP_CIC Standby switch
> 21: 0 0 MSP_CIC MSP PER cascade
> 25: 37077 0 MSP_CIC timer
> 27: 188 0 MSP_CIC serial
> 34: 0 36986 MSP_CIC timer
>
> Do I want to change anything specific for SMTC ? .
>
> >
> > If it's purely an issue with clock distribution on VPE1, then a boot
> > with maxvpes=1 maxtcs=4 should be stable.
>
> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>
> >
> > /K.
> >
> > On 1/3/2011 11:20 AM, Anoop P A wrote:
> > > Hi Kevin,
> > >
> > > On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> > >> The very first SMTC implementations didn't support full kernel-mode
> > >> preemption, which anyway wasn't a priority, given the hardware event
> > >> response support in MIPS MT. I believe it was later made compatible,
> > >> but it was never extensively exercised. Since SMTC has fingers in some
> > >> pretty low-level atomicity mechanisms, if a new, parallel set was
> > >> implemented for RCU, I can easily imagine that nobody has yet
> > >> implemented SMTC-ified variants of that set.
> > >>
> > >> Your last statement isn't very clear, though. Are you saying that if
> > >> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> > >> kernel boots all the way up, or that it simply hangs later? What's the
> > >> last rev kernel that actually boots all the way up?
> > > I have debugged this a bit more. It seems that kernel getting stalled
> > > while executing on TC's of second VPE .
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=2504 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=10036 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=17568 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=25100 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=32632 jiffies)
> > > INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> > > by 1, t=40164 jiffies)
> > >
> > > With CONFIG_TREE_CPU we were not hitting this scenario very often.
> > > However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> > >
> > > I presume some issue in my timer setup . I am not seeing timer interrupt
> > > (or IPI interrupt) getting incremented for VPE1 tcs on a completely
> > > booted 2.6.32-stable kernel.
> > >
> > > / # cat /proc/interrupts
> > > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> > > CPU6
> > > 1: 148 15023 15140 15093 3779 8
> > > 2 MIPS SMTC_IPI
> > > 6: 0 0 0 0 0 0
> > > 0 MIPS MSP CIC cascade
> > > 8: 0 0 0 0 0 0
> > > 0 MSP_CIC Softreset button
> > > 9: 0 0 0 0 0 0
> > > 0 MSP_CIC Standby switch
> > > 21: 0 0 0 0 0 0
> > > 0 MSP_CIC MSP PER cascade
> > > 25: 15113 341 4 7 0 0
> > > 0 MSP_CIC timer
> > > 27: 260 9 0 1 0 0
> > > 0 MSP_CIC serial
> > > 34: 0 0 0 0 0 0
> > > 0 MSP_CIC timer
> > >
> > > Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> > >
> > > I have tried setting up VPE1 timer from get_co_compare_int as follows
> > >
> > > unsigned int __cpuinit get_c0_compare_int(void)
> > > {
> > > if ((1==get_current_vpe()) && !vpe1_timr_installed){
> > >
> > > memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> > >
> > > setup_irq(MSP_INT_VPE1_TIMER, &timer_vpe1);
> > > vpe1_timr_installed++;
> > > }
> > > return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> > > MSP_INT_VPE0_TIMER);
> > > }
> > >
> > > Thanks
> > > Anoop
> > >
> > >> Regards,
> > >>
> > >> Kevin K.
> > >>
> > >> On 1/3/2011 7:12 AM, Anoop P A wrote:
> > >>> Hi ,
> > >>>
> > >>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> > >>> SMP kernel.
> > >>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> > >>>
> > >>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> > >>> ( which will be only available RCU implementation for SMTC kernel from
> > >>> 2.6.37 onwards) .
> > >>>
> > >>> With no forced preemption and selecting TREE_CPU I am able to boot
> > >>> further to the hang that I have reported.
> > >>>
> > >>> Thanks
> > >>> Anoop
> > >>>
> > >>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> > >>>> At this point the logical thing to do would seem to look at your kernel
> > >>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> > >>>> shows the last exception to have been taken. That's a critical SMTC
> > >>>> routine that gets called whenever an xxx_irq_restore() enables
> > >>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> > >>>> the TC had interrupts disabled can be handled deterministically. As I
> > >>>> mentioned in an earlier message, there was some cleanup work from David
> > >>>> Howell that changed a number of irq management-related function names
> > >>>> and prototypes across all architectures, which went into linux-mips.org
> > >>>> at very roughly the time of the breakage. The SMTC overlay over the irq
> > >>>> implementation has been pretty robust, but it's written in a perhaps
> > >>>> doomed attempt to be both efficient and using a maximum amount of common
> > >>>> code with the general case. A mechanical or semi-mechanical change
> > >>>> could conceivably have broken things.
> > >>>>
> > >>>> Regards,
> > >>>>
> > >>>> Kevin K.
> > >>>>
> > >>>>
> > >>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> > >>>>> Hi ,
> > >>>>>
> > >>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> > >>>>> Another important observation is even though 2.6.33 kernel + stackframe
> > >>>>> patch well passes calibration hang , I am still unable boot in to a
> > >>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> > >>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> > >>>>> ######################## Log ###########################
> > >>>>>
> > >>>>> === MIPS MT State Dump ===
> > >>>>> -- Global State --
> > >>>>> MVPControl Passed: 00000005
> > >>>>> MVPControl Read: 00000004
> > >>>>> MVPConf0 : a8008406
> > >>>>> -- per-VPE State --
> > >>>>> VPE 0
> > >>>>> VPEControl : 00008000
> > >>>>> VPEConf0 : 800f0003
> > >>>>> VPE0.Status : 11004201
> > >>>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> > >>>>> VPE0.Cause : 50804000
> > >>>>> VPE0.Config7 : 00010000
> > >>>>> VPE 1
> > >>>>> VPEControl : 00068006
> > >>>>> VPEConf0 : 80cf0003
> > >>>>> VPE1.Status : 11008301
> > >>>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> > >>>>> VPE1.Cause : 50800000
> > >>>>> VPE1.Config7 : 00010000
> > >>>>> -- per-TC State --
> > >>>>> TC 0 (current TC with VPE EPC above)
> > >>>>> TCStatus : 18102000
> > >>>>> TCBind : 00000000
> > >>>>> TCRestart : 803fa19c printk+0xc/0x30
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00000000
> > >>>>> TC 1
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00200000
> > >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00140000
> > >>>>> TC 2
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00400000
> > >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00280000
> > >>>>> TC 3
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00600000
> > >>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 003c0000
> > >>>>> TC 4
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00800001
> > >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00500000
> > >>>>> TC 5
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00a00001
> > >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00640000
> > >>>>> TC 6
> > >>>>> TCStatus : 18902000
> > >>>>> TCBind : 00c00001
> > >>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> > >>>>> TCHalt : 00000000
> > >>>>> TCContext : 00780000
> > >>>>> Counter Interrupts taken per CPU (TC)
> > >>>>> 0: 0
> > >>>>> 1: 0
> > >>>>> 2: 0
> > >>>>> 3: 0
> > >>>>> 4: 0
> > >>>>> 5: 0
> > >>>>> 6: 0
> > >>>>> 7: 0
> > >>>>> Self-IPI invocations:
> > >>>>> 0: 12
> > >>>>> 1: 0
> > >>>>> 2: 0
> > >>>>> 3: 0
> > >>>>> 4: 0
> > >>>>> 5: 5
> > >>>>> 6: 4
> > >>>>> 7: 0
> > >>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> > >>>>> 0 Recoveries of "stolen" FPU
> > >>>>> ===========================
> > >>>>>
> > >>>>> ################################################################
> > >>>>>
> > >>>>> Thanks
> > >>>>> Anoop
> > >>>>>
> > >>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> > >>>>>> I took a quick look last night, and the only thing that looked vaguely
> > >>>>>> dangerous in changes since the timer changes I alluded to earlier was
> > >>>>>> the global naming cleanup of irq-related function names that David
> > >>>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
> > >>>>>> the definitions are nested subtly for SMTC to maximize the amount of
> > >>>>>> common code, and I could imagine something getting lost in translation
> > >>>>>> there. If that were really the problem, it would of course affect much
> > >>>>>> more than just the timer subsystem, but early in the boot process,
> > >>>>>> timers are pretty much the only interrupts that have to be handled
> > >>>>>> correctly.
> > >>>>>>
> > >>>>>> I'm travelling today, but will take a look at timekeeping_notify()
> > >>>>>> tomorrow or the next day...
> > >>>>>>
> > >>>>>> /K.
> > >>>>>>
> > >>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> > >>>>>>> Hi,
> > >>>>>>>
> > >>>>>>> I had a glance into the code diff without notice of any suspect-able
> > >>>>>>> code .
> > >>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> > >>>>>>> function.
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Anoop
> > >>>>>>>
> > >>>>>>> PS: I may not be available until Thursday
> > >>>>>>>
> > >>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> > >>>>>>>> Hi Kevin,
> > >>>>>>>>
> > >>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> > >>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> > >>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> > >>>>>>>> stackframe patch) .
> > >>>>>>>>
> > >>>>>>>> Hi Stuart,
> > >>>>>>>>
> > >>>>>>>> I haven't got much time to spend on this today.
> > >>>>>>>>
> > >>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> > >>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> > >>>>>>>>
> > >>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> > >>>>>>>>
> > >>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> > >>>>>>>> code diff .
> > >>>>>>>>
> > >>>>>>>> Thanks
> > >>>>>>>> Anoop
> > >>>>>>>>
> > >>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> > >>>>>>>>> Kevin,
> > >>>>>>>>>
> > >>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Anoop,
> > >>>>>>>>>
> > >>>>>>>>> Maybe we can get lucky again.
> > >>>>>>>>>
> > >>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> > >>>>>>>>> I'll be happy to do another diff.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Hope you'll have had a good Christmas as well.
> > >>>>>>>>> We've had snow in Alabama since Christmas eve!
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Stuart
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -----Original Message-----
> > >>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> > >>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> > >>>>>>>>> To: Anoop P A
> > >>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> > >>>>>>>>> Subject: Re: SMTC support status in latest git head.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> > >>>>>>>>> performance tweak for the deeper pipelined processors. In looking for
> > >>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> > >>>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
> > >>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> > >>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> > >>>>>>>>>
> > >>>>>>>>> Oh, yes, and Merry Christmas one and all!
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Kevin K.
> > >>>>>>>>>
> > >>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> > >>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> > >>>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> > >>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> > >>>>>>>>>>>
> > >>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> > >>>>>>>>>> loop but hangs after switching to mips closource
> > >>>>>>>>>>
> > >>>>>>>>>> TC 6 going on-line as CPU 6
> > >>>>>>>>>> Brought up 7 CPUs
> > >>>>>>>>>> bio: create slab<bio-0> at 0
> > >>>>>>>>>> SCSI subsystem initialized
> > >>>>>>>>>> Switching to clocksource MIPS
> > >>>>>>>>>>
> > >>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> > >>>>>>>>>> much to get rid of this hang.
> > >>>>>>>>>>
> > >>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> index 58730c5..7fc9f10 100644
> > >>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> > >>>>>>>>>> @@ -195,9 +195,9 @@
> > >>>>>>>>>> * to cover the pipeline delay.
> > >>>>>>>>>> */
> > >>>>>>>>>> .set mips32
> > >>>>>>>>>> - mfc0 v1, CP0_TCSTATUS
> > >>>>>>>>>> + mfc0 v0, CP0_TCSTATUS
> > >>>>>>>>>> .set mips0
> > >>>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
> > >>>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
> > >>>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
> > >>>>>>>>>> LONG_S $4, PT_R4(sp)
> > >>>>>>>>>> LONG_S $5, PT_R5(sp)
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> /K.
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> > >>>>>>>>>>>> Hi Kevin, Stuart ,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Woohooo You guys spotted !.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> > >>>>>>>>>>>> the culprit
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> > >>>>>>>>>>>> booting !.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> Anoop
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> > >>>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> > >>>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> > >>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> > >>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> > >>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> > >>>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
> > >>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> > >>>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
> > >>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> > >>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> > >>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> > >>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> > >>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> > >>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> > >>>>>>>>>>>>> submit a patch just now.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Kevin K.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> > >>>>>>>>>>>>>> Kevin,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I'm not sure if it's useful,
> > >>>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> > >>>>>>>>>>>>>> works 2.6.32-stable with patch 804
> > >>>>>>>>>>>>>> works_not 2.6.33-stable
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> > >>>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> > >>>>>>>>>>>>>> arch/mips/kernel/irq.c
> > >>>>>>>>>>>>>> do_IRQ
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> > >>>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/include/asm/time.h
> > >>>>>>>>>>>>>> clocksource_set_clock
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/kernel/process.c
> > >>>>>>>>>>>>>> cpu_idle
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> > >>>>>>>>>>>>>> __irq_entry
> > >>>>>>>>>>>>>> ipi_decode
> > >>>>>>>>>>>>>> SMTC_CLOCK_TICK
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Stuart
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >
> >
>
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 14:37 ` Anoop P A
@ 2011-01-04 17:21 ` Kevin D. Kissell
2011-01-04 17:54 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 17:21 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
I'm trying to figure out a reason why your change below should help, and
offhand, modulo tool bugs, I don't see it. I'm assuming that your diff
below is a diff relative to the pre-patch stackframe.h. I wouldn't
bless it as an alternative because it moves code and comments
unnecessarily - all you should really have to do is to move the
190 mfc0 v1, CP0_STATUS
191 LONG_S $2, PT_R2(sp)
to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
If moving the save of zero to PT_R0(sp) actually makes a difference,
it's evidence that you've got problems in your toolchain (or, heaven
forbid, your pipeline)!
But I'd really like to see what your assembler is doing to the original
patch for it to be broken. Assembler instruction reordering is armed,
but it ought not to move register moves and stores around in ways where
your sequence
197 .set mips32
198 mfc0 v1, CP0_TCSTATUS
199 .set mips0
200 LONG_S v1, PT_TCSTATUS(sp)
189 LONG_S $0, PT_R0(sp)
190 mfc0 v1, CP0_STATUS
191 LONG_S $2, PT_R2(sp)
202 LONG_S $4, PT_R4(sp)
203 LONG_S $5, PT_R5(sp)
204 LONG_S v1, PT_STATUS(sp)
to work while
189 LONG_S $0, PT_R0(sp)
190 mfc0 v1, CP0_STATUS
191 LONG_S $2, PT_R2(sp)
197 .set mips32
198 mfc0 v0, CP0_TCSTATUS
199 .set mips0
200 LONG_S v0, PT_TCSTATUS(sp)
202 LONG_S $4, PT_R4(sp)
203 LONG_S $5, PT_R5(sp)
204 LONG_S v1, PT_STATUS(sp)
does not, provided that the identity of v0=$2, v1=$3 is respected.
One thing that does stick out as being different - though, again, I'd
need to see the disassembly of an instance of the macro to know what it
could have done - is that the SMTC conditiona code brackets the mfc0 of
TCStatus with .set mips32/.set mips0. Given that the code no longer has
a .set mips0 early in the macro, it would be more correct to make it:
.set push
.set mips32
mfc0 v0, CP0_TCSTATUS (or v1, if we move the mfc0
v1,CP0_STATUS)
.set pop
and presumably make a similar chage for the block from line 334 to 429.
But I don't see any causal path from that funniness to failure.
Regards,
Kevin K.
On 01/04/11 06:37, Anoop P A wrote:
> Hi Kevin,
>
> the stackframe patch that you have suggested had some side effects I was
> unable execute init. When I changed some thing like below it started
> working .Could you kindly review it ?.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..da786ed 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -181,14 +181,6 @@
> #endif
> LONG_S k0, PT_R29(sp)
> LONG_S $3, PT_R3(sp)
> - /*
> - * You might think that you don't need to save $0,
> - * but the FPU emulator and gdb remote debug stub
> - * need it to operate correctly
> - */
> - LONG_S $0, PT_R0(sp)
> - mfc0 v1, CP0_STATUS
> - LONG_S $2, PT_R2(sp)
> #ifdef CONFIG_MIPS_MT_SMTC
> /*
> * Ideally, these instructions would be shuffled in
> @@ -199,6 +191,14 @@
> .set mips0
> LONG_S v1, PT_TCSTATUS(sp)
> #endif /* CONFIG_MIPS_MT_SMTC */
> + /*
> + * You might think that you don't need to save $0,
> + * but the FPU emulator and gdb remote debug stub
> + * need it to operate correctly
> + */
> + LONG_S $0, PT_R0(sp)
> + mfc0 v1, CP0_STATUS
> + LONG_S $2, PT_R2(sp)
> LONG_S $4, PT_R4(sp)
> LONG_S $5, PT_R5(sp)
> LONG_S v1, PT_STATUS(sp)
>
> Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
> line.
>
> / # cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> CPU6
> 1: 249 218024 218286 218263 218235 218208
> 218179 MIPS SMTC_IPI
> 6: 0 0 0 0 0 0
> 0 MIPS MSP CIC cascade
> 8: 0 0 0 0 0 0
> 0 MSP_CIC Softreset button
> 9: 0 0 0 0 0 0
> 0 MSP_CIC Standby switch
> 21: 0 0 0 0 0 0
> 0 MSP_CIC MSP PER cascade
> 25: 218128 711 11 0 0 0
> 0 MSP_CIC timer
> 27: 341 22 0 0 2 0
> 6 MSP_CIC serial
>
> ERR: 0
> / # uname -a
> Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
> Jan 4 19:48:31 IST 2011 mips GNU/Linux
>
> So clock setup / distribution on VPE1 is some thing need fix.
>
> Thanks
> Anoop
>
>
> On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
>> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
>>> Those interrupt counters show that IPIs are being taken everywhere,
>>> though very few by CPUs 5 and 6. If I understand the configuration
>>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
>> Yes CPU4 is in second VPE
>>
>>> rate, *if* we're looking at a tickless kernel under low load. But there
>> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
>> expect IPI / timer interrupt for all the threads in this case ?.
>>
>>> may be a clue there to part of your problem. I have no idea why the
>>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
>>> you're getting your clock interrupts through the MSP CIC interrupt
>>> controller on VPE 0. There's nothing symmetric for VPE1. The Malta
>>> example code is perhaps deceptively simple, in that both VPEs have their
>>> count/compare indication wired directly to the 2 clock interrupt inputs,
>>> so that having both of them running with only a single set of irq state
>>> just works. I don't know whether the MSP CIC timer interrupt is a
>> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
>> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
>> connected to cpu irq 6.
>>
>> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
>> interrupt . Don't we have support for separate irq in SMTC
>> implementation ?..
>>
>>> gating of the VPE0 count/compare output, or whether it's it's own
>>> interval timer, but I suspect that you may need to do some further
>>> low-level initialization in the platform-specific code to set up an
>>> interrupt on the VPE1 side. I don't think the snippet you've got below
>>> would work as written.
>> The routine which I copied works fine for VSMP mode .
>>
>> / # cat /proc/interrupts
>> CPU0 CPU1
>> 0: 187 254 MIPS IPI_resched
>> 1: 77 174 MIPS IPI_call
>> 6: 0 0 MIPS MSP CIC cascade
>> 8: 0 0 MSP_CIC Softreset button
>> 9: 0 0 MSP_CIC Standby switch
>> 21: 0 0 MSP_CIC MSP PER cascade
>> 25: 37077 0 MSP_CIC timer
>> 27: 188 0 MSP_CIC serial
>> 34: 0 36986 MSP_CIC timer
>>
>> Do I want to change anything specific for SMTC ? .
>>
>>> If it's purely an issue with clock distribution on VPE1, then a boot
>>> with maxvpes=1 maxtcs=4 should be stable.
>> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>>
>>> /K.
>>>
>>> On 1/3/2011 11:20 AM, Anoop P A wrote:
>>>> Hi Kevin,
>>>>
>>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>>>>> The very first SMTC implementations didn't support full kernel-mode
>>>>> preemption, which anyway wasn't a priority, given the hardware event
>>>>> response support in MIPS MT. I believe it was later made compatible,
>>>>> but it was never extensively exercised. Since SMTC has fingers in some
>>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
>>>>> implemented for RCU, I can easily imagine that nobody has yet
>>>>> implemented SMTC-ified variants of that set.
>>>>>
>>>>> Your last statement isn't very clear, though. Are you saying that if
>>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>>>>> kernel boots all the way up, or that it simply hangs later? What's the
>>>>> last rev kernel that actually boots all the way up?
>>>> I have debugged this a bit more. It seems that kernel getting stalled
>>>> while executing on TC's of second VPE .
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=2504 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=10036 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=17568 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=25100 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=32632 jiffies)
>>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>>> by 1, t=40164 jiffies)
>>>>
>>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
>>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>>>>
>>>> I presume some issue in my timer setup . I am not seeing timer interrupt
>>>> (or IPI interrupt) getting incremented for VPE1 tcs on a completely
>>>> booted 2.6.32-stable kernel.
>>>>
>>>> / # cat /proc/interrupts
>>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
>>>> CPU6
>>>> 1: 148 15023 15140 15093 3779 8
>>>> 2 MIPS SMTC_IPI
>>>> 6: 0 0 0 0 0 0
>>>> 0 MIPS MSP CIC cascade
>>>> 8: 0 0 0 0 0 0
>>>> 0 MSP_CIC Softreset button
>>>> 9: 0 0 0 0 0 0
>>>> 0 MSP_CIC Standby switch
>>>> 21: 0 0 0 0 0 0
>>>> 0 MSP_CIC MSP PER cascade
>>>> 25: 15113 341 4 7 0 0
>>>> 0 MSP_CIC timer
>>>> 27: 260 9 0 1 0 0
>>>> 0 MSP_CIC serial
>>>> 34: 0 0 0 0 0 0
>>>> 0 MSP_CIC timer
>>>>
>>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
>>>>
>>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
>>>>
>>>> unsigned int __cpuinit get_c0_compare_int(void)
>>>> {
>>>> if ((1==get_current_vpe())&& !vpe1_timr_installed){
>>>>
>>>> memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
>>>>
>>>> setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
>>>> vpe1_timr_installed++;
>>>> }
>>>> return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
>>>> MSP_INT_VPE0_TIMER);
>>>> }
>>>>
>>>> Thanks
>>>> Anoop
>>>>
>>>>> Regards,
>>>>>
>>>>> Kevin K.
>>>>>
>>>>> On 1/3/2011 7:12 AM, Anoop P A wrote:
>>>>>> Hi ,
>>>>>>
>>>>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
>>>>>> SMP kernel.
>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
>>>>>>
>>>>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
>>>>>> ( which will be only available RCU implementation for SMTC kernel from
>>>>>> 2.6.37 onwards) .
>>>>>>
>>>>>> With no forced preemption and selecting TREE_CPU I am able to boot
>>>>>> further to the hang that I have reported.
>>>>>>
>>>>>> Thanks
>>>>>> Anoop
>>>>>>
>>>>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
>>>>>>> At this point the logical thing to do would seem to look at your kernel
>>>>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
>>>>>>> shows the last exception to have been taken. That's a critical SMTC
>>>>>>> routine that gets called whenever an xxx_irq_restore() enables
>>>>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
>>>>>>> the TC had interrupts disabled can be handled deterministically. As I
>>>>>>> mentioned in an earlier message, there was some cleanup work from David
>>>>>>> Howell that changed a number of irq management-related function names
>>>>>>> and prototypes across all architectures, which went into linux-mips.org
>>>>>>> at very roughly the time of the breakage. The SMTC overlay over the irq
>>>>>>> implementation has been pretty robust, but it's written in a perhaps
>>>>>>> doomed attempt to be both efficient and using a maximum amount of common
>>>>>>> code with the general case. A mechanical or semi-mechanical change
>>>>>>> could conceivably have broken things.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kevin K.
>>>>>>>
>>>>>>>
>>>>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
>>>>>>>> Hi ,
>>>>>>>>
>>>>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
>>>>>>>> Another important observation is even though 2.6.33 kernel + stackframe
>>>>>>>> patch well passes calibration hang , I am still unable boot in to a
>>>>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
>>>>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
>>>>>>>> ######################## Log ###########################
>>>>>>>>
>>>>>>>> === MIPS MT State Dump ===
>>>>>>>> -- Global State --
>>>>>>>> MVPControl Passed: 00000005
>>>>>>>> MVPControl Read: 00000004
>>>>>>>> MVPConf0 : a8008406
>>>>>>>> -- per-VPE State --
>>>>>>>> VPE 0
>>>>>>>> VPEControl : 00008000
>>>>>>>> VPEConf0 : 800f0003
>>>>>>>> VPE0.Status : 11004201
>>>>>>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
>>>>>>>> VPE0.Cause : 50804000
>>>>>>>> VPE0.Config7 : 00010000
>>>>>>>> VPE 1
>>>>>>>> VPEControl : 00068006
>>>>>>>> VPEConf0 : 80cf0003
>>>>>>>> VPE1.Status : 11008301
>>>>>>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
>>>>>>>> VPE1.Cause : 50800000
>>>>>>>> VPE1.Config7 : 00010000
>>>>>>>> -- per-TC State --
>>>>>>>> TC 0 (current TC with VPE EPC above)
>>>>>>>> TCStatus : 18102000
>>>>>>>> TCBind : 00000000
>>>>>>>> TCRestart : 803fa19c printk+0xc/0x30
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00000000
>>>>>>>> TC 1
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00200000
>>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00140000
>>>>>>>> TC 2
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00400000
>>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00280000
>>>>>>>> TC 3
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00600000
>>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 003c0000
>>>>>>>> TC 4
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00800001
>>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00500000
>>>>>>>> TC 5
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00a00001
>>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00640000
>>>>>>>> TC 6
>>>>>>>> TCStatus : 18902000
>>>>>>>> TCBind : 00c00001
>>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
>>>>>>>> TCHalt : 00000000
>>>>>>>> TCContext : 00780000
>>>>>>>> Counter Interrupts taken per CPU (TC)
>>>>>>>> 0: 0
>>>>>>>> 1: 0
>>>>>>>> 2: 0
>>>>>>>> 3: 0
>>>>>>>> 4: 0
>>>>>>>> 5: 0
>>>>>>>> 6: 0
>>>>>>>> 7: 0
>>>>>>>> Self-IPI invocations:
>>>>>>>> 0: 12
>>>>>>>> 1: 0
>>>>>>>> 2: 0
>>>>>>>> 3: 0
>>>>>>>> 4: 0
>>>>>>>> 5: 5
>>>>>>>> 6: 4
>>>>>>>> 7: 0
>>>>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
>>>>>>>> 0 Recoveries of "stolen" FPU
>>>>>>>> ===========================
>>>>>>>>
>>>>>>>> ################################################################
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
>>>>>>>>> I took a quick look last night, and the only thing that looked vaguely
>>>>>>>>> dangerous in changes since the timer changes I alluded to earlier was
>>>>>>>>> the global naming cleanup of irq-related function names that David
>>>>>>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
>>>>>>>>> the definitions are nested subtly for SMTC to maximize the amount of
>>>>>>>>> common code, and I could imagine something getting lost in translation
>>>>>>>>> there. If that were really the problem, it would of course affect much
>>>>>>>>> more than just the timer subsystem, but early in the boot process,
>>>>>>>>> timers are pretty much the only interrupts that have to be handled
>>>>>>>>> correctly.
>>>>>>>>>
>>>>>>>>> I'm travelling today, but will take a look at timekeeping_notify()
>>>>>>>>> tomorrow or the next day...
>>>>>>>>>
>>>>>>>>> /K.
>>>>>>>>>
>>>>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I had a glance into the code diff without notice of any suspect-able
>>>>>>>>>> code .
>>>>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
>>>>>>>>>> function.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anoop
>>>>>>>>>>
>>>>>>>>>> PS: I may not be available until Thursday
>>>>>>>>>>
>>>>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
>>>>>>>>>>> Hi Kevin,
>>>>>>>>>>>
>>>>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
>>>>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
>>>>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
>>>>>>>>>>> stackframe patch) .
>>>>>>>>>>>
>>>>>>>>>>> Hi Stuart,
>>>>>>>>>>>
>>>>>>>>>>> I haven't got much time to spend on this today.
>>>>>>>>>>>
>>>>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
>>>>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
>>>>>>>>>>>
>>>>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
>>>>>>>>>>>
>>>>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
>>>>>>>>>>> code diff .
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Anoop
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>
>>>>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Anoop,
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe we can get lucky again.
>>>>>>>>>>>>
>>>>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
>>>>>>>>>>>> I'll be happy to do another diff.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hope you'll have had a good Christmas as well.
>>>>>>>>>>>> We've had snow in Alabama since Christmas eve!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Stuart
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
>>>>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
>>>>>>>>>>>> To: Anoop P A
>>>>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
>>>>>>>>>>>> Subject: Re: SMTC support status in latest git head.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
>>>>>>>>>>>> performance tweak for the deeper pipelined processors. In looking for
>>>>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
>>>>>>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
>>>>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
>>>>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
>>>>>>>>>>>>
>>>>>>>>>>>> Oh, yes, and Merry Christmas one and all!
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin K.
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
>>>>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
>>>>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
>>>>>>>>>>>>> loop but hangs after switching to mips closource
>>>>>>>>>>>>>
>>>>>>>>>>>>> TC 6 going on-line as CPU 6
>>>>>>>>>>>>> Brought up 7 CPUs
>>>>>>>>>>>>> bio: create slab<bio-0> at 0
>>>>>>>>>>>>> SCSI subsystem initialized
>>>>>>>>>>>>> Switching to clocksource MIPS
>>>>>>>>>>>>>
>>>>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
>>>>>>>>>>>>> much to get rid of this hang.
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> index 58730c5..7fc9f10 100644
>>>>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>> @@ -195,9 +195,9 @@
>>>>>>>>>>>>> * to cover the pipeline delay.
>>>>>>>>>>>>> */
>>>>>>>>>>>>> .set mips32
>>>>>>>>>>>>> - mfc0 v1, CP0_TCSTATUS
>>>>>>>>>>>>> + mfc0 v0, CP0_TCSTATUS
>>>>>>>>>>>>> .set mips0
>>>>>>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
>>>>>>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
>>>>>>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
>>>>>>>>>>>>> LONG_S $4, PT_R4(sp)
>>>>>>>>>>>>> LONG_S $5, PT_R5(sp)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> /K.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
>>>>>>>>>>>>>>> Hi Kevin, Stuart ,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Woohooo You guys spotted !.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
>>>>>>>>>>>>>>> the culprit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
>>>>>>>>>>>>>>> booting !.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Anoop
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
>>>>>>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
>>>>>>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
>>>>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
>>>>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
>>>>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
>>>>>>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
>>>>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
>>>>>>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
>>>>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
>>>>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
>>>>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
>>>>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
>>>>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
>>>>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
>>>>>>>>>>>>>>>> submit a patch just now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kevin K.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
>>>>>>>>>>>>>>>>> Kevin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm not sure if it's useful,
>>>>>>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
>>>>>>>>>>>>>>>>> works 2.6.32-stable with patch 804
>>>>>>>>>>>>>>>>> works_not 2.6.33-stable
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
>>>>>>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
>>>>>>>>>>>>>>>>> arch/mips/kernel/irq.c
>>>>>>>>>>>>>>>>> do_IRQ
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
>>>>>>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/include/asm/time.h
>>>>>>>>>>>>>>>>> clocksource_set_clock
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/kernel/process.c
>>>>>>>>>>>>>>>>> cpu_idle
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
>>>>>>>>>>>>>>>>> __irq_entry
>>>>>>>>>>>>>>>>> ipi_decode
>>>>>>>>>>>>>>>>> SMTC_CLOCK_TICK
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Stuart
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 13:02 ` Anoop P A
2011-01-04 14:37 ` Anoop P A
@ 2011-01-04 17:40 ` Kevin D. Kissell
2011-01-05 13:09 ` Anoop P A
1 sibling, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 17:40 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/04/11 05:02, Anoop P A wrote:
> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
>> Those interrupt counters show that IPIs are being taken everywhere,
>> though very few by CPUs 5 and 6. If I understand the configuration
>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> Yes CPU4 is in second VPE
>
>> rate, *if* we're looking at a tickless kernel under low load. But there
> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> expect IPI / timer interrupt for all the threads in this case ?.
In that case, you should expect a distribution of timer interrupts that
favors the low-numbered TCs within the VPE, as you do in VPE0, and a
distribution of IPIs that is sort-of the inverse, as you do in VPE0.
But the low counts on VPE1 are indeed suspicious, as you note.
>> may be a clue there to part of your problem. I have no idea why the
>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
>> you're getting your clock interrupts through the MSP CIC interrupt
>> controller on VPE 0. There's nothing symmetric for VPE1. The Malta
>> example code is perhaps deceptively simple, in that both VPEs have their
>> count/compare indication wired directly to the 2 clock interrupt inputs,
>> so that having both of them running with only a single set of irq state
>> just works. I don't know whether the MSP CIC timer interrupt is a
> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> connected to cpu irq 6.
>
> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> interrupt . Don't we have support for separate irq in SMTC
> implementation ?..
There are hooks for platform-specific SMTC support, which is implemented
for the Malta in arch/mips/mti-malta/malta-smtc.c. See
msmtc_init_secondary(), for example, where the clock/compare, profile,
and IPI interrupts are armed for VPE 1, while I/O peripheral interrupts
are inhibited.
>> gating of the VPE0 count/compare output, or whether it's it's own
>> interval timer, but I suspect that you may need to do some further
>> low-level initialization in the platform-specific code to set up an
>> interrupt on the VPE1 side. I don't think the snippet you've got below
>> would work as written.
> The routine which I copied works fine for VSMP mode .
>
> / # cat /proc/interrupts
> CPU0 CPU1
> 0: 187 254 MIPS IPI_resched
> 1: 77 174 MIPS IPI_call
> 6: 0 0 MIPS MSP CIC cascade
> 8: 0 0 MSP_CIC Softreset button
> 9: 0 0 MSP_CIC Standby switch
> 21: 0 0 MSP_CIC MSP PER cascade
> 25: 37077 0 MSP_CIC timer
> 27: 188 0 MSP_CIC serial
> 34: 0 36986 MSP_CIC timer
>
> Do I want to change anything specific for SMTC ? .
If it works (which I doubt), then we can critique stylistic points like
using
if ((1==get_current_vpe())
Instead of the more readable and general
if (get_current_vpe()> 0)
But I think you're generally looking in the wrong place. Look at the
Malta code and see what's done where. The initial SMTC code had a lot
of Malta assumptions in the main line that I pushed out to platform code
in later patches. I can see how things could be made even more modular,
but for the moment I think it's just that there's some stuff that ought
to be done in a "msp_smtc.c" file that doesn't exist in 2.6.37.
Regards,
Kevin K.
>
>
>> If it's purely an issue with clock distribution on VPE1, then a boot
>> with maxvpes=1 maxtcs=4 should be stable.
> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
>
>> /K.
>>
>> On 1/3/2011 11:20 AM, Anoop P A wrote:
>>> Hi Kevin,
>>>
>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
>>>> The very first SMTC implementations didn't support full kernel-mode
>>>> preemption, which anyway wasn't a priority, given the hardware event
>>>> response support in MIPS MT. I believe it was later made compatible,
>>>> but it was never extensively exercised. Since SMTC has fingers in some
>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
>>>> implemented for RCU, I can easily imagine that nobody has yet
>>>> implemented SMTC-ified variants of that set.
>>>>
>>>> Your last statement isn't very clear, though. Are you saying that if
>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
>>>> kernel boots all the way up, or that it simply hangs later? What's the
>>>> last rev kernel that actually boots all the way up?
>>> I have debugged this a bit more. It seems that kernel getting stalled
>>> while executing on TC's of second VPE .
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=2504 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=10036 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=17568 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=25100 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=32632 jiffies)
>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
>>> by 1, t=40164 jiffies)
>>>
>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
>>>
>>> I presume some issue in my timer setup . I am not seeing timer interrupt
>>> (or IPI interrupt) getting incremented for VPE1 tcs on a completely
>>> booted 2.6.32-stable kernel.
>>>
>>> / # cat /proc/interrupts
>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
>>> CPU6
>>> 1: 148 15023 15140 15093 3779 8
>>> 2 MIPS SMTC_IPI
>>> 6: 0 0 0 0 0 0
>>> 0 MIPS MSP CIC cascade
>>> 8: 0 0 0 0 0 0
>>> 0 MSP_CIC Softreset button
>>> 9: 0 0 0 0 0 0
>>> 0 MSP_CIC Standby switch
>>> 21: 0 0 0 0 0 0
>>> 0 MSP_CIC MSP PER cascade
>>> 25: 15113 341 4 7 0 0
>>> 0 MSP_CIC timer
>>> 27: 260 9 0 1 0 0
>>> 0 MSP_CIC serial
>>> 34: 0 0 0 0 0 0
>>> 0 MSP_CIC timer
>>>
>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
>>>
>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
>>>
>>> unsigned int __cpuinit get_c0_compare_int(void)
>>> {
>>> if ((1==get_current_vpe())&& !vpe1_timr_installed){
>>>
>>> memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
>>>
>>> setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
>>> vpe1_timr_installed++;
>>> }
>>> return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
>>> MSP_INT_VPE0_TIMER);
>>> }
>>>
>>> Thanks
>>> Anoop
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 17:21 ` Kevin D. Kissell
@ 2011-01-04 17:54 ` Anoop P A
2011-01-04 18:33 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-04 17:54 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
> I'm trying to figure out a reason why your change below should help, and
> offhand, modulo tool bugs, I don't see it. I'm assuming that your diff
> below is a diff relative to the pre-patch stackframe.h. I wouldn't
Yes patch created against stock code .
> bless it as an alternative because it moves code and comments
> unnecessarily - all you should really have to do is to move the
>
>
> 190 mfc0 v1, CP0_STATUS
> 191 LONG_S $2, PT_R2(sp)
>
> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
of code ( which store $0 ) . git diff did the rest on behalf of me :)
>
> If moving the save of zero to PT_R0(sp) actually makes a difference,
> it's evidence that you've got problems in your toolchain (or, heaven
> forbid, your pipeline)!
In previous version of patch usage of V0 was creating issue. I have
verified this with previous version of code ( working code before
David's instruction rearrangement patch.) .
>
> But I'd really like to see what your assembler is doing to the original
> patch for it to be broken. Assembler instruction reordering is armed,
> but it ought not to move register moves and stores around in ways where
> your sequence
>
> 197 .set mips32
> 198 mfc0 v1, CP0_TCSTATUS
> 199 .set mips0
> 200 LONG_S v1, PT_TCSTATUS(sp)
> 189 LONG_S $0, PT_R0(sp)
> 190 mfc0 v1, CP0_STATUS
> 191 LONG_S $2, PT_R2(sp)
> 202 LONG_S $4, PT_R4(sp)
> 203 LONG_S $5, PT_R5(sp)
> 204 LONG_S v1, PT_STATUS(sp)
>
> to work while
>
> 189 LONG_S $0, PT_R0(sp)
> 190 mfc0 v1, CP0_STATUS
> 191 LONG_S $2, PT_R2(sp)
> 197 .set mips32
> 198 mfc0 v0, CP0_TCSTATUS
> 199 .set mips0
> 200 LONG_S v0, PT_TCSTATUS(sp)
> 202 LONG_S $4, PT_R4(sp)
> 203 LONG_S $5, PT_R5(sp)
> 204 LONG_S v1, PT_STATUS(sp)
>
> does not, provided that the identity of v0=$2, v1=$3 is respected.
>
> One thing that does stick out as being different - though, again, I'd
> need to see the disassembly of an instance of the macro to know what it
> could have done - is that the SMTC conditiona code brackets the mfc0 of
> TCStatus with .set mips32/.set mips0. Given that the code no longer has
> a .set mips0 early in the macro, it would be more correct to make it:
>
> .set push
> .set mips32
> mfc0 v0, CP0_TCSTATUS (or v1, if we move the mfc0
> v1,CP0_STATUS)
> .set pop
>
> and presumably make a similar chage for the block from line 334 to 429.
>
> But I don't see any causal path from that funniness to failure.
>
> Regards,
>
> Kevin K.
>
> On 01/04/11 06:37, Anoop P A wrote:
> > Hi Kevin,
> >
> > the stackframe patch that you have suggested had some side effects I was
> > unable execute init. When I changed some thing like below it started
> > working .Could you kindly review it ?.
> >
> > diff --git a/arch/mips/include/asm/stackframe.h
> > b/arch/mips/include/asm/stackframe.h
> > index 58730c5..da786ed 100644
> > --- a/arch/mips/include/asm/stackframe.h
> > +++ b/arch/mips/include/asm/stackframe.h
> > @@ -181,14 +181,6 @@
> > #endif
> > LONG_S k0, PT_R29(sp)
> > LONG_S $3, PT_R3(sp)
> > - /*
> > - * You might think that you don't need to save $0,
> > - * but the FPU emulator and gdb remote debug stub
> > - * need it to operate correctly
> > - */
> > - LONG_S $0, PT_R0(sp)
> > - mfc0 v1, CP0_STATUS
> > - LONG_S $2, PT_R2(sp)
> > #ifdef CONFIG_MIPS_MT_SMTC
> > /*
> > * Ideally, these instructions would be shuffled in
> > @@ -199,6 +191,14 @@
> > .set mips0
> > LONG_S v1, PT_TCSTATUS(sp)
> > #endif /* CONFIG_MIPS_MT_SMTC */
> > + /*
> > + * You might think that you don't need to save $0,
> > + * but the FPU emulator and gdb remote debug stub
> > + * need it to operate correctly
> > + */
> > + LONG_S $0, PT_R0(sp)
> > + mfc0 v1, CP0_STATUS
> > + LONG_S $2, PT_R2(sp)
> > LONG_S $4, PT_R4(sp)
> > LONG_S $5, PT_R5(sp)
> > LONG_S v1, PT_STATUS(sp)
> >
> > Linux-2.6.37-rc7 boots all the way if I specify maxvpes=1 in command
> > line.
> >
> > / # cat /proc/interrupts
> > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> > CPU6
> > 1: 249 218024 218286 218263 218235 218208
> > 218179 MIPS SMTC_IPI
> > 6: 0 0 0 0 0 0
> > 0 MIPS MSP CIC cascade
> > 8: 0 0 0 0 0 0
> > 0 MSP_CIC Softreset button
> > 9: 0 0 0 0 0 0
> > 0 MSP_CIC Standby switch
> > 21: 0 0 0 0 0 0
> > 0 MSP_CIC MSP PER cascade
> > 25: 218128 711 11 0 0 0
> > 0 MSP_CIC timer
> > 27: 341 22 0 0 2 0
> > 6 MSP_CIC serial
> >
> > ERR: 0
> > / # uname -a
> > Linux (none) 2.6.37-rc7-pmc-00001-g9cff2d6-dirty #289 SMP PREEMPT Tue
> > Jan 4 19:48:31 IST 2011 mips GNU/Linux
> >
> > So clock setup / distribution on VPE1 is some thing need fix.
> >
> > Thanks
> > Anoop
> >
> >
> > On Tue, 2011-01-04 at 18:32 +0530, Anoop P A wrote:
> >> On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> >>> Those interrupt counters show that IPIs are being taken everywhere,
> >>> though very few by CPUs 5 and 6. If I understand the configuration
> >>> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> >> Yes CPU4 is in second VPE
> >>
> >>> rate, *if* we're looking at a tickless kernel under low load. But there
> >> No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> >> expect IPI / timer interrupt for all the threads in this case ?.
> >>
> >>> may be a clue there to part of your problem. I have no idea why the
> >>> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> >>> you're getting your clock interrupts through the MSP CIC interrupt
> >>> controller on VPE 0. There's nothing symmetric for VPE1. The Malta
> >>> example code is perhaps deceptively simple, in that both VPEs have their
> >>> count/compare indication wired directly to the 2 clock interrupt inputs,
> >>> so that having both of them running with only a single set of irq state
> >>> just works. I don't know whether the MSP CIC timer interrupt is a
> >> In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> >> MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> >> connected to cpu irq 6.
> >>
> >> I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> >> interrupt . Don't we have support for separate irq in SMTC
> >> implementation ?..
> >>
> >>> gating of the VPE0 count/compare output, or whether it's it's own
> >>> interval timer, but I suspect that you may need to do some further
> >>> low-level initialization in the platform-specific code to set up an
> >>> interrupt on the VPE1 side. I don't think the snippet you've got below
> >>> would work as written.
> >> The routine which I copied works fine for VSMP mode .
> >>
> >> / # cat /proc/interrupts
> >> CPU0 CPU1
> >> 0: 187 254 MIPS IPI_resched
> >> 1: 77 174 MIPS IPI_call
> >> 6: 0 0 MIPS MSP CIC cascade
> >> 8: 0 0 MSP_CIC Softreset button
> >> 9: 0 0 MSP_CIC Standby switch
> >> 21: 0 0 MSP_CIC MSP PER cascade
> >> 25: 37077 0 MSP_CIC timer
> >> 27: 188 0 MSP_CIC serial
> >> 34: 0 36986 MSP_CIC timer
> >>
> >> Do I want to change anything specific for SMTC ? .
> >>
> >>> If it's purely an issue with clock distribution on VPE1, then a boot
> >>> with maxvpes=1 maxtcs=4 should be stable.
> >> Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
> >>
> >>> /K.
> >>>
> >>> On 1/3/2011 11:20 AM, Anoop P A wrote:
> >>>> Hi Kevin,
> >>>>
> >>>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >>>>> The very first SMTC implementations didn't support full kernel-mode
> >>>>> preemption, which anyway wasn't a priority, given the hardware event
> >>>>> response support in MIPS MT. I believe it was later made compatible,
> >>>>> but it was never extensively exercised. Since SMTC has fingers in some
> >>>>> pretty low-level atomicity mechanisms, if a new, parallel set was
> >>>>> implemented for RCU, I can easily imagine that nobody has yet
> >>>>> implemented SMTC-ified variants of that set.
> >>>>>
> >>>>> Your last statement isn't very clear, though. Are you saying that if
> >>>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >>>>> kernel boots all the way up, or that it simply hangs later? What's the
> >>>>> last rev kernel that actually boots all the way up?
> >>>> I have debugged this a bit more. It seems that kernel getting stalled
> >>>> while executing on TC's of second VPE .
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=2504 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=10036 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=17568 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=25100 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=32632 jiffies)
> >>>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>>> by 1, t=40164 jiffies)
> >>>>
> >>>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> >>>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >>>>
> >>>> I presume some issue in my timer setup . I am not seeing timer interrupt
> >>>> (or IPI interrupt) getting incremented for VPE1 tcs on a completely
> >>>> booted 2.6.32-stable kernel.
> >>>>
> >>>> / # cat /proc/interrupts
> >>>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> >>>> CPU6
> >>>> 1: 148 15023 15140 15093 3779 8
> >>>> 2 MIPS SMTC_IPI
> >>>> 6: 0 0 0 0 0 0
> >>>> 0 MIPS MSP CIC cascade
> >>>> 8: 0 0 0 0 0 0
> >>>> 0 MSP_CIC Softreset button
> >>>> 9: 0 0 0 0 0 0
> >>>> 0 MSP_CIC Standby switch
> >>>> 21: 0 0 0 0 0 0
> >>>> 0 MSP_CIC MSP PER cascade
> >>>> 25: 15113 341 4 7 0 0
> >>>> 0 MSP_CIC timer
> >>>> 27: 260 9 0 1 0 0
> >>>> 0 MSP_CIC serial
> >>>> 34: 0 0 0 0 0 0
> >>>> 0 MSP_CIC timer
> >>>>
> >>>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> >>>>
> >>>> I have tried setting up VPE1 timer from get_co_compare_int as follows
> >>>>
> >>>> unsigned int __cpuinit get_c0_compare_int(void)
> >>>> {
> >>>> if ((1==get_current_vpe())&& !vpe1_timr_installed){
> >>>>
> >>>> memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> >>>>
> >>>> setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
> >>>> vpe1_timr_installed++;
> >>>> }
> >>>> return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> >>>> MSP_INT_VPE0_TIMER);
> >>>> }
> >>>>
> >>>> Thanks
> >>>> Anoop
> >>>>
> >>>>> Regards,
> >>>>>
> >>>>> Kevin K.
> >>>>>
> >>>>> On 1/3/2011 7:12 AM, Anoop P A wrote:
> >>>>>> Hi ,
> >>>>>>
> >>>>>> Following patch restricts TREE_CPU RCU implementation only for !PREEMPT
> >>>>>> SMP kernel.
> >>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=687d7a960aea46e016182c7ce346d62c4dbd0366
> >>>>>>
> >>>>>> CONFIG_TREE_PREEMPT_RCU option seems to be not working for SMTC kernel
> >>>>>> ( which will be only available RCU implementation for SMTC kernel from
> >>>>>> 2.6.37 onwards) .
> >>>>>>
> >>>>>> With no forced preemption and selecting TREE_CPU I am able to boot
> >>>>>> further to the hang that I have reported.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Anoop
> >>>>>>
> >>>>>> On Sat, 2011-01-01 at 00:42 -0800, Kevin D. Kissell wrote:
> >>>>>>> At this point the logical thing to do would seem to look at your kernel
> >>>>>>> image and disassemble smtc_ipi_replay(), which is where the EPC of VPE 0
> >>>>>>> shows the last exception to have been taken. That's a critical SMTC
> >>>>>>> routine that gets called whenever an xxx_irq_restore() enables
> >>>>>>> interrupts, so that virtual per-TC IPI interrupts that were posted while
> >>>>>>> the TC had interrupts disabled can be handled deterministically. As I
> >>>>>>> mentioned in an earlier message, there was some cleanup work from David
> >>>>>>> Howell that changed a number of irq management-related function names
> >>>>>>> and prototypes across all architectures, which went into linux-mips.org
> >>>>>>> at very roughly the time of the breakage. The SMTC overlay over the irq
> >>>>>>> implementation has been pretty robust, but it's written in a perhaps
> >>>>>>> doomed attempt to be both efficient and using a maximum amount of common
> >>>>>>> code with the general case. A mechanical or semi-mechanical change
> >>>>>>> could conceivably have broken things.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Kevin K.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 12/31/2010 4:27 AM, Anoop P A wrote:
> >>>>>>>> Hi ,
> >>>>>>>>
> >>>>>>>> Kernel hangs on stop_machine call. Please find mt reg dump below.
> >>>>>>>> Another important observation is even though 2.6.33 kernel + stackframe
> >>>>>>>> patch well passes calibration hang , I am still unable boot in to a
> >>>>>>>> initramfs root ( verified ramfs working with VSMP). So it looks like
> >>>>>>>> still some issue to fix between 2.6.32 and 2.6.33 .
> >>>>>>>> ######################## Log ###########################
> >>>>>>>>
> >>>>>>>> === MIPS MT State Dump ===
> >>>>>>>> -- Global State --
> >>>>>>>> MVPControl Passed: 00000005
> >>>>>>>> MVPControl Read: 00000004
> >>>>>>>> MVPConf0 : a8008406
> >>>>>>>> -- per-VPE State --
> >>>>>>>> VPE 0
> >>>>>>>> VPEControl : 00008000
> >>>>>>>> VPEConf0 : 800f0003
> >>>>>>>> VPE0.Status : 11004201
> >>>>>>>> VPE0.EPC : 8010dc54 smtc_ipi_replay+0xcc/0x108
> >>>>>>>> VPE0.Cause : 50804000
> >>>>>>>> VPE0.Config7 : 00010000
> >>>>>>>> VPE 1
> >>>>>>>> VPEControl : 00068006
> >>>>>>>> VPEConf0 : 80cf0003
> >>>>>>>> VPE1.Status : 11008301
> >>>>>>>> VPE1.EPC : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>> VPE1.Cause : 50800000
> >>>>>>>> VPE1.Config7 : 00010000
> >>>>>>>> -- per-TC State --
> >>>>>>>> TC 0 (current TC with VPE EPC above)
> >>>>>>>> TCStatus : 18102000
> >>>>>>>> TCBind : 00000000
> >>>>>>>> TCRestart : 803fa19c printk+0xc/0x30
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00000000
> >>>>>>>> TC 1
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00200000
> >>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00140000
> >>>>>>>> TC 2
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00400000
> >>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00280000
> >>>>>>>> TC 3
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00600000
> >>>>>>>> TCRestart : 801022a0 r4k_wait+0x20/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 003c0000
> >>>>>>>> TC 4
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00800001
> >>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00500000
> >>>>>>>> TC 5
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00a00001
> >>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00640000
> >>>>>>>> TC 6
> >>>>>>>> TCStatus : 18902000
> >>>>>>>> TCBind : 00c00001
> >>>>>>>> TCRestart : 8010229c r4k_wait+0x1c/0x40
> >>>>>>>> TCHalt : 00000000
> >>>>>>>> TCContext : 00780000
> >>>>>>>> Counter Interrupts taken per CPU (TC)
> >>>>>>>> 0: 0
> >>>>>>>> 1: 0
> >>>>>>>> 2: 0
> >>>>>>>> 3: 0
> >>>>>>>> 4: 0
> >>>>>>>> 5: 0
> >>>>>>>> 6: 0
> >>>>>>>> 7: 0
> >>>>>>>> Self-IPI invocations:
> >>>>>>>> 0: 12
> >>>>>>>> 1: 0
> >>>>>>>> 2: 0
> >>>>>>>> 3: 0
> >>>>>>>> 4: 0
> >>>>>>>> 5: 5
> >>>>>>>> 6: 4
> >>>>>>>> 7: 0
> >>>>>>>> IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> >>>>>>>> 0 Recoveries of "stolen" FPU
> >>>>>>>> ===========================
> >>>>>>>>
> >>>>>>>> ################################################################
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Anoop
> >>>>>>>>
> >>>>>>>> On Tue, 2010-12-28 at 00:43 -0800, Kevin D. Kissell wrote:
> >>>>>>>>> I took a quick look last night, and the only thing that looked vaguely
> >>>>>>>>> dangerous in changes since the timer changes I alluded to earlier was
> >>>>>>>>> the global naming cleanup of irq-related function names that David
> >>>>>>>>> Howell submitted. The diff didn't look dangerous in itself, but some of
> >>>>>>>>> the definitions are nested subtly for SMTC to maximize the amount of
> >>>>>>>>> common code, and I could imagine something getting lost in translation
> >>>>>>>>> there. If that were really the problem, it would of course affect much
> >>>>>>>>> more than just the timer subsystem, but early in the boot process,
> >>>>>>>>> timers are pretty much the only interrupts that have to be handled
> >>>>>>>>> correctly.
> >>>>>>>>>
> >>>>>>>>> I'm travelling today, but will take a look at timekeeping_notify()
> >>>>>>>>> tomorrow or the next day...
> >>>>>>>>>
> >>>>>>>>> /K.
> >>>>>>>>>
> >>>>>>>>> On 12/28/10 12:19 AM, Anoop P A wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I had a glance into the code diff without notice of any suspect-able
> >>>>>>>>>> code .
> >>>>>>>>>> Tracing the hang showed that it is getting hanged in timekeeping_notify
> >>>>>>>>>> function.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Anoop
> >>>>>>>>>>
> >>>>>>>>>> PS: I may not be available until Thursday
> >>>>>>>>>>
> >>>>>>>>>> On Mon, 2010-12-27 at 22:49 +0530, Anoop P A wrote:
> >>>>>>>>>>> Hi Kevin,
> >>>>>>>>>>>
> >>>>>>>>>>> It is very unlikely that the patch you pointed has any impact on the the
> >>>>>>>>>>> hang I am seeing. The patch you have mentioned got into kernel around
> >>>>>>>>>>> 2.6.32 timeframe. I am able to boot both 2.6.32 and 2.6.33 kernel ( +
> >>>>>>>>>>> stackframe patch) .
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Stuart,
> >>>>>>>>>>>
> >>>>>>>>>>> I haven't got much time to spend on this today.
> >>>>>>>>>>>
> >>>>>>>>>>> I had got 2.6.36-stable(+ stack frame patch) booting last day and I have
> >>>>>>>>>>> observed hang issue with 2.6.37-rc1 ( Same as rc6 and current git head)
> >>>>>>>>>>>
> >>>>>>>>>>> So probably some patches in 2.6.37 branch introduced this hang.
> >>>>>>>>>>>
> >>>>>>>>>>> Hopefully I will get some free slot tomorrow so that I can look into
> >>>>>>>>>>> code diff .
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>> Anoop
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, 2010-12-27 at 09:49 -0600, STUART VENTERS wrote:
> >>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Outstanding, sometimes it's better to be lucky than good.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Anoop,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Maybe we can get lucky again.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you can isolate the .33 works/.37 works_not bug to a specific pair of versions,
> >>>>>>>>>>>> I'll be happy to do another diff.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hope you'll have had a good Christmas as well.
> >>>>>>>>>>>> We've had snow in Alabama since Christmas eve!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Stuart
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Kevin D. Kissell [mailto:kevink@paralogos.com]
> >>>>>>>>>>>> Sent: Friday, December 24, 2010 5:34 PM
> >>>>>>>>>>>> To: Anoop P A
> >>>>>>>>>>>> Cc: STUART VENTERS; Anoop P.A.; linux-mips@linux-mips.org
> >>>>>>>>>>>> Subject: Re: SMTC support status in latest git head.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Ah, well, at least we have a stackframe.h fix that preserves David's
> >>>>>>>>>>>> performance tweak for the deeper pipelined processors. In looking for
> >>>>>>>>>>>> this, I did notice that someone did some modification to the SMTC clock
> >>>>>>>>>>>> tick logic that I was skeptical had ever been tested. If you've still
> >>>>>>>>>>>> got that kernel binary handy, you might check to see if it boots with
> >>>>>>>>>>>> maxtcs=1 maxvpes=1, maxtcs=2 maxvpes=1, and/or maxtcs=2 maxvpes=2.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Oh, yes, and Merry Christmas one and all!
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Kevin K.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 12/24/10 8:02 AM, Anoop P A wrote:
> >>>>>>>>>>>>> On Fri, 2010-12-24 at 06:53 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>>> Excellent! Now, does the attached patch (relative to 2.6.37.11) also
> >>>>>>>>>>>>>> fix things, while preserving the other fixes and performance enhancements?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I have tested that patch with 2.6.37 branch it well passes calibration
> >>>>>>>>>>>>> loop but hangs after switching to mips closource
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> TC 6 going on-line as CPU 6
> >>>>>>>>>>>>> Brought up 7 CPUs
> >>>>>>>>>>>>> bio: create slab<bio-0> at 0
> >>>>>>>>>>>>> SCSI subsystem initialized
> >>>>>>>>>>>>> Switching to clocksource MIPS
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I Presume this is a different issue as restoring older file didn't help
> >>>>>>>>>>>>> much to get rid of this hang.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> diff --git a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> index 58730c5..7fc9f10 100644
> >>>>>>>>>>>>> --- a/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> +++ b/arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>> @@ -195,9 +195,9 @@
> >>>>>>>>>>>>> * to cover the pipeline delay.
> >>>>>>>>>>>>> */
> >>>>>>>>>>>>> .set mips32
> >>>>>>>>>>>>> - mfc0 v1, CP0_TCSTATUS
> >>>>>>>>>>>>> + mfc0 v0, CP0_TCSTATUS
> >>>>>>>>>>>>> .set mips0
> >>>>>>>>>>>>> - LONG_S v1, PT_TCSTATUS(sp)
> >>>>>>>>>>>>> + LONG_S v0, PT_TCSTATUS(sp)
> >>>>>>>>>>>>> #endif /* CONFIG_MIPS_MT_SMTC */
> >>>>>>>>>>>>> LONG_S $4, PT_R4(sp)
> >>>>>>>>>>>>> LONG_S $5, PT_R5(sp)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> /K.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 12/24/10 6:39 AM, Anoop P A wrote:
> >>>>>>>>>>>>>>> Hi Kevin, Stuart ,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Woohooo You guys spotted !.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> http://git.linux-mips.org/?p=linux.git;a=commit;h=d5ec6e3c seems to be
> >>>>>>>>>>>>>>> the culprit
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Once I restored previous version of stackframe.h 2.6.33-stable started
> >>>>>>>>>>>>>>> booting !.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Anoop
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, 2010-12-24 at 04:32 -0800, Kevin D. Kissell wrote:
> >>>>>>>>>>>>>>>> Thank you, Stuart! I've spotted some definite breakage to SMTC between
> >>>>>>>>>>>>>>>> those versions. In arch/mips/include/asm/stackframe.h, someone moved
> >>>>>>>>>>>>>>>> the store of the Status register value in SAVE_SOME (line 169 or 204,
> >>>>>>>>>>>>>>>> depending on the version) from two instructions after the mfc0 to a
> >>>>>>>>>>>>>>>> point after the #ifdef for SMTC, presumably to get better pipelining of
> >>>>>>>>>>>>>>>> the register access. Unfortunately, the v1 register is also used in the
> >>>>>>>>>>>>>>>> SMTC-specific fragment to save TCStatus, so the Status value gets
> >>>>>>>>>>>>>>>> clobbered before it gets stored. This will eventually result in the
> >>>>>>>>>>>>>>>> Status register getting a TCStatus value, which has some bits on common,
> >>>>>>>>>>>>>>>> but isn't identical and sooner or later Bad Things will happen.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'm a little surprised this wasn't caught by visual inspection of the patch.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Possible solutions would include reverting the store of the CP0_STATUS
> >>>>>>>>>>>>>>>> value to the block above the #ifdef, or, to retain whatever performance
> >>>>>>>>>>>>>>>> advantage was obtained by moving the store downward, to use v0/$2
> >>>>>>>>>>>>>>>> instead of v1/$3, as the staging register for the TCStatus value. I'd
> >>>>>>>>>>>>>>>> lean toward the second option, but I'm not in a position to test and
> >>>>>>>>>>>>>>>> submit a patch just now.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Kevin K.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 12/23/10 1:09 PM, STUART VENTERS wrote:
> >>>>>>>>>>>>>>>>> Kevin,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'm not sure if it's useful,
> >>>>>>>>>>>>>>>>> but finally I got the time to look at the two kernel versions Anoop pointed out.
> >>>>>>>>>>>>>>>>> works 2.6.32-stable with patch 804
> >>>>>>>>>>>>>>>>> works_not 2.6.33-stable
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> greping for files with CONFIG_MIPS_MT_SMTC
> >>>>>>>>>>>>>>>>> and looking for timer interrupt related stuff found the following differences:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/irq.h
> >>>>>>>>>>>>>>>>> arch/mips/kernel/irq.c
> >>>>>>>>>>>>>>>>> do_IRQ
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/stackframe.h
> >>>>>>>>>>>>>>>>> SAVE_SOME SAVE_TEMP get/set_saved_sp
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/include/asm/time.h
> >>>>>>>>>>>>>>>>> clocksource_set_clock
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/kernel/process.c
> >>>>>>>>>>>>>>>>> cpu_idle
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> arch/mips/kernel/smtc.c
> >>>>>>>>>>>>>>>>> __irq_entry
> >>>>>>>>>>>>>>>>> ipi_decode
> >>>>>>>>>>>>>>>>> SMTC_CLOCK_TICK
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Enclosed are the two subsets of files for a more expert look.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'll try to look in more detail after Christmas.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Stuart
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >
> >
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 17:54 ` Anoop P A
@ 2011-01-04 18:33 ` Kevin D. Kissell
2011-01-05 13:11 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-04 18:33 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/04/11 09:54, Anoop P A wrote:
> On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
>> I'm trying to figure out a reason why your change below should help, and
>> offhand, modulo tool bugs, I don't see it. I'm assuming that your diff
>> below is a diff relative to the pre-patch stackframe.h. I wouldn't
> Yes patch created against stock code .
>
>> bless it as an alternative because it moves code and comments
>> unnecessarily - all you should really have to do is to move the
>>
>>
>> 190 mfc0 v1, CP0_STATUS
>> 191 LONG_S $2, PT_R2(sp)
>>
>> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
> Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
> of code ( which store $0 ) . git diff did the rest on behalf of me :)
>
>> If moving the save of zero to PT_R0(sp) actually makes a difference,
>> it's evidence that you've got problems in your toolchain (or, heaven
>> forbid, your pipeline)!
> In previous version of patch usage of V0 was creating issue. I have
> verified this with previous version of code ( working code before
> David's instruction rearrangement patch.) .
Argh. It's not very clearly commented, but it looks as if the system
call trap handler has an implicit assumption that v0 has never been
changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI. So yeah, moving the
code around to fix the v1 conflict ends up being better than using v0 -
otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the
LONG_S v0, PT_TCSTATUS(sp) of the original patch.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 17:40 ` Kevin D. Kissell
@ 2011-01-05 13:09 ` Anoop P A
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-05 13:09 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
[-- Attachment #1: Type: text/plain, Size: 7962 bytes --]
On Tue, 2011-01-04 at 09:40 -0800, Kevin D. Kissell wrote:
> On 01/04/11 05:02, Anoop P A wrote:
> > On Tue, 2011-01-04 at 00:17 -0800, Kevin D. Kissell wrote:
> >> Those interrupt counters show that IPIs are being taken everywhere,
> >> though very few by CPUs 5 and 6. If I understand the configuration
> >> correctly, CPU 4 is a TC in VPE 1, and it's getting a reasonable IPI
> > Yes CPU4 is in second VPE
> >
> >> rate, *if* we're looking at a tickless kernel under low load. But there
> > No it was not the tickless kernel.I had selected 250 MHz timer. can't we
> > expect IPI / timer interrupt for all the threads in this case ?.
>
> In that case, you should expect a distribution of timer interrupts that
> favors the low-numbered TCs within the VPE, as you do in VPE0, and a
> distribution of IPIs that is sort-of the inverse, as you do in VPE0.
> But the low counts on VPE1 are indeed suspicious, as you note.
>
> >> may be a clue there to part of your problem. I have no idea why the
> >> behavior would have changed from 2.6.36 to 2.6.37, but it looks as if
> >> you're getting your clock interrupts through the MSP CIC interrupt
> >> controller on VPE 0. There's nothing symmetric for VPE1. The Malta
> >> example code is perhaps deceptively simple, in that both VPEs have their
> >> count/compare indication wired directly to the 2 clock interrupt inputs,
> >> so that having both of them running with only a single set of irq state
> >> just works. I don't know whether the MSP CIC timer interrupt is a
> > In my case it is separate irq. MSP_INT_VPE1_TIMER (34) and
> > MSP_INT_VPE0_TIMER (25) are wired to CIC . CIC interrupt has been
> > connected to cpu irq 6.
> >
> > I can reproduce cpu stall in VSMP mode If I don't setup VPE1 timer
> > interrupt . Don't we have support for separate irq in SMTC
> > implementation ?..
>
> There are hooks for platform-specific SMTC support, which is implemented
> for the Malta in arch/mips/mti-malta/malta-smtc.c. See
> msmtc_init_secondary(), for example, where the clock/compare, profile,
> and IPI interrupts are armed for VPE 1, while I/O peripheral interrupts
> are inhibited.
>
> >> gating of the VPE0 count/compare output, or whether it's it's own
> >> interval timer, but I suspect that you may need to do some further
> >> low-level initialization in the platform-specific code to set up an
> >> interrupt on the VPE1 side. I don't think the snippet you've got below
> >> would work as written.
> > The routine which I copied works fine for VSMP mode .
> >
> > / # cat /proc/interrupts
> > CPU0 CPU1
> > 0: 187 254 MIPS IPI_resched
> > 1: 77 174 MIPS IPI_call
> > 6: 0 0 MIPS MSP CIC cascade
> > 8: 0 0 MSP_CIC Softreset button
> > 9: 0 0 MSP_CIC Standby switch
> > 21: 0 0 MSP_CIC MSP PER cascade
> > 25: 37077 0 MSP_CIC timer
> > 27: 188 0 MSP_CIC serial
> > 34: 0 36986 MSP_CIC timer
> >
> > Do I want to change anything specific for SMTC ? .
>
> If it works (which I doubt), then we can critique stylistic points like
> using
>
> if ((1==get_current_vpe())
>
> Instead of the more readable and general
>
> if (get_current_vpe()> 0)
>
>
> But I think you're generally looking in the wrong place. Look at the
> Malta code and see what's done where. The initial SMTC code had a lot
> of Malta assumptions in the main line that I pushed out to platform code
> in later patches. I can see how things could be made even more modular,
> but for the moment I think it's just that there's some stuff that ought
> to be done in a "msp_smtc.c" file that doesn't exist in 2.6.37.
Yes , I am doing similar stuff in msp_smtc.c . Attaching code for your
reference. I am not seeing a VPE1 timer interrupt.
>
> Regards,
>
> Kevin K.
> >
> >
> >> If it's purely an issue with clock distribution on VPE1, then a boot
> >> with maxvpes=1 maxtcs=4 should be stable.
> > Yes the kernel seems to be stable if I boot with maxvpes=1 maxtcs=4 .
> >
> >> /K.
> >>
> >> On 1/3/2011 11:20 AM, Anoop P A wrote:
> >>> Hi Kevin,
> >>>
> >>> On Mon, 2011-01-03 at 08:14 -0800, Kevin D. Kissell wrote:
> >>>> The very first SMTC implementations didn't support full kernel-mode
> >>>> preemption, which anyway wasn't a priority, given the hardware event
> >>>> response support in MIPS MT. I believe it was later made compatible,
> >>>> but it was never extensively exercised. Since SMTC has fingers in some
> >>>> pretty low-level atomicity mechanisms, if a new, parallel set was
> >>>> implemented for RCU, I can easily imagine that nobody has yet
> >>>> implemented SMTC-ified variants of that set.
> >>>>
> >>>> Your last statement isn't very clear, though. Are you saying that if
> >>>> you configure for no forced preemption and with TREE_CPU, the 2.6.37
> >>>> kernel boots all the way up, or that it simply hangs later? What's the
> >>>> last rev kernel that actually boots all the way up?
> >>> I have debugged this a bit more. It seems that kernel getting stalled
> >>> while executing on TC's of second VPE .
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=2504 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=10036 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=17568 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=25100 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=32632 jiffies)
> >>> INFO: rcu_sched_state detected stalls on CPUs/tasks: { 4 5 6} (detected
> >>> by 1, t=40164 jiffies)
> >>>
> >>> With CONFIG_TREE_CPU we were not hitting this scenario very often.
> >>> However with CONFIG_PREEMPT_TREE_CPU stall happens most of the time.
> >>>
> >>> I presume some issue in my timer setup . I am not seeing timer interrupt
> >>> (or IPI interrupt) getting incremented for VPE1 tcs on a completely
> >>> booted 2.6.32-stable kernel.
> >>>
> >>> / # cat /proc/interrupts
> >>> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> >>> CPU6
> >>> 1: 148 15023 15140 15093 3779 8
> >>> 2 MIPS SMTC_IPI
> >>> 6: 0 0 0 0 0 0
> >>> 0 MIPS MSP CIC cascade
> >>> 8: 0 0 0 0 0 0
> >>> 0 MSP_CIC Softreset button
> >>> 9: 0 0 0 0 0 0
> >>> 0 MSP_CIC Standby switch
> >>> 21: 0 0 0 0 0 0
> >>> 0 MSP_CIC MSP PER cascade
> >>> 25: 15113 341 4 7 0 0
> >>> 0 MSP_CIC timer
> >>> 27: 260 9 0 1 0 0
> >>> 0 MSP_CIC serial
> >>> 34: 0 0 0 0 0 0
> >>> 0 MSP_CIC timer
> >>>
> >>> Can't we use separate timer interrupts for VPE1 and VPE0 in SMTC ?.
> >>>
> >>> I have tried setting up VPE1 timer from get_co_compare_int as follows
> >>>
> >>> unsigned int __cpuinit get_c0_compare_int(void)
> >>> {
> >>> if ((1==get_current_vpe())&& !vpe1_timr_installed){
> >>>
> >>> memcpy(&timer_vpe1,&c0_compare_irqaction,sizeof(timer_vpe1));
> >>>
> >>> setup_irq(MSP_INT_VPE1_TIMER,&timer_vpe1);
> >>> vpe1_timr_installed++;
> >>> }
> >>> return (get_current_vpe() ? MSP_INT_VPE1_TIMER :
> >>> MSP_INT_VPE0_TIMER);
> >>> }
> >>>
> >>> Thanks
> >>> Anoop
>
[-- Attachment #2: msp_smtc.c --]
[-- Type: text/x-csrc, Size: 3230 bytes --]
/*
* MSP71xx Platform-specific hooks for SMP operation.
* Started from malta-smtc.c.
*/
#include <linux/irq.h>
#include <linux/init.h>
#include <linux/sched.h>
#include <asm/mipsregs.h>
#include <asm/mipsmtregs.h>
#include <asm/smtc.h>
#include <asm/smtc_ipi.h>
/* VPE/SMP Prototype implements platform interfaces directly */
/*
* Cause the specified action to be performed on a targeted "CPU"
*/
static void msp_smtc_send_ipi_single(int cpu, unsigned int action)
{
/* "CPU" may be TC of same VPE, VPE of same CPU, or different CPU */
smtc_send_ipi(cpu, LINUX_SMP_IPI, action);
}
static void msp_smtc_send_ipi_mask(const struct cpumask *mask, unsigned int action)
{
unsigned int i;
for_each_cpu(i, mask)
msp_smtc_send_ipi_single(i, action);
}
/*
* Post-config but pre-boot cleanup entry point
*/
static int prev_vpe;
static void __cpuinit msp_smtc_init_secondary(void)
{
void smtc_init_secondary(void);
int myvpe;
myvpe = read_c0_tcbind() & TCBIND_CURVPE;
/* Change status register when we switch to new VPE*/
if ((myvpe != prev_vpe) && (myvpe > 0)) {
change_c0_status(ST0_IM, STATUSF_IP0 | STATUSF_IP1 |
STATUSF_IP6 | STATUSF_IP7);
}
prev_vpe = myvpe;
smtc_init_secondary();
}
/*
* Platform "CPU" startup hook
*/
static void __cpuinit msp_smtc_boot_secondary(int cpu, struct task_struct *idle)
{
smtc_boot_secondary(cpu, idle);
}
/*
* SMP initialization finalization entry point
*/
static void __cpuinit msp_smtc_smp_finish(void)
{
smtc_smp_finish();
}
/*
* Hook for after all CPUs are online
*/
static void msp_smtc_cpus_done(void)
{
}
/*
* Platform SMP pre-initialization
*
* As noted above, we can assume a single CPU for now
* but it may be multithreaded.
*/
static void __init msp_smtc_smp_setup(void)
{
/*
* we won't get the definitive value until
* we've run smtc_prepare_cpus later, but
*/
if (read_c0_config3() & (1 << 2))
smp_num_siblings = smtc_build_cpu_map(0);
}
static void __init msp_smtc_prepare_cpus(unsigned int max_cpus)
{
smtc_prepare_cpus(max_cpus);
}
struct plat_smp_ops msp_smtc_smp_ops = {
.send_ipi_single = msp_smtc_send_ipi_single,
.send_ipi_mask = msp_smtc_send_ipi_mask,
.init_secondary = msp_smtc_init_secondary,
.smp_finish = msp_smtc_smp_finish,
.cpus_done = msp_smtc_cpus_done,
.boot_secondary = msp_smtc_boot_secondary,
.smp_setup = msp_smtc_smp_setup,
.prepare_cpus = msp_smtc_prepare_cpus,
};
#if 0
/* TODO */
#ifdef CONFIG_MIPS_MT_SMTC_IRQAFF
/*
* IRQ affinity hook
*/
int plat_set_irq_affinity(unsigned int irq, const struct cpumask *affinity)
{
cpumask_t tmask;
int cpu = 0;
void smtc_set_irq_affinity(unsigned int irq, cpumask_t aff);
cpumask_copy(&tmask, affinity);
for_each_cpu(cpu, affinity) {
if ((cpu_data[cpu].vpe_id != 0) || !cpu_online(cpu))
cpu_clear(cpu, tmask);
}
cpumask_copy(irq_desc[irq].affinity, &tmask);
if (cpus_empty(tmask))
/*
* We could restore a default mask here, but the
* runtime code can anyway deal with the null set
*/
printk(KERN_WARNING
"IRQ affinity leaves no legal CPU for IRQ %d\n", irq);
/* Do any generic SMTC IRQ affinity setup */
smtc_set_irq_affinity(irq, tmask);
return 0;
}
#endif /* CONFIG_MIPS_MT_SMTC_IRQAFF */
#endif
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-04 18:33 ` Kevin D. Kissell
@ 2011-01-05 13:11 ` Anoop P A
2011-01-05 19:23 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-05 13:11 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Tue, 2011-01-04 at 10:33 -0800, Kevin D. Kissell wrote:
> On 01/04/11 09:54, Anoop P A wrote:
> > On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
> >> I'm trying to figure out a reason why your change below should help, and
> >> offhand, modulo tool bugs, I don't see it. I'm assuming that your diff
> >> below is a diff relative to the pre-patch stackframe.h. I wouldn't
> > Yes patch created against stock code .
> >
> >> bless it as an alternative because it moves code and comments
> >> unnecessarily - all you should really have to do is to move the
> >>
> >>
> >> 190 mfc0 v1, CP0_STATUS
> >> 191 LONG_S $2, PT_R2(sp)
> >>
> >> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
> > Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
> > of code ( which store $0 ) . git diff did the rest on behalf of me :)
> >
> >> If moving the save of zero to PT_R0(sp) actually makes a difference,
> >> it's evidence that you've got problems in your toolchain (or, heaven
> >> forbid, your pipeline)!
> > In previous version of patch usage of V0 was creating issue. I have
> > verified this with previous version of code ( working code before
> > David's instruction rearrangement patch.) .
>
> Argh. It's not very clearly commented, but it looks as if the system
> call trap handler has an implicit assumption that v0 has never been
> changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI. So yeah, moving the
> code around to fix the v1 conflict ends up being better than using v0 -
> otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the
> LONG_S v0, PT_TCSTATUS(sp) of the original patch.
Well, Here is the patch.
diff --git a/arch/mips/include/asm/stackframe.h
b/arch/mips/include/asm/stackframe.h
index 58730c5..19418c4 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -187,8 +187,6 @@
* need it to operate correctly
*/
LONG_S $0, PT_R0(sp)
- mfc0 v1, CP0_STATUS
- LONG_S $2, PT_R2(sp)
#ifdef CONFIG_MIPS_MT_SMTC
/*
* Ideally, these instructions would be shuffled in
@@ -199,6 +197,8 @@
.set mips0
LONG_S v1, PT_TCSTATUS(sp)
#endif /* CONFIG_MIPS_MT_SMTC */
+ mfc0 v1, CP0_STATUS
+ LONG_S $2, PT_R2(sp)
LONG_S $4, PT_R4(sp)
LONG_S $5, PT_R5(sp)
LONG_S v1, PT_STATUS(sp)
>
> Regards,
>
> Kevin K.
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-05 13:11 ` Anoop P A
@ 2011-01-05 19:23 ` Kevin D. Kissell
2011-01-06 20:23 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-05 19:23 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/05/11 05:11, Anoop P A wrote:
> On Tue, 2011-01-04 at 10:33 -0800, Kevin D. Kissell wrote:
>> On 01/04/11 09:54, Anoop P A wrote:
>>> On Tue, 2011-01-04 at 09:21 -0800, Kevin D. Kissell wrote:
>>>> I'm trying to figure out a reason why your change below should help, and
>>>> offhand, modulo tool bugs, I don't see it. I'm assuming that your diff
>>>> below is a diff relative to the pre-patch stackframe.h. I wouldn't
>>> Yes patch created against stock code .
>>>
>>>> bless it as an alternative because it moves code and comments
>>>> unnecessarily - all you should really have to do is to move the
>>>>
>>>>
>>>> 190 mfc0 v1, CP0_STATUS
>>>> 191 LONG_S $2, PT_R2(sp)
>>>>
>>>> to be just after the #endif /* CONFIG_MIPS_MT_SMTC */ at around line 201.
>>> Actually I just moved code under CONFIG_MIPS_MT_SMTC to previous block
>>> of code ( which store $0 ) . git diff did the rest on behalf of me :)
>>>
>>>> If moving the save of zero to PT_R0(sp) actually makes a difference,
>>>> it's evidence that you've got problems in your toolchain (or, heaven
>>>> forbid, your pipeline)!
>>> In previous version of patch usage of V0 was creating issue. I have
>>> verified this with previous version of code ( working code before
>>> David's instruction rearrangement patch.) .
>> Argh. It's not very clearly commented, but it looks as if the system
>> call trap handler has an implicit assumption that v0 has never been
>> changed by SAVE_SOME, TRACE_IRQS_ON_RELOAD, or STI. So yeah, moving the
>> code around to fix the v1 conflict ends up being better than using v0 -
>> otherwise, we'd need to add a LONG_L v0, PT_R2(sp) somewhere after the
>> LONG_S v0, PT_TCSTATUS(sp) of the original patch.
> Well, Here is the patch.
>
> diff --git a/arch/mips/include/asm/stackframe.h
> b/arch/mips/include/asm/stackframe.h
> index 58730c5..19418c4 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -187,8 +187,6 @@
> * need it to operate correctly
> */
> LONG_S $0, PT_R0(sp)
> - mfc0 v1, CP0_STATUS
> - LONG_S $2, PT_R2(sp)
> #ifdef CONFIG_MIPS_MT_SMTC
> /*
> * Ideally, these instructions would be shuffled in
> @@ -199,6 +197,8 @@
> .set mips0
> LONG_S v1, PT_TCSTATUS(sp)
> #endif /* CONFIG_MIPS_MT_SMTC */
> + mfc0 v1, CP0_STATUS
> + LONG_S $2, PT_R2(sp)
> LONG_S $4, PT_R4(sp)
> LONG_S $5, PT_R5(sp)
> LONG_S v1, PT_STATUS(sp)
That's exactly what I'd propose as the cleanest minimal fix. I've got a
version that also replaces the .set mips32 / .set mips0 with the .set
push / .set pop paradigm, which I'd have used in the original code if
I'd known at the time about that assembler directive. I'm hoping to be
able to test on a Malta/34K reference platform, and make sure there
isn't breakage on that platform branch as well, before we commit to the
repository.
Your msp_smtc.c file looks plausible on the face of it. The
init_secondary function has the quirk that it expects to execute on each
"CPU" in numerical order, which is very likely but not guaranteed. It
*ought* to be harmless in the rare case where it fails, but the
assumption is worth a comment, IMHO.
At this point, there shouldn't be a whole lot of SMTC-specific mystery
to get your timer running on the second VPE. You know it's taking
interrupts, because of the IPIs getting through, so in principle you
just need to run the chain of enables from the clock peripheral itself
through the CIC to the CPU core and the IM bits.
It would be really cool if we could get a stable repository branch that
boots SMTC out-of-the-box on both Malta and the MSP platform.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-05 19:23 ` Kevin D. Kissell
@ 2011-01-06 20:23 ` Anoop P A
2011-01-06 23:31 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2011-01-06 20:23 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Wed, 2011-01-05 at 11:23 -0800, Kevin D. Kissell wrote:
> > LONG_S $5, PT_R5(sp)
> > LONG_S v1, PT_STATUS(sp)
>
> That's exactly what I'd propose as the cleanest minimal fix. I've got a
> version that also replaces the .set mips32 / .set mips0 with the .set
> push / .set pop paradigm, which I'd have used in the original code if
> I'd known at the time about that assembler directive. I'm hoping to be
> able to test on a Malta/34K reference platform, and make sure there
> isn't breakage on that platform branch as well, before we commit to the
> repository.
I hope somebody can test this patch on Malta/34K platform. I don't have
access to any malta boards and I believe 34K MT simulations is not
available on qemu.
>
> Your msp_smtc.c file looks plausible on the face of it. The
> init_secondary function has the quirk that it expects to execute on each
> "CPU" in numerical order, which is very likely but not guaranteed. It
> *ought* to be harmless in the rare case where it fails, but the
> assumption is worth a comment, IMHO.
Yes I will add a comment.
>
> At this point, there shouldn't be a whole lot of SMTC-specific mystery
> to get your timer running on the second VPE. You know it's taking
> interrupts, because of the IPIs getting through, so in principle you
> just need to run the chain of enables from the clock peripheral itself
> through the CIC to the CPU core and the IM bits.
I hope we are almost there. I have made some progress with the debug . I
think you should be able to give better insight to the observation I
have made.
1. Without selecting CONFIG_MIPS_MT_SMTC_IM_BACKSTOP My kernel hangs in
calibration loop itself . ( I haven't looked further into this).
2. With CONFIG_MIPS_MT_SMTC_IM_BACKSTOP I found I am getting 3
VPE1-TIMER interrupt ( one for each TC of VPE1) .However this interrupts
are not getting carried till c0_compare_interrupt .
do_IRQ call had a SMTC hook which is modifying tccontext ( To reduce
complexity I haven't selected SMTC affinity).
Once I disabled this call . I am seeing VPE1 timer interrupts and able
to boot completely without any issue's so far :).
/ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6
1: 171 727459 727561 727533 27 727446
727453 MIPS SMTC_IPI
6: 0 0 0 0 0 0
0 MIPS MSP CIC cascade
8: 0 0 0 0 0 0
0 MSP_CIC Softreset button
9: 0 0 0 0 0 0
0 MSP_CIC Standby switch
21: 0 0 0 0 0 0
0 MSP_CIC MSP PER cascade
25: 727507 484 11 0 0 0
0 MSP_CIC timer
27: 0 0 0 0 258 10
1 MSP_CIC serial
34: 0 0 0 0 727533 7
1 MSP_CIC timer
BTW following code in my cic init was setting hwmask.
/* initialize all the IRQ descriptors */
for (i = MSP_CIC_INTBASE ; i < MSP_CIC_INTBASE + 32 ; i++) {
set_irq_chip_and_handler(i, &msp_cic_irq_controller,
handle_level_irq);
#ifdef CONFIG_MIPS_MT_SMTC
irq_hwmask[i] = C_IRQ4;
#endif
}
> It would be really cool if we could get a stable repository branch that
> boots SMTC out-of-the-box on both Malta and the MSP platform.
:)
>
> Regards,
>
> Kevin K.
>
>
Thanks
Anoop
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-06 20:23 ` Anoop P A
@ 2011-01-06 23:31 ` Kevin D. Kissell
2011-01-07 7:56 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-06 23:31 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/06/11 12:23, Anoop P A wrote:
> On Wed, 2011-01-05 at 11:23 -0800, Kevin D. Kissell wrote:
>> At this point, there shouldn't be a whole lot of SMTC-specific mystery
>> to get your timer running on the second VPE. You know it's taking
>> interrupts, because of the IPIs getting through, so in principle you
>> just need to run the chain of enables from the clock peripheral itself
>> through the CIC to the CPU core and the IM bits.
> I hope we are almost there. I have made some progress with the debug . I
> think you should be able to give better insight to the observation I
> have made.
>
> 1. Without selecting CONFIG_MIPS_MT_SMTC_IM_BACKSTOP My kernel hangs in
> calibration loop itself . ( I haven't looked further into this).
That suggests a problem with Status.IM initialization and/or
the handling of irq_hwmask[]. Do you mean that this is always
true, or only if VPE1 is being booted? You haven't mentioned it
before.
> 2. With CONFIG_MIPS_MT_SMTC_IM_BACKSTOP I found I am getting 3
> VPE1-TIMER interrupt ( one for each TC of VPE1) .However this interrupts
> are not getting carried till c0_compare_interrupt .
Would you expect them to? I thought you were using an outboard
timer and not the CP0 Compare interrupt.
> do_IRQ call had a SMTC hook which is modifying tccontext ( To reduce
> complexity I haven't selected SMTC affinity).
>
> Once I disabled this call . I am seeing VPE1 timer interrupts and able
> to boot completely without any issue's so far :).
So long as you've got the IM_BACKSTOP hack enabled, right?
Because otherwise, without that __DO_IRQ_SMTC_HOOK() invocation
> / # cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> CPU6
> 1: 171 727459 727561 727533 27 727446
> 727453 MIPS SMTC_IPI
> 6: 0 0 0 0 0 0
> 0 MIPS MSP CIC cascade
> 8: 0 0 0 0 0 0
> 0 MSP_CIC Softreset button
> 9: 0 0 0 0 0 0
> 0 MSP_CIC Standby switch
> 21: 0 0 0 0 0 0
> 0 MSP_CIC MSP PER cascade
> 25: 727507 484 11 0 0 0
> 0 MSP_CIC timer
> 27: 0 0 0 0 258 10
> 1 MSP_CIC serial
> 34: 0 0 0 0 727533 7
> 1 MSP_CIC timer
>
>
> BTW following code in my cic init was setting hwmask.
>
> /* initialize all the IRQ descriptors */
> for (i = MSP_CIC_INTBASE ; i< MSP_CIC_INTBASE + 32 ; i++) {
> set_irq_chip_and_handler(i,&msp_cic_irq_controller,
> handle_level_irq);
> #ifdef CONFIG_MIPS_MT_SMTC
> irq_hwmask[i] = C_IRQ4;
> #endif
> }
I'm sure I've said this before, and it's in various comments in the SMTC
code, but remember, one of the main problems that the SMTC kernel
had to solve was to prevent all TCs of a VPE from "convoying" after every
interrupt. The way this is done is that the interrupt vector code, before
clearing EXL, masks off the Status.IM bit associated with the incoming
interrupt. Of course, to get another interrupt from the same source
(or collection of sources), that IM bit needs to be restored. The "correct"
mechanism for this is by having the appropriate irq_hwmask[] value set,
so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
(meaning that the source has been quenched and any new occurrence
should be considered a new interrupt), will restore the bit in Status.
This function got moved around a bit in the various SMTC prototypes,
but it proved least intrusive to put it into the xxx_mask_and_ack()
functions
for the interrupt controllers - see irq-msc01.c and i8259.c. If you haven't
done the same in any equivalent code for a different on-chip controller,
you'll definitely have problems.
The Backstop scheme works OK for peripheral interrupts that didn't
have an appropriate irq_hwmask[] value set up, but clock interrupts
don't follow the same code paths and can't depend on the backstop.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-06 23:31 ` Kevin D. Kissell
@ 2011-01-07 7:56 ` Anoop P A
2011-01-07 18:46 ` Kevin D. Kissell
2011-01-10 19:30 ` Kevin D. Kissell
0 siblings, 2 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-07 7:56 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]
On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
> On 01/06/11 12:23, Anoop P A wrote:
>
> I'm sure I've said this before, and it's in various comments in the SMTC
> code, but remember, one of the main problems that the SMTC kernel
> had to solve was to prevent all TCs of a VPE from "convoying" after every
> interrupt. The way this is done is that the interrupt vector code, before
> clearing EXL, masks off the Status.IM bit associated with the incoming
> interrupt. Of course, to get another interrupt from the same source
> (or collection of sources), that IM bit needs to be restored. The "correct"
> mechanism for this is by having the appropriate irq_hwmask[] value set,
> so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
> (meaning that the source has been quenched and any new occurrence
> should be considered a new interrupt), will restore the bit in Status.
> This function got moved around a bit in the various SMTC prototypes,
> but it proved least intrusive to put it into the xxx_mask_and_ack()
> functions
> for the interrupt controllers - see irq-msc01.c and i8259.c. If you haven't
> done the same in any equivalent code for a different on-chip controller,
> you'll definitely have problems.
>
> The Backstop scheme works OK for peripheral interrupts that didn't
> have an appropriate irq_hwmask[] value set up, but clock interrupts
> don't follow the same code paths and can't depend on the backstop.
Ok. Well thanks much for your detailed explanation. Well I hope I found
the root cause . smtc_clockevent_init() was overriding irq_hwmask even
if are using platform specific get_c0_compare_int. With following patch
everything seems to be working for me.
------------------------------------------------------------------------
diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
index 2e72d30..a25fc59 100644
--- a/arch/mips/kernel/cevt-smtc.c
+++ b/arch/mips/kernel/cevt-smtc.c
@@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
return 0;
/*
* And we need the hwmask associated with the c0_compare
- * vector to be initialized.
+ * vector to be initialized. However incase of platform
+ * specific get_co_compare_int, don't override irq_hwmask
+ * expect platform code to set a valid mask value.
*/
- irq_hwmask[irq] = (0x100 << cp0_compare_irq);
+
+ if (!get_c0_compare_int)
+ irq_hwmask[irq] = (0x100 << cp0_compare_irq);
+
if (cp0_timer_irq_installed)
return 0;
-----------------------------------------------------------------------
Attaching my msp_ir_cic.c . Kindly have a look if possible.
Thanks
Anoop
>
> Regards,
>
> Kevin K.
[-- Attachment #2: msp_irq_cic.c --]
[-- Type: text/x-csrc, Size: 5357 bytes --]
/*
* Copyright 2010 PMC-Sierra, Inc, derived from irq_cpu.c
*
* This file define the irq handler for MSP CIC subsystem interrupts.
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the
* Free Software Foundation; either version 2 of the License, or (at your
* option) any later version.
*/
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/kernel.h>
#include <linux/bitops.h>
#include <linux/irq.h>
#include <asm/mipsregs.h>
#include <asm/system.h>
#include <msp_cic_int.h>
#include <msp_regs.h>
/*
* External API
*/
extern void msp_per_irq_init(void);
extern void msp_per_irq_dispatch(void);
/*
* Convenience Macro. Should be somewhere generic.
*/
#define get_current_vpe() \
((read_c0_tcbind() >> TCBIND_CURVPE_SHIFT) & TCBIND_CURVPE)
#ifdef CONFIG_SMP
#define LOCK_VPE(flags, mtflags) \
do { \
local_irq_save(flags); \
mtflags = dmt(); \
} while (0)
#define UNLOCK_VPE(flags, mtflags) \
do { \
emt(mtflags); \
local_irq_restore(flags);\
} while (0)
#define LOCK_CORE(flags, mtflags) \
do { \
local_irq_save(flags); \
mtflags = dvpe(); \
} while (0)
#define UNLOCK_CORE(flags, mtflags) \
do { \
evpe(mtflags); \
local_irq_restore(flags);\
} while (0)
#else
#define LOCK_VPE(flags, mtflags)
#define UNLOCK_VPE(flags, mtflags)
#endif
/* ensure writes to cic are completed */
static inline void cic_wmb(void)
{
const volatile void __iomem *cic_mem = CIC_VPE0_MSK_REG;
volatile u32 dummy_read;
wmb();
dummy_read = __raw_readl(cic_mem);
dummy_read++;
}
static inline void unmask_cic_irq(unsigned int irq)
{
volatile u32 *cic_msk_reg = CIC_VPE0_MSK_REG;
int vpe;
#ifdef CONFIG_SMP
unsigned int mtflags;
unsigned long flags;
/*
* Make sure we have IRQ affinity. It may have changed while
* we were processing the IRQ.
*/
if (!cpumask_test_cpu(smp_processor_id(), irq_desc[irq].affinity))
return;
#endif
vpe = get_current_vpe();
LOCK_VPE(flags, mtflags);
cic_msk_reg[vpe] |= (1 << (irq - MSP_CIC_INTBASE));
UNLOCK_VPE(flags, mtflags);
cic_wmb();
}
static inline void mask_cic_irq(unsigned int irq)
{
volatile u32 *cic_msk_reg = CIC_VPE0_MSK_REG;
int vpe = get_current_vpe();
#ifdef CONFIG_SMP
unsigned long flags, mtflags;
#endif
LOCK_VPE(flags, mtflags);
cic_msk_reg[vpe] &= ~(1 << (irq - MSP_CIC_INTBASE));
UNLOCK_VPE(flags, mtflags);
cic_wmb();
}
static inline void msp_cic_irq_ack(unsigned int irq)
{
mask_cic_irq(irq);
/*
* Only really necessary for 18, 16-14 and sometimes 3:0
* (since these can be edge sensitive) but it doesn't
* hurt for the others
*/
*CIC_STS_REG = (1 << (irq - MSP_CIC_INTBASE));
smtc_im_ack_irq(irq);
}
static void msp_cic_irq_end(unsigned int irq)
{
if (!(irq_desc[irq].status & (IRQ_DISABLED | IRQ_INPROGRESS)))
unmask_cic_irq(irq);
}
#ifdef CONFIG_SMP
static inline int msp_cic_irq_set_affinity(unsigned int irq,
const struct cpumask *cpumask)
{
int cpu;
unsigned long flags;
unsigned int mtflags;
unsigned long imask = (1 << (irq - MSP_CIC_INTBASE));
volatile u32 *cic_mask = (volatile u32 *)CIC_VPE0_MSK_REG;
/* timer balancing should be disabled in kernel code */
BUG_ON(irq == MSP_INT_VPE0_TIMER || irq == MSP_INT_VPE1_TIMER);
LOCK_CORE(flags, mtflags);
/* enable if any of each VPE's TCs require this IRQ */
for_each_online_cpu(cpu) {
if (cpumask_test_cpu(cpu, cpumask))
cic_mask[cpu] |= imask;
else
cic_mask[cpu] &= ~imask;
}
UNLOCK_CORE(flags, mtflags);
return 0;
}
#endif
static struct irq_chip msp_cic_irq_controller = {
.name = "MSP_CIC",
.mask = msp_cic_irq_ack,
.mask_ack = msp_cic_irq_ack,
.unmask = unmask_cic_irq,
.ack = msp_cic_irq_ack,
.end = msp_cic_irq_end,
#ifdef CONFIG_SMP
.set_affinity = msp_cic_irq_set_affinity,
#endif
};
void __init msp_cic_irq_init(void)
{
int i;
/* Mask/clear interrupts. */
*CIC_VPE0_MSK_REG = 0x00000000;
*CIC_VPE1_MSK_REG = 0x00000000;
*CIC_STS_REG = 0xFFFFFFFF;
/*
* The MSP7120 RG and EVBD boards use IRQ[6:4] for PCI.
* These inputs map to EXT_INT_POL[6:4] inside the CIC.
* They are to be active low, level sensitive.
*/
*CIC_EXT_CFG_REG &= 0xFFFF8F8F;
/* initialize all the IRQ descriptors */
for (i = MSP_CIC_INTBASE ; i < MSP_CIC_INTBASE + 32 ; i++) {
set_irq_chip_and_handler(i, &msp_cic_irq_controller,
handle_level_irq);
#ifdef CONFIG_MIPS_MT_SMTC
/* Mask of CIC interrupt */
irq_hwmask[i] = C_IRQ4;
#endif
}
/* Initialize the PER interrupt sub-system */
msp_per_irq_init();
}
/* CIC masked by CIC vector processing before dispatch called */
void msp_cic_irq_dispatch(void)
{
volatile u32 *cic_msk_reg = (volatile u32 *)CIC_VPE0_MSK_REG;
u32 cic_mask;
u32 pending;
int cic_status = *CIC_STS_REG;
cic_mask = cic_msk_reg[get_current_vpe()];
pending = cic_status & cic_mask;
if (pending & (1 << (MSP_INT_VPE0_TIMER - MSP_CIC_INTBASE))) {
do_IRQ(MSP_INT_VPE0_TIMER);
} else if (pending & (1 << (MSP_INT_VPE1_TIMER - MSP_CIC_INTBASE))) {
do_IRQ(MSP_INT_VPE1_TIMER);
} else if (pending & (1 << (MSP_INT_PER - MSP_CIC_INTBASE))) {
msp_per_irq_dispatch();
} else if (pending) {
do_IRQ(ffs(pending) + MSP_CIC_INTBASE - 1);
} else{
spurious_interrupt();
/* Re-enable the CIC cascaded interrupt. */
irq_desc[MSP_INT_CIC].chip->end(MSP_INT_CIC);
}
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-07 7:56 ` Anoop P A
@ 2011-01-07 18:46 ` Kevin D. Kissell
2011-01-08 19:33 ` Anoop P A
2011-01-10 19:30 ` Kevin D. Kissell
1 sibling, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-07 18:46 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/06/11 23:56, Anoop P A wrote:
> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>> I'm sure I've said this before, and it's in various comments in the SMTC
>> code, but...
As an aside to this conversation, would it be possible to create a
Documentation/mips/SMTC.txt file that would actually propagate
upstream, so that I'd stop being the sole repository of SMTC folklore?
I only maintain it as a hobby.
> Ok. Well thanks much for your detailed explanation. Well I hope I found
> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
> if are using platform specific get_c0_compare_int. With following patch
> everything seems to be working for me.
> ------------------------------------------------------------------------
> diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
> index 2e72d30..a25fc59 100644
> --- a/arch/mips/kernel/cevt-smtc.c
> +++ b/arch/mips/kernel/cevt-smtc.c
> @@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
> return 0;
> /*
> * And we need the hwmask associated with the c0_compare
> - * vector to be initialized.
> + * vector to be initialized. However incase of platform
> + * specific get_co_compare_int, don't override irq_hwmask
> + * expect platform code to set a valid mask value.
> */
> - irq_hwmask[irq] = (0x100<< cp0_compare_irq);
> +
> + if (!get_c0_compare_int)
> + irq_hwmask[irq] = (0x100<< cp0_compare_irq);
> +
> if (cp0_timer_irq_installed)
> return 0;
> -----------------------------------------------------------------------
I'm still not clear on one point that, to me, is pretty important when
engineering a fix here. Are you, in fact, using the Count/Compare
interrupt system, but having the externalization of the compare
interrupt routed back through an intervening interrupt controller,
or is your timer coming from another source?
In the former case, I think you're on the right track as to the
possible cause of a problem, but the fix should actually be simpler
and rather more elegant. Why can't you simply see to it that
cp0_compare_irq is set to the right value, either at compile time,
or in your earliest platform initialization of the interrupt controller?
That would be a one-line, inline change and spare us another
cryptic conditional.
In the later case, you'll presumably be having lots of other problems,
as cevt-smtc.c is intertwined with cevt-r4k.c and the Count/Compare
paradigm.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-07 18:46 ` Kevin D. Kissell
@ 2011-01-08 19:33 ` Anoop P A
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-08 19:33 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Sat, Jan 8, 2011 at 12:16 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
> On 01/06/11 23:56, Anoop P A wrote:
>>
>> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>>>
>>> I'm sure I've said this before, and it's in various comments in the SMTC
>>> code, but...
>
> As an aside to this conversation, would it be possible to create a
> Documentation/mips/SMTC.txt file that would actually propagate
> upstream, so that I'd stop being the sole repository of SMTC folklore?
> I only maintain it as a hobby.
>>
>> Ok. Well thanks much for your detailed explanation. Well I hope I found
>> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
>> if are using platform specific get_c0_compare_int. With following patch
>> everything seems to be working for me.
>> ------------------------------------------------------------------------
>> diff --git a/arch/mips/kernel/cevt-smtc.c b/arch/mips/kernel/cevt-smtc.c
>> index 2e72d30..a25fc59 100644
>> --- a/arch/mips/kernel/cevt-smtc.c
>> +++ b/arch/mips/kernel/cevt-smtc.c
>> @@ -310,9 +310,14 @@ int __cpuinit smtc_clockevent_init(void)
>> return 0;
>> /*
>> * And we need the hwmask associated with the c0_compare
>> - * vector to be initialized.
>> + * vector to be initialized. However incase of platform
>> + * specific get_co_compare_int, don't override irq_hwmask
>> + * expect platform code to set a valid mask value.
>> */
>> - irq_hwmask[irq] = (0x100<< cp0_compare_irq);
>> +
>> + if (!get_c0_compare_int)
>> + irq_hwmask[irq] = (0x100<< cp0_compare_irq);
>> +
>> if (cp0_timer_irq_installed)
>> return 0;
>> -----------------------------------------------------------------------
>
> I'm still not clear on one point that, to me, is pretty important when
> engineering a fix here. Are you, in fact, using the Count/Compare
> interrupt system, but having the externalization of the compare
> interrupt routed back through an intervening interrupt controller,
> or is your timer coming from another source?
>
> In the former case, I think you're on the right track as to the
> possible cause of a problem, but the fix should actually be simpler
> and rather more elegant. Why can't you simply see to it that
> cp0_compare_irq is set to the right value, either at compile time,
> or in your earliest platform initialization of the interrupt controller?
> That would be a one-line, inline change and spare us another
> cryptic conditional.
Yes ,it is first case.
http://git.linux-mips.org/?p=linux.git;a=commit;h=38760d40ca61b18b2809e9c28df8b3ff9af8a02b
Above mentioned patch enables platforms to utilize 4k timer code with
platform specific timer interrupts. cevt-smtc also had ( copied from cevt-r4k)
referred code. Given the specific irq support in cevt-smtc we should add
support for specific hwmask , IMHO.
>
> In the later case, you'll presumably be having lots of other problems,
> as cevt-smtc.c is intertwined with cevt-r4k.c and the Count/Compare
> paradigm.
>
> Regards,
>
> Kevin K.
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-07 7:56 ` Anoop P A
2011-01-07 18:46 ` Kevin D. Kissell
@ 2011-01-10 19:30 ` Kevin D. Kissell
2011-01-11 4:05 ` Anoop P A
2011-01-13 7:53 ` Kevin D. Kissell
1 sibling, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-10 19:30 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On 01/06/11 23:56, Anoop P A wrote:
> On Thu, 2011-01-06 at 15:31 -0800, Kevin D. Kissell wrote:
>> I'm sure I've said this before, and it's in various comments in the SMTC
>> code, but remember, one of the main problems that the SMTC kernel
>> had to solve was to prevent all TCs of a VPE from "convoying" after every
>> interrupt. The way this is done is that the interrupt vector code, before
>> clearing EXL, masks off the Status.IM bit associated with the incoming
>> interrupt. Of course, to get another interrupt from the same source
>> (or collection of sources), that IM bit needs to be restored. The "correct"
>> mechanism for this is by having the appropriate irq_hwmask[] value set,
>> so that smtc_im_ack_irq(), which should be invoked on an irq "ack()"
>> (meaning that the source has been quenched and any new occurrence
>> should be considered a new interrupt), will restore the bit in Status.
>> This function got moved around a bit in the various SMTC prototypes,
>> but it proved least intrusive to put it into the xxx_mask_and_ack()
>> functions
>> for the interrupt controllers - see irq-msc01.c and i8259.c. If you haven't
>> done the same in any equivalent code for a different on-chip controller,
>> you'll definitely have problems.
>>
>> The Backstop scheme works OK for peripheral interrupts that didn't
>> have an appropriate irq_hwmask[] value set up, but clock interrupts
>> don't follow the same code paths and can't depend on the backstop.
> Ok. Well thanks much for your detailed explanation. Well I hope I found
> the root cause . smtc_clockevent_init() was overriding irq_hwmask even
> if are using platform specific get_c0_compare_int. With following patch
> everything seems to be working for me.
Would this still be with a "tickful" kernel? I was able to run some
experiments on a Malta over the weekend, using mostly default
Malta defconfig options including tickless operation. The 2.6.32.27
build comes up with both VPEs and all TCs firing. 2.6.36.2 with
the stackframe.h patch boots all the way up on a single VPE, but
VERY slowly - as if the Clock/Compare setups weren't being done
correctly and timer intervals were waiting the full Count register
rollover cycle. I've been looking at diffs, and merged one change
that was made to cevt-r4k.c into the analogous routine in cevt-smtc.c
(no change), but there's clearly more breakage to the SMTC/Malta
configuration post-2.6.32 than just the stackframe.h patch. Going
tickful may work around it, but tickful+SMTC is grossly inefficient.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-10 19:30 ` Kevin D. Kissell
@ 2011-01-11 4:05 ` Anoop P A
2011-01-13 7:53 ` Kevin D. Kissell
1 sibling, 0 replies; 68+ messages in thread
From: Anoop P A @ 2011-01-11 4:05 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: STUART VENTERS, Anoop P.A., linux-mips
On Tue, Jan 11, 2011 at 1:00 AM, Kevin D. Kissell <kevink@paralogos.com> wrote:
>
> Would this still be with a "tickful" kernel? I was able to run some
> experiments on a Malta over the weekend, using mostly default
> Malta defconfig options including tickless operation. The 2.6.32.27
> build comes up with both VPEs and all TCs firing. 2.6.36.2 with
> the stackframe.h patch boots all the way up on a single VPE, but
> VERY slowly - as if the Clock/Compare setups weren't being done
> correctly and timer intervals were waiting the full Count register
> rollover cycle. I've been looking at diffs, and merged one change
> that was made to cevt-r4k.c into the analogous routine in cevt-smtc.c
> (no change), but there's clearly more breakage to the SMTC/Malta
> configuration post-2.6.32 than just the stackframe.h patch. Going
> tickful may work around it, but tickful+SMTC is grossly inefficient.
Yes that is true my configuration is using tickful . I had reported this
issue with tickless kernel . I think you missed my last email. I will
resend.
Thanks
Anoop
>
> Regards,
>
> Kevin K.
>
>
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2011-01-10 19:30 ` Kevin D. Kissell
2011-01-11 4:05 ` Anoop P A
@ 2011-01-13 7:53 ` Kevin D. Kissell
1 sibling, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2011-01-13 7:53 UTC (permalink / raw)
To: Anoop P A; +Cc: STUART VENTERS, Anoop P.A., linux-mips
Further interesting data point:
If I specify "nowait" on the command line, I get much better behavior on
the 2.6.36 and 2.6.37 kernels. In particular, 2.6.37, which hung after
the "Switching to clocksource MIPS" even booting with a single TC, gets
far enough to enable swap space even with 4 TCs running. I note that
there was historically a problem with getting SMTC to work with the
wait-with-interrupts-disabled idle wait mode. I had it working back in
2.5.2x, but something seems to have gotten broken in that 2.6.32 to
2.6.36 interval...
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-16 13:03 ` Anoop P A
@ 2010-12-16 18:43 ` Kevin D. Kissell
0 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-16 18:43 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., linux-mips
Getting back to my previous comment, the value reported for
TC0's TCStatus register in the MT register dump can't be right.
There are bits that are literally the same flip-flops between
TCStatus and the containing VPE's Status register, and those
bits are turning up different. If the reporting is wrong, then
one of the underlying assumptions of the dump code must
have been broken. Taking a quick look at it - which is all the
time I have for it today - I note with alarm that the TCStatus
value reported for the TC currently executing comes from the
"flags" variable used in the local_irq_save(flags) statement
at the beginning of the dump code. That historically worked,
because local_irq_save(x) propagated not only the interrupt
enable bit (bit 0) in x, but the entire value of Status - or TCStatus
in the case of SMTC. It certainly looks as if that's no longer true.
I'm pretty sure that the dump function isn't the only place where
the knowledge of local_irq_save()'s implementation was exploited
by SMTC code. So you look for changes to the local_irq_save()
macro definitions between 2.6.32 and 2.6.33.
The fact that you're blowing up on a DSP after you force an
exit from the timer calibration loop might also be attributable
to TCStatus is getting trashed, accidentally clearing access
rights to the DSP ASE state.
Honestly, just how many lines changed under arch/mips
(and include/asm-mips, if it was still outside arch/mips)
between 2.6.32 and 2.6.33? There simply can't be that
many to review.
Regards,
Kevin K.
On 12/16/10 05:03, Anoop P A wrote:
> On Wed, 2010-12-15 at 11:58 -0800, Kevin D. Kissell wrote:
>> On 12/15/10 11:18, Anoop P A wrote:
>>>> management algorithms I described
>>> Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
>>> register dump is copied below.
>> I guess what jumps out at me is that VPE0.EPC doesn't look to have
>> changed since the very initial boot vector, as if we'd never successfully
>> taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
>> you're getting that MT state dump by breaking in with an NMI).
>> I'm puzzled that TC0.TCStatus is being reported as 0, when it should
>> have a bunch of bits in common with VPE0.Status. And I'm particularly
>> intrigued by the fact that you seem to have an interrupt bit set in Cause
>> which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
>> seem to be getting interrupts.
>>
>> Do you have access to some kind of EJTAG probe for your system?
> Unfortunately I don't have access to a working EJTAG at the moment.
>
>>> I have tested few stable tags in git and isolated the code brake.
>>>
>>> 2.6.24-stable + patch[1] = SMTC boot success
>>> 2.6.29-stable + patch[1] = SMTC boot success
>>> 2.6.31-stable + patch[1] = SMTC boot success
>>> 2.6.32-stable + patch[1] = SMTC boot success
>>> 2.6.33-stable = SMTC boot failed
>>> 2.6.35-stable = SMTC boot failed
>>>
>>> So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
>> That's a pretty good job of isolating the problem, and the fact
>> that it happens even with no TC or VPE concurrency means it's
>> not a failure of the SMTC logic per se, but that someone changed
>> some code that's common to SMTC and "normal"/SMP operation
>> in a way that breaks the more constrained assumptions of SMTC.
>>
> I have tried digging diff between 2.6.32 and 2.6.33 but I couldn't spot
> any likely causes.
>
> I forgot to mention that I can boot newer kernels both in VSMP and UP
> mode.
>
> The other thing I have tried is booting kernel with pre-set lpj ( Just
> to test how far I can go), which lead me to a dsp exception (spurious ?)
>
> Let me know if you have any thoughts .
>
> Thanks,
> Anoop
>
> ################# log #############
>
> Linux version 2.6.33.7-pmc (paanoop1@paanoop1-desktop) (gcc version
> 4.5.1 (GCC) ) #27 SMP PREEMPT Thu Dec 16 17:49:46 IST 2010
> DSPRAM0: PA=1c100000,Size=00008000,enabled
> UART clock set to 50000000
> CPU revision is: 00019548 (MIPS 34Kc)
> Determined physical RAM map:
> memory: 00001000 @ 00000000 (reserved)
> memory: 000ff000 @ 00001000 (usable)
> memory: 00271000 @ 00100000 (reserved)
> memory: 0fc5a200 @ 00371000 (usable)
> Wasting 32 bytes for tracking 1 unused pages
> Zone PFN ranges:
> Normal 0x00000000 -> 0x0000ffcb
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
> 0: 0x00000000 -> 0x0000ffcb
> 6 available secondary CPU TC(s)
> PERCPU: Embedded 7 pages/cpu @81203000 s4896 r8192 d15584 u65536
> pcpu-alloc: s4896 r8192 d15584 u65536 alloc=16*4096
> pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
> Built 1 zonelists in Zone order, mobility grouping on. Total pages:
> 64971
> Kernel command line: console=ttyS0,57600 lpj=796672
> PID hash table entries: 1024 (order: 0, 4096 bytes)
> Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
> Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
> Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
> Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
> Writing ErrCtl register=00000000
> Readback ErrCtl register=00000000
> Memory: 255548k/259428k available (1861k kernel code, 3504k reserved,
> 400k data, 156k init, 0k highmem)
> Hierarchical RCU implementation.
> NR_IRQS:128
> Clock rate set to 600000000
> console [ttyS0] enabled
> Calibrating delay loop (skipped) preset value.. 398.33 BogoMIPS
> (lpj=796672)
> Mount-cache hash table entries: 512
> Cpu 0
> $ 0 : 00000000 10102000 00000010 00000003
> $ 4 : 00000003 00000000 00000000 8f82f758
> $ 8 : 00000000 00000000 00000000 00000000
> $12 : 00000000 00000007 8f82301c 00000000
> $16 : 8f82f758 00800b00 8035d3c0 8f830000
> $20 : 80329df8 00000000 8035d3c0 80360000
> $24 : 00000000 00000001
> $28 : 80328000 80329ce0 8f82f868 8010d018
> Hi : 0000004c
> Lo : 3831f4b4
> epc : 8010d054 copy_thread+0x88/0x348
> Not tainted
> ra : 8010d018 copy_thread+0x4c/0x348
> Status: 10102000 KERNEL
> Cause : 50804068
> PrId : 00019548 (MIPS 34Kc)
> Kernel panic - not syncing: Unexpected DSP exception
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-15 19:58 ` Kevin D. Kissell
@ 2010-12-16 13:03 ` Anoop P A
2010-12-16 18:43 ` Kevin D. Kissell
0 siblings, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-16 13:03 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips
On Wed, 2010-12-15 at 11:58 -0800, Kevin D. Kissell wrote:
> On 12/15/10 11:18, Anoop P A wrote:
> >> management algorithms I described
> > Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
> > register dump is copied below.
> I guess what jumps out at me is that VPE0.EPC doesn't look to have
> changed since the very initial boot vector, as if we'd never successfully
> taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
> you're getting that MT state dump by breaking in with an NMI).
> I'm puzzled that TC0.TCStatus is being reported as 0, when it should
> have a bunch of bits in common with VPE0.Status. And I'm particularly
> intrigued by the fact that you seem to have an interrupt bit set in Cause
> which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
> seem to be getting interrupts.
>
> Do you have access to some kind of EJTAG probe for your system?
Unfortunately I don't have access to a working EJTAG at the moment.
>
> > I have tested few stable tags in git and isolated the code brake.
> >
> > 2.6.24-stable + patch[1] = SMTC boot success
> > 2.6.29-stable + patch[1] = SMTC boot success
> > 2.6.31-stable + patch[1] = SMTC boot success
> > 2.6.32-stable + patch[1] = SMTC boot success
> > 2.6.33-stable = SMTC boot failed
> > 2.6.35-stable = SMTC boot failed
> >
> > So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
> That's a pretty good job of isolating the problem, and the fact
> that it happens even with no TC or VPE concurrency means it's
> not a failure of the SMTC logic per se, but that someone changed
> some code that's common to SMTC and "normal"/SMP operation
> in a way that breaks the more constrained assumptions of SMTC.
>
I have tried digging diff between 2.6.32 and 2.6.33 but I couldn't spot
any likely causes.
I forgot to mention that I can boot newer kernels both in VSMP and UP
mode.
The other thing I have tried is booting kernel with pre-set lpj ( Just
to test how far I can go), which lead me to a dsp exception (spurious ?)
Let me know if you have any thoughts .
Thanks,
Anoop
################# log #############
Linux version 2.6.33.7-pmc (paanoop1@paanoop1-desktop) (gcc version
4.5.1 (GCC) ) #27 SMP PREEMPT Thu Dec 16 17:49:46 IST 2010
DSPRAM0: PA=1c100000,Size=00008000,enabled
UART clock set to 50000000
CPU revision is: 00019548 (MIPS 34Kc)
Determined physical RAM map:
memory: 00001000 @ 00000000 (reserved)
memory: 000ff000 @ 00001000 (usable)
memory: 00271000 @ 00100000 (reserved)
memory: 0fc5a200 @ 00371000 (usable)
Wasting 32 bytes for tracking 1 unused pages
Zone PFN ranges:
Normal 0x00000000 -> 0x0000ffcb
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0: 0x00000000 -> 0x0000ffcb
6 available secondary CPU TC(s)
PERCPU: Embedded 7 pages/cpu @81203000 s4896 r8192 d15584 u65536
pcpu-alloc: s4896 r8192 d15584 u65536 alloc=16*4096
pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
Built 1 zonelists in Zone order, mobility grouping on. Total pages:
64971
Kernel command line: console=ttyS0,57600 lpj=796672
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
Writing ErrCtl register=00000000
Readback ErrCtl register=00000000
Memory: 255548k/259428k available (1861k kernel code, 3504k reserved,
400k data, 156k init, 0k highmem)
Hierarchical RCU implementation.
NR_IRQS:128
Clock rate set to 600000000
console [ttyS0] enabled
Calibrating delay loop (skipped) preset value.. 398.33 BogoMIPS
(lpj=796672)
Mount-cache hash table entries: 512
Cpu 0
$ 0 : 00000000 10102000 00000010 00000003
$ 4 : 00000003 00000000 00000000 8f82f758
$ 8 : 00000000 00000000 00000000 00000000
$12 : 00000000 00000007 8f82301c 00000000
$16 : 8f82f758 00800b00 8035d3c0 8f830000
$20 : 80329df8 00000000 8035d3c0 80360000
$24 : 00000000 00000001
$28 : 80328000 80329ce0 8f82f868 8010d018
Hi : 0000004c
Lo : 3831f4b4
epc : 8010d054 copy_thread+0x88/0x348
Not tainted
ra : 8010d018 copy_thread+0x4c/0x348
Status: 10102000 KERNEL
Cause : 50804068
PrId : 00019548 (MIPS 34Kc)
Kernel panic - not syncing: Unexpected DSP exception
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-15 19:18 ` Anoop P A
@ 2010-12-15 19:58 ` Kevin D. Kissell
2010-12-16 13:03 ` Anoop P A
0 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-15 19:58 UTC (permalink / raw)
To: Anoop P A; +Cc: Anoop P.A., linux-mips
On 12/15/10 11:18, Anoop P A wrote:
> On Tue, 2010-12-14 at 10:32 -0800, Kevin D. Kissell wrote:
>
>> I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
>> diff.
> http://patchwork.linux-mips.org/patch/804/ I was speaking about this
> patch. Since my timer is connected through a cascaded CIC , It is
> required to check TI bit of cause register in order to ensure a timer
> interrupt. With above mentioned patch I was able to boot a 2.6.24-stable
> SMTC kernel. ( Not tested fully though )
OK, yes, of course, you'd need that patch.
>> The recommended procedure was, and remains, to isolate clock
>> propagation problems by using command line options "maxtcs="
>> and "maxvpes=".
>>
>> First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
>> a virtual uniprocessor. If that doesn't run, you've got some fundamental
>> problem with support for your platform, or someone has really fundamentally
>> broken the SMTC build somewhere. Next, try booting with maxtcs=2
>> and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
>> If those fail, your problem is probably in the interrupt mask
>> management algorithms I described
> Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
> register dump is copied below.
I guess what jumps out at me is that VPE0.EPC doesn't look to have
changed since the very initial boot vector, as if we'd never successfully
taken an exception or interrupt of any kind, prior to the NMI (I'm assuming
you're getting that MT state dump by breaking in with an NMI).
I'm puzzled that TC0.TCStatus is being reported as 0, when it should
have a bunch of bits in common with VPE0.Status. And I'm particularly
intrigued by the fact that you seem to have an interrupt bit set in Cause
which is enabled in Status, with IE set and EXL/ERL clear, yet you don't
seem to be getting interrupts.
Do you have access to some kind of EJTAG probe for your system?
> I have tested few stable tags in git and isolated the code brake.
>
> 2.6.24-stable + patch[1] = SMTC boot success
> 2.6.29-stable + patch[1] = SMTC boot success
> 2.6.31-stable + patch[1] = SMTC boot success
> 2.6.32-stable + patch[1] = SMTC boot success
> 2.6.33-stable = SMTC boot failed
> 2.6.35-stable = SMTC boot failed
>
> So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
That's a pretty good job of isolating the problem, and the fact
that it happens even with no TC or VPE concurrency means it's
not a failure of the SMTC logic per se, but that someone changed
some code that's common to SMTC and "normal"/SMP operation
in a way that breaks the more constrained assumptions of SMTC.
> Thanks and Regards,
> Anoop
>
> patch[1] : http://patchwork.linux-mips.org/patch/804/
>
>
> #############################Log###########################
> 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [ 0.000000] -- Global State --
> [ 0.000000] MVPControl Passed: 00000000
> [ 0.000000] MVPControl Read: 00000000
> [ 0.000000] MVPConf0 : a8008406
> [ 0.000000] -- per-VPE State --
> [ 0.000000] VPE 0
> [ 0.000000] VPEControl : 00000000
> [ 0.000000] VPEConf0 : 800f0003
> [ 0.000000] VPE0.Status : 11004001
> [ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
> [ 0.000000] VPE0.Cause : 50804000
> [ 0.000000] VPE0.Config7 : 00010000
> [ 0.000000] VPE 1
> [ 0.000000] VPEControl : 00060000
> [ 0.000000] VPEConf0 : 800f0000
> [ 0.000000] VPE1.Status : 00408305
> [ 0.000000] VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
> [ 0.000000] VPE1.Cause : 50000200
> [ 0.000000] VPE1.Config7 : 00010000
> [ 0.000000] -- per-TC State --
> [ 0.000000] TC 0 (current TC with VPE EPC above)
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00000000
> [ 0.000000] TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
> [ 0.000000] TCHalt : 00000000
> [ 0.000000] TCContext : 00000000
> [ 0.000000] TC 1
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00200001
> [ 0.000000] TCRestart : 8f800020 0x8f800020
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00140000
> [ 0.000000] TC 2
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00400001
> [ 0.000000] TCRestart : 8f800020 0x8f800020
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00280000
> [ 0.000000] TC 3
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00600001
> [ 0.000000] TCRestart : 8f800020 0x8f800020
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 003c0000
> [ 0.000000] TC 4
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00800001
> [ 0.000000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00500000
> [ 0.000000] TC 5
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00a00001
> [ 0.000000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00640000
> [ 0.000000] TC 6
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00c00001
> [ 0.000000] TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00780000
> [ 0.000000] Counter Interrupts taken per CPU (TC)
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] 2: 0
> [ 0.000000] 3: 0
> [ 0.000000] 4: 0
> [ 0.000000] 5: 0
> [ 0.000000] 6: 0
> [ 0.000000] 7: 0
> [ 0.000000] Self-IPI invocations:
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] 2: 0
> [ 0.000000] 3: 0
> [ 0.000000] 4: 0
> [ 0.000000] 5: 0
> [ 0.000000] 6: 0
> [ 0.000000] 7: 0
> [ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] 0 Recoveries of "stolen" FPU
> [ 0.000000] ===========================
> [ 0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
> pend=0x20000
> [ 0.010000] === MIPS MT State Dump ===
> [ 0.010000] -- Global State --
> [ 0.010000] MVPControl Passed: 00000000
> [ 0.010000] MVPControl Read: 00000000
> [ 0.010000] MVPConf0 : a8008406
> [ 0.010000] -- per-VPE State --
> [ 0.010000] VPE 0
> [ 0.010000] VPEControl : 00000000
> [ 0.010000] VPEConf0 : 800f0003
> [ 0.010000] VPE0.Status : 18004000
> [ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [ 0.010000] VPE0.Cause : 40804000
> [ 0.010000] VPE0.Config7 : 00010000
> [ 0.010000] VPE 1
> [ 0.010000] VPEControl : 00060000
> [ 0.010000] VPEConf0 : 800f0000
> [ 0.010000] VPE1.Status : 00408305
> [ 0.010000] VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
> [ 0.010000] VPE1.Cause : 50000200
> [ 0.010000] VPE1.Config7 : 00010000
> [ 0.010000] -- per-TC State --
> [ 0.010000] TC 0 (current TC with VPE EPC above)
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00000000
> [ 0.010000] TCRestart : 803f791c printk+0xc/0x30
> [ 0.010000] TCHalt : 00000000
> [ 0.010000] TCContext : 00000000
> [ 0.010000] TC 1
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00200001
> [ 0.010000] TCRestart : 8f800020 0x8f800020
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00140000
> [ 0.010000] TC 2
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00400001
> [ 0.010000] TCRestart : 8f800020 0x8f800020
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00280000
> [ 0.010000] TC 3
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00600001
> [ 0.010000] TCRestart : 8f800020 0x8f800020
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 003c0000
> [ 0.010000] TC 4
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00800001
> [ 0.010000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00500000
> [ 0.010000] TC 5
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00a00001
> [ 0.010000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00640000
> [ 0.010000] TC 6
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00c00001
> [ 0.010000] TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00780000
> [ 0.010000] Counter Interrupts taken per CPU (TC)
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] Self-IPI invocations:
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] 0 Recoveries of "stolen" FPU
> [ 0.010000] ===========================
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-14 18:32 ` Kevin D. Kissell
2010-12-14 18:50 ` Ralf Baechle
@ 2010-12-15 19:18 ` Anoop P A
2010-12-15 19:58 ` Kevin D. Kissell
1 sibling, 1 reply; 68+ messages in thread
From: Anoop P A @ 2010-12-15 19:18 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips
On Tue, 2010-12-14 at 10:32 -0800, Kevin D. Kissell wrote:
> Between your mailer and mine (Thunderbird 3.1 on Ubuntu), the quoting
> has become something of a dogs breakfast, so let me just lay things out
> here as best I can.
I am sorry for that. With evolution it will be better I hope.
>
> I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
> diff.
http://patchwork.linux-mips.org/patch/804/ I was speaking about this
patch. Since my timer is connected through a cascaded CIC , It is
required to check TI bit of cause register in order to ensure a timer
interrupt. With above mentioned patch I was able to boot a 2.6.24-stable
SMTC kernel. ( Not tested fully though )
> The recommended procedure was, and remains, to isolate clock
> propagation problems by using command line options "maxtcs="
> and "maxvpes=".
>
> First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
> a virtual uniprocessor. If that doesn't run, you've got some fundamental
> problem with support for your platform, or someone has really fundamentally
> broken the SMTC build somewhere. Next, try booting with maxtcs=2
> and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
> If those fail, your problem is probably in the interrupt mask
> management algorithms I described
Even with command line maxtcs=1 and maxvpes=1 I am seeing same hung. The
register dump is copied below.
> Your dump below looks as if it comes from 2 TCs running on
> 2 VPEs, and that the interrupt mask issues I alluded to earlier
> are neither relevant nor manifest. It looks instead as if the
> initialization of "CPU 1" (VPE1/TC1) may not have been done
> properly. Under normal operation, it would be pretty rare to
> catch TC 1 in the exception vector dispatch code, so the first
> hypothesis that comes to mind is that something isn't right in
> the vector/handler setup, and TC 1 is stuck in an infinite exception
> loop, unable to handshake with TC 0 and thus locking up the
> system. But that's just my best guess based on limited data.
>
> Regards,
>
> Kevin K.
>
I have tested few stable tags in git and isolated the code brake.
2.6.24-stable + patch[1] = SMTC boot success
2.6.29-stable + patch[1] = SMTC boot success
2.6.31-stable + patch[1] = SMTC boot success
2.6.32-stable + patch[1] = SMTC boot success
2.6.33-stable = SMTC boot failed
2.6.35-stable = SMTC boot failed
So it looks like SMTC support got broke between 2.6.32 and 2.6.33 .
Thanks and Regards,
Anoop
patch[1] : http://patchwork.linux-mips.org/patch/804/
#############################Log###########################
0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[ 0.000000] -- Global State --
[ 0.000000] MVPControl Passed: 00000000
[ 0.000000] MVPControl Read: 00000000
[ 0.000000] MVPConf0 : a8008406
[ 0.000000] -- per-VPE State --
[ 0.000000] VPE 0
[ 0.000000] VPEControl : 00000000
[ 0.000000] VPEConf0 : 800f0003
[ 0.000000] VPE0.Status : 11004001
[ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
[ 0.000000] VPE0.Cause : 50804000
[ 0.000000] VPE0.Config7 : 00010000
[ 0.000000] VPE 1
[ 0.000000] VPEControl : 00060000
[ 0.000000] VPEConf0 : 800f0000
[ 0.000000] VPE1.Status : 00408305
[ 0.000000] VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
[ 0.000000] VPE1.Cause : 50000200
[ 0.000000] VPE1.Config7 : 00010000
[ 0.000000] -- per-TC State --
[ 0.000000] TC 0 (current TC with VPE EPC above)
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00000000
[ 0.000000] TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[ 0.000000] TCHalt : 00000000
[ 0.000000] TCContext : 00000000
[ 0.000000] TC 1
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00200001
[ 0.000000] TCRestart : 8f800020 0x8f800020
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00140000
[ 0.000000] TC 2
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00400001
[ 0.000000] TCRestart : 8f800020 0x8f800020
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00280000
[ 0.000000] TC 3
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00600001
[ 0.000000] TCRestart : 8f800020 0x8f800020
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 003c0000
[ 0.000000] TC 4
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00800001
[ 0.000000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00500000
[ 0.000000] TC 5
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00a00001
[ 0.000000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00640000
[ 0.000000] TC 6
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00c00001
[ 0.000000] TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00780000
[ 0.000000] Counter Interrupts taken per CPU (TC)
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] Self-IPI invocations:
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] 0 Recoveries of "stolen" FPU
[ 0.000000] ===========================
[ 0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 80100380 name_to_dev_t+0x50/0x430
[ 0.010000] VPE1.Cause : 50000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 8f800020 0x8f800020
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00140000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 8f800020 0x8f800020
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00280000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : 8f800020 0x8f800020
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 003c0000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00500000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 80100380 name_to_dev_t+0x50/0x430
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00640000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : 80268e00 aes_encrypt+0x10e4/0x164c
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head
2010-12-14 21:27 ` STUART VENTERS
(?)
@ 2010-12-14 23:01 ` Kevin D. Kissell
-1 siblings, 0 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-14 23:01 UTC (permalink / raw)
To: STUART VENTERS; +Cc: linux-mips
On 12/14/10 13:27, STUART VENTERS wrote:
> Kevin,
>
> It turns out we are also looking at Linux SMTC support for 34kc.
> (For a different pmc part.)
>
> You said you remembered seeing it work on at least one version of the kernel.
>
> Could you help us find that version by bracketing the search a bit?
>
> Maybe a date and/or version range to look in.
>
There were early working versions without dyntick or interrupt affinity
in the 2.6.23/24 timeframe, but as per the commit lots in linux-mips.org,
I finally got the dyntick stuff working in September 2008, with the commits
propagating to various git branches over the following two months. I
can see
that the new code was in 2.6.28.1 but not in 2.6.26.8 At some point
subsequent
to that, I'm pretty sure I checked out the then-latest stable version of
the Malta
branch and got a functional build.
The last time I regression checked it was in March of 2009 at which point
some infrastructure changes had broken things, which I fixed in patches
posted on March 31, 2009, one which addressed a change in the semantics
of CP0 access macros, and one of which fixed a name conflict.
Those were committed on 3/31 and 5/14/2009, depending on the branch
you look at. With those patches and only those patches on what was then
the latest stable (Malta?) branch at LMO, it seemed to run OK
to the limited degree I was able to have it tested. Someone else found a
hole in smtc_distribute_timer() in November of 2009, and I worked with
the discoverer on a very small patch committed November 13, 2009,
but I never actually ran the code to test (then again, I'd never been able
to drive a system into the failure it could cause).
Sorry to be a little vague, but I no longer have my MIPS Linux development
build or test systems, so I'm reduced to googling and searching LMO, just
like anyone else.
Regards,
Kevin K.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head
@ 2010-12-14 21:27 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-14 21:27 UTC (permalink / raw)
To: kevink; +Cc: linux-mips
Kevin,
It turns out we are also looking at Linux SMTC support for 34kc.
(For a different pmc part.)
You said you remembered seeing it work on at least one version of the kernel.
Could you help us find that version by bracketing the search a bit?
Maybe a date and/or version range to look in.
Regards,
Stuart Venters
Adtran
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head
@ 2010-12-14 21:27 ` STUART VENTERS
0 siblings, 0 replies; 68+ messages in thread
From: STUART VENTERS @ 2010-12-14 21:27 UTC (permalink / raw)
To: kevink; +Cc: linux-mips
Kevin,
It turns out we are also looking at Linux SMTC support for 34kc.
(For a different pmc part.)
You said you remembered seeing it work on at least one version of the kernel.
Could you help us find that version by bracketing the search a bit?
Maybe a date and/or version range to look in.
Regards,
Stuart Venters
Adtran
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-14 18:32 ` Kevin D. Kissell
@ 2010-12-14 18:50 ` Ralf Baechle
2010-12-15 19:18 ` Anoop P A
1 sibling, 0 replies; 68+ messages in thread
From: Ralf Baechle @ 2010-12-14 18:50 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: Anoop P.A., linux-mips
On Tue, Dec 14, 2010 at 10:32:57AM -0800, Kevin D. Kissell wrote:
> I am no longer associated with MIPS Technologies and no longer have
> access to my email archives from that period. If I did, I could tell you
> which LMO kernel version(s) had SMTC working "out of the box". There
> definitely was at least one, and I commented on it in an email. You
> might be able to find it in the LMO email archives, but it's possible that
> I only sent it to a MIPS internal mailing list.
>
> There was also a message I wrote that I had *thought* had gone to
> the LMO mailing list, but may have only been sent to a group of internal
> MIPS and customer engineers, in which I described the recommended
> procedure for debugging exactly this canonical problem with porting
> SMTC.
git bisect to the rescue :) It's time consuming with a slow machine but
perfectly doable. Go back, find some antique kernel version with
functioning SMTC and take it from there.
Ralf
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-14 15:25 ` Anoop P.A.
(?)
@ 2010-12-14 18:32 ` Kevin D. Kissell
2010-12-14 18:50 ` Ralf Baechle
2010-12-15 19:18 ` Anoop P A
-1 siblings, 2 replies; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-14 18:32 UTC (permalink / raw)
To: Anoop P.A.; +Cc: linux-mips
Between your mailer and mine (Thunderbird 3.1 on Ubuntu), the quoting
has become something of a dogs breakfast, so let me just lay things out
here as best I can.
I can't comment on your tweak to 2.6.24.7 without seeing it as a patch
diff.
I am no longer associated with MIPS Technologies and no longer have
access to my email archives from that period. If I did, I could tell you
which LMO kernel version(s) had SMTC working "out of the box". There
definitely was at least one, and I commented on it in an email. You
might be able to find it in the LMO email archives, but it's possible that
I only sent it to a MIPS internal mailing list.
There was also a message I wrote that I had *thought* had gone to
the LMO mailing list, but may have only been sent to a group of internal
MIPS and customer engineers, in which I described the recommended
procedure for debugging exactly this canonical problem with porting
SMTC.
The recommended procedure was, and remains, to isolate clock
propagation problems by using command line options "maxtcs="
and "maxvpes=".
First, boot your SMTC kernel with maxtcs=1 and maxvpes=1,
a virtual uniprocessor. If that doesn't run, you've got some fundamental
problem with support for your platform, or someone has really fundamentally
broken the SMTC build somewhere. Next, try booting with maxtcs=2
and maxvpes=1, then with no constraint on maxtcs and maxvpes=1.
If those fail, your problem is probably in the interrupt mask
management algorithms I described.
On the other hand, if you boot with maxtcs=2 and maxvpes=2,
there will be only one TC per VPE and far less vulnerability to interrupt
mask lockup, but you need to have cross-VPE IPI interrupts working.
The preferred method of doing cross-VPE IPIs would be to use a physical
interrupt input that's instantiated per-VPE and manipulable by software.
Malta didn't have one, so there's the historical hack of using
MIPS MT instructions to freeze the other VPE and set up a
software interrupt using MTTR to the remote Cause register.
The PMC-Sierra platforms did, if I recall correctly, have some kind
of register that one could write to cause a real cross-VPE hardware
interrupt, but I don't recall whether it got used in the SMTC port.
Your dump below looks as if it comes from 2 TCs running on
2 VPEs, and that the interrupt mask issues I alluded to earlier
are neither relevant nor manifest. It looks instead as if the
initialization of "CPU 1" (VPE1/TC1) may not have been done
properly. Under normal operation, it would be pretty rare to
catch TC 1 in the exception vector dispatch code, so the first
hypothesis that comes to mind is that something isn't right in
the vector/handler setup, and TC 1 is stuck in an infinite exception
loop, unable to handshake with TC 0 and thus locking up the
system. But that's just my best guess based on limited data.
Regards,
Kevin K.
On 12/14/10 07:25, Anoop P.A. wrote:
>> it ended up being cleaner and more efficient to have *some* hooks in
>> platform specific timer code. It was there for Malta in the
> kernel.org
>> mainline once upon a time, and I *thought* we'd propagated working
> code
>> for the initial PMC-Sierra 34K-based SoC's at least as far as
> [Anoop P.A.]
> I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
> c0_compare_int_pending changed as following "return (read_c0_cause()>>
> cp0_compare_irq_shift)& (1ul<< CAUSEB_IP)"
>
>> linux-mips.org, but the source tree has been considerably reorganized
> -
>> there was a time when some of the hooks were under
>> arch/mips/mips-boards/generic, which no longer exists - and I'm not
> sure
>> where to point you. Git and grep are your friends.
> [Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
> Can you recollect the version of l-m-o kernel with a known working SMTC
> support ?.
>
>> The first order of business is to break into that hung timer
> calibration
>> loop and dump the CP0 registers for the VPE and the TCs, in particular
>> checking the interrupt enable mask in Status against the pending
>> interrupts in the Cause register. If you're seeing the timer
>> interrupt's bit set in Cause, but clear in Status, you need to fix the
>> SMTC interrupt mask hook for your platform timer.
> [Anoop P.A.]
> I tried dumping registers from calibration while loop.
> It looks like the timer interrupt bit stay high on both cause and status
> register ( in my case timer interrupt is connected to Cascaded CIC
> interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
> below
>
>> check to see if you're building for "tickless" operation. Tickless
> ends
>> up being really important for SMTC, and I did get it working properly
>> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
>> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
> going
>> by that I rather doubt were ever tested against an SMTC
> build/platform.
>> There might have been breakage there, and configuring to use a fixed
>> interval timer (say, 100Hz) would be a way to test that hypothesis.
> [Anoop P.A.] I have tried both tickles and fixed interval timer.
>
>> Regards,
>>
>> Kevin K.
>
> [Anoop P.A.] Thanks much for your and Ralf's detailed response.
> [Anoop P.A.]
> [ 0.000000] Writing ErrCtl register=00000000
> [ 0.000000] Readback ErrCtl register=00000000
> [ 0.000000] Memory: 254384k/257912k available (3062k kernel code,
> 3528k reserved, 648k data, 200k init, 0k highmem)
> [ 0.000000] Preemptable hierarchical RCU implementation.
> [ 0.000000] NR_IRQS:128
> [ 0.000000] console [ttyS0] enabled
> [ 0.000000] Clock rate set to 600000000
> [ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
> [ 0.000000] -- Global State --
> [ 0.000000] MVPControl Passed: 00000000
> [ 0.000000] MVPControl Read: 00000000
> [ 0.000000] MVPConf0 : a8008406
> [ 0.000000] -- per-VPE State --
> [ 0.000000] VPE 0
> [ 0.000000] VPEControl : 00000000
> [ 0.000000] VPEConf0 : 800f0003
> [ 0.000000] VPE0.Status : 11004001
> [ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
> [ 0.000000] VPE0.Cause : 40804000
> [ 0.000000] VPE0.Config7 : 00010000
> [ 0.000000] VPE 1
> [ 0.000000] VPEControl : 00060000
> [ 0.000000] VPEConf0 : 800f0000
> [ 0.000000] VPE1.Status : 00408305
> [ 0.000000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [ 0.000000] VPE1.Cause : 40000200
> [ 0.000000] VPE1.Config7 : 00010000
> [ 0.000000] -- per-TC State --
> [ 0.000000] TC 0 (current TC with VPE EPC above)
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00000000
> [ 0.000000] TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
> [ 0.000000] TCHalt : 00000000
> [ 0.000000] TCContext : 00000000
> [ 0.000000] TC 1
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00200001
> [ 0.000000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00180000
> [ 0.000000] TC 2
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00400001
> [ 0.000000] TCRestart : 7ffffffc 0x7ffffffc
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00300000
> [ 0.000000] TC 3
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00600001
> [ 0.000000] TCRestart : fff7ffae 0xfff7ffae
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00480000
> [ 0.000000] TC 4
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00800001
> [ 0.000000] TCRestart : f3fff7fe 0xf3fff7fe
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00600000
> [ 0.000000] TC 5
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00a00001
> [ 0.000000] TCRestart : 7ffffbfe 0x7ffffbfe
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00780000
> [ 0.000000] TC 6
> [ 0.000000] TCStatus : 00000000
> [ 0.000000] TCBind : 00c00001
> [ 0.000000] TCRestart : ffff7ffe 0xffff7ffe
> [ 0.000000] TCHalt : 00000001
> [ 0.000000] TCContext : 00900000
> [ 0.000000] Counter Interrupts taken per CPU (TC)
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] 2: 0
> [ 0.000000] 3: 0
> [ 0.000000] 4: 0
> [ 0.000000] 5: 0
> [ 0.000000] 6: 0
> [ 0.000000] 7: 0
> [ 0.000000] Self-IPI invocations:
> [ 0.000000] 0: 0
> [ 0.000000] 1: 0
> [ 0.000000] 2: 0
> [ 0.000000] 3: 0
> [ 0.000000] 4: 0
> [ 0.000000] 5: 0
> [ 0.000000] 6: 0
> [ 0.000000] 7: 0
> [ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.000000] 0 Recoveries of "stolen" FPU
> [ 0.000000] ===========================
> [ 0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
> pend=0x20000
> [ 0.010000] === MIPS MT State Dump ===
> [ 0.010000] -- Global State --
> [ 0.010000] MVPControl Passed: 00000000
> [ 0.010000] MVPControl Read: 00000000
> [ 0.010000] MVPConf0 : a8008406
> [ 0.010000] -- per-VPE State --
> [ 0.010000] VPE 0
> [ 0.010000] VPEControl : 00000000
> [ 0.010000] VPEConf0 : 800f0003
> [ 0.010000] VPE0.Status : 18004000
> [ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [ 0.010000] VPE0.Cause : 40804000
> [ 0.010000] VPE0.Config7 : 00010000
> [ 0.010000] VPE 1
> [ 0.010000] VPEControl : 00060000
> [ 0.010000] VPEConf0 : 800f0000
> [ 0.010000] VPE1.Status : 00408305
> [ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [ 0.010000] VPE1.Cause : 40000200
> [ 0.010000] VPE1.Config7 : 00010000
> [ 0.010000] -- per-TC State --
> [ 0.010000] TC 0 (current TC with VPE EPC above)
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00000000
> [ 0.010000] TCRestart : 803f791c printk+0xc/0x30
> [ 0.010000] TCHalt : 00000000
> [ 0.010000] TCContext : 00000000
> [ 0.010000] TC 1
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00200001
> [ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00180000
> [ 0.010000] TC 2
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00400001
> [ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00300000
> [ 0.010000] TC 3
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00600001
> [ 0.010000] TCRestart : fff7ffae 0xfff7ffae
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00480000
> [ 0.010000] TC 4
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00800001
> [ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00600000
> [ 0.010000] TC 5
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00a00001
> [ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00780000
> [ 0.010000] TC 6
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00c00001
> [ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00900000
> [ 0.010000] Counter Interrupts taken per CPU (TC)
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] Self-IPI invocations:
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] 0 Recoveries of "stolen" FPU
> [ 0.010000] ===========================
> [ 0.010000] === MIPS MT State Dump ===
> [ 0.010000] -- Global State --
> [ 0.010000] MVPControl Passed: 00000000
> [ 0.010000] MVPControl Read: 00000000
> [ 0.010000] MVPConf0 : a8008406
> [ 0.010000] -- per-VPE State --
> [ 0.010000] VPE 0
> [ 0.010000] VPEControl : 00000000
> [ 0.010000] VPEConf0 : 800f0003
> [ 0.010000] VPE0.Status : 18004000
> [ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [ 0.010000] VPE0.Cause : 40804000
> [ 0.010000] VPE0.Config7 : 00010000
> [ 0.010000] VPE 1
> [ 0.010000] VPEControl : 00060000
> [ 0.010000] VPEConf0 : 800f0000
> [ 0.010000] VPE1.Status : 00408305
> [ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [ 0.010000] VPE1.Cause : 40000200
> [ 0.010000] VPE1.Config7 : 00010000
> [ 0.010000] -- per-TC State --
> [ 0.010000] TC 0 (current TC with VPE EPC above)
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00000000
> [ 0.010000] TCRestart : 803f791c printk+0xc/0x30
> [ 0.010000] TCHalt : 00000000
> [ 0.010000] TCContext : 00000000
> [ 0.010000] TC 1
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00200001
> [ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00180000
> [ 0.010000] TC 2
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00400001
> [ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00300000
> [ 0.010000] TC 3
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00600001
> [ 0.010000] TCRestart : fff7ffae 0xfff7ffae
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00480000
> [ 0.010000] TC 4
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00800001
> [ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00600000
> [ 0.010000] TC 5
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00a00001
> [ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00780000
> [ 0.010000] TC 6
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00c00001
> [ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00900000
> [ 0.010000] Counter Interrupts taken per CPU (TC)
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] Self-IPI invocations:
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] 0 Recoveries of "stolen" FPU
> [ 0.010000] ===========================
> [ 0.010000] === MIPS MT State Dump ===
> [ 0.010000] -- Global State --
> [ 0.010000] MVPControl Passed: 00000000
> [ 0.010000] MVPControl Read: 00000000
> [ 0.010000] MVPConf0 : a8008406
> [ 0.010000] -- per-VPE State --
> [ 0.010000] VPE 0
> [ 0.010000] VPEControl : 00000000
> [ 0.010000] VPEConf0 : 800f0003
> [ 0.010000] VPE0.Status : 18004000
> [ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
> [ 0.010000] VPE0.Cause : 40804000
> [ 0.010000] VPE0.Config7 : 00010000
> [ 0.010000] VPE 1
> [ 0.010000] VPEControl : 00060000
> [ 0.010000] VPEConf0 : 800f0000
> [ 0.010000] VPE1.Status : 00408305
> [ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
> [ 0.010000] VPE1.Cause : 40000200
> [ 0.010000] VPE1.Config7 : 00010000
> [ 0.010000] -- per-TC State --
> [ 0.010000] TC 0 (current TC with VPE EPC above)
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00000000
> [ 0.010000] TCRestart : 803f791c printk+0xc/0x30
> [ 0.010000] TCHalt : 00000000
> [ 0.010000] TCContext : 00000000
> [ 0.010000] TC 1
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00200001
> [ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00180000
> [ 0.010000] TC 2
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00400001
> [ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00300000
> [ 0.010000] TC 3
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00600001
> [ 0.010000] TCRestart : fff7ffae 0xfff7ffae
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00480000
> [ 0.010000] TC 4
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00800001
> [ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00600000
> [ 0.010000] TC 5
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00a00001
> [ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00780000
> [ 0.010000] TC 6
> [ 0.010000] TCStatus : 00000000
> [ 0.010000] TCBind : 00c00001
> [ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
> [ 0.010000] TCHalt : 00000001
> [ 0.010000] TCContext : 00900000
> [ 0.010000] Counter Interrupts taken per CPU (TC)
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] Self-IPI invocations:
> [ 0.010000] 0: 0
> [ 0.010000] 1: 0
> [ 0.010000] 2: 0
> [ 0.010000] 3: 0
> [ 0.010000] 4: 0
> [ 0.010000] 5: 0
> [ 0.010000] 6: 0
> [ 0.010000] 7: 0
> [ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
> [ 0.010000] 0 Recoveries of "stolen" FPU
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-14 15:25 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-14 15:25 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: linux-mips
> it ended up being cleaner and more efficient to have *some* hooks in
> platform specific timer code. It was there for Malta in the
kernel.org
> mainline once upon a time, and I *thought* we'd propagated working
code
> for the initial PMC-Sierra 34K-based SoC's at least as far as
[Anoop P.A.]
I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
c0_compare_int_pending changed as following "return (read_c0_cause() >>
cp0_compare_irq_shift) & (1ul << CAUSEB_IP)"
> linux-mips.org, but the source tree has been considerably reorganized
-
> there was a time when some of the hooks were under
> arch/mips/mips-boards/generic, which no longer exists - and I'm not
sure
> where to point you. Git and grep are your friends.
[Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
Can you recollect the version of l-m-o kernel with a known working SMTC
support ?.
>
> The first order of business is to break into that hung timer
calibration
> loop and dump the CP0 registers for the VPE and the TCs, in particular
> checking the interrupt enable mask in Status against the pending
> interrupts in the Cause register. If you're seeing the timer
> interrupt's bit set in Cause, but clear in Status, you need to fix the
> SMTC interrupt mask hook for your platform timer.
[Anoop P.A.]
I tried dumping registers from calibration while loop.
It looks like the timer interrupt bit stay high on both cause and status
register ( in my case timer interrupt is connected to Cascaded CIC
interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
below
> check to see if you're building for "tickless" operation. Tickless
ends
> up being really important for SMTC, and I did get it working properly
> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
going
> by that I rather doubt were ever tested against an SMTC
build/platform.
> There might have been breakage there, and configuring to use a fixed
> interval timer (say, 100Hz) would be a way to test that hypothesis.
[Anoop P.A.] I have tried both tickles and fixed interval timer.
>
> Regards,
>
> Kevin K.
[Anoop P.A.] Thanks much for your and Ralf's detailed response.
>
[Anoop P.A.]
[ 0.000000] Writing ErrCtl register=00000000
[ 0.000000] Readback ErrCtl register=00000000
[ 0.000000] Memory: 254384k/257912k available (3062k kernel code,
3528k reserved, 648k data, 200k init, 0k highmem)
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] NR_IRQS:128
[ 0.000000] console [ttyS0] enabled
[ 0.000000] Clock rate set to 600000000
[ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[ 0.000000] -- Global State --
[ 0.000000] MVPControl Passed: 00000000
[ 0.000000] MVPControl Read: 00000000
[ 0.000000] MVPConf0 : a8008406
[ 0.000000] -- per-VPE State --
[ 0.000000] VPE 0
[ 0.000000] VPEControl : 00000000
[ 0.000000] VPEConf0 : 800f0003
[ 0.000000] VPE0.Status : 11004001
[ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
[ 0.000000] VPE0.Cause : 40804000
[ 0.000000] VPE0.Config7 : 00010000
[ 0.000000] VPE 1
[ 0.000000] VPEControl : 00060000
[ 0.000000] VPEConf0 : 800f0000
[ 0.000000] VPE1.Status : 00408305
[ 0.000000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.000000] VPE1.Cause : 40000200
[ 0.000000] VPE1.Config7 : 00010000
[ 0.000000] -- per-TC State --
[ 0.000000] TC 0 (current TC with VPE EPC above)
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00000000
[ 0.000000] TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[ 0.000000] TCHalt : 00000000
[ 0.000000] TCContext : 00000000
[ 0.000000] TC 1
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00200001
[ 0.000000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00180000
[ 0.000000] TC 2
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00400001
[ 0.000000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00300000
[ 0.000000] TC 3
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00600001
[ 0.000000] TCRestart : fff7ffae 0xfff7ffae
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00480000
[ 0.000000] TC 4
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00800001
[ 0.000000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00600000
[ 0.000000] TC 5
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00a00001
[ 0.000000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00780000
[ 0.000000] TC 6
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00c00001
[ 0.000000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00900000
[ 0.000000] Counter Interrupts taken per CPU (TC)
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] Self-IPI invocations:
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] 0 Recoveries of "stolen" FPU
[ 0.000000] ===========================
[ 0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
^ permalink raw reply [flat|nested] 68+ messages in thread
* RE: SMTC support status in latest git head.
@ 2010-12-14 15:25 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-14 15:25 UTC (permalink / raw)
To: Kevin D. Kissell; +Cc: linux-mips
> it ended up being cleaner and more efficient to have *some* hooks in
> platform specific timer code. It was there for Malta in the
kernel.org
> mainline once upon a time, and I *thought* we'd propagated working
code
> for the initial PMC-Sierra 34K-based SoC's at least as far as
[Anoop P.A.]
I was able to boot 2.6.24-7 git sources with a change in cevt-r4k.c (
c0_compare_int_pending changed as following "return (read_c0_cause() >>
cp0_compare_irq_shift) & (1ul << CAUSEB_IP)"
> linux-mips.org, but the source tree has been considerably reorganized
-
> there was a time when some of the hooks were under
> arch/mips/mips-boards/generic, which no longer exists - and I'm not
sure
> where to point you. Git and grep are your friends.
[Anoop P.A.]malta code has been moved to arch/mips/mti-malta/
Can you recollect the version of l-m-o kernel with a known working SMTC
support ?.
>
> The first order of business is to break into that hung timer
calibration
> loop and dump the CP0 registers for the VPE and the TCs, in particular
> checking the interrupt enable mask in Status against the pending
> interrupts in the Cause register. If you're seeing the timer
> interrupt's bit set in Cause, but clear in Status, you need to fix the
> SMTC interrupt mask hook for your platform timer.
[Anoop P.A.]
I tried dumping registers from calibration while loop.
It looks like the timer interrupt bit stay high on both cause and status
register ( in my case timer interrupt is connected to Cascaded CIC
interrupt which is connected to irq -6 ( C_IRQ4)). Detailed log pasted
below
> check to see if you're building for "tickless" operation. Tickless
ends
> up being really important for SMTC, and I did get it working properly
> back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
> functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c
going
> by that I rather doubt were ever tested against an SMTC
build/platform.
> There might have been breakage there, and configuring to use a fixed
> interval timer (say, 100Hz) would be a way to test that hypothesis.
[Anoop P.A.] I have tried both tickles and fixed interval timer.
>
> Regards,
>
> Kevin K.
[Anoop P.A.] Thanks much for your and Ralf's detailed response.
>
[Anoop P.A.]
[ 0.000000] Writing ErrCtl register=00000000
[ 0.000000] Readback ErrCtl register=00000000
[ 0.000000] Memory: 254384k/257912k available (3062k kernel code,
3528k reserved, 648k data, 200k init, 0k highmem)
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] NR_IRQS:128
[ 0.000000] console [ttyS0] enabled
[ 0.000000] Clock rate set to 600000000
[ 0.000000] Calibrating delay loop... === MIPS MT State Dump ===
[ 0.000000] -- Global State --
[ 0.000000] MVPControl Passed: 00000000
[ 0.000000] MVPControl Read: 00000000
[ 0.000000] MVPConf0 : a8008406
[ 0.000000] -- per-VPE State --
[ 0.000000] VPE 0
[ 0.000000] VPEControl : 00000000
[ 0.000000] VPEConf0 : 800f0003
[ 0.000000] VPE0.Status : 11004001
[ 0.000000] VPE0.EPC : 80100000 _stext+0x0/0x10
[ 0.000000] VPE0.Cause : 40804000
[ 0.000000] VPE0.Config7 : 00010000
[ 0.000000] VPE 1
[ 0.000000] VPEControl : 00060000
[ 0.000000] VPEConf0 : 800f0000
[ 0.000000] VPE1.Status : 00408305
[ 0.000000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.000000] VPE1.Cause : 40000200
[ 0.000000] VPE1.Config7 : 00010000
[ 0.000000] -- per-TC State --
[ 0.000000] TC 0 (current TC with VPE EPC above)
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00000000
[ 0.000000] TCRestart : 8010d860 mips_mt_regdump+0x2f0/0x3c4
[ 0.000000] TCHalt : 00000000
[ 0.000000] TCContext : 00000000
[ 0.000000] TC 1
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00200001
[ 0.000000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00180000
[ 0.000000] TC 2
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00400001
[ 0.000000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00300000
[ 0.000000] TC 3
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00600001
[ 0.000000] TCRestart : fff7ffae 0xfff7ffae
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00480000
[ 0.000000] TC 4
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00800001
[ 0.000000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00600000
[ 0.000000] TC 5
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00a00001
[ 0.000000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00780000
[ 0.000000] TC 6
[ 0.000000] TCStatus : 00000000
[ 0.000000] TCBind : 00c00001
[ 0.000000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.000000] TCHalt : 00000001
[ 0.000000] TCContext : 00900000
[ 0.000000] Counter Interrupts taken per CPU (TC)
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] Self-IPI invocations:
[ 0.000000] 0: 0
[ 0.000000] 1: 0
[ 0.000000] 2: 0
[ 0.000000] 3: 0
[ 0.000000] 4: 0
[ 0.000000] 5: 0
[ 0.000000] 6: 0
[ 0.000000] 7: 0
[ 0.000000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.000000] 0 Recoveries of "stolen" FPU
[ 0.000000] ===========================
[ 0.000000] In platform cic dispatch cic_mask=0x22000 stat=0x2402000f
pend=0x20000
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
[ 0.010000] ===========================
[ 0.010000] === MIPS MT State Dump ===
[ 0.010000] -- Global State --
[ 0.010000] MVPControl Passed: 00000000
[ 0.010000] MVPControl Read: 00000000
[ 0.010000] MVPConf0 : a8008406
[ 0.010000] -- per-VPE State --
[ 0.010000] VPE 0
[ 0.010000] VPEControl : 00000000
[ 0.010000] VPEConf0 : 800f0003
[ 0.010000] VPE0.Status : 18004000
[ 0.010000] VPE0.EPC : 8010d900 mips_mt_regdump+0x390/0x3c4
[ 0.010000] VPE0.Cause : 40804000
[ 0.010000] VPE0.Config7 : 00010000
[ 0.010000] VPE 1
[ 0.010000] VPEControl : 00060000
[ 0.010000] VPEConf0 : 800f0000
[ 0.010000] VPE1.Status : 00408305
[ 0.010000] VPE1.EPC : 801024e0 except_vec_vi+0x0/0x84
[ 0.010000] VPE1.Cause : 40000200
[ 0.010000] VPE1.Config7 : 00010000
[ 0.010000] -- per-TC State --
[ 0.010000] TC 0 (current TC with VPE EPC above)
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00000000
[ 0.010000] TCRestart : 803f791c printk+0xc/0x30
[ 0.010000] TCHalt : 00000000
[ 0.010000] TCContext : 00000000
[ 0.010000] TC 1
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00200001
[ 0.010000] TCRestart : 80104b64 copy_thread+0x2ac/0x2b4
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00180000
[ 0.010000] TC 2
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00400001
[ 0.010000] TCRestart : 7ffffffc 0x7ffffffc
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00300000
[ 0.010000] TC 3
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00600001
[ 0.010000] TCRestart : fff7ffae 0xfff7ffae
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00480000
[ 0.010000] TC 4
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00800001
[ 0.010000] TCRestart : f3fff7fe 0xf3fff7fe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00600000
[ 0.010000] TC 5
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00a00001
[ 0.010000] TCRestart : 7ffffbfe 0x7ffffbfe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00780000
[ 0.010000] TC 6
[ 0.010000] TCStatus : 00000000
[ 0.010000] TCBind : 00c00001
[ 0.010000] TCRestart : ffff7ffe 0xffff7ffe
[ 0.010000] TCHalt : 00000001
[ 0.010000] TCContext : 00900000
[ 0.010000] Counter Interrupts taken per CPU (TC)
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] Self-IPI invocations:
[ 0.010000] 0: 0
[ 0.010000] 1: 0
[ 0.010000] 2: 0
[ 0.010000] 3: 0
[ 0.010000] 4: 0
[ 0.010000] 5: 0
[ 0.010000] 6: 0
[ 0.010000] 7: 0
[ 0.010000] IPIQ[0]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[1]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[2]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[3]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[4]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[5]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[6]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] IPIQ[7]: head = 0x0, tail = 0x0, depth = 0
[ 0.010000] 0 Recoveries of "stolen" FPU
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-08 13:48 ` Anoop P.A.
(?)
(?)
@ 2010-12-09 18:52 ` Kevin D. Kissell
2010-12-14 15:25 ` Anoop P.A.
-1 siblings, 1 reply; 68+ messages in thread
From: Kevin D. Kissell @ 2010-12-09 18:52 UTC (permalink / raw)
To: Anoop P.A.; +Cc: linux-mips
I used to do occasional tests and damage control patches for SMTC, but
haven't had the time and resources for the past year or so. The
"Calibrating delay loop" hang is an absolutely classic hang in SMTC
systems that stems from the interrupt management system not being
properly set up. Ralf alluded to the intra-TC timer propagation
protocol, but your problem could just as easily (more easily, actually)
have to do with enable mask management. In order to keep multiple
threads from "convoying" into interrupt handlers chasing a single event,
SMTC manipulates the interrupt enable mask at entry into an interrupt
exception to ensure that only the initial TC goes after it. The
interrupt is unmasked once the interrupt handler has quenched the source
and invoked the IRQ ack function. Unfortunately, generic timer
functions don't always do the canonical source quench performed by most
device driver interrupt handlers. I tried to make all this
self-contained in generic architecture-specific code, but at some point
it ended up being cleaner and more efficient to have *some* hooks in
platform specific timer code. It was there for Malta in the kernel.org
mainline once upon a time, and I *thought* we'd propagated working code
for the initial PMC-Sierra 34K-based SoC's at least as far as
linux-mips.org, but the source tree has been considerably reorganized -
there was a time when some of the hooks were under
arch/mips/mips-boards/generic, which no longer exists - and I'm not sure
where to point you. Git and grep are your friends.
The first order of business is to break into that hung timer calibration
loop and dump the CP0 registers for the VPE and the TCs, in particular
checking the interrupt enable mask in Status against the pending
interrupts in the Cause register. If you're seeing the timer
interrupt's bit set in Cause, but clear in Status, you need to fix the
SMTC interrupt mask hook for your platform timer. If that's *not* it,
check to see if you're building for "tickless" operation. Tickless ends
up being really important for SMTC, and I did get it working properly
back in 2008, but I the SMTC-specific cevt-smtc.c code uses common
functions in cevt-r4k.c, and I've seen some patches to cevt-r4k.c going
by that I rather doubt were ever tested against an SMTC build/platform.
There might have been breakage there, and configuring to use a fixed
interval timer (say, 100Hz) would be a way to test that hypothesis.
Regards,
Kevin K.
On 12/08/10 05:48, Anoop P.A. wrote:
> Hi list,
>
> Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k).
>
> I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
>
> While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file ( changed to map smp_ops structure) 2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine.
>
> [ 0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
> [ 0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
> [ 0.000000] UART clock set to 50000000
> [ 0.000000] CPU revision is: 00019548 (MIPS 34Kc)
> [ 0.000000] Determined physical RAM map:
> [ 0.000000] memory: 00001000 @ 00000000 (reserved)
> [ 0.000000] memory: 000ff000 @ 00001000 (usable)
> [ 0.000000] memory: 003f2000 @ 00100000 (reserved)
> [ 0.000000] memory: 0fad9200 @ 004f2000 (usable)
> [ 0.000000] Wasting 32 bytes for tracking 1 unused pages
> [ 0.000000] Zone PFN ranges:
> [ 0.000000] Normal 0x00000000 -> 0x0000ffcb
> [ 0.000000] Movable zone start PFN for each node
> [ 0.000000] early_node_map[1] active PFN ranges
> [ 0.000000] 0: 0x00000000 -> 0x0000ffcb
> [ 0.000000] 6 available secondary CPU TC(s)
> [ 0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
> [ 0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
> [ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64971
> [ 0.000000] Kernel command line: console=ttyS0,57600
> [ 0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
> [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
> [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
> [ 0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
> [ 0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
> [ 0.000000] Writing ErrCtl register=00000000
> [ 0.000000] Readback ErrCtl register=00000000
> [ 0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
> [ 0.000000] Preemptable hierarchical RCU implementation.
> [ 0.000000] NR_IRQS:128
> [ 0.000000] console [ttyS0] enabled
> [ 0.000000] Clock rate set to 600000000
> [ 0.000000] Calibrating delay loop...
>
> Any idea to debug the issue ?.
>
> Thanks,
> Anoop
>
>
>
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: SMTC support status in latest git head.
2010-12-08 13:48 ` Anoop P.A.
(?)
@ 2010-12-09 17:07 ` Ralf Baechle
-1 siblings, 0 replies; 68+ messages in thread
From: Ralf Baechle @ 2010-12-09 17:07 UTC (permalink / raw)
To: Anoop P.A.; +Cc: linux-mips, Kevin D. Kissell
On Wed, Dec 08, 2010 at 05:48:48AM -0800, Anoop P.A. wrote:
> Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k).
Correct. MTI's MIPSsim is the only simulator that supports multithreading
afaik.
SMTC is not terribly popular so doesn't receive the regular testing it should
because it's also a complex beast.
> I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
>
> While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file ( changed to map smp_ops structure) 2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine.
Timer interrupts work differently in SMTC. Each CPU needs a clock event
device, that is an interrupt timer but the CPU core is restricted to just
one per VPE so in typical SMTC setup multiple CPUs aka TCs will have to
share an interrupt timer. The way this works is that one of the TCs
associated with a VPE will take the timer interrupt and forward it to
the other TCs associated with the same VPE (if any) through a software
IPI mechanism. The race conditions that need to handled to make this
work are ... interesting. Your problem seems to be simpler as you only
get a single timer interrupt.
Ralf
^ permalink raw reply [flat|nested] 68+ messages in thread
* SMTC support status in latest git head.
@ 2010-12-08 13:48 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-08 13:48 UTC (permalink / raw)
To: linux-mips
Hi list,
Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k).
I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file ( changed to map smp_ops structure) 2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine.
[ 0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
[ 0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
[ 0.000000] UART clock set to 50000000
[ 0.000000] CPU revision is: 00019548 (MIPS 34Kc)
[ 0.000000] Determined physical RAM map:
[ 0.000000] memory: 00001000 @ 00000000 (reserved)
[ 0.000000] memory: 000ff000 @ 00001000 (usable)
[ 0.000000] memory: 003f2000 @ 00100000 (reserved)
[ 0.000000] memory: 0fad9200 @ 004f2000 (usable)
[ 0.000000] Wasting 32 bytes for tracking 1 unused pages
[ 0.000000] Zone PFN ranges:
[ 0.000000] Normal 0x00000000 -> 0x0000ffcb
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0x00000000 -> 0x0000ffcb
[ 0.000000] 6 available secondary CPU TC(s)
[ 0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
[ 0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64971
[ 0.000000] Kernel command line: console=ttyS0,57600
[ 0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[ 0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
[ 0.000000] Writing ErrCtl register=00000000
[ 0.000000] Readback ErrCtl register=00000000
[ 0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] NR_IRQS:128
[ 0.000000] console [ttyS0] enabled
[ 0.000000] Clock rate set to 600000000
[ 0.000000] Calibrating delay loop...
Any idea to debug the issue ?.
Thanks,
Anoop
^ permalink raw reply [flat|nested] 68+ messages in thread
* SMTC support status in latest git head.
@ 2010-12-08 13:48 ` Anoop P.A.
0 siblings, 0 replies; 68+ messages in thread
From: Anoop P.A. @ 2010-12-08 13:48 UTC (permalink / raw)
To: linux-mips
Hi list,
Any body is aware of SMTC support status in latest git sources?. I have tried testing SMTC kernel for malta in qemu / OVP without any success ( emulators not working for 34k).
I am trying to bring up SMTC Linux support for an mips34K based soc ( MSP71xx family).
While booting , kernel getting hung on calibrate loop delay. I am getting only one interrupt from timer. With similar smtc platform support file ( changed to map smp_ops structure) 2.6.24-stable branch kernel ( where latest timer structure introduced) boots fine.
[ 0.000000] Linux version 2.6.37-rc1-pmc-00197-g5bfd3ba-dirty (paanoop1@paanoop1-desktop) (gcc version 4.5.1 (GCC) ) #168 SMP PREEMPT Wed Dec 8 19:19:490
[ 0.000000] DSPRAM0: PA=1c100000,Size=00008000,enabled
[ 0.000000] UART clock set to 50000000
[ 0.000000] CPU revision is: 00019548 (MIPS 34Kc)
[ 0.000000] Determined physical RAM map:
[ 0.000000] memory: 00001000 @ 00000000 (reserved)
[ 0.000000] memory: 000ff000 @ 00001000 (usable)
[ 0.000000] memory: 003f2000 @ 00100000 (reserved)
[ 0.000000] memory: 0fad9200 @ 004f2000 (usable)
[ 0.000000] Wasting 32 bytes for tracking 1 unused pages
[ 0.000000] Zone PFN ranges:
[ 0.000000] Normal 0x00000000 -> 0x0000ffcb
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0x00000000 -> 0x0000ffcb
[ 0.000000] 6 available secondary CPU TC(s)
[ 0.000000] PERCPU: Embedded 7 pages/cpu @81203000 s6464 r8192 d14016 u32768
[ 0.000000] pcpu-alloc: s6464 r8192 d14016 u32768 alloc=8*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64971
[ 0.000000] Kernel command line: console=ttyS0,57600
[ 0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[ 0.000000] Primary data cache 64kB, 4-way, PIPT, no aliases, linesize 32 bytes
[ 0.000000] Writing ErrCtl register=00000000
[ 0.000000] Readback ErrCtl register=00000000
[ 0.000000] Memory: 254360k/257888k available (3081k kernel code, 3528k reserved, 653k data, 200k init, 0k highmem)
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] NR_IRQS:128
[ 0.000000] console [ttyS0] enabled
[ 0.000000] Clock rate set to 600000000
[ 0.000000] Calibrating delay loop...
Any idea to debug the issue ?.
Thanks,
Anoop
^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2011-01-13 7:53 UTC | newest]
Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-16 15:37 SMTC support status in latest git head STUART VENTERS
2010-12-16 15:37 ` STUART VENTERS
[not found] ` <4D0A677C.6040104@paralogos.com>
2010-12-16 19:58 ` Kevin D. Kissell
2010-12-17 21:35 ` Kevin D. Kissell
2010-12-20 10:44 ` Anoop P A
[not found] ` <4D10F7A9.1020306@paralogos.com>
2010-12-21 20:06 ` Anoop P.A.
2010-12-21 20:06 ` Anoop P.A.
2010-12-21 20:29 ` Anoop P.A.
2010-12-21 20:29 ` Anoop P.A.
2010-12-22 10:27 ` Kevin D. Kissell
2010-12-22 11:35 ` Anoop P A
2010-12-22 11:37 ` Kevin D. Kissell
2010-12-22 11:51 ` Anoop P A
2010-12-22 13:03 ` Kevin D. Kissell
2010-12-22 16:34 ` STUART VENTERS
2010-12-22 16:34 ` STUART VENTERS
2010-12-23 21:09 ` STUART VENTERS
2010-12-23 21:09 ` STUART VENTERS
2010-12-24 12:32 ` Kevin D. Kissell
2010-12-24 14:39 ` Anoop P A
2010-12-24 14:53 ` Kevin D. Kissell
2010-12-24 16:02 ` Anoop P A
2010-12-24 23:34 ` Kevin D. Kissell
2010-12-25 7:32 ` Anoop P A
2010-12-25 15:17 ` Kevin D. Kissell
2010-12-27 15:49 ` STUART VENTERS
2010-12-27 15:49 ` STUART VENTERS
2010-12-27 17:19 ` Anoop P A
2010-12-28 8:19 ` Anoop P A
2010-12-28 8:43 ` Kevin D. Kissell
2010-12-31 12:27 ` Anoop P A
2011-01-01 8:42 ` Kevin D. Kissell
2011-01-03 15:12 ` Anoop P A
2011-01-03 16:14 ` Kevin D. Kissell
2011-01-03 19:20 ` Anoop P A
2011-01-04 8:17 ` Kevin D. Kissell
2011-01-04 13:02 ` Anoop P A
2011-01-04 14:37 ` Anoop P A
2011-01-04 17:21 ` Kevin D. Kissell
2011-01-04 17:54 ` Anoop P A
2011-01-04 18:33 ` Kevin D. Kissell
2011-01-05 13:11 ` Anoop P A
2011-01-05 19:23 ` Kevin D. Kissell
2011-01-06 20:23 ` Anoop P A
2011-01-06 23:31 ` Kevin D. Kissell
2011-01-07 7:56 ` Anoop P A
2011-01-07 18:46 ` Kevin D. Kissell
2011-01-08 19:33 ` Anoop P A
2011-01-10 19:30 ` Kevin D. Kissell
2011-01-11 4:05 ` Anoop P A
2011-01-13 7:53 ` Kevin D. Kissell
2011-01-04 17:40 ` Kevin D. Kissell
2011-01-05 13:09 ` Anoop P A
-- strict thread matches above, loose matches on Subject: below --
2010-12-14 21:27 STUART VENTERS
2010-12-14 21:27 ` STUART VENTERS
2010-12-14 23:01 ` Kevin D. Kissell
2010-12-08 13:48 Anoop P.A.
2010-12-08 13:48 ` Anoop P.A.
2010-12-09 17:07 ` Ralf Baechle
2010-12-09 18:52 ` Kevin D. Kissell
2010-12-14 15:25 ` Anoop P.A.
2010-12-14 15:25 ` Anoop P.A.
2010-12-14 18:32 ` Kevin D. Kissell
2010-12-14 18:50 ` Ralf Baechle
2010-12-15 19:18 ` Anoop P A
2010-12-15 19:58 ` Kevin D. Kissell
2010-12-16 13:03 ` Anoop P A
2010-12-16 18:43 ` Kevin D. Kissell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.