All of lore.kernel.org
 help / color / mirror / Atom feed
* SPARC32 SMP IRQ15 question
@ 2010-12-16 14:24 Daniel Hellstrom
  2010-12-21 21:02 ` David Miller
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: Daniel Hellstrom @ 2010-12-16 14:24 UTC (permalink / raw)
  To: sparclinux

Hello Dave,

I have an architectural question about the SPARC32 port regarding how 
IRQ15 is used for cross calls.

Why is IRQ15, the non-maskable IRQ, used for cross calls? Would it not 
be safer to use IRQ14?

Since IRQ15 is non-maskable it will even interrupt spin_lock_irqsave() 
protected reqions. I assume it is safe as long as the cross call 
function run in IRQ context does not try to take the same spinlock, for 
that would create a dead lock I believe. For example atomic_add() on 
SPARC32 below is implemented using one of four global spinlocks, does 
that mean that we can not use atomic functions at all from within a 
cross call function?

#define atomic_add(i, v)    ((void)__atomic_add_return( (int)(i), (v)))

#define ATOMIC_HASH_SIZE    4
#define ATOMIC_HASH(a)    (&__atomic_hash[(((unsigned long)a)>>8) & 
(ATOMIC_HASH_SIZE-1)])

spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] = {
    [0 ... (ATOMIC_HASH_SIZE-1)] = SPIN_LOCK_UNLOCKED
};

int __atomic_add_return(int i, atomic_t *v)
{
    int ret;
    unsigned long flags;
    spin_lock_irqsave(ATOMIC_HASH(v), flags);

    ret = (v->counter += i);

    spin_unlock_irqrestore(ATOMIC_HASH(v), flags);
    return ret;
}


This particular case is interesting since atomic instructions are used 
by drain_local_pages() helper functions, which is scheduled as a cross 
call in drain_all_pages():

#0   0xf02cb884   0xf14b97a0   _raw_spin_lock_irqsave + 0x54
#1   0xf0195024   0xf14b9800   __atomic_add_return + 0x18  (via 
zone_page_state_add() include/linux/vmstat.h: 145)
#2   0xf007dfa8   0xf14b9860   __mod_zone_page_state + 0x64   
(mm/vmstat.c: 165)
#3   0xf006f9cc   0xf14b98c0   free_pcppages_bulk + 0x340  
(mm/page_alloc.c: 586)
#4   0xf006fb58   0xf14b9938   drain_local_pages + 0x64
#5   0xf001cb00   0xf14b9998   leon_cross_call_irq + 0x3c

/*
 * Spill all the per-cpu pages from all CPUs back into the buddy allocator
 */
void drain_all_pages(void)
{
    on_each_cpu(drain_local_pages, NULL, 1);
}


Best Regards,

Daniel Hellstrom
Aeroflex Gaisler

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
@ 2010-12-21 21:02 ` David Miller
  2010-12-21 22:44 ` Jiri Gaisler
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-21 21:02 UTC (permalink / raw)
  To: sparclinux

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Thu, 16 Dec 2010 15:24:02 +0100

> Why is IRQ15, the non-maskable IRQ, used for cross calls? Would it not
> be safer to use IRQ14?
> 
> Since IRQ15 is non-maskable it will even interrupt spin_lock_irqsave()
> protected reqions. I assume it is safe as long as the cross call
> function run in IRQ context does not try to take the same spinlock,
> for that would create a dead lock I believe. For example atomic_add()
> on SPARC32 below is implemented using one of four global spinlocks,
> does that mean that we can not use atomic functions at all from within
> a cross call function?

We don't want operations like TLB and cache flushes to be blocked
by IRQ disabling.

For other operations, we should reschedule it to a software interrupt
at a lower level than 15, but nobody has done the work to implement
this yet.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
  2010-12-21 21:02 ` David Miller
@ 2010-12-21 22:44 ` Jiri Gaisler
  2010-12-22  3:54 ` David Miller
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Jiri Gaisler @ 2010-12-21 22:44 UTC (permalink / raw)
  To: sparclinux


We are thinking of switching the LEON port to use CASA (compare and swap),
just like in the sparc64 port. Most newer leon3/4 processors implement CASA,
even if the instruction belongs to sparc V9 and not V8. Would a patch like
that be accepted (for the LEON port only)? This will not fix the SMP for
general sparc32 (which really is broken), but it would fix it for LEON.

Jiri.

David Miller wrote:
> From: Daniel Hellstrom <daniel@gaisler.com>
> Date: Thu, 16 Dec 2010 15:24:02 +0100
> 
>> Why is IRQ15, the non-maskable IRQ, used for cross calls? Would it not
>> be safer to use IRQ14?
>>
>> Since IRQ15 is non-maskable it will even interrupt spin_lock_irqsave()
>> protected reqions. I assume it is safe as long as the cross call
>> function run in IRQ context does not try to take the same spinlock,
>> for that would create a dead lock I believe. For example atomic_add()
>> on SPARC32 below is implemented using one of four global spinlocks,
>> does that mean that we can not use atomic functions at all from within
>> a cross call function?
> 
> We don't want operations like TLB and cache flushes to be blocked
> by IRQ disabling.
> 
> For other operations, we should reschedule it to a software interrupt
> at a lower level than 15, but nobody has done the work to implement
> this yet.
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
  2010-12-21 21:02 ` David Miller
  2010-12-21 22:44 ` Jiri Gaisler
@ 2010-12-22  3:54 ` David Miller
  2010-12-22  9:19 ` Alex Buell
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-22  3:54 UTC (permalink / raw)
  To: sparclinux

From: Jiri Gaisler <jiri@gaisler.com>
Date: Tue, 21 Dec 2010 23:44:44 +0100

> 
> We are thinking of switching the LEON port to use CASA (compare and swap),
> just like in the sparc64 port. Most newer leon3/4 processors implement CASA,
> even if the instruction belongs to sparc V9 and not V8. Would a patch like
> that be accepted (for the LEON port only)? This will not fix the SMP for
> general sparc32 (which really is broken), but it would fix it for LEON.

It should be fine as long as the patches are clean enough.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (2 preceding siblings ...)
  2010-12-22  3:54 ` David Miller
@ 2010-12-22  9:19 ` Alex Buell
  2010-12-22  9:51 ` Jiri Gaisler
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Alex Buell @ 2010-12-22  9:19 UTC (permalink / raw)
  To: sparclinux

On Tue, 2010-12-21 at 19:54 -0800, David Miller wrote:
> > We are thinking of switching the LEON port to use CASA (compare and
> swap), just like in the sparc64 port. Most newer leon3/4 processors
> implement CASA, even if the instruction belongs to sparc V9 and not
> V8. Would a patch like that be accepted (for the LEON port only)? This
> will not fix the SMP for general sparc32 (which really is broken), but
> it would fix it for LEON.
> 
> It should be fine as long as the patches are clean enough. 

Rather than specialising it for LEON, why not just test for CASA
availability and use it if it available?
-- 
Tactical Nuclear Kittens

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (3 preceding siblings ...)
  2010-12-22  9:19 ` Alex Buell
@ 2010-12-22  9:51 ` Jiri Gaisler
  2010-12-22 12:27 ` Daniel Hellstrom
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Jiri Gaisler @ 2010-12-22  9:51 UTC (permalink / raw)
  To: sparclinux



Alex Buell wrote:
> On Tue, 2010-12-21 at 19:54 -0800, David Miller wrote:
>>> We are thinking of switching the LEON port to use CASA (compare and
>> swap), just like in the sparc64 port. Most newer leon3/4 processors
>> implement CASA, even if the instruction belongs to sparc V9 and not
>> V8. Would a patch like that be accepted (for the LEON port only)? This
>> will not fix the SMP for general sparc32 (which really is broken), but
>> it would fix it for LEON.
>>
>> It should be fine as long as the patches are clean enough. 
> 
> Rather than specialising it for LEON, why not just test for CASA
> availability and use it if it available?

The CASA is only necessary for SMP systems, as the current irq15
solution will NOT work on sparc32-SMP. So falling back on irq15
when CASA is not present is not really an option, as it will lead
to a deadlock sooner or later ...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (4 preceding siblings ...)
  2010-12-22  9:51 ` Jiri Gaisler
@ 2010-12-22 12:27 ` Daniel Hellstrom
  2010-12-22 19:53 ` David Miller
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Daniel Hellstrom @ 2010-12-22 12:27 UTC (permalink / raw)
  To: sparclinux

David Miller wrote:

>From: Daniel Hellstrom <daniel@gaisler.com>
>Date: Thu, 16 Dec 2010 15:24:02 +0100
>
>  
>
>>Why is IRQ15, the non-maskable IRQ, used for cross calls? Would it not
>>be safer to use IRQ14?
>>
>>Since IRQ15 is non-maskable it will even interrupt spin_lock_irqsave()
>>protected reqions. I assume it is safe as long as the cross call
>>function run in IRQ context does not try to take the same spinlock,
>>for that would create a dead lock I believe. For example atomic_add()
>>on SPARC32 below is implemented using one of four global spinlocks,
>>does that mean that we can not use atomic functions at all from within
>>a cross call function?
>>    
>>
>
>We don't want operations like TLB and cache flushes to be blocked
>by IRQ disabling.
>
>For other operations, we should reschedule it to a software interrupt
>at a lower level than 15, but nobody has done the work to implement
>this yet.
>
>
>  
>
Thank you for your answer.

I will try to create a patch for the atomic layer for SMP LEON systems 
since they have the CASA instruction.

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (5 preceding siblings ...)
  2010-12-22 12:27 ` Daniel Hellstrom
@ 2010-12-22 19:53 ` David Miller
  2010-12-22 19:56 ` David Miller
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-22 19:53 UTC (permalink / raw)
  To: sparclinux

From: Alex Buell <alex.buell@munted.org.uk>
Date: Wed, 22 Dec 2010 09:19:04 +0000

> On Tue, 2010-12-21 at 19:54 -0800, David Miller wrote:
>> > We are thinking of switching the LEON port to use CASA (compare and
>> swap), just like in the sparc64 port. Most newer leon3/4 processors
>> implement CASA, even if the instruction belongs to sparc V9 and not
>> V8. Would a patch like that be accepted (for the LEON port only)? This
>> will not fix the SMP for general sparc32 (which really is broken), but
>> it would fix it for LEON.
>> 
>> It should be fine as long as the patches are clean enough. 
> 
> Rather than specialising it for LEON, why not just test for CASA
> availability and use it if it available?

Run time detection of this is going to really be terrible.

Lots of cmpxchg() uses are inline, and we'll have to implement
it out-of-line and patch the call instructions dynamically.

LEON is the only chip that can do the instruction and we already
do a special LEON specific build.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (6 preceding siblings ...)
  2010-12-22 19:53 ` David Miller
@ 2010-12-22 19:56 ` David Miller
  2010-12-22 20:23 ` Alex Buell
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-22 19:56 UTC (permalink / raw)
  To: sparclinux

From: Jiri Gaisler <jiri@gaisler.com>
Date: Wed, 22 Dec 2010 10:51:05 +0100

> 
> 
> Alex Buell wrote:
>> On Tue, 2010-12-21 at 19:54 -0800, David Miller wrote:
>>>> We are thinking of switching the LEON port to use CASA (compare and
>>> swap), just like in the sparc64 port. Most newer leon3/4 processors
>>> implement CASA, even if the instruction belongs to sparc V9 and not
>>> V8. Would a patch like that be accepted (for the LEON port only)? This
>>> will not fix the SMP for general sparc32 (which really is broken), but
>>> it would fix it for LEON.
>>>
>>> It should be fine as long as the patches are clean enough. 
>> 
>> Rather than specialising it for LEON, why not just test for CASA
>> availability and use it if it available?
> 
> The CASA is only necessary for SMP systems, as the current irq15
> solution will NOT work on sparc32-SMP. So falling back on irq15
> when CASA is not present is not really an option, as it will lead
> to a deadlock sooner or later ...

Well actually, you can't just implement CAS and not fix the irq15
issue.

Spinlocks and other code that does local_disable_irq() expects
that smp_call_function() would be blocks by such disables and
they won't and you will crash if you don't fix this.

So no matter what someone has to fix the irq15 NMI issue.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (7 preceding siblings ...)
  2010-12-22 19:56 ` David Miller
@ 2010-12-22 20:23 ` Alex Buell
  2010-12-22 22:28 ` David Miller
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Alex Buell @ 2010-12-22 20:23 UTC (permalink / raw)
  To: sparclinux

On Wed, 2010-12-22 at 11:53 -0800, David Miller wrote:
> > Rather than specialising it for LEON, why not just test for CASA
> > availability and use it if it available?
> 
> Run time detection of this is going to really be terrible.
> 
> Lots of cmpxchg() uses are inline, and we'll have to implement
> it out-of-line and patch the call instructions dynamically.
> 
> LEON is the only chip that can do the instruction and we already
> do a special LEON specific build. 

Inlines do save a lot of time, so I guess you're right there's no point.
But I think changing the way IRQ15/IRQ14 interrupt works could be an
extra bonus. 
-- 
Tactical Nuclear Kittens

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (8 preceding siblings ...)
  2010-12-22 20:23 ` Alex Buell
@ 2010-12-22 22:28 ` David Miller
  2010-12-26 12:34 ` Kjetil Oftedal
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-22 22:28 UTC (permalink / raw)
  To: sparclinux

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Wed, 22 Dec 2010 14:23:27 +0100

> I will try to create a patch for the atomic layer for SMP LEON systems
> since they have the CASA instruction.

But see my other posting, you still have to fix the irq15 problem.

Merely switching to CASA doesn't mean you don't still have a problem
because spin_lock_irqsave() and other similar pieces of code expect
they will not be interrupted by smp_call_function() calls.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (9 preceding siblings ...)
  2010-12-22 22:28 ` David Miller
@ 2010-12-26 12:34 ` Kjetil Oftedal
  2010-12-26 14:44 ` Jiri Gaisler
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Kjetil Oftedal @ 2010-12-26 12:34 UTC (permalink / raw)
  To: sparclinux

On 22 December 2010 23:28, David Miller <davem@davemloft.net> wrote:
> From: Daniel Hellstrom <daniel@gaisler.com>
> Date: Wed, 22 Dec 2010 14:23:27 +0100
>
>> I will try to create a patch for the atomic layer for SMP LEON systems
>> since they have the CASA instruction.
>
> But see my other posting, you still have to fix the irq15 problem.
>
> Merely switching to CASA doesn't mean you don't still have a problem
> because spin_lock_irqsave() and other similar pieces of code expect
> they will not be interrupted by smp_call_function() calls.

Was SPARC32 SMP in 2.4 also implemented using IRQ15/NMI ?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (10 preceding siblings ...)
  2010-12-26 12:34 ` Kjetil Oftedal
@ 2010-12-26 14:44 ` Jiri Gaisler
  2010-12-27  1:55 ` David Miller
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Jiri Gaisler @ 2010-12-26 14:44 UTC (permalink / raw)
  To: sparclinux



Kjetil Oftedal wrote:
> On 22 December 2010 23:28, David Miller <davem@davemloft.net> wrote:
>> From: Daniel Hellstrom <daniel@gaisler.com>
>> Date: Wed, 22 Dec 2010 14:23:27 +0100
>>
>>> I will try to create a patch for the atomic layer for SMP LEON systems
>>> since they have the CASA instruction.
>>
>> But see my other posting, you still have to fix the irq15 problem.
>>
>> Merely switching to CASA doesn't mean you don't still have a problem
>> because spin_lock_irqsave() and other similar pieces of code expect
>> they will not be interrupted by smp_call_function() calls.
> 
> Was SPARC32 SMP in 2.4 also implemented using IRQ15/NMI ?


Don't know about 2.4, but the interesting thing is that sparc32 SMP
does work on 2.6.21.1 and seems to get broken somewhere between 2.6.23
and 2.6.24. So something happened with the irq15 handler such that
it now tries to grab a taken spin-lock and the system dead-locks.
Could also be that the problem has been there all the time, but
not triggered in the pre-2.6.24 kernels ...

I think we first need to understand what root cause is, so that
we fix the right thing. I have confidence that Daniel and Konrad
will hash this out somehow during the coming months...

Jiri.

> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (11 preceding siblings ...)
  2010-12-26 14:44 ` Jiri Gaisler
@ 2010-12-27  1:55 ` David Miller
  2010-12-27 17:28 ` crn
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2010-12-27  1:55 UTC (permalink / raw)
  To: sparclinux

From: Kjetil Oftedal <oftedal@gmail.com>
Date: Sun, 26 Dec 2010 13:34:56 +0100

> On 22 December 2010 23:28, David Miller <davem@davemloft.net> wrote:
>> From: Daniel Hellstrom <daniel@gaisler.com>
>> Date: Wed, 22 Dec 2010 14:23:27 +0100
>>
>>> I will try to create a patch for the atomic layer for SMP LEON systems
>>> since they have the CASA instruction.
>>
>> But see my other posting, you still have to fix the irq15 problem.
>>
>> Merely switching to CASA doesn't mean you don't still have a problem
>> because spin_lock_irqsave() and other similar pieces of code expect
>> they will not be interrupted by smp_call_function() calls.
> 
> Was SPARC32 SMP in 2.4 also implemented using IRQ15/NMI ?

Yes, it's essentially always had this problem.

It was less important back then because smp_call_function() was
really not used much by generic code.  Now it's used everywhere.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (12 preceding siblings ...)
  2010-12-27  1:55 ` David Miller
@ 2010-12-27 17:28 ` crn
  2011-01-03 14:14 ` Daniel Hellstrom
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: crn @ 2010-12-27 17:28 UTC (permalink / raw)
  To: sparclinux

> From: Kjetil Oftedal <oftedal@gmail.com>
> Date: Sun, 26 Dec 2010 13:34:56 +0100
>
>> On 22 December 2010 23:28, David Miller <davem@davemloft.net> wrote:
>>> From: Daniel Hellstrom <daniel@gaisler.com>
>>> Date: Wed, 22 Dec 2010 14:23:27 +0100
>>>
>>>> I will try to create a patch for the atomic layer for SMP LEON systems
>>>> since they have the CASA instruction.
>>>
>>> But see my other posting, you still have to fix the irq15 problem.
>>>
>>> Merely switching to CASA doesn't mean you don't still have a problem
>>> because spin_lock_irqsave() and other similar pieces of code expect
>>> they will not be interrupted by smp_call_function() calls.
>>
>> Was SPARC32 SMP in 2.4 also implemented using IRQ15/NMI ?
>
> Yes, it's essentially always had this problem.
>
> It was less important back then because smp_call_function() was
> really not used much by generic code.  Now it's used everywhere.

Maybe this could be why I could never get a sun4d SS1000E with 8
processors to stay up for more than a few minutes without locking solid.
OTOH it could be irrelevant.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (13 preceding siblings ...)
  2010-12-27 17:28 ` crn
@ 2011-01-03 14:14 ` Daniel Hellstrom
  2011-01-03 14:19 ` Daniel Hellstrom
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Daniel Hellstrom @ 2011-01-03 14:14 UTC (permalink / raw)
  To: sparclinux

David Miller wrote:

>From: Daniel Hellstrom <daniel@gaisler.com>
>Date: Wed, 22 Dec 2010 14:23:27 +0100
>
>  
>
>>I will try to create a patch for the atomic layer for SMP LEON systems
>>since they have the CASA instruction.
>>    
>>
>
>But see my other posting, you still have to fix the irq15 problem.
>
>Merely switching to CASA doesn't mean you don't still have a problem
>because spin_lock_irqsave() and other similar pieces of code expect
>they will not be interrupted by smp_call_function() calls.
>  
>
I totally agree, I brought up the atomic layer as an example because it 
could be triggered easiest. The atomic layer can be modified for LEON to 
gain some performance improvements I guess, I made a dirty patch for 
testing it seemed to work but then it freezes in the memory zone lock 
instead.

I will continue my investigations later, first I want to clean up my 
single-CPU patches and submit them to the list.

Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (14 preceding siblings ...)
  2011-01-03 14:14 ` Daniel Hellstrom
@ 2011-01-03 14:19 ` Daniel Hellstrom
  2011-01-26 17:02 ` Daniel Hellstrom
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Daniel Hellstrom @ 2011-01-03 14:19 UTC (permalink / raw)
  To: sparclinux

crn@pop3.netunix.com wrote:

>>From: Kjetil Oftedal <oftedal@gmail.com>
>>Date: Sun, 26 Dec 2010 13:34:56 +0100
>>
>>    
>>
>>>On 22 December 2010 23:28, David Miller <davem@davemloft.net> wrote:
>>>      
>>>
>>>>From: Daniel Hellstrom <daniel@gaisler.com>
>>>>Date: Wed, 22 Dec 2010 14:23:27 +0100
>>>>
>>>>        
>>>>
>>>>>I will try to create a patch for the atomic layer for SMP LEON systems
>>>>>since they have the CASA instruction.
>>>>>          
>>>>>
>>>>But see my other posting, you still have to fix the irq15 problem.
>>>>
>>>>Merely switching to CASA doesn't mean you don't still have a problem
>>>>because spin_lock_irqsave() and other similar pieces of code expect
>>>>they will not be interrupted by smp_call_function() calls.
>>>>        
>>>>
>>>Was SPARC32 SMP in 2.4 also implemented using IRQ15/NMI ?
>>>      
>>>
>>Yes, it's essentially always had this problem.
>>
>>It was less important back then because smp_call_function() was
>>really not used much by generic code.  Now it's used everywhere.
>>    
>>
>
>Maybe this could be why I could never get a sun4d SS1000E with 8
>processors to stay up for more than a few minutes without locking solid.
>OTOH it could be irrelevant.
>  
>
It might very well be due to this problem. The boot up process and much 
other stuff work every time, however after some minutes of more CPU-load 
the system tend to hang. That is the behaviour which we have seen so far.

Daniel

>
>--
>To unsubscribe from this list: send the line "unsubscribe sparclinux" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>  
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (15 preceding siblings ...)
  2011-01-03 14:19 ` Daniel Hellstrom
@ 2011-01-26 17:02 ` Daniel Hellstrom
  2011-01-26 19:52 ` David Miller
  2011-01-26 21:28 ` daniel
  18 siblings, 0 replies; 20+ messages in thread
From: Daniel Hellstrom @ 2011-01-26 17:02 UTC (permalink / raw)
  To: sparclinux

David Miller wrote:

>From: Daniel Hellstrom <daniel@gaisler.com>
>Date: Thu, 16 Dec 2010 15:24:02 +0100
>
>  
>
>>Why is IRQ15, the non-maskable IRQ, used for cross calls? Would it not
>>be safer to use IRQ14?
>>
>>Since IRQ15 is non-maskable it will even interrupt spin_lock_irqsave()
>>protected reqions. I assume it is safe as long as the cross call
>>function run in IRQ context does not try to take the same spinlock,
>>for that would create a dead lock I believe. For example atomic_add()
>>on SPARC32 below is implemented using one of four global spinlocks,
>>does that mean that we can not use atomic functions at all from within
>>a cross call function?
>>    
>>
>
>We don't want operations like TLB and cache flushes to be blocked
>by IRQ disabling.
>
>For other operations, we should reschedule it to a software interrupt
>at a lower level than 15, but nobody has done the work to implement
>this yet.
>  
>

I have made an first implementation on the LEON, the hangs that I could 
trigger quite easily are now gone and I can run my system for several 
days without rather heavy load, however I still have some problems on 
another board, but I don't think they are directly related...

Please look at my implementation suggestion (patches in separate email 
soon), I have tried to separate the patches so that it will be easier to 
implement it for other CPU models, however I don't really know the other 
architectures... it would be nice if someone else could have look at it 
and someone how has the hardware...

I have removed the use of smp_cross_call in smp_call_function* and 
instead used the generic implementation by defining 
USE_GENERIC_SMP_HELPERS, I think this is the cleanest approach.

Thanks for applying my GRETH patches on the devnet list before,

Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (16 preceding siblings ...)
  2011-01-26 17:02 ` Daniel Hellstrom
@ 2011-01-26 19:52 ` David Miller
  2011-01-26 21:28 ` daniel
  18 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-01-26 19:52 UTC (permalink / raw)
  To: sparclinux

From: Daniel Hellstrom <daniel@gaisler.com>
Date: Wed, 26 Jan 2011 18:02:22 +0100

> I have made an first implementation on the LEON, the hangs that I
> could trigger quite easily are now gone and I can run my system for
> several days without rather heavy load, however I still have some
> problems on another board, but I don't think they are directly
> related...
> 
> Please look at my implementation suggestion (patches in separate email
> soon), I have tried to separate the patches so that it will be easier
> to implement it for other CPU models, however I don't really know the
> other architectures... it would be nice if someone else could have
> look at it and someone how has the hardware...
> 
> I have removed the use of smp_cross_call in smp_call_function* and
> instead used the generic implementation by defining
> USE_GENERIC_SMP_HELPERS, I think this is the cleanest approach.

Thanks so much for doing this work!

I'll take a look at your patches soon.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: SPARC32 SMP IRQ15 question
  2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
                   ` (17 preceding siblings ...)
  2011-01-26 19:52 ` David Miller
@ 2011-01-26 21:28 ` daniel
  18 siblings, 0 replies; 20+ messages in thread
From: daniel @ 2011-01-26 21:28 UTC (permalink / raw)
  To: sparclinux



On Wed, 26 Jan 2011 11:52:53 -0800 (PST), David Miller  wrote:
From: Daniel Hellstrom <daniel@gaisler.com>
  > Date: Wed, 26 Jan 2011 18:02:22  0100
  >
  > > I have made an first implementation on the LEON, the hangs that I
  > > could trigger quite easily are now gone and I can run my system for
  > > several days without rather heavy load, however I still have some
  > > problems on another board, but I don't think they are directly
  > > related... 
  > >
  > > Please look at my implementation suggestion (patches in separate email
  > > soon), I have tried to separate the patches so that it will be easier
  > > to implement it for other CPU models, however I don't really know the
  > > other architectures... it would be nice if someone else could have
  > > look at it and someone how has the hardware... 
  > >
  > > I have removed the use of smp_cross_call in smp_call_function* and
  > > instead used the generic implementation by defining
  > > USE_GENERIC_SMP_HELPERS, I think this is the cleanest approach. 
  >
  > Thanks so much for doing this work!
  > 
   
   
   
  I must thank you as well! I still haven't replied on you response 
about the per-cpu timers, it is quite much non-LEON work there.. but I 
will follow up on that later even though I might do exactly as you told 
me to :)
   
  The patch is rather small, but I have spent a lot time of testing it. 
   
  I'm looking forward to your reply,
   
  Daniel


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-01-26 21:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-16 14:24 SPARC32 SMP IRQ15 question Daniel Hellstrom
2010-12-21 21:02 ` David Miller
2010-12-21 22:44 ` Jiri Gaisler
2010-12-22  3:54 ` David Miller
2010-12-22  9:19 ` Alex Buell
2010-12-22  9:51 ` Jiri Gaisler
2010-12-22 12:27 ` Daniel Hellstrom
2010-12-22 19:53 ` David Miller
2010-12-22 19:56 ` David Miller
2010-12-22 20:23 ` Alex Buell
2010-12-22 22:28 ` David Miller
2010-12-26 12:34 ` Kjetil Oftedal
2010-12-26 14:44 ` Jiri Gaisler
2010-12-27  1:55 ` David Miller
2010-12-27 17:28 ` crn
2011-01-03 14:14 ` Daniel Hellstrom
2011-01-03 14:19 ` Daniel Hellstrom
2011-01-26 17:02 ` Daniel Hellstrom
2011-01-26 19:52 ` David Miller
2011-01-26 21:28 ` daniel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.