All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] MIPS: ASID conflict after CPU hotplug
@ 2010-11-17 18:49 Maksim Rayskiy
  2010-11-22  3:41 ` Ralf Baechle
  2011-11-10 13:09 ` Ralf Baechle
  0 siblings, 2 replies; 11+ messages in thread
From: Maksim Rayskiy @ 2010-11-17 18:49 UTC (permalink / raw)
  To: linux-mips, ralf

This is a repost of my original message which somehow did not reach
the mailing list (filtered out?).

Hello,

I am running SMP Linux 2.6.37-rc1 on BMIPS5000 (single core dual
thread) and observe some abnormalities when doing system
suspend/resume which I narrowed down to cpu hotplugging. The suspend
brings the second thread processor down and then restarts it, after
which I see memory corruption in userspace. I started digging and
found out that problem occurs because while doing execve() the child
process is getting the same ASID as the parent, which obviously
corrupts parent's address space.

Further digging showed that:
activate_mm() calls get_new_mmu_context() to get a new ASID, but at
this time ASID field in entryHi is 1, and asid_cache(cpu) is 0x100 (it
was just reset to ASID_FIRST_VERSION when the secondary TP was
booting).
So, get_new_mmu_context() increments the asid_cache(cpu) value to
0x101, and thus puts 0x01 into entryHi. The result - ASID field does
not get changed as it was supposed to.

My solution was very simple - do not reset asid_cache(cpu) on TP warm
restart. But I would welcome any comments because my understanding of
the code is somewhat fuzzy.

diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index d83f325..ccf9272 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -1577,7 +1577,8 @@ void __cpuinit per_cpu_trap_init(void)
        }
 #endif /* CONFIG_MIPS_MT_SMTC */

-       cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
+       if (!cpu_data[cpu].asid_cache)
+               cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
        TLBMISS_HANDLER_SETUP();

        atomic_inc(&init_mm.mm_count);

Regards,
Max Rayskiy.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-17 18:49 [PATCH] MIPS: ASID conflict after CPU hotplug Maksim Rayskiy
@ 2010-11-22  3:41 ` Ralf Baechle
  2010-11-22 18:38   ` Maksim Rayskiy
  2010-11-22 21:34   ` Kevin D. Kissell
  2011-11-10 13:09 ` Ralf Baechle
  1 sibling, 2 replies; 11+ messages in thread
From: Ralf Baechle @ 2010-11-22  3:41 UTC (permalink / raw)
  To: Maksim Rayskiy, Kevin D. Kissell; +Cc: linux-mips

On Wed, Nov 17, 2010 at 10:49:09AM -0800, Maksim Rayskiy wrote:

> This is a repost of my original message which somehow did not reach
> the mailing list (filtered out?).

Yes indeed.  In particular the previous posting was HTML which the list
robot is configured to exterminate.

> I am running SMP Linux 2.6.37-rc1 on BMIPS5000 (single core dual
> thread) and observe some abnormalities when doing system
> suspend/resume which I narrowed down to cpu hotplugging. The suspend
> brings the second thread processor down and then restarts it, after
> which I see memory corruption in userspace. I started digging and
> found out that problem occurs because while doing execve() the child
> process is getting the same ASID as the parent, which obviously
> corrupts parent's address space.
> 
> Further digging showed that:
> activate_mm() calls get_new_mmu_context() to get a new ASID, but at
> this time ASID field in entryHi is 1, and asid_cache(cpu) is 0x100 (it
> was just reset to ASID_FIRST_VERSION when the secondary TP was
> booting).
> So, get_new_mmu_context() increments the asid_cache(cpu) value to
> 0x101, and thus puts 0x01 into entryHi. The result - ASID field does
> not get changed as it was supposed to.
> 
> My solution was very simple - do not reset asid_cache(cpu) on TP warm
> restart. But I would welcome any comments because my understanding of
> the code is somewhat fuzzy.

Unfortunately I haven't yet found a BMIPS board or manual in my mailbox
so I can't really give a definitate answer.  But let me describe how
the MIPS34K handles it.

The 34K supports two TLB modes, shared and split TLB.  The VSMP kernel
uses the TLB in split mode in which half of the TLB entries is available
to each of the two threads aka VPEs.  So with a 64 entry TLB that's 32
entries per VPE then.  Each VPE (or rather TC but see further down) has
it's own c0_entryhi register, thus it's own ASID.  So no ASID collisions
possible, ever.  This is the same as on a conventional SMP system where
TLB and ASID number space are also per CPU.

The SMTC kernel model (usually) uses the shared model, that is all the 64
entries are now available to all threads and the ASID space is shared.
This means allocation of the same ASID to multiple TCs needs to be avoided.

It seems BMIPS falls into the latter class?

Need to think a little about potencial consequences of your suggested
patch.  It seems ok.  Kevin, what do you think?

  Ralf

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-22  3:41 ` Ralf Baechle
@ 2010-11-22 18:38   ` Maksim Rayskiy
  2010-11-22 21:34   ` Kevin D. Kissell
  1 sibling, 0 replies; 11+ messages in thread
From: Maksim Rayskiy @ 2010-11-22 18:38 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Kevin D. Kissell, linux-mips

On Sun, Nov 21, 2010 at 7:41 PM, Ralf Baechle <ralf@linux-mips.org> wrote:
> Unfortunately I haven't yet found a BMIPS board or manual in my mailbox
> so I can't really give a definitate answer.  But let me describe how
> the MIPS34K handles it.
>
> The 34K supports two TLB modes, shared and split TLB.  The VSMP kernel
> uses the TLB in split mode in which half of the TLB entries is available
> to each of the two threads aka VPEs.  So with a 64 entry TLB that's 32
> entries per VPE then.  Each VPE (or rather TC but see further down) has
> it's own c0_entryhi register, thus it's own ASID.  So no ASID collisions
> possible, ever.  This is the same as on a conventional SMP system where
> TLB and ASID number space are also per CPU.
>
> The SMTC kernel model (usually) uses the shared model, that is all the 64
> entries are now available to all threads and the ASID space is shared.
> This means allocation of the same ASID to multiple TCs needs to be avoided.
>
> It seems BMIPS falls into the latter class?
>

No, each thread has a separate TLB and all TLB-related registers are
also per thread. The conflict I have found was the same ASID for two
different processes on the second TP.

I still do not understand all the details, but what I saw was after
the second TP is brought back online init process runs on (migrates to
?) it with entryHi=1. If it tries to spawn another process, the child
gets the same ASID, because current asid_cache value is 0 (well it is
actually 0x100, but only lower 8 bits matter).

> Need to think a little about potencial consequences of your suggested
> patch.  It seems ok.  Kevin, what do you think?
>
>  Ralf
>

Thanks,
Maksim.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-22  3:41 ` Ralf Baechle
  2010-11-22 18:38   ` Maksim Rayskiy
@ 2010-11-22 21:34   ` Kevin D. Kissell
       [not found]     ` <AANLkTimuJLxG2KoibRxzcHkX3LoKsTWqJSF_e=ouFi+b@mail.gmail.com>
  1 sibling, 1 reply; 11+ messages in thread
From: Kevin D. Kissell @ 2010-11-22 21:34 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Maksim Rayskiy, linux-mips

On 11/21/10 19:41, Ralf Baechle wrote:
> ...
> Need to think a little about potencial consequences of your suggested
> patch.  It seems ok.  Kevin, what do you think?
>    
Since you ask, while I would imagine that Maksim's patch works fine for 
him, I'm not sure that it's really the right fix.  I never did succeed 
in getting CPU hotplugging working back in the 2.6.18 days, so I don't 
know as much about it as I'd like, but if per_cpu_trap_init() needs to 
be invoked on a hot plugin event, and if its behavior needs to be 
different , I'd really, really prefer to see that state propagated 
explicitly, rather than inferring it from whatever happens to be in 
cache/memory at cpu_data[cpu].asid_cache.  But beyond that, if the 
problem arises because setting cpu_data[cpu].asid_cache to a known 
initial state on a plugin event can conflict with the residual content 
of EntryHi, rather than creating a special case where we don't 
initialize the ASID cache, since we seem to be (re)initializing a lot of 
other privileged state, why aren't we also setting a known sane initial 
EntryHi value?   Wouldn't that be a cleaner fix?  (And I don't mean that 
as a rhetorical question - there may be very good reasons to let EntryHi 
values persist across hot unplug/plug events.  I just can't imagine them 
offhand over coffee.)

/K.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
       [not found]     ` <AANLkTimuJLxG2KoibRxzcHkX3LoKsTWqJSF_e=ouFi+b@mail.gmail.com>
@ 2010-11-25 15:57       ` Kevin D. Kissell
       [not found]         ` <AANLkTinUSjvjwHVJoRW-Fr75WDfheq3hSM_hEBMsEUXK@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Kevin D. Kissell @ 2010-11-25 15:57 UTC (permalink / raw)
  To: Maksim Rayskiy; +Cc: Ralf Baechle, linux-mips

This approach certainly makes per_cpu_trap_init() more readable and 
maintainable,  but it has the downside of creating state and 
infrastructure that have footprints elsewhere that add to global cruft 
and complexity.  Note that your patch, as written, wouldn't solve your 
problem, because it doesn't include the code that would actually set and 
clear the elements of your cpu_warm_boot[] array.  If we do need to pay 
attention to warm boot state elsewhere in the kernel (Does any other 
architecture?  That should be a clue...), then some bits in memory like 
that should perhaps be defined (though I'd wonder why it couldn't be a 
bit in some existing per-CPU state entity like cpu_data[]).  Otherwise, 
as I said earlier, the cleanest approach strikes me as one of resetting 
the value of EntryHi as well as the ASID cache when the hotplug event 
takes place.  The cleanest possible outcome would be if one could *move* 
the reset initialization of EntryHi from wherever it is now to 
per_cpu_trap_init(), so there would be *zero* net additional code, but 
it may be (it's Thanksgiving and I'm limited in time and internet 
access, so I can't really go look for you) that it's initialized as a 
side effect of something that happens repeatedly, such that actually 
*moving* it would be dangerous.  But if you have the time, try setting 
up EntryHi explicitly and unconditionally in per_cpu_trap_init() and see 
if it doesn't solve your initial problem.

Happy Holiday to you all,

/K.

On 11/24/10 7:03 PM, Maksim Rayskiy wrote:
> I certainly agree that it is a bad idea to look at the current value
> of asid_cache when figuring out if it is a warm or cold boot.
> I could not tell how the code ended up with this entryHi value after
> the hotplug. So, I can only address the simplest portion of issues you
> mentioned.
> How about we add a variable to tell warm restart from cold and
> preserve asid_cache across hotplug event. It is not much of an
> improvement over the original code, I must admit.
>
> Signed-off-by: Maksim Rayskiy<maksim.rayskiy@gmail.com>
>
> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> index d83f325..9116adb 100644
> --- a/arch/mips/kernel/traps.c
> +++ b/arch/mips/kernel/traps.c
> @@ -1489,6 +1489,8 @@ static int __init ulri_disable(char *s)
>   }
>   __setup("noulri", ulri_disable);
>
> +static int cpu_warm_boot[NR_CPUS];
> +
>   void __cpuinit per_cpu_trap_init(void)
>   {
>          unsigned int cpu = smp_processor_id();
> @@ -1577,7 +1579,9 @@ void __cpuinit per_cpu_trap_init(void)
>          }
>   #endif /* CONFIG_MIPS_MT_SMTC */
>
> -       cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
> +       if (!cpu_warm_boot[cpu])
> +               cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
> +       cpu_warm_boot[cpu] = 1;
>          TLBMISS_HANDLER_SETUP();
>
>          atomic_inc(&init_mm.mm_count);
>
>
> On Mon, Nov 22, 2010 at 1:34 PM, Kevin D. Kissell<kevink@paralogos.com>  wrote:
>> On 11/21/10 19:41, Ralf Baechle wrote:
>>> ...
>>> Need to think a little about potencial consequences of your suggested
>>> patch.  It seems ok.  Kevin, what do you think?
>>>
>> Since you ask, while I would imagine that Maksim's patch works fine for him,
>> I'm not sure that it's really the right fix.  I never did succeed in getting
>> CPU hotplugging working back in the 2.6.18 days, so I don't know as much
>> about it as I'd like, but if per_cpu_trap_init() needs to be invoked on a
>> hot plugin event, and if its behavior needs to be different , I'd really,
>> really prefer to see that state propagated explicitly, rather than inferring
>> it from whatever happens to be in cache/memory at cpu_data[cpu].asid_cache.
>>   But beyond that, if the problem arises because setting
>> cpu_data[cpu].asid_cache to a known initial state on a plugin event can
>> conflict with the residual content of EntryHi, rather than creating a
>> special case where we don't initialize the ASID cache, since we seem to be
>> (re)initializing a lot of other privileged state, why aren't we also setting
>> a known sane initial EntryHi value?   Wouldn't that be a cleaner fix?  (And
>> I don't mean that as a rhetorical question - there may be very good reasons
>> to let EntryHi values persist across hot unplug/plug events.  I just can't
>> imagine them offhand over coffee.)
>>
>> /K.
>>
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
       [not found]         ` <AANLkTinUSjvjwHVJoRW-Fr75WDfheq3hSM_hEBMsEUXK@mail.gmail.com>
@ 2010-11-30  2:53           ` Kevin D. Kissell
  2010-11-30  3:21             ` Maksim Rayskiy
  0 siblings, 1 reply; 11+ messages in thread
From: Kevin D. Kissell @ 2010-11-30  2:53 UTC (permalink / raw)
  To: Maksim Rayskiy; +Cc: Ralf Baechle, linux-mips

Having done surgery in the past to the ASID management code, this sounds 
like
a much more rational explanation of the observed problem.  Your proposed mod
sounds like it might work, but local_flush_tlb_mm() is implemented in 
terms of
drop_mmu_context(), which only does what you want if the CPU executing the
code is *not* one of the CPUs participating in the memory map.  Otherwise,
instead of clearing the ASID in the table, it allocates a new one.  I 
have a concern
that this may re-randomize things in a way that will solve your problem 
*most*
of the time, but not always.

Now that we have a better understanding of the failure, your initial notion
of *not* restarting the ASID sequence on a hotplug insertion doesn't seem
as crazy - it's certainly the zen "doing by doing nothing" way to go, 
without
the iterative overhead of walking the full process table.  But as we 
discussed,
it has the downside of requiring new state infrastructure for tracking 
hotplugs,
and we'd want to be sure that it's well behaved in the case where we have a
post-initial-boot hotplug event that brings a CPU online that has never been
initialized.  To take that tack, we'd need a per-CPU-slot bit which says 
"I have
a valid ASID sequence, thank you", which is checked in per_cpu_trap_init()
(or some other appropriate hook), and the ASID "cache" is initialized only
if it's needed, which *might* be on a hotplug.

/K.

On 11/29/10 17:35, Maksim Rayskiy wrote:
> Kevin,
>
> Thank you for your suggestions. I think I see where the conflict is
> coming from.
> If a usermode process during its lifetime migrates between cpus (TPs
> in my case) it has a non-zero value in cpu_context(cpu, mm) fields
> (asid in mips case) for all cpus it has ever run on. Moving from one
> cpu to another does not clear cpu_context. This is okay as long as all
> cpus are running. When cpu hotplug occurs, cpu_context for this
> threads running on cpu being stopped does not get cleared either. It
> might have been cleared if local_flush_tlb_mm() is called for these
> threads, but I do not see it happening.
> As a result, after secondary cpu is brought online, some processes
> already have stale asid values stored in their mm structures. One of
> these processes in my case happens to be 'init' with asid=0x101, which
> causes a conflict because next asid value selected by
> get_new_mmu_context() is ASID_FIRST_VERSION+ASID_INC=0x101 as well.
> Basically, at this point there is a discrepancy between global
> (per-cpu) asid_cache value and some individual thread asid values
> stored in mm->context.asid[cpu] fields.
> Just adding a simple code to call drop_mmu_context() on every thread
> which has ever run on the cpu getting offlined solves my problem. It
> does not require distinguishing warm boot from cold as before.
> Something like (called from hotplugged cpu)
>
> void local_flush_tlb_all_mm(void)
> {
> 	struct task_struct *p;
> 	for_each_process(p)
> 		if (p->mm)
> 			local_flush_tlb_mm(p->mm);
> }
>
> I can generate a patch if you think it is a more viable solution.
>
> Regards,
> Maksim.
>
> On Thu, Nov 25, 2010 at 7:57 AM, Kevin D. Kissell<kevink@paralogos.com>  wrote:
>    
>> This approach certainly makes per_cpu_trap_init() more readable and
>> maintainable,  but it has the downside of creating state and infrastructure
>> that have footprints elsewhere that add to global cruft and complexity.
>>   Note that your patch, as written, wouldn't solve your problem, because it
>> doesn't include the code that would actually set and clear the elements of
>> your cpu_warm_boot[] array.  If we do need to pay attention to warm boot
>> state elsewhere in the kernel (Does any other architecture?  That should be
>> a clue...), then some bits in memory like that should perhaps be defined
>> (though I'd wonder why it couldn't be a bit in some existing per-CPU state
>> entity like cpu_data[]).  Otherwise, as I said earlier, the cleanest
>> approach strikes me as one of resetting the value of EntryHi as well as the
>> ASID cache when the hotplug event takes place.  The cleanest possible
>> outcome would be if one could *move* the reset initialization of EntryHi
>> from wherever it is now to per_cpu_trap_init(), so there would be *zero* net
>> additional code, but it may be (it's Thanksgiving and I'm limited in time
>> and internet access, so I can't really go look for you) that it's
>> initialized as a side effect of something that happens repeatedly, such that
>> actually *moving* it would be dangerous.  But if you have the time, try
>> setting up EntryHi explicitly and unconditionally in per_cpu_trap_init() and
>> see if it doesn't solve your initial problem.
>>
>> Happy Holiday to you all,
>>
>> /K.
>>
>> On 11/24/10 7:03 PM, Maksim Rayskiy wrote:
>>      
>>> I certainly agree that it is a bad idea to look at the current value
>>> of asid_cache when figuring out if it is a warm or cold boot.
>>> I could not tell how the code ended up with this entryHi value after
>>> the hotplug. So, I can only address the simplest portion of issues you
>>> mentioned.
>>> How about we add a variable to tell warm restart from cold and
>>> preserve asid_cache across hotplug event. It is not much of an
>>> improvement over the original code, I must admit.
>>>
>>> Signed-off-by: Maksim Rayskiy<maksim.rayskiy@gmail.com>
>>>
>>> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
>>> index d83f325..9116adb 100644
>>> --- a/arch/mips/kernel/traps.c
>>> +++ b/arch/mips/kernel/traps.c
>>> @@ -1489,6 +1489,8 @@ static int __init ulri_disable(char *s)
>>>   }
>>>   __setup("noulri", ulri_disable);
>>>
>>> +static int cpu_warm_boot[NR_CPUS];
>>> +
>>>   void __cpuinit per_cpu_trap_init(void)
>>>   {
>>>          unsigned int cpu = smp_processor_id();
>>> @@ -1577,7 +1579,9 @@ void __cpuinit per_cpu_trap_init(void)
>>>          }
>>>   #endif /* CONFIG_MIPS_MT_SMTC */
>>>
>>> -       cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
>>> +       if (!cpu_warm_boot[cpu])
>>> +               cpu_data[cpu].asid_cache = ASID_FIRST_VERSION;
>>> +       cpu_warm_boot[cpu] = 1;
>>>          TLBMISS_HANDLER_SETUP();
>>>
>>>          atomic_inc(&init_mm.mm_count);
>>>
>>>
>>> On Mon, Nov 22, 2010 at 1:34 PM, Kevin D. Kissell<kevink@paralogos.com>
>>>   wrote:
>>>        
>>>> On 11/21/10 19:41, Ralf Baechle wrote:
>>>>          
>>>>> ...
>>>>> Need to think a little about potencial consequences of your suggested
>>>>> patch.  It seems ok.  Kevin, what do you think?
>>>>>
>>>>>            
>>>> Since you ask, while I would imagine that Maksim's patch works fine for
>>>> him,
>>>> I'm not sure that it's really the right fix.  I never did succeed in
>>>> getting
>>>> CPU hotplugging working back in the 2.6.18 days, so I don't know as much
>>>> about it as I'd like, but if per_cpu_trap_init() needs to be invoked on a
>>>> hot plugin event, and if its behavior needs to be different , I'd really,
>>>> really prefer to see that state propagated explicitly, rather than
>>>> inferring
>>>> it from whatever happens to be in cache/memory at
>>>> cpu_data[cpu].asid_cache.
>>>>   But beyond that, if the problem arises because setting
>>>> cpu_data[cpu].asid_cache to a known initial state on a plugin event can
>>>> conflict with the residual content of EntryHi, rather than creating a
>>>> special case where we don't initialize the ASID cache, since we seem to
>>>> be
>>>> (re)initializing a lot of other privileged state, why aren't we also
>>>> setting
>>>> a known sane initial EntryHi value?   Wouldn't that be a cleaner fix?
>>>>   (And
>>>> I don't mean that as a rhetorical question - there may be very good
>>>> reasons
>>>> to let EntryHi values persist across hot unplug/plug events.  I just
>>>> can't
>>>> imagine them offhand over coffee.)
>>>>
>>>> /K.
>>>>
>>>>
>>>>          
>>
>>      

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-30  2:53           ` Kevin D. Kissell
@ 2010-11-30  3:21             ` Maksim Rayskiy
  2010-11-30 10:05               ` Kevin D. Kissell
  0 siblings, 1 reply; 11+ messages in thread
From: Maksim Rayskiy @ 2010-11-30  3:21 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, linux-mips

On Mon, Nov 29, 2010 at 6:53 PM, Kevin D. Kissell <kevink@paralogos.com> wrote:
> Having done surgery in the past to the ASID management code, this sounds
> like
> a much more rational explanation of the observed problem.  Your proposed mod
> sounds like it might work, but local_flush_tlb_mm() is implemented in terms
> of
> drop_mmu_context(), which only does what you want if the CPU executing the
> code is *not* one of the CPUs participating in the memory map.  Otherwise,
> instead of clearing the ASID in the table, it allocates a new one.  I have a
> concern
> that this may re-randomize things in a way that will solve your problem
> *most*
> of the time, but not always.
>

Actually, if you call this function late enough, specifically when
cpu_online(cpu) is 0, it does exactly what I want from it - that is
clears ASID in the context.
I am calling it from play_dead() which is platform specific, but there
might be a place for it in platform-independent code as well.
Another option would be not to use drop_mmu_context() but rather clear
the context directly, since we know exactly what we want to do at this
point.

> Now that we have a better understanding of the failure, your initial notion
> of *not* restarting the ASID sequence on a hotplug insertion doesn't seem
> as crazy - it's certainly the zen "doing by doing nothing" way to go,
> without
> the iterative overhead of walking the full process table.  But as we
> discussed,
> it has the downside of requiring new state infrastructure for tracking
> hotplugs,
> and we'd want to be sure that it's well behaved in the case where we have a
> post-initial-boot hotplug event that brings a CPU online that has never been
> initialized.  To take that tack, we'd need a per-CPU-slot bit which says "I
> have
> a valid ASID sequence, thank you", which is checked in per_cpu_trap_init()
> (or some other appropriate hook), and the ASID "cache" is initialized only
> if it's needed, which *might* be on a hotplug.

You are talking about adding this bit to cpuinfo_mips, correct?

Maksim.

>
> /K.
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-30  3:21             ` Maksim Rayskiy
@ 2010-11-30 10:05               ` Kevin D. Kissell
  2010-11-30 19:49                 ` Maksim Rayskiy
  0 siblings, 1 reply; 11+ messages in thread
From: Kevin D. Kissell @ 2010-11-30 10:05 UTC (permalink / raw)
  To: Maksim Rayskiy; +Cc: Ralf Baechle, linux-mips

On 11/29/2010 7:21 PM, Maksim Rayskiy wrote:
> On Mon, Nov 29, 2010 at 6:53 PM, Kevin D. Kissell <kevink@paralogos.com> wrote:
>> Having done surgery in the past to the ASID management code, this sounds
>> like
>> a much more rational explanation of the observed problem.  Your proposed mod
>> sounds like it might work, but local_flush_tlb_mm() is implemented in terms
>> of
>> drop_mmu_context(), which only does what you want if the CPU executing the
>> code is *not* one of the CPUs participating in the memory map.  Otherwise,
>> instead of clearing the ASID in the table, it allocates a new one.  I have a
>> concern
>> that this may re-randomize things in a way that will solve your problem
>> *most*
>> of the time, but not always.
>>
> Actually, if you call this function late enough, specifically when
> cpu_online(cpu) is 0, it does exactly what I want from it - that is
> clears ASID in the context.

It does exactly what you want when cpumask_test_cpu(cpu, mm_cpumask(mm))
is false, but I don't think that will necessarily be the case when
cpu_online(cpu) is 0.
It's keying off the CPU mask in the mm struct, not the global online
mask.  If those
are actually guaranteed to be in sync, then your hack would be fine.  
I'd just beseech
you to make sure you comment the invocation of drop_mmu_context() is
commented
to the effect that "This needs to be done after the CPU is offline for
the purposes
of a cpumask check on the mm".
> I am calling it from play_dead() which is platform specific, but there
> might be a place for it in platform-independent code as well.
> Another option would be not to use drop_mmu_context() but rather clear
> the context directly, since we know exactly what we want to do at this
> point.
>
>> Now that we have a better understanding of the failure, your initial notion
>> of *not* restarting the ASID sequence on a hotplug insertion doesn't seem
>> as crazy - it's certainly the zen "doing by doing nothing" way to go,
>> without
>> the iterative overhead of walking the full process table.  But as we
>> discussed,
>> it has the downside of requiring new state infrastructure for tracking
>> hotplugs,
>> and we'd want to be sure that it's well behaved in the case where we have a
>> post-initial-boot hotplug event that brings a CPU online that has never been
>> initialized.  To take that tack, we'd need a per-CPU-slot bit which says "I
>> have
>> a valid ASID sequence, thank you", which is checked in per_cpu_trap_init()
>> (or some other appropriate hook), and the ASID "cache" is initialized only
>> if it's needed, which *might* be on a hotplug.
> You are talking about adding this bit to cpuinfo_mips, correct?

That would be a logical place to hang it, though I'm not sure if we keep
volatile data like that in the array these days.  It might need to be
lock protected.

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-30 10:05               ` Kevin D. Kissell
@ 2010-11-30 19:49                 ` Maksim Rayskiy
  2010-12-01 11:51                   ` Sergei Shtylyov
  0 siblings, 1 reply; 11+ messages in thread
From: Maksim Rayskiy @ 2010-11-30 19:49 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: Ralf Baechle, linux-mips



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-30 19:49                 ` Maksim Rayskiy
@ 2010-12-01 11:51                   ` Sergei Shtylyov
  0 siblings, 0 replies; 11+ messages in thread
From: Sergei Shtylyov @ 2010-12-01 11:51 UTC (permalink / raw)
  To: Maksim Rayskiy; +Cc: Kevin D. Kissell, Ralf Baechle, linux-mips

Hello.

On 30-11-2010 22:49, Maksim Rayskiy wrote:

> From 9a03661a40407e14ee75295f5541f371f0a7cdda Mon Sep 17 00:00:00 2001
> From: Maksim Rayskiy<maksim.rayskiy@gmail.com>
> Date: Tue, 30 Nov 2010 11:34:31 -0800
> Subject: [PATCH] MIPS: Added local_flush_tlb_all_mm to clear all mm
> contexts on calling cpu

> When hotplug removing a cpu, all mm context TLB entries must be cleared
> to avoid ASID conflict when cpu is restarted.
> New functions local_flush_tlb_all_mm() and all-cpu version
> flush_tlb_all_mm() are added.
> To function properly, local_flush_tlb_all_mm() must be called when
> mm_cpumask for all
> mm context on given cpu is cleared.

> Signed-off-by: Maksim Rayskiy<maksim.rayskiy@gmail.com>
[...]

> diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c
> index c618eed..5c03218 100644
> --- a/arch/mips/mm/tlb-r4k.c
> +++ b/arch/mips/mm/tlb-r4k.c
> @@ -66,6 +66,18 @@ extern void build_tlb_refill_handler(void);
>
>   #endif
>
> +/* This function will clear all mm contexts on calling cpu
> + * To produce desired effect it must be called
> + * when mm_cpumask for all mm contexts is cleared
> + */
> +void local_flush_tlb_all_mm(void)
> +{
> +	struct task_struct *p;

    An empty line wouldn't hurt here...

> +	for_each_process(p)
> +		if (p->mm)
> +			local_flush_tlb_mm(p->mm);
> +}
> +

WBR, Sergei

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] MIPS: ASID conflict after CPU hotplug
  2010-11-17 18:49 [PATCH] MIPS: ASID conflict after CPU hotplug Maksim Rayskiy
  2010-11-22  3:41 ` Ralf Baechle
@ 2011-11-10 13:09 ` Ralf Baechle
  1 sibling, 0 replies; 11+ messages in thread
From: Ralf Baechle @ 2011-11-10 13:09 UTC (permalink / raw)
  To: Maksim Rayskiy; +Cc: linux-mips, Kevin Cernekee

On Wed, Nov 17, 2010 at 10:49:09AM -0800, Maksim Rayskiy wrote:

Applied, after the recent discussion.

  Ralf

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-10 13:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17 18:49 [PATCH] MIPS: ASID conflict after CPU hotplug Maksim Rayskiy
2010-11-22  3:41 ` Ralf Baechle
2010-11-22 18:38   ` Maksim Rayskiy
2010-11-22 21:34   ` Kevin D. Kissell
     [not found]     ` <AANLkTimuJLxG2KoibRxzcHkX3LoKsTWqJSF_e=ouFi+b@mail.gmail.com>
2010-11-25 15:57       ` Kevin D. Kissell
     [not found]         ` <AANLkTinUSjvjwHVJoRW-Fr75WDfheq3hSM_hEBMsEUXK@mail.gmail.com>
2010-11-30  2:53           ` Kevin D. Kissell
2010-11-30  3:21             ` Maksim Rayskiy
2010-11-30 10:05               ` Kevin D. Kissell
2010-11-30 19:49                 ` Maksim Rayskiy
2010-12-01 11:51                   ` Sergei Shtylyov
2011-11-10 13:09 ` Ralf Baechle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.