All of lore.kernel.org
 help / color / mirror / Atom feed
From: gdavis@mvista.com (George G. Davis)
To: linux-arm-kernel@lists.infradead.org
Subject: parallel load of modules on an ARM multicore
Date: Thu, 6 Oct 2011 00:29:28 -0400	[thread overview]
Message-ID: <05942148-7AFB-4755-A22F-355E0360B098@mvista.com> (raw)
In-Reply-To: <CAHkRjk60dZX45DNiHoSd578NVv2G2E+95cfEk3iiUfKS1906HQ@mail.gmail.com>

Hello Catalin,

On Sep 22, 2011, at 4:52 AM, Catalin Marinas wrote:

Ugh, sorry, I've been having problems with fetchmail/POP and did not see your
reply until just now, logged into IMAP finally.  : /

> Hi George,
> 
> On 22 September 2011 08:29, George G. Davis <gdavis@mvista.com> wrote:
>> On Mon, Jun 20, 2011 at 03:43:27PM +0200, EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31) wrote:
>>> I'm getting unexpected results from loading several modules - some
>>> of them in parallel - on an ARM11 MPcore system.
> ...
>> In case anyone missed the subtlety, this report was for an ARM11 MPCore system
>> with CONFIG_PREEMPT enabled.  I've also been looking into this and various other
>> memory corruption issues on ARM11 MPCore with CONFIG_PREEMPT enabled and have
>> come to the conclusion that CONFIG_PREEMPT is broken on ARM11 MPCore.
>> 
>> I added the following instrumentation in 3.1.0-rc4ish to watch for
>> process migration in a few places of interest:
> ...
>> Now with sufficient system stress, I get the following recurring problems
>> (it's a 3-core system : ):
>> 
>> load_module:2858: cpu was 0 but is now 1, memory corruption is possible
>> load_module:2858: cpu was 0 but is now 2, memory corruption is possible
>> load_module:2858: cpu was 1 but is now 0, memory corruption is possible
>> load_module:2858: cpu was 1 but is now 2, memory corruption is possible
>> load_module:2858: cpu was 2 but is now 0, memory corruption is possible
>> load_module:2858: cpu was 2 but is now 1, memory corruption is possible
>> pte_alloc_one:100: cpu was 0 but is now 1, memory corruption is possible
>> pte_alloc_one:100: cpu was 0 but is now 2, memory corruption is possible
>> pte_alloc_one:100: cpu was 1 but is now 0, memory corruption is possible
>> pte_alloc_one:100: cpu was 1 but is now 2, memory corruption is possible
>> pte_alloc_one:100: cpu was 2 but is now 0, memory corruption is possible
>> pte_alloc_one:100: cpu was 2 but is now 1, memory corruption is possible
>> pte_alloc_one_kernel:74: cpu was 2 but is now 1, memory corruption is possible
>> 
>> With sufficient stress and extended run time, the system will eventually
>> hang or oops with non-sensical oops traces - machine state does not
>> make sense relative to the code excuting at the time of the oops.
> 
> I think your analysis is valid and these places are not safe with
> CONFIG_PREEMPT enabled.

Alas, the stress test stability problems persist even with CONFIG_PREEMPT off.
Perhaps the windows are smaller, but they still exist.

>> The interesting point here is that each of the above contain critical
>> sections in which ARM11 MPCore memory is inconsistent, i.e. cache on
>> CPU A contains modified entries but then migration occurs and the
>> cache is flushed on CPU B yet those cache ops called in the above
>> cases do not implement ARM11 MPCore RWFO workarounds.
> 
> I agree, my follow-up patch to implement lazy cache flushing on
> ARM11MPCore was meant for other uses (like drivers not calling
> flush_dcache_page), I never had PREEMPT in mind.
> 
>> Furthermore,
>> the current ARM11 MPCore RWFO workarounds for DMA et al are unsafe
>> as well for the CONFIG_PREEMPT case because, again, process migration
>> can occur during DMA cache maintance operations in between RWFO and
>> cache op instructions resulting in memory inconsistencies for the
>> DMA case - a very narrow but real window.
> 
> Yes, that's correct.
> 
>> So what's the recommendation, don't use CONFIG_PREEMPT on ARM11 MPCore?
>> 
>> Are there any known fixes for CONFIG_PREEMPT on ARM11 MPCore if it
>> is indeed broken as it appears?
> 
> The scenarios you have described look valid to me. I think for now we
> can say that ARM11MPCore and PREEMPT don't go well together.

But it is unreliable even for the !PREEMPT case based on my stress testing,
even with your lazy cache flushing workaound applied.  : /

> This can
> be fixed though by making sure that cache maintenance places with the
> RWFO trick have the preemption disabled. But the RWFO has some
> performance impact as well, so I would only use it where absolutely
> necessary. In this case, I would just disable PREEMPT.

I'll post a series of RFC patches which adress the "low hanging fruit".  I'm still
working on the harder nuts which of course have performance trade offs
between RWFO v. broadcast cache ops to consider...

Thanks and apologies again for lack of follow up reply on my part.  I blame
my fetchmail/POP as I'm still getting at least some LAKML messages, just
not all.  : /

--
Regards,
George

> 
> -- 
> Catalin

  reply	other threads:[~2011-10-06  4:29 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-20 13:43 parallel load of modules on an ARM multicore EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
2011-06-21 15:50 ` Catalin Marinas
2011-06-23 14:39   ` AW: " EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
2011-06-23 14:52     ` Catalin Marinas
2011-06-23 15:12       ` Russell King - ARM Linux
2011-06-23 15:34         ` Catalin Marinas
2011-06-23 17:02           ` Catalin Marinas
2011-06-23 15:20       ` AW: " EXTERNAL Waechtler Peter (Fa. TCP, CM-AI/PJ-CF31)
2011-07-07  4:25   ` George G. Davis
2011-10-20  4:04   ` George G. Davis
2011-09-22  7:29 ` George G. Davis
2011-09-22  8:52   ` Catalin Marinas
2011-10-06  4:29     ` George G. Davis [this message]
2011-10-06  5:08       ` [RFC/PATCH 0/7] ARM: ARM11 MPCore preemption/task migration cache coherency fixups gdavis at mvista.com
2011-10-06  5:08         ` [RFC/PATCH 1/7] ARM: ARM11 MPCore: pgd_alloc is not preempt safe gdavis at mvista.com
2011-10-06 16:35           ` Russell King - ARM Linux
2011-10-06 19:38             ` George G. Davis
2011-10-06  5:08         ` [RFC/PATCH 2/7] ARM: ARM11 MPCore: pte_alloc_one{, _kernel} are " gdavis at mvista.com
2011-10-06  5:08         ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " gdavis at mvista.com
2011-10-06  5:08         ` [RFC/PATCH 4/7] ARM: ARM11 MPCore: clean_dcache_area is " gdavis at mvista.com
2011-10-06  5:08         ` [RFC/PATCH 5/7] ARM: Move get_thread_info macro definition to <asm/assembler.h> gdavis at mvista.com
2011-10-06  5:08         ` [RFC/PATCH 6/7] ARM: ARM11 MPCore: DMA_CACHE_RWFO operations are not preempt safe gdavis at mvista.com
2011-10-06 16:40           ` Russell King - ARM Linux
2011-10-06 19:41             ` George G. Davis
2011-10-06  5:08         ` [RFC/PATCH 7/7] ARM: ARM11 MPCore: cpu_v6_set_pte_ext is " gdavis at mvista.com
2011-10-06  7:46           ` Russell King - ARM Linux
2011-10-06 12:35             ` George G. Davis
2011-10-07  2:38         ` [RFC/PATCH v2 0/7] ARM11 MPCore: preemption/task migration cache coherency fixups gdavis at mvista.com
2011-10-07  2:38           ` [RFC/PATCH 1/7] ARM: ARM11 MPCore: pgd_alloc is not preempt safe gdavis at mvista.com
2011-10-07  2:38           ` [RFC/PATCH 2/7] ARM: ARM11 MPCore: pte_alloc_one{, _kernel} are " gdavis at mvista.com
2011-10-07  7:47             ` [RFC/PATCH 2/7] ARM: ARM11 MPCore: pte_alloc_one{,_kernel} " Russell King - ARM Linux
2011-10-07 15:31               ` [RFC/PATCH 2/7] ARM: ARM11 MPCore: pte_alloc_one{, _kernel} " George G. Davis
2011-10-07  2:38           ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " gdavis at mvista.com
2011-10-11  9:53             ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean,flush}_pmd_entry " Catalin Marinas
2011-10-12  2:34               ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " George G. Davis
2011-10-13 14:31                 ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean,flush}_pmd_entry " Russell King - ARM Linux
2011-10-14  1:34                   ` [RFC/PATCH 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " George G. Davis
2011-10-07  2:38           ` [RFC/PATCH 4/7] ARM: ARM11 MPCore: clean_dcache_area is " gdavis at mvista.com
2011-10-07  2:38           ` [RFC/PATCH 5/7] ARM: Move get_thread_info macro definition to <asm/assembler.h> gdavis at mvista.com
2011-10-11  9:56             ` Catalin Marinas
2011-10-12  6:04               ` gdavis at mvista.com
2011-10-13 14:34                 ` Russell King - ARM Linux
2011-10-13 14:49                   ` Catalin Marinas
2011-10-13 14:53                     ` Russell King - ARM Linux
2011-10-14  1:46                       ` George G. Davis
2011-10-14  1:44                     ` George G. Davis
2011-10-14  1:42                   ` George G. Davis
2011-10-14  2:54                     ` Nicolas Pitre
2011-10-14 12:56                       ` George G. Davis
2011-10-07  2:38           ` [RFC/PATCH 6/7] ARM: ARM11 MPCore: DMA_CACHE_RWFO operations are not preempt safe gdavis at mvista.com
2011-10-07  2:38           ` [RFC/PATCH 7/7] ARM: ARM11 MPCore: cpu_v6_set_pte_ext is " gdavis at mvista.com
2011-10-07 16:26           ` [RFC/PATCH v3 0/7] ARM11 MPCore: preemption/task migration cache coherency fixups gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 1/7] ARM: ARM11 MPCore: pgd_alloc is not preempt safe gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 2/7] ARM: ARM11 MPCore: pte_alloc_one{, _kernel} are " gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 4/7] ARM: ARM11 MPCore: clean_dcache_area is " gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 5/7] ARM: Move get_thread_info macro definition to <asm/assembler.h> gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 6/7] ARM: ARM11 MPCore: DMA_CACHE_RWFO operations are not preempt safe gdavis at mvista.com
2011-10-07 16:26             ` [RFC/PATCH v3 7/7] ARM: ARM11 MPCore: cpu_v6_set_pte_ext is " gdavis at mvista.com
2011-10-18 13:47             ` [RFC/PATCH v4 0/7] ARM11 MPCore: preemption/task migration cache coherency fixups gdavis at mvista.com
2011-10-18 13:47               ` [RFC/PATCH v4 1/7] ARM: ARM11 MPCore: pgd_alloc is not preempt safe gdavis at mvista.com
2011-10-18 13:47               ` [RFC/PATCH v4 2/7] ARM: ARM11 MPCore: pte_alloc_one{, _kernel} are " gdavis at mvista.com
2011-10-18 13:47               ` [RFC/PATCH v4 3/7] ARM: ARM11 MPCore: {clean, flush}_pmd_entry " gdavis at mvista.com
2011-10-18 13:47               ` [RFC/PATCH v4 4/7] ARM: ARM11 MPCore: clean_dcache_area is " gdavis at mvista.com
2011-10-18 17:08                 ` Tony Lindgren
2011-10-18 17:30                   ` George G. Davis
2011-10-18 17:43                     ` Tony Lindgren
2011-10-18 18:13                       ` George G. Davis
2011-10-18 13:47               ` [RFC/PATCH v4 5/7] ARM: Move get_thread_info macro definition to <asm/assembler.h> gdavis at mvista.com
2011-10-18 13:47               ` [RFC/PATCH v4 6/7] ARM: ARM11 MPCore: DMA_CACHE_RWFO operations are not preempt safe gdavis at mvista.com
2011-10-18 21:28                 ` Nicolas Pitre
2011-10-18 23:26                   ` George G. Davis
2011-10-19  1:09                     ` Nicolas Pitre
2011-10-18 13:47               ` [RFC/PATCH v4 7/7] ARM: ARM11 MPCore: cpu_v6_set_pte_ext is " gdavis at mvista.com
2011-10-18 21:52                 ` Nicolas Pitre
2011-10-18 23:29                   ` George G. Davis
2012-06-12 20:40               ` [RFC/PATCH v5 0/7] ARM11 MPCore: preemption/task migration cache coherency fixups gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 1/7] ARM: ARM11 MPCore: Make pgd_alloc preempt safe gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 2/7] ARM: ARM11 MPCore: Make pte_alloc_one{, _kernel} " gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 3/7] ARM: ARM11 MPCore: Make {clean, flush}_pmd_entry " gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 4/7] ARM: Move get_thread_info macro definition to <asm/assembler.h> gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 5/7] ARM: ARM11 MPCore: cpu_v6_dcache_clean_area needs RFO gdavis at mvista.com
2012-06-13  9:32                   ` Catalin Marinas
2012-06-13  9:36                     ` Russell King - ARM Linux
2012-06-13  9:41                       ` Catalin Marinas
2012-06-13  9:45                         ` Russell King - ARM Linux
2012-06-13  9:54                           ` Catalin Marinas
2012-06-13 11:36                       ` George G. Davis
2012-06-13 11:21                     ` George G. Davis
2012-06-12 20:40                 ` [RFC/PATCH v5 6/7] ARM: ARM11 MPCore: Make DMA_CACHE_RWFO operations preempt safe gdavis at mvista.com
2012-06-12 20:40                 ` [RFC/PATCH v5 7/7] ARM: ARM11 MPCore: Make cpu_v6_set_pte_ext " gdavis at mvista.com
2012-06-13  9:34                   ` Catalin Marinas
2012-06-13 11:35                     ` George G. Davis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=05942148-7AFB-4755-A22F-355E0360B098@mvista.com \
    --to=gdavis@mvista.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.