All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lee Jones <lee.jones@linaro.org>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, ola.o.lilja@stericsson.com,
	alsa-devel@alsa-project.org, linus.walleij@stericsson.com,
	broonie@opensource.wolfsonmicro.com, olalilja@yahoo.se,
	STEricsson_nomadik_linux@list.st.com, lrg@ti.com
Subject: Re: [PATCH 5/6] ARM: ux500: Enable HIGHMEM on all mop500 platforms
Date: Wed, 01 Aug 2012 09:48:25 +0100	[thread overview]
Message-ID: <5018ED59.2020205@linaro.org> (raw)
In-Reply-To: <20120801084127.GT6802@n2100.arm.linux.org.uk>

On 01/08/12 09:41, Russell King - ARM Linux wrote:
> On Wed, Aug 01, 2012 at 08:56:14AM +0100, Lee Jones wrote:
>> On 31/07/12 23:01, Russell King - ARM Linux wrote:
>>> On Tue, Jul 31, 2012 at 08:50:02PM +0000, Arnd Bergmann wrote:
>>>> On Tuesday 31 July 2012, Russell King - ARM Linux wrote:
>>>>> I still fail to see how not having highmem enabled would ever cause memory
>>>>> corruption errors (unless something dealing with memory in a very very
>>>>> wrong way - iow, not using one of the reservation or memory allocation
>>>>> methods provided by the kernel.)
>>>>
>>>> The problem is that all users of ux500 systems pass a command line like
>>>>
>>>> vmalloc=256M mem=128M@0 mali.mali_mem=32M@128M hwmem=168M@160M mem=48M@328M mem_issw=1M@383M mem=640M@384M
>>>>
>>>> This is of course totally bogus and should not be done. If I understand
>>>> Lee correctly, one of the issues resulting from passing a command
>>>> line like this without enabling highmem is memory corruption.
>>>
>>> But the question is _why_ does that corruption happen.
>>>
>>>   From the above, we will end up with the kernel getting:
>>>
>>> 0x00000000 - 0x07ffffff (128M @ 0)
>>> 0x14800000 - 0x177fffff (48M  @ 328M)
>>> 0x18000000 - 0x3fffffff (640M @ 384M)
>>>
>>> with:
>>>
>>> 0x08000000 - 0x081fffff used for mali
>>> 0x0a000000 - 0x147fffff used for hwmem
>>> 0x17f00000 - 0x17ffffff used for mem_issw
>>>
>>> Now, with highmem disabled, the kernel should still map exactly the
>>> regions: 0x00000000 - 0x07ffffff, 0x14800000 - 0x177fffff, into the
>>> direct mapped region, and truncate the 0x18000000 - 0x3fffffff
>>> region appropriately, reducing the amount of memory available such
>>> that it won't overlap the vmalloc area (which you've specified to be
>>> a minimum of 256M.)
>>>
>>> This should _NOT_ cause any memory corruption.
>>>
>>> So, come on guys.  Debugging is *mandatory* for this kind of problem.
>>> Papering over it is obscene.
>>
>> Actually I didn't go any further with it, as I changed to another
>> identical piece of hardware and couldn't reproduce the issue.
>>
>> FYI, here's the boot log from the broken board:
>>
>> http://paste.ubuntu.com/1102017/
>
> Well, the good thing is this:
>
>     8 Truncating RAM at 18000000-3fffffff to -2c3fffff (vmalloc region overlap).
>
> which means the RAM was properly truncated before it is passed to
> memblock, etc.
>
> That oops dump looks very much like an ASoC problem, where
> dapm_widget_power_check() recurses into dapm_supply_check_power()
> which then recurses back into dapm_widget_power_check(), and it
> eventually overflows the kernel stack, corrupting the thread_info
> and the pages below.
>
> Given the address of the stack pointer (ebc480a8) I don't think
> we can be too sure where it was supposed to be, and where the top
> of stack should have been, so we don't know how many pages have
> been stomped on and corrupted.
>
> Stopping that recursion is the first thing that needs to be done
> so that the cause of it can then be properly debugged without the
> kernel itself corrupting memory below the kernel stack.

Those were my thoughts.

Here was my cry for help: https://lkml.org/lkml/2012/7/23/181

-- 
Lee Jones
Linaro ST-Ericsson Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

WARNING: multiple messages have this Message-ID (diff)
From: lee.jones@linaro.org (Lee Jones)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 5/6] ARM: ux500: Enable HIGHMEM on all mop500 platforms
Date: Wed, 01 Aug 2012 09:48:25 +0100	[thread overview]
Message-ID: <5018ED59.2020205@linaro.org> (raw)
In-Reply-To: <20120801084127.GT6802@n2100.arm.linux.org.uk>

On 01/08/12 09:41, Russell King - ARM Linux wrote:
> On Wed, Aug 01, 2012 at 08:56:14AM +0100, Lee Jones wrote:
>> On 31/07/12 23:01, Russell King - ARM Linux wrote:
>>> On Tue, Jul 31, 2012 at 08:50:02PM +0000, Arnd Bergmann wrote:
>>>> On Tuesday 31 July 2012, Russell King - ARM Linux wrote:
>>>>> I still fail to see how not having highmem enabled would ever cause memory
>>>>> corruption errors (unless something dealing with memory in a very very
>>>>> wrong way - iow, not using one of the reservation or memory allocation
>>>>> methods provided by the kernel.)
>>>>
>>>> The problem is that all users of ux500 systems pass a command line like
>>>>
>>>> vmalloc=256M mem=128M at 0 mali.mali_mem=32M at 128M hwmem=168M at 160M mem=48M at 328M mem_issw=1M at 383M mem=640M at 384M
>>>>
>>>> This is of course totally bogus and should not be done. If I understand
>>>> Lee correctly, one of the issues resulting from passing a command
>>>> line like this without enabling highmem is memory corruption.
>>>
>>> But the question is _why_ does that corruption happen.
>>>
>>>   From the above, we will end up with the kernel getting:
>>>
>>> 0x00000000 - 0x07ffffff (128M @ 0)
>>> 0x14800000 - 0x177fffff (48M  @ 328M)
>>> 0x18000000 - 0x3fffffff (640M @ 384M)
>>>
>>> with:
>>>
>>> 0x08000000 - 0x081fffff used for mali
>>> 0x0a000000 - 0x147fffff used for hwmem
>>> 0x17f00000 - 0x17ffffff used for mem_issw
>>>
>>> Now, with highmem disabled, the kernel should still map exactly the
>>> regions: 0x00000000 - 0x07ffffff, 0x14800000 - 0x177fffff, into the
>>> direct mapped region, and truncate the 0x18000000 - 0x3fffffff
>>> region appropriately, reducing the amount of memory available such
>>> that it won't overlap the vmalloc area (which you've specified to be
>>> a minimum of 256M.)
>>>
>>> This should _NOT_ cause any memory corruption.
>>>
>>> So, come on guys.  Debugging is *mandatory* for this kind of problem.
>>> Papering over it is obscene.
>>
>> Actually I didn't go any further with it, as I changed to another
>> identical piece of hardware and couldn't reproduce the issue.
>>
>> FYI, here's the boot log from the broken board:
>>
>> http://paste.ubuntu.com/1102017/
>
> Well, the good thing is this:
>
>     8 Truncating RAM at 18000000-3fffffff to -2c3fffff (vmalloc region overlap).
>
> which means the RAM was properly truncated before it is passed to
> memblock, etc.
>
> That oops dump looks very much like an ASoC problem, where
> dapm_widget_power_check() recurses into dapm_supply_check_power()
> which then recurses back into dapm_widget_power_check(), and it
> eventually overflows the kernel stack, corrupting the thread_info
> and the pages below.
>
> Given the address of the stack pointer (ebc480a8) I don't think
> we can be too sure where it was supposed to be, and where the top
> of stack should have been, so we don't know how many pages have
> been stomped on and corrupted.
>
> Stopping that recursion is the first thing that needs to be done
> so that the cause of it can then be properly debugged without the
> kernel itself corrupting memory below the kernel stack.

Those were my thoughts.

Here was my cry for help: https://lkml.org/lkml/2012/7/23/181

-- 
Lee Jones
Linaro ST-Ericsson Landing Team Lead
Linaro.org ? Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

  reply	other threads:[~2012-08-01  8:48 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-31 13:31 [PATCH 1/6] Bugfixes and clean-ups bound for the v3.6 RCs Lee Jones
2012-07-31 13:31 ` Lee Jones
2012-07-31 13:31 ` Lee Jones
2012-07-31 13:31 ` [PATCH 1/6] ASoC: ab8500: Inform SoC Core that we have our own I/O arrangements Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31 ` [PATCH 1/6] ASoC: dapm: If one widget fails, do not force all subsequent widgets to fail too Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:42   ` Mark Brown
2012-07-31 13:42     ` Mark Brown
2012-07-31 14:25     ` Lee Jones
2012-07-31 14:25       ` Lee Jones
2012-07-31 14:25       ` Lee Jones
2012-07-31 14:28       ` Mark Brown
2012-07-31 14:28         ` Mark Brown
2012-07-31 14:28         ` Mark Brown
2012-07-31 14:38         ` Lee Jones
2012-07-31 14:38           ` Lee Jones
2012-07-31 14:38           ` Lee Jones
2012-07-31 14:54           ` Mark Brown
2012-07-31 14:54             ` Mark Brown
2012-07-31 14:54             ` Mark Brown
2012-07-31 15:15             ` Lee Jones
2012-07-31 15:15               ` Lee Jones
2012-07-31 15:15               ` Lee Jones
2012-07-31 15:18               ` Mark Brown
2012-07-31 15:18                 ` Mark Brown
2012-07-31 15:18                 ` Mark Brown
2012-08-01  7:19                 ` Lee Jones
2012-08-01  7:19                   ` Lee Jones
2012-08-01  7:19                   ` Lee Jones
2012-08-01 13:20                   ` Mark Brown
2012-08-01 13:20                     ` Mark Brown
2012-08-01 13:20                     ` Mark Brown
2012-08-01 13:50                     ` Lee Jones
2012-08-01 13:50                       ` Lee Jones
2012-08-01 16:08                       ` Mark Brown
2012-08-01 16:08                         ` Mark Brown
2012-08-01 16:08                         ` Mark Brown
2012-08-01 19:41                         ` [alsa-devel] " Mark Brown
2012-08-01 19:41                           ` Mark Brown
2012-08-01 19:41                           ` Mark Brown
2012-08-02  7:45                           ` [alsa-devel] " Lee Jones
2012-08-02  7:45                             ` Lee Jones
2012-08-02 17:56                             ` Mark Brown
2012-08-02 17:56                               ` Mark Brown
2012-08-02 17:56                               ` Mark Brown
2012-08-03  8:30                               ` [alsa-devel] " Lee Jones
2012-08-03  8:30                                 ` Lee Jones
2012-08-03  8:30                                 ` Lee Jones
2012-08-04  0:48                                 ` [alsa-devel] " Mark Brown
2012-08-04  0:48                                   ` Mark Brown
2012-08-04  0:48                                   ` Mark Brown
2012-08-02  5:58                     ` Ola Lilja
2012-08-02  5:58                       ` Ola Lilja
2012-08-02  5:58                       ` Ola Lilja
2012-08-02  9:59                       ` Mark Brown
2012-08-02  9:59                         ` Mark Brown
2012-08-02  9:59                         ` Mark Brown
2012-08-10 11:43                       ` Linus Walleij
2012-08-10 11:43                         ` Linus Walleij
2012-08-10 11:43                         ` Linus Walleij
2012-08-02 12:21   ` Lee Jones
2012-08-02 12:21     ` Lee Jones
2012-08-02 12:21     ` Lee Jones
2012-07-31 13:31 ` [PATCH 2/6] ARM: ux500: Remove unused snowball_of_platform_devs struct Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31 ` [PATCH 2/6] ASoC: ab8500: Inform SoC Core that we have our own I/O arrangements Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31 ` [PATCH 3/6] ARM: ux500: Fix merge error, so such struct 'snd_soc_u8500' Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 16:46   ` Sergei Shtylyov
2012-07-31 16:46     ` Sergei Shtylyov
2012-08-01  7:37     ` Lee Jones
2012-08-01  7:37       ` Lee Jones
2012-08-01  7:37       ` Lee Jones
2012-08-01  8:19       ` Lee Jones
2012-08-01  8:19         ` Lee Jones
2012-08-01  8:19         ` Lee Jones
2012-08-01  8:46   ` [PATCH 3/6 v2] ARM: ux500: Fix merge error, no matching driver name for, 'snd_soc_u8500' Lee Jones
2012-08-01  8:46     ` Lee Jones
2012-08-01  8:46     ` Lee Jones
2012-07-31 13:31 ` [PATCH 3/6] ARM: ux500: Remove unused snowball_of_platform_devs struct Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 20:58   ` Arnd Bergmann
2012-07-31 20:58     ` Arnd Bergmann
2012-07-31 20:58     ` Arnd Bergmann
2012-07-31 13:31 ` [PATCH 4/6] ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31 ` [PATCH 4/6] ARM: ux500: Fix merge error, so such struct 'snd_soc_u8500' Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 20:58   ` Arnd Bergmann
2012-07-31 20:58     ` Arnd Bergmann
2012-07-31 20:58     ` Arnd Bergmann
2012-07-31 13:31 ` [PATCH 5/6] ARM: ux500: Enable HIGHMEM on all mop500 platforms Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:56   ` Russell King - ARM Linux
2012-07-31 13:56     ` Russell King - ARM Linux
2012-07-31 13:56     ` Russell King - ARM Linux
2012-07-31 14:29     ` Lee Jones
2012-07-31 14:29       ` Lee Jones
2012-07-31 14:29       ` Lee Jones
2012-07-31 14:37       ` Russell King - ARM Linux
2012-07-31 14:37         ` Russell King - ARM Linux
2012-07-31 14:37         ` Russell King - ARM Linux
2012-07-31 20:50         ` Arnd Bergmann
2012-07-31 20:50           ` Arnd Bergmann
2012-07-31 20:50           ` Arnd Bergmann
2012-07-31 22:01           ` Russell King - ARM Linux
2012-07-31 22:01             ` Russell King - ARM Linux
2012-08-01  7:56             ` Lee Jones
2012-08-01  7:56               ` Lee Jones
2012-08-01  7:56               ` Lee Jones
2012-08-01  8:41               ` Russell King - ARM Linux
2012-08-01  8:41                 ` Russell King - ARM Linux
2012-08-01  8:48                 ` Lee Jones [this message]
2012-08-01  8:48                   ` Lee Jones
2012-07-31 13:31 ` [PATCH 5/6] ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 20:54   ` Arnd Bergmann
2012-07-31 20:54     ` Arnd Bergmann
2012-08-01  7:34     ` Lee Jones
2012-08-01  7:34       ` Lee Jones
2012-08-01  7:34       ` Lee Jones
2012-08-01 13:32       ` Arnd Bergmann
2012-08-01 13:32         ` Arnd Bergmann
2012-08-01 13:32         ` Arnd Bergmann
2012-08-01 13:55         ` Lee Jones
2012-08-01 13:55           ` Lee Jones
2012-08-01 13:55           ` Lee Jones
2012-08-01 14:32           ` Arnd Bergmann
2012-08-01 14:32             ` Arnd Bergmann
2012-07-31 13:31 ` [PATCH 6/6] ARM: ux500: Enable HIGHMEM on all mop500 platforms Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:31 ` [PATCH 6/6] ASoC: Ux500: Move MSP pinctrl setup into the MSP driver Lee Jones
2012-07-31 13:31   ` Lee Jones
2012-07-31 13:40 ` [PATCH 1/6] Bugfixes and clean-ups bound for the v3.6 RCs Mark Brown
2012-07-31 13:40   ` Mark Brown
2012-07-31 14:30   ` Lee Jones
2012-07-31 14:30     ` Lee Jones
2012-07-31 14:30     ` Lee Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5018ED59.2020205@linaro.org \
    --to=lee.jones@linaro.org \
    --cc=STEricsson_nomadik_linux@list.st.com \
    --cc=alsa-devel@alsa-project.org \
    --cc=arnd@arndb.de \
    --cc=broonie@opensource.wolfsonmicro.com \
    --cc=linus.walleij@stericsson.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=lrg@ti.com \
    --cc=ola.o.lilja@stericsson.com \
    --cc=olalilja@yahoo.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.