All of lore.kernel.org
 help / color / mirror / Atom feed
* ARM64 CONFIG_ZONE_DMA for 32-bit devices
@ 2017-02-28 10:34 Kashyap Desai
  2017-02-28 12:42 ` Robin Murphy
  0 siblings, 1 reply; 6+ messages in thread
From: Kashyap Desai @ 2017-02-28 10:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi -

I was reading below articles. Mine is not similar issue, but  I understand
few things about ARM64 SWIOTLB interface from below discussion.
Any input will be a great help to resolve/understand the issue.

https://patchwork.codeaurora.org/patch/143833/
https://patchwork.kernel.org/patch/9495893/

Current problem statement is -
"Trying to load kdump kernel from above 4GB memory does not work on ARM64
platform as <megaraid_sas> driver require certain DMA buffer from below
4GB memory range."

Looking for alternative/workaround for time being. Long term plan is to
remove limitation of <megaraid_sas> driver to remove 32 bit DMA mask for
SAS3.0 onwards controller.

1) I found below code @arch/arm64/mm/init.c . ARM64 kernel has provision
to support 32-bit DMA as well.  I am not sure about why CONFIG_ZONE_DMA
option is an configurable option for ARM64 ?

        /* 4GB maximum for 32-bit only capable devices */
        if (IS_ENABLED(CONFIG_ZONE_DMA))
                arm64_dma_phys_limit = max_zone_dma_phys();
        else
                arm64_dma_phys_limit = PHYS_MASK + 1;
        dma_contiguous_reserve(arm64_dma_phys_limit);

One of the reason I think "kdump" kernel can load from above 4GB memory
range provided crashkernel=<>, high and crashkernel=0, low option. So I
guess ARM64 kdump kernel may have disabled CONFIG_ZONE_DMA option, but
base kernel must have enabled CONFIG_ZONE_DMA option to support 32-bit
only capable devices.

Still it is not clear as kdump kernel works in ARM64 without
CONFIG_ZONE_DMA, means base kernel can also work without CONFIG_ZONE_DMA
set. It means looks like CONFIG_ZONE_DMA is just temporary arrangement.

Is there any plan that ARM64 will remove CONFIG_ZONE_DMA and require all
devices to be 64-bit capable.


2.)

Typically - SWIOTLB uses DMA buffer from below 4GB range only. ARM64 is
the only architecture which support Low memory definition as per ARCH
specified. See below

[root@ linux]# grep -R ARCH_LOW_ADDRESS_LIMIT arch/
arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
0x7fffffffUL
arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
(arm64_dma_phys_limit - 1)

For only ARM64, it is possible to get SWTBL DMA buffer above 4GB. See
below snippet from crashed kernel on ARM64.

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000005fc0000000-0x0000005fffffffff]
[    0.000000]   Normal   empty

SWIOTLB can map 64MB buffer from above 4GB only on ARM64 machine and that
is causing problem for <megaraid_sas> driver.
Current megaraid_sas driver wants certain resources from below 4GB memory
and that is why it request consistent dma mask as below -
pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)).

If I do the same on x86_64, SWTBL INIT will fail because there is no Low
memory below 4GB. See below prints from x86_64 machine.

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000407effffff]
  [    0.000000] Movable zone start for each node
   ..
   [    0.000000] Cannot allocate SWIOTLB buffer

Question is - "ARM64 platform can't allocate memory for crash kernel in
below 4GB range ?"

Thanks, Kashyap

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM64 CONFIG_ZONE_DMA for 32-bit devices
  2017-02-28 10:34 ARM64 CONFIG_ZONE_DMA for 32-bit devices Kashyap Desai
@ 2017-02-28 12:42 ` Robin Murphy
  2017-03-02 18:35   ` Catalin Marinas
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Murphy @ 2017-02-28 12:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 28/02/17 10:34, Kashyap Desai wrote:
> Hi -
> 
> I was reading below articles. Mine is not similar issue, but  I understand
> few things about ARM64 SWIOTLB interface from below discussion.
> Any input will be a great help to resolve/understand the issue.
> 
> https://patchwork.codeaurora.org/patch/143833/
> https://patchwork.kernel.org/patch/9495893/
> 
> Current problem statement is -
> "Trying to load kdump kernel from above 4GB memory does not work on ARM64
> platform as <megaraid_sas> driver require certain DMA buffer from below
> 4GB memory range."
> 
> Looking for alternative/workaround for time being. Long term plan is to
> remove limitation of <megaraid_sas> driver to remove 32 bit DMA mask for
> SAS3.0 onwards controller.
> 
> 1) I found below code @arch/arm64/mm/init.c . ARM64 kernel has provision
> to support 32-bit DMA as well.  I am not sure about why CONFIG_ZONE_DMA
> option is an configurable option for ARM64 ?
> 
>         /* 4GB maximum for 32-bit only capable devices */
>         if (IS_ENABLED(CONFIG_ZONE_DMA))
>                 arm64_dma_phys_limit = max_zone_dma_phys();
>         else
>                 arm64_dma_phys_limit = PHYS_MASK + 1;
>         dma_contiguous_reserve(arm64_dma_phys_limit);
> 
> One of the reason I think "kdump" kernel can load from above 4GB memory
> range provided crashkernel=<>, high and crashkernel=0, low option. So I
> guess ARM64 kdump kernel may have disabled CONFIG_ZONE_DMA option, but
> base kernel must have enabled CONFIG_ZONE_DMA option to support 32-bit
> only capable devices.

I believe it's more that ZONE_DMA goes a bit crazy when the available
RAM starts above 4GB. It's not technically possible to turn it off
without hacking Kconfig.

> Still it is not clear as kdump kernel works in ARM64 without
> CONFIG_ZONE_DMA, means base kernel can also work without CONFIG_ZONE_DMA
> set. It means looks like CONFIG_ZONE_DMA is just temporary arrangement.
> 
> Is there any plan that ARM64 will remove CONFIG_ZONE_DMA and require all
> devices to be 64-bit capable.

No. It is not realistic for the kernel to simply cease supporting nearly
all current hardware (and a large proportion of current future hardware)
any time soon.

> 2.)
> 
> Typically - SWIOTLB uses DMA buffer from below 4GB range only. ARM64 is
> the only architecture which support Low memory definition as per ARCH
> specified. See below
> 
> [root@ linux]# grep -R ARCH_LOW_ADDRESS_LIMIT arch/
> arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> 0x7fffffffUL
> arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> (arm64_dma_phys_limit - 1)
> 
> For only ARM64, it is possible to get SWTBL DMA buffer above 4GB. See
> below snippet from crashed kernel on ARM64.
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000005fc0000000-0x0000005fffffffff]
> [    0.000000]   Normal   empty
> 
> SWIOTLB can map 64MB buffer from above 4GB only on ARM64 machine and that
> is causing problem for <megaraid_sas> driver.
> Current megaraid_sas driver wants certain resources from below 4GB memory
> and that is why it request consistent dma mask as below -
> pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)).
> 
> If I do the same on x86_64, SWTBL INIT will fail because there is no Low
> memory below 4GB. See below prints from x86_64 machine.
> 
> [    0.000000] Zone ranges:
> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> [    0.000000]   Normal   [mem 0x0000000100000000-0x000000407effffff]
>   [    0.000000] Movable zone start for each node
>    ..
>    [    0.000000] Cannot allocate SWIOTLB buffer
> 
> Question is - "ARM64 platform can't allocate memory for crash kernel in
> below 4GB range ?"

If you want to use a device which requires 32-bit-addressable DMA
resources with your crash kernel, and that device isn't behind an IOMMU,
then don't load your crash kernel above 4GB. It's as simple as that,
because in general there's no other way around the issue. And if said
device doesn't actually need 32-bit-addressable resources, then yeah,
fix the dma_set_mask() calls in the driver.

That said, I think something is a bit wonky in max_zone_dma_phys() with
"It currently assumes that for memory starting above 4G, 32-bit devices
will use a DMA offset" - I think that assumption needs to be revisited
since, even disregarding cases like kdump, commonly available hardware
now exists for which that is not true (e.g. AMD Seattle). Catalin?

Robin.

> 
> Thanks, Kashyap
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM64 CONFIG_ZONE_DMA for 32-bit devices
  2017-02-28 12:42 ` Robin Murphy
@ 2017-03-02 18:35   ` Catalin Marinas
  2017-03-06 13:00     ` Robin Murphy
  0 siblings, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2017-03-02 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 28, 2017 at 12:42:51PM +0000, Robin Murphy wrote:
> On 28/02/17 10:34, Kashyap Desai wrote:
> > I was reading below articles. Mine is not similar issue, but  I understand
> > few things about ARM64 SWIOTLB interface from below discussion.
> > Any input will be a great help to resolve/understand the issue.
> > 
> > https://patchwork.codeaurora.org/patch/143833/
> > https://patchwork.kernel.org/patch/9495893/
> > 
> > Current problem statement is -
> > "Trying to load kdump kernel from above 4GB memory does not work on ARM64
> > platform as <megaraid_sas> driver require certain DMA buffer from below
> > 4GB memory range."
> > 
> > Looking for alternative/workaround for time being. Long term plan is to
> > remove limitation of <megaraid_sas> driver to remove 32 bit DMA mask for
> > SAS3.0 onwards controller.
> > 
> > 1) I found below code @arch/arm64/mm/init.c . ARM64 kernel has provision
> > to support 32-bit DMA as well.  I am not sure about why CONFIG_ZONE_DMA
> > option is an configurable option for ARM64 ?
> > 
> >         /* 4GB maximum for 32-bit only capable devices */
> >         if (IS_ENABLED(CONFIG_ZONE_DMA))
> >                 arm64_dma_phys_limit = max_zone_dma_phys();
> >         else
> >                 arm64_dma_phys_limit = PHYS_MASK + 1;
> >         dma_contiguous_reserve(arm64_dma_phys_limit);
> > 
> > One of the reason I think "kdump" kernel can load from above 4GB memory
> > range provided crashkernel=<>, high and crashkernel=0, low option. So I
> > guess ARM64 kdump kernel may have disabled CONFIG_ZONE_DMA option, but
> > base kernel must have enabled CONFIG_ZONE_DMA option to support 32-bit
> > only capable devices.
> 
> I believe it's more that ZONE_DMA goes a bit crazy when the available
> RAM starts above 4GB. It's not technically possible to turn it off
> without hacking Kconfig.

I think even if you hack Kconfig, the kernel may not build. We keep the
Kconfig option so that enum zone_type has ZONE_DMA defined.

Regarding the DMA zone selection, max_zone_dma_phys() is more of a hack
(needed on Seattle where all RAM is above 4GB). Basically the first
32-bit at the start of RAM are considered for ZONE_DMA, even though the
actual physical address of the start of RAM would be well beyond 4GB.
This assumes that 32-bit only devices have the relevant dma_pfn_offset
passed via DT (not sure what we do on ACPI).

> > 2.)
> > 
> > Typically - SWIOTLB uses DMA buffer from below 4GB range only. ARM64 is
> > the only architecture which support Low memory definition as per ARCH
> > specified. See below
> > 
> > [root@ linux]# grep -R ARCH_LOW_ADDRESS_LIMIT arch/
> > arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> > 0x7fffffffUL
> > arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> > (arm64_dma_phys_limit - 1)
> > 
> > For only ARM64, it is possible to get SWTBL DMA buffer above 4GB. See
> > below snippet from crashed kernel on ARM64.
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000005fc0000000-0x0000005fffffffff]
> > [    0.000000]   Normal   empty

I guess that's because the kernel thinks 0x5fc00000 is the start of all
RAM that is available and just assumes that ZONE_DMA would be covered by
the lower 32-bit of this (high) range.

The physical address in the 32-bit DMA context is rather irrelevant.
What you need is the actual DMA address that the device is seeing and
this is calculated by phys_to_dma (taking dma_pfn_offset into account).

> > SWIOTLB can map 64MB buffer from above 4GB only on ARM64 machine and that
> > is causing problem for <megaraid_sas> driver.
> > Current megaraid_sas driver wants certain resources from below 4GB memory
> > and that is why it request consistent dma mask as below -
> > pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)).
> > 
> > If I do the same on x86_64, SWTBL INIT will fail because there is no Low
> > memory below 4GB. See below prints from x86_64 machine.
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> > [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> > [    0.000000]   Normal   [mem 0x0000000100000000-0x000000407effffff]
> >   [    0.000000] Movable zone start for each node
> >    ..
> >    [    0.000000] Cannot allocate SWIOTLB buffer
> > 
> > Question is - "ARM64 platform can't allocate memory for crash kernel in
> > below 4GB range ?"
> 
> If you want to use a device which requires 32-bit-addressable DMA
> resources with your crash kernel, and that device isn't behind an IOMMU,
> then don't load your crash kernel above 4GB. It's as simple as that,
> because in general there's no other way around the issue. And if said
> device doesn't actually need 32-bit-addressable resources, then yeah,
> fix the dma_set_mask() calls in the driver.

I agree. I don't think there is much we can do, other than parsing all
dma_pfn_offsets early on in DT and deciding whether the ZONE_DMA
heuristics actually helps (and, if not, print some warning).

> That said, I think something is a bit wonky in max_zone_dma_phys() with
> "It currently assumes that for memory starting above 4G, 32-bit devices
> will use a DMA offset" - I think that assumption needs to be revisited
> since, even disregarding cases like kdump, commonly available hardware
> now exists for which that is not true (e.g. AMD Seattle). Catalin?

IIRC, I did this specifically for Seattle, though not sure whether it
was just a matter of failing memory allocations when ZONE_DMA was empty
rather than a device actually using it. That's the best we could do if
there actually is a device with the relevant dma_pfn_offset.

The alternative would be to put everything in ZONE_DMA if the RAM is
beyond 4GB but it doesn't help if we do have devices with a proper
dma_pfn_offset. Leaving ZONE_DMA empty probably has other implications
with failing allocations (you can fall back from ZONE_NORMAL to ZONE_DMA
but not the other way around).

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM64 CONFIG_ZONE_DMA for 32-bit devices
  2017-03-02 18:35   ` Catalin Marinas
@ 2017-03-06 13:00     ` Robin Murphy
  2017-03-06 13:39       ` Catalin Marinas
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Murphy @ 2017-03-06 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 02/03/17 18:35, Catalin Marinas wrote:
> On Tue, Feb 28, 2017 at 12:42:51PM +0000, Robin Murphy wrote:
>> On 28/02/17 10:34, Kashyap Desai wrote:
>>> I was reading below articles. Mine is not similar issue, but  I understand
>>> few things about ARM64 SWIOTLB interface from below discussion.
>>> Any input will be a great help to resolve/understand the issue.
>>>
>>> https://patchwork.codeaurora.org/patch/143833/
>>> https://patchwork.kernel.org/patch/9495893/
>>>
>>> Current problem statement is -
>>> "Trying to load kdump kernel from above 4GB memory does not work on ARM64
>>> platform as <megaraid_sas> driver require certain DMA buffer from below
>>> 4GB memory range."
>>>
>>> Looking for alternative/workaround for time being. Long term plan is to
>>> remove limitation of <megaraid_sas> driver to remove 32 bit DMA mask for
>>> SAS3.0 onwards controller.
>>>
>>> 1) I found below code @arch/arm64/mm/init.c . ARM64 kernel has provision
>>> to support 32-bit DMA as well.  I am not sure about why CONFIG_ZONE_DMA
>>> option is an configurable option for ARM64 ?
>>>
>>>         /* 4GB maximum for 32-bit only capable devices */
>>>         if (IS_ENABLED(CONFIG_ZONE_DMA))
>>>                 arm64_dma_phys_limit = max_zone_dma_phys();
>>>         else
>>>                 arm64_dma_phys_limit = PHYS_MASK + 1;
>>>         dma_contiguous_reserve(arm64_dma_phys_limit);
>>>
>>> One of the reason I think "kdump" kernel can load from above 4GB memory
>>> range provided crashkernel=<>, high and crashkernel=0, low option. So I
>>> guess ARM64 kdump kernel may have disabled CONFIG_ZONE_DMA option, but
>>> base kernel must have enabled CONFIG_ZONE_DMA option to support 32-bit
>>> only capable devices.
>>
>> I believe it's more that ZONE_DMA goes a bit crazy when the available
>> RAM starts above 4GB. It's not technically possible to turn it off
>> without hacking Kconfig.
> 
> I think even if you hack Kconfig, the kernel may not build. We keep the
> Kconfig option so that enum zone_type has ZONE_DMA defined.
> 
> Regarding the DMA zone selection, max_zone_dma_phys() is more of a hack
> (needed on Seattle where all RAM is above 4GB). Basically the first
> 32-bit at the start of RAM are considered for ZONE_DMA, even though the
> actual physical address of the start of RAM would be well beyond 4GB.
> This assumes that 32-bit only devices have the relevant dma_pfn_offset
> passed via DT (not sure what we do on ACPI).

Except Seattle doesn't have dma_pfn_offsets :/

The DT has an identity-mapped "dma-ranges", and I've seen it
demonstrated that a PCI card whose driver sets a 32-bit mask simply has
all DMA API calls fail.

>>> 2.)
>>>
>>> Typically - SWIOTLB uses DMA buffer from below 4GB range only. ARM64 is
>>> the only architecture which support Low memory definition as per ARCH
>>> specified. See below
>>>
>>> [root@ linux]# grep -R ARCH_LOW_ADDRESS_LIMIT arch/
>>> arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
>>> 0x7fffffffUL
>>> arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
>>> (arm64_dma_phys_limit - 1)
>>>
>>> For only ARM64, it is possible to get SWTBL DMA buffer above 4GB. See
>>> below snippet from crashed kernel on ARM64.
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000005fc0000000-0x0000005fffffffff]
>>> [    0.000000]   Normal   empty
> 
> I guess that's because the kernel thinks 0x5fc00000 is the start of all
> RAM that is available and just assumes that ZONE_DMA would be covered by
> the lower 32-bit of this (high) range.
> 
> The physical address in the 32-bit DMA context is rather irrelevant.
> What you need is the actual DMA address that the device is seeing and
> this is calculated by phys_to_dma (taking dma_pfn_offset into account).
> 
>>> SWIOTLB can map 64MB buffer from above 4GB only on ARM64 machine and that
>>> is causing problem for <megaraid_sas> driver.
>>> Current megaraid_sas driver wants certain resources from below 4GB memory
>>> and that is why it request consistent dma mask as below -
>>> pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)).
>>>
>>> If I do the same on x86_64, SWTBL INIT will fail because there is no Low
>>> memory below 4GB. See below prints from x86_64 machine.
>>>
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
>>> [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
>>> [    0.000000]   Normal   [mem 0x0000000100000000-0x000000407effffff]
>>>   [    0.000000] Movable zone start for each node
>>>    ..
>>>    [    0.000000] Cannot allocate SWIOTLB buffer
>>>
>>> Question is - "ARM64 platform can't allocate memory for crash kernel in
>>> below 4GB range ?"
>>
>> If you want to use a device which requires 32-bit-addressable DMA
>> resources with your crash kernel, and that device isn't behind an IOMMU,
>> then don't load your crash kernel above 4GB. It's as simple as that,
>> because in general there's no other way around the issue. And if said
>> device doesn't actually need 32-bit-addressable resources, then yeah,
>> fix the dma_set_mask() calls in the driver.
> 
> I agree. I don't think there is much we can do, other than parsing all
> dma_pfn_offsets early on in DT and deciding whether the ZONE_DMA
> heuristics actually helps (and, if not, print some warning).
> 
>> That said, I think something is a bit wonky in max_zone_dma_phys() with
>> "It currently assumes that for memory starting above 4G, 32-bit devices
>> will use a DMA offset" - I think that assumption needs to be revisited
>> since, even disregarding cases like kdump, commonly available hardware
>> now exists for which that is not true (e.g. AMD Seattle). Catalin?
> 
> IIRC, I did this specifically for Seattle, though not sure whether it
> was just a matter of failing memory allocations when ZONE_DMA was empty
> rather than a device actually using it. That's the best we could do if
> there actually is a device with the relevant dma_pfn_offset.
> 
> The alternative would be to put everything in ZONE_DMA if the RAM is
> beyond 4GB but it doesn't help if we do have devices with a proper
> dma_pfn_offset. Leaving ZONE_DMA empty probably has other implications
> with failing allocations (you can fall back from ZONE_NORMAL to ZONE_DMA
> but not the other way around).

I'd be more inclined to take the latter approach - the vast majority of
(if not all) systems where this is even a concern at all have IOMMUs,
which make ZONE_DMA rather moot as a concept once you can freely
allocate pages to back buffer mappings from anywhere you like. Of
course, it might be nice to avoid allocating a useless SWIOTLB buffer in
such cases, but I guess that's perhaps a separate problem in itself.

Robin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM64 CONFIG_ZONE_DMA for 32-bit devices
  2017-03-06 13:00     ` Robin Murphy
@ 2017-03-06 13:39       ` Catalin Marinas
  0 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2017-03-06 13:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Mar 06, 2017 at 01:00:18PM +0000, Robin Murphy wrote:
> On 02/03/17 18:35, Catalin Marinas wrote:
> > On Tue, Feb 28, 2017 at 12:42:51PM +0000, Robin Murphy wrote:
> >> On 28/02/17 10:34, Kashyap Desai wrote:
> >>> I was reading below articles. Mine is not similar issue, but  I understand
> >>> few things about ARM64 SWIOTLB interface from below discussion.
> >>> Any input will be a great help to resolve/understand the issue.
> >>>
> >>> https://patchwork.codeaurora.org/patch/143833/
> >>> https://patchwork.kernel.org/patch/9495893/
> >>>
> >>> Current problem statement is -
> >>> "Trying to load kdump kernel from above 4GB memory does not work on ARM64
> >>> platform as <megaraid_sas> driver require certain DMA buffer from below
> >>> 4GB memory range."
> >>>
> >>> Looking for alternative/workaround for time being. Long term plan is to
> >>> remove limitation of <megaraid_sas> driver to remove 32 bit DMA mask for
> >>> SAS3.0 onwards controller.
> >>>
> >>> 1) I found below code @arch/arm64/mm/init.c . ARM64 kernel has provision
> >>> to support 32-bit DMA as well.  I am not sure about why CONFIG_ZONE_DMA
> >>> option is an configurable option for ARM64 ?
> >>>
> >>>         /* 4GB maximum for 32-bit only capable devices */
> >>>         if (IS_ENABLED(CONFIG_ZONE_DMA))
> >>>                 arm64_dma_phys_limit = max_zone_dma_phys();
> >>>         else
> >>>                 arm64_dma_phys_limit = PHYS_MASK + 1;
> >>>         dma_contiguous_reserve(arm64_dma_phys_limit);
> >>>
> >>> One of the reason I think "kdump" kernel can load from above 4GB memory
> >>> range provided crashkernel=<>, high and crashkernel=0, low option. So I
> >>> guess ARM64 kdump kernel may have disabled CONFIG_ZONE_DMA option, but
> >>> base kernel must have enabled CONFIG_ZONE_DMA option to support 32-bit
> >>> only capable devices.
> >>
> >> I believe it's more that ZONE_DMA goes a bit crazy when the available
> >> RAM starts above 4GB. It's not technically possible to turn it off
> >> without hacking Kconfig.
> > 
> > I think even if you hack Kconfig, the kernel may not build. We keep the
> > Kconfig option so that enum zone_type has ZONE_DMA defined.
> > 
> > Regarding the DMA zone selection, max_zone_dma_phys() is more of a hack
> > (needed on Seattle where all RAM is above 4GB). Basically the first
> > 32-bit at the start of RAM are considered for ZONE_DMA, even though the
> > actual physical address of the start of RAM would be well beyond 4GB.
> > This assumes that 32-bit only devices have the relevant dma_pfn_offset
> > passed via DT (not sure what we do on ACPI).
> 
> Except Seattle doesn't have dma_pfn_offsets :/
> 
> The DT has an identity-mapped "dma-ranges", and I've seen it
> demonstrated that a PCI card whose driver sets a 32-bit mask simply has
> all DMA API calls fail.

IIRC, the problem was an empty ZONE_DMA rather than an actual device
using this memory. It could have been just swiotlb failing, I don't
remember the details.

> >>> 2.)
> >>>
> >>> Typically - SWIOTLB uses DMA buffer from below 4GB range only. ARM64 is
> >>> the only architecture which support Low memory definition as per ARCH
> >>> specified. See below
> >>>
> >>> [root@ linux]# grep -R ARCH_LOW_ADDRESS_LIMIT arch/
> >>> arch/s390/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> >>> 0x7fffffffUL
> >>> arch/arm64/include/asm/processor.h:#define ARCH_LOW_ADDRESS_LIMIT
> >>> (arm64_dma_phys_limit - 1)
> >>>
> >>> For only ARM64, it is possible to get SWTBL DMA buffer above 4GB. See
> >>> below snippet from crashed kernel on ARM64.
> >>>
> >>> [    0.000000] Zone ranges:
> >>> [    0.000000]   DMA      [mem 0x0000005fc0000000-0x0000005fffffffff]
> >>> [    0.000000]   Normal   empty
> > 
> > I guess that's because the kernel thinks 0x5fc00000 is the start of all
> > RAM that is available and just assumes that ZONE_DMA would be covered by
> > the lower 32-bit of this (high) range.
> > 
> > The physical address in the 32-bit DMA context is rather irrelevant.
> > What you need is the actual DMA address that the device is seeing and
> > this is calculated by phys_to_dma (taking dma_pfn_offset into account).
> > 
> >>> SWIOTLB can map 64MB buffer from above 4GB only on ARM64 machine and that
> >>> is causing problem for <megaraid_sas> driver.
> >>> Current megaraid_sas driver wants certain resources from below 4GB memory
> >>> and that is why it request consistent dma mask as below -
> >>> pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)).
> >>>
> >>> If I do the same on x86_64, SWTBL INIT will fail because there is no Low
> >>> memory below 4GB. See below prints from x86_64 machine.
> >>>
> >>> [    0.000000] Zone ranges:
> >>> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> >>> [    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
> >>> [    0.000000]   Normal   [mem 0x0000000100000000-0x000000407effffff]
> >>>   [    0.000000] Movable zone start for each node
> >>>    ..
> >>>    [    0.000000] Cannot allocate SWIOTLB buffer
> >>>
> >>> Question is - "ARM64 platform can't allocate memory for crash kernel in
> >>> below 4GB range ?"
> >>
> >> If you want to use a device which requires 32-bit-addressable DMA
> >> resources with your crash kernel, and that device isn't behind an IOMMU,
> >> then don't load your crash kernel above 4GB. It's as simple as that,
> >> because in general there's no other way around the issue. And if said
> >> device doesn't actually need 32-bit-addressable resources, then yeah,
> >> fix the dma_set_mask() calls in the driver.
> > 
> > I agree. I don't think there is much we can do, other than parsing all
> > dma_pfn_offsets early on in DT and deciding whether the ZONE_DMA
> > heuristics actually helps (and, if not, print some warning).
> > 
> >> That said, I think something is a bit wonky in max_zone_dma_phys() with
> >> "It currently assumes that for memory starting above 4G, 32-bit devices
> >> will use a DMA offset" - I think that assumption needs to be revisited
> >> since, even disregarding cases like kdump, commonly available hardware
> >> now exists for which that is not true (e.g. AMD Seattle). Catalin?
> > 
> > IIRC, I did this specifically for Seattle, though not sure whether it
> > was just a matter of failing memory allocations when ZONE_DMA was empty
> > rather than a device actually using it. That's the best we could do if
> > there actually is a device with the relevant dma_pfn_offset.
> > 
> > The alternative would be to put everything in ZONE_DMA if the RAM is
> > beyond 4GB but it doesn't help if we do have devices with a proper
> > dma_pfn_offset. Leaving ZONE_DMA empty probably has other implications
> > with failing allocations (you can fall back from ZONE_NORMAL to ZONE_DMA
> > but not the other way around).
> 
> I'd be more inclined to take the latter approach - the vast majority of
> (if not all) systems where this is even a concern at all have IOMMUs,
> which make ZONE_DMA rather moot as a concept once you can freely
> allocate pages to back buffer mappings from anywhere you like. Of
> course, it might be nice to avoid allocating a useless SWIOTLB buffer in
> such cases, but I guess that's perhaps a separate problem in itself.

We could revert the DMA zone hack but only if we make swiotlb fail
silently in this case (and subsequent uses of it). It allocates its
buffers using GFP_DMA and automatically fail to get them if ZONE_DMA is
empty.

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* ARM64 CONFIG_ZONE_DMA for 32-bit devices
@ 2017-03-03  8:59 Kashyap Desai
  0 siblings, 0 replies; 6+ messages in thread
From: Kashyap Desai @ 2017-03-03  8:59 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks Robin and Catalin. Somehow I did not received email due to some
firewall or other technical issue, so copied your  discussion and replied
inline.


> > If you want to use a device which requires 32-bit-addressable DMA
> > resources with your crash kernel, and that device isn't behind an
> > IOMMU, then don't load your crash kernel above 4GB. It's as simple
> > as that, because in general there's no other way around the issue.
> > And if said device doesn't actually need 32-bit-addressable
> > resources, then yeah, fix the dma_set_mask() calls in the driver.
>
> I am working on driver level changes. So kdump and base kernel
> behavior is just setting and configuration problem.
> It is not the case we are seeing difference in DMA zone in kdump is
> different vs base kernel.
>
> >
> > That said, I think something is a bit wonky in max_zone_dma_phys()
> > with "It currently assumes that for memory starting above 4G, 32-bit
> > devices will use a DMA offset" - I think that assumption needs to be
> > revisited since,

 This is what exactly I could not understand as it is very specific to ARM
and not  many Arch follow this.

 Do you mean always reserve some memory for DMA zone in below 4GB as an
alternative solution?

 I am asking question with limited understanding of how ARM is working.
Need time to understand your reply completely.
 Solution I am looking for is driver asking 64 bit DMA mask for Streaming
IO and 32 bit consistent DMA can work in ARM64 platform ?

> >even disregarding cases like kdump, commonly available hardware now
> >exists for  which that is not true (e.g. AMD Seattle). Catalin?

 I recently receive another set of logs which has base kernel  causing
problem same as kdump kernel with crashkernel option in high mem.

 Here is /proc/iomem snippet in base kernel -

 7e930000-7e930fff : /soc/xxxxxxxx at 7e930000 4000200000-43ffffffff :
 System RAM
   4000280000-4000ed0fff : Kernel code
   4000fa1000-400116cfff : Kernel data
  e0d0000000-e0d003ffff : cfg

 DMA and Normal zone are configured as below -

 Feb 22 15:34:54 dhcp-135-24-241-244 kernel: [    0.000000] Zone ranges:
 Feb 22 15:34:54 dhcp-135-24-241-244 kernel: [    0.000000]   DMA
[mem
 0x0000004000200000-0x00000040ffffffff]
 Feb 22 15:34:54 dhcp-135-24-241-244 kernel: [    0.000000]   Normal
[mem
 0x0000004100000000-0x00000043ffffffff]

 This is first instance where I observed RAM was mapped above 4GB and
because of that driver asking for 32 bit DMA is failing.
Which parameter plays role  here for mapping system RAM above 4GB ? Is it
done in BIOS ?

>
> >
> > Robin.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-06 13:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-28 10:34 ARM64 CONFIG_ZONE_DMA for 32-bit devices Kashyap Desai
2017-02-28 12:42 ` Robin Murphy
2017-03-02 18:35   ` Catalin Marinas
2017-03-06 13:00     ` Robin Murphy
2017-03-06 13:39       ` Catalin Marinas
2017-03-03  8:59 Kashyap Desai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.