All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-20 10:48 Chen Baozi
  2015-04-20 10:53 ` David Vrabel
  2015-04-20 13:37 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-20 10:48 UTC (permalink / raw)
  To: xen-devel; +Cc: Chen Baozi, stefano.stabellini

Make sure that xen_swiotlb_init allocates buffers that is DMA capable.

Signed-off-by: Chen Baozi <baozich@gmail.com>
---
 drivers/xen/swiotlb-xen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..7345afd 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,8 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
+			xen_io_tlb_start = (void *)__get_free_pages(
+						__GFP_NOWARN|__GFP_DMA, order);
 			if (xen_io_tlb_start)
 				break;
 			order--;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 10:48 [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages Chen Baozi
@ 2015-04-20 10:53 ` David Vrabel
  2015-04-20 11:07   ` Chen Baozi
  2015-04-20 11:07     ` Chen Baozi
  2015-04-20 13:37 ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 30+ messages in thread
From: David Vrabel @ 2015-04-20 10:53 UTC (permalink / raw)
  To: Chen Baozi, xen-devel; +Cc: stefano.stabellini

On 20/04/15 11:48, Chen Baozi wrote:
> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> 
> Signed-off-by: Chen Baozi <baozich@gmail.com>
> ---
>  drivers/xen/swiotlb-xen.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 810ad41..7345afd 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -235,7 +235,8 @@ retry:
>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> +			xen_io_tlb_start = (void *)__get_free_pages(
> +						__GFP_NOWARN|__GFP_DMA, order);

I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
mean __GFP_DMA32?

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 10:53 ` David Vrabel
@ 2015-04-20 11:07     ` Chen Baozi
  2015-04-20 11:07     ` Chen Baozi
  1 sibling, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-20 11:07 UTC (permalink / raw)
  To: David Vrabel
  Cc: xen-devel, stefano.stabellini, linux-kernel, linux-arm-kernel

On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> On 20/04/15 11:48, Chen Baozi wrote:
> > Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> > 
> > Signed-off-by: Chen Baozi <baozich@gmail.com>
> > ---
> >  drivers/xen/swiotlb-xen.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 810ad41..7345afd 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -235,7 +235,8 @@ retry:
> >  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> > -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> > +			xen_io_tlb_start = (void *)__get_free_pages(
> > +						__GFP_NOWARN|__GFP_DMA, order);
> 
> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> mean __GFP_DMA32?

__GFP_DMA32 doesn't help on arm64...

I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
arm?

Cheers,

Baozi.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-20 11:07     ` Chen Baozi
  0 siblings, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-20 11:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> On 20/04/15 11:48, Chen Baozi wrote:
> > Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> > 
> > Signed-off-by: Chen Baozi <baozich@gmail.com>
> > ---
> >  drivers/xen/swiotlb-xen.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 810ad41..7345afd 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -235,7 +235,8 @@ retry:
> >  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> > -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> > +			xen_io_tlb_start = (void *)__get_free_pages(
> > +						__GFP_NOWARN|__GFP_DMA, order);
> 
> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> mean __GFP_DMA32?

__GFP_DMA32 doesn't help on arm64...

I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
arm?

Cheers,

Baozi.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 10:53 ` David Vrabel
@ 2015-04-20 11:07   ` Chen Baozi
  2015-04-20 11:07     ` Chen Baozi
  1 sibling, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-20 11:07 UTC (permalink / raw)
  To: David Vrabel
  Cc: stefano.stabellini, linux-kernel, linux-arm-kernel, xen-devel

On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> On 20/04/15 11:48, Chen Baozi wrote:
> > Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> > 
> > Signed-off-by: Chen Baozi <baozich@gmail.com>
> > ---
> >  drivers/xen/swiotlb-xen.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index 810ad41..7345afd 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -235,7 +235,8 @@ retry:
> >  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> > -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> > +			xen_io_tlb_start = (void *)__get_free_pages(
> > +						__GFP_NOWARN|__GFP_DMA, order);
> 
> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> mean __GFP_DMA32?

__GFP_DMA32 doesn't help on arm64...

I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
arm?

Cheers,

Baozi.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 11:07     ` Chen Baozi
@ 2015-04-20 12:02       ` David Vrabel
  -1 siblings, 0 replies; 30+ messages in thread
From: David Vrabel @ 2015-04-20 12:02 UTC (permalink / raw)
  To: Chen Baozi, David Vrabel
  Cc: stefano.stabellini, linux-kernel, linux-arm-kernel, xen-devel

On 20/04/15 12:07, Chen Baozi wrote:
> On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
>> On 20/04/15 11:48, Chen Baozi wrote:
>>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
>>>
>>> Signed-off-by: Chen Baozi <baozich@gmail.com>
>>> ---
>>>  drivers/xen/swiotlb-xen.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>> index 810ad41..7345afd 100644
>>> --- a/drivers/xen/swiotlb-xen.c
>>> +++ b/drivers/xen/swiotlb-xen.c
>>> @@ -235,7 +235,8 @@ retry:
>>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
>>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
>>> +			xen_io_tlb_start = (void *)__get_free_pages(
>>> +						__GFP_NOWARN|__GFP_DMA, order);
>>
>> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
>> mean __GFP_DMA32?
> 
> __GFP_DMA32 doesn't help on arm64...
> 
> I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> arm?

Yes.

This is also conceptually wrong since it doesn't matter where the pages
are in PFN space,  but where they are in bus address (MFN) space (which
is what the subsequent hypercall is required to sort out).

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-20 12:02       ` David Vrabel
  0 siblings, 0 replies; 30+ messages in thread
From: David Vrabel @ 2015-04-20 12:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/04/15 12:07, Chen Baozi wrote:
> On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
>> On 20/04/15 11:48, Chen Baozi wrote:
>>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
>>>
>>> Signed-off-by: Chen Baozi <baozich@gmail.com>
>>> ---
>>>  drivers/xen/swiotlb-xen.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>> index 810ad41..7345afd 100644
>>> --- a/drivers/xen/swiotlb-xen.c
>>> +++ b/drivers/xen/swiotlb-xen.c
>>> @@ -235,7 +235,8 @@ retry:
>>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
>>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
>>> +			xen_io_tlb_start = (void *)__get_free_pages(
>>> +						__GFP_NOWARN|__GFP_DMA, order);
>>
>> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
>> mean __GFP_DMA32?
> 
> __GFP_DMA32 doesn't help on arm64...
> 
> I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> arm?

Yes.

This is also conceptually wrong since it doesn't matter where the pages
are in PFN space,  but where they are in bus address (MFN) space (which
is what the subsequent hypercall is required to sort out).

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 11:07     ` Chen Baozi
  (?)
@ 2015-04-20 12:02     ` David Vrabel
  -1 siblings, 0 replies; 30+ messages in thread
From: David Vrabel @ 2015-04-20 12:02 UTC (permalink / raw)
  To: Chen Baozi, David Vrabel
  Cc: xen-devel, linux-kernel, linux-arm-kernel, stefano.stabellini

On 20/04/15 12:07, Chen Baozi wrote:
> On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
>> On 20/04/15 11:48, Chen Baozi wrote:
>>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
>>>
>>> Signed-off-by: Chen Baozi <baozich@gmail.com>
>>> ---
>>>  drivers/xen/swiotlb-xen.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>> index 810ad41..7345afd 100644
>>> --- a/drivers/xen/swiotlb-xen.c
>>> +++ b/drivers/xen/swiotlb-xen.c
>>> @@ -235,7 +235,8 @@ retry:
>>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
>>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
>>> +			xen_io_tlb_start = (void *)__get_free_pages(
>>> +						__GFP_NOWARN|__GFP_DMA, order);
>>
>> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
>> mean __GFP_DMA32?
> 
> __GFP_DMA32 doesn't help on arm64...
> 
> I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> arm?

Yes.

This is also conceptually wrong since it doesn't matter where the pages
are in PFN space,  but where they are in bus address (MFN) space (which
is what the subsequent hypercall is required to sort out).

David

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 10:48 [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages Chen Baozi
  2015-04-20 10:53 ` David Vrabel
@ 2015-04-20 13:37 ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 30+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-04-20 13:37 UTC (permalink / raw)
  To: Chen Baozi; +Cc: stefano.stabellini, xen-devel

On Mon, Apr 20, 2015 at 06:48:24PM +0800, Chen Baozi wrote:
> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> 
> Signed-off-by: Chen Baozi <baozich@gmail.com>
> ---
>  drivers/xen/swiotlb-xen.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 810ad41..7345afd 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -235,7 +235,8 @@ retry:
>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> +			xen_io_tlb_start = (void *)__get_free_pages(
> +						__GFP_NOWARN|__GFP_DMA, order);

This is not good for x86 PV guests - as the GFP_DMA pool does not really
have pages below 4GB.

Could you  make this __GFP_DMA be dependent on (!xen_pv_domain()) or such?

>  			if (xen_io_tlb_start)
>  				break;
>  			order--;
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 12:02       ` David Vrabel
@ 2015-04-20 17:54         ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-20 17:54 UTC (permalink / raw)
  To: David Vrabel
  Cc: Chen Baozi, stefano.stabellini, linux-kernel, linux-arm-kernel,
	xen-devel

On Mon, 20 Apr 2015, David Vrabel wrote:
> On 20/04/15 12:07, Chen Baozi wrote:
> > On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> >> On 20/04/15 11:48, Chen Baozi wrote:
> >>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> >>>
> >>> Signed-off-by: Chen Baozi <baozich@gmail.com>
> >>> ---
> >>>  drivers/xen/swiotlb-xen.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> >>> index 810ad41..7345afd 100644
> >>> --- a/drivers/xen/swiotlb-xen.c
> >>> +++ b/drivers/xen/swiotlb-xen.c
> >>> @@ -235,7 +235,8 @@ retry:
> >>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> >>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> >>> +			xen_io_tlb_start = (void *)__get_free_pages(
> >>> +						__GFP_NOWARN|__GFP_DMA, order);
> >>
> >> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> >> mean __GFP_DMA32?
> > 
> > __GFP_DMA32 doesn't help on arm64...
> > 
> > I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> > arm?
> 
> Yes.

This should definitely be done only on ARM and ARM64, as on x86 PVH
assumes the presence of an IOMMU. We need an ifdef.

Also we need to figure out a way to try without GFP_DMA in case no ram
under 4g is available at all, as some arm64 platforms don't have any. Of
course in those cases we don't need to worry about devices and their dma
masks.  Maybe we could use memblock for that? Something like:

    struct memblock_region *reg;
    gfp_t flags = __GFP_NOWARN;

#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
	for_each_memblock(memory, reg) {
		unsigned long start = memblock_region_memory_base_pfn(reg);

        if (start < 4G) {
            flags |= __GFP_DMA;
            break;
        }
    }
#endif

    [...]

    xen_io_tlb_start = (void *)__get_free_pages(flags, order);
    




> This is also conceptually wrong since it doesn't matter where the pages
> are in PFN space,  but where they are in bus address (MFN) space (which
> is what the subsequent hypercall is required to sort out).

Actually on ARM dom0 is mapped 1:1, so it is the same thing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-20 17:54         ` Stefano Stabellini
  0 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-20 17:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 20 Apr 2015, David Vrabel wrote:
> On 20/04/15 12:07, Chen Baozi wrote:
> > On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> >> On 20/04/15 11:48, Chen Baozi wrote:
> >>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> >>>
> >>> Signed-off-by: Chen Baozi <baozich@gmail.com>
> >>> ---
> >>>  drivers/xen/swiotlb-xen.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> >>> index 810ad41..7345afd 100644
> >>> --- a/drivers/xen/swiotlb-xen.c
> >>> +++ b/drivers/xen/swiotlb-xen.c
> >>> @@ -235,7 +235,8 @@ retry:
> >>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> >>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> >>> +			xen_io_tlb_start = (void *)__get_free_pages(
> >>> +						__GFP_NOWARN|__GFP_DMA, order);
> >>
> >> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> >> mean __GFP_DMA32?
> > 
> > __GFP_DMA32 doesn't help on arm64...
> > 
> > I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> > arm?
> 
> Yes.

This should definitely be done only on ARM and ARM64, as on x86 PVH
assumes the presence of an IOMMU. We need an ifdef.

Also we need to figure out a way to try without GFP_DMA in case no ram
under 4g is available at all, as some arm64 platforms don't have any. Of
course in those cases we don't need to worry about devices and their dma
masks.  Maybe we could use memblock for that? Something like:

    struct memblock_region *reg;
    gfp_t flags = __GFP_NOWARN;

#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
	for_each_memblock(memory, reg) {
		unsigned long start = memblock_region_memory_base_pfn(reg);

        if (start < 4G) {
            flags |= __GFP_DMA;
            break;
        }
    }
#endif

    [...]

    xen_io_tlb_start = (void *)__get_free_pages(flags, order);
    




> This is also conceptually wrong since it doesn't matter where the pages
> are in PFN space,  but where they are in bus address (MFN) space (which
> is what the subsequent hypercall is required to sort out).

Actually on ARM dom0 is mapped 1:1, so it is the same thing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 12:02       ` David Vrabel
  (?)
  (?)
@ 2015-04-20 17:54       ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-20 17:54 UTC (permalink / raw)
  To: David Vrabel
  Cc: Chen Baozi, xen-devel, linux-kernel, linux-arm-kernel,
	stefano.stabellini

On Mon, 20 Apr 2015, David Vrabel wrote:
> On 20/04/15 12:07, Chen Baozi wrote:
> > On Mon, Apr 20, 2015 at 11:53:47AM +0100, David Vrabel wrote:
> >> On 20/04/15 11:48, Chen Baozi wrote:
> >>> Make sure that xen_swiotlb_init allocates buffers that is DMA capable.
> >>>
> >>> Signed-off-by: Chen Baozi <baozich@gmail.com>
> >>> ---
> >>>  drivers/xen/swiotlb-xen.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> >>> index 810ad41..7345afd 100644
> >>> --- a/drivers/xen/swiotlb-xen.c
> >>> +++ b/drivers/xen/swiotlb-xen.c
> >>> @@ -235,7 +235,8 @@ retry:
> >>>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> >>>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> >>>  		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> >>> -			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
> >>> +			xen_io_tlb_start = (void *)__get_free_pages(
> >>> +						__GFP_NOWARN|__GFP_DMA, order);
> >>
> >> I think this breaks x86 where __GFP_DMA means below 16 MB.  Perhaps you
> >> mean __GFP_DMA32?
> > 
> > __GFP_DMA32 doesn't help on arm64...
> > 
> > I guess there might be conflicts about the meaning of __GFP_DMA between x86 and
> > arm?
> 
> Yes.

This should definitely be done only on ARM and ARM64, as on x86 PVH
assumes the presence of an IOMMU. We need an ifdef.

Also we need to figure out a way to try without GFP_DMA in case no ram
under 4g is available at all, as some arm64 platforms don't have any. Of
course in those cases we don't need to worry about devices and their dma
masks.  Maybe we could use memblock for that? Something like:

    struct memblock_region *reg;
    gfp_t flags = __GFP_NOWARN;

#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
	for_each_memblock(memory, reg) {
		unsigned long start = memblock_region_memory_base_pfn(reg);

        if (start < 4G) {
            flags |= __GFP_DMA;
            break;
        }
    }
#endif

    [...]

    xen_io_tlb_start = (void *)__get_free_pages(flags, order);
    




> This is also conceptually wrong since it doesn't matter where the pages
> are in PFN space,  but where they are in bus address (MFN) space (which
> is what the subsequent hypercall is required to sort out).

Actually on ARM dom0 is mapped 1:1, so it is the same thing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 17:54         ` Stefano Stabellini
@ 2015-04-21  7:57           ` Ian Campbell
  -1 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21  7:57 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: David Vrabel, Chen Baozi, xen-devel, linux-kernel, linux-arm-kernel

On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> This should definitely be done only on ARM and ARM64, as on x86 PVH
> assumes the presence of an IOMMU. We need an ifdef.
> 
> Also we need to figure out a way to try without GFP_DMA in case no ram
> under 4g is available at all, as some arm64 platforms don't have any. Of
> course in those cases we don't need to worry about devices and their dma
> masks.  Maybe we could use memblock for that?

It's pretty ugly, but I've not got any better ideas.

It would perhaps be less ugly as a an arch-specific
get_me_a_swiotlb_region type function, with the bare __get_free_pages as
the generic fallback.

>  Something like:
> 
>     struct memblock_region *reg;
>     gfp_t flags = __GFP_NOWARN;
> 
> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> 	for_each_memblock(memory, reg) {
> 		unsigned long start = memblock_region_memory_base_pfn(reg);
> 
>         if (start < 4G) {
>             flags |= __GFP_DMA;
>             break;
>         }
>     }
> #endif
> 
>     [...]
> 
>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>     
> 
> 
> 
> 
> > This is also conceptually wrong since it doesn't matter where the pages
> > are in PFN space,  but where they are in bus address (MFN) space (which
> > is what the subsequent hypercall is required to sort out).
> 
> Actually on ARM dom0 is mapped 1:1, so it is the same thing.

On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
I think that dom0 can still treat that as 1:1 mapped for these purposes,
since the SMMU will provide that illusion.

Dumb question, and this might affect PVH too, if you have an IOMMU and a
device with a limited DMA range, I suppose you need to provide DMA
addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
output (i.e. MFN) space (since the device only sees PFNs).

So even for x86 PVH isn't something required here to ensure that the
swiotlb has suitable pages under 4GB in PFN space too?

(On ARM PFN==IPA and MFN==PA)

Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
the Xen swiotlb code, or would we want to arrange to go via the native
swiotlb paths? I initially thought the latter, but does e.g. grant
mappings still need some special handling?

Ian.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-21  7:57           ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21  7:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> This should definitely be done only on ARM and ARM64, as on x86 PVH
> assumes the presence of an IOMMU. We need an ifdef.
> 
> Also we need to figure out a way to try without GFP_DMA in case no ram
> under 4g is available at all, as some arm64 platforms don't have any. Of
> course in those cases we don't need to worry about devices and their dma
> masks.  Maybe we could use memblock for that?

It's pretty ugly, but I've not got any better ideas.

It would perhaps be less ugly as a an arch-specific
get_me_a_swiotlb_region type function, with the bare __get_free_pages as
the generic fallback.

>  Something like:
> 
>     struct memblock_region *reg;
>     gfp_t flags = __GFP_NOWARN;
> 
> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> 	for_each_memblock(memory, reg) {
> 		unsigned long start = memblock_region_memory_base_pfn(reg);
> 
>         if (start < 4G) {
>             flags |= __GFP_DMA;
>             break;
>         }
>     }
> #endif
> 
>     [...]
> 
>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>     
> 
> 
> 
> 
> > This is also conceptually wrong since it doesn't matter where the pages
> > are in PFN space,  but where they are in bus address (MFN) space (which
> > is what the subsequent hypercall is required to sort out).
> 
> Actually on ARM dom0 is mapped 1:1, so it is the same thing.

On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
I think that dom0 can still treat that as 1:1 mapped for these purposes,
since the SMMU will provide that illusion.

Dumb question, and this might affect PVH too, if you have an IOMMU and a
device with a limited DMA range, I suppose you need to provide DMA
addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
output (i.e. MFN) space (since the device only sees PFNs).

So even for x86 PVH isn't something required here to ensure that the
swiotlb has suitable pages under 4GB in PFN space too?

(On ARM PFN==IPA and MFN==PA)

Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
the Xen swiotlb code, or would we want to arrange to go via the native
swiotlb paths? I initially thought the latter, but does e.g. grant
mappings still need some special handling?

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-20 17:54         ` Stefano Stabellini
  (?)
@ 2015-04-21  7:57         ` Ian Campbell
  -1 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21  7:57 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: linux-kernel, Chen Baozi, David Vrabel, linux-arm-kernel, xen-devel

On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> This should definitely be done only on ARM and ARM64, as on x86 PVH
> assumes the presence of an IOMMU. We need an ifdef.
> 
> Also we need to figure out a way to try without GFP_DMA in case no ram
> under 4g is available at all, as some arm64 platforms don't have any. Of
> course in those cases we don't need to worry about devices and their dma
> masks.  Maybe we could use memblock for that?

It's pretty ugly, but I've not got any better ideas.

It would perhaps be less ugly as a an arch-specific
get_me_a_swiotlb_region type function, with the bare __get_free_pages as
the generic fallback.

>  Something like:
> 
>     struct memblock_region *reg;
>     gfp_t flags = __GFP_NOWARN;
> 
> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> 	for_each_memblock(memory, reg) {
> 		unsigned long start = memblock_region_memory_base_pfn(reg);
> 
>         if (start < 4G) {
>             flags |= __GFP_DMA;
>             break;
>         }
>     }
> #endif
> 
>     [...]
> 
>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>     
> 
> 
> 
> 
> > This is also conceptually wrong since it doesn't matter where the pages
> > are in PFN space,  but where they are in bus address (MFN) space (which
> > is what the subsequent hypercall is required to sort out).
> 
> Actually on ARM dom0 is mapped 1:1, so it is the same thing.

On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
I think that dom0 can still treat that as 1:1 mapped for these purposes,
since the SMMU will provide that illusion.

Dumb question, and this might affect PVH too, if you have an IOMMU and a
device with a limited DMA range, I suppose you need to provide DMA
addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
output (i.e. MFN) space (since the device only sees PFNs).

So even for x86 PVH isn't something required here to ensure that the
swiotlb has suitable pages under 4GB in PFN space too?

(On ARM PFN==IPA and MFN==PA)

Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
the Xen swiotlb code, or would we want to arrange to go via the native
swiotlb paths? I initially thought the latter, but does e.g. grant
mappings still need some special handling?

Ian.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21  7:57           ` Ian Campbell
@ 2015-04-21 10:36             ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 10:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, David Vrabel, Chen Baozi, xen-devel,
	linux-kernel, linux-arm-kernel, Roger Pau Monne

On Tue, 21 Apr 2015, Ian Campbell wrote:
> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > assumes the presence of an IOMMU. We need an ifdef.
> > 
> > Also we need to figure out a way to try without GFP_DMA in case no ram
> > under 4g is available at all, as some arm64 platforms don't have any. Of
> > course in those cases we don't need to worry about devices and their dma
> > masks.  Maybe we could use memblock for that?
> 
> It's pretty ugly, but I've not got any better ideas.
> 
> It would perhaps be less ugly as a an arch-specific
> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> the generic fallback.

We could do that, but even open code like this isn't too bad: it might
be ugly but at least is very obvious.


> >  Something like:
> > 
> >     struct memblock_region *reg;
> >     gfp_t flags = __GFP_NOWARN;
> > 
> > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > 	for_each_memblock(memory, reg) {
> > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > 
> >         if (start < 4G) {
> >             flags |= __GFP_DMA;
> >             break;
> >         }
> >     }
> > #endif
> > 
> >     [...]
> > 
> >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> >     
> > 
> > 
> > 
> > 
> > > This is also conceptually wrong since it doesn't matter where the pages
> > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > is what the subsequent hypercall is required to sort out).
> > 
> > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> 
> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> I think that dom0 can still treat that as 1:1 mapped for these purposes,
> since the SMMU will provide that illusion.
> 
> Dumb question, and this might affect PVH too, if you have an IOMMU and a
> device with a limited DMA range, I suppose you need to provide DMA
> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> output (i.e. MFN) space (since the device only sees PFNs).

I think you mean "for the input to the device (PFN)", but I presume the
same.


> So even for x86 PVH isn't something required here to ensure that the
> swiotlb has suitable pages under 4GB in PFN space too?
> 
> (On ARM PFN==IPA and MFN==PA)

I guess that is true. PVH people, any thoughts?


> Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> the Xen swiotlb code, or would we want to arrange to go via the native
> swiotlb paths? I initially thought the latter, but does e.g. grant
> mappings still need some special handling?

In both cases I think it would be simpler to just go through the normal
swiotlb.

FYI at the moment there is no knowledge about SMMUs availability on ARM.
The code is still missing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-21 10:36             ` Stefano Stabellini
  0 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 10:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Apr 2015, Ian Campbell wrote:
> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > assumes the presence of an IOMMU. We need an ifdef.
> > 
> > Also we need to figure out a way to try without GFP_DMA in case no ram
> > under 4g is available at all, as some arm64 platforms don't have any. Of
> > course in those cases we don't need to worry about devices and their dma
> > masks.  Maybe we could use memblock for that?
> 
> It's pretty ugly, but I've not got any better ideas.
> 
> It would perhaps be less ugly as a an arch-specific
> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> the generic fallback.

We could do that, but even open code like this isn't too bad: it might
be ugly but at least is very obvious.


> >  Something like:
> > 
> >     struct memblock_region *reg;
> >     gfp_t flags = __GFP_NOWARN;
> > 
> > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > 	for_each_memblock(memory, reg) {
> > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > 
> >         if (start < 4G) {
> >             flags |= __GFP_DMA;
> >             break;
> >         }
> >     }
> > #endif
> > 
> >     [...]
> > 
> >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> >     
> > 
> > 
> > 
> > 
> > > This is also conceptually wrong since it doesn't matter where the pages
> > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > is what the subsequent hypercall is required to sort out).
> > 
> > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> 
> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> I think that dom0 can still treat that as 1:1 mapped for these purposes,
> since the SMMU will provide that illusion.
> 
> Dumb question, and this might affect PVH too, if you have an IOMMU and a
> device with a limited DMA range, I suppose you need to provide DMA
> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> output (i.e. MFN) space (since the device only sees PFNs).

I think you mean "for the input to the device (PFN)", but I presume the
same.


> So even for x86 PVH isn't something required here to ensure that the
> swiotlb has suitable pages under 4GB in PFN space too?
> 
> (On ARM PFN==IPA and MFN==PA)

I guess that is true. PVH people, any thoughts?


> Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> the Xen swiotlb code, or would we want to arrange to go via the native
> swiotlb paths? I initially thought the latter, but does e.g. grant
> mappings still need some special handling?

In both cases I think it would be simpler to just go through the normal
swiotlb.

FYI at the moment there is no knowledge about SMMUs availability on ARM.
The code is still missing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21  7:57           ` Ian Campbell
  (?)
  (?)
@ 2015-04-21 10:36           ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 10:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, linux-kernel, xen-devel, Chen Baozi,
	David Vrabel, linux-arm-kernel, Roger Pau Monne

On Tue, 21 Apr 2015, Ian Campbell wrote:
> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > assumes the presence of an IOMMU. We need an ifdef.
> > 
> > Also we need to figure out a way to try without GFP_DMA in case no ram
> > under 4g is available at all, as some arm64 platforms don't have any. Of
> > course in those cases we don't need to worry about devices and their dma
> > masks.  Maybe we could use memblock for that?
> 
> It's pretty ugly, but I've not got any better ideas.
> 
> It would perhaps be less ugly as a an arch-specific
> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> the generic fallback.

We could do that, but even open code like this isn't too bad: it might
be ugly but at least is very obvious.


> >  Something like:
> > 
> >     struct memblock_region *reg;
> >     gfp_t flags = __GFP_NOWARN;
> > 
> > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > 	for_each_memblock(memory, reg) {
> > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > 
> >         if (start < 4G) {
> >             flags |= __GFP_DMA;
> >             break;
> >         }
> >     }
> > #endif
> > 
> >     [...]
> > 
> >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> >     
> > 
> > 
> > 
> > 
> > > This is also conceptually wrong since it doesn't matter where the pages
> > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > is what the subsequent hypercall is required to sort out).
> > 
> > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> 
> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> I think that dom0 can still treat that as 1:1 mapped for these purposes,
> since the SMMU will provide that illusion.
> 
> Dumb question, and this might affect PVH too, if you have an IOMMU and a
> device with a limited DMA range, I suppose you need to provide DMA
> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> output (i.e. MFN) space (since the device only sees PFNs).

I think you mean "for the input to the device (PFN)", but I presume the
same.


> So even for x86 PVH isn't something required here to ensure that the
> swiotlb has suitable pages under 4GB in PFN space too?
> 
> (On ARM PFN==IPA and MFN==PA)

I guess that is true. PVH people, any thoughts?


> Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> the Xen swiotlb code, or would we want to arrange to go via the native
> swiotlb paths? I initially thought the latter, but does e.g. grant
> mappings still need some special handling?

In both cases I think it would be simpler to just go through the normal
swiotlb.

FYI at the moment there is no knowledge about SMMUs availability on ARM.
The code is still missing.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
@ 2015-04-21 11:05               ` Ian Campbell
  -1 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21 11:05 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: David Vrabel, Chen Baozi, xen-devel, linux-kernel,
	linux-arm-kernel, Roger Pau Monne

On Tue, 2015-04-21 at 11:36 +0100, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
> > >  Something like:
> > > 
> > >     struct memblock_region *reg;
> > >     gfp_t flags = __GFP_NOWARN;
> > > 
> > > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > > 	for_each_memblock(memory, reg) {
> > > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > > 
> > >         if (start < 4G) {
> > >             flags |= __GFP_DMA;
> > >             break;
> > >         }
> > >     }
> > > #endif
> > > 
> > >     [...]
> > > 
> > >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> > >     
> > > 
> > > 
> > > 
> > > 
> > > > This is also conceptually wrong since it doesn't matter where the pages
> > > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > > is what the subsequent hypercall is required to sort out).
> > > 
> > > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> > 
> > On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> > I think that dom0 can still treat that as 1:1 mapped for these purposes,
> > since the SMMU will provide that illusion.
> > 
> > Dumb question, and this might affect PVH too, if you have an IOMMU and a
> > device with a limited DMA range, I suppose you need to provide DMA
> > addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> > output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.

Yes, either the numbers the driver programs into the device, or the ones
the device itself puts on the bus (i.e. inputs to the IOMMU). They are
in the same address space.

> > So even for x86 PVH isn't something required here to ensure that the
> > swiotlb has suitable pages under 4GB in PFN space too?
> > 
> > (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?
> 
> 
> > Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> > the Xen swiotlb code, or would we want to arrange to go via the native
> > swiotlb paths? I initially thought the latter, but does e.g. grant
> > mappings still need some special handling?
> 
> In both cases I think it would be simpler to just go through the normal
> swiotlb.

No special grant table magic in this case == great!

> FYI at the moment there is no knowledge about SMMUs availability on ARM.
> The code is still missing.

Right, I just want to avoid painting ourselves into a corner before that
stuff lands.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-21 11:05               ` Ian Campbell
  0 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-04-21 at 11:36 +0100, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
> > >  Something like:
> > > 
> > >     struct memblock_region *reg;
> > >     gfp_t flags = __GFP_NOWARN;
> > > 
> > > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > > 	for_each_memblock(memory, reg) {
> > > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > > 
> > >         if (start < 4G) {
> > >             flags |= __GFP_DMA;
> > >             break;
> > >         }
> > >     }
> > > #endif
> > > 
> > >     [...]
> > > 
> > >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> > >     
> > > 
> > > 
> > > 
> > > 
> > > > This is also conceptually wrong since it doesn't matter where the pages
> > > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > > is what the subsequent hypercall is required to sort out).
> > > 
> > > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> > 
> > On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> > I think that dom0 can still treat that as 1:1 mapped for these purposes,
> > since the SMMU will provide that illusion.
> > 
> > Dumb question, and this might affect PVH too, if you have an IOMMU and a
> > device with a limited DMA range, I suppose you need to provide DMA
> > addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> > output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.

Yes, either the numbers the driver programs into the device, or the ones
the device itself puts on the bus (i.e. inputs to the IOMMU). They are
in the same address space.

> > So even for x86 PVH isn't something required here to ensure that the
> > swiotlb has suitable pages under 4GB in PFN space too?
> > 
> > (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?
> 
> 
> > Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> > the Xen swiotlb code, or would we want to arrange to go via the native
> > swiotlb paths? I initially thought the latter, but does e.g. grant
> > mappings still need some special handling?
> 
> In both cases I think it would be simpler to just go through the normal
> swiotlb.

No special grant table magic in this case == great!

> FYI at the moment there is no knowledge about SMMUs availability on ARM.
> The code is still missing.

Right, I just want to avoid painting ourselves into a corner before that
stuff lands.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
  (?)
  (?)
@ 2015-04-21 11:05             ` Ian Campbell
  -1 siblings, 0 replies; 30+ messages in thread
From: Ian Campbell @ 2015-04-21 11:05 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: linux-kernel, xen-devel, Chen Baozi, David Vrabel,
	linux-arm-kernel, Roger Pau Monne

On Tue, 2015-04-21 at 11:36 +0100, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
> > >  Something like:
> > > 
> > >     struct memblock_region *reg;
> > >     gfp_t flags = __GFP_NOWARN;
> > > 
> > > #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> > > 	for_each_memblock(memory, reg) {
> > > 		unsigned long start = memblock_region_memory_base_pfn(reg);
> > > 
> > >         if (start < 4G) {
> > >             flags |= __GFP_DMA;
> > >             break;
> > >         }
> > >     }
> > > #endif
> > > 
> > >     [...]
> > > 
> > >     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
> > >     
> > > 
> > > 
> > > 
> > > 
> > > > This is also conceptually wrong since it doesn't matter where the pages
> > > > are in PFN space,  but where they are in bus address (MFN) space (which
> > > > is what the subsequent hypercall is required to sort out).
> > > 
> > > Actually on ARM dom0 is mapped 1:1, so it is the same thing.
> > 
> > On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
> > I think that dom0 can still treat that as 1:1 mapped for these purposes,
> > since the SMMU will provide that illusion.
> > 
> > Dumb question, and this might affect PVH too, if you have an IOMMU and a
> > device with a limited DMA range, I suppose you need to provide DMA
> > addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
> > output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.

Yes, either the numbers the driver programs into the device, or the ones
the device itself puts on the bus (i.e. inputs to the IOMMU). They are
in the same address space.

> > So even for x86 PVH isn't something required here to ensure that the
> > swiotlb has suitable pages under 4GB in PFN space too?
> > 
> > (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?
> 
> 
> > Second dumb question, on x86 PVH or ARM with an SMMU, would we even hit
> > the Xen swiotlb code, or would we want to arrange to go via the native
> > swiotlb paths? I initially thought the latter, but does e.g. grant
> > mappings still need some special handling?
> 
> In both cases I think it would be simpler to just go through the normal
> swiotlb.

No special grant table magic in this case == great!

> FYI at the moment there is no knowledge about SMMUs availability on ARM.
> The code is still missing.

Right, I just want to avoid painting ourselves into a corner before that
stuff lands.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
@ 2015-04-21 11:11               ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 11:11 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, David Vrabel, Chen Baozi, xen-devel, linux-kernel,
	linux-arm-kernel, Roger Pau Monne

On Tue, 21 Apr 2015, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.

Chen,
could you please try the patch below in your repro scenario?
I have only build tested it.

---

xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM

From: Chen Baozi <baozich@gmail.com>

Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses.

No functional changes on x86.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 2f7e6ff..0b579b2 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -110,5 +110,6 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
 bool xen_arch_need_swiotlb(struct device *dev,
 			   unsigned long pfn,
 			   unsigned long mfn);
+unsigned long xen_get_swiotlb_free_pages(unsigned int order);
 
 #endif /* _ASM_ARM_XEN_PAGE_H */
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 793551d..4983250 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -4,6 +4,7 @@
 #include <linux/gfp.h>
 #include <linux/highmem.h>
 #include <linux/export.h>
+#include <linux/memblock.h>
 #include <linux/of_address.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -21,6 +22,20 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
+unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	struct memblock_region *reg;
+	gfp_t flags = __GFP_NOWARN;
+
+	for_each_memblock(memory, reg) {
+		if (reg->base < (phys_addr_t)0xffffffff) {
+			flags |= __GFP_DMA;
+			break;
+		}
+	}
+	return __get_free_pages(flags, order);
+}
+
 enum dma_cache_op {
        DMA_UNMAP,
        DMA_MAP,
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index 358dcd3..c44a5d5 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -269,4 +269,9 @@ static inline bool xen_arch_need_swiotlb(struct device *dev,
 	return false;
 }
 
+static inline unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	return __get_free_pages(__GFP_NOWARN, order);
+}
+
 #endif /* _ASM_X86_XEN_PAGE_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..4c54932 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,7 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
+			xen_io_tlb_start = (void *)xen_get_swiotlb_free_pages(order);
 			if (xen_io_tlb_start)
 				break;
 			order--;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-21 11:11               ` Stefano Stabellini
  0 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 11:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 21 Apr 2015, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.

Chen,
could you please try the patch below in your repro scenario?
I have only build tested it.

---

xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM

From: Chen Baozi <baozich@gmail.com>

Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses.

No functional changes on x86.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 2f7e6ff..0b579b2 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -110,5 +110,6 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
 bool xen_arch_need_swiotlb(struct device *dev,
 			   unsigned long pfn,
 			   unsigned long mfn);
+unsigned long xen_get_swiotlb_free_pages(unsigned int order);
 
 #endif /* _ASM_ARM_XEN_PAGE_H */
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 793551d..4983250 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -4,6 +4,7 @@
 #include <linux/gfp.h>
 #include <linux/highmem.h>
 #include <linux/export.h>
+#include <linux/memblock.h>
 #include <linux/of_address.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -21,6 +22,20 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
+unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	struct memblock_region *reg;
+	gfp_t flags = __GFP_NOWARN;
+
+	for_each_memblock(memory, reg) {
+		if (reg->base < (phys_addr_t)0xffffffff) {
+			flags |= __GFP_DMA;
+			break;
+		}
+	}
+	return __get_free_pages(flags, order);
+}
+
 enum dma_cache_op {
        DMA_UNMAP,
        DMA_MAP,
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index 358dcd3..c44a5d5 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -269,4 +269,9 @@ static inline bool xen_arch_need_swiotlb(struct device *dev,
 	return false;
 }
 
+static inline unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	return __get_free_pages(__GFP_NOWARN, order);
+}
+
 #endif /* _ASM_X86_XEN_PAGE_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..4c54932 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,7 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
+			xen_io_tlb_start = (void *)xen_get_swiotlb_free_pages(order);
 			if (xen_io_tlb_start)
 				break;
 			order--;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
                               ` (2 preceding siblings ...)
  (?)
@ 2015-04-21 11:11             ` Stefano Stabellini
  -1 siblings, 0 replies; 30+ messages in thread
From: Stefano Stabellini @ 2015-04-21 11:11 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, linux-kernel, xen-devel, Chen Baozi, David Vrabel,
	linux-arm-kernel, Roger Pau Monne

On Tue, 21 Apr 2015, Stefano Stabellini wrote:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
> > On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
> > > This should definitely be done only on ARM and ARM64, as on x86 PVH
> > > assumes the presence of an IOMMU. We need an ifdef.
> > > 
> > > Also we need to figure out a way to try without GFP_DMA in case no ram
> > > under 4g is available at all, as some arm64 platforms don't have any. Of
> > > course in those cases we don't need to worry about devices and their dma
> > > masks.  Maybe we could use memblock for that?
> > 
> > It's pretty ugly, but I've not got any better ideas.
> > 
> > It would perhaps be less ugly as a an arch-specific
> > get_me_a_swiotlb_region type function, with the bare __get_free_pages as
> > the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.

Chen,
could you please try the patch below in your repro scenario?
I have only build tested it.

---

xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM

From: Chen Baozi <baozich@gmail.com>

Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses.

No functional changes on x86.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 2f7e6ff..0b579b2 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -110,5 +110,6 @@ static inline bool set_phys_to_machine(unsigned long pfn, unsigned long mfn)
 bool xen_arch_need_swiotlb(struct device *dev,
 			   unsigned long pfn,
 			   unsigned long mfn);
+unsigned long xen_get_swiotlb_free_pages(unsigned int order);
 
 #endif /* _ASM_ARM_XEN_PAGE_H */
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 793551d..4983250 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -4,6 +4,7 @@
 #include <linux/gfp.h>
 #include <linux/highmem.h>
 #include <linux/export.h>
+#include <linux/memblock.h>
 #include <linux/of_address.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -21,6 +22,20 @@
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
+unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	struct memblock_region *reg;
+	gfp_t flags = __GFP_NOWARN;
+
+	for_each_memblock(memory, reg) {
+		if (reg->base < (phys_addr_t)0xffffffff) {
+			flags |= __GFP_DMA;
+			break;
+		}
+	}
+	return __get_free_pages(flags, order);
+}
+
 enum dma_cache_op {
        DMA_UNMAP,
        DMA_MAP,
diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index 358dcd3..c44a5d5 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -269,4 +269,9 @@ static inline bool xen_arch_need_swiotlb(struct device *dev,
 	return false;
 }
 
+static inline unsigned long xen_get_swiotlb_free_pages(unsigned int order)
+{
+	return __get_free_pages(__GFP_NOWARN, order);
+}
+
 #endif /* _ASM_X86_XEN_PAGE_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..4c54932 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,7 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 		while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-			xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order);
+			xen_io_tlb_start = (void *)xen_get_swiotlb_free_pages(order);
 			if (xen_io_tlb_start)
 				break;
 			order--;

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
@ 2015-04-21 12:01               ` Roger Pau Monné
  -1 siblings, 0 replies; 30+ messages in thread
From: Roger Pau Monné @ 2015-04-21 12:01 UTC (permalink / raw)
  To: Stefano Stabellini, Ian Campbell
  Cc: David Vrabel, Chen Baozi, xen-devel, linux-kernel, linux-arm-kernel

El 21/04/15 a les 12.36, Stefano Stabellini ha escrit:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
>> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
>>> This should definitely be done only on ARM and ARM64, as on x86 PVH
>>> assumes the presence of an IOMMU. We need an ifdef.
>>>
>>> Also we need to figure out a way to try without GFP_DMA in case no ram
>>> under 4g is available at all, as some arm64 platforms don't have any. Of
>>> course in those cases we don't need to worry about devices and their dma
>>> masks.  Maybe we could use memblock for that?
>>
>> It's pretty ugly, but I've not got any better ideas.
>>
>> It would perhaps be less ugly as a an arch-specific
>> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
>> the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
>>>  Something like:
>>>
>>>     struct memblock_region *reg;
>>>     gfp_t flags = __GFP_NOWARN;
>>>
>>> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
>>> 	for_each_memblock(memory, reg) {
>>> 		unsigned long start = memblock_region_memory_base_pfn(reg);
>>>
>>>         if (start < 4G) {
>>>             flags |= __GFP_DMA;
>>>             break;
>>>         }
>>>     }
>>> #endif
>>>
>>>     [...]
>>>
>>>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>>>     
>>>
>>>
>>>
>>>
>>>> This is also conceptually wrong since it doesn't matter where the pages
>>>> are in PFN space,  but where they are in bus address (MFN) space (which
>>>> is what the subsequent hypercall is required to sort out).
>>>
>>> Actually on ARM dom0 is mapped 1:1, so it is the same thing.
>>
>> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
>> I think that dom0 can still treat that as 1:1 mapped for these purposes,
>> since the SMMU will provide that illusion.
>>
>> Dumb question, and this might affect PVH too, if you have an IOMMU and a
>> device with a limited DMA range, I suppose you need to provide DMA
>> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
>> output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.
> 
> 
>> So even for x86 PVH isn't something required here to ensure that the
>> swiotlb has suitable pages under 4GB in PFN space too?
>>
>> (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?

AFAIK Linux PVH uses the native swiotlb, and FreeBSD does the same. I
expect the device expects PFNs (not MFNs) below the 4GB range, or else
the design is broken because there's no way the guest can figure out if
the MFN behind a PFN is below 4GB.

Roger.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-21 12:01               ` Roger Pau Monné
  0 siblings, 0 replies; 30+ messages in thread
From: Roger Pau Monné @ 2015-04-21 12:01 UTC (permalink / raw)
  To: linux-arm-kernel

El 21/04/15 a les 12.36, Stefano Stabellini ha escrit:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
>> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
>>> This should definitely be done only on ARM and ARM64, as on x86 PVH
>>> assumes the presence of an IOMMU. We need an ifdef.
>>>
>>> Also we need to figure out a way to try without GFP_DMA in case no ram
>>> under 4g is available at all, as some arm64 platforms don't have any. Of
>>> course in those cases we don't need to worry about devices and their dma
>>> masks.  Maybe we could use memblock for that?
>>
>> It's pretty ugly, but I've not got any better ideas.
>>
>> It would perhaps be less ugly as a an arch-specific
>> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
>> the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
>>>  Something like:
>>>
>>>     struct memblock_region *reg;
>>>     gfp_t flags = __GFP_NOWARN;
>>>
>>> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
>>> 	for_each_memblock(memory, reg) {
>>> 		unsigned long start = memblock_region_memory_base_pfn(reg);
>>>
>>>         if (start < 4G) {
>>>             flags |= __GFP_DMA;
>>>             break;
>>>         }
>>>     }
>>> #endif
>>>
>>>     [...]
>>>
>>>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>>>     
>>>
>>>
>>>
>>>
>>>> This is also conceptually wrong since it doesn't matter where the pages
>>>> are in PFN space,  but where they are in bus address (MFN) space (which
>>>> is what the subsequent hypercall is required to sort out).
>>>
>>> Actually on ARM dom0 is mapped 1:1, so it is the same thing.
>>
>> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
>> I think that dom0 can still treat that as 1:1 mapped for these purposes,
>> since the SMMU will provide that illusion.
>>
>> Dumb question, and this might affect PVH too, if you have an IOMMU and a
>> device with a limited DMA range, I suppose you need to provide DMA
>> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
>> output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.
> 
> 
>> So even for x86 PVH isn't something required here to ensure that the
>> swiotlb has suitable pages under 4GB in PFN space too?
>>
>> (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?

AFAIK Linux PVH uses the native swiotlb, and FreeBSD does the same. I
expect the device expects PFNs (not MFNs) below the 4GB range, or else
the design is broken because there's no way the guest can figure out if
the MFN behind a PFN is below 4GB.

Roger.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 10:36             ` Stefano Stabellini
                               ` (4 preceding siblings ...)
  (?)
@ 2015-04-21 12:01             ` Roger Pau Monné
  -1 siblings, 0 replies; 30+ messages in thread
From: Roger Pau Monné @ 2015-04-21 12:01 UTC (permalink / raw)
  To: Stefano Stabellini, Ian Campbell
  Cc: linux-kernel, Chen Baozi, David Vrabel, linux-arm-kernel, xen-devel

El 21/04/15 a les 12.36, Stefano Stabellini ha escrit:
> On Tue, 21 Apr 2015, Ian Campbell wrote:
>> On Mon, 2015-04-20 at 18:54 +0100, Stefano Stabellini wrote:
>>> This should definitely be done only on ARM and ARM64, as on x86 PVH
>>> assumes the presence of an IOMMU. We need an ifdef.
>>>
>>> Also we need to figure out a way to try without GFP_DMA in case no ram
>>> under 4g is available at all, as some arm64 platforms don't have any. Of
>>> course in those cases we don't need to worry about devices and their dma
>>> masks.  Maybe we could use memblock for that?
>>
>> It's pretty ugly, but I've not got any better ideas.
>>
>> It would perhaps be less ugly as a an arch-specific
>> get_me_a_swiotlb_region type function, with the bare __get_free_pages as
>> the generic fallback.
> 
> We could do that, but even open code like this isn't too bad: it might
> be ugly but at least is very obvious.
> 
> 
>>>  Something like:
>>>
>>>     struct memblock_region *reg;
>>>     gfp_t flags = __GFP_NOWARN;
>>>
>>> #if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
>>> 	for_each_memblock(memory, reg) {
>>> 		unsigned long start = memblock_region_memory_base_pfn(reg);
>>>
>>>         if (start < 4G) {
>>>             flags |= __GFP_DMA;
>>>             break;
>>>         }
>>>     }
>>> #endif
>>>
>>>     [...]
>>>
>>>     xen_io_tlb_start = (void *)__get_free_pages(flags, order);
>>>     
>>>
>>>
>>>
>>>
>>>> This is also conceptually wrong since it doesn't matter where the pages
>>>> are in PFN space,  but where they are in bus address (MFN) space (which
>>>> is what the subsequent hypercall is required to sort out).
>>>
>>> Actually on ARM dom0 is mapped 1:1, so it is the same thing.
>>
>> On a system with a fully functional SMMU dom0 may not be 1:1 mapped, but
>> I think that dom0 can still treat that as 1:1 mapped for these purposes,
>> since the SMMU will provide that illusion.
>>
>> Dumb question, and this might affect PVH too, if you have an IOMMU and a
>> device with a limited DMA range, I suppose you need to provide DMA
>> addresses in the <4G for the input to the IOMMU (i.e. PFN) and not the
>> output (i.e. MFN) space (since the device only sees PFNs).
> 
> I think you mean "for the input to the device (PFN)", but I presume the
> same.
> 
> 
>> So even for x86 PVH isn't something required here to ensure that the
>> swiotlb has suitable pages under 4GB in PFN space too?
>>
>> (On ARM PFN==IPA and MFN==PA)
> 
> I guess that is true. PVH people, any thoughts?

AFAIK Linux PVH uses the native swiotlb, and FreeBSD does the same. I
expect the device expects PFNs (not MFNs) below the 4GB range, or else
the design is broken because there's no way the guest can figure out if
the MFN behind a PFN is below 4GB.

Roger.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 11:11               ` Stefano Stabellini
@ 2015-04-23  2:33                 ` Chen Baozi
  -1 siblings, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-23  2:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, David Vrabel, xen-devel, linux-kernel,
	linux-arm-kernel, Roger Pau Monne

On Tue, Apr 21, 2015 at 12:11:01PM +0100, Stefano Stabellini wrote:
> Chen,
> could you please try the patch below in your repro scenario?
> I have only build tested it.
> 
> ---
> 
> xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
> 
> From: Chen Baozi <baozich@gmail.com>
> 
> Make sure that xen_swiotlb_init allocates buffers that are DMA capable
> when at least one memblock is available below 4G. Otherwise we assume
> that all devices on the SoC can cope with >4G addresses.
> 
> No functional changes on x86.
> 
> Signed-off-by: Chen Baozi <baozich@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Tested-by: Chen Baozi <baozich@gmail.com>

Cheers,

Baozi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Xen-devel] [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
@ 2015-04-23  2:33                 ` Chen Baozi
  0 siblings, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-23  2:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 21, 2015 at 12:11:01PM +0100, Stefano Stabellini wrote:
> Chen,
> could you please try the patch below in your repro scenario?
> I have only build tested it.
> 
> ---
> 
> xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
> 
> From: Chen Baozi <baozich@gmail.com>
> 
> Make sure that xen_swiotlb_init allocates buffers that are DMA capable
> when at least one memblock is available below 4G. Otherwise we assume
> that all devices on the SoC can cope with >4G addresses.
> 
> No functional changes on x86.
> 
> Signed-off-by: Chen Baozi <baozich@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Tested-by: Chen Baozi <baozich@gmail.com>

Cheers,

Baozi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages.
  2015-04-21 11:11               ` Stefano Stabellini
  (?)
@ 2015-04-23  2:33               ` Chen Baozi
  -1 siblings, 0 replies; 30+ messages in thread
From: Chen Baozi @ 2015-04-23  2:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, linux-kernel, xen-devel, David Vrabel,
	linux-arm-kernel, Roger Pau Monne

On Tue, Apr 21, 2015 at 12:11:01PM +0100, Stefano Stabellini wrote:
> Chen,
> could you please try the patch below in your repro scenario?
> I have only build tested it.
> 
> ---
> 
> xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
> 
> From: Chen Baozi <baozich@gmail.com>
> 
> Make sure that xen_swiotlb_init allocates buffers that are DMA capable
> when at least one memblock is available below 4G. Otherwise we assume
> that all devices on the SoC can cope with >4G addresses.
> 
> No functional changes on x86.
> 
> Signed-off-by: Chen Baozi <baozich@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Tested-by: Chen Baozi <baozich@gmail.com>

Cheers,

Baozi

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-04-23  2:33 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-20 10:48 [PATCH] xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages Chen Baozi
2015-04-20 10:53 ` David Vrabel
2015-04-20 11:07   ` Chen Baozi
2015-04-20 11:07   ` [Xen-devel] " Chen Baozi
2015-04-20 11:07     ` Chen Baozi
2015-04-20 12:02     ` David Vrabel
2015-04-20 12:02     ` [Xen-devel] " David Vrabel
2015-04-20 12:02       ` David Vrabel
2015-04-20 17:54       ` Stefano Stabellini
2015-04-20 17:54         ` Stefano Stabellini
2015-04-21  7:57         ` Ian Campbell
2015-04-21  7:57         ` [Xen-devel] " Ian Campbell
2015-04-21  7:57           ` Ian Campbell
2015-04-21 10:36           ` Stefano Stabellini
2015-04-21 10:36             ` Stefano Stabellini
2015-04-21 11:05             ` Ian Campbell
2015-04-21 11:05               ` Ian Campbell
2015-04-21 11:05             ` Ian Campbell
2015-04-21 11:11             ` Stefano Stabellini
2015-04-21 11:11             ` [Xen-devel] " Stefano Stabellini
2015-04-21 11:11               ` Stefano Stabellini
2015-04-23  2:33               ` Chen Baozi
2015-04-23  2:33               ` [Xen-devel] " Chen Baozi
2015-04-23  2:33                 ` Chen Baozi
2015-04-21 12:01             ` Roger Pau Monné
2015-04-21 12:01             ` [Xen-devel] " Roger Pau Monné
2015-04-21 12:01               ` Roger Pau Monné
2015-04-21 10:36           ` Stefano Stabellini
2015-04-20 17:54       ` Stefano Stabellini
2015-04-20 13:37 ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.