linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
@ 2017-04-30 14:36 Shanker Donthineni
  2017-05-02 16:16 ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Shanker Donthineni @ 2017-04-30 14:36 UTC (permalink / raw)
  To: Marc Zyngier, linux-kernel, linux-arm-kernel
  Cc: Thomas Gleixner, Jason Cooper, Vikram Sethi, Shanker Donthineni

We are always allocating extra 255Bytes of memory to handle ITE
physical address alignment requirement. The kmalloc() satisfies
the ITE alignment since the ITS driver is requesting a minimum
size of ITS_ITT_ALIGN bytes.

Let's try to allocate the exact amount of memory that is required
for ITEs to avoid wastage.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
---
Changes:
v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
v3: changed from IITE to ITE.
v3: removed fallback since kmalloc() guarantees the right alignment.

 drivers/irqchip/irq-gic-v3-its.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 45ea1933..72e56f03 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
 	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
 
 	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
-	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
 
 	its_encode_cmd(cmd, GITS_CMD_MAPD);
 	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
@@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	 */
 	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
 	sz = nr_ites * its->ite_size;
-	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
+	sz = max(sz, ITS_ITT_ALIGN);
 	itt = kzalloc(sz, GFP_KERNEL);
 	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
 	if (lpi_map)
 		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
 
-	if (!dev || !itt || !lpi_map || !col_map) {
+	if (!dev || !itt || !lpi_map || !col_map ||
+	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
 		kfree(dev);
 		kfree(itt);
 		kfree(lpi_map);
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
  2017-04-30 14:36 [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs Shanker Donthineni
@ 2017-05-02 16:16 ` Marc Zyngier
  2017-05-05 22:04   ` Shanker Donthineni
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2017-05-02 16:16 UTC (permalink / raw)
  To: Shanker Donthineni
  Cc: linux-kernel, linux-arm-kernel, Thomas Gleixner, Jason Cooper,
	Vikram Sethi

On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
> We are always allocating extra 255Bytes of memory to handle ITE
> physical address alignment requirement. The kmalloc() satisfies
> the ITE alignment since the ITS driver is requesting a minimum
> size of ITS_ITT_ALIGN bytes.
>
> Let's try to allocate the exact amount of memory that is required
> for ITEs to avoid wastage.
>
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
> ---
> Changes:
> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
> v3: changed from IITE to ITE.
> v3: removed fallback since kmalloc() guarantees the right alignment.
>
>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 45ea1933..72e56f03 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>  	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>  
>  	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>  
>  	its_encode_cmd(cmd, GITS_CMD_MAPD);
>  	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>  	 */
>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>  	sz = nr_ites * its->ite_size;
> -	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> +	sz = max(sz, ITS_ITT_ALIGN);
>  	itt = kzalloc(sz, GFP_KERNEL);
>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>  	if (lpi_map)
>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>  
> -	if (!dev || !itt || !lpi_map || !col_map) {
> +	if (!dev || !itt || !lpi_map || !col_map ||
> +	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>  		kfree(dev);
>  		kfree(itt);
>  		kfree(lpi_map);

I'm confused. Either the alignment is guaranteed (and you should
document why it is so), or it is not, and we need to handle the
non-alignment (instead of failing).

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
  2017-05-02 16:16 ` Marc Zyngier
@ 2017-05-05 22:04   ` Shanker Donthineni
  2017-05-06 11:25     ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Shanker Donthineni @ 2017-05-05 22:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Vikram Sethi, Thomas Gleixner, linux-kernel, linux-arm-kernel,
	Jason Cooper

Hi Marc,


On 05/02/2017 11:16 AM, Marc Zyngier wrote:
> On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>> We are always allocating extra 255Bytes of memory to handle ITE
>> physical address alignment requirement. The kmalloc() satisfies
>> the ITE alignment since the ITS driver is requesting a minimum
>> size of ITS_ITT_ALIGN bytes.
>>
>> Let's try to allocate the exact amount of memory that is required
>> for ITEs to avoid wastage.
>>
>> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
>> ---
>> Changes:
>> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
>> v3: changed from IITE to ITE.
>> v3: removed fallback since kmalloc() guarantees the right alignment.
>>
>>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 45ea1933..72e56f03 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>  	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>>  
>>  	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>>  
>>  	its_encode_cmd(cmd, GITS_CMD_MAPD);
>>  	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>>  	 */
>>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>>  	sz = nr_ites * its->ite_size;
>> -	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
>> +	sz = max(sz, ITS_ITT_ALIGN);
>>  	itt = kzalloc(sz, GFP_KERNEL);
>>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>>  	if (lpi_map)
>>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>>  
>> -	if (!dev || !itt || !lpi_map || !col_map) {
>> +	if (!dev || !itt || !lpi_map || !col_map ||
>> +	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>>  		kfree(dev);
>>  		kfree(itt);
>>  		kfree(lpi_map);
> I'm confused. Either the alignment is guaranteed (and you should
> document why it is so), or it is not, and we need to handle the
> non-alignment (instead of failing).

Sorry for confusion, alignment is guaranteed by kmalloc(), added a check for readability purpose only can be removed.
 
> Thanks,
>
> 	M.

-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
  2017-05-05 22:04   ` Shanker Donthineni
@ 2017-05-06 11:25     ` Marc Zyngier
  2017-06-21  1:22       ` Shanker Donthineni
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2017-05-06 11:25 UTC (permalink / raw)
  To: Shanker Donthineni
  Cc: Vikram Sethi, Thomas Gleixner, linux-kernel, linux-arm-kernel,
	Jason Cooper

On Fri, May 05 2017 at 11:04:22 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
> Hi Marc,
>
>
> On 05/02/2017 11:16 AM, Marc Zyngier wrote:
>> On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>>> We are always allocating extra 255Bytes of memory to handle ITE
>>> physical address alignment requirement. The kmalloc() satisfies
>>> the ITE alignment since the ITS driver is requesting a minimum
>>> size of ITS_ITT_ALIGN bytes.
>>>
>>> Let's try to allocate the exact amount of memory that is required
>>> for ITEs to avoid wastage.
>>>
>>> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
>>> ---
>>> Changes:
>>> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
>>> v3: changed from IITE to ITE.
>>> v3: removed fallback since kmalloc() guarantees the right alignment.
>>>
>>>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>>> index 45ea1933..72e56f03 100644
>>> --- a/drivers/irqchip/irq-gic-v3-its.c
>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>>  	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>>>  
>>>  	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>>> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>>>  
>>>  	its_encode_cmd(cmd, GITS_CMD_MAPD);
>>>  	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>>> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>>>  	 */
>>>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>>>  	sz = nr_ites * its->ite_size;
>>> -	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
>>> +	sz = max(sz, ITS_ITT_ALIGN);
>>>  	itt = kzalloc(sz, GFP_KERNEL);
>>>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>>>  	if (lpi_map)
>>>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>>>  
>>> -	if (!dev || !itt || !lpi_map || !col_map) {
>>> +	if (!dev || !itt || !lpi_map || !col_map ||
>>> +	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>>>  		kfree(dev);
>>>  		kfree(itt);
>>>  		kfree(lpi_map);
>> I'm confused. Either the alignment is guaranteed (and you should
>> document why it is so), or it is not, and we need to handle the
>> non-alignment (instead of failing).
>
> Sorry for confusion, alignment is guaranteed by kmalloc(), added a
> check for readability purpose only can be removed.

My question still remains. Where exactly is that alignment guarantee
documented and enforced? I can't see anything giving that certainty.

I would expect kmalloc to give you something that is cache-line aligned,
but probably nothing more than that. Now, I'd happily be proven wrong,
but so far, all I can see is that:

- ARCH_KMALLOC_MINALIGN is defined as ARCH_DMA_MINALIGN
- ARCH_DMA_MINALIGN is defined as L1_CACHE_BYTES
- L1_CACHE_BYTES is 128 on arm64, and either 32, 64, or 128 on arm.

What am I missing?

Thanks,

	M.
-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
  2017-05-06 11:25     ` Marc Zyngier
@ 2017-06-21  1:22       ` Shanker Donthineni
  2017-06-21  7:30         ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Shanker Donthineni @ 2017-06-21  1:22 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Vikram Sethi, Thomas Gleixner, linux-kernel, linux-arm-kernel,
	Jason Cooper

Hi Marc,

On 05/06/2017 06:25 AM, Marc Zyngier wrote:
> On Fri, May 05 2017 at 11:04:22 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>> Hi Marc,
>>
>>
>> On 05/02/2017 11:16 AM, Marc Zyngier wrote:
>>> On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>>>> We are always allocating extra 255Bytes of memory to handle ITE
>>>> physical address alignment requirement. The kmalloc() satisfies
>>>> the ITE alignment since the ITS driver is requesting a minimum
>>>> size of ITS_ITT_ALIGN bytes.
>>>>
>>>> Let's try to allocate the exact amount of memory that is required
>>>> for ITEs to avoid wastage.
>>>>
>>>> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
>>>> ---
>>>> Changes:
>>>> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
>>>> v3: changed from IITE to ITE.
>>>> v3: removed fallback since kmalloc() guarantees the right alignment.
>>>>
>>>>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>>>> index 45ea1933..72e56f03 100644
>>>> --- a/drivers/irqchip/irq-gic-v3-its.c
>>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>>> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>>>  	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>>>>  
>>>>  	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>>>> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>>>>  
>>>>  	its_encode_cmd(cmd, GITS_CMD_MAPD);
>>>>  	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>>>> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>>>>  	 */
>>>>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>>>>  	sz = nr_ites * its->ite_size;
>>>> -	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
>>>> +	sz = max(sz, ITS_ITT_ALIGN);
>>>>  	itt = kzalloc(sz, GFP_KERNEL);
>>>>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>>>>  	if (lpi_map)
>>>>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>>>>  
>>>> -	if (!dev || !itt || !lpi_map || !col_map) {
>>>> +	if (!dev || !itt || !lpi_map || !col_map ||
>>>> +	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>>>>  		kfree(dev);
>>>>  		kfree(itt);
>>>>  		kfree(lpi_map);
>>> I'm confused. Either the alignment is guaranteed (and you should
>>> document why it is so), or it is not, and we need to handle the
>>> non-alignment (instead of failing).
>>
>> Sorry for confusion, alignment is guaranteed by kmalloc(), added a
>> check for readability purpose only can be removed.
> 
> My question still remains. Where exactly is that alignment guarantee
> documented and enforced? I can't see anything giving that certainty.
> 

The internal implementation of kmalloc() uses the slab/slub feature to allocate memory from 2^N size pool. Linux kernel maintains the fixed size of kmem_cache pools to serve the kmalloc(), It allocates minimum size of 128Bytes and maximum size depends on the system configuration and memory availability. In fact SMMUv3 driver has a similar requirement and absolutely there no problem using kmalloc() to meet the address alignment requirement.

Call trace:
    kmalloc()
      kmalloc_slab() --> convert size to kmem_cache
      slab_alloc()   ---> allocate 2^N size kmem_cache object
   
root@null-8cfdf006971f:~# cat /proc/slabinfo | grep kmall
dma-kmalloc-131072     0      0 131072    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-65536      0      0  65536    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32768      0      0  32768   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16384      0      0  16384   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8192       0      0   8192   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4096       0      0   4096   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2048       0      0   2048   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1024       0      0   1024   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512      128    128    512  128    1 : tunables    0    0    0 : slabdata      1      1      0
dma-kmalloc-256        0      0    256  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128      512    512    128  512    1 : tunables    0    0    0 : slabdata      1      1      0
kmalloc-131072         4      4 131072    4    8 : tunables    0    0    0 : slabdata      1      1      0
kmalloc-65536        376    376  65536    8    8 : tunables    0    0    0 : slabdata     47     47      0
kmalloc-32768        320    320  32768   16    8 : tunables    0    0    0 : slabdata     20     20      0
kmalloc-16384       5248   5248  16384   32    8 : tunables    0    0    0 : slabdata    164    164      0
kmalloc-8192        2176   2176   8192   32    4 : tunables    0    0    0 : slabdata     68     68      0
kmalloc-4096        4452   4576   4096   32    2 : tunables    0    0    0 : slabdata    143    143      0
kmalloc-2048        4416   4416   2048   32    1 : tunables    0    0    0 : slabdata    138    138      0
kmalloc-1024       10048  10176   1024   64    1 : tunables    0    0    0 : slabdata    159    159      0
kmalloc-512        19071  19584    512  128    1 : tunables    0    0    0 : slabdata    153    153      0
kmalloc-256        75873  77312    256  256    1 : tunables    0    0    0 : slabdata    302    302      0
kmalloc-128        82078  85504    128  512    1 : tunables    0    0    0 : slabdata    167    167      0
             

> I would expect kmalloc to give you something that is cache-line aligned,
> but probably nothing more than that. Now, I'd happily be proven wrong,
> but so far, all I can see is that:
> 
> - ARCH_KMALLOC_MINALIGN is defined as ARCH_DMA_MINALIGN
> - ARCH_DMA_MINALIGN is defined as L1_CACHE_BYTES
> - L1_CACHE_BYTES is 128 on arm64, and either 32, 64, or 128 on arm.
> 

Kmalloc always allocates memory with size=roundup_pow_of_two(size) and address alignment roundup_pow_of_two(size).   

> What am I missing?
> 
> Thanks,
> 
> 	M.
> 

-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs
  2017-06-21  1:22       ` Shanker Donthineni
@ 2017-06-21  7:30         ` Marc Zyngier
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2017-06-21  7:30 UTC (permalink / raw)
  To: shankerd
  Cc: Vikram Sethi, Thomas Gleixner, linux-kernel, linux-arm-kernel,
	Jason Cooper

On 21/06/17 02:22, Shanker Donthineni wrote:
> Hi Marc,
> 
> On 05/06/2017 06:25 AM, Marc Zyngier wrote:
>> On Fri, May 05 2017 at 11:04:22 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>>> Hi Marc,
>>>
>>>
>>> On 05/02/2017 11:16 AM, Marc Zyngier wrote:
>>>> On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni <shankerd@codeaurora.org> wrote:
>>>>> We are always allocating extra 255Bytes of memory to handle ITE
>>>>> physical address alignment requirement. The kmalloc() satisfies
>>>>> the ITE alignment since the ITS driver is requesting a minimum
>>>>> size of ITS_ITT_ALIGN bytes.
>>>>>
>>>>> Let's try to allocate the exact amount of memory that is required
>>>>> for ITEs to avoid wastage.
>>>>>
>>>>> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
>>>>> ---
>>>>> Changes:
>>>>> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit.
>>>>> v3: changed from IITE to ITE.
>>>>> v3: removed fallback since kmalloc() guarantees the right alignment.
>>>>>
>>>>>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>>>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>>>>> index 45ea1933..72e56f03 100644
>>>>> --- a/drivers/irqchip/irq-gic-v3-its.c
>>>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>>>> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>>>>  	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>>>>>  
>>>>>  	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>>>>> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>>>>>  
>>>>>  	its_encode_cmd(cmd, GITS_CMD_MAPD);
>>>>>  	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>>>>> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
>>>>>  	 */
>>>>>  	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>>>>>  	sz = nr_ites * its->ite_size;
>>>>> -	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
>>>>> +	sz = max(sz, ITS_ITT_ALIGN);
>>>>>  	itt = kzalloc(sz, GFP_KERNEL);
>>>>>  	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
>>>>>  	if (lpi_map)
>>>>>  		col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>>>>>  
>>>>> -	if (!dev || !itt || !lpi_map || !col_map) {
>>>>> +	if (!dev || !itt || !lpi_map || !col_map ||
>>>>> +	    !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>>>>>  		kfree(dev);
>>>>>  		kfree(itt);
>>>>>  		kfree(lpi_map);
>>>> I'm confused. Either the alignment is guaranteed (and you should
>>>> document why it is so), or it is not, and we need to handle the
>>>> non-alignment (instead of failing).
>>>
>>> Sorry for confusion, alignment is guaranteed by kmalloc(), added a
>>> check for readability purpose only can be removed.
>>
>> My question still remains. Where exactly is that alignment guarantee
>> documented and enforced? I can't see anything giving that certainty.
>>
> 
> The internal implementation of kmalloc() uses the slab/slub feature
> to allocate memory from 2^N size pool. Linux kernel maintains the
> fixed size of kmem_cache pools to serve the kmalloc(), It allocates
> minimum size of 128Bytes and maximum size depends on the system
> configuration and memory availability. In fact SMMUv3 driver has a
> similar requirement and absolutely there no problem using kmalloc()
> to meet the address alignment requirement.
> 
> Call trace:
>     kmalloc()
>       kmalloc_slab() --> convert size to kmem_cache
>       slab_alloc()   ---> allocate 2^N size kmem_cache object
>    
> root@null-8cfdf006971f:~# cat /proc/slabinfo | grep kmall
> dma-kmalloc-131072     0      0 131072    4    8 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-65536      0      0  65536    8    8 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-32768      0      0  32768   16    8 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-16384      0      0  16384   32    8 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-8192       0      0   8192   32    4 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-4096       0      0   4096   32    2 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-2048       0      0   2048   32    1 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-1024       0      0   1024   64    1 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-512      128    128    512  128    1 : tunables    0    0    0 : slabdata      1      1      0
> dma-kmalloc-256        0      0    256  256    1 : tunables    0    0    0 : slabdata      0      0      0
> dma-kmalloc-128      512    512    128  512    1 : tunables    0    0    0 : slabdata      1      1      0
> kmalloc-131072         4      4 131072    4    8 : tunables    0    0    0 : slabdata      1      1      0
> kmalloc-65536        376    376  65536    8    8 : tunables    0    0    0 : slabdata     47     47      0
> kmalloc-32768        320    320  32768   16    8 : tunables    0    0    0 : slabdata     20     20      0
> kmalloc-16384       5248   5248  16384   32    8 : tunables    0    0    0 : slabdata    164    164      0
> kmalloc-8192        2176   2176   8192   32    4 : tunables    0    0    0 : slabdata     68     68      0
> kmalloc-4096        4452   4576   4096   32    2 : tunables    0    0    0 : slabdata    143    143      0
> kmalloc-2048        4416   4416   2048   32    1 : tunables    0    0    0 : slabdata    138    138      0
> kmalloc-1024       10048  10176   1024   64    1 : tunables    0    0    0 : slabdata    159    159      0
> kmalloc-512        19071  19584    512  128    1 : tunables    0    0    0 : slabdata    153    153      0
> kmalloc-256        75873  77312    256  256    1 : tunables    0    0    0 : slabdata    302    302      0
> kmalloc-128        82078  85504    128  512    1 : tunables    0    0    0 : slabdata    167    167      0
>              
> 
>> I would expect kmalloc to give you something that is cache-line aligned,
>> but probably nothing more than that. Now, I'd happily be proven wrong,
>> but so far, all I can see is that:
>>
>> - ARCH_KMALLOC_MINALIGN is defined as ARCH_DMA_MINALIGN
>> - ARCH_DMA_MINALIGN is defined as L1_CACHE_BYTES
>> - L1_CACHE_BYTES is 128 on arm64, and either 32, 64, or 128 on arm.
>>
> 
> Kmalloc always allocates memory with size=roundup_pow_of_two(size)
> and address alignment roundup_pow_of_two(size).

Again, where is that enforced? The slob allocator explicitly uses
ARCH_KMALLOC_MINALIGN to compute its alignment. How does that match your
description above? Where is this roundup_pow_of_two(size) you're
quoting? Does it actually apply to all 3 allocators we have in the kernel?

Please don't give me any of this "it works for me". Show me the code ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-21  7:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-30 14:36 [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs Shanker Donthineni
2017-05-02 16:16 ` Marc Zyngier
2017-05-05 22:04   ` Shanker Donthineni
2017-05-06 11:25     ` Marc Zyngier
2017-06-21  1:22       ` Shanker Donthineni
2017-06-21  7:30         ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).