All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tariq Toukan <tariqt@mellanox.com>
To: Qing Huang <qing.huang@oracle.com>,
	tariqt@mellanox.com, davem@davemloft.net,
	haakon.bugge@oracle.com, yanjun.zhu@oracle.com
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, gi-oh.kim@profitbricks.com
Subject: Re: [PATCH V4] mlx4_core: allocate ICM memory in page size chunks
Date: Thu, 24 May 2018 12:45:03 +0300	[thread overview]
Message-ID: <bc652655-29de-a34b-a78a-0de19fc20df5@mellanox.com> (raw)
In-Reply-To: <20180523232246.20445-1-qing.huang@oracle.com>



On 24/05/2018 2:22 AM, Qing Huang wrote:
> When a system is under memory presure (high usage with fragments),
> the original 256KB ICM chunk allocations will likely trigger kernel
> memory management to enter slow path doing memory compact/migration
> ops in order to complete high order memory allocations.
> 
> When that happens, user processes calling uverb APIs may get stuck
> for more than 120s easily even though there are a lot of free pages
> in smaller chunks available in the system.
> 
> Syslog:
> ...
> Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
> oracle_205573_e:205573 blocked for more than 120 seconds.
> ...
> 
> With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.
> 
> However in order to support smaller ICM chunk size, we need to fix
> another issue in large size kcalloc allocations.
> 
> E.g.
> Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
> size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
> entry). So we need a 16MB allocation for a table->icm pointer array to
> hold 2M pointers which can easily cause kcalloc to fail.
> 
> The solution is to use kvzalloc to replace kcalloc which will fall back
> to vmalloc automatically if kmalloc fails.
> 
> Signed-off-by: Qing Huang <qing.huang@oracle.com>
> Acked-by: Daniel Jurgens <danielj@mellanox.com>
> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> ---
> v4: use kvzalloc instead of vzalloc
>      add one err condition check
>      don't include vmalloc.h any more
> 
> v3: use PAGE_SIZE instead of PAGE_SHIFT
>      add comma to the end of enum variables
>      include vmalloc.h header file to avoid build issues on Sparc
> 
> v2: adjusted chunk size to reflect different architectures
> 
>   drivers/net/ethernet/mellanox/mlx4/icm.c | 16 +++++++++-------
>   1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
> index a822f7a..685337d 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
> @@ -43,12 +43,12 @@
>   #include "fw.h"
>   
>   /*
> - * We allocate in as big chunks as we can, up to a maximum of 256 KB
> - * per chunk.
> + * We allocate in page size (default 4KB on many archs) chunks to avoid high
> + * order memory allocations in fragmented/high usage memory situation.
>    */
>   enum {
> -	MLX4_ICM_ALLOC_SIZE	= 1 << 18,
> -	MLX4_TABLE_CHUNK_SIZE	= 1 << 18
> +	MLX4_ICM_ALLOC_SIZE	= PAGE_SIZE,
> +	MLX4_TABLE_CHUNK_SIZE	= PAGE_SIZE,
>   };
>   
>   static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
> @@ -398,9 +398,11 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
>   	u64 size;
>   
>   	obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
> +	if (WARN_ON(!obj_per_chunk))
> +		return -EINVAL;
>   	num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
>   
> -	table->icm      = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL);
> +	table->icm      = kvzalloc(num_icm * sizeof(*table->icm), GFP_KERNEL);
>   	if (!table->icm)
>   		return -ENOMEM;
>   	table->virt     = virt;
> @@ -446,7 +448,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
>   			mlx4_free_icm(dev, table->icm[i], use_coherent);
>   		}
>   
> -	kfree(table->icm);
> +	kvfree(table->icm);
>   
>   	return -ENOMEM;
>   }
> @@ -462,5 +464,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table)
>   			mlx4_free_icm(dev, table->icm[i], table->coherent);
>   		}
>   
> -	kfree(table->icm);
> +	kvfree(table->icm);
>   }
> 

Thanks Qing.

Reviewed-by: Tariq Toukan <tariqt@mellanox.com>

  parent reply	other threads:[~2018-05-24  9:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-23 23:22 [PATCH V4] mlx4_core: allocate ICM memory in page size chunks Qing Huang
2018-05-24  7:23 ` Gi-Oh Kim
2018-05-24  9:45 ` Tariq Toukan [this message]
2018-05-25 14:23 ` David Miller
2018-05-30  3:34   ` Eric Dumazet
2018-05-30  3:44     ` Eric Dumazet
2018-05-30  3:49       ` Eric Dumazet
2018-05-30 17:39         ` Qing Huang
2018-05-31  6:54         ` Michal Hocko
2018-05-31  8:35           ` Eric Dumazet
2018-05-31  8:55             ` Michal Hocko
2018-05-31  9:10               ` Michal Hocko
2018-06-01  2:04                 ` Qing Huang
2018-06-01  7:31                   ` Michal Hocko
2018-06-01 22:05                     ` Qing Huang
2018-06-04  6:27                       ` Michal Hocko
2018-06-04 12:40                         ` Vlastimil Babka
2018-06-05 18:51                           ` Qing Huang
2018-06-04 13:11                 ` Michal Hocko
2018-06-04 13:22                   ` Eric Dumazet
2018-06-05  8:32                     ` Michal Hocko
2018-05-30 17:53     ` Qing Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc652655-29de-a34b-a78a-0de19fc20df5@mellanox.com \
    --to=tariqt@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=gi-oh.kim@profitbricks.com \
    --cc=haakon.bugge@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=qing.huang@oracle.com \
    --cc=yanjun.zhu@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.