From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754687AbcKUUUz convert rfc822-to-8bit (ORCPT ); Mon, 21 Nov 2016 15:20:55 -0500 Received: from lhrrgout.huawei.com ([194.213.3.17]:4239 "EHLO lhrrgout.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753434AbcKUUUy (ORCPT ); Mon, 21 Nov 2016 15:20:54 -0500 From: Salil Mehta To: Leon Romanovsky CC: "dledford@redhat.com" , "Huwei (Xavier)" , oulijun , "mehta.salil.lnk@gmail.com" , "linux-rdma@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Linuxarm , "Zhangping (ZP)" Subject: RE: [PATCH for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs Thread-Topic: [PATCH for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs Thread-Index: AQHSNrnWOaTLjEdgR062ss5CfO0xkKDQRlQAgAnzTlCAASHXAIAIVBYwgAAYZYCAADGdYA== Date: Mon, 21 Nov 2016 20:20:30 +0000 Message-ID: References: <20161104163633.141880-1-salil.mehta@huawei.com> <20161104163633.141880-4-salil.mehta@huawei.com> <20161109072130.GH27883@leon.nu> <20161116083602.GH4240@leon.nu> <20161121171423.GA23083@leon.nu> In-Reply-To: <20161121171423.GA23083@leon.nu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.47.82.53] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A0B0201.58335719.0016,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 220623f7be5393e8ba51fede4e61d896 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: netdev-owner@vger.kernel.org [mailto:netdev- > owner@vger.kernel.org] On Behalf Of Leon Romanovsky > Sent: Monday, November 21, 2016 5:14 PM > To: Salil Mehta > Cc: dledford@redhat.com; Huwei (Xavier); oulijun; > mehta.salil.lnk@gmail.com; linux-rdma@vger.kernel.org; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm; > Zhangping (ZP) > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of > allocating memory using APIs > > On Mon, Nov 21, 2016 at 04:12:38PM +0000, Salil Mehta wrote: > > > -----Original Message----- > > > From: Leon Romanovsky [mailto:leon@kernel.org] > > > Sent: Wednesday, November 16, 2016 8:36 AM > > > To: Salil Mehta > > > Cc: dledford@redhat.com; Huwei (Xavier); oulijun; > > > mehta.salil.lnk@gmail.com; linux-rdma@vger.kernel.org; > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm; > > > Zhangping (ZP) > > > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of > > > allocating memory using APIs > > > > > > On Tue, Nov 15, 2016 at 03:52:46PM +0000, Salil Mehta wrote: > > > > > -----Original Message----- > > > > > From: Leon Romanovsky [mailto:leon@kernel.org] > > > > > Sent: Wednesday, November 09, 2016 7:22 AM > > > > > To: Salil Mehta > > > > > Cc: dledford@redhat.com; Huwei (Xavier); oulijun; > > > > > mehta.salil.lnk@gmail.com; linux-rdma@vger.kernel.org; > > > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm; > > > > > Zhangping (ZP) > > > > > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic > of > > > > > allocating memory using APIs > > > > > > > > > > On Fri, Nov 04, 2016 at 04:36:25PM +0000, Salil Mehta wrote: > > > > > > From: "Wei Hu (Xavier)" > > > > > > > > > > > > This patch modified the logic of allocating memory using APIs > in > > > > > > hns RoCE driver. We used kcalloc instead of kmalloc_array and > > > > > > bitmap_zero. And When kcalloc failed, call vzalloc to alloc > > > > > > memory. > > > > > > > > > > > > Signed-off-by: Wei Hu (Xavier) > > > > > > Signed-off-by: Ping Zhang > > > > > > Signed-off-by: Salil Mehta > > > > > > --- > > > > > > drivers/infiniband/hw/hns/hns_roce_mr.c | 15 ++++++++----- > -- > > > > > > 1 file changed, 8 insertions(+), 7 deletions(-) > > > > > > > > > > > > diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > > b/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > > > index fb87883..d3dfb5f 100644 > > > > > > --- a/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > > > +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > > > @@ -137,11 +137,12 @@ static int hns_roce_buddy_init(struct > > > > > hns_roce_buddy *buddy, int max_order) > > > > > > > > > > > > for (i = 0; i <= buddy->max_order; ++i) { > > > > > > s = BITS_TO_LONGS(1 << (buddy->max_order - i)); > > > > > > - buddy->bits[i] = kmalloc_array(s, sizeof(long), > > > > > GFP_KERNEL); > > > > > > - if (!buddy->bits[i]) > > > > > > - goto err_out_free; > > > > > > - > > > > > > - bitmap_zero(buddy->bits[i], 1 << (buddy->max_order - > > > i)); > > > > > > + buddy->bits[i] = kcalloc(s, sizeof(long), > > > GFP_KERNEL); > > > > > > + if (!buddy->bits[i]) { > > > > > > + buddy->bits[i] = vzalloc(s * sizeof(long)); > > > > > > > > > > I wonder, why don't you use directly vzalloc instead of kcalloc > > > > > fallback? > > > > As we know we will have physical contiguous pages if the kcalloc > > > > call succeeds. This will give us a chance to have better > performance > > > > over the allocations which are just virtually contiguous through > the > > > > function vzalloc(). Therefore, later has only been used as a > fallback > > > > when our memory request cannot be entertained through kcalloc. > > > > > > > > Are you suggesting that there will not be much performance > penalty > > > > if we use just vzalloc ? > > > > > > Not exactly, > > > I asked it, because we have similar code in our drivers and this > > > construction looks strange to me. > > > > > > 1. If performance is critical, we will use kmalloc. > > > 2. If performance is not critical, we will use vmalloc. > > > > > > But in this case, such construction shows me that we can live with > > > vmalloc performance and kmalloc allocation are not really needed. > > > > > > In your specific case, I'm not sure that kcalloc will ever fail. > > Performance is definitely critical here. Though, I agree this is bit > > unusual way of memory allocation. In actual, we were encountering > > memory alloc failures using kmalloc (if you see allocation amount > > is on the higher side and is exponential) so we ended up using > > vmalloc as fall back - It is very naïve allocation scheme. > > I understand it, we did the same, see our mlx5_vzalloc call. > BTW, we used __GFP_NOWARN flag, which you should consider to use > in your case too. Ok. Will add this flag and refloat patch V3. Thanks > > > > > Maybe we need to rethink this allocation scheme part? Also, I can > pull > > back this particular patch for now or just live with vzalloc() till > > we figure out proper solution to this? > > It is up to you, I don't think that you should drop it, AFAIK, there is > no other proper solution. Ok we will live with it for now and later maybe we can see how we can optimize pre-allocation of physically contiguous memory. Thanks for your suggestions! Salil > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > + if (!buddy->bits[i]) > > > > > > + goto err_out_free; > > > > > > + } > > > > > > }