From mboxrd@z Thu Jan 1 00:00:00 1970 From: Parav Pandit Subject: Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller Date: Wed, 14 Sep 2016 14:49:56 +0530 Message-ID: References: <20160901084406.GA4115@lst.de> <20160910161442.GC29259@lst.de> <20160910170151.GA5230@obsidianresearch.com> <20160911133421.GA23384@lst.de> <20160911143522.GL6415@leon.nu> <20160911171409.GA13442@obsidianresearch.com> <20160911172445.GA25953@lst.de> <20160911175235.GB13442@obsidianresearch.com> <20160912050717.GE8812@leon.nu> <13a00119-e629-2d34-d08b-c02bb6beceea@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: <13a00119-e629-2d34-d08b-c02bb6beceea-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Matan Barak Cc: Leon Romanovsky , Jason Gunthorpe , Christoph Hellwig , Tejun Heo , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux Kernel Mailing List , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Li Zefan , Johannes Weiner , Doug Ledford , Liran Liss , "Hefty, Sean" , Haggai Eran , Jonathan Corbet , james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org, serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org, Or Gerlitz , Andrew Morton , linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-rdma@vger.kernel.org Hi Matan, On Wed, Sep 14, 2016 at 1:44 PM, Matan Barak wrote: > On 14/09/2016 10:06, Parav Pandit wrote: >> >> Hi Dennis, >> >> Do you know how would HFI1 driver would work along with rdma cgroup? >> >> Hi Matan, Leon, Jason, >> Apart from HFI1, is there any other concern? > > > I just wonder how things like RSS will work. For example, a RSS QP doesn't > really have a queue (if I recall, it's connected to work queues via an > indirection table). So, when a user creates such a QP, do you want to > account it as a regular QP? > How are work queues accounted? ib_create_rwq_ind_table verb allows creating indirection table. I assume it allows creating multiple such tables. If it is so, than number of tables should be a cgroup resource that we can add in follow on patch. By doing so, one container doesn't takeaway all the tables. > > >> Or Patch is good to go? >> >> 4.8 dates are close by (2 weeks) and there are two git trees involved >> (that might cause merge error to Linus) so if there are no issues, I >> would like to make request to Doug to consider it for 4.8 early on. >> >> Parav >> >> On Mon, Sep 12, 2016 at 10:37 AM, Leon Romanovsky wrote: >>> >>> On Sun, Sep 11, 2016 at 11:52:35AM -0600, Jason Gunthorpe wrote: >>>> >>>> On Sun, Sep 11, 2016 at 07:24:45PM +0200, Christoph Hellwig wrote: >>>>>>>> >>>>>>>> I've posted some initial work toward a) a while ago, and once we >>>>>> >>>>>> >>>>>> Did it get merged? Do you have a pointer? >>>>> >>>>> >>>>> http://www.spinics.net/lists/linux-rdma/msg31958.html >>>> >>>> >>>> Right, I remember that. Certainly the right direction >>>> >>>>>> However, everything under verbs is not straightforward. The files in >>>>>> userspace are not copies... >>>>>> >>>>>> user: >>>>>> >>>>>> struct ibv_query_device { >>>>>> __u32 command; >>>>>> __u16 in_words; >>>>>> __u16 out_words; >>>>>> __u64 response; >>>>>> __u64 driver_data[0]; >>>>>> }; >>>>>> >>>>>> kernel: >>>>>> >>>>>> struct ib_uverbs_query_device { >>>>>> __u64 response; >>>>>> __u64 driver_data[0]; >>>>>> }; >>>>> >>>>> >>>>> We'll obviously need different strutures for the libibvers API >>>>> and the kernel interface in this case, and we'll need to figure out >>>>> how to properly translate them. I think a cast, plus compile time >>>>> type checking ala BUILD_BUG_ON is the way to go. >>>> >>>> >>>> I'm not sure I follow, which would I cast? >>>> >>>> BUILD_BUG_ON(sizeof(ibv_query_device) == sizeof(ib_uverbs_cmd_hdr) + >>>> sizeof(ib_uverbs_query_device)) >>>> >>>> ? >>>> >>>>>> I'm thinking the best way forward might be to use a script and >>>>>> transform userspace into: >>>>>> >>>>>> struct ibv_query_device { >>>>>> struct ib_uverbs_cmd_hdr hdr; >>>>>> struct ib_uverbs_query_device cmd; >>>>>> }; >>>>> >>>>> >>>>> That would break the users of the interface. >>>> >>>> >>>> Sorry, I mean doing this inside rdma-plumbing. Since the change is ABI >>>> identical the modified libibverbs would still be binary compatible >>>> with all providers but not source compatible. Since all kernel >>>> supported providers are in rdma-plumbing we can add the '.cmd.' at the >>>> same time. >>>> >>>> The kernel uapi header would stay the same. >>>> >>>>> However automatically generating the user ABI from the kernel one >>>>> might still be a good idea in the long run. >>>> >>>> >>>> My preference would be to try and use the kernel headers directly. >>> >>> >>> I thought the same, especially after realizing that they are almost >>> copy/paste from the vendor *-abi.h files. >>> >>>> >>>> Jason > > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758038AbcINJUC (ORCPT ); Wed, 14 Sep 2016 05:20:02 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:35179 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750975AbcINJT6 (ORCPT ); Wed, 14 Sep 2016 05:19:58 -0400 MIME-Version: 1.0 In-Reply-To: <13a00119-e629-2d34-d08b-c02bb6beceea@mellanox.com> References: <20160901084406.GA4115@lst.de> <20160910161442.GC29259@lst.de> <20160910170151.GA5230@obsidianresearch.com> <20160911133421.GA23384@lst.de> <20160911143522.GL6415@leon.nu> <20160911171409.GA13442@obsidianresearch.com> <20160911172445.GA25953@lst.de> <20160911175235.GB13442@obsidianresearch.com> <20160912050717.GE8812@leon.nu> <13a00119-e629-2d34-d08b-c02bb6beceea@mellanox.com> From: Parav Pandit Date: Wed, 14 Sep 2016 14:49:56 +0530 Message-ID: Subject: Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller To: Matan Barak Cc: Leon Romanovsky , Jason Gunthorpe , Christoph Hellwig , Tejun Heo , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, Linux Kernel Mailing List , linux-rdma@vger.kernel.org, Li Zefan , Johannes Weiner , Doug Ledford , Liran Liss , "Hefty, Sean" , Haggai Eran , Jonathan Corbet , james.l.morris@oracle.com, serge@hallyn.com, Or Gerlitz , Andrew Morton , linux-security-module@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Matan, On Wed, Sep 14, 2016 at 1:44 PM, Matan Barak wrote: > On 14/09/2016 10:06, Parav Pandit wrote: >> >> Hi Dennis, >> >> Do you know how would HFI1 driver would work along with rdma cgroup? >> >> Hi Matan, Leon, Jason, >> Apart from HFI1, is there any other concern? > > > I just wonder how things like RSS will work. For example, a RSS QP doesn't > really have a queue (if I recall, it's connected to work queues via an > indirection table). So, when a user creates such a QP, do you want to > account it as a regular QP? > How are work queues accounted? ib_create_rwq_ind_table verb allows creating indirection table. I assume it allows creating multiple such tables. If it is so, than number of tables should be a cgroup resource that we can add in follow on patch. By doing so, one container doesn't takeaway all the tables. > > >> Or Patch is good to go? >> >> 4.8 dates are close by (2 weeks) and there are two git trees involved >> (that might cause merge error to Linus) so if there are no issues, I >> would like to make request to Doug to consider it for 4.8 early on. >> >> Parav >> >> On Mon, Sep 12, 2016 at 10:37 AM, Leon Romanovsky wrote: >>> >>> On Sun, Sep 11, 2016 at 11:52:35AM -0600, Jason Gunthorpe wrote: >>>> >>>> On Sun, Sep 11, 2016 at 07:24:45PM +0200, Christoph Hellwig wrote: >>>>>>>> >>>>>>>> I've posted some initial work toward a) a while ago, and once we >>>>>> >>>>>> >>>>>> Did it get merged? Do you have a pointer? >>>>> >>>>> >>>>> http://www.spinics.net/lists/linux-rdma/msg31958.html >>>> >>>> >>>> Right, I remember that. Certainly the right direction >>>> >>>>>> However, everything under verbs is not straightforward. The files in >>>>>> userspace are not copies... >>>>>> >>>>>> user: >>>>>> >>>>>> struct ibv_query_device { >>>>>> __u32 command; >>>>>> __u16 in_words; >>>>>> __u16 out_words; >>>>>> __u64 response; >>>>>> __u64 driver_data[0]; >>>>>> }; >>>>>> >>>>>> kernel: >>>>>> >>>>>> struct ib_uverbs_query_device { >>>>>> __u64 response; >>>>>> __u64 driver_data[0]; >>>>>> }; >>>>> >>>>> >>>>> We'll obviously need different strutures for the libibvers API >>>>> and the kernel interface in this case, and we'll need to figure out >>>>> how to properly translate them. I think a cast, plus compile time >>>>> type checking ala BUILD_BUG_ON is the way to go. >>>> >>>> >>>> I'm not sure I follow, which would I cast? >>>> >>>> BUILD_BUG_ON(sizeof(ibv_query_device) == sizeof(ib_uverbs_cmd_hdr) + >>>> sizeof(ib_uverbs_query_device)) >>>> >>>> ? >>>> >>>>>> I'm thinking the best way forward might be to use a script and >>>>>> transform userspace into: >>>>>> >>>>>> struct ibv_query_device { >>>>>> struct ib_uverbs_cmd_hdr hdr; >>>>>> struct ib_uverbs_query_device cmd; >>>>>> }; >>>>> >>>>> >>>>> That would break the users of the interface. >>>> >>>> >>>> Sorry, I mean doing this inside rdma-plumbing. Since the change is ABI >>>> identical the modified libibverbs would still be binary compatible >>>> with all providers but not source compatible. Since all kernel >>>> supported providers are in rdma-plumbing we can add the '.cmd.' at the >>>> same time. >>>> >>>> The kernel uapi header would stay the same. >>>> >>>>> However automatically generating the user ABI from the kernel one >>>>> might still be a good idea in the long run. >>>> >>>> >>>> My preference would be to try and use the kernel headers directly. >>> >>> >>> I thought the same, especially after realizing that they are almost >>> copy/paste from the vendor *-abi.h files. >>> >>>> >>>> Jason > >