From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756556AbcIGP1F (ORCPT ); Wed, 7 Sep 2016 11:27:05 -0400 Received: from mail-db5eur01on0060.outbound.protection.outlook.com ([104.47.2.60]:64448 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753183AbcIGP05 (ORCPT ); Wed, 7 Sep 2016 11:26:57 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=matanb@mellanox.com; Subject: Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller To: Parav Pandit , Christoph Hellwig References: <1472632647-1525-1-git-send-email-pandit.parav@gmail.com> <1472632647-1525-2-git-send-email-pandit.parav@gmail.com> <61101e8b-5776-c0bc-b3ea-d8b984eebabf@mellanox.com> <20160831211618.GA12660@htj.duckdns.org> <9b6a346d-af4c-1e5f-0144-f68fb8e46c27@mellanox.com> <20160901084406.GA4115@lst.de> CC: Tejun Heo , , , Linux Kernel Mailing List , , Li Zefan , Johannes Weiner , Doug Ledford , Liran Liss , "Hefty, Sean" , Jason Gunthorpe , Haggai Eran , Jonathan Corbet , , , Or Gerlitz , Andrew Morton , From: Matan Barak Message-ID: Date: Wed, 7 Sep 2016 11:51:42 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [193.47.165.251] X-ClientProxiedBy: AM5PR0901CA0040.eurprd09.prod.outlook.com (10.164.186.178) To HE1PR05MB1737.eurprd05.prod.outlook.com (10.169.120.19) X-MS-Office365-Filtering-Correlation-Id: ca70ef4d-db26-44f6-687b-08d3d6fc3887 X-Microsoft-Exchange-Diagnostics: 1;HE1PR05MB1737;2:KbyHsaY3rA4v/Zw8tC6sGuucgIJTItNslwkz0/vxAMD+Iw6tzGtVGF0JdvRnhhm8J6k13zmxTUZR0tpau5xEQH7Emp4u4GLkmC3YQy1x3lVYqTwYXilptk9PMGkJgvW9vgYiv5vjSsTi4HovzHB9Rnxx1nGNFyNfSw7NlPvBP82WG0aHkLxU9j0TtPAuCaBq;3:paeFXBzkoCkQ0MFj16bF2CQ0FnBHpJmEs4zI2cxZlIq8zlnGXbo/w8BTS6OdDcgVXxTcwQvaM7ojntTJwrnnfFQNkshljoIcUA6ND8VmVsPEjLuyVeKwx5Yo+kbSpivi X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR05MB1737; X-Microsoft-Exchange-Diagnostics: 1;HE1PR05MB1737;25:ltILIDYKlaNs/0nGfND6GliwB4UdbPOERNish13uy4TMwVstBe2RV0WUHsn6rFoUwNDqVQ/GKfgtE3i7MqN0iRTKj3xQ7jgBz6T/VzjIVcmHgnFgO63hYrmhfLJMhh1Zp/GvmZatbRtHMnuwHJzygFudZr0APDNkW3Jcj3mFZ3KAsthRYGcUNfuph0kMwcfIAHHg9uIXFJvMZ5OjGvshoNIpgGRlHwS1wpf9JhftQRgSRlLbicJRWI4ipr+PpzOQLbAVeKS2St8gOXpmFm0FdwBU7vqYW4FbEmPErZnGdYHnKUdvXlVUe0fa5niDN5qH2POPC0r7le8q3uEPBzieFN1jedHKAIq1I15MVbj2NkrQtw5nM22MBJkQE3zUSQZOSdcYHjXoRCqlTZUUA/pznZDLQkN0j34SCc4DllwhUfvZtHk41/Sg8OMJp/H5PGPMV0sWwMTRQEFkm7ASGHgQsq/z3yLUDfRK43Y2q1ttwYRVkjJY9B5pjPSeoUiHkD4ohiZE8YRRl3sAvIdvVeObdB46MmCz1ITU2d4QebE5W0tmhc6bY8elQ72OSj0WuIlEF1t2OJn8jM+XdrmPQivmvj7FxPludqW80Z6B3zTNRpaC+E3B0tUwEpykGvwAx22N33eSer0ObcVK66vlJN26Rqe4Vnfm1yjD15zZUpcyZMqkO0wOzYw0N30Zlfn0AqzG/fkhNxrTQLysqE+UE/MwFdZ/uGMQ3OLa6srCe0zDw6K1sCmVOmdn+HUPJTS5xwOlxUlTlaoZbrZKq9ki+JwhgEX5+DZwWXtyVgvEmJuIFuG4OrOFtd1pH0DzIpQgvuLR X-Microsoft-Exchange-Diagnostics: 1;HE1PR05MB1737;31:cJPiyQg3O42vszdT/d2pNpS+XE4ZhcKRgaH4inmTZy6Yh/kRVvpK3S9GaxN3Pde4cDowxAEhywE/abSHFcVzv5SZ+aOHFVRy1KcmV5j6pzd+nLDvb1eseNn29l7or8OdML872L+hBpp0BIu7zJ4kp3wp2K8h7RcYKA7K6m08nNmqCNjdGuKa+8elY56GjmvcD1SzO/QjG6QXpJGQB+h6I5oCFPiat/sFC2wddoQ/xSQ=;20:ZTg5LUdO/iMok+raFvueFdx5lleK6XJHX33otym08vs+gcPgwkhXJbJzJd9X0mTN5b2SDhGHprF4GmLSWH8sWvGzt4Z/Q5RBVK9lYPfOD4wm5M4252xvgDOX1MtqvxDoEncaI+xmq5gjlSM0P3UP9tVyW1DzpoNFTrg9o6hCjmPwLMSo2/2o+UdKvEwTGAfe/h+mzqWPteRDDHkv5aUVxog0L51foWhh6Z4PvhYMI6oIadF2Nk0azc0GiBojwirifLZvruD6Qk5h8gA6n6cgz9yRYiIxtvXS2H6VF0lNJgLWB3olh6NsfVZN4/Sg3ZI2RGJLiPsgn5ZAUH7cNBiMhLWpd8i2hg3Sa2aDKPe7lEcD9RxKuCgxHjHjC+e5ciw1SxiTSE02ISA6RACsPhHdluzLqUKjQyht1kcC1t49HIDJEpHAro9ZSUdWdJp7yhm6vNcVYMl2SYy6loXrevqQ3qSPDfW8lVTMkoWzxOohVKvWwSJ0A7Xr5LlzpjS8D7LZ X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026);SRVR:HE1PR05MB1737;BCL:0;PCL:0;RULEID:;SRVR:HE1PR05MB1737; X-Microsoft-Exchange-Diagnostics: 1;HE1PR05MB1737;4:Lqjz3Sefj8JlDp7zFeogx/Zmg376nSdzkKki98JlnefldG9WL83dTyBrv6NaGkt0SB8aCkVv0qem4btEGzI16t7/qtjZbvaoBYPaopan8wPa6CZMuflVD6zYqnQTrMoEZuGazmh2i05SrD0f8EaLlR4ZlPTnatlHDnGL7LcfZLyN0Y8SA0GXN5qyCztfK1+q/XU581NIO8xkqtYPF3YuilLGUQwo5e5+tY0y26Aqc5RWW92ArZIlTgHkAqJ5ZOOClOqS8PRiyWa40StgeWgZbsmGXa7xhUCOSGUIc6DIS/1mxXdQc8DkZxn5nSENVBrO+/RkNQxf0HZCSokWmI4HjZD36B6dG+QXfaTeJsLcfRwhFobcADseb4XkQbhnVYC2e8/TvzrRDIrRAwq9JqSPeQI1aFbkXzLx1Mph3BZLJcI= X-Forefront-PRVS: 0058ABBBC7 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(7916002)(189002)(24454002)(199003)(52314003)(377454003)(97736004)(50466002)(65806001)(47776003)(65956001)(86362001)(2950100001)(189998001)(64126003)(65826007)(83506001)(66066001)(5660300001)(5001770100001)(92566002)(23676002)(6116002)(561944003)(50986999)(33646002)(76176999)(7736002)(7846002)(31686004)(54356999)(3846002)(93886004)(19580405001)(106356001)(81166006)(42186005)(81156014)(8676002)(305945005)(105586002)(19580395003)(7416002)(586003)(4001350100001)(36756003)(77096005)(4326007)(101416001)(2906002)(31696002)(230700001)(68736007)(15975445007)(21314002);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR05MB1737;H:[10.223.3.146];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA1TUIxNzM3OzIzOnJtMEoraURkeFZYcjBvU1p2c2ZicEw0TzFr?= =?utf-8?B?Wk8vTlB1Z1RLakUrd2hteS96UHZWSHg5bGpqUFdPbG15OG9ZelhFMGtUS0lt?= =?utf-8?B?TVQwTGwzZERBN2tmangrN3BYK0YrK3FoUW9sb3BLMGRud3ZTZjdZRlVHMW5i?= =?utf-8?B?Y3d1cFVHWFBFK2FNekZjYmt5RGxwY2cvTzluaDU0eC9OLzFIZkhVVkllU0NS?= =?utf-8?B?VUk5cjM3Kzd1YkRUOTJJT1NYQm1pcmNaOWFWVjAvNGZ5ZlBPbTZxQkoyMDVM?= =?utf-8?B?TGUwN2dzc0JKYzJmZEFKVHVYTm1zdVorK1ZaRC91QXphYnhmWitHWkZ0Z21F?= =?utf-8?B?YmlGZlRyT2V2ZlBlQTNwWUdrdVVseGFjUDdCSVVDazVRL3RhbnFudUdva1B5?= =?utf-8?B?YVlhdTlOS1o1Y2JVWjFEZElpZ2VaRmMraDBLclJrZXR6cXEvN3RUcTNhNUxZ?= =?utf-8?B?dDF6a0ZqaVVzRm51R0tlVVZCV3lFUkVZTVF2SFdGMmdOdmNVVmZmQ0FPLzVT?= =?utf-8?B?SXpWVlFUYWxHTmpobTV6WURlWlBwaEJiYWhXTjA5bktWM3ZGMXY1SmkwQXZv?= =?utf-8?B?RUI1MElwRTFxSHA5VkRRZjdETi93VWlaUkJWZXVGRWdoaXpSNGRzODErV1V6?= =?utf-8?B?TUVtbnZCeXQ5V05vZkVnTGIwSGpkWVdtTjBBRnQvUW1nbGVOOVUrQUpQa1hn?= =?utf-8?B?YkJVZlpMMTVnYzFGMWZnemswRUFrZWVKYXk2SDFpczQ3ZHhtVEtSQUhRVkR1?= =?utf-8?B?R1V1Z2thc25vdk5ZRmZ3aHpQYUV6andxeVhPSW43YXhsR2pBZjAvNmx5RGdC?= =?utf-8?B?bUZ0RTErVVhPbnBFN2tBS2t2YVdtUGlBZEQxcGVaWk5BNU1nZnVtcjBFTUc0?= =?utf-8?B?S2pqVjhGMXBCN1h3a1dzREhHWWpxbnBzbURJT0pLNmFOZTZUb1R2eGFLaXFm?= =?utf-8?B?SktjVktXWmJ1dkN0NURFUHg4b3JDejdEOGxYNDZIYXF4Nkk4RkN5WHg5Wmps?= =?utf-8?B?aE1sTytDb04xYXhralBKSDAvdDVlU0FDeTdlcy9DcDc3VDVJSVdBVTROaU1H?= =?utf-8?B?VndVM0JEcUtkOVZtbVZIRm9vNU90VFZFSCs3RnBsS0V4Mk5ZTzNENGtYdU5T?= =?utf-8?B?bTZBbEZqZWhob21TSUwrU0swQW1CbFNyWGVvNU11Nm8zbXFaRkJQNTBqcGpP?= =?utf-8?B?KzZHU0IvM3d3RUlmblRxc09xTEZTNXBZWWtpWVczT0tBUkpGWVE5aEcya25Y?= =?utf-8?B?dWE5OHNBcnZkaDhBQkEzWmZuTk1mcDYwbGlPQWozRGZadjlZdVo1VTNEemxY?= =?utf-8?B?UnZOSlpidytSSjN3RU9ZT2I4US9VOU1XM1dBaDVyT1lBRVRUak44VEhzbzg1?= =?utf-8?B?TTZxUlNCMVptNVpaUVZmMnVOa2w3bFdvcUEvaVRJVmUraVhCMzJEaVJoY29R?= =?utf-8?B?cHdFaTI4NW5JMDhqWFVlTTJqL1o4andoVGoweWk3VGUxM1piaGdiWllIQUpn?= =?utf-8?B?TEZBSDZUb20za084MEViTkhEYkw3M0pSQmxaUHE3c3NYbktpNWlEOTYzNkNy?= =?utf-8?B?ZFl2TmV1YVJsaksxeUhOZjI5ZUVVL1d1ZjJxdTZ1R0tCQmtDRitBYkhiSE5Y?= =?utf-8?B?RGRsL1ExaU9remNQRzlGZnhyZnZJSnpVRlNNeEdWOUkvT21hZUUzQllTUWkw?= =?utf-8?B?Um1SeUtsOWtVTm5GaFRjMmw0RVBnUlJBOFVlMTVjbldTWlhtcXdzb2hzSGV3?= =?utf-8?B?TEZKZ2xkYXMrcWd2SitGUDJWVjl5KzZKZE9mNkRHV09iRFhVTFgxeEtkRStK?= =?utf-8?B?eS9wbUVKRVF6Z2ZEVHpyYmZyUXpsYTFlZitpSUJRYTVZeVprL1JVRXUwTk1J?= =?utf-8?B?MEdlWThhdW5RekFQcnZnWkpScjNyeU1tU25GdWNkb0hsOWs0K0xQU2FDODhl?= =?utf-8?Q?VRmm5zI+MGyRO+7nThRZmk9FqQbdGA=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR05MB1737;6:OBCgyCGpTkxePSTS2TsbUednSvZEjxcWJk00Hrj4i/4twmXgYkwjJuhuTtqVtTEmKxBBjFn9kcflJAF8s2JTEAkON9rHSlxXdwv+ZvbcQQQ+4ZBfX/ZQd4G3KHJgGOBx4dAcm2q2PIraWlr+bSksTSwU1WuIWdZRkwxzm1J19iRsTJTjN59OPIpIhmOthOi2KeObi1GAZ1O82Y+ZOGrRSbRvHr2hvyHzqfSFwMSSUbpMya42g1SjIEU7OpF/5Z2agY0eqQgnzVdEXMjj6+s/YOVhOIqJ7HqqmVP19uBe8PBDQJ9Qh5uOQtXbvcgRdPb+X9BUEYA3xiQdn1cNoW2Ttw==;5:9E0OQJVTxl7ThRVDih/WjjmV7Fmt9Hcn23WnkYJlifn+xWIGmVeeS3RbsucDsAcRja+PPabRVmAP1NaWiiNsCNlAWdpDDw40z6Sd/9ndgJqAXURySZys2hdFQWfB6l6Z4t3fqXIFjIKTGFahILqf/g==;24:Di2W4+epAAJHhLiVjhRIUxLvUelv0D016DA1Y9LSuo3KHyrb4TrOkmKEJz/6rTG8XmJKHL7ea10TtE/FXR91p/BsZt9rqklorihJg+XNaJE=;7:ajO7QpMRNfx8iQELqKms6uZV+dEN/BMy0DRmolSoqZZYOYO8DPVwOkqy3afNZ2YsLg7TvYLgiQx4z3gm5KDwLy9qzIZCi4sRFEFCqv8VYpJX3JiVHvW96QM/E9xPurZloJ/tHA1qgCTXjen9KD4fkaIrxADHxK1Kw7vxq7IHkwYhfbk264U0HQS+mg/uKHf4fFOu0oNJ58ZRqbHa2cfh/FnAQFsiMFfagp089TTwVaQg9xK6Lfh0xu/oOLHakQi3 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Sep 2016 08:51:53.5214 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR05MB1737 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/09/2016 10:55, Parav Pandit wrote: > Hi Matan, > > On Thu, Sep 1, 2016 at 2:14 PM, Christoph Hellwig wrote: >> On Thu, Sep 01, 2016 at 10:25:40AM +0300, Matan Barak wrote: >>> Well, if I recall, the reason doing so last time was in order to allow >>> flexible updating of ib_core independently, which is obviously not a good >>> reason (to say the least). >>> >>> Since the new ABI will probably define new object types (all recent >>> proposals go this way), the current approach could lead to either trying to >>> map new objects to existing cgroup resource types, which could lead to some >>> weird non 1:1 mapping, or having a split definitions - such that each >>> driver will declare its objects both in the cgroups mechanism and in its >>> driver dispatch table. > >>> Even worse than that, drivers could simply ignore the cgroups support while >>> implementing their own resource types and get a very broken containers >>> support. > If drivers are broken due to ignorance of not-calling cgroup APIs, > that should be ok. > That particular driver should fix it. > If the resource creation using uverbs is using well defined rdma level > resource, uverbs level will make sure to honor cgroup limits without > involving hw drivers anyway. > All recent proposals of the new ABI schema deals with extending the flexibility of the current schema by letting drivers define their specific types, actions, attributes, etc. Even more than that, the dispatching starts from the driver and it chooses if it wants to use the common RDMA core layer or have it's own wise implementation instead. Some drivers might even prefer not to implement the current verbs types. These decisions were made in the OFVWG meetings. Anyway, maybe we should consider using a more higher-level logic objects that could fit multiple drivers requirements. > RDMA Verb level resource is charged/uncharged by RDMA core. > RDMA HW level resource is charged/uncharged by RDMA HW driver using > well defined API and resource by cgroup core. > This scheme ensures that there is 1:1 mapping. > Sounds reasonable, but what about drivers which ignore the common code and implement it in their own way? What about drivers which don't support the standard RDMA types at all? I guess we should find a balance between something abstract and common enough that will ease administrator configuration but be specific enough for the various models we have (or will have) in the RDMA stack. > I don't think current definition of resource type is carved out on stone. > They can be extended as we forward. > As many of us agree that, they should be well defined and it should be > agreed by cgroup and rdma community. > Of course, but since the ABI and cgroups model are somehow related, they should be dealt with together and approved by Doug who participated in some of the OFVWG meetings. > For example, today we have RDMA_VERB_xxx resources. > New well defined RDMA HW resources can be defined in rdma_cgroup.h > file as RDMA_HW_xx in same enum table. > So a driver will change the cgroups file for every new type it adds? Will we just have a super set enum of all crazy types vendors added? >> >> Sorry guys, but arbitrary extensibility for something not finished is the >> worst idea ever. We have two options here: >> >> a) delay cgroups support until the grand rewrite is done >> b) add it now and deal with the consequences later >> > Can we do (b) now and differ adding any HW resources to cgroup until > they are clearly called out. > Architecture and APIs are already in place to support this. > Since this affect the user, it's better to think how it fits our "optional standard"/"vendor types" model first. Maybe we could force all standard types even if the driver we use doesn't support any of them. >> That being said, adding random non-Verbs hardwasre to the RDMA core is >> the second worst idea ever. > > We can differ adding HW resource to core and cgroup until they are > clearly defined. > In that case current architecture still holds good. > Clearly we should differ adding the actual code until a driver could declare such objects, but we need to decide how to expose the standard optional RDMA types (basically, the types you've added) and how future driver specific types fit in. >> Guess I need to catch up with the >> discussion and start using the flame thrower. > > Matan, > Can you please point us to the new RFC/ABI email thread which > describes the design, partitioning of code between core vs hw drivers > etc. > One proposal is [1]. There's another one from Sean which aims for similar targets regards the driver specific types. [1] https://www.spinics.net/lists/linux-rdma/msg38997.html Matan