From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760754AbcAKWie (ORCPT ); Mon, 11 Jan 2016 17:38:34 -0500 Received: from g9t5009.houston.hp.com ([15.240.92.67]:33233 "EHLO g9t5009.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758752AbcAKWic (ORCPT ); Mon, 11 Jan 2016 17:38:32 -0500 From: "Seymour, Shane M" To: Mathieu Desnoyers , Thomas Gleixner , Paul Turner , Andrew Hunter , Peter Zijlstra CC: "linux-kernel@vger.kernel.org" , "linux-api@vger.kernel.org" , Andy Lutomirski , Andi Kleen , Dave Watson , Chris Lameter , Ingo Molnar , Ben Maurer , Steven Rostedt , "Paul E. McKenney" , Josh Triplett , Linus Torvalds , Andrew Morton , Russell King , Catalin Marinas , Will Deacon , Michael Kerrisk Subject: RE: [RFC PATCH 0/3] Implement getcpu_cache system call Thread-Topic: [RFC PATCH 0/3] Implement getcpu_cache system call Thread-Index: AQHRR4cHoyUzwJLGvU+0ltTZgExGbJ726/+w Date: Mon, 11 Jan 2016 22:38:28 +0000 Message-ID: References: <1451977320-4886-1-git-send-email-mathieu.desnoyers@efficios.com> In-Reply-To: <1451977320-4886-1-git-send-email-mathieu.desnoyers@efficios.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [16.210.48.36] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u0BMceKf003694 Hi Mathieu, I have some concerns and suggestions for you about this. What's to stop someone in user space from requesting an arbitrarily large number of CPU # cache locations that the kernel needs to allocate memory to track and each time the task migrates to a new CPU it needs to update them all? Could you use it to dramatically slow down a system/task switching? Should there be a ulimit type value or a sysctl setting to limit the number that you're allowed to register per-task? If you can just register consecutive addresses each 4 bytes apart the size of the structure required to keep track of it in the kernel looks to be 20 or 24 bytes long depending on kernel bitness (using kmalloc it should be 32 bytes allocated either way) so if you do something like tie up 1GiB of memory in user space registered with CPU cache locations it will tie up >8GiB of memory in the kernel and there will be a huge linked list that will take significant amounts of time to traverse. You could use it as a local denial of service attack to try and soak up memory and cause a kernel OOM because the kernel needs more memory to keep track of the request compared to the size of the memory used by user space to create it. There doesn't currently appear to be any upper bounds on the number that can be registered. In terms of tracking what it's doing would you consider some sysfs attribute files (or something in debugfs) that tracked (these would all be in the add path so it shouldn't be performance sensitive): 1) The largest number of entries someone has created in the list in any task 2) The number of times (assuming you implement an upper bound on the number allowed) the upper bound is being hit (to allow someone to monitor for issues where the upper bound is being hit) Assuming that something (e.g. glibc) is willing to register and make an entry available for the life of the task consider allowing one flag, for example, GETCPU_CACHE_PERSISTENT with GETCPU_CACHE_CMD_REGISTER and have a new command GETCPU_CACHE_CMD_GET_ PERSISTENT to allow someone to ask for the user space address of an entry that something has guaranteed will be there until the task ends. If none exist they can fall back and allocate a new one - it allows for better reuse of an existing resource but log a warning if you're forced to remove a persistent entry or someone attempts to unregister it (which should always fail) since someone in user space will have broken their promise that it will always be there until the task ends (that means persistent ones should be left to be torn down by the kernel not unregistered from user space). If you do this you might need to optimize the find process so it's more likely the first persistent one will be the first one found if you think someone is more likely to take the approach of asking for that first and falling back to creating a new one if there isn't already one present. Having this will also tend to limit the number of these that anyone will need to create if most libraries ask for a persistent entry first. Thanks Shane From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Seymour, Shane M" Subject: RE: [RFC PATCH 0/3] Implement getcpu_cache system call Date: Mon, 11 Jan 2016 22:38:28 +0000 Message-ID: References: <1451977320-4886-1-git-send-email-mathieu.desnoyers@efficios.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <1451977320-4886-1-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org> Content-Language: en-US Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Mathieu Desnoyers , Thomas Gleixner , Paul Turner , Andrew Hunter , Peter Zijlstra Cc: "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Andy Lutomirski , Andi Kleen , Dave Watson , Chris Lameter , Ingo Molnar , Ben Maurer , Steven Rostedt , "Paul E. McKenney" , Josh Triplett , Linus Torvalds , Andrew Morton , Russell King , Catalin Marinas , Will Deacon , Michael Kerrisk List-Id: linux-api@vger.kernel.org SGkgTWF0aGlldSwNCg0KSSBoYXZlIHNvbWUgY29uY2VybnMgYW5kIHN1Z2dlc3Rpb25zIGZvciB5 b3UgYWJvdXQgdGhpcy4NCg0KV2hhdCdzIHRvIHN0b3Agc29tZW9uZSBpbiB1c2VyIHNwYWNlIGZy b20gcmVxdWVzdGluZyBhbiBhcmJpdHJhcmlseSBsYXJnZSBudW1iZXIgb2YgQ1BVICMgY2FjaGUg bG9jYXRpb25zIHRoYXQgdGhlIGtlcm5lbCBuZWVkcyB0byBhbGxvY2F0ZSBtZW1vcnkgdG8gdHJh Y2sgYW5kIGVhY2ggdGltZSB0aGUgdGFzayBtaWdyYXRlcyB0byBhIG5ldyBDUFUgaXQgbmVlZHMg dG8gdXBkYXRlIHRoZW0gYWxsPyBDb3VsZCB5b3UgdXNlIGl0IHRvIGRyYW1hdGljYWxseSBzbG93 IGRvd24gYSBzeXN0ZW0vdGFzayBzd2l0Y2hpbmc/IFNob3VsZCB0aGVyZSBiZSBhIHVsaW1pdCB0 eXBlIHZhbHVlIG9yIGEgc3lzY3RsIHNldHRpbmcgdG8gbGltaXQgdGhlIG51bWJlciB0aGF0IHlv dSdyZSBhbGxvd2VkIHRvIHJlZ2lzdGVyIHBlci10YXNrPw0KDQpJZiB5b3UgY2FuIGp1c3QgcmVn aXN0ZXIgY29uc2VjdXRpdmUgYWRkcmVzc2VzIGVhY2ggNCBieXRlcyBhcGFydCB0aGUgc2l6ZSBv ZiB0aGUgc3RydWN0dXJlIHJlcXVpcmVkIHRvIGtlZXAgdHJhY2sgb2YgaXQgaW4gdGhlIGtlcm5l bCBsb29rcyB0byBiZSAyMCBvciAyNCBieXRlcyBsb25nIGRlcGVuZGluZyBvbiBrZXJuZWwgYml0 bmVzcyAodXNpbmcga21hbGxvYyBpdCBzaG91bGQgYmUgMzIgYnl0ZXMgYWxsb2NhdGVkIGVpdGhl ciB3YXkpIHNvIGlmIHlvdSBkbyBzb21ldGhpbmcgbGlrZSB0aWUgdXAgMUdpQiBvZiBtZW1vcnkg aW4gdXNlciBzcGFjZSByZWdpc3RlcmVkIHdpdGggQ1BVIGNhY2hlIGxvY2F0aW9ucyBpdCB3aWxs IHRpZSB1cCA+OEdpQiBvZiBtZW1vcnkgaW4gdGhlIGtlcm5lbCBhbmQgdGhlcmUgd2lsbCBiZSBh IGh1Z2UgbGlua2VkIGxpc3QgdGhhdCB3aWxsIHRha2Ugc2lnbmlmaWNhbnQgYW1vdW50cyBvZiB0 aW1lIHRvIHRyYXZlcnNlLiBZb3UgY291bGQgdXNlIGl0IGFzIGEgbG9jYWwgZGVuaWFsIG9mIHNl cnZpY2UgYXR0YWNrIHRvIHRyeSBhbmQgc29hayB1cCBtZW1vcnkgYW5kIGNhdXNlIGEga2VybmVs IE9PTSBiZWNhdXNlIHRoZSBrZXJuZWwgbmVlZHMgbW9yZSBtZW1vcnkgdG8ga2VlcCB0cmFjayBv ZiB0aGUgcmVxdWVzdCBjb21wYXJlZCB0byB0aGUgc2l6ZSBvZiB0aGUgbWVtb3J5IHVzZWQgYnkg dXNlciBzcGFjZSB0byBjcmVhdGUgaXQuIFRoZXJlIGRvZXNuJ3QgY3VycmVudGx5IGFwcGVhciB0 byBiZSBhbnkgdXBwZXIgYm91bmRzIG9uIHRoZSBudW1iZXIgdGhhdCBjYW4gYmUgcmVnaXN0ZXJl ZC4NCg0KSW4gdGVybXMgb2YgdHJhY2tpbmcgd2hhdCBpdCdzIGRvaW5nIHdvdWxkIHlvdSBjb25z aWRlciBzb21lIHN5c2ZzIGF0dHJpYnV0ZSBmaWxlcyAob3Igc29tZXRoaW5nIGluIGRlYnVnZnMp IHRoYXQgdHJhY2tlZCAodGhlc2Ugd291bGQgYWxsIGJlIGluIHRoZSBhZGQgcGF0aCBzbyBpdCBz aG91bGRuJ3QgYmUgcGVyZm9ybWFuY2Ugc2Vuc2l0aXZlKToNCg0KMSkgVGhlIGxhcmdlc3QgbnVt YmVyIG9mIGVudHJpZXMgc29tZW9uZSBoYXMgY3JlYXRlZCBpbiB0aGUgbGlzdCBpbiBhbnkgdGFz aw0KMikgVGhlIG51bWJlciBvZiB0aW1lcyAoYXNzdW1pbmcgeW91IGltcGxlbWVudCBhbiB1cHBl ciBib3VuZCBvbiB0aGUgbnVtYmVyIGFsbG93ZWQpIHRoZSB1cHBlciBib3VuZCBpcyBiZWluZyBo aXQgKHRvIGFsbG93IHNvbWVvbmUgdG8gbW9uaXRvciBmb3IgaXNzdWVzIHdoZXJlIHRoZSB1cHBl ciBib3VuZCBpcyBiZWluZyBoaXQpDQoNCkFzc3VtaW5nIHRoYXQgc29tZXRoaW5nIChlLmcuIGds aWJjKSBpcyB3aWxsaW5nIHRvIHJlZ2lzdGVyIGFuZCBtYWtlIGFuIGVudHJ5IGF2YWlsYWJsZSBm b3IgdGhlIGxpZmUgb2YgdGhlIHRhc2sgY29uc2lkZXIgYWxsb3dpbmcgb25lIGZsYWcsIGZvciBl eGFtcGxlLCBHRVRDUFVfQ0FDSEVfUEVSU0lTVEVOVCB3aXRoIEdFVENQVV9DQUNIRV9DTURfUkVH SVNURVIgYW5kIGhhdmUgYSBuZXcgY29tbWFuZCBHRVRDUFVfQ0FDSEVfQ01EX0dFVF8gUEVSU0lT VEVOVCB0byBhbGxvdyBzb21lb25lIHRvIGFzayBmb3IgdGhlIHVzZXIgc3BhY2UgYWRkcmVzcyBv ZiBhbiBlbnRyeSB0aGF0IHNvbWV0aGluZyBoYXMgZ3VhcmFudGVlZCB3aWxsIGJlIHRoZXJlIHVu dGlsIHRoZSB0YXNrIGVuZHMuIElmIG5vbmUgZXhpc3QgdGhleSBjYW4gZmFsbCBiYWNrIGFuZCBh bGxvY2F0ZSBhIG5ldyBvbmUgLSBpdCBhbGxvd3MgZm9yIGJldHRlciByZXVzZSBvZiBhbiBleGlz dGluZyByZXNvdXJjZSBidXQgbG9nIGEgd2FybmluZyBpZiB5b3UncmUgZm9yY2VkIHRvIHJlbW92 ZSBhIHBlcnNpc3RlbnQgZW50cnkgb3Igc29tZW9uZSBhdHRlbXB0cyB0byB1bnJlZ2lzdGVyIGl0 ICh3aGljaCBzaG91bGQgYWx3YXlzIGZhaWwpIHNpbmNlIHNvbWVvbmUgaW4gdXNlciBzcGFjZSB3 aWxsIGhhdmUgYnJva2VuIHRoZWlyIHByb21pc2UgdGhhdCBpdCB3aWxsIGFsd2F5cyBiZSB0aGVy ZSB1bnRpbCB0aGUgdGFzayBlbmRzICh0aGF0IG1lYW5zIHBlcnNpc3RlbnQgb25lcyBzaG91bGQg YmUgbGVmdCB0byBiZSB0b3JuIGRvd24gYnkgdGhlIGtlcm5lbCBub3QgdW5yZWdpc3RlcmVkIGZy b20gdXNlciBzcGFjZSkuIElmIHlvdSBkbyB0aGlzIHlvdSBtaWdodCBuZWVkIHRvIG9wdGltaXpl IHRoZSBmaW5kIHByb2Nlc3Mgc28gaXQncyBtb3JlIGxpa2VseSB0aGUgZmlyc3QgcGVyc2lzdGVu dCBvbmUgd2lsbCBiZSB0aGUgZmlyc3Qgb25lIGZvdW5kIGlmIHlvdSB0aGluayBzb21lb25lIGlz IG1vcmUgbGlrZWx5IHRvIHRha2UgdGhlIGFwcHJvYWNoIG9mIGFza2luZyBmb3IgdGhhdCBmaXJz dCBhbmQgZmFsbGluZyBiYWNrIHRvIGNyZWF0aW5nIGEgbmV3IG9uZSBpZiB0aGVyZSBpc24ndCBh bHJlYWR5IG9uZSBwcmVzZW50LiBIYXZpbmcgdGhpcyB3aWxsIGFsc28gdGVuZCB0byBsaW1pdCB0 aGUgbnVtYmVyIG9mIHRoZXNlIHRoYXQgYW55b25lIHdpbGwgbmVlZCB0byBjcmVhdGUgaWYgbW9z dCBsaWJyYXJpZXMgYXNrIGZvciBhIHBlcnNpc3RlbnQgZW50cnkgZmlyc3QuDQoNClRoYW5rcw0K U2hhbmUNCg==