From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5755BC433E2 for ; Thu, 3 Sep 2020 18:26:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F21792078E for ; Thu, 3 Sep 2020 18:26:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LKy5prm7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F21792078E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6DE1B900002; Thu, 3 Sep 2020 14:26:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 666DD6B0078; Thu, 3 Sep 2020 14:26:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55495900002; Thu, 3 Sep 2020 14:26:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id 3C4046B0075 for ; Thu, 3 Sep 2020 14:26:18 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E0992824556B for ; Thu, 3 Sep 2020 18:26:17 +0000 (UTC) X-FDA: 77222579994.05.curve17_0608b5f270ab Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id B7D9A1802610E for ; Thu, 3 Sep 2020 18:26:17 +0000 (UTC) X-HE-Tag: curve17_0608b5f270ab X-Filterd-Recvd-Size: 6929 Received: from mail-ua1-f67.google.com (mail-ua1-f67.google.com [209.85.222.67]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Sep 2020 18:26:17 +0000 (UTC) Received: by mail-ua1-f67.google.com with SMTP id v5so990196uau.10 for ; Thu, 03 Sep 2020 11:26:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5wEHqOkRCO6zrvELrJE5bgL0eu3kLwl0UlB5Lk2nAbM=; b=LKy5prm72JTvX9LRAWvZyMScD/YjGEPTkbRjAirRjVudbZmVG2z8qERHkIMLaNzaTn aDLkcyK3pL2tjiXSyF6gt2r9HFLwmShVZrVpKa6Rou1XEl8ssmjovDnOnrsPKPk2ivoz bMOzrI2VQBQ2y12QZBcTDq/2zgsHNAjXNY7QX89tWeU1Ea0+v/LB3eACKSyDbKKzn6CW EBve223LVVw/nX8pzgD0JRL6Ovwnu2R9rW2R7czCVCWxkTrIE3DolFmWFvaGb6N3NCf6 myMLAGu0Q5GHDCjCRjMUp+bxvw1CER1iA18MZTnW7HaT6eLnvM4MdARIdhP5GI76XruM gUJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5wEHqOkRCO6zrvELrJE5bgL0eu3kLwl0UlB5Lk2nAbM=; b=LUlpglzBKrASZN3AkquDqrfsbttkFKwjNbBoAjQdwNOOE+nUB+bLfaaljiZ4si7+6L AJ0jDcmfYx3caYReAbipaqhG33vdOHtsHnW2ns4JjsbIKoA3w6lSJULwQGLj0BQY6qs8 89GJdYIdn0W0eqpc0wrct/959HLiGyH+IpY2J+F1n0txkbZ2k8y+y0QRn+mMAA90NPzq Vodo+V8zuyY/VuAPE/uIkFmJQVpKXXMlqKSl+n5yDsiy3d/dUAE0hPJV1Ol0io0TaDgw Wrk3AukvejQt6ul3e1J8UnVUaJIpQbyfbIOv5DokPg/YQ5Tb5H2VbM0bNUmkvvBJswNG L5yA== X-Gm-Message-State: AOAM530axbq0E5mqGuWFJukDarDvzGAPFXmzMoIYfQ98bi9nknf8hRIp qvLHt5vJ0XmkAIRl9kkGi8XtDbHWkBT4yitRm2pp4w== X-Google-Smtp-Source: ABdhPJxm65p2V+IkSnb8EnrOEu328aOvwUecnyJgQFA1jPL17vR3Cl0Clm5Asn9hVtv1vq4z1gnH6bJFP8ZFY0W4JKI= X-Received: by 2002:ab0:108:: with SMTP id 8mr2441076uak.25.1599157576110; Thu, 03 Sep 2020 11:26:16 -0700 (PDT) MIME-Version: 1.0 References: <20200901161459.11772-1-sumit.semwal@linaro.org> <20200901161459.11772-4-sumit.semwal@linaro.org> <202009031031.D32EF57ED@keescook> <393be893-6379-8adb-217c-4064f5052702@intel.com> In-Reply-To: <393be893-6379-8adb-217c-4064f5052702@intel.com> From: Colin Cross Date: Thu, 3 Sep 2020 11:26:04 -0700 Message-ID: Subject: Re: [PATCH v7 3/3] mm: add a field to store names for private anonymous memory To: Dave Hansen Cc: Kees Cook , Sumit Semwal , Andrew Morton , Linux-MM , lkml , Alexey Dobriyan , Jonathan Corbet , Mauro Carvalho Chehab , Michal Hocko , Alexey Gladkov , Matthew Wilcox , Jason Gunthorpe , "Kirill A . Shutemov" , Michel Lespinasse , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Song Liu , Huang Ying , Vlastimil Babka , Yang Shi , chenqiwu , Mathieu Desnoyers , John Hubbard , Mike Christie , Bart Van Assche , Amit Pundir , Thomas Gleixner , Christian Brauner , Daniel Jordan , Adrian Reber , Nicolas Viennot , Al Viro , linux-fsdevel@vger.kernel.org, John Stultz , Pekka Enberg , Peter Zijlstra , Ingo Molnar , Oleg Nesterov , "Eric W. Biederman" , Jan Glauber , Rob Landley , Cyrill Gorcunov , "Serge E. Hallyn" , David Rientjes , Hugh Dickins , Rik van Riel , Mel Gorman , Tang Chen , Robin Holt , Shaohua Li , Sasha Levin , Johannes Weiner , Minchan Kim Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: B7D9A1802610E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 3, 2020 at 11:09 AM Dave Hansen wrote: > > On 9/3/20 11:00 AM, Kees Cook wrote: > > Why is a kernel-copied string insufficient for this? I don't think VMA > > merging is a fast-path operation, so doing a strcmp isn't going to wreck > > anything... > > > > Let me try to find the earlier thread with Dave Hansen... okay, found it: > > https://lore.kernel.org/linux-mm/51DDF071.5000309@intel.com/ > > > > Right, so, this idea predates userfaultfd. :) > > > > More notes below, but I *really* think this should not be a userspace > > pointer. And since a separate union has been found, let's just do a > > strndup_user() for the name, validate it as containing only printable > > characters without \n \r \v \f and move the merging logic into a > > separate patch. > > FWIW, I don't have any objections to this. > > Refcounting strings was what I think I had the strongest reaction to > back in the good old days of 2013. strdup() on split plus strcmp() on > merge doesn't sound afwul to me, and it is darn straightforward. The > biggest downside is probably kernel memory consumption. We should > probably just think through whether having so many duplicates changes > things materially. > > For instance, should/could we penalize a task's vm.max_map_count when > it's using this mechanism? Just to provide some concrete numbers, the ART process I examined (https://pastebin.com/YNUTvZyz) had 280 named anonymous mappings using a total of 6566 bytes for the names. There were only 63 unique names, using 1925 bytes. On my personal usage device, there are currently a total of 59769 named anonymous devices across all processes using 1224119 bytes, 5843 of them unique using 121754 bytes. The vast majority of the unique names are of the form "stack_and_tls:999", which are dynamically allocated in the userspace process' heap. There are only 132 names that do not contain stack_and_tls using 9540 bytes, repeated 49938 times using 1030651 bytes (108x). Most of those are constant strings, meaning the pointer is into the .rodata section of a file mapping that is shared between all processes. Is fork a concern? It would have to strdup every name.