From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752075AbcKTX1L (ORCPT ); Sun, 20 Nov 2016 18:27:11 -0500 Received: from mail-oi0-f54.google.com ([209.85.218.54]:36784 "EHLO mail-oi0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750830AbcKTX1I (ORCPT ); Sun, 20 Nov 2016 18:27:08 -0500 MIME-Version: 1.0 In-Reply-To: <1479680873.8455.386.camel@edumazet-glaptop3.roam.corp.google.com> References: <1479680873.8455.386.camel@edumazet-glaptop3.roam.corp.google.com> From: Linus Torvalds Date: Sun, 20 Nov 2016 15:27:07 -0800 X-Google-Sender-Auth: BbiVt6TG8_2Yt_vs1L9Gu99OW1g Message-ID: Subject: Re: Linux 4.9-rc6 To: Eric Dumazet , Al Viro Cc: Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 20, 2016 at 2:27 PM, Eric Dumazet wrote: > > Hosts with ~100,000 threads have an issue with /prov/vmallocinfo > > It can take about 800 usec to skip over ~100,000 struct vmap_area > in s_start(), while holding vmap_area_lock spinlock, and therefore > blocking fork()/pthread_create(). > > I presume we can not switch to the rbtree (vmap_area_root) > for /proc/vmallocinfo, because this file is seek-able, right ? Well, the good news is that the file is root-only anyway, which means that at least it won't have the issue that a lot of other /proc files have had - namely being opened by random user programs or libraries. Which means that the users of it are likely fairly limited. Which in turn means that we can probably afford to play more games with it. Including, for example, possibly marking it non-seekable. Or even just limit the maximum entries we are willing to walk. Or we could decide that that file shouldn't be a seq_file at all, use the old "one page buffer" approach that was so common for /proc files, and make the position encode the vmalloc address in it (make the lower PAGE_MASK bits be the offset in the line), and then we *could* just look things up using the btree method. Al, do you have any clever ideas? Linus