From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030307AbcIWJ7F (ORCPT ); Fri, 23 Sep 2016 05:59:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:43523 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932274AbcIWJ7D (ORCPT ); Fri, 23 Sep 2016 05:59:03 -0400 Subject: Re: [PATCH v2] fs/select: add vmalloc fallback for select(2) To: David Laight , Eric Dumazet References: <20160922164359.9035-1-vbabka@suse.cz> <1474562982.23058.140.camel@edumazet-glaptop3.roam.corp.google.com> <12efc491-a0e7-1012-5a8b-6d3533c720db@suse.cz> <1474564068.23058.144.camel@edumazet-glaptop3.roam.corp.google.com> <063D6719AE5E284EB5DD2968C1650D6DB0107DC8@AcuExch.aculab.com> Cc: Alexander Viro , Andrew Morton , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Michal Hocko , "netdev@vger.kernel.org" , Linux API , "linux-man@vger.kernel.org" From: Vlastimil Babka Message-ID: <3bbcc269-ec8b-12dd-e0ae-190c18bc3f47@suse.cz> Date: Fri, 23 Sep 2016 11:58:34 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0107DC8@AcuExch.aculab.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/23/2016 11:42 AM, David Laight wrote: > From: Vlastimil Babka >> Sent: 22 September 2016 18:55 > ... >> So in the case of select() it seems like the memory we need 6 bits per file >> descriptor, multiplied by the highest possible file descriptor (nfds) as passed >> to the syscall. According to the man page of select: >> >> EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see >> getrlimit(2)). > > That second clause is relatively recent. Interesting... so it was added without actually being true in the kernel code? >> The code actually seems to silently cap the value instead of returning EINVAL >> though? (IIUC): >> >> /* max_fds can increase, so grab it once to avoid race */ >> rcu_read_lock(); >> fdt = files_fdtable(current->files); >> max_fds = fdt->max_fds; >> rcu_read_unlock(); >> if (n > max_fds) >> n = max_fds; >> >> The default for this cap seems to be 1024 where I checked (again, IIUC, it's >> what ulimit -n returns?). I wasn't able to change it to more than 2048, which >> makes the bitmaps still below PAGE_SIZE. >> >> So if I get that right, the system admin would have to allow really large >> RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large >> concern? > > 4k open files isn't that many. > Especially for programs that are using pipes to emulate windows events. Sure but IIUC we need 6 bits per file. That means up to almost 42k files, we should fit into order-3 allocation, which effectively cannot fail right now. > I suspect that fdt->max_fds is an upper bound for the highest fd the > process has open - not the RLIMIT_NOFILE value. I gathered that the highest fd effectively limits the number of files, so it's the same. I might be wrong. > select() shouldn't be silently ignoring large values of 'n' unless > the fd_set bits are zero. Yeah that doesn't seem to conform to the manpage. > Of course, select does scale well for high numbered fds > and neither poll nor select scale well for large numbers of fds. True. > David > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vlastimil Babka Subject: Re: [PATCH v2] fs/select: add vmalloc fallback for select(2) Date: Fri, 23 Sep 2016 11:58:34 +0200 Message-ID: <3bbcc269-ec8b-12dd-e0ae-190c18bc3f47@suse.cz> References: <20160922164359.9035-1-vbabka@suse.cz> <1474562982.23058.140.camel@edumazet-glaptop3.roam.corp.google.com> <12efc491-a0e7-1012-5a8b-6d3533c720db@suse.cz> <1474564068.23058.144.camel@edumazet-glaptop3.roam.corp.google.com> <063D6719AE5E284EB5DD2968C1650D6DB0107DC8@AcuExch.aculab.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Alexander Viro , Andrew Morton , "linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org" , Michal Hocko , "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Linux API , "linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" To: David Laight , Eric Dumazet Return-path: In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0107DC8-VkEWCZq2GCInGFn1LkZF6NBPR1lH4CV8@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org On 09/23/2016 11:42 AM, David Laight wrote: > From: Vlastimil Babka >> Sent: 22 September 2016 18:55 > ... >> So in the case of select() it seems like the memory we need 6 bits per file >> descriptor, multiplied by the highest possible file descriptor (nfds) as passed >> to the syscall. According to the man page of select: >> >> EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see >> getrlimit(2)). > > That second clause is relatively recent. Interesting... so it was added without actually being true in the kernel code? >> The code actually seems to silently cap the value instead of returning EINVAL >> though? (IIUC): >> >> /* max_fds can increase, so grab it once to avoid race */ >> rcu_read_lock(); >> fdt = files_fdtable(current->files); >> max_fds = fdt->max_fds; >> rcu_read_unlock(); >> if (n > max_fds) >> n = max_fds; >> >> The default for this cap seems to be 1024 where I checked (again, IIUC, it's >> what ulimit -n returns?). I wasn't able to change it to more than 2048, which >> makes the bitmaps still below PAGE_SIZE. >> >> So if I get that right, the system admin would have to allow really large >> RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large >> concern? > > 4k open files isn't that many. > Especially for programs that are using pipes to emulate windows events. Sure but IIUC we need 6 bits per file. That means up to almost 42k files, we should fit into order-3 allocation, which effectively cannot fail right now. > I suspect that fdt->max_fds is an upper bound for the highest fd the > process has open - not the RLIMIT_NOFILE value. I gathered that the highest fd effectively limits the number of files, so it's the same. I might be wrong. > select() shouldn't be silently ignoring large values of 'n' unless > the fd_set bits are zero. Yeah that doesn't seem to conform to the manpage. > Of course, select does scale well for high numbered fds > and neither poll nor select scale well for large numbers of fds. True. > David > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH v2] fs/select: add vmalloc fallback for select(2) To: David Laight , Eric Dumazet References: <20160922164359.9035-1-vbabka@suse.cz> <1474562982.23058.140.camel@edumazet-glaptop3.roam.corp.google.com> <12efc491-a0e7-1012-5a8b-6d3533c720db@suse.cz> <1474564068.23058.144.camel@edumazet-glaptop3.roam.corp.google.com> <063D6719AE5E284EB5DD2968C1650D6DB0107DC8@AcuExch.aculab.com> Cc: Alexander Viro , Andrew Morton , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Michal Hocko , "netdev@vger.kernel.org" , Linux API , "linux-man@vger.kernel.org" From: Vlastimil Babka Message-ID: <3bbcc269-ec8b-12dd-e0ae-190c18bc3f47@suse.cz> Date: Fri, 23 Sep 2016 11:58:34 +0200 MIME-Version: 1.0 In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0107DC8@AcuExch.aculab.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: On 09/23/2016 11:42 AM, David Laight wrote: > From: Vlastimil Babka >> Sent: 22 September 2016 18:55 > ... >> So in the case of select() it seems like the memory we need 6 bits per file >> descriptor, multiplied by the highest possible file descriptor (nfds) as passed >> to the syscall. According to the man page of select: >> >> EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see >> getrlimit(2)). > > That second clause is relatively recent. Interesting... so it was added without actually being true in the kernel code? >> The code actually seems to silently cap the value instead of returning EINVAL >> though? (IIUC): >> >> /* max_fds can increase, so grab it once to avoid race */ >> rcu_read_lock(); >> fdt = files_fdtable(current->files); >> max_fds = fdt->max_fds; >> rcu_read_unlock(); >> if (n > max_fds) >> n = max_fds; >> >> The default for this cap seems to be 1024 where I checked (again, IIUC, it's >> what ulimit -n returns?). I wasn't able to change it to more than 2048, which >> makes the bitmaps still below PAGE_SIZE. >> >> So if I get that right, the system admin would have to allow really large >> RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large >> concern? > > 4k open files isn't that many. > Especially for programs that are using pipes to emulate windows events. Sure but IIUC we need 6 bits per file. That means up to almost 42k files, we should fit into order-3 allocation, which effectively cannot fail right now. > I suspect that fdt->max_fds is an upper bound for the highest fd the > process has open - not the RLIMIT_NOFILE value. I gathered that the highest fd effectively limits the number of files, so it's the same. I might be wrong. > select() shouldn't be silently ignoring large values of 'n' unless > the fd_set bits are zero. Yeah that doesn't seem to conform to the manpage. > Of course, select does scale well for high numbered fds > and neither poll nor select scale well for large numbers of fds. True. > David > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org