From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754342AbcHSF1I (ORCPT ); Fri, 19 Aug 2016 01:27:08 -0400 Received: from mail-pa0-f68.google.com ([209.85.220.68]:36577 "EHLO mail-pa0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750879AbcHSF1B (ORCPT ); Fri, 19 Aug 2016 01:27:01 -0400 From: "Michael Kerrisk (man-pages)" Subject: [PATCH 6/8] pipe: fix limit checking in alloc_pipe_info() To: Andrew Morton References: <67ce15aa-cf43-0c89-d079-2d966177c56d@gmail.com> Cc: mtk.manpages@gmail.com, Willy Tarreau , Vegard Nossum , socketpair@gmail.com, Tetsuo Handa , Jens Axboe , Al Viro , linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Message-ID: Date: Fri, 19 Aug 2016 17:25:48 +1200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <67ce15aa-cf43-0c89-d079-2d966177c56d@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The limit checking in alloc_pipe_info() (used by pipe(2) and when opening a FIFO) has the following problems: (1) When checking capacity required for the new pipe, the checks against the limit in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the new pipe capacity. As a consequence: (1) the memory allocation throttling provided by the soft limit does not kick in quite as early as it should, and (2) the user can overrun the hard limit. (2) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch addresses the above problems as follows: * Alter the checks against limits to include the memory required for the new pipe. * Re-order the accounting step so that it precedes the buffer allocation. If the accounting step determines that a limit has been reached, revert the accounting and cause the operation to fail. Cc: Willy Tarreau Cc: Vegard Nossum Cc: socketpair@gmail.com Cc: Tetsuo Handa Cc: Jens Axboe Cc: Al Viro Cc: linux-api@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Michael Kerrisk --- fs/pipe.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/fs/pipe.c b/fs/pipe.c index 613c6b9..705d79f 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -632,24 +632,28 @@ struct pipe_inode_info *alloc_pipe_info(void) if (pipe == NULL) goto out_free_uid; - if (!too_many_pipe_buffers_hard(user)) { - if (too_many_pipe_buffers_soft(user)) - pipe_bufs = 1; - pipe->bufs = kcalloc(pipe_bufs, - sizeof(struct pipe_buffer), - GFP_KERNEL_ACCOUNT); - } + if (too_many_pipe_buffers_soft(user)) + pipe_bufs = 1; + + account_pipe_buffers(user, 0, pipe_bufs); + + if (too_many_pipe_buffers_hard(user)) + goto out_revert_acct; + + pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer), + GFP_KERNEL_ACCOUNT); if (pipe->bufs) { init_waitqueue_head(&pipe->wait); pipe->r_counter = pipe->w_counter = 1; pipe->buffers = pipe_bufs; pipe->user = user; - account_pipe_buffers(user, 0, pipe_bufs); mutex_init(&pipe->mutex); return pipe; } +out_revert_acct: + account_pipe_buffers(user, pipe_bufs, 0); kfree(pipe); out_free_uid: free_uid(user); -- 2.5.5 From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael Kerrisk (man-pages)" Subject: [PATCH 6/8] pipe: fix limit checking in alloc_pipe_info() Date: Fri, 19 Aug 2016 17:25:48 +1200 Message-ID: References: <67ce15aa-cf43-0c89-d079-2d966177c56d@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <67ce15aa-cf43-0c89-d079-2d966177c56d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Andrew Morton Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, Willy Tarreau , Vegard Nossum , socketpair-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, Tetsuo Handa , Jens Axboe , Al Viro , linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-api@vger.kernel.org The limit checking in alloc_pipe_info() (used by pipe(2) and when opening a FIFO) has the following problems: (1) When checking capacity required for the new pipe, the checks against the limit in /proc/sys/fs/pipe-user-pages-{soft,hard} are made against existing consumption, and exclude the memory required for the new pipe capacity. As a consequence: (1) the memory allocation throttling provided by the soft limit does not kick in quite as early as it should, and (2) the user can overrun the hard limit. (2) As currently implemented, accounting and checking against the limits is done as follows: (a) Test whether the user has exceeded the limit. (b) Make new pipe buffer allocation. (c) Account new allocation against the limits. This is racey. Multiple processes may pass point (a) simultaneously, and then allocate pipe buffers that are accounted for only in step (c). The race means that the user's pipe buffer allocation could be pushed over the limit (by an arbitrary amount, depending on how unlucky we were in the race). [Thanks to Vegard Nossum for spotting this point, which I had missed.] This patch addresses the above problems as follows: * Alter the checks against limits to include the memory required for the new pipe. * Re-order the accounting step so that it precedes the buffer allocation. If the accounting step determines that a limit has been reached, revert the accounting and cause the operation to fail. Cc: Willy Tarreau Cc: Vegard Nossum Cc: socketpair-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Cc: Tetsuo Handa Cc: Jens Axboe Cc: Al Viro Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Signed-off-by: Michael Kerrisk --- fs/pipe.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/fs/pipe.c b/fs/pipe.c index 613c6b9..705d79f 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -632,24 +632,28 @@ struct pipe_inode_info *alloc_pipe_info(void) if (pipe == NULL) goto out_free_uid; - if (!too_many_pipe_buffers_hard(user)) { - if (too_many_pipe_buffers_soft(user)) - pipe_bufs = 1; - pipe->bufs = kcalloc(pipe_bufs, - sizeof(struct pipe_buffer), - GFP_KERNEL_ACCOUNT); - } + if (too_many_pipe_buffers_soft(user)) + pipe_bufs = 1; + + account_pipe_buffers(user, 0, pipe_bufs); + + if (too_many_pipe_buffers_hard(user)) + goto out_revert_acct; + + pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer), + GFP_KERNEL_ACCOUNT); if (pipe->bufs) { init_waitqueue_head(&pipe->wait); pipe->r_counter = pipe->w_counter = 1; pipe->buffers = pipe_bufs; pipe->user = user; - account_pipe_buffers(user, 0, pipe_bufs); mutex_init(&pipe->mutex); return pipe; } +out_revert_acct: + account_pipe_buffers(user, pipe_bufs, 0); kfree(pipe); out_free_uid: free_uid(user); -- 2.5.5