From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753155AbdGHSzL (ORCPT ); Sat, 8 Jul 2017 14:55:11 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:34436 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752907AbdGHSzJ (ORCPT ); Sat, 8 Jul 2017 14:55:09 -0400 MIME-Version: 1.0 In-Reply-To: References: <1499452335-3478-1-git-send-email-xiyou.wangcong@gmail.com> From: Linus Torvalds Date: Sat, 8 Jul 2017 11:55:07 -0700 X-Google-Sender-Auth: tyP5Tc50hHv7XDZKwRmlM2XfiHc Message-ID: Subject: Re: [Patch] mqueue: fix the retry logic for netlink_attachskb() To: Cong Wang Cc: Network Development , Linux Kernel Mailing List , Gene Blue , Andrew Morton , Manfred Spraul Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 8, 2017 at 11:04 AM, Cong Wang wrote: >> >> Can you confirm that? I don't know where the original report is. > > Yes of course, setting 'sock' to NULL before 'goto retry' is sufficient > to fix it, that is in fact my initial thought. And I realized retry'ing > fdget() can't help anything in this situation but increases the > attack vector, so I decided to get rid of it from the retry loop > instead of just NULL'ing 'sock'. > > Or do you prefer the simpler fix? Or should I just resend it with > a improved changelog? It was just the combination of that nasty code, your patch, and the explanation that confused me. Reading the patch, I actually thought that one of the things you fixed was moving the "fdput()" later, to after the netlink_attachskb(). And I thought you did that because netlink_attachskb() would need the file to be still around keeping a reference count to the socket, and without it the socket could have been dropped in the meantime. But reading the code more closely, I notice that netlink_getsockbyfilp() gets a reference to the sock, and it's that netlink_attachskb() will drop that reference on error or retry. So the fdput() makes sense after netlink_getsockbyfilp(), but that's also why the retry code currently includes repeating the fdget()... And the error handling for the fdget is that then triggers the real bug. So the reason you moved the fdput() later wasn't to protect the socket reference, it was just because of how the whole retry loop needs to have the file descriptor just to get a new reference to the socket. That's why I thought you fixed a bug even in the first iteration, but it turns out that was just me making assumptions based on mis-reading the patch without looking at all the context and the logic of the called functions. Now that I have checked deeper, I realize that your patch description was actually correct about this only being a retry problem - the first time around the reference count ends up moving correctly from file to socket, but then when it repeats and 'sock' may contain a stale pointer, we may end up doing the wrong thing when the fdget fails. Honestly, now I feel like either patch is fine, and your original commit message is fine too - but I just hate that code. And making it use some nice helper function to clean it up looks painful too, because the error handling is so odd (ie mq_netlink_attachskb() will free the skb on error, while the other error cases won't, so you'd have to have some special handling for the different errors that can happen). Honestly, this code is nasty, and right now my feeling is that it would be good to have a minimal patch that also backports cleanly. Maybe somebody can clean it up later, but that's a separate windmill to rail against. And due to the recent compat cleanups by Al, your bigger patch does not apply cleanly to current git - but the smaller patch to just setting 'sock' to NULL before that 'goto retry' should apply cleanly to all versions of this code. So purely because of that reason, I think I'd prefer to see that smaller patch instead. Would you mind re-sending the thing? Sorry about the whole confusion. Linus