From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752946AbcLNAbv (ORCPT ); Tue, 13 Dec 2016 19:31:51 -0500 Received: from mail-io0-f194.google.com ([209.85.223.194]:33655 "EHLO mail-io0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751499AbcLNAbt (ORCPT ); Tue, 13 Dec 2016 19:31:49 -0500 MIME-Version: 1.0 In-Reply-To: <20161213105233.GG1305@madcap2.tricolour.ca> References: <20161129164859.GD26673@madcap2.tricolour.ca> <20161130045207.GE26673@madcap2.tricolour.ca> <20161209060248.GT22655@madcap2.tricolour.ca> <20161209110155.GW22655@madcap2.tricolour.ca> <20161212100215.GA1305@madcap2.tricolour.ca> <20161213105233.GG1305@madcap2.tricolour.ca> From: Cong Wang Date: Tue, 13 Dec 2016 16:17:14 -0800 Message-ID: Subject: Re: netlink: GPF in sock_sndtimeo To: Richard Guy Briggs Cc: linux-audit@redhat.com, Paul Moore , Dmitry Vyukov , David Miller , Johannes Berg , Florian Westphal , Eric Dumazet , Herbert Xu , netdev , LKML , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 13, 2016 at 2:52 AM, Richard Guy Briggs wrote: > It is actually the audit_pid and audit_nlk_portid that I care about > more. The audit daemon could vanish or close the socket while the > kernel sock to which it was attached is still quite valid. Accessing > the set of three atomically is the urge. I wonder if it makes more > sense to test for the presence of auditd using audit_sock rather than > audit_pid, but still keep audit_pid for our reporting and replacement > strategy. Another idea would be to put the three in one struct. Note, the process has audit_pid should hold a refcnt to the netns too, so the netns can't be gone until that process is gone. > > Can someone explain how they think the original test was able to trigger > this GPF? Network namespace shutdown while something pretended to set > up a new auditd? That's impressive for a fuzzer if that's the case... > Is there an strace? I guess it is all in test(). > I am surprised you still don't get the race condition even when you are now working on v2... The race happens in this scenarios : 1) Create a new netns 2) In the new netns, communicate with kauditd to set audit_sock 3) Generate some audit messages, so kauditd will keep sending them via audit_sock 4) exit the netns 5) the previous audit_sock is now going away, but kaudit_sock could still access it in this small window.