From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933254AbcLILCC (ORCPT ); Fri, 9 Dec 2016 06:02:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53342 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932245AbcLILB7 (ORCPT ); Fri, 9 Dec 2016 06:01:59 -0500 Date: Fri, 9 Dec 2016 06:01:55 -0500 From: Richard Guy Briggs To: Cong Wang Cc: linux-audit@redhat.com, Paul Moore , Dmitry Vyukov , David Miller , Johannes Berg , Florian Westphal , Eric Dumazet , Herbert Xu , netdev , LKML , syzkaller Subject: Re: netlink: GPF in sock_sndtimeo Message-ID: <20161209110155.GW22655@madcap2.tricolour.ca> References: <20161129164859.GD26673@madcap2.tricolour.ca> <20161130045207.GE26673@madcap2.tricolour.ca> <20161209060248.GT22655@madcap2.tricolour.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 09 Dec 2016 11:01:59 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016-12-08 22:57, Cong Wang wrote: > On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs wrote: > > I also tried to extend Cong Wang's idea to attempt to proactively respond to a > > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error > > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. > > Eliminating the lock since the sock is dead anways eliminates the error. > > > > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to > > get the test case to compile. > > It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' > are updated as a whole and race between audit_receive_msg() and > NETLINK_URELEASE. This is what I expected and why I originally added the mutex lock in the callback... The dumps I got were bare with no wrapper identifying the process context or specific error, so I'm at a bit of a loss how to solve this (without thinking more about it) other than instinctively removing the mutex. Another approach might be to look at consolidating the three into one identifier or derive the other two from one, or serialize their access. > > @@ -1167,10 +1190,14 @@ static void __net_exit audit_net_exit(struct net *net) > > { > > struct audit_net *aunet = net_generic(net, audit_net_id); > > struct sock *sock = aunet->nlsk; > > + > > + mutex_lock(&audit_cmd_mutex); > > if (sock == audit_sock) { > > audit_pid = 0; > > + audit_nlk_portid = 0; > > audit_sock = NULL; > > } > > + mutex_unlock(&audit_cmd_mutex); > > If you decide to use NETLINK_URELEASE notifier, the above piece is no > longer needed, the net_exit path simply releases a refcnt. Good point. It would have already killed it off. So this piece is arguably too late anyways. - RGB -- Richard Guy Briggs Kernel Security Engineering, Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635