From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1165739AbdD1Pal (ORCPT ); Fri, 28 Apr 2017 11:30:41 -0400 Received: from mail-ua0-f170.google.com ([209.85.217.170]:32986 "EHLO mail-ua0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1165721AbdD1Pa0 (ORCPT ); Fri, 28 Apr 2017 11:30:26 -0400 MIME-Version: 1.0 X-Originating-IP: [108.49.102.27] In-Reply-To: <15bb2070c10.2852.85c95baa4474aabc7814e68940a78392@paul-moore.com> References: <15bb2070c10.2852.85c95baa4474aabc7814e68940a78392@paul-moore.com> From: Paul Moore Date: Fri, 28 Apr 2017 11:30:24 -0400 Message-ID: Subject: Re: Boot regression caused by kauditd To: Cong Wang Cc: LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore wrote: > In that case please send a proper inline patch to the audit mailing list > and we'll review it. > > Thanks. Now that I'm back in front of a proper screen/keyboard I've been looking over your patch and while you are very right in that the current RCU usage is very wrong, there are quite a few things I would like to see changed in your patch ... I'm working on something right now, I'll post an RFC draft to the audit list and CC you once I get this sorted out, expect something in a few hours. Also, once you've had a look at this new patch, and assuming you are okay with it, I'd like to add your sign-off to it. This may not be your patch exactly, but a significant portion of it is borrowed from your patch yesterday. > On April 27, 2017 7:41:45 PM Cong Wang wrote: > >> On Thu, Apr 27, 2017 at 3:38 PM, Paul Moore wrote: >>> On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang wrote: >>>> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang wrote: >>>>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang wrote: >>>>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore wrote: >>>>>>> Thanks for the report, this is the only one like it that I've seen. >>>>>>> I'm looking at the code in Linus' tree and I'm not seeing anything >>>>>>> obvious ... looking at the trace above it appears that the problem is >>>>>>> when get_net() goes to bump the refcount and the passed net pointer is >>>>>>> NULL; unless I'm missing something, the only way this would happen in >>>>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the >>>>>>> auditd_conn.net pointer is NULL. >>>>>>> >>>>>>> That shouldn't happen. >>>>>>> >>>>>> >>>>>> Looking at the code that reads/writes the global auditd_conn, >>>>>> I don't see how it even works with RCU+spinlock, RCU plays >>>>>> with pointers and you have to make a copy as its name implies. >>>>>> But it looks like you simply use RCU+spinlock as a traditional >>>>>> rwlock, it doesn't work. >>>>> >>>>> The attached patch seems working for me, I tried to boot my >>>>> VM for 4 times, so far no crash or warning. >>>>> >>>> >>>> Or even better, save a memory allocation for reset path... >>> >>> I need to step away from my laptop for the evening so I can't give >>> this a proper review until tomorrow (sending patches as attachments >>> makes it difficult to review), but on quick glance I did notice a few >>> small things I would like to see changed. However, since there is no >>> normal commit description and sign-off, I'm guessing you sent these >>> out as a suggestion and not a proper patch submission, yes/no? If >>> that's the case, I'll work up a proper fix tomorrow and share it with >>> you for comment/review, but if you were planning on sending a proper >>> patch let me know and I'll wait until I see something in my inbox from >>> you. >> >> I want you to give it sanity check before I submit a formal one. ;) >> If you don't reject it, I will send a formal one with description and SoB. >> >> Thanks. > > -- paul moore www.paul-moore.com