From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755409AbdKJAQX (ORCPT ); Thu, 9 Nov 2017 19:16:23 -0500 Received: from mail-pg0-f51.google.com ([74.125.83.51]:43979 "EHLO mail-pg0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754502AbdKJAQW (ORCPT ); Thu, 9 Nov 2017 19:16:22 -0500 X-Google-Smtp-Source: ABhQp+TPHSnsYl+lM79uR0Zg+UExIBjIMJj2+WOgWFDBJrxWh3eu783RGsI3jddjydogC4lpM7wrEwsDUdrfdhpYcIc= MIME-Version: 1.0 In-Reply-To: <008a7e8d-86e2-0709-d2ae-8aa743ef12ac@oracle.com> References: <20171107102156.3fgxt6y6v5y2kqnf@wfg-t540p.sh.intel.com> <20171108094832.qxvkawpw2snpcbvh@wfg-t540p.sh.intel.com> <20171108171230.ccf7lwutjysk26fc@wfg-t540p.sh.intel.com> <20171109031206.x6ta5ysdalf3lk3s@wfg-t540p.sh.intel.com> <008a7e8d-86e2-0709-d2ae-8aa743ef12ac@oracle.com> From: Cong Wang Date: Thu, 9 Nov 2017 16:16:01 -0800 Message-ID: Subject: Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf To: Girish Moodalbail Cc: Fengguang Wu , Alexander Duyck , Linus Torvalds , Jeff Kirsher , Network Development , "David S. Miller" , Linux Kernel Mailing List , intel-wired-lan Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 9, 2017 at 7:51 AM, Girish Moodalbail wrote: > > Upon receiving NETDEV_DOWN event, we are calling > > vlan_vid_del(dev, htons(ETH_P_8021Q), 0); > > which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at > some point. This free function frees the array[] > (vlan_info.vlan_grp.vn_devices_array). My guess is that > vlan_info_free_rcu() is being called first and then the array[] is being > accessed in vlan_device_event(). > Well yes and no. No, RCU itself is not broken and we clearly unpublish the RCU pointer before calling call_rcu(). Yes, I see where it is broken: the grp pointer still points to old dev->vlan_info, we should re-fetch it after vlan_vid_del(). I will send a fix. Thanks! From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cong Wang Date: Thu, 9 Nov 2017 16:16:01 -0800 Subject: [Intel-wired-lan] [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf In-Reply-To: <008a7e8d-86e2-0709-d2ae-8aa743ef12ac@oracle.com> References: <20171107102156.3fgxt6y6v5y2kqnf@wfg-t540p.sh.intel.com> <20171108094832.qxvkawpw2snpcbvh@wfg-t540p.sh.intel.com> <20171108171230.ccf7lwutjysk26fc@wfg-t540p.sh.intel.com> <20171109031206.x6ta5ysdalf3lk3s@wfg-t540p.sh.intel.com> <008a7e8d-86e2-0709-d2ae-8aa743ef12ac@oracle.com> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: intel-wired-lan@osuosl.org List-ID: On Thu, Nov 9, 2017 at 7:51 AM, Girish Moodalbail wrote: > > Upon receiving NETDEV_DOWN event, we are calling > > vlan_vid_del(dev, htons(ETH_P_8021Q), 0); > > which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at > some point. This free function frees the array[] > (vlan_info.vlan_grp.vn_devices_array). My guess is that > vlan_info_free_rcu() is being called first and then the array[] is being > accessed in vlan_device_event(). > Well yes and no. No, RCU itself is not broken and we clearly unpublish the RCU pointer before calling call_rcu(). Yes, I see where it is broken: the grp pointer still points to old dev->vlan_info, we should re-fetch it after vlan_vid_del(). I will send a fix. Thanks!