From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f43.google.com ([209.85.167.43]:40251 "EHLO mail-lf1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389627AbeHARIf (ORCPT ); Wed, 1 Aug 2018 13:08:35 -0400 Received: by mail-lf1-f43.google.com with SMTP id y200-v6so13589591lfd.7 for ; Wed, 01 Aug 2018 08:22:21 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <9aece481-b03d-c30a-7911-a07e586c1db4@oracle.com> References: <9aece481-b03d-c30a-7911-a07e586c1db4@oracle.com> From: gokul cg Date: Wed, 1 Aug 2018 20:52:19 +0530 Message-ID: Subject: Re: Possible race condition in the kernel between PCI driver and AER handling To: Thomas Tai Cc: linux-pci@vger.kernel.org Content-Type: multipart/alternative; boundary="000000000000c4d8bb057261429a" Sender: linux-pci-owner@vger.kernel.org List-ID: --000000000000c4d8bb057261429a Content-Type: text/plain; charset="UTF-8" Hi Thomas, >In your case, I am hoping to recreate your issue so that we can work together to isolate and fix the issue. Do you have any suggestion how to fix it at this moment? Yes . I can reproduce issue. I don't have any patch right now. I was thinking about two options , 1) Adding generic call back in pci_dev to notify any when a device get removed from tree so that aer_driver can also subscribe to it 2) set_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags) in pci device flags when it removed from list and let aer driver to mange free , but i fear whether this will this create memory leak because of race. Regards Gokul On Wed, Aug 1, 2018 at 7:54 PM, Thomas Tai wrote: > > > On 08/01/2018 01:53 AM, gokul cg wrote: > >> Hi, >> >> I see there is a basic design flow. As AER and PCI drivers are >> independent modules , >> locally storing pointer to any data structure from pci linked list in AER >> driver will create problem as there is no synchronization between the same . >> >> >> https://elixir.bootlin.com/linux/v3.10.99/source/drivers/pci >> /pcie/aer/aerdrv_core.c#L701 >> Here 'structaer_err_info > nux/v3.10.99/ident/aer_err_info>*e_info > nux/v3.10.99/ident/e_info>' has pointer to pci dev , which can be >> removed from pci tree at any time . >> I think this is the basic issue. >> > > Hi Gokul, > Agree. We had an issue last week about this e_info storing the pci_dev > which is removed in the pcie_do_fatal_recovery() and causes use-after-free > problem. > > In your case, I am hoping to recreate your issue so that we can work > together to isolate and fix the issue. Do you have any suggestion how to > fix it at this moment? > > Thanks, > Thomas > > >> >> Regards >> Gokul >> >> >> On Tue, Jul 31, 2018 at 6:45 PM, Thomas Tai > > wrote: >> >> >> >> On 07/31/2018 08:42 AM, gokul cg wrote: >> >> Hi All, >> >> >> I am suspecting a possible race condition in the kernel between >> PCI driver and AER handling. >> >> Because of the same kernel panic happens from worker thread >> which handles bottom half of aer irq. >> >> >> I am seeing this issue when I suddenly power off PCI card which >> supports/enabled PCIE AER error reporting. >> >> While powering off PCI device, AER driver will get AER IRQ for >> the device, from AER IRQ handler, it will cache AER error code >> and schedule worker thread to handle error. >> >> >> Hi Gokul, >> >> It may be an issue in the AER driver. How do you power off your >> device? I've never seen this issue with normal shutdown nor "echo 0 >> > /sys/bus/pci/slots/xx/power" >> >> Cheers, >> Thomas >> >> >> >> The PCIe device will get removed from PCI tree before worker >> thread completes its task and kernel panic is happening when >> worker thread tries to access PCI device's config space. >> >> >> >> Issue: >> >> >> crash> >> >> crash> bt >> >> PID: 2727 TASK: ffff880272adc530 CPU: 0 COMMAND: >> "kworker/0:2" >> >> #0 [ffff88027469fac8] machine_kexec at ffffffff8102cf18 >> >> #1 [ffff88027469fb28] crash_kexec at ffffffff810a6b05 >> >> #2 [ffff88027469fbf0] oops_end at ffffffff8176d960 >> >> #3 [ffff88027469fc18] die at ffffffff810060db >> >> #4 [ffff88027469fc48] do_general_protection at ffffffff8176d452 >> >> #5 [ffff88027469fc70] general_protection at ffffffff8176cdf2 >> >> [exception RIP: pci_bus_read_config_dword+100] >> >> RIP: ffffffff813405f4 RSP: ffff88027469fd20 RFLAGS: >> 00010046 >> >> RAX: 435f494350006963 RBX: ffff880274892000 RCX: >> 0000000000000004 >> >> RDX: 0000000000000100 RSI: 0000000000000060 RDI: >> ffff880274892000 >> >> RBP: ffff88027469fd48 R8: ffff88027469fd2c R9: >> 00000000000012c0 >> >> R10: 0000000000000006 R11: 00000000000012bf R12: >> ffff88027469fd5c >> >> R13: 0000000000000246 R14: 0000000000000000 R15: >> ffff8802741a4000 >> >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 >> >> #6 [ffff88027469fd50] pci_find_next_ext_capability at >> ffffffff81345d7b >> >> #7 [ffff88027469fd90] pci_find_ext_capability at ffffffff81347225 >> >> #8 [ffff88027469fda0] get_device_error_info at ffffffff81356c4d >> >> #9 [ffff88027469fdd0] aer_isr at ffffffff81357a38 >> >> #10 [ffff88027469fe28] process_one_work at ffffffff8105d4c0 >> >> #11 [ffff88027469fe70] worker_thread at ffffffff8105e251 >> >> #12 [ffff88027469fed0] kthread at ffffffff81064260 >> >> #13 [ffff88027469ff50] ret_from_fork at ffffffff81773a38 >> >> >> crash> >> >> >> I have tested it on kernel 3.10 . But from source i could see >> that this case is still relevant for latest Linux source . >> >> >> Can anybody tell me if this is an issue with AER driver in linux ? >> >> >> >> >> Regards >> >> Gokul CG >> >> >> --000000000000c4d8bb057261429a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi=C2=A0Thomas,

>In= your case, I am hoping to recreate your issue so that we can work together= to isolate and fix the issue. Do you have any suggestion how to fix it at = this moment?
<= div>Yes . I can reproduce issue.

I don't=C2=A0= have any patch right now.=C2=A0
I was thinking about two options= ,=C2=A0

1)=C2=A0 Adding generic call back=C2=A0 i= n pci_dev to notify any when a device get removed from tree so that aer_dri= ver can also subscribe to it
2)=C2=A0 set_bit(PCI_DEV_DISCONNECTE= D, &dev->priv_flags) in pci device flags when it removed from list a= nd let aer driver to mange free , but i fear whether this will this create = memory leak because of race.


Regards
Gokul=C2=A0
On Wed, Aug 1, 2018 at 7:54 PM, Thomas Tai <thomas.tai@oracle.com<= /a>> wrote:
=

On 08/01/2018 01:53 AM, gokul cg wrote:
Hi,

I see there is a basic design flow. As AER and PCI drivers are independent = modules ,
locally storing pointer to any data structure from pci linked list in AER d= river will create problem as there is no synchronization between the same .=


https://elix= ir.bootlin.com/linux/v3.10.99/source/drivers/pci/pcie/aer/aerdrv_= core.c#L701
Here 'structaer_err_info <https:/= /elixir.bootlin.com/linux/v3.10.99/ident/aer_err_info>*e_i= nfo <https://elixir.bootlin.com/linux/= v3.10.99/ident/e_info>' has pointer to pci dev , which can be re= moved from pci tree at any time .
I think this is the basic issue.

Hi Gokul,
Agree. We had an issue last week about this e_info storing the pci_dev whic= h is removed in the pcie_do_fatal_recovery() and causes use-after-free prob= lem.

In your case, I am hoping to recreate your issue so that we can work togeth= er to isolate and fix the issue. Do you have any suggestion how to fix it a= t this moment?

Thanks,
Thomas



Regards
Gokul


On Tue, Jul 31, 2018 at 6:45 PM, Thomas Tai <thomas.tai@oracle.com <mailto:thomas.tai@oracle.com<= /a>>> wrote:



=C2=A0 =C2=A0 On 07/31/2018 08:42 AM, gokul cg wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Hi All,


=C2=A0 =C2=A0 =C2=A0 =C2=A0 I am suspecting a possible race condition in th= e kernel between
=C2=A0 =C2=A0 =C2=A0 =C2=A0 PCI driver and AER handling.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Because of the same kernel panic happens from w= orker thread
=C2=A0 =C2=A0 =C2=A0 =C2=A0 which handles bottom half of aer irq.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 I am seeing this issue when I suddenly power of= f PCI card which
=C2=A0 =C2=A0 =C2=A0 =C2=A0 supports/enabled PCIE AER error reporting.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 While powering off PCI device, AER driver will = get AER IRQ for
=C2=A0 =C2=A0 =C2=A0 =C2=A0 the device, from AER IRQ handler, it will cache= AER error code
=C2=A0 =C2=A0 =C2=A0 =C2=A0 and schedule worker thread to handle error.


=C2=A0 =C2=A0 Hi Gokul,

=C2=A0 =C2=A0 It may be an issue in the AER driver. How do you power off yo= ur
=C2=A0 =C2=A0 device? I've never seen this issue with normal shutdown n= or "echo 0
=C2=A0 =C2=A0 =C2=A0> /sys/bus/pci/slots/xx/power"

=C2=A0 =C2=A0 Cheers,
=C2=A0 =C2=A0 Thomas



=C2=A0 =C2=A0 =C2=A0 =C2=A0 The PCIe device will get removed from PCI tree = before worker
=C2=A0 =C2=A0 =C2=A0 =C2=A0 thread completes its task and kernel panic is= =C2=A0 happening when
=C2=A0 =C2=A0 =C2=A0 =C2=A0 worker thread tries to access PCI device's = config space.



=C2=A0 =C2=A0 =C2=A0 =C2=A0 Issue:


=C2=A0 =C2=A0 =C2=A0 =C2=A0 crash>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 crash> bt

=C2=A0 =C2=A0 =C2=A0 =C2=A0 PID: 2727=C2=A0 =C2=A0TASK: ffff880272adc530=C2= =A0 CPU: 0=C2=A0 =C2=A0COMMAND: "kworker/0:2"

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #0 [ffff88027469fac8] machine_kexec at ffffffff= 8102cf18

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #1 [ffff88027469fb28] crash_kexec at ffffffff81= 0a6b05

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #2 [ffff88027469fbf0] oops_end at ffffffff8176d= 960

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #3 [ffff88027469fc18] die at ffffffff810060db
=C2=A0 =C2=A0 =C2=A0 =C2=A0 #4 [ffff88027469fc48] do_general_protection at = ffffffff8176d452

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #5 [ffff88027469fc70] general_protection at fff= fffff8176cdf2

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 [exception RIP: pci_b= us_read_config_dword+100]

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 RIP: ffffffff813405f4= =C2=A0 RSP: ffff88027469fd20=C2=A0 RFLAGS: 00010046

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 RAX: 435f494350006963= =C2=A0 RBX: ffff880274892000=C2=A0 RCX:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 0000000000000004

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 RDX: 0000000000000100= =C2=A0 RSI: 0000000000000060=C2=A0 RDI:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ffff880274892000

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 RBP: ffff88027469fd48= =C2=A0 =C2=A0R8: ffff88027469fd2c=C2=A0 =C2=A0R9:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 00000000000012c0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 R10: 0000000000000006= =C2=A0 R11: 00000000000012bf=C2=A0 R12:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ffff88027469fd5c

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 R13: 0000000000000246= =C2=A0 R14: 0000000000000000=C2=A0 R15:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ffff8802741a4000

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 ORIG_RAX: fffffffffff= fffff=C2=A0 CS: 0010=C2=A0 SS: 0000

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #6 [ffff88027469fd50] pci_find_next_ext_capabil= ity at
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ffffffff81345d7b

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #7 [ffff88027469fd90] pci_find_ext_capability a= t ffffffff81347225

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #8 [ffff88027469fda0] get_device_error_info at = ffffffff81356c4d

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #9 [ffff88027469fdd0] aer_isr at ffffffff81357a= 38

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #10 [ffff88027469fe28] process_one_work at ffff= ffff8105d4c0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #11 [ffff88027469fe70] worker_thread at fffffff= f8105e251

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #12 [ffff88027469fed0] kthread at ffffffff81064= 260

=C2=A0 =C2=A0 =C2=A0 =C2=A0 #13 [ffff88027469ff50] ret_from_fork at fffffff= f81773a38


=C2=A0 =C2=A0 =C2=A0 =C2=A0 crash>


=C2=A0 =C2=A0 =C2=A0 =C2=A0 I have tested it on kernel 3.10 . But from sour= ce i could see
=C2=A0 =C2=A0 =C2=A0 =C2=A0 that this case is still relevant for latest Lin= ux source .


=C2=A0 =C2=A0 =C2=A0 =C2=A0 Can anybody tell me if this is an issue with AE= R driver in linux ?




=C2=A0 =C2=A0 =C2=A0 =C2=A0 Regards

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Gokul CG



--000000000000c4d8bb057261429a--