From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: MIME-Version: 1.0 In-Reply-To: <0afd8c9c-8552-4141-3ccf-9d90d4698a0b@oracle.com> References: <20180801164358.GI2534@lahna.fi.intel.com> <20180801171512.GA28440@wunner.de> <20180802072036.GN2534@lahna.fi.intel.com> <20180802084657.GA21267@wunner.de> <20180802150749.GA31683@wunner.de> <0afd8c9c-8552-4141-3ccf-9d90d4698a0b@oracle.com> From: gokul cg Date: Wed, 8 Aug 2018 16:51:08 +0530 Message-ID: Subject: Re: [PATCH] PCI: pciehp: Differentiate between surprise and safe removal To: Thomas Tai Cc: Lukas Wunner , Mika Westerberg , Bjorn Helgaas , Ashok Raj , Keith Busch , Yinghai Lu , Sinan Kaya , linux-pci@vger.kernel.org, Alexandru Gagniuc Content-Type: multipart/alternative; boundary="00000000000009cb5f0572eab54c" List-ID: --00000000000009cb5f0572eab54c Content-Type: text/plain; charset="UTF-8" Thanks Thomas, With patch you suggested , panic has gone away from ' pci_find_next_ext_capability' as we not using inside aer_isr , but now it hits at pci_bus_read_config_dword. -------------------xxxxxxx bt og xxxxxxxx-----------------" PID: 24 TASK: ffff880274ac0000 CPU: 0 COMMAND: "kworker/0:1" #0 [ffff880274abbb18] machine_kexec at ffffffff8102cf18 #1 [ffff880274abbb78] crash_kexec at ffffffff810a6b05 #2 [ffff880274abbc40] oops_end at ffffffff8176d960 #3 [ffff880274abbc68] die at ffffffff810060db #4 [ffff880274abbc98] do_general_protection at ffffffff8176d452 #5 [ffff880274abbcc0] general_protection at ffffffff8176cdf2 [exception RIP: pci_bus_read_config_dword+100] RIP: ffffffff813405f4 RSP: ffff880274abbd70 RFLAGS: 00010046 RAX: 455a494c41495449 RBX: ffff880274891800 RCX: 0000000000000004 RDX: 0000000000000110 RSI: 0000000000000060 RDI: ffff880274891800 RBP: ffff880274abbd98 R8: ffff880274abbd7c R9: 00000000000011b5 R10: 0000000000000000 R11: 00000000000011b4 R12: ffff8802741a0210 R13: 0000000000000246 R14: ffff880272afc008 R15: ffff880272af8800 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffff880274abbda0] get_device_error_info at ffffffff81356d74 #7 [ffff880274abbdd0] aer_isr at ffffffff81357b41 #8 [ffff880274abbe28] process_one_work at ffffffff8105d4c0 #9 [ffff880274abbe70] worker_thread at ffffffff8105e251 #10 [ffff880274abbed0] kthread at ffffffff81064260 #11 [ffff880274abbf50] ret_from_fork at ffffffff81773a38" -------------------xxxxxxx bt og end xxxxxxxx----------------- Regards, Gokul On Tue, Aug 7, 2018 at 9:00 PM, Thomas Tai wrote: > Hi Gokul, > Something pop up in my mind and want to share with you. I assume that your > device is not a root port device or a switch device. I assume when you > power off the device, a FATAL error is sent to the root port thus trigger > the aer_isr. > > Since it is a fatal error and your device is not a switch device, the code > should not reach out your device because fatal error means that the link to > your device is not reliable. So the pci_find_ext_capability() looks strange > to me. When compare the code with the master branch. v3.10 is missing > following patch. Would you think you can give it a try? > > commit 66b808099146166c44157600a166c8372172cd76 > Author: Keith Busch > Date: Tue Sep 27 16:23:34 2016 -0400 > > PCI/AER: Cache capability position > > Save the position of the error reporting capability so it doesn't need > to > be rediscovered during error handling. > > Signed-off-by: Keith Busch > Signed-off-by: Bjorn Helgaas > CC: Lukas Wunner > > - Thomas > > > On 08/06/2018 02:33 PM, gokul cg wrote: > >> Hi, >> >> I have tried with following patch and I am still getting same kernel >> panic. >> >> -------------X++++++++++++++++++++X--------------------- >> >> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c >> b/drivers/pci/pcie/aer/aerdrv_core.c >> index 0f4554e..05592aa 100644 >> --- a/drivers/pci/pcie/aer/aerdrv_core.c >> +++ b/drivers/pci/pcie/aer/aerdrv_core.c >> @@ -26,6 +26,7 @@ >> #include >> #include >> #include "aerdrv.h" >> +#include "../../pci.h" >> >> static bool forceload; >> static bool nosourceid; >> @@ -82,7 +82,7 @@ EXPORT_SYMBOL_GPL(pci_cleanup_ >> aer_uncorrect_error_status); >> static int add_error_device(struct aer_err_info *e_info, struct pci_dev >> *dev) >> { >> if (e_info->error_dev_num < AER_MAX_MULTI_ERR_DEVICES) { >> -e_info->dev[e_info->error_dev_num] = dev; >> +e_info->dev[e_info->error_dev_num] = pci_dev_get(dev); >> >> e_info->error_dev_num++; >> return 0; >> } >> @@ -659,6 +659,9 @@ static int get_device_error_info(struct pci_dev *dev, >> struct aer_err_info *info) >> if (!pos) >> return 1; >> >> + if (pci_dev_is_disconnected(dev)) >> + return 0; >> + >> if (info->severity == AER_CORRECTABLE) { >> pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS, >> &info->status); >> @@ -710,6 +713,8 @@ static inline void aer_process_err_devices(struct >> pcie_device *p_device, >> for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++) { >> if (get_device_error_info(e_info->dev[i], e_info)) >> handle_error_source(p_device, e_info->dev[i], e_info); >> + >> + pci_dev_put(e_info->dev[i]); >> } >> } >> -------------X++++++++++++++++++++X--------------------- >> >> >> Note: I have configured CONFIG_HOTPLUG_PCI_PCIE and CONFIG_HOTPLUG_PCI as >> modules and loading in start up using script. >> >> root@/proc/:~# cat config | grep -i HOT >> CONFIG_TICK_ONESHOT=y >> CONFIG_HOTPLUG=y >> # CONFIG_MEMORY_HOTPLUG is not set >> CONFIG_HOTPLUG_CPU=y >> # CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set >> # CONFIG_DEBUG_HOTPLUG_CPU0 is not set >> CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y >> CONFIG_ACPI_HOTPLUG_CPU=y >> CONFIG_HOTPLUG_PCI_PCIE=m >> CONFIG_HOTPLUG_PCI=m >> # CONFIG_HOTPLUG_PCI_CPCI is not set >> # CONFIG_HOTPLUG_PCI_SHPC is not set >> CONFIG_DM_SNAPSHOT=y >> # CONFIG_USB_STORAGE_JUMPSHOT is not set >> # CONFIG_TRACER_SNAPSHOT is not set >> root@/proc/:~# >> >> Panic back trace : >> crash> bt >> PID: 24 TASK: ffff880274ac0000 CPU: 0 COMMAND: "kworker/0:1" >> #0 [ffff880274abbac8] machine_kexec at ffffffff8102cf18 >> #1 [ffff880274abbb28] crash_kexec at ffffffff810a6b05 >> #2 [ffff880274abbbf0] oops_end at ffffffff8176d8a0 >> #3 [ffff880274abbc18] die at ffffffff810060db >> #4 [ffff880274abbc48] do_general_protection at ffffffff8176d392 >> #5 [ffff880274abbc70] general_protection at ffffffff8176cd32 >> [exception RIP: pci_bus_read_config_dword+100] >> RIP: ffffffff813405f4 RSP: ffff880274abbd20 RFLAGS: 00010046 >> RAX: 435f494350006963 RBX: ffff880274891800 RCX: 0000000000000004 >> RDX: 0000000000000ffc RSI: 0000000000000060 RDI: ffff880274891800 >> RBP: ffff880274abbd48 R8: ffff880274abbd2c R9: 00000000000002b8 >> R10: ffff880274340000 R11: 0000000000000246 R12: ffff880274abbd5c >> R13: 0000000000000246 R14: 0000000000000000 R15: ffff880274920000 >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 >> #6 [ffff880274abbd50] pci_find_next_ext_capability at ffffffff81345db6 >> #7 [ffff880274abbd90] pci_find_ext_capability at ffffffff81347225 >> #8 [ffff880274abbda0] get_device_error_info at ffffffff81356c4d >> #9 [ffff880274abbdd0] aer_isr at ffffffff81357ab0 >> #10 [ffff880274abbe28] process_one_work at ffffffff8105d4c0 >> #11 [ffff880274abbe70] worker_thread at ffffffff8105e251 >> #12 [ffff880274abbed0] kthread at ffffffff81064260 >> #13 [ffff880274abbf50] ret_from_fork at ffffffff81773978 >> crash> >> >> >> Regards, >> Gokul >> >> On Thu, Aug 2, 2018 at 10:39 PM, Thomas Tai > > wrote: >> >> >> On 08/02/2018 11:07 AM, Lukas Wunner wrote: >> >> [cc += Thomas Tai] >> >> >> Hi Lukas, >> Thank you very much for cc me. >> >> >> On Thu, Aug 02, 2018 at 10:46:57AM +0200, Lukas Wunner wrote: >> >> On Thu, Aug 02, 2018 at 12:59:18PM +0530, gokul cg wrote: >> >> I am suspecting a possible race condition in the kernel >> between PCI driver >> and AER handling. >> >> >> The solution is to acquire a ref on each device in >> add_error_device(). >> Then release the ref aer_process_err_devices() by calling >> pci_dev_put(). >> >> >> So in case it wasn't clear, the below is what I had in mind. >> Completely untested though. Does this work for you? >> >> For v3.10 compatibility, cherry-pick 89ee9f768003 (or >> alternatively >> cherry-pick 8496e85c20e7 and replace pci_dev_is_disconnected(dev) >> with !pci_device_is_present(dev)). >> >> -- >8 -- >> Subject: [PATCH] PCI/AER: Fix use-after-free on surprise removal >> >> The work item to consume errors, aer_isr(), walks the hierarchy >> using >> pci_walk_bus() and stores a pointer to PCI devices which reported >> an >> error in an array. As long as pci_walk_bus() runs, those >> pointers are >> valid because pci_bus_sem is held. But once pci_walk_bus() >> finishes, >> nothing prevents the pointers from becoming invalid, e.g. through >> unplugging of the PCI devices. The unprotected pointers are then >> dereferenced in aer_process_err_devices(), which may oops: >> >> >> I like your idea to increment the refcount during pci_walk_bus(), >> that should fix the use-after-free issue. We just need Gokul to >> confirm if it fixes his issue or not. >> >> Thanks, >> Thomas >> >> >> >> #5 general_protection at ffffffff8176cdf2 >> [exception RIP: pci_bus_read_config_dword+100] >> #6 pci_find_next_ext_capability at ffffffff81345d7b >> #7 pci_find_ext_capability at ffffffff81347225 >> #8 get_device_error_info at ffffffff81356c4d >> #9 aer_isr at ffffffff81357a38 >> >> Fix by holding a ref on the devices until they have been >> processed. >> Skip processing of unplugged devices. >> >> Reported-by: gokul cg > > >> Signed-off-by: Lukas Wunner > > >> >> --- >> drivers/pci/pcie/aer.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> index a2e8838..937592e 100644 >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -657,7 +657,7 @@ void cper_print_aer(struct pci_dev *dev, int >> aer_severity, >> static int add_error_device(struct aer_err_info *e_info, >> struct pci_dev *dev) >> { >> if (e_info->error_dev_num < AER_MAX_MULTI_ERR_DEVICES) { >> - e_info->dev[e_info->error_dev_num] = dev; >> + e_info->dev[e_info->error_dev_num] = >> pci_dev_get(dev); >> e_info->error_dev_num++; >> return 0; >> } >> @@ -898,6 +898,9 @@ static int get_device_error_info(struct >> pci_dev *dev, struct aer_err_info *info) >> if (!pos) >> return 0; >> + if (pci_dev_is_disconnected(dev)) >> + return 0; >> + >> if (info->severity == AER_CORRECTABLE) { >> pci_read_config_dword(dev, pos + >> PCI_ERR_COR_STATUS, >> &info->status); >> @@ -948,6 +951,7 @@ static inline void >> aer_process_err_devices(struct aer_err_info *e_info) >> for (i = 0; i < e_info->error_dev_num && >> e_info->dev[i]; i++) { >> if (get_device_error_info(e_info->dev[i], >> e_info)) >> handle_error_source(e_info->dev[i], >> e_info); >> + pci_dev_put(e_info->dev[i]); >> } >> } >> >> >> --00000000000009cb5f0572eab54c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks Thomas,=

With patch you suggested= , panic has gone away from 'pci_find_next_ext_capability'= as we not using inside aer_isr , but now it hits at=C2=A0pci_bus_read_co= nfig_dword.

-------------------xxxxxxx bt = og xxxxxxxx-----------------"
PID: 24=C2=A0 =C2=A0 =C2=A0TAS= K: ffff880274ac0000=C2=A0 CPU: 0=C2=A0 =C2=A0COMMAND: "kworker/0:1&quo= t;
=C2=A0#0 [ffff880274abbb18] machine_kexec at ffffffff81= 02cf18
=C2=A0#1 [ffff880274abbb78] crash_kexec at ffffffff810a6b0= 5
=C2=A0#2 [ffff880274abbc40] oops_end at ffffffff8176d960
<= div>=C2=A0#3 [ffff880274abbc68] die at ffffffff810060db
=C2=A0#4 = [ffff880274abbc98] do_general_protection at ffffffff8176d452
=C2= =A0#5 [ffff880274abbcc0] general_protection at ffffffff8176cdf2
= =C2=A0 =C2=A0 [exception RIP: pci_bus_read_config_dword+100]
=C2= =A0 =C2=A0 RIP: ffffffff813405f4=C2=A0 RSP: ffff880274abbd70=C2=A0 RFLAGS: = 00010046
=C2=A0 =C2=A0 RAX: 455a494c41495449=C2=A0 RBX: ffff88027= 4891800=C2=A0 RCX: 0000000000000004
=C2=A0 =C2=A0 RDX: 0000000000= 000110=C2=A0 RSI: 0000000000000060=C2=A0 RDI: ffff880274891800
= =C2=A0 =C2=A0 RBP: ffff880274abbd98=C2=A0 =C2=A0R8: ffff880274abbd7c=C2=A0 = =C2=A0R9: 00000000000011b5
=C2=A0 =C2=A0 R10: 0000000000000000=C2= =A0 R11: 00000000000011b4=C2=A0 R12: ffff8802741a0210
=C2=A0 =C2= =A0 R13: 0000000000000246=C2=A0 R14: ffff880272afc008=C2=A0 R15: ffff880272= af8800
=C2= =A0 =C2=A0 ORIG_RAX: ffffffffffffffff=C2=A0 CS: 0010=C2=A0 SS: 0018
=C2=A0#6 [ffff880274abbda0] get_device_error_info at ffffffff8135= 6d74
=C2=A0#7 [ffff880274abbdd0] aer_isr at ffffffff81357b41
=C2=A0#8 [ffff880274abbe28] process_one_work at ffffffff8105d4c0
=C2=A0#9 [ffff880274abbe70] worker_thread at ffffffff8105e251
#10 [ffff880274abbed0] kthread at ffffffff81064260
#11 [ffff88= 0274abbf50] ret_from_fork at ffffffff81773a38"

------------------= -xxxxxxx bt og end xxxxxxxx-----------------


Regards,
Gokul=C2=A0


On Tue, Aug 7, 2= 018 at 9:00 PM, Thomas Tai <thomas.tai@oracle.com> wrote= :
Hi Gokul,
Something pop up in my mind and want to share with you. I assume that your = device is not a root port device or a switch device. I assume when you powe= r off the device, a FATAL error is sent to the root port thus trigger the a= er_isr.

Since it is a fatal error and your device is not a switch device, the code = should not reach out your device because fatal error means that the link to= your device is not reliable. So the pci_find_ext_capability() looks strang= e to me. When compare the code with the master branch. v3.10 is missing fol= lowing patch. Would you think you can give it a try?

commit 66b808099146166c44157600a166c8372172cd76
Author: Keith Busch <keith.busch@intel.com>
Date:=C2=A0 =C2=A0Tue Sep 27 16:23:34 2016 -0400

=C2=A0 =C2=A0 PCI/AER: Cache capability position

=C2=A0 =C2=A0 Save the position of the error reporting capability so it doe= sn't need to
=C2=A0 =C2=A0 be rediscovered during error handling.

=C2=A0 =C2=A0 Signed-off-by: Keith Busch <keith.busch@intel.com>
=C2=A0 =C2=A0 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
=C2=A0 =C2=A0 CC: Lukas Wunner <lukas@wunner.de>

- Thomas


On 08/06/2018 02:33 PM, gokul cg wrote:
Hi,

I have tried with following patch and I am still getting same kernel panic.=

-------------X++++++++++++++++++++X---------------------

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer= /aerdrv_core.c
index 0f4554e..05592aa 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -26,6 +26,7 @@
=C2=A0=C2=A0#include <linux/slab.h>
=C2=A0=C2=A0#include <linux/kfifo.h>
=C2=A0=C2=A0#include "aerdrv.h"
+#include "../../pci.h"

=C2=A0=C2=A0static bool forceload;
=C2=A0=C2=A0static bool nosourceid;
@@ -82,7 +82,7 @@ EXPORT_SYMBOL_GPL(pci_cleanup_aer_uncorrect_error_st= atus);
=C2=A0=C2=A0static int add_error_device(struct aer_err_info *e_info, struct= pci_dev *dev)
=C2=A0=C2=A0{
if (e_info->error_dev_num < AER_MAX_MULTI_ERR_DEVICES) {
-e_info->dev[e_info->error_dev_num] =3D dev;
+e_info->dev[e_info->error_dev_num] =3D pci_dev_get(dev);

e_info->error_dev_num++;
return 0;
}
@@ -659,6 +659,9 @@ static int get_device_error_info(struct pci_dev *dev, s= truct aer_err_info *info)
if (!pos)
return 1;

+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (pci_dev_is_disconnected(dev))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;
+
if (info->severity =3D=3D AER_CORRECTABLE) {
pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
&info->status);
@@ -710,6 +713,8 @@ static inline void aer_process_err_devices(struct pcie_= device *p_device,
for (i =3D 0; i < e_info->error_dev_num && e_info->dev[i];= i++) {
if (get_device_error_info(e_info->dev[i], e_info))
handle_error_source(p_device, e_info->dev[i], e_info);
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pci_dev_put(e_info= ->dev[i]);
}
=C2=A0=C2=A0}
-------------X++++++++++++++++++++X---------------------


Note: I have configured CONFIG_HOTPLUG_PCI_PCIE and CONFIG_HOTPLUG_PCI as m= odules and=C2=A0 loading in start up using script.

root@/proc/:~# cat config | grep -i HOT
CONFIG_TICK_ONESHOT=3Dy
CONFIG_HOTPLUG=3Dy
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_HOTPLUG_CPU=3Dy
# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=3Dy
CONFIG_ACPI_HOTPLUG_CPU=3Dy
CONFIG_HOTPLUG_PCI_PCIE=3Dm
CONFIG_HOTPLUG_PCI=3Dm
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set
CONFIG_DM_SNAPSHOT=3Dy
# CONFIG_USB_STORAGE_JUMPSHOT is not set
# CONFIG_TRACER_SNAPSHOT is not set
root@/proc/:~#

Panic back trace :
crash> bt
PID: 24=C2=A0 =C2=A0 =C2=A0TASK: ffff880274ac0000=C2=A0 CPU: 0=C2=A0 =C2=A0= COMMAND: "kworker/0:1"
=C2=A0=C2=A0#0 [ffff880274abbac8] machine_kexec at ffffffff8102cf18
=C2=A0=C2=A0#1 [ffff880274abbb28] crash_kexec at ffffffff810a6b05
=C2=A0=C2=A0#2 [ffff880274abbbf0] oops_end at ffffffff8176d8a0
=C2=A0=C2=A0#3 [ffff880274abbc18] die at ffffffff810060db
=C2=A0=C2=A0#4 [ffff880274abbc48] do_general_protection at ffffffff8176d392=
=C2=A0=C2=A0#5 [ffff880274abbc70] general_protection at ffffffff8176cd32 =C2=A0=C2=A0 =C2=A0 [exception RIP: pci_bus_read_config_dword+100]
=C2=A0=C2=A0 =C2=A0 RIP: ffffffff813405f4=C2=A0 RSP: ffff880274abbd20=C2=A0= RFLAGS: 00010046
=C2=A0=C2=A0 =C2=A0 RAX: 435f494350006963=C2=A0 RBX: ffff880274891800=C2=A0= RCX: 0000000000000004
=C2=A0=C2=A0 =C2=A0 RDX: 0000000000000ffc=C2=A0 RSI: 0000000000000060=C2=A0= RDI: ffff880274891800
=C2=A0=C2=A0 =C2=A0 RBP: ffff880274abbd48=C2=A0 =C2=A0R8: ffff880274abbd2c= =C2=A0 =C2=A0R9: 00000000000002b8
=C2=A0=C2=A0 =C2=A0 R10: ffff880274340000=C2=A0 R11: 0000000000000246=C2=A0= R12: ffff880274abbd5c
=C2=A0=C2=A0 =C2=A0 R13: 0000000000000246=C2=A0 R14: 0000000000000000=C2=A0= R15: ffff880274920000
=C2=A0=C2=A0 =C2=A0 ORIG_RAX: ffffffffffffffff=C2=A0 CS: 0010=C2=A0 SS: 001= 8
=C2=A0=C2=A0#6 [ffff880274abbd50] pci_find_next_ext_capability at ffffffff8= 1345db6
=C2=A0=C2=A0#7 [ffff880274abbd90] pci_find_ext_capability at ffffffff813472= 25
=C2=A0=C2=A0#8 [ffff880274abbda0] get_device_error_info at ffffffff81356c4d=
=C2=A0=C2=A0#9 [ffff880274abbdd0] aer_isr at ffffffff81357ab0
#10 [ffff880274abbe28] process_one_work at ffffffff8105d4c0
#11 [ffff880274abbe70] worker_thread at ffffffff8105e251
#12 [ffff880274abbed0] kthread at ffffffff81064260
#13 [ffff880274abbf50] ret_from_fork at ffffffff81773978
crash>


Regards,
Gokul

On Thu, Aug 2, 2018 at 10:39 PM, Thomas Tai <thomas.tai@oracle.com <mailto:thomas.tai@oracle.com<= /a>>> wrote:


=C2=A0 =C2=A0 On 08/02/2018 11:07 AM, Lukas Wunner wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 [cc +=3D Thomas Tai]


=C2=A0 =C2=A0 Hi Lukas,
=C2=A0 =C2=A0 Thank you very much for cc me.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 On Thu, Aug 02, 2018 at 10:46:57AM +0200, Lukas= Wunner wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Thu, Aug 02, 2018 at 12:59:18P= M +0530, gokul cg wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 I am suspecting a p= ossible race condition in the kernel
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 between PCI driver<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 and AER handling.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 The solution is to acquire a ref = on each device in
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 add_error_device().
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Then release the ref aer_process_= err_devices() by calling
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pci_dev_put().


=C2=A0 =C2=A0 =C2=A0 =C2=A0 So in case it wasn't clear, the below is wh= at I had in mind.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Completely untested though.=C2=A0 Does this wor= k for you?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 For v3.10 compatibility, cherry-pick 89ee9f7680= 03 (or alternatively
=C2=A0 =C2=A0 =C2=A0 =C2=A0 cherry-pick 8496e85c20e7 and replace pci_dev_is= _disconnected(dev)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 with !pci_device_is_present(dev)).

=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- >8 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Subject: [PATCH] PCI/AER: Fix use-after-free on= surprise removal

=C2=A0 =C2=A0 =C2=A0 =C2=A0 The work item to consume errors, aer_isr(), wal= ks the hierarchy
=C2=A0 =C2=A0 =C2=A0 =C2=A0 using
=C2=A0 =C2=A0 =C2=A0 =C2=A0 pci_walk_bus() and stores a pointer to PCI devi= ces which reported an
=C2=A0 =C2=A0 =C2=A0 =C2=A0 error in an array.=C2=A0 As long as pci_walk_bu= s() runs, those
=C2=A0 =C2=A0 =C2=A0 =C2=A0 pointers are
=C2=A0 =C2=A0 =C2=A0 =C2=A0 valid because pci_bus_sem is held.=C2=A0 But on= ce pci_walk_bus()
=C2=A0 =C2=A0 =C2=A0 =C2=A0 finishes,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 nothing prevents the pointers from becoming inv= alid, e.g. through
=C2=A0 =C2=A0 =C2=A0 =C2=A0 unplugging of the PCI devices.=C2=A0 The unprot= ected pointers are then
=C2=A0 =C2=A0 =C2=A0 =C2=A0 dereferenced in aer_process_err_devices(), whic= h may oops:


=C2=A0 =C2=A0 I like your idea to increment the refcount during pci_walk_bu= s(),
=C2=A0 =C2=A0 that should fix the use-after-free issue. We just need Gokul = to
=C2=A0 =C2=A0 confirm if it fixes his issue or not.

=C2=A0 =C2=A0 Thanks,
=C2=A0 =C2=A0 Thomas



=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0#5=C2=A0 general_protection = at ffffffff8176cdf2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0[exception RIP= : pci_bus_read_config_dword+100]
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0#6=C2=A0 pci_find_next_ext_c= apability at ffffffff81345d7b
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0#7=C2=A0 pci_find_ext_capabi= lity at ffffffff81347225
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0#8=C2=A0 get_device_error_in= fo at ffffffff81356c4d
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0#9=C2=A0 aer_isr at ffffffff= 81357a38

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Fix by holding a ref on the devices until they = have been processed.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Skip processing of unplugged devices.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Reported-by: gokul cg <
gokuljnpr@gmail.com
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:gokuljnpr@gmail.com>>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Signed-off-by: Lukas Wunner <lukas@wunner.de
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:lukas@wunner.de>>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 ---
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 drivers/pci/pcie/aer.c | 6 +++++-<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 1 file changed, 5 insertions(+), 1= deletion(-)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 diff --git a/drivers/pci/pcie/aer.c b/drivers/p= ci/pcie/aer.c
=C2=A0 =C2=A0 =C2=A0 =C2=A0 index a2e8838..937592e 100644
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --- a/drivers/pci/pcie/aer.c
=C2=A0 =C2=A0 =C2=A0 =C2=A0 +++ b/drivers/pci/pcie/aer.c
=C2=A0 =C2=A0 =C2=A0 =C2=A0 @@ -657,7 +657,7 @@ void cper_print_aer(struct = pci_dev *dev, int
=C2=A0 =C2=A0 =C2=A0 =C2=A0 aer_severity,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 static int add_error_device(struct= aer_err_info *e_info,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct pci_dev *dev)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (e_info-&g= t;error_dev_num < AER_MAX_MULTI_ERR_DEVICES) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0e_info->dev[e_info->error_dev_num] =3D dev;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0e_info->dev[e_info->error_dev_num] =3D
=C2=A0 =C2=A0 =C2=A0 =C2=A0 pci_dev_get(dev);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 e_info->error_dev_num++;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 return 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 @@ -898,6 +898,9 @@ static int get_device_error= _info(struct
=C2=A0 =C2=A0 =C2=A0 =C2=A0 pci_dev *dev, struct aer_err_info *info)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (!pos)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 return 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 +=C2=A0 =C2=A0 =C2=A0if (pci_dev_i= s_disconnected(dev))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0return 0;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 +
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (info->= severity =3D=3D AER_CORRECTABLE) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 pci_read_config_dword(dev, pos +
=C2=A0 =C2=A0 =C2=A0 =C2=A0 PCI_ERR_COR_STATUS,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &info->status);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 @@ -948,6 +951,7 @@ static inline void
=C2=A0 =C2=A0 =C2=A0 =C2=A0 aer_process_err_devices(struct aer_err_info *e_= info)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 for (i =3D 0;= i < e_info->error_dev_num &&
=C2=A0 =C2=A0 =C2=A0 =C2=A0 e_info->dev[i]; i++) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 if (get_device_error_info(e_info->dev[i], e_info)) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 handle_error_source(e_info->d= ev[i],
=C2=A0 =C2=A0 =C2=A0 =C2=A0 e_info);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0pci_dev_put(e_info->dev[i]);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 }



--00000000000009cb5f0572eab54c--