linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
@ 2021-05-07 10:27 Danijel Slivka
  2021-05-07 22:07 ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: Danijel Slivka @ 2021-05-07 10:27 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, Danijel Slivka

This patch fixes segmentation fault during accessing already freed
pci device resource files, as after freeing res_attr and res_attr_wc
elements, in pci_remove_resource_files function, they are left as
dangling pointers.

Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
---
 drivers/pci/pci-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index f8afd54ca3e1..bbdf6c57fcda 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1130,12 +1130,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr[i] = NULL;
 		}
 
 		res_attr = pdev->res_attr_wc[i];
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr_wc[i] = NULL;
 		}
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-07 10:27 [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files Danijel Slivka
@ 2021-05-07 22:07 ` Bjorn Helgaas
  2021-05-10 15:02   ` Slivka, Danijel
  2021-05-10 15:08   ` Slivka, Danijel
  0 siblings, 2 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2021-05-07 22:07 UTC (permalink / raw)
  To: Danijel Slivka; +Cc: bhelgaas, linux-pci

Hi Danijel,

Thanks for the patch.

On Fri, May 07, 2021 at 06:27:06PM +0800, Danijel Slivka wrote:
> This patch fixes segmentation fault during accessing already freed
> pci device resource files, as after freeing res_attr and res_attr_wc
> elements, in pci_remove_resource_files function, they are left as
> dangling pointers.
> 
> Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
> ---
>  drivers/pci/pci-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index f8afd54ca3e1..bbdf6c57fcda 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1130,12 +1130,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr[i] = NULL;
>  		}
>  
>  		res_attr = pdev->res_attr_wc[i];
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr_wc[i] = NULL;

If this patch fixes something, I would expect to see a test like this
somewhere:

  if (pdev->res_attr[i])
    pdev->res_attr[i]->size = 0;

But I don't see anything like that, so I can't figure out where we
actually use res_attr[i] or res_attr_wc[i], except in
pci_remove_resource_files() itself.

Did you actually see a segmentation fault?  If so, where?

>  		}
>  	}
>  }
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-07 22:07 ` Bjorn Helgaas
@ 2021-05-10 15:02   ` Slivka, Danijel
  2021-05-10 22:57     ` Bjorn Helgaas
  2021-05-10 15:08   ` Slivka, Danijel
  1 sibling, 1 reply; 8+ messages in thread
From: Slivka, Danijel @ 2021-05-10 15:02 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci

Hi Bjorn,

Yes, I get segmentation fault on unloading custom module during the check of error handling case.
There is no directly visible access to res_attr fields, as you mentioned, other than the one in a call chain after a check in pci_remove_resource_files() which seems to cause the issue (accessing name).
Load and unload module will invoke pci_enable_sriov() and disable_pci_sriov() respectively that are both having a call of pci_remove_resource_files() in call chain.
In that function existing check is not behaving as expected since memory is freed but pointers left dangling. 
Below is call trace and detail description. 


During loading of module pci_enable_sriov() is called, I have following invoking sequence:
device_create_file
pci_create_capabilities_sysfs
pci_create_sysfs_dev_files
pci_bus_add_device
pci_iov_add_virtfn
sriov_enable
pci_enable_sriov

In case of a failure of device_create_file() for dev_attr_reset case  (which was the case I tested), it will call pci_remove_resource_files(), 
which will free both, res_attr and res_attr_wc array elements (pointers) without setting them to NULL. 
Then failure is propagated only up to  pci_create_sysfs_dev_files and return value is not checked inside of pci_bus_add_device. 
That causes the same behavior as pci_enable_sriov function passed. 

On unload, during pci_disable_sriov(), pci_remove_resource_files() will be regularly called to try to remove the files.
The check segment that currently exist will not prevent calling of removal code since pointers are left as dangling pointers, even though the resource files and attributes are already freed.

		res_attr = pdev->res_attr[i];
		if (res_attr) {
			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
			kfree(res_attr);
		}

Attribute res_attr[i]->name is the one causing segmentation fault when strlen() function is called in kernfs_name_hash().

Here is the call trace:

[  991.796300] RIP: 0010:strlen+0x0/0x30
[  991.807240] Code: f8 48 89 fa 48 89 e5 74 09 48 83 c2 01 80 3a 00 75 f7 48 83 c6 01 0f b6 4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed 5d c3 0f 1f 00 <80> 3f 00 55 48 89 e5 74 14 48 89 f8 48 83 c7 01 80 3f 00 75 f7 48
[  991.863423] RSP: 0018:ffffc1d68a9ffb80 EFLAGS: 00010246
[  991.879051] RAX: 0000000000000000 RBX: b5f8d4ce837e84cb RCX: 0000000000000000
[  991.900395] RDX: 0000000000000000 RSI: 0000000000000000 RDI: b5f8d4ce837e84cb
[  991.921739] RBP: ffffc1d68a9ffb98 R08: 0000000000000000 R09: ffffffff96964e01
[  991.943086] R10: ffffc1d68a9ffb78 R11: 0000000000000001 R12: 0000000000000000
[  991.964432] R13: 0000000000000000 R14: b5f8d4ce837e84cb R15: 0000000000000038
[  991.985777] FS:  00007f3c278b3740(0000) GS:ffffa0a61f740000(0000) knlGS:0000000000000000
[  992.009983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  992.027168] CR2: 00007f3c27920340 CR3: 000000074c010004 CR4: 00000000003606e0
[  992.048516] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  992.069861] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  992.091205] Call Trace:
[  992.098523]  ? kernfs_name_hash+0x17/0x80
[  992.110499]  kernfs_find_ns+0x3a/0xc0
[  992.121448]  kernfs_remove_by_name_ns+0x36/0xa0
[  992.134994]  sysfs_remove_bin_file+0x17/0x20
[  992.147760]  pci_remove_resource_files+0x38/0x80
[  992.161565]  pci_remove_sysfs_dev_files+0x5b/0xc0
[  992.175631]  pci_stop_bus_device+0x78/0x90
[  992.187875]  pci_stop_and_remove_bus_device+0x12/0x20
[  992.202983]  pci_iov_remove_virtfn+0xc3/0x110
[  992.216006]  sriov_disable+0x43/0x100
[  992.226953]  pci_disable_sriov+0x23/0x30

BR,
Danijel Slivka

-----Original Message-----
From: Bjorn Helgaas <helgaas@kernel.org> 
Sent: Saturday, May 8, 2021 12:07 AM
To: Slivka, Danijel <Danijel.Slivka@amd.com>
Cc: bhelgaas@google.com; linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files

Hi Danijel,

Thanks for the patch.

On Fri, May 07, 2021 at 06:27:06PM +0800, Danijel Slivka wrote:
> This patch fixes segmentation fault during accessing already freed pci 
> device resource files, as after freeing res_attr and res_attr_wc 
> elements, in pci_remove_resource_files function, they are left as 
> dangling pointers.
> 
> Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
> ---
>  drivers/pci/pci-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 
> f8afd54ca3e1..bbdf6c57fcda 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1130,12 +1130,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr[i] = NULL;
>  		}
>  
>  		res_attr = pdev->res_attr_wc[i];
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr_wc[i] = NULL;

If this patch fixes something, I would expect to see a test like this
somewhere:

  if (pdev->res_attr[i])
    pdev->res_attr[i]->size = 0;

But I don't see anything like that, so I can't figure out where we actually use res_attr[i] or res_attr_wc[i], except in
pci_remove_resource_files() itself.

Did you actually see a segmentation fault?  If so, where?

>  		}
>  	}
>  }
> --
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-07 22:07 ` Bjorn Helgaas
  2021-05-10 15:02   ` Slivka, Danijel
@ 2021-05-10 15:08   ` Slivka, Danijel
  1 sibling, 0 replies; 8+ messages in thread
From: Slivka, Danijel @ 2021-05-10 15:08 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci

Hi Bjorn,

Yes, I get segmentation fault on unloading custom module during the check of error handling case.
There is no directly visible access to res_attr fields, as you mentioned, other than the one in a call chain after a check in pci_remove_resource_files() which seems to cause the issue (accessing name).
Load and unload module will invoke pci_enable_sriov() and disable_pci_sriov() respectively that are both having a call of pci_remove_resource_files() in call chain.
In that function existing check is not behaving as expected since memory is freed but pointers left dangling. 
Below is call trace and detail description. 


During loading of module pci_enable_sriov() is called, I have following invoking sequence:
device_create_file
pci_create_capabilities_sysfs
pci_create_sysfs_dev_files
pci_bus_add_device
pci_iov_add_virtfn
sriov_enable
pci_enable_sriov

In case of a failure of device_create_file() for dev_attr_reset case  (which was the case I tested), it will call pci_remove_resource_files(), which will free both, res_attr and res_attr_wc array elements (pointers) without setting them to NULL. 
Then failure is propagated only up to  pci_create_sysfs_dev_files and return value is not checked inside of pci_bus_add_device. 
That causes the same behavior as pci_enable_sriov function passed. 

On unload, during pci_disable_sriov(), pci_remove_resource_files() will be regularly called to try to remove the files.
The check segment that currently exist will not prevent calling of removal code since pointers are left as dangling pointers, even though the resource files and attributes are already freed.

		res_attr = pdev->res_attr[i];
		if (res_attr) {
			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
			kfree(res_attr);
		}

Attribute res_attr[i]->name is the one causing segmentation fault when strlen() function is called in kernfs_name_hash().

Here is the call trace:

[  991.796300] RIP: 0010:strlen+0x0/0x30
[  991.807240] Code: f8 48 89 fa 48 89 e5 74 09 48 83 c2 01 80 3a 00 75 f7 48 83 c6 01 0f b6 4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed 5d c3 0f 1f 00 <80> 3f 00 55 48 89 e5 74 14 48 89 f8 48 83 c7 01 80 3f 00 75 f7 48
[  991.863423] RSP: 0018:ffffc1d68a9ffb80 EFLAGS: 00010246
[  991.879051] RAX: 0000000000000000 RBX: b5f8d4ce837e84cb RCX: 0000000000000000
[  991.900395] RDX: 0000000000000000 RSI: 0000000000000000 RDI: b5f8d4ce837e84cb
[  991.921739] RBP: ffffc1d68a9ffb98 R08: 0000000000000000 R09: ffffffff96964e01
[  991.943086] R10: ffffc1d68a9ffb78 R11: 0000000000000001 R12: 0000000000000000
[  991.964432] R13: 0000000000000000 R14: b5f8d4ce837e84cb R15: 0000000000000038
[  991.985777] FS:  00007f3c278b3740(0000) GS:ffffa0a61f740000(0000) knlGS:0000000000000000
[  992.009983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  992.027168] CR2: 00007f3c27920340 CR3: 000000074c010004 CR4: 00000000003606e0
[  992.048516] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  992.069861] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  992.091205] Call Trace:
[  992.098523]  ? kernfs_name_hash+0x17/0x80
[  992.110499]  kernfs_find_ns+0x3a/0xc0
[  992.121448]  kernfs_remove_by_name_ns+0x36/0xa0
[  992.134994]  sysfs_remove_bin_file+0x17/0x20
[  992.147760]  pci_remove_resource_files+0x38/0x80
[  992.161565]  pci_remove_sysfs_dev_files+0x5b/0xc0
[  992.175631]  pci_stop_bus_device+0x78/0x90
[  992.187875]  pci_stop_and_remove_bus_device+0x12/0x20
[  992.202983]  pci_iov_remove_virtfn+0xc3/0x110
[  992.216006]  sriov_disable+0x43/0x100
[  992.226953]  pci_disable_sriov+0x23/0x30

BR,
Danijel Slivka

-----Original Message-----
From: Bjorn Helgaas <helgaas@kernel.org> 
Sent: Saturday, May 8, 2021 12:07 AM
To: Slivka, Danijel <Danijel.Slivka@amd.com>
Cc: bhelgaas@google.com; linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files

Hi Danijel,

Thanks for the patch.

On Fri, May 07, 2021 at 06:27:06PM +0800, Danijel Slivka wrote:
> This patch fixes segmentation fault during accessing already freed pci 
> device resource files, as after freeing res_attr and res_attr_wc 
> elements, in pci_remove_resource_files function, they are left as 
> dangling pointers.
> 
> Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
> ---
>  drivers/pci/pci-sysfs.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 
> f8afd54ca3e1..bbdf6c57fcda 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1130,12 +1130,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr[i] = NULL;
>  		}
>  
>  		res_attr = pdev->res_attr_wc[i];
>  		if (res_attr) {
>  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
>  			kfree(res_attr);
> +			pdev->res_attr_wc[i] = NULL;

If this patch fixes something, I would expect to see a test like this
somewhere:

  if (pdev->res_attr[i])
    pdev->res_attr[i]->size = 0;

But I don't see anything like that, so I can't figure out where we actually use res_attr[i] or res_attr_wc[i], except in
pci_remove_resource_files() itself.

Did you actually see a segmentation fault?  If so, where?

>  		}
>  	}
>  }
> --
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-10 15:02   ` Slivka, Danijel
@ 2021-05-10 22:57     ` Bjorn Helgaas
  2021-05-25 22:54       ` Krzysztof Wilczyński
  2021-10-26 22:12       ` Bjorn Helgaas
  0 siblings, 2 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2021-05-10 22:57 UTC (permalink / raw)
  To: Slivka, Danijel; +Cc: bhelgaas, linux-pci, Krzysztof Wilczyński

[+cc Krzysztof]

See https://people.kernel.org/tglx/notes-about-netiquette about
typical style for Linux mailing lists.

On Mon, May 10, 2021 at 03:02:45PM +0000, Slivka, Danijel wrote:
> Hi Bjorn,
> 
> Yes, I get segmentation fault on unloading custom module during the
> check of error handling case.
>
> There is no directly visible access to res_attr fields, as you
> mentioned, other than the one in a call chain after a check in
> pci_remove_resource_files() which seems to cause the issue
> (accessing name).
>
> Load and unload module will invoke pci_enable_sriov() and
> disable_pci_sriov() respectively that are both having a call of
> pci_remove_resource_files() in call chain.
>
> In that function existing check is not behaving as expected since
> memory is freed but pointers left dangling. 
>
> Below is call trace and detail description. 
> 
> During loading of module pci_enable_sriov() is called, I have
> following invoking sequence:
>
> device_create_file
> pci_create_capabilities_sysfs
> pci_create_sysfs_dev_files
> pci_bus_add_device
> pci_iov_add_virtfn
> sriov_enable
> pci_enable_sriov

OK.  For anybody following along, this call path changed in v5.13-rc1,
so pci_create_capabilities_sysfs() longer exists.  But looking at
v5.12, I think the sequence you're seeing is:

  pci_create_sysfs_dev_files
    pci_create_capabilities_sysfs
      retval = device_create_file(&dev->dev, &dev_attr_reset)
      return retval		# I guess this what failed, right?
    if (retval) goto err_rom_file
    err_rom_file:
    ...
    pci_remove_resource_files
      sysfs_remove_bin_file(pdev->res_attr[i])
      kfree(pdev->res_attr[i])
      # pdev->res_attr[i] not set to NULL in v5.12

Later, on module unload, we have this sequence:

  pci_disable_sriov
    sriov_disable
      sriov_del_vfs
        pci_iov_remove_virtfn
          pci_stop_and_remove_bus_device
            pci_stop_bus_device
              pci_stop_dev
                pci_remove_sysfs_dev_files
                  pci_remove_resource_files
                    sysfs_remove_bin_file(pdev->res_attr[i])
                    # pdev->res_attr[i] points to a freed object

Definitely seems like a problem.  Hmmm.  I'm not really a fan of
checking the pointer to see whether it's been freed.  That seems like
a band-aid.

Krzysztof did some really nice work in v5.13-rc1 that removes a lot of
the explicit sysfs file management.  I think he's planning similar
work for pdev->res_attr[], and I suspect that will solve this problem
nicely.  But there are some wrinkles to work out before that's ready.

Any ideas here, Krzysztof?  IIRC some of the wrinkles have to do with
Alpha, which has its own pci_create_resource_files() implementation.
If you have any work-in-progress that works on x86, I'd be curious to
see if it would solve Danijel's problem.

> In case of a failure of device_create_file() for dev_attr_reset case
> (which was the case I tested), it will call
> pci_remove_resource_files(), which will free both, res_attr and
> res_attr_wc array elements (pointers) without setting them to NULL.
>
> Then failure is propagated only up to  pci_create_sysfs_dev_files
> and return value is not checked inside of pci_bus_add_device.
>
> That causes the same behavior as pci_enable_sriov function passed. 
> 
> On unload, during pci_disable_sriov(), pci_remove_resource_files()
> will be regularly called to try to remove the files.
>
> The check segment that currently exist will not prevent calling of
> removal code since pointers are left as dangling pointers, even
> though the resource files and attributes are already freed.
>
> 
> 		res_attr = pdev->res_attr[i];
> 		if (res_attr) {
> 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
> 			kfree(res_attr);
> 		}
> 
> Attribute res_attr[i]->name is the one causing segmentation fault
> when strlen() function is called in kernfs_name_hash().

I think the "sysfs_remove_bin_file(pdev->res_attr[i])" is clearly
wrong because we're passing a pointer to memory that has already been
freed.

> Here is the call trace:
> 
> [  991.796300] RIP: 0010:strlen+0x0/0x30
> [  991.807240] Code: f8 48 89 fa 48 89 e5 74 09 48 83 c2 01 80 3a 00 75 f7 48 83 c6 01 0f b6 4e ff 48 83 c2 01 84 c9 88 4a ff 75 ed 5d c3 0f 1f 00 <80> 3f 00 55 48 89 e5 74 14 48 89 f8 48 83 c7 01 80 3f 00 75 f7 48
> [  991.863423] RSP: 0018:ffffc1d68a9ffb80 EFLAGS: 00010246
> [  991.879051] RAX: 0000000000000000 RBX: b5f8d4ce837e84cb RCX: 0000000000000000
> [  991.900395] RDX: 0000000000000000 RSI: 0000000000000000 RDI: b5f8d4ce837e84cb
> [  991.921739] RBP: ffffc1d68a9ffb98 R08: 0000000000000000 R09: ffffffff96964e01
> [  991.943086] R10: ffffc1d68a9ffb78 R11: 0000000000000001 R12: 0000000000000000
> [  991.964432] R13: 0000000000000000 R14: b5f8d4ce837e84cb R15: 0000000000000038
> [  991.985777] FS:  00007f3c278b3740(0000) GS:ffffa0a61f740000(0000) knlGS:0000000000000000
> [  992.009983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  992.027168] CR2: 00007f3c27920340 CR3: 000000074c010004 CR4: 00000000003606e0
> [  992.048516] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  992.069861] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  992.091205] Call Trace:
> [  992.098523]  ? kernfs_name_hash+0x17/0x80
> [  992.110499]  kernfs_find_ns+0x3a/0xc0
> [  992.121448]  kernfs_remove_by_name_ns+0x36/0xa0
> [  992.134994]  sysfs_remove_bin_file+0x17/0x20
> [  992.147760]  pci_remove_resource_files+0x38/0x80
> [  992.161565]  pci_remove_sysfs_dev_files+0x5b/0xc0
> [  992.175631]  pci_stop_bus_device+0x78/0x90
> [  992.187875]  pci_stop_and_remove_bus_device+0x12/0x20
> [  992.202983]  pci_iov_remove_virtfn+0xc3/0x110
> [  992.216006]  sriov_disable+0x43/0x100
> [  992.226953]  pci_disable_sriov+0x23/0x30
> 
> BR,
> Danijel Slivka
> 
> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org> 
> Sent: Saturday, May 8, 2021 12:07 AM
> To: Slivka, Danijel <Danijel.Slivka@amd.com>
> Cc: bhelgaas@google.com; linux-pci@vger.kernel.org
> Subject: Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
> 
> Hi Danijel,
> 
> Thanks for the patch.
> 
> On Fri, May 07, 2021 at 06:27:06PM +0800, Danijel Slivka wrote:
> > This patch fixes segmentation fault during accessing already freed pci 
> > device resource files, as after freeing res_attr and res_attr_wc 
> > elements, in pci_remove_resource_files function, they are left as 
> > dangling pointers.
> > 
> > Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
> > ---
> >  drivers/pci/pci-sysfs.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 
> > f8afd54ca3e1..bbdf6c57fcda 100644
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -1130,12 +1130,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
> >  		if (res_attr) {
> >  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
> >  			kfree(res_attr);
> > +			pdev->res_attr[i] = NULL;
> >  		}
> >  
> >  		res_attr = pdev->res_attr_wc[i];
> >  		if (res_attr) {
> >  			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
> >  			kfree(res_attr);
> > +			pdev->res_attr_wc[i] = NULL;
> 
> If this patch fixes something, I would expect to see a test like this
> somewhere:
> 
>   if (pdev->res_attr[i])
>     pdev->res_attr[i]->size = 0;
> 
> But I don't see anything like that, so I can't figure out where we actually use res_attr[i] or res_attr_wc[i], except in
> pci_remove_resource_files() itself.
> 
> Did you actually see a segmentation fault?  If so, where?
> 
> >  		}
> >  	}
> >  }
> > --
> > 2.20.1
> > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-10 22:57     ` Bjorn Helgaas
@ 2021-05-25 22:54       ` Krzysztof Wilczyński
  2021-06-03 18:37         ` Krzysztof Wilczyński
  2021-10-26 22:12       ` Bjorn Helgaas
  1 sibling, 1 reply; 8+ messages in thread
From: Krzysztof Wilczyński @ 2021-05-25 22:54 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Slivka, Danijel, bhelgaas, linux-pci

Hello Bjorn and Danijel,

Sorry for late reply!

[...]
> Krzysztof did some really nice work in v5.13-rc1 that removes a lot of
> the explicit sysfs file management.  I think he's planning similar
> work for pdev->res_attr[], and I suspect that will solve this problem
> nicely.  But there are some wrinkles to work out before that's ready.

Yes, the idea is to completely retire the late_initcall() and in the
process the pci_create_resource_files() and pci_create_legacy_files()
that are currently called from the pci_sysfs_init() function.

This will in turn allow for the removal of the "sysfs_initialized"
global variable that was, and most likely still is, responsible for the
race condition we have seen on some platforms, see:

  https://lore.kernel.org/linux-pci/20200716110423.xtfyb3n6tn5ixedh@pali/

> Any ideas here, Krzysztof?  IIRC some of the wrinkles have to do with
> Alpha, which has its own pci_create_resource_files() implementation.

Correct, Alpha support is something we have to address as one of the
outstanding problems to resolve.

To be more precise, there are two problems left to solve before we can
finally retire the late_initcall, and these would be:

  - pci_create_resource_files()
  - pci_create_legacy_files() 

The pci_create_resource_files() can be overridden on certain platforms
through a weak linkage which then allows for alternative implementations
to be provided - and this is true for Alpha as it provides a complete
implementation of the original function with all the platform-specific
changes it requires.

Similarly, an auxiliary function called pci_adjust_legacy_attr() can
also be overridden on some platforms again through a weak linkage
offering a way to adjust properties of some of the sysfs objects to be
added, and Alpha is also the platform that uses this for some custom
adjustments.

Having said that, aside of some dependency Alpha has on the functions we
are trying to retire, both of them have a larger challenge to overcome,
and which is related to the following commits:

- 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region")
- 636b21b50152 ("PCI: Revoke mappings like devmem")

Following these two changes, functions pci_create_attr() (called internally
from the pci_create_resource_files()) and pci_create_legacy_files() have
since a dependency on the iomem_get_mapping() function - which in turn also
creates an implicit dependency on the VFS and fs_initcall.

This dependency on iomem_get_mapping() stops us from retiring the
late_initcall at the moment as when we convert dynamically added sysfs
objects (that are primarily added in the pci_create_resource_files() and
pci_create_legacy_files() functions), these attributes are added before
the VFS completes its initialisation, and since most of the PCI devices
are typically enumerated in subsys_initcall this leads to a failure and
an Oops related to iomem_get_mapping() access.

See relevant conversations:

  https://lore.kernel.org/linux-pci/20210204165831.2703772-1-daniel.vetter@ffwll.ch/
  https://lore.kernel.org/linux-pci/20210313215747.GA2394467@bjorn-Precision-5520/

After some deliberation on a potential solution, we have settled on
changing the order when PCI sub-system and relevant device drivers are
initialised so that PCI will come after the VFS but before ACPI, thus
allowing for the iomem_get_mapping() to work without issues.

And this initialisation order change is what I am currently working on
towards.

Hopefully once this is done, and nothing breaks, we would be able to
retire the late_initcall and pci_sysfs_init(), and move everything to
static sysfs object.  This in turn would definitely solve the problem
Danijel stumbled upon in regards to module unloading.

> If you have any work-in-progress that works on x86, I'd be curious to
> see if it would solve Danijel's problem.
[...]

Nothing as of yet.  But I will focus on this now, so that we can finish
this work a lot sooner.

	Krzysztof

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-25 22:54       ` Krzysztof Wilczyński
@ 2021-06-03 18:37         ` Krzysztof Wilczyński
  0 siblings, 0 replies; 8+ messages in thread
From: Krzysztof Wilczyński @ 2021-06-03 18:37 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Slivka, Danijel, bhelgaas, linux-pci

Hello,

[...]
> Having said that, aside of some dependency Alpha has on the functions we
> are trying to retire, both of them have a larger challenge to overcome,
> and which is related to the following commits:
> 
> - 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region")
> - 636b21b50152 ("PCI: Revoke mappings like devmem")
> 
> Following these two changes, functions pci_create_attr() (called internally
> from the pci_create_resource_files()) and pci_create_legacy_files() have
> since a dependency on the iomem_get_mapping() function - which in turn also
> creates an implicit dependency on the VFS and fs_initcall.
> 
> This dependency on iomem_get_mapping() stops us from retiring the
> late_initcall at the moment as when we convert dynamically added sysfs
> objects (that are primarily added in the pci_create_resource_files() and
> pci_create_legacy_files() functions), these attributes are added before
> the VFS completes its initialisation, and since most of the PCI devices
> are typically enumerated in subsys_initcall this leads to a failure and
> an Oops related to iomem_get_mapping() access.
> 
> See relevant conversations:
> 
>   https://lore.kernel.org/linux-pci/20210204165831.2703772-1-daniel.vetter@ffwll.ch/
>   https://lore.kernel.org/linux-pci/20210313215747.GA2394467@bjorn-Precision-5520/
> 
> After some deliberation on a potential solution, we have settled on
> changing the order when PCI sub-system and relevant device drivers are
> initialised so that PCI will come after the VFS but before ACPI, thus
> allowing for the iomem_get_mapping() to work without issues.
[...]

The conversation related to the iomem_get_mapping() and the dependency
on VFS has moved to the following thread:

  https://lore.kernel.org/linux-pci/20210527205845.GA1421476@bjorn-Precision-5520/

	Krzysztof

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files
  2021-05-10 22:57     ` Bjorn Helgaas
  2021-05-25 22:54       ` Krzysztof Wilczyński
@ 2021-10-26 22:12       ` Bjorn Helgaas
  1 sibling, 0 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2021-10-26 22:12 UTC (permalink / raw)
  To: Slivka, Danijel; +Cc: bhelgaas, linux-pci, Krzysztof Wilczyński

On Mon, May 10, 2021 at 05:57:33PM -0500, Bjorn Helgaas wrote:
> On Mon, May 10, 2021 at 03:02:45PM +0000, Slivka, Danijel wrote:
> > Hi Bjorn,
> > 
> > Yes, I get segmentation fault on unloading custom module during the
> > check of error handling case.
> >
> > There is no directly visible access to res_attr fields, as you
> > mentioned, other than the one in a call chain after a check in
> > pci_remove_resource_files() which seems to cause the issue
> > (accessing name).
> >
> > Load and unload module will invoke pci_enable_sriov() and
> > disable_pci_sriov() respectively that are both having a call of
> > pci_remove_resource_files() in call chain.
> >
> > In that function existing check is not behaving as expected since
> > memory is freed but pointers left dangling. 
> >
> > Below is call trace and detail description. 
> > 
> > During loading of module pci_enable_sriov() is called, I have
> > following invoking sequence:
> >
> > device_create_file
> > pci_create_capabilities_sysfs
> > pci_create_sysfs_dev_files
> > pci_bus_add_device
> > pci_iov_add_virtfn
> > sriov_enable
> > pci_enable_sriov
> 
> OK.  For anybody following along, this call path changed in v5.13-rc1,
> so pci_create_capabilities_sysfs() longer exists.  But looking at
> v5.12, I think the sequence you're seeing is:
> 
>   pci_create_sysfs_dev_files
>     pci_create_capabilities_sysfs
>       retval = device_create_file(&dev->dev, &dev_attr_reset)
>       return retval		# I guess this what failed, right?
>     if (retval) goto err_rom_file
>     err_rom_file:
>     ...
>     pci_remove_resource_files
>       sysfs_remove_bin_file(pdev->res_attr[i])
>       kfree(pdev->res_attr[i])
>       # pdev->res_attr[i] not set to NULL in v5.12
> 
> Later, on module unload, we have this sequence:
> 
>   pci_disable_sriov
>     sriov_disable
>       sriov_del_vfs
>         pci_iov_remove_virtfn
>           pci_stop_and_remove_bus_device
>             pci_stop_bus_device
>               pci_stop_dev
>                 pci_remove_sysfs_dev_files
>                   pci_remove_resource_files
>                     sysfs_remove_bin_file(pdev->res_attr[i])
>                     # pdev->res_attr[i] points to a freed object
> 
> Definitely seems like a problem.  Hmmm.  I'm not really a fan of
> checking the pointer to see whether it's been freed.  That seems like
> a band-aid.

This patch does:

  +			pdev->res_attr[i] = NULL;
  +			pdev->res_attr_wc[i] = NULL;

which I think essentially relies on the fact that kfree(NULL) is a
no-op.

I'm going to drop this for now because I don't want to rely on that.
I'd rather avoid doing the kfree() altogether.

IIRC this happens when device_create_file() fails, and it likely fails
because of a race when creating the sysfs files, which would explain
why we don't see lots of reports of this.

Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-26 22:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07 10:27 [PATCH] PCI: Fix accessing freed memory in pci_remove_resource_files Danijel Slivka
2021-05-07 22:07 ` Bjorn Helgaas
2021-05-10 15:02   ` Slivka, Danijel
2021-05-10 22:57     ` Bjorn Helgaas
2021-05-25 22:54       ` Krzysztof Wilczyński
2021-06-03 18:37         ` Krzysztof Wilczyński
2021-10-26 22:12       ` Bjorn Helgaas
2021-05-10 15:08   ` Slivka, Danijel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).