LKML Archive on lore.kernel.org
 help / Atom feed
* fpga: fpga_mgr_free usage
@ 2018-07-11 12:38 Federico Vaga
  2018-07-11 15:59 ` Alan Tull
  0 siblings, 1 reply; 5+ messages in thread
From: Federico Vaga @ 2018-07-11 12:38 UTC (permalink / raw)
  To: Alan Tull; +Cc: linux-fpga, linux-kernel

Hi Alan,

I have another point that I would like to discuss. It is about the 
usage of 'fpga_mgr_free()' which does not look like consistent.

This function, according to the current implementation, can be used by 
an FPGA manager user and it is used by the FPGA manager itself on 
device release. This means that the user can only use this function if 
fpga_mgr_register() fails (to clean up), otherwise the user must NOT 
use this function, otherwise we most likely get an oops (NULL or 
invalid pointer). The example here is correct, this is what we should 
do:

https://www.kernel.org/doc/html/latest/driver-api/fpga/fpga-mgr.html

But I suggest to document it better or prevent this type of mistakes 
from happening. Following a couple of proposals

------
1.
Document it better. This is easy, in the fpga_mgr_free() kernel-doc 
comment we explain that the use of this function must be limited to 
clean up the memory on a registration failure. If an FPGA manager has 
been successfully registered then it will be freed by the framework 
itself.

But still, this does not prevent an oops from happening.
------
2.
Remove the fpga_mgr_free() from fpga_mgr_dev_release() and ask the 
user to free the manager when necessary.

This makes the usage consistent: the user creates and destroy its own 
objects. This is also consistent with our other discussion where we 
said, among the other things, that the module that uses the FPGA 
manager can the owner of the fpga_manager data structure.
------
3.
Not sure how, but perhaps we can be able to understand if we can 
safely continue to run fpga_mgr_free() or we should stop.
------



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fpga: fpga_mgr_free usage
  2018-07-11 12:38 fpga: fpga_mgr_free usage Federico Vaga
@ 2018-07-11 15:59 ` Alan Tull
  2018-07-11 18:25   ` Moritz Fischer
  2018-07-25 16:33   ` Alan Tull
  0 siblings, 2 replies; 5+ messages in thread
From: Alan Tull @ 2018-07-11 15:59 UTC (permalink / raw)
  To: federico.vaga; +Cc: linux-fpga, linux-kernel

On Wed, Jul 11, 2018 at 7:38 AM, Federico Vaga <federico.vaga@cern.ch> wrote:

Hi Federico,

> Hi Alan,
>
> I have another point that I would like to discuss. It is about the
> usage of 'fpga_mgr_free()' which does not look like consistent.
>
> This function, according to the current implementation, can be used by
> an FPGA manager user and it is used by the FPGA manager itself on
> device release. This means that the user can only use this function if
> fpga_mgr_register() fails (to clean up), otherwise the user must NOT
> use this function, otherwise we most likely get an oops (NULL or
> invalid pointer). The example here is correct, this is what we should
> do:
>
> https://www.kernel.org/doc/html/latest/driver-api/fpga/fpga-mgr.html
>
> But I suggest to document it better or prevent this type of mistakes
> from happening. Following a couple of proposals
>
> ------
> 1.
> Document it better. This is easy, in the fpga_mgr_free() kernel-doc
> comment we explain that the use of this function must be limited to
> clean up the memory on a registration failure. If an FPGA manager has
> been successfully registered then it will be freed by the framework
> itself.
>
> But still, this does not prevent an oops from happening.
> ------
> 2.
> Remove the fpga_mgr_free() from fpga_mgr_dev_release() and ask the
> user to free the manager when necessary.
>
> This makes the usage consistent: the user creates and destroy its own
> objects. This is also consistent with our other discussion where we
> said, among the other things, that the module that uses the FPGA
> manager can the owner of the fpga_manager data structure.

You're not the first to complain about this.  I think I'll err on the
side of consistency and implement your option 2 here.

Alan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fpga: fpga_mgr_free usage
  2018-07-11 15:59 ` Alan Tull
@ 2018-07-11 18:25   ` Moritz Fischer
  2018-07-25 16:33   ` Alan Tull
  1 sibling, 0 replies; 5+ messages in thread
From: Moritz Fischer @ 2018-07-11 18:25 UTC (permalink / raw)
  To: Alan Tull; +Cc: federico.vaga, linux-fpga, linux-kernel

Hi Alan,

On Wed, Jul 11, 2018 at 10:59:01AM -0500, Alan Tull wrote:
> On Wed, Jul 11, 2018 at 7:38 AM, Federico Vaga <federico.vaga@cern.ch> wrote:

[..]
> > This makes the usage consistent: the user creates and destroy its own
> > objects. This is also consistent with our other discussion where we
> > said, among the other things, that the module that uses the FPGA
> > manager can the owner of the fpga_manager data structure.
> 
> You're not the first to complain about this.  I think I'll err on the
> side of consistency and implement your option 2 here.

I agree, that seems like a good approach.

Cheers,

Moritz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fpga: fpga_mgr_free usage
  2018-07-11 15:59 ` Alan Tull
  2018-07-11 18:25   ` Moritz Fischer
@ 2018-07-25 16:33   ` Alan Tull
  2018-07-26  7:27     ` Federico Vaga
  1 sibling, 1 reply; 5+ messages in thread
From: Alan Tull @ 2018-07-25 16:33 UTC (permalink / raw)
  To: federico.vaga; +Cc: linux-fpga, linux-kernel

On Wed, Jul 11, 2018 at 10:59 AM, Alan Tull <atull@kernel.org> wrote:
> On Wed, Jul 11, 2018 at 7:38 AM, Federico Vaga <federico.vaga@cern.ch> wrote:
>
> Hi Federico,
>
>> Hi Alan,
>>
>> I have another point that I would like to discuss. It is about the
>> usage of 'fpga_mgr_free()' which does not look like consistent.
>>
>> This function, according to the current implementation, can be used by
>> an FPGA manager user and it is used by the FPGA manager itself on
>> device release. This means that the user can only use this function if
>> fpga_mgr_register() fails (to clean up), otherwise the user must NOT
>> use this function, otherwise we most likely get an oops (NULL or
>> invalid pointer). The example here is correct, this is what we should
>> do:
>>
>> https://www.kernel.org/doc/html/latest/driver-api/fpga/fpga-mgr.html
>>
>> But I suggest to document it better or prevent this type of mistakes
>> from happening. Following a couple of proposals
>>
>> ------
>> 1.
>> Document it better. This is easy, in the fpga_mgr_free() kernel-doc
>> comment we explain that the use of this function must be limited to
>> clean up the memory on a registration failure. If an FPGA manager has
>> been successfully registered then it will be freed by the framework
>> itself.

As I was researching this, I remembered why I implemented it this way.
  See below for that explanation.

It looks like I'm going to switch to option 1 here and add more
documentation for both fpga_mgr_free() and fpga_mgr_unregister().
Note that fpga_mgr_unregister() already says that that it frees the
manager, and the usage example already does the right thing, but I'll
add more words to really beat the message in.

>>
>> But still, this does not prevent an oops from happening.
>> ------
>> 2.
>> Remove the fpga_mgr_free() from fpga_mgr_dev_release() and ask the
>> user to free the manager when necessary.
>>
>> This makes the usage consistent: the user creates and destroy its own
>> objects. This is also consistent with our other discussion where we
>> said, among the other things, that the module that uses the FPGA
>> manager can the owner of the fpga_manager data structure.
>
> You're not the first to complain about this.  I think I'll err on the
> side of consistency and implement your option 2 here.
>
> Alan

If you write a class or create a device, the kernel wants a release
function and will give a warning if you leave it out.  The warning is
"Device 'fpga0' does not have a release() function, it is broken and
must be fixed." and comes from drivers/base/core.c.

I will add some more documentation to make it clear that once a a
mgr/bridge/region has been registered, the cleanup will be handled
automatically when the device goes away.  Until the
fpga_(mgr|bridge|region)_register succeeds, the caller still needs to
do cleanup.

I did find one bug while looking at this.  I'll post some patches.

Full message was:
root@cyclone5:~# rmmod socfpga
[   48.206235] fpga_manager fpga0: fpga_mgr_unregister Altera SOCFPGA
FPGA Manager
[   48.213677] ------------[ cut here ]------------
[   48.218312] WARNING: CPU: 1 PID: 1369 at
/home/atull/repos/linux-socfpga/drivers/base/core.c:895
device_release+0x9c/0xa0
[   48.229293] Device 'fpga0' does not have a release() function, it
is broken and must be fixed.
[   48.237904] Modules linked in: socfpga(-) altera_hps2fpga fpga_mgr
fpga_bridge [last unloaded: fpga_region]
[   48.247659] CPU: 1 PID: 1369 Comm: rmmod Not tainted
4.18.0-rc5-next-20180717-00012-ge5f548e-dirty #3
[   48.256843] Hardware name: Altera SOCFPGA
[   48.260858] [<c01137ac>] (unwind_backtrace) from [<c010dc04>]
(show_stack+0x20/0x24)
[   48.268582] [<c010dc04>] (show_stack) from [<c07d448c>]
(dump_stack+0x8c/0xa0)
[   48.275786] [<c07d448c>] (dump_stack) from [<c0123bc0>] (__warn+0x104/0x11c)
[   48.282810] [<c0123bc0>] (__warn) from [<c0123c2c>]
(warn_slowpath_fmt+0x54/0x70)
[   48.290269] [<c0123c2c>] (warn_slowpath_fmt) from [<c052c9cc>]
(device_release+0x9c/0xa0)
[   48.298418] [<c052c9cc>] (device_release) from [<c07d904c>]
(kobject_put+0xa8/0xe0)
[   48.306047] [<c07d904c>] (kobject_put) from [<c052ec3c>]
(device_unregister+0x2c/0x30)
[   48.313939] [<c052ec3c>] (device_unregister) from [<bf01262c>]
(fpga_mgr_unregister+0x58/0x74 [fpga_mgr])
[   48.323475] [<bf01262c>] (fpga_mgr_unregister [fpga_mgr]) from
[<bf02b01c>] (socfpga_fpga_remove+0x1c/0x24 [socfpga])
[   48.334047] [<bf02b01c>] (socfpga_fpga_remove [socfpga]) from
[<c0534be8>] (platform_drv_remove+0x34/0x4c)
[   48.343664] [<c0534be8>] (platform_drv_remove) from [<c0532f64>]
(device_release_driver_internal+0x180/0x230)
[   48.353538] [<c0532f64>] (device_release_driver_internal) from
[<c0533090>] (driver_detach+0x58/0xa0)
[   48.362720] [<c0533090>] (driver_detach) from [<c0531bf8>]
(bus_remove_driver+0x5c/0xb4)
[   48.370781] [<c0531bf8>] (bus_remove_driver) from [<c0533a70>]
(driver_unregister+0x38/0x58)
[   48.379186] [<c0533a70>] (driver_unregister) from [<c0534cd0>]
(platform_driver_unregister+0x1c/0x20)
[   48.388370] [<c0534cd0>] (platform_driver_unregister) from
[<bf02b688>] (socfpga_fpga_driver_exit+0x18/0x990 [socfpga])
[   48.399113] [<bf02b688>] (socfpga_fpga_driver_exit [socfpga]) from
[<c01ad948>] (sys_delete_module+0x1a0/0x1f0)
[   48.409164] [<c01ad948>] (sys_delete_module) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[   48.417391] Exception stack(0xee6dbfa8 to 0xee6dbff0)
[   48.422424] bfa0:                   0001dce0 beba4be0 0001dd1c
00000800 0000000a 80080000
[   48.430568] bfc0: 0001dce0 beba4be0 00000000 00000081 0001c22c
00000000 00000001 beba4dcc
[   48.438708] bfe0: b6ecdd00 beba4b9c 00012b43 b6ecdd0c
[   48.443773] ---[ end trace bcf003ed0f464330 ]---

Alan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: fpga: fpga_mgr_free usage
  2018-07-25 16:33   ` Alan Tull
@ 2018-07-26  7:27     ` Federico Vaga
  0 siblings, 0 replies; 5+ messages in thread
From: Federico Vaga @ 2018-07-26  7:27 UTC (permalink / raw)
  To: Alan Tull; +Cc: linux-fpga, linux-kernel

On Wednesday, July 25, 2018 6:33:44 PM CEST Alan Tull wrote:
> On Wed, Jul 11, 2018 at 10:59 AM, Alan Tull <atull@kernel.org> wrote:
> > On Wed, Jul 11, 2018 at 7:38 AM, Federico Vaga <federico.vaga@cern.ch>
> > wrote:
> > 
> > Hi Federico,
> > 
> >> Hi Alan,
> >> 
> >> I have another point that I would like to discuss. It is about the
> >> usage of 'fpga_mgr_free()' which does not look like consistent.
> >> 
> >> This function, according to the current implementation, can be used by
> >> an FPGA manager user and it is used by the FPGA manager itself on
> >> device release. This means that the user can only use this function if
> >> fpga_mgr_register() fails (to clean up), otherwise the user must NOT
> >> use this function, otherwise we most likely get an oops (NULL or
> >> invalid pointer). The example here is correct, this is what we should
> >> do:
> >> 
> >> https://www.kernel.org/doc/html/latest/driver-api/fpga/fpga-mgr.html
> >> 
> >> But I suggest to document it better or prevent this type of mistakes
> >> from happening. Following a couple of proposals
> >> 
> >> ------
> >> 1.
> >> Document it better. This is easy, in the fpga_mgr_free() kernel-doc
> >> comment we explain that the use of this function must be limited to
> >> clean up the memory on a registration failure. If an FPGA manager has
> >> been successfully registered then it will be freed by the framework
> >> itself.
> 
> As I was researching this, I remembered why I implemented it this way.
>   See below for that explanation.
> 
> It looks like I'm going to switch to option 1 here and add more
> documentation for both fpga_mgr_free() and fpga_mgr_unregister().
> Note that fpga_mgr_unregister() already says that that it frees the
> manager, and the usage example already does the right thing, but I'll
> add more words to really beat the message in.
> 
> >> But still, this does not prevent an oops from happening.
> >> ------
> >> 2.
> >> Remove the fpga_mgr_free() from fpga_mgr_dev_release() and ask the
> >> user to free the manager when necessary.
> >> 
> >> This makes the usage consistent: the user creates and destroy its own
> >> objects. This is also consistent with our other discussion where we
> >> said, among the other things, that the module that uses the FPGA
> >> manager can the owner of the fpga_manager data structure.
> > 
> > You're not the first to complain about this.  I think I'll err on the
> > side of consistency and implement your option 2 here.
> > 
> > Alan
> 
> If you write a class or create a device, the kernel wants a release
> function and will give a warning if you leave it out.  The warning is
> "Device 'fpga0' does not have a release() function, it is broken and
> must be fixed." and comes from drivers/base/core.c.

True, but that function can be empty (in other words, it does nothing) and 
option 2 can be implemented as well without warnings. I do not think is that 
bad, for example if I allocate everything with devm_* probably I will not have 
much to do in my release() function.
Anyway, I do not have strong technical arguments in favor of option 1 or 2.

> I will add some more documentation to make it clear that once a a
> mgr/bridge/region has been registered, the cleanup will be handled
> automatically when the device goes away.  Until the
> fpga_(mgr|bridge|region)_register succeeds, the caller still needs to
> do cleanup.
> 
> I did find one bug while looking at this.  I'll post some patches.
> 
> Full message was:
> root@cyclone5:~# rmmod socfpga
> [   48.206235] fpga_manager fpga0: fpga_mgr_unregister Altera SOCFPGA
> FPGA Manager
> [   48.213677] ------------[ cut here ]------------
> [   48.218312] WARNING: CPU: 1 PID: 1369 at
> /home/atull/repos/linux-socfpga/drivers/base/core.c:895
> device_release+0x9c/0xa0
> [   48.229293] Device 'fpga0' does not have a release() function, it
> is broken and must be fixed.
> [   48.237904] Modules linked in: socfpga(-) altera_hps2fpga fpga_mgr
> fpga_bridge [last unloaded: fpga_region]
> [   48.247659] CPU: 1 PID: 1369 Comm: rmmod Not tainted
> 4.18.0-rc5-next-20180717-00012-ge5f548e-dirty #3
> [   48.256843] Hardware name: Altera SOCFPGA
> [   48.260858] [<c01137ac>] (unwind_backtrace) from [<c010dc04>]
> (show_stack+0x20/0x24)
> [   48.268582] [<c010dc04>] (show_stack) from [<c07d448c>]
> (dump_stack+0x8c/0xa0)
> [   48.275786] [<c07d448c>] (dump_stack) from [<c0123bc0>]
> (__warn+0x104/0x11c) [   48.282810] [<c0123bc0>] (__warn) from
> [<c0123c2c>]
> (warn_slowpath_fmt+0x54/0x70)
> [   48.290269] [<c0123c2c>] (warn_slowpath_fmt) from [<c052c9cc>]
> (device_release+0x9c/0xa0)
> [   48.298418] [<c052c9cc>] (device_release) from [<c07d904c>]
> (kobject_put+0xa8/0xe0)
> [   48.306047] [<c07d904c>] (kobject_put) from [<c052ec3c>]
> (device_unregister+0x2c/0x30)
> [   48.313939] [<c052ec3c>] (device_unregister) from [<bf01262c>]
> (fpga_mgr_unregister+0x58/0x74 [fpga_mgr])
> [   48.323475] [<bf01262c>] (fpga_mgr_unregister [fpga_mgr]) from
> [<bf02b01c>] (socfpga_fpga_remove+0x1c/0x24 [socfpga])
> [   48.334047] [<bf02b01c>] (socfpga_fpga_remove [socfpga]) from
> [<c0534be8>] (platform_drv_remove+0x34/0x4c)
> [   48.343664] [<c0534be8>] (platform_drv_remove) from [<c0532f64>]
> (device_release_driver_internal+0x180/0x230)
> [   48.353538] [<c0532f64>] (device_release_driver_internal) from
> [<c0533090>] (driver_detach+0x58/0xa0)
> [   48.362720] [<c0533090>] (driver_detach) from [<c0531bf8>]
> (bus_remove_driver+0x5c/0xb4)
> [   48.370781] [<c0531bf8>] (bus_remove_driver) from [<c0533a70>]
> (driver_unregister+0x38/0x58)
> [   48.379186] [<c0533a70>] (driver_unregister) from [<c0534cd0>]
> (platform_driver_unregister+0x1c/0x20)
> [   48.388370] [<c0534cd0>] (platform_driver_unregister) from
> [<bf02b688>] (socfpga_fpga_driver_exit+0x18/0x990 [socfpga])
> [   48.399113] [<bf02b688>] (socfpga_fpga_driver_exit [socfpga]) from
> [<c01ad948>] (sys_delete_module+0x1a0/0x1f0)
> [   48.409164] [<c01ad948>] (sys_delete_module) from [<c0101000>]
> (ret_fast_syscall+0x0/0x54)
> [   48.417391] Exception stack(0xee6dbfa8 to 0xee6dbff0)
> [   48.422424] bfa0:                   0001dce0 beba4be0 0001dd1c
> 00000800 0000000a 80080000
> [   48.430568] bfc0: 0001dce0 beba4be0 00000000 00000081 0001c22c
> 00000000 00000001 beba4dcc
> [   48.438708] bfe0: b6ecdd00 beba4b9c 00012b43 b6ecdd0c
> [   48.443773] ---[ end trace bcf003ed0f464330 ]---
> 
> Alan

Federico Vaga
[BE-CO-HT]



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-11 12:38 fpga: fpga_mgr_free usage Federico Vaga
2018-07-11 15:59 ` Alan Tull
2018-07-11 18:25   ` Moritz Fischer
2018-07-25 16:33   ` Alan Tull
2018-07-26  7:27     ` Federico Vaga

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox