linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* memory leaks in dasd_eckd_check_characteristics() error paths
@ 2019-10-02 19:33 Qian Cai
  2019-10-16 13:29 ` Stefan Haberland
  0 siblings, 1 reply; 7+ messages in thread
From: Qian Cai @ 2019-10-02 19:33 UTC (permalink / raw)
  To: Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
dasd_generic_set_online() emits this message,

dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12

After that, there are several memory leaks below. There are "config_data" and
then stored as,

/* store per path conf_data */
device->path[pos].conf_data = conf_data;

When it processes the error path in  dasd_generic_set_online(), it calls
dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
the device->path[].conf_data first. Is it safe to free those in
dasd_free_device() without worrying about the double-free? Or, is it better to
free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
out_err*?

--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
  */
 void dasd_free_device(struct dasd_device *device)
 {
+       for (int i = 0; i < 8; i++)
+               kfree(device->path[i].conf_data);
+
        kfree(device->private);
        free_pages((unsigned long) device->ese_mem, 1);
        free_page((unsigned long) device->erp_mem);


unreferenced object 0x0fcee900 (size 256):
  comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
  hex dump (first 32 bytes):
    dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
    f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
  backtrace:
    [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
    [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
    [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
[dasd_eckd_mod]
    [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
    [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
    [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
    [<00000000349a5446>] online_store+0x2ce/0x420
    [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
    [<0000000005664197>] vfs_write+0xce/0x220
    [<0000000044a8bccb>] ksys_write+0xea/0x190
    [<0000000037335938>] system_call+0x296/0x2b4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-02 19:33 memory leaks in dasd_eckd_check_characteristics() error paths Qian Cai
@ 2019-10-16 13:29 ` Stefan Haberland
  2019-10-16 14:09   ` Qian Cai
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Haberland @ 2019-10-16 13:29 UTC (permalink / raw)
  To: Qian Cai, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

Hi,

thanks for reporting this.

On 02.10.19 21:33, Qian Cai wrote:
> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
> dasd_generic_set_online() emits this message,
>
> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
>
> After that, there are several memory leaks below. There are "config_data" and
> then stored as,
>
> /* store per path conf_data */
> device->path[pos].conf_data = conf_data;
>
> When it processes the error path in  dasd_generic_set_online(), it calls
> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
> the device->path[].conf_data first. 

Usually dasd_delete_device() calls dasd_generic_free_discipline() which
takes care of
the device->path[].conf_data in dasd_eckd_uncheck_device().
From a first look this looks sane.

So I need to spend a closer look if this does not happen correctly here.

> Is it safe to free those in
> dasd_free_device() without worrying about the double-free? Or, is it better to
> free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
> out_err*?
>
> --- a/drivers/s390/block/dasd.c
> +++ b/drivers/s390/block/dasd.c
> @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
>   */
>  void dasd_free_device(struct dasd_device *device)
>  {
> +       for (int i = 0; i < 8; i++)
> +               kfree(device->path[i].conf_data);
> +
>         kfree(device->private);
>         free_pages((unsigned long) device->ese_mem, 1);
>         free_page((unsigned long) device->erp_mem);
>
>
> unreferenced object 0x0fcee900 (size 256):
>   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
>   hex dump (first 32 bytes):
>     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
>     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
>   backtrace:
>     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
>     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
>     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
> [dasd_eckd_mod]
>     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
>     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
>     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
>     [<00000000349a5446>] online_store+0x2ce/0x420
>     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
>     [<0000000005664197>] vfs_write+0xce/0x220
>     [<0000000044a8bccb>] ksys_write+0xea/0x190
>     [<0000000037335938>] system_call+0x296/0x2b4



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-16 13:29 ` Stefan Haberland
@ 2019-10-16 14:09   ` Qian Cai
  2019-10-16 14:56     ` Stefan Haberland
  0 siblings, 1 reply; 7+ messages in thread
From: Qian Cai @ 2019-10-16 14:09 UTC (permalink / raw)
  To: Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
> Hi,
> 
> thanks for reporting this.
> 
> On 02.10.19 21:33, Qian Cai wrote:
> > For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
> > dasd_generic_set_online() emits this message,
> > 
> > dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
> > 
> > After that, there are several memory leaks below. There are "config_data" and
> > then stored as,
> > 
> > /* store per path conf_data */
> > device->path[pos].conf_data = conf_data;
> > 
> > When it processes the error path in  dasd_generic_set_online(), it calls
> > dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
> > the device->path[].conf_data first. 
> 
> Usually dasd_delete_device() calls dasd_generic_free_discipline() which
> takes care of
> the device->path[].conf_data in dasd_eckd_uncheck_device().
> From a first look this looks sane.
> 
> So I need to spend a closer look if this does not happen correctly here.

When dasd_eckd_check_characteristics() failed here,

	if (!private) {
		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
		if (!private) {
			dev_warn(&device->cdev->dev,
				 "Allocating memory for private DASD data "
				 "failed\n");
			return -ENOMEM;
		}
		device->private = private;

The device->private is NULL.

Then, in dasd_eckd_uncheck_device(), it will return immediately.

	if (!private)
		return;

> 
> > Is it safe to free those in
> > dasd_free_device() without worrying about the double-free? Or, is it better to
> > free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
> > out_err*?
> > 
> > --- a/drivers/s390/block/dasd.c
> > +++ b/drivers/s390/block/dasd.c
> > @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
> >   */
> >  void dasd_free_device(struct dasd_device *device)
> >  {
> > +       for (int i = 0; i < 8; i++)
> > +               kfree(device->path[i].conf_data);
> > +
> >         kfree(device->private);
> >         free_pages((unsigned long) device->ese_mem, 1);
> >         free_page((unsigned long) device->erp_mem);
> > 
> > 
> > unreferenced object 0x0fcee900 (size 256):
> >   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
> >   hex dump (first 32 bytes):
> >     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
> >     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
> >   backtrace:
> >     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
> >     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
> >     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
> > [dasd_eckd_mod]
> >     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
> >     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
> >     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
> >     [<00000000349a5446>] online_store+0x2ce/0x420
> >     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
> >     [<0000000005664197>] vfs_write+0xce/0x220
> >     [<0000000044a8bccb>] ksys_write+0xea/0x190
> >     [<0000000037335938>] system_call+0x296/0x2b4
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-16 14:09   ` Qian Cai
@ 2019-10-16 14:56     ` Stefan Haberland
  2019-10-16 15:26       ` Qian Cai
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Haberland @ 2019-10-16 14:56 UTC (permalink / raw)
  To: Qian Cai, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

On 16.10.19 16:09, Qian Cai wrote:
> On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
>> Hi,
>>
>> thanks for reporting this.
>>
>> On 02.10.19 21:33, Qian Cai wrote:
>>> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
>>> dasd_generic_set_online() emits this message,
>>>
>>> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
>>>
>>> After that, there are several memory leaks below. There are "config_data" and
>>> then stored as,
>>>
>>> /* store per path conf_data */
>>> device->path[pos].conf_data = conf_data;
>>>
>>> When it processes the error path in  dasd_generic_set_online(), it calls
>>> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
>>> the device->path[].conf_data first. 
>> Usually dasd_delete_device() calls dasd_generic_free_discipline() which
>> takes care of
>> the device->path[].conf_data in dasd_eckd_uncheck_device().
>> From a first look this looks sane.
>>
>> So I need to spend a closer look if this does not happen correctly here.
> When dasd_eckd_check_characteristics() failed here,
>
> 	if (!private) {
> 		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
> 		if (!private) {
> 			dev_warn(&device->cdev->dev,
> 				 "Allocating memory for private DASD data "
> 				 "failed\n");
> 			return -ENOMEM;
> 		}
> 		device->private = private;
>
> The device->private is NULL.
>
> Then, in dasd_eckd_uncheck_device(), it will return immediately.
>
> 	if (!private)
> 		return;

Yes but in this case there is no per_path configuration data stored.
This is done after the private structure is allocated successfully.


>>> Is it safe to free those in
>>> dasd_free_device() without worrying about the double-free? Or, is it better to
>>> free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
>>> out_err*?
>>>
>>> --- a/drivers/s390/block/dasd.c
>>> +++ b/drivers/s390/block/dasd.c
>>> @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
>>>   */
>>>  void dasd_free_device(struct dasd_device *device)
>>>  {
>>> +       for (int i = 0; i < 8; i++)
>>> +               kfree(device->path[i].conf_data);
>>> +
>>>         kfree(device->private);
>>>         free_pages((unsigned long) device->ese_mem, 1);
>>>         free_page((unsigned long) device->erp_mem);
>>>
>>>
>>> unreferenced object 0x0fcee900 (size 256):
>>>   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
>>>   hex dump (first 32 bytes):
>>>     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
>>>     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
>>>   backtrace:
>>>     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
>>>     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
>>>     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
>>> [dasd_eckd_mod]
>>>     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
>>>     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
>>>     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
>>>     [<00000000349a5446>] online_store+0x2ce/0x420
>>>     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
>>>     [<0000000005664197>] vfs_write+0xce/0x220
>>>     [<0000000044a8bccb>] ksys_write+0xea/0x190
>>>     [<0000000037335938>] system_call+0x296/0x2b4
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-16 14:56     ` Stefan Haberland
@ 2019-10-16 15:26       ` Qian Cai
  2019-10-16 15:28         ` Stefan Haberland
  0 siblings, 1 reply; 7+ messages in thread
From: Qian Cai @ 2019-10-16 15:26 UTC (permalink / raw)
  To: Stefan Haberland, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

On Wed, 2019-10-16 at 16:56 +0200, Stefan Haberland wrote:
> On 16.10.19 16:09, Qian Cai wrote:
> > On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
> > > Hi,
> > > 
> > > thanks for reporting this.
> > > 
> > > On 02.10.19 21:33, Qian Cai wrote:
> > > > For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
> > > > dasd_generic_set_online() emits this message,
> > > > 
> > > > dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
> > > > 
> > > > After that, there are several memory leaks below. There are "config_data" and
> > > > then stored as,
> > > > 
> > > > /* store per path conf_data */
> > > > device->path[pos].conf_data = conf_data;
> > > > 
> > > > When it processes the error path in  dasd_generic_set_online(), it calls
> > > > dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
> > > > the device->path[].conf_data first. 
> > > 
> > > Usually dasd_delete_device() calls dasd_generic_free_discipline() which
> > > takes care of
> > > the device->path[].conf_data in dasd_eckd_uncheck_device().
> > > From a first look this looks sane.
> > > 
> > > So I need to spend a closer look if this does not happen correctly here.
> > 
> > When dasd_eckd_check_characteristics() failed here,
> > 
> > 	if (!private) {
> > 		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
> > 		if (!private) {
> > 			dev_warn(&device->cdev->dev,
> > 				 "Allocating memory for private DASD data "
> > 				 "failed\n");
> > 			return -ENOMEM;
> > 		}
> > 		device->private = private;
> > 
> > The device->private is NULL.
> > 
> > Then, in dasd_eckd_uncheck_device(), it will return immediately.
> > 
> > 	if (!private)
> > 		return;
> 
> Yes but in this case there is no per_path configuration data stored.
> This is done after the private structure is allocated successfully.

Yes, you are right. It has been a while so I must lost a bit memory. Actually,
it looks like in dasd_eckd_check_characteristic() that device->private is set to
NULL from this path,

	/* Read Configuration Data */
	rc = dasd_eckd_read_conf(device);
	if (rc)
		goto out_err1;

out_err1:
	kfree(private->conf_data);
	kfree(device->private);
	device->private = NULL;
	return rc;

because dasd_eckd_read_conf() returns -ENOMEM and calls,

out_error:
	kfree(rcd_buf);
	*rcd_buffer = NULL;
	*rcd_buffer_size = 0;
	return ret;

It will only free its own config_data in one operational path, but there could
has already allocated in earlier paths in dasd_eckd_read_conf() without any
rollback and calls return,

	for (lpm = 0x80; lpm; lpm>>= 1) {
		if (!(lpm & opm))
			continue;
		rc = dasd_eckd_read_conf_lpm(device, &conf_data,
					     &conf_len, lpm);
		if (rc && rc != -EOPNOTSUPP) {	/* -EOPNOTSUPP is ok */
			DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
					"Read configuration data returned "
					"error %d", rc);
			return rc;
		}

Later, dasd_eckd_uncheck_device() see private->device is NULL without cleaning
up. Does it make sense?

> 
> 
> > > > Is it safe to free those in
> > > > dasd_free_device() without worrying about the double-free? Or, is it better to
> > > > free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
> > > > out_err*?
> > > > 
> > > > --- a/drivers/s390/block/dasd.c
> > > > +++ b/drivers/s390/block/dasd.c
> > > > @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
> > > >   */
> > > >  void dasd_free_device(struct dasd_device *device)
> > > >  {
> > > > +       for (int i = 0; i < 8; i++)
> > > > +               kfree(device->path[i].conf_data);
> > > > +
> > > >         kfree(device->private);
> > > >         free_pages((unsigned long) device->ese_mem, 1);
> > > >         free_page((unsigned long) device->erp_mem);
> > > > 
> > > > 
> > > > unreferenced object 0x0fcee900 (size 256):
> > > >   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
> > > >   hex dump (first 32 bytes):
> > > >     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
> > > >     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
> > > >   backtrace:
> > > >     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
> > > >     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
> > > >     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
> > > > [dasd_eckd_mod]
> > > >     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
> > > >     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
> > > >     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
> > > >     [<00000000349a5446>] online_store+0x2ce/0x420
> > > >     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
> > > >     [<0000000005664197>] vfs_write+0xce/0x220
> > > >     [<0000000044a8bccb>] ksys_write+0xea/0x190
> > > >     [<0000000037335938>] system_call+0x296/0x2b4
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-16 15:26       ` Qian Cai
@ 2019-10-16 15:28         ` Stefan Haberland
  2019-10-18 12:06           ` Stefan Haberland
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Haberland @ 2019-10-16 15:28 UTC (permalink / raw)
  To: Qian Cai, Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger
  Cc: linux-s390, linux-kernel

On 16.10.19 17:26, Qian Cai wrote:
> On Wed, 2019-10-16 at 16:56 +0200, Stefan Haberland wrote:
>> On 16.10.19 16:09, Qian Cai wrote:
>>> On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
>>>> Hi,
>>>>
>>>> thanks for reporting this.
>>>>
>>>> On 02.10.19 21:33, Qian Cai wrote:
>>>>> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
>>>>> dasd_generic_set_online() emits this message,
>>>>>
>>>>> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
>>>>>
>>>>> After that, there are several memory leaks below. There are "config_data" and
>>>>> then stored as,
>>>>>
>>>>> /* store per path conf_data */
>>>>> device->path[pos].conf_data = conf_data;
>>>>>
>>>>> When it processes the error path in  dasd_generic_set_online(), it calls
>>>>> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
>>>>> the device->path[].conf_data first. 
>>>> Usually dasd_delete_device() calls dasd_generic_free_discipline() which
>>>> takes care of
>>>> the device->path[].conf_data in dasd_eckd_uncheck_device().
>>>> From a first look this looks sane.
>>>>
>>>> So I need to spend a closer look if this does not happen correctly here.
>>> When dasd_eckd_check_characteristics() failed here,
>>>
>>> 	if (!private) {
>>> 		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
>>> 		if (!private) {
>>> 			dev_warn(&device->cdev->dev,
>>> 				 "Allocating memory for private DASD data "
>>> 				 "failed\n");
>>> 			return -ENOMEM;
>>> 		}
>>> 		device->private = private;
>>>
>>> The device->private is NULL.
>>>
>>> Then, in dasd_eckd_uncheck_device(), it will return immediately.
>>>
>>> 	if (!private)
>>> 		return;
>> Yes but in this case there is no per_path configuration data stored.
>> This is done after the private structure is allocated successfully.
> Yes, you are right. It has been a while so I must lost a bit memory. Actually,
> it looks like in dasd_eckd_check_characteristic() that device->private is set to
> NULL from this path,
>
> 	/* Read Configuration Data */
> 	rc = dasd_eckd_read_conf(device);
> 	if (rc)
> 		goto out_err1;
>
> out_err1:
> 	kfree(private->conf_data);
> 	kfree(device->private);
> 	device->private = NULL;
> 	return rc;
>
> because dasd_eckd_read_conf() returns -ENOMEM and calls,
>
> out_error:
> 	kfree(rcd_buf);
> 	*rcd_buffer = NULL;
> 	*rcd_buffer_size = 0;
> 	return ret;
>
> It will only free its own config_data in one operational path, but there could
> has already allocated in earlier paths in dasd_eckd_read_conf() without any
> rollback and calls return,
>
> 	for (lpm = 0x80; lpm; lpm>>= 1) {
> 		if (!(lpm & opm))
> 			continue;
> 		rc = dasd_eckd_read_conf_lpm(device, &conf_data,
> 					     &conf_len, lpm);
> 		if (rc && rc != -EOPNOTSUPP) {	/* -EOPNOTSUPP is ok */
> 			DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
> 					"Read configuration data returned "
> 					"error %d", rc);
> 			return rc;
> 		}
>
> Later, dasd_eckd_uncheck_device() see private->device is NULL without cleaning
> up. Does it make sense?

Yes, this looks like an error path not handling this correctly.

>
>>
>>>>> Is it safe to free those in
>>>>> dasd_free_device() without worrying about the double-free? Or, is it better to
>>>>> free those in dasd_eckd_check_characteristics()'s goto error handling, i.e.,
>>>>> out_err*?
>>>>>
>>>>> --- a/drivers/s390/block/dasd.c
>>>>> +++ b/drivers/s390/block/dasd.c
>>>>> @@ -153,6 +153,9 @@ struct dasd_device *dasd_alloc_device(void)
>>>>>   */
>>>>>  void dasd_free_device(struct dasd_device *device)
>>>>>  {
>>>>> +       for (int i = 0; i < 8; i++)
>>>>> +               kfree(device->path[i].conf_data);
>>>>> +
>>>>>         kfree(device->private);
>>>>>         free_pages((unsigned long) device->ese_mem, 1);
>>>>>         free_page((unsigned long) device->erp_mem);
>>>>>
>>>>>
>>>>> unreferenced object 0x0fcee900 (size 256):
>>>>>   comm "dasdconf.sh", pid 446, jiffies 4294940081 (age 170.340s)
>>>>>   hex dump (first 32 bytes):
>>>>>     dc 01 01 00 f0 f0 f2 f1 f0 f7 f9 f0 f0 c9 c2 d4  ................
>>>>>     f7 f5 f0 f0 f0 f0 f0 f0 f0 c6 d9 c2 f7 f1 62 33  ..............b3
>>>>>   backtrace:
>>>>>     [<00000000a83b1992>] kmem_cache_alloc_trace+0x200/0x388
>>>>>     [<00000000048ef3e2>] dasd_eckd_read_conf+0x408/0x1400 [dasd_eckd_mod]
>>>>>     [<00000000ce31f195>] dasd_eckd_check_characteristics+0x3cc/0x938
>>>>> [dasd_eckd_mod]
>>>>>     [<00000000f6f1759b>] dasd_generic_set_online+0x150/0x4c0
>>>>>     [<00000000efca1efa>] ccw_device_set_online+0x324/0x808
>>>>>     [<00000000f9779774>] online_store_recog_and_online+0xe8/0x220
>>>>>     [<00000000349a5446>] online_store+0x2ce/0x420
>>>>>     [<000000005bd145f8>] kernfs_fop_write+0x1bc/0x270
>>>>>     [<0000000005664197>] vfs_write+0xce/0x220
>>>>>     [<0000000044a8bccb>] ksys_write+0xea/0x190
>>>>>     [<0000000037335938>] system_call+0x296/0x2b4
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memory leaks in dasd_eckd_check_characteristics() error paths
  2019-10-16 15:28         ` Stefan Haberland
@ 2019-10-18 12:06           ` Stefan Haberland
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Haberland @ 2019-10-18 12:06 UTC (permalink / raw)
  To: Qian Cai
  Cc: Jan Hoeppner, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, linux-s390, linux-kernel

On 16.10.19 17:28, Stefan Haberland wrote:
> On 16.10.19 17:26, Qian Cai wrote:
>> On Wed, 2019-10-16 at 16:56 +0200, Stefan Haberland wrote:
>>> On 16.10.19 16:09, Qian Cai wrote:
>>>> On Wed, 2019-10-16 at 15:29 +0200, Stefan Haberland wrote:
>>>>> Hi,
>>>>>
>>>>> thanks for reporting this.
>>>>>
>>>>> On 02.10.19 21:33, Qian Cai wrote:
>>>>>> For some reasons, dasd_eckd_check_characteristics() received -ENOMEM and then
>>>>>> dasd_generic_set_online() emits this message,
>>>>>>
>>>>>> dasd: 0.0.0122 Setting the DASD online with discipline ECKD failed with rc=-12
>>>>>>
>>>>>> After that, there are several memory leaks below. There are "config_data" and
>>>>>> then stored as,
>>>>>>
>>>>>> /* store per path conf_data */
>>>>>> device->path[pos].conf_data = conf_data;
>>>>>>
>>>>>> When it processes the error path in  dasd_generic_set_online(), it calls
>>>>>> dasd_delete_device() which nuke the whole "struct dasd_device" without freeing
>>>>>> the device->path[].conf_data first. 
>>>>> Usually dasd_delete_device() calls dasd_generic_free_discipline() which
>>>>> takes care of
>>>>> the device->path[].conf_data in dasd_eckd_uncheck_device().
>>>>> From a first look this looks sane.
>>>>>
>>>>> So I need to spend a closer look if this does not happen correctly here.
>>>> When dasd_eckd_check_characteristics() failed here,
>>>>
>>>> 	if (!private) {
>>>> 		private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
>>>> 		if (!private) {
>>>> 			dev_warn(&device->cdev->dev,
>>>> 				 "Allocating memory for private DASD data "
>>>> 				 "failed\n");
>>>> 			return -ENOMEM;
>>>> 		}
>>>> 		device->private = private;
>>>>
>>>> The device->private is NULL.
>>>>
>>>> Then, in dasd_eckd_uncheck_device(), it will return immediately.
>>>>
>>>> 	if (!private)
>>>> 		return;
>>> Yes but in this case there is no per_path configuration data stored.
>>> This is done after the private structure is allocated successfully.
>> Yes, you are right. It has been a while so I must lost a bit memory. Actually,
>> it looks like in dasd_eckd_check_characteristic() that device->private is set to
>> NULL from this path,
>>
>> 	/* Read Configuration Data */
>> 	rc = dasd_eckd_read_conf(device);
>> 	if (rc)
>> 		goto out_err1;
>>
>> out_err1:
>> 	kfree(private->conf_data);
>> 	kfree(device->private);
>> 	device->private = NULL;
>> 	return rc;
>>
>> because dasd_eckd_read_conf() returns -ENOMEM and calls,
>>
>> out_error:
>> 	kfree(rcd_buf);
>> 	*rcd_buffer = NULL;
>> 	*rcd_buffer_size = 0;
>> 	return ret;
>>
>> It will only free its own config_data in one operational path, but there could
>> has already allocated in earlier paths in dasd_eckd_read_conf() without any
>> rollback and calls return,
>>
>> 	for (lpm = 0x80; lpm; lpm>>= 1) {
>> 		if (!(lpm & opm))
>> 			continue;
>> 		rc = dasd_eckd_read_conf_lpm(device, &conf_data,
>> 					     &conf_len, lpm);
>> 		if (rc && rc != -EOPNOTSUPP) {	/* -EOPNOTSUPP is ok */
>> 			DBF_EVENT_DEVID(DBF_WARNING, device->cdev,
>> 					"Read configuration data returned "
>> 					"error %d", rc);
>> 			return rc;
>> 		}
>>
>> Later, dasd_eckd_uncheck_device() see private->device is NULL without cleaning
>> up. Does it make sense?
> Yes, this looks like an error path not handling this correctly.
>

I will provide a patch for this. Most likely I will cover this in the
error handling of the function.
Thanks again for reporting.

Regards,
Stefan


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-18 12:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 19:33 memory leaks in dasd_eckd_check_characteristics() error paths Qian Cai
2019-10-16 13:29 ` Stefan Haberland
2019-10-16 14:09   ` Qian Cai
2019-10-16 14:56     ` Stefan Haberland
2019-10-16 15:26       ` Qian Cai
2019-10-16 15:28         ` Stefan Haberland
2019-10-18 12:06           ` Stefan Haberland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).