From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968069AbdAEQ7S (ORCPT ); Thu, 5 Jan 2017 11:59:18 -0500 Received: from mail-qk0-f195.google.com ([209.85.220.195]:34917 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761540AbdAEQ64 (ORCPT ); Thu, 5 Jan 2017 11:58:56 -0500 Subject: Re: [PATCH] thermal: add thermal_zone_remove_device_groups() To: Zhang Rui , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org References: <6c5a1a95-9b03-6f43-8de8-1ab4ac27cc78@gmail.com> <1483504508.2320.41.camel@intel.com> <1483505107.2320.50.camel@intel.com> Cc: edubezval@gmail.com From: Yasuaki Ishimatsu Message-ID: <7116979e-5405-a02b-17f1-2fe663334907@gmail.com> Date: Thu, 5 Jan 2017 11:58:48 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <1483505107.2320.50.camel@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/03/2017 11:45 PM, Zhang Rui wrote: > On Wed, 2017-01-04 at 12:35 +0800, Zhang Rui wrote: >> On Thu, 2016-12-15 at 16:47 -0500, Yasuaki Ishimatsu wrote: >>> >>> When offlining all cores on a CPU, the following system panic >>> occurs: >>> >>> BUG: unable to handle kernel NULL pointer dereference at (null) >>> IP: strlen+0x0/0x20 >>> >>> Call Trace: >>> ? kernfs_name_hash+0x17/0x80 >>> kernfs_find_ns+0x3f/0xd0 >>> kernfs_remove_by_name_ns+0x36/0xa0 >>> remove_files.isra.1+0x36/0x70 >>> sysfs_remove_group+0x44/0x90 >>> sysfs_remove_groups+0x2e/0x50 >>> device_remove_attrs+0x5e/0x90 >>> device_del+0x1ea/0x350 >>> device_unregister+0x1a/0x60 >>> thermal_zone_device_unregister+0x1f2/0x210 >>> pkg_thermal_cpu_offline+0x14f/0x1a0 [x86_pkg_temp_thermal] >>> ? kzalloc.constprop.2+0x10/0x10 [x86_pkg_temp_thermal] >>> cpuhp_invoke_callback+0x8d/0x3f0 >>> cpuhp_down_callbacks+0x42/0x80 >>> cpuhp_thread_fun+0x8b/0xf0 >>> smpboot_thread_fn+0x110/0x160 >>> kthread+0x101/0x140 >>> ? sort_range+0x30/0x30 >>> ? kthread_park+0x90/0x90 >>> ret_from_fork+0x25/0x30 >>> >>> thermal_zone_create_device_group() sets attribute_groups in >>> thermal_zone_attribute_groups[] to tz->device.groups. But these >>> attributes_groups do not have name argument. >>> >> I'm a little confused here, in remove_files(), >> it is the (struct attribute *)->name which is passed into >> kernfs_remove_by_name, instead of attributes_groups->name. >> >> IMO, a NULL-name attribute group won't bring any problem. >> > hah, I see what the problem is here. > Just like the problem illustrated in this one > https://patchwork.kernel.org/patch/9492439/ > The root cause is that, the trip point attributes are cleared and freed > BEFORE device_unregister(), this results in NULL trip point attributes > when removing the thermal zone device sysfs groups. > > And I believe https://patchwork.kernel.org/patch/9492439/ should fix > the problem for you, right? Thank you for the information. I confirmed that the patch fixes the issue. Thanks, Yasuaki Ishimatsu > > thanks, > rui> > >>> So when offlining all cores on CPU and executing >>> thermal_zone_device_unregister(), the panic occurs in strlen() >>> called from kernfs_name_hash() because name argument is NULL. >>> >>> The patch adds thermal_zone_remove_device_groups() to free >>> tz->device.groups and set NULL pointer. >>> >>> Signed-off-by: Yasuaki Ishimatsu >>> CC: Zhang Rui >>> CC: Eduardo Valentin >>> --- >>> drivers/thermal/thermal_core.c | 3 ++- >>> drivers/thermal/thermal_core.h | 1 + >>> drivers/thermal/thermal_sysfs.c | 6 ++++++ >>> 3 files changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/thermal/thermal_core.c >>> b/drivers/thermal/thermal_core.c >>> index 641faab..926e385 100644 >>> --- a/drivers/thermal/thermal_core.c >>> +++ b/drivers/thermal/thermal_core.c >>> @@ -1251,6 +1251,7 @@ struct thermal_zone_device * >>> >>> unregister: >>> release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); >>> + thermal_zone_remove_device_groups(tz); >>> device_unregister(&tz->device); >>> return ERR_PTR(result); >>> } >>> @@ -1315,8 +1316,8 @@ void thermal_zone_device_unregister(struct >>> thermal_zone_device *tz) >>> release_idr(&thermal_tz_idr, &thermal_idr_lock, tz->id); >>> idr_destroy(&tz->idr); >>> mutex_destroy(&tz->lock); >>> + thermal_zone_remove_device_groups(tz); >>> device_unregister(&tz->device); >>> - kfree(tz->device.groups); >>> } >>> EXPORT_SYMBOL_GPL(thermal_zone_device_unregister); >>> >>> diff --git a/drivers/thermal/thermal_core.h >>> b/drivers/thermal/thermal_core.h >>> index 2412b37..e3a60db 100644 >>> --- a/drivers/thermal/thermal_core.h >>> +++ b/drivers/thermal/thermal_core.h >>> @@ -70,6 +70,7 @@ void thermal_zone_device_unbind_exception(struct >>> thermal_zone_device *, >>> int thermal_build_list_of_policies(char *buf); >>> >>> /* sysfs I/F */ >>> +void thermal_zone_remove_device_groups(struct thermal_zone_device >>> *tz); >>> int thermal_zone_create_device_groups(struct thermal_zone_device >>> *, >>> int); >>> void thermal_cooling_device_setup_sysfs(struct >>> thermal_cooling_device *); >>> /* used only at binding time */ >>> diff --git a/drivers/thermal/thermal_sysfs.c >>> b/drivers/thermal/thermal_sysfs.c >>> index a694de9..3dfd29b 100644 >>> --- a/drivers/thermal/thermal_sysfs.c >>> +++ b/drivers/thermal/thermal_sysfs.c >>> @@ -605,6 +605,12 @@ static int create_trip_attrs(struct >>> thermal_zone_device *tz, int mask) >>> return 0; >>> } >>> >>> +void thermal_zone_remove_device_groups(struct thermal_zone_device >>> *tz) >>> +{ >>> + kfree(tz->device.groups); >>> + tz->device.groups = NULL; >>> +} >>> + >>> int thermal_zone_create_device_groups(struct thermal_zone_device >>> *tz, >>> int mask) >>> { >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pm" >> in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > >