* [PATCH] arm64: defconfig: Enable cros-ec and battery driver @ 2016-05-03 15:45 Rhyland Klein 2016-05-19 17:20 ` Rhyland Klein 2016-05-24 14:09 ` Jon Hunter 0 siblings, 2 replies; 24+ messages in thread From: Rhyland Klein @ 2016-05-03 15:45 UTC (permalink / raw) To: Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra, Rhyland Klein Enable the ChromeOS Embedded Controller, its I2C tunnel driver, and the BA27XXX battery driver. These are all used on the Tegra210 Smaug platform. Signed-off-by: Rhyland Klein <rklein@nvidia.com> --- arch/arm64/configs/defconfig | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 53f7ded14efb..043fab2d2c53 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -174,6 +174,7 @@ CONFIG_I2C_QUP=y CONFIG_I2C_TEGRA=y CONFIG_I2C_UNIPHIER_F=y CONFIG_I2C_RCAR=y +CONFIG_I2C_CROS_EC_TUNNEL=y CONFIG_SPI=y CONFIG_SPI_PL022=y CONFIG_SPI_QUP=y @@ -188,6 +189,7 @@ CONFIG_GPIO_PL061=y CONFIG_GPIO_RCAR=y CONFIG_GPIO_XGENE=y CONFIG_POWER_RESET_MSM=y +CONFIG_BATTERY_BQ27XXX=y CONFIG_POWER_RESET_XGENE=y CONFIG_POWER_RESET_SYSCON=y CONFIG_SENSORS_LM90=m @@ -199,6 +201,8 @@ CONFIG_MFD_SPMI_PMIC=y CONFIG_MFD_SEC_CORE=y CONFIG_MFD_HI655X_PMIC=y CONFIG_REGULATOR=y +CONFIG_MFD_CROS_EC=y +CONFIG_MFD_CROS_EC_I2C=y CONFIG_REGULATOR_FIXED_VOLTAGE=y CONFIG_REGULATOR_HI655X=y CONFIG_REGULATOR_QCOM_SMD_RPM=y -- 1.9.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-03 15:45 [PATCH] arm64: defconfig: Enable cros-ec and battery driver Rhyland Klein @ 2016-05-19 17:20 ` Rhyland Klein 2016-05-24 14:09 ` Jon Hunter 1 sibling, 0 replies; 24+ messages in thread From: Rhyland Klein @ 2016-05-19 17:20 UTC (permalink / raw) To: Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra On 5/3/2016 11:45 AM, Rhyland Klein wrote: > Enable the ChromeOS Embedded Controller, its I2C tunnel driver, and > the BA27XXX battery driver. These are all used on the Tegra210 Smaug > platform. > > Signed-off-by: Rhyland Klein <rklein@nvidia.com> Has anyone had a chance to review this yet? -rhyland > --- > arch/arm64/configs/defconfig | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig > index 53f7ded14efb..043fab2d2c53 100644 > --- a/arch/arm64/configs/defconfig > +++ b/arch/arm64/configs/defconfig > @@ -174,6 +174,7 @@ CONFIG_I2C_QUP=y > CONFIG_I2C_TEGRA=y > CONFIG_I2C_UNIPHIER_F=y > CONFIG_I2C_RCAR=y > +CONFIG_I2C_CROS_EC_TUNNEL=y > CONFIG_SPI=y > CONFIG_SPI_PL022=y > CONFIG_SPI_QUP=y > @@ -188,6 +189,7 @@ CONFIG_GPIO_PL061=y > CONFIG_GPIO_RCAR=y > CONFIG_GPIO_XGENE=y > CONFIG_POWER_RESET_MSM=y > +CONFIG_BATTERY_BQ27XXX=y > CONFIG_POWER_RESET_XGENE=y > CONFIG_POWER_RESET_SYSCON=y > CONFIG_SENSORS_LM90=m > @@ -199,6 +201,8 @@ CONFIG_MFD_SPMI_PMIC=y > CONFIG_MFD_SEC_CORE=y > CONFIG_MFD_HI655X_PMIC=y > CONFIG_REGULATOR=y > +CONFIG_MFD_CROS_EC=y > +CONFIG_MFD_CROS_EC_I2C=y > CONFIG_REGULATOR_FIXED_VOLTAGE=y > CONFIG_REGULATOR_HI655X=y > CONFIG_REGULATOR_QCOM_SMD_RPM=y > -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-03 15:45 [PATCH] arm64: defconfig: Enable cros-ec and battery driver Rhyland Klein 2016-05-19 17:20 ` Rhyland Klein @ 2016-05-24 14:09 ` Jon Hunter 2016-05-24 19:08 ` Rhyland Klein 1 sibling, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-24 14:09 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra Hi Rhyland, On 03/05/16 16:45, Rhyland Klein wrote: > Enable the ChromeOS Embedded Controller, its I2C tunnel driver, and > the BA27XXX battery driver. These are all used on the Tegra210 Smaug > platform. > > Signed-off-by: Rhyland Klein <rklein@nvidia.com> I tried booting with this patch with next-20160523 on the Tegra210 Smaug and I am seeing the following panic ... [ 1.258116] i2c /dev entries driver [ 1.278519] Unable to handle kernel NULL pointer dereference at virtual address 00000378 [ 1.286605] pgd = ffff000008cb6000 [ 1.290000] [00000378] *pgd=000000013fffe003, *pud=000000013fffd003, *pmd=0000000000000000 [ 1.298277] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 1.303839] Modules linked in: [ 1.306898] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.6.0-next-20160523+ #1047 [ 1.314284] Hardware name: Google Pixel C (DT) [ 1.318719] task: ffff8000bc0b0000 ti: ffff8000bc0b8000 task.ti: ffff8000bc0b8000 [ 1.326199] PC is at _raw_spin_lock_irqsave+0x1c/0x50 [ 1.331245] LR is at power_supply_changed+0x1c/0x60 [ 1.336114] pc : [<ffff0000087a19a0>] lr : [<ffff000008613114>] pstate: 800000c5 [ 1.343498] sp : ffff8000bc0bb8d0 [ 1.346805] x29: ffff8000bc0bb8d0 x28: ffff000008c39a40 [ 1.352120] x27: 0000000000000000 x26: 0000000000000000 [ 1.357434] x25: ffff00000888a818 x24: ffff8000709d1800 [ 1.362747] x23: ffff800071006270 x22: 0000000000000000 [ 1.368061] x21: 0000000000000019 x20: 0000000000000378 [ 1.373373] x19: 0000000000000000 x18: ffff00000885d138 [ 1.378685] x17: 000000000000000e x16: 0000000000000007 [ 1.383998] x15: 0000000000000001 x14: ffffffffffffffff [ 1.389310] x13: ffffffffffffffff x12: 0000000000000018 [ 1.394625] x11: 0101010101010101 x10: 0000000000000850 [ 1.399939] x9 : ffff8000bc0b8000 x8 : ffff8000bc0b08b0 [ 1.405253] x7 : 0000000002480a08 x6 : ffff8000bff9c740 [ 1.410567] x5 : ffff8000709d1058 x4 : 0000000000000000 [ 1.415880] x3 : 0000000000000001 x2 : 0000000000000040 [ 1.421192] x1 : ffff8000bc0b8000 x0 : 0000000000000378 [ 1.426506] [ 1.427992] Process swapper/0 (pid: 1, stack limit = 0xffff8000bc0b8020) [ 1.434683] Stack: (0xffff8000bc0bb8d0 to 0xffff8000bc0bc000) [ 1.440419] b8c0: ffff8000bc0bb900 ffff000008614984 [ 1.448239] b8e0: ffff800071006218 0000000000000096 ffff000200010055 ffff8000bc0bb8d8 [ 1.456059] b900: ffff8000bc0bb960 ffff000008615084 ffff800071006270 ffff000008c3a338 [ 1.463879] b920: ffff8000710062d8 ffff8000bc0bb9f8 00097c2000000bb9 0000000000000000 [ 1.471699] b940: 0000002300863b48 0000000000000051 0000009600000019 ffff000000000001 [ 1.479519] b960: ffff8000bc0bb990 ffff000008615180 ffff800071006218 000000000000002e [ 1.487339] b980: ffff8000710062d8 ffff000008615178 ffff8000bc0bb9d0 ffff0000086133d8 [ 1.495158] b9a0: ffff8000bc0bba7c ffff8000709d1800 ffff8000709d1800 ffff8000709d1b80 [ 1.502977] b9c0: ffff8000bc0bba7c ffff0000082282f4 ffff8000bc0bba00 ffff000008616508 [ 1.510797] b9e0: ffff8000709d1800 ffff0000086164f4 ffff8000709d1800 ffff00000847ecc4 [ 1.518617] ba00: ffff8000bc0bba50 ffff0000086183b0 0000000000000000 ffff8000709d1800 [ 1.526436] ba20: ffff8000709d1800 ffff000008c3ab50 ffff000008c3a000 ffff8000709d1818 [ 1.534255] ba40: ffff8000709d1b48 000000007fffffff ffff8000bc0bba80 ffff0000086193b4 [ 1.542075] ba60: 0000000000000000 ffff8000709d1818 ffff8000bc0bba80 ffff0000086191e0 [ 1.549894] ba80: ffff8000bc0bbb20 ffff000008613b6c ffff8000bb9ca400 ffff8000bb9ca438 [ 1.557714] baa0: ffff8000bc0bbbc0 ffff8000bb9ca020 ffff80007138e898 ffff8000bb9ca020 [ 1.565532] bac0: ffff000008ca4000 0000000000000000 ffff000008c76000 0000000000000000 [ 1.573352] bae0: ffff8000bb9ca400 ffff000008c3ab50 ffff80007138e898 ffff8000bb9ca020 [ 1.581171] bb00: ffff8000709d1804 0000000000000000 ffff8000bb9ca400 0000000000000000 [ 1.588991] bb20: ffff8000bc0bbb90 ffff000008613c54 ffff800071006218 ffff000008888e20 [ 1.596810] bb40: ffff8000bb9ca000 ffff000008ca5000 ffff000008c3a000 ffff8000bb9ca020 [ 1.604630] bb60: ffff000008889f18 ffff80007138e818 ffff000008c76000 0000000000000000 [ 1.612451] bb80: ffff8000bc0bbba0 ffff8000bb9ca718 ffff8000bc0bbba0 ffff000008614f1c [ 1.620270] bba0: ffff8000bc0bbbe0 ffff000008615668 0000000000000000 ffff800071006218 [ 1.628091] bbc0: 0000000000000000 ffff800071006218 0000000000000000 0000000000000000 [ 1.635910] bbe0: ffff8000bc0bbc30 ffff0000085ffff4 ffff000008889f18 ffff8000bb9ca020 [ 1.643730] bc00: ffff8000bb9ca004 ffff8000bb9ca000 ffff000008615598 ffff000008acb0d0 [ 1.651549] bc20: ffff000008b2fba0 0000000000000101 ffff8000bc0bbc70 ffff0000084836dc [ 1.659369] bc40: ffff8000bb9ca020 0000000000000000 ffff000008ca2000 ffff000008c3a698 [ 1.667189] bc60: 0000000000000000 0000000000000000 ffff8000bc0bbcb0 ffff000008483820 [ 1.675009] bc80: ffff8000bb9ca020 ffff000008c3a698 ffff8000bb9ca080 0000000000000000 [ 1.682828] bca0: ffff000008c21000 0000000000000000 ffff8000bc0bbce0 ffff000008481804 [ 1.690647] bcc0: 0000000000000000 ffff000008c3a698 ffff00000848377c ffff8000bc0bbd30 [ 1.698466] bce0: ffff8000bc0bbd20 ffff000008483008 ffff000008c3a698 ffff80007134fe00 [ 1.706286] bd00: ffff000008c389f8 ffff000008799b08 ffff8000bc2f2ca8 ffff80007134fb68 [ 1.714105] bd20: ffff8000bc0bbd30 ffff000008482c28 ffff8000bc0bbd70 ffff0000084840b8 [ 1.721924] bd40: ffff000008c3a698 ffff8000bc0b8000 0000000000000000 ffff000008c76000 [ 1.729743] bd60: ffff000008ae048c ffff00000833b0fc ffff8000bc0bbda0 ffff000008601f14 [ 1.737564] bd80: ffff000008b0ffa4 ffff8000bc0b8000 0000000000000000 ffff000008485218 [ 1.745384] bda0: ffff8000bc0bbdd0 ffff000008b0ffbc ffff000008b0ffa4 ffff000008081a54 [ 1.753203] bdc0: ffff8000bc0bbdd0 ffff000008c3a660 ffff8000bc0bbde0 ffff000008081a54 [ 1.761021] bde0: ffff8000bc0bbe50 ffff000008ae0ce0 ffff000008b96740 ffff000008b2fb00 [ 1.768842] be00: 0000000000000006 ffff000008c76000 ffff8000bc0bbe00 ffff0000089d2b78 [ 1.776662] be20: ffff000008be8b10 0000000600000006 0000000000000000 0000000000000000 [ 1.784481] be40: ffff000008ae048c ffff000008acb0d0 ffff8000bc0bbeb0 ffff00000879b9c4 [ 1.792300] be60: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 [ 1.800120] be80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.807940] bea0: 0000000000000000 0000000000000000 0000000000000000 ffff000008084e10 [ 1.815759] bec0: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 [ 1.823577] bee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.831396] bf00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.839215] bf20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.847034] bf40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.854853] bf60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.862671] bf80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.870491] bfa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.878310] bfc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000005 [ 1.886129] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.893948] Call trace: [ 1.896389] Exception stack(0xffff8000bc0bb710 to 0xffff8000bc0bb830) [ 1.902818] b700: 0000000000000000 0000000000000378 [ 1.910637] b720: ffff8000bc0bb8d0 ffff0000087a19a0 ffff000e0001001e ffff8000709a8018 [ 1.918456] b740: ffff8000bc0bb780 ffff00000866a21c ffff8000bc0bb7c0 ffff00000860ffd8 [ 1.926276] b760: 0000000000000020 ffff8000bc0bb8f0 0000000000000020 ffff80007138e398 [ 1.934095] b780: ffff80007138e380 0000000000000002 0000000000000002 0000000000000004 [ 1.941915] b7a0: ffff80007138e394 ffff8000bc0bb8e0 0000000000000378 ffff8000bc0b8000 [ 1.949735] b7c0: 0000000000000040 0000000000000001 0000000000000000 ffff8000709d1058 [ 1.957554] b7e0: ffff8000bff9c740 0000000002480a08 ffff8000bc0b08b0 ffff8000bc0b8000 [ 1.965374] b800: 0000000000000850 0101010101010101 0000000000000018 ffffffffffffffff [ 1.973193] b820: ffffffffffffffff 0000000000000001 [ 1.978064] [<ffff0000087a19a0>] _raw_spin_lock_irqsave+0x1c/0x50 [ 1.984150] [<ffff000008614984>] bq27xxx_battery_update+0x88/0x51c [ 1.990321] [<ffff000008615084>] bq27xxx_battery_poll+0x24/0x70 [ 1.996231] [<ffff000008615180>] bq27xxx_battery_get_property+0xb0/0x3b4 [ 2.002923] [<ffff0000086133d8>] power_supply_read_temp+0x2c/0x54 [ 2.009005] [<ffff000008616508>] thermal_zone_get_temp+0x5c/0x11c [ 2.015089] [<ffff0000086183b0>] thermal_zone_device_update+0x34/0xb4 [ 2.021518] [<ffff0000086193b4>] thermal_zone_device_register+0x87c/0x8cc [ 2.028295] [<ffff000008613b6c>] __power_supply_register+0x370/0x430 [ 2.034638] [<ffff000008613c54>] power_supply_register_no_ws+0x10/0x18 [ 2.041155] [<ffff000008614f1c>] bq27xxx_battery_setup+0x104/0x15c [ 2.047325] [<ffff000008615668>] bq27xxx_battery_i2c_probe+0xd0/0x1b0 [ 2.053757] [<ffff0000085ffff4>] i2c_device_probe+0x174/0x240 [ 2.059498] [<ffff0000084836dc>] driver_probe_device+0x1fc/0x29c [ 2.065493] [<ffff000008483820>] __driver_attach+0xa4/0xa8 [ 2.070969] [<ffff000008481804>] bus_for_each_dev+0x58/0x98 [ 2.076531] [<ffff000008483008>] driver_attach+0x20/0x28 [ 2.081832] [<ffff000008482c28>] bus_add_driver+0x1c8/0x22c [ 2.087395] [<ffff0000084840b8>] driver_register+0x68/0x108 [ 2.092958] [<ffff000008601f14>] i2c_register_driver+0x38/0x7c [ 2.098784] [<ffff000008b0ffbc>] bq27xxx_battery_i2c_driver_init+0x18/0x20 [ 2.105650] [<ffff000008081a54>] do_one_initcall+0x38/0x12c [ 2.111215] [<ffff000008ae0ce0>] kernel_init_freeable+0x148/0x1ec [ 2.117299] [<ffff00000879b9c4>] kernel_init+0x10/0x100 [ 2.122516] [<ffff000008084e10>] ret_from_fork+0x10/0x40 [ 2.127819] Code: b9401823 11000463 b9001823 f9800011 (885ffc01) [ 2.133927] ---[ end trace 9759cbbac2b2ba9b ]--- [ 2.138547] note: swapper/0[1] exited with preempt_count 1 [ 2.144061] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.144061] [ 2.153187] SMP: stopping secondary CPUs [ 2.157104] Kernel Offset: disabled [ 2.160584] Memory Limit: none [ 2.163633] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.163633] Have you tried this recently? I have not had chance to dig into this. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-24 14:09 ` Jon Hunter @ 2016-05-24 19:08 ` Rhyland Klein 2016-05-25 10:58 ` Jon Hunter 0 siblings, 1 reply; 24+ messages in thread From: Rhyland Klein @ 2016-05-24 19:08 UTC (permalink / raw) To: Jon Hunter, Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra On 5/24/2016 10:09 AM, Jon Hunter wrote: > Hi Rhyland, > > On 03/05/16 16:45, Rhyland Klein wrote: >> Enable the ChromeOS Embedded Controller, its I2C tunnel driver, and >> the BA27XXX battery driver. These are all used on the Tegra210 Smaug >> platform. >> >> Signed-off-by: Rhyland Klein <rklein@nvidia.com> > > I tried booting with this patch with next-20160523 on the Tegra210 Smaug and I am seeing the following panic ... > > [ 1.258116] i2c /dev entries driver > [ 1.278519] Unable to handle kernel NULL pointer dereference at virtual address 00000378 > [ 1.286605] pgd = ffff000008cb6000 > [ 1.290000] [00000378] *pgd=000000013fffe003, *pud=000000013fffd003, *pmd=0000000000000000 > [ 1.298277] Internal error: Oops: 96000004 [#1] PREEMPT SMP > [ 1.303839] Modules linked in: > [ 1.306898] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.6.0-next-20160523+ #1047 > [ 1.314284] Hardware name: Google Pixel C (DT) > [ 1.318719] task: ffff8000bc0b0000 ti: ffff8000bc0b8000 task.ti: ffff8000bc0b8000 > [ 1.326199] PC is at _raw_spin_lock_irqsave+0x1c/0x50 > [ 1.331245] LR is at power_supply_changed+0x1c/0x60 > [ 1.336114] pc : [<ffff0000087a19a0>] lr : [<ffff000008613114>] pstate: 800000c5 > [ 1.343498] sp : ffff8000bc0bb8d0 > [ 1.346805] x29: ffff8000bc0bb8d0 x28: ffff000008c39a40 > [ 1.352120] x27: 0000000000000000 x26: 0000000000000000 > [ 1.357434] x25: ffff00000888a818 x24: ffff8000709d1800 > [ 1.362747] x23: ffff800071006270 x22: 0000000000000000 > [ 1.368061] x21: 0000000000000019 x20: 0000000000000378 > [ 1.373373] x19: 0000000000000000 x18: ffff00000885d138 > [ 1.378685] x17: 000000000000000e x16: 0000000000000007 > [ 1.383998] x15: 0000000000000001 x14: ffffffffffffffff > [ 1.389310] x13: ffffffffffffffff x12: 0000000000000018 > [ 1.394625] x11: 0101010101010101 x10: 0000000000000850 > [ 1.399939] x9 : ffff8000bc0b8000 x8 : ffff8000bc0b08b0 > [ 1.405253] x7 : 0000000002480a08 x6 : ffff8000bff9c740 > [ 1.410567] x5 : ffff8000709d1058 x4 : 0000000000000000 > [ 1.415880] x3 : 0000000000000001 x2 : 0000000000000040 > [ 1.421192] x1 : ffff8000bc0b8000 x0 : 0000000000000378 > [ 1.426506] > [ 1.427992] Process swapper/0 (pid: 1, stack limit = 0xffff8000bc0b8020) > [ 1.434683] Stack: (0xffff8000bc0bb8d0 to 0xffff8000bc0bc000) > [ 1.440419] b8c0: ffff8000bc0bb900 ffff000008614984 > [ 1.448239] b8e0: ffff800071006218 0000000000000096 ffff000200010055 ffff8000bc0bb8d8 > [ 1.456059] b900: ffff8000bc0bb960 ffff000008615084 ffff800071006270 ffff000008c3a338 > [ 1.463879] b920: ffff8000710062d8 ffff8000bc0bb9f8 00097c2000000bb9 0000000000000000 > [ 1.471699] b940: 0000002300863b48 0000000000000051 0000009600000019 ffff000000000001 > [ 1.479519] b960: ffff8000bc0bb990 ffff000008615180 ffff800071006218 000000000000002e > [ 1.487339] b980: ffff8000710062d8 ffff000008615178 ffff8000bc0bb9d0 ffff0000086133d8 > [ 1.495158] b9a0: ffff8000bc0bba7c ffff8000709d1800 ffff8000709d1800 ffff8000709d1b80 > [ 1.502977] b9c0: ffff8000bc0bba7c ffff0000082282f4 ffff8000bc0bba00 ffff000008616508 > [ 1.510797] b9e0: ffff8000709d1800 ffff0000086164f4 ffff8000709d1800 ffff00000847ecc4 > [ 1.518617] ba00: ffff8000bc0bba50 ffff0000086183b0 0000000000000000 ffff8000709d1800 > [ 1.526436] ba20: ffff8000709d1800 ffff000008c3ab50 ffff000008c3a000 ffff8000709d1818 > [ 1.534255] ba40: ffff8000709d1b48 000000007fffffff ffff8000bc0bba80 ffff0000086193b4 > [ 1.542075] ba60: 0000000000000000 ffff8000709d1818 ffff8000bc0bba80 ffff0000086191e0 > [ 1.549894] ba80: ffff8000bc0bbb20 ffff000008613b6c ffff8000bb9ca400 ffff8000bb9ca438 > [ 1.557714] baa0: ffff8000bc0bbbc0 ffff8000bb9ca020 ffff80007138e898 ffff8000bb9ca020 > [ 1.565532] bac0: ffff000008ca4000 0000000000000000 ffff000008c76000 0000000000000000 > [ 1.573352] bae0: ffff8000bb9ca400 ffff000008c3ab50 ffff80007138e898 ffff8000bb9ca020 > [ 1.581171] bb00: ffff8000709d1804 0000000000000000 ffff8000bb9ca400 0000000000000000 > [ 1.588991] bb20: ffff8000bc0bbb90 ffff000008613c54 ffff800071006218 ffff000008888e20 > [ 1.596810] bb40: ffff8000bb9ca000 ffff000008ca5000 ffff000008c3a000 ffff8000bb9ca020 > [ 1.604630] bb60: ffff000008889f18 ffff80007138e818 ffff000008c76000 0000000000000000 > [ 1.612451] bb80: ffff8000bc0bbba0 ffff8000bb9ca718 ffff8000bc0bbba0 ffff000008614f1c > [ 1.620270] bba0: ffff8000bc0bbbe0 ffff000008615668 0000000000000000 ffff800071006218 > [ 1.628091] bbc0: 0000000000000000 ffff800071006218 0000000000000000 0000000000000000 > [ 1.635910] bbe0: ffff8000bc0bbc30 ffff0000085ffff4 ffff000008889f18 ffff8000bb9ca020 > [ 1.643730] bc00: ffff8000bb9ca004 ffff8000bb9ca000 ffff000008615598 ffff000008acb0d0 > [ 1.651549] bc20: ffff000008b2fba0 0000000000000101 ffff8000bc0bbc70 ffff0000084836dc > [ 1.659369] bc40: ffff8000bb9ca020 0000000000000000 ffff000008ca2000 ffff000008c3a698 > [ 1.667189] bc60: 0000000000000000 0000000000000000 ffff8000bc0bbcb0 ffff000008483820 > [ 1.675009] bc80: ffff8000bb9ca020 ffff000008c3a698 ffff8000bb9ca080 0000000000000000 > [ 1.682828] bca0: ffff000008c21000 0000000000000000 ffff8000bc0bbce0 ffff000008481804 > [ 1.690647] bcc0: 0000000000000000 ffff000008c3a698 ffff00000848377c ffff8000bc0bbd30 > [ 1.698466] bce0: ffff8000bc0bbd20 ffff000008483008 ffff000008c3a698 ffff80007134fe00 > [ 1.706286] bd00: ffff000008c389f8 ffff000008799b08 ffff8000bc2f2ca8 ffff80007134fb68 > [ 1.714105] bd20: ffff8000bc0bbd30 ffff000008482c28 ffff8000bc0bbd70 ffff0000084840b8 > [ 1.721924] bd40: ffff000008c3a698 ffff8000bc0b8000 0000000000000000 ffff000008c76000 > [ 1.729743] bd60: ffff000008ae048c ffff00000833b0fc ffff8000bc0bbda0 ffff000008601f14 > [ 1.737564] bd80: ffff000008b0ffa4 ffff8000bc0b8000 0000000000000000 ffff000008485218 > [ 1.745384] bda0: ffff8000bc0bbdd0 ffff000008b0ffbc ffff000008b0ffa4 ffff000008081a54 > [ 1.753203] bdc0: ffff8000bc0bbdd0 ffff000008c3a660 ffff8000bc0bbde0 ffff000008081a54 > [ 1.761021] bde0: ffff8000bc0bbe50 ffff000008ae0ce0 ffff000008b96740 ffff000008b2fb00 > [ 1.768842] be00: 0000000000000006 ffff000008c76000 ffff8000bc0bbe00 ffff0000089d2b78 > [ 1.776662] be20: ffff000008be8b10 0000000600000006 0000000000000000 0000000000000000 > [ 1.784481] be40: ffff000008ae048c ffff000008acb0d0 ffff8000bc0bbeb0 ffff00000879b9c4 > [ 1.792300] be60: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 > [ 1.800120] be80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.807940] bea0: 0000000000000000 0000000000000000 0000000000000000 ffff000008084e10 > [ 1.815759] bec0: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 > [ 1.823577] bee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.831396] bf00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.839215] bf20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.847034] bf40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.854853] bf60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.862671] bf80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.870491] bfa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.878310] bfc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000005 > [ 1.886129] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 1.893948] Call trace: > [ 1.896389] Exception stack(0xffff8000bc0bb710 to 0xffff8000bc0bb830) > [ 1.902818] b700: 0000000000000000 0000000000000378 > [ 1.910637] b720: ffff8000bc0bb8d0 ffff0000087a19a0 ffff000e0001001e ffff8000709a8018 > [ 1.918456] b740: ffff8000bc0bb780 ffff00000866a21c ffff8000bc0bb7c0 ffff00000860ffd8 > [ 1.926276] b760: 0000000000000020 ffff8000bc0bb8f0 0000000000000020 ffff80007138e398 > [ 1.934095] b780: ffff80007138e380 0000000000000002 0000000000000002 0000000000000004 > [ 1.941915] b7a0: ffff80007138e394 ffff8000bc0bb8e0 0000000000000378 ffff8000bc0b8000 > [ 1.949735] b7c0: 0000000000000040 0000000000000001 0000000000000000 ffff8000709d1058 > [ 1.957554] b7e0: ffff8000bff9c740 0000000002480a08 ffff8000bc0b08b0 ffff8000bc0b8000 > [ 1.965374] b800: 0000000000000850 0101010101010101 0000000000000018 ffffffffffffffff > [ 1.973193] b820: ffffffffffffffff 0000000000000001 > [ 1.978064] [<ffff0000087a19a0>] _raw_spin_lock_irqsave+0x1c/0x50 > [ 1.984150] [<ffff000008614984>] bq27xxx_battery_update+0x88/0x51c > [ 1.990321] [<ffff000008615084>] bq27xxx_battery_poll+0x24/0x70 > [ 1.996231] [<ffff000008615180>] bq27xxx_battery_get_property+0xb0/0x3b4 > [ 2.002923] [<ffff0000086133d8>] power_supply_read_temp+0x2c/0x54 > [ 2.009005] [<ffff000008616508>] thermal_zone_get_temp+0x5c/0x11c > [ 2.015089] [<ffff0000086183b0>] thermal_zone_device_update+0x34/0xb4 > [ 2.021518] [<ffff0000086193b4>] thermal_zone_device_register+0x87c/0x8cc > [ 2.028295] [<ffff000008613b6c>] __power_supply_register+0x370/0x430 > [ 2.034638] [<ffff000008613c54>] power_supply_register_no_ws+0x10/0x18 > [ 2.041155] [<ffff000008614f1c>] bq27xxx_battery_setup+0x104/0x15c > [ 2.047325] [<ffff000008615668>] bq27xxx_battery_i2c_probe+0xd0/0x1b0 > [ 2.053757] [<ffff0000085ffff4>] i2c_device_probe+0x174/0x240 > [ 2.059498] [<ffff0000084836dc>] driver_probe_device+0x1fc/0x29c > [ 2.065493] [<ffff000008483820>] __driver_attach+0xa4/0xa8 > [ 2.070969] [<ffff000008481804>] bus_for_each_dev+0x58/0x98 > [ 2.076531] [<ffff000008483008>] driver_attach+0x20/0x28 > [ 2.081832] [<ffff000008482c28>] bus_add_driver+0x1c8/0x22c > [ 2.087395] [<ffff0000084840b8>] driver_register+0x68/0x108 > [ 2.092958] [<ffff000008601f14>] i2c_register_driver+0x38/0x7c > [ 2.098784] [<ffff000008b0ffbc>] bq27xxx_battery_i2c_driver_init+0x18/0x20 > [ 2.105650] [<ffff000008081a54>] do_one_initcall+0x38/0x12c > [ 2.111215] [<ffff000008ae0ce0>] kernel_init_freeable+0x148/0x1ec > [ 2.117299] [<ffff00000879b9c4>] kernel_init+0x10/0x100 > [ 2.122516] [<ffff000008084e10>] ret_from_fork+0x10/0x40 > [ 2.127819] Code: b9401823 11000463 b9001823 f9800011 (885ffc01) > [ 2.133927] ---[ end trace 9759cbbac2b2ba9b ]--- > [ 2.138547] note: swapper/0[1] exited with preempt_count 1 > [ 2.144061] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > [ 2.144061] > [ 2.153187] SMP: stopping secondary CPUs > [ 2.157104] Kernel Offset: disabled > [ 2.160584] Memory Limit: none > [ 2.163633] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > [ 2.163633] > > Have you tried this recently? I have not had chance to dig into this. > I haven't, but looking now, its another case where a callback triggered during power_supply registration was using an uninitialized pointer for the power_supply. This patch works for me locally, do you want to test it out? diff --git a/drivers/power/power_supply_core.c b/drivers/power/power_supply_core.c index 456987c88baa..1e02eae6e4b4 100644 --- a/drivers/power/power_supply_core.c +++ b/drivers/power/power_supply_core.c @@ -561,11 +561,15 @@ static int power_supply_read_temp(struct thermal_zone_device *tzd, { struct power_supply *psy; union power_supply_propval val; - int ret; + int ret = 0; WARN_ON(tzd == NULL); + psy = tzd->devdata; - ret = psy->desc->get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); + WARN_ON(atomic_read(&psy->use_cnt) == 0); + + if (atomic_read(&psy->use_cnt) > 0) + ret = psy->desc->get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); /* Convert tenths of degree Celsius to milli degree Celsius. */ if (!ret) -rhyland -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-24 19:08 ` Rhyland Klein @ 2016-05-25 10:58 ` Jon Hunter 2016-05-25 11:03 ` Jon Hunter 0 siblings, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-25 10:58 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra On 24/05/16 20:08, Rhyland Klein wrote: >> On 03/05/16 16:45, Rhyland Klein wrote: >>> Enable the ChromeOS Embedded Controller, its I2C tunnel driver, and >>> the BA27XXX battery driver. These are all used on the Tegra210 Smaug >>> platform. >>> >>> Signed-off-by: Rhyland Klein <rklein@nvidia.com> >> >> I tried booting with this patch with next-20160523 on the Tegra210 Smaug and I am seeing the following panic ... >> >> [ 1.258116] i2c /dev entries driver >> [ 1.278519] Unable to handle kernel NULL pointer dereference at virtual address 00000378 >> [ 1.286605] pgd = ffff000008cb6000 >> [ 1.290000] [00000378] *pgd=000000013fffe003, *pud=000000013fffd003, *pmd=0000000000000000 >> [ 1.298277] Internal error: Oops: 96000004 [#1] PREEMPT SMP >> [ 1.303839] Modules linked in: >> [ 1.306898] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.6.0-next-20160523+ #1047 >> [ 1.314284] Hardware name: Google Pixel C (DT) >> [ 1.318719] task: ffff8000bc0b0000 ti: ffff8000bc0b8000 task.ti: ffff8000bc0b8000 >> [ 1.326199] PC is at _raw_spin_lock_irqsave+0x1c/0x50 >> [ 1.331245] LR is at power_supply_changed+0x1c/0x60 >> [ 1.336114] pc : [<ffff0000087a19a0>] lr : [<ffff000008613114>] pstate: 800000c5 >> [ 1.343498] sp : ffff8000bc0bb8d0 >> [ 1.346805] x29: ffff8000bc0bb8d0 x28: ffff000008c39a40 >> [ 1.352120] x27: 0000000000000000 x26: 0000000000000000 >> [ 1.357434] x25: ffff00000888a818 x24: ffff8000709d1800 >> [ 1.362747] x23: ffff800071006270 x22: 0000000000000000 >> [ 1.368061] x21: 0000000000000019 x20: 0000000000000378 >> [ 1.373373] x19: 0000000000000000 x18: ffff00000885d138 >> [ 1.378685] x17: 000000000000000e x16: 0000000000000007 >> [ 1.383998] x15: 0000000000000001 x14: ffffffffffffffff >> [ 1.389310] x13: ffffffffffffffff x12: 0000000000000018 >> [ 1.394625] x11: 0101010101010101 x10: 0000000000000850 >> [ 1.399939] x9 : ffff8000bc0b8000 x8 : ffff8000bc0b08b0 >> [ 1.405253] x7 : 0000000002480a08 x6 : ffff8000bff9c740 >> [ 1.410567] x5 : ffff8000709d1058 x4 : 0000000000000000 >> [ 1.415880] x3 : 0000000000000001 x2 : 0000000000000040 >> [ 1.421192] x1 : ffff8000bc0b8000 x0 : 0000000000000378 >> [ 1.426506] >> [ 1.427992] Process swapper/0 (pid: 1, stack limit = 0xffff8000bc0b8020) >> [ 1.434683] Stack: (0xffff8000bc0bb8d0 to 0xffff8000bc0bc000) >> [ 1.440419] b8c0: ffff8000bc0bb900 ffff000008614984 >> [ 1.448239] b8e0: ffff800071006218 0000000000000096 ffff000200010055 ffff8000bc0bb8d8 >> [ 1.456059] b900: ffff8000bc0bb960 ffff000008615084 ffff800071006270 ffff000008c3a338 >> [ 1.463879] b920: ffff8000710062d8 ffff8000bc0bb9f8 00097c2000000bb9 0000000000000000 >> [ 1.471699] b940: 0000002300863b48 0000000000000051 0000009600000019 ffff000000000001 >> [ 1.479519] b960: ffff8000bc0bb990 ffff000008615180 ffff800071006218 000000000000002e >> [ 1.487339] b980: ffff8000710062d8 ffff000008615178 ffff8000bc0bb9d0 ffff0000086133d8 >> [ 1.495158] b9a0: ffff8000bc0bba7c ffff8000709d1800 ffff8000709d1800 ffff8000709d1b80 >> [ 1.502977] b9c0: ffff8000bc0bba7c ffff0000082282f4 ffff8000bc0bba00 ffff000008616508 >> [ 1.510797] b9e0: ffff8000709d1800 ffff0000086164f4 ffff8000709d1800 ffff00000847ecc4 >> [ 1.518617] ba00: ffff8000bc0bba50 ffff0000086183b0 0000000000000000 ffff8000709d1800 >> [ 1.526436] ba20: ffff8000709d1800 ffff000008c3ab50 ffff000008c3a000 ffff8000709d1818 >> [ 1.534255] ba40: ffff8000709d1b48 000000007fffffff ffff8000bc0bba80 ffff0000086193b4 >> [ 1.542075] ba60: 0000000000000000 ffff8000709d1818 ffff8000bc0bba80 ffff0000086191e0 >> [ 1.549894] ba80: ffff8000bc0bbb20 ffff000008613b6c ffff8000bb9ca400 ffff8000bb9ca438 >> [ 1.557714] baa0: ffff8000bc0bbbc0 ffff8000bb9ca020 ffff80007138e898 ffff8000bb9ca020 >> [ 1.565532] bac0: ffff000008ca4000 0000000000000000 ffff000008c76000 0000000000000000 >> [ 1.573352] bae0: ffff8000bb9ca400 ffff000008c3ab50 ffff80007138e898 ffff8000bb9ca020 >> [ 1.581171] bb00: ffff8000709d1804 0000000000000000 ffff8000bb9ca400 0000000000000000 >> [ 1.588991] bb20: ffff8000bc0bbb90 ffff000008613c54 ffff800071006218 ffff000008888e20 >> [ 1.596810] bb40: ffff8000bb9ca000 ffff000008ca5000 ffff000008c3a000 ffff8000bb9ca020 >> [ 1.604630] bb60: ffff000008889f18 ffff80007138e818 ffff000008c76000 0000000000000000 >> [ 1.612451] bb80: ffff8000bc0bbba0 ffff8000bb9ca718 ffff8000bc0bbba0 ffff000008614f1c >> [ 1.620270] bba0: ffff8000bc0bbbe0 ffff000008615668 0000000000000000 ffff800071006218 >> [ 1.628091] bbc0: 0000000000000000 ffff800071006218 0000000000000000 0000000000000000 >> [ 1.635910] bbe0: ffff8000bc0bbc30 ffff0000085ffff4 ffff000008889f18 ffff8000bb9ca020 >> [ 1.643730] bc00: ffff8000bb9ca004 ffff8000bb9ca000 ffff000008615598 ffff000008acb0d0 >> [ 1.651549] bc20: ffff000008b2fba0 0000000000000101 ffff8000bc0bbc70 ffff0000084836dc >> [ 1.659369] bc40: ffff8000bb9ca020 0000000000000000 ffff000008ca2000 ffff000008c3a698 >> [ 1.667189] bc60: 0000000000000000 0000000000000000 ffff8000bc0bbcb0 ffff000008483820 >> [ 1.675009] bc80: ffff8000bb9ca020 ffff000008c3a698 ffff8000bb9ca080 0000000000000000 >> [ 1.682828] bca0: ffff000008c21000 0000000000000000 ffff8000bc0bbce0 ffff000008481804 >> [ 1.690647] bcc0: 0000000000000000 ffff000008c3a698 ffff00000848377c ffff8000bc0bbd30 >> [ 1.698466] bce0: ffff8000bc0bbd20 ffff000008483008 ffff000008c3a698 ffff80007134fe00 >> [ 1.706286] bd00: ffff000008c389f8 ffff000008799b08 ffff8000bc2f2ca8 ffff80007134fb68 >> [ 1.714105] bd20: ffff8000bc0bbd30 ffff000008482c28 ffff8000bc0bbd70 ffff0000084840b8 >> [ 1.721924] bd40: ffff000008c3a698 ffff8000bc0b8000 0000000000000000 ffff000008c76000 >> [ 1.729743] bd60: ffff000008ae048c ffff00000833b0fc ffff8000bc0bbda0 ffff000008601f14 >> [ 1.737564] bd80: ffff000008b0ffa4 ffff8000bc0b8000 0000000000000000 ffff000008485218 >> [ 1.745384] bda0: ffff8000bc0bbdd0 ffff000008b0ffbc ffff000008b0ffa4 ffff000008081a54 >> [ 1.753203] bdc0: ffff8000bc0bbdd0 ffff000008c3a660 ffff8000bc0bbde0 ffff000008081a54 >> [ 1.761021] bde0: ffff8000bc0bbe50 ffff000008ae0ce0 ffff000008b96740 ffff000008b2fb00 >> [ 1.768842] be00: 0000000000000006 ffff000008c76000 ffff8000bc0bbe00 ffff0000089d2b78 >> [ 1.776662] be20: ffff000008be8b10 0000000600000006 0000000000000000 0000000000000000 >> [ 1.784481] be40: ffff000008ae048c ffff000008acb0d0 ffff8000bc0bbeb0 ffff00000879b9c4 >> [ 1.792300] be60: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 >> [ 1.800120] be80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.807940] bea0: 0000000000000000 0000000000000000 0000000000000000 ffff000008084e10 >> [ 1.815759] bec0: ffff00000879b9b4 0000000000000000 0000000000000000 0000000000000000 >> [ 1.823577] bee0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.831396] bf00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.839215] bf20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.847034] bf40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.854853] bf60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.862671] bf80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.870491] bfa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.878310] bfc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000005 >> [ 1.886129] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 1.893948] Call trace: >> [ 1.896389] Exception stack(0xffff8000bc0bb710 to 0xffff8000bc0bb830) >> [ 1.902818] b700: 0000000000000000 0000000000000378 >> [ 1.910637] b720: ffff8000bc0bb8d0 ffff0000087a19a0 ffff000e0001001e ffff8000709a8018 >> [ 1.918456] b740: ffff8000bc0bb780 ffff00000866a21c ffff8000bc0bb7c0 ffff00000860ffd8 >> [ 1.926276] b760: 0000000000000020 ffff8000bc0bb8f0 0000000000000020 ffff80007138e398 >> [ 1.934095] b780: ffff80007138e380 0000000000000002 0000000000000002 0000000000000004 >> [ 1.941915] b7a0: ffff80007138e394 ffff8000bc0bb8e0 0000000000000378 ffff8000bc0b8000 >> [ 1.949735] b7c0: 0000000000000040 0000000000000001 0000000000000000 ffff8000709d1058 >> [ 1.957554] b7e0: ffff8000bff9c740 0000000002480a08 ffff8000bc0b08b0 ffff8000bc0b8000 >> [ 1.965374] b800: 0000000000000850 0101010101010101 0000000000000018 ffffffffffffffff >> [ 1.973193] b820: ffffffffffffffff 0000000000000001 >> [ 1.978064] [<ffff0000087a19a0>] _raw_spin_lock_irqsave+0x1c/0x50 >> [ 1.984150] [<ffff000008614984>] bq27xxx_battery_update+0x88/0x51c >> [ 1.990321] [<ffff000008615084>] bq27xxx_battery_poll+0x24/0x70 >> [ 1.996231] [<ffff000008615180>] bq27xxx_battery_get_property+0xb0/0x3b4 >> [ 2.002923] [<ffff0000086133d8>] power_supply_read_temp+0x2c/0x54 >> [ 2.009005] [<ffff000008616508>] thermal_zone_get_temp+0x5c/0x11c >> [ 2.015089] [<ffff0000086183b0>] thermal_zone_device_update+0x34/0xb4 >> [ 2.021518] [<ffff0000086193b4>] thermal_zone_device_register+0x87c/0x8cc >> [ 2.028295] [<ffff000008613b6c>] __power_supply_register+0x370/0x430 >> [ 2.034638] [<ffff000008613c54>] power_supply_register_no_ws+0x10/0x18 >> [ 2.041155] [<ffff000008614f1c>] bq27xxx_battery_setup+0x104/0x15c >> [ 2.047325] [<ffff000008615668>] bq27xxx_battery_i2c_probe+0xd0/0x1b0 >> [ 2.053757] [<ffff0000085ffff4>] i2c_device_probe+0x174/0x240 >> [ 2.059498] [<ffff0000084836dc>] driver_probe_device+0x1fc/0x29c >> [ 2.065493] [<ffff000008483820>] __driver_attach+0xa4/0xa8 >> [ 2.070969] [<ffff000008481804>] bus_for_each_dev+0x58/0x98 >> [ 2.076531] [<ffff000008483008>] driver_attach+0x20/0x28 >> [ 2.081832] [<ffff000008482c28>] bus_add_driver+0x1c8/0x22c >> [ 2.087395] [<ffff0000084840b8>] driver_register+0x68/0x108 >> [ 2.092958] [<ffff000008601f14>] i2c_register_driver+0x38/0x7c >> [ 2.098784] [<ffff000008b0ffbc>] bq27xxx_battery_i2c_driver_init+0x18/0x20 >> [ 2.105650] [<ffff000008081a54>] do_one_initcall+0x38/0x12c >> [ 2.111215] [<ffff000008ae0ce0>] kernel_init_freeable+0x148/0x1ec >> [ 2.117299] [<ffff00000879b9c4>] kernel_init+0x10/0x100 >> [ 2.122516] [<ffff000008084e10>] ret_from_fork+0x10/0x40 >> [ 2.127819] Code: b9401823 11000463 b9001823 f9800011 (885ffc01) >> [ 2.133927] ---[ end trace 9759cbbac2b2ba9b ]--- >> [ 2.138547] note: swapper/0[1] exited with preempt_count 1 >> [ 2.144061] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b >> [ 2.144061] >> [ 2.153187] SMP: stopping secondary CPUs >> [ 2.157104] Kernel Offset: disabled >> [ 2.160584] Memory Limit: none >> [ 2.163633] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b >> [ 2.163633] >> >> Have you tried this recently? I have not had chance to dig into this. >> > > I haven't, but looking now, its another case where a callback triggered > during power_supply registration was using an uninitialized pointer for > the power_supply. This patch works for me locally, do you want to test > it out? Yes it does avoid the crash but ... > diff --git a/drivers/power/power_supply_core.c > b/drivers/power/power_supply_core.c > index 456987c88baa..1e02eae6e4b4 100644 > --- a/drivers/power/power_supply_core.c > +++ b/drivers/power/power_supply_core.c > @@ -561,11 +561,15 @@ static int power_supply_read_temp(struct > thermal_zone_device *tzd, > { > struct power_supply *psy; > union power_supply_propval val; > - int ret; > + int ret = 0; > > WARN_ON(tzd == NULL); > + > psy = tzd->devdata; > - ret = psy->desc->get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); > + WARN_ON(atomic_read(&psy->use_cnt) == 0); ... this now generates a large splat on boot. > + > + if (atomic_read(&psy->use_cnt) > 0) > + ret = psy->desc->get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); Did you verify if this prevents the temp being read later when the battery is polled? Looking at this a bit more I am wondering if we should prevent the battery for being polled before the registration has completed ... diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..32649183ecd9 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct power_supply *psy, int ret = 0; struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); - mutex_lock(&di->lock); - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { - cancel_delayed_work_sync(&di->work); - bq27xxx_battery_poll(&di->work.work); + if (di->bat) { + mutex_lock(&di->lock); + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { + cancel_delayed_work_sync(&di->work); + bq27xxx_battery_poll(&di->work.work); + } + mutex_unlock(&di->lock); } - mutex_unlock(&di->lock); It seems that the di->bat pointer is not initialised until after the registrations completes, but during the registration we try to reference it when polling the battery. I believe the above is ok because we call bq27xxx_battery_update() after registering. Cheers Jon -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 10:58 ` Jon Hunter @ 2016-05-25 11:03 ` Jon Hunter 2016-05-25 15:46 ` Thierry Reding 2016-05-25 15:49 ` Rhyland Klein 0 siblings, 2 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-25 11:03 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra On 25/05/16 11:58, Jon Hunter wrote: ... > Looking at this a bit more I am wondering if we should prevent the > battery for being polled before the registration has completed ... > > diff --git a/drivers/power/bq27xxx_battery.c > b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..32649183ecd9 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct > power_supply *psy, > int ret = 0; > struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); > > - mutex_lock(&di->lock); > - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { > - cancel_delayed_work_sync(&di->work); > - bq27xxx_battery_poll(&di->work.work); > + if (di->bat) { > + mutex_lock(&di->lock); > + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { > + cancel_delayed_work_sync(&di->work); > + bq27xxx_battery_poll(&di->work.work); > + } > + mutex_unlock(&di->lock); > } > - mutex_unlock(&di->lock); Alternatively, maybe the following is simpler ... diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..8a713b52e9f6 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -733,7 +733,8 @@ static void bq27xxx_battery_poll(struct work_struct *work) container_of(work, struct bq27xxx_device_info, work.work); - bq27xxx_battery_update(di); + if (di->bat) + bq27xxx_battery_update(di); Jon -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 11:03 ` Jon Hunter @ 2016-05-25 15:46 ` Thierry Reding 2016-05-25 15:55 ` Rhyland Klein 2016-05-25 15:57 ` Jon Hunter 2016-05-25 15:49 ` Rhyland Klein 1 sibling, 2 replies; 24+ messages in thread From: Thierry Reding @ 2016-05-25 15:46 UTC (permalink / raw) To: Jon Hunter Cc: Rhyland Klein, Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra [-- Attachment #1: Type: text/plain, Size: 3538 bytes --] On Wed, May 25, 2016 at 12:03:47PM +0100, Jon Hunter wrote: > > On 25/05/16 11:58, Jon Hunter wrote: > > ... > > > Looking at this a bit more I am wondering if we should prevent the > > battery for being polled before the registration has completed ... > > > > diff --git a/drivers/power/bq27xxx_battery.c > > b/drivers/power/bq27xxx_battery.c > > index 45f6ebf88df6..32649183ecd9 100644 > > --- a/drivers/power/bq27xxx_battery.c > > +++ b/drivers/power/bq27xxx_battery.c > > @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct > > power_supply *psy, > > int ret = 0; > > struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); > > > > - mutex_lock(&di->lock); > > - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { > > - cancel_delayed_work_sync(&di->work); > > - bq27xxx_battery_poll(&di->work.work); > > + if (di->bat) { > > + mutex_lock(&di->lock); > > + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { > > + cancel_delayed_work_sync(&di->work); > > + bq27xxx_battery_poll(&di->work.work); > > + } > > + mutex_unlock(&di->lock); > > } > > - mutex_unlock(&di->lock); > > Alternatively, maybe the following is simpler ... > > diff --git a/drivers/power/bq27xxx_battery.c > b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..8a713b52e9f6 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,7 +733,8 @@ static void bq27xxx_battery_poll(struct work_struct > *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > - bq27xxx_battery_update(di); > + if (di->bat) > + bq27xxx_battery_update(di); How about this, which should be the most minimal to fix it (though it's completely untested) and still update the internal cache (it just won't signal an supply change, which wouldn't work at this point anyway). The patch makes up for the supply change notification by doing that instead of a full bq27xxx_battery_update() at the end of ->probe(). This should take care of always sending out a uevent on successful probe, whereas a bq27xxx_battery_update() at the end of ->probe() may not send one if it is presented with the same data. Thierry --- >8 --- diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..df1b4cb2bbc2 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -717,7 +717,13 @@ void bq27xxx_battery_update(struct bq27xxx_device_info *di) di->charge_design_full = bq27xxx_battery_read_dcap(di); } - if (di->cache.capacity != cache.capacity) + /* + * This function ends up being called while the power supply is being + * registered, hence di->bat will be NULL on the first call, causing + * power_supply_changed() to oops. Avoid that by checking if we have + * been registered already or not. + */ + if (di->bat && di->cache.capacity != cache.capacity) power_supply_changed(di->bat); if (memcmp(&di->cache, &cache, sizeof(cache)) != 0) @@ -984,7 +990,7 @@ int bq27xxx_battery_setup(struct bq27xxx_device_info *di) dev_info(di->dev, "support ver. %s enabled\n", DRIVER_VERSION); - bq27xxx_battery_update(di); + power_supply_changed(di->bat); return 0; } [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 15:46 ` Thierry Reding @ 2016-05-25 15:55 ` Rhyland Klein 2016-05-25 16:10 ` Jon Hunter 2016-05-25 15:57 ` Jon Hunter 1 sibling, 1 reply; 24+ messages in thread From: Rhyland Klein @ 2016-05-25 15:55 UTC (permalink / raw) To: Thierry Reding, Jon Hunter, Krzysztof Kozlowski Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 5/25/2016 11:46 AM, Thierry Reding wrote: > On Wed, May 25, 2016 at 12:03:47PM +0100, Jon Hunter wrote: >> >> On 25/05/16 11:58, Jon Hunter wrote: >> >> ... >> >>> Looking at this a bit more I am wondering if we should prevent the >>> battery for being polled before the registration has completed ... >>> >>> diff --git a/drivers/power/bq27xxx_battery.c >>> b/drivers/power/bq27xxx_battery.c >>> index 45f6ebf88df6..32649183ecd9 100644 >>> --- a/drivers/power/bq27xxx_battery.c >>> +++ b/drivers/power/bq27xxx_battery.c >>> @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct >>> power_supply *psy, >>> int ret = 0; >>> struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); >>> >>> - mutex_lock(&di->lock); >>> - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >>> - cancel_delayed_work_sync(&di->work); >>> - bq27xxx_battery_poll(&di->work.work); >>> + if (di->bat) { >>> + mutex_lock(&di->lock); >>> + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >>> + cancel_delayed_work_sync(&di->work); >>> + bq27xxx_battery_poll(&di->work.work); >>> + } >>> + mutex_unlock(&di->lock); >>> } >>> - mutex_unlock(&di->lock); >> >> Alternatively, maybe the following is simpler ... >> >> diff --git a/drivers/power/bq27xxx_battery.c >> b/drivers/power/bq27xxx_battery.c >> index 45f6ebf88df6..8a713b52e9f6 100644 >> --- a/drivers/power/bq27xxx_battery.c >> +++ b/drivers/power/bq27xxx_battery.c >> @@ -733,7 +733,8 @@ static void bq27xxx_battery_poll(struct work_struct >> *work) >> container_of(work, struct bq27xxx_device_info, >> work.work); >> >> - bq27xxx_battery_update(di); >> + if (di->bat) >> + bq27xxx_battery_update(di); > > How about this, which should be the most minimal to fix it (though it's > completely untested) and still update the internal cache (it just won't > signal an supply change, which wouldn't work at this point anyway). The > patch makes up for the supply change notification by doing that instead > of a full bq27xxx_battery_update() at the end of ->probe(). This should > take care of always sending out a uevent on successful probe, whereas a > bq27xxx_battery_update() at the end of ->probe() may not send one if it > is presented with the same data. > The problem I see with this is that this only fixes this for the bq27xxx driver. The real problem is that during the registration for (di->bat = power_supply_register...) the core is calling back into the driver being registered passing it an incomplete struct. As far as I can tell, the call should never be made in the first place. In fact, for all drivers that register and support thermal, this should be happening. Adding Krzysztof Kozlowski who has looked into similar issues recently it appears. -rhyland -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 15:55 ` Rhyland Klein @ 2016-05-25 16:10 ` Jon Hunter 2016-05-25 16:29 ` Jon Hunter 0 siblings, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-25 16:10 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Krzysztof Kozlowski Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 16:55, Rhyland Klein wrote: > On 5/25/2016 11:46 AM, Thierry Reding wrote: >> On Wed, May 25, 2016 at 12:03:47PM +0100, Jon Hunter wrote: >>> >>> On 25/05/16 11:58, Jon Hunter wrote: >>> >>> ... >>> >>>> Looking at this a bit more I am wondering if we should prevent the >>>> battery for being polled before the registration has completed ... >>>> >>>> diff --git a/drivers/power/bq27xxx_battery.c >>>> b/drivers/power/bq27xxx_battery.c >>>> index 45f6ebf88df6..32649183ecd9 100644 >>>> --- a/drivers/power/bq27xxx_battery.c >>>> +++ b/drivers/power/bq27xxx_battery.c >>>> @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct >>>> power_supply *psy, >>>> int ret = 0; >>>> struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); >>>> >>>> - mutex_lock(&di->lock); >>>> - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >>>> - cancel_delayed_work_sync(&di->work); >>>> - bq27xxx_battery_poll(&di->work.work); >>>> + if (di->bat) { >>>> + mutex_lock(&di->lock); >>>> + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >>>> + cancel_delayed_work_sync(&di->work); >>>> + bq27xxx_battery_poll(&di->work.work); >>>> + } >>>> + mutex_unlock(&di->lock); >>>> } >>>> - mutex_unlock(&di->lock); >>> >>> Alternatively, maybe the following is simpler ... >>> >>> diff --git a/drivers/power/bq27xxx_battery.c >>> b/drivers/power/bq27xxx_battery.c >>> index 45f6ebf88df6..8a713b52e9f6 100644 >>> --- a/drivers/power/bq27xxx_battery.c >>> +++ b/drivers/power/bq27xxx_battery.c >>> @@ -733,7 +733,8 @@ static void bq27xxx_battery_poll(struct work_struct >>> *work) >>> container_of(work, struct bq27xxx_device_info, >>> work.work); >>> >>> - bq27xxx_battery_update(di); >>> + if (di->bat) >>> + bq27xxx_battery_update(di); >> >> How about this, which should be the most minimal to fix it (though it's >> completely untested) and still update the internal cache (it just won't >> signal an supply change, which wouldn't work at this point anyway). The >> patch makes up for the supply change notification by doing that instead >> of a full bq27xxx_battery_update() at the end of ->probe(). This should >> take care of always sending out a uevent on successful probe, whereas a >> bq27xxx_battery_update() at the end of ->probe() may not send one if it >> is presented with the same data. >> > > The problem I see with this is that this only fixes this for the bq27xxx > driver. The real problem is that during the registration for (di->bat = > power_supply_register...) the core is calling back into the driver being > registered passing it an incomplete struct. As far as I can tell, the > call should never be made in the first place. In fact, for all drivers > that register and support thermal, this should be happening. So power_supply_read_temp() calls ->get_property() and passes the power_supply psy struct which is initialised. The problem is that inside the bq27xxx driver, this then kicks off the worker thread to update the bq27xxx state and when this worker thread runs it attempts to access the same psy struct but by dereferencing a pointer to it from the bq27xxx_device_info where the pointer has not been initialised yet. Therefore, IMO it seems that we should not allow this worker thread to start until the registration has completed and hence the pointer is initialised. I don't see why the temperature could not be read during the registration to get the initial temp and it does seem to work fine if we prevent this worker thread from running. I am sure there are lot of other devices that have the POWER_SUPPLY_PROP_TEMP property and so I would have thought if this is a generic problem it would have come up before now? Plus this worker thread that triggers the crash is specific to the bq27xxx. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 16:10 ` Jon Hunter @ 2016-05-25 16:29 ` Jon Hunter 2016-05-25 16:36 ` Rhyland Klein 2016-05-25 16:36 ` Jon Hunter 0 siblings, 2 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-25 16:29 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Krzysztof Kozlowski Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 17:10, Jon Hunter wrote: ... > So power_supply_read_temp() calls ->get_property() and passes the > power_supply psy struct which is initialised. The problem is that inside > the bq27xxx driver, this then kicks off the worker thread to update the > bq27xxx state and when this worker thread runs it attempts to access the > same psy struct but by dereferencing a pointer to it from the > bq27xxx_device_info where the pointer has not been initialised yet. > Therefore, IMO it seems that we should not allow this worker thread to > start until the registration has completed and hence the pointer is > initialised. Sorry, it is not the actual worker thread that triggers the NULL pointer deference, but the function bq27xxx_battery_poll() that schedules the worker thread. Anyway, I still don't see that we need to update the bq27xxx state during the registration especially seeing as we call bq27xxx_battery_update() after the registration is complete. It seems that updating the overall state should be mutually exclusive from reading the temp. Looking at my patch, it does appear that the worker thread which also calls bq27xxx_battery_update() is still scheduled and so may be it should be ... diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..1334ed522332 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) container_of(work, struct bq27xxx_device_info, work.work); + if (!di->bat) + return; + bq27xxx_battery_update(di); if (poll_interval > 0) { Cheers Jon -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 16:29 ` Jon Hunter @ 2016-05-25 16:36 ` Rhyland Klein 2016-05-25 17:26 ` Jon Hunter 2016-05-25 16:36 ` Jon Hunter 1 sibling, 1 reply; 24+ messages in thread From: Rhyland Klein @ 2016-05-25 16:36 UTC (permalink / raw) To: Jon Hunter, Thierry Reding, Krzysztof Kozlowski, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 5/25/2016 12:29 PM, Jon Hunter wrote: > > On 25/05/16 17:10, Jon Hunter wrote: > > ... > >> So power_supply_read_temp() calls ->get_property() and passes the >> power_supply psy struct which is initialised. The problem is that inside >> the bq27xxx driver, this then kicks off the worker thread to update the >> bq27xxx state and when this worker thread runs it attempts to access the >> same psy struct but by dereferencing a pointer to it from the >> bq27xxx_device_info where the pointer has not been initialised yet. >> Therefore, IMO it seems that we should not allow this worker thread to >> start until the registration has completed and hence the pointer is >> initialised. > > Sorry, it is not the actual worker thread that triggers the NULL pointer > deference, but the function bq27xxx_battery_poll() that schedules the > worker thread. Anyway, I still don't see that we need to update the > bq27xxx state during the registration especially seeing as we call > bq27xxx_battery_update() after the registration is complete. It seems > that updating the overall state should be mutually exclusive from > reading the temp. > > Looking at my patch, it does appear that the worker thread which also > calls bq27xxx_battery_update() is still scheduled and so may be it > should be ... > > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..1334ed522332 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > + if (!di->bat) > + return; > + > bq27xxx_battery_update(di); > > if (poll_interval > 0) { > > I can see that getting the temperature could work. I would point out that I don't see any recent changes to bq27xxx or the power_supply_core that would imply this is a regression. My guess is that up until now, for devices that support the TEMP property, CONFIG_THERMAL isn't been enabled. So here are my thoughts.... we can do 2 things here: 1) patch bq27xxx in some manner that will allow the bq27xxx driver to work report the temp during register (such as above patch). 2) Patch the core to avoid using get_property callback during registration. I think for our immediate concern and crash, #1 is fine. It will work and is fine. I however think this is just a symptom of the larger issue (#2). In this case, the problem we see is that di->bat is used before it is set, and we have a way around it. However, for EVERY device that registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to receive a call with its relative di->bat uninitialized too. I don't know for certain if #2 has caused problems anywhere else, and I would be surprised if it has and hasn't been caught. AS far as this crash is concerned, I think either approach will work. Adding in David, Dmitry, and Sebastian (maintainers) to see if they have a preferred approach. -rhyland -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 16:36 ` Rhyland Klein @ 2016-05-25 17:26 ` Jon Hunter 2016-05-25 19:44 ` Rhyland Klein 0 siblings, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-25 17:26 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Krzysztof Kozlowski, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 17:36, Rhyland Klein wrote: ... > I can see that getting the temperature could work. I would point out > that I don't see any recent changes to bq27xxx or the power_supply_core > that would imply this is a regression. My guess is that up until now, > for devices that support the TEMP property, CONFIG_THERMAL isn't been > enabled. > > So here are my thoughts.... we can do 2 things here: > > 1) patch bq27xxx in some manner that will allow the bq27xxx driver to > work report the temp during register (such as above patch). > 2) Patch the core to avoid using get_property callback during registration. I wonder if #2 will cause other problems for other devices as this really changes the functionality. > I think for our immediate concern and crash, #1 is fine. It will work > and is fine. I however think this is just a symptom of the larger issue > (#2). In this case, the problem we see is that di->bat is used before it > is set, and we have a way around it. However, for EVERY device that > registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to > receive a call with its relative di->bat uninitialized too. I don't think we are understanding each other and I still think that this could be specific to the bq27xxx. And here is why ... The power supply driver is going to receive a call to it's ->get_property() function with a *VALID* pointer to the power_supply structure, *psy. When initialised, di->bat == psy (assuming bq27xxx naming) and so in other words, they point to the same thing. Therefore, in the normal case, you should not need to access di->bat from within ->get_property() because you already have a valid pointer to the power_supply structure, *psy. So the ONLY problem would be IF the ->get_property function calls power_supply_get_drvdata() to get a pointer to the drvdata, *di, and then dereferences and uses the pointer, di->bat, which may NOT be initialised yet. I am wondering how likely this is as it seems a bit daft to do this, unless you are doing something like the bq27xxx driver is attempting to do. Again IMO the problem is related to how the bq27xxx driver has implemented the status update. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 17:26 ` Jon Hunter @ 2016-05-25 19:44 ` Rhyland Klein 2016-05-26 10:35 ` Jon Hunter 2016-05-27 8:37 ` Krzysztof Kozlowski 0 siblings, 2 replies; 24+ messages in thread From: Rhyland Klein @ 2016-05-25 19:44 UTC (permalink / raw) To: Jon Hunter, Thierry Reding, Krzysztof Kozlowski, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 5/25/2016 1:26 PM, Jon Hunter wrote: > > On 25/05/16 17:36, Rhyland Klein wrote: > > ... > >> I can see that getting the temperature could work. I would point out >> that I don't see any recent changes to bq27xxx or the power_supply_core >> that would imply this is a regression. My guess is that up until now, >> for devices that support the TEMP property, CONFIG_THERMAL isn't been >> enabled. >> >> So here are my thoughts.... we can do 2 things here: >> >> 1) patch bq27xxx in some manner that will allow the bq27xxx driver to >> work report the temp during register (such as above patch). >> 2) Patch the core to avoid using get_property callback during registration. > > I wonder if #2 will cause other problems for other devices as this > really changes the functionality. > >> I think for our immediate concern and crash, #1 is fine. It will work >> and is fine. I however think this is just a symptom of the larger issue >> (#2). In this case, the problem we see is that di->bat is used before it >> is set, and we have a way around it. However, for EVERY device that >> registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to >> receive a call with its relative di->bat uninitialized too. > > I don't think we are understanding each other and I still think that > this could be specific to the bq27xxx. And here is why ... > > The power supply driver is going to receive a call to it's > ->get_property() function with a *VALID* pointer to the power_supply > structure, *psy. When initialised, di->bat == psy (assuming bq27xxx > naming) and so in other words, they point to the same thing. Therefore, > in the normal case, you should not need to access di->bat from within > ->get_property() because you already have a valid pointer to the > power_supply structure, *psy. > > So the ONLY problem would be IF the ->get_property function calls > power_supply_get_drvdata() to get a pointer to the drvdata, *di, and > then dereferences and uses the pointer, di->bat, which may NOT be > initialised yet. I am wondering how likely this is as it seems a bit > daft to do this, unless you are doing something like the bq27xxx driver > is attempting to do. > > Again IMO the problem is related to how the bq27xxx driver has > implemented the status update. > And you might be completely correct, that is something that can only happen specifically with the bq27xxx driver. In which case, making the fix there should be the fix. I just know from the commit log (and some previous work with power supply drivers) that the case of get_property being called during registration has caused problems before. That's why I am trying to make sure we cover the generic case if it exists. Using scheduled work is common for power_supplies to regularly update their status. As for your proposed patches for bq27xxx, I think the latest one you suggested (@12:36PM EST) with the change for battery_update->battery_poll as well makes the most sense for bq27xxx. I would like to point out though that if we patch this, the cache won't be populated for the first TEMP request, which has the same end effect as the patch I proposed to power_supply_read_temp. I believe both will return 0 for the temp. I think that patch would work just fine in place of what I suggested for this specific crash. -rhyland -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 19:44 ` Rhyland Klein @ 2016-05-26 10:35 ` Jon Hunter 2016-05-27 8:37 ` Krzysztof Kozlowski 1 sibling, 0 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-26 10:35 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Krzysztof Kozlowski, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 20:44, Rhyland Klein wrote: ... > And you might be completely correct, that is something that can only > happen specifically with the bq27xxx driver. In which case, making the > fix there should be the fix. I just know from the commit log (and some > previous work with power supply drivers) that the case of get_property > being called during registration has caused problems before. That's why > I am trying to make sure we cover the generic case if it exists. Using > scheduled work is common for power_supplies to regularly update their > status. Yes but only appears to be the bq27xxx and seems to be related to this issue. To be honest, if the maintainers are ok with your proposal then it is fine with me, but I was just thinking that this feels like a bq27xxx issue. > As for your proposed patches for bq27xxx, I think the latest one you > suggested (@12:36PM EST) with the change for > battery_update->battery_poll as well makes the most sense for bq27xxx. I > would like to point out though that if we patch this, the cache won't be > populated for the first TEMP request, which has the same end effect as > the patch I proposed to power_supply_read_temp. I believe both will > return 0 for the temp. Actually, I gave it a test and instead of returning zero it returns absolute zero (0K or -273C!). So that is probably no good. Something like Thierry was suggesting could work. Another thought I had was ... diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..77488183df6b 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -674,12 +674,15 @@ static int bq27xxx_battery_read_health(struct bq27xxx_device_info *di) return POWER_SUPPLY_HEALTH_GOOD; } -void bq27xxx_battery_update(struct bq27xxx_device_info *di) +static void __bq27xxx_battery_update(struct bq27xxx_device_info *di, + struct power_supply *psy) { struct bq27xxx_reg_cache cache = {0, }; bool has_ci_flag = di->chip == BQ27000 || di->chip == BQ27010; bool has_singe_flag = di->chip == BQ27000 || di->chip == BQ27010; + WARN_ON(!psy); + cache.flags = bq27xxx_read(di, BQ27XXX_REG_FLAGS, has_singe_flag); if ((cache.flags & 0xff) == 0xff) cache.flags = -1; /* read error */ @@ -717,13 +720,25 @@ void bq27xxx_battery_update(struct bq27xxx_device_info *di) di->charge_design_full = bq27xxx_battery_read_dcap(di); } - if (di->cache.capacity != cache.capacity) - power_supply_changed(di->bat); + if (psy && di->cache.capacity != cache.capacity) + power_supply_changed(psy); if (memcmp(&di->cache, &cache, sizeof(cache)) != 0) di->cache = cache; di->last_update = jiffies; + + if (poll_interval > 0) { + /* The timer does not have to be accurate. */ + set_timer_slack(&di->work.timer, poll_interval * HZ / 4); + schedule_delayed_work(&di->work, poll_interval * HZ); + } +} + +void bq27xxx_battery_update(struct bq27xxx_device_info *di) +{ + cancel_delayed_work_sync(&di->work); + __bq27xxx_battery_update(di, di->bat); } EXPORT_SYMBOL_GPL(bq27xxx_battery_update); @@ -733,13 +748,7 @@ static void bq27xxx_battery_poll(struct work_struct *work) container_of(work, struct bq27xxx_device_info, work.work); - bq27xxx_battery_update(di); - - if (poll_interval > 0) { - /* The timer does not have to be accurate. */ - set_timer_slack(&di->work.timer, poll_interval * HZ / 4); - schedule_delayed_work(&di->work, poll_interval * HZ); - } + __bq27xxx_battery_update(di, di->bat); } /* @@ -874,7 +883,7 @@ static int bq27xxx_battery_get_property(struct power_supply *psy, mutex_lock(&di->lock); if (time_is_before_jiffies(di->last_update + 5 * HZ)) { cancel_delayed_work_sync(&di->work); - bq27xxx_battery_poll(&di->work.work); + __bq27xxx_battery_update(di, psy); } mutex_unlock(&di->lock); Cheers Jon -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 19:44 ` Rhyland Klein 2016-05-26 10:35 ` Jon Hunter @ 2016-05-27 8:37 ` Krzysztof Kozlowski 2016-05-27 9:19 ` Krzysztof Kozlowski 2016-05-27 10:28 ` Jon Hunter 1 sibling, 2 replies; 24+ messages in thread From: Krzysztof Kozlowski @ 2016-05-27 8:37 UTC (permalink / raw) To: Rhyland Klein, Jon Hunter, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 05/25/2016 09:44 PM, Rhyland Klein wrote: > On 5/25/2016 1:26 PM, Jon Hunter wrote: >> >> On 25/05/16 17:36, Rhyland Klein wrote: >> >> ... >> >>> I can see that getting the temperature could work. I would point out >>> that I don't see any recent changes to bq27xxx or the power_supply_core >>> that would imply this is a regression. My guess is that up until now, >>> for devices that support the TEMP property, CONFIG_THERMAL isn't been >>> enabled. >>> >>> So here are my thoughts.... we can do 2 things here: >>> >>> 1) patch bq27xxx in some manner that will allow the bq27xxx driver to >>> work report the temp during register (such as above patch). >>> 2) Patch the core to avoid using get_property callback during registration. >> >> I wonder if #2 will cause other problems for other devices as this >> really changes the functionality. >> >>> I think for our immediate concern and crash, #1 is fine. It will work >>> and is fine. I however think this is just a symptom of the larger issue >>> (#2). In this case, the problem we see is that di->bat is used before it >>> is set, and we have a way around it. However, for EVERY device that >>> registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to >>> receive a call with its relative di->bat uninitialized too. >> >> I don't think we are understanding each other and I still think that >> this could be specific to the bq27xxx. And here is why ... >> >> The power supply driver is going to receive a call to it's >> ->get_property() function with a *VALID* pointer to the power_supply >> structure, *psy. When initialised, di->bat == psy (assuming bq27xxx >> naming) and so in other words, they point to the same thing. Therefore, >> in the normal case, you should not need to access di->bat from within >> ->get_property() because you already have a valid pointer to the >> power_supply structure, *psy. >> >> So the ONLY problem would be IF the ->get_property function calls >> power_supply_get_drvdata() to get a pointer to the drvdata, *di, and >> then dereferences and uses the pointer, di->bat, which may NOT be >> initialised yet. I am wondering how likely this is as it seems a bit >> daft to do this, unless you are doing something like the bq27xxx driver >> is attempting to do. >> >> Again IMO the problem is related to how the bq27xxx driver has >> implemented the status update. >> > > And you might be completely correct, that is something that can only > happen specifically with the bq27xxx driver. In which case, making the > fix there should be the fix. I just know from the commit log (and some > previous work with power supply drivers) that the case of get_property > being called during registration has caused problems before. That's why > I am trying to make sure we cover the generic case if it exists. Using > scheduled work is common for power_supplies to regularly update their > status. > > As for your proposed patches for bq27xxx, I think the latest one you > suggested (@12:36PM EST) with the change for > battery_update->battery_poll as well makes the most sense for bq27xxx. I > would like to point out though that if we patch this, the cache won't be > populated for the first TEMP request, which has the same end effect as > the patch I proposed to power_supply_read_temp. I believe both will > return 0 for the temp. > > I think that patch would work just fine in place of what I suggested for > this specific crash. Hello all, Indeed I was struggling with similar issue in bq27x00_battery. The issue was introduced by... me :( when moving the ownership of power supply structure from driver to the core. However IMHO my change exposed the fundamental problem with power supply. Anyway a fix for this issue was: 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on early uevent) AFAIU, this fix no longer fixes all the issues, right? As for the fundamental problem, the power supply core should not call back the driver (get_property()) until the probe ends. Even if the di->bat was initialized, some other fields of driver could not be set yet. In general, the probe did not end so we should avoid calling driver internal functions. In this particular problem: 1. Fix for the driver (!di->bat) is okay... but it won't solve the problem in general. 2. I think the core should handle it somehow... Best regards, Krzysztof ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 8:37 ` Krzysztof Kozlowski @ 2016-05-27 9:19 ` Krzysztof Kozlowski 2016-05-27 10:28 ` Jon Hunter 1 sibling, 0 replies; 24+ messages in thread From: Krzysztof Kozlowski @ 2016-05-27 9:19 UTC (permalink / raw) To: Rhyland Klein, Jon Hunter, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 05/27/2016 10:37 AM, Krzysztof Kozlowski wrote: >> And you might be completely correct, that is something that can only >> happen specifically with the bq27xxx driver. In which case, making the >> fix there should be the fix. I just know from the commit log (and some >> previous work with power supply drivers) that the case of get_property >> being called during registration has caused problems before. That's why >> I am trying to make sure we cover the generic case if it exists. Using >> scheduled work is common for power_supplies to regularly update their >> status. >> >> As for your proposed patches for bq27xxx, I think the latest one you >> suggested (@12:36PM EST) with the change for >> battery_update->battery_poll as well makes the most sense for bq27xxx. I >> would like to point out though that if we patch this, the cache won't be >> populated for the first TEMP request, which has the same end effect as >> the patch I proposed to power_supply_read_temp. I believe both will >> return 0 for the temp. >> >> I think that patch would work just fine in place of what I suggested for >> this specific crash. > > Hello all, > > Indeed I was struggling with similar issue in bq27x00_battery. The issue > was introduced by... me :( when moving the ownership of power supply > structure from driver to the core. However IMHO my change exposed the > fundamental problem with power supply. > > Anyway a fix for this issue was: > 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on > early uevent) > AFAIU, this fix no longer fixes all the issues, right? > > As for the fundamental problem, the power supply core should not call > back the driver (get_property()) until the probe ends. Even if the > di->bat was initialized, some other fields of driver could not be set > yet. In general, the probe did not end so we should avoid calling driver > internal functions. > > In this particular problem: > 1. Fix for the driver (!di->bat) is okay... but it won't solve the > problem in general. > 2. I think the core should handle it somehow... I was thinking about some more generic solutions for that. Few ideas: 1. Split the power_supply_register() into register + manual call to power_supply_changed(). Each driver will have to call the power_supply_changed() when it is ready to do it. After that call, it is expected that driver provides everything for power supply (it can receive callbacks). 2. Since 7f1a57fdd6cb ("power_supply: Fix possible NULL pointer dereference on early uevent") the power_supply_changed() is called from a deferred work. Separate thread. We can introduce (in the core only) a mutex: power_supply_deferred_register_work() { psy->mutex_lock(); power_supply_changed(psy); psy->mutex_unlock(); } and add it also to all of API: power_supply_get_property() { psy->mutex_lock(); psy->get_property(); psy->mutex_unlock(); } The changes would be limited only to the core but we will introduce strict locking over all of the psy callbacks. 3. We can go back to previous API, leaving the allocation done by the core: some_drv_probe() { err = power_supply_register(&some_drv->psy...); } I think the second solution seems to be the most self-contained and robust. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 8:37 ` Krzysztof Kozlowski 2016-05-27 9:19 ` Krzysztof Kozlowski @ 2016-05-27 10:28 ` Jon Hunter 2016-05-27 11:46 ` Krzysztof Kozlowski 1 sibling, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-27 10:28 UTC (permalink / raw) To: Krzysztof Kozlowski, Rhyland Klein, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra Hi Krzysztof, On 27/05/16 09:37, Krzysztof Kozlowski wrote: ... > Indeed I was struggling with similar issue in bq27x00_battery. The issue > was introduced by... me :( when moving the ownership of power supply > structure from driver to the core. However IMHO my change exposed the > fundamental problem with power supply. > > Anyway a fix for this issue was: > 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on > early uevent) > AFAIU, this fix no longer fixes all the issues, right? > > As for the fundamental problem, the power supply core should not call > back the driver (get_property()) until the probe ends. Even if the > di->bat was initialized, some other fields of driver could not be set > yet. In general, the probe did not end so we should avoid calling driver > internal functions. For my understanding, can you elaborate why the power-supply core should not call back to the drivers ->get_property() before the probe ends? I assume that registering the power-supply should be the last thing done in the probe and so the power-supply should be configured at that point. The problems with the bq27xxx seem to stem from the periodic update of the bq27xxx status and so it is not clear to me that this is a generic problem for all power-supply devices. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 10:28 ` Jon Hunter @ 2016-05-27 11:46 ` Krzysztof Kozlowski 2016-05-27 12:17 ` Jon Hunter 0 siblings, 1 reply; 24+ messages in thread From: Krzysztof Kozlowski @ 2016-05-27 11:46 UTC (permalink / raw) To: Jon Hunter, Rhyland Klein, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 05/27/2016 12:28 PM, Jon Hunter wrote: > Hi Krzysztof, > > On 27/05/16 09:37, Krzysztof Kozlowski wrote: > > ... > >> Indeed I was struggling with similar issue in bq27x00_battery. The issue >> was introduced by... me :( when moving the ownership of power supply >> structure from driver to the core. However IMHO my change exposed the >> fundamental problem with power supply. >> >> Anyway a fix for this issue was: >> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >> early uevent) >> AFAIU, this fix no longer fixes all the issues, right? >> >> As for the fundamental problem, the power supply core should not call >> back the driver (get_property()) until the probe ends. Even if the >> di->bat was initialized, some other fields of driver could not be set >> yet. In general, the probe did not end so we should avoid calling driver >> internal functions. > > For my understanding, can you elaborate why the power-supply core should > not call back to the drivers ->get_property() before the probe ends? I > assume that registering the power-supply should be the last thing done > in the probe and so the power-supply should be configured at that point. It is not only about power supply but other resources allocated by the driver. If the power_supply_register() is a last call, then no problem. But if not, then these resources won't be available. Actually I exaggerated a little bit as a fundamental problem as this is quite common pattern. When driver provides something (like power supply) then after registration it should be ready for calls coming from the core or user space. It does not have to be power supply. It might be exposing sysfs entries or file operations (exposed before calling power_supply_register()). > The problems with the bq27xxx seem to stem from the periodic update of > the bq27xxx status and so it is not clear to me that this is a generic > problem for all power-supply devices. Initially, the generic problem was that the core would call back the driver from power_supply_register() in a synchronous way through power_supply_changed(). The commit 7f1a57fdd6c changed it to an asynchronous call. Here it looks like the same problem - the power_supply_register() calls thermal which calls thermal_zone_device_update() and we are back at the driver... before finishing power_supply_register() call. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 11:46 ` Krzysztof Kozlowski @ 2016-05-27 12:17 ` Jon Hunter 2016-05-27 12:55 ` Krzysztof Kozlowski 0 siblings, 1 reply; 24+ messages in thread From: Jon Hunter @ 2016-05-27 12:17 UTC (permalink / raw) To: Krzysztof Kozlowski, Rhyland Klein, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 27/05/16 12:46, Krzysztof Kozlowski wrote: > On 05/27/2016 12:28 PM, Jon Hunter wrote: >> Hi Krzysztof, >> >> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >> >> ... >> >>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>> was introduced by... me :( when moving the ownership of power supply >>> structure from driver to the core. However IMHO my change exposed the >>> fundamental problem with power supply. >>> >>> Anyway a fix for this issue was: >>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>> early uevent) >>> AFAIU, this fix no longer fixes all the issues, right? >>> >>> As for the fundamental problem, the power supply core should not call >>> back the driver (get_property()) until the probe ends. Even if the >>> di->bat was initialized, some other fields of driver could not be set >>> yet. In general, the probe did not end so we should avoid calling driver >>> internal functions. >> >> For my understanding, can you elaborate why the power-supply core should >> not call back to the drivers ->get_property() before the probe ends? I >> assume that registering the power-supply should be the last thing done >> in the probe and so the power-supply should be configured at that point. > > It is not only about power supply but other resources allocated by the > driver. If the power_supply_register() is a last call, then no problem. > But if not, then these resources won't be available. > > Actually I exaggerated a little bit as a fundamental problem as this is > quite common pattern. When driver provides something (like power supply) > then after registration it should be ready for calls coming from the > core or user space. It does not have to be power supply. It might be > exposing sysfs entries or file operations (exposed before calling > power_supply_register()). Right, exactly when you register with the power-supply core the device better be ready so that handle any incoming calls. >> The problems with the bq27xxx seem to stem from the periodic update of >> the bq27xxx status and so it is not clear to me that this is a generic >> problem for all power-supply devices. > > Initially, the generic problem was that the core would call back the > driver from power_supply_register() in a synchronous way through > power_supply_changed(). The commit 7f1a57fdd6c changed it to an > asynchronous call. Here it looks like the same problem - the > power_supply_register() calls thermal which calls > thermal_zone_device_update() and we are back at the driver... before > finishing power_supply_register() call. So I am still not convinced this is a generic problem but a problem with the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided if we did something like ... http://marc.info/?l=linux-kernel&m=146425896332433&w=2 AFAICT in most cases, in ->get_property() you should have no need to access a driver's equivalent of di->bat, because you have already been passed a pointer to this via the *psy argument. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 12:17 ` Jon Hunter @ 2016-05-27 12:55 ` Krzysztof Kozlowski 2016-05-31 17:24 ` Jon Hunter 0 siblings, 1 reply; 24+ messages in thread From: Krzysztof Kozlowski @ 2016-05-27 12:55 UTC (permalink / raw) To: Jon Hunter, Rhyland Klein, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 05/27/2016 02:17 PM, Jon Hunter wrote: > > On 27/05/16 12:46, Krzysztof Kozlowski wrote: >> On 05/27/2016 12:28 PM, Jon Hunter wrote: >>> Hi Krzysztof, >>> >>> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >>> >>> ... >>> >>>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>>> was introduced by... me :( when moving the ownership of power supply >>>> structure from driver to the core. However IMHO my change exposed the >>>> fundamental problem with power supply. >>>> >>>> Anyway a fix for this issue was: >>>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>>> early uevent) >>>> AFAIU, this fix no longer fixes all the issues, right? >>>> >>>> As for the fundamental problem, the power supply core should not call >>>> back the driver (get_property()) until the probe ends. Even if the >>>> di->bat was initialized, some other fields of driver could not be set >>>> yet. In general, the probe did not end so we should avoid calling driver >>>> internal functions. >>> >>> For my understanding, can you elaborate why the power-supply core should >>> not call back to the drivers ->get_property() before the probe ends? I >>> assume that registering the power-supply should be the last thing done >>> in the probe and so the power-supply should be configured at that point. >> >> It is not only about power supply but other resources allocated by the >> driver. If the power_supply_register() is a last call, then no problem. >> But if not, then these resources won't be available. >> >> Actually I exaggerated a little bit as a fundamental problem as this is >> quite common pattern. When driver provides something (like power supply) >> then after registration it should be ready for calls coming from the >> core or user space. It does not have to be power supply. It might be >> exposing sysfs entries or file operations (exposed before calling >> power_supply_register()). > > Right, exactly when you register with the power-supply core the device > better be ready so that handle any incoming calls. Yes, the unusual thing here is that the device is called back directly from the power_supply_register() call. > >>> The problems with the bq27xxx seem to stem from the periodic update of >>> the bq27xxx status and so it is not clear to me that this is a generic >>> problem for all power-supply devices. >> >> Initially, the generic problem was that the core would call back the >> driver from power_supply_register() in a synchronous way through >> power_supply_changed(). The commit 7f1a57fdd6c changed it to an >> asynchronous call. Here it looks like the same problem - the >> power_supply_register() calls thermal which calls >> thermal_zone_device_update() and we are back at the driver... before >> finishing power_supply_register() call. > > So I am still not convinced this is a generic problem but a problem with > the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided > if we did something like ... > > http://marc.info/?l=linux-kernel&m=146425896332433&w=2 > > AFAICT in most cases, in ->get_property() you should have no need to > access a driver's equivalent of di->bat, because you have already been > passed a pointer to this via the *psy argument. I agree that get_property() shouldn't access di->bat. However if it is not forbidden (at least by documentation) then someone might just do it because he does not know about such requirement. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-27 12:55 ` Krzysztof Kozlowski @ 2016-05-31 17:24 ` Jon Hunter 0 siblings, 0 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-31 17:24 UTC (permalink / raw) To: Krzysztof Kozlowski, Rhyland Klein, Thierry Reding, Sebastian Reichel, David Woodhouse, Dmitry Eremin-Solenikov Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 27/05/16 13:55, Krzysztof Kozlowski wrote: > On 05/27/2016 02:17 PM, Jon Hunter wrote: >> >> On 27/05/16 12:46, Krzysztof Kozlowski wrote: >>> On 05/27/2016 12:28 PM, Jon Hunter wrote: >>>> Hi Krzysztof, >>>> >>>> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >>>> >>>> ... >>>> >>>>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>>>> was introduced by... me :( when moving the ownership of power supply >>>>> structure from driver to the core. However IMHO my change exposed the >>>>> fundamental problem with power supply. >>>>> >>>>> Anyway a fix for this issue was: >>>>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>>>> early uevent) >>>>> AFAIU, this fix no longer fixes all the issues, right? >>>>> >>>>> As for the fundamental problem, the power supply core should not call >>>>> back the driver (get_property()) until the probe ends. Even if the >>>>> di->bat was initialized, some other fields of driver could not be set >>>>> yet. In general, the probe did not end so we should avoid calling driver >>>>> internal functions. >>>> >>>> For my understanding, can you elaborate why the power-supply core should >>>> not call back to the drivers ->get_property() before the probe ends? I >>>> assume that registering the power-supply should be the last thing done >>>> in the probe and so the power-supply should be configured at that point. >>> >>> It is not only about power supply but other resources allocated by the >>> driver. If the power_supply_register() is a last call, then no problem. >>> But if not, then these resources won't be available. >>> >>> Actually I exaggerated a little bit as a fundamental problem as this is >>> quite common pattern. When driver provides something (like power supply) >>> then after registration it should be ready for calls coming from the >>> core or user space. It does not have to be power supply. It might be >>> exposing sysfs entries or file operations (exposed before calling >>> power_supply_register()). >> >> Right, exactly when you register with the power-supply core the device >> better be ready so that handle any incoming calls. > > Yes, the unusual thing here is that the device is called back directly > from the power_supply_register() call. > >> >>>> The problems with the bq27xxx seem to stem from the periodic update of >>>> the bq27xxx status and so it is not clear to me that this is a generic >>>> problem for all power-supply devices. >>> >>> Initially, the generic problem was that the core would call back the >>> driver from power_supply_register() in a synchronous way through >>> power_supply_changed(). The commit 7f1a57fdd6c changed it to an >>> asynchronous call. Here it looks like the same problem - the >>> power_supply_register() calls thermal which calls >>> thermal_zone_device_update() and we are back at the driver... before >>> finishing power_supply_register() call. >> >> So I am still not convinced this is a generic problem but a problem with >> the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided >> if we did something like ... >> >> http://marc.info/?l=linux-kernel&m=146425896332433&w=2 >> >> AFAICT in most cases, in ->get_property() you should have no need to >> access a driver's equivalent of di->bat, because you have already been >> passed a pointer to this via the *psy argument. > > I agree that get_property() shouldn't access di->bat. However if it is > not forbidden (at least by documentation) then someone might just do it > because he does not know about such requirement. In that case, shouldn't the driver should check that di->bat is valid before anyone attempts to dereference it? However, if you and/or Rhyland have a generic fix for preventing this, please go ahead and propose it. Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 16:29 ` Jon Hunter 2016-05-25 16:36 ` Rhyland Klein @ 2016-05-25 16:36 ` Jon Hunter 1 sibling, 0 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-25 16:36 UTC (permalink / raw) To: Rhyland Klein, Thierry Reding, Krzysztof Kozlowski Cc: Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 17:29, Jon Hunter wrote: > > On 25/05/16 17:10, Jon Hunter wrote: > > ... > >> So power_supply_read_temp() calls ->get_property() and passes the >> power_supply psy struct which is initialised. The problem is that inside >> the bq27xxx driver, this then kicks off the worker thread to update the >> bq27xxx state and when this worker thread runs it attempts to access the >> same psy struct but by dereferencing a pointer to it from the >> bq27xxx_device_info where the pointer has not been initialised yet. >> Therefore, IMO it seems that we should not allow this worker thread to >> start until the registration has completed and hence the pointer is >> initialised. > > Sorry, it is not the actual worker thread that triggers the NULL pointer > deference, but the function bq27xxx_battery_poll() that schedules the > worker thread. Anyway, I still don't see that we need to update the > bq27xxx state during the registration especially seeing as we call > bq27xxx_battery_update() after the registration is complete. It seems > that updating the overall state should be mutually exclusive from > reading the temp. > > Looking at my patch, it does appear that the worker thread which also > calls bq27xxx_battery_update() is still scheduled and so may be it > should be ... > > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..1334ed522332 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > + if (!di->bat) > + return; > + > bq27xxx_battery_update(di); > > if (poll_interval > 0) { Or actually ... diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..9257676be105 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) container_of(work, struct bq27xxx_device_info, work.work); + if (!di->bat) + return; + bq27xxx_battery_update(di); if (poll_interval > 0) { @@ -984,7 +987,7 @@ int bq27xxx_battery_setup(struct bq27xxx_device_info *di) dev_info(di->dev, "support ver. %s enabled\n", DRIVER_VERSION); - bq27xxx_battery_update(di); + bq27xxx_battery_poll(&di->work.work); return 0; } -- nvpublic ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 15:46 ` Thierry Reding 2016-05-25 15:55 ` Rhyland Klein @ 2016-05-25 15:57 ` Jon Hunter 1 sibling, 0 replies; 24+ messages in thread From: Jon Hunter @ 2016-05-25 15:57 UTC (permalink / raw) To: Thierry Reding Cc: Rhyland Klein, Stephen Warren, Alexandre Courbot, linux-kernel, linux-tegra On 25/05/16 16:46, Thierry Reding wrote: ... > How about this, which should be the most minimal to fix it (though it's > completely untested) and still update the internal cache (it just won't > signal an supply change, which wouldn't work at this point anyway). The > patch makes up for the supply change notification by doing that instead > of a full bq27xxx_battery_update() at the end of ->probe(). This should > take care of always sending out a uevent on successful probe, whereas a > bq27xxx_battery_update() at the end of ->probe() may not send one if it > is presented with the same data. > > Thierry > --- >8 --- > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..df1b4cb2bbc2 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -717,7 +717,13 @@ void bq27xxx_battery_update(struct bq27xxx_device_info *di) > di->charge_design_full = bq27xxx_battery_read_dcap(di); > } > > - if (di->cache.capacity != cache.capacity) > + /* > + * This function ends up being called while the power supply is being > + * registered, hence di->bat will be NULL on the first call, causing > + * power_supply_changed() to oops. Avoid that by checking if we have > + * been registered already or not. > + */ > + if (di->bat && di->cache.capacity != cache.capacity) > power_supply_changed(di->bat); > > if (memcmp(&di->cache, &cache, sizeof(cache)) != 0) > @@ -984,7 +990,7 @@ int bq27xxx_battery_setup(struct bq27xxx_device_info *di) > > dev_info(di->dev, "support ver. %s enabled\n", DRIVER_VERSION); > > - bq27xxx_battery_update(di); > + power_supply_changed(di->bat); > > return 0; > } I think that would work too, my only concern is that this assumes that bq27xxx_battery_update() is called during the registration of the power supply. Looking at the backtrace from the panic we have ... [ 1.984150] [<ffff000008614984>] bq27xxx_battery_update+0x88/0x51c [ 1.990321] [<ffff000008615084>] bq27xxx_battery_poll+0x24/0x70 [ 1.996231] [<ffff000008615180>] bq27xxx_battery_get_property+0xb0/0x3b4 [ 2.002923] [<ffff0000086133d8>] power_supply_read_temp+0x2c/0x54 [ 2.009005] [<ffff000008616508>] thermal_zone_get_temp+0x5c/0x11c [ 2.015089] [<ffff0000086183b0>] thermal_zone_device_update+0x34/0xb4 [ 2.021518] [<ffff0000086193b4>] thermal_zone_device_register+0x87c/0x8cc [ 2.028295] [<ffff000008613b6c>] __power_supply_register+0x370/0x430 [ 2.034638] [<ffff000008613c54>] power_supply_register_no_ws+0x10/0x18 [ 2.041155] [<ffff000008614f1c>] bq27xxx_battery_setup+0x104/0x15c [ 2.047325] [<ffff000008615668>] bq27xxx_battery_i2c_probe+0xd0/0x1b0 Here bq27xxx_battery_update() is being called during the thermal zone registration and so as long as all bq27xxx devices have a POWER_SUPPLY_PROP_TEMP property then it *should* be ok. It would only break if there was a new bq27xxx with no temp support. May be that is a bit fragile and we are better off explicitly calling bq27xxx_battery_update()? Cheers Jon -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] arm64: defconfig: Enable cros-ec and battery driver 2016-05-25 11:03 ` Jon Hunter 2016-05-25 15:46 ` Thierry Reding @ 2016-05-25 15:49 ` Rhyland Klein 1 sibling, 0 replies; 24+ messages in thread From: Rhyland Klein @ 2016-05-25 15:49 UTC (permalink / raw) To: Jon Hunter, Thierry Reding, Stephen Warren, Alexandre Courbot Cc: linux-kernel, linux-tegra On 5/25/2016 7:03 AM, Jon Hunter wrote: > > On 25/05/16 11:58, Jon Hunter wrote: > > ... I am aware of the splat, and I was considering the proper place for working around that. > >> Looking at this a bit more I am wondering if we should prevent the >> battery for being polled before the registration has completed ... >> >> diff --git a/drivers/power/bq27xxx_battery.c >> b/drivers/power/bq27xxx_battery.c >> index 45f6ebf88df6..32649183ecd9 100644 >> --- a/drivers/power/bq27xxx_battery.c >> +++ b/drivers/power/bq27xxx_battery.c >> @@ -871,12 +871,14 @@ static int bq27xxx_battery_get_property(struct >> power_supply *psy, >> int ret = 0; >> struct bq27xxx_device_info *di = power_supply_get_drvdata(psy); >> >> - mutex_lock(&di->lock); >> - if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >> - cancel_delayed_work_sync(&di->work); >> - bq27xxx_battery_poll(&di->work.work); >> + if (di->bat) { >> + mutex_lock(&di->lock); >> + if (time_is_before_jiffies(di->last_update + 5 * HZ)) { >> + cancel_delayed_work_sync(&di->work); >> + bq27xxx_battery_poll(&di->work.work); >> + } >> + mutex_unlock(&di->lock); >> } >> - mutex_unlock(&di->lock); > > Alternatively, maybe the following is simpler ... > > diff --git a/drivers/power/bq27xxx_battery.c > b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..8a713b52e9f6 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,7 +733,8 @@ static void bq27xxx_battery_poll(struct work_struct > *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > - bq27xxx_battery_update(di); > + if (di->bat) > + bq27xxx_battery_update(di); > While that might get around the problem, I don't think the fix should be inside the bq27xxx driver. The problem is that the core is calling : __power_supply_register-> psy_register_thermal()-> thermal_zone_device_register()-> thermal_zone_device_update()-> thermal_zone_get_temp()-> power_supply_read_temp() then power_supply_read_temp() will attempt to use the driver's callback get_property method passing it uncompletely initialized struct. If you notice, there are already other places inside power_supply_core.c where use_cnt is used to block calls that would reach back to the get_property callbacks. I don't think it would be bad to have sanity checks in those callbacks for NULL pointers, but the main problem is that in this path, the core should know not to call a get_property callback during registration (before use_cnt is incremented). This is closely related to this patch in the power_supply_core.c commit 7f1a57fdd6cb6e7be2ed31878a34655df38e1861 Author: Krzysztof Kozlowski <k.kozlowski@samsung.com> Date: Tue May 19 16:13:02 2015 +0900 power_supply: Fix possible NULL pointer dereference on early uevent Don't call the power_supply_changed() from power_supply_register() when parent is still probing because it may lead to accessing parent too early. ... Its just another situation where get_property is called prematurely. -rhyland -- nvpublic ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2016-05-31 17:25 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-03 15:45 [PATCH] arm64: defconfig: Enable cros-ec and battery driver Rhyland Klein 2016-05-19 17:20 ` Rhyland Klein 2016-05-24 14:09 ` Jon Hunter 2016-05-24 19:08 ` Rhyland Klein 2016-05-25 10:58 ` Jon Hunter 2016-05-25 11:03 ` Jon Hunter 2016-05-25 15:46 ` Thierry Reding 2016-05-25 15:55 ` Rhyland Klein 2016-05-25 16:10 ` Jon Hunter 2016-05-25 16:29 ` Jon Hunter 2016-05-25 16:36 ` Rhyland Klein 2016-05-25 17:26 ` Jon Hunter 2016-05-25 19:44 ` Rhyland Klein 2016-05-26 10:35 ` Jon Hunter 2016-05-27 8:37 ` Krzysztof Kozlowski 2016-05-27 9:19 ` Krzysztof Kozlowski 2016-05-27 10:28 ` Jon Hunter 2016-05-27 11:46 ` Krzysztof Kozlowski 2016-05-27 12:17 ` Jon Hunter 2016-05-27 12:55 ` Krzysztof Kozlowski 2016-05-31 17:24 ` Jon Hunter 2016-05-25 16:36 ` Jon Hunter 2016-05-25 15:57 ` Jon Hunter 2016-05-25 15:49 ` Rhyland Klein
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).