Hi! > >> +Under certain circumstances, the SoC reaches a temperature exceeding > >> +the allocated power budget or the maximum temperature limit. The > > > > I don't understand. Power budget is in W, temperature is in > > kelvin. Temperature can't exceed power budget AFAICT. > > Yes, it is badly worded. Is the following better ? > > " > Under certain circumstances a SoC can reach the maximum temperature > limit or is unable to stabilize the temperature around a temperature > control. > > When the SoC has to stabilize the temperature, the kernel can act on a > cooling device to mitigate the dissipated power. > > When the maximum temperature is reached and to prevent a catastrophic > situation a radical decision must be taken to reduce the temperature > under the critical threshold, that impacts the performance. > > " Actually... if hardware is expected to protect itself, I'd tone it down. No need to be all catastrophic and critical... But yes, better. > > Critical here, critical there. I have trouble following > > it. Theoretically hardware should protect itself, because you don't > > want kernel bug to damage your CPU? > > There are several levels of protection. The first level is mitigating > the temperature from the kernel, then in the temperature sensor a reset > line will trigger the reboot of the CPUs. Usually it is a register where > you write the maximum temperature, from the driver itself. I never tried > to write 1000°C in this register and see if I can burn the board. > > I know some boards have another level of thermal protection in the > hardware itself and some other don't. > > In any case, from a kernel point of view, it is a critical situation as > we are about to hard reboot the system and in this case it is preferable > to drop drastically the performance but give the opportunity to the > system to run in a degraded mode. Agreed you want to keep going. In ACPI world, we shutdown when critical trip point is reached, so this is somehow confusing. > >> +Solutions: > >> +---------- > >> + > >> +If we can remove the static and the dynamic leakage for a specific > >> +duration in a controlled period, the SoC temperature will > >> +decrease. Acting at the idle state duration or the idle cycle > > > > "should" decrease? If you are in bad environment.. > > No, it will decrease in any case because of the static leakage drop. The > bad environment will impact the speed of this decrease. I meant... if ambient temperature is 105C, there's not much you can do to cool system down :-). > >> +Idle Injection: > >> +--------------- > >> + > >> +The base concept of the idle injection is to force the CPU to go to an > >> +idle state for a specified time each control cycle, it provides > >> +another way to control CPU power and heat in addition to > >> +cpufreq. Ideally, if all CPUs of a cluster inject idle synchronously, > >> +this cluster can get into the deepest idle state and achieve minimum > >> +power consumption, but that will also increase system response latency > >> +if we inject less than cpuidle latency. > > > > I don't understand last sentence. > > Is it better ? > > "Ideally, if all CPUs, belonging to the same cluster, inject their idle > cycle synchronously, the cluster can reach its power down state with a > minimum power consumption and static leakage drop. However, these idle > cycles injection will add extra latencies as the CPUs will have to > wakeup from a deep sleep state." Extra comma "CPUs , belonging". But yes, better. > Thanks! You are welcome. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html