From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751127AbdEBDMg (ORCPT ); Mon, 1 May 2017 23:12:36 -0400 Received: from scorn.kernelslacker.org ([45.56.101.199]:46668 "EHLO scorn.kernelslacker.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751045AbdEBDMf (ORCPT ); Mon, 1 May 2017 23:12:35 -0400 Date: Mon, 1 May 2017 23:12:30 -0400 From: Dave Jones To: Linux Kernel Mailing List Cc: Arun Raghavan , Thomas Gleixner , Linus Torvalds Subject: Re: rlimits: Print more information when CPU/RT limits are exceeded Message-ID: <20170502031230.jz5b6witmmujsdmq@codemonkey.org.uk> Mail-Followup-To: Dave Jones , Linux Kernel Mailing List , Arun Raghavan , Thomas Gleixner , Linus Torvalds References: <20170501232152.8142E661220@gitolite.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170501232152.8142E661220@gitolite.kernel.org> User-Agent: NeoMutt/20170306 (1.8.0) X-Spam-Note: SpamAssassin invocation failed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 01, 2017 at 11:21:52PM +0000, Linux Kernel wrote: > Web: https://git.kernel.org/torvalds/c/e7ea7c9806a2681807257ea89085339d33f7fa0b > Commit: e7ea7c9806a2681807257ea89085339d33f7fa0b > Parent: 4495c08e84729385774601b5146d51d9e5849f81 > Refname: refs/heads/master > Author: Arun Raghavan > AuthorDate: Wed Mar 1 20:23:09 2017 +0530 > Committer: Thomas Gleixner > CommitDate: Mon Mar 13 21:32:15 2017 +0100 > > rlimits: Print more information when CPU/RT limits are exceeded > > When a process is sent a SIGKILL because it exceeded CPU or RT limits, > the cause may not be obvious in userspace -- daemonised processes just > get killed, and even foreground process just see a 'Killed' message. The > lack of any information on why this might be happening in logs can be > confusing to users who are not aware of this mechanism. > > Add messages which dump the process name and tid in dmesg when a process > exceeds its CPU or RT limits (soft and hard) in order to make it clearer to > people debugging such issues. > > Signed-off-by: Arun Raghavan > Link: http://lkml.kernel.org/r/20170301145309.27214-1-arun@arunraghavan.net > Signed-off-by: Thomas Gleixner This needs to be configurable, because this is really obnoxious.. [ 121.042170] RT Watchdog Timeout (hard): trinity-c40[7533] [ 125.670948] RT Watchdog Timeout (hard): trinity-c29[1612] [ 200.631968] CPU Watchdog Timeout (soft): trinity-c33[11454] [ 200.644308] CPU Watchdog Timeout (soft): trinity-c33[11454] [ 200.656551] CPU Watchdog Timeout (soft): trinity-c33[11454] [ 213.454504] CPU Watchdog Timeout (soft): trinity-c33[11454] [ 276.787116] CPU Watchdog Timeout (soft): trinity-c22[23943] [ 285.857773] CPU Watchdog Timeout (soft): trinity-c33[24908] [ 287.236710] CPU Watchdog Timeout (soft): trinity-c22[23943] [ 295.186400] CPU Watchdog Timeout (soft): trinity-c33[24908] [ 296.464352] CPU Watchdog Timeout (soft): trinity-c22[23943] [ 305.164011] CPU Watchdog Timeout (soft): trinity-c33[24908] [ 367.123564] CPU Watchdog Timeout (soft): trinity-c8[1472] [ 373.950321] CPU Watchdog Timeout (soft): trinity-c8[1472] [ 381.415054] CPU Watchdog Timeout (soft): trinity-c8[1472] [ 389.621759] CPU Watchdog Timeout (soft): trinity-c8[1472] [ 463.940996] CPU Watchdog Timeout (soft): trinity-c29[7725] [ 463.952215] CPU Watchdog Timeout (soft): trinity-c29[7725] [ 463.963306] CPU Watchdog Timeout (soft): trinity-c29[7725] [ 555.264401] RT Watchdog Timeout (hard): trinity-c58[19175] [ 610.536159] RT Watchdog Timeout (hard): trinity-c2[28363] [ 741.785688] CPU Watchdog Timeout (soft): trinity-c46[5034] [ 741.796600] CPU Watchdog Timeout (soft): trinity-c46[5034] [ 741.807384] CPU Watchdog Timeout (soft): trinity-c46[5034] [ 741.972679] CPU Watchdog Timeout (soft): trinity-c46[5034] [ 743.115630] CPU Watchdog Timeout (soft): trinity-c46[5034] [ 743.128628] CPU Watchdog Timeout (soft): trinity-c12[2803] [ 743.139032] CPU Watchdog Timeout (soft): trinity-c12[2803] [ 743.149276] CPU Watchdog Timeout (soft): trinity-c12[2803] [ 823.866183] CPU Watchdog Timeout (soft): trinity-c21[15684] [ 892.151230] CPU Watchdog Timeout (soft): trinity-c56[22668] [ 892.161239] CPU Watchdog Timeout (soft): trinity-c56[22668] [ 899.818894] CPU Watchdog Timeout (soft): trinity-c4[22072] [ 899.828718] CPU Watchdog Timeout (soft): trinity-c4[22072] [ 899.838439] CPU Watchdog Timeout (soft): trinity-c4[22072] [ 905.253660] CPU Watchdog Timeout (soft): trinity-c56[22668] [ 907.297573] CPU Watchdog Timeout (soft): trinity-c39[24134] [ 907.307170] CPU Watchdog Timeout (soft): trinity-c39[24134] [ 907.519560] CPU Watchdog Timeout (soft): trinity-c4[22072] [ 940.624120] RT Watchdog Timeout (hard): trinity-c34[30478] [ 959.189311] CPU Watchdog Timeout (soft): trinity-c61[31012] [ 968.873887] CPU Watchdog Timeout (soft): trinity-c61[31012] [ 980.305390] CPU Watchdog Timeout (soft): trinity-c61[31012] [ 992.532852] CPU Watchdog Timeout (soft): trinity-c61[31012] [ 1032.126118] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1036.723920] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1046.628487] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1054.169156] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1064.102718] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1076.722166] CPU Watchdog Timeout (soft): trinity-c34[5861] [ 1129.844831] CPU Watchdog Timeout (soft): trinity-c4[14714] [ 1134.980606] CPU Watchdog Timeout (soft): trinity-c4[14714] [ 1146.512098] CPU Watchdog Timeout (soft): trinity-c4[14714] [ 1158.420576] CPU Watchdog Timeout (soft): trinity-c4[14714] [ 1170.275054] CPU Watchdog Timeout (soft): trinity-c4[14714] Dave