[PATCH] hangcheck-timer is broken on x86

* [PATCH] hangcheck-timer is broken on x86
@ 2010-03-24  3:36 Yury Polyanskiy
  2010-03-26 21:24 ` Andrew Morton
  2010-03-26 21:46 ` Joel Becker
  0 siblings, 2 replies; 21+ messages in thread
From: Yury Polyanskiy @ 2010-03-24  3:36 UTC (permalink / raw)
  To: Joel Becker; +Cc: linux-kernel, Andrew Morton, john stultz, Jan Glauber

[-- Attachment #1: Type: text/plain, Size: 2529 bytes --]

Dear Joel,

The drivers/char/hangcheck-timer.c is doubly broken. First, the
following line overflows unsigned long:
# define TIMER_FREQ (HZ*loops_per_jiffy)

Second, and more importantly, loops_per_jiffy has little to do with the conversion from the
the time scale of get_cycles() (aka rdtsc) to the time scale of jiffies.

The attached patch resolves both of the problems.

Best,
Yury

PS. I have CC-ed a few people who touched hangcheck-timer.c in 2006. Please CC me for follow-ups.

diff --git a/drivers/char/hangcheck-timer.c b/drivers/char/hangcheck-timer.c
index 712d9f2..c57a508 100644
--- a/drivers/char/hangcheck-timer.c
+++ b/drivers/char/hangcheck-timer.c
@@ -49,8 +49,9 @@
 #include <asm/uaccess.h>
 #include <linux/sysrq.h>
 #include <linux/timer.h>
+#include <linux/time.h>
 
-#define VERSION_STR "0.9.0"
+#define VERSION_STR "0.9.1"
 
 #define DEFAULT_IOFENCE_MARGIN 60	/* Default fudge factor, in seconds */
 #define DEFAULT_IOFENCE_TICK 180	/* Default timer timeout, in seconds */
@@ -119,10 +120,8 @@ __setup("hcheck_dump_tasks", hangcheck_parse_dump_tasks);
 #if defined(CONFIG_S390)
 # define HAVE_MONOTONIC
 # define TIMER_FREQ 1000000000ULL
-#elif defined(CONFIG_IA64)
-# define TIMER_FREQ ((unsigned long long)local_cpu_data->itc_freq)
 #else
-# define TIMER_FREQ (HZ*loops_per_jiffy)
+# define TIMER_FREQ 1000000000ULL
 #endif
 
 #ifdef HAVE_MONOTONIC
@@ -130,7 +129,9 @@ extern unsigned long long monotonic_clock(void);
 #else
 static inline unsigned long long monotonic_clock(void)
 {
-	return get_cycles();
+	struct timespec ts;
+	getrawmonotonic(&ts);
+	return timespec_to_ns(&ts);
 }
 #endif  /* HAVE_MONOTONIC */
 
@@ -168,6 +169,13 @@ static void hangcheck_fire(unsigned long data)
 			printk(KERN_CRIT "Hangcheck: hangcheck value past margin!\n");
 		}
 	}
+#if 0
+	/* 
+	 * Enable to investigate delays in detail
+	 */
+	printk("Hangcheck: called %Ld ns since last time (%Ld ns overshoot)\n", 
+			tsc_diff, tsc_diff - hangcheck_tick*TIMER_FREQ);
+#endif
 	mod_timer(&hangcheck_ticktock, jiffies + (hangcheck_tick*HZ));
 	hangcheck_tsc = monotonic_clock();
 }
@@ -180,7 +188,7 @@ static int __init hangcheck_init(void)
 #if defined (HAVE_MONOTONIC)
 	printk("Hangcheck: Using monotonic_clock().\n");
 #else
-	printk("Hangcheck: Using get_cycles().\n");
+	printk("Hangcheck: Using getrawmonotonic().\n");
 #endif  /* HAVE_MONOTONIC */
 	hangcheck_tsc_margin =
 		(unsigned long long)(hangcheck_margin + hangcheck_tick);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply related	[flat|nested] 21+ messages in thread