From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753174AbcADTfp (ORCPT ); Mon, 4 Jan 2016 14:35:45 -0500 Received: from mail-am1on0081.outbound.protection.outlook.com ([157.56.112.81]:22944 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753020AbcADTfe (ORCPT ); Mon, 4 Jan 2016 14:35:34 -0500 Authentication-Results: spf=fail (sender IP is 12.216.194.146) smtp.mailfrom=ezchip.com; ezchip.com; dkim=none (message not signed) header.d=none;ezchip.com; dmarc=none action=none header.from=ezchip.com; From: Chris Metcalf To: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , "Rik van Riel" , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , Catalin Marinas , Will Deacon , Andy Lutomirski , , CC: Chris Metcalf Subject: [PATCH v9 06/13] task_isolation: add debug boot flag Date: Mon, 4 Jan 2016 14:34:44 -0500 Message-ID: <1451936091-29247-7-git-send-email-cmetcalf@ezchip.com> X-Mailer: git-send-email 2.1.2 In-Reply-To: <1451936091-29247-1-git-send-email-cmetcalf@ezchip.com> References: <1451936091-29247-1-git-send-email-cmetcalf@ezchip.com> X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;DB3FFO11FD005;1:2rB3PC6E/iy3lNa3+IfZZWfpwdYTUvbcsF0djIFhJGYcxcULMZqdM0HBVNoS0gJiEohMEU65JzqfGqUscT/vRGtWEHvW2EcEwgHTlj2i6xQUcepG/+qzORFGCXxMe91qrT5VuZMr8PZaSC0DgIiMVOxif9dkoPnFsYB7oOxjjRGTB7n/oTjb7lMmgHUFxryCroG3dLPDxcvfCLYHdgh5yNBXduiIADm50vUx+rzbZHBkAjTef04njgDrV1KruQ0Q5FMAGLuUqQQyEo8uV410+2GPdasDOWSqdOv7BTwbuvDzG+BnN1EAKOqNlT2H32vfPxBf898PMKAJuleEPQGkadwNoNqSNnagtQJ7KMS2eQrb4jcJAdErkOoIE+CbPxBv X-Forefront-Antispam-Report: CIP:12.216.194.146;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(1110001)(1109001)(339900001)(189002)(199003)(5001770100001)(5003940100001)(76176999)(87936001)(19580405001)(575784001)(47776003)(92566002)(50466002)(6806005)(104016004)(50226001)(2950100001)(1096002)(5008740100001)(229853001)(4001430100002)(2201001)(586003)(48376002)(4326007)(85426001)(1220700001)(106466001)(189998001)(33646002)(86362001)(50986999)(107886002)(19580395003)(42186005)(36756003)(105606002)(5001970100001)(921003)(2101003)(83996005)(1121003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM3PR02MB0424;H:ld-1.internal.tilera.com;FPR:;SPF:Fail;PTR:wb-fw1.tilera.com;MX:1;A:1;LANG:en; MIME-Version: 1.0 Content-Type: text/plain X-Microsoft-Exchange-Diagnostics: 1;AM3PR02MB0424;2:w5iG/ri0mUk24/oUUzj7a3x7X1BCoG4xaXsV7WVOL3L3O5WzhiPMFYJL0DA4ynoYyuuKMUU7jXDAOIjZA1LAUpMDN9e1m00+MZBpjgp4I+71GUsY0kkemeQPGuEF5p5t//LxisvIZhG1GPUawgRLUw==;3:GGTSFAlTr8uX47RKUVsumm+B9E5An1iETcILqQEfCTwjGq9nJOE3hx68KK5Cq7Rbi36oMEfExhJJ/MxEUsCM+d2n1wspU50k/JeYU3rztwgtv1VK7xky5V1kI2l8I4xM1IElc8OlICWR12cL8u1kSZQ8MmXnMNUOdo4IsWbeGraeyjo7m3ylTcP/s37gfqEHtuf8PCWxJSZyheB2pKrHMJP9Y91oCnBg4EtmedEelKU=;25:xEtcW5WmRKqXT7SiPgWUalSZ5h1A7I0hqIqoVsNdGj7Me4YNs2tcRxuI/0DtXTRfOy6qv9QzgGYUQfe1eR9jPibR+NUbBVAEkyF5e7mL64lIEDijwyc5tsOA/ihRqG19yMJbZEXFQcKObhDSykM9HBf51OR73/N5zGnGm62Z2NoNRVx5l+IGKRPAJelvMqvaspZl/aZDVC41jhytpUO/BAS1SN9ufehQsQmi9xZfwR3VbZVGeGhPp0xd2bIInDhdY6y+TbqKKZ7axKPvy/NTIQ==;20:RF8hVFi1cBBpgCtUfF8iI4dUqr1DtuWJC2q9VM9/eoW5pDzseokxg0ZktAZfMRNHr3JHLUAtqinyi3AUzlFcIaIYpqQDH9AXpX5TxUMJ5zUJxBL0Gf0ilJYsG7cfxr+BG/PKcDNuTPOA2KWmi9n5oE4ejMW831k7ojHE/JeArfw= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM3PR02MB0424; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(121898900299872); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(520078)(5005006)(8121501046)(10201501046)(3002001);SRVR:AM3PR02MB0424;BCL:0;PCL:0;RULEID:;SRVR:AM3PR02MB0424; X-Microsoft-Exchange-Diagnostics: 1;AM3PR02MB0424;4:L4xCalg0luyTr72y7cngycNDBRfudUwyHS4wiAzu3ehD4H8BXV6WirJ8lLj+rLU0/KJp3Kdtdlsty8DJUDZYlPBftwPfMjFw4C4/unagzPrsX3qV3VgdMF7hNGyieeQmD47GtaeTKVD8gl6fCMWgFGTtB8lwJfto7VjuTkJN2RXGfP5ZLtDXCjntnYFqyuCW+Td8qs+qunC8iq6+p9ETUEhEQxDdmvbnhPt6FKM6hdAf/y7LXh43tZE/6yJEV2kO5NjzCyDrdKXYqhQONCmUAALP5mB3meOUfrioVO74ZU+T4ynQf1vAnSIXhoZ1otBpkVDj4pyziE8B8P0vbwKz+yBf/xoHDekiJIGo9+x5+QjX3i28//p6F3Fg/9sK04OF6f5mT9UFAkA990qKH/Yw3qnCNYY2NVBJm2XV1t0NaxmNY7eH2YyjILX35k2kXA6H X-Forefront-PRVS: 08118EFC2B X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM3PR02MB0424;23:/hKsz0DDYyO69kzwqMJGJh9LqNqbhrhjR4mnE4433?= =?us-ascii?Q?iawG6GzTU19Fvf0yD1aWvuK6obA4aC2G+4zv5NNXE1TMzM//jHUs6QeNySbZ?= =?us-ascii?Q?tzB6yxuXt0iSsBq9pu6FRIo36HzMVx/G/0dlIT+zwp9vCWcI+XbNlNitpb98?= =?us-ascii?Q?bjv6yx+uYEuJououB+bDEso7EKj3170qJ02m4t0MrQ5KXQ1Rw2wXq31jovR5?= =?us-ascii?Q?UpPwepQy/H9L39j7RbFb/QA6JIUlz74UF7xLIPfVTERPaE6VPe0+ZIc5gZG2?= =?us-ascii?Q?QvobMoEwVA+4sepp/zz65wyuW92sor24c6+wi6RUO/fpfYntBkHq7vEwjd+x?= =?us-ascii?Q?Wjn98nD0IGdOv26wdm40ItRj73cX6HwO4xy6ttv5s2W6JJeYquY+nLVMZ4yj?= =?us-ascii?Q?iKjHcrfJdBBE9SYMHM8voFjf7b5RvNUFKdSZOJizIAaQYwOKKa24l7wwb9yo?= =?us-ascii?Q?b5ESX8UlswuycL/kP4km/Awd/1zk3TKm3K8C6qdHN2LfC5XSVWSMhMP9GGRw?= =?us-ascii?Q?5phncmcP3ivUJADa3byZm0IjXkAiLbHqmf2Sl0YOWI6K1z3RWGGAqjbr6zK+?= =?us-ascii?Q?ItWsirRWhIk25C/VG6uYhazn4TpenSYXM/VKXym9uck09QbF/XR/bxGyTU/v?= =?us-ascii?Q?VXqmfzauwHNEjDFP98d6UYyBCJag4g/kGlFK1hhOufys09/OhDpI5lQRzIfK?= =?us-ascii?Q?jye1FVgNGzAbvzx8oWmAG/npzj4dVcizjG4PC3013QNnd+sHFBlXw6Ryu3pE?= =?us-ascii?Q?MgGdOQpn/27m1RlH6u8/R2O403KAKX0crDkpSq1RiPHMKGr4bh4ZDjOJPdMf?= =?us-ascii?Q?L7nIKM7Kwi+Y7MY1iiPvphMsa4K8A0gtQDLlDSbmVND3w6joaEauS4auklzH?= =?us-ascii?Q?nwUF++DOz+xIcWtKSm40gqT7m/XLLXm9Ukt83OrNVh5PwDl4rF1u5PU6IzyW?= =?us-ascii?Q?hKqY+Rsgw6jc5xHeNjlhyt50RL+uYGiRC4qQM4i+Um2pIDXlKZubP4lgjTfK?= =?us-ascii?Q?17+XsSTMBtkaScgkxSoqPcgqA0lh8R8YubE16gZ3BDTJ/x29Po0DlsqI+a+D?= =?us-ascii?Q?d0R3/nap2j8SNEGPbwpPGzvWFhiBnPoaSRFY+2n22Eim//ibbay9K3EoL1p2?= =?us-ascii?Q?OuIE1nGGg2zCJkCAxLbE08cI9hZnXIL8m1vqAqo4f+o20PRKt5uLFyTCOza6?= =?us-ascii?Q?JFXqElwuDjG54U=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM3PR02MB0424;5:Qx5SsoEkok3Pxslu4JpeQUUcdhe2nVIkLZvQ00rvdJAxbPnB+VWF+N/z1bFrIve9Vbo8GLJsP0IgDGKdPJSPO8a5lLVu9EK5WyP0nxEl/iPQVY/J7YVUeDLoOTWuqPp/lC8dla7+Gp9x1XHRhkkrzw==;24:4O8yA9QY/TzurKe/ychmKOxCXxuWizSj3aWzVRMph1ha2oTappMqIXz4OKhoDvpeLxl0YUj4z1Qf1FupRjFL/rN/ceWo3RPXPRs3g+SD4zI= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Jan 2016 19:35:29.8789 (UTC) X-MS-Exchange-CrossTenant-Id: 0fc16e0a-3cd3-4092-8b2f-0a42cff122c3 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=0fc16e0a-3cd3-4092-8b2f-0a42cff122c3;Ip=[12.216.194.146];Helo=[ld-1.internal.tilera.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR02MB0424 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The new "task_isolation_debug" flag simplifies debugging of TASK_ISOLATION kernels when processes are running in PR_TASK_ISOLATION_ENABLE mode. Such processes should get no interrupts from the kernel, and if they do, when this boot flag is specified a kernel stack dump on the console is generated. It's possible to use ftrace to simply detect whether a task_isolation core has unexpectedly entered the kernel. But what this boot flag does is allow the kernel to provide better diagnostics, e.g. by reporting in the IPI-generating code what remote core and context is preparing to deliver an interrupt to a task_isolation core. It may be worth considering other ways to generate useful debugging output rather than console spew, but for now that is simple and direct. Signed-off-by: Chris Metcalf --- Documentation/kernel-parameters.txt | 8 +++++ include/linux/isolation.h | 5 ++++ kernel/irq_work.c | 5 +++- kernel/isolation.c | 60 +++++++++++++++++++++++++++++++++++++ kernel/sched/core.c | 18 +++++++++++ kernel/signal.c | 5 ++++ kernel/smp.c | 6 +++- kernel/softirq.c | 33 ++++++++++++++++++++ 8 files changed, 138 insertions(+), 2 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e035679e646e..112fba1727f4 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -3673,6 +3673,14 @@ bytes respectively. Such letter suffixes can also be entirely omitted. also sets up nohz_full and isolcpus mode for the listed set of cpus. + task_isolation_debug [KNL] + In kernels built with CONFIG_TASK_ISOLATION + and booted in task_isolation= mode, this + setting will generate console backtraces when + the kernel is about to interrupt a task that + has requested PR_TASK_ISOLATION_ENABLE and is + running on a task_isolation core. + tcpmhash_entries= [KNL,NET] Set the number of tcp_metrics_hash slots. Default value is 8192 or 16384 depending on total diff --git a/include/linux/isolation.h b/include/linux/isolation.h index 69a3e4c59ab3..3e15e75d078f 100644 --- a/include/linux/isolation.h +++ b/include/linux/isolation.h @@ -43,6 +43,9 @@ static inline void task_isolation_enter(void) extern bool task_isolation_syscall(int nr); extern void task_isolation_exception(const char *fmt, ...); extern void task_isolation_interrupt(struct task_struct *, const char *buf); +extern void task_isolation_debug(int cpu); +extern void task_isolation_debug_cpumask(const struct cpumask *); +extern void task_isolation_debug_task(int cpu, struct task_struct *p); static inline bool task_isolation_strict(void) { @@ -70,6 +73,8 @@ static inline bool task_isolation_ready(void) { return true; } static inline void task_isolation_enter(void) { } static inline bool task_isolation_check_syscall(int nr) { return false; } static inline void task_isolation_check_exception(const char *fmt, ...) { } +static inline void task_isolation_debug(int cpu) { } +#define task_isolation_debug_cpumask(mask) do {} while (0) #endif #endif diff --git a/kernel/irq_work.c b/kernel/irq_work.c index bcf107ce0854..a9b95ce00667 100644 --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -17,6 +17,7 @@ #include #include #include +#include #include @@ -75,8 +76,10 @@ bool irq_work_queue_on(struct irq_work *work, int cpu) if (!irq_work_claim(work)) return false; - if (llist_add(&work->llnode, &per_cpu(raised_list, cpu))) + if (llist_add(&work->llnode, &per_cpu(raised_list, cpu))) { + task_isolation_debug(cpu); arch_send_call_function_single_ipi(cpu); + } return true; } diff --git a/kernel/isolation.c b/kernel/isolation.c index 29ffb21ada0b..9f31c0b458ed 100644 --- a/kernel/isolation.c +++ b/kernel/isolation.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include "time/tick-sched.h" @@ -163,3 +164,62 @@ bool task_isolation_syscall(int syscall) task_isolation_exception("syscall %d", syscall); return true; } + +/* Enable debugging of any interrupts of task_isolation cores. */ +static int task_isolation_debug_flag; +static int __init task_isolation_debug_func(char *str) +{ + task_isolation_debug_flag = true; + return 1; +} +__setup("task_isolation_debug", task_isolation_debug_func); + +void task_isolation_debug_task(int cpu, struct task_struct *p) +{ + static DEFINE_RATELIMIT_STATE(console_output, HZ, 1); + bool force_debug = false; + + /* + * Our caller made sure the task was running on a task isolation + * core, but make sure the task has enabled isolation. + */ + if (!(p->task_isolation_flags & PR_TASK_ISOLATION_ENABLE)) + return; + + /* + * If the task was in strict mode, deliver a signal to it. + * We disable task isolation mode when we deliver a signal + * so we won't end up recursing back here again. + * If we are in an NMI, we don't try delivering the signal + * and instead just treat it as if "debug" mode was enabled, + * since that's pretty much all we can do. + */ + if (p->task_isolation_flags & PR_TASK_ISOLATION_STRICT) { + if (in_nmi()) + force_debug = true; + else + task_isolation_interrupt(p, "interrupt"); + } + + /* + * If (for example) the timer interrupt starts ticking + * unexpectedly, we will get an unmanageable flow of output, + * so limit to one backtrace per second. + */ + if (force_debug || + (task_isolation_debug_flag && __ratelimit(&console_output))) { + pr_err("Interrupt detected for task_isolation cpu %d, %s/%d\n", + cpu, p->comm, p->pid); + dump_stack(); + } +} + +void task_isolation_debug_cpumask(const struct cpumask *mask) +{ + int cpu, thiscpu = smp_processor_id(); + + /* No need to report on this cpu since we're already in the kernel. */ + for_each_cpu(cpu, mask) + if (cpu != thiscpu) + task_isolation_debug(cpu); +} diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 732e993b564b..700120221f6b 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -74,6 +74,7 @@ #include #include #include +#include #include #include @@ -746,6 +747,23 @@ bool sched_can_stop_tick(void) } #endif /* CONFIG_NO_HZ_FULL */ +#ifdef CONFIG_TASK_ISOLATION +void task_isolation_debug(int cpu) +{ + struct task_struct *p; + + if (!task_isolation_possible(cpu)) + return; + + rcu_read_lock(); + p = cpu_curr(cpu); + get_task_struct(p); + rcu_read_unlock(); + task_isolation_debug_task(cpu, p); + put_task_struct(p); +} +#endif + void sched_avg_update(struct rq *rq) { s64 period = sched_avg_period(); diff --git a/kernel/signal.c b/kernel/signal.c index f3f1f7a972fd..c45ef71f329c 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -638,6 +638,11 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info) */ void signal_wake_up_state(struct task_struct *t, unsigned int state) { +#ifdef CONFIG_TASK_ISOLATION + /* If the task is being killed, don't complain about task_isolation. */ + if (state & TASK_WAKEKILL) + t->task_isolation_flags = 0; +#endif set_tsk_thread_flag(t, TIF_SIGPENDING); /* * TASK_WAKEKILL also means wake it up in the stopped/traced/killable diff --git a/kernel/smp.c b/kernel/smp.c index d903c02223af..a61894409645 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "smpboot.h" @@ -178,8 +179,10 @@ static int generic_exec_single(int cpu, struct call_single_data *csd, * locking and barrier primitives. Generic code isn't really * equipped to do the right thing... */ - if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu))) + if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu))) { + task_isolation_debug(cpu); arch_send_call_function_single_ipi(cpu); + } return 0; } @@ -457,6 +460,7 @@ void smp_call_function_many(const struct cpumask *mask, } /* Send a message to all CPUs in the map */ + task_isolation_debug_cpumask(cfd->cpumask); arch_send_call_function_ipi_mask(cfd->cpumask); if (wait) { diff --git a/kernel/softirq.c b/kernel/softirq.c index 479e4436f787..f249b71cddf4 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -26,6 +26,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -319,6 +320,37 @@ asmlinkage __visible void do_softirq(void) local_irq_restore(flags); } +/* Determine whether this IRQ is something task isolation cares about. */ +static void task_isolation_irq(void) +{ +#ifdef CONFIG_TASK_ISOLATION + struct pt_regs *regs; + + if (!context_tracking_cpu_is_enabled()) + return; + + /* + * We have not yet called __irq_enter() and so we haven't + * adjusted the hardirq count. This test will allow us to + * avoid false positives for nested IRQs. + */ + if (in_interrupt()) + return; + + /* + * If we were already in the kernel, not from an irq but from + * a syscall or synchronous exception/fault, this test should + * avoid a false positive as well. Note that this requires + * architecture support for calling set_irq_regs() prior to + * calling irq_enter(), and if it's not done consistently, we + * will not consistently avoid false positives here. + */ + regs = get_irq_regs(); + if (regs && user_mode(regs)) + task_isolation_debug(smp_processor_id()); +#endif +} + /* * Enter an interrupt context. */ @@ -335,6 +367,7 @@ void irq_enter(void) _local_bh_enable(); } + task_isolation_irq(); __irq_enter(); } -- 2.1.2