From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752540AbXBIXbz (ORCPT ); Fri, 9 Feb 2007 18:31:55 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752542AbXBIXbz (ORCPT ); Fri, 9 Feb 2007 18:31:55 -0500 Received: from mx1.redhat.com ([66.187.233.31]:58137 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752540AbXBIXby (ORCPT ); Fri, 9 Feb 2007 18:31:54 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Roland McGrath To: Alan Stern X-Fcc: ~/Mail/utrace Cc: Prasanna S Panchamukhi , Kernel development list Subject: Re: [PATCH] Kwatch: kernel watchpoints using CPU debug registers In-Reply-To: Alan Stern's message of Friday, 9 February 2007 10:54:40 -0500 X-Shopping-List: (1) Precocious bruisers (2) Surreptitious exhibitions (3) Revised console bags (4) Decorator hider hamster-lips (5) Omniscient dangerous acne Message-Id: <20070209233150.B9542180055@magilla.sf.frob.com> Date: Fri, 9 Feb 2007 15:31:50 -0800 (PST) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org > Yes. In fact, the current existing code does not handle dr6 correctly. > It never clears the register, which means you're likely to get into > trouble when multiple breakpoints (or watchpoints) are enabled. This is a subtle change from the existing ABI, in which userland has to clear %dr6 via ptrace itself. But gdb never does that AFAICT. So it's in fact subject to confusion when two watchpoints are set and the second hits after the first. So gdb ought to be fixed to clear dr6 via ptrace, to work with existing and older kernels. I don't think I really object to the ABI change of clearing %dr6 after an exception so that it does not accumulate multiple results. But first I'll have to convince myself that we never actually do want to accumulate multiple results. Hmm, I think we can, so maybe I do object. If you set two watchpoints inside a user buffer and then do a system call that touches both those addresses (e.g. read), then you will go through do_debug (to send_sigtrap) twice before returning to user mode. When the syscall is done, you'll have a pending SIGTRAP for the debugger to handle. By looking at your %dr6 the debugger can see that both watchpoints hit. (gdb does not handle this case, but it should.) Am I wrong? So this gets to the more complicated view of %dr6 handling that I had first had in mind yesterday. Each allocation "owns" one of the low 4 bits in %dr6 too. Only the dr6 bits owned by the userland "raw" allocation (i.e. ptrace/utrace_regset) should appear nonzero in thread.debugreg[6]. So when kwatch swallows a debug exception, it should mask off its bit from %dr6 in the CPU, but not clear %dr6 completely. That way you can have a sequence of user dr0 hit, kwatch dr3 hit, user dr1 hit, all inside one system call (including interrupt handlers), and when it gets to the userland debugger examining dr6 it sees the low 2 bits both set. > It's really quite a tricky matter. Should a register be allocated to > kwatch only when no user process needs it? Should we really go about > checking the requirements of every single process whenever a kwatch > allocation request comes in? What if the processes which need a > particular register aren't running -- should the register then be given to > kwatch? What if one of those processes then does start running on one > CPU? To "go about checking the requirements of every single process" is not so hard as it sounds when they're recorded as a single global use count per slot, as your original code does. When you mentioned a "your allocation is available" callback, I was thinking it might come to that being called inside context switch. It's all rather tricky, indeed. The obvious answer is to start simple. If any user process anywhere uses drN, kwatch has to give it up for all CPUs (watchpoints with less than "break ptrace" priority do). If anyone really cares about more flexibility than that, we can change or extend it. Some copious comments in the interface descriptions can lead them in the right direction if the situation comes up. Probably with systemtap support in a while, we'll get a lot more concrete uses of watchpoints and people finding out what really matters to them. Thanks, Roland