From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752686AbXBJEcU (ORCPT ); Fri, 9 Feb 2007 23:32:20 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752689AbXBJEcU (ORCPT ); Fri, 9 Feb 2007 23:32:20 -0500 Received: from firewall.rowland.harvard.edu ([140.247.233.35]:33647 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with SMTP id S1752686AbXBJEcU (ORCPT ); Fri, 9 Feb 2007 23:32:20 -0500 Date: Fri, 9 Feb 2007 23:32:17 -0500 (EST) From: Alan Stern X-X-Sender: stern@netrider.rowland.org To: Roland McGrath cc: Prasanna S Panchamukhi , Kernel development list Subject: Re: [PATCH] Kwatch: kernel watchpoints using CPU debug registers In-Reply-To: <20070209233150.B9542180055@magilla.sf.frob.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 9 Feb 2007, Roland McGrath wrote: > I don't think I really object to the ABI change of clearing %dr6 after an > exception so that it does not accumulate multiple results. But first I'll > have to convince myself that we never actually do want to accumulate > multiple results. Hmm, I think we can, so maybe I do object. If you set > two watchpoints inside a user buffer and then do a system call that touches > both those addresses (e.g. read), then you will go through do_debug (to > send_sigtrap) twice before returning to user mode. When the syscall is > done, you'll have a pending SIGTRAP for the debugger to handle. By looking > at your %dr6 the debugger can see that both watchpoints hit. (gdb does not > handle this case, but it should.) Am I wrong? I think you're right. > So this gets to the more complicated view of %dr6 handling that I had first > had in mind yesterday. Each allocation "owns" one of the low 4 bits in > %dr6 too. Only the dr6 bits owned by the userland "raw" allocation > (i.e. ptrace/utrace_regset) should appear nonzero in thread.debugreg[6]. > So when kwatch swallows a debug exception, it should mask off its bit from > %dr6 in the CPU, but not clear %dr6 completely. That way you can have a > sequence of user dr0 hit, kwatch dr3 hit, user dr1 hit, all inside one > system call (including interrupt handlers), and when it gets to the > userland debugger examining dr6 it sees the low 2 bits both set. Okay; I'll fix this too. Come to think of it, kwatch needs to handle multiple hits as well -- there might be two watchpoints set to the same address. > > It's really quite a tricky matter. Should a register be allocated to > > kwatch only when no user process needs it? Should we really go about > > checking the requirements of every single process whenever a kwatch > > allocation request comes in? What if the processes which need a > > particular register aren't running -- should the register then be given to > > kwatch? What if one of those processes then does start running on one > > CPU? > > To "go about checking the requirements of every single process" is not so > hard as it sounds when they're recorded as a single global use count per > slot, as your original code does. When you mentioned a "your allocation is > available" callback, I was thinking it might come to that being called > inside context switch. It's all rather tricky, indeed. > > The obvious answer is to start simple. If any user process anywhere uses > drN, kwatch has to give it up for all CPUs (watchpoints with less than > "break ptrace" priority do). If anyone really cares about more flexibility > than that, we can change or extend it. Some copious comments in the > interface descriptions can lead them in the right direction if the > situation comes up. Probably with systemtap support in a while, we'll get > a lot more concrete uses of watchpoints and people finding out what really > matters to them. It's still more complicated than you might think. Let's say two user processes each have dr1 allocated, one with low priority and the other with high priority. The kernel has to be aware of the high-priority allocation, so that it can refuse intermediate-priority kwatch allocation attempts. Now suppose the second process exits. dr1 is still allocated to the first user process but only with low priority, so now intermediate-priority kwatch allocation attempts should succeed. In order for this to work, when the second process gives up its allocation I would have to either scan though all tasks to see the first process, or else keep several global use counts for each slot -- in fact, one use count for each priority level. That's doable if there are only a few levels, but not if there are many. How do you suggest this be handled? Maybe we should just keep track of a maximum user priority level for each slot, allowing it to go up but not down until all user processes have given up the slot. (I.e., in the example above the later kwatch requests would still fail because we would continue to remember the high user priority level so long as the first process maintained its allocation.) That would be overly pessimistic, but it would at least be safe. Alan Stern