Frederic Weisbecker wrote: > On Sat, Sep 12, 2009 at 12:09:40AM +0200, Jan Kiszka wrote: >> Frederic Weisbecker wrote: >>> This patch rebase the implementation of the breakpoints API on top of >>> perf counters instances. >>> >>> The core breakpoint API has changed a bit: >>> >>> - register_kernel_hw_breakpoint() now takes a cpu as a parameter. For >>> now it doesn't support all cpu wide breakpoints but this may be >>> implemented soon. >>> >>> - unregister_kernel_hw_breakpoint() and unregister_user_hw_breakpoint() >>> have been unified in a single unregister_hw_breakpoint() >>> >>> Each breakpoints now match a perf counter which now handles the >>> register scheduling, thread/cpu attachment, etc.. >>> >>> The new layering is now made as follows: >>> >>> ptrace kgdb ftrace perf syscall >>> \ | / / >>> \ | / / >> kgdb doesn't fit here as it requires nmi-safe services. >> >> I don't think you want to make the whole stack nmi-safe but rather >> provide a separate interface that allows kgdb to announce to the kernel >> when it uses some slot. Those slots should simply be excluded from >> hardware updates. That's roughly the logic we use in KVM for guest >> debugging: when the host starts to use debug registers for that purpose, >> the guest's setting will not effect the real hardware anymore. > > > > I don't quite understand what must be NMI-safe here. Is it when > we request a breakpoint or when we hit one? > Both. With kgdb, the kernel may be interrupted (almost) everywhere, and then the operator may decide to add/remove hardware breakpoints during this interruption. > > >> Still on my wishlist for KVM is a cheap & easy way to obtain the current >> register content or to refresh it in hardware. It's not yet clear to me >> where to hook this in the given design. It looks like this information >> can be scattered over the current thread and some perf counters. > > > With this design approach, the debug registers are not anymore stored > in the thread structure. They are not stored anymore actually. > > Especially because the breakpoint are not anymore assigned to a > specific address register. This one is decided when the counter > is enabled. And the counter is often toggled on/off, depending > if we start/end profiling the desired context. It can be a single task, > in which case the counter is enabled while the task is sched in, and > disabled when it is sched out. > And between two sched atoms, the register used for a breakpoint > can be different. > > The arch informations about the breakpoints (len/type/addr) are stored > in the counter structure, and the address/control registers contents > are now dynamically computed. > > For your needs, basically the control must be done from perfcounters. > When you switch from host to guest, the counter must be sched out. > And in the reverse direction, it must be sched in. > Then perf will take care of that by itself. Actually, we wanted to avoid sched-out activity, and so far this is possible. But if both steps are cheap enough, specifically if the sched-out does _not_ touch the hardware and is very cheap if no breakpoints are set, KVM will likely be a happy user. Does that API already exist or what additional work is required? Jan