From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751764AbdFHOSG (ORCPT ); Thu, 8 Jun 2017 10:18:06 -0400 Received: from smtp3.ugent.be ([157.193.49.127]:45606 "EHLO smtp3.ugent.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751702AbdFHOSF (ORCPT ); Thu, 8 Jun 2017 10:18:05 -0400 X-Greylist: delayed 548 seconds by postgrey-1.27 at vger.kernel.org; Thu, 08 Jun 2017 10:18:04 EDT Date: Thu, 08 Jun 2017 16:08:54 +0200 Message-ID: <20170608160854.Horde.wUKaUVyqiOZZOVp2xD1nl5A@webmail.ugent.be> From: Ilja Nevolin To: linux-kernel@vger.kernel.org Subject: Circular debugging using ptrace results in deadlock due to race condition? User-Agent: Internet Messaging Program (IMP) H4 (5.0.23) Content-Type: text/plain; charset=ISO-8859-1; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline X-Miltered: at jchkm3 with ID 59395A76.006 by Joe's j-chkmail (http://helpdesk.ugent.be/email/)! X-j-chkmail-Enveloppe: 59395A76.006 from emilio.ugent.be/emilio.ugent.be/157.193.47.117/emilio.ugent.be/ X-j-chkmail-Score: MSGID : 59395A76.006 on smtp3.ugent.be : j-chkmail score : . : R=. U=. O=. B=0.000 -> S=0.000 X-j-chkmail-Status: Ham Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi guys, As part of my master's thesis I am facing a challenging problem. I am trying to let two processes be each other's debuggers using the ptrace syscall. However, my proof-of-concept implementation always results in a deadlock state (both processes get stuck in 't+' state as shown by 'ps aux'). Here is my code, it's pretty simple: https://pastebin.com/A1iBA3nh I have compiled and run this on an ARMv7 developer board with kernel version 3.0.35 (Linaro 13.08). The output of the above code is this: A waiting to continue... B attachTo: 0 B setOptions: 0 B setVarData: 0 B setVarData: 0 B cont: 0 B waiting to continue... B waiting to continue... A attachTo: 0 As you can see it never reaches the "finished" printf code, and gets stuck as soon as the other process attempts to attach to the debugger. I have done a similar experiment for 3 processes, such that each one attempts to attach to the other in a circular fashion: A -> B -> C -> A The result in this case was exactly the same. However, here I was able to detect a race condition, because sometimes the code executed properly without getting stuck in a deadlock (but it's hard to reproduce). If you wish, you can test this by using a lightweight debugger I've developed and three console terminals. Here's the code: https://pastebin.com/fPJb8ZNb Once you've compiled the above code, you simply run the binary on each console and enter the PID of another process to establish a 3-way circle. I am far from an expert on the kernel, but I did have a look at the ARM specific kernel implementation which left me puzzled. I couldn't find where/how/why this code does not work. Now I'm wondering if it's possible at all to make this work without a deadlock occurring? Does anyone have any experience with this, or can provide some clues/feedback? Thank you greatly for your time, attention and effort! Ilya Nevolin ilja.nevolin@ugent.be