From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: Exit all jobs on error References: <5669B23C.1020001@kernel.dk> <20151210181540.GB21415@kernel.dk> <5669C364.9020100@kernel.dk> From: Jens Axboe Message-ID: <5669C46F.3090300@kernel.dk> Date: Thu, 10 Dec 2015 11:29:03 -0700 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit To: Andrey Kuzmin Cc: Sitsofe Wheeler , "fio@vger.kernel.org" List-ID: On 12/10/2015 11:27 AM, Andrey Kuzmin wrote: > On Thu, Dec 10, 2015 at 9:24 PM, Jens Axboe wrote: >> On 12/10/2015 11:17 AM, Andrey Kuzmin wrote: >>> >>> On Thu, Dec 10, 2015 at 9:15 PM, Jens Axboe wrote: >>>> >>>> On Thu, Dec 10 2015, Andrey Kuzmin wrote: >>>>> >>>>> I've also encountered a similar issue a number of times where the job >>>>> failed to stop (and refused to terminate in response to C-C) when a >>>>> thread/process fails, e.g. due to an error. My guess is that the loop >>>>> that waits for completions doesn't check for td->terminate being set. >>>> >>>> >>>> Attach with gdb and see what they are doing, could be a missing >>>> terminate check. Or it could already be sitting waiting for completions. >>> >>> >>> It just sits there waiting for completions, as gdb understandably >>> predominantly hits the wait state. >> >> >> Where is it sitting and/or looping? > > unix/wait smth ;), as far as I recall. > > If you need an exact ref, let me make up an error in the code, run, > and get back to you with the exact gdb frame info. I'm generally not in the crystal ball or guessing game :-) So yeah, a stack trace would be helpful. -- Jens Axboe