From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD696C433DF for ; Mon, 1 Jun 2020 18:50:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9CBA520872 for ; Mon, 1 Jun 2020 18:50:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="L/G1ZV/x" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729687AbgFASuk (ORCPT ); Mon, 1 Jun 2020 14:50:40 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:51608 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730376AbgFASFx (ORCPT ); Mon, 1 Jun 2020 14:05:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591034751; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=osjdztueMuaAwG61qi79PUuaa4pVp/qOeP56bfj88L4=; b=L/G1ZV/xWvd2hrDrY0cZTqbOrZl6yNPP84aZquM22KwwOXYll9NRC/CG+a6Cvzp5ef9dEN 4Y7SyReYBIGvCcdOpnjiN8KuiTE+HXdxPgxO5tKNVTILVhvCD/Oa6gn4tw+sfxeeP2EKyx cE4cNSy1daO9HfhVXzpkyjD7d+BiA30= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-1-ayfnyAN1PO6nfg34SZesjA-1; Mon, 01 Jun 2020 14:05:47 -0400 X-MC-Unique: ayfnyAN1PO6nfg34SZesjA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8DF86107B266; Mon, 1 Jun 2020 18:05:45 +0000 (UTC) Received: from treble (ovpn-116-170.rdu2.redhat.com [10.10.116.170]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D44F11002394; Mon, 1 Jun 2020 18:05:40 +0000 (UTC) Date: Mon, 1 Jun 2020 13:05:38 -0500 From: Josh Poimboeuf To: "Wangshaobo (bobo)" Cc: huawei.libin@huawei.com, xiexiuqi@huawei.com, cj.chengjian@huawei.com, mingo@redhat.com, x86@kernel.org, linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, mbenes@suse.cz, devel@etsukata.com, viro@zeniv.linux.org.uk, esyr@redhat.com Subject: Re: Question: livepatch failed for new fork() task stack unreliable Message-ID: <20200601180538.o5agg5trbdssqken@treble> References: <20200529101059.39885-1-bobo.shaobowang@huawei.com> <20200529174433.wpkknhypx2bmjika@treble> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: live-patching-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: live-patching@vger.kernel.org On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote: > 1) when a user mode task just fork start excuting ret_from_fork() till > schedule_tail, unwind_next_frame found > > orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time > arch_stack_walk_reliable() > > terminates it's backtracing loop for unwind_done() return true. then 'if > (!(task->flags & (PF_KTHREAD | PF_IDLE)))' > > in arch_stack_walk_reliable() true and return -EINVAL after. > > * The stack trace looks like that: > > ret_from_fork > >       -=> UNWIND_HINT_EMPTY > >       -=> schedule_tail             /* schedule out */ > >       ... > >       -=> UNWIND_HINT_REGS      /*  UNDO */ Yes, makes sense. > 2) when using call_usermodehelper_exec_async() to create a user mode task, > ret_from_fork() still not exec whereas > > the task has been scheduled in __schedule(), at this time, orc->sp_reg is > ORC_REG_UNDEFINED but orc->end equals zero, > > unwind_error() return true and also terminates arch_stack_walk_reliable()'s > backtracing loop, end up return from > > 'if (unwind_error())' branch. > > * The stack trace looks like that: > > -=> call_usermodehelper_exec > >                  -=> do_exec > >                            -=> search_binary_handler > >                                       -=> load_elf_binary > >                                                 -=> elf_map > >                                                          -=> vm_mmap_pgoff > > -=> down_write_killable > > -=> _cond_resched > >              -=> __schedule           /* scheduled to work */ > > -=> ret_from_fork       /* UNDO */ I don't quite follow the stacktrace, but it sounds like the issue is the same as the first one you originally reported: > 1) The task was not actually scheduled to excute, at this time > UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's > sp_reg and end field remain default value and end up throwing an error > in unwind_next_frame() when called by arch_stack_walk_reliable(); Or am I misunderstanding? And to reiterate, these are not "livepatch failures", right? Livepatch doesn't fail when stack_trace_save_tsk_reliable() returns an error. It recovers gracefully and tries again later. -- Josh