From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA381C433E0 for ; Thu, 18 Mar 2021 15:19:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72E9464ED2 for ; Thu, 18 Mar 2021 15:19:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231769AbhCRPSb (ORCPT ); Thu, 18 Mar 2021 11:18:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231777AbhCRPSP (ORCPT ); Thu, 18 Mar 2021 11:18:15 -0400 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B766DC06175F for ; Thu, 18 Mar 2021 08:18:15 -0700 (PDT) Received: by mail-qv1-xf34.google.com with SMTP id x27so3356374qvd.2 for ; Thu, 18 Mar 2021 08:18:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wOFBEPOr1SLeNR/Um5yDZSZIjOy82REcErjZY7pyKDQ=; b=CGGA9HuO0wmX6avzNQTVJ1nxLiQT7aoA7o7FrVE3h5ceaWoUInIhyTeqemyADCS6SS J1aSW/VAUL1LnUkoNWIaeeSEzosnmOzekeP/Cd/AKRy6XHe9D7fDF40I0pEkVfgkcW04 Wueno/T53xBADAnWoqeonhzXD9XNteEmC/vWz3nlkLOJoxqPjJij8jrdxkVtid3qxIiG I0rOFVlJAHwmbMjA/2dQi7Yn2BAdLee9UC8U/FG16CmNV2srxQv8jKGwqOFHX+vGZrEo 6/yLe1Nt7gWUUdg+KAGLwVM8/EjMxf07fwbwGl6wHOYdCUuiAcVwJliH313dDR58H1k5 zZzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wOFBEPOr1SLeNR/Um5yDZSZIjOy82REcErjZY7pyKDQ=; b=Pmfr7GX5cPhTsK2Bko/PUzHYOfRgn75SrPuKU4IsYsvz1W6qMv61qTUSyKiLAK5gI4 YHaYHWaxXilgOIIpnZdud49xViV/s2QV9SK6NrPegB2tCSoGwqigvJwZ6UiXkkPCQyNP 2Oi86owoBplEQh5TrsCqtO1FojAZMyH/oUMcTrq5SYiPYO2EEzVquJYNBndKlWXTHqhd kdjb9dejSBJ/tgjX7nfZmsQCTGOGTSKuGHSc/JS8kUu9JAfkK/4Zc2xM583o6Dtxxg4G IDDfK3J9dZeFbDtf53wm3QTOlK7y8xhFzvVACS4zFDZrzRP0s9lLmVAjUgI6Cuy6wi8k tXaw== X-Gm-Message-State: AOAM531gNabxpXKLV++ayqE/xj7JerOHb7EbzT/5KdmnPrfgG43mcaJq /8XijrI3kAnaV1GBi0c0tNZoa4GHusR49lqxVdgMhQ== X-Google-Smtp-Source: ABdhPJy+OexF6sUgIP9Tj4vBricdu5cz15i4H10NQtNRkmkVuuYT5vaIDniBeQ+tCfWs2JPm3+l6eTr6bCrP19dyYj8= X-Received: by 2002:ad4:410d:: with SMTP id i13mr4704593qvp.44.1616080694677; Thu, 18 Mar 2021 08:18:14 -0700 (PDT) MIME-Version: 1.0 References: <00000000000096cdaa05bd32d46f@google.com> In-Reply-To: From: Dmitry Vyukov Date: Thu, 18 Mar 2021 16:18:03 +0100 Message-ID: Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in sock_ioctl To: Ben Dooks Cc: syzbot , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , andrii@kernel.org, Alexei Starovoitov , bpf , Daniel Borkmann , David Miller , John Fastabend , Martin KaFai Lau , kpsingh@kernel.org, Jakub Kicinski , LKML , netdev , Song Liu , syzkaller-bugs , Yonghong Song Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Mar 15, 2021 at 3:41 PM Ben Dooks wrote: > > On 15/03/2021 11:52, Dmitry Vyukov wrote: > > On Mon, Mar 15, 2021 at 12:30 PM Ben Dooks wrote: > >> > >> On 14/03/2021 11:03, Dmitry Vyukov wrote: > >>> On Sun, Mar 14, 2021 at 11:01 AM Dmitry Vyukov wrote: > >>>>> On Wed, Mar 10, 2021 at 7:28 PM syzbot > >>>>> wrote: > >>>>>> > >>>>>> Hello, > >>>>>> > >>>>>> syzbot found the following issue on: > >>>>>> > >>>>>> HEAD commit: 0d7588ab riscv: process: Fix no prototype for arch_dup_tas.. > >>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes > >>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=122c343ad00000 > >>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136 > >>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=c23c5421600e9b454849 > >>>>>> userspace arch: riscv64 > >>>>>> > >>>>>> Unfortunately, I don't have any reproducer for this issue yet. > >>>>>> > >>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > >>>>>> Reported-by: syzbot+c23c5421600e9b454849@syzkaller.appspotmail.com > >>>>> > >>>>> +riscv maintainers > >>>>> > >>>>> Another case of put_user crashing. > >>>> > >>>> There are 58 crashes in sock_ioctl already. Somehow there is a very > >>>> significant skew towards crashing with this "user memory without > >>>> uaccess routines" in schedule_tail and sock_ioctl of all places in the > >>>> kernel that use put_user... This looks very strange... Any ideas > >>>> what's special about these 2 locations? > >>> > >>> I could imagine if such a crash happens after a previous stack > >>> overflow and now task data structures are corrupted. But f_getown does > >>> not look like a function that consumes way more than other kernel > >>> syscalls... > >> > >> The last crash I looked at suggested somehow put_user got re-entered > >> with the user protection turned back on. Either there is a path through > >> one of the kernel handlers where this happens or there's something > >> weird going on with qemu. > > > > Is there any kind of tracking/reporting that would help to localize > > it? I could re-reproduce with that code. > > I'm not sure. I will have a go at debugging on qemu today just to make > sure I can reproduce here before I have to go into the office and fix > my Icicle board for real hardware tests. > > I think my first plan post reproduction is to stuff some trace points > into the fault handlers to see if we can get a idea of faults being > processed, etc. > > Maybe also add a check in the fault handler to see if the fault was > in a fixable region and post an error if that happens / maybe retry > the instruction with the relevant SR_SUM flag set. > > Hopefully tomorrow I can get a run on real hardware to confirm. > Would have been better if the Unmatched board I ordered last year > would turn up. In retrospect it's obvious what's common between these 2 locations: they both call a function inside of put_user. #syz dup: BUG: unable to handle kernel access to user memory in schedule_tail