From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 042AFC5DF61 for ; Thu, 7 Nov 2019 13:00:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 78CDC214D8 for ; Thu, 7 Nov 2019 13:00:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gnqaqtcP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 78CDC214D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DB2496B0003; Thu, 7 Nov 2019 08:00:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D63116B0006; Thu, 7 Nov 2019 08:00:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2A7D6B0007; Thu, 7 Nov 2019 08:00:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0120.hostedemail.com [216.40.44.120]) by kanga.kvack.org (Postfix) with ESMTP id AFFCF6B0003 for ; Thu, 7 Nov 2019 08:00:46 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 4FAF0181AEF21 for ; Thu, 7 Nov 2019 13:00:46 +0000 (UTC) X-FDA: 76129490892.14.copy30_5df13c960883e X-HE-Tag: copy30_5df13c960883e X-Filterd-Recvd-Size: 10752 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 Nov 2019 13:00:45 +0000 (UTC) Received: by mail-wm1-f67.google.com with SMTP id b11so2372286wmb.5 for ; Thu, 07 Nov 2019 05:00:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=uBbAKwGx+S8z20P47tHnjqU7Ahnt9BU/JIYon5I3zzw=; b=gnqaqtcPJqgq1pyGsF8/lvY82JpBUW1QZ6xc7oUJLUqJTPL23IoawgecwqduaOI+kE IXY6ER+KTzOI8cTCfd0MA4tkXg391cn0Jyk4m8F4mLWpvwoE8dsrv2bhFcTGF7jM8uEU nqT/yp6DYura5b4A3wnNCP2of+WBe8djzzF3R4/dPndIreGmDobR4uw20ha8jI2QJEdk Vfy77Q2FfKwOAcGjj0TdS7Jj35IkqFwQndlYKNBT9UT38elf8DV4gcquIjPm+at25kzY u5n0VuqkCURSinC7ZtJ/+CA9OEL+hOAnKS+mdSA9r10W9mE7gJcDRVJ1wMWmJiFYqZbV ufLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=uBbAKwGx+S8z20P47tHnjqU7Ahnt9BU/JIYon5I3zzw=; b=hfYdjwyt9tc19LkrJltMfT22V9FT+UDH/Hp6jnbwQNYQGT6dFSQvXxBKjKN05mQ/Zd IVzsKb9Ac+aQGyVu9dbrK5NiyGgaD78ToIhvaXMVMGvR3QbxQO5r0YaUQ25yFV+eVRiL U056OYqR+k3Fac5646blhRLuldIv4dGXnM5jyYXW2NYlqU+kcsYpjD5+tXKblssE+mQE 1zdpmxOPiJ+cYFyUmgpnu47mlyfbTb4cevzq7hiH72BifXmCimKgOxWOBfl57imIdzE5 j456yhUkI5MBUfJHYkeNy1QpvgbN+27ECTKIRKPDLmgBHZ71TCFYlSWHqs2HhdF619Jw wkgg== X-Gm-Message-State: APjAAAXglcTY5nQ1PtO0AMYqt2agUfYVJYXTyD38Wi4Zci7DFIYcgV68 U1d0hWzf7f4Ra9awcFHo2z4Q3vyeIN1J4w4N+7LzGA== X-Google-Smtp-Source: APXvYqxrFFTwfpu5R+UBYEQ+iHuJ7qI8ztzAHf5EeRoWiLFtGY+rcNiJGTTvZ241Xyu82ebMRNw46D/sw7c2UriOSOQ= X-Received: by 2002:a05:600c:2212:: with SMTP id z18mr3017298wml.154.1573131642945; Thu, 07 Nov 2019 05:00:42 -0800 (PST) MIME-Version: 1.0 References: <20191030142237.249532-1-glider@google.com> <20191030142237.249532-23-glider@google.com> <20191030143814.GC15015@lst.de> In-Reply-To: From: Alexander Potapenko Date: Thu, 7 Nov 2019 14:00:31 +0100 Message-ID: Subject: Re: [PATCH RFC v2 22/25] kmsan: unpoisoning buffers from devices etc. To: Christoph Hellwig Cc: Andrew Morton , Jens Axboe , "Theodore Ts'o" , Dmitry Torokhov , "Martin K. Petersen" , "Michael S. Tsirkin" , Eric Dumazet , Eric Van Hensbergen , Takashi Iwai , Vegard Nossum , Dmitry Vyukov , Matthew Wilcox , Linux Memory Management List , Al Viro , Andrey Ryabinin , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Greg Kroah-Hartman , Harry Wentland , Herbert Xu , Ingo Molnar , Martin Schwidefsky , Michal Simek , Petr Mladek , Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , Wolfram Sang , Vasily Gorbik , Ilya Leoshkevich , Mark Rutland , Randy Dunlap , Andrey Konovalov , Marco Elver Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 5, 2019 at 4:02 PM Alexander Potapenko wrot= e: > > On Wed, Oct 30, 2019 at 3:38 PM Christoph Hellwig wrote: > > > > On Wed, Oct 30, 2019 at 03:22:34PM +0100, glider@google.com wrote: > > > When data is copied to memory from a device KMSAN should treat it as > > > initialized. In most cases it's enough to just unpoison the buffer th= at > > > is known to come from a device. > > > In the case with __do_page_cache_readahead() and bio_copy_user_iov() = we > > > have to mark the whole pages as ignored by KMSAN, as it's not obvious > > > where these pages are read again. > > > > A lot of this looks pretty strange. Why don't you instrument > > the dma_map / dma_sync infrastucture? That should avoid most of the > > driver hooks. > > That's the exact reason I'm sending these patches: I simply don't know > the kernel code good enough. > May I ask you for some pointers? > My goal is to mark data copied from the device as initialized (by > calling kmsan_unpoison_shadow(ptr, size)), and, if possible, check > data that's about to be copied to device (by calling > kmsan_check_memory(ptr, size)). > My understanding is that: > 1. calls to dma_map_* and dma_sync_* with direction=3DDMA_FROM_DEVICE > denote that the corresponding kernel buffer can be marked as > initialized > 2. calls to dma_unmap_* and dma_sync_* with direction=3DDMA_TO_DEVICE > denote that the buffer will be copied to device (and must be checked > for being initialized) > 3. I need some translation table to find out the virtual address for > a given dma_addr_t > Does this sound reasonable? Initializing memory in dma_map_ still leaves out the reports as the one bel= ow. There seems to be a DMA access somewhere in blk_execute_rq(), but I fail to see why it's not covered. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BUG: KMSAN: uninit-value in[< none >] sr_check_events+0x1091/0x1190 drivers/scsi/sr.c:246 CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.4.0-rc5+ #3266 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01= /2014 Workqueue: events_freezable_power_ disk_events_workfn Call Trace: [< inline >] __dump_stack lib/dump_stack.c:77 [< none >] dump_stack+0x196/0x1f0 lib/dump_stack.c:113 [< none >] kmsan_report+0x127/0x220 mm/kmsan/kmsan_report.c:108 [< none >] __msan_warning+0x73/0xe0 mm/kmsan/kmsan_instr.c:245 [< inline >] sr_get_events drivers/scsi/sr.c:213 [< none >] sr_check_events+0x1091/0x1190 drivers/scsi/sr.c:246 [< inline >] cdrom_update_events drivers/cdrom/cdrom.c:1476 [< none >] cdrom_check_events+0xc3/0x260 drivers/cdrom/cdrom.c:1= 486 [< none >] sr_block_check_events+0x3c4/0x670 drivers/scsi/sr.c:6= 14 [< none >] disk_check_events+0x154/0x8b0 block/genhd.c:1855 [< none >] disk_events_workfn+0x47/0x50 block/genhd.c:1841 [< none >] process_one_work+0x1556/0x1ef0 kernel/workqueue.c:226= 9 ... Uninit was stored to memory at: [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:151 [< none >] kmsan_internal_chain_origin+0xa3/0x160 mm/kmsan/kmsan.c:319 [< none >] kmsan_memcpy_memmove_metadata+0x271/0x2e0 mm/kmsan/kmsan.c:254 [< none >] kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:274 [< none >] __msan_memcpy+0x55/0x70 mm/kmsan/kmsan_instr.c:129 [< none >] bio_copy_kern_endio_read+0x467/0x990 block/bio.c:1543 [< none >] bio_endio+0xa36/0xbb0 block/bio.c:1850 [< inline >] req_bio_endio block/blk-core.c:242 [< none >] blk_update_request+0xd3c/0x20a0 block/blk-core.c:1462 [< none >] scsi_end_request+0x10b/0xeb0 drivers/scsi/scsi_lib.c:= 579 [< none >] scsi_io_completion+0x279/0x2660 drivers/scsi/scsi_lib.c:963 [< none >] scsi_finish_command+0x6f9/0x720 drivers/scsi/scsi.c:2= 28 [< none >] scsi_softirq_done+0x772/0x980 drivers/scsi/scsi_lib.c= :1477 [< none >] blk_done_softirq+0x300/0x4f0 block/blk-softirq.c:37 [< none >] __do_softirq+0x311/0x83d kernel/softirq.c:293 ... Uninit was created at: [< none >] kmsan_save_stack_with_flags+0x3f/0x90 mm/kmsan/kmsan.= c:151 [< inline >] kmsan_internal_alloc_meta_for_pages mm/kmsan/kmsan_shadow.c:362 [< none >] kmsan_alloc_page+0x14e/0x360 mm/kmsan/kmsan_shadow.c:= 391 [< none >] __alloc_pages_nodemask+0x594e/0x6050 mm/page_alloc.c:= 4796 [< none >] alloc_pages_current+0x682/0x990 mm/mempolicy.c:2188 [< inline >] alloc_pages ./include/linux/gfp.h:511 [< none >] bio_copy_kern+0x4c5/0xed0 block/bio.c:1590 [< none >] blk_rq_map_kern+0x458/0x7e0 block/blk-map.c:237 [< none >] __scsi_execute+0x2cf/0xaf0 drivers/scsi/scsi_lib.c:26= 5 [< inline >] scsi_execute_req ./include/scsi/scsi_device.h:451 [< inline >] sr_get_events drivers/scsi/sr.c:207 [< none >] sr_check_events+0x2ff/0x1190 drivers/scsi/sr.c:246 [< inline >] cdrom_update_events drivers/cdrom/cdrom.c:1476 [< none >] cdrom_check_events+0xc3/0x260 drivers/cdrom/cdrom.c:1= 486 [< none >] sr_block_check_events+0x3c4/0x670 drivers/scsi/sr.c:6= 14 [< none >] disk_check_events+0x154/0x8b0 block/genhd.c:1855 [< none >] disk_events_workfn+0x47/0x50 block/genhd.c:1841 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > I still don't understand how to handle DMA_BIDIRECTIONAL. Will it be > sane to assume that at each dma_{map,sync,unmap}_* call must always > check the memory range and then unpoison it? > > Thanks in advance > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Stra=C3=9Fe, 33 > 80636 M=C3=BCnchen > > Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg --=20 Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Stra=C3=9Fe, 33 80636 M=C3=BCnchen Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg