From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFA2BC433EF for ; Fri, 13 May 2022 12:30:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 48B3F8D0001; Fri, 13 May 2022 08:30:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4394C6B0075; Fri, 13 May 2022 08:30:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B40C8D0001; Fri, 13 May 2022 08:30:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 18BE76B0073 for ; Fri, 13 May 2022 08:30:00 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id E694180C4F for ; Fri, 13 May 2022 12:29:59 +0000 (UTC) X-FDA: 79460651718.15.08202E9 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by imf29.hostedemail.com (Postfix) with ESMTP id 0A8B51200CB for ; Fri, 13 May 2022 12:29:51 +0000 (UTC) Received: by mail-pg1-f171.google.com with SMTP id a191so7380104pge.2 for ; Fri, 13 May 2022 05:29:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=GvGWR6IfK+76+fr1L6Yh8onoy/RnqXfY/GTukgAxpsI=; b=JM86XHvtq84LQidE9YO30FumASo/xEj/S8OPKPAXN73PQCYhOQEzXjsrXT7VrBxfmX kpyAsyMiKa8lE6Rjwo7kYJ7iluvDVstQJG8Z8GEUv5IC6nBRVvtOTU52UiveqHCxKC+C LOblKwniPbtDsJe3yFCu3PAj6OztwxDDKZ3wOclz11RZeQbHvtF63n29zAXms758rsua CyCYEjXXXrStrA4fAPCS7fN//xVT9Lkl3w/TlAdAQK83gpWFnFgyVxeZ4MGsXpU7uc4U wu8XNjPZodl3mxoAe4locHLEwKaIOnbZEOKxuZd/696Z80c7S4jPZ2K8L0DoUunMDHOq rdug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=GvGWR6IfK+76+fr1L6Yh8onoy/RnqXfY/GTukgAxpsI=; b=ZY2ypd3kffmKWYroCLJRdsHvI666E2GJEuhTSbY9lCs42DiQnc7yLdJ9KDR7X3q5CE ZEGp1Y97dL+76CpGHDLhW0PbfofHS0LA1Q0pS8RcpeTewQ/EEBfVfAwTDLZaFNfQXeQd myAHm4Odq0GE7vdcZnFEwg1NN2p9GanmtWywG+4mLaOK1fZVvrvWViBJiDg7yGWs6UCT udrcx5nOiG7hhbxwHXG0eRqSlDl6oh2ovW8rSEiGw7jPDG4QbYgZDu3VD3/5slgat7vg OQY8JSpLSN8kfwbSumyRy5l12wpDc+CXjfCyas3a8V3+tfeYErIRBifu9ZuTb8/KAxGb qGOA== X-Gm-Message-State: AOAM5301QI1jCOyIuuWBvQhmo8wGuqH1nBMYYi75vqna1MgcymbFqcJ8 J9M5/Vb+nQBlPIPHOGXVkGvxCNSiTP4rVhm3fFQ= X-Google-Smtp-Source: ABdhPJz7FQpsNU4nG/qXjuLTdg/U+bTHxPbMT/WE1UC+YpTNUBzsKN63yamAOT2G84Pp0GTZHu+Lxq9HclxFu6uHCvk= X-Received: by 2002:a63:6b82:0:b0:39d:a6ce:14dc with SMTP id g124-20020a636b82000000b0039da6ce14dcmr3853864pgc.476.1652444998133; Fri, 13 May 2022 05:29:58 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xiongwei Song Date: Fri, 13 May 2022 20:29:31 +0800 Message-ID: Subject: Re: squashfs performance regression and readahea To: Hsin-Yi Wang Cc: Phillip Lougher , Matthew Wilcox , Zheng Liang , Zhang Yi , Hou Tao , Miao Xie , Andrew Morton , Linus Torvalds , "Song, Xiongwei" , "linux-mm@kvack.org" , "squashfs-devel@lists.sourceforge.net" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0A8B51200CB X-Stat-Signature: n1znhwzs9hwx61ixko33uci1xbue89sn X-Rspam-User: Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=JM86XHvt; spf=pass (imf29.hostedemail.com: domain of sxwjean@gmail.com designates 209.85.215.171 as permitted sender) smtp.mailfrom=sxwjean@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1652444991-371053 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Hsin-Yi, One more thing, we should revert 9eec1d897139 ("squashfs: provide backing_dev_info in order to disable read-ahead")? Then put the two patches in one review? Regards, Xiongwei On Fri, May 13, 2022 at 8:16 PM Xiongwei Song wrote: > > Hello, > > On Thu, May 12, 2022 at 2:23 PM Hsin-Yi Wang wrote: > > > > On Thu, May 12, 2022 at 3:13 AM Phillip Lougher wrote: > > > > > > On 11/05/2022 16:12, Hsin-Yi Wang wrote: > > > > On Tue, May 10, 2022 at 9:19 PM Xiongwei Song w= rote: > > > >> > > > >> On Tue, May 10, 2022 at 8:47 PM Hsin-Yi Wang = wrote: > > > >>> > > > >>> On Tue, May 10, 2022 at 8:31 PM Xiongwei Song = wrote: > > > >>>> > > > >>>> Hi Hsin-Yi, > > > >>>> > > > >>>> On Mon, May 9, 2022 at 10:29 PM Hsin-Yi Wang wrote: > > > >>>>> > > > >>>>> On Mon, May 9, 2022 at 9:21 PM Matthew Wilcox wrote: > > > >>>>>> > > > >>>>>> On Mon, May 09, 2022 at 08:43:45PM +0800, Xiongwei Song wrote: > > > >>>>>>> Hi Hsin-Yi and Matthew, > > > >>>>>>> > > > >>>>>>> With the patch from the attachment on linux 5.10, ran the com= mand as I > > > >>>>>>> mentioned earlier, > > > >>>>>>> got the results below: > > > >>>>>>> 1:40.65 (1m + 40.65s) > > > >>>>>>> 1:10.12 > > > >>>>>>> 1:11.10 > > > >>>>>>> 1:11.47 > > > >>>>>>> 1:11.59 > > > >>>>>>> 1:11.94 > > > >>>>>>> 1:11.86 > > > >>>>>>> 1:12.04 > > > >>>>>>> 1:12.21 > > > >>>>>>> 1:12.06 > > > >>>>>>> > > > >>>>>>> The performance has improved obviously, but compared to linux= 4.18, the > > > >>>>>>> performance is not so good. > > > >>>>>>> > > > >>>>> I think you shouldn't compare the performance with 4.18 directl= y, > > > >>>>> since there might be other factors that impact the performance. > > > >>>> > > > >>>> Make sense. > > > >>>> > > > >>>>> I'd suggest comparing the same kernel version with: > > > >>>>> a) with this patch > > > >>>>> b) with c1f6925e1091 ("mm: put readahead pages in cache earlier= ") reverted. > > > >>>> > > > >>>> With 9eec1d897139 ("squashfs: provide backing_dev_info in order = to disable > > > >>>> read-ahead") reverted and applied 0001-WIP-squashfs-implement-re= adahead.patch, > > > >>>> test result on linux 5.18=EF=BC=9A > > > >>>> 1:41.51 (1m + 41.51s) > > > >>>> 1:08.11 > > > >>>> 1:10.37 > > > >>>> 1:11.17 > > > >>>> 1:11.32 > > > >>>> 1:11.59 > > > >>>> 1:12.23 > > > >>>> 1:12.08 > > > >>>> 1:12.76 > > > >>>> 1:12.51 > > > >>>> > > > >>>> performance worse 1 ~ 2s than linux 5.18 vanilla. > > > >>>> > > > >>> > > > >>> Can you share the pack file you used for testing? Thanks > > > >> > > > >> You are saying the files that are put in squashfs partitions? If y= es, I can tell > > > >> I just put some dynamic libraries in partitions: > > > >> -rwxr-xr-x 1 root root 200680 Apr 20 03:57 ld-2.33.so > > > >> lrwxrwxrwx 1 root root 10 Apr 20 03:57 ld-linux-x86-64.so.2 -= > ld-2.33.so > > > >> -rwxr-xr-x 1 root root 18776 Apr 20 03:57 libanl-2.33.so > > > >> lrwxrwxrwx 1 root root 14 Apr 20 03:57 libanl.so.1 -> libanl-= 2.33.so > > > >> lrwxrwxrwx 1 root root 17 Apr 20 04:08 libblkid.so.1 -> libbl= kid.so.1.1.0 > > > >> -rwxr-xr-x 1 root root 330776 Apr 20 04:08 libblkid.so.1.1.0 > > > >> -rwxr-xr-x 1 root root 1823192 Apr 20 03:57 libc-2.33.so > > > >> ...... snip ...... > > > >> > > > >> The number of files is 110(55 libs + 55 soft links to libs). I ha= ve 90 squashfs > > > >> partitions which save the identical files. The space of each parti= tion is 11M, > > > >> nothing special. > > > >> > > > >> Thanks. > > > >> > > > > > > > > I noticed that there's a crash at > > > > https://elixir.bootlin.com/linux/latest/source/lib/lzo/lzo1x_decomp= ress_safe.c#L218 > > > > when testing on my system. > > > > (I have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS enabled) > > > > > > > > Full logs: > > > > [ 119.062420] Unable to handle kernel paging request at virtual > > > > address ffffffc017337000 > > > > [ 119.062437] Mem abort info: > > > > [ 119.062442] ESR =3D 0x96000047 > > > > [ 119.062447] EC =3D 0x25: DABT (current EL), IL =3D 32 bits > > > > [ 119.062451] SET =3D 0, FnV =3D 0 > > > > [ 119.062454] EA =3D 0, S1PTW =3D 0 > > > > [ 119.062457] Data abort info: > > > > [ 119.062460] ISV =3D 0, ISS =3D 0x00000047 > > > > [ 119.062464] CM =3D 0, WnR =3D 1 > > > > [ 119.062469] swapper pgtable: 4k pages, 39-bit VAs, pgdp=3D000000= 0041099000 > > > > [ 119.062473] [ffffffc017337000] pgd=3D000000010014a003, > > > > p4d=3D000000010014a003, pud=3D000000010014a003, pmd=3D000000010ba59= 003, > > > > pte=3D0000000000000000 > > > > [ 119.062489] Internal error: Oops: 96000047 [#1] PREEMPT SMP > > > > [ 119.062494] Modules linked in: vhost_vsock vhost vhost_iotlb > > > > vmw_vsock_virtio_transport_common vsock rfcomm algif_hash > > > > algif_skcipher af_alg veth uinput xt_cgroup mtk_dip mtk_cam_isp > > > > mtk_vcodec_enc mtk_vcodec_dec hci_uart mtk_fd mtk_mdp3 v4l2_h264 > > > > mtk_vcodec_common mtk_vpu xt_MASQUERADE mtk_jpeg cros_ec_rpmsg btqc= a > > > > videobuf2_dma_contig v4l2_fwnode v4l2_mem2mem btrtl elants_i2c mtk_= scp > > > > mtk_rpmsg rpmsg_core mtk_scp_ipi mt8183_cci_devfreq ip6table_nat fu= se > > > > 8021q bluetooth ecdh_generic ecc iio_trig_sysfs cros_ec_lid_angle > > > > cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer > > > > kfifo_buf cros_ec_sensorhub cros_ec_typec typec hid_google_hammer > > > > ath10k_sdio lzo_rle lzo_compress ath10k_core ath mac80211 zram > > > > cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 > > > > videobuf2_common cdc_ether usbnet r8152 mii joydev > > > > [ 119.062613] CPU: 4 PID: 4161 Comm: chrome Tainted: G W > > > > 5.10.112 #105 39f11bffda227eaae4c704733b9bf01db22d8b4d > > > > [ 119.062617] Hardware name: Google burnet board (DT) > > > > [ 119.062623] pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=3D-= -) > > > > [ 119.062636] pc : lzo1x_decompress_safe+0x1dc/0x564 > > > > [ 119.062643] lr : lzo_uncompress+0x134/0x1f0 > > > > [ 119.062647] sp : ffffffc01837b860 > > > > [ 119.062650] x29: ffffffc01837b860 x28: 0000000000000000 > > > > [ 119.062656] x27: 0000000000005451 x26: ffffffc0171c9445 > > > > [ 119.062662] x25: 0000000000000000 x24: ffffffc017437000 > > > > [ 119.062668] x23: ffffffc0171c944f x22: ffffffc017136000 > > > > [ 119.062673] x21: ffffffc017336ff1 x20: ffffffc017237000 > > > > [ 119.062679] x19: ffffffc01837b8d0 x18: 0000000000000000 > > > > [ 119.062684] x17: 00000000000001eb x16: 0000000000000012 > > > > [ 119.062689] x15: 000000000010000f x14: d600120202000001 > > > > [ 119.062695] x13: ffffffc017336ff1 x12: ffffffc017336ff4 > > > > [ 119.062700] x11: 0000000000000002 x10: 01010101010100ff > > > > [ 119.062705] x9 : ffffffffffffffff x8 : ffffffc0171c944d > > > > [ 119.062710] x7 : d15d3aaabd294330 x6 : 0206397115fe28ab > > > > [ 119.062715] x5 : ffffffc0171c944f x4 : 000000000009344f > > > > [ 119.062721] x3 : ffffffc01837b8d0 x2 : ffffffc017237000 > > > > [ 119.062726] x1 : 000000000009344f x0 : ffffffc0171c9447 > > > > [ 119.062731] Call trace: > > > > [ 119.062738] lzo1x_decompress_safe+0x1dc/0x564 > > > > [ 119.062742] lzo_uncompress+0x134/0x1f0 > > > > [ 119.062746] squashfs_decompress+0x6c/0xb4 > > > > [ 119.062753] squashfs_read_data+0x1a8/0x298 > > > > [ 119.062758] squashfs_readahead+0x308/0x474 > > > > [ 119.062765] read_pages+0x74/0x280 > > > > [ 119.062769] page_cache_ra_unbounded+0x1d0/0x228 > > > > [ 119.062773] do_page_cache_ra+0x44/0x50 > > > > [ 119.062779] do_sync_mmap_readahead+0x188/0x1a0 > > > > [ 119.062783] filemap_fault+0x100/0x350 > > > > [ 119.062789] __do_fault+0x48/0x10c > > > > [ 119.062793] do_cow_fault+0x58/0x12c > > > > [ 119.062797] handle_mm_fault+0x544/0x904 > > > > [ 119.062804] do_page_fault+0x260/0x384 > > > > [ 119.062809] do_translation_fault+0x44/0x5c > > > > [ 119.062813] do_mem_abort+0x48/0xb4 > > > > [ 119.062819] el0_da+0x28/0x34 > > > > [ 119.062824] el0_sync_compat_handler+0xb8/0xcc > > > > [ 119.062829] el0_sync_compat+0x188/0x1c0 > > > > [ 119.062837] Code: f94001ae f90002ae f94005ae 910041ad (f90006ae) > > > > [ 119.062842] ---[ end trace 3e9828c7360fd7be ]--- > > > > [ 119.090436] Kernel panic - not syncing: Oops: Fatal exception > > > > [ 119.090455] SMP: stopping secondary CPUs > > > > [ 119.090467] Kernel Offset: 0x2729c00000 from 0xffffffc010000000 > > > > [ 119.090471] PHYS_OFFSET: 0xffffffd880000000 > > > > [ 119.090477] CPU features: 0x08240002,2188200c > > > > > > > > 1) Traces near when the crash happened: > > > > [ 79.495580] Block @ 0x60eea9c, compressed size 65744, src size 1= 048576 > > > > [ 80.363573] Block @ 0x1f9f000, compressed size 200598, src size = 1048576 > > > > [ 80.371256] Block @ 0x1fcff96, compressed size 80772, src size 1= 048576 > > > > [ 80.428388] Block @ 0x1fe3b1a, compressed size 83941, src size 1= 048576 > > > > [ 80.435319] Block @ 0x1ff82ff, compressed size 77936, src size 1= 048576 > > > > [ 80.724331] Block @ 0x4501000, compressed size 364069, src size = 1048576 > > > > [ 80.738683] Block @ 0x4dccf28, compressed size 603215, src size = 2097152 > > > > > > Src size 2097152 is clearly wrong, as the maximum data block size is = 1 > > > Mbyte or 1048576. > > > > > > That debug line comes from > > > > > > https://elixir.bootlin.com/linux/latest/source/fs/squashfs/block.c#L1= 56 > > > > > > ---- > > > TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n", > > > index, compressed ? "" : "un", length, output->length= ); > > > ---- > > > > > > Which indicates your code has created a page_actor of 2 Mbytes in siz= e > > > (output->length). > > > > > > This is completely incorrect, as the page_actor should never be large= r > > > than the size of the block to be read in question. In most cases tha= t > > > will be msblk->block_size, but it may be less at the end of the file. > > > > > > You appear to be trying to read the amount of readahead requested. B= ut, > > > you should always be trying to read the lesser of readahead, and the > > > size of the block in question. > > > > > > Hope that helps. > > > > > > Phillip > > > > > Hi Phillip, > > Thanks for the explanation. After restricting the size feed to > > page_actor, the crash no longer happened. > > > > Hi Xiongwei, > > Can you test this version (sent as attachment) again? I've tested on > > my platform: > > - arm64 > > - kernel 5.10 > > - pack_data size ~ 300K > > - time ureadahead pack_data > > 1. with c1f6925e1091 ("mm: put readahead pages in cache earlier") rever= ted: > > 0.633s > > 0.755s > > 0.804s > > > > 2. apply the patch: > > 0.625s > > 0.656s > > 0.768s > > > Thanks for sharing. I have done the test on 5.10 and 5.18. The results ar= e > a little worse than patch v1 for my test env. > > On linux 5.10=EF=BC=9A > With c1f6925e1091 ("mm: put readahead pages in cache earlier") reverted:: > 1:37.16 (1m +37.16s) > 1:04.18 > 1:05.28 > 1:06.07 > 1:06.31 > 1:06.58 > 1:06.80 > 1:06.79 > 1:06.95 > 1:06.61 > > With your patch v2: > 2:04.27 (2m + 4.27s) > 1:14.95 > 1:14.56 > 1:15.75 > 1:16.55 > 1:16.87 > 1:16.74 > 1:17.36 > 1:17.50 > 1:17.32 > > On linux 5.18: > The ra disabled by default:: > 1:12.82 > 1:07.68 > 1:08.94 > 1:09.65 > 1:09.87 > 1:10.32 > 1:10.47 > 1:10.34 > 1:10.24 > 1:10.34 > > With your patch v2: > 2:00.14 (2m + 0.14s) > 1:13.46 > 1:14.62 > 1:15.02 > 1:15.78 > 1:16.01 > 1:16.03 > 1:16.24 > 1:16.44 > 1:16.16 > > As you can see, there are extra 6s increase on both 5.10 and 5.18. > Don't know if the change of page number makes the overhead. > > One stupid question, see below code from your patch: > > + } > + > + kfree(actor); > + return; > + > +skip_pages: > > when release page pointers array after pages cached? I don't see > any chance to do that. > > Regards, > Xiongwei > > > > > Hi Matthew, > > Thanks for reviewing the patch previously. Does this version look good > > to you? If so, I can send it to the list. > > > > > > Thanks for all of your help. > > > > > > > > > > It's also noticed that when the crash happened, nr_pages obtained b= y > > > > readahead_count() is 512. > > > > nr_pages =3D readahead_count(ractl); // this line > > > > > > > > 2) Normal cases that won't crash: > > > > [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 2= 62144 > > > > [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 2= 62144 > > > > [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 1= 31072 > > > > [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 2= 62144 > > > > [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 2= 62144 > > > > [ 22.695739] Block @ 0x13698673, compressed size 65907, src size = 131072 > > > > [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 1= 31072 > > > > [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 1= 31072 > > > > [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 1= 31072 > > > > > > > > nr_pages are observed to be 32, 64, 256... These won't cause a cras= h. > > > > Other values (max_pages, bsize, block...) looks normal > > > > > > > > I'm not sure why the crash happened, but I tried to modify the mask > > > > for a bit. After modifying the mask value to below, the crash is go= ne > > > > (nr_pages are <=3D256). > > > > Based on my testing on a 300K pack file, there's no performance cha= nge. > > > > > > > > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > > > > index 20ec48cf97c5..f6d9b6f88ed9 100644 > > > > --- a/fs/squashfs/file.c > > > > +++ b/fs/squashfs/file.c > > > > @@ -499,8 +499,8 @@ static void squashfs_readahead(struct > > > > readahead_control *ractl) > > > > { > > > > struct inode *inode =3D ractl->mapping->host; > > > > struct squashfs_sb_info *msblk =3D inode->i_sb->s_fs_info; > > > > - size_t mask =3D (1UL << msblk->block_log) - 1; > > > > size_t shift =3D msblk->block_log - PAGE_SHIFT; > > > > + size_t mask =3D (1UL << shift) - 1; > > > > > > > > > > > > Any pointers are appreciated. Thanks! > > >