From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756903AbdDRCsB (ORCPT ); Mon, 17 Apr 2017 22:48:01 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:59633 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755294AbdDRCr7 (ORCPT ); Mon, 17 Apr 2017 22:47:59 -0400 X-Original-SENDERIP: 156.147.1.125 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 165.244.249.25 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.223.161 X-Original-MAILFROM: minchan@kernel.org Date: Tue, 18 Apr 2017 11:47:53 +0900 From: Minchan Kim To: Sergey Senozhatsky CC: , Andrew Morton , , , , Sergey Senozhatsky Subject: Re: [PATCH 1/3] zram: fix operator precedence to get offset Message-ID: <20170418024753.GA10648@bbox> References: <1492042622-12074-1-git-send-email-minchan@kernel.org> <20170414050747.GB462@jagdpanzerIV.localdomain> <20170414153251.GA16910@bgram> <20170417012105.GA518@jagdpanzerIV.localdomain> <20170417015429.GE518@jagdpanzerIV.localdomain> <20170417021439.GA20981@bbox> <20170417105016.GF518@jagdpanzerIV.localdomain> <20170417235310.GB21354@bbox> <20170418015310.GA558@jagdpanzerIV.localdomain> MIME-Version: 1.0 In-Reply-To: <20170418015310.GA558@jagdpanzerIV.localdomain> User-Agent: Mutt/1.5.24 (2015-08-30) X-MIMETrack: Itemize by SMTP Server on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/04/18 11:47:53, Serialize by Router on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/04/18 11:47:53, Serialize complete at 2017/04/18 11:47:53 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 18, 2017 at 10:53:10AM +0900, Sergey Senozhatsky wrote: > Hello, > > On (04/18/17 08:53), Minchan Kim wrote: > > On Mon, Apr 17, 2017 at 07:50:16PM +0900, Sergey Senozhatsky wrote: > > > Hello Minchan, > > > > > > On (04/17/17 11:14), Minchan Kim wrote: > > > > On Mon, Apr 17, 2017 at 10:54:29AM +0900, Sergey Senozhatsky wrote: > > > > > On (04/17/17 10:21), Sergey Senozhatsky wrote: > > > > > > > However, it should be *fixed* to prevent confusion in future > > > > > > > > > > or may be something like below? can save us some cycles. > > > > > > > > > > remove this calculation > > > > > > > > > > - offset = sector & (SECTORS_PER_PAGE - 1) << SECTOR_SHIFT; > > > > > > > > > > > > > > > and pass 0 to zram_bvec_rw() > > > > > > > > > > - err = zram_bvec_rw(zram, &bv, index, offset, is_write); > > > > > + err = zram_bvec_rw(zram, &bv, index, 0, is_write); > > > > > > > > That was one I wrote but have thought it more. > > > > > > > > Because I suspect fs can submit page-size IO in non-aligned PAGE_SIZE > > > > sector? For example, it can submit PAGE_SIZE read request from 9 sector. > > > > Is it possible? I don't know. > > > > > > > > As well, FS can format zram from sector 1, not sector 0? IOW, can't it > > > > use starting sector as non-page algined sector? > > > > We can do it via fdisk? > > > > > > > > Anyway, If one of scenario I mentioned is possible, zram_rw_page will > > > > be broken. > > > > > > > > If it's hard to check all of scenario in this moment, it would be > > > > better to not remove it and then add WARN_ON(offset) in there. > > > > > > > > While I am writing this, I found this. > > > > > > > > /** > > > > * bdev_read_page() - Start reading a page from a block device > > > > * @bdev: The device to read the page from > > > > * @sector: The offset on the device to read the page to (need not be aligned) > > > > * @page: The page to read > > > > * > > > > > > > > Hmm,, need investigation but no time. > > > > > > good questions. > > > > > > as far as I can see, we never use 'offset' which we pass to zram_bvec_rw() > > > from zram_rw_page(). `offset' makes a lot of sense for partial IO, but in > > > zram_bvec_rw() we always do "bv.bv_len = PAGE_SIZE". > > > > > > so what we have is > > > > > > for READ > > > > > > zram_rw_page() > > > bv.bv_len = PAGE_SIZE > > > zram_bvec_rw(zram, &bv, index, offset, is_write); > > > zram_bvec_read() > > > if (is_partial_io(bvec)) // always false > > > memcpy(user_mem + bvec->bv_offset, > > > uncmem + offset, > > > bvec->bv_len); > > > > > > > > > for WRITE > > > > > > zram_rw_page() > > > bv.bv_len = PAGE_SIZE > > > zram_bvec_rw(zram, &bv, index, offset, is_write); > > > zram_bvec_write() > > > if (is_partial_io(bvec)) // always false > > > memcpy(uncmem + offset, > > > user_mem + bvec->bv_offset, > > > bvec->bv_len); > > > > > > > > > and our is_partial_io() looks at ->bv_len: > > > > > > bvec->bv_len != PAGE_SIZE; > > > > > > which we set to PAGE_SIZE. > > > > > > so in the existing scheme of things, we never care about 'sector' > > > passed from zram_rw_page(). and this has worked for us for quite > > > some time. my call would be -- let's drop zram_rw_page() `sector' > > > calculation. > > > > I can do but before that, I want to confirm. Ccing Matthew, > > Summary for Matthew, > > > > I see following comment about the sector from bdev_read_page. > > > > /** > > * bdev_read_page() - Start reading a page from a block device > > * @bdev: The device to read the page from > > * @sector: The offset on the device to read the page to (need not be aligned) > > * @page: The page to read > > * > > > > Does it mean that sector can be not aligned PAGE_SIZE? > > > > For example, 512byte sector, 4K page system, 4K = 8 sector > > > > bdev_read_page(bdev, 9, page); > > do you mean a sector that spans two pages? sectors are pow of 2 in size > and pages are pow of 2 in size, so page_size is `K * sector_size', isn't > it? > > fs/mpage.c > > static struct bio * > do_mpage_readpage(struct bio *bio, struct page *page, unsigned nr_pages, > sector_t *last_block_in_bio, struct buffer_head *map_bh, > unsigned long *first_logical_block, get_block_t get_block, > gfp_t gfp) > { > const unsigned blkbits = inode->i_blkbits; > const unsigned blocks_per_page = PAGE_SIZE >> blkbits; > const unsigned blocksize = 1 << blkbits; > sector_t block_in_file; > sector_t last_block; > sector_t last_block_in_file; > sector_t blocks[MAX_BUF_PER_PAGE]; > ... > block_in_file = (sector_t)page->index << (PAGE_SHIFT - blkbits); > last_block = block_in_file + nr_pages * blocks_per_page; > last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; > if (last_block > last_block_in_file) > last_block = last_block_in_file; > > or did I misunderstood your question? I meant If bdev_read_page ask 4K(8 sectors) from sector 9(if it is possible), zram should handle it with two IO separate request like below. zram_rw_page: index = sector >> SECTORS_PER_PAGE_SHIFT; offset = (sector & (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; bvec.bv_len = PAGE_SIZE - offset; bvec.bv_offset = 0; zram_bvec_rw(zram, &bv, index, offset, is_write); bvec.bv_len = offset; bvec.bv_offset = PAGE_SIZE - offset; zram_bvec_rw(zram, &bv, index + 1, 0, is_write);