From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA81BC352A2 for ; Thu, 6 Feb 2020 13:10:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CC2A0217BA for ; Thu, 6 Feb 2020 13:10:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727717AbgBFNK5 (ORCPT ); Thu, 6 Feb 2020 08:10:57 -0500 Received: from smtp2207-205.mail.aliyun.com ([121.197.207.205]:57380 "EHLO smtp2207-205.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727361AbgBFNK4 (ORCPT ); Thu, 6 Feb 2020 08:10:56 -0500 X-Alimail-AntiSpam: AC=CONTINUE;BC=0.07436282|-1;CH=green;DM=CONTINUE|CONTINUE|true|0.774386-0.00992849-0.215685;DS=SPAM|spam_ad|0.977435-0.000146245-0.0224188;FP=0|0|0|0|0|-1|-1|-1;HT=e01l07426;MF=liaoweixiong@allwinnertech.com;NM=1;PH=DS;RN=16;RT=16;SR=0;TI=SMTPD_---.GlHG0bB_1580994647; Received: from 192.168.31.126(mailfrom:liaoweixiong@allwinnertech.com fp:SMTPD_---.GlHG0bB_1580994647) by smtp.aliyun-inc.com(10.147.42.135); Thu, 06 Feb 2020 21:10:48 +0800 Subject: Re: [PATCH v1 11/11] mtd: new support oops logger based on pstore/blk To: Miquel Raynal Cc: Kees Cook , Anton Vorontsov , Colin Cross , Tony Luck , Jonathan Corbet , Richard Weinberger , Vignesh Raghavendra , Mauro Carvalho Chehab , "David S. Miller" , Rob Herring , Greg Kroah-Hartman , Jonathan Cameron , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org References: <1579482233-2672-1-git-send-email-liaoweixiong@allwinnertech.com> <1579482233-2672-12-git-send-email-liaoweixiong@allwinnertech.com> <20200120110306.32e53fd8@xps13> <27226590-379c-8784-f461-f5d701015611@allwinnertech.com> <20200121094802.61f8cb4d@xps13> <2c6000b1-ae25-564b-911a-2879e9c244b2@allwinnertech.com> <20200122184114.125b42c8@xps13> From: liaoweixiong Message-ID: Date: Thu, 6 Feb 2020 21:10:47 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <20200122184114.125b42c8@xps13> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org hi Miquel Raynal, On 2020/1/23 AM 1:41, Miquel Raynal wrote: > Hello, > > >>>>>> +/* >>>>>> + * All zones will be read as pstore/blk will read zone one by one when do >>>>>> + * recover. >>>>>> + */ >>>>>> +static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off) >>>>>> +{ >>>>>> + struct mtdpstore_context *cxt = &oops_cxt; >>>>>> + size_t retlen; >>>>>> + int ret; >>>>>> + >>>>>> + if (mtdpstore_block_isbad(cxt, off)) >>>>>> + return -ENEXT; >>>>>> + >>>>>> + pr_debug("try to read off 0x%llx size %zu\n", off, size); >>>>>> + ret = mtd_read(cxt->mtd, off, size, &retlen, (u_char *)buf); >>>>>> + if ((ret < 0 && !mtd_is_bitflip(ret)) || size != retlen) { >>>>> >>>>> IIRC size != retlen does not mean it failed, but that you should >>>>> continue reading after retlen bytes, no? >>>>> >> >>>> Yes, you are right. I will fix it. Thanks. >>>> >>>>> Also, mtd_is_bitflip() does not mean that you are reading a false >>>>> buffer, but that the data has been corrected as it contained bitflips. >>>>> mtd_is_eccerr() however, would be meaningful. >>>>> >> >>>> Sure I know mtd_is_bitflip() does not mean failure, but I do not think >>>> mtd_is_eccerr() should be here since the codes are ret < 0 and NOT >>>> mtd_is_bitflip(). >>> >>> Yes, just drop this check, only keep ret < 0. >>> >> >> If I don't get it wrong, it should not be dropped here. Like your words, >> "mtd_is_bitflip() does not mean that you are reading a false buffer, >> but that the data has been corrected as it contained bitflips.", the >> data I get are valid even if mtd_is_bitflip() return true. It's correct >> data and it's no need to go to handle error. To me, the codes >> should be: >> if (ret < 0 && !mit_is_bitflip()) >> [error handling] > > Please check the implementation of mtd_is_bitflip(). You'll probably > figure out what I am saying. > > https://elixir.bootlin.com/linux/latest/source/include/linux/mtd/mtd.h#L585 > How about the codes as follows: for (done = 0, retlen = 0; done < size; done += retlen) { ret = mtd_read(..., &retlen, ...); if (!ret) continue; /* * do nothing if bitflip and ecc error occurs because whether * it's bitflip or ECC error, just a small number of bits flip * and the impact on log data is so small. The mtdpstore just * hands over what it gets and user can judge whether the data * is valid or not. */ if (mtd_is_bitflip(ret)) { dev_warn("bitflip at...."); continue; } else if (mtd_is_eccerr(ret)) { dev_warn("eccerr at...."); retlen = retlen == 0 ? size : retlen; continue; } else { dev_err("read failure at..."); /* this zone is broken, try next one */ return -ENEXT; } } > > |...] > >>>>>> + return; >>>>>> + } >>>>>> + if (unlikely(info->dmesg_size % mtd->writesize)) { >>>>>> + pr_err("record size %lu KB must align to write size %d KB\n", >>>>>> + info->dmesg_size / 1024, >>>>>> + mtd->writesize / 1024); >>>>> >>>>> This condition is weird, why would you check this? >>>>> >> >>>> pstore/blk will write 'record_size' dmesg log at one time. >>>> Since each write data must be aligned to 'writesize' for flash, I am not >>>> sure >>>> all flash drivers are compatible with misaligned data, that's why i >>>> check this. >>> >>> I think you should enforce this alignment instead of checking it. >>> >> >> Do you mean that mtdpstore should enforce this alignment while running? >> The way I can think of is to handle a buffer aligned to writesize and >> write to flash with this aligned buffer. >> >> That causes some error. The MTD device will be divided into mutil >> chunks accroding to dmesg_size. Each chunk stores a individual >> OOPS/Panic log. If dmesg_size is misaligned to writesize, the last >> write results in next write failure because the page of flash can only >> be programed once before next erase and the page shared by two chunks >> has been used by the last write. Besides, we can not read to buffer, >> ersae and write back as there is no read/erase for panic case. > > I mean: what is the usual size of dmesg? I don't get why you need it to The usual size of dmesg is 64K, usually be equal to log_buf size. > be ie. a multiple of 2k. It probably is actually, I don't know if there > is a standard. But if dmesg_size is eg 3k, just skip the end of the > partially written page and start writing at the next page? > 1. upper layer do not support to skip partially written page The upper layer pstore/blk will not skip the end of the partially written page since it is not only used for MTD device, but also block device, which has no page limited. A common practice at the upper layer is to check the size and limit size to be aligned. We make dmesg_size to be a multiple of 4K for greater compatibility. 2. chunks management and size per write The mtdpstore tells pstore/blk how large the device is. Then pstore/blk will divide it into several chunks according to dmesg_size. The pstore/blk will write dmesg_size data at a time. In a word, the amount of data written each time can not lead to page slicing, so, dmesg_size must be aligned to writesize. >> >>>> >>>>>> + return; >>>>>> + } >>>>>> + if (unlikely(mtd->size > MTDPSTORE_MAX_MTD_SIZE)) { >>>>>> + pr_err("mtd%d is too large (limit is %d MiB)\n", >>>>>> + mtd->index, >>>>>> + MTDPSTORE_MAX_MTD_SIZE / 1024 / 1024); >>>>> >>>>> Same question? I could understand that it is easier to manage blocks >>>>> knowing their maximum number though. >>>>> >> >>>> It refers to mtdoops. >>> >>> What do you mean? >>> >> >> To me, it's unnecessary to check at all, however it is really there >> on codes of mtdoops. I refer to module mtdoops when I design mtdpstore. >> It may be helpfull for some cases out of my think, that's why I keep it. > > Why not. > OK, I will drop it. > [...] > >>>> >>>> In case of repeated erase when users remove several log files, mtdpstore >>>> do remove jobs when exit. >>>> >>>> Besides, mtdpstore do not check the return code to ensure write back valid >>>> log as much as possible. >>> >>> You are not in a critical path, I don't understand why you don't check >>> it? If it returns an error, it means the data is not written. IMHO it >>> is best to alert the user than to silently fail. >>> >> >> This function will be called only when mtd device is removing. It's >> useless to alert the user but try to flush the other valid data to > > It is useful to alert the user! It means something went wrong while > everything seems fine. > >> flash as mush as possible by which reduce losses. If it's just >> because of busy, what happens next time? > > Just because of busy? I don't get it. I want to express that if the write fails due to busy, the next one may succeed. > > I'm okay with the idea of trying to write the other chunks though: > > while (remaining_chunk) { > ret = mtd_write() > if (ret) { > alert-user; > continue; > } > } > OK, I will fix it. >> >>>> >>>>>> +. >>>> + off += zonesize; >>>>>> + size -= min_t(unsigned int, zonesize, size); >>>>>> + } >>>>>> + >>>>>> +free: >>>>>> + kfree(buf); >>>>>> + return ret; >>>>>> +} >>>>>> + >>> >>> >>> [...] >>> >>>>> >>>>> Thanks, >>>>> Miquèl >>>>> >> >>>> I will collect more suggestions and submit the new version at one time. >>>> >>> >>> Sure, no hurry. >>> >> >> I am on holiday, please forgive me for my slow response. > > Take your time, as I said, no hurry. > >> >>> >>> Thanks, >>> Miquèl >>> > > > > > Thanks, > Miquèl >