From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D922C46475 for ; Thu, 25 Oct 2018 10:21:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 328422082E for ; Thu, 25 Oct 2018 10:21:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 328422082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727350AbeJYSxX (ORCPT ); Thu, 25 Oct 2018 14:53:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42094 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726803AbeJYSxX (ORCPT ); Thu, 25 Oct 2018 14:53:23 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0C3CB30014D2; Thu, 25 Oct 2018 10:21:17 +0000 (UTC) Received: from localhost.localdomain (unknown [10.33.36.108]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 27F97611D7; Thu, 25 Oct 2018 10:20:43 +0000 (UTC) Date: Thu, 25 Oct 2018 11:20:40 +0100 From: "Bryn M. Reeves" To: Mikulas Patocka Cc: Paul Lawrence , Mike Snitzer , Jonathan Corbet , Shaohua Li , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, dm-devel@redhat.com, kernel-team@android.com, Alasdair Kergon Subject: Re: [dm-devel] [RFC] dm-bow working prototype Message-ID: <20181025102039.GA9378@localhost.localdomain> References: <20181023212358.60292-1-paullawrence@google.com> <20181023221819.GB17552@agk-dp.fab.redhat.com> <296148c2-f2d9-5818-ea76-d71a0d6f5cd4@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.45]); Thu, 25 Oct 2018 10:21:17 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 24, 2018 at 03:24:29PM -0400, Mikulas Patocka wrote: > > > On Wed, 24 Oct 2018, Paul Lawrence wrote: > > > Android has had the concept of A/B updates for since Android N, which means > > that if an update is unable to boot for any reason three times, we revert to > > the older system. However, if the failure occurs after the new system has > > started modifying userdata, we will be attempting to start an older system > > with a newer userdata, which is an unsupported state. Thus to make A/B able to > > fully deliver on its promise of safe updates, we need to be able to revert > > userdata in the event of a failure. > > > > For those cases where the file system on userdata supports > > snapshots/checkpoints, we should clearly use them. However, there are many > > Android devices using filesystems that do not support checkpoints, so we need > > a generic solution. Here we had two options. One was to use overlayfs to > > manage the changes, then on merge have a script that copies the files to the > > underlying fs. This was rejected on the grounds of compatibility concerns and > > managing the merge through reboots, though it is definitely a plausible > > strategy. The second was to work at the block layer. > > > > At the block layer, dm-snap would have given us a ready-made solution, except > > that there is no sufficiently large spare partition on Android devices. But in > > general there is free space on userdata, just scattered over the device, and > > of course likely to get modified as soon as userdata is written to. We also > > decided that the merge phase was a high risk component of any design. Since > > the normal path is that the update succeeds, we anticipate merges happening > > 99% of the time, and we want to guarantee their success even in the event of > > unexpected failure during the merge. Thus we decided we preferred a strategy > > where the device is in the committed state at all times, and rollback requires > > work, to one where the device remains in the original state but the merge is > > complex. > > What about allocating a big file, using the FIEMAP ioctl to find the > physical locations of the file, creating a dm device with many linear > targets to map the big file and using it as a snapshot store? I think it > would be way easier than re-implementing the snapshot functionality in a > new target. libdevmapper already has code to handle enumerating physical file extents via the dm-stats file mapping support. It should be fairly easy to adapt this to create dm tables rather than dm-stats regions. See dm_stats_create_regions_from_fd() and _stats_map_file_regions(). Bryn.