From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from list by lists.gnu.org with archive (Exim 4.90_1)
	id 1kpIjP-0004gR-Ae
	for mharc-grub-devel@gnu.org; Tue, 15 Dec 2020 17:23:43 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:55446)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <development@efficientek.com>)
 id 1kpIjL-0004ch-O6
 for grub-devel@gnu.org; Tue, 15 Dec 2020 17:23:40 -0500
Received: from mail-qk1-x741.google.com ([2607:f8b0:4864:20::741]:35308)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <development@efficientek.com>)
 id 1kpIjC-0002Wc-BZ
 for grub-devel@gnu.org; Tue, 15 Dec 2020 17:23:39 -0500
Received: by mail-qk1-x741.google.com with SMTP id n142so20792106qkn.2
 for <grub-devel@gnu.org>; Tue, 15 Dec 2020 14:23:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=efficientek-com.20150623.gappssmtp.com; s=20150623;
 h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to
 :mime-version:content-transfer-encoding;
 bh=oZtGF+fy8yGAnLEqxbm1b/m/Y9QDM4Fdbl2IgNgbZhs=;
 b=UbRmNyZqHZkCspTwklL5c/bv1Mma1IW2xbQr0RixMvf1Pbf/GaEaiL+h1+7WSpP1x0
 PKo3StUv+YibNP6s5o2/Ue+catXlan3Qy3xnZ0eIfTXG31lE2gNM0XATUGvDW8tLvmT4
 NmfKpyadMDC0BiUc4RGfOBFWyE4+/rleO3ncdecQYBVWy2IS4u83RfaIUnACc3FCSX5V
 sbBdzUnmBrv3CU0dINBxgLnQMv77g5plkRAtcadYSx+bRrFPIQp7kexC1YU16+WydCpS
 krtc3/BzKhIS2ZOBgs6JB0MZ9BsyLCXODUkBvYZA8I20GuoWjwYTZEVZDQLWEooqMrSb
 WXew==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to
 :references:reply-to:mime-version:content-transfer-encoding;
 bh=oZtGF+fy8yGAnLEqxbm1b/m/Y9QDM4Fdbl2IgNgbZhs=;
 b=ZCIVrJbbticW4Fb2dNQDPImPSOd09n/x42nb1re+la5MTiq6eFWMposJruoEZzXfFV
 bsxL3+XxKv/RTLy3xhYxsgu67hytMjBQZqC3YhiahFlbBzloPxIQjqzB4j/Qhv2BiIUU
 L70cuDDG2/vrwFqrpQ1PvNi8keXBeL1d0vwUl19CHeXP5IVlZ7s2TNMuM2lRYjHHOmwg
 E2DLOpjf4cokthWj/CRfKNvgL9UlHIXhbRCyTIQNcOf6YXc/o4+K9hGGOs8wCZGLgYMP
 5ZK5FI1jE+r06I3B7hL+zKggFkUTLFiIchShwkCxpFlkAMOttwUWPJVAxmD+uEPZ0thE
 Sylw==
X-Gm-Message-State: AOAM531FSWKIGa5PeEVKaYboN2rGICvA2j7lL/oN6F+wk1EId8haDh4U
 XQveVstRMwvI7cyZsdpco8HzkaoCwtPaaQ==
X-Google-Smtp-Source: ABdhPJyEdEmpNvlW37NXA1RoxHgfxAmLS0kz6CjLpZf4LyC1ebyN99GLK6LCMXHH2+I2E+7o6XZiiQ==
X-Received: by 2002:a37:4acb:: with SMTP id
 x194mr39641092qka.295.1608071008513; 
 Tue, 15 Dec 2020 14:23:28 -0800 (PST)
Received: from crass-HP-ZBook-15-G2 ([136.49.211.192])
 by smtp.gmail.com with ESMTPSA id b67sm41302qkc.44.2020.12.15.14.23.28
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 15 Dec 2020 14:23:28 -0800 (PST)
Date: Tue, 15 Dec 2020 16:23:19 -0600
From: Glenn Washburn <development@efficientek.com>
To: Daniel Kiper <dkiper@net-space.pl>
Cc: grub-devel@gnu.org, Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v8 14/18] luks2: Better error handling when setting up
 the cryptodisk
Message-ID: <20201215162319.05d5f02e@crass-HP-ZBook-15-G2>
In-Reply-To: <20201212151918.4zr3rujy4opldypc@tomti.i.net-space.pl>
References: <cover.1604723348.git.development@efficientek.com>
 <cover.1607466704.git.development@efficientek.com>
 <9fbd815f9360ed2faa45c386d4ed103103284417.1607466704.git.development@efficientek.com>
 <20201210160707.ansx6bpqxgjfj7vs@tomti.i.net-space.pl>
 <20201211191018.382b8264@crass-HP-ZBook-15-G2>
 <20201212151918.4zr3rujy4opldypc@tomti.i.net-space.pl>
Reply-To: development@efficientek.com
X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::741;
 envelope-from=development@efficientek.com; helo=mail-qk1-x741.google.com
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: grub-devel@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: The development of GNU GRUB <grub-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/grub-devel>,
 <mailto:grub-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/grub-devel>
List-Post: <mailto:grub-devel@gnu.org>
List-Help: <mailto:grub-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/grub-devel>,
 <mailto:grub-devel-request@gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Dec 2020 22:23:41 -0000

On Sat, 12 Dec 2020 16:19:18 +0100
Daniel Kiper <dkiper@net-space.pl> wrote:

> On Fri, Dec 11, 2020 at 07:10:18PM -0600, Glenn Washburn wrote:
> > On Thu, 10 Dec 2020 17:07:07 +0100
> > Daniel Kiper <dkiper@net-space.pl> wrote:
> >
> > > On Tue, Dec 08, 2020 at 04:45:45PM -0600, Glenn Washburn wrote:
> > > > First, check to make sure that source disk has a known size. If
> > > > not, print debug message and return error. There are 4 cases
> > > > where GRUB_DISK_SIZE_UNKNOWN is set (biosdisk, obdisk, ofdisk,
> > > > and uboot), and in all those cases processing continues. So
> > > > this is probably a bit conservative. However, 3 of the cases
> > > > seem pathological, and the other, biosdisk, happens when
> > > > booting from a cd. Since I doubt booting from a LUKS2 volume on
> > > > a cd is a big use case, we'll error until someone complains.
> > > >
> > > > Do some sanity checking on data coming from the luks header. If
> > > > segment.size is "dynamic", verify that the offset is not past
> > > > the end of disk. Otherwise, check for errors from grub_strtoull
> > > > when converting segment size from string. If a
> > > > GRUB_ERR_BAD_NUMBER error was returned, then the string was not
> > > > a valid parsable number, so skip the key. If
> > > > GRUB_ERR_OUT_OF_RANGE was returned, then there was an overflow
> > > > in converting to a 64-bit unsigned integer. So this could be a
> > > > very large disk (perhaps large raid array). In this case, we
> > > > want to continue to try to use this key, but only allow access
> > > > up to the end of the source disk.
> > > >
> > > > Signed-off-by: Glenn Washburn <development@efficientek.com>
> > > > ---
> > > >  grub-core/disk/luks2.c | 84
> > > > ++++++++++++++++++++++++++++++++++++++++-- include/grub/disk.h
> > > >   | 17 +++++++++ 2 files changed, 98 insertions(+), 3
> > > > deletions(-)
> > > >
> > > > diff --git a/grub-core/disk/luks2.c b/grub-core/disk/luks2.c
> > > > index 9abcb1c94..8cb11e899 100644
> > > > --- a/grub-core/disk/luks2.c
> > > > +++ b/grub-core/disk/luks2.c
> > > > @@ -600,12 +600,26 @@ luks2_recover_key (grub_disk_t source,
> > > >        goto err;
> > > >      }
> > > >
> > > > +  if (source->total_sectors == GRUB_DISK_SIZE_UNKNOWN)
> > > > +    {
> > > > +      /* FIXME: Allow use of source disk, and maybe cause
> > > > errors in read. */
> > > > +      grub_dprintf ("luks2", "Source disk %s has an unknown
> > > > size, "
> > > > +			     "conservatively returning
> > > > error\n", source->name);
> > > > +      ret = grub_error (GRUB_ERR_BUG, "Unknown size of luks2
> > > > source device");
> > > > +      goto err;
> > > > +    }
> > > > +
> > > >    /* Try all keyslot */
> > > >    for (json_idx = 0; json_idx < size; json_idx++)
> > > >      {
> > > > +      typeof(source->total_sectors) max_crypt_sectors = 0;
> > > > +
> > > > +      grub_errno = GRUB_ERR_NONE;
> > > >        ret = luks2_get_keyslot (&keyslot, &digest, &segment,
> > > > json, json_idx); if (ret)
> > > >  	goto err;
> > > > +      if (grub_errno != GRUB_ERR_NONE)
> > > > +	  grub_dprintf ("luks2", "Ignoring unhandled error %d
> > > > from luks2_get_keyslot\n", grub_errno);
> > > >
> > > >        if (keyslot.priority == 0)
> > > >  	{
> > > > @@ -619,11 +633,75 @@ luks2_recover_key (grub_disk_t source,
> > > >        crypt->offset_sectors = grub_divmod64 (segment.offset,
> > > > segment.sector_size, NULL); crypt->log_sector_size = sizeof
> > > > (unsigned int) * 8
> > > >  		- __builtin_clz ((unsigned int)
> > > > segment.sector_size) - 1;
> > > > +      /* Set to the source disk size, which is the maximum we
> > > > allow. */
> > > > +      max_crypt_sectors = grub_disk_convert_sector(source,
> > > > +
> > > > source->total_sectors,
> > > > +
> > > > crypt->log_sector_size); +
> > > > +      if (max_crypt_sectors < crypt->offset_sectors)
> > > > +	{
> > > > +	  grub_dprintf ("luks2", "Segment
> > > > \"%"PRIuGRUB_UINT64_T"\" has offset"
> > > > +				 " %"PRIuGRUB_UINT64_T" which
> > > > is greater than"
> > > > +				 " source disk size
> > > > %"PRIuGRUB_UINT64_T","
> > > > +				 " skipping\n",
> > > > +				 segment.idx,
> > > > crypt->offset_sectors,
> > > > +				 max_crypt_sectors);
> > > > +	  continue;
> > > > +	}
> > > > +
> > > >        if (grub_strcmp (segment.size, "dynamic") == 0)
> > > > -	crypt->total_sectors = (grub_disk_get_size (source) >>
> > > > (crypt->log_sector_size - source->log_sector_size))
> > > > -			       - crypt->offset_sectors;
> > > > +	crypt->total_sectors = max_crypt_sectors -
> > > > crypt->offset_sectors; else
> > > > -	crypt->total_sectors = grub_strtoull (segment.size,
> > > > NULL, 10) >> crypt->log_sector_size;
> > > > +	{
> > > > +	  grub_errno = GRUB_ERR_NONE;
> > > > +	  /* Convert segment.size to sectors, rounding up to
> > > > nearest sector */
> > > > +	  crypt->total_sectors = grub_strtoull (segment.size,
> > > > NULL, 10);
> > > > +	  crypt->total_sectors = ALIGN_UP
> > > > (crypt->total_sectors,
> > > > +					   1 <<
> > > > crypt->log_sector_size);
> > > > +	  crypt->total_sectors >>= crypt->log_sector_size;
> > > > +
> > > > +	  if (grub_errno == GRUB_ERR_NONE)
> > > > +	    ;
> > > > +	  else if (grub_errno == GRUB_ERR_BAD_NUMBER)
> > > > +	    {
> > > > +	      grub_dprintf ("luks2", "Segment
> > > > \"%"PRIuGRUB_UINT64_T"\" size"
> > > > +				     " \"%s\" is not a parsable
> > > > number\n",
> > > > +				     segment.idx,
> > > > segment.size);
> > > > +	      continue;
> > > > +	    }
> > > > +	  else if (grub_errno == GRUB_ERR_OUT_OF_RANGE)
> > > > +	    {
> > > > +	      /*
> > > > +	       * There was an overflow in parsing
> > > > segment.size, so disk must
> > > > +	       * be very large or the string is incorrect.
> > > > +	       */
> > > > +	      grub_dprintf ("luks2", "Segment
> > > > \"%"PRIuGRUB_UINT64_T"\" size"
> > > > +				     " %s overflowed 64-bit
> > > > unsigned integer,"
> > > > +				     " the end of the crypto
> > > > device will be"
> > > > +				     " inaccessible\n",
> > > > +				     segment.idx,
> > > > segment.size);
> > > > +	      if (crypt->total_sectors > max_crypt_sectors)
> > >
> > > I think this if is bogus. You should clamp crypt->total_sectors
> > > without any checks here.
> >
> > Actually, I wouldn't call this a clamp because in the overflow case
> > crypt->total_sectors always equals 0.  I just realized this, and its
> > because grub_strtoull will return 2^64-1 thus causing the following
> > ALIGN_UP to overflow returning 0.  Suffice to say that's not what we
> > want. My original intent was what happened before the ALIGN_UP code
> > was introduced, which would ALIGN_DOWN.
> 
> Ugh... Right. That is why I still think you should do further
> calculations on value returned from grub_strtoull() if grub_errno ==
> GRUB_ERR_NONE only. I understand that then crypt->total_sectors does
> not contain total sectors for a while but you can rectify this a bit
> by putting short comment before grub_strtoull() call.
> 
> > Here's an example illustrating why I wanted and still think the
> > intent of this check is reasonable. Suppose we have a disk size
> > 2^67 bytes with 512 byte (2^9) sized sectors. It will have 2^58
> > sectors. Further suppose, there is a LUKS volume with size 2^65
> > bytes starting at the beginning and a sector size of 4096 bytes
> > (2^12). This will cause an overflow, so grub_strtoull will return
> > 2^64-1, this should have us set crypt->total_sectors to 2^52-1.
> > Since we don't know how much the overflow is (1 byte or 1
> > terabyte), we don't know how many more sectors til the end of the
> > LUKS encrypted area. In this case max_crypt_sectors will be
> > 2^(58+9-12) => 2^57 sectors.  So here we see that
> > crypt->total_sectors < max_crypt_sectors in the overflow case.  If
> > we do as you suggest crypt->total_sectors will be set to 2^57, and
> > thus it will be valid to read past the end of the encrypted data
> > (ie. block 2^56 of the 4k sector crypt will be a sector starting at
> > byte 2^68, which is more than the 2^65 byte size volume).
> >
> > On the one hand, I like your suggestion because it allows reading
> > all possible encrypted data, at the cost of reading, decrypting, and
> > returning non-encrypted data (ie random garbage).  While keeping the
> > check, will prevent returning garbage at the cost of not allowing
> > access to all encrypted sectors. I think we should keep the check
> > and document a known limitation of 2^64 byte maximum sized LUKS
> > volumes. And that larger sized volumes can be read only up to byte
> > 2^64.
> 
> If think the code should look like this:
> 
>   /*
>    * ...a comment saying what crypt->total_sectors contains
>    * and why LUKS2 volumes larger than 2^64 does not work...
>    */
>   crypt->total_sectors = grub_strtoull (segment.size,  NULL, 10);
> 
>   if (grub_errno == GRUB_ERR_NONE)
>     {
>       crypt->total_sectors = ALIGN_UP (crypt->total_sectors, 1 <<
> crypt->log_sector_size); crypt->total_sectors >>=
> crypt->log_sector_size; }

Now that ALIGN_UP is being used which we don't want in the
GRUB_ERR_OUT_OF_RANGE case, I'm ok with this. Although, if we just
continue in grub_strtoull error case as you suggest below, it doesn't
matter. And this actually doesn't prevent ALIGN_UP from potentially
overflowing crypt->total_sectors either, when segment.size is very
large. Perhaps we should check for that.

>   else if (grub_errno == GRUB_ERR_BAD_NUMBER)
>     {
>       ...
>       continue;
>     }
> 
>   if (grub_errno == GRUB_ERR_OUT_OF_RANGE ||
>       !crypt->total_sectors ||
>       crypt->total_sectors > max_crypt_sectors)
>     {
>       ...
>       continue;
>       /*
>        * Yes, I think we should not guess crypt->total_sectors
>        * value and always fail. It seems safer.
>        */

I agree this is safer, but its at the cost of using volumes over 2^64
bytes working at all. I think the question is how safe is it to allow
arbitrary disk read failures in grub. Ultimately I don't really like
continuing here, which is why I originally had the TODO, because it
prevents otherwise accessible drives from being accessible (admittedly
in an extremely rare case). Think about a 2^65 sized raid with LUKS on
top and LVM on top of that. But really you should probably not be
booting from that anyway. I think I'll add a TODO here, with a work
around suggesting not to use such large devices to boot from.

>     }

Glenn