* [qemu-web PATCH] Update FUSE block export blog post
@ 2021-09-06 16:29 Hanna Reitz
2021-09-07 14:07 ` Thomas Huth
2021-09-07 17:52 ` Eric Blake
0 siblings, 2 replies; 5+ messages in thread
From: Hanna Reitz @ 2021-09-06 16:29 UTC (permalink / raw)
To: qemu-devel
Cc: Thomas Huth, Klaus Kiwi, Hanna Reitz, Stefan Hajnoczi,
Paolo Bonzini, Eric Blake
Because I forgot to CC Thomas on the discussion adding this post, it was
merged prematurely. This patch updates the post to incorporate the
feedback I received on it:
- Title change: This article mostly deals with presenting a guest image
in one image format as a raw image, so the title should reflect that;
there is much less focus on exporting block devices from a live VM
- Mention libguestfs, and contrast against it; make a note that
libguestfs provides security that FUSE exports cannot provide
- Have a full example in the intro, to show where we are going with this
post
- Some heading depths changed (nesting did not really make sense)
- Be more explicit that by "file mounts" I do not mean a filesystem with
a root directory and a single file in it
- Explicitly mention that "/" is a directory without a name, to
illustrate the fact that root nodes do not have names
- Short intro for "QEMU block exports", explaining its place in this
post
- Make all exports writable
- Use "exp0" as export ID to get shorter lines that fit better into 80
characters
- Reference the intro example in the intro of "Mounting an image on
itself"
- Show "qemu-fuse-disk-export.py" in *italic* instead of as `code`
(because I had all other command names in *italic*)
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
_posts/2021-08-22-fuse-blkexport.md | 145 ++++++++++++++++++++++------
1 file changed, 117 insertions(+), 28 deletions(-)
diff --git a/_posts/2021-08-22-fuse-blkexport.md b/_posts/2021-08-22-fuse-blkexport.md
index 7e8066e..1db6e74 100644
--- a/_posts/2021-08-22-fuse-blkexport.md
+++ b/_posts/2021-08-22-fuse-blkexport.md
@@ -1,30 +1,55 @@
---
layout: post
-title: "Exporting block devices as raw image files with FUSE"
+title: "Presenting guest images as raw image files with FUSE"
date: 2021-08-22 14:00:00 +0200
author: Hanna Reitz
categories: [storage, features, tutorials]
---
Sometimes, there is a VM disk image whose contents you want to manipulate
-without booting the VM. For raw images, that process is usually fairly simple,
-because most Linux systems bring tools for the job, e.g.:
+without booting the VM. One way of doing this is to use
+[libguestfs](https://libguestfs.org), which can boot a minimal Linux VM to
+provide the host with secure access to the disk’s contents. For example,
+[*guestmount*](https://libguestfs.org/guestmount.1.html) allows you to mount a
+guest filesystem on the host, without requiring root rights.
+
+However, maybe you cannot or do not want to use libguestfs, e.g. because you do
+not have KVM available in your environment, and so it becomes too slow; or
+because you do not want to go through a guest OS, but want to access the raw
+image data directly on the host, with minimal overhead.
+
+**Note**: Guest images can generally be arbitrarily modified by VM guests. If
+you have an image to which an untrusted guest had write access at some point,
+you must treat any data and metadata on this image as potentially having been
+modified in a malicious manner. Parsing anything must be done carefully and
+with caution. Note that many existing tools are not careful in this regard, for
+example, filesystem drivers generally deliberately do not have protection
+against maliciously corrupted filesystems. This is why in contrast accessing an
+image through libguestfs is considered secure, because the actual access happens
+in a libvirt-managed VM guest.
+
+From this point, we assume you are aware of the security caveats and still want
+to access and manipulate image data on the host.
+
+Now, unless your image is already in raw format, you will be faced with the
+problem of getting it into raw format. The tools that you might want to use for
+image manipulation generally only work on raw images (because that is how block
+device files appear), like:
* *dd* to just copy data to and from given offsets,
* *parted* to manipulate the partition table,
* *kpartx* to present all partitions as block devices,
* *mount* to access filesystems’ contents.
-Sadly, but naturally, such tools only work for raw images, and not for images
-e.g. in QEMU’s qcow2 format. To access such an image’s content, the format has
-to be translated to create a raw image, for example by:
+So if you want to use such tools on image files e.g. in QEMU’s qcow2 format, you
+will need to translate them into raw images first, for example by:
* Exporting the image file with `qemu-nbd -c` as an NBD block device file,
* Converting between image formats using `qemu-img convert`,
* Accessing the image from a guest, where it appears as a normal block device.
Unfortunately, none of these methods is perfect: `qemu-nbd -c` generally
-requires root rights, converting to a temporary raw copy requires additional
-disk space and the conversion process takes time, and accessing the image from a
-guest is just quite cumbersome in general (and also specifically something that
-we set out to avoid in the first sentence of this blog post).
+requires root rights; converting to a temporary raw copy requires additional
+disk space and the conversion process takes time; and accessing the image from a
+guest is basically what libguestfs does (i.e., if that is what you want, then
+you should probably use libguestfs).
As of QEMU 6.0, there is another method, namely FUSE block exports.
Conceptually, these are rather similar to using `qemu-nbd -c`, but they do not
@@ -42,15 +67,67 @@ mounting remote directories from a machine accessible via SSH.
QEMU can use FUSE to make a virtual block device appear as a normal file on the
host, so that tools like *kpartx* can interact with it regardless of the image
-format.
+format, like in the following example:
-## Background information
+```
+$ qemu-img create -f raw foo.img 20G
+Formatting 'foo.img', fmt=raw size=21474836480
+
+$ parted -s foo.img \
+ 'mklabel msdos' \
+ 'mkpart primary ext4 2048s 100%'
+
+$ qemu-img convert -p -f raw -O qcow2 foo.img foo.qcow2 && rm foo.img
+ (100.00/100%)
+
+$ file foo.qcow2
+foo.qcow2: QEMU QCOW2 Image (v3), 21474836480 bytes
+
+$ sudo kpartx -l foo.qcow2
+
+$ qemu-storage-daemon \
+ --blockdev node-name=prot-node,driver=file,filename=foo.qcow2 \
+ --blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
+ --export \
+ type=fuse,id=exp0,node-name=fmt-node,mountpoint=foo.qcow2,writable=on \
+ &
+[1] 200495
+
+$ file foo.qcow2
+foo.qcow2: DOS/MBR boot sector; partition 1 : ID=0x83, start-CHS (0x10,0,1),
+end-CHS (0x3ff,3,32), startsector 2048, 41940992 sectors
-### File mounts
+$ sudo kpartx -av foo.qcow2
+add map loop0p1 (254:0): 0 41940992 linear 7:0 2048
+```
+
+In this example, we create a partition on a newly created raw image. We then
+convert this raw image to qcow2 and discard the original. Because a tool like
+*kpartx* cannot parse the qcow2 format, it reports no partitions to be present
+in `foo.qcow2`.
+
+Using the QEMU storage daemon, we then create a FUSE export for the image that
+apparently turns it into a raw image, which makes the content and thus the
+partitions visible to *file* and *kpartx*. Now, we can use *kpartx* to access
+the partition in `foo.qcow2` under `/dev/mapper/loop0p1`.
+
+So how does this work? How can the QEMU storage daemon make a qcow2 image
+appear as a raw image?
+
+## File mounts
-A perhaps little-known fact is that, on Linux, filesystems do not need to have
-a root directory, they only need to have a root node. A filesystem that only
-provides a single regular file is perfectly valid.
+To transparently translate a file into a different format, like we did above, we
+make use of two little-known facts about filesystems and the VFS on Linux. The
+first one of these we can explain immediately, for the second one we will need
+some more information about how FUSE exports work, so that secret will be lifted
+later (down in the “Mounting an image on itself” section).
+
+Here is the first secret: Filesystems do not need to have a root directory.
+They only need a root node. A regular file is a node, so a filesystem that only
+consists of a single regular file is perfectly valid.
+
+Note that this is not about filesystems with just a single file in their root
+directory, but about filesystems that really *do not have* a root directory.
Conceptually, every filesystem is a tree, and mounting works by replacing one
subtree of the global VFS tree by the mounted filesystem’s tree. Normally, a
@@ -65,7 +142,8 @@ shadowed by the new filesystem (showing `/foo/x` and `/foo/y`).
Note that a filesystem’s root node generally has no name. After mounting, the
filesystem’s root directory’s name is determined by the original name of the
-mount point.
+mount point. (“/” is not a name. It specifically is a directory without a
+name.)
Because a tree does not need to have multiple nodes but may consist of just a
single leaf, a filesystem with a file for its root node works just as well,
@@ -81,7 +159,10 @@ point for it must also be a regular file (`/foo/a` in our example), and just
like before, the content of `/foo/a` is shadowed, and when opening it, one will
instead see the contents of FS B’s unnamed root node.
-### QEMU block exports
+## QEMU block exports
+
+Before we can see what FUSE exports are and how they work, we should explore
+QEMU block exports in general.
QEMU allows exporting block nodes via various protocols (as of 6.0: NBD,
vhost-user, FUSE). A block node is an element of QEMU’s block graph (see e.g.
@@ -108,7 +189,7 @@ The command line to achieve the above could look something like this:
$ qemu-system-x86_64 \
-blockdev node-name=prot-node,driver=file,filename=$image_path \
-blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
- -device virtio-blk,drive=fmt-node
+ -device virtio-blk,drive=fmt-node,share-rw=on
```
Besides attaching guest devices to block nodes, you can also export them for
@@ -131,9 +212,10 @@ the QEMU instance above, then you could do this:
"execute": "block-export-add",
"arguments": {
"type": "nbd",
- "id": "fmt-node-export",
+ "id": "exp0",
"node-name": "fmt-node",
- "name": "guest-disk"
+ "name": "guest-disk",
+ "writable": true
}
}
```
@@ -168,7 +250,8 @@ $ qemu-storage-daemon \
--blockdev node-name=prot-node,driver=file,filename=$image_path \
--blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
--nbd-server addr.type=inet,addr.host=localhost,addr.port=10809 \
- --export type=nbd,id=fmt-node-export,node-name=fmt-node,name=guest-disk
+ --export \
+ type=nbd,id=exp0,node-name=fmt-node,name=guest-disk,writable=on
```
Which creates the following block graph:
@@ -194,7 +277,8 @@ $ touch mount-point
$ qemu-storage-daemon \
--blockdev node-name=prot-node,driver=file,filename=$image_path \
--blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
- --export type=fuse,id=fmt-node-export,node-name=fmt-node,mountpoint=mount-point
+ --export \
+ type=fuse,id=exp0,node-name=fmt-node,mountpoint=mount-point,writable=on
```
The mount point now appears as the raw VM disk that is stored in the qcow2
@@ -237,7 +321,11 @@ disk size: 0 B
## Mounting an image on itself
So far, we have seen what FUSE exports are, how they work, and how they can be
-used. Now let’s add an interesting twist.
+used. However, in the very first example in this blog post, we did not export
+the raw image on some empty regular file that just serves as a mount point – no,
+we turned the original qcow2 image itself into a raw image.
+
+How does that work?
### What happens to the old tree under a mount point?
@@ -330,7 +418,8 @@ Format specific information:
$ qemu-storage-daemon --blockdev \
node-name=node0,driver=qcow2,file.driver=file,file.filename=foo.qcow2 \
- --export type=fuse,id=node0-export,node-name=node0,mountpoint=foo.qcow2 &
+ --export \
+ type=fuse,id=node0-export,node-name=node0,mountpoint=foo.qcow2,writable=on &
[1] 40843
$ qemu-img info foo.qcow2
@@ -355,17 +444,17 @@ that opens the image by name (i.e. `open("foo.qcow2")`) will open the raw disk
image exported by QEMU. Therefore, it looks like the qcow2 image is in raw
format now.
-### `qemu-fuse-disk-export.py`
+### *qemu-fuse-disk-export.py*
Because the QEMU storage daemon command line tends to become kind of long, I’ve
written a script to facilitate the process:
-[qemu-fuse-disk-export.py](https://gitlab.com/hreitz/qemu-scripts/-/blob/main/qemu-fuse-disk-export.py)
+[*qemu-fuse-disk-export.py*](https://gitlab.com/hreitz/qemu-scripts/-/blob/main/qemu-fuse-disk-export.py)
([direct download link](https://gitlab.com/hreitz/qemu-scripts/-/raw/main/qemu-fuse-disk-export.py?inline=false)).
This script automatically detects the image format, and its `--daemonize` option
allows safe use in scripts, where it is important that the process blocks until
the export is fully set up.
-Using `qemu-fuse-disk-export.py`, the above example looks like this:
+Using *qemu-fuse-disk-export.py*, the above example looks like this:
```
$ qemu-img info foo.qcow2 | grep 'file format'
file format: qcow2
--
2.31.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [qemu-web PATCH] Update FUSE block export blog post
2021-09-06 16:29 [qemu-web PATCH] Update FUSE block export blog post Hanna Reitz
@ 2021-09-07 14:07 ` Thomas Huth
2021-09-07 15:54 ` Hanna Reitz
2021-09-07 17:52 ` Eric Blake
1 sibling, 1 reply; 5+ messages in thread
From: Thomas Huth @ 2021-09-07 14:07 UTC (permalink / raw)
To: Hanna Reitz, qemu-devel
Cc: Klaus Kiwi, Paolo Bonzini, Eric Blake, Stefan Hajnoczi
On 06/09/2021 18.29, Hanna Reitz wrote:
> Because I forgot to CC Thomas on the discussion adding this post, it was
> merged prematurely. This patch updates the post to incorporate the
> feedback I received on it:
>
> - Title change: This article mostly deals with presenting a guest image
> in one image format as a raw image, so the title should reflect that;
> there is much less focus on exporting block devices from a live VM
>
> - Mention libguestfs, and contrast against it; make a note that
> libguestfs provides security that FUSE exports cannot provide
>
> - Have a full example in the intro, to show where we are going with this
> post
>
> - Some heading depths changed (nesting did not really make sense)
>
> - Be more explicit that by "file mounts" I do not mean a filesystem with
> a root directory and a single file in it
>
> - Explicitly mention that "/" is a directory without a name, to
> illustrate the fact that root nodes do not have names
>
> - Short intro for "QEMU block exports", explaining its place in this
> post
>
> - Make all exports writable
>
> - Use "exp0" as export ID to get shorter lines that fit better into 80
> characters
>
> - Reference the intro example in the intro of "Mounting an image on
> itself"
>
> - Show "qemu-fuse-disk-export.py" in *italic* instead of as `code`
> (because I had all other command names in *italic*)
>
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
> _posts/2021-08-22-fuse-blkexport.md | 145 ++++++++++++++++++++++------
> 1 file changed, 117 insertions(+), 28 deletions(-)
Thanks, changes looked fine to me, so I've pushed it now.
Thomas
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [qemu-web PATCH] Update FUSE block export blog post
2021-09-07 14:07 ` Thomas Huth
@ 2021-09-07 15:54 ` Hanna Reitz
0 siblings, 0 replies; 5+ messages in thread
From: Hanna Reitz @ 2021-09-07 15:54 UTC (permalink / raw)
To: Thomas Huth, qemu-devel
Cc: Klaus Kiwi, Paolo Bonzini, Eric Blake, Stefan Hajnoczi
On 07.09.21 16:07, Thomas Huth wrote:
> On 06/09/2021 18.29, Hanna Reitz wrote:
>> Because I forgot to CC Thomas on the discussion adding this post, it was
>> merged prematurely. This patch updates the post to incorporate the
>> feedback I received on it:
>>
>> - Title change: This article mostly deals with presenting a guest image
>> in one image format as a raw image, so the title should reflect that;
>> there is much less focus on exporting block devices from a live VM
>>
>> - Mention libguestfs, and contrast against it; make a note that
>> libguestfs provides security that FUSE exports cannot provide
>>
>> - Have a full example in the intro, to show where we are going with this
>> post
>>
>> - Some heading depths changed (nesting did not really make sense)
>>
>> - Be more explicit that by "file mounts" I do not mean a filesystem with
>> a root directory and a single file in it
>>
>> - Explicitly mention that "/" is a directory without a name, to
>> illustrate the fact that root nodes do not have names
>>
>> - Short intro for "QEMU block exports", explaining its place in this
>> post
>>
>> - Make all exports writable
>>
>> - Use "exp0" as export ID to get shorter lines that fit better into 80
>> characters
>>
>> - Reference the intro example in the intro of "Mounting an image on
>> itself"
>>
>> - Show "qemu-fuse-disk-export.py" in *italic* instead of as `code`
>> (because I had all other command names in *italic*)
>>
>> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
>> ---
>> _posts/2021-08-22-fuse-blkexport.md | 145 ++++++++++++++++++++++------
>> 1 file changed, 117 insertions(+), 28 deletions(-)
>
> Thanks, changes looked fine to me, so I've pushed it now.
Thanks!
Hanna
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [qemu-web PATCH] Update FUSE block export blog post
2021-09-06 16:29 [qemu-web PATCH] Update FUSE block export blog post Hanna Reitz
2021-09-07 14:07 ` Thomas Huth
@ 2021-09-07 17:52 ` Eric Blake
2021-09-14 10:02 ` Hanna Reitz
1 sibling, 1 reply; 5+ messages in thread
From: Eric Blake @ 2021-09-07 17:52 UTC (permalink / raw)
To: Hanna Reitz
Cc: Klaus Kiwi, Paolo Bonzini, Thomas Huth, qemu-devel, Stefan Hajnoczi
On Mon, Sep 06, 2021 at 06:29:16PM +0200, Hanna Reitz wrote:
> Because I forgot to CC Thomas on the discussion adding this post, it was
> merged prematurely. This patch updates the post to incorporate the
> feedback I received on it:
>
Overall, nice! I see it's already live, but another tweak you might
want to make:
> +## File mounts
>
> -A perhaps little-known fact is that, on Linux, filesystems do not need to have
> -a root directory, they only need to have a root node. A filesystem that only
> -provides a single regular file is perfectly valid.
> +To transparently translate a file into a different format, like we did above, we
> +make use of two little-known facts about filesystems and the VFS on Linux. The
> +first one of these we can explain immediately, for the second one we will need
> +some more information about how FUSE exports work, so that secret will be lifted
s/lifted/revealed/
> +later (down in the “Mounting an image on itself” section).
> +
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [qemu-web PATCH] Update FUSE block export blog post
2021-09-07 17:52 ` Eric Blake
@ 2021-09-14 10:02 ` Hanna Reitz
0 siblings, 0 replies; 5+ messages in thread
From: Hanna Reitz @ 2021-09-14 10:02 UTC (permalink / raw)
To: Eric Blake
Cc: Klaus Kiwi, Paolo Bonzini, Thomas Huth, qemu-devel, Stefan Hajnoczi
On 07.09.21 19:52, Eric Blake wrote:
> On Mon, Sep 06, 2021 at 06:29:16PM +0200, Hanna Reitz wrote:
>> Because I forgot to CC Thomas on the discussion adding this post, it was
>> merged prematurely. This patch updates the post to incorporate the
>> feedback I received on it:
>>
> Overall, nice! I see it's already live, but another tweak you might
> want to make:
>
>> +## File mounts
>>
>> -A perhaps little-known fact is that, on Linux, filesystems do not need to have
>> -a root directory, they only need to have a root node. A filesystem that only
>> -provides a single regular file is perfectly valid.
>> +To transparently translate a file into a different format, like we did above, we
>> +make use of two little-known facts about filesystems and the VFS on Linux. The
>> +first one of these we can explain immediately, for the second one we will need
>> +some more information about how FUSE exports work, so that secret will be lifted
> s/lifted/revealed/
Ah, yes. I don’t think I’ll send another update just for this, but I
will include it if I need to send an update for some other reason.
Hanna
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-09-14 10:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-06 16:29 [qemu-web PATCH] Update FUSE block export blog post Hanna Reitz
2021-09-07 14:07 ` Thomas Huth
2021-09-07 15:54 ` Hanna Reitz
2021-09-07 17:52 ` Eric Blake
2021-09-14 10:02 ` Hanna Reitz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.