Documentation/9psetup: Difference between revisions

From QEMU
(→‎Starting the Guest directly: security model option 'mapped' is equivalent to 'mapped-xattr')
(add section 'Security Considerations')
(8 intermediate revisions by 2 users not shown)
Line 3: Line 3:
This section details the steps involved in setting up VirtFS (Plan 9 folder sharing over Virtio - I/O virtualization framework) between the guest and host operating systems. The instructions are followed by an
This section details the steps involved in setting up VirtFS (Plan 9 folder sharing over Virtio - I/O virtualization framework) between the guest and host operating systems. The instructions are followed by an
example usage of the mentioned steps.
example usage of the mentioned steps.
This page is focused on user aspects like setting up 9pfs, configuration, performance tweaks. For the developers documentation of 9pfs refer to [[Documentation/9p]] instead.
See also [[Documentation/9p_root_fs]] for a complete HOWTO about installing and configuring an entire guest system ontop of 9p as root fs.


== Preparation ==
== Preparation ==
Line 40: Line 44:


* <b>FSDRIVER</b>: Either "local", "proxy" or "synth". This option specifies the filesystem driver backend to use. In short: you want to use "local". In detail:
* <b>FSDRIVER</b>: Either "local", "proxy" or "synth". This option specifies the filesystem driver backend to use. In short: you want to use "local". In detail:
# local: Simply lets QEMU call the individual VFS functions (more or less) directly on host.  
# local: Simply lets QEMU call the individual VFS functions (more or less) directly on host (<b>recommended option</b>).  
# proxy: this driver was supposed to dispatch the VFS functions to be called from a separate process (by virtfs-proxy-helper), however the "proxy" driver is currently not considered to be production grade.  
# proxy: this driver was supposed to dispatch the VFS functions to be called from a separate process (by virtfs-proxy-helper), however the "proxy" driver is currently not considered to be production grade, not considered safe and has very poor performance. The "proxy" driver has not seen any development in years and will likely be removed in a future version of QEMU. <b>We recommend NOT using the "proxy" driver</b>.  
# synth: This driver is only used for development purposes (i.e. test cases).
# synth: This driver is only used for development purposes (i.e. test cases).


Line 51: Line 55:


* security_model=mapped-xattr|mapped-file|passthrough|none: Specifies the security model to be used for this export path. Security model is mandatory only for "local" fsdriver. Other fsdrivers (like "proxy") don't take security model as a parameter. Recommended option is "mapped-xattr".
* security_model=mapped-xattr|mapped-file|passthrough|none: Specifies the security model to be used for this export path. Security model is mandatory only for "local" fsdriver. Other fsdrivers (like "proxy") don't take security model as a parameter. Recommended option is "mapped-xattr".
# passthrough: Files are stored using the same credentials as they are created on the guest. This requires QEMU to run as root.
# passthrough: Files are stored using the same credentials as they are created on the guest. This requires QEMU to run as root and therefore using <b>"passthrough" security model is strongly discouraged, especially when running untrusted guests!</b>
# mapped: Equivalent to "mapped-xattr".
# mapped: Equivalent to "mapped-xattr".
# mapped-xattr: Some of the file attributes like uid, gid, mode bits and link target are stored as file attributes. This is probably the most reliable and secure option.
# mapped-xattr: Some of the file attributes like uid, gid, mode bits and link target are stored as file attributes. This is probably the most reliable and secure option.
Line 105: Line 109:
# access=any : v9fs does single attach and performs all operations as one user   
# access=any : v9fs does single attach and performs all operations as one user   
# access=client : Fetches access control list values from the server and does an access check on the client.
# access=client : Fetches access control list values from the server and does an access check on the client.
<span id="security"></span>
== Security Considerations ==
* Recommended is <b>'security_model=mapped'</b>. <b>Do not</b> use 'security_model=passthrough' as it requires QEMU to be run as root.
* Recommend FSDRIVER is <b>'local'</b>. <b>Do not</b> use the 'proxy' driver, as it's in bad shape, hasn't seen any development in years, and will be removed in a future version of QEMU.
* Keep in mind that an ordinary guest user might create indefinite many and large files as desired and therefore might fill the entire shared partition until it's full. So if you are sharing a tree with an untrusted guest then you should:
** mount a <b>separate partition/data set</b> on host just for the shared tree
** and/or <b>deploy quotas</b> to limit the <b>amount of data</b> AND the <b>amount of inodes</b> the guest is allowed to create
* If you are sharing more than one file system, or on doubt use <b>'multidevs=remap'</b>. It adds some extra cycles for safely remapping inodes from host to guest to avoid potential file ID collisions on guest side, which could otherwise lead to nasty misbehaviours on guest side which are often not obvious to hunt down. This option also safely allows to mount more filesystems into the shared tree while guest is still running.


<!-- NOTE: anchor 'msize' is linked by a QEMU 9pfs log message in 9p.c  -->
<!-- NOTE: anchor 'msize' is linked by a QEMU 9pfs log message in 9p.c  -->
<span id="msize"></span>
<span id="msize"></span>
== Performance Considerations ==
== Performance Considerations (msize) ==
You should set an appropriate value for option "msize" on client (guest OS) side to avoid degraded file I/O performance. This 9P option is only available on client side. If you omit to specify a value for "msize" with a Linux 9P client, the client would fall back to its default value of only 8 kiB which results in very poor performance. A good value for "msize" depends on the file I/O potential of the underlying storage on host side (i.e. a feature invisible to the client), and then you still might want to trade off between performance profit and additional RAM costs, i.e. with growing "msize" (RAM occupation) performance still increases, but the performance gain (delta) will shrink continuously.
You should set an appropriate value for option "msize" on client (guest OS) side to avoid degraded file I/O performance. This 9P option is only available on client side. If you omit to specify a value for "msize" with a Linux 9P client, the client would fall back to its default value which was prior to Linux kernel v5.15 only 8 kiB which resulted in very poor performance. With [https://github.com/torvalds/linux/commit/9c4d94dc9a64426d2fa0255097a3a84f6ff2eebe#diff-8ca710cee9d036f79b388ea417a11afa79f70bdbfca99c938e750e4ff3b4402d Linux kernel v5.15 the default msize was raised to 128 kiB], which [https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg01003.html still limits performance on most machines].
 
A good value for "msize" depends on the file I/O potential of the underlying storage on host side (i.e. a feature invisible to the client), and then you still might want to trade off between performance profit and additional RAM costs, i.e. with growing "msize" (RAM occupation) performance still increases, but the performance gain (delta) will shrink continuously.


For that reason it is recommended to benchmark and manually pick an appropriate value for 'msize' for your use case by yourself. As a starting point, you might start by picking something between 10 MiB .. >100 MiB for a spindle based SATA storage, whereas for a PCIe based Flash storage you might pick several hundred MiB or more. Then create some large file on host side (e.g. 12 GiB):
For that reason it is recommended to benchmark and manually pick an appropriate value for 'msize' for your use case by yourself. As a starting point, you might start by picking something between 10 MiB .. >100 MiB for a spindle based SATA storage, whereas for a PCIe based Flash storage you might pick several hundred MiB or more. Then create some large file on host side (e.g. 12 GiB):
Line 173: Line 189:
     /home/guest/9p_setup/qemu/bin/qemu -drive file=/home/guest/9p_setup/ubuntu-lucid.img,if=virtio \  
     /home/guest/9p_setup/qemu/bin/qemu -drive file=/home/guest/9p_setup/ubuntu-lucid.img,if=virtio \  
     -kernel /path/to/kernel/bzImage -append "console=ttyS0 root=/dev/vda" -m 512 -smp 1 \
     -kernel /path/to/kernel/bzImage -append "console=ttyS0 root=/dev/vda" -m 512 -smp 1 \
     -fsdev local,id=test_dev,path=/home/guest/9p_setup/shared,security_model=none -device virtio-9p-pci,fsdev=test_dev,mount_tag=test_mount -enable-kvm  
     -fsdev local,id=test_dev,path=/home/guest/9p_setup/shared,security_model=mapped,multidevs=remap \
    -device virtio-9p-pci,fsdev=test_dev,mount_tag=test_mount -enable-kvm  
      
      
The above command runs a VNC server. To view the guest OS, install and use any VNC viewer (for instance xclientvncviewer).
The above command runs a VNC server. To view the guest OS, install and use any VNC viewer (for instance xclientvncviewer).
Line 180: Line 197:


Mount the shared folder on guest using
Mount the shared folder on guest using
     mount -t 9p -o trans=virtio test_mount /tmp/shared/ -oversion=9p2000.L,posixacl,msize=104857600,cache=loose
     mount -t 9p -o trans=virtio test_mount /tmp/shared/ -oversion=9p2000.L,posixacl,msize=104857600


In the above example the folder /home/guest/9p_setup/shared of the host is shared with the folder /tmp/shared on the guest.
In the above example the folder /home/guest/9p_setup/shared of the host is shared with the folder /tmp/shared on the guest.
We intentionally add no 'cache' option in this example to avoid confusion. You may add e.g. cache=loose option to increase performance, however keep in mind that [https://lore.kernel.org/all/ZCHU6k56nF5849xj@bombadil.infradead.org/ currently all caching implementations of Linux 9p client do not revalidate file changes made on host side <b>ever</b>!] In other words: changes made on host side would (currently) never become visible on guest unless you would remount or reboot guest! This is currently in the works, and in a future Linux version caching is planned to be enabled by default once this issue was addressed properly.


[[Category:User documentation]]
[[Category:User documentation]]

Revision as of 13:10, 10 January 2024

With QEMU's 9pfs you can create virtual filesystem devices (virtio-9p-device) and expose them to guests, which essentially means that a certain directory on host machine is made directly accessible by a guest OS as a pass-through file system by using the 9P network protocol for communication between host and guest, if desired even accessible, shared by several guests simultaniously.

This section details the steps involved in setting up VirtFS (Plan 9 folder sharing over Virtio - I/O virtualization framework) between the guest and host operating systems. The instructions are followed by an example usage of the mentioned steps.

This page is focused on user aspects like setting up 9pfs, configuration, performance tweaks. For the developers documentation of 9pfs refer to Documentation/9p instead.

See also Documentation/9p_root_fs for a complete HOWTO about installing and configuring an entire guest system ontop of 9p as root fs.

Preparation

1. Download the latest kernel code (2.6.36.rc4 or newer) from http://www.kernel.org to build the kernel image for the guest.

2. Ensure the following 9P options are enabled in the kernel configuration.

    CONFIG_NET_9P=y
    CONFIG_NET_9P_VIRTIO=y
    CONFIG_NET_9P_DEBUG=y (Optional)
    CONFIG_9P_FS=y
    CONFIG_9P_FS_POSIX_ACL=y
    CONFIG_PCI=y
    CONFIG_VIRTIO_PCI=y

and these PCI and virtio options:

    CONFIG_PCI=y
    CONFIG_VIRTIO_PCI=y
    CONFIG_PCI_HOST_GENERIC=y (only needed for the QEMU Arm 'virt' board)

3. Get the latest git repository from http://git.qemu.org/ or http://repo.or.cz/w/qemu.git.

4. Configure QEMU for the desired target. Note that if the configuration step prompts ATTR/XATTR as 'no' then you need to install libattr & libattr-dev first.

For debian based systems install packages libattr1 & libattr1-dev and for rpm based systems install libattr & libattr-devel. Proceed to configure and build QEMU.

5. Setup the guest OS image and ensure kvm modules are loaded.

Starting the Guest directly

To start the guest add the following options to enable 9P sharing in QEMU

    -fsdev FSDRIVER,path=PATH_TO_SHARE,security_model=mapped-xattr|mapped-file|passthrough|none[,id=ID][,writeout=immediate][,readonly][,fmode=FMODE][,dmode=DMODE][,multidevs=remap|forbid|warn][,socket=SOCKET|sock_fd=SOCK_FD] -device TRANSPORT_DRIVER,fsdev=FSDEVID,mount_tag=MOUNT_TAG
     

You can also just use the following short-cut of the command above:

    -virtfs FSDRIVER,path=PATH_TO_SHARE,mount_tag=MOUNT_TAG,security_model=mapped|mapped-xattr|mapped-file|passthrough|none[,id=ID][,writeout=immediate][,readonly][,fmode=FMODE][,dmode=DMODE][,multidevs=remap|forbid|warn][,socket=SOCKET|sock_fd=SOCK_FD]

Options:

  • FSDRIVER: Either "local", "proxy" or "synth". This option specifies the filesystem driver backend to use. In short: you want to use "local". In detail:
  1. local: Simply lets QEMU call the individual VFS functions (more or less) directly on host (recommended option).
  2. proxy: this driver was supposed to dispatch the VFS functions to be called from a separate process (by virtfs-proxy-helper), however the "proxy" driver is currently not considered to be production grade, not considered safe and has very poor performance. The "proxy" driver has not seen any development in years and will likely be removed in a future version of QEMU. We recommend NOT using the "proxy" driver.
  3. synth: This driver is only used for development purposes (i.e. test cases).
  • TRANSPORT_DRIVER: Either "virtio-9p-pci", "virtio-9p-ccw" or "virtio-9p-device", depending on the underlying system. This option specifies the driver used for communication between host and guest. if the -virtfs shorthand form is used then "virtio-9p-pci" is implied.
  • id=ID: Specifies identifier for this fsdev device.
  • path=PATH_TO_SHARE: Specifies the export path for the file system device. Files under this path on host will be available to the 9p client on the guest.
  • security_model=mapped-xattr|mapped-file|passthrough|none: Specifies the security model to be used for this export path. Security model is mandatory only for "local" fsdriver. Other fsdrivers (like "proxy") don't take security model as a parameter. Recommended option is "mapped-xattr".
  1. passthrough: Files are stored using the same credentials as they are created on the guest. This requires QEMU to run as root and therefore using "passthrough" security model is strongly discouraged, especially when running untrusted guests!
  2. mapped: Equivalent to "mapped-xattr".
  3. mapped-xattr: Some of the file attributes like uid, gid, mode bits and link target are stored as file attributes. This is probably the most reliable and secure option.
  4. mapped-file: The attributes are stored in the hidden .virtfs_metadata directory. Directories exported by this security model cannot interact with other unix tools.
  5. none: Same as "passthrough" except the sever won't report failures if it fails to set file attributes like ownership (chown). This makes a passthrough like security model usable for people who run kvm as non root.
  • writeout=immediate: This is an optional argument. The only supported value is "immediate". This means that host page cache will be used to read and write data but write notification will be sent to the guest only when the data has been reported as written by the storage subsystem.
  • readonly: Enables exporting 9p share as a readonly mount for guests. By default read-write access is given.
  • socket=SOCKET: This option is only available for the "proxy" fsdriver. It enables "proxy" filesystem driver to use passed socket file for communicating with virtfs-proxy-helper
  • sock_fd=SOCK_FD: This option is only available for the "proxy" fsdriver. It enables "proxy" filesystem driver to use passed socket descriptor for communicating with virtfs-proxy-helper. Usually a helper like libvirt will create socketpair and pass one of the fds as sock_fd.
  • fmode=FMODE: Specifies the default mode for newly created files on the host. Works only with security models "mapped-xattr" and "mapped-file".
  • dmode=DMODE: Specifies the default mode for newly created directories on the host. Works only with security models "mapped-xattr" and "mapped-file".
  • mount_tag=MOUNT_TAG: Specifies the tag name to be used by the guest to mount this export point.
  • multidevs=remap|forbid|warn: Specifies how to deal with multiple devices being shared with a 9p export, i.e. to avoid file ID collisions. Supported behaviours are either:
  1. warn: This is the default behaviour on which virtfs 9p expects only one device to be shared with the same export, and if more than one device is shared and accessed via the same 9p export then only a warning message is logged (once) by qemu on host side.
  2. remap: In order to avoid file ID collisions on guest you should either create a separate virtfs export for each device to be shared with guests (recommended way) or you might use "remap" instead which allows you to share multiple devices with only one export instead, which is achieved by remapping the original inode numbers from host to guest in a way that would prevent such collisions. Remapping inodes in such use cases is required because the original device IDs from host are never passed and exposed on guest. Instead all files of an export shared with virtfs always share the same device id on guest. So two files with identical inode numbers but from actually different devices on host would otherwise cause a file ID collision and hence potential misbehaviours on guest.
  3. forbid: Assumes like "warn" that only one device is shared by the same export, however it will not only log a warning message but also deny access to additional devices on guest. Note though that "forbid" does currently not block all possible file access operations (e.g. readdir() would still return entries from other devices).

Starting the Guest using libvirt

If using libvirt for management of QEMU/KVM virtual machines, the <filesystem> element can be used to setup 9p sharing for guests

 <filesystem type='mount' accessmode='$security_model'>
   <source dir='$hostpath'/>
   <target dir='$mount_tag'/>
 </filesystem>

In the above XML, the source directory will contain the host path that is to be exported. The target directory should be filled with the mount tag for the device, which despite its name, does not have to actually be a directory path - any string 32 characters or less can be used. The accessmode attribute determines the sharing mode, one of 'passthrough', 'mapped' or 'squashed'.

There is no equivalent of the QEMU 'id' attribute, since that is automatically filled in by libvirt. Libvirt will also automatically assign a PCI address for the 9p device, though that can be overridden if desired.

Mounting the shared path

You can mount the shared folder using

    mount -t 9p -o trans=virtio [mount tag] [mount point] -oversion=9p2000.L
  • mount tag: As specified in Qemu commandline.
  • mount point: Path to mount point.
  • trans: Transport method (here virtio for using 9P over virtio)
  • version: Protocol version. By default it is 9p2000.u .

Other options that can be used include:

  • msize: Maximum packet size including any headers. By default it is 8KB.
  • access: Following are the access modes
  1. access=user : If a user tries to access a file on v9fs filesystem for the first time, v9fs sends an attach command (Tattach) for that user. This is the default mode.
  2. access=<uid> : It only allows the user with uid=<uid> to access the files on the mounted filesystem
  3. access=any : v9fs does single attach and performs all operations as one user
  4. access=client : Fetches access control list values from the server and does an access check on the client.

Security Considerations

  • Recommended is 'security_model=mapped'. Do not use 'security_model=passthrough' as it requires QEMU to be run as root.
  • Recommend FSDRIVER is 'local'. Do not use the 'proxy' driver, as it's in bad shape, hasn't seen any development in years, and will be removed in a future version of QEMU.
  • Keep in mind that an ordinary guest user might create indefinite many and large files as desired and therefore might fill the entire shared partition until it's full. So if you are sharing a tree with an untrusted guest then you should:
    • mount a separate partition/data set on host just for the shared tree
    • and/or deploy quotas to limit the amount of data AND the amount of inodes the guest is allowed to create
  • If you are sharing more than one file system, or on doubt use 'multidevs=remap'. It adds some extra cycles for safely remapping inodes from host to guest to avoid potential file ID collisions on guest side, which could otherwise lead to nasty misbehaviours on guest side which are often not obvious to hunt down. This option also safely allows to mount more filesystems into the shared tree while guest is still running.

Performance Considerations (msize)

You should set an appropriate value for option "msize" on client (guest OS) side to avoid degraded file I/O performance. This 9P option is only available on client side. If you omit to specify a value for "msize" with a Linux 9P client, the client would fall back to its default value which was prior to Linux kernel v5.15 only 8 kiB which resulted in very poor performance. With Linux kernel v5.15 the default msize was raised to 128 kiB, which still limits performance on most machines.

A good value for "msize" depends on the file I/O potential of the underlying storage on host side (i.e. a feature invisible to the client), and then you still might want to trade off between performance profit and additional RAM costs, i.e. with growing "msize" (RAM occupation) performance still increases, but the performance gain (delta) will shrink continuously.

For that reason it is recommended to benchmark and manually pick an appropriate value for 'msize' for your use case by yourself. As a starting point, you might start by picking something between 10 MiB .. >100 MiB for a spindle based SATA storage, whereas for a PCIe based Flash storage you might pick several hundred MiB or more. Then create some large file on host side (e.g. 12 GiB):

    dd if=/dev/zero of=test.dat bs=1G count=12

and measure how long it takes reading the file on guest OS side:

    time cat test.dat > /dev/null

then repeat with different values for "msize" to find a good value.

Example

An example usage of the above steps (tried on an Ubuntu Lucid Lynx system):

1. Download the latest kernel source from http://www.kernel.org

2. Build kernel image

  • Ensure relevant kernel configuration options are enabled pertaining to
  1. Virtualization
  2. KVM
  3. Virtio
  4. 9P
  • Compile

3. Get the latest QEMU git repository in a fresh directory using

    git clone git://repo.or.cz/qemu.git

4. Configure QEMU

For example for i386-softmm with debugging support, use

    ./configure '--target-list=i386-softmmu' '--enable-debug' '--enable-kvm' '--prefix=/home/guest/9p_setup/qemu/'

If this step prompts ATTR/XATTR as 'no', install packages libattr1 and libattr1-dev on your system using:

    sudo apt-get install libattr1
    sudo apt-get install libattr1-dev

5. Compile QEMU

    make
    make install

6. Guest OS installation (Installing Ubuntu Lucid Lynx here)

  • Create Guest image (here of size 2 GB)
    dd if=/dev/zero of=/home/guest/9p_setup/ubuntu-lucid.img bs=1M count=2000 
  • Burn a filesystem on the image file (ext4 here)
    mkfs.ext4 /home/guest/9p_setup/ubuntu-lucid.img 
  • Mount the image file
    mount -o loop /home/guest/9p_setup/ubuntu-lucid.img /mnt/temp_mount
  • Install the Guest OS

For installing a Debain system you can use package debootstrap

    debootstrap lucid /mnt/temp_mount 

Once the OS is installed, unmount the guest image.

    umount /mnt/temp_mount

7. Load the KVM modules on the host (for intel here)

    modprobe kvm
    modprobe kvm_intel 

8. Start the Guest OS

   /home/guest/9p_setup/qemu/bin/qemu -drive file=/home/guest/9p_setup/ubuntu-lucid.img,if=virtio \ 
   -kernel /path/to/kernel/bzImage -append "console=ttyS0 root=/dev/vda" -m 512 -smp 1 \
   -fsdev local,id=test_dev,path=/home/guest/9p_setup/shared,security_model=mapped,multidevs=remap \
   -device virtio-9p-pci,fsdev=test_dev,mount_tag=test_mount -enable-kvm 
   

The above command runs a VNC server. To view the guest OS, install and use any VNC viewer (for instance xclientvncviewer).

9. Mounting shared folder

Mount the shared folder on guest using

    mount -t 9p -o trans=virtio test_mount /tmp/shared/ -oversion=9p2000.L,posixacl,msize=104857600

In the above example the folder /home/guest/9p_setup/shared of the host is shared with the folder /tmp/shared on the guest.

We intentionally add no 'cache' option in this example to avoid confusion. You may add e.g. cache=loose option to increase performance, however keep in mind that currently all caching implementations of Linux 9p client do not revalidate file changes made on host side ever! In other words: changes made on host side would (currently) never become visible on guest unless you would remount or reboot guest! This is currently in the works, and in a future Linux version caching is planned to be enabled by default once this issue was addressed properly.