VIRTIOFSD(1) QEMU VIRTIOFSD(1)
NAME
virtiofsd - QEMU virtio-fs shared file system daemon
SYNOPSIS
virtiofsd [OPTIONS]
DESCRIPTION
Share a host directory tree with a guest through a virtio-fs device.
This program is a vhost-user backend that implements the virtio-fs de-
vice. Each virtio-fs device instance requires its own virtiofsd
process.
This program is designed to work with QEMU's --device vhost-user-fs-pci
but should work with any virtual machine monitor (VMM) that supports
vhost-user. See the Examples section below.
This program must be run as the root user. The program drops privi-
leges where possible during startup although it must be able to create
and access files with any uid/gid:
o The ability to invoke syscalls is limited using seccomp(2).
o Linux capabilities(7) are dropped.
In "namespace" sandbox mode the program switches into a new file system
namespace and invokes pivot_root(2) to make the shared directory tree
its root. A new pid and net namespace is also created to isolate the
process.
In "chroot" sandbox mode the program invokes chroot(2) to make the
shared directory tree its root. This mode is intended for container en-
vironments where the container runtime has already set up the name-
spaces and the program does not have permission to create namespaces
itself.
Both sandbox modes prevent "file system escapes" due to symlinks and
other file system objects that might lead to files outside the shared
directory.
OPTIONS
-h, --help
Print help.
-V, --version
Print version.
-d Enable debug output.
--syslog
Print log messages to syslog instead of stderr.
-o OPTION
o debug - Enable debug output.
o flock|no_flock - Enable/disable flock. The default is
no_flock.
o modcaps=CAPLIST Modify the list of capabilities allowed;
CAPLIST is a colon separated list of capabilities, each pre-
ceded by either + or -, e.g. ''+sys_admin:-chown''.
o log_level=LEVEL - Print only log messages matching LEVEL or
more severe. LEVEL is one of err, warn, info, or debug. The
default is info.
o posix_lock|no_posix_lock - Enable/disable remote POSIX locks.
The default is no_posix_lock.
o readdirplus|no_readdirplus - Enable/disable readdirplus. The
default is readdirplus.
o sandbox=namespace|chroot - Sandbox mode: - namespace: Create
mount, pid, and net namespaces and pivot_root(2) into the
shared directory. - chroot: chroot(2) into shared directory
(use in containers). The default is "namespace".
o source=PATH - Share host directory tree located at PATH. This
option is required.
o timeout=TIMEOUT - I/O timeout in seconds. The default depends
on cache= option.
o writeback|no_writeback - Enable/disable writeback cache. The
cache allows the FUSE client to buffer and merge write re-
quests. The default is no_writeback.
o xattr|no_xattr - Enable/disable extended attributes (xattr) on
files and directories. The default is no_xattr.
o posix_acl|no_posix_acl - Enable/disable posix acl support.
Posix ACLs are disabled by default.
--socket-path=PATH
Listen on vhost-user UNIX domain socket at PATH.
--socket-group=GROUP
Set the vhost-user UNIX domain socket gid to GROUP.
--fd=FDNUM
Accept connections from vhost-user UNIX domain socket file de-
scriptor FDNUM. The file descriptor must already be listening
for connections.
--thread-pool-size=NUM
Restrict the number of worker threads per request queue to NUM.
The default is 64.
--cache=none|auto|always
Select the desired trade-off between coherency and performance.
none forbids the FUSE client from caching to achieve best co-
herency at the cost of performance. auto acts similar to NFS
with a 1 second metadata cache timeout. always sets a long
cache lifetime at the expense of coherency. The default is
auto.
EXTENDED ATTRIBUTE (XATTR) MAPPING
By default the name of xattr's used by the client are passed through to
the server file system. This can be a problem where either those xattr
names are used by something on the server (e.g. selinux client/server
confusion) or if the virtiofsd is running in a container with re-
stricted privileges where it cannot access some attributes.
Mapping syntax
A mapping of xattr names can be made using -o xattrmap=mapping where
the mapping string consists of a series of rules.
The first matching rule terminates the mapping. The set of rules must
include a terminating rule to match any remaining attributes at the
end.
Each rule consists of a number of fields separated with a separator
that is the first non-white space character in the rule. This separa-
tor must then be used for the whole rule. White space may be added be-
fore and after each rule.
Using ':' as the separator a rule is of the form:
:type:scope:key:prepend:
scope is:
o
'client' - match 'key' against a xattr name from the client for
setxattr/getxattr/removexattr
o
'server' - match 'prepend' against a xattr name from the server
for listxattr
o
'all' - can be used to make a single rule where both the server
and client matches are triggered.
type is one of:
o 'prefix' - is designed to prepend and strip a prefix; the modified
attributes then being passed on to the client/server.
o 'ok' - Causes the rule set to be terminated when a match is found
while allowing matching xattr's through unchanged. It is intended
both as a way of explicitly terminating the list of rules, and to al-
low some xattr's to skip following rules.
o 'bad' - If a client tries to use a name matching 'key' it's denied
using EPERM; when the server passes an attribute name matching
'prepend' it's hidden. In many ways it's use is very like 'ok' as
either an explicit terminator or for special handling of certain pat-
terns.
o 'unsupported' - If a client tries to use a name matching 'key' it's
denied using ENOTSUP; when the server passes an attribute name match-
ing 'prepend' it's hidden. In many ways it's use is very like 'ok'
as either an explicit terminator or for special handling of certain
patterns.
key is a string tested as a prefix on an attribute name originating on
the client. It maybe empty in which case a 'client' rule will always
match on client names.
prepend is a string tested as a prefix on an attribute name originating
on the server, and used as a new prefix. It may be empty in which case
a 'server' rule will always match on all names from the server.
e.g.:
:prefix:client:trusted.:user.virtiofs.:
will match 'trusted.' attributes in client calls and prefix them be-
fore passing them to the server.
:prefix:server::user.virtiofs.:
will strip 'user.virtiofs.' from all server replies.
:prefix:all:trusted.:user.virtiofs.:
combines the previous two cases into a single rule.
:ok:client:user.::
will allow get/set xattr for 'user.' xattr's and ignore following
rules.
:ok:server::security.:
will pass 'securty.' xattr's in listxattr from the server and ignore
following rules.
:ok:all:::
will terminate the rule search passing any remaining attributes in
both directions.
:bad:server::security.:
would hide 'security.' xattr's in listxattr from the server.
A simpler 'map' type provides a shorter syntax for the common case:
:map:key:prepend:
The 'map' type adds a number of separate rules to add prepend as a pre-
fix to the matched key (or all attributes if key is empty). There may
be at most one 'map' rule and it must be the last rule in the set.
Note: When the 'security.capability' xattr is remapped, the daemon has
to do extra work to remove it during many operations, which the host
kernel normally does itself.
Security considerations
Operating systems typically partition the xattr namespace using well
defined name prefixes. Each partition may have different access con-
trols applied. For example, on Linux there are multiple partitions
o system.* - access varies depending on attribute & filesystem
o security.* - only processes with CAP_SYS_ADMIN
o trusted.* - only processes with CAP_SYS_ADMIN
o user.* - any process granted by file permissions / ownership
While other OS such as FreeBSD have different name prefixes and access
control rules.
When remapping attributes on the host, it is important to ensure that
the remapping does not allow a guest user to evade the guest access
control rules.
Consider if trusted.* from the guest was remapped to user.vir-
tiofs.trusted* in the host. An unprivileged user in a Linux guest has
the ability to write to xattrs under user.*. Thus the user can evade
the access control restriction on trusted.* by instead writing to
user.virtiofs.trusted.*.
As noted above, the partitions used and access controls applied, will
vary across guest OS, so it is not wise to try to predict what the
guest OS will use.
The simplest way to avoid an insecure configuration is to remap all
xattrs at once, to a given fixed prefix. This is shown in example (1)
below.
If selectively mapping only a subset of xattr prefixes, then rules must
be added to explicitly block direct access to the target of the remap-
ping. This is shown in example (2) below.
Mapping examples
1. Prefix all attributes with 'user.virtiofs.'
-o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"
This uses two rules, using : as the field separator; the first rule
prefixes and strips 'user.virtiofs.', the second rule hides any
non-prefixed attributes that the host set.
This is equivalent to the 'map' rule:
-o xattrmap=":map::user.virtiofs.:"
2. Prefix 'trusted.' attributes, allow others through
"/prefix/all/trusted./user.virtiofs./
/bad/server//trusted./
/bad/client/user.virtiofs.//
/ok/all///"
Here there are four rules, using / as the field separator, and also
demonstrating that new lines can be included between rules. The first
rule is the prefixing of 'trusted.' and stripping of 'user.virtiofs.'.
The second rule hides unprefixed 'trusted.' attributes on the host.
The third rule stops a guest from explicitly setting the 'user.vir-
tiofs.' path directly to prevent access control bypass on the target of
the earlier prefix remapping. Finally, the fourth rule lets all re-
maining attributes through.
This is equivalent to the 'map' rule:
-o xattrmap="/map/trusted./user.virtiofs./"
3. Hide 'security.' attributes, and allow everything else
"/bad/all/security./security./
/ok/all///'
The first rule combines what could be separate client and server rules
into a single 'all' rule, matching 'security.' in either client argu-
ments or lists returned from the host. This stops the client seeing
any 'security.' attributes on the server and stops it setting any.
EXAMPLES
Export /var/lib/fs/vm001/ on vhost-user UNIX domain socket
/var/run/vm001-vhost-fs.sock:
host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001
host# qemu-system-x86_64 \
-chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
-device vhost-user-fs-pci,chardev=char0,tag=myfs \
-object memory-backend-memfd,id=mem,size=4G,share=on \
-numa node,memdev=mem \
...
guest# mount -t virtiofs myfs /mnt
AUTHOR
Stefan Hajnoczi <stefanha@redhat.com>, Masayoshi Mizuma
<m.mizuma@jp.fujitsu.com>
COPYRIGHT
2024, The QEMU Project Developers
6.2.0 Jan 29, 2024 VIRTIOFSD(1)
Return to Main Contents