Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.
If file capabilities are supported: add any capability from the calling thread's bounding set to its inheritable set; drop capabilities from the bounding set (via prctl(2) PR_CAPBSET_DROP); make changes to the securebits flags.
If a thread drops a capability from its permitted set, it can never reacquire that capability (unless it execve(2)s either a set-user-ID-root program, or a program whose associated file capabilities grant that capability).
The ambient capability set can be directly modified using prctl(2). Ambient capabilities are automatically lowered if either of the corresponding permitted or inheritable capabilities is lowered.
Executing a program that changes UID or GID due to the set-user-ID or set-group-ID bits or executing a program that has any file capabilities set will clear the ambient set. Ambient capabilities are added to the permitted set and assigned to the effective set when execve(2) is called.
Using capset(2), a thread may manipulate its own capability sets (see below).
Since Linux 3.2, the file /proc/sys/kernel/cap_last_cap exposes the numerical value of the highest capability supported by the running kernel; this can be used to determine the highest bit that may be set in a capability set.
The three file capability sets are:
Enabling the file effective capability bit implies that any file permitted or inheritable capability that causes a thread to acquire the corresponding permitted capability during an execve(2) (see the transformation rules described below) will also acquire that capability in its effective set. Therefore, when assigning capabilities to a file (setcap(8), cap_set_file(3), cap_set_fd(3)), if we specify the effective flag as being enabled for any capability, then the effective flag must also be specified as enabled for all other capabilities for which the corresponding permitted or inheritable flags is enabled.
During an execve(2), the kernel calculates the new capabilities of the process using the following algorithm:
P'(ambient) = (file is privileged) ? 0 : P(ambient) P'(permitted) = (P(inheritable) & F(inheritable)) | (F(permitted) & cap_bset) | P'(ambient) P'(effective) = F(effective) ? P'(permitted) : P'(ambient) P'(inheritable) = P(inheritable) [i.e., unchanged]where:
The upshot of the above rules, combined with the capabilities transformations described above, is that when a process execve(2)s a set-user-ID-root program, or when a process with an effective UID of 0 execve(2)s a program, it gains all capabilities in its permitted and effective capability sets, except those masked out by the capability bounding set. This provides semantics that are the same as those provided by traditional UNIX systems.
Note that the bounding set masks the file permitted capabilities, but not the inherited capabilities. If a thread maintains a capability in its inherited set that is not in its bounding set, then it can still gain that capability in its permitted set by executing a file that has the capability in its inherited set.
Depending on the kernel version, the capability bounding set is either a system-wide attribute, or a per-process attribute.
Capability bounding set prior to Linux 2.6.25
In kernels before 2.6.25, the capability bounding set is a system-wide attribute that affects all threads on the system. The bounding set is accessible via the file /proc/sys/kernel/cap-bound. (Confusingly, this bit mask parameter is expressed as a signed decimal number in /proc/sys/kernel/cap-bound.)
Only the init process may set capabilities in the capability bounding set; other than that, the superuser (more precisely: programs with the CAP_SYS_MODULE capability) may only clear capabilities from this set.
On a standard system the capability bounding set always masks out the CAP_SETPCAP capability. To remove this restriction (dangerous!), modify the definition of CAP_INIT_EFF_SET in include/linux/capability.h and rebuild the kernel.
The system-wide capability bounding set feature was added to Linux starting with kernel version 2.2.11.
Capability bounding set from Linux 2.6.25 onward
From Linux 2.6.25, the capability bounding set is a per-thread attribute. (There is no longer a system-wide capability bounding set.)
A thread may remove capabilities from its capability bounding set using the prctl(2) PR_CAPBSET_DROP operation, provided it has the CAP_SETPCAP capability. Once a capability has been dropped from the bounding set, it cannot be restored to that set. A thread can determine if a capability is in its bounding set using the prctl(2) PR_CAPBSET_READ operation.
Removing capabilities from the bounding set is supported only if file capabilities are compiled into the kernel. In kernels before Linux 2.6.33, file capabilities were an optional feature configurable via the CONFIG_SECURITY_FILE_CAPABILITIES option. Since Linux 2.6.33, the configuration option has been removed and file capabilities are always part of the kernel. When file capabilities are compiled into the kernel, the init process (the ancestor of all processes) begins with a full bounding set. If file capabilities are not compiled into the kernel, then init begins with a full bounding set minus CAP_SETPCAP, because this capability has a different meaning when there are no file capabilities.
Removing a capability from the bounding set does not remove it from the thread's inherited set. However it does prevent the capability from being added back into the thread's inherited set in the future.
If a thread that has a 0 value for one or more of its user IDs wants to prevent its permitted capability set being cleared when it resets all of its user IDs to nonzero values, it can do so using the prctl(2) PR_SET_KEEPCAPS operation or the SECBIT_KEEP_CAPS securebits flag described below.
Each of the above "base" flags has a companion "locked" flag. Setting any of the "locked" flags is irreversible, and has the effect of preventing further changes to the corresponding "base" flag. The locked flags are: SECBIT_KEEP_CAPS_LOCKED, SECBIT_NO_SETUID_FIXUP_LOCKED, SECBIT_NOROOT_LOCKED, and SECBIT_NO_CAP_AMBIENT_RAISE.
The securebits flags can be modified and retrieved using the prctl(2) PR_SET_SECUREBITS and PR_GET_SECUREBITS operations. The CAP_SETPCAP capability is required to modify the flags.
The securebits flags are inherited by child processes. During an execve(2), all of the flags are preserved, except SECBIT_KEEP_CAPS which is always cleared.
An application can use the following call to lock itself, and all of its descendants, into an environment where the only way of gaining capabilities is by executing a program with associated file capabilities:
prctl(PR_SET_SECUREBITS, SECBIT_KEEP_CAPS_LOCKED | SECBIT_NO_SETUID_FIXUP | SECBIT_NO_SETUID_FIXUP_LOCKED | SECBIT_NOROOT | SECBIT_NOROOT_LOCKED);
The /proc/PID/task/TID/status file can be used to view the capability sets of a thread. The /proc/PID/status file shows the capability sets of a process's main thread. Before Linux 3.8, nonexistent capabilities were shown as being enabled (1) in these sets. Since Linux 3.8, all nonexistent capabilities (above CAP_LAST_CAP) are shown as disabled (0).
package provides a suite of routines for setting and
getting capabilities that is more comfortable and less likely
to change than the interface provided by
This package also provides the
It can be found at
Before kernel 2.6.24, and from kernel 2.6.24 to kernel 2.6.32 if file capabilities are not enabled, a thread with the CAP_SETPCAP capability can manipulate the capabilities of threads other than itself. However, this is only theoretically possible, since no thread ever has CAP_SETPCAP in either of these cases: