user/sven/linux.git/include/uapi/linux/vfio.h, branch v4.9.223

vfio/pci: Intel IGD host and LCP bridge config space access

2016-02-22T23:10:09Z

Provide read-only access to PCI config space of the PCI host bridge and LPC bridge through device specific regions. This may be used to configure a VM with matching register contents to satisfy driver requirements. Providing this through the vfio file descriptor removes an additional userspace requirement for access through pci-sysfs and removes the CAP_SYS_ADMIN requirement that doesn't appear to apply to the specific devices we're accessing. Signed-off-by: Alex Williamson

vfio/pci: Intel IGD OpRegion support

2016-02-22T23:10:09Z

This is the first consumer of vfio device specific resource support, providing read-only access to the OpRegion for Intel graphics devices. Signed-off-by: Alex Williamson

vfio: Define device specific region type capability

2016-02-22T23:10:09Z

To this point vfio has only provided an interface to the user that allows them to determine the number of regions and specifics about each region. What the region represents is left to the vfio bus driver. vfio-pci chooses to use fixed indexes for fixed resources, index 0 is BAR0, 1 is BAR1,... 7 is config space, etc. This works pretty well since all PCI devices have these regions, even if they don't necessarily populate all of them. Then we start to add things like VGA, which only certain device even support. We added this the same way, but now we've wasted a region index, and due to our offset implementation the corresponding address space, for all devices. Rather than continuing that process, let's try to make regions self describing by including a capability that defines their type. For vfio-pci we'll make the current VFIO_PCI_NUM_REGIONS fixed, defining the end of the static indexes and the beginning of self describing regions. Signed-off-by: Alex Williamson

vfio: Define sparse mmap capability for regions

2016-02-22T23:10:08Z

We can't always support mmap across an entire device region, for example we deny mmaps covering the MSI-X table of PCI devices, but we don't really have a way to report it. We expect the user to implicitly know this restriction. We also can't split the region because vfio-pci defines an API with fixed region index to BAR number mapping. We therefore define a new capability which lists areas within the region that may be mmap'd. In addition to the MSI-X case, this potentially enables in-kernel emulation and extensions to devices. Signed-off-by: Alex Williamson

vfio: Define capability chains

2016-02-22T23:10:08Z

We have a few cases where we need to extend the data returned from the INFO ioctls in VFIO. For instance we already have devices exposed through vfio-pci where VFIO_DEVICE_GET_REGION_INFO reports the region as mmap-capable, but really only supports sparse mmaps, avoiding the MSI-X table. If we wanted to provide in-kernel emulation or extended functionality for devices, we'd also want the ability to tell the user not to mmap various regions, rather than forcing them to figure it out on their own. Another example is VFIO_IOMMU_GET_INFO. We'd really like to expose the actual IOVA capabilities of the IOMMU rather than letting the user assume the address space they have available to them. We could add IOVA base and size fields to struct vfio_iommu_type1_info, but what if we have multiple IOVA ranges. For instance x86 uses a range of addresses at 0xfee00000 for MSI vectors. These typically are not available for standard DMA IOVA mappings and splits our available IOVA space into two ranges. POWER systems have both an IOVA window below 4G as well as dynamic data window which they can use to remap all of guest memory. Representing variable sized arrays within a fixed structure makes it very difficult to parse, we'd therefore like to put this data beyond fixed fields within the data structures. One way to do this is to emulate capabilities in PCI configuration space. A new flag indciates whether capabilties are supported and a new fixed field reports the offset of the first entry. Users can then walk the chain to find capabilities, adding capabilities does not require additional fields in the fixed structure, and parsing variable sized data becomes trivial. This patch outlines the theory and base header structure, which should be shared by all future users. Signed-off-by: Alex Williamson

vfio: Include No-IOMMU mode

2015-12-21T22:28:11Z

There is really no way to safely give a user full access to a DMA capable device without an IOMMU to protect the host system. There is also no way to provide DMA translation, for use cases such as device assignment to virtual machines. However, there are still those users that want userspace drivers even under those conditions. The UIO driver exists for this use case, but does not provide the degree of device access and programming that VFIO has. In an effort to avoid code duplication, this introduces a No-IOMMU mode for VFIO. This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling the "enable_unsafe_noiommu_mode" option on the vfio driver. This should make it very clear that this mode is not safe. Additionally, CAP_SYS_RAWIO privileges are necessary to work with groups and containers using this mode. Groups making use of this support are named /dev/vfio/noiommu-$GROUP and can only make use of the special VFIO_NOIOMMU_IOMMU for the container. Use of this mode, specifically binding a device without a native IOMMU group to a VFIO bus driver will taint the kernel and should therefore not be considered supported. This patch includes no-iommu support for the vfio-pci bus driver only. Signed-off-by: Alex Williamson Acked-by: Michael S. Tsirkin

vfio: Add explicit alignments in vfio_iommu_spapr_tce_create

2015-12-21T22:28:11Z

The vfio_iommu_spapr_tce_create struct has 4x32bit and 2x64bit fields which should have resulted in sizeof(fio_iommu_spapr_tce_create) equal to 32 bytes. However due to the gcc's default alignment, the actual size of this struct is 40 bytes. This fills gaps with __resv1/2 fields. This should not cause any change in behavior. Signed-off-by: Alexey Kardashevskiy Acked-by: David Gibson Signed-off-by: Alex Williamson

Revert: "vfio: Include No-IOMMU mode"

2015-12-04T15:38:42Z

Revert commit 033291eccbdb ("vfio: Include No-IOMMU mode") due to lack of a user. This was originally intended to fill a need for the DPDK driver, but uptake has been slow so rather than support an unproven kernel interface revert it and revisit when userspace catches up. Signed-off-by: Alex Williamson

vfio: Include No-IOMMU mode

2015-11-04T16:56:16Z

vfio: powerpc/spapr: Support Dynamic DMA windows

2015-06-11T05:16:55Z

This adds create/remove window ioctls to create and remove DMA windows. sPAPR defines a Dynamic DMA windows capability which allows para-virtualized guests to create additional DMA windows on a PCI bus. The existing linux kernels use this new window to map the entire guest memory and switch to the direct DMA operations saving time on map/unmap requests which would normally happen in a big amounts. This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows. Up to 2 windows are supported now by the hardware and by this driver. This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional information such as a number of supported windows and maximum number levels of TCE tables. DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature as we still want to support v2 on platforms which cannot do DDW for the sake of TCE acceleration in KVM (coming soon). Signed-off-by: Alexey Kardashevskiy [aw: for the vfio related changes] Acked-by: Alex Williamson Reviewed-by: David Gibson Signed-off-by: Michael Ellerman