1 |
3 |
xianfeng |
Linux IOMMU Support
|
2 |
|
|
===================
|
3 |
|
|
|
4 |
|
|
The architecture spec can be obtained from the below location.
|
5 |
|
|
|
6 |
|
|
http://www.intel.com/technology/virtualization/
|
7 |
|
|
|
8 |
|
|
This guide gives a quick cheat sheet for some basic understanding.
|
9 |
|
|
|
10 |
|
|
Some Keywords
|
11 |
|
|
|
12 |
|
|
DMAR - DMA remapping
|
13 |
|
|
DRHD - DMA Engine Reporting Structure
|
14 |
|
|
RMRR - Reserved memory Region Reporting Structure
|
15 |
|
|
ZLR - Zero length reads from PCI devices
|
16 |
|
|
IOVA - IO Virtual address.
|
17 |
|
|
|
18 |
|
|
Basic stuff
|
19 |
|
|
-----------
|
20 |
|
|
|
21 |
|
|
ACPI enumerates and lists the different DMA engines in the platform, and
|
22 |
|
|
device scope relationships between PCI devices and which DMA engine controls
|
23 |
|
|
them.
|
24 |
|
|
|
25 |
|
|
What is RMRR?
|
26 |
|
|
-------------
|
27 |
|
|
|
28 |
|
|
There are some devices the BIOS controls, for e.g USB devices to perform
|
29 |
|
|
PS2 emulation. The regions of memory used for these devices are marked
|
30 |
|
|
reserved in the e820 map. When we turn on DMA translation, DMA to those
|
31 |
|
|
regions will fail. Hence BIOS uses RMRR to specify these regions along with
|
32 |
|
|
devices that need to access these regions. OS is expected to setup
|
33 |
|
|
unity mappings for these regions for these devices to access these regions.
|
34 |
|
|
|
35 |
|
|
How is IOVA generated?
|
36 |
|
|
---------------------
|
37 |
|
|
|
38 |
|
|
Well behaved drivers call pci_map_*() calls before sending command to device
|
39 |
|
|
that needs to perform DMA. Once DMA is completed and mapping is no longer
|
40 |
|
|
required, device performs a pci_unmap_*() calls to unmap the region.
|
41 |
|
|
|
42 |
|
|
The Intel IOMMU driver allocates a virtual address per domain. Each PCIE
|
43 |
|
|
device has its own domain (hence protection). Devices under p2p bridges
|
44 |
|
|
share the virtual address with all devices under the p2p bridge due to
|
45 |
|
|
transaction id aliasing for p2p bridges.
|
46 |
|
|
|
47 |
|
|
IOVA generation is pretty generic. We used the same technique as vmalloc()
|
48 |
|
|
but these are not global address spaces, but separate for each domain.
|
49 |
|
|
Different DMA engines may support different number of domains.
|
50 |
|
|
|
51 |
|
|
We also allocate gaurd pages with each mapping, so we can attempt to catch
|
52 |
|
|
any overflow that might happen.
|
53 |
|
|
|
54 |
|
|
|
55 |
|
|
Graphics Problems?
|
56 |
|
|
------------------
|
57 |
|
|
If you encounter issues with graphics devices, you can try adding
|
58 |
|
|
option intel_iommu=igfx_off to turn off the integrated graphics engine.
|
59 |
|
|
|
60 |
|
|
If it happens to be a PCI device included in the INCLUDE_ALL Engine,
|
61 |
|
|
then try enabling CONFIG_DMAR_GFX_WA to setup a 1-1 map. We hear
|
62 |
|
|
graphics drivers may be in process of using DMA api's in the near
|
63 |
|
|
future and at that time this option can be yanked out.
|
64 |
|
|
|
65 |
|
|
Some exceptions to IOVA
|
66 |
|
|
-----------------------
|
67 |
|
|
Interrupt ranges are not address translated, (0xfee00000 - 0xfeefffff).
|
68 |
|
|
The same is true for peer to peer transactions. Hence we reserve the
|
69 |
|
|
address from PCI MMIO ranges so they are not allocated for IOVA addresses.
|
70 |
|
|
|
71 |
|
|
|
72 |
|
|
Fault reporting
|
73 |
|
|
---------------
|
74 |
|
|
When errors are reported, the DMA engine signals via an interrupt. The fault
|
75 |
|
|
reason and device that caused it with fault reason is printed on console.
|
76 |
|
|
|
77 |
|
|
See below for sample.
|
78 |
|
|
|
79 |
|
|
|
80 |
|
|
Boot Message Sample
|
81 |
|
|
-------------------
|
82 |
|
|
|
83 |
|
|
Something like this gets printed indicating presence of DMAR tables
|
84 |
|
|
in ACPI.
|
85 |
|
|
|
86 |
|
|
ACPI: DMAR (v001 A M I OEMDMAR 0x00000001 MSFT 0x00000097) @ 0x000000007f5b5ef0
|
87 |
|
|
|
88 |
|
|
When DMAR is being processed and initialized by ACPI, prints DMAR locations
|
89 |
|
|
and any RMRR's processed.
|
90 |
|
|
|
91 |
|
|
ACPI DMAR:Host address width 36
|
92 |
|
|
ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000
|
93 |
|
|
ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000
|
94 |
|
|
ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000
|
95 |
|
|
ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff
|
96 |
|
|
ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff
|
97 |
|
|
|
98 |
|
|
When DMAR is enabled for use, you will notice..
|
99 |
|
|
|
100 |
|
|
PCI-DMA: Using DMAR IOMMU
|
101 |
|
|
|
102 |
|
|
Fault reporting
|
103 |
|
|
---------------
|
104 |
|
|
|
105 |
|
|
DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
|
106 |
|
|
DMAR:[fault reason 05] PTE Write access is not set
|
107 |
|
|
DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
|
108 |
|
|
DMAR:[fault reason 05] PTE Write access is not set
|
109 |
|
|
|
110 |
|
|
TBD
|
111 |
|
|
----
|
112 |
|
|
|
113 |
|
|
- For compatibility testing, could use unity map domain for all devices, just
|
114 |
|
|
provide a 1-1 for all useful memory under a single domain for all devices.
|
115 |
|
|
- API for paravirt ops for abstracting functionlity for VMM folks.
|