OpenCores
URL https://opencores.org/ocsvn/or1k_soc_on_altera_embedded_dev_kit/or1k_soc_on_altera_embedded_dev_kit/trunk

Subversion Repositories or1k_soc_on_altera_embedded_dev_kit

[/] [or1k_soc_on_altera_embedded_dev_kit/] [trunk/] [linux-2.6/] [linux-2.6.24/] [Documentation/] [MSI-HOWTO.txt] - Blame information for rev 17

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 3 xianfeng
                The MSI Driver Guide HOWTO
2
        Tom L Nguyen tom.l.nguyen@intel.com
3
                        10/03/2003
4
        Revised Feb 12, 2004 by Martine Silbermann
5
                email: Martine.Silbermann@hp.com
6
        Revised Jun 25, 2004 by Tom L Nguyen
7
 
8
1. About this guide
9
 
10
This guide describes the basics of Message Signaled Interrupts (MSI),
11
the advantages of using MSI over traditional interrupt mechanisms,
12
and how to enable your driver to use MSI or MSI-X. Also included is
13
a Frequently Asked Questions (FAQ) section.
14
 
15
1.1 Terminology
16
 
17
PCI devices can be single-function or multi-function.  In either case,
18
when this text talks about enabling or disabling MSI on a "device
19
function," it is referring to one specific PCI device and function and
20
not to all functions on a PCI device (unless the PCI device has only
21
one function).
22
 
23
2. Copyright 2003 Intel Corporation
24
 
25
3. What is MSI/MSI-X?
26
 
27
Message Signaled Interrupt (MSI), as described in the PCI Local Bus
28
Specification Revision 2.3 or later, is an optional feature, and a
29
required feature for PCI Express devices. MSI enables a device function
30
to request service by sending an Inbound Memory Write on its PCI bus to
31
the FSB as a Message Signal Interrupt transaction. Because MSI is
32
generated in the form of a Memory Write, all transaction conditions,
33
such as a Retry, Master-Abort, Target-Abort or normal completion, are
34
supported.
35
 
36
A PCI device that supports MSI must also support pin IRQ assertion
37
interrupt mechanism to provide backward compatibility for systems that
38
do not support MSI. In systems which support MSI, the bus driver is
39
responsible for initializing the message address and message data of
40
the device function's MSI/MSI-X capability structure during device
41
initial configuration.
42
 
43
An MSI capable device function indicates MSI support by implementing
44
the MSI/MSI-X capability structure in its PCI capability list. The
45
device function may implement both the MSI capability structure and
46
the MSI-X capability structure; however, the bus driver should not
47
enable both.
48
 
49
The MSI capability structure contains Message Control register,
50
Message Address register and Message Data register. These registers
51
provide the bus driver control over MSI. The Message Control register
52
indicates the MSI capability supported by the device. The Message
53
Address register specifies the target address and the Message Data
54
register specifies the characteristics of the message. To request
55
service, the device function writes the content of the Message Data
56
register to the target address. The device and its software driver
57
are prohibited from writing to these registers.
58
 
59
The MSI-X capability structure is an optional extension to MSI. It
60
uses an independent and separate capability structure. There are
61
some key advantages to implementing the MSI-X capability structure
62
over the MSI capability structure as described below.
63
 
64
        - Support a larger maximum number of vectors per function.
65
 
66
        - Provide the ability for system software to configure
67
        each vector with an independent message address and message
68
        data, specified by a table that resides in Memory Space.
69
 
70
        - MSI and MSI-X both support per-vector masking. Per-vector
71
        masking is an optional extension of MSI but a required
72
        feature for MSI-X. Per-vector masking provides the kernel the
73
        ability to mask/unmask a single MSI while running its
74
        interrupt service routine. If per-vector masking is
75
        not supported, then the device driver should provide the
76
        hardware/software synchronization to ensure that the device
77
        generates MSI when the driver wants it to do so.
78
 
79
4. Why use MSI?
80
 
81
As a benefit to the simplification of board design, MSI allows board
82
designers to remove out-of-band interrupt routing. MSI is another
83
step towards a legacy-free environment.
84
 
85
Due to increasing pressure on chipset and processor packages to
86
reduce pin count, the need for interrupt pins is expected to
87
diminish over time. Devices, due to pin constraints, may implement
88
messages to increase performance.
89
 
90
PCI Express endpoints uses INTx emulation (in-band messages) instead
91
of IRQ pin assertion. Using INTx emulation requires interrupt
92
sharing among devices connected to the same node (PCI bridge) while
93
MSI is unique (non-shared) and does not require BIOS configuration
94
support. As a result, the PCI Express technology requires MSI
95
support for better interrupt performance.
96
 
97
Using MSI enables the device functions to support two or more
98
vectors, which can be configured to target different CPUs to
99
increase scalability.
100
 
101
5. Configuring a driver to use MSI/MSI-X
102
 
103
By default, the kernel will not enable MSI/MSI-X on all devices that
104
support this capability. The CONFIG_PCI_MSI kernel option
105
must be selected to enable MSI/MSI-X support.
106
 
107
5.1 Including MSI/MSI-X support into the kernel
108
 
109
To allow MSI/MSI-X capable device drivers to selectively enable
110
MSI/MSI-X (using pci_enable_msi()/pci_enable_msix() as described
111
below), the VECTOR based scheme needs to be enabled by setting
112
CONFIG_PCI_MSI during kernel config.
113
 
114
Since the target of the inbound message is the local APIC, providing
115
CONFIG_X86_LOCAL_APIC must be enabled as well as CONFIG_PCI_MSI.
116
 
117
5.2 Configuring for MSI support
118
 
119
Due to the non-contiguous fashion in vector assignment of the
120
existing Linux kernel, this version does not support multiple
121
messages regardless of a device function is capable of supporting
122
more than one vector. To enable MSI on a device function's MSI
123
capability structure requires a device driver to call the function
124
pci_enable_msi() explicitly.
125
 
126
5.2.1 API pci_enable_msi
127
 
128
int pci_enable_msi(struct pci_dev *dev)
129
 
130
With this new API, a device driver that wants to have MSI
131
enabled on its device function must call this API to enable MSI.
132
A successful call will initialize the MSI capability structure
133
with ONE vector, regardless of whether a device function is
134
capable of supporting multiple messages. This vector replaces the
135
pre-assigned dev->irq with a new MSI vector. To avoid a conflict
136
of the new assigned vector with existing pre-assigned vector requires
137
a device driver to call this API before calling request_irq().
138
 
139
5.2.2 API pci_disable_msi
140
 
141
void pci_disable_msi(struct pci_dev *dev)
142
 
143
This API should always be used to undo the effect of pci_enable_msi()
144
when a device driver is unloading. This API restores dev->irq with
145
the pre-assigned IOAPIC vector and switches a device's interrupt
146
mode to PCI pin-irq assertion/INTx emulation mode.
147
 
148
Note that a device driver should always call free_irq() on the MSI vector
149
that it has done request_irq() on before calling this API. Failure to do
150
so results in a BUG_ON() and a device will be left with MSI enabled and
151
leaks its vector.
152
 
153
5.2.3 MSI mode vs. legacy mode diagram
154
 
155
The below diagram shows the events which switch the interrupt
156
mode on the MSI-capable device function between MSI mode and
157
PIN-IRQ assertion mode.
158
 
159
         ------------   pci_enable_msi   ------------------------
160
        |            | <=============== |                        |
161
        | MSI MODE   |                  | PIN-IRQ ASSERTION MODE |
162
        |            | ===============> |                        |
163
         ------------   pci_disable_msi  ------------------------
164
 
165
 
166
Figure 1. MSI Mode vs. Legacy Mode
167
 
168
In Figure 1, a device operates by default in legacy mode. Legacy
169
in this context means PCI pin-irq assertion or PCI-Express INTx
170
emulation. A successful MSI request (using pci_enable_msi()) switches
171
a device's interrupt mode to MSI mode. A pre-assigned IOAPIC vector
172
stored in dev->irq will be saved by the PCI subsystem and a new
173
assigned MSI vector will replace dev->irq.
174
 
175
To return back to its default mode, a device driver should always call
176
pci_disable_msi() to undo the effect of pci_enable_msi(). Note that a
177
device driver should always call free_irq() on the MSI vector it has
178
done request_irq() on before calling pci_disable_msi(). Failure to do
179
so results in a BUG_ON() and a device will be left with MSI enabled and
180
leaks its vector. Otherwise, the PCI subsystem restores a device's
181
dev->irq with a pre-assigned IOAPIC vector and marks the released
182
MSI vector as unused.
183
 
184
Once being marked as unused, there is no guarantee that the PCI
185
subsystem will reserve this MSI vector for a device. Depending on
186
the availability of current PCI vector resources and the number of
187
MSI/MSI-X requests from other drivers, this MSI may be re-assigned.
188
 
189
For the case where the PCI subsystem re-assigns this MSI vector to
190
another driver, a request to switch back to MSI mode may result
191
in being assigned a different MSI vector or a failure if no more
192
vectors are available.
193
 
194
5.3 Configuring for MSI-X support
195
 
196
Due to the ability of the system software to configure each vector of
197
the MSI-X capability structure with an independent message address
198
and message data, the non-contiguous fashion in vector assignment of
199
the existing Linux kernel has no impact on supporting multiple
200
messages on an MSI-X capable device functions. To enable MSI-X on
201
a device function's MSI-X capability structure requires its device
202
driver to call the function pci_enable_msix() explicitly.
203
 
204
The function pci_enable_msix(), once invoked, enables either
205
all or nothing, depending on the current availability of PCI vector
206
resources. If the PCI vector resources are available for the number
207
of vectors requested by a device driver, this function will configure
208
the MSI-X table of the MSI-X capability structure of a device with
209
requested messages. To emphasize this reason, for example, a device
210
may be capable for supporting the maximum of 32 vectors while its
211
software driver usually may request 4 vectors. It is recommended
212
that the device driver should call this function once during the
213
initialization phase of the device driver.
214
 
215
Unlike the function pci_enable_msi(), the function pci_enable_msix()
216
does not replace the pre-assigned IOAPIC dev->irq with a new MSI
217
vector because the PCI subsystem writes the 1:1 vector-to-entry mapping
218
into the field vector of each element contained in a second argument.
219
Note that the pre-assigned IOAPIC dev->irq is valid only if the device
220
operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at
221
using dev->irq by the device driver to request for interrupt service
222
may result in unpredictable behavior.
223
 
224
For each MSI-X vector granted, a device driver is responsible for calling
225
other functions like request_irq(), enable_irq(), etc. to enable
226
this vector with its corresponding interrupt service handler. It is
227
a device driver's choice to assign all vectors with the same
228
interrupt service handler or each vector with a unique interrupt
229
service handler.
230
 
231
5.3.1 Handling MMIO address space of MSI-X Table
232
 
233
The PCI 3.0 specification has implementation notes that MMIO address
234
space for a device's MSI-X structure should be isolated so that the
235
software system can set different pages for controlling accesses to the
236
MSI-X structure. The implementation of MSI support requires the PCI
237
subsystem, not a device driver, to maintain full control of the MSI-X
238
table/MSI-X PBA (Pending Bit Array) and MMIO address space of the MSI-X
239
table/MSI-X PBA.  A device driver is prohibited from requesting the MMIO
240
address space of the MSI-X table/MSI-X PBA. Otherwise, the PCI subsystem
241
will fail enabling MSI-X on its hardware device when it calls the function
242
pci_enable_msix().
243
 
244
5.3.2 API pci_enable_msix
245
 
246
int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
247
 
248
This API enables a device driver to request the PCI subsystem
249
to enable MSI-X messages on its hardware device. Depending on
250
the availability of PCI vectors resources, the PCI subsystem enables
251
either all or none of the requested vectors.
252
 
253
Argument 'dev' points to the device (pci_dev) structure.
254
 
255
Argument 'entries' is a pointer to an array of msix_entry structs.
256
The number of entries is indicated in argument 'nvec'.
257
struct msix_entry is defined in /driver/pci/msi.h:
258
 
259
struct msix_entry {
260
        u16     vector; /* kernel uses to write alloc vector */
261
        u16     entry; /* driver uses to specify entry */
262
};
263
 
264
A device driver is responsible for initializing the field 'entry' of
265
each element with a unique entry supported by MSI-X table. Otherwise,
266
-EINVAL will be returned as a result. A successful return of zero
267
indicates the PCI subsystem completed initializing each of the requested
268
entries of the MSI-X table with message address and message data.
269
Last but not least, the PCI subsystem will write the 1:1
270
vector-to-entry mapping into the field 'vector' of each element. A
271
device driver is responsible for keeping track of allocated MSI-X
272
vectors in its internal data structure.
273
 
274
A return of zero indicates that the number of MSI-X vectors was
275
successfully allocated. A return of greater than zero indicates
276
MSI-X vector shortage. Or a return of less than zero indicates
277
a failure. This failure may be a result of duplicate entries
278
specified in second argument, or a result of no available vector,
279
or a result of failing to initialize MSI-X table entries.
280
 
281
5.3.3 API pci_disable_msix
282
 
283
void pci_disable_msix(struct pci_dev *dev)
284
 
285
This API should always be used to undo the effect of pci_enable_msix()
286
when a device driver is unloading. Note that a device driver should
287
always call free_irq() on all MSI-X vectors it has done request_irq()
288
on before calling this API. Failure to do so results in a BUG_ON() and
289
a device will be left with MSI-X enabled and leaks its vectors.
290
 
291
5.3.4 MSI-X mode vs. legacy mode diagram
292
 
293
The below diagram shows the events which switch the interrupt
294
mode on the MSI-X capable device function between MSI-X mode and
295
PIN-IRQ assertion mode (legacy).
296
 
297
         ------------   pci_enable_msix(,,n) ------------------------
298
        |            | <===============     |                        |
299
        | MSI-X MODE |                      | PIN-IRQ ASSERTION MODE |
300
        |            | ===============>     |                        |
301
         ------------   pci_disable_msix     ------------------------
302
 
303
Figure 2. MSI-X Mode vs. Legacy Mode
304
 
305
In Figure 2, a device operates by default in legacy mode. A
306
successful MSI-X request (using pci_enable_msix()) switches a
307
device's interrupt mode to MSI-X mode. A pre-assigned IOAPIC vector
308
stored in dev->irq will be saved by the PCI subsystem; however,
309
unlike MSI mode, the PCI subsystem will not replace dev->irq with
310
assigned MSI-X vector because the PCI subsystem already writes the 1:1
311
vector-to-entry mapping into the field 'vector' of each element
312
specified in second argument.
313
 
314
To return back to its default mode, a device driver should always call
315
pci_disable_msix() to undo the effect of pci_enable_msix(). Note that
316
a device driver should always call free_irq() on all MSI-X vectors it
317
has done request_irq() on before calling pci_disable_msix(). Failure
318
to do so results in a BUG_ON() and a device will be left with MSI-X
319
enabled and leaks its vectors. Otherwise, the PCI subsystem switches a
320
device function's interrupt mode from MSI-X mode to legacy mode and
321
marks all allocated MSI-X vectors as unused.
322
 
323
Once being marked as unused, there is no guarantee that the PCI
324
subsystem will reserve these MSI-X vectors for a device. Depending on
325
the availability of current PCI vector resources and the number of
326
MSI/MSI-X requests from other drivers, these MSI-X vectors may be
327
re-assigned.
328
 
329
For the case where the PCI subsystem re-assigned these MSI-X vectors
330
to other drivers, a request to switch back to MSI-X mode may result
331
being assigned with another set of MSI-X vectors or a failure if no
332
more vectors are available.
333
 
334
5.4 Handling function implementing both MSI and MSI-X capabilities
335
 
336
For the case where a function implements both MSI and MSI-X
337
capabilities, the PCI subsystem enables a device to run either in MSI
338
mode or MSI-X mode but not both. A device driver determines whether it
339
wants MSI or MSI-X enabled on its hardware device. Once a device
340
driver requests for MSI, for example, it is prohibited from requesting
341
MSI-X; in other words, a device driver is not permitted to ping-pong
342
between MSI mod MSI-X mode during a run-time.
343
 
344
5.5 Hardware requirements for MSI/MSI-X support
345
 
346
MSI/MSI-X support requires support from both system hardware and
347
individual hardware device functions.
348
 
349
5.5.1 Required x86 hardware support
350
 
351
Since the target of MSI address is the local APIC CPU, enabling
352
MSI/MSI-X support in the Linux kernel is dependent on whether existing
353
system hardware supports local APIC. Users should verify that their
354
system supports local APIC operation by testing that it runs when
355
CONFIG_X86_LOCAL_APIC=y.
356
 
357
In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set;
358
however, in UP environment, users must manually set
359
CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting
360
CONFIG_PCI_MSI enables the VECTOR based scheme and the option for
361
MSI-capable device drivers to selectively enable MSI/MSI-X.
362
 
363
Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI/MSI-X
364
vector is allocated new during runtime and MSI/MSI-X support does not
365
depend on BIOS support. This key independency enables MSI/MSI-X
366
support on future IOxAPIC free platforms.
367
 
368
5.5.2 Device hardware support
369
 
370
The hardware device function supports MSI by indicating the
371
MSI/MSI-X capability structure on its PCI capability list. By
372
default, this capability structure will not be initialized by
373
the kernel to enable MSI during the system boot. In other words,
374
the device function is running on its default pin assertion mode.
375
Note that in many cases the hardware supporting MSI have bugs,
376
which may result in system hangs. The software driver of specific
377
MSI-capable hardware is responsible for deciding whether to call
378
pci_enable_msi or not. A return of zero indicates the kernel
379
successfully initialized the MSI/MSI-X capability structure of the
380
device function. The device function is now running on MSI/MSI-X mode.
381
 
382
5.6 How to tell whether MSI/MSI-X is enabled on device function
383
 
384
At the driver level, a return of zero from the function call of
385
pci_enable_msi()/pci_enable_msix() indicates to a device driver that
386
its device function is initialized successfully and ready to run in
387
MSI/MSI-X mode.
388
 
389
At the user level, users can use the command 'cat /proc/interrupts'
390
to display the vectors allocated for devices and their interrupt
391
MSI/MSI-X modes ("PCI-MSI"/"PCI-MSI-X"). Below shows MSI mode is
392
enabled on a SCSI Adaptec 39320D Ultra320 controller.
393
 
394
           CPU0       CPU1
395
  0:     324639          0    IO-APIC-edge  timer
396
  1:       1186          0    IO-APIC-edge  i8042
397
  2:          0          0          XT-PIC  cascade
398
 12:       2797          0    IO-APIC-edge  i8042
399
 14:       6543          0    IO-APIC-edge  ide0
400
 15:          1          0    IO-APIC-edge  ide1
401
169:          0          0   IO-APIC-level  uhci-hcd
402
185:          0          0   IO-APIC-level  uhci-hcd
403
193:        138         10         PCI-MSI  aic79xx
404
201:         30          0         PCI-MSI  aic79xx
405
225:         30          0   IO-APIC-level  aic7xxx
406
233:         30          0   IO-APIC-level  aic7xxx
407
NMI:          0          0
408
LOC:     324553     325068
409
ERR:          0
410
MIS:          0
411
 
412
6. MSI quirks
413
 
414
Several PCI chipsets or devices are known to not support MSI.
415
The PCI stack provides 3 possible levels of MSI disabling:
416
* on a single device
417
* on all devices behind a specific bridge
418
* globally
419
 
420
6.1. Disabling MSI on a single device
421
 
422
Under some circumstances it might be required to disable MSI on a
423
single device.  This may be achieved by either not calling pci_enable_msi()
424
or all, or setting the pci_dev->no_msi flag before (most of the time
425
in a quirk).
426
 
427
6.2. Disabling MSI below a bridge
428
 
429
The vast majority of MSI quirks are required by PCI bridges not
430
being able to route MSI between busses. In this case, MSI have to be
431
disabled on all devices behind this bridge. It is achieves by setting
432
the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge
433
subordinate bus. There is no need to set the same flag on bridges that
434
are below the broken bridge. When pci_enable_msi() is called to enable
435
MSI on a device, pci_msi_supported() takes care of checking the NO_MSI
436
flag in all parent busses of the device.
437
 
438
Some bridges actually support dynamic MSI support enabling/disabling
439
by changing some bits in their PCI configuration space (especially
440
the Hypertransport chipsets such as the nVidia nForce and Serverworks
441
HT2000). It may then be required to update the NO_MSI flag on the
442
corresponding devices in the sysfs hierarchy. To enable MSI support
443
on device "0000:00:0e", do:
444
 
445
        echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus
446
 
447
To disable MSI support, echo 0 instead of 1. Note that it should be
448
used with caution since changing this value might break interrupts.
449
 
450
6.3. Disabling MSI globally
451
 
452
Some extreme cases may require to disable MSI globally on the system.
453
For now, the only known case is a Serverworks PCI-X chipsets (MSI are
454
not supported on several busses that are not all connected to the
455
chipset in the Linux PCI hierarchy). In the vast majority of other
456
cases, disabling only behind a specific bridge is enough.
457
 
458
For debugging purpose, the user may also pass pci=nomsi on the kernel
459
command-line to explicitly disable MSI globally. But, once the appro-
460
priate quirks are added to the kernel, this option should not be
461
required anymore.
462
 
463
6.4. Finding why MSI cannot be enabled on a device
464
 
465
Assuming that MSI are not enabled on a device, you should look at
466
dmesg to find messages that quirks may output when disabling MSI
467
on some devices, some bridges or even globally.
468
Then, lspci -t gives the list of bridges above a device. Reading
469
/sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI
470
are enabled (1) or disabled (0). In 0 is found in a single bridge
471
msi_bus file above the device, MSI cannot be enabled.
472
 
473
7. FAQ
474
 
475
Q1. Are there any limitations on using the MSI?
476
 
477
A1. If the PCI device supports MSI and conforms to the
478
specification and the platform supports the APIC local bus,
479
then using MSI should work.
480
 
481
Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
482
AMD processors)? In P3 IPI's are transmitted on the APIC local
483
bus and in P4 and Xeon they are transmitted on the system
484
bus. Are there any implications with this?
485
 
486
A2. MSI support enables a PCI device sending an inbound
487
memory write (0xfeexxxxx as target address) on its PCI bus
488
directly to the FSB. Since the message address has a
489
redirection hint bit cleared, it should work.
490
 
491
Q3. The target address 0xfeexxxxx will be translated by the
492
Host Bridge into an interrupt message. Are there any
493
limitations on the chipsets such as Intel 8xx, Intel e7xxx,
494
or VIA?
495
 
496
A3. If these chipsets support an inbound memory write with
497
target address set as 0xfeexxxxx, as conformed to PCI
498
specification 2.3 or latest, then it should work.
499
 
500
Q4. From the driver point of view, if the MSI is lost because
501
of errors occurring during inbound memory write, then it may
502
wait forever. Is there a mechanism for it to recover?
503
 
504
A4. Since the target of the transaction is an inbound memory
505
write, all transaction termination conditions (Retry,
506
Master-Abort, Target-Abort, or normal completion) are
507
supported. A device sending an MSI must abide by all the PCI
508
rules and conditions regarding that inbound memory write. So,
509
if a retry is signaled it must retry, etc... We believe that
510
the recommendation for Abort is also a retry (refer to PCI
511
specification 2.3 or latest).

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.