OpenCores
URL https://opencores.org/ocsvn/test_project/test_project/trunk

Subversion Repositories test_project

[/] [test_project/] [trunk/] [linux_sd_driver/] [Documentation/] [DocBook/] [libata.tmpl] - Blame information for rev 78

Go to most recent revision | Details | Compare with Previous | View Log

Line No. Rev Author Line
1 62 marcus.erl
2
3
        "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
4
 
5
6
 
7
  libATA Developer's Guide
8
 
9
  
10
   
11
    Jeff
12
    Garzik
13
   
14
  
15
 
16
  
17
   2003-2006
18
   Jeff Garzik
19
  
20
 
21
  
22
   
23
   The contents of this file are subject to the Open
24
   Software License version 1.1 that can be found at
25
   http://www.opensource.org/licenses/osl-1.1.txt and is included herein
26
   by reference.
27
   
28
 
29
   
30
   Alternatively, the contents of this file may be used under the terms
31
   of the GNU General Public License version 2 (the "GPL") as distributed
32
   in the kernel source COPYING file, in which case the provisions of
33
   the GPL are applicable instead of the above.  If you wish to allow
34
   the use of your version of this file only under the terms of the
35
   GPL and not to allow others to use your version of this file under
36
   the OSL, indicate your decision by deleting the provisions above and
37
   replace them with the notice and other provisions required by the GPL.
38
   If you do not delete the provisions above, a recipient may use your
39
   version of this file under either the OSL or the GPL.
40
   
41
 
42
  
43
 
44
 
45
46
 
47
  
48
     Introduction
49
  
50
  libATA is a library used inside the Linux kernel to support ATA host
51
  controllers and devices.  libATA provides an ATA driver API, class
52
  transports for ATA and ATAPI devices, and SCSI<->ATA translation
53
  for ATA devices according to the T10 SAT specification.
54
  
55
  
56
  This Guide documents the libATA driver API, library functions, library
57
  internals, and a couple sample ATA low-level drivers.
58
  
59
  
60
 
61
  
62
     libata Driver API
63
     
64
     struct ata_port_operations is defined for every low-level libata
65
     hardware driver, and it controls how the low-level driver
66
     interfaces with the ATA and SCSI layers.
67
     
68
     
69
     FIS-based drivers will hook into the system with ->qc_prep() and
70
     ->qc_issue() high-level hooks.  Hardware which behaves in a manner
71
     similar to PCI IDE hardware may utilize several generic helpers,
72
     defining at a bare minimum the bus I/O addresses of the ATA shadow
73
     register blocks.
74
     
75
     
76
        struct ata_port_operations
77
 
78
        Disable ATA port
79
        
80
void (*port_disable) (struct ata_port *);
81
        
82
 
83
        
84
        Called from ata_bus_probe() and ata_bus_reset() error paths,
85
        as well as when unregistering from the SCSI module (rmmod, hot
86
        unplug).
87
        This function should do whatever needs to be done to take the
88
        port out of use.  In most cases, ata_port_disable() can be used
89
        as this hook.
90
        
91
        
92
        Called from ata_bus_probe() on a failed probe.
93
        Called from ata_bus_reset() on a failed bus reset.
94
        Called from ata_scsi_release().
95
        
96
 
97
        
98
 
99
        Post-IDENTIFY device configuration
100
        
101
void (*dev_config) (struct ata_port *, struct ata_device *);
102
        
103
 
104
        
105
        Called after IDENTIFY [PACKET] DEVICE is issued to each device
106
        found.  Typically used to apply device-specific fixups prior to
107
        issue of SET FEATURES - XFER MODE, and prior to operation.
108
        
109
        
110
        Called by ata_device_add() after ata_dev_identify() determines
111
        a device is present.
112
        
113
        
114
        This entry may be specified as NULL in ata_port_operations.
115
        
116
 
117
        
118
 
119
        Set PIO/DMA mode
120
        
121
void (*set_piomode) (struct ata_port *, struct ata_device *);
122
void (*set_dmamode) (struct ata_port *, struct ata_device *);
123
void (*post_set_mode) (struct ata_port *);
124
unsigned int (*mode_filter) (struct ata_port *, struct ata_device *, unsigned int);
125
        
126
 
127
        
128
        Hooks called prior to the issue of SET FEATURES - XFER MODE
129
        command.  The optional ->mode_filter() hook is called when libata
130
        has built a mask of the possible modes. This is passed to the
131
        ->mode_filter() function which should return a mask of valid modes
132
        after filtering those unsuitable due to hardware limits. It is not
133
        valid to use this interface to add modes.
134
        
135
        
136
        dev->pio_mode and dev->dma_mode are guaranteed to be valid when
137
        ->set_piomode() and when ->set_dmamode() is called. The timings for
138
        any other drive sharing the cable will also be valid at this point.
139
        That is the library records the decisions for the modes of each
140
        drive on a channel before it attempts to set any of them.
141
        
142
        
143
        ->post_set_mode() is
144
        called unconditionally, after the SET FEATURES - XFER MODE
145
        command completes successfully.
146
        
147
 
148
        
149
        ->set_piomode() is always called (if present), but
150
        ->set_dma_mode() is only called if DMA is possible.
151
        
152
 
153
        
154
 
155
        Taskfile read/write
156
        
157
void (*tf_load) (struct ata_port *ap, struct ata_taskfile *tf);
158
void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf);
159
        
160
 
161
        
162
        ->tf_load() is called to load the given taskfile into hardware
163
        registers / DMA buffers.  ->tf_read() is called to read the
164
        hardware registers / DMA buffers, to obtain the current set of
165
        taskfile register values.
166
        Most drivers for taskfile-based hardware (PIO or MMIO) use
167
        ata_tf_load() and ata_tf_read() for these hooks.
168
        
169
 
170
        
171
 
172
        PIO data read/write
173
        
174
void (*data_xfer) (struct ata_device *, unsigned char *, unsigned int, int);
175
        
176
 
177
        
178
All bmdma-style drivers must implement this hook.  This is the low-level
179
operation that actually copies the data bytes during a PIO data
180
transfer.
181
Typically the driver
182
will choose one of ata_pio_data_xfer_noirq(), ata_pio_data_xfer(), or
183
ata_mmio_data_xfer().
184
        
185
 
186
        
187
 
188
        ATA command execute
189
        
190
void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf);
191
        
192
 
193
        
194
        causes an ATA command, previously loaded with
195
        ->tf_load(), to be initiated in hardware.
196
        Most drivers for taskfile-based hardware use ata_exec_command()
197
        for this hook.
198
        
199
 
200
        
201
 
202
        Per-cmd ATAPI DMA capabilities filter
203
        
204
int (*check_atapi_dma) (struct ata_queued_cmd *qc);
205
        
206
 
207
        
208
Allow low-level driver to filter ATA PACKET commands, returning a status
209
indicating whether or not it is OK to use DMA for the supplied PACKET
210
command.
211
        
212
        
213
        This hook may be specified as NULL, in which case libata will
214
        assume that atapi dma can be supported.
215
        
216
 
217
        
218
 
219
        Read specific ATA shadow registers
220
        
221
u8   (*check_status)(struct ata_port *ap);
222
u8   (*check_altstatus)(struct ata_port *ap);
223
        
224
 
225
        
226
        Reads the Status/AltStatus ATA shadow register from
227
        hardware.  On some hardware, reading the Status register has
228
        the side effect of clearing the interrupt condition.
229
        Most drivers for taskfile-based hardware use
230
        ata_check_status() for this hook.
231
        
232
        
233
        Note that because this is called from ata_device_add(), at
234
        least a dummy function that clears device interrupts must be
235
        provided for all drivers, even if the controller doesn't
236
        actually have a taskfile status register.
237
        
238
 
239
        
240
 
241
        Select ATA device on bus
242
        
243
void (*dev_select)(struct ata_port *ap, unsigned int device);
244
        
245
 
246
        
247
        Issues the low-level hardware command(s) that causes one of N
248
        hardware devices to be considered 'selected' (active and
249
        available for use) on the ATA bus.  This generally has no
250
        meaning on FIS-based devices.
251
        
252
        
253
        Most drivers for taskfile-based hardware use
254
        ata_std_dev_select() for this hook.  Controllers which do not
255
        support second drives on a port (such as SATA contollers) will
256
        use ata_noop_dev_select().
257
        
258
 
259
        
260
 
261
        Private tuning method
262
        
263
void (*set_mode) (struct ata_port *ap);
264
        
265
 
266
        
267
        By default libata performs drive and controller tuning in
268
        accordance with the ATA timing rules and also applies blacklists
269
        and cable limits. Some controllers need special handling and have
270
        custom tuning rules, typically raid controllers that use ATA
271
        commands but do not actually do drive timing.
272
        
273
 
274
        
275
        
276
        This hook should not be used to replace the standard controller
277
        tuning logic when a controller has quirks. Replacing the default
278
        tuning logic in that case would bypass handling for drive and
279
        bridge quirks that may be important to data reliability. If a
280
        controller needs to filter the mode selection it should use the
281
        mode_filter hook instead.
282
        
283
        
284
 
285
        
286
 
287
        Control PCI IDE BMDMA engine
288
        
289
void (*bmdma_setup) (struct ata_queued_cmd *qc);
290
void (*bmdma_start) (struct ata_queued_cmd *qc);
291
void (*bmdma_stop) (struct ata_port *ap);
292
u8   (*bmdma_status) (struct ata_port *ap);
293
        
294
 
295
        
296
When setting up an IDE BMDMA transaction, these hooks arm
297
(->bmdma_setup), fire (->bmdma_start), and halt (->bmdma_stop)
298
the hardware's DMA engine.  ->bmdma_status is used to read the standard
299
PCI IDE DMA Status register.
300
        
301
 
302
        
303
These hooks are typically either no-ops, or simply not implemented, in
304
FIS-based drivers.
305
        
306
        
307
Most legacy IDE drivers use ata_bmdma_setup() for the bmdma_setup()
308
hook.  ata_bmdma_setup() will write the pointer to the PRD table to
309
the IDE PRD Table Address register, enable DMA in the DMA Command
310
register, and call exec_command() to begin the transfer.
311
        
312
        
313
Most legacy IDE drivers use ata_bmdma_start() for the bmdma_start()
314
hook.  ata_bmdma_start() will write the ATA_DMA_START flag to the DMA
315
Command register.
316
        
317
        
318
Many legacy IDE drivers use ata_bmdma_stop() for the bmdma_stop()
319
hook.  ata_bmdma_stop() clears the ATA_DMA_START flag in the DMA
320
command register.
321
        
322
        
323
Many legacy IDE drivers use ata_bmdma_status() as the bmdma_status() hook.
324
        
325
 
326
        
327
 
328
        High-level taskfile hooks
329
        
330
void (*qc_prep) (struct ata_queued_cmd *qc);
331
int (*qc_issue) (struct ata_queued_cmd *qc);
332
        
333
 
334
        
335
        Higher-level hooks, these two hooks can potentially supercede
336
        several of the above taskfile/DMA engine hooks.  ->qc_prep is
337
        called after the buffers have been DMA-mapped, and is typically
338
        used to populate the hardware's DMA scatter-gather table.
339
        Most drivers use the standard ata_qc_prep() helper function, but
340
        more advanced drivers roll their own.
341
        
342
        
343
        ->qc_issue is used to make a command active, once the hardware
344
        and S/G tables have been prepared.  IDE BMDMA drivers use the
345
        helper function ata_qc_issue_prot() for taskfile protocol-based
346
        dispatch.  More advanced drivers implement their own ->qc_issue.
347
        
348
        
349
        ata_qc_issue_prot() calls ->tf_load(), ->bmdma_setup(), and
350
        ->bmdma_start() as necessary to initiate a transfer.
351
        
352
 
353
        
354
 
355
        Exception and probe handling (EH)
356
        
357
void (*eng_timeout) (struct ata_port *ap);
358
void (*phy_reset) (struct ata_port *ap);
359
        
360
 
361
        
362
Deprecated.  Use ->error_handler() instead.
363
        
364
 
365
        
366
void (*freeze) (struct ata_port *ap);
367
void (*thaw) (struct ata_port *ap);
368
        
369
 
370
        
371
ata_port_freeze() is called when HSM violations or some other
372
condition disrupts normal operation of the port.  A frozen port
373
is not allowed to perform any operation until the port is
374
thawed, which usually follows a successful reset.
375
        
376
 
377
        
378
The optional ->freeze() callback can be used for freezing the port
379
hardware-wise (e.g. mask interrupt and stop DMA engine).  If a
380
port cannot be frozen hardware-wise, the interrupt handler
381
must ack and clear interrupts unconditionally while the port
382
is frozen.
383
        
384
        
385
The optional ->thaw() callback is called to perform the opposite of ->freeze():
386
prepare the port for normal operation once again.  Unmask interrupts,
387
start DMA engine, etc.
388
        
389
 
390
        
391
void (*error_handler) (struct ata_port *ap);
392
        
393
 
394
        
395
->error_handler() is a driver's hook into probe, hotplug, and recovery
396
and other exceptional conditions.  The primary responsibility of an
397
implementation is to call ata_do_eh() or ata_bmdma_drive_eh() with a set
398
of EH hooks as arguments:
399
        
400
 
401
        
402
'prereset' hook (may be NULL) is called during an EH reset, before any other actions
403
are taken.
404
        
405
 
406
        
407
'postreset' hook (may be NULL) is called after the EH reset is performed.  Based on
408
existing conditions, severity of the problem, and hardware capabilities,
409
        
410
 
411
        
412
Either 'softreset' (may be NULL) or 'hardreset' (may be NULL) will be
413
called to perform the low-level EH reset.
414
        
415
 
416
        
417
void (*post_internal_cmd) (struct ata_queued_cmd *qc);
418
        
419
 
420
        
421
Perform any hardware-specific actions necessary to finish processing
422
after executing a probe-time or EH-time command via ata_exec_internal().
423
        
424
 
425
        
426
 
427
        Hardware interrupt handling
428
        
429
irqreturn_t (*irq_handler)(int, void *, struct pt_regs *);
430
void (*irq_clear) (struct ata_port *);
431
        
432
 
433
        
434
        ->irq_handler is the interrupt handling routine registered with
435
        the system, by libata.  ->irq_clear is called during probe just
436
        before the interrupt handler is registered, to be sure hardware
437
        is quiet.
438
        
439
        
440
        The second argument, dev_instance, should be cast to a pointer
441
        to struct ata_host_set.
442
        
443
        
444
        Most legacy IDE drivers use ata_interrupt() for the
445
        irq_handler hook, which scans all ports in the host_set,
446
        determines which queued command was active (if any), and calls
447
        ata_host_intr(ap,qc).
448
        
449
        
450
        Most legacy IDE drivers use ata_bmdma_irq_clear() for the
451
        irq_clear() hook, which simply clears the interrupt and error
452
        flags in the DMA status register.
453
        
454
 
455
        
456
 
457
        SATA phy read/write
458
        
459
int (*scr_read) (struct ata_port *ap, unsigned int sc_reg,
460
                 u32 *val);
461
int (*scr_write) (struct ata_port *ap, unsigned int sc_reg,
462
                   u32 val);
463
        
464
 
465
        
466
        Read and write standard SATA phy registers.  Currently only used
467
        if ->phy_reset hook called the sata_phy_reset() helper function.
468
        sc_reg is one of SCR_STATUS, SCR_CONTROL, SCR_ERROR, or SCR_ACTIVE.
469
        
470
 
471
        
472
 
473
        Init and shutdown
474
        
475
int (*port_start) (struct ata_port *ap);
476
void (*port_stop) (struct ata_port *ap);
477
void (*host_stop) (struct ata_host_set *host_set);
478
        
479
 
480
        
481
        ->port_start() is called just after the data structures for each
482
        port are initialized.  Typically this is used to alloc per-port
483
        DMA buffers / tables / rings, enable DMA engines, and similar
484
        tasks.  Some drivers also use this entry point as a chance to
485
        allocate driver-private memory for ap->private_data.
486
        
487
        
488
        Many drivers use ata_port_start() as this hook or call
489
        it from their own port_start() hooks.  ata_port_start()
490
        allocates space for a legacy IDE PRD table and returns.
491
        
492
        
493
        ->port_stop() is called after ->host_stop().  It's sole function
494
        is to release DMA/memory resources, now that they are no longer
495
        actively being used.  Many drivers also free driver-private
496
        data from port at this time.
497
        
498
        
499
        Many drivers use ata_port_stop() as this hook, which frees the
500
        PRD table.
501
        
502
        
503
        ->host_stop() is called after all ->port_stop() calls
504
have completed.  The hook must finalize hardware shutdown, release DMA
505
and other resources, etc.
506
        This hook may be specified as NULL, in which case it is not called.
507
        
508
 
509
        
510
 
511
     
512
  
513
 
514
  
515
        Error handling
516
 
517
        
518
        This chapter describes how errors are handled under libata.
519
        Readers are advised to read SCSI EH
520
        (Documentation/scsi/scsi_eh.txt) and ATA exceptions doc first.
521
        
522
 
523
        Origins of commands
524
        
525
        In libata, a command is represented with struct ata_queued_cmd
526
        or qc.  qc's are preallocated during port initialization and
527
        repetitively used for command executions.  Currently only one
528
        qc is allocated per port but yet-to-be-merged NCQ branch
529
        allocates one for each tag and maps each qc to NCQ tag 1-to-1.
530
        
531
        
532
        libata commands can originate from two sources - libata itself
533
        and SCSI midlayer.  libata internal commands are used for
534
        initialization and error handling.  All normal blk requests
535
        and commands for SCSI emulation are passed as SCSI commands
536
        through queuecommand callback of SCSI host template.
537
        
538
        
539
 
540
        How commands are issued
541
 
542
        
543
 
544
        Internal commands
545
        
546
        
547
        First, qc is allocated and initialized using
548
        ata_qc_new_init().  Although ata_qc_new_init() doesn't
549
        implement any wait or retry mechanism when qc is not
550
        available, internal commands are currently issued only during
551
        initialization and error recovery, so no other command is
552
        active and allocation is guaranteed to succeed.
553
        
554
        
555
        Once allocated qc's taskfile is initialized for the command to
556
        be executed.  qc currently has two mechanisms to notify
557
        completion.  One is via qc->complete_fn() callback and the
558
        other is completion qc->waiting.  qc->complete_fn() callback
559
        is the asynchronous path used by normal SCSI translated
560
        commands and qc->waiting is the synchronous (issuer sleeps in
561
        process context) path used by internal commands.
562
        
563
        
564
        Once initialization is complete, host_set lock is acquired
565
        and the qc is issued.
566
        
567
        
568
        
569
 
570
        SCSI commands
571
        
572
        
573
        All libata drivers use ata_scsi_queuecmd() as
574
        hostt->queuecommand callback.  scmds can either be simulated
575
        or translated.  No qc is involved in processing a simulated
576
        scmd.  The result is computed right away and the scmd is
577
        completed.
578
        
579
        
580
        For a translated scmd, ata_qc_new_init() is invoked to
581
        allocate a qc and the scmd is translated into the qc.  SCSI
582
        midlayer's completion notification function pointer is stored
583
        into qc->scsidone.
584
        
585
        
586
        qc->complete_fn() callback is used for completion
587
        notification.  ATA commands use ata_scsi_qc_complete() while
588
        ATAPI commands use atapi_qc_complete().  Both functions end up
589
        calling qc->scsidone to notify upper layer when the qc is
590
        finished.  After translation is completed, the qc is issued
591
        with ata_qc_issue().
592
        
593
        
594
        Note that SCSI midlayer invokes hostt->queuecommand while
595
        holding host_set lock, so all above occur while holding
596
        host_set lock.
597
        
598
        
599
        
600
 
601
        
602
        
603
 
604
        How commands are processed
605
        
606
        Depending on which protocol and which controller are used,
607
        commands are processed differently.  For the purpose of
608
        discussion, a controller which uses taskfile interface and all
609
        standard callbacks is assumed.
610
        
611
        
612
        Currently 6 ATA command protocols are used.  They can be
613
        sorted into the following four categories according to how
614
        they are processed.
615
        
616
 
617
        
618
           ATA NO DATA or DMA
619
           
620
           
621
           ATA_PROT_NODATA and ATA_PROT_DMA fall into this category.
622
           These types of commands don't require any software
623
           intervention once issued.  Device will raise interrupt on
624
           completion.
625
           
626
           
627
           
628
 
629
           ATA PIO
630
           
631
           
632
           ATA_PROT_PIO is in this category.  libata currently
633
           implements PIO with polling.  ATA_NIEN bit is set to turn
634
           off interrupt and pio_task on ata_wq performs polling and
635
           IO.
636
           
637
           
638
           
639
 
640
           ATAPI NODATA or DMA
641
           
642
           
643
           ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this
644
           category.  packet_task is used to poll BSY bit after
645
           issuing PACKET command.  Once BSY is turned off by the
646
           device, packet_task transfers CDB and hands off processing
647
           to interrupt handler.
648
           
649
           
650
           
651
 
652
           ATAPI PIO
653
           
654
           
655
           ATA_PROT_ATAPI is in this category.  ATA_NIEN bit is set
656
           and, as in ATAPI NODATA or DMA, packet_task submits cdb.
657
           However, after submitting cdb, further processing (data
658
           transfer) is handed off to pio_task.
659
           
660
           
661
           
662
        
663
        
664
 
665
        How commands are completed
666
        
667
        Once issued, all qc's are either completed with
668
        ata_qc_complete() or time out.  For commands which are handled
669
        by interrupts, ata_host_intr() invokes ata_qc_complete(), and,
670
        for PIO tasks, pio_task invokes ata_qc_complete().  In error
671
        cases, packet_task may also complete commands.
672
        
673
        
674
        ata_qc_complete() does the following.
675
        
676
 
677
        
678
 
679
        
680
        
681
        DMA memory is unmapped.
682
        
683
        
684
 
685
        
686
        
687
        ATA_QCFLAG_ACTIVE is clared from qc->flags.
688
        
689
        
690
 
691
        
692
        
693
        qc->complete_fn() callback is invoked.  If the return value of
694
        the callback is not zero.  Completion is short circuited and
695
        ata_qc_complete() returns.
696
        
697
        
698
 
699
        
700
        
701
        __ata_qc_complete() is called, which does
702
           
703
 
704
           
705
           
706
           qc->flags is cleared to zero.
707
           
708
           
709
 
710
           
711
           
712
           ap->active_tag and qc->tag are poisoned.
713
           
714
           
715
 
716
           
717
           
718
           qc->waiting is claread & completed (in that order).
719
           
720
           
721
 
722
           
723
           
724
           qc is deallocated by clearing appropriate bit in ap->qactive.
725
           
726
           
727
 
728
           
729
        
730
        
731
 
732
        
733
 
734
        
735
        So, it basically notifies upper layer and deallocates qc.  One
736
        exception is short-circuit path in #3 which is used by
737
        atapi_qc_complete().
738
        
739
        
740
        For all non-ATAPI commands, whether it fails or not, almost
741
        the same code path is taken and very little error handling
742
        takes place.  A qc is completed with success status if it
743
        succeeded, with failed status otherwise.
744
        
745
        
746
        However, failed ATAPI commands require more handling as
747
        REQUEST SENSE is needed to acquire sense data.  If an ATAPI
748
        command fails, ata_qc_complete() is invoked with error status,
749
        which in turn invokes atapi_qc_complete() via
750
        qc->complete_fn() callback.
751
        
752
        
753
        This makes atapi_qc_complete() set scmd->result to
754
        SAM_STAT_CHECK_CONDITION, complete the scmd and return 1.  As
755
        the sense data is empty but scmd->result is CHECK CONDITION,
756
        SCSI midlayer will invoke EH for the scmd, and returning 1
757
        makes ata_qc_complete() to return without deallocating the qc.
758
        This leads us to ata_scsi_error() with partially completed qc.
759
        
760
 
761
        
762
 
763
        ata_scsi_error()
764
        
765
        ata_scsi_error() is the current transportt->eh_strategy_handler()
766
        for libata.  As discussed above, this will be entered in two
767
        cases - timeout and ATAPI error completion.  This function
768
        calls low level libata driver's eng_timeout() callback, the
769
        standard callback for which is ata_eng_timeout().  It checks
770
        if a qc is active and calls ata_qc_timeout() on the qc if so.
771
        Actual error handling occurs in ata_qc_timeout().
772
        
773
        
774
        If EH is invoked for timeout, ata_qc_timeout() stops BMDMA and
775
        completes the qc.  Note that as we're currently in EH, we
776
        cannot call scsi_done.  As described in SCSI EH doc, a
777
        recovered scmd should be either retried with
778
        scsi_queue_insert() or finished with scsi_finish_command().
779
        Here, we override qc->scsidone with scsi_finish_command() and
780
        calls ata_qc_complete().
781
        
782
        
783
        If EH is invoked due to a failed ATAPI qc, the qc here is
784
        completed but not deallocated.  The purpose of this
785
        half-completion is to use the qc as place holder to make EH
786
        code reach this place.  This is a bit hackish, but it works.
787
        
788
        
789
        Once control reaches here, the qc is deallocated by invoking
790
        __ata_qc_complete() explicitly.  Then, internal qc for REQUEST
791
        SENSE is issued.  Once sense data is acquired, scmd is
792
        finished by directly invoking scsi_finish_command() on the
793
        scmd.  Note that as we already have completed and deallocated
794
        the qc which was associated with the scmd, we don't need
795
        to/cannot call ata_qc_complete() again.
796
        
797
 
798
        
799
 
800
        Problems with the current EH
801
 
802
        
803
 
804
        
805
        
806
        Error representation is too crude.  Currently any and all
807
        error conditions are represented with ATA STATUS and ERROR
808
        registers.  Errors which aren't ATA device errors are treated
809
        as ATA device errors by setting ATA_ERR bit.  Better error
810
        descriptor which can properly represent ATA and other
811
        errors/exceptions is needed.
812
        
813
        
814
 
815
        
816
        
817
        When handling timeouts, no action is taken to make device
818
        forget about the timed out command and ready for new commands.
819
        
820
        
821
 
822
        
823
        
824
        EH handling via ata_scsi_error() is not properly protected
825
        from usual command processing.  On EH entrance, the device is
826
        not in quiescent state.  Timed out commands may succeed or
827
        fail any time.  pio_task and atapi_task may still be running.
828
        
829
        
830
 
831
        
832
        
833
        Too weak error recovery.  Devices / controllers causing HSM
834
        mismatch errors and other errors quite often require reset to
835
        return to known state.  Also, advanced error handling is
836
        necessary to support features like NCQ and hotplug.
837
        
838
        
839
 
840
        
841
        
842
        ATA errors are directly handled in the interrupt handler and
843
        PIO errors in pio_task.  This is problematic for advanced
844
        error handling for the following reasons.
845
        
846
        
847
        First, advanced error handling often requires context and
848
        internal qc execution.
849
        
850
        
851
        Second, even a simple failure (say, CRC error) needs
852
        information gathering and could trigger complex error handling
853
        (say, resetting & reconfiguring).  Having multiple code
854
        paths to gather information, enter EH and trigger actions
855
        makes life painful.
856
        
857
        
858
        Third, scattered EH code makes implementing low level drivers
859
        difficult.  Low level drivers override libata callbacks.  If
860
        EH is scattered over several places, each affected callbacks
861
        should perform its part of error handling.  This can be error
862
        prone and painful.
863
        
864
        
865
 
866
        
867
        
868
  
869
 
870
  
871
     libata Library
872
!Edrivers/ata/libata-core.c
873
  
874
 
875
  
876
     libata Core Internals
877
!Idrivers/ata/libata-core.c
878
  
879
 
880
  
881
     libata SCSI translation/emulation
882
!Edrivers/ata/libata-scsi.c
883
!Idrivers/ata/libata-scsi.c
884
  
885
 
886
  
887
     ATA errors and exceptions
888
 
889
  
890
  This chapter tries to identify what error/exception conditions exist
891
  for ATA/ATAPI devices and describe how they should be handled in
892
  implementation-neutral way.
893
  
894
 
895
  
896
  The term 'error' is used to describe conditions where either an
897
  explicit error condition is reported from device or a command has
898
  timed out.
899
  
900
 
901
  
902
  The term 'exception' is either used to describe exceptional
903
  conditions which are not errors (say, power or hotplug events), or
904
  to describe both errors and non-error exceptional conditions.  Where
905
  explicit distinction between error and exception is necessary, the
906
  term 'non-error exception' is used.
907
  
908
 
909
  
910
     Exception categories
911
     
912
     Exceptions are described primarily with respect to legacy
913
     taskfile + bus master IDE interface.  If a controller provides
914
     other better mechanism for error reporting, mapping those into
915
     categories described below shouldn't be difficult.
916
     
917
 
918
     
919
     In the following sections, two recovery actions - reset and
920
     reconfiguring transport - are mentioned.  These are described
921
     further in .
922
     
923
 
924
     
925
        HSM violation
926
        
927
        This error is indicated when STATUS value doesn't match HSM
928
        requirement during issuing or excution any ATA/ATAPI command.
929
        
930
 
931
        
932
        Examples
933
 
934
        
935
        
936
        ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying
937
        to issue a command.
938
        
939
        
940
 
941
        
942
        
943
        !BSY && !DRQ during PIO data transfer.
944
        
945
        
946
 
947
        
948
        
949
        DRQ on command completion.
950
        
951
        
952
 
953
        
954
        
955
        !BSY && ERR after CDB tranfer starts but before the
956
        last byte of CDB is transferred.  ATA/ATAPI standard states
957
        that "The device shall not terminate the PACKET command
958
        with an error before the last byte of the command packet has
959
        been written" in the error outputs description of PACKET
960
        command and the state diagram doesn't include such
961
        transitions.
962
        
963
        
964
 
965
        
966
 
967
        
968
        In these cases, HSM is violated and not much information
969
        regarding the error can be acquired from STATUS or ERROR
970
        register.  IOW, this error can be anything - driver bug,
971
        faulty device, controller and/or cable.
972
        
973
 
974
        
975
        As HSM is violated, reset is necessary to restore known state.
976
        Reconfiguring transport for lower speed might be helpful too
977
        as transmission errors sometimes cause this kind of errors.
978
        
979
     
980
 
981
     
982
        ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION)
983
 
984
        
985
        These are errors detected and reported by ATA/ATAPI devices
986
        indicating device problems.  For this type of errors, STATUS
987
        and ERROR register values are valid and describe error
988
        condition.  Note that some of ATA bus errors are detected by
989
        ATA/ATAPI devices and reported using the same mechanism as
990
        device errors.  Those cases are described later in this
991
        section.
992
        
993
 
994
        
995
        For ATA commands, this type of errors are indicated by !BSY
996
        && ERR during command execution and on completion.
997
        
998
 
999
        For ATAPI commands,
1000
 
1001
        
1002
 
1003
        
1004
        
1005
        !BSY && ERR && ABRT right after issuing PACKET
1006
        indicates that PACKET command is not supported and falls in
1007
        this category.
1008
        
1009
        
1010
 
1011
        
1012
        
1013
        !BSY && ERR(==CHK) && !ABRT after the last
1014
        byte of CDB is transferred indicates CHECK CONDITION and
1015
        doesn't fall in this category.
1016
        
1017
        
1018
 
1019
        
1020
        
1021
        !BSY && ERR(==CHK) && ABRT after the last byte
1022
        of CDB is transferred *probably* indicates CHECK CONDITION and
1023
        doesn't fall in this category.
1024
        
1025
        
1026
 
1027
        
1028
 
1029
        
1030
        Of errors detected as above, the followings are not ATA/ATAPI
1031
        device errors but ATA bus errors and should be handled
1032
        according to .
1033
        
1034
 
1035
        
1036
 
1037
           
1038
           CRC error during data transfer
1039
           
1040
           
1041
           This is indicated by ICRC bit in the ERROR register and
1042
           means that corruption occurred during data transfer.  Upto
1043
           ATA/ATAPI-7, the standard specifies that this bit is only
1044
           applicable to UDMA transfers but ATA/ATAPI-8 draft revision
1045
           1f says that the bit may be applicable to multiword DMA and
1046
           PIO.
1047
           
1048
           
1049
           
1050
 
1051
           
1052
           ABRT error during data transfer or on completion
1053
           
1054
           
1055
           Upto ATA/ATAPI-7, the standard specifies that ABRT could be
1056
           set on ICRC errors and on cases where a device is not able
1057
           to complete a command.  Combined with the fact that MWDMA
1058
           and PIO transfer errors aren't allowed to use ICRC bit upto
1059
           ATA/ATAPI-7, it seems to imply that ABRT bit alone could
1060
           indicate tranfer errors.
1061
           
1062
           
1063
           However, ATA/ATAPI-8 draft revision 1f removes the part
1064
           that ICRC errors can turn on ABRT.  So, this is kind of
1065
           gray area.  Some heuristics are needed here.
1066
           
1067
           
1068
           
1069
 
1070
        
1071
 
1072
        
1073
        ATA/ATAPI device errors can be further categorized as follows.
1074
        
1075
 
1076
        
1077
 
1078
           
1079
           Media errors
1080
           
1081
           
1082
           This is indicated by UNC bit in the ERROR register.  ATA
1083
           devices reports UNC error only after certain number of
1084
           retries cannot recover the data, so there's nothing much
1085
           else to do other than notifying upper layer.
1086
           
1087
           
1088
           READ and WRITE commands report CHS or LBA of the first
1089
           failed sector but ATA/ATAPI standard specifies that the
1090
           amount of transferred data on error completion is
1091
           indeterminate, so we cannot assume that sectors preceding
1092
           the failed sector have been transferred and thus cannot
1093
           complete those sectors successfully as SCSI does.
1094
           
1095
           
1096
           
1097
 
1098
           
1099
           Media changed / media change requested error
1100
           
1101
           
1102
           <<TODO: fill here>>
1103
           
1104
           
1105
           
1106
 
1107
           Address error
1108
           
1109
           
1110
           This is indicated by IDNF bit in the ERROR register.
1111
           Report to upper layer.
1112
           
1113
           
1114
           
1115
 
1116
           Other errors
1117
           
1118
           
1119
           This can be invalid command or parameter indicated by ABRT
1120
           ERROR bit or some other error condition.  Note that ABRT
1121
           bit can indicate a lot of things including ICRC and Address
1122
           errors.  Heuristics needed.
1123
           
1124
           
1125
           
1126
 
1127
        
1128
 
1129
        
1130
        Depending on commands, not all STATUS/ERROR bits are
1131
        applicable.  These non-applicable bits are marked with
1132
        "na" in the output descriptions but upto ATA/ATAPI-7
1133
        no definition of "na" can be found.  However,
1134
        ATA/ATAPI-8 draft revision 1f describes "N/A" as
1135
        follows.
1136
        
1137
 
1138
        
1139
        
1140
           3.2.3.3a N/A
1141
           
1142
           
1143
           A keyword the indicates a field has no defined value in
1144
           this standard and should not be checked by the host or
1145
           device. N/A fields should be cleared to zero.
1146
           
1147
           
1148
           
1149
        
1150
        
1151
 
1152
        
1153
        So, it seems reasonable to assume that "na" bits are
1154
        cleared to zero by devices and thus need no explicit masking.
1155
        
1156
 
1157
     
1158
 
1159
     
1160
        ATAPI device CHECK CONDITION
1161
 
1162
        
1163
        ATAPI device CHECK CONDITION error is indicated by set CHK bit
1164
        (ERR bit) in the STATUS register after the last byte of CDB is
1165
        transferred for a PACKET command.  For this kind of errors,
1166
        sense data should be acquired to gather information regarding
1167
        the errors.  REQUEST SENSE packet command should be used to
1168
        acquire sense data.
1169
        
1170
 
1171
        
1172
        Once sense data is acquired, this type of errors can be
1173
        handled similary to other SCSI errors.  Note that sense data
1174
        may indicate ATA bus error (e.g. Sense Key 04h HARDWARE ERROR
1175
        && ASC/ASCQ 47h/00h SCSI PARITY ERROR).  In such
1176
        cases, the error should be considered as an ATA bus error and
1177
        handled according to .
1178
        
1179
 
1180
     
1181
 
1182
     
1183
        ATA device error (NCQ)
1184
 
1185
        
1186
        NCQ command error is indicated by cleared BSY and set ERR bit
1187
        during NCQ command phase (one or more NCQ commands
1188
        outstanding).  Although STATUS and ERROR registers will
1189
        contain valid values describing the error, READ LOG EXT is
1190
        required to clear the error condition, determine which command
1191
        has failed and acquire more information.
1192
        
1193
 
1194
        
1195
        READ LOG EXT Log Page 10h reports which tag has failed and
1196
        taskfile register values describing the error.  With this
1197
        information the failed command can be handled as a normal ATA
1198
        command error as in  and all
1199
        other in-flight commands must be retried.  Note that this
1200
        retry should not be counted - it's likely that commands
1201
        retried this way would have completed normally if it were not
1202
        for the failed command.
1203
        
1204
 
1205
        
1206
        Note that ATA bus errors can be reported as ATA device NCQ
1207
        errors.  This should be handled as described in 
1208
        linkend="excatATAbusErr"/>.
1209
        
1210
 
1211
        
1212
        If READ LOG EXT Log Page 10h fails or reports NQ, we're
1213
        thoroughly screwed.  This condition should be treated
1214
        according to .
1215
        
1216
 
1217
     
1218
 
1219
     
1220
        ATA bus error
1221
 
1222
        
1223
        ATA bus error means that data corruption occurred during
1224
        transmission over ATA bus (SATA or PATA).  This type of errors
1225
        can be indicated by
1226
        
1227
 
1228
        
1229
 
1230
        
1231
        
1232
        ICRC or ABRT error as described in .
1233
        
1234
        
1235
 
1236
        
1237
        
1238
        Controller-specific error completion with error information
1239
        indicating transmission error.
1240
        
1241
        
1242
 
1243
        
1244
        
1245
        On some controllers, command timeout.  In this case, there may
1246
        be a mechanism to determine that the timeout is due to
1247
        transmission error.
1248
        
1249
        
1250
 
1251
        
1252
        
1253
        Unknown/random errors, timeouts and all sorts of weirdities.
1254
        
1255
        
1256
 
1257
        
1258
 
1259
        
1260
        As described above, transmission errors can cause wide variety
1261
        of symptoms ranging from device ICRC error to random device
1262
        lockup, and, for many cases, there is no way to tell if an
1263
        error condition is due to transmission error or not;
1264
        therefore, it's necessary to employ some kind of heuristic
1265
        when dealing with errors and timeouts.  For example,
1266
        encountering repetitive ABRT errors for known supported
1267
        command is likely to indicate ATA bus error.
1268
        
1269
 
1270
        
1271
        Once it's determined that ATA bus errors have possibly
1272
        occurred, lowering ATA bus transmission speed is one of
1273
        actions which may alleviate the problem.  See 
1274
        linkend="exrecReconf"/> for more information.
1275
        
1276
 
1277
     
1278
 
1279
     
1280
        PCI bus error
1281
 
1282
        
1283
        Data corruption or other failures during transmission over PCI
1284
        (or other system bus).  For standard BMDMA, this is indicated
1285
        by Error bit in the BMDMA Status register.  This type of
1286
        errors must be logged as it indicates something is very wrong
1287
        with the system.  Resetting host controller is recommended.
1288
        
1289
 
1290
     
1291
 
1292
     
1293
        Late completion
1294
 
1295
        
1296
        This occurs when timeout occurs and the timeout handler finds
1297
        out that the timed out command has completed successfully or
1298
        with error.  This is usually caused by lost interrupts.  This
1299
        type of errors must be logged.  Resetting host controller is
1300
        recommended.
1301
        
1302
 
1303
     
1304
 
1305
     
1306
        Unknown error (timeout)
1307
 
1308
        
1309
        This is when timeout occurs and the command is still
1310
        processing or the host and device are in unknown state.  When
1311
        this occurs, HSM could be in any valid or invalid state.  To
1312
        bring the device to known state and make it forget about the
1313
        timed out command, resetting is necessary.  The timed out
1314
        command may be retried.
1315
        
1316
 
1317
        
1318
        Timeouts can also be caused by transmission errors.  Refer to
1319
         for more details.
1320
        
1321
 
1322
     
1323
 
1324
     
1325
        Hotplug and power management exceptions
1326
 
1327
        
1328
        <<TODO: fill here>>
1329
        
1330
 
1331
     
1332
 
1333
  
1334
 
1335
  
1336
     EH recovery actions
1337
 
1338
     
1339
     This section discusses several important recovery actions.
1340
     
1341
 
1342
     
1343
        Clearing error condition
1344
 
1345
        
1346
        Many controllers require its error registers to be cleared by
1347
        error handler.  Different controllers may have different
1348
        requirements.
1349
        
1350
 
1351
        
1352
        For SATA, it's strongly recommended to clear at least SError
1353
        register during error handling.
1354
        
1355
     
1356
 
1357
     
1358
        Reset
1359
 
1360
        
1361
        During EH, resetting is necessary in the following cases.
1362
        
1363
 
1364
        
1365
 
1366
        
1367
        
1368
        HSM is in unknown or invalid state
1369
        
1370
        
1371
 
1372
        
1373
        
1374
        HBA is in unknown or invalid state
1375
        
1376
        
1377
 
1378
        
1379
        
1380
        EH needs to make HBA/device forget about in-flight commands
1381
        
1382
        
1383
 
1384
        
1385
        
1386
        HBA/device behaves weirdly
1387
        
1388
        
1389
 
1390
        
1391
 
1392
        
1393
        Resetting during EH might be a good idea regardless of error
1394
        condition to improve EH robustness.  Whether to reset both or
1395
        either one of HBA and device depends on situation but the
1396
        following scheme is recommended.
1397
        
1398
 
1399
        
1400
 
1401
        
1402
        
1403
        When it's known that HBA is in ready state but ATA/ATAPI
1404
        device is in unknown state, reset only device.
1405
        
1406
        
1407
 
1408
        
1409
        
1410
        If HBA is in unknown state, reset both HBA and device.
1411
        
1412
        
1413
 
1414
        
1415
 
1416
        
1417
        HBA resetting is implementation specific.  For a controller
1418
        complying to taskfile/BMDMA PCI IDE, stopping active DMA
1419
        transaction may be sufficient iff BMDMA state is the only HBA
1420
        context.  But even mostly taskfile/BMDMA PCI IDE complying
1421
        controllers may have implementation specific requirements and
1422
        mechanism to reset themselves.  This must be addressed by
1423
        specific drivers.
1424
        
1425
 
1426
        
1427
        OTOH, ATA/ATAPI standard describes in detail ways to reset
1428
        ATA/ATAPI devices.
1429
        
1430
 
1431
        
1432
 
1433
           PATA hardware reset
1434
           
1435
           
1436
           This is hardware initiated device reset signalled with
1437
           asserted PATA RESET- signal.  There is no standard way to
1438
           initiate hardware reset from software although some
1439
           hardware provides registers that allow driver to directly
1440
           tweak the RESET- signal.
1441
           
1442
           
1443
           
1444
 
1445
           Software reset
1446
           
1447
           
1448
           This is achieved by turning CONTROL SRST bit on for at
1449
           least 5us.  Both PATA and SATA support it but, in case of
1450
           SATA, this may require controller-specific support as the
1451
           second Register FIS to clear SRST should be transmitted
1452
           while BSY bit is still set.  Note that on PATA, this resets
1453
           both master and slave devices on a channel.
1454
           
1455
           
1456
           
1457
 
1458
           EXECUTE DEVICE DIAGNOSTIC command
1459
           
1460
           
1461
           Although ATA/ATAPI standard doesn't describe exactly, EDD
1462
           implies some level of resetting, possibly similar level
1463
           with software reset.  Host-side EDD protocol can be handled
1464
           with normal command processing and most SATA controllers
1465
           should be able to handle EDD's just like other commands.
1466
           As in software reset, EDD affects both devices on a PATA
1467
           bus.
1468
           
1469
           
1470
           Although EDD does reset devices, this doesn't suit error
1471
           handling as EDD cannot be issued while BSY is set and it's
1472
           unclear how it will act when device is in unknown/weird
1473
           state.
1474
           
1475
           
1476
           
1477
 
1478
           ATAPI DEVICE RESET command
1479
           
1480
           
1481
           This is very similar to software reset except that reset
1482
           can be restricted to the selected device without affecting
1483
           the other device sharing the cable.
1484
           
1485
           
1486
           
1487
 
1488
           SATA phy reset
1489
           
1490
           
1491
           This is the preferred way of resetting a SATA device.  In
1492
           effect, it's identical to PATA hardware reset.  Note that
1493
           this can be done with the standard SCR Control register.
1494
           As such, it's usually easier to implement than software
1495
           reset.
1496
           
1497
           
1498
           
1499
 
1500
        
1501
 
1502
        
1503
        One more thing to consider when resetting devices is that
1504
        resetting clears certain configuration parameters and they
1505
        need to be set to their previous or newly adjusted values
1506
        after reset.
1507
        
1508
 
1509
        
1510
        Parameters affected are.
1511
        
1512
 
1513
        
1514
 
1515
        
1516
        
1517
        CHS set up with INITIALIZE DEVICE PARAMETERS (seldomly used)
1518
        
1519
        
1520
 
1521
        
1522
        
1523
        Parameters set with SET FEATURES including transfer mode setting
1524
        
1525
        
1526
 
1527
        
1528
        
1529
        Block count set with SET MULTIPLE MODE
1530
        
1531
        
1532
 
1533
        
1534
        
1535
        Other parameters (SET MAX, MEDIA LOCK...)
1536
        
1537
        
1538
 
1539
        
1540
 
1541
        
1542
        ATA/ATAPI standard specifies that some parameters must be
1543
        maintained across hardware or software reset, but doesn't
1544
        strictly specify all of them.  Always reconfiguring needed
1545
        parameters after reset is required for robustness.  Note that
1546
        this also applies when resuming from deep sleep (power-off).
1547
        
1548
 
1549
        
1550
        Also, ATA/ATAPI standard requires that IDENTIFY DEVICE /
1551
        IDENTIFY PACKET DEVICE is issued after any configuration
1552
        parameter is updated or a hardware reset and the result used
1553
        for further operation.  OS driver is required to implement
1554
        revalidation mechanism to support this.
1555
        
1556
 
1557
     
1558
 
1559
     
1560
        Reconfigure transport
1561
 
1562
        
1563
        For both PATA and SATA, a lot of corners are cut for cheap
1564
        connectors, cables or controllers and it's quite common to see
1565
        high transmission error rate.  This can be mitigated by
1566
        lowering transmission speed.
1567
        
1568
 
1569
        
1570
        The following is a possible scheme Jeff Garzik suggested.
1571
        
1572
 
1573
        
1574
        
1575
        If more than $N (3?) transmission errors happen in 15 minutes,
1576
        
1577
        
1578
        
1579
        
1580
        if SATA, decrease SATA PHY speed.  if speed cannot be decreased,
1581
        
1582
        
1583
        
1584
        
1585
        decrease UDMA xfer speed.  if at UDMA0, switch to PIO4,
1586
        
1587
        
1588
        
1589
        
1590
        decrease PIO xfer speed.  if at PIO3, complain, but continue
1591
        
1592
        
1593
        
1594
        
1595
 
1596
     
1597
 
1598
  
1599
 
1600
  
1601
 
1602
  
1603
     ata_piix Internals
1604
!Idrivers/ata/ata_piix.c
1605
  
1606
 
1607
  
1608
     sata_sil Internals
1609
!Idrivers/ata/sata_sil.c
1610
  
1611
 
1612
  
1613
     Thanks
1614
  
1615
  The bulk of the ATA knowledge comes thanks to long conversations with
1616
  Andre Hedrick (www.linux-ide.org), and long hours pondering the ATA
1617
  and SCSI specifications.
1618
  
1619
  
1620
  Thanks to Alan Cox for pointing out similarities
1621
  between SATA and SCSI, and in general for motivation to hack on
1622
  libata.
1623
  
1624
  
1625
  libata's device detection
1626
  method, ata_pio_devchk, and in general all the early probing was
1627
  based on extensive study of Hale Landis's probe/reset code in his
1628
  ATADRVR driver (www.ata-atapi.com).
1629
  
1630
  
1631
 
1632

powered by: WebSVN 2.1.0

© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.