OpenCores
URL https://opencores.org/ocsvn/or1k/or1k/trunk

Subversion Repositories or1k

[/] [or1k/] [trunk/] [linux/] [linux-2.4/] [Documentation/] [watchdog-api.txt] - Blame information for rev 1765

Details | Compare with Previous | View Log

Line No. Rev Author Line
1 1275 phoenix
The Linux Watchdog driver API.
2
 
3
Copyright 2002 Christer Weingel 
4
 
5
Some parts of this document are copied verbatim from the sbc60xxwdt
6
driver which is (c) Copyright 2000 Jakob Oestergaard 
7
 
8
This document describes the state of the Linux 2.4.18 kernel.
9
 
10
Introduction:
11
 
12
A Watchdog Timer (WDT) is a hardware circuit that can reset the
13
computer system in case of a software fault.  You probably knew that
14
already.
15
 
16
Usually a userspace daemon will notify the kernel watchdog driver via the
17
/dev/watchdog special device file that userspace is still alive, at
18
regular intervals.  When such a notification occurs, the driver will
19
usually tell the hardware watchdog that everything is in order, and
20
that the watchdog should wait for yet another little while to reset
21
the system.  If userspace fails (RAM error, kernel bug, whatever), the
22
notifications cease to occur, and the hardware watchdog will reset the
23
system (causing a reboot) after the timeout occurs.
24
 
25
The Linux watchdog API is a rather AD hoc construction and different
26
drivers implement different, and sometimes incompatible, parts of it.
27
This file is an attempt to document the existing usage and allow
28
future driver writers to use it as a reference.
29
 
30
The simplest API:
31
 
32
All drivers support the basic mode of operation, where the watchdog
33
activates as soon as /dev/watchdog is opened and will reboot unless
34
the watchdog is pinged within a certain time, this time is called the
35
timeout or margin.  The simplest way to ping the watchdog is to write
36
some data to the device.  So a very simple watchdog daemon would look
37
like this:
38
 
39
int main(int argc, const char *argv[]) {
40
        int fd=open("/dev/watchdog",O_WRONLY);
41
        if (fd==-1) {
42
                perror("watchdog");
43
                exit(1);
44
        }
45
        while(1) {
46
                write(fd, "\0", 1);
47
                sleep(10);
48
        }
49
}
50
 
51
A more advanced driver could for example check that a HTTP server is
52
still responding before doing the write call to ping the watchdog.
53
 
54
When the device is closed, the watchdog is disabled.  This is not
55
always such a good idea, since if there is a bug in the watchdog
56
daemon and it crashes the system will not reboot.  Because of this,
57
some of the drivers support the configuration option "Disable watchdog
58
shutdown on close", CONFIG_WATCHDOG_NOWAYOUT.  If it is set to Y when
59
compiling the kernel, there is no way of disabling the watchdog once
60
it has been started.  So, if the watchdog dameon crashes, the system
61
will reboot after the timeout has passed.
62
 
63
Some other drivers will not disable the watchdog, unless a specific
64
magic character 'V' has been sent /dev/watchdog just before closing
65
the file.  If the userspace daemon closes the file without sending
66
this special character, the driver will assume that the daemon (and
67
userspace in general) died, and will stop pinging the watchdog without
68
disabling it first.  This will then cause a reboot.
69
 
70
The ioctl API:
71
 
72
All conforming drivers also support an ioctl API.
73
 
74
Pinging the watchdog using an ioctl:
75
 
76
All drivers that have an ioctl interface support at least one ioctl,
77
KEEPALIVE.  This ioctl does exactly the same thing as a write to the
78
watchdog device, so the main loop in the above program could be
79
replaced with:
80
 
81
        while (1) {
82
                ioctl(fd, WDIOC_KEEPALIVE, 0);
83
                sleep(10);
84
        }
85
 
86
the argument to the ioctl is ignored.
87
 
88
Setting and getting the timeout:
89
 
90
For some drivers it is possible to modify the watchdog timeout on the
91
fly with the SETTIMEOUT ioctl, those drivers have the WDIOF_SETTIMEOUT
92
flag set in their option field.  The argument is an integer
93
representing the timeout in seconds.  The driver returns the real
94
timeout used in the same variable, and this timeout might differ from
95
the requested one due to limitation of the hardware.
96
 
97
    int timeout = 45;
98
    ioctl(fd, WDIOC_SETTIMEOUT, &timeout);
99
    printf("The timeout was set to %d seconds\n", timeout);
100
 
101
This example might actually print "The timeout was set to 60 seconds"
102
if the device has a granularity of minutes for its timeout.
103
 
104
Starting with the Linux 2.4.18 kernel, it is possible to query the
105
current timeout using the GETTIMEOUT ioctl.
106
 
107
    ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
108
    printf("The timeout was is %d seconds\n", timeout);
109
 
110
Envinronmental monitoring:
111
 
112
All watchdog drivers are required return more information about the system,
113
some do temperature, fan and power level monitoring, some can tell you
114
the reason for the last reboot of the system.  The GETSUPPORT ioctl is
115
available to ask what the device can do:
116
 
117
        struct watchdog_info ident;
118
        ioctl(fd, WDIOC_GETSUPPORT, &ident);
119
 
120
the fields returned in the ident struct are:
121
 
122
        identity                a string identifying the watchdog driver
123
        firmware_version        the firmware version of the card if available
124
        options                 a flags describing what the device supports
125
 
126
the options field can have the following bits set, and describes what
127
kind of information that the GET_STATUS and GET_BOOT_STATUS ioctls can
128
return.   [FIXME -- Is this correct?]
129
 
130
        WDIOF_OVERHEAT          Reset due to CPU overheat
131
 
132
The machine was last rebooted by the watchdog because the thermal limit was
133
exceeded
134
 
135
        WDIOF_FANFAULT          Fan failed
136
 
137
A system fan monitored by the watchdog card has failed
138
 
139
        WDIOF_EXTERN1           External relay 1
140
 
141
External monitoring relay/source 1 was triggered. Controllers intended for
142
real world applications include external monitoring pins that will trigger
143
a reset.
144
 
145
        WDIOF_EXTERN2           External relay 2
146
 
147
External monitoring relay/source 2 was triggered
148
 
149
        WDIOF_POWERUNDER        Power bad/power fault
150
 
151
The machine is showing an undervoltage status
152
 
153
        WDIOF_CARDRESET         Card previously reset the CPU
154
 
155
The last reboot was caused by the watchdog card
156
 
157
        WDIOF_POWEROVER         Power over voltage
158
 
159
The machine is showing an overvoltage status. Note that if one level is
160
under and one over both bits will be set - this may seem odd but makes
161
sense.
162
 
163
        WDIOF_KEEPALIVEPING     Keep alive ping reply
164
 
165
The watchdog saw a keepalive ping since it was last queried.
166
 
167
        WDIOF_SETTIMEOUT        Can set/get the timeout
168
 
169
 
170
For those drivers that return any bits set in the option field, the
171
GETSTATUS and GETBOOTSTATUS ioctls can be used to ask for the current
172
status, and the status at the last reboot, respectively.
173
 
174
    int flags;
175
    ioctl(fd, WDIOC_GETSTATUS, &flags);
176
 
177
    or
178
 
179
    ioctl(fd, WDIOC_GETBOOTSTATUS, &flags);
180
 
181
Note that not all devices support these two calls, and some only
182
support the GETBOOTSTATUS call.
183
 
184
Some drivers can measure the temperature using the GETTEMP ioctl.  The
185
returned value is the temperature in degrees farenheit.
186
 
187
    int temperature;
188
    ioctl(fd, WDIOC_GETTEMP, &temperature);
189
 
190
Finally the SETOPTIONS ioctl can be used to control some aspects of
191
the cards operation; right now the pcwd driver is the only one
192
supporting thiss ioctl.
193
 
194
    int options = 0;
195
    ioctl(fd, WDIOC_SETOPTIONS, options);
196
 
197
The following options are available:
198
 
199
        WDIOS_DISABLECARD       Turn off the watchdog timer
200
        WDIOS_ENABLECARD        Turn on the watchdog timer
201
        WDIOS_TEMPPANIC         Kernel panic on temperature trip
202
 
203
[FIXME -- better explanations]
204
 
205
Implementations in the current drivers in the kernel tree:
206
 
207
Here I have tried to summarize what the different drivers support and
208
where they do strange things compared to the other drivers.
209
 
210
acquirewdt.c -- Acquire Single Board Computer
211
 
212
        This driver has a hardcoded timeout of 1 minute
213
 
214
        Supports CONFIG_WATCHDOG_NOWAYOUT
215
 
216
        GETSUPPORT returns KEEPALIVEPING.  GETSTATUS will return 1 if
217
        the device is open, 0 if not.  [FIXME -- isn't this rather
218
        silly?  To be able to use the ioctl, the device must be open
219
        and so GETSTATUS will always return 1].
220
 
221
advantechwdt.c -- Advantech Single Board Computer
222
 
223
        Timeout that defaults to 60 seconds, supports SETTIMEOUT.
224
 
225
        Supports CONFIG_WATCHDOG_NOWAYOUT
226
 
227
        GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT.
228
        The GETSTATUS call returns if the device is open or not.
229
        [FIXME -- silliness again?]
230
 
231
eurotechwdt.c -- Eurotech CPU-1220/1410
232
 
233
        The timeout can be set using the SETTIMEOUT ioctl and defaults
234
        to 60 seconds.
235
 
236
        Also has a module parameter "ev", event type which controls
237
        what should happen on a timeout, the string "int" or anything
238
        else that causes a reboot.  [FIXME -- better description]
239
 
240
        Supports CONFIG_WATCHDOG_NOWAYOUT
241
 
242
        GETSUPPORT returns CARDRESET and WDIOF_SETTIMEOUT but
243
        GETSTATUS is not supported and GETBOOTSTATUS just returns 0.
244
 
245
i810-tco.c -- Intel 810 chipset
246
 
247
        Also has support for a lot of other i8x0 stuff, but the
248
        watchdog is one of the things.
249
 
250
        The timeout is set using the module parameter "i810_margin",
251
        which is in steps of 0.6 seconds where 2
252
        driver supports the SETTIMEOUT ioctl.
253
 
254
        Supports CONFIG_WATCHDOG_NOWAYOUT.
255
 
256
        GETSUPPORT returns WDIOF_SETTIMEOUT.  The GETSTATUS call
257
        returns some kind of timer value which ist not compatible with
258
        the other drivers.  GETBOOT status returns some kind of
259
        hardware specific boot status.  [FIXME -- describe this]
260
 
261
ib700wdt.c -- IB700 Single Board Computer
262
 
263
        Default timeout of 30 seconds and the timeout is settable
264
        using the SETTIMEOUT ioctl.  Note that only a few timeout
265
        values are supported.
266
 
267
        Supports CONFIG_WATCHDOG_NOWAYOUT
268
 
269
        GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT.
270
        The GETSTATUS call returns if the device is open or not.
271
        [FIXME -- silliness again?]
272
 
273
machzwd.c -- MachZ ZF-Logic
274
 
275
        Hardcoded timeout of 10 seconds
276
 
277
        Has a module parameter "action" that controls what happens
278
        when the timeout runs out which can be 0 = RESET (default),
279
        1 = SMI, 2 = NMI, 3 = SCI.
280
 
281
        Supports CONFIG_WATCHDOG_NOWAYOUT and the magic character
282
        'V' close handling.
283
 
284
        GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call
285
        returns if the device is open or not.  [FIXME -- silliness
286
        again?]
287
 
288
mixcomwd.c -- MixCom Watchdog
289
 
290
        [FIXME -- I'm unable to tell what the timeout is]
291
 
292
        Supports CONFIG_WATCHDOG_NOWAYOUT
293
 
294
        GETSUPPORT returns WDIOF_KEEPALIVEPING, GETSTATUS returns if
295
        the device is opened or not [FIXME -- I'm not really sure how
296
        this works, there seems to be some magic connected to
297
        CONFIG_WATCHDOG_NOWAYOUT]
298
 
299
pcwd.c -- Berkshire PC Watchdog
300
 
301
        Hardcoded timeout of 1.5 seconds
302
 
303
        Supports CONFIG_WATCHDOG_NOWAYOUT
304
 
305
        GETSUPPORT returns WDIOF_OVERHEAT|WDIOF_CARDRESET and both
306
        GETSTATUS and GETBOOTSTATUS return something useful.
307
 
308
        The SETOPTIONS call can be used to enable and disable the card
309
        and to ask the driver to call panic if the system overheats.
310
 
311
sbc60xxwdt.c -- 60xx Single Board Computer
312
 
313
        Hardcoded timeout of 10 seconds
314
 
315
        Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic
316
        character 'V' close handling.
317
 
318
        No bits set in GETSUPPORT
319
 
320
scx200.c -- National SCx200 CPUs
321
 
322
        Not in the kernel yet.
323
 
324
        The timeout is set using a module parameter "margin" which
325
        defaults to 60 seconds.  The timeout can also be set using
326
        SETTIMEOUT and read using GETTIMEOUT.
327
 
328
        Supports a module parameter "nowayout" that is initialized
329
        with the value of CONFIG_WATCHDOG_NOWAYOUT.  Also supports the
330
        magic character 'V' handling.
331
 
332
shwdt.c -- SuperH 3/4 processors
333
 
334
        [FIXME -- I'm unable to tell what the timeout is]
335
 
336
        Supports CONFIG_WATCHDOG_NOWAYOUT
337
 
338
        GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call
339
        returns if the device is open or not.  [FIXME -- silliness
340
        again?]
341
 
342
softdog.c -- Software watchdog
343
 
344
        The timeout is set with the module parameter "soft_margin"
345
        which defaults to 60 seconds, the timeout is also settable
346
        using the SETTIMEOUT ioctl.
347
 
348
        Supports CONFIG_WATCHDOG_NOWAYOUT
349
 
350
        WDIOF_SETTIMEOUT bit set in GETSUPPORT
351
 
352
w83877f_wdt.c -- W83877F Computer
353
 
354
        Hardcoded timeout of 30 seconds
355
 
356
        Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic
357
        character 'V' close handling.
358
 
359
        No bits set in GETSUPPORT
360
 
361
wdt.c -- ICS WDT500/501 ISA and
362
wdt_pci.c -- ICS WDT500/501 PCI
363
 
364
        Default timeout of 60 seconds.  The timeout is also settable
365
        using the SETTIMEOUT ioctl.
366
 
367
        Supports CONFIG_WATCHDOG_NOWAYOUT
368
 
369
        GETSUPPORT returns with bits set depending on the actual
370
        card. The WDT501 supports a lot of external monitoring, the
371
        WDT500 much less.
372
 
373
wdt285.c -- Footbridge watchdog
374
 
375
        The timeout is set with the module parameter "soft_margin"
376
        which defaults to 60 seconds.  The timeout is also settable
377
        using the SETTIMEOUT ioctl.
378
 
379
        Does not support CONFIG_WATCHDOG_NOWAYOUT
380
 
381
        WDIOF_SETTIMEOUT bit set in GETSUPPORT
382
 
383
wdt977.c -- Netwinder W83977AF chip
384
 
385
        Hardcoded timeout of 3 minutes
386
 
387
        Supports CONFIG_WATCHDOG_NOWAYOUT
388
 
389
        Does not support any ioctls at all.
390
 

powered by: WebSVN 2.1.0

© copyright 1999-2025 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.