1 |
1625 |
jcastillo |
USING VFAT
|
2 |
|
|
----------------------------------------------------------------------
|
3 |
|
|
To use the vfat filesystem, use the filesystem type 'vfat'. i.e.
|
4 |
|
|
mount -t vfat /dev/fd0 /mnt
|
5 |
|
|
|
6 |
|
|
No special partition formatter is required. mkdosfs will work fine
|
7 |
|
|
if you want to format from within Linux.
|
8 |
|
|
|
9 |
|
|
VFAT MOUNT OPTIONS
|
10 |
|
|
----------------------------------------------------------------------
|
11 |
|
|
codepage=### -- Sets the codepage for converting to shortname characters
|
12 |
|
|
on FAT and VFAT filesystems. By default, codepage 437
|
13 |
|
|
is used. This is the default for the U.S. and some
|
14 |
|
|
European countries.
|
15 |
|
|
iocharset=name-- Character set to use for converting between 8 bit characters
|
16 |
|
|
and 16 bit Unicode characters. Long filenames are stored on
|
17 |
|
|
disk in Unicode format, but Unix for the most part doesn't
|
18 |
|
|
know how to deal with Unicode. There is also an option of
|
19 |
|
|
doing UTF8 translations with the utf8 option.
|
20 |
|
|
utf8 -- UTF8 is the filesystem safe version of Unicode that
|
21 |
|
|
is used by the console. It can be enabled for the
|
22 |
|
|
filesystem with this option. If 'uni_xlate' gets set,
|
23 |
|
|
UTF8 gets disabled.
|
24 |
|
|
uni_xlate -- Translate unhandled Unicode characters to special
|
25 |
|
|
escaped sequences. This would let you backup and
|
26 |
|
|
restore filenames that are created with any Unicode
|
27 |
|
|
characters. Until Linux supports Unicode for real,
|
28 |
|
|
this gives you an alternative. Without this option,
|
29 |
|
|
a '?' is used when no translation is possible. The
|
30 |
|
|
escape character is ':' because it is otherwise
|
31 |
|
|
illegal on the vfat filesystem. The escape sequence
|
32 |
|
|
that gets used, where u is the unicode character, is:
|
33 |
|
|
':', (u & 0x3f), ((u>>6) & 0x3f), (u>>12),
|
34 |
|
|
posix -- Allow names of same letters, different case such as
|
35 |
|
|
'LongFileName' and 'longfilename' to coexist. This has some
|
36 |
|
|
problems currently because 8.3 conflicts are not handled
|
37 |
|
|
correctly for Posix filesystem compliance.
|
38 |
|
|
nonumtail -- When creating 8.3 aliases, normally the alias will
|
39 |
|
|
end in '~1' or tilde followed by some number. If this
|
40 |
|
|
option is set, then if the filename is
|
41 |
|
|
"longfilename.txt" and "longfile.txt" does not
|
42 |
|
|
currently exist in the directory, 'longfile.txt' will
|
43 |
|
|
be the short alias instead of 'longfi~1.txt'.
|
44 |
|
|
|
45 |
|
|
quiet -- Stops printing certain warning messages.
|
46 |
|
|
|
47 |
|
|
Explanation of Native Language Support in the VFAT Filesystem
|
48 |
|
|
----------------------------------------------------------------------
|
49 |
|
|
There are two different character sets are needed by the vfat
|
50 |
|
|
filesystem. The first is the codepage character set. The codepage is
|
51 |
|
|
the character set that is used to store short filenames on disk. Its
|
52 |
|
|
mount option is 'codepage=437' which 437 is the codepage number.
|
53 |
|
|
|
54 |
|
|
Long filenames are stored in Unicode, but since the Linux filesystem
|
55 |
|
|
doesn't deal with 16 bit characters, we need some way of converting
|
56 |
|
|
characters. There are a couple options of how to do this. One is to
|
57 |
|
|
use the 'utf8' mount option and I will cover that a bit later. The
|
58 |
|
|
other is to use the 'iocharset=iso8859-1' mount option where the
|
59 |
|
|
iso8859-1 tells the filesystem which character set is used for input
|
60 |
|
|
and output. If you are in Russia, you might specify koi8-r here.
|
61 |
|
|
If a Unicode character on disk cannot be mapped to anything in the
|
62 |
|
|
iocharset, it is replaced with a '?'.
|
63 |
|
|
|
64 |
|
|
The iocharset is used to convert long filenames to and from Unicode.
|
65 |
|
|
It is currently implemented. The codepage is used to convert short
|
66 |
|
|
filenames to and from the iocharset. This translation is not currently
|
67 |
|
|
implemented.
|
68 |
|
|
|
69 |
|
|
If no iocharset is specified and the default is unable to be loaded,
|
70 |
|
|
the mount will succeed while falling back to doing no conversions at
|
71 |
|
|
all. If a charset is explicity specified and the charset cannot be
|
72 |
|
|
loaded, the mount will fail.
|
73 |
|
|
|
74 |
|
|
For the codepage, the default mount option is 'codepage=437'. If a
|
75 |
|
|
codepage is explicitly asked for and the load of the character set
|
76 |
|
|
fails, the mount will fail. Is no codepage is explicitly asked for
|
77 |
|
|
and the load of the character set fails, the load will still succeed.
|
78 |
|
|
|
79 |
|
|
UTF8 is an 8 bit, filesystem safe representation of Unicode. It does
|
80 |
|
|
not lose any information in the conversion. However, you need to have
|
81 |
|
|
a terminal or a program that knows how to deal with UTF8. The Linux
|
82 |
|
|
console can be put into a mode where it will correctly display UTF8
|
83 |
|
|
characters. I don't know if there is a similar mode for xterms, but
|
84 |
|
|
I don't believe there is. More information about UTF8 can be found
|
85 |
|
|
at http://www.unicode.com
|
86 |
|
|
|
87 |
|
|
TODO
|
88 |
|
|
----------------------------------------------------------------------
|
89 |
|
|
* When only shortnames exist, translate them from the codepage character
|
90 |
|
|
set to the iocharset. Currently, translations only occur when longnames
|
91 |
|
|
exist. To translate, first convert from codepage to Unicode and then
|
92 |
|
|
to the output character set.
|
93 |
|
|
|
94 |
|
|
* Need to add dcache_lookup code msdos filesystem. This means the
|
95 |
|
|
directories need to be versioned like the vfat filesystem.
|
96 |
|
|
|
97 |
|
|
* Need to get rid of the raw scanning stuff. Instead, always use
|
98 |
|
|
a get next directory entry approach. The only thing left that uses
|
99 |
|
|
raw scanning is the directory renaming code.
|
100 |
|
|
|
101 |
|
|
* Fix the Posix filesystem support to work in 8.3 space. This involves
|
102 |
|
|
renaming aliases if a conflict occurs between a new filename and
|
103 |
|
|
an old alias. This is quite a mess.
|
104 |
|
|
|
105 |
|
|
|
106 |
|
|
POSSIBLE PROBLEMS
|
107 |
|
|
----------------------------------------------------------------------
|
108 |
|
|
* vfat_valid_longname does not properly checked reserved names.
|
109 |
|
|
* When a volume name is the same as a directory name in the root
|
110 |
|
|
directory of the filesystem, the directory name sometimes shows
|
111 |
|
|
up empty an empty file.
|
112 |
|
|
* autoconv option does not work correctly.
|
113 |
|
|
|
114 |
|
|
BUG REPORTS
|
115 |
|
|
----------------------------------------------------------------------
|
116 |
|
|
If you have trouble with the VFAT filesystem, mail bug reports to
|
117 |
|
|
chaffee@bugs-bunny.cs.berkeley.edu. Please specify the filename
|
118 |
|
|
and the operation that gave you trouble.
|
119 |
|
|
|
120 |
|
|
TEST SUITE
|
121 |
|
|
----------------------------------------------------------------------
|
122 |
|
|
If you plan to make any modifications to the vfat filesystem, please
|
123 |
|
|
get the test suite that comes with the vfat distribution at
|
124 |
|
|
|
125 |
|
|
http://www-plateau.cs.berkeley.edu/people/chaffee/vfat.html
|
126 |
|
|
|
127 |
|
|
This tests quite a few parts of the vfat filesystem and additional
|
128 |
|
|
tests for new features or untested features would be appreciated.
|
129 |
|
|
|
130 |
|
|
NOTES ON THE STRUCTURE OF THE VFAT FILESYSTEM
|
131 |
|
|
----------------------------------------------------------------------
|
132 |
|
|
(This documentation was provided by Galen C. Hunt
|
133 |
|
|
and lightly annotated by Gordon Chaffee).
|
134 |
|
|
|
135 |
|
|
This document presents a very rough, technical overview of my
|
136 |
|
|
knowledge of the extended FAT file system used in Windows NT 3.5 and
|
137 |
|
|
Windows 95. I don't guarantee that any of the following is correct,
|
138 |
|
|
but it appears to be so.
|
139 |
|
|
|
140 |
|
|
The extended FAT file system is almost identical to the FAT
|
141 |
|
|
file system used in DOS versions up to and including 6.223410239847
|
142 |
|
|
:-). The significant change has been the addition of long file names.
|
143 |
|
|
Theses names support up to 255 characters including spaces and lower
|
144 |
|
|
case characters as opposed to the traditional 8.3 short names.
|
145 |
|
|
|
146 |
|
|
Here is the description of the traditional FAT entry in the current
|
147 |
|
|
Windows 95 filesystem:
|
148 |
|
|
|
149 |
|
|
struct directory { // Short 8.3 names
|
150 |
|
|
unsigned char name[8]; // file name
|
151 |
|
|
unsigned char ext[3]; // file extension
|
152 |
|
|
unsigned char attr; // attribute byte
|
153 |
|
|
unsigned char lcase; // Case for base and extension
|
154 |
|
|
unsigned char ctime_ms; // Creation time, milliseconds
|
155 |
|
|
unsigned char ctime[2]; // Creation time
|
156 |
|
|
unsigned char cdate[2]; // Creation date
|
157 |
|
|
unsigned char adate[2]; // Last access date
|
158 |
|
|
unsigned char reserved[2]; // reserved values (ignored)
|
159 |
|
|
unsigned char time[2]; // time stamp
|
160 |
|
|
unsigned char date[2]; // date stamp
|
161 |
|
|
unsigned char start[2]; // starting cluster number
|
162 |
|
|
unsigned char size[4]; // size of the file
|
163 |
|
|
};
|
164 |
|
|
|
165 |
|
|
The lcase field specifies if the base and/or the extension of an 8.3
|
166 |
|
|
name should be capitalized. This field does not seem to be used by
|
167 |
|
|
Windows 95 but it is used by Windows NT. The case of filenames is not
|
168 |
|
|
completely compatible from Windows NT to Windows 95. It is not completely
|
169 |
|
|
compatible in the reverse direction, however. Filenames that fit in
|
170 |
|
|
the 8.3 namespace and are written on Windows NT to be lowercase will
|
171 |
|
|
show up as uppercase on Windows 95.
|
172 |
|
|
|
173 |
|
|
Note that the "start" and "size" values are actually little
|
174 |
|
|
endian integer values. The descriptions of the fields in this
|
175 |
|
|
structure are public knowledge and can be found elsewhere.
|
176 |
|
|
|
177 |
|
|
With the extended FAT system, Microsoft has inserted extra
|
178 |
|
|
directory entries for any files with extended names. (Any name which
|
179 |
|
|
legally fits within the old 8.3 encoding scheme does not have extra
|
180 |
|
|
entries.) I call these extra entries slots. Basically, a slot is a
|
181 |
|
|
specially formatted directory entry which holds up to 13 characters of
|
182 |
|
|
a files extended name. Think of slots as additional labeling for the
|
183 |
|
|
directory entry of the file to which they correspond. Microsoft
|
184 |
|
|
prefers to refer to the 8.3 entry for a file as its alias and the
|
185 |
|
|
extended slot directory entries as the file name.
|
186 |
|
|
|
187 |
|
|
The C structure for a slot directory entry follows:
|
188 |
|
|
|
189 |
|
|
struct slot { // Up to 13 characters of a long name
|
190 |
|
|
unsigned char id; // sequence number for slot
|
191 |
|
|
unsigned char name0_4[10]; // first 5 characters in name
|
192 |
|
|
unsigned char attr; // attribute byte
|
193 |
|
|
unsigned char reserved; // always 0
|
194 |
|
|
unsigned char alias_checksum; // checksum for 8.3 alias
|
195 |
|
|
unsigned char name5_10[12]; // 6 more characters in name
|
196 |
|
|
unsigned char start[2]; // starting cluster number
|
197 |
|
|
unsigned char name11_12[4]; // last 2 characters in name
|
198 |
|
|
};
|
199 |
|
|
|
200 |
|
|
If the layout of the slots looks a little odd, it's only
|
201 |
|
|
because of Microsoft's efforts to maintain compatibility with old
|
202 |
|
|
software. The slots must be disguised to prevent old software from
|
203 |
|
|
panicing. To this end, a number of measures are taken:
|
204 |
|
|
|
205 |
|
|
1) The attribute byte for a slot directory entry is always set
|
206 |
|
|
to 0x0f. This corresponds to an old directory entry with
|
207 |
|
|
attributes of "hidden", "system", "read-only", and "volume
|
208 |
|
|
label". Most old software will ignore any directory
|
209 |
|
|
entries with the "volume label" bit set. Real volume label
|
210 |
|
|
entries don't have the other three bits set.
|
211 |
|
|
|
212 |
|
|
2) The starting cluster is always set to 0, an impossible
|
213 |
|
|
value for a DOS file.
|
214 |
|
|
|
215 |
|
|
Because the extended FAT system is backward compatible, it is
|
216 |
|
|
possible for old software to modify directory entries. Measures must
|
217 |
|
|
be taken to insure the validity of slots. An extended FAT system can
|
218 |
|
|
verify that a slot does in fact belong to an 8.3 directory entry by
|
219 |
|
|
the following:
|
220 |
|
|
|
221 |
|
|
1) Positioning. Slots for a file always immediately proceed
|
222 |
|
|
their corresponding 8.3 directory entry. In addition, each
|
223 |
|
|
slot has an id which marks its order in the extended file
|
224 |
|
|
name. Here is a very abbreviated view of an 8.3 directory
|
225 |
|
|
entry and its corresponding long name slots for the file
|
226 |
|
|
"My Big File.Extension which is long":
|
227 |
|
|
|
228 |
|
|
|
229 |
|
|
|
230 |
|
|
|
231 |
|
|
|
232 |
|
|
|
233 |
|
|
|
234 |
|
|
Note that the slots are stored from last to first. Slots
|
235 |
|
|
are numbered from 1 to N. The Nth slot is or'ed with 0x40
|
236 |
|
|
to mark it as the last one.
|
237 |
|
|
|
238 |
|
|
2) Checksum. Each slot has an "alias_checksum" value. The
|
239 |
|
|
checksum is calculated from the 8.3 name using the
|
240 |
|
|
following algorithm:
|
241 |
|
|
|
242 |
|
|
for (sum = i = 0; i < 11; i++) {
|
243 |
|
|
sum = (((sum&1)<<7)|((sum&0xfe)>>1)) + name[i]
|
244 |
|
|
}
|
245 |
|
|
|
246 |
|
|
3) If there is in the final slot, a Unicode NULL (0x0000) is stored
|
247 |
|
|
after the final character. After that, all unused characters in
|
248 |
|
|
the final slot are set to Unicode 0xFFFF.
|
249 |
|
|
|
250 |
|
|
Finally, note that the extended name is stored in Unicode. Each Unicode
|
251 |
|
|
character takes two bytes.
|