USING VFAT
|
USING VFAT
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
To use the vfat filesystem, use the filesystem type 'vfat'. i.e.
|
To use the vfat filesystem, use the filesystem type 'vfat'. i.e.
|
mount -t vfat /dev/fd0 /mnt
|
mount -t vfat /dev/fd0 /mnt
|
|
|
No special partition formatter is required. mkdosfs will work fine
|
No special partition formatter is required. mkdosfs will work fine
|
if you want to format from within Linux.
|
if you want to format from within Linux.
|
|
|
VFAT MOUNT OPTIONS
|
VFAT MOUNT OPTIONS
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
codepage=### -- Sets the codepage for converting to shortname characters
|
codepage=### -- Sets the codepage for converting to shortname characters
|
on FAT and VFAT filesystems. By default, codepage 437
|
on FAT and VFAT filesystems. By default, codepage 437
|
is used. This is the default for the U.S. and some
|
is used. This is the default for the U.S. and some
|
European countries.
|
European countries.
|
iocharset=name-- Character set to use for converting between 8 bit characters
|
iocharset=name-- Character set to use for converting between 8 bit characters
|
and 16 bit Unicode characters. Long filenames are stored on
|
and 16 bit Unicode characters. Long filenames are stored on
|
disk in Unicode format, but Unix for the most part doesn't
|
disk in Unicode format, but Unix for the most part doesn't
|
know how to deal with Unicode. There is also an option of
|
know how to deal with Unicode. There is also an option of
|
doing UTF8 translations with the utf8 option.
|
doing UTF8 translations with the utf8 option.
|
utf8 -- UTF8 is the filesystem safe version of Unicode that
|
utf8 -- UTF8 is the filesystem safe version of Unicode that
|
is used by the console. It can be enabled for the
|
is used by the console. It can be enabled for the
|
filesystem with this option. If 'uni_xlate' gets set,
|
filesystem with this option. If 'uni_xlate' gets set,
|
UTF8 gets disabled.
|
UTF8 gets disabled.
|
uni_xlate -- Translate unhandled Unicode characters to special
|
uni_xlate -- Translate unhandled Unicode characters to special
|
escaped sequences. This would let you backup and
|
escaped sequences. This would let you backup and
|
restore filenames that are created with any Unicode
|
restore filenames that are created with any Unicode
|
characters. Until Linux supports Unicode for real,
|
characters. Until Linux supports Unicode for real,
|
this gives you an alternative. Without this option,
|
this gives you an alternative. Without this option,
|
a '?' is used when no translation is possible. The
|
a '?' is used when no translation is possible. The
|
escape character is ':' because it is otherwise
|
escape character is ':' because it is otherwise
|
illegal on the vfat filesystem. The escape sequence
|
illegal on the vfat filesystem. The escape sequence
|
that gets used, where u is the unicode character, is:
|
that gets used, where u is the unicode character, is:
|
':', (u & 0x3f), ((u>>6) & 0x3f), (u>>12),
|
':', (u & 0x3f), ((u>>6) & 0x3f), (u>>12),
|
posix -- Allow names of same letters, different case such as
|
posix -- Allow names of same letters, different case such as
|
'LongFileName' and 'longfilename' to coexist. This has some
|
'LongFileName' and 'longfilename' to coexist. This has some
|
problems currently because 8.3 conflicts are not handled
|
problems currently because 8.3 conflicts are not handled
|
correctly for Posix filesystem compliance.
|
correctly for Posix filesystem compliance.
|
nonumtail -- When creating 8.3 aliases, normally the alias will
|
nonumtail -- When creating 8.3 aliases, normally the alias will
|
end in '~1' or tilde followed by some number. If this
|
end in '~1' or tilde followed by some number. If this
|
option is set, then if the filename is
|
option is set, then if the filename is
|
"longfilename.txt" and "longfile.txt" does not
|
"longfilename.txt" and "longfile.txt" does not
|
currently exist in the directory, 'longfile.txt' will
|
currently exist in the directory, 'longfile.txt' will
|
be the short alias instead of 'longfi~1.txt'.
|
be the short alias instead of 'longfi~1.txt'.
|
|
|
quiet -- Stops printing certain warning messages.
|
quiet -- Stops printing certain warning messages.
|
|
|
Explanation of Native Language Support in the VFAT Filesystem
|
Explanation of Native Language Support in the VFAT Filesystem
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
There are two different character sets are needed by the vfat
|
There are two different character sets are needed by the vfat
|
filesystem. The first is the codepage character set. The codepage is
|
filesystem. The first is the codepage character set. The codepage is
|
the character set that is used to store short filenames on disk. Its
|
the character set that is used to store short filenames on disk. Its
|
mount option is 'codepage=437' which 437 is the codepage number.
|
mount option is 'codepage=437' which 437 is the codepage number.
|
|
|
Long filenames are stored in Unicode, but since the Linux filesystem
|
Long filenames are stored in Unicode, but since the Linux filesystem
|
doesn't deal with 16 bit characters, we need some way of converting
|
doesn't deal with 16 bit characters, we need some way of converting
|
characters. There are a couple options of how to do this. One is to
|
characters. There are a couple options of how to do this. One is to
|
use the 'utf8' mount option and I will cover that a bit later. The
|
use the 'utf8' mount option and I will cover that a bit later. The
|
other is to use the 'iocharset=iso8859-1' mount option where the
|
other is to use the 'iocharset=iso8859-1' mount option where the
|
iso8859-1 tells the filesystem which character set is used for input
|
iso8859-1 tells the filesystem which character set is used for input
|
and output. If you are in Russia, you might specify koi8-r here.
|
and output. If you are in Russia, you might specify koi8-r here.
|
If a Unicode character on disk cannot be mapped to anything in the
|
If a Unicode character on disk cannot be mapped to anything in the
|
iocharset, it is replaced with a '?'.
|
iocharset, it is replaced with a '?'.
|
|
|
The iocharset is used to convert long filenames to and from Unicode.
|
The iocharset is used to convert long filenames to and from Unicode.
|
It is currently implemented. The codepage is used to convert short
|
It is currently implemented. The codepage is used to convert short
|
filenames to and from the iocharset. This translation is not currently
|
filenames to and from the iocharset. This translation is not currently
|
implemented.
|
implemented.
|
|
|
If no iocharset is specified and the default is unable to be loaded,
|
If no iocharset is specified and the default is unable to be loaded,
|
the mount will succeed while falling back to doing no conversions at
|
the mount will succeed while falling back to doing no conversions at
|
all. If a charset is explicity specified and the charset cannot be
|
all. If a charset is explicity specified and the charset cannot be
|
loaded, the mount will fail.
|
loaded, the mount will fail.
|
|
|
For the codepage, the default mount option is 'codepage=437'. If a
|
For the codepage, the default mount option is 'codepage=437'. If a
|
codepage is explicitly asked for and the load of the character set
|
codepage is explicitly asked for and the load of the character set
|
fails, the mount will fail. Is no codepage is explicitly asked for
|
fails, the mount will fail. Is no codepage is explicitly asked for
|
and the load of the character set fails, the load will still succeed.
|
and the load of the character set fails, the load will still succeed.
|
|
|
UTF8 is an 8 bit, filesystem safe representation of Unicode. It does
|
UTF8 is an 8 bit, filesystem safe representation of Unicode. It does
|
not lose any information in the conversion. However, you need to have
|
not lose any information in the conversion. However, you need to have
|
a terminal or a program that knows how to deal with UTF8. The Linux
|
a terminal or a program that knows how to deal with UTF8. The Linux
|
console can be put into a mode where it will correctly display UTF8
|
console can be put into a mode where it will correctly display UTF8
|
characters. I don't know if there is a similar mode for xterms, but
|
characters. I don't know if there is a similar mode for xterms, but
|
I don't believe there is. More information about UTF8 can be found
|
I don't believe there is. More information about UTF8 can be found
|
at http://www.unicode.com
|
at http://www.unicode.com
|
|
|
TODO
|
TODO
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
* When only shortnames exist, translate them from the codepage character
|
* When only shortnames exist, translate them from the codepage character
|
set to the iocharset. Currently, translations only occur when longnames
|
set to the iocharset. Currently, translations only occur when longnames
|
exist. To translate, first convert from codepage to Unicode and then
|
exist. To translate, first convert from codepage to Unicode and then
|
to the output character set.
|
to the output character set.
|
|
|
* Need to add dcache_lookup code msdos filesystem. This means the
|
* Need to add dcache_lookup code msdos filesystem. This means the
|
directories need to be versioned like the vfat filesystem.
|
directories need to be versioned like the vfat filesystem.
|
|
|
* Need to get rid of the raw scanning stuff. Instead, always use
|
* Need to get rid of the raw scanning stuff. Instead, always use
|
a get next directory entry approach. The only thing left that uses
|
a get next directory entry approach. The only thing left that uses
|
raw scanning is the directory renaming code.
|
raw scanning is the directory renaming code.
|
|
|
* Fix the Posix filesystem support to work in 8.3 space. This involves
|
* Fix the Posix filesystem support to work in 8.3 space. This involves
|
renaming aliases if a conflict occurs between a new filename and
|
renaming aliases if a conflict occurs between a new filename and
|
an old alias. This is quite a mess.
|
an old alias. This is quite a mess.
|
|
|
|
|
POSSIBLE PROBLEMS
|
POSSIBLE PROBLEMS
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
* vfat_valid_longname does not properly checked reserved names.
|
* vfat_valid_longname does not properly checked reserved names.
|
* When a volume name is the same as a directory name in the root
|
* When a volume name is the same as a directory name in the root
|
directory of the filesystem, the directory name sometimes shows
|
directory of the filesystem, the directory name sometimes shows
|
up empty an empty file.
|
up empty an empty file.
|
* autoconv option does not work correctly.
|
* autoconv option does not work correctly.
|
|
|
BUG REPORTS
|
BUG REPORTS
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
If you have trouble with the VFAT filesystem, mail bug reports to
|
If you have trouble with the VFAT filesystem, mail bug reports to
|
chaffee@bugs-bunny.cs.berkeley.edu. Please specify the filename
|
chaffee@bugs-bunny.cs.berkeley.edu. Please specify the filename
|
and the operation that gave you trouble.
|
and the operation that gave you trouble.
|
|
|
TEST SUITE
|
TEST SUITE
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
If you plan to make any modifications to the vfat filesystem, please
|
If you plan to make any modifications to the vfat filesystem, please
|
get the test suite that comes with the vfat distribution at
|
get the test suite that comes with the vfat distribution at
|
|
|
http://www-plateau.cs.berkeley.edu/people/chaffee/vfat.html
|
http://www-plateau.cs.berkeley.edu/people/chaffee/vfat.html
|
|
|
This tests quite a few parts of the vfat filesystem and additional
|
This tests quite a few parts of the vfat filesystem and additional
|
tests for new features or untested features would be appreciated.
|
tests for new features or untested features would be appreciated.
|
|
|
NOTES ON THE STRUCTURE OF THE VFAT FILESYSTEM
|
NOTES ON THE STRUCTURE OF THE VFAT FILESYSTEM
|
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
(This documentation was provided by Galen C. Hunt
|
(This documentation was provided by Galen C. Hunt
|
and lightly annotated by Gordon Chaffee).
|
and lightly annotated by Gordon Chaffee).
|
|
|
This document presents a very rough, technical overview of my
|
This document presents a very rough, technical overview of my
|
knowledge of the extended FAT file system used in Windows NT 3.5 and
|
knowledge of the extended FAT file system used in Windows NT 3.5 and
|
Windows 95. I don't guarantee that any of the following is correct,
|
Windows 95. I don't guarantee that any of the following is correct,
|
but it appears to be so.
|
but it appears to be so.
|
|
|
The extended FAT file system is almost identical to the FAT
|
The extended FAT file system is almost identical to the FAT
|
file system used in DOS versions up to and including 6.223410239847
|
file system used in DOS versions up to and including 6.223410239847
|
:-). The significant change has been the addition of long file names.
|
:-). The significant change has been the addition of long file names.
|
Theses names support up to 255 characters including spaces and lower
|
Theses names support up to 255 characters including spaces and lower
|
case characters as opposed to the traditional 8.3 short names.
|
case characters as opposed to the traditional 8.3 short names.
|
|
|
Here is the description of the traditional FAT entry in the current
|
Here is the description of the traditional FAT entry in the current
|
Windows 95 filesystem:
|
Windows 95 filesystem:
|
|
|
struct directory { // Short 8.3 names
|
struct directory { // Short 8.3 names
|
unsigned char name[8]; // file name
|
unsigned char name[8]; // file name
|
unsigned char ext[3]; // file extension
|
unsigned char ext[3]; // file extension
|
unsigned char attr; // attribute byte
|
unsigned char attr; // attribute byte
|
unsigned char lcase; // Case for base and extension
|
unsigned char lcase; // Case for base and extension
|
unsigned char ctime_ms; // Creation time, milliseconds
|
unsigned char ctime_ms; // Creation time, milliseconds
|
unsigned char ctime[2]; // Creation time
|
unsigned char ctime[2]; // Creation time
|
unsigned char cdate[2]; // Creation date
|
unsigned char cdate[2]; // Creation date
|
unsigned char adate[2]; // Last access date
|
unsigned char adate[2]; // Last access date
|
unsigned char reserved[2]; // reserved values (ignored)
|
unsigned char reserved[2]; // reserved values (ignored)
|
unsigned char time[2]; // time stamp
|
unsigned char time[2]; // time stamp
|
unsigned char date[2]; // date stamp
|
unsigned char date[2]; // date stamp
|
unsigned char start[2]; // starting cluster number
|
unsigned char start[2]; // starting cluster number
|
unsigned char size[4]; // size of the file
|
unsigned char size[4]; // size of the file
|
};
|
};
|
|
|
The lcase field specifies if the base and/or the extension of an 8.3
|
The lcase field specifies if the base and/or the extension of an 8.3
|
name should be capitalized. This field does not seem to be used by
|
name should be capitalized. This field does not seem to be used by
|
Windows 95 but it is used by Windows NT. The case of filenames is not
|
Windows 95 but it is used by Windows NT. The case of filenames is not
|
completely compatible from Windows NT to Windows 95. It is not completely
|
completely compatible from Windows NT to Windows 95. It is not completely
|
compatible in the reverse direction, however. Filenames that fit in
|
compatible in the reverse direction, however. Filenames that fit in
|
the 8.3 namespace and are written on Windows NT to be lowercase will
|
the 8.3 namespace and are written on Windows NT to be lowercase will
|
show up as uppercase on Windows 95.
|
show up as uppercase on Windows 95.
|
|
|
Note that the "start" and "size" values are actually little
|
Note that the "start" and "size" values are actually little
|
endian integer values. The descriptions of the fields in this
|
endian integer values. The descriptions of the fields in this
|
structure are public knowledge and can be found elsewhere.
|
structure are public knowledge and can be found elsewhere.
|
|
|
With the extended FAT system, Microsoft has inserted extra
|
With the extended FAT system, Microsoft has inserted extra
|
directory entries for any files with extended names. (Any name which
|
directory entries for any files with extended names. (Any name which
|
legally fits within the old 8.3 encoding scheme does not have extra
|
legally fits within the old 8.3 encoding scheme does not have extra
|
entries.) I call these extra entries slots. Basically, a slot is a
|
entries.) I call these extra entries slots. Basically, a slot is a
|
specially formatted directory entry which holds up to 13 characters of
|
specially formatted directory entry which holds up to 13 characters of
|
a files extended name. Think of slots as additional labeling for the
|
a files extended name. Think of slots as additional labeling for the
|
directory entry of the file to which they correspond. Microsoft
|
directory entry of the file to which they correspond. Microsoft
|
prefers to refer to the 8.3 entry for a file as its alias and the
|
prefers to refer to the 8.3 entry for a file as its alias and the
|
extended slot directory entries as the file name.
|
extended slot directory entries as the file name.
|
|
|
The C structure for a slot directory entry follows:
|
The C structure for a slot directory entry follows:
|
|
|
struct slot { // Up to 13 characters of a long name
|
struct slot { // Up to 13 characters of a long name
|
unsigned char id; // sequence number for slot
|
unsigned char id; // sequence number for slot
|
unsigned char name0_4[10]; // first 5 characters in name
|
unsigned char name0_4[10]; // first 5 characters in name
|
unsigned char attr; // attribute byte
|
unsigned char attr; // attribute byte
|
unsigned char reserved; // always 0
|
unsigned char reserved; // always 0
|
unsigned char alias_checksum; // checksum for 8.3 alias
|
unsigned char alias_checksum; // checksum for 8.3 alias
|
unsigned char name5_10[12]; // 6 more characters in name
|
unsigned char name5_10[12]; // 6 more characters in name
|
unsigned char start[2]; // starting cluster number
|
unsigned char start[2]; // starting cluster number
|
unsigned char name11_12[4]; // last 2 characters in name
|
unsigned char name11_12[4]; // last 2 characters in name
|
};
|
};
|
|
|
If the layout of the slots looks a little odd, it's only
|
If the layout of the slots looks a little odd, it's only
|
because of Microsoft's efforts to maintain compatibility with old
|
because of Microsoft's efforts to maintain compatibility with old
|
software. The slots must be disguised to prevent old software from
|
software. The slots must be disguised to prevent old software from
|
panicing. To this end, a number of measures are taken:
|
panicing. To this end, a number of measures are taken:
|
|
|
1) The attribute byte for a slot directory entry is always set
|
1) The attribute byte for a slot directory entry is always set
|
to 0x0f. This corresponds to an old directory entry with
|
to 0x0f. This corresponds to an old directory entry with
|
attributes of "hidden", "system", "read-only", and "volume
|
attributes of "hidden", "system", "read-only", and "volume
|
label". Most old software will ignore any directory
|
label". Most old software will ignore any directory
|
entries with the "volume label" bit set. Real volume label
|
entries with the "volume label" bit set. Real volume label
|
entries don't have the other three bits set.
|
entries don't have the other three bits set.
|
|
|
2) The starting cluster is always set to 0, an impossible
|
2) The starting cluster is always set to 0, an impossible
|
value for a DOS file.
|
value for a DOS file.
|
|
|
Because the extended FAT system is backward compatible, it is
|
Because the extended FAT system is backward compatible, it is
|
possible for old software to modify directory entries. Measures must
|
possible for old software to modify directory entries. Measures must
|
be taken to insure the validity of slots. An extended FAT system can
|
be taken to insure the validity of slots. An extended FAT system can
|
verify that a slot does in fact belong to an 8.3 directory entry by
|
verify that a slot does in fact belong to an 8.3 directory entry by
|
the following:
|
the following:
|
|
|
1) Positioning. Slots for a file always immediately proceed
|
1) Positioning. Slots for a file always immediately proceed
|
their corresponding 8.3 directory entry. In addition, each
|
their corresponding 8.3 directory entry. In addition, each
|
slot has an id which marks its order in the extended file
|
slot has an id which marks its order in the extended file
|
name. Here is a very abbreviated view of an 8.3 directory
|
name. Here is a very abbreviated view of an 8.3 directory
|
entry and its corresponding long name slots for the file
|
entry and its corresponding long name slots for the file
|
"My Big File.Extension which is long":
|
"My Big File.Extension which is long":
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that the slots are stored from last to first. Slots
|
Note that the slots are stored from last to first. Slots
|
are numbered from 1 to N. The Nth slot is or'ed with 0x40
|
are numbered from 1 to N. The Nth slot is or'ed with 0x40
|
to mark it as the last one.
|
to mark it as the last one.
|
|
|
2) Checksum. Each slot has an "alias_checksum" value. The
|
2) Checksum. Each slot has an "alias_checksum" value. The
|
checksum is calculated from the 8.3 name using the
|
checksum is calculated from the 8.3 name using the
|
following algorithm:
|
following algorithm:
|
|
|
for (sum = i = 0; i < 11; i++) {
|
for (sum = i = 0; i < 11; i++) {
|
sum = (((sum&1)<<7)|((sum&0xfe)>>1)) + name[i]
|
sum = (((sum&1)<<7)|((sum&0xfe)>>1)) + name[i]
|
}
|
}
|
|
|
3) If there is in the final slot, a Unicode NULL (0x0000) is stored
|
3) If there is in the final slot, a Unicode NULL (0x0000) is stored
|
after the final character. After that, all unused characters in
|
after the final character. After that, all unused characters in
|
the final slot are set to Unicode 0xFFFF.
|
the final slot are set to Unicode 0xFFFF.
|
|
|
Finally, note that the extended name is stored in Unicode. Each Unicode
|
Finally, note that the extended name is stored in Unicode. Each Unicode
|
character takes two bytes.
|
character takes two bytes.
|
|
|