In discussions about these systems, it was clear that the differences between the databases were simply a result of them being separate, and not due to any fundamental disagreements between developers. Everyone is keen to see them merged.
This spec proposes:
A standard format for these files.
A standard location for them.
The new format is very similar to the KDE format. However, only the tags used in this example are valid:
[MIME-Info text/html] Encoding=UTF-8 Comment=HTML document Comment[af]=... [... etc. other translations ] Patterns=*.htm;*.html; Contents=(starts-with "<HTML")
Specifically, all KDE-specific tags have been removed, as well as the Icon field. Although all desktops need a way to determine the icon for a particular type, the icon used will depend on desktop, and not only on the file type.
Although not part of the name-to-type mapping, the Comment field is left in for the sake of not having too many files.
KDE's Patterns field replaces GNOME's and ROX's ext/regex fields, since it is trivial to detect a pattern in the form '*.ext' and store it in an extension hash table internally. The full power of regular expressions was not being used by either desktop, and glob patterns are more suitable for filename matching anyway.
Applications MUST first try a case-sensitive match, then a case-insensitive
one. This is so that main.C
will be seen as a C++ file,
but IMAGE.GIF
will still use the *.gif pattern.
The value of the Contents attribute is a scheme expression. If the expression evaluates to a true value then the file is assumed to be of this type. Since scanning a file's contents can be very slow, applications may choose to do pattern matching first and only fallback to content matching, or not perform it at all.
This is just a vague proposal at the moment. Also, need a list of functions to provide.
If several patterns match then the longest pattern SHOULD be used. In
particular, files with multiple extensions (such as
Data.tar.gz
) MUST match the longest sequence of extensions
(eg '*.tar.gz' in preference to '*.gz'). Literal patterns (eg, 'Makefile') must
be matched before all others. It is acceptable to match patterns of the form
'*.text' before other wildcarded patterns (that is, to special-case extensions
using a hash table).
If the same pattern is defined twice, then they SHOULD be ordered by the directory the rule came from (this is to allow users to override the system defaults if, for example, they are using a common extension to mean something else). If they came from the same directory, either can be used.
If the same type is defined in several places, the Patterns and Comments MUST be merged. If two different comments are provided for the same MIME type in the same language, they should be ordered by directory as before.
Common types (such as MS Word Documents) will be provided in the X Desktop Group's package, which SHOULD be required by all applications using this specification. Since each application will then only be providing information about its own type, conflicts should be rare.
Unlike the KDE system, the files are not arranged in the filesystem by type.
This approach is only possible for a tightly coordinated system. Consider,
for example, that ROX-Filer adds a mapping from
.DirIcon
to 'image/png'. This cannot be specified in
a file called image/png.desktop
without conflicting
with existing definitions for the type.
Since files are not named by type, each file may contain multiple types. The files should be named by the package that they come from to avoid conflicts and reduce loading times.
The directories to be used to load these files are:
/usr/share/mime/mime-info
/usr/local/share/mime/mime-info
~/.mime/mime-info
Programs modifying any of these files MUST update the modification time on
the parent (mime-info
) directory so that applications can
easily detect the change. The rules from the directories in this list take
precedence over conflicting rules from earlier directories. Thus, the user's
settings take precedence over all others.
The system described in this document is intended to allow different programs to see the same file as having the same type. This is to help interoperability. The type determined in this way is only a guess, and an application MUST NOT trust a file based simply on its MIME type. For example, a downloader should not pass a file directly to a launcher application without confirmation simply because the type looks `harmless' (eg, text/plain).