Frequently Asked Question
What is the file command and how does it know what something is?
file examines the first few hundred bytes of its argument and compares them
against a large database of magic numbers, short, distinctive byte sequences
that file formats place at the start of their files. A JPEG begins with
FF D8 FF, a PNG with 89 50 4E 47, an ELF executable with 7F 45 4C 46, a gzip
archive with 1F 8B. The database lives at /usr/share/file/magic (or
/usr/share/misc/magic), and you can extend it with your own rules.
Because it sniffs content rather than name, file is right far more often than
file extensions are. It will happily tell you that photo is a JPEG, data.txt
is actually an Ogg Vorbis stream, or /bin/ls is "ELF 64-bit LSB pie executable,
x86-64, dynamically linked". For plain-text input it makes a further pass over
the bytes to guess the encoding and language, UTF-8 versus ASCII, C source versus
shell script. The MIME-type output (file -i) is the form most useful to other
programs.