Frequently Asked Question

What is the file command and how does it know what something is?

file examines the first few hundred bytes of its argument and compares them against a large database of magic numbers, short, distinctive byte sequences that file formats place at the start of their files. A JPEG begins with FF D8 FF, a PNG with 89 50 4E 47, an ELF executable with 7F 45 4C 46, a gzip archive with 1F 8B. The database lives at /usr/share/file/magic (or /usr/share/misc/magic), and you can extend it with your own rules.

Because it sniffs content rather than name, file is right far more often than file extensions are. It will happily tell you that photo is a JPEG, data.txt is actually an Ogg Vorbis stream, or /bin/ls is "ELF 64-bit LSB pie executable, x86-64, dynamically linked". For plain-text input it makes a further pass over the bytes to guess the encoding and language, UTF-8 versus ASCII, C source versus shell script. The MIME-type output (file -i) is the form most useful to other programs.

Further reading and video