Frequently Asked Question
Why is parsing the output of ls bad?
ls is designed to format directory entries for humans, not for programs. Two specific
problems make parsing its output dangerously unreliable. First, Unix allows almost any
character in a filename, including spaces, tabs, newlines, backslashes, and even
backslash-n that looks like a newline. A loop like for f in $(ls); do rm "$f"; done
breaks the moment you have a file called my report.txt (split into two arguments) or
one with a literal newline in it (split at that newline). Second, ls sometimes
transforms its output for display, replacing non-printable bytes with question marks,
adding colour codes, making the bytes you read different from the real filename.
The correct alternatives all bypass ls entirely. To loop over files matching a
pattern, use bash's glob: for f in *.log; do ...; done. To recurse, use find,
ideally with -print0 and a while IFS= read -r -d '' loop, or with -exec. To get
filenames into an array, bash 4+ has mapfile with find -print0. ShellCheck warns on
ls parsing via SC2012 and SC2207, heed it.