bomstrip — strip Byte-Order Marks from UTF-8 text

The bomstrip utility reads a UTF-8 file on its standard input and copies it to the standard output, removing the three-byte BOM from the start if it is present. The fun part is that bomstrip is actually a collection of various implementations in many programming languages! For a better explanation of why strippping the BOM from UTF-8 files is good and necessary, please take a look at the real bomstrip page by Mechiel Lukkien.

This is a patchset that adds a couple of new implementations (C++, Awk, and a Perl one-liner so far), a manual page, an additional bomstrip-files utility to strip files in place, and a Makefile to install the whole thing.

The bomstrip utility may be fetched from its real homepage at Mechiel Lukkien's site — or you may get it here, along with my patchset :)

The latest release is bomstrip-9 — with the patchset making it bomstrip-9-roam-02:

