[GTALUG] tr: Illegal byte sequence

Stewart C. Russell scruss at gmail.com
Wed Sep 26 10:58:06 EDT 2018


On 2018-09-26 10:43 AM, Giles Orr via talk wrote:
> 
> I'd really like to understand what the problem is, why 'tr' barfs, and
> what the 'locale' settings have to do with this.  Thanks.

tr on Mac OS seems to assume input is valid UTF-8 text (if locale is
suitably UTF-8). You can set your tr string to something trivial and it
still barfs:

    dd if=/dev/urandom bs=1 count=256 2>/dev/null | tr -dc 'A-Za-z0-9' |
head -c 32

A portable hack might be to use iconv to say that the input is an 8-bit
charset:

    dd if=/dev/urandom bs=1 count=256 2>/dev/null | iconv -f ISO-8859-1
| tr -dc 'A-Za-z0-9!@$%^&*(){}[]=+-_/?\|~`' | head -c 32

cheers,
 Stewart


More information about the talk mailing list