Removing junk characters from text files?

Peter L. Peres plp-ysDPMY98cNQDDBjDh4tngg at public.gmane.org
Fri Feb 11 13:47:54 UTC 2005



On Thu, 10 Feb 2005, William O'Higgins wrote:

> The problem is that I don't know how to obtain values for $junkcharacter
> based on the crap I see on the screen.  F'rinstance, a CRLF shows up as
> ^M in vim (with the a line break) and I know that that is called "\r" in
> my replacement string - but I don't know what to call some of this other
> stuff that I see.  I can't copy/paste it, because it is represented on
> the screen as something other than what is found with a regex.  Does
> that help?
k
You can run a variety of commands to catch them, among others a simple 
sed filter that lets only printable ascii through:

sed -e 's/[^[:print:]\t]//g' <infile >outfile

You can run a similar command in vi

Peter
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list