Test for invalid unicode in file name

Madison Kelly linux-5ZoueyuiTZhBDgjK7y7TUQ at public.gmane.org
Fri May 6 03:53:49 UTC 2005


Thanks for the reply!

The trick is though that I have several valid unicode file names (ie: 
files using Japanese kana/kanji characters). These file names are 
accepted just fine and it is important that unicode support remains. If 
there is a regex that cought all valid unicodes and wasn't too expensive 
that would be great.

Madison

billt-lxSQFCZeNF4 at public.gmane.org wrote:
> Hi.
> 
> I'm not sure if this will work, but try to match all the valid characters.
> 
> if ($filename =~ m/[A-Z][a-z][0-9]/) {
>   # do something
> }
> 
> Just a thought
> 
> Bill
> 
>  Thu, May 05, 2005 at 10:20:48PM -0400, Madison Kelly wrote:
> 
>>Hi all,
>>
>>   I've run into a problem where a bulk postgres "COPY..." statement is 
>>dieing because one of the lines contains a file name with an invalid 
>>unicode character. In nautilus this file has '(invalid encoding)' and 
>>the postgres error is 'CONTEXT:  COPY file_info_3, line 228287, column 
>>file_name: "Femme Fatal\uffff.url"'.
>>
>>   Is there a way in perl (something like 'stat') where I can check to 
>>make sure a file name has valid encoding? If there is than I can catch 
>>this problem before adding it to, and corrupting, my COPY statement? I 
>>already 'quote' the file names first but that didn't catch it.
>>
>>   Thanks!
>>
>>Madison


-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison Kelly (Digimer)
TLE-BU, The Linux Experience; Back Up
http://tle-bu.thelinuxexperience.com
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list