I'm puzzled by this perl behaviour
D. Hugh Redelmeier
hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sat Sep 20 20:05:37 UTC 2003
[Disclaimer: I don't actually know perl.]
I don't understand why the regular expression match in the following
perl script fails if and only if the environment is utf8.
If the RE element can be matched without a "+" suffix, surely it can
match with a "+" suffix. Matching exactly once should be a stronger
condition than matching at least once.
My guess is that there is some Perl feature that I don't know about
that explains this behaviour.
Help!
Hugh Redelmeier
hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org voice: +1 416 482-8253
PS: you can use en_CA in place of en_US. I wrote en_US in the hope
that this is a more debugged setting.
================ whacky.pl ================
#!/usr/bin/perl
# whacky.pl: demonstrate oddity in Red Hat Linux 9.0's Perl (5.8.0)
# works: echo "package" | LANG=en_US.utf8 ./whacky.pl
# fails: echo "package" | LANG=en_US ./whacky.pl
use Data::Dumper;
print $ENV{"LANG"}, "\n";
while (<>) {
# print Dumper($_);
chop;
# $_ = "package=defaults";
print Dumper($_);
#happy if( /^\s*([^\s=+])/ ){
#sad
if( /^\s*([^\s=+]+)/ ){
die "$1 happy";
}
else {
warn "unknown input in \"$ARGV\" line $. of: $_\n";
die "sad";
}
}
================ end ================
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list