algebaric operations on a RegEx?

Richard Dice rdice-e+AXbWqSrlAAvxtiuMwx3w at public.gmane.org
Sun Apr 5 03:33:40 UTC 2009


>
>    It's simpler with awk than perl:
>
> awk -v add=$1 '/<size>[0-9]+<\/size>/ {
>   n = $0
>   gsub( /[^0-9]/,"",n)
>   gsub( /[0-9]+/, n + add) }
>   { print }
> ' "$2" > tempfile$$ && mv tempfile$$ "$2"
>

Them is fighting words.

[richarddice ~]$ cat input.xml
<xmlsample>
 <size>13</size>
</xmlsample>

[richarddice ~]$ perl -pe "s/<size>(\d+)<\/size>/'<size>' . (\$1 + 4) .
'<\/size>'/e" input.xml
<xmlsample>
 <size>17</size>
</xmlsample>

"perl -e" means "execute the following perl code from the shell command
line"
"perl -p" means "wrap the following perl code in an implicit 'while (<>) {
... } continue { print; }' block"

Adding them together gives "perl -pe".

One way to save your newly-modified output is with shell STDOUT redirection:

[richarddice ~]$ perl -pe "s/<size>(\d+)<\/size>/'<size>' . (\$1 + 4) .
'<\/size>'/e" input.xml > output.xml

But if you wanted to be hardcore you could use "perl -i", which is in-place
file editing.  Again combining perl invocation flags, you would get "perl
-pie".  (Uuummm... pie!)

The /e suffix on the s/// substitution construct allows the evaluation of
the math in-place.

Note that I had to escape the $1 to make it \$1 in the above command line
example, lest the shell mangle it.  (This took me much longer to figure out
than the actual Perl one-liner.)  If you do this in the confines of a
program then it's not needed:

[richarddice ~]$ cat ./regex_math.pl
#!/usr/bin/perl -p

use warnings;
use strict;

s/<size>(\d+)<\/size>/"<size>" . ($1+4) . "<\/size>"/e;

[richarddice ~]$ perl ./regex_math.pl input.xml
<xmlsample>
 <size>17</size>
</xmlsample>

Personally, of the approaches I've seen so far I like the Python one, as it
cares about the XML structure and doesn't just blindly chop into the text of
the file.  I could probably dig up an equivalent Perl module to "import" to
do the same thing, but it was just too much of a delight to use -p and s///e
in the same place to pass up the opportunity.

Cheers,
 - Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/legacy/attachments/20090404/622ea57c/attachment.html>


More information about the Legacy mailing list