Advise: Programming Language to Learn

CLIFFORD ILKAY clifford_ilkay-biY6FKoJMRdBDgjK7y7TUQ at public.gmane.org
Mon Apr 9 19:37:42 UTC 2007


On Monday 09 April 2007, Lennart Sorensen wrote:
> On Mon, Apr 09, 2007 at 01:37:49PM -0400, Stephen wrote:
> > I often need to do utility programs to, for example, extract
> > email addresses from a text file.
>
> Well for munging text, it is hard to beat perl (other than for
> writing understandable pretty code).  I suspect python handles text
> quite well too, and is certainly a much cleaner language.  awk
> could probably do it too, but I wouldn't want to inflict that on
> people in general.
>
> > I used to use OS/2 REXX and it was great and fast.
> >
> > What would be a good choice to use on my Linux system?
> >
> > I know Pascal well, C fair and PHP pretty well.
>
> I would say try python.  Quite popular, with lots of handy
> libraries available.

I second Len's suggestion. Python has excellent string and regular 
expression modules and being an expressive, compact, approachable, 
and orthogonal language, doing this sort of thing is easy. Like Perl, 
Python is included with virtually every Linux distro and OS X. HP 
apparently includes it on their Windows machines as well. It will 
certainly be no harder to use Python than shell or Perl but it can be 
a lot easier. The only challenging bit in your particular case is 
identifying email addresses. If you are certain that "@" is not used 
anywhere in that file but in email addresses, or better yet, email 
addresses are denoted by "mailto:", you can use the "re" module to 
find the email addresses.

Here is an example from a recent bit of code I wrote for importing 
data from a tab-delimited text file into a PostgreSQL-backed Django 
application. This code populates a "Location" taxonomy vocabulary 
using the Django ORM (Object Relational Mapper). The final result is 
a tree that looks something like this:

root
- Base
-- Location
--- Ontario
---- North Bay
---- Ottawa
---- Toronto
---- Waterloo
---- Windsor

# Begin example
# Bring the csv module into the current namespace
import csv
# Bring the Taxonomy class from the taxonomy models module into the 
current namespace
from taxonomy.models import Taxonomy
# Create the root node for the taxonomy tree
root=Taxonomy.Create_super_root()
# Create a Base node to which we can add vocabularies
base_cat=root.add_child("Base")
# Create the Location vocabulary
location=base_cat.add_child("Location")
# Add a term called "Ontario" to the Location vocabulary
ont = location.add_child("Ontario", value="ON")
# Open the tab-delimited text file containing Ontario cities
file_ref = file("data/on-cities.txt")
# Instantiate a csv reader object
the_file = csv.reader(file_ref, delimiter="\t")
# Iterate over the_file
for the_row in the_file:
   # Add a city as a child to the Ontario term by referencing the 
first column in the_row
    ont.add_child(the_row[0])
# Close the file handle when there are no more rows in the_file
file_ref.close()
# End example

The general form of what you would have to do is similar but instead 
of creating a taxonomy, which is specific to my use case, you would 
be parsing the_row in the_file to extract the email addresses and 
doing something with them. You might, for example, be sending an 
email to each of them. Assuming the file you have to deal with is not 
a delimited file, you will probably not use the "csv" module to open 
the file but just open it directly using "open". It depends on what 
you are trying to accomplish. In pseudo code, it would look something 
like this:

# Begin example
the_file = open('/path/to/my/file')
for the_row in the_file:
    extract the email address from the_row
    do something with the email address - save to a file, send email, 
etc.
thefile.close()
# End example
-- 
Regards,

Clifford Ilkay
Dinamis Corporation
3266 Yonge Street, Suite 1419
Toronto, ON
Canada  M4N 3P6

<http://dinamis.com>
+1 416-410-3326
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list