automate printing of html-formatted pages?
Matt Price
matt.price-H217xnMUJC0sA/PxXw9srA at public.gmane.org
Tue Sep 13 13:46:08 UTC 2005
hey, we're back on line!
I'll try this one again:
Ho folks,
My partner needs to print out copies of all the content in her
mid-sized, statically-generated website (I know this is a stupid idea,
but it's for her tenure file and there are lots and lots of stupid
elements in this process). This seems like something one ought to be
able to do automatically, e.g. with:
wget -m -k http://some.website.com/
and then:
#! /bin/bash
find /path/to/top/level -type f -iname *.html | while read file; do
html2ps -gn $file > "$file".ps ;
done
find /path/to/top/level -type f -iname *.html | while read psfile; do
lpr $psfile
done
unfortunately, this doesn't work very well -- among other things,
html2ps does some very strange things with the layout of the pages,
apparently trying to cram all the text even in very long pages into a
single 8.5x11 sheet of paper -- I've posted an example at
http://www.racesci.org/test.ps (the original html file is at
http://www.racesci.org/bibliographies/current_scholarship/sigerist.html ).
Presumably this has something to do with the rendering bug mentioned in
the html2ps man page:
Rendering HTML tables well is a non-trivial task.
For "real" tables, that is representation of tabu-
lar data, html2ps usually generates reasonably good
output. When tables are used for layout purposes,
the result varies from good to useless. This is
because a table cell is never broken across pages.
So if a table contains a cell with a lot of con-
tent, the entire table may have to be scaled down
in size in order to make this cell fit on a single
page. Sometimes this may even result in unreadable
output.
OK, I can see this is difficult to do. But is there another
command-line solution to my problem? It bugs me that there isn't a
simple tool that Just Works.
Thanks as always for your suggestions.
Matt
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list