extracting text from PDF file
Fred Nastos
nastos-JAjqph6Yjy8fbXvGcxQkLSwD8/FfD2ys at public.gmane.org
Fri Apr 23 19:59:59 UTC 2004
On April 23, 2004 03:32 pm, Stewart C. Russell wrote:
> Fred Nastos wrote:
> > Does anyone have a good way to extract images from a PDF file?
>
> pdfimages, from the xpdf package: <http://www.foolabs.com/xpdf/>
I've tried pdfimages; It doesn't work for the document I'm
interested in. The document has some funny way (i.e
non-typical) way of including eps images.
> While I'm here, I might as well also mention pdftohtml
> <http://pdftohtml.sourceforge.net/>, which makes a fantastic job of
> converting PDF to HTML layouts. F'rinstance, I generated this
Thanks. I just tried it, and it does work for some documents
(quite well), but not for the one I'm working with right now.
Guess I'll keep trying... or ask a Windows-friend to extract
them for me. Thanks
> <http://www.peck.ca/grhcc/portland_agenda_html/index.html> from
> <http://www.peck.ca/grhcc/portland_agenda.pdf>.
>
> Stewart
> --
> The Toronto Linux Users Group. Meetings: http://tlug.ss.org
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list