extracting text from PDF file

Stewart C. Russell scruss-rieW9WUcm8FFJ04o6PK0Fg at public.gmane.org
Fri Apr 23 19:32:35 UTC 2004


Fred Nastos wrote:
> 
> Does anyone have a good way to extract images from a PDF file?

pdfimages, from the xpdf package: <http://www.foolabs.com/xpdf/>

While I'm here, I might as well also mention pdftohtml 
<http://pdftohtml.sourceforge.net/>, which makes a fantastic job of 
converting PDF to HTML layouts. F'rinstance, I generated this 
<http://www.peck.ca/grhcc/portland_agenda_html/index.html> from 
<http://www.peck.ca/grhcc/portland_agenda.pdf>.

  Stewart
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list