| The comp.text.pdf Frequently Asked Questions | ||
|---|---|---|
| <<< Previous | Next >>> | |
Q: Is there an easy way to turn a PDF into a TIFF [or other raster / bitmap files]? I've tried exporting as an EPS then opening it in Photoshop, but Photoshop is giving me an error message. [Q64]
A: I just did it with Illustrator 9.
Jim K (jkajpust@concentricRATS.net) [A105]
A: Try using ghostscript from http://www.ghostscript.com
Michael Still (mikal@stillhq.com) [A106]
A: Try Konvertor_pdf2xxx (http://www.logipole.com)
Jean Piquemal (j.piquemal@wanadoo.fr) [A107]
A: BCL Computers' Freebird software is available as a command line program. Freebird converts PDF to TIFF, JPEG, and BMP. For more information, visit our website at http://www.bcl-computers.com
Rachel Burnsed (burnsed@bcl-computers.com) [A108]
A: Try ghostscript, e.g.: gs -sDEVICE=jpeg -sOutputFile=temp.%03d.jpg -c save pop -f your_file.pdf This will create one jpg file for every page, the "%03d" being replaced by a 3 digit page count.
Helge Blischke (H.Blischke@srz-berlin.de) [A109]
A:
gs command line options:
ad b: -r300
ad c,d: -sDEVICE=png... -sOutputFile=filepg%03d.png
(where the "..." denote one of the variants of the png devices)
ad a: -c save pop -f your_pdf_file
(the "-c save pop" recommended but nor essentially necessary)
ad e: not possigle by command line switches. You'd have to write a
PostScript wrapper that determines the current page size and
orientation (using the BeginPage approach).
|
Helge Blischke (H.Blischke@srz-berlin.de) [A175]
Q: I'm trying to convert TIFF files to PDF files. Preferably, I'd like to do this on a Unix (Solaris) server, and it definately has to be something that can run in the background, unattended. [Q65]
A: PDFlib does this. And yes, we will answer your support mails :-) There's also a dedicated PDFlib mailing available.
Thomas Merz (tm@pdflib.com) [A110]
A: Panda is a free (as in GNU GPLed) API which runs on various Unices, Linux and Windows and can do this sort of conversion work. Have a look at http://www.stillhq.com for more details
Michael Still (mikal@stillhq.com) [A111]
A: `convert' (part of of the ImageMagick tools) can translate almost any image format to almost any other. So `convert your.tiff your.pdf' will do what you want. I just tried it on a tiff file lying around and it worked fine. Convert is tremendously useful.
On Linux it is part of the ImageMagick RPM package. For Solaris I guess you have to get the source and compile. If rpm runs under Solaris, you might be able to get the src rpm and do rpm --rebuild ImageMagic...src.rpm where the ... depend on what version you get.
Go to and you'll be redirected to page pointing to tons of versions of the RPM (for various releases of many different linux distributions). They list
http://www.imagemagick.org
as the source for the software but it seems to be down right now.
Sanjoy Mahajan (sanjoy@skye.ra.phy.cam.ac.uk) [A112]
A: You can use "DaVince Tools" at http://www.davince.com. The tiff2pdf converter does exactly what you specified, except for the hidden text (are you referring to OCR?). It does recursion, 1 to 1 and many to 1 PDF conversion and thumbnail generation.
Dan Cogliano (dan@davince.com) [A208]
A: You could convert the PDF file to another format. BCL Computers' Drake software will convert PDF to RTF for editing in Word; however, it will also preserve page breaks. If you use BCL Magellan for converting PDF to HTML, and select the "HTML3" option, you should get an output file which can be opened and edited in Word. Free demos of these Acrobat plug-ins are available at http://www.bcl-computers.com.
Rachel Burnsed (burnsed@bcl-computers.com) [A113]
A: You can convert the PDF file to test with ghostscript.
http://www.cs.wisc.edu/~ghost/
A. Sinan Unur (asu1@cornell.edu) [A196]
A: Try xpdf, "pdftotext" is your friend. http://www.foolabs.com/xpdf/
Frank M. Siegert (frank@this.net) [A198]
A: Acrobat 5 does a nice job of this. Also create RTF files.
Dan Sideen (dansideen@home.com) [A200]
A: There are several ways to get PDF content into Word. A few options: Acrobat 5 has a "Save as RTF" feature. This exports the text with little formatting.
BCL Computers' Jade and Drake software allow you to convert PDF to RTF. Jade requires that you manually select the areas of text, tables, and graphics to convert; Drake automatically converts the entire file to RTF, preserving the layout and text formatting from the PDF. Free demos are available online at http://www.bcl-computers.com.
PDFZone (http://www.pdfzone.com) should list other PDF conversion tools as well.
Rachel Burnsed (burnsed@bcl-computers.com) [A240]
A: [Have a look at Adobe Distiller, and the Distiller like products listed in the Products not from Adobe section]
Michael Still (mikal@stillhq.com) [A115]
A: You should try the Sowedoo Easy PDf converter that enables you to convert a batch of Office files. Have a look at www.sowedoo.com
Chaize Michaël (mc@sowedoo.com) [A116]
A: ImageToPDF for Excel version 1.0 has been tested on Win 95/98SE and Excel 97/2000.
This macro is free and will/is:
Batch convert image files supported by Adobe Acrobat versions 4 and 5.
Fill Document Info fields.
Create Thumbnails
Set the file to Standard Open, Bookmarks or Thumbnails.
Information placed on the spreadsheet is PDFWorkshop Friendly.
http://freepdf.homestead.com/files/ImageToPDFver1_0.zip
Tiger (saccuzzo@home.com) [A209]
Q: I have a problem in distilling postscript documents produced with LATEX. The pdf doesn't look good on the screen, althaugh the printed document is perfect. [Q116]
A: Take a look at "Creating quality Adobe PDF files from TeX with DVIPS" by Kendall Whitehouse/EMERGE
http://www.math.hawaii.edu/~ralph/MathOnWeb/TeXPDF.html
Alex Cherepanov (alexcher@erols.com) [A168]
A: Check out http://www.xmlpdf.com/overview.html
John Farrow (john.farrow@xmlpdf.com) [A180]
A: java -cp $classpath org.apache.xalan.xslt.Process -IN myfile.xml -XSL mystyle.xsl -OUT newfile.fo MSXML will allow you todo the exact same thing for except it is a COM object. The transformation would be performed by XSL.
GDCII (miles@netset.com) [A183]
A: Try ps2pdf if you are using a Unix iteration...
Michael Still (mikal@stillhq.com) [A192]
A: If you want to convert more than one PS file into a single PDF, then try:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=your.pdf -c save pop -f file1.ps file2.ps
Helge Blischke (H.Blischke@srz-berlin.de) [A193]
Q: I am in search of a server side solution for converting RTF or MS Word docs to PDF. The proposed solution is to be deployed on an UNIX platform. [Q161]
A: For RTF, you could use unrtf (http://freshmeat.net/projects/unrtf/) to convert to PostScript, and run ps2pdf on the result to get PDF.
For MS Word, try wvWare (http://wvware.sourceforge.net/).
J.H.M. Dassen (Ray) (usenet@zensunni.demon.nl) [A248]
Q: Our customer has given us a number of PDF files with drawing that we have to redraw in a CAD format. Does someone know how to export a PDF into DXF, DWG or DXF? [Q177]
A: GhostView (GSview) and GhostScript will convert PDF files to vector drawings in DXF file format.
But if the source PDF file was created with raster drawings instead of importing a vector format the resulting DXF file will not contain any information about the drawing(s).
In GSview you should be using Edit Convert to vector format dxf: CAD exchange format
Then select the page to convert, and browse to the target (save) directory and give the the DXF file a file name.
The resulting DXF file will contain polygons instead of lines. Circles will be rough and a lot of clean up will be required. But the basic drawing will be useful and cleaning up the file will take less time then scanning the file and digitizing the drawing.
David George-Nichols (george-nichols@worldnet.att.net) [A271]
Q: I have tried a couple of tools ( not Java libraries but .. ) e.g. XPdf and Ghostscript. They work some of the time but fail in strange ways.
Sometimes, the Pdf file would seem to have normal english text in the Acrobat reader. When the text extraction tool is used, there is unadulterated garbage in the text file. [Q178]
A: Sometimes, when a font is embedded in a PDF file, the encoding information can get mangled. This seems to be happen as part of font subsetting, but I'm not sure what software or version or specific font(s) causes it. Most font subsetting does *not* destroy the encoding info this way.
In case this isn't entirely clear... A font's encoding maps character codes to glyph names; the font itself has a description (drawing instructions) for each glyph. A font subset is constructed by removing unused glyph descriptions. So if you only used the characters 'a', 'b', and 'c' from a particular font, the encoding might map character codes 1, 2, and 3 to the glyphs named /a, /b, and /c, respectively. In this case, extracting text is easy - just look at the glyph names and ignore the character codes.
But it's also perfectly legal to use, say, character codes 16, 46, and 93, and map them to the glyph names /p01, /p02, and /p03 -- as long as the description for the glyph named /p01 draws the the letter 'a', /p02 draws the letter 'b', etc. Onscreen viewers and printers don't care what the glyphs are named, they just look up the descriptions and draw the letter shapes.
Derek B. Noonburg (derekn@foolabs.com) [A272]
| <<< Previous | Home | Next >>> |
| Common questions about PDF | Development: Acrobat plugins |