Internet stuff

1. Problems with IE displaying PDFs
Q: Does anyone have any suggestions that could help us understand why IE intermittantly will not get the PDF from the web server. We just get a blank screen occasionally, with 'Done' on the status bar. We have tried unoptimizing the DPF so as not to use byte serving. Web server is Apache 3.1.12 HTTP 1.1 on solaris UNIX. The client is NT4 + IE5. [Q43]
2. Optimised, compressed, linearised? Arrrrgh! My brain hurts!
Q: What is the difference between an optomised PDF, a compressed PDF, a linearised PDF, and a cheese stick? [Q44]
3. Encrypted PDFs and search engines
Q: I noticed that in Windows, PDF files have an extra 'PDF Properties' tab in file properties. It shows the 'General Info' (ctrl+D in Acrobat) of the PDF file. However, if the PDF is encrypted (eg. changing the document is not allowed), this info is not showed. Does anyone know if encrypting affetcs the indexing of PDFs (Acrobat Catalog, search engines)? [Q45]
4. Document corruption in older web caches
Q: I'm having trouble. I've tested with Acrobat Reader 5.0.1 27.3.2001 & W2k (both SP1 &SP2) and IE 5.5SP1 & IE 6.0beta.
5. Problems linking to PDF documents
Q: Why don't my PDF documents display properly when I link to them from a web page? [Q125]
6. Problems with accessing linearized PDFs from an Apache web server
Q: There have been some reports that some versions of Apache don't byte serve properly, which means that linearized PDFs don't serve properly... Is there any fact to this? [Q130]
7. Problems with IE caching PDF documents
Q: Hi, we have the following problem and I hope someone can help me. We are creating a web application using pdf forms. These forms are updated with data from a database via a web server through servlets. We are using TOMCAT as a web server and the FDF toolkit to create the form data.
8. Printing files within browsers using Javascript
Q: We have an application which generates dynamically pdf files on the server side. This pdf's can be viewed in the browser (which of course uses the acrobat reader for displaying the contents.) In a special convenient function I would like to print these docs directly without poping up an acrobat reader. eg the user presses simply a button called 'print'.
9. Linking to named destinations within a PDF
Q: I'm normally a lotus-notes-developer (so not much of a pdf-expert). For a project in domino.doc, I should be able to create URL-links that open pdf-documents that are stored in domino.doc to a certain location.
10. Font changes when byte serving PDF documents
Q: This is hardly a serious problem... I'm just curious. We're in a WinNT environment (Acrobat 4) and are serving PDF documents over the Web. The documents were Distilled with an embedded LucidaSans Type 1 font. When looking at the font characteristics within the document the embedding looks like it took just fine. This is what we are experiencing...

1. Problems with IE displaying PDFs

Q: Does anyone have any suggestions that could help us understand why IE intermittantly will not get the PDF from the web server. We just get a blank screen occasionally, with 'Done' on the status bar. We have tried unoptimizing the DPF so as not to use byte serving. Web server is Apache 3.1.12 HTTP 1.1 on solaris UNIX. The client is NT4 + IE5. [Q43]

A: I've noticed that it depends (somewhat, not always) on how your serving up your PDF's. I have some software on my site that creates PDFs on-the-fly which wasn't working in IE (got the white screen or Adobe complained it wasn't a valid PDF) but worked fine in Netscape. From your message I take it you have the PDFs on disk already?

Instead of writing the PDF stream out to the user's browser I started putting the PDFs on disk and doing a redirect. It seems to have alieviated the problem (server is NT with PDFs being created from Perl scripts). What was really strange was that when I sent the data directly to the browser, and Adobe complained in IE, I did a save to my disk to inspect the contents. Parts of the PDF were missing and even some of the internal structure, like what object the page info was stored at, was changed. I suspect that the plug-in for IE was modifying it (not sure what else could have). Perhaps the same thing is happening on your end?

Even some static PDFs I have out there don't always come up properly in IE although in Netscape I don't think I've ever had a problem.

Mike Bernardo (mbernardo@chartermi.net) [A70]

A: The problem is that MS fixes a PDF bug in every new version of MSIE, but everytime they succeed in introducing new PDF bugs. So it is very hard to find a workaround that works with EVERY version of MSIE. For what it's worth: I experienced 2 different problems with MSIE and DYNAMICALLY generated PDFs: 1. MSIE sends multiple requests to the server: so if the users asks for 1 dynamically generated PDF file, MSIE sends the same request 2 or 3 times to the server (if you don't anticipate, a PDF document is generated 2 or 3 times). 2. MSIE gives a blank page. This last problem can be solved by sending a server header indicating the exact length of the PDF file. In iText we solved this like this: http://www.lowagie.com/iText/faq.html#msie We are not happy with this situation, because if you have large documents, you want to send data to the browser bit by bit, not the whole file at once when it's finished: if the timeout is reached, you get 'connection reset by peer' errors.

Bruno Lowagie (bruno.lowagie@rug.ac.be) [A186]

A: As far as I remember, MSIE looks at the file extension to determin what kind of file you are serving. As you say you serve your PDF from a CGI, I guess the extension you have is something like .pl or .exe, instead of .pdf; that's probably why MSIE fails to see it as a PDF file. Have you tried setting the header "Content-Disposition" with the value "attachment; filename=myPDF.pdf". Maybe this helps... http://www.lowagie.com/iText/faq.html#msie gives you a list of links to the MS Support Pages. Maybe you'll find an answer there.

Bruno Lowagie (bruno.lowagie@rug.ac.be) [A265]

A: You have to watch out for this one - it reacts badly with this IE5 bug.

http://support.microsoft.com/default.aspx?scid=kb;EN-GB;q281119

On my system I couldn't even save, let alone view the PDF when we did this.

Try loading the PDF at

http://big.faceless.org/products/pdf/testsuffix.jsp

and see if you get this behaviour - I can load this successfully in IE5.5, despite the suffix.

I'm assuming your Content-Type is set to JUST "application/pdf" - another oddity we found in IE5 and IE6 (but not IE5.5, curiously), was that if the Content-Type is set to the unusual but technically valid "application/pdf; charset=ISO-8859-1" it would prompt to save the file (this header courtesy a JRun "quirk").

You've obviously got IE set up correctly to a point, as you're able to view some PDF's. How about posting just the headers from a working and non-working response, and we'll take a look? I was recently pointed towards "TracePlus Web Detective" from http://www.sstinc.com, which does a nice job of monitoring the HTTP header exchange - you can get a trial download from their site.

Mike Bremford (mike-news@big.faceless.org) [A266]

A: We have had this problem for many years and I am pretty sure that IE requires a "plausible" file name at the end of the URL to work properly. On our website you will find URLs that look like

http://www.iec.ch/cgi-bin/restricted/getfile.pl/100_467e_CDV.pdf?dir=100&for mat=pdf&type=_CDV&file=467e.pdf

(sorry this won't deliver a document unless you have a valid userid), but hit and miss has led us down this road. Curiously enough, if you hit Save Target As... , it proposes the filename in the middle (100_467e_CDV.pdf), though the parameters following the ? are required by the PERL script to retrieve the file.

Jack Sheldon (jack.sheldon@freesurf.ch.nospam) [A267]

2. Optimised, compressed, linearised? Arrrrgh! My brain hurts!

Q: What is the difference between an optomised PDF, a compressed PDF, a linearised PDF, and a cheese stick? [Q44]

A: These terms are confusing because many people mis use them (to a certain extent Adobe didn't pick very helpful terms either). The PDF specification uses the terms to mean:

Optimised: the PDF has been laid out in the most graceful manner possible. For instance, you have saved a black and white image as a colour or grayscale image, which would take a lot more space.

Compressed: some elements in the PDF are compressed. The whole document is not required to be compressed however.

Linearised: the document has had it's internals rearranged so that byte serving will work. Byte serving is that thing you get on some web sites when only the page you are currently reading is downloaded... This means that you can flick through large documents without having to first download the entire thing. It is often called optimised by people who haven't read the PDF specification.

Michael Still (mikal@stillhq.com) [A71]

3. Encrypted PDFs and search engines

Q: I noticed that in Windows, PDF files have an extra 'PDF Properties' tab in file properties. It shows the 'General Info' (ctrl+D in Acrobat) of the PDF file. However, if the PDF is encrypted (eg. changing the document is not allowed), this info is not showed. Does anyone know if encrypting affetcs the indexing of PDFs (Acrobat Catalog, search engines)? [Q45]

A: Certainly, Catalog used not to be able to index encrypted PDFs. With Acrobat 5.0 it is now a plug-in, so it probably can. Many, perhaps most, search engines will not be able to index files that are encrypted.

Aandi Inston (quite@dial.pipex.com) [A72]

4. Document corruption in older web caches

Q: I'm having trouble. I've tested with Acrobat Reader 5.0.1 27.3.2001 & W2k (both SP1 &SP2) and IE 5.5SP1 & IE 6.0beta.

Error message is "File does not start with '%PDF-'".

I get the error message about 99% of the pdf viewing attempts, also with

www.airtug.com/brochure.pdf [Q46]

A: When PDF files are fetched from the web, this is often in small pieces - a process called byteserving.

Some proxy servers don't recognise that the pieces are separate, and so they give back the wrong pieces. This leaves the PDF files in quite a mess, as if they are broken up and glued together wrong.

I don't know if that proxy has problems, but if a proxy didn't understand byteserving your symptoms are exactly what I'd expect.

Aandi Inston (quite@dial.pipex.com) [A73]

5. Problems linking to PDF documents

Q: Why don't my PDF documents display properly when I link to them from a web page? [Q125]

A: There are several threads about this. I'd like to add my experience and my fix. First,the problem: we added several Adobe PDF documents to a new web site. They had formerly been linked to successfully on another page. The documents had recently been revised. When we published the new site, a group of PDF files from one page came up blank when a user tried to link to them. Sometimes, using the Reload button would then display the page. The problem occurred with Netscape 4.6 on Win 2000 and with NS 4.76 on an unknown Windows platform (user did not report OS). I got a blank page with NS 6.1 on NT4, but after I changed a configuration setting to stop asking me if I wanted to save or display when I clicked on a PDF link, the blank page no longer occurred. The pages displayed first time on this same machine with NS 4.61 and NS 4.78 (However, I installed these older browsers after making the config change on 6.1.)

All these machines were using the Adobe Acrobat Reader plug-in 4.0 or 4.05. I had some problems with the pages locking up the entire browsers from home with NS 4.7 and IE 5.5. on Win ME; I upgraded to Acrobat Reader 5.0 and the files displayed fine in this configuration. (Considering the locking up of the browsers, this problem may be a different one than our users experienced.)

The documents in question were created with MS Word and converted to Acrobat via a plug-in. They were modified with Adobe PDFWriter 4.05 for Windows NT, PDF Version 1.3. However, other documents that had been modified by the same version, but were linked on another web page, displayed fine on all platforms and browsers. In addition, when we recreated one of the PDF documents by converting in Word with an older Adobe PDFWriter (PDF Version 1.2--which documents had displayed correctly on the old site) the PDFs still would not link from the problem page. Here's the final clue: when we linked a supposedly bad PDF file from another web page, it could be viewed fine.

So we presumed that some problem in the HTML code was causing the display glitch. Consequently, we moved all the HTML from the "bad" page to a new page. However, we did it in chunks--layout separate from border code, etc. (We use NetObjects TeamFusion Authoring Server 2000 for Web publication)

Voila! The PDFs displayed fine when linked to from the new page.

While I have not examined all the code on the "bad" page, the contents of the <a href> tags themselves are identical to those on the new page. I conclude that some hidden control code has caused an oddity in interpretation of this link by certain configurations of browser and Adobe plug-in. Obviously, it is not consistent across the machines. Other threads on Usenet have mentioned that it can happen on ASP pages, on IE, and in UNIX in addition to multiple versions of Netscape. The fixes offered in Usenet (reformat your hard drive - LOL - or change your Adobe Reader from a Plug-in to a helper application) have worked for some, but not others. I add to those suggestions another less hard intervention--recreate your web page.

(Please compare this hypothesis and scenario to what happens in some cases where Word text is copied and pasted to an HTML page in Team Fusion; or in SQL server code when pasting from one screen to another. Occasionally, the code will not work until you delete all spaces that might contain hidden codes or retype the code. Or the case of dropped close-table tags that display correctly in IE, but show a blank page in some versions of Netscape)

Hoping this helps someone who runs across this problem.

Diana Diehl (ddiehl@uillinois.edu) [A184]

6. Problems with accessing linearized PDFs from an Apache web server

Q: There have been some reports that some versions of Apache don't byte serve properly, which means that linearized PDFs don't serve properly... Is there any fact to this? [Q130]

A: Apache 1.3.14 had a serious bug affecting byteserving, which is the method Acrobat uses to fetch pages. I don't believe it affected any other version. Still you could try upgrading to the latest.

Aandi Inston (quite@dial.pipex.com) [A197]

7. Problems with IE caching PDF documents

Q: Hi, we have the following problem and I hope someone can help me. We are creating a web application using pdf forms. These forms are updated with data from a database via a web server through servlets. We are using TOMCAT as a web server and the FDF toolkit to create the form data.

When a user presses the submit button, the servlet handles the request, builds an updated FDF file and sends it back to the client. The problem is that altough the servlet response gets to the client, the pdf file does not get updated with the newly generated FDF data. We do not have this problem with Netscape communicator this is why I think it is a problem with IE5.5's caching.

Any help would be greatly appreciated. Thank you in advance. [Q131]

A: The problem appears to be that IE checks for files with newer dates on the server, and the pdf does not have a newer date, since it hasn't changed. You need to disable caching in IE. Try Tools:Internet Options: Temporary Internet Files. The bad news is that this disables caching for all pages, and will slow down all web work.

Alternatively, try using a different file name for the "master" PDF file using FDFSetFile. This will force a new version to be downloaded.

Dan Sideen (dansideen@home.com) [A199]

8. Printing files within browsers using Javascript

Q: We have an application which generates dynamically pdf files on the server side. This pdf's can be viewed in the browser (which of course uses the acrobat reader for displaying the contents.) In a special convenient function I would like to print these docs directly without poping up an acrobat reader. eg the user presses simply a button called 'print'.

In the newsgroups I found solutions like this

 
        ..
        <EMBED name="pdf" SRC="rg.pdf" WIDTH=85 HEIGHT=115>

        <script>
          window.document.pdf.print();
        </script>
        ..
 

which DON'T work for me. [Q133]

A: Yep that was an oldone which relied on Adobe's ActiveX control being wholly not safe for scripting, and violating basic MS plug-in rules... on Win32,

http://www.meadroid.com/scriptx/ would be a solution (combined with Neptune from the same site for non IE.)

Jim Ley (jim@jibbering.com) [A203]

9. Linking to named destinations within a PDF

Q: I'm normally a lotus-notes-developer (so not much of a pdf-expert). For a project in domino.doc, I should be able to create URL-links that open pdf-documents that are stored in domino.doc to a certain location.

I can get this to work when I specify a certain page (you know,

http://.../filename.pdf#page=7), but it doesn't work when I want to go to a named destination (as in

http://.../filename.pdf#nameddest=destination). It opens the document, but always the first page.

I have created a named destination in my test-document (using the destination-window in acrobat 4.0), and tested it (using go to destination in the destination-window) so I'm sure there is a named destination.

What am I doing wrong, or is this a browser(IE 5.0)-related problem ? [Q135]

A: I found it myself, apparently there's a mistake in the acrobat-help-files (leave out the #nameddest) :

http://www.adobe.com/support/techdocs/a17e.htm

Further I was using blanks in the name of my destination, and this also seemed to cause trouble (browser translates them to %20 and this doesn't seem to match the name of the destination anymore).

Philippe Lauwers (eb.xeruces@srewual.eppilihp) [A206]

10. Font changes when byte serving PDF documents

Q: This is hardly a serious problem... I'm just curious. We're in a WinNT environment (Acrobat 4) and are serving PDF documents over the Web. The documents were Distilled with an embedded LucidaSans Type 1 font. When looking at the font characteristics within the document the embedding looks like it took just fine. This is what we are experiencing...

The end-user is browsing away (Netscape 4.08), reading PDF documents with the Reader 4 plugin. They bring up a PDF document and when it displays, for a fraction of a second, the text that is displayed is rendered in a different font. Very quickly it then shifts to the LucidaSans font. Although it happens very fast, I think the font that is initially displayed is the AdobeSansMM font. At least that's what it looks like to me. [Q153]

A: That's right - it's exactly what is supposed to happen. When displaying a "byteserved" PDF Acrobat will download the page in stages. First the text itself, which is rendered with substitute fonts. Then bitmap graphics. Finally, any embedded fonts come down and the text will, if necessary be redrawn.

This can take a long time, not just happen in a flash, and that's the point of it - to give the user something to read as quickly as possible. This enormously improves the perceived performance (which is much more important in most cases than the real performance).

Aandi Inston (quite@dial.pipex.com) [A236]