Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Editing/Replacing Text in a PDF

by Russ (Deacon)
on Dec 20, 2006 at 18:20 UTC ( #590956=note: print w/replies, xml ) Need Help??

in reply to Editing/Replacing Text in a PDF

PDFs are relatively simple beasts, but what you're trying to do is corrupting the xref tables, which hold byte offsets to the beginnings of PDF objects. I haven't seen the output of Indesign, but the fact that you are able to replace text directly suggests that Indesign does not compress all its objects. That's good for your purposes.

You'll want the PDF Reference (published by Adobe) to understand this better, but look at the xref table in the output file. It looks like this:

xref 69 16 0000000016 00000 n 0000001041 00000 n 0000000616 00000 n 0000001121 00000 n 0000001250 00000 n 0000001381 00000 n 0000001533 00000 n 0000001567 00000 n 0000001782 00000 n 0000001858 00000 n 0000002238 00000 n 0000002609 00000 n 0000002830 00000 n 0000003269 00000 n 0000003514 00000 n 0000006183 00000 n
You'll want to know which objects you are modifying so you can correct all the objects with higher offset values. An object starts with a section like this:
71 0 obj
This is the 71st object (note that the xref table I copied started its numbering at 69, for some reason), 0th revision. It starts after 616 bytes, which you can see in the xref table above.

At a minimum, if all else goes well, you will have to correct the byte offstes of the objects that appear after your changes, so that the viewer can "find" them.

If you're looking for an easy fix, this won't help. If you're willing to invest some time learning a really cool file format, jump on in!

P.S. (Pun kinda intended) There may be more than one xref table. For your purposes, update all of them with offsets greater than your byte location until you know why you don't have to... :-)

Replies are listed 'Best First'.
Re^2: Editing/Replacing Text in a PDF
by ikkon (Monk) on Dec 21, 2006 at 11:40 UTC
    nice, thanks for a detailed answer, as far as this moment I will probably go with almuts suggustion , cause I do not have alot of time on this, however I am really interested in learning this, and since I am going on vacation soon I will have to really read up on this, and hopefully later be able to design a script that will interact more how i would like it too again thanks for the answer it really helped.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://590956]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2022-05-21 06:59 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (76 votes). Check out past polls.