בתאריך יום שלישי, 18 בספטמבר 2012 16:54:23 UTC+1, מאת Rudra Banerjee:
> Can anyone kindly show me the steps required to read pdf headers in human
> readable format?
>
PDF is a binary format. To read any binary format, you need to have a copy
of the format specification. That tells you how the bits are to be interpreted.
With PDF, the gross file structure is quite straightforwards. Whilst I forget
the details, basically you have a tag which tells you what type of data the
section is (text, image, font, copyright notice, etc), then you have the
length of the data, then you have the data itself.
However the data itself is usually compressed, using zlib. Whilst it is
possible to write your owen decompressor, this is a major undertaking.
usually the only realistic option is to use a library.
What this means is that whilst you can get an idea of waht a PDF file
contains, you can't easily read the actual data, certainly not with your own
little scratch program.
--
http://www.malcolmmclean.site11.com/www