Current location: Hot Scripts Forums » Programming Languages » PHP » How now to get the number of pages the one document PDF?

How now to get the number of pages the one document PDF?

Reply
  #1 (permalink)  
Old 11-02-05, 09:59 AM
jonathanphp jonathanphp is offline
New Member
 
Join Date: Nov 2005
Posts: 4
Thanks: 0
Thanked 0 Times in 0 Posts
How now to get the number of pages the one document PDF?

Hey people!!!

How now the number the pages the one document PDF?

Bye...
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #2 (permalink)  
Old 11-02-05, 12:51 PM
NeverMind's Avatar
NeverMind NeverMind is offline
Community VIP
 
Join Date: Aug 2003
Location: K.S.A
Posts: 2,250
Thanks: 0
Thanked 0 Times in 0 Posts
do you mean you want to know how many pages a PDF document has?
__________________
We don't need a reason to help people - Zidane [FF9]
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #3 (permalink)  
Old 11-02-05, 01:02 PM
digioz's Avatar
digioz digioz is offline
Community Leader
 
Join Date: Oct 2003
Location: Chicago, IL
Posts: 1,987
Thanks: 2
Thanked 3 Times in 3 Posts
The simple answer is: "It is not possible"

Having said that, here is a page on the PHP.net documentation that says the following:

http://us3.php.net/pdf

Code:
How to get how many pages in a PDF? I read PDF spec. V1.6 and find this:
 
PDF set a "Page Tree Node" to define the ordering of pages in the document. The tree structure allows PDF applications, using little memory to quickly open a document containing thousands of pages.
 
If a PDF have 63 pages, the page tree node will like this...
 
2 0 obj
<< /Type /Pages
/Kidsn [ 4 0 R
			 10 0 R
			 ]
	 /Count 63		<---- YES, got it
>>
endobj
 
[P.S] a PDF may not only a pages tree node, The right answer is in "root page tree node", if /Count XX with /Parent XXX node, it not "root page tree node"
 
SO, You must find the node with /Count XX and Without /Parent terms, and you'll get total pages of PDF
 
%PDF-1.0 ~ %PDF-1.5 all works
In other words, you would simply have to look for the "/Count ", and the number of pages will be right in front of it.

For example:

Code:
/Type /Pages 
/Kids [ 2386 0 R 2388 0 R 2389 0 R 2390 0 R 2391 0 R 2392 0 R 2393 0 R ] 
/Count 67
So Once you find the "/Type /Pages" inside the text of your PDF file, the "/Count " that follows it will have the number of pages in it.

Last edited by digioz; 11-02-05 at 01:07 PM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #4 (permalink)  
Old 11-03-05, 10:34 AM
NeverMind's Avatar
NeverMind NeverMind is offline
Community VIP
 
Join Date: Aug 2003
Location: K.S.A
Posts: 2,250
Thanks: 0
Thanked 0 Times in 0 Posts
interesting find digioz!
are these info found in the header of any PDF file? so you could simply use fread() with specified number of bytes to obtain these headers.
__________________
We don't need a reason to help people - Zidane [FF9]
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #5 (permalink)  
Old 11-03-05, 12:37 PM
digioz's Avatar
digioz digioz is offline
Community Leader
 
Join Date: Oct 2003
Location: Chicago, IL
Posts: 1,987
Thanks: 2
Thanked 3 Times in 3 Posts
That is correct Nevermind (this information can be found inside ANY PDF file). So a simple fread() would allow you to read the contents of the pdf file in text format and extract the desired information.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #6 (permalink)  
Old 11-03-05, 02:36 PM
digioz's Avatar
digioz digioz is offline
Community Leader
 
Join Date: Oct 2003
Location: Chicago, IL
Posts: 1,987
Thanks: 2
Thanked 3 Times in 3 Posts
I have been getting a lot of emails asking me about this issue, saying that there are more then one "/Type /Pages" inside their PDF file.

YES! But only one of those is the ROOT set.


Example Of a ROOT Identifier Set:

Code:
/Type /Pages 
/Kids [ 2386 0 R 2388 0 R 2389 0 R 2390 0 R 2391 0 R 2392 0 R 2393 0 R ] 
/Count 67
>>
Example of what is NOT the root node:

Code:
/Type /Pages 
/Kids [ 250 0 R 253 0 R 256 0 R 259 0 R 267 0 R 275 0 R 283 0 R 291 0 R 299 0 R 
] 
/Count 9 
/Parent 15770 0 R 
>>
Notice that in the second example there is a "/Parent" tag. This means that the second one is NOT the root. The first example however does NOT have a "/Parent", which means it IS the ROOT.

I hope this clarifies it for everyone.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #7 (permalink)  
Old 12-12-05, 03:05 AM
Oddish Oddish is offline
New Member
 
Join Date: Dec 2005
Posts: 1
Thanks: 0
Thanked 0 Times in 0 Posts
This thread is really interesting and could potentially solve my problem, but I must be doing something wrong since I can't find any such text in the pdf-files I upload. There's no /Pages or anything. I'm attaching a bit from a random pdf that I just echoed onto the page. Do I need to decode the contents in some way to find this /Pages text?
Attached Files
File Type: txt pdfcontents.txt (82.8 KB, 273 views)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #8 (permalink)  
Old 02-08-10, 01:05 PM
Baboum Baboum is offline
New Member
 
Join Date: Feb 2010
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Since you just have to read all the file to get the Count tags, and the parent node has got one too, the following php script should work.
It looks for the largest Count tag, that is the one from the parent node. Just save it as test.php and call it with test.php?file=my_pdf_file.pdf

Code:
<?php
if (!$fp = @fopen($_REQUEST['file'],"r")) {
        echo 'failed opening file '.$_REQUEST['file'];
}
else {
        $max=0;
        while(!feof($fp)) {
                $line = fgets($fp,255);
                if (preg_match('/\/Count [0-9]+/', $line, $matches)){
                        preg_match('/[0-9]+/',$matches[0], $matches2);
                        if ($max<$matches2[0]) $max=$matches2[0];
                }
        }
        fclose($fp);
echo 'There '.($max<2?'is ':'are ').$max.' page'.($max<2?'':'s').' in '. $_REQUEST['file'].'.';
}
?>
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #9 (permalink)  
Old 02-19-10, 10:43 PM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
I know this thread is ancient, but since it has been revitalized with a solution, I thought I would chime in and say that Baboum's script works for me most of the time, but like Oddish, I do have some PDFs without Page/Count tags. I didn't create the original PDF, but I have tried recreating it, by printing to PDF from Acrobat, but still it does not contain the tags. Anyone have any ideas on how to add these properly. I tried some quick manual additions of it, but it broke the PDF.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #10 (permalink)  
Old 02-20-10, 10:01 AM
digioz's Avatar
digioz digioz is offline
Community Leader
 
Join Date: Oct 2003
Location: Chicago, IL
Posts: 1,987
Thanks: 2
Thanked 3 Times in 3 Posts
Quote:
Originally Posted by adrianbj View Post
I know this thread is ancient, but since it has been revitalized with a solution, I thought I would chime in and say that Baboum's script works for me most of the time, but like Oddish, I do have some PDFs without Page/Count tags. I didn't create the original PDF, but I have tried recreating it, by printing to PDF from Acrobat, but still it does not contain the tags. Anyone have any ideas on how to add these properly. I tried some quick manual additions of it, but it broke the PDF.
What version of PDF is this document that doesn't have a /count (PDF 7, 8, 9)? Can you post a sample PDF with no count here?

Thanks,
Pete
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 2 (0 members and 2 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Art Nexus seeking PHP programmers TheArtNexus Job Offers & Assistance 5 02-26-08 03:08 AM
Number pages perleo PHP 3 05-14-05 06:30 AM
I most definately suggest DevelopingCentral.com For Any Website Design/Development! Salty777 General Advertisements 2 10-01-04 04:27 AM


All times are GMT -5. The time now is 08:08 AM.
vBulletin® Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.