Current location: Hot Scripts Forums » Programming Languages » PHP » How now to get the number of pages the one document PDF?


How now to get the number of pages the one document PDF?

Reply
  #11 (permalink)  
Old 02-20-10, 11:53 AM
wirehopper's Avatar
wirehopper wirehopper is offline
-
 
Join Date: Feb 2006
Posts: 2,516
Thanks: 20
Thanked 109 Times in 106 Posts
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #12 (permalink)  
Old 02-20-10, 12:41 PM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
PDF1.5
http://ian.umces.edu/pdfs/ian_presen...1124145318.pdf

I have tried saving as different versions, even extracting pages and recombining to make a new pdf - can't seem to get page count tags into it.

Any help figuring this out would be great.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #13 (permalink)  
Old 02-22-10, 03:25 AM
Baboum Baboum is offline
New Member
 
Join Date: Feb 2010
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
Solution ?

I tried adding this to count the number of 'Type/Page' appearing in the document, and that worked until I tried with you document :
PHP Code:

    if ($max===0){

            if (
$fp = @fopen($file,"r")) {   
            while(!
feof($fp)) {
                
$line fgets($fp,255);
                if (
preg_match('/\/Type\/Page/'$line)) $max++;
            }    
            
fclose($fp);
        }
    }
    return 
$max
That returned 28 pages instead of 16. So I just had a look at Count The Number of Pages in a PDF - PHP - Snipplr (thanks wirehopper ) and the first message gave me the solution (using Ubuntu Karmic) :
PHP Code:

exec('pdfinfo '.$file.' | awk \'/Pages/ {print $2}\''); 

It worked perfectly on your document. I will keep this one. Simple, fast and efficient.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #14 (permalink)  
Old 02-22-10, 03:51 PM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
I actually went with a combined approach. Since I have exec disabled on my server I wanted to stick with a PHP based solution, so ended up with this:

Code:
function getNumPagesPdf($filepath){
	$fp = @fopen(preg_replace("/\[(.*?)\]/i", "",$filepath),"r");
	$max=0;
	while(!feof($fp)) {
			$line = fgets($fp,255);
			if (preg_match('/\/Count [0-9]+/', $line, $matches)){
					preg_match('/[0-9]+/',$matches[0], $matches2);
					if ($max<$matches2[0]) $max=$matches2[0];
			}
	}
	fclose($fp);
	if($max==0){
		$im = new imagick($filepath);
		$max=$im->getNumberImages();
	}
	
	return $max;
}
If it can't figure things out because there are no Count tags, then it uses the imagick php extension. The reason I do a two-fold approach is because the latter is quite slow.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #15 (permalink)  
Old 02-24-10, 03:37 AM
Baboum Baboum is offline
New Member
 
Join Date: Feb 2010
Posts: 3
Thanks: 0
Thanked 0 Times in 0 Posts
I confirm that the Imagemagick solution is quite slow. That's why I didn't use it. I fear that too much users using it at the same time might crash my server .
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #16 (permalink)  
Old 02-24-10, 03:46 AM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Agreed, it is very slow, but also very effective, so I ended up using this code at the time the PDF is uploaded via my CMS. The result is then stored in the database along with all the other data about the file. Should circumvent any speed/load issues.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #17 (permalink)  
Old 03-01-10, 02:24 PM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Just wanted to add that unfortunately the "preg_match('/\/Count [0-9]+/', $line, $matches" approach actually mis-reports sometimes, so you cannot rely on my two-fold approach. Even if it returns a number other than zero, it does not mean it is going to be correct. It usually is, but I have had at least half a dozen time now when it has been wrong. Sometimes off by only a few, but other times not even close!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
  #18 (permalink)  
Old 03-01-10, 08:14 PM
adrianbj adrianbj is offline
Newbie Coder
 
Join Date: Feb 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
imagemagick really did end up being too slow on PDFs with hundreds of pages, so I have FPDI a go and it seems quick and accurate so far. I used it with tcpdf since I am already using that, rather than FPDF, which is the default. Don't forget that you will also need FPDF_TPL. You can get both from here:
http://www.setasign.de/products/pdf-.../fpdi/fpdi.php

require_once('../../tcpdf/tcpdf.php');
require_once('../../fpdi/fpdi.php');

$pdf =& new FPDI();

$pagecount = $pdf->setSourceFile($filepath);
return $pagecount;
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiShare on FacebookShare on Stumble UponShare on Twitter
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Art Nexus seeking PHP programmers TheArtNexus Job Offers & Assistance 5 02-26-08 04:08 AM
Number pages perleo PHP 3 05-14-05 07:30 AM
I most definately suggest DevelopingCentral.com For Any Website Design/Development! Salty777 General Advertisements 2 10-01-04 05:27 AM


All times are GMT -5. The time now is 01:20 PM.
vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.