View Single Post
  #3 (permalink)  
Old 09-26-08, 04:08 PM
wirehopper's Avatar
wirehopper wirehopper is offline
-
 
Join Date: Feb 2006
Posts: 2,516
Thanks: 20
Thanked 109 Times in 106 Posts
PHP Code:

// List the file as a link, so it can easily be viewed and checked

echo "<a href=\"$url\" target=\"_blank\">$url</a><br />\n";


 
// This is used to extract all the references from src, href, url, action, and window.opens
$ref=run_preg($text,"/(?:(?:src|href|url|action|window.open|popup)+\s*[=\(]+\s*[\"'`]*)([\+\w:?=@&\/#._;-]+)(?:[\s\"'`])/i");

function 
run_preg($text,$pattern)
{
   
// Handles the preg for the scan
   
preg_match_all ($pattern$text$matches);

   return (
is_array($matches)) ? $matches[1]:FALSE;

These are extracts out of robots.spider (http://robots-wizard.com/robots.spider/), download the code at http://robots-wizard.com/robots.spider.tar.gz.
Reply With Quote