Current location: Hot Scripts Forums » Programming Languages » PHP » Autolink Script - Link once/keyword


Autolink Script - Link once/keyword

Reply
  #1 (permalink)  
Old 09-22-08, 07:14 PM
cocaine_energy_drink cocaine_energy_drink is offline
Newbie Coder
 
Join Date: Sep 2008
Posts: 12
Thanks: 0
Thanked 0 Times in 0 Posts
Question Autolink Script - Link once/keyword

I found this auto linking script that works very nicely. So far I have made a few changes to it. It now automatically pulls the URL to reference to an HTML file that pulls the content, and it now also will filter out links to the same page from a site wide list of links. However I have one more challenge to tackle on this script before it is satisfactory. The script automatically links chosen keywords, thats great, but it links every iteration of the chosen keywords on the page. For example, if I am trying to link the keyword "fish food" to a fish food page, every iteration of the phrase "fish food" will be linked to that page. I would like to only have it link to the 1st iteration of each chosen keyword. If I'm talking about fish food and there's 20 links on the page with that anchor text it begins to look ridiculous. The 1st few tweaks we're easy, but this one I just don't have a clue on. Take a look, here's the scripts:


Here we have the main page that pulls the content from an html file:
PHP Code:

$file "content".$_SERVER["REQUEST_URI"];

$data file_get_contents($file);

/**********************************************
This is the automatice keyword generator class
***********************************************/

//this the actual application.
include('autolink/class.autokeyword.php');

$params['content'] = $data//page content

//set the length of keywords you like
$params['min_word_length'] = 5;  //minimum length of single words
$params['min_word_occur'] = 5;  //minimum occur of single words
$params['min_2words_length'] = 3;  //minimum length of words for 2 word phrases
$params['min_2words_phrase_length'] = 10//minimum length of 2 word phrases
$params['min_2words_phrase_occur'] = 2//minimum occur of 2 words phrase
$params['min_3words_length'] = 3;  //minimum length of words for 3 word phrases
$params['min_3words_phrase_length'] = 10//minimum length of 3 word phrases
$params['min_3words_phrase_occur'] = 2//minimum occur of 3 words phrase



$keyword = new autokeyword($params"iso-8859-1");

$keywords =  $keyword->get_keywords();

$origin $_SERVER["REQUEST_URI"];

/**********************************************
This is the start of the auto link keyword!
***********************************************/
// this list could be an output of a database query
// or just from a plain text file.
$linkfile ='autolink/linkedKeywords.php';
//read the file
$fh fopen($linkfile,'r') or die("can't read ".$linkfile." file!");
$keyword_array = array();
while (! 
feof($fh)) {
$s rtrim(fgets($fh,1024));

list(
$word,$link) = explode(',',$s);
$word trim($word);
$link trim($link);
if (
$link != $origin){
$keyword_array[$word] = $link;
}
}
fclose($fh) or die("can't close file ".$linkfile."!");

include(
'autolink/class.autolink.php');

$autolink = new autolink($keywords$keyword_array$data"link""link","i");
echo 
$autolink->linkKeywords(); 
As you can see the params are used to filter the keywords used. They have no impact on the number of link occurrences. They simply filter out a keyword that has less than the set parameters.

Next we have the class.autokeyword.php file:
PHP Code:

class autokeyword {


    
//declare variables
    //the site contents
    
var $contents;
    var 
$encoding;
    
//the generated keywords
    
var $keywords;
    
//minimum word length for inclusion into the single word
    //metakeys
    
var $wordLengthMin;
    var 
$wordOccuredMin;
    
//minimum word length for inclusion into the 2 word
    //phrase metakeys
    
var $word2WordPhraseLengthMin;
    var 
$phrase2WordLengthMinOccur;
    
//minimum word length for inclusion into the 3 word
    //phrase metakeys
    
var $word3WordPhraseLengthMin;
    
//minimum phrase length for inclusion into the 2 word
    //phrase metakeys
    
var $phrase2WordLengthMin;
    var 
$phrase3WordLengthMinOccur;
    
//minimum phrase length for inclusion into the 3 word
    //phrase metakeys
    
var $phrase3WordLengthMin;

    function 
autokeyword($params$encoding)
    {
        
//get parameters
        
$this->encoding $encoding;
        
mb_internal_encoding($encoding);
        
$this->contents $this->replace_chars($params['content']);

        
// single word
        
$this->wordLengthMin $params['min_word_length'];
        
$this->wordOccuredMin $params['min_word_occur'];

        
// 2 word phrase
        
$this->word2WordPhraseLengthMin $params['min_2words_length'];
        
$this->phrase2WordLengthMin $params['min_2words_phrase_length'];
        
$this->phrase2WordLengthMinOccur $params['min_2words_phrase_occur'];

        
// 3 word phrase
        
$this->word3WordPhraseLengthMin $params['min_3words_length'];
        
$this->phrase3WordLengthMin $params['min_3words_phrase_length'];
        
$this->phrase3WordLengthMinOccur $params['min_3words_phrase_occur'];

        
//parse single, two words and three words

    
}

    function 
get_keywords()
    {
        
$keywords $this->parse_words().$this->parse_2words().$this->parse_3words();
        return 
substr($keywords0, -2);
    }

    
//turn the site contents into an array
    //then replace common html tags.
    
function replace_chars($content)
    {
        
//convert all characters to lower case
        
$content mb_strtolower($content);
        
//$content = mb_strtolower($content, "UTF-8");
        
$content strip_tags($content);

        
$punctuations = array(','')''(''.'"'"'"',
        
'<''>'';''!''?''/''-',
        
'_''['']'':''+''=''#',
        
'$''&quot;''&copy;''&gt;''&lt;',
        
chr(10), chr(13), chr(9));

        
$content str_replace($punctuations" "$content);
        
// replace multiple gaps
        
$content preg_replace('/ {2,}/si'" "$content);

        return 
$content;
    }

    
//single words META KEYWORDS
    
function parse_words()
    {
        
//list of commonly used words
        // this can be edited to suit your needs
        
$common = array("able""about""above""act""add""afraid""after""again""against""age""ago""agree""all""almost""alone""along""already""also""although""always""am""amount""an""and""anger""angry""animal""another""answer""any""appear""apple""are""arrive""arm""arms""around""arrive""as""ask""at""attempt""aunt""away""back""bad""bag""bay""be""became""because""become""been""before""began""begin""behind""being""bell""belong""below""beside""best""better""between""beyond""big""body""bone""born""borrow""both""bottom""box""boy""break""bring""brought""bug""built""busy""but""buy""by""call""came""can""cause""choose""close""close""consider""come""consider""considerable""contain""continue""could""cry""cut""dare""dark""deal""dear""decide""deep""did""die""do""does""dog""done""doubt""down""during""each""ear""early""eat""effort""either""else""end""enjoy""enough""enter""even""ever""every""except""expect""explain""fail""fall""far""fat""favor""fear""feel""feet""fell""felt""few""fill""find""fit""fly""follow""for""forever""forget""from""front""gave""get""gives""goes""gone""good""got""gray""great""green""grew""grow""guess""had""half""hang""happen""has""hat""have""he""hear""heard""held""hello""help""her""here""hers""high""hill""him""his""hit""hold""hot""how""however""I""if""ill""in""indeed""instead""into""iron""is""it""its""just""keep""kept""knew""know""known""late""least""led""left""lend""less""let""like""likely""likr""lone""long""look""lot""make""many""may""me""mean""met""might""mile""mine""moon""more""most""move""much""must""my""near""nearly""necessary""neither""never""next""no""none""nor""not""note""nothing""now""number""of""off""often""oh""on""once""only""or""other""ought""our""out""please""prepare""probable""pull""pure""push""put""raise""ran""rather""reach""realize""reply""require""rest""run""said""same""sat""saw""say""see""seem""seen""self""sell""sent""separate""set""shall""she""should""side""sign""since""so""sold""some""soon""sorry""stay""step""stick""still""stood""such""sudden""suppose""take""taken""talk""tall""tell""ten""than""thank""that""the""their""them""then""there""therefore""these""they""this""those""though""through""till""to""today""told""tomorrow""too""took""tore""tought""toward""tried""tries""trust""try""turn""two""under""until""up""upon""us""use""usual""various""verb""very""visit""want""was""we""well""went""were""what""when""where""whether""which""while""white""who""whom""whose""why""will""with""within""without""would""yes""yet""you""young""your""br""img""p","lt""gt""quot""copy");
        
//create an array out of the site contents
        
$s split(" "$this->contents);
        
//initialize array
        
$k = array();
        
//iterate inside the array
        
foreach( $s as $key=>$val ) {
            
//delete single or two letter words and
            //Add it to the list if the word is not
            //contained in the common words list.
            
if(mb_strlen(trim($val)) >= $this->wordLengthMin  && !in_array(trim($val), $common)  && !is_numeric(trim($val))) {
                
$k[] = trim($val);
            }
        }
        
//count the words
        
$k array_count_values($k);
        
//sort the words from
        //highest count to the
        //lowest.
        
$occur_filtered $this->occure_filter($k$this->wordOccuredMin);
        
arsort($occur_filtered);

        
$imploded $this->implode(", "$occur_filtered);
        
//release unused variables
        
unset($k);
        unset(
$s);

        return 
$imploded;
    }

    function 
parse_2words()
    {
        
//create an array out of the site contents
        
$x split(" "$this->contents);
        
//initilize array

        //$y = array();
        
for ($i=0$i count($x)-1$i++) {
            
//delete phrases lesser than 5 characters
            
if( (mb_strlen(trim($x[$i])) >= $this->word2WordPhraseLengthMin ) && (mb_strlen(trim($x[$i+1])) >= $this->word2WordPhraseLengthMin) )
            {
                
$y[] = trim($x[$i])." ".trim($x[$i+1]);
            }
        }

        
//count the 2 word phrases
        
$y array_count_values($y);

        
$occur_filtered $this->occure_filter($y$this->phrase2WordLengthMinOccur);
        
//sort the words from highest count to the lowest.
        
arsort($occur_filtered);

        
$imploded $this->implode(", "$occur_filtered);
        
//release unused variables
        
unset($y);
        unset(
$x);

        return 
$imploded;
    }

    function 
parse_3words()
    {
        
//create an array out of the site contents
        
$a split(" "$this->contents);
        
//initilize array
        
$b = array();

        for (
$i=0$i count($a)-2$i++) {
            
//delete phrases lesser than 5 characters
            
if( (mb_strlen(trim($a[$i])) >= $this->word3WordPhraseLengthMin) && (mb_strlen(trim($a[$i+1])) > $this->word3WordPhraseLengthMin) && (mb_strlen(trim($a[$i+2])) > $this->word3WordPhraseLengthMin) && (mb_strlen(trim($a[$i]).trim($a[$i+1]).trim($a[$i+2])) > $this->phrase3WordLengthMin) )
            {
                
$b[] = trim($a[$i])." ".trim($a[$i+1])." ".trim($a[$i+2]);
            }
        }

        
//count the 3 word phrases
        
$b array_count_values($b);
        
//sort the words from
        //highest count to the
        //lowest.
        
$occur_filtered $this->occure_filter($b$this->phrase3WordLengthMinOccur);
        
arsort($occur_filtered);

        
$imploded $this->implode(", "$occur_filtered);
        
//release unused variables
        
unset($a);
        unset(
$b);

        return 
$imploded;
    }

    function 
occure_filter($array_count_values$min_occur)
    {
        
$occur_filtered = array();
        foreach (
$array_count_values as $word => $occured) {
            if (
$occured >= $min_occur) {
                
$occur_filtered[$word] = $occured;
            }
        }

        return 
$occur_filtered;
    }

    function 
implode($gule$array)
    {
        
$c "";
        foreach(
$array as $key=>$val) {
            @
$c .= $key.$gule;
        }
        return 
$c;
    }

Of course there is a linkedKeywords file but that's irrelevant here.

I'm pretty sure what I need to tweak is the class.autolink script. So here it is class.autolink.php:
PHP Code:

class autolink {

// initialize class
// $keywords : keywords list output from automatic keyword class
//$linksArray : list of predetmined links related to keywords
//$contents : the article contents
function autolink($keywords$linksArray$content$id NULL$class NULL$type NULL ){
    
$this->links $linksArray;
    
//get the links keys
    
$this->links_keys array_keys($this->links);
    
//convert the keyword list into an array
    
$this->keywords split(",",$keywords);
    
$this->content $content;
    
//CSS formatting
    
$this->id $id;
    
$this->class $class;
    
//replacement type
    //CASE SENSITIVE : $type = NULL
    //CASE INSENSITIVE : $type = "i";
    
$this->type $type;
}

// link the keyword if it is contained
// in the $contents
function linkKeywords(){
    
//iterate into each keyword
    
foreach($this->keywords as $word){
        
//strip white spaces.
        
$word trim($word);
        
//initialized $replacedKeyword
        
$replacedKeyword "";
        
//check if keyword is found in the
        //predertemined list of links
        
if(in_array($word$this->links_keys)){
            
//if found check if the word is found in the article
            
if (stristr($this->content,$word)){
                
//convert the $keyword into a link
                //which include CSS formatting $id & $class.
                
$replacedKeyword '<a href="'.$this->links[$word].'" id="'.$this->id.'" class="'.$this->class.'">'.$word.'</a>';
                
//find whole word only
                //this prevents replcement of words contained
                //in compound words.
                
$whole_word "/\\b(" trim($word) . ")\\b/".$this->type;
                
//replace the article contents of with the keywords
                //with links.
                
$this->content preg_replace($whole_word$replacedKeyword$this->content);
            }
        }
    }
return 
$this->content;

}


And that's it. The first script ties everything together, the second script pulls the content and identifies the keywords, and the third script creates the links and returns the content auto linked. My guess would be to place some sort of filter in the class.autolink file but beyond that I'm lost. Any ideas?
Reply With Quote
  #2 (permalink)  
Old 09-22-08, 07:19 PM
cocaine_energy_drink cocaine_energy_drink is offline
Newbie Coder
 
Join Date: Sep 2008
Posts: 12
Thanks: 0
Thanked 0 Times in 0 Posts
I already posted this in the script request forum, but I'm pretty sure it actually belongs here... Again, any suggestions, even a hint in the right direction, would be greatly appreciated.
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Run Your Own Profitable and VERY unique eBusiness Voltaire General Advertisements 3 03-30-10 06:36 AM
PHP Link Exchange Script j-a-m-i-n Script Requests 2 02-17-08 06:37 AM
Is there any Link Hider kinda script ??!! ProgS Script Requests 5 06-14-07 05:36 AM
Link Exchange Referall Script (1:1 , 2:1 ratio) cyclotron Script Requests 1 05-14-07 03:00 PM
Raffle/Lottery Script (Very profitable!), Coded it myself. Voltaire General Advertisements 2 01-02-06 11:55 PM


All times are GMT -5. The time now is 08:20 AM.
vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.