Current location: Hot Scripts Forums » Programming Languages » PHP » Content Filtering


Content Filtering

Reply
  #1 (permalink)  
Old 08-10-06, 11:57 PM
zendobi zendobi is offline
Newbie Coder
 
Join Date: Nov 2005
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Content Filtering

Ok like I mentioned before I have a couple content sites that anyone can submit articles and tutorials to. However I have been running into the problem of some people taking other peoples articles and changing a few words here and there and posting it as their own. SO I am looking for anyones suggestions, on building a content filter. I have considered doing a line by line compare of all the other articles in the database that have a similar keyword density, but that could be majorly server intensive. Any ideas are welcome
__________________
Mike Miller
ZapContent - Free Article Submission for Authors | Free Content for Publishers

Last edited by zendobi; 08-11-06 at 12:01 AM.
Reply With Quote
  #2 (permalink)  
Old 08-11-06, 07:40 PM
Patiek Patiek is offline
Wannabe Coder
 
Join Date: Nov 2003
Posts: 165
Thanks: 0
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by zendobi
Ok like I mentioned before I have a couple content sites that anyone can submit articles and tutorials to. However I have been running into the problem of some people taking other peoples articles and changing a few words here and there and posting it as their own. SO I am looking for anyones suggestions, on building a content filter. I have considered doing a line by line compare of all the other articles in the database that have a similar keyword density, but that could be majorly server intensive. Any ideas are welcome
You should try implementing multiple levels. Don't just do line-by-line for everything. Instead, try initial trials to eliminate the need to go line-by-line. For example, you could check lengths against each other (and if the difference of two lengths falls under a certain value, only then compare).

Thus, you could do a lot of elimination before you even get to anything intensive. Try to think of some other possible eliminating factors and have them attempt to eliminate fraudulent articles. If they can't, only then should you do the deeper analysis.
Reply With Quote
  #3 (permalink)  
Old 08-11-06, 07:47 PM
zendobi zendobi is offline
Newbie Coder
 
Join Date: Nov 2005
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Ya I actually had an idea last night that will be of some help. I am going to try doing a keyword density test on the articles, since thats easy and if the articles dont have a similar density "profile" then no need to check further. Then I suppose I could strip all the common words and extra whitespace, then do a phrase match type search. A length match like you suggested is a good place to start before the keyword profiling. That will narrow down the field
__________________
Mike Miller
ZapContent - Free Article Submission for Authors | Free Content for Publishers
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Access Content rid PHP 1 10-20-05 05:09 AM
How can I change and then restore a links content using onClick. Joe_Bloggs JavaScript 2 09-22-05 08:18 AM
CodeAvalanche FastContent - Content is king cash on AdSense faster xfairguy General Advertisements 0 07-01-05 04:39 AM
Error On Registeration timmy408 ASP 2 09-05-04 02:53 PM
forcing an iframe to scrolldown as it gets filled with content davidklonski JavaScript 2 07-20-04 08:27 AM


All times are GMT -5. The time now is 07:23 AM.
vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.