Current location: Hot Scripts Forums » Programming Languages » PHP » Trying to pharse a log but fgets not seeing new line


Trying to pharse a log but fgets not seeing new line

Reply
  #1 (permalink)  
Old 06-11-07, 03:35 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
Trying to pharse a log but fgets not seeing new line

Let me set what I'm trying to do befor discussing the problem:

I'm using windows backup tool in windows server 2000 it outputs a log of what its backed up as a simple txt file: backup01.log

Im getting php to read through the file and grab bits of data that I need and place these bits in a db so that a user can come along and read the contents and be able to determin if backups where successful etc.

so a bit of my php code im using to read each line one at a time is:

PHP Code:

$myFile "backup05.log";

$fh fopen($myFile'r');
while(
$theData fgets($fh)) {
    
// each line of data here

So here is the problem. For some reason php is seeing the log file as one long line of data not 100's of lines here is a bit of the log file:

Code:
Backup Status
Operation: Backup
Active backup destination: 4mm DDS
Media name: "ALPSERVER01-Tue-06-05-2007-11-00p"

Backup of "D: Disk 1"
Backup set #1 on media #1
Backup description: "daily Tue-06-05-2007-11-00p"
Backup Type: Normal

Backup started on 6/5/2007 at 11:02 PM.
Folder D:\FMR
Bm90.gil.zip                               6206332   3/28/2002     1:04 PM
cesdata.dmg.zip                            1248237   3/28/2002     1:03 PM
CESData.GIL.HOY                            7986451   3/28/2002     1:02 PM
CESData.GIL.zip                            1938728   3/28/2002     1:46 PM
Cesdata.ste.zip                            4177201   3/28/2002     1:08 PM
Cesdata.wor.zip                            1338525    6/4/2001    12:41 PM
fmrstrip.TXT                                    88  11/15/2000    11:08 AM
looking at the log file in notepad the data is clearly has new lines. put php is seeing the new lines something else. If I echo out the log in a web browser I get:

Code:
Backup Status 㰊牢>Operation: Backup 㰊牢>Active backup destination: 4mm DDS 㰊牢>Media name: "ALPSERVER01-Tue-06-05-2007-11-00p" 㰊牢> 㰊牢>Backup of "D: Disk 1" 㰊牢>Backup set #1 on media #1 㰊牢>Backup description: "daily Tue-06-05-2007-11-00p" 㰊牢>Backup Type: Normal 㰊牢> 㰊牢>Backup started on 6/5/2007 at 11:02 PM. 㰊牢>Folder D:\FMR 㰊牢>Bm90.gil.zip 6206332 3/28/2002 1:04 PM 㰊牢>cesdata.dmg.zip 1248237 3/28/2002 1:03 PM 㰊牢>CESData.GIL.HOY 7986451 3/28/2002 1:02 PM 㰊牢>CESData.GIL.zip 1938728 3/28/2002 1:46 PM 㰊牢>Cesdata.ste.zip 4177201 3/28/2002 1:08 PM 㰊牢>Cesdata.wor.zip 1338525 6/4/2001 12:41 PM 㰊牢>fmrstrip.TXT 88 11/15/2000 11:08 AM 㰊牢>fmrtrans.se 1034
looking at the source new lines are showing as ??>

I presume this is a charset problem here, but im really lost with what charset I should set the php file to and how I do this? Do I set it as:

PHP Code:

header ('Content-type: text/html; charset=iso-8859-1'); 

if so what char set should I set it to, what charset does windows backup use in its logs?

Any input is greatly apreciated

Cheers
Reply With Quote
  #2 (permalink)  
Old 06-11-07, 03:44 AM
fyrestrtr fyrestrtr is offline
Wannabe Coder
 
Join Date: Nov 2003
Posts: 191
Thanks: 0
Thanked 0 Times in 0 Posts
What are the locale settings for Windows? That would be the source of the problem.
__________________
Find me at WHT
Reply With Quote
  #3 (permalink)  
Old 06-11-07, 03:48 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
Current lang settings are set to: Western Europe and United States. I presume then the charset is one of the 2:

charset=iso-8859-1 Western Alphabet
charset=windows-1252 Western Alphabet (Windows)

but setting them in the header both still see the log with wierd chars

Last edited by scott2500uk; 06-11-07 at 04:16 AM.
Reply With Quote
  #4 (permalink)  
Old 06-11-07, 06:58 AM
mab's Avatar
mab mab is offline
Community VIP
 
Join Date: Oct 2005
Location: Denver, Co. USA
Posts: 2,674
Thanks: 0
Thanked 0 Times in 0 Posts
This is a newline problem not a language setting problem. It is likely that your log file has \n\r instead of \r\n. Check this out using an editor that displays the hex values of characters (using your original/actual log file.) It is also possible that if you opened and saved this file using an editor that the original newlines were altered to be what they are currently.

You can try turning on auto_detect_line_endings (this can be turned on in your script using an ini_set(...) statement) -
Quote:
auto_detect_line_endings boolean
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
Edit: Outputting this to a browser actually tells very little as \n or \r has no meaning to a browser. A <br> or <br /> is what causes a newline to be rendered in a browser. \n's or \r's in content output to a browser only format the "source" code of the web page.
__________________
Error checking, error reporting, and error recovery. If your code does not have these to get it to tell you why it is not working, what makes you think someone in a programming forum will be able to tell you why it is not working???

Last edited by mab; 06-11-07 at 07:11 AM.
Reply With Quote
  #5 (permalink)  
Old 06-11-07, 07:49 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
im pretty sure this isnt a line ending problem. You can read a file with the params b and t that check for different line endings. This didnt work in my case.

I continued to look into charsets and found the function utf8_encode and decode.

When running that on the file the document was now seeing new lines but now all the document was spaced out and was un readable from the source. I searched and searched and couldnt find a way around it. In the end I tried this just for the sake of it and it worked

PHP Code:

while($theData urldecode(urlencode(fgets($fh)))) { 

Im sure this isnt the correct way about it. Is there any one that can explain why this works and if there is a better solution?

cheers
Reply With Quote
  #6 (permalink)  
Old 06-11-07, 11:37 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
^^ my last post is utter crap I linked to a wrong file somehopw when I was testing things and I linked to a fixed file so I thought what I had done had fixed it. So anyway Im back to my oridinal problem

I tried setting auto_detect_line_endings on but had no effect. I belive this only works if reading files from mac or you are on a mac and trying to read unix and windows.

I did a bit more investigating and its said that the windows adds null bytes at the beggining of each new line. Could that be that weird chars that Im seeing. Is there A way I can read through the file and replace these null bytes and replace with the correct new line?

Also I did a mb_detect_encoding() on each line and the output is ASCII does this tell me anything?
Reply With Quote
  #7 (permalink)  
Old 06-11-07, 11:55 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
its definatley the null byte char at the begining of the line.

When I do a url encode it comes up with %00
and when I run a ord() on the char I get 255

How would I do a str_replace to replace a null char?
Reply With Quote
  #8 (permalink)  
Old 06-11-07, 12:01 PM
UnrealEd's Avatar
UnrealEd UnrealEd is offline
Community Liaison
 
Join Date: May 2005
Location: Antwerp, Belgium
Posts: 3,165
Thanks: 4
Thanked 25 Times in 25 Posts
try this:
PHP Code:

$string str_replace(chr(0), ""$string); 

or, since the ordinal of the character is 255:
PHP Code:

$string str_replace(chr(255), ""$string); 

more on chr()
__________________
"Good judgement comes from experience, and experience comes from bad judgement." - Fred Brooks

Reply With Quote
  #9 (permalink)  
Old 06-11-07, 01:52 PM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
thx UnrealEd ill give that a go tomorrow when I'm back at work thanks for all the help everyone
Reply With Quote
  #10 (permalink)  
Old 06-12-07, 02:39 AM
scott2500uk's Avatar
scott2500uk scott2500uk is offline
Coding Addict
 
Join Date: Apr 2006
Posts: 275
Thanks: 2
Thanked 2 Times in 2 Posts
I did the str_replace with chr 255 but that replaced all null bytes in the document making it look like I ran the code through utf8_encode where all the code was spaced out and was unreadable.

It turns out that the log is full of null bytes and all other wierd chars so I did an preg_replace to remove all unwanted ascii chars.

PHP Code:

$theData preg_replace('/[\x00-\x0c\x0b\x0c\x0e-\x1f\x7f-\x9f]/i'''$theData); 

Now the log looks great in the source and I can now start working on breaking down the log and grab the bits from it I want.

Cheers for all the input guys!!
Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
No error message... stormshadow PHP 3 12-11-06 06:31 PM
Problems getting PHP-Nuke setup correctly TravisT PHP 2 12-17-05 07:54 PM
Redirection back to a page from form submit DAL Perl 11 03-21-05 02:45 PM
I most definately suggest DevelopingCentral.com For Any Website Design/Development! Salty777 General Advertisements 2 10-01-04 04:27 AM
asp-iis-Server error nsuresh_rasr ASP.NET 3 02-08-04 12:47 AM


All times are GMT -5. The time now is 06:03 AM.
vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.