I'm currently working with a huge text file (over 2Mb and 34,000 lines of text) that are separated by columns (probably using a /t special char). I want to calculate some values in those columns.
Example:
Column 1 refers to Height, #2 to Pressure and #3 to Temperature. I know that in perl it is possible to go through the file and search certain values in those columns (i.e.: the highest value, calculate average, etc.). I just want to know how do I handle such a file using perl.
I wouldn't say 2mb is a huge file, certainly not small but most any computer can handle this amount of data easily.
You want to look into using open() to open and read the file and split() to split the lines into the seperate columns, and most likely using a hash to build a dataset out of the file and then run comparisons to get the results you want.
Here's an example to build a basic dataset (an array of arrays) from the file:
Code:
my @AoA = ();
open(FH,'<file.txt') or die "Can't open file.txt: $!";
while(my $line = <FH>){
chomp($line);
push @AoA,[split(/\s+/,$line)];
}
close(FH);
Now you would use @AoA to do whatever it is you need to do.
/\s+/ means to split the line on one or more spaces, \s being a single space and the + symbol being a quantifier meaning one or more. Tabs (\t) are multiple spaces, so using \s+ is more flexible than using \t to split a file on spaces. But if you are certain the file is tab delimited you could use /\t/ instread of /\s+/.
I'm having this small problem. How can I remove one line of my AoA? I'm thinking of using splice, but that removes only one element. Should I use some cycle to do this?
Example:
[1][2][3]
[4][5][6]
[7][8][9]
I want to remove the second row and have something like: