Finding Max Column Length

by bitznbitez

Lets say you have a delimited file and you want to find the maximum length of all the columns, but your developer and the client who supplied the file cannot produce the information.   We as DBAs never face that situation right of course not.

If the delimiter does not show up in the data itself, even escaped or quoted, the following simple perl script will show the max length of each column

#!/usr/bin/perl
###########################################################################
# Given a delimiter char as an option parse stdin and give max len 
###########################################################################
sub usage()
{
  print STDERR "ERROR: USAGE Incorrect\n";
  print STDERR "$0 -d delim_char -n num_columns\n";
  exit -1;
}
use Getopt::Std;

$rec_count = 0;
my %opt;
$opt_string = "d:n:";
getopts("$opt_string", \%opt ) or usage();

if( ! exists $opt{d} || ! exists $opt{n}) {
 usage();
}

$opt{d} = "\\|" if $opt{d} eq "|";
$field_count = $opt{n};

for ( $i = 0; $i < $field_count; $i++){
  $field_max_len[$i] = 0;
}

while(<>){
  chomp ;
  $rec_count++;

  @file_fields = split($opt{d});
  for ( $i = 0; $i < $field_count; $i++){
    $curr_field_len = length($file_fields[$i]);
    $field_max_len[$i] = $curr_field_len if $curr_field_len > $field_max_len[$i]
  }
}

printf("Total Records : %d\n",$rec_count);
printf("Total Fields : %d\n",$field_count);

for ( $i = 0; $i < $field_count; $i++){
  printf("Max Len Field[%d] : %d\n",$i,$field_max_len[$i]);
}
Advertisements