I spent forever trying to figure out how to unwrap an apache access_log that was splitting requests between two lines. I finally found the answer!

Create a perl script called unwrap.pl and run the script like this:

# perl unwrap.pl access_log > accesslog2
#!/bin/env perl

use strict;

my $sentinel = 0;
my $previous_line;
my $result = "";

sub early_indent { # line next_line
    my $line = shift(@_);
    my $next_line = shift(@_);

    my @words = split(' ', $next_line);
    return (length ($line . " " . $words[0]) < length($next_line));
}

while (<>)
{
    my $this_line = $_;
    chomp $this_line;
    if ($sentinel)
    {
	if (($this_line eq "")
	    || ($previous_line eq "")
	    || ($this_line =~ /^[^A-Za-z0-9]/)
	    || early_indent($previous_line, $this_line))
	{ $result .= $previous_line . "\n"; }
	else {$result .= ($previous_line . " "); }
    }
    $previous_line = $this_line;
    $sentinel = 1;
}
Last modified: September 13, 2012

Author

Comments

Write a Reply or Comment

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

This site uses Akismet to reduce spam. Learn how your comment data is processed.