Bit O' text play

Soldato
Joined
24 Nov 2002
Posts
16,378
Location
38.744281°N 104.846806°W
I have some data in the following format:

Code:
QW_004125.3 	0006457	0003674	0008219	0005737	0005515	0008152	0016265	0005488	0005739	0008150	0005623	0043067	0044260	0006916	0044424	0005575	0050789	0044444	0050791	0017076	0043118	0048523	0043069	0050794	0051244	0009986	0019538	0051243	0005524	0051082	0042981	0043229	0043226	0006915	0044237	0044464	0043227	0005622	0043231	0009987	0007582	0000166	0043066	0043170	0044267	0012501	0048519	0030554	0050875	0044238
QW_001311.3 	0030234	0042802	0005956	0003674	0005515	0007165	0005488	0008150	0008605	0019207	0005623	0044424	0005575	0016055	0003824	0019887	0007166	0044464	0005622	0016772	0009987	0007154	0016740	0016301	0043234
QW_001605.1 	0005200	0003674	0005515	0044422	0005488	0015629	0005623	0044424	0005575	0017076	0005198	0043232	0044430	0005884	0005524	0005856	0043229	0043226	0044464	0005622	0043228	0044446	0000166	0030554
..etc....

I would like to have this formatted as:

Code:
AN:0006457
AN:0003674
AN:0008219
AN:0005737
..etc....

i.e. strip away the first colum (QW_xxxxxx.x), then put every remaining column (containg the 7 digit number) - regardless of row onto a new line, prefixed with AN: )

I have been using excel to at least to the initial bits but it has fallen down and form some reason I loose the "000"s when I export???

Therefore a php solution would be fantastic.... however, I'm stumped...

Presumuably I need to read the text file as an array, implode it... no no I have no idea...

Help me please!

edit - some rows will be blank, e.g.

Code:
QW_004125.9       blahhhhh
QW_004125.3
QW_004125.2       blahhhhh
... etc....
so the script must be able to ignore blank rows (as when first column is removed, row will be blank)
 
Last edited:
This will be fine for files up to a few thousand lines long, if they're bigger than that, I'll write a version that doesn't load the entire input file into memory at once.
Code:
<?php
    $file = file_get_contents('input.txt');
    $file = str_replace("\t", ' ', $file); //this is here because I wasnt sure if it was tabs or spaces in your post
    $matches = array();
    if (preg_match_all('/\.\d  (\d*)/', $file, $matches) > 0) {
        $fp = fopen('output.txt', 'w') or die ("Couldn't open output file for writing");
        foreach ($matches[1] as $match) {
            fwrite($fp, 'AN:' . $match . "\n");
        }
        fclose($fp);
    } else {
        echo "No valid lines\n";
    }
?>
 
Cheers, whilst that appeared to work - I'm a little concerned....

The input file is 295kb (812 lines)... yet the output is just 8kb (744) lines.... it should be much bigger than that I imagine...! 1000s of lines....

Furthermore, as a check I tried to search for the random collection of digits in the file and it didn't come up with some of them...

ps. It was tab deliminated, not spaces

EDIT- it now appears not to work at all.... "No Valid Lines"

Data file > http://www.jonathandickerson.com/updata.zip
 
Last edited:
It wont work with that input file no, the forums must have mangled your examples up a bit as the whitespace is different. :)

And I appear to have misunderstood your initial post a bit, that's what happens when you read at 4am. ;)

Two ticks I'll knock up a fixed version.

Edit: here you go:

Code:
<?php
    $file = file_get_contents('updata.txt');
    $matches = array();
    if (preg_match_all('/\t(\d{7})*/', $file, $matches) > 0) {
        $fp = fopen('output.txt', 'w');
        foreach ($matches[1] as $match) {
            fwrite($fp, 'AN:' . $match . "\n");
        }
        fclose($fp);
    } else {
        echo "No valid lines\n";
    }
?>

Using your input file, it returns 36336 lines of AN:xxxxxxx text. :)
 
Last edited:
Moredhel said:
It wont work with that input file no, the forums must have mangled your examples up a bit as the whitespace is different. :)

And I appear to have misunderstood your initial post a bit, that's what happens when you read at 4am. ;)

Two ticks I'll knock up a fixed version.

Edit: here you go:

Code:
<?php
    $file = file_get_contents('updata.txt');
    $matches = array();
    if (preg_match_all('/\t(\d{7})*/', $file, $matches) > 0) {
        $fp = fopen('output.txt', 'w');
        foreach ($matches[1] as $match) {
            fwrite($fp, 'AN:' . $match . "\n");
        }
        fclose($fp);
    } else {
        echo "No valid lines\n";
    }
?>

Using your input file, it returns 36336 lines of AN:xxxxxxx text. :)
*bows down*
 
Back
Top Bottom