php function to strip stuff away

Soldato
Joined
19 Oct 2002
Posts
3,480
hi guys,

obviously there are many ways to remove bits and bobs from a string... currently what i am doing is using str_replace like so:

PHP:
str_replace($tags, '', $line);

where $tags is an array full of different bits that i want removed, but this is getting quite messy...

if i give an example of some lines (yes they are html tags but just look at them as strings) and show you how i want them to end up, maybe someone could tell me a much easier way to do it than the above:

<img src="image.jpg" />
image.jpg

<image src="image.jpg" class="*could be anything*" title="*could be anything* />
image.jpg

<span class="*could be anything*">Don't look back in anger</span>
Don't look back in anger

<p>Howdy</p>
Howdy

you get the idea, i'm stripping away all the unessesary stuff each time and at the moment the array is getting out of control with all different versions of what could be either side of the bit i need...

any ideas?
 
id use explode, might need to explode twice to get the data you need. Or use a regular expression to remove parts of the string. Would take a bit of work to get the regex built though and would be slower to execute than explode.
 
Last edited:
right, thanks guys :) - found out what regex is, read a chapter on it in my big all encompasing php book, so now i'm gonna start building the thing i reckon...

in words, i guess what i want it:

"find something that maybe starts with a / then has words, then maybe another forward slash, then a word with a .jpg"

(these files will almost always have a path with them)

the rest of the examples can be stripped with strip_tags i think

this is pretty cool :) - gonna have a stab...
 
brilliant thanks for your help guys...

i reckon i need this:

Code:
[/a-zA-Z0-9]+\.(png|jpg|gif)

so can i strip the string down to the matching sub expression like this:

Code:
$line = eregi('[/a-zA-Z0-9]+\.(png|jpg|gif)', $line);

?

cuz that doesn't seem to work...

EDIT: by the way, found this outlandishly useful tool :) http://gskinner.com/RegExr/
 
Last edited:
right guys, been playing and i'm confident my regex it good, so now i just need to get php to extract the matching section...

i read that it store the matched bit in the third paramater like this:

Code:
preg_match('[/a-zA-Z0-9]+\.(png|jpg|gif)', $line, $matches);
	$line = $matches[0];
like this, but it doesn't, this just sticks "jpg" into the $line variable each time so its cutting too much off...

basically in plain english i want to "take a variable which contains a string, match my pattern to it, and delete everything else and re"save" it into the variable"

could someone give me a legup?
 
you need delimeters, and try using preg_replace()
Code:
$newString = preg_replace('#([/a-z0-9]+\.(png|jpg|gif))#i', '\\1', $line);
 
right guys, been playing and i'm confident my regex it good, so now i just need to get php to extract the matching section...

i read that it store the matched bit in the third paramater like this:

Code:
preg_match('[/a-zA-Z0-9]+\.(png|jpg|gif)', $line, $matches);
	$line = $matches[0];
like this, but it doesn't, this just sticks "jpg" into the $line variable each time so its cutting too much off...

basically in plain english i want to "take a variable which contains a string, match my pattern to it, and delete everything else and re"save" it into the variable"

could someone give me a legup?

Just a quick note: don't use the ereg* functions, as I believe they are being deprecated in favour of preg* in PHP 6. They are also a lot slower.

As for your problem, to match something in a regex (and then return it) you need to surround the "thing" with ( )
 
well, it works :)

the main problem i was hitting was that some of the path/filenames had a dash and i hadn't included it in the regex :rolleyes:

here is the final code, all working :)

if someone cold advise anyway to shrink it and make it more efficient as always i would be very grateful and happy to learn a better way of doing something :)

PHP:
<?php

function tagTrim(&$line)
{
	if (preg_match('([-/a-zA-Z0-9]+\.(png|jpg|gif))', $line, $matches))
	{
		$line = $matches[0];
	}
	else
	{	
		$line = strip_tags($line);
	}
}

function grabLines($path)
{
	$lines = file($path);
	array_walk($lines,'tagTrim');
	return $lines;
}

?>

so yeah, a path gets chucked to grabLines and it comes back with an array stripped down to the bare necessities :)
 
Back
Top Bottom