[MySQL]Remove spaces and delete dupes

Associate
Joined
21 May 2003
Posts
1,365
I've got a database table with a couple of hundred thousand records. The table has a unique field, "tel1" which holds telephone numbers (as a varchar).

Some of the numbers are formatted with spaces, i.e. "0845 000 0000"

and some aren't i.e. "08450000000".

I'd like to remove all the spaces, and any duplicate records.

I've tried
Code:
UPDATE table SET tel1 = replace(tel1, " ", "");

but that gives an error if there is a duplicate record (because tel1 is unique).

Any way around this?
 
You'll have to remove the duplicates first, by doing something like this:

DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND B.tel1 = A.tel1)

then your statement to remove the spaces will work.
 
Oh yeah, ofcourse! Try this then:

DELETE FROM table A where ID=(SELECT ID from table B where B.id <> a.id AND Replace(B.tel1, " ", "") = Replace(A.tel1, " ", ""))
 
I made a slight adjustment (one of the tel1's will be the space-less version?) but I get an error when trying to run:

Code:
mysql> DELETE FROM test A WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", "") = A.tel1);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE id = (SELECT id FROM test B WHERE B.id <> A.id AND Replace(B.tel1, " ", ""' at line 1
 
Could also try:

Code:
DELETE FROM `table1` WHERE REPLACE(' ', '', `table1`.`tel1`) IN (SELECT `table1`.`tel1` FROM `table1`);

Or try the ANY operator instead of IN, both are untested on my part though.

Then run the first query you posted to remove the spaces in the cells that are not dupes.
 
Last edited:
mysql> DELETE FROM `test` WHERE Replace(' ', '', `test`.`tel1`) IN (SELECT `test`.`tel1` FROM `test`);
ERROR 1093 (HY000): You can't specify target table 'test' for update in FROM clause


Apologies if i'm not much use here, my SQL experience is pretty much limited to inserts and updates.
 
Last edited:
Ok,

SELECT `id`, `tel1` FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

Gives the correct records i.e. the dupes.

but

DELETE FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

gives a syntax error.
 
Can't be bothered to sort your syntax our but if original select gets what rows you need then I would just do a select into and then a delete (presuming this is a one off).
SELECT `id`, `tel1` INTO 'DupRows' FROM `test` AS `A` WHERE Replace(`A`.`tel1`, " ", "") IN (SELECT `A`.`tel1` FROM `test`);

Then do DELETE FROM test WHERE id IN (SELECT id from DupRows)
 
Back
Top Bottom