Question for database types..

Associate
Joined
15 Jul 2011
Posts
1,528
Location
London
I have 2 database files from 1 table exported as .xls

one from last year, one from this year with thousands of rows, is there a program of some sort that will give me the rows that are in the new one but not in the old

i.e the difference between the two?

Help appreciated!
 
Can't you just do a VLOOKUP on the new sheet against the old one and where you get "#N/A's" is line not on the old one, the ones it finds are on both.

example
242e4qf.jpg
 
Last edited:
That only works if the lines he is comparing are all unique.

* insert from both files
* select * from ( select rowdata, count(*) from temptable group by rowdata) where count(*) < 2;
* boosh, you've got the difference between both files. if the rowdata is in both files, it won't return in the select.

Not exactly rocket science.
 
Im not too sure about the data you have but this could work.

Copy all values into a list. In the next row put in the formula '=COUNTIF(A:A,A2)>1'

This will give the result Ture if the value is in the list more than once. if you filter for false that will be your list of new values.
 
This is all going way over my head aha. Its for a client we've adopted who are on an old system and they wanted inventory dumps, was hoping there was a simpler way >.<
 
All of the above work but without a bit more information about the data it would be difficult to say which is best.

Is the data sensitive? Email me in trust and I can have a look for you if not.

/edit Just saw your last post. Probably can't help if its business information.

Also, if this is work related how has it fallen to you when you're clearly out of your depth lol.
 
Last edited:
Its not sensetive info its just a list of products youd find on the clients website anyway.

And because im a web dev and they asked for a dump last year and one this year for their accountant, but now theyve recieved this one they 'dont want the whole thing as half of its the same we want the difference'

left to me..!
 
All of the above work but without a bit more information about the data it would be difficult to say which is best.
.

There's a billion different ways to do it :p

If you're on linux, cat the two files into main.txt

cat main.txt | sort | uniq -c | sort -n > results.txt

will give you a count for each instance of a row :)
 
* insert from both files
* select * from ( select rowdata, count(*) from temptable group by rowdata) where count(*) < 2;
* boosh, you've got the difference between both files. if the rowdata is in both files, it won't return in the select.

Not exactly rocket science.

Can't you see that method would fail if the any of the files contained more than one row that was the same? If the new file for example has two identical rows in it which aren't in the old one and you merge it with the old file and run a query looking for any line that appears less than twice those lines would not be highlighted despite being missing from the old list.

Not only that it would highlight every difference between the two when the OP is only interested in what is new on the new list (missing from the old list).

VLOOKUP is the easiest way, it's checking every line in one file against every line in the other and doesn't require any of the lines in either file to be unique.

Look, here's a comparison of using both methods, vlookup on the left, mixing them then counting for lines less than 2 on the right...

15yjb83.jpg


As you can see the lines with "2" in are correctly called using the vlookup method but are missed by your suggested method because they appear twice in the new list and thus fail fail your condition. Secondly 3,6,9 and 10 are actually ones missing from the new list but are on the old which OP isn't interested in.
 
Last edited:
This is all going way over my head aha. Its for a client we've adopted who are on an old system and they wanted inventory dumps, was hoping there was a simpler way >.<

You can't really get more simple than a vlookup query mate. If you give me the names of the columns you have and what kind of data is contained within them I'll write it for you so you can just paste it in if you really want.
 
Cheers for the replies guys, the primary key data was totally random because some of it was auto generated and some of it was put in directly

(its for products so it started off well they put in the product id's as tshirt-red but then got lazy and left them so a lot of they are lke 123457) ive sorted it by fobbing them off anyway its their problem really just tried to help if it was a 2 minute job but seemingly not so its for them to sort
 
Back
Top Bottom