Capture details in multiple files into Excel sheet?

isnt quite as easy as Id hoped. No delimiters makes it a pain in the ***!

Im gonna have to read each file in line by line, search for the first space and take whatever is after that.

Will you be able to do a *.pfl search, and copy all the files to a temporary folder so they are all in the same place? That way I can loop for all files in the folder. If not we will have to get a list of all their locations.
 
ok starting to work.... have the values

Code:
ContactNameStart = InStr(1, sText, "ContactName", vbTextCompare)
ContactEmailStart = InStr(1, sText, "ContactEmail", vbTextCompare)
ContactPhoneStart = InStr(1, sText, "ContactPhone", vbTextCompare)
ContactFaxStart = InStr(1, sText, "ContactFax", vbTextCompare)
CompanyNameStart = InStr(1, sText, "CompanyName", vbTextCompare)

ContactNameValuePosition = (ContactEmailStart - ContactNameStart)
ContactNameValueText = Mid(sText, 13, ContactNameValuePosition - 12)

ContactEmailStartValuePosition = (ContactPhoneStart - ContactEmailStart)
ContactEmailStartValueText = Mid(sText, ContactEmailStart + 13, ContactEmailStartValuePosition - 13)

ContactPhoneStartValuePosition = (ContactFaxStart - ContactPhoneStart)
ContactPhoneStartValueText = Mid(sText, ContactPhoneStart + 13, ContactPhoneStartValuePosition - 13)

ContactFaxStartValuePosition = (CompanyNameStart - ContactFaxStart)
ContactFaxStartValueText = Mid(sText, ContactFaxStart + 11, ContactFaxStartValuePosition - 11)

CompanyNameStartValuePosition = (ContactFaxStart - CompanyName)
CompanyNameStartValueText = Mid(sText, CompanyNameStart + 12, CompanyNameStartValuePosition - 12)
 
Ok all done.

http://dlhmakeup.co.uk/contactcollector.xlsm

It doesnt look at subfolders so you will need to copy them to a folder.

It doesnt write to files, but please make a backup first.

Oh, you said there was more info underneath. If you let me know the next delimeter I can use, then I'll make it stop once it htis it. But at the moment the last cell will be everything from CompanyName onwards
 
Quick'n'dirty (and still beaten to it :() but this should do it....

Code:
ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

SET CurrentFile=
SET ContactName=
SET ContactEmail=
SET CompanyName=
SET OutFile=Output.%TIME::=.%.%DATE:/=-%.csv


ECHO "ContactName","ContactEmail","CompanyName","File">%OutFile%

FOR /R .\ %%A IN (*.pfl) DO (
	SET CurrentFile=%%A
	FOR /F "usebackq tokens=1,* delims= " %%B IN (`FIND /I "ContactName" "!CurrentFile!"`) DO (SET ContactName=%%C)
	FOR /F "usebackq tokens=1,* delims= " %%B IN (`FIND /I "ContactEmail" "!CurrentFile!"`) DO (SET ContactEmail=%%C)
	FOR /F "usebackq tokens=1,* delims= " %%B IN (`FIND /I "CompanyName" "!CurrentFile!"`) DO (SET CompanyName=%%C)
	ECHO "!ContactName!","!ContactEmail!","!CompanyName!","!CurrentFile!">>%OutFile%
)


SET CurrentFile=
SET ContactName=
SET ContactEmail=
SET CompanyName=
SET OutFile=

ENDLOCAL

Put that in a .bat file in the root and run it. Probably not the best way to do it but for a one off should be fine. Added the file name at the end so you can do a few random compairs to make sure the correct info comes from the correct files.

Will process all *.pfl files in all sub dirs.

Pull the out put in to Excel, delim on commas and use " as the text qualifier.
 
Last edited:
In what way do you mean?

The basics are the "FOR /R" line is looking for all *.pfl files recursively starting from the current path (.\) and then when it finds a file that matches (%%A) sets it as a variable (CurrentFile).

It then just does three FINDs, one for each value he wants, in CurrentFile and sets those as variables. It delims on spaces but the "tokens=1,*" tells it that we are only interested in the first token delimited by a space and that all others after that are to be treated as one big one and not delimited any more.

It then just echos out the variables in a line, in quotes, comma separated to a text file that I've given a *.csv extension. It already echo'd out the column headers to the file before the FOR loop starts.

This file can then just be pulled into Excel as a normal text/csv file.
 
tokens=1,*

It delims on space but the 'tokens=' tells it how to deal with each token/var. So I tell it we are only interested in token 1 and then all tokens after that '*' are to be treated as one. My starting token is going to be %%B, my second token (*=everything else) is then %%C.

If we knew the line was always going to be...

ContactName Joe Middle Bloggs

Then we know we will always have four tokens, so you could not specify tokens at all and just delim on space. Then if your starting token is %%B you would...

SET ContactName= %%C %%D %%E

You can also do things like...

tokens=2,3-4

to skip tokens/define only the tokens you are interested in, which is useful if you are going to have a large number on one line and are worried about running out of FOR variables.

Not sure if I am explaining that too well, I still get it wrong myself sometimes, and I'd find it easier to show you with pen and paper or more examples.


Thinking about it, the other way to do it would be read the whole line in and just do a string replace to replace "ContactName " with nothing.
 
Hey, thanks guys!

chaosphy - I tried that as a bat file though it had a few issues. At least some company names only had the second part of the name. There may be more but I didn't have any time to spend on checking this out. Hopefully will tomorrow.

A[L]C - That seems great! Didn't notice any errors but again not had much time. Could you also add CompanyName CompanyAddress CompanyCity and CompanyState? The delimeter is CompanyZip and anything from there isn't needed.

Will def reread what you both posted though so I can do this myself next time :D
 
No, I don't need that right now. That's the end of wanted data really.

I did try adding these myself in your vB but kept getting debug errors. I'd tried to logically expand the system you had but no success.
 
lol, i didnt make the code very tidy and ripped and edited loads of bits from across the web rather than writing my own.

Not gonna be able to look at it till thursday night tho im afraid
 
Hey, thanks guys!

chaosphy - I tried that as a bat file though it had a few issues. At least some company names only had the second part of the name. There may be more but I didn't have any time to spend on checking this out. Hopefully will tomorrow.

A[L]C - That seems great! Didn't notice any errors but again not had much time. Could you also add CompanyName CompanyAddress CompanyCity and CompanyState? The delimeter is CompanyZip and anything from there isn't needed.

Will def reread what you both posted though so I can do this myself next time :D

hmm odd, really not sure why that'd be with out seeing the problem files. Only thing I can think is the format is off in those files or possibly some odd character that CMD doesn't like. What is odd is that you only get the second part of the company name, which would imply that is where the first space is.

If you do have time to look at one of the problem files and see if it is anything obvious then I'd like to know why, it'll bug me silly otherwise. :confused:

Plus it may be beneficial to you as if the files are off for some reason you may need to look at how they are generated in the first place encase any inconsistencies cause problems down the line. That's not to say that there isn't an issue with the method I used, there could well be.

If possible I've also rewritten it and wonder if you could run it again and see what happens? I don' think this'll fix the issue you are seeing as it basically does the same thing.

Changes I've made are...

The way it works. It should make it easier to add and remove values you are after (though with a bit more work I may be able to make that easier still).

It echos some output to the screen so you can at least see it is doing something.

It will give a count of possible errors at the end. (this is only based on the simple error check I do on the variable names)

Errors it picks up are now piped to a second error file.

First column in the normal output now contains 'E' if it thinks there was an error in the data.

PS. Weirdly, if I was doing this at work I'd've probably got bored by now and told you to rtfm :D
PPS. Though some of this may come in handy at work one day anyway as bits of it are from other things I've done with manipulating text files.
Code:
@ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

SET CurrentFile=

SET ContactName=
SET ContactEmail=
SET CompanyName=
SET CompanyAddress=
SET CompanyCity= 
SET CompanyState=

SET GetValue=

SET PossibleError=
SET ErrorCountFiles=0
SET ErrorCountValues=0

SET OutFile=Output.%DATE:/=-%--%TIME::=.%.csv
SET OutFile=%OutFile: =0%
SET ErrorFile=%OutFile:Output=Error%

ECHO "","ContactName","ContactEmail","CompanyName","CompanyAddress","CompanyCity","CompanyState","File">%OutFile%
ECHO "Expected Variable","Actual Variable","Obtained Value","Problem File">%ErrorFile%


FOR /R .\ %%A IN (*.pfl) DO (
	SET CurrentFile=%%A
	SET PossibleError=
	ECHO Parsing... !CurrentFile!
	FOR %%B IN (ContactName ContactEmail CompanyName CompanyAddress CompanyCity CompanyState) DO (
		SET GetValue=%%B
		SET !GetValue!=
		FOR /F "usebackq tokens=1,* delims= " %%C IN (`FINDSTR /I "!GetValue!" "!CurrentFile!"`) DO (
			IF NOT [%%C]==[!GetValue!] (
				ECHO "!GetValue!","%%C","%%D","!CurrentFile!">>%ErrorFile%
				SET PossibleError=E
				SET /A ErrorCountValues=!ErrorCountValues! + 1
			)
			SET !GetValue!=%%D
		)
	)
	ECHO "!PossibleError!","!ContactName!","!ContactEmail!","!CompanyName!","!CompanyAddress!","!CompanyCity!","!CompanyState!","!CurrentFile!">>%OutFile%
	IF [!PossibleError!]==[E] (
		ECHO Parsed with Error^(s^)
		SET /A ErrorCountFiles=!ErrorCountFiles! + 1
	)
	ECHO.
)

ECHO.
ECHO Possible Number of FILES  with ERROR = %ErrorCountFiles%
ECHO Possible Number of VALUES With ERROR = %ErrorCountValues%


SET CurrentFile=

SET ContactName=
SET ContactEmail=
SET CompanyName=
SET CompanyAddress=
SET CompanyCity= 
SET CompanyState=

SET OutFile=
SET ErrorFile=

SET PossibleError=
SET ErrorCountFiles=
SET ErrorCountValues=

SET GetValue=


ENDLOCAL
 
Last edited:
Ahh, I see what's going on.

The first delimiter is a tab and not a space - I guess the forum formated that when I copied it though I should have noticed.

How can I change that in your file chaosophy?
 
Ahh, I see what's going on.

The first delimiter is a tab and not a space - I guess the forum formated that when I copied it though I should have noticed.

How can I change that in your file chaosophy?

ah ok in that case change...

FOR /F "usebackq tokens=1,* delims= "

to be...

FOR /F "usebackq tokens=1,*"

and give that a go.

i'm a bit limited at the mo, as posting from phone.
 
Back
Top Bottom