PHP or JavaScript - interpreting a chat

Associate
Joined
30 Dec 2005
Posts
415
Evening all!

I've hit a brick wall with this problem... can't understand why it's so hard either! I'm hoping that some of you guys will have some ideas...

I'm trying to write a script in either JavaScript or PHP (doesn't matter which) that will allow you to copy and paste a conversation either from an email or an instant messaging program such as msn or adium. The idea is the script would read the conversation and split it up into it's messages, as well as extract the sender name and the time...

So for example if I had this conversation:
gavin holt
20:50
this is gavin's 1st message

rich
20:50
this is rich's 1st message

gavin holt
20:54
this is gavin's 2nd message

54:44
this is gavin's 3rd message

54:47
this is gavin's 4th message

rich
20:55
this is rich's 2nd message

56:46
this is rich's 3rd message

gavin holt
20:57
this is gavin's 5th message

It would be interpreted by the script to this..
Gavin - 20:50 - this is gavin's 1st message
Rich - 20:50 - this is rich's 1st message
Gavin - 20:54 - this is gavin's 2nd message
Gavin - 20:54 - this is gavin's 3rd message
Gavin - 20:54 - this is gavin's 4th message
Rich - 20:55 - this is rich's 2nd message
Rich - 20:56 - this is rich's 3rd message
Gavin - 20:57 - this is gavin's 5th message

Now obviously emails and IM conversations aren't in the same format, so it'd have to be able to cope with a variable input.. so you could specify that the start of each message is in the following format:
rich
%TIME%
...

or
Richard @ %DATETIME%
...

or
On %DATETIME% Richard wrote:
...

If anyone has any ideas about how to go about this or has seen this done before I'd really appreciate the input.

Thanks in advance!
 
Well that's the thing.. the chat could come from any email client or any instant message program, so the format could change considerably. What it needs is 2 input boxes that let you specify the format of each message (one input box for each user)..

for eg
box1 said:
On %DATETIME% Richard wrote:
...

box2 said:
On %DATETIME% Gavin wrote:
...

If it has that capability then it could work with any conversation..
 
Great concept.. thanks for your ideas!

Instead of "On %DATETIME% %NAME% wrote:" i'm thinking it could run off "On %DATETIME% Richard wrote:", as the word Richard won't change (just like On and wrote).

So what it needs to do is convert the above filter into a regular expression, and then also create one for the other user in the conversation.

At that point I'll have 2 regular expressions for identifying the headers... now it's just figuring out the process for running these against the content to extract the messages!
 
Back
Top Bottom