Support Forums

Full Version: [IRC Bot Tips]How to Correctly Parse IRC Messages
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
IRC Bots are always fun to make, but one of the main problems I see with new developers is their method of parsing information from IRC server responses, such as when someone speaks. This mini tutorial will show you how to correctly parse these responses for user information ^__^

Before we begin, we need to know what a PRIVMSG looks like.

Red = Nick
Orange = User
Green = Hostname
Blue = Destination
White = Actual Message

Channel Message:

:lamerlord!lamerlord@427C27D7.5F26B8D7.81E3A2C9.IP PRIVMSG #hackforums :hi Fallen

Private Message:

:lamerlord!lamerlord@427C27D7.5F26B8D7.81E3A2C9.IP PRIVMSG Fallen :hi Fallen

Notice the difference between the two, if a PM is being sent, the destination (which is usually a channel) is replaced by the person being sent the pm's nick.

Now onto the fun part... Parsing.

This tutorial assumes you have some python knowledge, I will not be explaining much pythonwise.

Now here is my basic Parser Class

( IRC_SocketHandle in this case would be a socket already connected to the IRC server, and beginning headers already sent [nick, user, etc] )

Code:
class NG_IRC_Parser( object ):
    def __init__( self, IRC_SocketHandle ):
        while True:
            IRC_SocketRecvAll = IRC_SocketHandle.recv( 1024 )
            for IRC_SocketRecv in IRC_SocketRecvAll.split( "\r\n" ):
                if not IRC_SocketRecv.find( "PRIVMSG" ) == -1:
                    try:
                        IRC_PRIVMSG_Information = [
                            IRC_SocketRecv.split( ":" )[ 1 ].split( "!" )[ 0 ], #nick
                            IRC_SocketRecv.split( "!" )[ 1 ].split( "@" )[ 0 ], #user  
                            IRC_SocketRecv.split( " " )[ 0 ].split( "@" )[ 1 ], #host
                            IRC_SocketRecv.split( " " )[ 2 ], #Channel, Message
                            ":".join( IRC_SocketRecv.split( ":" )[  2: ] ), True ] # Channel or PM
                        if not IRC_PRIVMSG_Information[ 3 ][ 0 ] == "#":
                            IRC_PRIVMSG_Information[ 5 ] = False
                        print IRC_PRIVMSG_Information
                    except Exception, Error: print "Error: %s" % ( Error )

The important piece of this code happens to be,

Code:
if not IRC_SocketRecv.find( "PRIVMSG" ) == -1:
                    try:
                        IRC_PRIVMSG_Information = [
                            IRC_SocketRecv.split( ":" )[ 1 ].split( "!" )[ 0 ], #nick
                            IRC_SocketRecv.split( "!" )[ 1 ].split( "@" )[ 0 ], #user  
                            IRC_SocketRecv.split( " " )[ 0 ].split( "@" )[ 1 ], #host
                            IRC_SocketRecv.split( " " )[ 2 ], #Channel, Message
                            ":".join( IRC_SocketRecv.split( ":" )[  2: ] ), True ] # Channel or PM
                        if not IRC_PRIVMSG_Information[ 3 ][ 0 ] == "#":
                            IRC_PRIVMSG_Information[ 5 ] = False
                        print IRC_PRIVMSG_Information
                    except Exception, Error: print "Error: %s" % ( Error )

If "PRIVMSG" is found in the response from the IRC Server, the parser gets to work.

The nick is very easy to obtain. It is always after the first colon and prior to the exclamation point.

Code:
IRC_SocketRecv.split( ":" )[ 1 ].split( "!" )[ 0 ] #nick

The user is always after the exclamation point but before the @

See this isnt too hard Tongue

Code:
IRC_SocketRecv.split( "!" )[ 1 ].split( "@" )[ 0 ] #user

The hostname is always before the first space in the response, but after the @

Code:
IRC_SocketRecv.split( " " )[ 0 ].split( "@" )[ 1 ] #host

The channel name/Person being pm'd is always between the 2nd and 3rd space

Code:
IRC_SocketRecv.split( " " )[ 2 ]

And last, but definetly not least is the actual message being sent.

We first split the message by colons, which will erase all of the beginning headers ( [ 2: ] ), then we rejoin the list by colons (So colons in the message are kept)

Code:
":".join( IRC_SocketRecv.split( ":" )[  2: ] )

and finally we set the last element of the array to True, we will use the next small code snippet to tell if the PRIVMSG is actually a channel message or a PM ( so our highly sophisticated IRC bot can act upon it accordingly Tongue )

As mentioned, the next couple lines will check if the PRIVMSG is actually a channel message or PM by seeing if the Destination begins with a "#", which we all know IRC channels start with. If by chance there is no beginning "#" the code will set our previously True boolean to False, showing it is a PM.

Code:
if not IRC_PRIVMSG_Information[ 3 ][ 0 ] == "#":
                            IRC_PRIVMSG_Information[ 5 ] = False

So there you have it, an awesome PRIVMSG parsing class Tongue
and you will be left with an awesome list of information Big Grin

IRC_PRIVMSG_Information[ 0 ] = Nick
...[ 1 ] User
...[ 2 ] Host
...[ 3 ] Channel
...[ 4 ] Message
...[ 5 ] Channel Message or PM


irc.Malvager.com #Malvager
irc.Hackforums.net #Hackforums
Very nice tutorial Fallen as usual Smile
Thanks Fallen!!!

I have written a Bot to keep my channel online, parsing those messages was a pain for me, until now...
The bot is not written in Python but it recieves the same messages, so thank you very much man!