The Autonomous Autobaiter.


This page and links are slightly outdated. Version 4 is near completion and solves many of the inadequacies of previous versions.

Version 2 of the autobaiter is documented here. The most difficult problem solved was to get the autobaiter to automatically interface with email service providers. In version 2, An interface using POP/SMTP (hotpop) had been used, however, it proved to be very unreliable.  Now version 4 behaves like any email browser through a POP and SMTP account with almost any provider that allows those protocols.

The major changes to the program are to include two new buttons in the Control Panel (See below) They get all mail and post all mail that has been saved. Attachments are not downloaded, but their names and extentions are so the program can make appropriate responses to the attachments.

Graphic User Interface
A GUI allows the user to work with the autobaiter. The user develops the personality and character of the bait largely through a scenario table. The scam emails are cycled through the system with a control panel of buttons.  Two text windows show details of the emails. The left text window is the scammers incoming email and the right text window is the outgoing computer generated reply. The user has the option of looking at many details of why and how the reply was generated so that improvements can be made in the scenario table.

Tenor and scenario
The tenor of the bait is given by a list of scripts, or more accurately, script segments. The scripts of dialog in this program defined a "straight bait" in an early test.  The problem was that the bait moved too fast, and the scammer began to expect too much from the autobaiter. The tone of the scripts were chosen to maximize the lad's confidence that the baiter was a genuine qualified victim. This too quickly lead to pressure on the baiter to contact the second level lad in the scam - the cohort. An early version of the autobaiter could not handle it well at all.  I was advised to dumb it down with an idea of Grandpa's Go-nowhere Stories (thanks Epistimon.)  Now, the second scenario has a much slower pace with a slower thinking baiter persona - Arnie Leapzorp, a hick farmer who is all too long winded.

Gleaning information
The most difficult problem in autonomous autobaiting is gleaning vital information from the scam email.  For example, the lads name is gleaned by comparing the following sources: the name field in the header, the email address itself, searching for any titles such as Mr, Miss, Prince, correlating with phrases such as "I am ...", "my name is ...", and closing salutations such as, "Best regards, ...", "Sincerely, ...". Name correlations are done in many small
segments because, for example the header might have the first and last name reversed, run together, or abbreviated. A priority is set where highest credence is given to "my name is ..." and to a closing salutation where the name is isolated on the next line or two. If these reliable name indicators are not given, back-up names are provided. An operator alert is given with details of the gleaned data.  The name becomes "Friend" if the user ignores the alert. The user can override that decision with a pull down menu of possibilities for which the autobaiter could not gain confidence.

Gleaning the proper response email is also difficult.
The correct email address for the response is often not the one used in the scammers first mailing. Often the "reply-to" in the header is wrong because of a lazy lad. If the mail is in hypertext, a "mail-to" tag is a fairly reliable source.  If the mail is in plain text, email addresses are searched that are proceeded by phrases such as "my email", "my address", "respond to", etc.  The lads often will suddenly change their email address because they were shut down by their service provider. Detecting that the new address comes from one of the lads already in the hundreds of history files is tricky.

There are over 1200 lines of code for gleaning names, numbers and addresses.

Find the cohort
One current problem that is fairly well solved is to make a viable transition from the level 1 lad to the level 2 lad. The cohort who often poses as a banker, lawyer or security company is generally more intelligent and it is very important to waste his time.  However, the original lad is not left off easily - the emails to the cohort are copied to the lad along with a continuing banter to exhaust the lads supply of scripts.  History files of the lad and cohort are continually updated and are cross referenced to each other to allow more intelligent responses.

Email attachments
A number of fake images are available to automatically attach to responses, such as passports (a totally noisy image with vague streaks), and marginally readable Western Union transfer receipts (WUxfr). Requested documents that are not available are also sent as generic corrupted files. The
WUxfrs are automatically generated by a Java imaging program with the appropriate information automatically written. Sending a WUxfr is very important in prolonging baits. The version 2 program is able to glean the particulars for filling out the WUxfr and do it autonomously.
(See the web site for details of a manual version.)  

Instruction manual?
This section gives an overview of how the autobaiter works in practice.  It will eventually become an instruction manual.

The Graphic User Interface
The following sections cover details of the GUI and underlying concepts. The GUI allows the user to see data in a number of tables each accessed through a tab in the main frame. Further details of the program are provided by clicking the paragraph headings below.

The tables include keywords which are used in a search through the incoming email text. Key words are grouped by synonyms. The synonyms are sometimes quite loose, for example all religious words are synonymous, such as {God, pray, bless, etc.} The user can alter any aspects of the keyword table to suit the baiting style.

This is the most complex table and includes many script segments (short to long paragraphs) that are put together to form the reply email.  Scripts are chosen by the program through events (stimuli) that trigger a script when the conditions are met.  Here is where the social engineering is done to develop the persona of the baiter.
A stored history of transmitted scripts to each scammer prevents key words from triggering the same script over and over again.  Instead, different scripts in a sequence are chosen for each new round.

Scam classification
There are currently 10 different scam classifications. Classifications are important because some script triggers depend on the specific scam type. The result is that email responses are more tuned to the scam. A neural network classifier is trained to recognize the type of scam. The user can do further training if desired or can define and train a new classification is desired.  But training with the classifier is not necessary since it has already been done.  Further documentation will be forthcoming.

List of current scammers
This table gives a synopsis of who is currently being baited, and what the status of the bait is. It is automatically generated from the history directory and is for information only. Each line in the table gives the scammers email, name, the number of emails exchanged, the last date of correspondence, and the email of the cohort.

The control panel
Buttons on the control panel cycle the user through newly arrived email from the inbox. The buttons are:
Next. This button causes next email in the inbox to be decoded, processed and displayed. The computer generated reply is also displayed.
Again. The same email is reprocessed. This is valuable if the user wants to change data in the tables and rerun to check out the new results. The "Again" button can be pressed indefinitely while the user tunes up scripts and the stimuli to trigger scripts.
Previous. This button backsteps and reruns the previous email in the inbox.
Letter/Gleaned/Trigger. This button cycles the displayed data to show diagnostics, i.e. data gleaned, and elements of the email that caused scripts to be triggered.  The generated letter is also shown here. If the user really wants, the letter can be manually altered before being sent.
Send. Will send off the letter through a POP3 account, bypassing the need for an email program.
Backup. Files are stored with "backup" added to the file name. This can be done occasionally when extensively updating a table so it won't be too devastating when the dog trips over your power cord.
Close. The autobaiter first asks if you want to save unsaved files before closing.

History files
History files are not part of the GUI. They are separate text files stored for each lad in a dedicated history directory. The file name for each lad is simply <lad's email address>.txt.  The history files store the names and numbers gleaned from the lad's email. Also each file stores information such as the last date of correspondence, the scam type, special labels to indicate the stage of the bait, and data to indicate what scripts have already been sent so that they will not be sent again, except in certain programmable circumstances.  Finally each history file stores a journal of the exchanged emails.

Baiting Mode
A front end dialog box allows the autobaiter to deal with an indefinite number of account names, potentially with different characters and scenarios.

Running the autobaiter.
The autobaiting sequence is to:

(1) Open the program. Double check what data files are being used. Click OK.

(2) Click "Get Mail". The first mail is opened and a reply is displayed. Gloat or frown over the reply.
(Optional: Change the Keywords or Scenario tables if desired and hit "Update".) Hit the "Send" button. Click the "Next" button. Repeat step (2) until done. 

(3) Click "Post All". Only saved mail will be sent. The others are left for next time, or deleted.

(4) Wait 12 to 24 hours and do over.