The Autonomous Autobaiter.
Forward
This page and links are slightly outdated. Version 4 is
near completion and solves many of the inadequacies of previous versions.
Version 2 of the autobaiter is documented here. The most
difficult problem solved was to get the autobaiter to automatically
interface with email service providers. In version 2, An interface using
POP/SMTP (hotpop) had been used, however, it proved to be very unreliable.
Now version 4 behaves like any email browser through a POP and SMTP
account with almost any provider that allows those protocols.
The major changes to the program are to include two new buttons in the
Control Panel (See below) They get all mail and post all mail that has
been saved. Attachments are not downloaded, but their names and
extentions are so the program can make appropriate responses to the attachments.
Graphic User Interface
A GUI allows the user to work with
the autobaiter. The user develops the personality and character of the
bait largely
through a scenario table. The scam emails are cycled through the system
with a control panel of buttons. Two text windows show details of
the emails. The left text window is the scammers incoming email and the
right text window is the outgoing computer generated
reply. The user
has the option of looking at many details of why and how the reply was
generated so that improvements can be made in the scenario table.
Tenor and scenario
The tenor of the bait is given by a list of scripts, or more
accurately, script segments. The
scripts of dialog in this program defined a "straight bait" in an early
test. The problem was that the bait moved too fast, and the
scammer began to expect too much from the autobaiter. The tone of
the scripts were chosen to maximize the lad's confidence that the
baiter was a genuine qualified victim. This too quickly lead to
pressure on the baiter to contact the second
level lad in the scam - the cohort. An early version of the autobaiter
could not handle it
well at all. I was advised to dumb it down with an idea of
Grandpa's Go-nowhere Stories (thanks Epistimon.) Now, the second
scenario has a much slower pace with a slower thinking baiter persona -
Arnie Leapzorp, a hick farmer who is all too long winded.
Gleaning information
The most difficult problem in autonomous autobaiting is gleaning vital
information from the scam
email. For example, the lads name is gleaned by comparing the
following
sources: the name field in the header, the email address itself,
searching for any titles such as Mr, Miss, Prince, correlating with
phrases such as "I am ...", "my name is ...", and closing salutations
such as,
"Best regards, ...", "Sincerely, ...". Name correlations are done in
many small segments because,
for example the header might have the first and last name reversed,
run together, or abbreviated. A priority is set where highest credence
is given to "my
name is ..." and to a closing salutation where the name is isolated on
the next line or two. If these reliable name
indicators are not given, back-up names are provided. An operator alert
is given with details of the gleaned data. The name becomes
"Friend" if the user ignores the alert. The user can override that
decision
with a pull down menu of possibilities for which the autobaiter could
not gain confidence.
Gleaning the proper response email is also
difficult. The correct email address for the response is
often not the one used in
the scammers first mailing. Often the "reply-to" in the
header is wrong because of
a lazy lad. If the mail is in hypertext, a "mail-to" tag is a fairly
reliable source. If the mail is in plain text, email addresses
are searched that are proceeded by phrases such as "my email", "my
address", "respond to", etc. The lads often will suddenly change
their email address because they were shut down by their service
provider. Detecting that the new address comes from one of the lads
already in the hundreds of history files is tricky.
There are over 1200 lines of code for
gleaning names, numbers and addresses.
Find the cohort
One current problem that is fairly well solved is to make a viable
transition from the level 1 lad to the level 2 lad. The cohort who
often poses as a banker, lawyer or
security company is generally more intelligent and
it is very important to waste his time. However, the original
lad is not left off easily - the emails to the cohort are copied to
the
lad along with a continuing banter to exhaust the lads supply of
scripts. History files of the lad and cohort are continually
updated and are cross referenced to each other to allow more
intelligent responses.
Email attachments
A number of fake images are available to automatically attach
to responses, such as passports (a totally noisy image with vague
streaks), and marginally readable Western Union transfer receipts
(WUxfr). Requested documents that are not available
are also sent as generic corrupted files. The WUxfrs
are
automatically generated by a Java imaging program with the appropriate
information automatically written. Sending a WUxfr
is very
important in prolonging baits. The version 2 program is
able to glean the particulars for filling out the WUxfr and do it
autonomously.
(See the web site http://www.geocities.com/hemorr_ice/WU-Receipt-Maker.html
for details of a manual version.)
Instruction manual?
This section gives an overview of how the autobaiter works in
practice. It will eventually become an instruction manual.
The Graphic User Interface
The
following sections cover details of the GUI and underlying
concepts.
The GUI allows the user to see data in a number of tables each
accessed through a tab in the main frame. Further details
of the program are provided by clicking the paragraph headings below.
Keywords
The tables include keywords which are
used in a search through the incoming email text. Key words are grouped
by synonyms.
The synonyms are sometimes quite loose, for example all religious words
are synonymous, such as {God, pray, bless, etc.} The user can alter any
aspects of the keyword table to suit the baiting style.
Scenario
This is the most complex table and includes many script segments (short
to long paragraphs) that are put together to form the reply
email. Scripts are chosen by the program through events (stimuli)
that trigger a script when the conditions are met. Here is where
the social engineering is done to develop the persona of the baiter. A
stored history of
transmitted scripts to each scammer prevents
key words from triggering the same script over and over again.
Instead, different scripts in a sequence are chosen for each new round.
Scam
classification
There are currently 10 different scam classifications. Classifications
are important because some script triggers depend on the specific scam
type. The result is that email responses are more tuned to the scam. A
neural network
classifier is trained to recognize the type of scam. The user can do
further training if desired or can define and train a new
classification is desired. But training with the classifier is
not necessary since it has already been done. Further
documentation will be forthcoming.
List of
current scammers
This table gives a synopsis of who is currently being baited, and what
the status of the bait is. It is automatically generated from the
history directory and is for information only. Each line in the
table gives the scammers email, name, the number of emails exchanged,
the last date of correspondence, and the email of the cohort.
The
control panel
Buttons on the control panel cycle the user through newly arrived email
from the inbox. The buttons are:
Next. This button causes next
email in the inbox to be decoded, processed and displayed. The computer
generated
reply is also displayed.
Again. The same email is
reprocessed. This is valuable if the user wants to change data in the
tables and rerun to check out the new results. The "Again" button can
be pressed indefinitely while the user tunes up scripts and the
stimuli to trigger scripts.
Previous. This button backsteps
and reruns the previous email in the inbox.
Letter/Gleaned/Trigger. This
button cycles the displayed data to show diagnostics, i.e. data
gleaned, and elements of the email that caused scripts to be
triggered. The generated letter is also shown here. If the user
really wants, the letter can be manually altered before being sent.
Send. Will send off the letter
through a POP3 account, bypassing the need for an email program.
Backup. Files are stored with
"backup" added to the file name. This can be done occasionally when
extensively updating a table so it won't be too devastating when the
dog trips over your power cord.
Close. The autobaiter first
asks if you want to save unsaved files before closing.
History
files
History files are not part of the GUI. They are separate text files
stored for each lad in a dedicated history directory. The file name for
each lad is
simply <lad's email address>.txt. The history files store
the
names and numbers gleaned from the lad's email. Also each file stores
information such as the last date of correspondence, the scam type,
special labels to indicate the stage of the bait, and data to indicate
what scripts have already been sent so that they will not be sent
again, except in certain programmable circumstances. Finally each
history file stores a journal of the exchanged emails.
Baiting Mode
A front end dialog box allows the autobaiter to deal with an
indefinite number of account names, potentially with different
characters and scenarios.
Running the autobaiter.
The autobaiting sequence is to:
(1) Open the program. Double check what data files are being used. Click OK.
(2) Click "Get Mail". The first mail is opened and a reply is displayed.
Gloat or frown over the reply. (Optional:
Change the Keywords or Scenario tables if desired and hit
"Update".) Hit the "Send" button. Click the "Next" button. Repeat step (2) until done.
(3) Click "Post All". Only saved mail will be sent. The others are left for next time, or deleted.
(4) Wait 12 to 24 hours and do over.