Other Definitions
word salad (dict)

Word Salad

In the mental health field, word salad is used to describe the confused and repetitious language that is symptomatic of various psychoses and other serious mental illnesses. It describes the use of words with no apparent meaning attached to them, or to the relationships between them. In this context, it is considered to be a symptom of a formal thought disorder. One of the more famous of these disorders is Tourette syndrome.

In spam e-mail

In response to the growing problem of spam e-mail, filtering tools became available starting around 2002 which implemented a widely employed method known as the naive Bayes classifier. This method uses the probability of various words appearing in spam emails to automatically classify them as spam. For a short time, this worked fairly well to classify emails as probable spam. In response, spammers developed word salad to fool programs employing this method of classification. By adding large amounts of random text somewhere in their message, spammers hope to confuse Bayesian classifiers into classifying the message as "ham e-mail" (non-spam e-mail). Typically, this text contains random words from a dictionary. Algorithms for detecting word salad are clearly possible and not particularly difficult to implement. They would be, for the most part, more computationally intensive than most rules used by spam filters today. A statistical approach based on Zipf's law of word frequency has potential in detecting simple word salad, as do grammar checking and the use of natural language processing algorithms. Statistical Markovian analysis, where short phrases are used to determine if they are likely to occur in normal English sentences, is another statistical approach that would undoubtedly be effective.

Sentence and paragraph salad

In a related technique, actual text from some large corpus of legitimate English (Shakespeare, random world wide web pages, or the like) is added into the email. This approach attempts to get around algorithms that could be devised to detect the more primitive form of word salad. Paragraph salad will reduce the effectiveness of any of the algorithms mentioned above and will lead to higher scores with any Bayesian filters. The only algorithms that might thwart sentence and paragraph salad would be very high level and expensive natural language processing, some kind of artificial intelligence algorithm involving a search engine, or exhaustive listing of spam emails. All of these techniques would be exceptionally expensive, and would likely not be very successful at filtering spam despite their high cost.

Letter salad

On an even smaller scale than word salad, spammers are using misspellings of words to try and thwart Bayesian filters. Misspelling Viagra as Via6ra, \/|/\Gr/\, or any one of a million other ways, or even using characters from international character sets is an attempt to avoid the high efficiency with which a Bayesian filter would classify any email containing spammy words as spam. A simple spell checker might significantly reduce the effectiveness of letter salad approaches, yet most present spam filters don't take this step. The lengths to which some spammers have gone with letter salad have often produced illegible, almost laughable messages. Reading such email has become akin to the old "What does that license plate mean?" game.

Future

As spam filters get better at detecting simple word and letter salad, spammers will likely migrate towards sentence and paragraph salad techniques. In the process of obscuring their message from improving spam filters, they will also obscure their message from potential targets of their advertising, virus distribution, or phishing. At some point, the profitability of spam may be brought down to the point that its volume is substantially reduced.

Recommendations

End users should take no action upon receiving email with word salad content, or whose sender or purpose is unclear. Opening questionable email, and especially clicking on links contained in it, may risk overall information security.

 

<< PreviousWord BrowserNext >>
benefit tourism
collegium musicum
john jacob astor, 1st baron astor of hever
special member state territories and their relations with the eu
lantau link
nathan heard
polybius square
tpm
grammy award for producer of the year, non classical
speak (movie)
list of english suffixes
encyclopedia of life sciences
hampton comes alive
assam general sales tax
apostolic fathers
roy thomson, 1st baron thomson of fleet
kenneth thomson, 2nd baron thomson of fleet
mptp
rory underwood
tintin and the picaros
kenneth copeland
james stewart edwards
independent commission on policing for northern ireland
unweaving the rainbow
lillian disney
william woodbridge
garth turner
the seven crystal balls
chek lap kok
trade justice
chain email
sts 114
richard reeves
samhain (band)
fisher (animal)
brian pallister
code base
mucopolysaccharidosis
superx plus plus
kat o chau
campaign for freedom of information
william mastrosimone
sinclair stevens
chinese hamster ovary cell