Spam filtering: prototypes
Image Spam Lab
To experiment computer vision and pattern recognition techniques against image spam, we developed a tool for generating artificial spam images. The tool allows to generate images with embedded text obfuscated with several techniques used by spammers in real spam e-mails. The user can enter any text he wants (or use some predefined texts), and can choose the text features (font face and size), width and height of the image, the obfuscation technique and the obfuscation level. Three obfuscation techniques have been implemented so far, as in the following examples.



The obfuscation level can be tuned to obtain clean images to obfuscated images which cannot be read by OCR tools, but are still readable by a human being, as in the examples below.




Fig.2: Examples of the three obfuscation techniques for 50% obfuscation level (left) and 100% obfuscation level (right), applied to the original clean image (top image).
A screen shot of our Image Spam Lab tool is shown below.

Fig.3: A screen shot of Image Spam Lab.