1st International Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

A collaborative and multi-agent system for e-mail filtering and classification

  • @INPROCEEDINGS{10.1109/COLCOM.2005.1651248,
        author={Lorenzo Lazzari and Marco Mari and Agostino Poggi},
        title={A collaborative and multi-agent system for e-mail filtering and classification},
        proceedings={1st International Conference on Collaborative Computing: Networking, Applications and Worksharing},
        publisher={IEEE},
        proceedings_a={COLLABORATECOM},
        year={2006},
        month={7},
        keywords={Bayesian methods  Collaboration  Databases  Electronic mail  Information filtering  Information filters  Internet  Multiagent systems  Postal services  Unsolicited electronic mail},
        doi={10.1109/COLCOM.2005.1651248}
    }
    
  • Lorenzo Lazzari
    Marco Mari
    Agostino Poggi
    Year: 2006
    A collaborative and multi-agent system for e-mail filtering and classification
    COLLABORATECOM
    IEEE
    DOI: 10.1109/COLCOM.2005.1651248
Lorenzo Lazzari1,*, Marco Mari1,*, Agostino Poggi1,*
  • 1: Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Parma, Parco Area delle Scienze 181/A, 43100 Parma, Italy
*Contact email: lazzari@ce.unipr.it, mari@ce.unipr.it, poggi@ce.unipr.it

Abstract

CAFE (collaborative agents for filtering e-mails) is a multi-agent system to collaboratively filter spam and classify legitimate messages in users' mail stream. CAFE associates a proxy agent with each user, and this agent represents a sort of interface between the user's e-mail client and the e-mail server. With the support of other types of agents, the proxy agent makes a classification of new messages into three categories: ham (good messages), spam and spam-presumed. Ham messages can be in their turn divided on the basis of the sender's identity and reputation. The reputation is collaboratively inferred from users' ratings. The filtering process is performed using three kinds of approach: a first approach based on the usage of an hash function, a static approach using DNSBL (DNS-based black lists) databases and a dynamic approach based on a Bayesian filter. We give a mathematical representation of the system, showing that if users collaborate, the fault probability decreases in proportion to the number of active users.