Resources for Research on Web Spam

Mailing List

We recommend you to subscribe to our mailing list. Datasets, challenges and conferences related to Web spam are posted to this low-volume, announcements-only mailing list.

Datasets

We host Web spam datasets developed by a collaborative effort by a team of volunteers. The goal of our dataset activity is to make available reference collections that should be:

Currently we are hosting a set of collections for research on Web Spam. See datasets >>.

See also

ECML/PKDD Discovery Challenge 2010 — competition to identify methods for assessing Web Quality including spam.

Web Spam Challenge 2007/2008 — competition to identify methods for detecting Web Spam.

AIRWeb — workshop on Adversarial Information Retrieval on the Web

Source code (archived) — Truncated PageRank and Adaptive Estimation of Supporters, the algorithms proposed in a WebKDD'06 paper.

For inquiries please contact Carlos Castillo
Last updated: September 10, 2012.