On mySimon: Holiday Gifts Under $50
BNET Business Network:
BNET
TechRepublic
ZDNet

FREE Registration is required

Overview:

This paper describes a machine learning approach for detecting web spam. Each example in this classification task corresponds to 100 web pages from a host and the task is to predict whether this collection of pages represents spam or not. Their approach begins by adding several human-engineered features constructed from the raw data. They then construct a rough classifier and use semi-supervised learning to classify the unlabelled examples provided to them. They then construct additional link-based features and incorporate them into the training process. They also employ a combinatorial feature-fusion method for "Compressing" the enormous number of word-based features that are available, so that conventional machine learning algorithms can be used. Their results demonstrate the effectiveness of semi-supervised learning and the combinatorial feature-fusion method.

(Is this item miscategorized? Does it need more tags? Let us know.)

Format:PDFSize:158 KB
Date:Jul 2007
Pages:8
advertisement

White Papers, Webcasts, and Resources

Featured Training Courses

advertisement

SmartPlanet

Click Here

Returning users: Log In Here!

Already registered on BNET, TechRepublic, or ZDNet? Simply log in.

Free Membership: Sign Up Now!

Sign up for a free membership today and get instant and unlimited access to one of the largest databases of white papers, webcasts, and casestudies anywhere. Your FREE membership allows you to:

  • Download an unlimited amount of content, including classic and current white papers, case studies, webcasts and more
  • Track content on your chosen topics of interest
  • Receive targeted email alerts when your favorite content is added
  • Save content for future reading
  • Receive our member newsletter

When you register to access this library, you allow us to share your information with companies that produce products or services featured in the library--so that such companies may contact you with information and offers regarding their products and services. This enables us to keep the library a free service. As a library registrant, you will receive a complimentary subscription to the ZDNet white paper newsletter and e-mail Must-Read News Alerts. You can unsubscribe from these at any time. By clicking the Sign up button, you indicate that you agree to our Terms and Conditions and have read and understand our Privacy Policy (updated).