FREE Registration is required
Overview:
This paper describes a machine learning approach for detecting web spam. Each example in this classification task corresponds to 100 web pages from a host and the task is to predict whether this collection of pages represents spam or not. Their approach begins by adding several human-engineered features constructed from the raw data. They then construct a rough classifier and use semi-supervised learning to classify the unlabelled examples provided to them. They then construct additional link-based features and incorporate them into the training process. They also employ a combinatorial feature-fusion method for "Compressing" the enormous number of word-based features that are available, so that conventional machine learning algorithms can be used. Their results demonstrate the effectiveness of semi-supervised learning and the combinatorial feature-fusion method.
(Is this item miscategorized? Does it need more tags? Let us know.)
| Format: | Size: | 158 KB | |
| Date: | Jul 2007 | ||
| Pages: | 8 |
Top results from Network Security
» View all Network Security listings
Top results from Spam - E-mail Fraud - Phishing
White Papers, Webcasts, and Resources
- Practical Approaches for Securing Web Applications across the Software Delivery Lifecycle IBMLearn how to implement a robust process for integrating security and risk management throughout your web application software development lifecycle.
- Staying a step ahead of the hackers: the importance of identifying critical Web application vulnerabilities. IBMLearn to identify Web application vulnerabilities, plus how to best protect your company against today's Web application and network security threats.
- Techical Whitepaper: Configuring Deduplication for High Performance Dell EqualLogicReduce disk storage requirements--while enabling higher performance--by learning to configure de-duplication for your primary storage policy copies.
Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
- News, Insights, Guidance
Visit CBSMoneyWatch.com Today -
MoneyWatch.com is the premier destination for smart, practical personal finance advice. Watch the latest Human Capital videos to make the most of your biggest asset - your earning power
- Learn more >>
Featured Training Courses
SmartPlanet
- Thought-provoking progressive ideas on diverse topics that intersect with technology, business, and life, and matter to the world at large. Visit SmartPlanet
- More from IBM
- How to Drive Better Business Outcomes with Exceptional Web Experiences Download the eBook
- Driving Business Agility through SOA Connectivity & Integration Read the White Paper from IBM
- Linking Decisions and Information for Organizational Performance Read the Tom Davenport study






