FREE Registration is required
Overview:
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to both complex page layout designs and unrestricted user created posts. This paper, studies the problem of structured data extraction from various web forum sites. The target is to find a solution as general as possible to extract structured data, such as post title, post author, post time, and post content from any forum site. In contrast to most existing information extraction methods, which only lever-age the knowledge inside an individual page, the paper incorporates both page-level and site-level knowledge and employ Markov Logic Networks (MLNs) to effectively integrate all useful evidence by learning their importance automatically.
(Is this item miscategorized? Does it need more tags? Let us know.)
| Format: | Size: | 832 KB | |
| Date: | Apr 2009 | ||
| Pages: | 10 |
Top results from Knowledge and Data Management
» View all Knowledge and Data Management listings
Top results from Data Acquisition - ETL
White Papers, Webcasts, and Resources
- Technical On Demand Teleconference: Managing large objects in a DB2 for z/OS environment - tips and techniques IBMLearn why large objects (LOBs) represent such a critical DB2 resource, plus tips and techniques for managing LOBs more efficiently and effectively.
- Adopting Server Virtualization for Business Continuity and Disaster Recovery CA XOsoftDiscover the advantages of server virtualization for building an IT infrastructure with robust business continuity and disaster recovery capabilities.
- Business Continuity Planning - IT Survival Guide CA XOsoftLearn how to begin the process of developing an effective business continuity plan designed to minimize the impact of disasters and reduce risk.
Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
- SmartPlanet
Discover innovative insight and ideas that impact the world around you -
SmartPlanet offers expert advice on innovations in healthcare, including electronic personal health records, treatment, privacy and regulation, and the green technologies that make it happen.
- Learn more >>
- News, Insights, Guidance
Visit CBSMoneyWatch.com Today -
MoneyWatch.com is the premier destination for smart, practical personal finance advice. Watch the latest Human Capital videos to make the most of your biggest asset - your earning power
- Learn more >>











