0000000001192802
AUTHOR
Lida Zhu
showing 1 related works from this author
Automatic Categorization of Web Sites
2008
Masteroppgave i informasjons- og kommunikasjonsteknologi 2008 – Universitetet i Agder, Grimstad In this thesis we have presented a solution to classify websites into geographical attribute code (NUTS) and economical activities attribute codes(NACE). We propose a solution for web site classification with high accuracy. We use keywordbased document classification methods which had shown good performance. After classification, each document is assigned a class label from a set of predefined categories, which is based on a pool of pre-classified sample documents. Our solution includes to remove stop words and skip html tags, which identify the informative term, remove the non-informative or red…