Table of Contents
Text Mining / Web Mining
Text Mining as a Data Mining Task
Applications
KDD Process
Data Preparation
Feature Selection
Missing Data
Why Text is Rich to Mine
Why Text is Tough to Mine
Classify documents
Discovering Trends in Text
Hierarchy of Phrases
Grow Sequences
Shape Matching
Data Mining to Create Taxonomy
TAPER
Querying
Improved query environment
Document signature
TAPER Algorithm
Feature Selection
Number of Features
Hierarchical Classification
Hierarchical Classification
Results
Results - Confusion Matrices
Results of Hierarchical Classification
Organization of Web Pages Using WordNet and Self-Organizing Maps
Test Pages
Overall Process
WordNet
Creating Feature Vectors
Self-Organizing Map
Maps
Sammon Map
Results
In the Spotlight
In the Spotlight
WWW Data Mining
Challenges to Web Mining
Web Mining: Much Can Be Done!
Web Log Mining
A Multiple Layered Meta-Web Architecture
Future
|