| Web mining is moving the World Wide Web toward a more useful environment in which users can quickly and easily find the information they need. It includes the discovery and analysis of data, documents, and multimedia from the World Wide Web. Web mining uses document content, hyperlink structure, and usage statistics to assist users in meeting their information needs.
The Web itself and search engines contain relationship information about documents. Web mining is the discovery of these relationships and is accomplished within three sometimes overlapping areas. Content mining is first. Search engines define content by keywords. Finding contents’ keywords and finding the relationship between a Web page’s content and a user’s query content is content mining. Hyperlinks provide information about other documents on the Web thought to be important to another document. These links add depth to the document, providing the multi-dimensionality that characterizes the Web. Mining this link structure is the second area of Web mining. Finally, there is a relationship to other documents on the Web that are identified by previous searches. These relationships are recorded in logs of searches and accesses. Mining these logs is the third area of Web mining.
Understanding the user is also an important part of Web mining. Analysis of the user’s previous sessions, preferred display of information, and expressed preferences may influence the Web pages returned in response to a query.
Web mining is interdisciplinary in nature, spanning across such fields as information retrieval, natural language processing, information extraction, machine learning, database, data mining, data warehousing, user interface design, and visualization. Techniques for mining the Web have practical application in m-commerce, e-commerce, egovernment, e-learning, distance learning, organizational learning, virtual organizations, knowledge management, and digital libraries |