Posts

Showing posts from November, 2017

Unit 12 Reading Notes

Text classification and Naive Bayes To capture the generality and scope of the problem space to which standing queries belong, we now introduce the general notion of a classification problem. Apart from manual classification and hand-crafted rules, there is a third approach to text classification, namely, machine learning-based text classification. Flat clustering Clustering algorithms group a set of documents into subsets or clusters. The algorithms’ goal is to create clusters that are coherent internally, but clearly different from each other. Clustering is the most common form of unsupervised learning. No supervision means that there is no human expert who has assigned documents to classes. In clustering, it is the distribution and makeup of the data that will determine cluster membership. Flat clustering creates a flat set of clusters without any explicit structure that would relate clusters to each other. Hierarchical clustering Hierarchical clustering (or hierarchic

Unit 11 Muddiest Points

How efficiently do the adaptive system using log mining? I mean that we have to store a lot of log data to calculate the score, do we still build the system to collect the data?

Unit 11 Reading Notes

User profiling refers to use popular techniques for collecting information about users, representing and building user profiles.  Collecting information about users: the information collected may be explicitly input by the user or implicitly gathered by a software agent, collected on the user's client machine or gathered by the application server itself.  User profile representations: keyword profiles and semantic network profile. 4. User profile construction: building keyword profiles and semantic network profile, then building concept profiles.

Unit 10 Muddiest Points

1. In the web search, link is very important message for the query. If the search engine company add the advertisement between the page links. Do they violate information integrity? 2. In the link analysis, you said that there are some people who made the fake linked page(static pages) to enhance their website ranking. Can web search engine detect such pages?