Google Inc. - Mountain View CA, US Joel Ingram - Chicago IL, US
Assignee:
GOOGLE INC. - Mountain View CA
International Classification:
G06N 5/02
US Classification:
706 46
Abstract:
A computer implemented method for clustering documents that includes receiving at least a first first-level cluster of documents and a second first-level cluster of documents, wherein the first first-level cluster includes one or more first documents associated with a first domain, the first documents satisfying a first first-level classification criterion, and the second first-level cluster including one or more second documents associated with a second domain, the second documents satisfying a second first-level classification criterion. The method also includes creating, by a processor of the computer system, a second-level cluster by combining the first first-level cluster with the second first-level cluster when a) the first first-level cluster and second first-level cluster satisfy a second-level classification criterion; and b) the first domain does not match the second domain.