Michael A. Woytowitz - Freeland MD, US Marshall Wells Hawks - Upperco MD, US
Assignee:
COMSORT, INC. - Hunt Valley MD
International Classification:
G06F 17/30
US Classification:
707710, 707E17109
Abstract:
Method and apparatus for creating an electronic database of disambiguated entity mentions and relations from a corpus of electronic documents. The invention automatically extracts from the corpus of electronic documents mentions about entities (e.g., references to people, organizations or places), parses the entity mentions into “mention objects,” and executes a series of grouping, comparison and hierarchical fuzzy object clustering algorithms to cluster together in an electronic database all of the mention objects referring to the same entity and all of the mention objects (e.g. “people”) associated with each other by a relationship (e.g., “co-authors” or “family members”). The resulting electronic database of disambiguated entity mentions and relations, which may comprise, for example, an XML document, a relational database or hierarchical database, is structured to permit useful recordation, access, review and display of all of the mentions and relations associated with a particular entity or collection of entities.