David Lubensky - Danbury CT Cheng Wu - Mount Kisco NY
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 1510
US Classification:
704233, 704238, 704239
Abstract:
Disclosed is a method of automated handset identification, comprising receiving a sample speech input signal from a sample handset; deriving a cepstral covariance sample matrix from said first sample speech signal; calculating, with a distance metric, all distances between said sample matrix and one or more cepstral covariance handset matrices, wherein each said handset matrix is derived from a plurality of speech signals taken from different speakers through the same handset; and determining if the smallest of said distances is below a predetermined threshold value.
Method And Apparatus For Speaker Identification Using Cepstral Covariance Matrices And Distance Metrics
David Lubensky - Danbury CT, US Cheng Wu - Mount Kisco NY, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 17/00
US Classification:
704250, 704251, 704255
Abstract:
Disclosed is a method of automated speaker identification, comprising receiving a sample speech input signal from a sample handset; deriving a cepstral covariance sample matrix from the first sample speech signal; calculating, with a distance metric, all distances between the sample matrix and one or more cepstral covariance signature matrices; determining if the smallest of the distances is below a predetermined threshold value; and wherein the distance metric is selected from fusion derivatives thereof, and fusion derivatives thereof with.
System And Method For Management Of Call Data Using A Vector Based Model And Relational Data Structure
Cheng Wu - Mount Kisco NY, US Andrzej Sakrajda - White Plains NY, US Vaibhava Goel - Elmsford NY, US David Lubensky - Brookfield CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
H04M 11/06
US Classification:
379 8813, 379 85, 706 62, 709219, 714 32
Abstract:
A system and method for representing call content in a searchable database includes transcribing call content to text. The call content is projected to vector space, by creating a vector by indexing the call based on the content and determining a similarity of the call to an atomic-class dictionary. The call is classified in a relational database in accordance with the vector.
System And Method For Management Of Call Data Using A Vector Based Model And Relational Data Structure
Cheng Wu - Mount Kisco NY, US Andrzej Sakrajda - White Plains NY, US Vaibhava Goel - Elmsford NY, US David Lubensky - Brookfield CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
H04M 11/00
US Classification:
379 8813, 379 85, 706 62, 709219, 704250, 704255
Abstract:
A system and method for representing call content in a searchable database includes transcribing call content to text. The call content is projected to vector space, by creating a vector by indexing the call based on the content and determining a similarity of the call to an atomic-class dictionary. The call is classified in a relational database in accordance with the vector.
Method And Apparatus For Detecting Data Anomalies In Statistical Natural Language Applications
Yuqing Gao - Mount Kisco NY, US Roberto Pieraccini - Peekskill NY, US Jerome Quinn - North Salem NY, US Cheng Wu - Mount Kisco NY, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/28
US Classification:
704005000
Abstract:
Techniques for detecting data anomalies in a natural language understanding (NLU) system are provided. A number of categorized sentences, categorized into a number of categories, are obtained. Sentences within a given one of the categories are clustered into a number of sub clusters, and the sub clusters are analyzed to identify data anomalies. The clustering can be based on surface forms of the sentences. The anomalies can be, for example, ambiguities or inconsistencies. The clustering can be performed, for example, with a K-means clustering algorithm.
Juan M. Huerta - Pleasantville NY, US Cheng Wu - Mount Kisco NY, US
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION - Armonk NY
International Classification:
G06F 17/28
US Classification:
704 7
Abstract:
A method, system, and computer readable storage medium including a computer readable program are provided. The method includes storing a set of sentences in a memory device. The method further includes receiving an input translated phrase and searching the set of sentences for a subset of sentences closest to the input translated phrase based on a set of respective distances to the sentences in the set with respect to the input translated phrase. The method also includes calculating and outputting a language model score for the subset of sentences based on a function of a subset of respective distances pertaining to the subset of sentences.
Method And Apparatus For Fast Translation Memory Search
Juan M. Huerta - Pleasantville NY, US David M. Lubensky - Brookfield CT, US Cheng Wu - Mount Kisco NY, US
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION - Armonk NY
International Classification:
G06F 17/28
US Classification:
704 5
Abstract:
Methods and systems for fast translation memory search include, in response to an input query string, identifying a plurality of hypothesis strings stored in a translation memory as candidates to match the query string. One or more candidates are eliminated, using a processor, where string lengths between the candidates and the query string are at least a cutoff value representing a string edit distance. One or more candidates are eliminated where differences in word frequency distributions between the candidates and the query string are at least the cutoff value. One or more candidates are eliminated by employing a dynamic programming matrix where string edit distances between the candidates and the query string are at least the cutoff value. A number of remaining candidates are outputted as matches to the query string.
Natural Language System And Method Based On Unisolated Performance Metric
Yuqing Gao - Mount Kisco NY, US Vaibhava Goel - Elmsford NY, US Cheng Wu - Mount Kisco NY, US
Assignee:
Nuance Communications, Inc. - Burlington MA
International Classification:
G10L 15/18
US Classification:
704257
Abstract:
A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed