Dusan Macho - Schaumburg IL Yan Ming Cheng - Schaumburg IL
Assignee:
Motorola, Inc. - Schaumburg IL
International Classification:
G10L 2102
US Classification:
704207, 704233
Abstract:
A system for enhancing the signal-to-noise ratio of a speech signal is avoided. A plurality of local energy maximums associated with a speech signal are determined. Presumably, each of these local energy maximums defines a speech pitch period. Typically, human pitch periods are approximately 100-400 Hz depending on the sex and age of the speaker. Because human speech typically includes more energy near the beginning of a pitch period than at the end of the pitch period, and background noise tends to remain relatively constant throughout the pitch period, the speech signal may be enhanced by increasing the energy associated with the beginning of the pitch period and/or by decreasing the energy associated with the end of the pitch period. Preferably, the amount of energy increase in the earlier portion of the pitch period is approximately equal to the amount of energy reduction in the later portion of the pitch period. In this manner, the total energy remains the constant.
Dusan Macho - Schaumburg IL Yan Ming Cheng - Schaumburg IL
Assignee:
Motorola, Inc. - Schaumburg IL
International Classification:
G10L 1520
US Classification:
704233, 704243
Abstract:
A voice sample characterization front-end suitable for use in a distributed speech recognition context. A digitized voice sample is split between a low frequency path and a high frequency path. Both paths are used to determine spectral content suitable for use when determining speech recognition parameters (such as cepstral coefficients) that characterize the speech sample for recognition purposes. The low frequency path has a thorough noise reduction capability. In one embodiment, the results of this noise reduction are used by the high frequency path to aid in de-noising without requiring the same level of resource capacity as used by the low frequency path.
Changxue C. Ma - Barrington IL, US Yan M. Cheng - Inverness IL, US Chen Liu - Lisle IL, US Ted Mazurkiewicz - Lake Zurich IL, US Steven J. Nowlan - South Barrington IL, US James R. Talley - Austin TX, US Yuan-Jun Wei - Hoffman Estates IL, US
Assignee:
Motorola, Inc. - Schaumburg IL
International Classification:
G10L 15/14
US Classification:
704251
Abstract:
An electronic device () for speech dialog includes functions that receive () a speech phrase that comprises a request phrase that includes an instantiated variable (), generate () pitch and voicing characteristics () of the instantiated variable, and performs speech recognition () of the instantiated variable to determine a most likely set of acoustic states (). The electronic device may generate () a synthesized value of the instantiated variable using the most likely set of acoustic states and the pitch and voicing characteristics of the instantiated variable. The electronic device may use a table of previously entered values of variables that have been determined to be unique, and in which the values are associated with a most likely set of acoustic states and the pitch and voicing characteristics determined at the receipt of each value to disambiguate () a newly received instantiated variable.
Method And Apparatus For Generating And Updating A Voice Tag
A method and apparatus () for updating a voice tag comprising N stored voice tag phoneme sequences includes a function () for determining () an accepted stored voice tag phoneme sequence for an utterance, a function () for extracting() a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function () for updating () a reference histogram associated with the accepted voice tag, and a function () for updating () the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus () also generates a voice tag using some functions () that are common with the method and apparatus to update the voice tag, such as the extracting () of the current set of M phoneme sequences.
Tailored Speaker-Independent Voice Recognition System
Changxue C. Ma - Barrington IL, US Yan M. Cheng - Schaumburg IL, US
Assignee:
Motorola, Inc. - Schaumburg IL
International Classification:
G10L 15/06 G10L 15/00
US Classification:
704244, 704243, 704251, 704255
Abstract:
A tailored speaker-independent voice recognition system has a speech recognition dictionary () with at least one word (). That word () has at least two transcriptions (), each transcription () having a probability factor () and an indicator () of whether the transcription is active. When a speech utterance is received (), the voice recognition system determines () the word signified by the speech utterance, evaluates () the speech utterance against the transcriptions of the correct word, updates () the probability factors for each transcription, and inactivates () any transcription that has an updated probability factor that is less than a threshold.
Method And System For Interpreting Verbal Inputs In Multimodal Dialog System
Changxue C. Ma - Barrington IL, US Harry M. Bliss - Evanston IL, US Yan M. Cheng - Inverness IL, US
Assignee:
Motorola, Inc. - Schaumburg IL
International Classification:
G10L 15/00
US Classification:
704240, 704239, 704242
Abstract:
A method, a system and a computer program product for interpreting a verbal input in a multimodal dialog system are provided. The method includes assigning () a confidence value to at least one word generated by a verbal recognition component. The method further includes generating () a semantic unit confidence score for the verbal input. The generation of a semantic unit confidence score is based on the confidence value of at least one word and at least one semantic confidence operator.
Method And Apparatus For Distributed Voice Searching
A method for distributed voice searching may include receiving a search query from a user of the mobile communication device, generating a lattice of coarse linguistic representations from speech parts in the search query, extracting query features from the generated lattice of coarse linguistic representations, generating coarse search feature vectors based on the extracted query features, performing a coarse search using the generated coarse search feature vectors and transmitting the generated coarse search feature vectors to a remote voice search processing unit, receiving remote resultant web indices from the remote voice search processing unit, generating a lattice of fine linguistic representations from speech parts in the search query, generating fine search feature vectors from the lattice of fine linguistic representations, performing a fine search using the coarse search results, the remote resultant web indices and the generated fine search feature vectors, and displaying the fine search results to the user.
Mel-Frequency Domain Based Audible Noise Filter And Method
YAN MING CHENG - SCHAUMBURG IL, US ANSHU AGARWAL - SAN JOSE CA, US
International Classification:
G10L015/20 G10L021/00
US Classification:
704/233000, 704/275000
Abstract:
An audio filter consists of two substantially identical stages with different purposes. The first stage () whitens detected noise, while preserving speech or other audible information in an undistorted manner. The second stage () effectively eliminates the residual white noise. Each stage, in one embodiment, includes a Mel domain based error minimization stage (). A two stage Mel-frequency domain Wiener filter () is designed for each speech time frame in the Mel-frequency domain, instead of linear frequency domain. Each Mel domain based error minimization stage () minimizes the perceptual distortion and reduces the computation requirement to provide suitably filtered audible information.
Westchester Medical Center Oncology 100 Wood Rd STE 7S, Valhalla, NY 10595 (914)4937488 (phone), (914)4937483 (fax)
Education:
Medical School St. George's University School of Medicine, St. George's, Greneda Graduated: 2011
Languages:
English
Description:
Dr. Cheng graduated from the St. George's University School of Medicine, St. George's, Greneda in 2011. He works in Valhalla, NY and specializes in Hematology/Oncology. Dr. Cheng is affiliated with Westchester Medical Center.
Motorola Solutions Information Technology and Services · Manufacturing · College/University Management Consulting Services · Museum/Art Gallery · Mfg Radio/TV Communication Equipment · Radio and T.V. Communications Equipment, Nsk · Mfg Radio/TV Comm Equip · Nonclassifiable Establishments
1303 E Algonquin Rd, Schaumburg, IL 60196 2390 E Camelback Rd, Phoenix, AZ 85016 1303 E Algonquin Rd Attn: Tax, Schaumburg, IL 60196 1209 Orange Street , Wilmington, DE 19801 (847)5768600, (847)5766559, (847)5767000, (847)5762453