Stephane Herman Maes - Danbury CT Geoffrey G. Zweig - Greenwich CT
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 2100
US Classification:
704273, 704270
Abstract:
A method of validating production of a biometric attribute allegedly associated with a user comprises the following steps. A first signal is generated representing data associated with the biometric attribute allegedly received in association with the user. A second signal is also generated representing data associated with at least one feature detected in association with the production of the biometric attribute allegedly received from the user. Then, the first signal and the second signal are compared to determine a correlation level between the biometric attribute and the production feature, wherein the validation of the production of the biometric attribute depends on the correlation level. Accordingly, the invention serves to provide substantial assurance that the biometric attribute offered by the user has been physically generated by the user.
A trainable radio scanner, including a station monitoring circuit to scan a plurality of radio frequencies and extract audio samples of a predetermined duration from each one of the plurality of radio frequencies having a signal strength above a reception threshold; a memory storing audio classification data and the plurality of audio samples; and an audio analyzer to analyze each one of the plurality of audio samples using the audio classification data and classifies each audio sample into a musical style category; a style discriminator to control a radio station scanning operation of the radio receiver to tune only to preferred radio stations having a radio frequency at which the corresponding audio sample is classified in at least one preferred musical style category.
Information Extraction From Documents With Regular Expression Matching
Techniques are provided for enumerating regularly identifiable or stereotypical phrases that people commonly use to convey particular information, and where exactly in these phrases the particular information is to be found. In one embodiment, such phrases are referred to as “regular expressions. ” Using such enumerated phrases, the invention is able to automatically identify them in an input data stream and then identify and extract the particular information associated with the phrase that is being sought, e. g. , important or relevant information.
Lattice-Based Unsupervised Maximum Likelihood Linear Regression For Speaker Adaptation
Mukund Padmanabhan - White Plains NY, US George A. Saon - Putnam Valley NY, US Geoffrey G. Zweig - Greenwich CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 15/06 G10L 15/14
US Classification:
704240, 704244, 704246
Abstract:
Methods and arrangements using lattice-based information for unsupervised speaker adaptation. By performing adaptation against a word lattice, correct models are more likely to be used in estimating a transform. Further, a particular type of lattice proposed herein enables the use of a natural confidence measure given by the posterior occupancy probability of a state, that is, the statistics of a particular state will be updated with the current frame only if the a posteriori probability of the state at that particular time is greater than a predetermined threshold.
Automatic Construction Of Unique Signatures And Confusable Sets For Database Access
Benoit Maison - White Plains NY, US Geoffrey G. Zweig - Greenwich CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 15/02
US Classification:
704251
Abstract:
Methods and arrangements for facilitating database access in speech recognition. A plurality of possible subsequences corresponding to a database entry are ascertained, a record of such subsequences and their correspondence to database entries is created, and either or both of the following are carried out: unique signatures are ascertained via determining whether a subsequence corresponding to a given database entry does not also correspond to at least one other database entry; and/or multiple occurrences of a given subsequence are found, with corresponding database entries being grouped into a confusion set.
Speech Recognition Utilizing Multitude Of Speech Features
Scott E. Axelrod - Mount Kisco NY, US Sreeram Viswanath Balakrishnan - Los Altos CA, US Stanley F. Chen - Yorktown Heights NY, US Yuging Gao - Mount Kisco NY, US Ramesh A. Gopinath - Millwood NY, US Benoit Maison - White Plains NY, US David Nahamoo - White Plains NY, US Michael Alan Picheny - White Plains NY, US George A. Saon - Old Greenwich CT, US Geoffrey G. Zweig - Ridgefield CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G10L 15/00 G10L 15/20
US Classification:
704236, 704240, 704251
Abstract:
In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
Yun-Cheng Ju - Bellevue WA, US Alejandro Acero - Bellevue WA, US Neal Bernstein - Mercer Island WA, US Geoffrey Zweig - Sammamish WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G10L 15/00
US Classification:
704246, 704251, 704270
Abstract:
A voice interaction system is configured to analyze an utterance and identify inherent attributes that are indicative of a demographic characteristic of the system user that spoke the utterance. The system then selects and presents a personalized response to the user, the response being selected based at least in part on the identified demographic characteristic. In one embodiment, the demographic characteristic is one or more of the caller's age, gender, ethnicity, education level, emotional state, health status and geographic group. In another embodiment, the selection of the response is further based on consideration of corroborative caller data.
Speech Processing With Predictive Language Modeling
The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.
Facebook
Director, Facebook Ai
Jpmorgan Chase & Co. Jan 2017 - Mar 2018
Managing Director, Global Head of Machine Learning
Microsoft Sep 2015 - Dec 2016
Partner Research Manager
Microsoft Sep 2013 - Aug 2015
Research Manager and Principal Researcher
Microsoft Sep 2010 - Sep 2013
Principal Researcher
Education:
University of California, Berkeley
Bachelors, Bachelor of Arts, Physics
University of California, Berkeley
Doctorates, Doctor of Philosophy, Computer Science, Philosophy
Skills:
Natural Language Processing Machine Learning Algorithms Speech Recognition Computer Science Pattern Recognition Artificial Intelligence Text Mining Statistical Modeling Neural Networks Signal Processing C++ Distributed Systems Data Mining Machine Translation Information Retrieval Programming Human Computer Interaction Software Engineering Software Design Computational Linguistics Image Processing Linux Computer Vision Python Speech Processing Information Extraction Semantic Web Unix Simulations Computer Architecture System Architecture