David G Grangier from Mountain View, CA, age 44

Us Patents

Supervised Semantic Indexing And Its Extensions
view source
US Patent:

8359282, Jan 22, 2013
Filed:

Sep 18, 2009
Appl. No.:

12/562840
Inventors:

Bing Bai - Plainsboro NJ, US
Jason Weston - New York NY, US
David Grangier - Princeton NJ, US
Assignee:

NEC Laboratories America, Inc. - Princeton NJ
International Classification:

G06F 15/18
G06F 7/00
US Classification:

706 12, 707705
Abstract:

A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.

Supervised Semantic Indexing And Its Extensions
view source
US Patent:

20100179933, Jul 15, 2010
Filed:

Sep 18, 2009
Appl. No.:

12/562802
Inventors:

BING BAI - Plainsboro NJ, US
Jason Weston - New York NY, US
Ronan Collobert - Princeton NJ, US
David Grangier - Princeton NJ, US
Assignee:

NEC Laboratories America, Inc. - Princeton NJ
International Classification:

G06F 17/30
G06F 15/18
US Classification:

706 12, 707742, 707E17033, 707749, 707E17002
Abstract:

A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

Feature Set Embedding For Incomplete Data
view source
US Patent:

20110302118, Dec 8, 2011
Filed:

Jun 2, 2011
Appl. No.:

13/151480
Inventors:

IAIN MELVIN - Rocky Hill NJ, US
David Grangier - San Francisco CA, US
Assignee:

NEC Laboratories America, Inc. - Princeton NJ
International Classification:

G06F 15/18
G06F 17/30
US Classification:

706 21, 707802, 707E17005
Abstract:

Methods and systems for classifying incomplete data are disclosed. In accordance with one method, pairs of features and values are generated based upon feature measurements on the incomplete data. In addition, a transformation function is applied on the pairs of features and values to generate a set of vectors by mapping each of the pairs to a corresponding vector in an embedding space. Further, a hardware processor applies a prediction function to the set of vectors to generate at least one confidence assessment for at least one class that indicates whether the incomplete data is of the at least one class. The method further includes outputting the at least one confidence assessment.

Separating Speech By Source In Audio Recordings By Predicting Isolated Audio Signals Conditioned On Speaker Representations
view source
US Patent:

20230112265, Apr 13, 2023
Filed:

Oct 17, 2022
Appl. No.:

17/967726
Inventors:

- Mountain View CA, US
David Grangier - Mountain View CA, US
International Classification:

G10L 21/028
G06N 3/08
G10L 17/04
G10L 17/18
G10L 21/0316
G06N 3/045
Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Separating Speech By Source In Audio Recordings By Predicting Isolated Audio Signals Conditioned On Speaker Representations
view source
US Patent:

20210249027, Aug 12, 2021
Filed:

Feb 8, 2021
Appl. No.:

17/170657
Inventors:

- Mountain View CA, US
David Grangier - Mountain View CA, US
International Classification:

G10L 21/028
G10L 17/04
G10L 17/18
G10L 21/0316
G06N 3/04
G06N 3/08
Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing speech separation. One of the methods includes obtaining a recording comprising speech from a plurality of speakers; processing the recording using a speaker neural network having speaker parameter values and configured to process the recording in accordance with the speaker parameter values to generate a plurality of per-recording speaker representations, each speaker representation representing features of a respective identified speaker in the recording; and processing the per-recording speaker representations and the recording using a separation neural network having separation parameter values and configured to process the recording and the speaker representations in accordance with the separation parameter values to generate, for each speaker representation, a respective predicted isolated audio signal that corresponds to speech of one of the speakers in the recording.

Interactive Concept Editing In Computer-Human Interactive Learning
view source
US Patent:

20190213252, Jul 11, 2019
Filed:

Mar 19, 2019
Appl. No.:

16/358261
Inventors:

- Redmond WA, US
DAVID G. GRANGIER - Kirkland WA, US
LEON BOTTOU - Kirkland WA, US
SALEEMA A. AMERSHI - Seattle WA, US
International Classification:

G06F 17/27
G06N 20/00
G06F 16/951
G06F 3/0482
H04L 1/00
G06N 7/00
Abstract:

A collection of data that is extremely large can be difficult to search and/or analyze. Relevance may be dramatically improved by automatically classifying queries and web pages in useful categories, and using these classification scores as relevance features. A thorough approach may require building a large number of classifiers, corresponding to the various types of information, activities, and products. Creation of classifiers and schematizers is provided on large data sets. Exercising the classifiers and schematizers on hundreds of millions of items may expose value that is inherent to the data by adding usable meta-data. Some aspects include active labeling exploration, automatic regularization and cold start, scaling with the number of items and the number of classifiers, active featuring, and segmentation and schematization.

Sequence-To-Sequence Convolutional Architecture
view source
US Patent:

20180261214, Sep 13, 2018
Filed:

Dec 20, 2017
Appl. No.:

15/848199
Inventors:

- Menlo Park CA, US
Michael Auli - Menlo Park CA, US
Yann Nicolas Dauphin - San Francisco CA, US
David G. Grangier - Mountain View CA, US
Dzianis Yarats - Redwood City CA, US
International Classification:

G10L 15/16
G06N 3/04
G06N 3/08
G10L 15/22
G06F 17/28
Abstract:

Exemplary embodiments relate to improvements to neural networks for translation and other sequence-to-sequence tasks. A convolutional neural network may include multiple blocks, each having a convolution layer and gated linear units; gating may determine what information passes through to the next block level. Residual connections, which add the input of a block back to its output, may be applied around each block. Further, an attention may be applied to determine which word is most relevant to translate next. By applying repeated passes of the attention to multiple layers of the decoder, the decoder is able to work on the entire structure of a sentence at once (with no temporal dependency). In addition to better accuracy, this configuration is better at capturing long-range dependencies, better models the hierarchical syntax structure of a sentence, and is highly parallelizable and thus faster to run on hardware.

Active Featuring In Computer-Human Interactive Learning
view source
US Patent:

20170039486, Feb 9, 2017
Filed:

Jul 13, 2016
Appl. No.:

15/209163
Inventors:

- Redmond WA, US
David Max CHICKERING - Bellevue WA, US
David G. GRANGIER - Kirkland WA, US
Aparna LAKSHMIRATAN - Kirkland WA, US
Saleema A. AMERSHI - Seattle WA, US
International Classification:

G06N 99/00
G06N 7/00
G06F 17/30
G06F 3/0482
Abstract:

A collection of data that is extremely large can be difficult to search and/or analyze. Relevance may be dramatically improved by automatically classifying queries and web pages in useful categories, and using these classification scores as relevance features. A thorough approach may require building a large number of classifiers, corresponding to the various types of information, activities, and products. Creation of classifiers and schematizers is provided on large data sets. Exercising the classifiers and schematizers on hundreds of millions of items may expose value that is inherent to the data by adding usable meta-data. Some aspects include active labeling exploration, automatic regularization and cold start, scaling with the number of items and the number of classifiers, active featuring, and segmentation and schematization.

Resumes

Research Scientist

view source

Location:

500 southwest Connect Way, Prineville, OR 97754

Industry:

Research

Work:

Microsoft Research since May 2012
Senior Research Engineer
AT&T Labs, Inc. Feb 2011 - May 2012
Principal Research Scientist
NEC Laboratories America Jun 2008 - Feb 2011
Research Scientist
Idiap Research Institute Mar 2003 - May 2008
Research Assistant
Google Jun 2007 - Dec 2007
Research Intern

Education:

Ecole polytechnique fédérale de Lausanne 2003 - 2008
PhD, Machine Learning Institut Eurécom 2002 - 2003
M. Sc., Computer Science Université de Nice-Sophia Antipolis 2002 - 2003
M. Sc, Computer Science Ecole nationale supérieure des Télécommunications de Bretagne 2000 - 2002
Bachelor, Telecommunications Engineering

Skills:

Machine Learning
Algorithms
Computer Vision
Information Retrieval
Pattern Recognition
Artificial Intelligence
Speech Recognition
Speech Processing
Neural Networks
Natural Language Processing
Signal Processing
Data Mining
Computer Science
Text Mining

Interests:

Text Modeling
Speech Recognition
Machine Learning
User Modeling
Computer Vision

Languages:

English
French