Xiaodong He - Issaquah WA, US Jian Wu - Redmond WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G10L 15/06
US Classification:
704243, 704245
Abstract:
Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.
Speech Models Generated Using Competitive Training, Asymmetric Training, And Data Boosting
Xiaodong He - Issaquah WA, US Jian Wu - Redmond WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G10L 15/06 G10L 15/00
US Classification:
704243, 704255
Abstract:
Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.
Jian Wu - Seattle WA Hugh West - Seattle WA Terry M. Grant - Auburn WA
Assignee:
Weyerhaeuser Company - Federal Way WA
International Classification:
D21H 1733 D21H 1763 D21H 1768 D21H 1769 D21H 1755
US Classification:
1621641
Abstract:
The invention relates to cellulose fluff pulp products that are debondable into fluff with markedly lower energy input, to a process for making the products, and to absorbent products using the fluff Most of the pulp products show no reduction in liquid absorbency rate from that of untreated fiber and significantly higher rates than pulps treated with the usual debonding agents. The products are made by adhering fine non-cellulosic particles to the fiber surfaces using a retention aid. The fiber is preferably treated with the retention aid in an aqueous suspension for a sufficient time so that the retention aid is substantively bonded with little or none left free in the water. The fine particulate additive is then added and becomes attached and uniformly distributed over the fiber surfaces with very little particle agglomeration occurring. The fiber is most usually not refined or only very lightly refined before sheeting. However it may be significantly refined to produce a product having a very high surface area.
Jian Wu - Seattle WA Hugh West - Seattle WA Terry M. Grant - Auburn WA
Assignee:
Weyerhaeuser Company - Tacoma WA
International Classification:
D21H 1763 D21H 1767 D21H 1768
US Classification:
162100
Abstract:
The invention relates to cellulose fluff pulp products that are debondable into fluff with markedly lower energy input, to a process for making the products, and to absorbent products using the fluff. Most of the pulp products show no reduction in liquid absorbency rate from that of untreated fiber and significantly higher rates than pulps treated with the usual debonding agents. The products are made by adhering fine non-cellulosic particles to the fiber surfaces using a retention aid. The fiber is preferably treated with the retention aid in an aqueous suspension for a sufficient time so that the retention aid is substantively bonded with little or none left free in the water. The fine particulate additive is then added and becomes attached and uniformly distributed over the fiber surfaces with very little particle agglomeration occurring. The fiber is most usually not refined or only very lightly refined before sheeting. However, it may be significantly refined to produce a product having a very high surface area.
- Redmond WA, US Emilian STOIMENOV - Bellevue WA, US Christopher Hakan BASOGLU - Everett WA, US Kshitiz KUMAR - Redmond WA, US Jian WU - Bellevue WA, US
International Classification:
G10L 15/16 G10L 25/51 G10L 19/16
Abstract:
Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.
- Redmond WA, US Emilian STOIMENOV - Bellevue WA, US Christopher Hakan BASOGLU - Everett WA, US Kshitiz KUMAR - Redmond WA, US Jian WU - Bellevue WA, US
International Classification:
G10L 15/30 G10L 15/16 G10L 19/16 G10L 25/51
Abstract:
Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.
Convolutional Neural Network With Phonetic Attention For Speaker Verification
- Redmond WA, US Tianyan ZHOU - Bellevue WA, US Jinyu LI - Redmond WA, US Yifan GONG - Sammamish WA, US Jian WU - Bellevue WA, US Zhuo CHEN - Woodinville WA, US
International Classification:
G10L 17/18 G10L 17/02 G06N 3/08
Abstract:
Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.
Core-tech Inc Newark, NJ Oct 2011 to Sep 2012 Accounting ClerkFirst Bagel, Inc Union, NJ Jun 2006 to May 2011 Bookkeeper (P/T)Future Textiles International Trade, Inc Jamesburg, NJ Feb 2005 to May 2006 Assistant Project ManagementContinental Auto Parts, LLC Newark, NJ Dec 2003 to Feb 2005 Jr. Accountant
Education:
State University of New York at Old Westbury Old Westbury, NY Dec 2002 Bachelor of Science in Accounting