Henrique S. Malvar - Sammamish WA, US Patrice Y. Simard - Bellevue WA, US James Russell Rinker - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06K 9/46
US Classification:
382195, 382264, 382283, 348586, 358448
Abstract:
A system and method facilitating image smoothing is provided. The invention includes an image processor having an image receptor and an image smoother. The invention provides for the image smoother to alter the value of a don't care pixel based, at least in part, upon a weighted average of care pixels.
System And Method Facilitating Document Image Compression Utilizing A Mask
Patrice Y. Simard - Bellevue WA, US Erin L. Renshaw - Kirkland WA, US James Russell Rinker - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06K 9/00 G06K 9/36
US Classification:
382166, 382232, 382243, 35842601
Abstract:
A system and method facilitating document image compression utilizing a mask separating a foreground of a document image from a background is provided. The invention includes a pixel energy analyzer adapted to partition regions into a foreground and background. The invention further provides for a merge region component adapted to attempt to merge regions if the merged region would not exceed a threshold energy. Merged regions are partitioned into a new foreground and new background. Thereafter, a mask storage component stores the partitioning information in a binary mask.
Patrice Y. Simard - Bellevue WA, US Erin L. Renshaw - Kirkland WA, US James Russell Rinker - Kirkland WA, US Henrique S. Malvar - Sammamish WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06K 9/00 G06K 9/36
US Classification:
382166, 382232, 382243, 35842601
Abstract:
Systems and methods for encoding and decoding document images are disclosed. Document images are segmented into multiple layers according to a mask. The multiple layers are non-binary. The respective layers can then be processed and compressed separately in order to achieve better compression of the document image overall. A mask is generated from a document image. The mask is generated so as to reduce an estimate of compression for the combined size of the mask and multiple layers of the document image. The mask is then employed to segment the document image into the multiple layers. The mask determines or allocates pixels of the document image into respective layers. The mask and the multiple layers are processed and encoded separately so as to improve compression of the document image overall and to improve the speed of so doing. The multiple layers are non-binary images and can, for example, comprise a foreground image and a background image.
Processing An Electronic Document For Information Extraction
Paul Viola - Kirkland WA, US Hiu Chung Law - East Lansing MI, US James Rinker - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 17/30 G06F 7/00
US Classification:
707 1, 707 5
Abstract:
The present invention relates to a method of automatically processing an electronic document for routing over a computer network. The method includes recognizing text in the document to identify a candidate address, accessing a collection of potential destinations and comparing the candidate address to the collection of potential destinations to determine a destination for the document.
Erin L. Renshaw - Kirkland WA, US James Russell Rinker - Kirkland WA, US Henrique Malvar - Sammamish WA, US
Assignee:
Microsoft Corporation - Redmond CA
International Classification:
G06K 9/00 G06K 9/36
US Classification:
382166, 382232, 382243, 35842601
Abstract:
Systems and methods for encoding and decoding document images are disclosed. Document images are segmented into multiple layers according to a mask. The multiple layers are non-binary. The respective layers can then be processed and compressed separately in order to achieve better compression of the document image overall. A mask is generated from a document image. The mask is generated so as to reduce an estimate of compression for the combined size of the mask and multiple layers of the document image. The mask is then employed to segment the document image into the multiple layers. The mask determines or allocates pixels of the document image into respective layers. The mask and the multiple layers are processed and encoded separately so as to improve compression of the document image overall and to improve the speed of so doing. The multiple layers are non-binary images and can, for example, comprise a foreground image and a background image.
Patrice Y. Simard - Belleveue WA, US James Russell Rinker - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 17/00
US Classification:
715243, 715255
Abstract:
A system and method facilitating layout analysis is provided. The invention includes a layout analyzer having a connected component organizer, a connected joiner, a word organizer and a word joiner. The invention provides for the connected component organizer to organize connected components based upon color, horizontal position and/or vertical position. The invention provides for the connected component joiner to join connected components based, at least in part, upon color, vertical position, horizontal position, a distance between the connected components, height of the connected components and/or width of the connected components. The word organizer organizes words and the word joiner joins words into lines. The joining of words into lines can cause the connected component joining to attempt to further join connected components into words.
Henrique S. Malvar - Sammamish WA, US Patrice Y. Simard - Bellevue WA, US James Russell Rinker - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06K 9/46
US Classification:
382195, 382264, 382283, 348586, 358448
Abstract:
A system and method facilitating image smoothing is provided. The invention includes an image processor having an image receptor and an image smoother. The invention provides for the image smoother to alter the value of a don't care pixel based, at least in part, upon a weighted average of care pixels.
Charles E. Jacobs - Bellevue WA, US James R. Rinker - Kirkland WA, US Patrice Y. Simard - Bellevue WA, US Paul A. Viola - Kirkland WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06K 9/18
US Classification:
382182, 382173, 382229
Abstract:
A global optimization framework for optical character recognition (OCR) of low-resolution photographed documents that combines a binarization-type process, segmentation, and recognition into a single process. The framework includes a machine learning approach trained on a large amount of data. A convolutional neural network can be employed to compute a classification function at multiple positions and take grey-level input which eliminates binarization. The framework utilizes preprocessing, layout analysis, character recognition, and word recognition to output high recognition rates. The framework also employs dynamic programming and language models to arrive at the desired output.