Douglas R. Burdick - Ithaca NY, US Robert J. Szczerba - Endicott NY, US
Assignee:
Lockheed Martin Corporation - Bethesda MD
International Classification:
G06F 11/00
US Classification:
714 41, 714 37
Abstract:
A system evaluates a data cleansing application. The system includes a collection of records cleansed by the data cleansing application, a plurality of dirtying functions for operating upon the collection to introduce errors to the collection, and a record of the errors introduced to the cleansed collection. The plurality of dirtying functions produces a collection of dirty records.
Visualization Toolkit For Data Cleansing Applications
Douglas R. Burdick - Ithaca NY, US Robert J. Szczerba - Endicott NY, US Wei Kang Zhan - Vestal NY, US
Assignee:
Lockheed Martin Corporation - Bethesda MD
International Classification:
G06F 3/00
US Classification:
715764, 715854, 715968
Abstract:
A system views results of a data cleansing application. The system includes a results visualization module and a learning visualization interface module. The results visualization module organizes output of the data cleansing application into a predefined format. The results visualization module displays the output to a user. The learning visualization interface module facilitates interaction with the data cleansing application by the user.
Framework For Evaluating Data Cleansing Applications
Douglas R. Burdick - Ithaca NY, US Robert J. Szczerba - Endicott NY, US Joseph H. Visgitus - Endwell NY, US
Assignee:
Lockheed Martin Corporation - Bethesda MD
International Classification:
G06F 17/30
US Classification:
707102, 707 6
Abstract:
A system evaluates a first data cleansing application and a second data cleansing application. The system includes a test data generator, an application execution module, and a results reporting module. The test data generator creates a dirty set of sample data from a clean set of data. The application execution module cleanses the dirty set of sample data. The application execution module utilizes the first data cleansing application to cleanse the dirty set of sample data and create a first cleansed output. The application execution module further utilizes the second data cleansing application to cleanse the dirty set of sample data and create a second cleansed output. The results reporting module evaluates the first and second cleansed output. The results reporting module produces an output of scores and statistics for each of the first and second data cleansing applications.
System For Identifying Similarities In Record Fields
Douglas Burdick - Ithaca NY, US Steven Rostedt - Endwell NY, US Robert Szczerba - Endicott NY, US
Assignee:
Lockheed Martin Corporation
International Classification:
G06F007/00
US Classification:
707/003000
Abstract:
A system identifies similarities in data. The system includes a collection of records, a plurality of transform functions, and a cell list structure. Each record in the collection represents an entity and has a list of fields. Data is contained in each field. The plurality of transform functions operates upon the data in each field in each record. The plurality of transform functions generates a set of output values for facilitating comparison of the records and determining whether any of the records represent the same entity. The cell list structure is generated from the output values. The cell list structure has a list of cells for each field and a list of pointers to each cell of the list of cells for each output value generated by the plurality of transform functions.
Douglas Burdick - Ithaca NY, US Robert Szczerba - Endicott NY, US Joseph Visgitus - Endwell NY, US
Assignee:
Lockheed Martin Corporation
International Classification:
G06F017/00 G06F007/00
US Classification:
707/101000
Abstract:
A system cleanses data. The system includes an input component, a pre-process component, an automated learning component, and a post-process component. The input component receives a collection of records. The pre-process component formats the collection of records and creates a plan for cleansing the collection of records. The automated learning component performs the plan and modifies the plan based on feedback from intermediate steps within the plan. The post-process evaluation component evaluates the result of the automated learning component. The post-process component determines whether to accept the result or to feed back the result to the automated learning component.
Boolean Rule-Based System For Clustering Similar Records
Douglas Burdick - Ithaca NY, US Steven Rostedt - Endwell NY, US Robert Szczerba - Endicott NY, US
Assignee:
Lockheed Martin Corporation
International Classification:
G06F017/00 G06F007/00
US Classification:
707/102000
Abstract:
A system identifies similar records. The system includes a collection of records, a set of Boolean rules, and a cell list structure. Each record in the collection has a list of fields and data contained in each field. The set of Boolean rules operate upon the data in each field. The cell list structure is generated from the collection of records. The cell list structure has a list of cells for each field and a list of pointers to each cell of the list of cells for each record. The set of Boolean rules identifies the similar records from the cell list structure.
Parallelizable System For Concise Representation Of Data
Douglas Burdick - Ithaca NY, US Steven Rostedt - Endwell NY, US Robert Szczerba - Endicott NY, US
Assignee:
Lockheed Martin Corporation
International Classification:
G06F007/00
US Classification:
707/001000
Abstract:
A system represents data during a data cleansing application. The system includes a record collection. Each record in the collection includes a list of fields and data contained in each field. The system further includes a predetermined sequence of operations to be performed on the record collection and a plurality of bit-maps representing the record collection. The system still further includes a partitioned sequence of operations for parallel processing of the bit-maps by a plurality of separate devices.
System For Dynamically Building Extended Dictionaries For A Data Cleansing Application
Douglas Burdick - Ithaca NY, US Robert Szczerba - Endicott NY, US
Assignee:
Lockheed Martin Corporation
International Classification:
G06F007/00
US Classification:
707/003000
Abstract:
A system builds an extended dictionary for a data cleansing application. The system includes a record collection. Each record in the collection includes a list of fields and data contained in each field. The system further includes an input dictionary defining predetermined valid values for variants of values in at least one of the fields and a set of rules derived from patterns of the field values. The system still further includes an extended dictionary including the input dictionary and the rules.
Name / Title
Company / Classification
Phones & Addresses
Douglas Burdick
BURDICKS TAVERN, INC
209 Thompson Rd, Nedrow, NY 13120 6600 S Salina St, Nedrow, NY 13120
Douglas Burdick President
Custom Plant Scapes Inc Real Estate · Landscape - Contractor