Se June Hong - Yorktown Heights NY, US Jonathan R. Hosking - Scarsdale NY, US Ramesh Natarajan - Pleasantville NY, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 15/18
US Classification:
703 2, 703 6, 706 12, 706 14, 706 20
Abstract:
A new method is used to model the class probability from data that is based on a novel multiplicative adjustment of the class probability by a plurality of items of evidence induced from training data. The optimal adjustment factors from each item of evidence can be determined by several techniques, a preferred embodiment thereof being the method of maximum likelihood. The evidence induced from the data can be any function of the feature variables, the simplest of which are the individual feature variables themselves. The adjustment factor of an item of evidence Eis given by the ratio of the conditional probability P(C|E) of the class C given Eto the prior class probability P(C), exponentiated by a parameter a. The method provides a new and useful way to aggregate probabilistic evidence so that the final model output exhibits a low error rate for classification, and also gives a superior lift curve when distinguishing between any one class and the remaining classes. A good prediction for the class response probability has many uses in data mining applications, such as using the probability to compute expected values of any function associated with the response, and in many marketing applications where lift curves are generated for selected prioritized target customers.
Frequency Estimation Of Rare Events By Adaptive Thresholding
International Business Machines Corporation - Armonk NY
International Classification:
G01R 23/00
US Classification:
702 75, 702179, 702180, 702181, 702193
Abstract:
A method and system for estimating a magnitude of extremely rare events upon receiving a complete data sample and a specific exceedance probability are described. A distribution is chosen for a complete data sample. An optimal subsample fitted to the distribution is obtained. The optimal subsample is a largest acceptable subsample. A subsample is considered as an acceptable subsample when a goodness-of-fit test on the subsample is satisfactory (i. e. , higher than a predetermined threshold). In addition, if a tail measure of an acceptable subsample lies outside a confidence interval of any smaller acceptable subsample, the acceptable subsample is considered as an unacceptable. Based on the optimal subsample and an inputted exceedance probability, a quantile estimate is computed, e. g. , by executing an inverse of a cumulative distribution function of generalized Pareto distribution.
Range Forecasting Of Demand For Order Configurations For Configurable Products
International Business Machines Corporation - Armonk NY
International Classification:
G06Q 10/00
US Classification:
705 725, 705 731
Abstract:
A method and system for forecasting demand for order configurations are provided. The method and system, in one aspect, expresses attach rates within a family of n options as a set of n positive numbers that sum to 1. By applying suitable transformations to the attach rates, they are modeled as a random vector in (n−1)-dimensional Euclidean space. The distribution of the transformed attach rates are modeled as a distribution family specified by a location vector and a dispersion matrix. The dispersion matrix is simplified, for example, using historical data or expert judgment or both to identify option families that have dependent demand. Simplifying may also include expressing dependence between options by a simple model that involves few parameters. Location vector is estimated by computing point forecasts of transformed attach rates. The parameters of the dispersion matrix are estimated by calibration on historical data, using the dispersion of the errors in historical point forecasts.
Method And Apparatus For Risk Assessment For A Disaster Recovery Process
David Gamarnik - New York NY, US Jonathan Hosking - Scarsdale NY, US William Kane - Florida NY, US Ta-Hsin Li - Danbury CT, US Emmanuel Yashchin - Yorktown Heights NY, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G01V003/00 G01V007/00 G06F019/00 G06F017/60
US Classification:
705004000, 702002000
Abstract:
A method and structure for calculating a risk exposure for a disaster recovery process, including loading a user interface into a memory, the user interface allowing control of an execution of one or more risk models. Each risk model is based on a specific disaster type, and each risk model addresses a recovery utilization of one or more specific assets identified as necessary for a recovery process of the disaster type. One of the risk models is executed at least one time.
Demand Planning For Configure-To-Order And Building Blocks-Based Market Environment
Roger R. Gung - Yorktown Heights NY Jonathan R. Hosking - Scarsdale NY Grace Y. Lin - Chappaqua NY Akira Tajima - Kawasaki, JP
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 1760
US Classification:
705 10, 705 7, 705 28, 705 29
Abstract:
A method for demand planning of products. The method comprises the steps of constructing a configure-to-order operation/multiple building block environment; and, forecasting the demand of the building blocks within this environment for establishing an efficient supply chain management.
Combining Group-Level And Item-Level Information For Forecasting By Empirical Bayesian Deconvolution Based Technique
- Seattle WA, US Jonathan Hosking - Scarsdale NY, US
Assignee:
Amazon Technologies, Inc. - Seattle WA
International Classification:
G06K 9/62 G06Q 10/04 G06F 16/906
Abstract:
A data set comprising records of state change events of items of an item collection, as well as records of asynchronous operations associated with the items, is obtained. The numbers of records in the data set may differ from one item to another. Using the data set, a Bayesian forecasting model employing a deconvolution algorithm is trained. The model generates estimates of metrics of a type of asynchronous operation using a combination of a category-level distribution of the asynchronous operation, an item-level distribution, and a category-versus item adjustment. A trained version of the model is stored.
Method And System For Forecasting Using An Online Analytical Processing Database
- Armonk NY, US Jonathan R.M. Hosking - Scarsdale NY, US Wanli Min - Mount Kisco NY, US Laura Wynter - Singapore, SG
International Classification:
G06Q 40/02 G06Q 40/00
Abstract:
A method for providing a forecast includes providing a database storing data at a lowest level in a first dimension, calculating a first forecast at a level that is higher than the lowest level of a first dimension in the database, calculating a forecast for each category within the lowest level of the first dimension, aggregating a second forecast across a category at the lowest level of the first dimension based upon an aggregation of the calculated forecast for the category within the lowest level of the first dimension, determining a difference between the first forecast and the second forecast, and creating a dummy category including a new category at the lowest level of the first dimension.
Real-Time Forecasting Of Electricity Demand In A Streams-Based Architecture With Applications
- Armonk NY, US Jonathan R. Hosking - Scarsdale NY, US Ramesh Natarajan - Pleasantville NY, US Shivaram Subramanian - New Fairfield CT, US Xiaoxuan Zhang - Jersey City NJ, US
International Classification:
G05B 19/042
Abstract:
A streams platform is used. Multiple streams of electricity usage data are received, each from an electrical meter providing periodic updates to electrical usage for devices connected to the electrical meter. Weather information is received corresponding to locations where the electrical meters are. Real-time predictive modeling of electricity demand is performed based on the received multiple streams of electricity usage data and the received weather information, at least by performing: updating a state space model for electrical load curves using the usage data from the streams and the weather, wherein the updating uses current load observations for the multiple streams for a current time period; and creating forecast(s) for the electricity demand. The forecast(s) of the electricity demand are output. Appliance-level predictions may be made and used, and substitution effects and load management functions may be performed.