Business scenario: Forecasting sales for individual stores

Consider the business problem of forecasting sales for individual stores.

The mining analyst wants to use historical sales data and store properties to create a prediction model that can be used to predict store sales for some day in the future.

The mining analyst wants to provide a table with the following columns as input table for a data mining regression algorithm:
Table 1. Input table for a data mining regression algorithm
Column name Logical name Description
STORE_ID store ID of store (part 1 of key)
DAY day Day (part 2 of key)
MONTH month Month (part 3 of key)
YEAR year Year (part 4 of key)
STORE_TYPE store type Type of the store
DATE date Date (yyyy-mm-dd)
QUARTER quarter Calendar quarter (1-4)
DAY_OF_WEEK day of week 1=Sunday, 2=Monday, 3=Tuesday, 4=Wednesday, 5=Thursday, 6=Friday, 7=Saturday
DAY_TYPE type of day Working day versus Saturday or Sunday
SALES total sales Total sales (for store on day)
SALES_TRX number of sales transactions Number of sales transactions as an approximation (upper limit) of the number of customers
SALES_PROFIT total profit Total profit (difference of sales amount and product price)
SALES_WK total week sales Total sales in week including date
SALES_AVG_MTH average sales per day in month Average sales of stores per day in the month including date
SALES_WK_PCT sales as percentage within a week Percentage of sales on date with respect to week
SALES_PERF sales performance class Classification of day for store as poor, mediocre, good, or outstanding
SALES_FURNITURE furniture sales Total sales in the furniture department
SALES_SPORTSWEAR sportswear sales Total sales in the sportswear department
SALES_ELECTRONICS electronic sales Total sales in the electronics department
SALES_OTHER other sales Total sales in all departments except furniture, sportswear, and electronics
SALES_7_DAYS past 7 day sales Sales in past 7 shopping days excluding day

This scenario is used throughout the next sections to illustrate how to prepare the input data for the mining algorithms.



Feedback