Implementing Time-Dimensional Price Lookups in BigQuery with Price Difference Analysis
Or any other Datawarehouse
Introduction
In the dynamic world of data analytics, tracking and querying historical product prices with precision is crucial. This blog post delves into setting up a time-dimensional price lookup system in Google BigQuery, enhanced with the capability to calculate the difference between invoiced prices and historical prices. This feature is especially useful for cost controliing and financial analysis, where understanding discrepancies in pricing over time is important.
Table Structure
Product Price Table
Stores historical records of product prices, each with a defined effective period.
product_number
: Unique identifier for the product.price
: The quoted price of the product.effective_from
: The start date for the price's validity.effective_to
: End date for the price's validity (NULL for currently valid prices).
Invoice Lines Table
Contains invoice line item details.
invoice_number
: Unique identifier for the invoice.product_number
: Associated product number.invoice_date
: Date of the invoice.charged_price
: Price charged on the invoice.
Query with Price Difference
To analyze price differences, we join the tables on product_number
and compare invoice_date
with the price's effective period. We also calculate the difference between the charged price and the historical price.
SELECT
il.invoice_number,
il.product_number,
il.invoice_date,
il.charged_price,
pp.price AS qouted_price,
(il.charged_price - pp.price) AS price_difference
FROM
invoice_lines il
JOIN
product_prices pp
ON
il.product_number = pp.product_number
WHERE
il.invoice_date >= pp.effective_from
AND (il.invoice_date <= pp.effective_to OR pp.effective_to IS NULL)
Practical Example with Price Difference
Let's demonstrate this concept with an example involving a price difference.
Creating the Tables
-- Creating the Product Price Table
CREATE TABLE product_prices (
product_number STRING,
price FLOAT64,
effective_from DATE,
effective_to DATE
);
-- Creating the Invoice Lines Table
CREATE TABLE invoice_lines (
invoice_number STRING,
product_number STRING,
invoice_date DATE,
charged_price FLOAT64
);
Inserting Dummy Data
-- Inserting data into Product Prices
INSERT INTO product_prices (product_number, price, effective_from, effective_to)
VALUES
('P001', 100.0, '2021-01-01', '2021-06-30'),
('P001', 110.0, '2021-07-01', NULL), -- Price change
('P002', 150.0, '2021-01-01', NULL); -- Constant price
-- Inserting data into Invoice Lines with a price discrepancy
INSERT INTO invoice_lines (invoice_number, product_number, invoice_date, charged_price)
VALUES
('INV001', 'P001', '2021-05-15', 100.0), -- Correct price
('INV002', 'P001', '2021-07-20', 115.0), -- Price discrepancy
('INV003', 'P002', '2021-04-10', 150.0); -- Correct price
Query Results on Dummy Data
When the query is executed on the provided dummy data, it yields the following results, demonstrating how the system can identify discrepancies between invoiced and quoted prices:
Invoice Number | Product Number | Invoice Date | Charged Price | Quoted Price | Price Difference |
INV001 | P001 | 2021-05-15 | $100.00 | $100.00 | $0.00 |
INV002 | P001 | 2021-07-20 | $115.00 | $110.00 | $5.00 |
INV003 | P002 | 2021-04-10 | $150.00 | $150.00 | $0.00 |
Conclusion
This enhanced approach in BigQuery not only allows us to track quoted and historical prices over time but also enables us to identify and analyze discrepancies between invoiced prices and quoted prices taking the time element into account and not just analyzing against the current price. Such a system is invaluable for businesses that need to maintain financial accuracy and integrity in their invoicing, controlling and pricing strategies.
Footnote: Adjustments for Different SQL Environments
This example is tailored for Google BigQuery. If you're implementing a similar project in a different SQL environment or data warehouse, some adjustments to the SQL syntax may be necessary. Here are a few common variations:
Data Types:
FLOAT64
in BigQuery is equivalent toFLOAT
in many other SQL systems.The
NUMERIC
type in BigQuery might beDECIMAL
in other databases.
Function Names:
Some functions might have different names (e.g., string functions like
SUBSTRING
vs.SUBSTR
).Date and time functions often vary between systems (e.g.,
CURRENT_TIMESTAMP
vs.NOW()
).
Syntax Variations:
The syntax for creating tables, inserting data, and querying can have minor differences.
Join syntax and window functions might differ slightly.
Performance Optimizations:
Indexing strategies and query optimization techniques can vary significantly.
Some databases might require specific tuning for better performance.
It's always a good practice to refer to the specific documentation of the SQL environment you are working with to ensure compatibility and optimal performance.