Being a role meant to support the decision-making process, Data Engineers need to understand certain Financial concepts and know-how to best leverage them in their data models.
Some concepts are particularly important for Data Engineers activities, namely amortization, and allocation. While other concepts of controlling such as entitlement values, mix shift, and variance decomposition can also be helpful to understand how some of their data consumers might be leveraging the data.
What is amortization?
Amortization represents the process of gradually writing off the initial cost of an asset over its useful life — when we distinguish between amortization and depreciation, it is usually to represent intangible assets through amortization and fixed assets through depreciation.
Amortization is a method that deals with the TIME value component of how to ascribe a cost across the time dimension.
What can amortization help answer?
Amortization help ascribe revenues or costs across time. This can help answer questions regarding profitability and economic revenue.
Why is the business so interested in amortization?
Besides the increased visibility in the overall cost structure, amortization tends to have tax and budget implications. It allows deferring expenses onto later years, providing more budgeting freedom in the initial years.
Let say we had a license agreement to sell branded football jerseys that we pre-paid 70e for, and that is meant to last one week during the peak of the football season. We would like to understand during each of the days the profitability of the deal. We are asked to inform our licensing partner by the Wednesday of that week if we would like to renew our licensing agreement for a second week. We need to compare the revenue that we earned every day with the cost that we incurred. But it would not be fair to ascribe all the cost of the licensing agreement to the first day of the contract; we need to split or “amortize” this cost across the useful life of the contract. What straight-line amortization does is split the cost across each time period uniformly so that each day of the contract brings about the same cost. The example below, for instance, shows how that would look like over our whole licensing week.
By day 3, we will have amortized about 30e of costs, and we could use this as the licensing cost base to determine if the deal is profitable for us and if we would like to renew it for a subsequent week.
When should you use Straight-line amortization?
There are different use cases for applying straight-line amortizations, notably when you receive prepayments for a subscription type of contract. The company will amortize the prepayment value across the contract's lifetime to represent the “economic revenue” earned during each period. In the real world, this applies to contracts such as your household insurance contracts or the yearly contract that you pay for streaming services such as Netflix or Amazon Prime.
It also applies to licensing agreements such as, for example, an Oracle license that has been prepaid for three years or when leveraging prepaid AWS reserved instances.
The same applies to specific assets such as purchased laptops that can amortize that tend to be amortized over their useful lie.
Straight-line amortization is a very straightforward amortization approach and can be leveraged in a wide variety of cases. It is also useful to apply when you don’t really know how to amortize the value of an asset.
How to implement it
At the heart of straight-line amortization is the computation of a rate of amortization by a unit of time. Let say you would like to know how much some yearly license costing 730 euro could be amortized over three months.
You would first compute the rate that would be 730 euro to be amortized over 365 days, so at a rate of 2e per day. The second step would be to compute the number of units of amortization. In this case, we would like to cover the period of 2021–01–01 to 2021–03–31 included, which totals 90 days. The amortization over the period is therefore 90*2e = 180e.
We can do this type of calculation in SQL. The following query calculates the total amortization over the first three months of 2021:
How to leverage amortization in data models?
Periodic snapshot fact tables under the Kimball methodology provide a flexible way to model amortization values.
Most non-linear amortization techniques aim to amortize more in the early periods when there is an increased “economic value” of the asset.
There is a wide variety of non-linear amortization techniques such as “double declining balance,” “sum of years digits depreciation,” “decay curve amortizations,” etc.… that incorporate this amortization front loading.
Other techniques such as the “Unit of productions” method aim to tie the amortization to the production patterns.
No matter the method employed, its goal is always to get closer to modeling the asset's economic value.
What is decay curve amortization?
The decay curve amortization would take the share of “output” obtained with the period (colored in green in the above picture) and compare it with the total output obtained during the asset's useful life.
Dynamic or “Adaptive” Segmentation: refers to segmentation that doesn’t result in a fixed set of users (for example, a single query run) who belong to the segment, but rather a segment from which users can join and drop of based on certain conditions.
When should you use Decay Curve amortization?
Decay curve amortization is a more involved amortization technique than straight lines. It requires knowledge of how the economic value of the asset is derived. So only warrants use if there are significant differences between how the economic value is derived and straight-line amortization.
This is the case for licensing agreements for movies, for instance. In the movie licensing industry, there are different windows in which you can license a title. The 1st run window is typically open after six months in the Theatre, 2nd run window is between 1 and 2 years after theater, and the movie is considered a library movie after that.
When looking at the licensing deals across these windows, there is a disproportionate cost difference between the cost of a first-run window to that of the 2nd run of the library window. It is not rare for a movie in the first run window to cost at least 2x more than on a second run window, and a similar type of difference exists between the licensing costs for 2nd run and library titles.
For licenses spanning over a multitude of years, it would not be fair to amortize using the straight line. An amortization using a decay curve would more closely match the economic value of the titles across these different windows.
How to implement it
There are different methods to implement a decay curve amortization.
One is through first applying some statistical technique for curve fitting a reverse exponential curve-like function. Once fitted, it is possible to use good old calculus to integrate it over the useful lifetime horizon and get the share of useful life contained within a time interval. This method is, in general, a bit involved and requires a high degree of oversight, making it challenging to apply in an automated way.
This is usually why it is common to directly leverage historical values to generate a cumulative share of revenue/cost by time period that can be directly applied without too much supervision.
An example implementation of this type of amortization is provided in the query below:
In the example above, the tmp_data CTE contains the historical information related to similar assets output by day. We recenter these days to the start days of the asset in the data_amortization CTE and create a share of output for each day based on it.
The tmp_asset CTE contains the different assets to be used and their amortization periods (start_date and end_date), while periods CTE provide us a list of days contains in the asset applicable periods. In the assets_period, we combine tmp_asset and periods CTE to only consider the different days within the amortization periods of each asset. It also recenters each amortization day w.r.t to the starting period of the asset.
The final output joins the assets_period CTE and the data_amortization period together. It leverages the amortization rate previously computed in data_amortization, but not before making some re-adjustment. For example, asset id 2 has an amortization period shorter than the one contained within the dataset. We are therefore required to rescale our amortization rate to take into account this shorter lifetime.
Besides directly leveraging actual values, it is possible to leverage an amortization formula, as previously mentioned. A typical formula used is the exponential decay formula.
An asset can generate some output after its “useful life.” It can also be resold for scraps, for instance. This is called the “residual” or “salvage” value.
We generally treat residual value separately from the rest of the asset amortization.
What is allocation?
Allocation provides a spread of costs or revenues across different categories. In contrast to amortization, it does not deal with a time component but rather a more specific area such as, for example, a product category.
What is the purpose of allocation?
The purpose of the allocation is to provide a better view of the share of cost or revenue that can be ascribed to that specific area to give a better view of revenues and cost drivers.
The end goal is to understand factors such as break-even point better, cost of production, and the different trade-offs encountered.
Where is allocation used?
In managerial accounting, approaches such as activity-based costing leverage allocation methods to provide a complete view of the cost of production of individual items.
In more real-world terms, this can relate to, for instance, a factory producing certain goods that might require the cost of an assembly line to be spread to different items; it can also relate to cases such as transportation where the cost of the vehicle and driver needs to be allocated to the various items being transported.
How to implement
Implementing a cost or revenue allocation first starts by defining what is called an “allocation key.” The allocation key is a measure that will use to apportion the cost/revenue across the different subcategories.
Take, for instance, the shipment cost for a given order. From the perspective of getting a view on the profitability of certain products, we would like to understand the overall cost drivers for the different product lines. We want to get a view of the various cost components at the product level. However, the costs are incurred at the shipment level, and since a shipment can contain multiple items, to be able to get this view, the shipment cost needs to be allocated at the product level.
An example query is provided, for instance, below, that shows how to do an order-line allocation by item quantity.
How to leverage allocation in data models?
Some data methodology like Kimball leverage extensively allocation methods to build datasets. In Kimball, these datasets are called “Allocated Facts.” One of the main reasons to leverage allocation in these types of datasets is to provide the ability to slice, dice, and roll up the dataset across different dimensions, providing added flexibility.
Allocation and amortization techniques can be quite helpful in data engineering to enhance the access of data. Leveraging supporting data modeling techniques such as Period Snapshots or Allocated Facts table allows for an easier way to consume the data for financial decision-making.