
The Presto lag
function a window function that returns the value of an offset before the current row in a window. One common use case for the lag
function is with time series analysis, such as autocorrelation.
Figure 1 shows the advert
table of sales and advertising expenditure from Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, John Wiley & Sons: New York. The advert
column is the monthly advertising expenditure, and the sales
column is the monthly sales volume.
/
A simple analysis could be to track the difference between the current month’s sales volume and the previous one, which is shown in Figure 2. The lag_1_sales
column is a single period lagged value of the sales
column, and the diff
column is the difference between sales
and lag_1_sales
. To generate the table in Figure 2, we can use the lag
function and the following query:
select
advert,
sales,
lag_1_sales,
round(sales - lag_1_sales,2) as diff
from (
select
advert,
sales,
lag(sales, 1) over(range unbounded preceding) as lag_1_sales
from advert
);
The subquery uses the lag
function to get a one period offset preceding value of the sales
column, where the OVER
clause syntax is specifying the window. The main query then computes the diff
column. Here are a couple of additional useful notes about the lag
function:
- You can change the offset with the second argument
lag(x, OFFSET)
, whereOFFSET
is any scalar expression. The current row isOFFSET=1
. - By default, if an offset value is null or outside the specified window, a
NULL
value is used. We can see this in the first row of the table in Figure 2. However, the default value to use in these cases is configurable with an optional third argumentlag(x, OFFSET, DEFAULT_VALUE)
, whereDEFAULT_VALUE
is desired value.
A closely related function is the lead
function returns the value at an offset after the current row.
If you want to get started with Presto easily, check out Ahana Cloud. It’s SaaS for Presto and takes away all the complexities of tuning, management and more. It’s free to try out for 14 days, then it’s pay-as-you-go through the AWS marketplace.