Derived time series

Sometimes it is useful to compute time series based on the data in other time series. For example, let’s say you have a series that measures temperature in degrees Celsius. Let’s call it A. You can create a derived time series that shows the same values converted to Fahrenheit, by using this formula:

A * 1.8 + 40

The resulting derived time series can be graphed and exported just like any other time series.

The data points in a derived time series are computed on demand. This means that any changes to the data of the source series are instantly reflected in the derived series. It also means that data points in derived time series do not count towards your storage limit.

Introduction to 28times formulas

28times formulas designed for making transformations to time series. In the formula A * 1.8 + 40, the symbol A stands for the source time series, consisting of temperature values paired with timestamps. Two transformations are applied to this source time series:

  1. A * 1.8 results in a new ephemeral series with the same timestamps as A, but with each value multiplied by 1.8.
  2. Addition is applied to the ephemeral series, resulting in a new time series, where each value is now in Fahrenheit.

Several other transformations are possible, here are some examples:

  • A - B: each value from time series B is subtracted from the corresponding value in time series A. Corresponding data points have the same point in time in both A and B. The result is a new time series with all timestamps that A and B have in common.
  • sqrt(A): computes the square root of each value in A.
  • from_earlier(A, 30min): Results in a new time series. The resulting series has a new set of timestamps. Each point in time in the resulting series receives the value from 30 minutes earlier in the source series. The resulting series has the same number of data points as the source. Their values are unchanged, but the timestamps are effectively shifted by 30 minutes towards the future. For example, in from_earlier(A, 30min), the value at the time 9:30 on a given date is the value at 9:00 in A. This function is useful for computing the change in value during a given time frame: The derived series A - from_earlier(A, 1h) shows how much the metric A changed per hour at any given point in time.

The binary operators +, - (subtraction), * (multiplication), / (division) and ^ (exponentiation) are supported. Unary - and + are also supported.

Mixing input series with different time zones in a formula is possible. Each data point has a point in time that is independent of the time zone. The same point in time has different textual timestamp representations in different time zones. For example, 2020-01-30 20:00 in America/New_York is equal to 2020-01-31 10:00 in Asia/Tokyo. Conversely, the same textual timestamp representation refers to different points in time in different time zones.

Functions in 28times formulas

The following functions can be applied to numbers or series:

  • sqrt(…): The square root. Equivalent to … ^ (1/2).
  • log10(…): Base-10 logarithm.
  • ln(…): Natural logarithm.

The following functions take a series as input and return a new series.

previous(…)

Each point in time in the resulting series receives the value from the next earlier data point in the source series. For example, A - previous(A) computes the difference between each value and the one that comes before it. previous(A) does not have a data point at the earliest time stamp of A.

next(…)

Each point in time in the resulting series receives the value from the next later data point in the source series. next(A) does not have a data point at the latest time stamp of A.

from_earlier(…, <duration>) and from_later(…, <duration>)

Look up values at an earlier/later point in time. The first argument must be a time series. The result is a time series with a new set of timestamps.

<duration> is a combination of a number and a unit, for example 30 min or 1 h. The number must be an integer. Valid units are: s (seconds), min (minutes), h (hours), d (days), w (weeks), mo (months), y (years).

Note that these functions are named from the perspective of the resulting time series: in the result of from_earlier, timestamps get the value that is associated with an earlier timestamp in the input series. In the result of from_later, timestamps get the value that is associated with a later timestamp in the input series.

For example, a time series with value 1 at 8:00 and 2 at 9:00 on some day would be the result of a from_earlier(A, 1h) operation where A has value 1 at 7:00 and 2 at 8:00.

For a duration measured in seconds, the resulting time series has timestamps which are set apart from the input by exactly that number of seconds. A duration 1 min is exactly the same as 60 s, and 1 h is exactly the same as 60 min or 3600 s. Time zones do not affect the calculations for seconds, minutes, and hours, because the calculations are done in terms of the actual number of seconds passed. This means that jumps due to daylight savings time are correctly taken into account: 4:00am is really 2 hours after 1:00am, in some places, on rare occasions, due to daylight savings time.

When the time unit is days or longer, the resulting time series take on the value from the same time of day, but on a different date. For example, the timestamp 2020-01-04 00:00 is 1 day after 2020-01-03 00:00. In most cases, this results in a difference of exactly 24 hours, but on rare occasions, it does not. For example, daylight savings time results in one day each year having 23 hours, and another having 25 hours. We use the time zone of the derived time series to account for this.

If the time zone of the derived series is UTC, a duration of one day is exactly the same as 24 hours.

A duration of one week is exactly the same as 7 days.

A duration of one or more months works analogously to a duration measured in days. For example, moving 2020-01-30 15:00 forward by one month in UTC results in 2020-02-29 23:59:59. (A duration of 1 month may be more useful when data points are always associated with the first day pf a month, for example. If you have data for every day, consider a duration of 30 days instead of 1 month.)

A duration of one year is exactly the same as 12 months.

bucket(…, <interval>, <aggregation>)

The bucket() function is an experimental feature that may be changed or replaced by a different function in the future.

The bucket function aggregates time stamps from a time interval. The first argument must be a time series.

Possible intervals are:

  • hourly
  • daily
  • weekly_sun
  • weekly_mon
  • weekly_tue
  • weekly_wed
  • weekly_thu
  • weekly_fri
  • weekly_sat
  • monthly
  • yearly

With bucket, all data points falling within a time interval are aggregated and the resulting value is associated with the timestamp at the start of the interval. The derived series‘ time zone is taken into account for this. Hourly intervals result in timestamps with minutes and seconds set to 0. For example, in a time series with data points at 2020-03-14 08:15 and 2020-03-14 08:45, the resulting time series with interval hourly will aggregate the two values to a single data point at 2020-03-14 08:00. With interval daily, the resulting data point is at 2020-03-14, which is equal to 2020-03-14 00:00.

Weekly intervals result in timestamps at the start of the day of the week in the interval name. For example weekly_mon aggregates values from Monday 00:00:00 to Sunday 23:59:59.

Monthly intervals start on the first day of each month at midnight. Yearly intervals start on January 1st at midnight.

Possible aggregations are:

  • mean: arithmetic mean of all values
  • median: the median of all values
  • max: maximum, the highest value
  • min: minimum, the lowest value
  • count: number of values
  • first: the earliest value
  • last: the latest value
  • sum: the sum total

Constants in Formula language

  • pi: the circle ratio π, 3.14159…
  • e: Euler’s constant, 2.71828…

Details

Values

All values are IEEE-754 64-bit floating point numbers, also known as doubles. Values resulting from a calculation can be inf or NaN, for instance, as the result of a division by zero.

Order of operations

In the absence of parentheses, the operator precedence is as follows, with the highest precedence at the top.

  • ^ (exponentiation) and unary operators - and +.
  • * and /.
  • Binary operators + and -.

All binary operators except ^ are left-associative as usual. ^ is non-associative, meaning that a chain of ^ operators, such as 3^4^5 is invalid and needs mandatory parentheses to clarify the order of operations. A formula like -3^2 is also invalid and needs clarifying parentheses: (-3)^2 or -(3^2).