All CollectionsUser Guides
Analytics: Synthetic Variables Basics
Analytics: Synthetic Variables Basics

This article explains how and when to use synthetic variables in Ubidots' analytics engine to compute complex equations.

Written by David Sepúlveda
Updated over a week ago

Ubidots' analytics engine supports a complex mathematical computation tool called synthetic variables. In simple words, a variable is any raw data within a device in Ubidots and a synthetic variable is the result of the computation of raw variables.

This tool allows you to extend your application's functionality. For example, if you develop a temperature logger that reads the variable in °C and you wish to show the data in both °C and °F, you would retrieve the sensor change and send two values to Ubidots: one value in °C and another one in °F. This adds an unnecessary compute load to your microcontroller, but with Ubidots' analytics engine you only need to send the raw value in °C and let Ubidots perform the required calculations to convert it to ºF, alleviating the excess microcontroller processing requirements.

Synthetic computation example: From Celsius to Fahrenheit.

Here, you will learn the basics about synthetic variables and the available mathematical and statistical functions you can implement with this tool.

IMPORTANT NOTE: The synthetic variables engine's computational speed is heavily influenced by the complexity of the synthetic expression, which results in calculation times ranging from a few seconds to a few minutes, or even an hour. If your synthetic variable is tied to a logic that is critical to your business, we recommend using another approach to compute it, such as an UbiFunction.

## 1. Creating a synthetic variable

Ubidots stores dots that come from your devices as default (raw) variables and these data have corresponding timestamps that organize each of them into a time series list, using the following sequence:

`values={[value 1, timestamp 1], [value 2, timestamp 2],... [value n, timestamp n]}`

With Ubidots' analytics engine you can apply different operations to the time series data set to create a parallel data set containing computed variables; these new variables are called synthetic variables. To create one, click on the "+ add variable" button, or hover over the "+" button, within your device and click on "synthetic variable".

## 3. Mathematical expressions

A synthetic variable consists of a math operation applied to the whole time series:

`Raw values={[value 1, timestamp 1], [value 2, timestamp 2],... [value n, timestamp n]}Square root values={[√value 1, timestamp 1], [√value 2, timestamp 2],... [√value n, timestamp n]}`

In the above example, a square root expression is applied to the time series data.

In the following table, find the list of supported mathematical expressions:

 Syntax Description `ceil(x)` Returns the rounded integer, greater or equal, for each element in the variable `x`. The ceil function always rounds up to the nearest integer. `floor(x)` Returns the floor of x as an integer, the largest integer value less than or equal to x. `round(x, n steps)` Returns the floating point value number rounded to "n" digits after the decimal point. `sin(x)` Returns the sine in radians of each element in the variable `x`. `cos(x)` Returns the cosine in radians of each element in the variable `x`. `tan(x)` Returns the tangent of each element in the variable `x`. `arcsin(x)` Returns in radians the inverse sine of each element in the variable `x`. `arccos(x)` Returns in radians the inverse cosine of each element in the variable `x`. `arctan(x)` Returns in radians the inverse tangent of each element in the variable `x`. `arctan2(x, y)` Returns in radians the trigonometric inverse tangent using the input variables `x` and `y` as Cartesian coordinates.Note: It will only perform the operation between values with the same timestamp. `sinh(x)` Returns the hyperbolic sine of each element in the variable `x`. `cosh(x)` Returns the hyperbolic cosine of each element in the variable `x`. `tanh(x)` Returns the hyperbolic tangent of each element in the variable `x`. `arcsinh(x)` Returns in radians the inverse hyperbolic sine of each element in the variable `x`. `arccosh(x)` Returns in radians the inverse hyperbolic cosine of each element in the variable `x`. `arctanh(x)` Returns in radians the inverse hyperbolic tangent of each element in the variable `x`. `exp(x)` Returns the exponential of each element in the variable `x`. `log(x, base)` Returns the logarithm of each element in the variable `x`. By default, the base is the Euler's number. `abs(x)` Returns the absolute value of each element in the variable `x`. `sqrt(x)` Returns the square root value of each element in the variable `x`.

Standard arithmetic operations and mathematical constants work just fine too:

• Subtraction: -

• Division: /

• Multiplication: *

• Exponentiation: **

• Module: %

• π : pi

• Euler's number: e

Example:

Converting a temperature value from °C to °F :​

`F = ((9 / 5) * variable) + 32`

The synthetic editor will look as follows:

## 4. Data range expressions

Ubidots allows you to create new variables from your time series based on date range data; for example, calculate average temperature per hour, or day, based on your sensor's readings using a synthetic variable.

Below you can find the commonly used data range functions and formula structure:

 Syntax Description `max(x, "range")` Returns the maximum value of the variable `x` in the specified time range. `min(x, "range")` Returns the minimum value of the variable `x` in the specified time range. `mean(x, "range")` Returns the mean value of the variable `x` in the specified time range. `std(x, "range")` Returns the standard deviation of the variable `x` in the specified time range. `median(x, "range")` Returns the median value of the `x` variable in the specified time range. `count(x, "range")` Returns the number of dots stored in the variable `x` for the specified time range. `last(x, "range")` Returns the last value of the variable `x` in the specified date range. `first(x, "range")` Returns the first value of the variable `x` in the specified time range. `sum(x, "range")` Returns the sum of the values of the dots stored in the variable `x` in the specified time range.

Available time ranges:

• "nT": Every n minutes.

• "nH": Every n hours.

• "nD": Every n days.

• "W":  Every end of week.

• "M": Every end of month.

IMPORTANT NOTE: The selected range should be set in a way that evenly divides the next range. For example, if using minutes ("T"), whatever the number is, it has to evenly divide an hour ("H"). Under such example, the available values for minutes are: 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30. Other values may render unexpected results. The same applies to other ranges.

Example:

The average temperature every 10 minutes in °F:
`A = mean( ((9 / 5) * variable) + 32, "10T" )`

The expression would looks as follows:

`T = ((9/5) * <variable>) + 32mean(t, "10T")`

Example:

Every n data range starts its period at 00:00:00, however, there are particular applications where the desired starting point is not 00:00:00, but rather 02:00:00 or 00:40:00, depending on the input data range. To apply an offset, the above functions can receive a third parameter called offset, as follows:​

`A = sum(variable, "8H", offset=6)`

The above example corresponds to the sum of the variable computed every 8 hours, offset by 6 hours (beyond 00:00:00), that is, 06:00:00. Accordingly, the synthetic variable will be run at 6:00, 14:00, and 22:00 daily.

## 5. Rolling expressions

This function returns the computed value of a data series within a time window or a given number of values, using one of the following aggregation methods: "mean", "sum", "min", "max", or "count".

`rolling (variable, aggregation_method, type_range, range, min_periods = 2)`

Example:

Calculate the maximum value of a sample of four data points.
`rolling(variable,"max","values",4)`

There are additional functions for more complex operations:

### 6.1. where ( )

`where(condition, operation_if_true, operation_if_false)`

Computes `operation_if_true` if the condition is True, or `operation_if_false` if the condition is false.

Comparison statements:

• Equal to: ==

• Greater than: >

• Lower than: <

• Not equal to: !=

• Equal to, greater than: >=

• Equal to, lower than: <=

Logical expressions (useful when setting more than 1 condition):

• And: "and"

• Or: "or"

Examples:

Populates the new synthetic variable with a '1 (true)' if the temperature value is greater than 20:​

`where({{var}} > 20, 1)`

Populates the synthetic variable with a '1' if the temperature value is greater than 20, if not, fills with a '0' value:​

`where({{var}} > 20, 1, 0)`

Stores the variables' timestamp if it is lower than 20 OR greater than 50:

`where({{var}} < 20 or {{var}} > 50, {{var}}.timestamp)`

As you can see, the dot (.) operator allows you to access the timestamp attribute of the variable.

### 6.2. diff ( )

This function returns the difference, starting at the last element in a time series and the next, separated by a specified number of steps.

`diff(x, step)`

### 6.3. shift ( )

This function returns the variable's values in the time series shifted by the given number of steps.

`shift(x,step)`

### 6.4. cumsum ( )

This function returns the cumulative sum of a time series.

`cumsum(x)`

### 6.5. fill_missing ( )

`fill_missing(x)`

Computes an expression containing multiple variables with different timestamps, filling any non-equal timestamped values with the non-equal variable's last value.

Example:

`fill_missing(3 * var1_id + var2_id)`

• ### Obtain the context value of a variable

`{YOUR_VARIABLE}.context.context-key`

Context data can only be used within your synthetic expression if the context is a number.

• ### Obtain the timestamp of a variable

`{YOUR_VARIABLE}.timestamp`