Aggravating Differences with Aggregate Functions

Aggregate Functions are the lifeblood of data analysis. However, aggregate functions can often be a little confusing. Things like variable field references, single constant values, or calculating empty values can all cause confusion if you are not familiar with the nuances of aggregate functions.

For instance, documentation for the Case function says:

Documentation for the Average function says:

Both of these descriptions state that parameters in braces { } are optional. However, in the case of Average (like other aggregate functions), those parameters are only optional if you are using a field(s) as a reference. You can write an average function with a series of static numbers, field references, and/or variables and it will calculate just fine.

Average ( Field1 ; $Variable1 ; 6 ) 

Will return 5 if Field1 contains 4 and $Variable1 contains 5

However, when you try to pass the Average function a set of numbers with a single variable, the result should return an error: Field is missing. So long as you pass an aggregate function a single parameter, the function will assume you are giving it a field reference and attempt to evaluate it as such. When using aggregate functions, it is best not to gather values into a single field or variable. If you would rather not separate your values, writing a calculation to manually aggregate your list will probably be the most effective way, a little more on that later.

If a related field is our only method of passing multiple values to an aggregate function with a single parameter, then what if we use our variable to pass the function a field reference? You may be writing a script for use in multiple areas to average a variety of fields, so rather than hard coding a field reference it might be more efficient to pass a variable with a reference instead. This is a viable solution but we’re still missing a piece, as we have learned, any single parameter will be evaluated as a field reference, not its contents, so this will still return an error.

I was searching for answers while looking at the GetFieldName Function. I had hoped getfieldname would read my variable as a reference and pass the same reference to my aggregate function. This still did not work, unfortunately, as the getfieldname function also expects a field reference. While looking at the example documentation on the getfieldname function I found the solution:

Our answer is the Evaluate Function. Not only will this allow a getfieldname function to read our variable as a field reference, it will also allow our aggregate functions to read a field reference as well.

The calculation we need is written like this:

Average ( Evaluate ( $Variable1 ) )

Will return the average of Field1 if $Variable1 is set to “TableName::Field1”

If you were looking for a way to use lists with aggregate functions, the Evaluate function will also be invaluable to you. It can be used to parse your list with a Substitute function before applying the aggregate calculation manually.

Evaluate ( Substitute ( $Variable1 ; “❡” ; “+” ) ) / ValueCount ( $Variable1 )

Will return the average of $Variable1 if it contains a return delimited list of values

Let ( $var1 = SortValues ( $Variable1 ; -2 ) ; GetValue ( $var1 ; 1 ) )

Will return the maximum value of $Variable1 ( change the “-2” to positive for the minimum)

Each aggregate function will require a different calculation to work with lists, which can even be found as custom functions similar to these. However I think it is best to try not to use lists when aggregating data if possible.

This is not the only difference aggregate functions have when used in calculations. When defining a calculation field, there is an option at the bottom to “Do not evaluate if all reference fields are empty.”

A useful option to keep data accurate and not waste time calculating fields that don’t need to be, this option reveals another difference in aggregate functions that may cause problems depending on how they are used.

Aggregate functions will ignore any null values, empty fields or otherwise, and evaluate without counting them. This can be good when averaging values, since an empty value that is counted could skew the results, and as long as all of the parameters you’ve given the function are empty, the calculation will not evaluate. However, if you are using multiple sources in your calculation, your aggregate function will evaluate as long as any one of those sources contains a value.

Min ( Field1 ; 100 )

Will return 100 even when Field1 is empty

There is an inconsistency with even this, however, since newly created records will not calculate while all reference fields are empty, as if the setting were functioning properly. The problem only shows when implementing a new calculation into an existing solution, where existing records will evaluate the calculation as if the reference fields were null values.

Depending on the aggregate function you are using and the purpose you are using it for, this can have different solutions that may work better for you. The most widely usable solution, however, would be to work in a manual IsEmpty check for your reference fields.

If ( IsEmpty ( Field1 ) ; “” ; Min ( Field1 ; 100 ) )

Will return nothing if Field1 is empty or whichever is lower between the value in Field1 and 100

If you have any questions about this topic or have issues implementing the functions in this post, leave a comment below or contact us.

 

Leave a Reply

Your email address will not be published. Required fields are marked *