Introducing the Extensible Analytics Engine

Introducing the Extensible Analytics Engine

At GoodData we believe in revolutionizing the way professionals use data to make strategic decisions throughout the workday. We have no shortage of ideas on how to do this, especially with ongoing input from our active community of users. The incremental improvements we push to our platform each week continue to drive GoodData’s evolution, but sometimes it takes a long term investment to take a giant leap forward.

Over a year ago, GoodData launched just such an endeavor to develop new advanced reporting features and performance improvements. It’s no trivial endeavor to dismantle and rebuild the logical machinery underlying a business intelligence platform. But following an immense team effort, we’re proud to introduce our latest innovation: GoodData’s Extensible Analytics Engine.

Extensible Analytics Engine Benefits

Best-in-class business intelligence dashboards demand fully customizable reporting tools, backed by a robust analytical engine for computing complex metrics. GoodData’s new Extensible Analytics Engine (XAE) leverages our proprietary data query language, MAQL, to define  metrics that are independent of any one dimensionality. This allows users to apply metrics in many contexts by modifying report dimensionality (by switching between report-level attributes that determine how data is categorized) without modifying metric definitions.

For advanced users who seek complete control over metric behavior in all contexts, any number of customizations are possible by modifying a metric’s MAQL definition. You can add filters, ranking rules, conditional statements, arithmetic expressions, mathematical functions, and use keywords to determine how a metric is affected by report level filtering and dimensionality. Metrics can even be nested one inside another to build more complex KPIs. With XAE's enhanced logical framework for interpreting complex MAQL expressions, there are more ways to apply custom metric than ever before.

What's New

XAE's innovative architecture allows users to tackle more advanced use cases. New features include:

Beyond its individual features, the beauty of GoodData's XAE becomes apparent as you manipulate data in complex ways without impacting the physical data model. Each project’s logical data model abstracts from the physical data model’s complexities, allowing users to focus on business use cases, rather than on how data is stored and optimized. And with XAE's updated approach for dealing with transformations, carrying out date attribute arithmetic, handling NULL values, and optimizing reporting performance, you're sure to find new ways to make your data Good Data.

Extensible Analytics Engine FAQ

XAE Feature Overview

Square Root Function

SQRT( )

Description

The square root function returns the square root of a number or metric input. Negative or complex inputs result in a NULL output.

Syntax

SELECT SQRT(...)

SELECT SQRT(constant)

SELECT SQRT(metric)

Examples

SELECT SQRT(25) returns the value “5”

SELECT SQRT(33.6) returns the value “5.796550698”

SELECT SQRT(SELECT SUM(Sales)) returns the square root of a Total Sales metric

Exception Reporting

IFNULL

Description

IFNULL allows you to predefine the way GoodData deals with any missing values that are returned by a metric expression (metric, fact aggregation, or arithmetic operation). If a metric expression that is wrapped within an IFNULL statement ever returns a null value, the replacement number specified in the second parameter of the IFNULL function will be inserted in place of the null value.

IFNULL is especially useful in cases where a sub metric is referred to within another metric’s definition. Wrapping the sub metric in an IFNULL statement keeps the encompassing metric from becoming null due to one of its components returning a null value.

Note: Supported replacement values include constants, but not other metric expressions – which would lead to an improper metric definition.

Syntax

SELECT IFNULL(… , …)

SELECT IFNULL(metric_expression , replacement_value)

Examples

SELECT IFNULL(SUM(Amount), 0)

SELECT IFNULL(SUM(FB_Cost + TW_Cost), 0)

SELECT IFNULL(SUM(Amount) + 100, 1)

SELECT IFNULL((SELECT SUM(Payments) WHERE Product IN (Explorer, Educational)), 0)

Conditional Statements

IF THEN ELSE

Description

IF THEN ELSE statements return one of two possible values, or perform one of two possible computations, depending on whether some condition is met.

The defined condition follows the keyword IF and can be constructed using any of the filtering keywords or relational operators (IN, NOT IN, BETWEEN, NOT BETWEEN, =, < , <=, >, >=, <>).

The first possible outcome follows the keyword THEN, and the second possible outcome follows the keyword ELSE. The outcomes that are returned can be numerical values or arithmetic operations. All IF THEN ELSE statements conclude with the END keyword.

Note: If there more than two possible conditions to be defined, use the CASE statement described below.

Syntax

SELECT IF … THEN … ELSE … END

SELECT IF condition THEN number ELSE number END

SELECT IF condition THEN arithmetic_operation ELSE arithmetic_operation END

Examples


SELECT IF SUM(Amount) >= AVG(Amount) THEN 10 ELSE 0 END

SELECT IF SUM(Duration) - AVG(Duration) > 2000  THEN 0 ELSE 1 END

SELECT IF AVG(Probability) > 0.5 THEN SUM(Amount) * 10 ELSE SUM(Amount) / 10 END

Developer article: Conditionals on XAE

CASE

Description

CASE is used for complex conditional statements that contain three or more conditions. Following the CASE keyword, conditions and outcomes follow the keywords WHEN and THEN, respectively. If none of the WHEN conditions are met, the outcome following ELSE is returned. Outcomes can be numerical values or arithmetic operations. All CASE statements conclude with the END keyword.

Syntax

SELECT CASE WHEN … THEN …, WHEN… THEN… ELSE… END

SELECT CASE WHEN condition1 THEN outcome1, WHEN condition2 THEN outcome2 ELSE outcome3 END

Examples

SELECT CASE WHEN activity_type IN(emailmeeting) THEN 1, WHEN activity_type IN(phone_call, meeting) THEN 2 ELSE 0 END

SELECT CASE WHEN SUM(Amount) > SUM(Lost) AND SUM(Amount) – SUM(Lost) > 100000 THEN 2, WHEN SUM(Amount) > SUM(Lost) AND SUM(Amount) – SUM(Lost) < 100000 THEN 1 ELSE 0 END

Ranking Functions

XAE now supports sequential ranking of all a report's values using the RANK function – in addition to TOP(n) and BOTTOM(n) filter ranking (e.g. filtering for the TOP(5) sales reps).

In combination with various formulations of the WITHIN statement, RANK allows you to rank values within sub-groups associated with a report’s attribute values. This would be useful for ranking the top 5 sales reps within each sales region, in a report containing sales rep data broken across a sales region attribute.

When RANK is used without an accompanying WITHIN statement (e.g. SELECT RANK(Avg. Won)), the function sequentially ranks all metric values across the rows in a report, resulting in an identical output as RANK(...) WITHIN(ALL IN ALL OTHER DIMENSIONS).

When a WITHIN statement is appended to a RANK function, it can be used with either of two parameter formulations, resulting in different outputs. These are explored in the table below.

WITHIN

Output

WITHIN (attribute1, attribute2, ... | ALL attribute | ALL OTHER EXCEPT attribute)

Ranks metric values within sub-groups that are specified by the attributes contained in the WITHIN statement’s parameter.

WITHIN (CURRENT)

Ranks within current report dimensionality (defined by the report's attributes), where every row of the report is handled as a separate group and thus each value is ranked '1'.

As noted above, WITHIN can be combined with the RANK function, which is described further below.

RANK

Description

The ranking function returns a numerical rank for each of a report’s metric values in ascending [ASC] order (smallest value ranked first) or descending [DESC] order (largest value ranked first). Ascending order is the default, for cases where an order is not explicitly specified.

A key feature of the RANK function is the WITHIN keyword, which allows you to carry out rankings of multiple attribute sub-groups within a single report.

Syntax

SELECT RANK(…) [ASC|DESC] [WITHIN(…)]

SELECT RANK(metric)

SELECT RANK(metric) WITHIN(CURRENT)

SELECT RANK(metric) WITHIN(attribute1, attribute2, [...])

SELECT RANK(metric) WITHIN(attribute ALL OTHER | ALL OTHER | ALL OTHER EXCEPT FOR)

Examples

SELECT RANK(Avg. Won)

SELECT RANK(Amount) ASC WITHIN(Year(Closed))

SELECT RANK(Amount[SUM]) WITHIN(CURRENT)

SELECT RANK(Amount[SUM]) WITHIN(ALL OTHER)

Developer article: Introducing the New Ranks

As with RANK, TOP and BOTTOM ranking functions can be used without a WITHIN statement for ranking that disregards sub-groups within a report. Using WITHIN in conjunction with TOP and BOTTOM allows for rankings that include only the top or bottom n values within certain sub-groups in the final report output.

WHERE TOP|BOTTOM (n) IN

Description

Unlike the RANK function, WHERE TOP|BOTTOM IN serves as a filter that limits the number of report values returned in the final report output. Its syntax requires a SELECT keyword preceding the metric that follows the IN keyword.

Note: Now supports percent parameters: n%.

Syntax

SELECT … WHERE TOP(…) IN … WITHIN …

SELECT metric1 WHERE TOP(n) IN (SELECT metric2 BY attribute1, attribute2...) WITHIN (...)

Examples

SELECT Amount WHERE TOP(5) IN (SELECT Avg. Won BY Product)

MAQL also supports using an OF keyword following the TOP|BOTTOM ranking filter. This changes the way the metric is interpreted by the analytical engine. With OF you can carry out rankings by more than one metric. Likewise, the ranking metric is interpreted as a submetric, rather than a subreport (as is the case for TOP|BOTTOM in conjunction with the IN keyword). 

WHERE TOP|BOTTOM (n) OF

Description

Similar to WHERE TOP|BOTTOM IN, WHERE TOP|BOTTOM OF serves as a filter that limits the number of report values returned in the final report output. Using the OF keyword (rather than IN) allows ranking of more than one metric and interprets the ranking metric as a sub-metric.

Note: Now supports percent parameters: n%.

Syntax

SELECT … WHERE TOP(…) OF … [WITHIN …]

SELECT metric1 WHERE [TOP|BOTTOM](n) OF ( metric2, ... )

SELECT metric1 WHERE [TOP|BOTTOM](n) OF ( metric2, ... ) WITHIN ( ... )

Examples

SELECT Amount WHERE TOP(5) OF (Avg. Won)

SELECT Won WHERE TOP(10) OF (Won) WITHIN(Region)

SELECT Amount WHERE BOTTOM(5) OF (Lost) WITHIN(Region, Year (closed))

Percentage ranking

WHERE TOP|BOTTOM (n%) OF

Description

Percentage rankings are similar to TOP|BOTTOM ranking filters but support percent parameters in place of n.

Syntax

SELECT … WHERE [TOP|BOTTOM](n%)

SELECT metric1 WHERE [TOP|BOTTOM](n%)

Examples

SELECT Amount WHERE TOP(5%) OF (Amount) AND Product=Explorer

SELECT RANK(Top 5% parent) WHERE TOP(50%) OF (Top 5% parent)

Developer article: Extended TOP|BOTTOM Ranks

Running Totals

Previously, running totals could be added to GoodData tables using the secondary-click contextual menu. This created a new column to the right of the original that automatically totaled the selected column’s cells. However, the actual metric used to compute the column’s values was inaccessible.

 

Now you can use the RUNSUM, RUNAVG, RUNMIN, and RUNMAX functions in MAQL to create running total metrics that can be reused in other metrics – where they can be manipulated with other MAQL functions – or added to any number of other reports.

RUNSUM / RUNAVG / RUNMIN / RUNMAX

Description

Takes the running total of a project fact or metric. In a table broken down by a date attribute, the running total value on any given day would be calculated by aggregating values from all prior days along with that of the current day.

 Note: Running total metrics can only be broken down by one date attribute at any given time.

Syntax

RUNSUM(...)  |  RUNAVG(...)  |  RUNMIN(...)  |  RUNMAX(...)

RUNSUM(fact)  |  RUNAVG(fact)  |  RUNMIN(fact)  |  RUNMAX(fact)

RUNSUM(metric)  |  RUNAVG(metric)  |  RUNMIN(metric)  |  RUNMAX(metric)

Examples

SELECT RUNSUM(Sales)

SELECT RUNAVG(Leads) WHERE Year={this}

SELECT RUNMIN(Probability) WHERE Amount > 500000

SELECT RUNMAX(Opportunities) BY Quarter/Year

SELECT RUNSUM(Won_Opportunities) WITHOUT PARENT FILTER

Developer article: Metric Running Totals

For more information on the Extensible Analytics Engine, refer to the XAE FAQ and the XAE Release Notes. For more information on writing custom metrics with MAQL, see the MAQL Reference Guide

Have more questions? Submit a request

Comments

  • Avatar
    Matt Matthews

    Sounds great.  Is this part of tonight's release, or do we already have it? 

    Also, I have to ask-- is the introduction of AQE responsible for any of the platform issues over the last few weeks?

    Thanks,

    Matt

  • Avatar
    Steve Olson

    Hey, Matt--

    After today's release, AQE functions should be available in all new and existing projects. To test, you can build a simple metric using the square root function (SQRT()). Read more in the release notes: https://support.gooddata.com/entries/24770636-Introducing-the-Advanced-Query-Engine#SQRT

    I don't have all of the details, but I don't think AQE was a factor in the recent platform issues. 

     

    Cheers,

    -SteveO

     

  • Avatar
    Cludia Oliveira

    hi Is there any way to have an string instead of numeric value for the THEN part of CASE statement? i.e. instead using 0 or 1 to use YES NO?

  • Avatar
    Cindy Au

    Hi Cludia,

    I'm afraid there isn't a way for us to use a string in a CASE statement. However, what you can do is in the metric formatting you can add something like this:

    [=1][YES;[=0]NO

    So, any time the metric value is 1, it will display a YES, and any time the value is a 0, it will display a NO. 

  • Avatar
    Cludia Oliveira

    Hi Cindy, that is a great tip. Thanks for the prompt feedback, using the formatting works for what I need.

    Awesome response :)

    thanks

    Cludia

Powered by Zendesk