Skip to main content
Version: 1.2.4

Manage Expectations

An Expectation is a verifiable assertion about your data. They make implicit assumptions about your data explicit, and they provide a flexible, declarative language for describing expected behavior. They can help you better understand your data and help you improve data quality. An Expectation Suite contains multiple Expectations.

Prerequisites

Available Expectations

The following table lists the available GX Cloud Expectations.

Data Quality IssueExpectationDescriptionDynamic Parameters?
Cardinalityexpect_column_values_to_be_uniqueExpect each column value to be unique.No
Cardinalityexpect_compound_columns_to_be_uniqueExpect the compound columns to be unique.No
Cardinalityexpect_select_column_values_to_be_unique_within_recordExpect the values for each record to be unique across the columns listed. Note that records can be duplicated.No
Cardinalityexpect_column_proportion_of_unique_values_to_be_betweenExpect the proportion of unique values to be between a minimum value and a maximum value.Yes
Cardinalityexpect_column_unique_value_count_to_be_betweenExpect the number of unique values to be between a minimum value and a maximum value.Yes
Data Integrityexpect_column_pair_values_to_be_equalExpect the values in column A to be the same as column B.No
Data Integrityexpect_multicolumn_sum_to_equalExpect that the sum of row values in a specified column list is the same for each row, and equal to a specified sum total.No
Distributionexpect_column_pair_values_A_to_be_greater_than_BExpect the values in column A to be greater than column B.No
Distributionexpect_column_values_to_be_betweenExpect the column entries to be between a minimum value and a maximum value.No
Distributionexpect_column_z_scores_to_be_less_thanExpect the Z-scores of a column's values to be less than a given threshold.No
Distributionexpect_column_stdev_to_be_betweenExpect the column standard deviation to be between a minimum value and a maximum value.Yes
Distributionexpect_column_sum_to_be_betweenExpect the column sum to be between a minimum value and a maximum value.Yes
Missingnessexpect_column_values_to_be_nullExpect the column values to be null.Coming soon
Missingnessexpect_column_values_to_not_be_nullExpect the column values to not be null.Coming soon
Numerical Dataexpect_column_max_to_be_betweenExpect the column maximum to be between a minimum and a maximum value.Yes
Numerical Dataexpect_column_mean_to_be_betweenExpect the column mean to be between a minimum and a maximum value.Yes
Numerical Dataexpect_column_median_to_be_betweenExpect the column median to be between a minimum and a maximum value.Yes
Numerical Dataexpect_column_min_to_be_betweenExpect the column minimum to be between a minimum value and a maximum value.Yes
Pattern matchingexpect_column_value_length_to_equalExpect the column entries to be strings with length equal to the provided value.No
Pattern matchingexpect_column_value_length_to_be_betweenExpect the column entries to be strings with length between a minimum value and a maximum value.No
Pattern matchingexpect_column_values_to_match_like_patternExpect the column entries to be strings that match a given like pattern expression.No
Pattern matchingexpect_column_values_to_match_like_pattern_listExpect the column entries to be strings that match any of a provided list of like pattern expressions.No
Pattern matchingexpect_column_values_to_match_regexExpect the column entries to be strings that match a given regular expression.No
Pattern matchingexpect_column_values_to_match_regex_listExpect the column entries to be strings that can be matched to either any of or all of a list of regular expressions.No
Pattern matchingexpect_column_values_to_not_match_like_patternExpect the column entries to be strings that do NOT match a given like pattern expression.No
Pattern matchingexpect_column_values_to_not_match_like_pattern_listExpect the column entries to be strings that do NOT match any of a provided list of like pattern expressions.No
Pattern matchingexpect_column_values_to_not_match_regexExpect the column entries to be strings that do NOT match a given regular expression.No
Pattern matchingexpect_column_values_to_not_match_regex_listExpect the column entries to be strings that do not match any of a list of regular expressions. Matches can be anywhere in the string.No
Schemaexpect_column_to_existChecks for the existence of a specified column within a table.No
Schemaexpect_column_values_to_be_in_type_listExpect a column to contain values from a specified type list.No
Schemaexpect_column_values_to_be_of_typeExpect a column to contain values of a specified data type.No
Schemaexpect_table_column_count_to_be_betweenExpect the number of columns in a table to be between two values.Yes
Schemaexpect_table_column_count_to_equalExpect the number of columns in a table to equal a value.No
Schemaexpect_table_columns_to_match_ordered_listExpect the columns in a table to exactly match a specified list.No
Schemaexpect_table_columns_to_match_setExpect the columns in a table to match an unordered set.No
Setsexpect_column_values_to_be_in_setExpect each column value to be in a given set.No
Setsexpect_column_values_to_not_be_in_setExpect column entries to not be in the set.No
Setsexpect_column_distinct_values_to_be_in_setExpect the set of distinct column values to be contained by a given set.No
Setsexpect_column_distinct_values_to_contain_setExpect the set of distinct column values to contain a given set.No
Setsexpect_column_distinct_values_to_equal_setExpect the set of distinct column values to equal a given set.No
Setsexpect_column_most_common_value_to_be_in_setExpect the most common value to be within the designated value set.No
Volumeexpect_table_row_count_to_be_betweenExpect the number of rows to be between two values.Yes
Volumeexpect_table_row_count_to_equalExpect the number of rows to equal a value.No
Volumeexpect_table_row_count_to_equal_other_tableExpect the number of rows to equal the number in another table within the same database.No

Custom SQL Expectations

GX Cloud also offers the ability to write a custom Expectation using SQL. It is designed to fail validation if the provided SQL query returns one or more rows.

The provided query should be written in the dialect of the Data Source in which a given Data Asset lives.

Optional {batch} named query

The optional {batch} named query references the Batch of data under test. When the Expectation is evaluated, the {batch} named query will be replaced with the Batch of data that is validated.

Dynamic Parameters

Dynamic Parameters allow you to create Expectations whose parameters update based on new data. GX Cloud can populate new Expectation parameters at runtime using the last n validation results. For example, you can define an Expectation to validate that the maximum value within a column does not exceed 20% above a previously recorded value.

You will be able to input:

  1. Sensitivity: X% of the average of previous values

  2. Constraint: Above, below, or above and below for the sensitivity threshold

  3. Run count: n previous validation results

When you select your n run count, and:

  • There are 0 previous runs, the Expectation will always succeed.

  • There are <n runs, the Expectation will take all previous runs into account.

  • There are n runs, the Expectation will take the last n runs into account.

  • There are >n runs, the Expectation will take the last n runs into account.

GX Cloud dynamic parameters

Expectation condition

The Expectation condition is an optional field that applies to any Expectation validating row-level data. This condition allows you to filter your data so that only a specific subset of your Batch is validated. Rows will be validated only when the condition is true.

You will need to select:

  1. A column to check the condition against.
  2. An operator that is used to compare the column against a parameter value.
  3. A parameter that will be compared against each row in the selected column.

To clear the Expectation condition, click the clear button located on the right-hand side of the condition field.

GX Cloud Expectation condition field

GX Cloud Expectation with condition

Add an Expectation

  1. In GX Cloud, click Data Assets.

  2. In the Data Assets list, click the Data Asset name.

  3. Click the Expectations tab.

  4. Click New Expectation.

  5. Select an Expectation type. See Available Expectations.

  6. If you are adding your first expectation on this data asset, you may be able to select a time-based Batch interval for that asset.

    • A batch is a feature of the data asset, and allows you to validate your data incrementally. A batch interval can only be defined once per data asset; you cannot change it after setting it.

    • In order to be able to select a batch interval, the data asset must have at least one DATE or DATETIME column.

    • Select the Entire table tab to provide all Data Asset records to your Expectations and validations, or select the Yearly/Monthly/Daily tab to use subsets of Data Asset records for your Expectations and validations.

    • Select Yearly to partition Data Asset records by year, select Monthly to partition Data Asset records by year and month, or select Daily to partition Data Asset records by year, month, and day.

    • Batch column - Select a name column from a prefilled list of DATE and DATETIME columns containing the date and time data.

  7. Complete the mandatory and optional fields for the Expectation. A recurring validation schedule will be applied automatically to your Expectation, based on the settings of your Expectation Suite.

  8. Click Save or click Save & Add More and then repeat steps 5 and 7 to add additional Expectations.

  9. Optional. Run a Validation. See Run a Validation.

Edit an Expectation

  1. In GX Cloud, click Data Assets.

  2. In the Data Assets list, click the Data Asset name.

  3. Click the Expectations tab.

  4. Click Edit Expectation for the Expectation that you want to edit.

  5. Edit the Expectation configuration.

  6. Click Save.

View Expectation history

View the Expectation history to determine when an Expectation was changed and who made the change.

  1. In GX Cloud, click Expectation Suites.

  2. In the Expectation Suites list, click the Expectation Suite name.

  3. Click the Change Log tab.

  4. Optional. Select an Expectation in the Columns pane to view the change history for a specific Expectation.

    The date, time, and email address of the users who created, edited, or deleted the Expectation appears below the Expectation name. Strikethrough text indicates an Expectation was deleted.

Delete an Expectation

  1. In GX Cloud, click Data Assets.

  2. In the Data Assets list, click the Data Asset name.

  3. Click the Expectations tab.

  4. Click Delete Expectation for the Expectation you want to delete.

  5. Click Yes, delete Expectation.