Amazon Redshift Query activity
Introduction
An Amazon Redshift Query activity, using its Amazon Redshift connection, retrieves records from a table at Amazon Redshift, and is intended to be used as a source to provide data in an operation.
Create an Amazon Redshift Query activity
An instance of an Amazon Redshift Query activity is created from an Amazon Redshift connection using its Query activity type.
To create an instance of an activity, drag the activity type to the design canvas or copy the activity type and paste it on the design canvas. For details, see Create an activity or tool instance in Component reuse.
An existing Amazon Redshift Query activity can be edited from these locations:
- The design canvas (see Component actions menu in Design canvas).
- The project pane's Components tab (see Component actions menu in Project pane Components tab).
Configure an Amazon Redshift Query activity
Follow these steps to configure an Amazon Redshift Query activity:
-
Step 1: Enter a name and select a schema
Provide a name for the activity and select a schema. -
Step 2: Select an object
Select an object to be queried. -
Step 3: Build your query
Set conditions on a query using the object fields and apply paging to a query. -
Step 4: Review the data schemas
Any request or response schemas generated from the endpoint are displayed.
Step 1: Enter a name and select a schema
In this step, provide a name for the activity and select a schema. Each user interface element of this step is described below.
-
Name: Enter a name to identify the activity. The name must be unique for each Amazon Redshift Query activity and must not contain forward slashes (
/
) or colons (:
). -
Select a Schema: This section displays schemas available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected schema is displayed instead of reloading the entire schema list.
-
Selected Schema Name: After a schema is selected, it is listed here.
-
Search: Enter any column's value into the search box to filter the list of schemas. The search is not case-sensitive. If schemas are already displayed within the table, the table results are filtered in real time with each keystroke. To reload schemas from the endpoint when searching, enter search criteria and then refresh, as described below.
-
Refresh: Click the refresh icon or the word Refresh to reload schemas from the Amazon Redshift endpoint. This may be useful if schemas have been added to Amazon Redshift. This action refreshes all metadata used to build the table of schemas displayed in the configuration.
-
Selecting a Schema: Within the table, click anywhere on a row to select a schema. Only one schema can be selected. The information available for each schema is fetched from the Amazon Redshift endpoint:
- Schema: The name of the Amazon Redshift schema.
Tip
If the table does not populate with available schemas, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.
-
-
Save & Exit: If enabled, click to save the configuration for this step and close the activity configuration.
-
Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.
-
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.
Step 2: Select an object
In this step, select an object. Each user interface element of this step is described below.
-
Select an Object: This section displays objects available in the Amazon Redshift endpoint. When reopening an existing activity configuration, only the selected object is displayed instead of reloading the entire object list.
-
Selected Schema Name: The schema name selected in the previous step is listed here.
-
Select an Object Name: After an object is selected, it is listed here.
-
Search: Enter any column's value into the search box to filter the list of objects. The search is not case-sensitive. If objects are already displayed within the table, the table results are filtered in real time with each keystroke. To reload objects from the endpoint when searching, enter search criteria and then refresh, as described below.
-
Refresh: Click the refresh icon or the word Refresh to reload objects from the Amazon Redshift endpoint. This may be useful if objects have been added to Amazon Redshift. This action refreshes all metadata used to build the table of objects displayed in the configuration.
-
Selecting an Object: Within the table, click anywhere on a row to select an object. Only one object can be selected. The information available for each object is fetched from the Amazon Redshift endpoint:
-
Object Name: The object name from Amazon Redshift.
-
Type: The object type from Amazon Redshift.
-
Catalog: The object catalog from Amazon Redshift.
-
Tip
If the table does not populate with available objects, the Amazon Redshift connection may not be successful. Ensure you are connected by reopening the connection and retesting the credentials.
-
-
Back: Click to temporarily store the configuration for this step and return to the previous step.
-
Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.
-
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.
Step 3: Build your query
In this step, set conditions on a query using the object fields and apply paging to a query. Each user interface element of this step is described below.
Tip
Fields with a variable icon support using global variables, project variables, and Jitterbit variables. Begin either by typing an open square bracket [
into the field or by clicking the variable icon to display a list of the existing variables to choose from.
-
Search: Enter any part of a field name into the search box to filter the list of fields for the selected object. The search is not case-sensitive. The listed results are filtered in real time with each keystroke.
-
Refresh: Click the refresh icon or the word Refresh to reload fields of the object from the Amazon Redshift endpoint.
-
Select All: When using the search box to filter, you can use this checkbox to select all visible fields at once.
-
Select Fields: Select the checkboxes of the fields you want included in the query to have them automatically added to the SELECT statement in the Query String. You can also Select All of the fields at once using the checkbox.
-
Paging: To add a paging clause (a limit on the number of records with an optional record offset), you can use the dropdown to set the paging limit and the field to enter an offset. If an offset is not specified, it defaults to 0. A single paging clause is supported. If paging clause is not included, all records are returned.
-
Apply: Click to automatically construct the clause based on the dropdown selections and entered value. The automatically constructed paging clause appears in the Query String text box.
-
Remove: Click to remove a paging clause that has been applied.
-
-
Conditions: To add conditional clauses, use the fields below as input to help construct the clauses, which then appear in the Query String text box.
-
Object: Field: Use the dropdown to select a field from the selected object.
-
Operator: Use the dropdown to select an operator that is appropriate for the field data type:
Operator Label Description = Equals != Not equals LIKE 'string' Like Like string LIKE 'string%' Starts with Starts with string LIKE '%string' Ends with Ends with string LIKE '%string%' Contains Contains string < Less than <= Less or equal > Greater than >= Greater or equal -
Value: Enter the desired value to use with the dropdown selections.
-
Add: Click to automatically construct the clause based on the dropdown selections and entered value. The conditional clause is added to the Query String text box.
-
Remove All: Click to remove all entered conditional clauses.
-
-
Query String: As you select fields, specify conditions, and set paging, the query statement in this text box is autopopulated with the selected fields, conditions, and paging limits.
-
Test Query: Click to validate the query. If the query is valid, a maximum of 50 records retrieved from the query is displayed in a table. If the query is not valid, relevant error messages are displayed.
Note
During operation runtime, the 50 record limit is not enforced unless it is specified in the Paging field (described earlier).
-
Back: Click to temporarily store the configuration for this step and return to the previous step.
-
Next: Click to temporarily store the configuration for this step and continue to the next step. The configuration will not be saved until you click the Finished button on the last step.
-
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.
Step 4: Review the data schemas
Any request or response schemas generated from the endpoint are displayed. Each user interface element of this step is described below.
-
Data Schemas: These data schemas are inherited by adjacent transformations and are displayed again during transformation mapping.
Note
Data supplied in a transformation takes precedence over the activity configuration.
The Amazon Redshift connector uses the Amazon Redshift JDBC Driver version 2.1.0.28 and Amazon Redshift SQL Commands. Refer to the Amazon Redshift documentation and the Amazon Redshift System Overview documentation for additional information.
The response data schema depends on the query that was specified. In this example, it consists of these nodes and fields:
-
Response
Response Schema Field/Node Notes accounts
Node of accounts being queried balance
Value of queried account id
ID of queried account name
Name of queried account
-
-
Refresh: Click the refresh icon or the word Refresh to regenerate schemas from the Amazon Redshift endpoint. This action also regenerates a schema in other locations throughout the project where the same schema is referenced, such as in an adjacent transformation.
-
Back: Click to temporarily store the configuration for this step and return to the previous step.
-
Finished: Click to save the configuration for all steps and close the activity configuration.
-
Discard Changes: After making changes, click to close the configuration without saving changes made to any step. A message asks you to confirm that you want to discard changes.
Next steps
After configuring an Amazon Redshift Query activity, complete the configuration of the operation by adding and configuring other activities, transformations, or scripts as operation steps. You can also configure the operation settings, which include the ability to chain operations together that are in the same or different workflows.
Menu actions for an activity are accessible from the project pane and the design canvas. For details, see Activity actions menu in Connector basics.
Amazon Redshift Query activities can be used as a source with these operation patterns:
- Transformation pattern
- Two-target archive pattern (as the first source only)
- Two-target HTTP archive pattern (as the first source only)
- Two-transformation pattern (as the first source only)
To use the activity with scripting functions, write the data to a temporary location and then use that temporary location in the scripting function.
When ready, deploy and run the operation and validate behavior by checking the operation logs.