- SAP Data services is one of the finest ETL(Extract, Transform, Load) tools which delivers a single enterprise-class solution for data integration, data quality, data profiling, and text data processing that allows you to integrate, transform, improve, and deliver trusted data to critical business processes.
- SAP Data services Transforms are built-in system objects stored in repository, which are used whenever we want to transform data from source(s) to target(s).
- The transforms can be found under Transforms tab of our Local object Library – which provides access to all repository objects (in-built or user built).
- The transforms are majorly classified into four categories as below. Expanding each type we can see the list of transforms present in each category.
- In this article, let’s look at one of the most widely used transform of data services, which is the query transform.
What is a Query Transform?:
- Query transform is part of the Platform tranforms available in data services. To know more about the transform categories and the transforms present in each one of them, please refer ‘SAP Data Services – Transforms’
- A query transform is similar to a SQL SELECT statement.
- It retrieves a data set that satisfies conditions that we specify.
The Query transform can perform the following operations:
- Choose (filter) the data to extract from sources
- Join data from multiple sources
- Map columns from input to output schemas
- Perform transformations and functions on the data
- Perform data nesting and unnesting
- Add new columns, nested schemas, and function results to the output schema
- Assign primary keys to output columns
Inside a Query Transform:
- To start with, let’s built a small dataflow with a source table and target table, which are mapped through a query transform.
- Mapping of target table columns, to those of source table are done through query transform.
- In the dataflow, double click on query transform to perform needed transformations.
- Once we enter into the query transfer, the query editor would be similar to something like below:
- To the left hand side of query editor we have the source/input table, while in the right we can see our target/output table. Below them we have the properties tab where query transform properties are defined.
- Let’s look in detail about each of these.
- The figure below shows the schema In or the input tables for the query transform.
- The Schema In area shows the input/source fields
- The options available for Schema In up on right click are:
- Copy: This option is used to copy the name of the column or source structure name (in this case DS_AGGR_FUNC)
- Find: This option is used to find specific column from source columns. It is very useful when we have so many column in source structure.
- Refresh: Using this option we can refresh the structure.
- Collapse: Using this option we can collapse the single source structure. It is useful when we have more source structures as source.
- Generate DTD: Generates a DTD format that corresponds to the structure of the selected schema (either NRDM or relational). Generates all data types as varchar.
- Generate XML Schema: Generates an XML Schema that corresponds to the structure of the selected schema (either NRDM or relational). All data types match those of the selected schema.
- Map to Output: Creates a simple mapping from the input schema area to the output schema area.
- Create File Format: Creates a file format from a relational table schema. All data types match those of the original table schema.
- Create HDFS File Format: Creates a file format from a HDFS schema. All data types match those of the original HDFS file schema.
- Properties: Displays the properties of the selected element. We cannot modify the properties here.
Please proceed to next page to continue reading…