Transform data in Data Catalog mode
Overview
Time to work some model magic! We'll create models to transform the data. First, we'll create a stg_orders
model using the Y42 Data catalog mode, which will extract relevant columns from the raw_orders
table.
Instructions
Add a model asset in Data catalog mode
- Press on +
- Select
model
- Name the model:
stg_orders
- Insert the following query
_19with source as (_19_19 select * from {{ source('raw_jaffle_shop_data','raw_orders') }}_19_19),_19_19renamed as (_19_19 select_19 "id" as order_id,_19 "user_id" as customer_id,_19 "order_date" as order_date,_19 "status" as status_19_19 from source_19_19)_19_19select * from renamed
- Preview the model by clicking on
Query Data
on the right side of the bottom drawer or by using the hotkey CTRL + RETURN while having your cursor in the SQL code. Verify that you can see the renamedorder_id
column.
Creating a model using the Data Catalog
mode automatically creates a .sql
file with the query above and a .yml
file referencing the query.
Commit & push the changes
- Click on the
Commit & Push
button at the top, you should see two tracked files:stg_orders.sql
stg_orders.yml
- Add a commit name:
adding stg_orders in data catalog mode
- Click on Commit
- Wait for the pre-configured checks
Build the model
- Click on the table in the left file selector
- Open the bottom drawer
- Navigate to
Build
- You should see a pre-defined build command, if you don’t see a build command, enter
y42 build -s stg_orders
- Click on
Build now
- Observe the build job and wait for the job to be
Ready
Add metadata to the model and preview data
- Click on
Sync columns
, this will add 3 columnsORDER_DATE
CUSTOMER_ID
STATUS
ORDER_ID
- Commit the table metadata
- Click on the
Commit & Push
button at the top - Name your Commit:
adding stg_orders metadata
- Commit your changes
- Wait for the pre-configured checks
- Click on the
- View the transformed data in the
Data
tab in the bottom drawer
Up next
Outstanding! You've added the stg_orders
model using the Data Catalog
mode that contain the transformation logic for our data pipeline. Next, we'll use the Code Editor to create the stg_orders
model.