by JF OY - Mar 6, 2023

What is DBT?

How we structure our DBT projects?

and Best practices

Encourage to add data descriptions to YAML!!!

Data stage

  1. Sources: Schemas and tables in a source-conformed structure (i.e. tables and columns in a structure based on what an API returns), loaded by a third-party tool. https://docs.getdbt.com/docs/building-a-dbt-project/using-sources

    src_{source}__{object} Table

  2. Staging models: The atomic unit of data modeling. Each model bears a one-to-one relationship with the source data table it represents. It has the same granularity, but the columns have been renamed, recast, or usefully reconsidered into a consistent format. stg_{source}__{object} View

  3. Marts models: Models that represent business processes and entities, abstracted from the data sources that they are based on.

    {dim/fact}_{object}_{t/v} Table or View

Folder structure

├── dbt_project.yml
└── models
    ├── marts
    |   ├── core
    |   |   ├── docs.md
    |   |   ├── core.yml
    |   |   ├── dim_account_profile_v.sql
    |   |   └── dim/fact_*_t/v.sql
    |   └── {business group}
    |       ├── *docs.md
    |       ├── dim/fact/summary_*_v.yml
    |       └── dim/fact/summary_*_v.sql
    └── staging
        ├── account
        |   ├── docs.md
        |   ├── src_account.yml
        |   ├── stg_account.yml
        |   └── stg_account__account_profile.sql
        └── {service}
            ├── docs.md
            ├── src_{service}.yml
            ├── stg_{service}.yml
            └── stg_{service}__*.sql

Data warehouse structure

dw
└── schemas
    ├── src
    |   └── Tables
    |       ├── src_account__account
    |       ├── src_account__account_profile
    |       ├── src_gsheet_data__*
    |       ├── src_gsheet_ops__*
    |       └── src_billing__*
    ├── dbt_stg
    |   └── Views
    |       ├── stg_account__account_profile
    |       └── stg_billing__*
    ├── dbt_mart_core
    |   └── Views
    |       └── dim_account_profile_v
    ├── dbt_mart_ico
    |   └── Views
    |       └── dim/fact_*_v
    └── dbt_mart_{business group}
        └── Views
            └── dim/fact_*_v

https://discourse.getdbt.com/t/faq-cleaning-up-removed-models-from-your-production-schema/113

Which data can we use?

As an analyst in the business group

Please only use models from dbt_mart_core or dbt_mart_{your business group}. Any model from other schemas may change fast and engineer won’t keep the old version.

DBT Style Guide

Tag Strategy

Currently, we don’t have strict limits on tags. But some tags are used for airflow triggers. Please be careful and avoid using them in the wrong way.

Workflow

DBT workflow-v20230306 share.jpeg

Potential issues

Ref