Metadata¶
The metadata
property on the dbtCloudClient
class contains methods that allow a user to retrieve metadata which pertains to the accuracy, recency, configuration, and structure of the views and tables in the warehouse.
The Metadata API is a GraphQL API. Normally, this would require a user to write a GraphQL query as part of the required payload. However, this package provides a convenient interface that allows a user to write the GraphQL query in a more pythonic way. There are two options:
- Simply provide the minimal set of arguments to the functions below and get every field in the desired schema, including those that are nested. An example is below for the
models
schema:
from dbtc import dbtCloudClient
# Assuming DBT_CLOUD_SERVICE_TOKEN is set as an environment variable
client = dbtCloudClient()
job_id = 1
models = client.metadata.get_models(job_id)
- If you don't want or need all of the fields from a particular schema, use the optional
fields
argument to limit the amount of data that's returned. This argument accepts a list of strings where the strings are names of fields within the schema. Additionally, you can ask for nested fields using dot notation.
from dbtc import dbtCloudClient
# Assuming DBT_CLOUD_SERVICE_TOKEN is set as an environment variable
client = dbtCloudClient()
job_id = 1
fields = [
'uniqueId',
'runId',
'projectId',
'environmentId',
'alias',
'description',
'parentsSources.name',
'parentsSources.criteria.warnAfter.period',
'parentsSources.criteria.warnAfter.count',
]
models = client.metadata.get_models(job_id, fields=fields)
The video below provides some more detail.
get_exposure¶
The exposure object allows you to query information about a particular exposure. You can learn more about exposures here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this expsoure was generated for |
required |
name |
str
|
The name of this particular exposure |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this exposure was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_exposure(job_id)
dbtc get-exposure --job-id=12345
get_exposures¶
The exposures object allows you to query information about all exposures in a given job. You can learn more about exposures here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this exposure was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this exposure was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_exposures(job_id)
dbtc get-exposures --job-id=12345
get_macro¶
The macro object allows you to query information about a particular macro in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this macro was generated for |
required |
unique_id |
str
|
The unique ID of this particular macro |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this macro was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_macro(job_id)
dbtc get-macro --job-id=12345
get_macros¶
The macros object allows you to query information about all macros in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this macro was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this macro was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_macros(job_id)
dbtc get-macros --job-id=12345
get_metric¶
The metric object allows you to query information about metrics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this metric was generated for |
required |
unique_id |
str
|
The unique ID of this particular metric |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this metric was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_metric(job_id)
dbtc get-metric --job-id=12345
get_metrics¶
The metrics object allows you to query information about metrics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this metric was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this metric was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_metrics(job_id)
dbtc get-metrics --job-id=12345
get_model¶
The model object allows you to query information about a particular model in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this model was generated for |
required |
unique_id |
str
|
The unique ID of this particular model |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this model was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_model(job_id)
dbtc get-model --job-id=12345
get_model_by_environment¶
The model by environment object allows you to query information about a particular model based on environment_id
Warning
This feature is currently in beta and subject to change.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
environment_id |
int
|
The environment_id for this model |
required |
unique_id |
str
|
The unique ID of this model |
required |
last_run_count |
int
|
Number of last run results where this model was built to return (max of 10). Defaults to 10. |
10
|
with_catalog |
bool
|
If true, return only runs that have catalog information for this model. Defaults to False. |
False
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_model_by_environment(
environment_id, unique_id
)
dbtc get-model-by-environment --environment-id=12345 --unique-id=models.tpch.order_items
get_models¶
The models object allows you to query information about all models in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this model was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this model was generated for |
None
|
database |
str
|
The database where this table/view lives |
None
|
schema |
str
|
The schema where this table/view lives |
None
|
identifier |
str
|
The identifier of this table/view |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_models(job_id)
dbtc get-models --job-id=12345
get_seed¶
The seed object allows you to query information about a particular seed in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this seed was generated for |
required |
unique_id |
str
|
The unique ID of this particular seed |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this seed was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_seed(job_id)
dbtc get-seed --job-id=12345
get_seeds¶
The seeds object allows you to query information about a all seeds in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this seed was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this seed was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_seeds(job_id)
dbtc get-seeds --job-id=12345
get_snapshot¶
The snapshot object allows you to query information about a particular snapshot.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this snapshot was generated for |
required |
unique_id |
str
|
The unique ID of this particular snapshot |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this snapshot was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_snapshot(job_id)
dbtc get-snapshot --job-id=12345
get_snapshots¶
The snapshots object allows you to query information about all snapshots in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this snapshot was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this snapshot was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_snapshots(job_id)
dbtc get-snapshots --job-id=12345
get_source¶
The source object allows you to query information about a particular source in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this source was generated for |
required |
unique_id |
str
|
The unique ID of this particular source |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this source was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_source(job_id)
dbtc get-source --job-id=12345
get_sources¶
The snapshots object allows you to query information about all snapshots in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this source was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this source was generated for |
None
|
database |
str
|
The database where this table/view lives |
None
|
schema |
str
|
The schema where this table/view lives |
None
|
identifier |
str
|
The identifier of this table/view |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_sources(job_id)
dbtc get-sources --job-id=12345
get_test¶
The test object allows you to query information about a particular test.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this test was generated for |
required |
unique_id |
str
|
The unique ID of this particular test |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this test was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_test(job_id)
dbtc get-test --job-id=12345
get_tests¶
The tests object allows you to query information about all tests in a given job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
int
|
The unique ID of the job in dbt Cloud that this test was generated for |
required |
run_id |
int
|
The run ID of the run in dbt Cloud that this test was generated for |
None
|
fields |
list
|
The list of fields to include in the response.
The field can either be in snake_case or camelCase (e.g. run_id and
runId will be evaluated similarly). Nested fields can be accessed
with a |
None
|
Note
If you do not include a run_id, it will default to the most recent run of the specified job.
Examples:
Assuming that client
is an instance of dbtCloudClient
client.metadata.get_tests(job_id)
dbtc get-tests --job-id=12345
query¶
Examples:
Assuming that client
is an instance of dbtCloudClient
query = '{models(jobId: 1) {uniqueId}}'
client.metadata.query(query)