Databricks - Download Files from DBFS to Workflows
Overview
Quickly export one or more files from your Databricks File System (DBFS). The [match type](https://www.shipyardapp.com/docs/reference/blueprint-library/match-type/) selected greatly affects how this Blueprint works.
Variables
Name | Reference | Type | Required | Default | Options | Description |
---|---|---|---|---|---|---|
Databricks Folder Name | DATABRICKS_SOURCE_FOLDER_NAME | Alphanumeric | ➖ | None | - | Name of the folder where the file is stored in the Databricks File System (DBFS). If left blank, looks in the /FileStore/. |
Databricks File Name Match Type | DATABRICKS_SOURCE_FILE_NAME_MATCH_TYPE | Select | ✅ | exact_match | Exact Match: exact_match Regex Match: regex_match | Determines if the text in "Databricks File Name" will look for one file with exact match, or multiple files using regex. |
Databricks File Name | DATABRICKS_SOURCE_FILE_NAME | Alphanumeric | ✅ | None | - | Name of the target file in the Databricks File System (DBFS). Can be regex if "Match Type" is set accordingly. |
Workflows Folder Name | DATABRICKS_DESTINATION_FOLDER_NAME | Alphanumeric | ➖ | None | - | Folder where the file(s) should be downloaded on Workflows. Leaving blank will place the file in the home directory. |
Workflows File Name | DATABRICKS_DESTINATION_FILE_NAME | Alphanumeric | ➖ | None | - | What to name the file(s) being downloaded on Workflows. If left blank, defaults to the original file name(s). |
Workspace Instance URL | DATABRICKS_INSTANCE_URL | Alphanumeric | ✅ | None | - | The subdomain, domain, and top-level domain (TLD) of your Databricks Workspace URL. |
Access Token | DATABRICKS_ACCESS_TOKEN | Password | ✅ | None | - | The personal access token associated with the provided Workspace Instance. |
YAML
Below is the YAML template
source:
template: Databricks - Download Files from DBFS to Workflows
inputs:
DATABRICKS_SOURCE_FOLDER_NAME:
DATABRICKS_SOURCE_FILE_NAME_MATCH_TYPE: exact_match
DATABRICKS_SOURCE_FILE_NAME:
DATABRICKS_DESTINATION_FOLDER_NAME:
DATABRICKS_DESTINATION_FILE_NAME:
DATABRICKS_INSTANCE_URL:
DATABRICKS_ACCESS_TOKEN:
type: TEMPLATE
guardrails:
retry_count: 1
retry_wait: 0h0m0s
runtime_cutoff: 1h0m0s
exclude_exit_code_ranges:
- 200
- 201
- 202
- 203
- 212
- 214