Skip to main content
Skip table of contents

Databricks SQL Warehouse - Upload File to Table

Overview

Quickly upload a file from Workflows to a SQL table in Databricks. __Recommended Setup__ This should be used immediately after downloading data from another source. Although they are not required in order to connect, it is recommended that you provide the `Catalog` and the `Schema` that you will query. By not doing so, the connection will resort to the defaults and the uploaded table will reside there. The recommended approach is to provide the volume where the uploaded file will be staged, then copied into the target table. Workflows will remove the staged file after successfully copying into the target. It is also recommended to use one volume per schema, though this is not strictly enforced. If the volume provided does not exist, Workflows will create it. Additionally the [match type](https://www.shipyardapp.com/docs/reference/blueprint-library/match-type/) selected greatly affects how this Blueprint works. **Note** This blueprint cannot upload a file from your local machine.

Variables

NameReferenceTypeRequiredDefaultOptionsDescription
Access TokenDATABRICKS_SQL_ACCESS_TOKENPassword--The access token generated in Databricks for programatic access
Databricks Server HostDATABRICKS_SQL_SERVER_HOSTAlphanumeric--The URL address of the SQL warehouse
Warehouse HTTP PathDATABRICKS_SQL_HTTP_PATHAlphanumeric--The extended path for the SQL warehouse
CatalogDATABRICKS_SQL_CATALOGAlphanumeric--The optional catalog to connect to. If none is provided, this will default to Hive Metastore
SchemaDATABRICKS_SQL_SCHEMAAlphanumeric--The optional schema to connect to. If none is provided, the blueprint will connect to the `default` schema
VolumeDATABRICKS_SQL_VOLUMEAlphanumeric--The name of the volume to stage the file
Table NameDATABRICKS_SQL_TABLEAlphanumeric--The table in Databricks to write to
Data TypesDATABRICKS_SQL_DATATYPESAlphanumeric--The optional Spark datatypes to use in Databricks. These should be in JSON format, and if none are provided then the datatypes will be inferred.
Insert MethodDATABRICKS_SQL_INSERT_METHODSelectappendAppend: append

Create or Replace: replace
This decides whether to append to an existing table or overwrite an exiting table.
File TypeDATABRICKS_SQL_FILE_TYPESelectcsvCSV: csv

Parquet: parquet
The file type to load
Workflows File Match TypeDATABRICKS_SQL_MATCH_TYPESelectexact_matchExact Match: exact_match

Glob Match: glob_match
Determines if the text in "Workflows File Name" will look for one file with exact match, or multiple files using regex.
Workflows Folder NameDATABRICKS_SQL_FOLDER_NAMEAlphanumeric--The optional name of the folder where the file in Workflows is located
Workflows File NameDATABRICKS_SQL_FILE_NAMEAlphanumeric--The name of the file in Workflows to load to Databricks

YAML

Below is the YAML template

YAML

source:
  template: Databricks SQL Warehouse - Upload File to Table
  inputs:
    DATABRICKS_SQL_ACCESS_TOKEN:
    DATABRICKS_SQL_SERVER_HOST:
    DATABRICKS_SQL_HTTP_PATH:
    DATABRICKS_SQL_CATALOG:
    DATABRICKS_SQL_SCHEMA:
    DATABRICKS_SQL_VOLUME:
    DATABRICKS_SQL_TABLE:
    DATABRICKS_SQL_DATATYPES:
    DATABRICKS_SQL_INSERT_METHOD: append
    DATABRICKS_SQL_FILE_TYPE: csv
    DATABRICKS_SQL_MATCH_TYPE: exact_match
    DATABRICKS_SQL_FOLDER_NAME:
    DATABRICKS_SQL_FILE_NAME:
  type: TEMPLATE
guardrails:
  retry_count: 1
  retry_wait: 0h0m0s
  runtime_cutoff: 1h0m0s
  exclude_exit_code_ranges:
    - 200
    - 202
    - 203
    - 204
    - 205
    - 206
    - 207
    - 208
    - 209
    - 210
    - 211
    - 249

      
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.