Mastering Azure Databricks Setup with Terraform: The Ultimate Guide

In this guide, you'll learn how to set up an Azure Databricks workspace using Terraform. We'll cover all the prerequisites and guide you through the entire process step by step.

Prerequisites

  • Install Terraform: Download and install Terraform on your machine.
  • Azure CLI: Install the Azure CLI and sign in:
    az login
  • Install Databricks CLI (Optional): If you'd like to interact with Databricks from the command line, install the Databricks CLI:
    pip install databricks-cli

Creating a Service Principal in Azure

Follow these steps to create a service principal in Azure:

  1. Open the Azure Portal and navigate to Azure Active Directory > App registrations.
  2. Click on New registration and fill in the details:
    • Name: Choose a name for your service principal.
    • Supported account types: Select the appropriate option for your environment.
    • Redirect URI: Leave this blank or provide one if needed.
  3. Click Register to create the service principal.
  4. After registration, navigate to the Certificates & Secrets section.
  5. Create a new client secret by clicking on New client secret. Note down the secret value as it will not be shown again.
  6. In the Overview section, note down the Application (client) ID and the Directory (tenant) ID.
  7. Assign the appropriate roles to the service principal as described in the Grant Permissions section below.

Setting Up Terraform Script

Create a Directory

Create a directory for your Terraform files and navigate into it:

mkdir terraform-azure-databricks
cd terraform-azure-databricks

Create main.tf

This is the primary Terraform configuration file:


# main.tf
provider "azurerm" {
  features {}
}

provider "databricks" {
  azure_workspace_resource_id = azurerm_databricks_workspace.dbr_workspace.id
  azure_client_id             = var.azure_client_id
  azure_client_secret         = var.azure_client_secret
  azure_tenant_id             = var.azure_tenant_id
}

resource "azurerm_resource_group" "dbr_rg" {
  name     = "databricks-rg"
  location = "eastus"
}

resource "azurerm_databricks_workspace" "dbr_workspace" {
  name                = "databricks-workspace"
  resource_group_name = azurerm_resource_group.dbr_rg.name
  location            = azurerm_resource_group.dbr_rg.location
  sku                 = "premium"
}

output "databricks_workspace_url" {
  value = azurerm_databricks_workspace.dbr_workspace.workspace_url
}
        

Create variables.tf

Define the necessary variables:


# variables.tf
variable "azure_client_id" {
  type        = string
  description = "Azure Service Principal Client ID"
}

variable "azure_client_secret" {
  type        = string
  description = "Azure Service Principal Client Secret"
}

variable "azure_tenant_id" {
  type        = string
  description = "Azure Tenant ID"
}
        

Create terraform.tfvars

Provide actual values for the variables:


# terraform.tfvars
azure_client_id     = "<YOUR_AZURE_CLIENT_ID>"
azure_client_secret = "<YOUR_AZURE_CLIENT_SECRET>"
azure_tenant_id     = "<YOUR_AZURE_TENANT_ID>"
        

Create outputs.tf

Define the output to see the workspace URL:


# outputs.tf
output "databricks_workspace_url" {
  description = "The URL of the created Databricks Workspace"
  value       = azurerm_databricks_workspace.dbr_workspace.workspace_url
}
        

Grant Permissions

Ensure your Service Principal has the appropriate permissions:

Assign Contributor Role to the Service Principal

az role assignment create --assignee <YOUR_AZURE_CLIENT_ID> --role Contributor --scope /subscriptions/<YOUR_SUBSCRIPTION_ID>

Assign Storage Blob Data Contributor Role

az role assignment create --assignee <YOUR_AZURE_CLIENT_ID> --role "Storage Blob Data Contributor" --scope /subscriptions/<YOUR_SUBSCRIPTION_ID>

Initialize and Apply Terraform

Initialize Terraform

terraform init

Plan the Infrastructure Changes

terraform plan

Apply the Plan

terraform apply

Check the Output

After successful completion, you'll see the output of the Databricks workspace URL:

databricks_workspace_url = "https://<workspace-name>.azuredatabricks.net"

Comments

Trending

Integrating Okta with Azure Web Apps and Power BI Embedding

Execute Code from Local Machine on Databricks Spark Cluster