Mastering Azure Databricks Setup with Terraform: The Ultimate Guide
In this guide, you'll learn how to set up an Azure Databricks workspace using Terraform. We'll cover all the prerequisites and guide you through the entire process step by step.
Prerequisites
- Install Terraform: Download and install Terraform on your machine.
- Azure CLI: Install the Azure CLI and sign in:
az login
- Install Databricks CLI (Optional): If you'd like to interact with Databricks from the command line, install the Databricks CLI:
pip install databricks-cli
Creating a Service Principal in Azure
Follow these steps to create a service principal in Azure:
- Open the Azure Portal and navigate to Azure Active Directory > App registrations.
- Click on New registration and fill in the details:
- Name: Choose a name for your service principal.
- Supported account types: Select the appropriate option for your environment.
- Redirect URI: Leave this blank or provide one if needed.
- Click Register to create the service principal.
- After registration, navigate to the Certificates & Secrets section.
- Create a new client secret by clicking on New client secret. Note down the secret value as it will not be shown again.
- In the Overview section, note down the Application (client) ID and the Directory (tenant) ID.
- Assign the appropriate roles to the service principal as described in the Grant Permissions section below.
Setting Up Terraform Script
Create a Directory
Create a directory for your Terraform files and navigate into it:
mkdir terraform-azure-databricks
cd terraform-azure-databricks
Create main.tf
This is the primary Terraform configuration file:
# main.tf
provider "azurerm" {
features {}
}
provider "databricks" {
azure_workspace_resource_id = azurerm_databricks_workspace.dbr_workspace.id
azure_client_id = var.azure_client_id
azure_client_secret = var.azure_client_secret
azure_tenant_id = var.azure_tenant_id
}
resource "azurerm_resource_group" "dbr_rg" {
name = "databricks-rg"
location = "eastus"
}
resource "azurerm_databricks_workspace" "dbr_workspace" {
name = "databricks-workspace"
resource_group_name = azurerm_resource_group.dbr_rg.name
location = azurerm_resource_group.dbr_rg.location
sku = "premium"
}
output "databricks_workspace_url" {
value = azurerm_databricks_workspace.dbr_workspace.workspace_url
}
Create variables.tf
Define the necessary variables:
# variables.tf
variable "azure_client_id" {
type = string
description = "Azure Service Principal Client ID"
}
variable "azure_client_secret" {
type = string
description = "Azure Service Principal Client Secret"
}
variable "azure_tenant_id" {
type = string
description = "Azure Tenant ID"
}
Create terraform.tfvars
Provide actual values for the variables:
# terraform.tfvars
azure_client_id = "<YOUR_AZURE_CLIENT_ID>"
azure_client_secret = "<YOUR_AZURE_CLIENT_SECRET>"
azure_tenant_id = "<YOUR_AZURE_TENANT_ID>"
Create outputs.tf
Define the output to see the workspace URL:
# outputs.tf
output "databricks_workspace_url" {
description = "The URL of the created Databricks Workspace"
value = azurerm_databricks_workspace.dbr_workspace.workspace_url
}
Grant Permissions
Ensure your Service Principal has the appropriate permissions:
Assign Contributor Role to the Service Principal
az role assignment create --assignee <YOUR_AZURE_CLIENT_ID> --role Contributor --scope /subscriptions/<YOUR_SUBSCRIPTION_ID>
Assign Storage Blob Data Contributor Role
az role assignment create --assignee <YOUR_AZURE_CLIENT_ID> --role "Storage Blob Data Contributor" --scope /subscriptions/<YOUR_SUBSCRIPTION_ID>
Initialize and Apply Terraform
Initialize Terraform
terraform init
Plan the Infrastructure Changes
terraform plan
Apply the Plan
terraform apply
Check the Output
After successful completion, you'll see the output of the Databricks workspace URL:
databricks_workspace_url = "https://<workspace-name>.azuredatabricks.net"
Comments
Post a Comment