Managed Identities in Azure make handling basic authentication and authorization tasks between devices in your subscription significantly easier. Recently I had a customer wanting to authenticate from their Azure Container Instance DBT container to their Databricks instance. Their original plan was to use key vault and service principals to authenticate but I presented them with a solution around Managed Identities that made it significantly easier.

The hard part though, for some reason Managed Identity authentication from an Azure device to Databricks wasn’t documented and it takes a few tricks to make it work.

Cutting to the chase, assuming you have the Managed Identity set up as a service principal in Databricks and have assigned it to the VM/Container Instance/Batch Instance run the below commands within your device.

export client_id="YourUserManagedIdentityClientIdGUID"
export DATABRICKS_AAD_TOKEN=$(curl "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=2ff814a6-3304-4ab8-85cb-cd0e6f879c1d&client_id=$client_id" -H Metadata:true -s | jq -r '.access_token')
databricks configure --aad-token # or you can use the DATABRICKS_AAD_TOKEN env var anywhere else you need to auth!

This takes out a ton of dependencies and makes the process seamless to get an OAuth token for Databricks. The important item for the call was the resource string which isn’t well documented. The resource string for Azure Databricks is 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d if you ever need it.

Another important item this brings up is the internal metadata endpoint all Azure Compute systems have access to via basic http calls. I highly suggest getting familiar with what the endpoint can do as it can highly simplify common tasks inside VMs/Containers.

Previous Post Next Post