Unity Catalog
Unity catalog is the unified central governance layer built into databricks. When enabled for a workspace, Unity Catalog operates beneath every data interaction in your workspace automatically.
Enforcing access control when you query a table, tracking lineage as data moves, logging activies for auditing, and more.
Unity Catalog Object Model
Every asset you govern under unity catalog is modeled as a securable object, an object on which you can grant permissions to user, service principles, groups.
Data assets such as tables, views, models, volumes and functions follows a three namespace (catalog.schema.object). Table and Volumes can be managed, where Unity Catalog handles both governance and the underlying file storage lifecycle, or external, where Unity Catalog handles governance only. Other objects, such as storage credentials, external locations, connections, and shares, sit directly under the metastore

When we create a table in unity catalog by default is it created as managed delta table, that is a database table where the platform (such as Databricks, Apache Hive, or Microsoft Fabric) handles both the data files and the table metadata (schema, column names).
CREATE TABLE emp (
emp_id INT,
emp_name VARCHAR,
dept_id INT
)
And to create a external table, the LOCATION is needed to be specified.
CREATE TABLE emp (
emp_id INT,
emp_name VARCHAR,
dept_id INT
)
LOCATION 'abfss:/tmp/data/emp'
Read more about the difference between managed and external table here.
In managed table, the data will be stored in this hirerical, if table location is specifed it store in table, if schema location is specified then at schema the data stored.
metastore
|
catalog
|
schema
|
table
While creating a managed table and the metastore location is not specified then you have to specify the catalog location, it becomes required. On the other hand we can say, if the metastore location is specified the specifing catalog location becomes optional.
How can you create a catalog with and without external location?
for one region it is advised to have only on metastore. why?
Keypoints
-
To drop a catalog, the catalog should be empty (by default two schema are created
defaultandinformation_schema) that is there should be no schema, and to drop a schema it should also be empty, so for this we useCASCADEkeyword.DROP CATALOG <name> CASCADE; -
To create a catalog at external location, firstly you have to create a external location object. then create a catalog using UI or SQL
CREATE EXTERNAL CATALOG `dev_ext` LOCATION 'abfss://data@adbpvcodes.dfs.core.windows.net/adb/catalog'; -
The main difference b/w Hive Metastore Managed and External Tables and Unity Catalog's is that on drop of managed tables the data is immediately deleted if it is managed by hive metastore where as it's is retained for 7 days if it's under unity catalog.
-
To view the the dropped tables -
USE CATALOG DEV; SHOW TABLES DROPPED IN BRONZE; -
To undrop or restore a deleted table in unity catalog
UNDROP TABLE dev.bronze.sales_managed; -- OR UNDROP TABLE WITH ID '661142fa-eb6e-4833-85d4-6c0fdbef1930';