A database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. Database refactoring does not change the way data is interpreted or used and does not fix bugs or add new functionality. Every refactoring to a database leaves the system in a working state, thus not causing maintenance lags, provided the meaningful data exists in the production environment.
A database refactoring is conceptually more difficult than a code refactoring; code refactorings only need to maintain behavioral semantics while database refactorings also must maintain informational semantics.
A database schema is typically refactored for one of several reasons:
In 2006 Scott Ambler, Pramod Sadalage[1] describe the following categories of database refactoring:[2]
A change which improves the overall manner in which external programs interact with a database.Methods of Architecture Refactoring category: Add CRUD Methods; Add Mirror Table; Add Read Method; Encapsulate Table With View; Introduce Calculation Method; Introduce Index; Introduce Read Only Table; Migrate Method From Database; Migrate Method To Database; Replace Method(s) With View; Replace View With Methods(s); Use Official Data Source.
A change to the table structure of your database schema.Methods of Structural Refactoring category: Drop Column; Drop Table; Drop View; Introduce Calculated Column; Introduce Surrogate Key; Merge Columns; Merge Tables; Move Column; Rename Column; Rename Table; Rename View; Replace LOB With Table; Replace Column; Replace One-To-Many With Associative Tables; Replace Surrogate Key With Natural Key; Split Column; Split Table.
A change which improves and/or ensures the consistency and usage of the values stored within the database.Methods of Data Quality Refactoring category: Add Lookup Table; Apply Standard Codes; Apply Standard Type; Consolidate Key Strategy; Drop Column Constraint; Drop Default Value; Drop Non Nullable; Introduce Column Constraint; Introduce Common Format; Introduce Default Value; Make Column Non Nullable; Move Data; Replace Type Code With Property Flags.
A change which ensures that a referenced row exists within another table and/or that ensures that a row which is no longer needed is removed appropriately.Methods of Referential Integrity Refactoring category: Add Foreign Key Constraint; Add Trigger for Calculated Column; Drop Foreign Key Constraint; Introduce Cascading Delete; Introduce Hard Delete; Introduce Soft Delete; Introduce Trigger for History.
A change which changes the semantics of your database schema by adding new elements to it or by modifying existing elements.Methods of Transformation category: Insert Data; Introduce New Column; Introduce New Table; Introduce View; Update Data.
A change which improves the quality of a stored procedure, stored function, or trigger.Methods of the Method Refactoring category: Parameterize Methods; Remove Parameter; Rename Method; Reorder Parameters; Replace Parameter with Explicit Methods; Consolidate Conditional Expression; Decompose Conditional; Extract Method; Introduce Variable; Remove Control Flag; Remove Middle Man; Replace Literal with Table Lookup; Replace Nested; Conditional with Guard Clauses; Split Temporary Variable; Substitute Algorithm.
In 2019 Vladislav Struzik supplemented the categories of database refactoring with a new one:[3]
A change which relates to data access. Methods of the Access Refactoring category:[4] [5] Change Authentication Attributes; Revoke Authorization Privileges; Grant Authorization Privileges; Extract Database Schema; Merge Database Schemas.
The process of database refactoring is the act of applying database refactorings to evolve an existing database schema (database refactoring is a core practice of evolutionary database design). There are three considerations that need to be taken into account: