One of the persistent myths in data engineering is that SQL is portable.
While SQL is based on ANSI standards, every database engine implements its own dialect. These differences appear in:
- pagination syntax
- quoting rules
- data types
- functions
- identifier handling
- DDL semantics
These incompatibilities become immediately obvious during:
- database migrations
- cross-database development
- platform modernization projects
To help streamline these workflows, I built MADSQL, a deterministic command-line tool that converts SQL scripts between dialects using SQLGlot.
GitHub Repository:
https://github.com/oramatt/madsql
Why SQL Dialect Translation Matters
Consider something as simple as limiting query results.
SQL Server
SELECT TOP 3 [name] FROM dbo.users;
PostgreSQL
SELECT name FROM users LIMIT 3;
Oracle
SELECT name FROM users FETCH FIRST 3 ROWS ONLY;
These statements all accomplish the same goal, but they are not interchangeable across database platforms.
Now imagine converting an application containing thousands of SQL statements between databases. This is where dialect incompatibilities quickly become a major engineering challenge.
MADSQL was created to help automate part of that work.
What MADSQL Is
MADSQL is a deterministic command-line tool for translating SQL scripts between database dialects using the SQLGlot parser and transpiler.
It supports workflows such as:
- converting SQL between dialects
- splitting large SQL files into individual statements
- processing entire directory trees of SQL scripts
- generating deterministic output suitable for CI pipelines
- producing structured error reports and run logs
MADSQL is designed specifically for batch SQL transformation workflows.
Core Design Goals
Deterministic Output
One of the key guarantees of MADSQL is deterministic output.
Given the same:
- input SQL
- SQLGlot version
- CLI arguments
MADSQL will generate identical output every time.
This makes the tool ideal for:
- automated migration pipelines
- CI/CD workflows
- reproducible SQL transformations
CLI-First Design
MADSQL is intentionally designed as a terminal-based tool.
There is no GUI and there are currently no plans to add one. The CLI design makes it easy to integrate into:
- automation scripts
- migration frameworks
- developer workflows
- batch SQL processing pipelines
Batch SQL Processing
MADSQL can process SQL from multiple input sources:
- single SQL files
- directory trees
- glob patterns
- stdin streams
Example:
madsql convert --source tsql --target postgres "./sql/**/*.sql" --out ./converted
This allows entire repositories of SQL code to be processed in a single run.
Example Usage
Convert a Single Statement
echo "SELECT TOP 3 [name] FROM dbo.users;" | madsql convert --source tsql --target postgres
Convert a SQL File
madsql convert --source postgres --target mysql ./input.sql
Convert a Directory of SQL Scripts
madsql convert --source postgres --target mysql --in ./sql --out ./converted
When converting directories, MADSQL preserves the relative directory structure in the output folder.
Splitting SQL Statements
Another useful feature is the ability to split SQL scripts into individual statements.
madsql split-statements --in ./input.sql --out ./split
This produces deterministic files such as:
0001_stmt.sql0002_stmt.sql0003_stmt.sql
MADSQL attempts to detect statement boundaries using SQLGlot first. If parsing fails, it automatically falls back to sqlparse for split-only boundary detection.
Deterministic File Naming
When splitting or converting SQL, MADSQL uses deterministic naming.
Example output:
0001_stmt.oracle.sql0002_stmt.oracle.sql
If a statement fails conversion, later statements retain their original index numbers.
Example:
0001_stmt.oracle.sql0003_stmt.oracle.sql
This makes it easy to correlate outputs with the original SQL source.
Error Handling and Reporting
MADSQL provides structured error handling designed for automation workflows.
Write JSON error reports:
--errors errors.json
Stop immediately on failure:
--fail-fast
Exit codes are predictable:
| Exit Code | Meaning |
|---|---|
| 0 | success |
| 1 | completed with statement errors |
| 2 | fatal CLI misuse |
This makes MADSQL safe to integrate into automated pipelines.
Markdown Reports
MADSQL can also generate logs and structured reports.
Markdown reports:
--report
Reports include:
- command invocation
- source and target dialects
- success and failure counts
- overall success rate
Reports are timestamped and written to the output directory.
Example Migration Workflow
MADSQL fits naturally into database migration pipelines.
Application SQL │ ▼MADSQL Convert │ ▼Target Dialect SQL │ ▼Database Validation
Because the tool is deterministic and script-friendly, it works well as part of larger migration automation systems.
Why I Built MADSQL
There are already SQL translators available.
MADSQL exists because I wanted a tool that supported the specific workflows I use when working on migration projects.
In particular:
- deterministic outputs
- reproducible CLI workflows
- batch processing
- structured error reporting
- reliable statement splitting
In many migration projects, engineers spend significant time rewriting SQL manually. MADSQL helps automate parts of that process.
Project Repository
https://github.com/oramatt/madsql
The project is licensed under the Universal Permissive License (UPL).
Final Thoughts
SQL dialect differences are one of the most underestimated challenges in database migrations.
Even small differences between database engines can require large-scale SQL refactoring across applications.
MADSQL provides a practical CLI tool to help automate SQL dialect translation and reduce manual effort in cross-database development and modernization projects.
Leave a comment