SQL migrations are underestimated until they aren’t. A few dozen queries can usually be handled manually. A few thousand across multiple applications, written by multiple teams, targeting a database engine with different quoting rules, pagination syntax, and type semantics, that’s a different problem entirely.
I built madsql to solve the specific workflow problems I kept running into on Oracle migration projects: inconsistent manual rewrites, no reproducibility between runs, no clean way to break large scripts apart for review, and no way to recover a working schema when the source DDL was incomplete or missing altogether.
This post is the starting point for a four-part series that walks through madsql from the ground up, with practical examples for each command in the order you’d actually reach for them on a migration project.
GitHub: https://github.com/oramatt/madsql
The Migration Problem madsql Addresses
Every major database engine implements its own SQL dialect. The differences are well-known but they compound quickly at scale:
- Pagination:
TOP 3in T-SQL,LIMIT 3in PostgreSQL,FETCH FIRST 3 ROWS ONLYin Oracle - Identifier quoting: square brackets in T-SQL, backticks in MySQL, double quotes in Oracle and PostgreSQL
- Data types:
TEXTvsCLOB,GEOGRAPHYvsSDO_GEOMETRY,DOUBLEvsNUMBER - Schema and user semantics: Oracle users are schemas; other engines separate the two concepts
- Statement terminators: T-SQL
GObatch separators, Oracle/block terminators
Manual rewriting handles one-offs. It falls apart on batch work because it isn’t reproducible, two engineers rewriting the same script will produce different output, and rerunning the same engineer’s process a week later will too.
madsql addresses this with three design commitments: deterministic output, CLI-first operation, and transparent dependency on well-maintained open source parsers (SQLGlot and sqlparse).
The Three Commands
madsql ships three commands. The three of them do the core migration work and each is designed for a specific phase of the workflow.
split-statements
Splits a multi-statement SQL file into one file per statement without changing the dialect. Use this first, before conversion, to make large scripts reviewable, isolate parse problems to a specific statement index, and prepare input for selective replay.
madsql split-statements --source tsql --in ./migration.sql --out ./split
infer-schema
Reads a SQL workload, queries, DML, DDL, or any mix, and infers CREATE TABLE DDL from it. Use this when you need a starter schema before migration begins, when the source DDL is incomplete, or when all you have is a query workload and no formal schema definition.
madsql infer-schema --source tsql --target oracle --create-user --create-user-password 'ChangeMe123' ./workload.sql
convert
Converts SQL from a source dialect to a target dialect. Handles single files, directory trees, glob patterns, and stdin. This is the main event, but it works best after you’ve split your scripts and confirmed you have a working schema to target.
madsql convert --source tsql --target oracle --in ./sql --out ./converted --infer-schema --report
Why This Order Matters
The series covers the commands in the order they’re most useful on a real migration project, not alphabetical order.
split-statements first because you should understand what’s in your SQL before you convert it. Splitting a large vendor script into individually-numbered statement files takes seconds and immediately tells you how many statements you’re dealing with, whether the parser can handle your dialect, and which statement index to investigate when something breaks later.
infer-schema second because migrations frequently begin with incomplete or missing DDL. Before you can validate converted SQL against a target database, you need tables to run against. Inferring a starter schema from the workload itself gives you a working foundation faster than reconstructing DDL manually.
convert last because it’s the most flag-rich command and it builds on both of the above. The convert post covers full batch workflows, Oracle-specific output, schema side artifacts, and CI integration.
Installation
Binary releases for macOS, Linux, and Windows (arm64 and x86_64) are at https://github.com/oramatt/madsql/releases.
To install from source:
git clone https://github.com/oramatt/madsql.gitcd madsqlpython -m venv .venvsource .venv/bin/activate # Windows: .venv\Scripts\activatepip install -U pippip install -e .madsql --version
Expected output:
madsql 0.11.2python 3.13.5sqlglot 29.0.1sqlparse 0.5.3sql-metadata 2.20.0simple-ddl-parser 1.10.0
If the madsql console script isn’t on your PATH, use python3 -m madsql <command> throughout.
Before anything else, confirm the supported dialect names for your installed version:
madsql dialects
Dialect names are exact, tsql not mssql, oracle not ora. Getting one wrong produces a fatal CLI error before any processing begins.
The Series
Post 1 — split-statements: Break SQL scripts apart before you do anything else
Post 2 — infer-schema: Recover DDL from a query workload
Post 3 — convert: Batch SQL dialect conversion for Oracle migration
Each post covers the command in depth with Oracle-primary examples, when-to-use guidance, common pitfalls, and CI integration notes.
The repository README and built-in help are also worth reading alongside the series:
madsql --helpmadsql split-statements --helpmadsql infer-schema --helpmadsql convert --help
Leave a comment