OraMatt: YABAOracle

Yet Another Blog About Oracle

MADSQL: A Deterministic CLI Tool for Translating SQL Between Database Dialects


SQL migrations are underestimated until they aren’t. A few dozen queries can usually be handled manually. A few thousand across multiple applications, written by multiple teams, targeting a database engine with different quoting rules, pagination syntax, and type semantics, that’s a different problem entirely.

I built madsql to solve the specific workflow problems I kept running into on Oracle migration projects: inconsistent manual rewrites, no reproducibility between runs, no clean way to break large scripts apart for review, and no way to recover a working schema when the source DDL was incomplete or missing altogether.

This post is the starting point for a four-part series that walks through madsql from the ground up, with practical examples for each command in the order you’d actually reach for them on a migration project.

GitHub: https://github.com/oramatt/madsql


The Migration Problem madsql Addresses

Every major database engine implements its own SQL dialect. The differences are well-known but they compound quickly at scale:

  • Pagination: TOP 3 in T-SQL, LIMIT 3 in PostgreSQL, FETCH FIRST 3 ROWS ONLY in Oracle
  • Identifier quoting: square brackets in T-SQL, backticks in MySQL, double quotes in Oracle and PostgreSQL
  • Data types: TEXT vs CLOB, GEOGRAPHY vs SDO_GEOMETRY, DOUBLE vs NUMBER
  • Schema and user semantics: Oracle users are schemas; other engines separate the two concepts
  • Statement terminators: T-SQL GO batch separators, Oracle / block terminators

Manual rewriting handles one-offs. It falls apart on batch work because it isn’t reproducible, two engineers rewriting the same script will produce different output, and rerunning the same engineer’s process a week later will too.

madsql addresses this with three design commitments: deterministic output, CLI-first operation, and transparent dependency on well-maintained open source parsers (SQLGlot and sqlparse).


The Three Commands

madsql ships three commands. The three of them do the core migration work and each is designed for a specific phase of the workflow.

split-statements

Splits a multi-statement SQL file into one file per statement without changing the dialect. Use this first, before conversion, to make large scripts reviewable, isolate parse problems to a specific statement index, and prepare input for selective replay.

madsql split-statements --source tsql --in ./migration.sql --out ./split

infer-schema

Reads a SQL workload, queries, DML, DDL, or any mix, and infers CREATE TABLE DDL from it. Use this when you need a starter schema before migration begins, when the source DDL is incomplete, or when all you have is a query workload and no formal schema definition.

madsql infer-schema --source tsql --target oracle --create-user --create-user-password 'ChangeMe123' ./workload.sql

convert

Converts SQL from a source dialect to a target dialect. Handles single files, directory trees, glob patterns, and stdin. This is the main event, but it works best after you’ve split your scripts and confirmed you have a working schema to target.

madsql convert --source tsql --target oracle --in ./sql --out ./converted --infer-schema --report

Why This Order Matters

The series covers the commands in the order they’re most useful on a real migration project, not alphabetical order.

split-statements first because you should understand what’s in your SQL before you convert it. Splitting a large vendor script into individually-numbered statement files takes seconds and immediately tells you how many statements you’re dealing with, whether the parser can handle your dialect, and which statement index to investigate when something breaks later.

infer-schema second because migrations frequently begin with incomplete or missing DDL. Before you can validate converted SQL against a target database, you need tables to run against. Inferring a starter schema from the workload itself gives you a working foundation faster than reconstructing DDL manually.

convert last because it’s the most flag-rich command and it builds on both of the above. The convert post covers full batch workflows, Oracle-specific output, schema side artifacts, and CI integration.


Installation

Binary releases for macOS, Linux, and Windows (arm64 and x86_64) are at https://github.com/oramatt/madsql/releases.

To install from source:

git clone https://github.com/oramatt/madsql.git
cd madsql
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip
pip install -e .
madsql --version

Expected output:

madsql 0.11.2
python 3.13.5
sqlglot 29.0.1
sqlparse 0.5.3
sql-metadata 2.20.0
simple-ddl-parser 1.10.0

If the madsql console script isn’t on your PATH, use python3 -m madsql <command> throughout.

Before anything else, confirm the supported dialect names for your installed version:

madsql dialects

Dialect names are exact, tsql not mssql, oracle not ora. Getting one wrong produces a fatal CLI error before any processing begins.



The Series

Post 1 — split-statements: Break SQL scripts apart before you do anything else

Post 2 — infer-schema: Recover DDL from a query workload

Post 3 — convert: Batch SQL dialect conversion for Oracle migration

Each post covers the command in depth with Oracle-primary examples, when-to-use guidance, common pitfalls, and CI integration notes.

The repository README and built-in help are also worth reading alongside the series:

madsql --help
madsql split-statements --help
madsql infer-schema --help
madsql convert --help

Leave a comment

Navigation

About

I’m Matt and I do Oracle things.