📣 pcarolan

How do you test SQL?

I've been looking for resources for our data team to apply best practices for testing SQL pipelines (we use DBT) but have not found anything. How do you test SQL pipelines? What patterns, tools and best practices would you recommend? Any good reference material you know of?

👤 RobinL Accepted Answer ✓

Try and write any complex SQL as a series of semantically meaningful CTEs. Test each part of the CTE pipeline with an in.parquet and an expected_out.parquet (or in.csv and out.csv if you have simple datatypes, so it works better with git). And similarly test larger parts of the pipeline with 'in' and 'expected_out' files.
If you use DuckDB to run the tests, you can reference those files as if they were tables (select * from 'in.parquet'), and the tests will run extremely fast
One challenge if you're using Spark is that test can be frustratingly slow to run. One possible solution (that I use myself) is to run most tests using DuckDB, and only e.g. the overall test using Spark SQL.
I've used the above strategy with PyTest, but I'm not sure conceptually it's particularly sensitive to the programming language/testrunner you use.
Also I have no idea whether this is good practice - it's just something that seemed to work well for me.
The approach with csvs can be nice because your customers can review these files for correctness (they may be the owners of the metric), without them needing to be coders. They just need to confirm in.csv should result in expected_out.csv.
If it makes it more readable you can also inline the 'in' and 'expected_out' data e.g. as a list of dicts and pass into DuckDB as a pandas dataframe
One gotya is SQL does not guarantee order so you need to somehow sort or otherwise ensure your tests are robust to this

👤 ramenmeal

We spin up a docker container running the DB technology we use, run our DB migration scripts on it, and then run integration tests against it. You get coverage of your migration scripts this way too.

👤 purerandomness

There's pgTAP for Postgres [1], the same approach probably is valid for other databases.
Here's [2] a slide deck by David Wheeler giving an introduction into how it works.
[1] https://pgtap.org/
[2] https://www.slideshare.net/justatheory/unit-test-your-databa...

👤 efxhoy

I write mostly batch ETL stuff. All plain psql and bash. We don’t have a good testing setup to be honest. What we do use instead:
Plenty of constraints, uniques and foreign keys and not nulls. Enum types.
Visuals, dump to csv and plot some graphs. Much easier to find gaps and strange distributions visually.
Asserts in DO blocks, mostly counts being equal.
Build tables in a a _next suffix schema and swap when done.
Never mutating the source data.
Using psqls ON_ERROR_STOP setting.
Avoid all but the most trivial CTEs, preferring intermediate tables that can be inspected. Constraints and assertions on the intermediate tables.
“Wasting” machine resources and always rebuilding from scratch when feasible. CREATE TABLE foo AS SELECT is much simpler than figuring out which row to UPDATE. Also ensures reproducibility, if you’re always reproducing from scratch it’s always easy. State is hard.
Overall i’m quite happy with the workflow and very rarely do we make mistakes that unit tests would have caught. Our source data is complex and not always well understood (10+ years of changing business logic) so writing good tests would be very hard. Because we never touch the raw source data any errors we inevitably make are recoverable.
This talk by Dr Martin Loetzsch helped a lot: https://youtu.be/whwNi21jAm4

👤 drx

If you're using dbt, dbt tests are a good start: https://docs.getdbt.com/docs/build/tests
You can hook up dbt tests to your CI and Git(hub|lab) for data PRs.
Depending on your needs, you can also look into data observability tools such as Datafold (paid) or re_data (free)

👤 chrisoldwood

Back in the mid-noughties I decided to see if I could write SQL in a test-first manner (i.e. TDD). This resulted in me writing a 100% T-SQL based unit testing framework for SQL Server [1] which we then used for the SQL back-end at an investment bank.
On the back on that professional use I wrote a blog post [2] explaining why you might choose to go down this route as it wasn't the way database was developed way back then (SQL wasn't developed in the same way as the other front-end and back-end code).
A few years later I gave a short 20-minute talk (videoed) to show what writing SQL using TDD looked like for me. It's hard to show all the kinds of tests we wrote in practice at the bank but the talk is intended to show how rapid the feedback loop can be using a standard DB query tool and two code windows - production code and tests.
Be kind, it was a long time ago and I'm sure the state of the art has improved a lot in the intervening years :o).
Chris Oldwood
---
[1] SQL Server Unit: https://github.com/chrisoldwood/SS-Unit
[2] You Write Your SQL Unit Tests in SQL?: https://chrisoldwood.blogspot.com/2011/04/you-write-your-sql...
[3] Test-Driven SQL: https://www.youtube.com/watch?v=5-MWYKLM3r0

👤 AnEro

Gitlabs has their guide up, I love it and use it all the time. I've been doing data engineering in a small team for about 4 years, helping hospitals with data and hopefully making it easier to understand. Something that is overlooked or undervalued in my opinion, have stark distinctions for separating out technical and business logic tests. It makes it easier communicating what's happening in the event something is 'wrong' vs wrong, and it's easier to bring a non-technically inclined team member up to speed. Also, I think it's good to learn from the SaaS side of things and not bloat up or overengineer with infrastructure as data engineering is the latest development flavour. Keep it simple. Great expectations is a great tool however I think small teams should take really hard looks at their needs and see if a simple orchestration engine and SQL testing is enough. A centralized source for testing is great, however infrastructure isn't free even when it is you are paying for it with you and your teams time.

👤 dagss

Probably not that relevant for a data team, but this is what we do as a backend team:
We use Microsoft SQL's docker image and spin it up in the background on our laptop/CI server so port 1433 has a database.
Then we have our homegrown migration file runner that will compute a hash of the migrations, make a database template_a5757f7e, and run the hundreds of migrations on it, whenever we add a new SQL migration (todo: make one template build on the previous).
Then we use the BACKUP command to dump the db to disk (within the docker image)
Finally, each test function is able to make a new database and restore that backup from file in less than a second. Populate with some relevant test data, run code, inspect results, drop database.
So our test suite uses hundreds of fresh databases and it still runs in a reasonable time.
(And..our test suite is written in Go, with a lot of embedded SQL strings, even if a lot of our business logic is in SQL)

👤 jlund-molfese

In SQL-heavy ETL pipelines, I normally don't test the SQL queries by themselves, but do black box & performance testing to verify that the output of a certain batch job matches what I expect (automated acceptance testing).
This is easier if you have the same input every time the tests run, like a frozen database image, because then you can basically have snapshot tests.

👤 ahakanbaba

The hard part about testing SQL is decoupling from infrastructure and big data sources. We use DuckDB, and pandas dataframes mock data sources to unit test SQL. Python testing frameworks (or simple assert statements) can be used to compare inputs and outputs.
When the tests pass, we can change from DuckDB to Spark. This helps decouple testing Spark pipelines from the SparkSession and infrastructure, which saves a lot of compute resources during the iteration process.
This setup requires an abstraction layer to make the SQL execution agnostic to platforms and to make the data sources mockable. We use the open source Fugue layer to define the business logic once, and have it be compatible with DuckDB and Spark.
It is also worth noting that FugueSQL will support warehouses like BigQuery and Snowflake in the near future as part of their roadmap. So in the future, you can unit test SQL logic, and then bring it to BigQuery/Snowflake when ready.
For more information, there is this talk on PyData NYC (SQL testing part): https://www.youtube.com/watch?v=yQHksEh1GCs&t=1766s
Fugue project repo: https://github.com/fugue-project/fugue/

👤 seanhunter

One easy method is just to test sql the way you test anything else:
1)Set up a test db instance with controlled data in it as the basis for your test cases. Ideally this data is taken from real data that has caused pipeline problems in the past but scrubbed for PII etc. You can also use or write generators to pad this out with realistic-looking fake data. If you do this the same dataset can be used for demos (once you add data for your demo paths).
2)Write test cases using whatever test framework you use in your main language. Say you code in python, you write pytest cases, java -> junit etc. You can help yourself by writing a little scaffolding that takes a sql query and a predicate, runs the query and asserts the predicate over the result. If you don't have a "main language", just write these test cases in a convenient language.
3)Consider resetting the state of the database (probably by reloading a controlled dump before each test batch) so any tests which involve inserts/deletes etc work. You may actually want to create an entirely new db and load it before each test run so that you can run multiple test batches concurrently against different dbs without contention messing up your results. Depending on your setup you may be able to achieve a similar effect using schemas or (sometimes but not always) transactions. You want each test run to be idempotent and isolated though.
Doing it this way has a number of benefits because it's easy to add your sql test cases into your CI/CD (they just run the same as everything else).

👤 cwp

Two ideas here:

1) The same way you'd write any other tests. Use your favourite testing framework to write fixtures and tests for the SQL queries:

  - connect to the database
  - create tables
  - load test data
  - run the query
  - assert you get the results you expect

For insert or update queries, that assertion step might involve running another query.

2) DBT has support for testing! It's quite good. See https://docs.getdbt.com/docs/build/tests

👤 hichkaker

I did data engineering for 6 years and am building a company to automate SQL validation for dbt users.
First, by “testing SQL pipelines”, I assume you mean testing changes to SQL code as part of the development workflow? (vs. monitoring pipelines in production for failures / anomalies).
If so:
1 – assertions. dbt comes with a solid built-in testing framework [1] for expressing assertions such as “this column should have values in the list [A,B,C]” as well checking referential integrity, uniqueness, nulls, etc. There are more advanced packages on top of dbt tests [2]. The problem with assertion testing in general though is that for a moderately complex data pipeline, it’s infeasible to achieve test coverage that would cover most possible failure scenarios.
2 – data diff: for every change to SQL, know exactly how the code change affects the output data by comparing the data in dev/staging (built off the dev branch code) with the data in production (built off the main branch). We built an open-source tool for that: https://github.com/datafold/data-diff, and we are adding an integration with dbt soon which will make diffing as part of dbt development workflow one command away [2]
We make money by selling a Cloud solution for teams that integrates data diff into Github/Gitlab CI and automatically diffs every pull request to tell you the how a change to SQL affects the target table you changed, downstream tables and dependent BI tools (video demo: [3])
I’ve also written about why reliable change management is so important for data engineering and what are key best practices to implement [4]
[1] https://docs.getdbt.com/docs/build/tests [2] https://github.com/calogica/dbt-expectations [3] https://github.com/datafold/data-diff/pull/364 [4] https://www.datafold.com/dbt [5] https://www.datafold.com/blog/the-day-you-stopped-breaking-y...

👤 munk-a

Use a testing framework to mock some database into the DB, run your queries, verify the result. Make sure you have a variety of data you use for tests to fully exercise the surface of logic you expect to hit.
Basically, treat the query and database as a black-box for testing like you would another third party API call.
I would strongly suggest having a layer of code in your application that is exclusively your data access and keeping any logic you can out of it. Data level tests are pretty onerous to write in the best circumstances and the more complexity you allow to grow around the raw SQL the worse of a time you'll have - swapping out where clauses and the like dynamically is a cost you'll need to eat, and sometimes having a semi-generic chunk that you reuse with some different joins can be more efficient than writing ten completely different access functions with completely different internal logic so judgement is required.
At the end of the day a database is like any other third party software component - data goes in, data comes out... the nice thing is that SQL is well defined and you've got all the definitions so it's easier to find the conditional cases you need to really closely tests... but databases are complex beasties and it'll never be easy.

👤 uticus

My favorite interview question. No, I mean when I'm being interviewed. The sheepish grins let me know I'm not alone.
Best ideas IMO (no particular order):
- make SQL dumber, move logic that needs testing out of SQL
- use an ORM that allows composing, disconnect composition & test (ie EF for .NET groups, test the LINQ for correct filtering etc, instead of testing for expected data from a db) (I see this has already been recommended elsewhere)
* edited formatting

👤 taeric

Best tool nowadays has to be to spin up a database and execute the queries against it. If you are on a database setup that spinning up an instance takes a long time, consider docker.
Be wary of too many techniques that are supposed to be making it easier to test, but also make it hard for you to leave a query pipeline. In particular, SQL should be very easy in the "with these as our base inputs, we expect these as our base outputs." Trying to test individual parts of the queries is almost certainly doomed to massive bloat of the system and will cause grief later.

👤 uticus

Related - how is any declarative language tested?
Quick web search confirms suspicions, it is not easy
https://www.metalevel.at/prolog/testing

👤 kkleindev

The teams I've been working on have resorted to data tests instead of code tests. That means that the data produced by your code is tested against a certain set of expectations - in stark contrast to code being tested _before_ its execution.
We've written our own tool to compare different data sources against each other. This allows, for example, to test for invariants (or expected variations) between and after a transformation.
The tool is open source: https://github.com/QuantCo/datajudge
We've also written a blog post trying to illustrate a use case: https://tech.quantco.com/2022/06/20/datajudge.html

👤 sakopov

.NET Shop using SQL Server here, but I think something similar to what we do can apply to any stack. We use TestContainers [1] to spin up a container with SQL Server engine running on it. Then use FluentMigrator [2] to provision tables and test data to run XUnit integration tests against. This has worked remarkably well.
[1] https://dotnet.testcontainers.org/
[2] https://fluentmigrator.github.io/

👤 crabbone

Can you please expand on what you mean by DBT?
DBT, specifically, DBT-2 is a suit of tests designed to benchmark a database system. These tests aren't interested in, eg. correctness of an application that is using the database. They are meant to be testing the system as a whole by modeling some sort of an "average business" and defining some sort of an "average business operation" and estimating how many of such operations can a particular deployment of a system perform.
Such tests are rarely of any interest to application developer, and are more geared towards DBAs who execute such tests to estimate the efficiency of a system they deploy or to estimate the amount of hardware necessary to support a business.
MySQL DBT2 suit: https://dev.mysql.com/downloads/benchmarks.html
PostgreSQL DBT2 suit: https://wiki.postgresql.org/wiki/DBT-2
Those tools are typically modeled on TPC-B... And, it would require a separate discussion to describe why these tests are obsolete and why there isn't really any replacement.
----
However, from the rest of your question it seems that you may use DBT acronym in some other way... So, what exactly are you testing? Are you interested in performance? A benchmark? Schema correctness? Are you perhaps trying to simply test the application that is using a SQL database and you want to avoid dealing with the database setup as much as possible?

👤 dave_infuseai

dbt does have testing built in, but of course there are only certain cases for which that kind of testing works. dbt can't know if your metrics 'look' right, only you will know.
As others have mentioned, you want to compare the results of your queries against a previously known 'good' state of the data. So, as you're making data model changes, you can regularly check your development environment against production to see how your changes affect the data.
Data profiling is the perfect tool for this, especially when your pipeline reaches a certain size, or you're dealing with very large datasets.
I work on the team creating PipeRider.io, which uses data profiling comparisons as a method of "code review for data".
It becomes particularly useful when you automate generating data profiles of development and production environments in CI, and attach the data profile comparison to the pull request comment. It makes seeing the impact of changes so much easier.
Here's an article that discusses the benefits of this: https://blog.infuseai.io/why-you-lack-confidence-merging-dbt...

👤 Waterluvian

GitHub Actions trivialized this for us. Spawning a Postgres database for testing is easy and carefree. Spawns. Runs operations and evaluates the state of the database after each operation.
We have two flavours of test: one that drops the transaction each time, ensuring a clean, known state. And one that doesn’t, allowing your tests to avoid lots of overhead by “walking through a series of incremental states”.
Yes, some might call the latter heresy. But it works great.

👤 dwohnitmok

Given that you're using dbt, it comes with a testing framework out of the box: https://docs.getdbt.com/docs/build/tests

👤 nephton

Write a toolbox for your tests that is able to spin up and tear down an instance of the database that you use (eg. on a random port and random password). Make sure that the new instance is really independent, especially storing its data in a separate (temporary) ditectory.
During test: - At the start of the test (fixture), run a new DB instance - Apply DB schema. - possibly: Remove constraints that that would disturb your tests (eg. unimportant foreign keys) - possibly: Add default values for columns that are not important for your test (but do with caution) - run you test - Assert results (maybe also directly as access to databse or via a dump of tables). - Tear down database possibly removing all data (except error logs).
I used this pattern to test software that uses MySQL or MariaDB server. For Microsoft SQL server it may be enough to create a new database instead of running a new instance (possible but not as easy as for MySQL/MariaDB).
On CI server this can be used to run tests against all required DB server types and versions.

👤 KronisLV

I think that when it comes to testing databases... most people just don't.
Look at this JetBrains survey: https://www.jetbrains.com/lp/devecosystem-2021/databases/
Around half of the people never debug stored procedures. Three quarters of people don't have tests in their databases. Only half of the people version their database scripts.
Personally: the answer is containers. Spin up a database in a container (manually or on CI server) and do whatever you need with it. Seed it with some test data, connect an app to it, check that app tests pass when writing to and reading from a live database (as opposed to mock data stores or something like H2), then discard the container.
Even if you don't have a traditional app, throwaway instances of the real type of DB that you'll be using are great, both for development and testing.

👤 jiggywiggy

To be honest.
I would love testing to work.
Have set up and maintained several unit test suites in Jest.
Wrote several large e2e test suites in Cypress.
I don't think anyone won time from simply having a manual checklist and testing manually.
Maybe me and my former teammates are doing it wrong. Talking 8+ teams, from corporate to startup.
But loved de proven wrong. E2e def. catched most issues.

👤 gsvclass

In GraphJin an automatic GraphQL to SQL compiler we use the gnomock library it startups a database instance (docker) then create the schema and tests data and finally our code connects to it and runs a series of tests. We run these across Mysql, Postgres and a few other DB's. Gnomock supports a wide range of them. Right now we don't take down the db for every test only between test runs but its fast enough that we could. This whole thing runs of a simple `go test -v .` command and we run it on every commit using a githook. https://github.com/dosco/graphjin/blob/master/tests/dbint_te...

👤 Ataraxy

So I'm not an expert, but for simplistic use cases I merely make use of https://github.com/oguimbal/pg-mem
It's a lot faster and easier than dealing with containers and the like.

👤 luckystarr

I use a fake object in place of a database connection which gives fake responses when the correct SQL query is sent to it.
Example:
db = Fake().expect_query("SELECT * FROM users", result=[(1, 'Bob'), (2, 'Joe')])
Then you do:
db.query("SELECT * FROM users")
and get back the result.
In Python if you do this in a context manager, you can ensure that all expected queries actually were issued, because the Fake object can track which ones it already saw and throw an exception on exit.
The upside of this is, you don't need any database server running for your tests.
update: This pattern is usually called db-mock or something like this. There are some packages out there. I built it a few times for companies I worked for.

👤 summerlight

Google internally has developed a SQL language dialect and some investments have been made to have a first class language supports for SQL. It has several interesting functionalities that made SQL nice for ETL use cases.

  * It has a language level module support, similar to other languages. Thus SQL functions are reusable across multiple codebases without depending on code generation tricks. One of the major blocker for SQL adoption has been complex domain specific business logic and now the situation is better.
  * It has an official unit test support. Google use Blaze (which is known as Bazel externally), so adding a unit test for SQL code is as simple as adding a SQL module (and its test input) dependency to SQL test target, write a test query and its expected output in a format of approval testing. Setting up the DB environment is all handled by the testing framework.
  * It has an official SQL binary support. It's just a fancy name for handling lots of tedious stuffs for running a SQL query (e.g. putting everything needed into a single package, performing type checks, handling input parameters, managing native code dependencies for FFI etc etc).

None of those are technically too sophisticated at least in theory, actually these combined together become pretty handy. Now I can write a simple SQL module which mostly depends on other team's SQL module, do a simple unit test for it then run a SQL binary just as other languages. I haven't worried a single time on how to set up a DB instance. This loop is largely focused on OLAP so it's a bit different for OLTP, which has another type of established testing patterns.

👤 nextlevelwizard

Test the queries your application is making. I wouldn't put much effort into this. You have to trust that other people test their stuff anyway so why make a difference with a database? I'd much rather test that your backups work.
And that can be done by dumping the database (possibly verifying the content of that dump), taking a backup, restoring the backup to a fresh container, then comparing dump of that freshly restored database to the one you took at the start.

👤 systems

Well, I used to work at a place where they used https://tsqlt.org/ for testing it worked great
I dont know the technical detail of how to set it up, it was already setup when I worked there
But basically, we wrote SQL script that included statements to
1. create the db structure, tables or views
2. insert statement to enter test data (you can insert corner cases etc..)
3. ran the function or procedure
4. ran an assert to confirm if results are to our expectation
test script were ran by the CI/CD process

👤 gorgoiler

Realistically, most of my tests are integration / end to end tests. They typically get written only when it comes to patch time, where you first want proof that the old system works before you tear it apart and rebuild it. I think that’s probably the only SQL testing I’ve ever done and honestly, if they are fast enough, that kind of integration testing is all you will need too.
As the meme say: App worked before. App work afterwards. Can’t explain that.

👤 creakingstairs

Phoenix/Elixir/Ecto seems to handle this really well. When you write a test in Phoenix, you can use `DataCase` which automatically creates sandboxed transactions [1] for each test. This makes writing integration tests a breeze.
[1] https://hexdocs.pm/ecto_sql/Ecto.Adapters.SQL.html#module-sq...

👤 ufmace

I don't think there's any perfect universal answer for this. I only have a few things I've done that work.
Rails for Ruby comes with some pretty nice setups for testing the database code. There's a test DB by default with the same schema as Production, and the usual test frameworks (FactoryBot and RSpec) make it easy to set up some data in the actual DB for each spec, run model code that makes actual SQL queries, and assert against the results.
I would have hoped most other web hosting frameworks would make as much effort to making it straightforward to test your database code, but it doesn't really seem to be the case.
In Rust, there's a very handy crate called sqlx. What it does is, at compile time, it runs all of the SQL in your codebase against a copy of your database to both validate that it runs without errors and map the input and output types to typecheck the Rust code.
When it comes to stuff like validating that your queries are performant against production datasets or that there isn't any unexpected data in production that breaks your queries, well I pretty much got nothing. Maybe try a read replica to execute against?

👤 epgui

We use this and take an example-based tests approach for any non-trivial DBT models: https://github.com/EqualExperts/dbt-unit-testing

More trivial example:

    {%
        call dbt_unit_testing.test(
            'REDACTED',
            'Should replace nullish values with NULL'
        )
    %}
        {% call dbt_unit_testing.mock_source('REDACTED', 'REDACTED', opts) %}

            "id" | "industry"
            1    | 'A'
            2    | 'B'
            3    | ''
            4    | 'Other'
            5    | 'C'
            6    | NULL

        {% endcall %}

        {% call dbt_unit_testing.expect(opts) %}

            "history_id" | "REDACTED"
            1            | 'A'
            2            | 'B'
            3            | NULL
            4            | NULL
            5            | 'C'
            6            | NULL

        {% endcall %}
    {% endcall %}

👤 davedx

SQL being relatively pure, functional, algebraic and whatnot, doesn't require the same rigor of automated test coverage that more "systems" programming languages do. (By "systems" I include all languages that people use to integrate the various parts of a software system - i.e. regular programming languages like Java, C#, TypeScript, C++ and so on. Not just low level languages.)
Stored procedures are a different beast though. Having significantly struggled to debug stored procedures running in MSSQL on a Macbook (on Windows SQL Management Studio lets you set breakpoints, on Mac you're SOL), if I was building an application based on them I'd definitely try to spin up some kind of testing framework around them. I guess what I'd probably do is have a temporary database and some regular testing framework that nukes the db, then calls the stored proc(s) with different inputs and checks what's in the tables after each run. Sounds slow and clunky?

👤 winrid

I try to do some kind of compile-time query checking. I really like sqlx with Rust, and other languages have some kind of equivalent (although maybe not as nice) like JOOQ. If you can store the queries in some kind of configuration, like SQL files, then this is easy no matter the language.

👤 jesseryoung

I started in the software engineering space and move into data engineering, and I was floored with the complete lack of tooling. There is a HUGE gap between software engineering and data engineering when it comes to both tooling and practice. Even the simplest "unit" test of "Is the SQL statement valid" is not all that common in frameworks and tooling but in practice is like 90% of the production failures that I've seen.
Starting with a framework that is programming language first (IE Spark) can help you build your own tooling to help you actually build unit tests. It's frustrating though, that this isn't just common across other ETL tooling.

👤 samsquire

From an SQL database implementation perspective, in my toy Python barebones SQL database that barely supports inner joins (https://github.com/samsquire/hash-db) I tested by testing on postgresql and seeing if my query with two joins produces the same results.
I ought to produce unit tests that prove that tuples from each join operation produces the correct dataset. I've only ever tested with 3 join operations in one query.
For a user perspective, I guess you could write some tooling that loads example data into a database and does an incremental join with each part of the join statement added.

👤 alphanumeric0

I actively avoid it. The SQL languages have no default libraries for testing. Furthermore, SQL cannot be easily composed, or abstracted into modules, so there is no sense in testing it. It is not designed to be tested.

👤 pmarreck

Excellent question. Not sure why it's getting no traction.
For my own use-cases, I usually test this at the application level and not the DB level. This is admittedly not unit-testing my SQL (or stored procs or triggers) but integration-testing it.

👤 megalan247

I experienced this issue in many companies and found there to be no real solution, especially when not testing with the real data. This was one of the reasons I created DrvDB [1] as it allows you to store a copy of the data and very quickly spin up containers to test large databases in CI, and verify the output, performance, etc. is what you expect.
You can achieve the same thing with "docker commit"-ing data into docker images of your dB engine of choice, and firing your queries on them, but that only really works with smaller datasets.
[1] https://devdb.cloud

👤 hansvm

You get a lot of bang for your buck with an expected input/output setup. In basically every database it's trivial to set up a few tiny tables, and it's cheap to run a query on small input. Pick a few edge cases and a few non-trivial representative examples, and any passing query written by a real person will likely express the logic you care about or will expose an additional input/output pair to add to the tests. Combine the high efficacy with the ease of writing and understanding such a test, and it becomes hard to argue against having at least a few.

👤 AdrianB1

It depends a lot on your use case. In my case we have SQL running against tables with trillions of rows, so we need to take a look at every single SQL query in the code that runs more than tens of milliseconds or often enough to get significant. There is no automation for a good DBA looking at an execution plan; I heard about a guy that works in some financial company where his job for the past 10 days was to tune the same ~ 10 queries to the death, but if your app is working with a database that can be hosted comfortably on a smartphone, none of this is needed.

👤 zabzonk

assuming you are asking about sql select statements, the problem is knowing what the correct answer is so you can test against it. for most data, you don't, and probably cannot know this.
not a unique problem with sql, btw.

👤 rowls66

This is a little off topic, but related. Does anyone know of any tools that would allow information about the size and shape of expected data to be provided along with a database schema so that developers could get instant feedback if queries were likely to perform poorly when run against production data sets. Or to perform poorly when a database grows to beyond a certain size. I have seen many instances of SQL going into production databases that works well for a while, but gets much slower as the database grows.

👤 thingsilearned

This linter can really enforce some best practices https://github.com/sqlfluff/sqlfluff
A list of best practices: https://docs.getdbt.com/guides/legacy/best-practices
And shameless plug but there's a chapter on modeling in my book: https://theinformedcompany.com

👤 cgopalan

You say you are using dbt, so doesnt "dbt test" provide you with the functionality to test? I assume by testing sql pipelines, you want to test if the data written to intermediate stores conform to what you expect. You should be able to do that with dbt test. If you are using an analytical database like Snowflake you could direct the results of the dbt run and dbt test to a test database and do your testing there.

👤 idlewords

Live in production!

👤 geocrasher

SQL pipelines can be tested pretty easily. If they look congested, use a snake (Python, if you like) to try to knock the CRUD out of them.

👤 davvid

Here's a nifty python module for testing against postgres that a friend wrote. It creates a temporary database on the fly for testing and tears it down at the end. You can then populate the database and run all kinds of integration tests against it.
https://github.com/ugtar/pg_temp

👤 edublancas

If you use Jupyter, check out JupySQL. It allows you to break down long queries in multiple cells so you can test them individually: https://jupysql.ploomber.io/en/latest/compose.html (Disclosure: my team develops JupySQL)

👤 andy_ppp

I really love (as with lots of things) how Ecto from the Elixir community handles this, you have an extra database and because Postgres has awesome transactions you can even run all theses tests of your whole data layer in parallel, including any SQL. Ecto is largely a domain specific language for writing modular SQL so that helps test things too.

👤 sam0x17

Side note: I believe it is good form to always have unit tests that test the "up" and "down" for every single migration in your app. It's not always possible but if you're strict about it you can avoid a lot of bad patterns and have a much healthier set of migrations

👤 hans0l074

On a related note (though it does touch upon testing) mitchellh open sourced Squire[1] recently (today?). Note though, that it's PostgreSQL specific.
[1] https://github.com/mitchellh/squire

👤 gxt

Abstraction layer between the query you write and the one that gets executed. This way you can mock the schema, run the query on the mock to assert some condition x.
A ref() concept like dbt's is sufficient. When testing, have ref output a different (test-x) name for all your references.

👤 nitwit005

If your app is API driven, I would create test data with the APIs, and test the queries against what that generates. APIs tend to change less often than the internal data representation, and the app developers can usually figure out how to fix the tests when they do.

👤 iblaine

If I were building a tool to test SQL, then I'd try to load the SQL into a dataframe, then test it by mocking the tables and the output. This is a tough problem to solve. If testing is important, possibly move away from SQL and towards ORMs.

👤 JenrHywy

We test it implicitly with integration tests across the API for the most part. We occasionally have unit tests (in C#) that directly call stored procs.
The backbone for this is that we spin up a DB per unit test, so we don't have to worry about shared state.

👤 dangwhy

I've had great success with this
https://news.ycombinator.com/item?id=34580675
I am always baffled by why this ins't more popular way of writing SQL.

👤 javaunsafe2019

I always did that with integration tests. Put some data in the db. Use the repository aka your sql and validate the results.
Most of the times there is a layer around your sql (a repository, a bash script or whatever) that you can use for integration testing.

👤 devdada

Creating a separate isolated environment is totally the way to go, and migrate the data from your existing database to it. This way the data turns to "dummy" data but it provides you with something to conduct testing with.

👤 jve

I regret making an integration in SQL Server Integration Services, with some nice addon tools that can make http requests and such. In the end it is untestable piece of hard to follow solution.
Anyone have any recommendations on testing SSIS ?

👤 pharmakom

Testing databases with Docker, dummy data, etc. can be very slow so it’s a big win to use Bazel as the test executor. This enables caching between runs and between machines. Saves us about 20 mins of CI time every build.

👤 Gigachad

Part of the solution could be using tooling which can compile time check SQL is valid like https://github.com/launchbadge/sqlx

👤 nikita

Disclosure. I'm CEO of Neon (neon.tech).
One of the premises that we have is the ability to instantly create a test environment by creating a branch. I'd love to hear what you think about it.

👤 mikpanko

Datafold (https://www.datafold.com/) is solving exactly that problem and has integration with dbt.

👤 xmorse

I do E2E tests for the core features against a staging database and url.
This way i don't waste time with unit tests that quickly get old and no one wants to maintain and run

👤 hparadiz

Surprised only a single comment in this thread had mentioned the "explain" keyword that exists in MySQL.
Excellent for checking delete queries before running them.

👤 d0100

I've been using IntegreSQL and it works pretty well

👤 johnthescott

sqldb testing is historically problematic, for at least correctness AND performance.
in postgresql a cool tool for performance is"hypothetical indexing", which predicts how the optimizer will use indexes in any sql query. i could see an automated testing tool written around "hypothetical indexing".
also, i believe MSServer supports HI.

👤 spikder

I have done this more than once and it is great. Point django at your db, dump the models, and then use django's test framework.

👤 dirtybirdnj

in production, on a friday before I leave for vacation
for real though I love tools like SequelPro or TablePlus that let me work out a query before I bake logic or stuff into my apps. Also sometimes I use it to work out the data needed for reports. I am working with salesforce for the first time in my life and apparently there are tools that let me treat it like I'm used to SequelPro.

👤 Andys

Something to consider (for Go users): Use the sqlc library, which makes whole classes of bugs/mistakes into compile-time errors.

👤 DecoPerson

In production ;)
But my app is for six users at one site, it’s not mission critical, and the sqlite DB is backed up hourly.
Life’s too short for (unnecessary) testing.

👤 world2vec

DBT supports testing right out of the gate, you can write little queries and macros to test and validate every column.

👤 EGreg

Have small databases.
Learn to use IMPORT TABLESPACE in MySQL or just dump and import SQL.
Every time you run a test you set up the mock databases again.

👤 dllthomas

I've been wanting to put together a property testing framework that lets you specify properties as queries.

👤 kindofabigdeal

Either a test environment or a mocking library

👤 z3t4

Local Dev dB and/or staging

👤 User23

Like any other integration test.

👤 postalrat

Git, merge requests, developers.

👤 neural_thing

Great Expectations and Datafold

👤 distantsounds

Flyway, https://flywaydb.org/

👤 tommyage

Relational databases rely on math. You may test your implementation; Or the input to your statements.

👤 karakanb

Disclaimer: I am building a data platform called Blast, and part of our focus is to make SQL pipelines easier to maintain at every aspect: easier to write, easier to test, easier to ensure it is correct before they are deployed. Link in my bio if you are interested, happy to have a conversation anyway. This is a problem we have been thinking and providing solutions for quite some time.
There are a few types of tests one would like from a SQL pipeline, each with a different value add:
- Quality assurance tests: these are things like DBT tests, they mainly test the accuracy of the result after the tables are produced. Examples of this would be tests like "this column should not contain any `null` values" or "it should have only X, Y and Z values". They are valuable checks, but the matter of the fact is that there are many cases where running this sorts of tests after the data is produced is a bit too late.
- Integration tests: specify an input table and your expected output, and run your queries against it, the end result must match with the expectations at all times. This is useful for running them regularly and serve as "integration tests" for your SQL assets. They allow validating the logic inside the query, provided that the input is covering the cases that needs to be covered, they can be executed in CI/CD pipelines. We are exploring a new way of doing this with Blast CLI, effectively running a BigQuery compatible database in-memory and running tests against every asset in the pipeline locally.
- Validation tests: these tests aim to ensure that the query is syntactically correct on the production DWH, usually using tricks like `EXPLAIN` or dry-run in BigQuery. These sorts of tests would ensure that the tables/fields referenced actually exist, the types are valid, the query has no syntax errors, etc.. These are very useful for running in CI after every change, effectively allowing catching many classes of bugs.
- Sanity checks: these are similar to the quality assurance tests described above, but with a bigger focus on making sense out of the data. They range from "this table has no more rows than this other table" to business-level checks such as "the conversion rate for this week cannot be more than 20% lower compared to last week". They are executed after the data is produced as well, and they would serve as an alerting layer.
There is no silver bullet when it comes to testing SQL, because in the end what is being tested is not just the SQL query but the data asset itself, which makes things more complicated. The fact that SQL has no standardized way of testing things and the language has a lot of dialects make this harder than it could have been. In my experience, I have found the combination of the strategies above to have a very good coverage when it comes to approximating how accurate the queries are and how trustworthy the end result is, provided that a healthy mix of them is being used throughout the whole development lifecycle.

👤 davps

This approach didn't use an ORM and run the tests concurrently against the same database.

I follow those steps on my pipeline:

Every time I commit changes the CI/CD pipeline follow those steps, on this order:

- I use sqitch for the database migration (my DB is postgresql).

- Run the migration script `sqitch deploy`. It runs only the items that hasn't been migrated yet.

- Run the `revert all` feature of sqitch to check if the revert action works well too.

- I run `sqitch deploy` again to test if the migration works well from scratch.

- After the schema migration has been applied, I run integration tests with Typescript and a test runner, which includes a mix of application tests and database tests too.

- If everything goes well, then it runs the migration script to the staging environment, and eventually it runs on the production database after a series of other steps on the pipeline.

I test my database queries from Typescript in this way:

-in practice I'm not strict on separating the tests from the database queries and the application code, instead, I test the layers as they are being developed, starting from simple inserts on the database, where I test my application CRUD functions that is being developed, plus to the fixtures generators (the code that generate synthetic data for my tests) and the deletion and test cleanup capabilities.

-having those boilerplate code, then I start testing the complex queries, and if a query is large enough (and assuming there are no performance penalties using CTE for those cases), I write my largue queries on small chunks on a cte, like this (replace SELECT 1 by your queries):

    export const sql_start = `
    WITH dummy_start AS (
        SELECT 1
    )

    export const step_2 = `${sql_start},
    step_2 AS (
        SELECT 1
    )
    `;

    export const step_3 = `${step_2},
    step_3 AS (
        SELECT 1
    )
    `;

    export const final_sql_query_to_use_in_app = ` ${step_3},
    final_sql_query_to_use_in_app AS(
        SELECT 1
    )

    SELECT \* FROM final_sql_query_to_use_in_app

Then on my tests I can quickly pick any step of the CTE to test it

    import {step_2, step_3, final_sql_query_to_use_in_app} from './my-query';

    test('my test', async t => {

        //
        // here goes the code that load the fixtures (testing data) to the database
        //

        //this is one test, repeat for each step of your sql query
        const sql = `${step_3} 
            SELECT * FROM step_3 WHERE .....
        `;
        const {rows: myResult} = await db.query(sql, [myParam]);
        t.is(myResult.length, 3);

        //
        // here goes the code that cleanup the testing data created for this test
        // 
    });

and on my application, I just use the final query:

        import {final_sql_query_to_use_in_app} from './my-query';

        db.query(final_sql_query_to_use_in_app)

The tests start with an empty database (sqitch deploy just ran on it), then each test creates its own data fixtures (this is the more time consuming part of the test process) with UUIDs as synthetic data so I don't have conflicts between each test data, which makes it possible to run the tests concurrenlty, which is important to detect bugs on the queries too. Also, I include a cleanup process after each tests so after finishing the tests the database is empty of data again.

For sql queries that are critical pieces, I was be able to develop thounsands of automated tests with this approach and in addition to combinatorial approaches. In cases where a column of a view are basically a operation of states, if you write the logic in sql directly, you can test the combination of states from a spreadsheet (each colum is an state), and combining the states you can fill the expectations directly on the spreadsheet and give it to the test suites to run the scenarios and expectations by consuming the csv version of your spreadsheets.

If you are interested on more details just ping me, I'll be happy to share more about my approach.

👤 sqldba

Thoughts and prayers.

👤 Teamteam16

SQL.lite

👤 devdada

Test

👤 opportune

I used to work on SQL pipelines, sadly I didn’t get to implement some of these suggestions, but we did implement others.
For testing:
Run your query/pipeline against synthetic/manual data that you can easily verify the correctness of. This is like a unit test.
Run your query/pipeline on sampled actual data (eg 0.1% of the furthest upstream data you care about). This is like an integration test or a canary. Instead of taking 0.1% of all records you might instead want to sample 0.1% of all USERID so that things like aggregate values can be sanity checked.
Compare the results of the new query to the results from the old query/pipeline. This is like a regression test. You may think this wouldn’t help for many changes because the output is expected to change, but you could run this only on e.g. a subset of columns.
Take the output of the new query (or sampled query, or the manual query) and feed it to whatever is downstream. This is like a conformance test.
For reliability:
If the cost is not prohibitive, consider persisting temporary query results (eg between stages of your pipeline) for 1-2 weeks. This way if you catch a bug from a recent change you only need to rerun the part of your pipeline after the breakage. May not make sense to do if your pipeline is not big
If the cost is not prohibitive you could also run both the new and old versions of the pipeline for ~a week so that you can quickly “rollback”. Ofc whether this is viable depends on what you’re doing.
The big failure modes with SQL pipelines IME are
1. unexpected edge cases and bad data causing queries to fail (eg you manually test the new query and it works fine, but in production it fails when handling Unicode)
2. not having a plan for what to do when a bug gets caught after the fact
3. barely ever noticing bugs or lost data because nobody is validating the output (for example, if you have a pipeline that aggregates a user’s records over a day, any USERID that’s in the input data for that day should also be in the output data for that day).
4. This can be very hard to solve depending on your circumstances, but upstream changes in data are the most annoying and intractable to solve. The best case here is you either spec out the input data closely OR have some kind of testing in place that the upstream folks run before shipping changes.
To address these, you need to take the approach of expecting things to fail, rather than hoping they don’t. This is common practice in many SWE shops these days but the culture in the data world hasn’t quite caught up. I think part of the problem is that automating this testing usually requires at least some scripting/programming which is outside the comfort zone for many people who “just write SQL.”

👤 2v35gggg

by running it in prod duh

👤 sbricks

you don't?...

👤 chadlavi

that's the neat thing, you don't

👤 wokwokwok

Two days ago, everyone: “You should use SQL because everyone knows it and modern SQL is pretty good.
Other languages are too complicated. :(“
Everyone today: “tries using sql
Oh wow, the tooling is quite basic, and you can’t express complex data structures and imperative code. :(“
What did you expect?
Look, I spent 4 years in this rabbit hole, and here’s my advice:
Don’t try to put the square peg in the round hole.
You want easy to write, simple code and pipelines? Just use sql.
Have a dev environment and run everything against that to verify it.
Do not bother with unit testing your CTEs, it’s hard to do, there are no good tools to do it.
If you want Strong Engineering TM, use python and spark and all the python libraries that exist to do all that stuff.
It won’t be as quick to write, or make changes to, but it will be easier to write more verifiably robust code.
If you treat either as something it is not (eg. Writing complex data structures and frameworks in sql) you’re using the wrong tool for the outcome you’re trying to achieve.
It’ll take longer and “feel bad”, not because the tool is bad, but because you’re using it in a bad way.