HACKER Q&A
📣 Harlekuin

Why aren't SQL queries compiled?


If the database knew what queries were running ahead of time and have them compiled you could easily defend against SQL injection because the SQL isn't being interpreted as it's run, and it would be possible to run analysis over the query to see if there are any indexes or implementations missing that could speed it up. Something like a stored procedure's variable substitution but not interpreted when called.


  👤 DougMerritt Accepted Answer ✓
The most difficult aspect of this is that SQL is typically called from some other language (Javascript, C++, whatever), which gets data from somewhere (a web form, a previous database query, a file) and needs to plug it into a new SQL query, which generally needs to be sent to an SQL interpreter ("SELECT $DATA1 FROM $DATA2" etc), not to an inflexible pre-made compiled SQL command.

And that obviously is susceptible to injection attacks.

Really the only general way around this would be if all of those other languages had an API to communicate with a relational backend which did not interpret SQL at all, but rather only offered things like select_statement(fieldname, tablename, ...)

Such interfaces exist, but I'm not really familiar with them, nor why they're not universally used -- except to note that it's a lot of work to create an SQL library. Look at the SQL language; it's a lot to support, particularly if you need to avoid 100 arguments passed to a function, etc. (Just saying one shouldn't leap to trivializing such projects.)


👤 drittich
Queries are compiled, i.e., a query plan is created and then executed. Depending on various things that plan may get re-used the next time the query is run. So what you are talking about exists. Databases can examine the plans, database statistics, and the resulting metrics from executing the plan and make recommendations about indexes.

👤 PaulHoule
A SQL engine can use any method it wants to execute a query.

For instance if you wanted to make a SQL engine that lets you write a SELECT on tables that are csv files that writes a C program and then compiles it with a C compiler that's up to you.

In particular a SQL engine can compile stored procedures ahead of time and back in the day I worked at a few places that put "business logic" into SQL procedures, that is, 100% of the SQL access is done through stored procedures, the same way that many web applications define a front end in Java or Node that exposes data to a Javascript front end.

If you build an app that way you can put a SQL specialist in charge of the SQL interface and also know the application is immune to SQL injection attacks.


👤 icedchai
Prepared statements are parameterized (avoiding SQL injection) and can pre-compiled (what this really means depends on the implementation.)

👤 pgh
It’s a declarative language and it’s compiled into an execution "plan", and parameterised queries (bind variables, prepared statements, whatever you want to call them) are passed at runtime. When you dynamically build queries by concatenation you bypass this compilation phase (parse, compute plan, etc.), and spend unnecessary time on near identical queries.

If you want many more gory details, this is a good watch: https://youtu.be/eurwtUhY5fk


👤 gregjor
Modern relational database systems do all of this already, have for decades.

👤 moomoo11
Isn’t that what prepared statements basically are?