Protect Your PHP Application from SQL Injection

·

# The Problem

One of the most common security threats in web applications is SQL injection. It continues to top the OWASP application security risk list. Yet somehow many developers don't even know what it is.

SQL injection is when a malicious SQL query is injected into a legitimate query run by the application, usually by a nefarious user through an input field in the user interface. It happens when the application isn't protecting the database from raw user input—common in so many PHP applications—and a cleverly formed input tricks the database into running the malicious query.

# A Simple Example

Imagine this query in an application with user profiles, where $_GET['userId'] comes from the query string to identify the user for the profile currently being viewed.

$query = 'SELECT * FROM Users WHERE id=' + $_GET['userId'];

This query would return the user data we're looking for.

But what if the viewer changes the query string to profile.php?userId=1%20OR%201=1?

Given the code above, the query would look like this going into the database.

SELECT * FROM Users WHERE id=1 OR 1=1

That's a valid query, and if your application just runs it as-is, doing nothing to protect the database from the user input, that query will evaluate true for every record in the Users table. It will return all the data for every record in the table. You'll likely leak details about all your users.

Now yes, of course you should be validating all user input coming into your application, but that is an entirely different responsibility. That's more about rejecting bad data than application security. 

And contrary to scores of outdated tutorials on the subject, sanitizing user input doesn't cut it. As the SQL injection guide at Paragon Initiative puts it:

While it's possible to prevent attacks by rewriting the incoming data stream before you send it to your database driver, it's rife with dangerous nuance and obscure edge-cases. (Both links in the previous sentence are highly recommended.)

Unless you want to take the time to research and attain complete mastery over every Unicode format your application uses or accepts, you're better off not even trying to sanitize your inputs. Prepared statements are more effective at preventing SQL injection than escaping strings.

Furthermore, altering your incoming data stream can cause data corruption, especially if you are dealing with raw binary blobs (e.g. images or encrypted messages).

Prepared statements are easier and can guarantee SQL Injection prevention.

# The Solution

It's actually not that hard to prevent SQL injections in a PHP application with a MySQL database.

In most cases, using native prepared statements will prevent SQL injection attacks.

(It's worth mentioning this solution isn't sufficient if you're using any of these five character sets, available in MySQL 5.6 or newer: big5, cp932, gb2312, gbk and sjis. Their usage is uncommon, but if you're using one of them you should see this extensive explanation by Anthony Ferrara to better understand your situation.)

There's a very important prerequisite to fully sidestep that in most cases caveat.

# Prerequisite: Use Full Unicode Encoding

Full stop.

In MySQL, that means utf8mb4. The utf8mb4 character set in MySQL supports the full range of Unicode characters and is not susceptible to the obscure edge case injection attacks linked above.

Don't use MySQL's utf8 encoding. It doesn't actually support the full character set, leading to some really odd encoding errors when someone enters 💩 in a form. (And you know they're gonna do it.) Read up on how to support full Unicode in your MySQL database with utf8mb4.

Supporting full Unicode in your database will save you so much heartache down the road for so many reasons unrelated to SQL injections. I beg you. You'll be so much better off. Just do it.

# Use Native Prepared Statements

Think of prepared statements like this: you "prepare" the database for a query you're going to run by sending a template with placeholders for variable data in the query. Then separately, you send data for those placeholders and tell the database to execute the query.

The code above rewritten as a prepared statement looks like this.

$statement = $pdo->prepare('SELECT * FROM Users WHERE id=?');

$statement->execute([$_GET['userId']]);
$user = $statement->fetch(PDO::FETCH_ASSOC);

With native prepared statements, that interaction does indeed take place in 2 separate requests. Because the statement and the data are sent separately—literally sent as separate packets of data to the database server—there's no way for cleverly formed data to alter the structure of the SQL query.

The prepared statement and data are not sent separately if emulation is enabled, so you should disable it. Emulated prepare statements are constructed differently (by PHP rather than by the database server), and don't fully protect you from SQL injection, as we saw with a serious SQL injection vulnerability in Drupal 7 a few years back.

Use PDO to connect to your database, explicitly set the charset your database uses, and disable emulated prepare statements. For clarity on errors, I strongly recommend enabling exceptions for error handling as well.

$pdo = new PDO(
    'mysql:host=localhost;dbname=db_name;charset=utf8mb4',
    'db_user',
    'db_pass'
);
$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

For a deep dive on the details behind all this, read the articles already referenced many times in this post: Anthony Ferrara's incredibly detailed post on StackOverflow, the Paragon Initiative's definitive guide on preventing SQL injection, and this exhaustive PDO explainer.

# You Might Already Be Protected

If you're using a good query builder or ORM, which itself always uses native prepared statements under the hood, you're already be protected. (As long as you're not using one of the 5 troublesome character sets listed above.) But don't breathe that sigh of relief until you do some research to ensure your favorite database abstraction makes use of native prepared statements.

Still, it's important to be very familiar with these techniques. At some point, you'll run into a situation where you need to directly query the database. Or you'll find yourself working on a legacy application that doesn't have a query builder or ORM that you can rely on for protection.

It is ultimately your responsibility to ensure your application is protected from one of the most continually pervasive web application threats out there.