The inner workings of CSRF token validation in Django

I’ve answered more than a few questions on Stack Overflow of the low-quality, “homework” type, the kind that could have been answered independently if only the person asking spent more time reading docs, following tutorials, reflecting on those materials and doing more experimentation by themselves. (Hey, maybe they really were in a bind, they truly needed an answer ASAP.) I’ve stopped answering those, but every now and then I come across a question not yet readily addressed by existing Stack Q&As or manuals and documentation, and I see an opportunity to add a meaningful, substantial response.

Below, I reproduce one such answer I’m proud to have written. ¹

Note that this was originally written in October 2020, and I do not guarantee that it still applies to the latest versions of the relevant software.

The question

“Why does Django/Django REST Framework not validate CSRF tokens in-depth, even with enforce-CSRF?,” asks the user named ‘Cloud’:

I am trying to enforce CSRF protection for a Django REST Framework API which is open to anonymous users. For that matter, I’ve tried two different approaches:

Extending the selected API views from one CSRFAPIView base view, which has an @ensure_csrf_cookie annotation on the dispatch method.

Using a custom Authentication class based on SessionAuthentication, which applies enforce_csrf() regardless of whether the user is logged in or not.

In both approaches the CSRF check seems to work superficially. In case the CSRF token is missing from the cookie or in case the length of the token is incorrect, the endpoint returns a 403 - Forbidden. However, if I edit the value of the CSRF token in the cookie, the request is accepted without issue. So I can use a random value for CSRF, as long as it’s the correct length.

This behavior seems to deviate from the regular Django login view, in which the contents of the CSRF do matter. I am testing in local setup with debug/test environment flags on.

What could be the reason my custom CSRF checks in DRF are not validated in-depth?

Code fragment of the custom Authentication:

class RestCsrfAuthentication(SessionAuthentication):
    def authenticate(self, request):
        self.enforce_csrf(request)
        rotate_token(request)
        return None

And in settings:

REST_FRAMEWORK = {
    'DEFAULT_AUTHENTICATION_CLASSES': [
        'csrfexample.authentication.RestCsrfAuthentication',
    ]
}

The answer

In Django, the specific contents of CSRF tokens actually don’t matter.

This reply by a Django security team member in the django-developers mailing list to a similar question says this:

The way our CSRF tokens work is pretty simple. Each form contains a CSRF token, which matches the CSRF cookie. Before we process the protected form, we make sure that the submitted token matches the cookie. This is a server-side check, but it’s not validating against a stored server-side value. Since a remote attacker should not be able to read or set arbitrary cookies on your domain, this protects you.

Since we’re just matching the cookie with the posted token, the data is not sensitive (in fact it’s completely arbitrary—a cookie of “zzzz” works just fine), and so the rotation/expiration recommendations don’t make any difference. If an attacker can read or set arbitrary cookies on your domain, all forms of cookie-based CSRF protection are broken, full stop.

(Actually “zzzz” won’t work because of length requirements, but more on that later.) I recommend reading the entire mailing list message for a fuller understanding. There are explanations there about how Django is peculiar among frameworks because CSRF protections are independent of sessions.

I found that mailing list message via this FAQ item on the Django docs:

Is posting an arbitrary CSRF token pair (cookie and POST data) a vulnerability?

No, this is by design. Without a man-in-the-middle attack, there is no way for an attacker to send a CSRF token cookie to a victim’s browser, so a successful attack would need to obtain the victim’s browser’s cookie via XSS or similar, in which case an attacker usually doesn’t need CSRF attacks.

Some security audit tools flag this as a problem but as mentioned before, an attacker cannot steal a user’s browser’s CSRF cookie. “Stealing” or modifying your own token using Firebug, Chrome dev tools, etc. isn’t a vulnerability.

(Emphasis mine.)

The message is from 2011, but it’s still valid, and to prove it let’s look at the code. Both Django REST Framework’s SessionAuthentication and the ensure_csrf_cookie decorator use Django core’s CsrfViewMiddleware (source). In that middleware class’s process_view() method, you’ll see that it fetches the CSRF cookie (a cookie named csrftoken by default), and then the posted CSRF token (part of the POSTed data, with a fallback to reading the X-CSRFToken header). After that, it runs _sanitize_token() on the POSTed/X-CSRFToken value. This sanitization step is where the check for the correct token length happens; this is why HTTP 403s are being returned as expected when shorter or longer tokens are provided.

After that, the method proceeds to compare the two values using the function _compare_salted_tokens(). If you read that function, and all the further calls that it makes, you’ll see that it boils down to checking if the two strings match, basically without regard to the values of the strings.

This behavior seems to deviate from the regular Django login view, in which the contents of the CSRF do matter.

This claim is incorrect; it doesn’t matter even in the built-in login views. To check, I ran this curl command (in Windows cmd format) against a fresh Django project:

curl -v
-H "Cookie: csrftoken=abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-H "X-CSRFToken: abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"
-F "username=admin" -F "password=1234" http://localhost:8000/admin/login/

Django returned a session cookie (plus a CSRF cookie, of course), signifying a successful login.

Just a note on the way SessionAuthentication.authenticate() is being overridden in the code snippet in the question: according to the DRF docs that method should return a (User, auth) tuple instead of None if the request has session data, i.e. if the request is from a logged-in user. Also, I think rotate_token() is unnecessary, because this code only checks for authentication status, and is not concerned with actually authenticating users. (The Django source says rotate_token() “should be done on login”.)

[1] Some minor edits were made for style and clarity.

This article is licensed under CC BY-SA 4.0.