Where am I?

External Variables (Code Review, Part II)

September 28, 2012 at 12:34 PM | categories: XQuery, MarkLogic | View Comments

Remember when I talked about XQuery Code Review? The other day I was forwarding that link to a client, and noticed that I forgot to mention external variables. I talked about xdmp:eval and xdmp:value in the section titled Look_for_injection_paths, and mentioned that it's usually better to use xdmp:invoke or xdmp:unpath, which are less vulnerable to injection attacks.

But it can be convenient or even necessary to evaluate dynamic XQuery. That's what xdmp:eval and xdmp:value are there for, after all. I've even written tools like Presta to help you.

Used properly, dynamic queries can be made safe. The trick is to never let user data directly into your dynamic queries. Whenever you see xdmp:eval or xdmp:value in XQuery, ask yourself "Where did this query comes from?" If any part of it came from user input, flag it for a rewrite.

(: WRONG - This code is vulnerable to an injection attack! :)
  concat('doc("', xdmp:get-request-field('uri'), '")'))

Actually there are at least two bugs in this code. There is a functional problem: what happens if the uri request field is fubar-"baz"? You might not expect a uri to include a quote, and maybe that will never legitimately happen in your application. But if that request-field does arrive, xdmp:value will throw an error:

XDMP-UNEXPECTED: (err:XPST0003) Unexpected token syntax error

That's because you haven't properly escaped the uri in the dynamic XQuery. And you could escape it. You could even write a function to do that for you. But if you miss any of the various characters that need escaping, XDMP-UNEXPECTED will be there, waiting for you.

So far we've only talked about innocent mistakes. But what if someone out there is actively hostile? Let's say it's me. If I know that your web service expects a uri request-field, I might guess that your code looks something like the code above, and try an injection attack.

After a little trial and error, I might find that sending uri=x"),cts:uris(),(" returns a list of all the documents in your database, whether you want me to see them or not. Then I can send something like uri=x"),xdmp:document-delete("fubar. If that document exists, and security isn't tight... it's gone. Or maybe I will decide to try xdmp:forest-clear instead.

In SQL we use bind variables to solve both of these problems. Any user input binds to a variable inside the SQL, and the database driver takes care of escaping for us. We no longer have to worry about obscure syntax errors or injection attacks, as long as we remember to use variable for all externally-supplied parameters. In XQuery these are known as external variables.

(: Always use external variables for user-supplied data. :)
  'declare variable $URI as xs:string external ;
  (xs:QName('URI'), xdmp:get-request-field('uri')))

The syntax is a little odd: that second parameter is a sequence of alternating QName and value. Because XQuery doesn't support nested sequences, this means you can't naively bind a sequence to a value. Instead you can pass in XML or a map, or use a convention like comma-separated values (CSV).

(: Using XML to bind a sequence to an external variable. :)
  'declare variable $URI-LIST as element(uri-list) external ;
   element uri-list {
     for $uri in xdmp:get-request-field('uri')
     return element uri { $uri } }))

Even though these examples all use pure XQuery, this code review principle also applies to XCC code. If you see a Java or .NET program using AdHocQuery, check to make sure that all user input binds to variables.

Remember, the best time to fix a potential security problem is before the code goes live.

Read and Post Comments