External Variables (Code Review, Part II)
September 28, 2012 at 12:34 PM | categories: XQuery, MarkLogic | View CommentsRemember when I talked about XQuery Code Review?
The other day I was forwarding that link to a client,
and noticed that I forgot to mention external variables.
I talked about xdmp:eval
and xdmp:value
in the section titled Look_for_injection_paths,
and mentioned that it's usually better to use xdmp:invoke
or xdmp:unpath
,
which are less vulnerable to injection attacks.
But it can be convenient or even necessary to evaluate dynamic XQuery.
That's what xdmp:eval
and xdmp:value
are there for, after all.
I've even written tools like Presta
to help you.
Used properly, dynamic queries can be made safe.
The trick is to never let user data directly into your dynamic queries.
Whenever you see xdmp:eval
or xdmp:value
in XQuery,
ask yourself "Where did this query comes from?"
If any part of it came from user input, flag it for a rewrite.
(: WRONG - This code is vulnerable to an injection attack! :)
xdmp:eval(
concat('doc("', xdmp:get-request-field('uri'), '")'))
Actually there are at least two bugs in this code.
There is a functional problem: what happens if the uri
request field
is fubar-"baz"
? You might not expect a uri to include a quote,
and maybe that will never legitimately happen in your application.
But if that request-field does arrive, xdmp:value
will throw an error:
XDMP-UNEXPECTED: (err:XPST0003) Unexpected token syntax error
That's because you haven't properly escaped the uri in the dynamic XQuery.
And you could escape it. You could even write a function to do that for you.
But if you miss any of the various characters that need escaping,
XDMP-UNEXPECTED
will be there, waiting for you.
So far we've only talked about innocent mistakes. But what if someone out there
is actively hostile? Let's say it's me. If I know that your web service
expects a uri
request-field, I might guess that your code looks something like
the code above, and try an injection attack.
After a little trial and error, I might find that sending
uri=x"),cts:uris(),("
returns a list of all the documents in your database,
whether you want me to see them or not. Then I can send something like
uri=x"),xdmp:document-delete("fubar
. If that document exists,
and security isn't tight... it's gone. Or maybe I will decide to try
xdmp:forest-clear
instead.
In SQL we use bind variables to solve both of these problems. Any user input binds to a variable inside the SQL, and the database driver takes care of escaping for us. We no longer have to worry about obscure syntax errors or injection attacks, as long as we remember to use variable for all externally-supplied parameters. In XQuery these are known as external variables.
(: Always use external variables for user-supplied data. :)
xdmp:eval(
'declare variable $URI as xs:string external ;
doc($URI)',
(xs:QName('URI'), xdmp:get-request-field('uri')))
The syntax is a little odd: that second parameter is a sequence of alternating QName and value. Because XQuery doesn't support nested sequences, this means you can't naively bind a sequence to a value. Instead you can pass in XML or a map, or use a convention like comma-separated values (CSV).
(: Using XML to bind a sequence to an external variable. :)
xdmp:eval(
'declare variable $URI-LIST as element(uri-list) external ;
doc($URI-LIST/uri)',
(xs:QName('URI-LIST'),
element uri-list {
for $uri in xdmp:get-request-field('uri')
return element uri { $uri } }))
Even though these examples all use pure XQuery, this code review principle
also applies to XCC code. If you see a Java or .NET program using AdHocQuery
,
check to make sure that all user input binds to variables.
Remember, the best time to fix a potential security problem is before the code goes live.