A Parameterized SPARQL String is a SPARQL query/update into which values may be injected.
The intended usage of this is where using a QuerySolutionMap
as
initial bindings is either inappropriate or not possible e.g.
- Generating query/update strings in code without lots of error prone and messy string concatenation
- Preparing a query/update for remote execution
- Where you do not want to simply say some variable should have a certain value but rather wish to insert constants into the query/update in place of variables
- Defending against SPARQL injection when creating a query/update using some external input, see SPARQL Injection notes for limitations.
- Provide a more convenient way to prepend common prefixes to your query
This class is useful for preparing both queries and updates hence the
generic name as it provides programmatic ways to replace variables in
the query with constants and to add prefix and base declarations. A
Query
or UpdateRequest
can be created using
the asQuery()
and asUpdate()
methods assuming the command an
instance represents is actually valid as a query/update.
Building parameterised commands
A ParameterizedSparqlString is created as follows:
ParameterizedSparqlString pss = new ParameterizedSparqlString();
There are also constructor overloads that take in an initial command text, parameter values, namespace prefixes etc. which may allow you to simplify some code.
Once you have an instance you first set your template command with the setCommandText()
method like so:
pss.setCommandText("SELECT * WHERE {\n" +
" ?s a ?type .\n" +
" OPTIONAL { ?s rdfs:label ?label . }\n" +
"}");
Note that in the above example we did not define the rdfs:
prefix so as it stands the query is invalid. However
you can automatically populate BASE
and PREFIX
declarations for your command without having to explicitly
declare them in your command text by using the setBaseUri()
and setNsPrefix()
method e.g.
// Add a Base URI and define the rdfs prefix
pss.setBaseUri("http://example.org/base#");
pss.setNsPrefix("rdfs", "http://www.w3.org/2000/01/rdf-schema#");
You can always call toString()
to see the current state of your instance e.g.
// Print current state to stdout
System.out.println(pss.toString());
Which based on the calls so far would print the following:
BASE <http://example.org/base#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
?s a ?type .
OPTIONAL { ?s rdfs:label ?label . }
}
Note that the state of the instance returned by toString()
will include any injected values. Part of what the toString()
method
does is check that your command is not subject to SPARQL injection attacks so in some cases where a possible
injection is detected an ARQException
will be thrown.
Injecting Values
Once you have a command text prepared then you want to actually inject values into it, values may be injected in several ways:
- By treating a variable in the SPARQL string as a parameter
- Using JDBC style positional parameters
- Appending values directly to the command text being built
See the ParameterizedSparqlString javadocs for a comprehensive reference of available methods for setting values, the following sections shows some basic examples of this.
Variable Parameters
Any SPARQL variable in the command text may have a value injected to it, injecting a value replaces all usages of that variable in the command i.e. substitutes the variable for a constant. Importantly injection is done by textual substitution so in some cases may cause unexpected side effects.
Variables parameters are set via the various setX()
methods which take a String
as their first argument e.g.
// Set an IRI
pss.setIri("x", "http://example.org");
// Set a Literal
pss.setLiteral("x", 1234);
pss.setLiteral("x", true);
pss.setLiteral("x", "value");
Where you set a value for a variable you have already set the existing value is overwritten. Setting any value
to null
has the same effect as calling the clearParam("x")
method
If you have the value already as a RDFNode
or Node
instance you can call the setParam()
method instead e.g.
// Set a Node
Node n = NodeFactory.createIRI("http://example.org");
pas.setParam("x", n);
Positional Parameters
You can use JDBC style positional parameters if you prefer, a JDBC style parameter is a single ?
followed by
whitespace or certain punctuation characters (currently ; , .
). Positional parameters have a unique index which
reflects the order in which they appear in the string. Note that positional parameters use a zero based index.
Positional parameters are set via the various setX()
methods which take an int
as their first argument e.g.
// Set an IRI
pss.setIri(0, "http://example.org");
// Set a Literal
pss.setLiteral(0, 1234);
pss.setLiteral(0, true);
pss.setLiteral(0, "value");
Where you set a value for a variable you have already set the existing value is overwritten. Setting any value
to null
has the same effect as calling the clearParam(0)
method
If you have the value already as a RDFNode
or Node
instance you can call the setParam()
method instead e.g.
// Set a Node
Node n = NodeFactory.createIRI("http://example.org");
pas.setParam(0, n);
Non-existent parameters
Where you try to set a variable/positional parameter that does not exist there will be no feedback that the parameter
does not exist, however the value set will not be included in the string produced when calling the toString()
method.
Buffer Usage
Additionally you may use this purely as a StringBuffer
replacement for creating commands since it provides a
large variety of convenience methods for appending things either as-is or as nodes (which causes appropriate
formatting to be applied).
For example we could add an ORDER BY
clause to our earlier example like so:
// Add ORDER BY clause
pss.append("ORDER BY ?s");
Be aware that the basic append()
methods append the given value as-is without any special formatting applied, if you
wanted to use the value being appended as a constant in the SPARQL query then you should use the appropriate appendLiteral()
,
appendIri()
or appendNode()
method e.g.
// Add a LIMIT clause
pss.append("LIMIT ");
pss.appendLiteral(50);
Getting a Query/Update
Once you’ve prepared your command you should then call the asQuery()
or asUpdate()
method to get it as a Query
or UpdateRequest
object as appropriate. Doing this calls toString()
to produce the final version of your command with
all values injected and runs it through the appropriate parser (either QueryFactory
or UpdateFactory
).
You can then use the returned Query
or UpdateRequest
object as you would normally to make a query/update.
SPARQL Injection Notes
First a couple of warnings:
- This class does not in any way check that your command is syntactically correct until such time as you try and parse
it as a
Query
orUpdateRequest
. - Injection is done purely based on textual replacement, it does not understand or respect variable scope in any way. For example if your command text contains sub queries you should ensure that variables within the sub query which you don’t want replaced have distinct names from those in the outer query you do want replaced (or vice versa)
While this class was in part designed to prevent SPARQL injection it is by no means foolproof because it works purely at the textual level. The current version of the code addresses some possible attack vectors that the developers have identified but we do not claim to be sufficiently devious to have thought of and prevented every possible attack vector.
Therefore we strongly recommend that users concerned about SPARQL Injection attacks perform their own validation on provided parameters and test their use of this class themselves prior to its use in any security conscious deployment. We also recommend that users do not use easily guess-able variable names for their parameters as these can allow a chained injection attack though generally speaking the code should prevent these.