Class RDFParserBuilder

java.lang.Object
org.apache.jena.riot.RDFParserBuilder

public class RDFParserBuilder extends Object
An RDFParser is a process that will generate triples; RDFParserBuilder provides the means to setup the parser.

An RDFParser has a predefined source; the target for output is given when the "parse" method is called. It can be used multiple times in which case the same source is reread. The destination can vary. The application is responsible for concurrency of the destination of the parse operation. The process is

     StreamRDF destination = ...
     RDFParser parser = RDFParser.create()
          .source("filename.ttl")
          .build();
     parser.parse(destination);
 
or using a short cut:
     RDFParser parser = RDFParser.create()
          .source("filename.ttl")
          .parse(destination);
 
  • Method Details

    • create

      public static RDFParserBuilder create()
    • source

      public RDFParserBuilder source(Path path)
      Set the source to Path. This clears any other source setting.

      The parser can be reused.

      Parameters:
      path -
      Returns:
      this
    • source

      public RDFParserBuilder source(String uriOrFile)
      Set the source to a URI; this includes OS file names. File URL should be of the form file:///.... This clears any other source setting.

      The parser can be reused.

      Parameters:
      uriOrFile -
      Returns:
      this
    • fromString

      public RDFParserBuilder fromString(String string)
      Use the given string as the content to parse. This clears any other source setting.

      The syntax must be set with .lang(...).

      The parser can be reused.

      Parameters:
      string - The characters to be parsed.
      Returns:
      this
    • source

      public RDFParserBuilder source(InputStream input)
      Set the source to InputStream. This clears any other source setting.

      The syntax must be set with .lang(...).

      The InputStream will be closed when the parser is called and the parser can not be reused.

      Parameters:
      input -
      Returns:
      this
    • source

      public RDFParserBuilder source(StringReader reader)
      Set the source to StringReader. This clears any other source setting. The StringReader will be closed when the parser is called and the parser can not be reused.

      The syntax must be set with .lang(...).

      Consider using fromString(java.lang.String) instead.

      Parameters:
      reader -
      Returns:
      this
    • source

      @Deprecated public RDFParserBuilder source(Reader reader)
      Deprecated.
      Use fromString(java.lang.String), or an InputStream or a StringReader.
      Set the source to Reader. This clears any other source setting. The Reader will be closed when the parser is called and the parser can not be reused.

      The syntax must be set with .lang(...).

      Parameters:
      reader -
      Returns:
      this
    • streamManager

      public RDFParserBuilder streamManager(StreamManager streamManager)
      Set the StreamManager to use when opening a URI (including files by name, but not by Path).
      Parameters:
      streamManager -
      Returns:
      this
    • lang

      public RDFParserBuilder lang(Lang lang)
      Set the hint Lang. This is the RDF syntax used when there is no way to deduce the syntax (e.g. read from a InputStream, not recognized file extension, no recognized HTTP Content-Type provided).
      Parameters:
      lang -
      Returns:
      this
    • strict

      public RDFParserBuilder strict(boolean strictMode)
      Set the parser built to "strict" mode. The default is system wide setting of SysRIOT.isStrictMode().
      Parameters:
      strictMode -
      Returns:
      this
    • forceLang

      public RDFParserBuilder forceLang(Lang lang)
      Force the choice RDF syntax to be lang, and ignore any indications such as file extension or HTTP Content-Type.
      Parameters:
      lang -
      Returns:
      this
      See Also:
    • acceptHeader

      public RDFParserBuilder acceptHeader(String acceptHeader)
      Set the HTTP "Accept" header. The default if not set is WebContent.defaultRDFAcceptHeader.
      Parameters:
      acceptHeader -
      Returns:
      this
    • httpHeader

      public RDFParserBuilder httpHeader(String header, String value)
      Set an HTTP header. Any previous setting is lost.

      Consider setting up an HttpClient if more complicated setting to an HTTP request is required.

    • httpClient

      public RDFParserBuilder httpClient(HttpClient httpClient)
      Set an HTTP client. Any previous setting is lost.

      Consider setting up an HttpClient if more complicated setting to an HTTP request is required.

    • base

      public RDFParserBuilder base(String base)
      Set the base URI for parsing. The default is to have no base URI.
    • resolveURIs

      public RDFParserBuilder resolveURIs(boolean flag)
      Choose whether to resolve URIs or throw an error.

      This does not affect all languages: N-Triples and N-Quads never resolve URIs.
      If this is flag false, relative URIs cause parse errors.
      Only set this to false for debugging and development purposes.

    • resolver

      public RDFParserBuilder resolver(org.apache.jena.irix.IRIxResolver resolver)
      Provide a specific IRIxResolver to check and resolve URIs. Its settings will determine the base IRI and whether to resolve relative IRIs or not. The caller is responsible for giving a resolver that is suitable for the RDF syntax to be parsed.
    • prefixes

      public RDFParserBuilder prefixes(PrefixMap prefixMap)
      Set an initial prefix map for parsing.

      Using this, and base(java.lang.String), mean that Turtle and TriG fragments can be parsed.

      The caller is responsible for setting any prefixes that are undeclared in the fragment.

      Changes made to the prefix map argument after this call will not be seen by the parser. Passing null clears any previous setting.

    • canonicalValues

      public RDFParserBuilder canonicalValues(boolean flag)
      Convert the lexical form of literals to a canonical form.

      Two literals can be different RDF terms for the same value.

      Examples include (first shown of the pair is the canonical form):

          "1"^^xsd:integer and "+01"^^xsd:integer
          "1.0E0"^^xsd:double and "1"^^xsd:double
       
      The canonical forms follow XSD 1.1 <href="https://www.w3.org/TR/xmlschema11-2/#canonical-lexical-representation">2.3.1 Canonical Mapping</a> except in the case of xsd:decimal where it follows the older XSD 1.0 which makes it legal for Turtle's short form ("1.0"^^xsd:Decimal rather than "1"^^xsd:decimal). See XSD 1.0 3.2.3.2 Canonical representation

      The effect on literals where the lexical form does not represent a valid value (for example, "3000"^^xsd:byte) is undefined.

      This option is off by default.

      This option can slow parsing down.

      For consistent loading of data, it is recommended that data is cleaned and canonicalized before loading so the conversion is done once.

      See Also:
    • langTagLowerCase

      @Deprecated public RDFParserBuilder langTagLowerCase()
      Deprecated.
      In Jena5, language tags are always converted to RFC 5646 case format.
      Convert language tags to lower case.

      This is the suggested form in RDF 1.1 for comparsions. However, this is not the recommended canonical form in RFC 5646.

      Providing all data is converted consistently, language tag equality is maintained for either lower case or RFC canonicalization styles.

      This option can slow parsing down.

      See Also:
    • langTagCanonical

      @Deprecated public RDFParserBuilder langTagCanonical()
      Deprecated.
      In Jena5, language tags are always converted to RFC 5646 case format.
      Language tags are case-normalized as defined by RFC 5646. Example: en-GB, not en-gb.

      This does not affect the RDF 1.1 requirement that the value-space of language tags is lower-case.

      Providing all data is converted consistently, lang tag equality is maintained for either lower case or RFC canonicalization.

      This option can slow parsing down.

      See Also:
    • langTagAsGiven

      @Deprecated public RDFParserBuilder langTagAsGiven()
      Deprecated.
      In Jena5, language tags are always converted to RFC 5646 case format.
      The form of the language tags as given in the data is preserved. This is the default behaviour of parsing.
      See Also:
    • checking

      public RDFParserBuilder checking(boolean flag)
      Set whether to perform checking, NTriples and NQuads default to no checking, other languages to checking.

      Checking adds warnings over and above basic syntax errors.

      • URIs - whether IRs confirm to all the rules of the URI scheme
      • Literals: whether the lexical form conforms to the rules for the datatype.
      • Triples and quads: check slots have a valid kind of RDF term (parsers usually make this a syntax error anyway).

      See also errorHandler(ErrorHandler) to control the output. The default is to log. This can also be used to turn warnings into exceptions.

    • errorHandler

      public RDFParserBuilder errorHandler(ErrorHandler handler)
      Set the ErrorHandler to use. This replaces any previous setting. The default is use slf4j logger "RIOT".
      Parameters:
      handler -
      Returns:
      this
    • factory

      public RDFParserBuilder factory(FactoryRDF factory)
      Set the FactoryRDF to use. FactoryRDF control how parser output is turned into Node and how Triples and Quads are built. This replaces any previous setting.
      The default is use RiotLib.factoryRDF() which is provides Node reuse.
      The FactoryRDF also determines how blank node labels in RDF syntax are mapped to blank node objects.
          new Factory(myLabelToNode)
       
      to create an FactoryRDF and set the LabelToNode step.
      Parameters:
      factory -
      Returns:
      this
      See Also:
      • labelToNode
    • labelToNode

      public RDFParserBuilder labelToNode(LabelToNode labelToNode)
      Use the given LabelToNode, the policy for converting blank node labels in RDF syntax to Jena's Node objects (usually a blank node).
      Only applies when the FactoryRDF is not set in the RDFParserBuilder, otherwise the FactoryRDF controls the label-to-node process.
      SyntaxLabels.createLabelToNode() is the default policy.
      LabelToNode.createUseLabelAsGiven() uses the label in the RDF syntax directly. This does not produce safe RDF and should only be used for development and debugging.
      Parameters:
      labelToNode -
      Returns:
      this
      See Also:
      • factory
    • context

      public RDFParserBuilder context(Context context)
      Set the context for the parser when built.
      Parameters:
      context -
      Returns:
      this
      See Also:
    • set

      public RDFParserBuilder set(Symbol symbol, Object value)
      Add a setting to the context for the parser when built. A value of "null" removes a previous setting.
      Parameters:
      symbol -
      value -
      Returns:
      this
    • set

      public RDFParserBuilder set(Symbol symbol, boolean value)
      Add a setting to the context for the parser when built.
      Parameters:
      symbol -
      value -
      Returns:
      this
    • parse

      public void parse(StreamRDF stream)
      Parse the source, sending the results to a StreamRDF. Short form for build().parse(stream).
      Parameters:
      stream -
    • parse

      public void parse(org.apache.jena.graph.Graph graph)
      Parse the source, sending the results to a Graph. The source must be for triples; any quads are discarded. Short form for build().parse(graph) which sends triples and prefixes to the Graph.
      Parameters:
      graph -
    • parse

      public void parse(org.apache.jena.rdf.model.Model model)
      Parse the source, sending the results to a Model. The source must be for triples; any quads are discarded. Short form for build().parse(model) which sends triples and prefixes to the Model.
      Parameters:
      model -
    • parse

      public void parse(DatasetGraph dataset)
      Parse the source, sending the results to a DatasetGraph. Short form for build().parse(dataset) which sends triples and prefixes to the DatasetGraph.
      Parameters:
      dataset -
    • parse

      public void parse(Dataset dataset)
      Parse the source, sending the results to a Dataset. Short form for build().parse(dataset) which sends triples and prefixes to the Dataset.
      Parameters:
      dataset -
    • toGraph

      public org.apache.jena.graph.Graph toGraph()
      Parse the source in to a fresh Graph and return the graph.

      The source must be for triples; any quads are discarded.

    • toModel

      public org.apache.jena.rdf.model.Model toModel()
      Parse the source in to a fresh Model and return the model.

      The source must be for triples; any quads are discarded.

    • toDataset

      public Dataset toDataset()
      Parse the source in to a fresh Dataset and return the dataset.
    • toDatasetGraph

      public DatasetGraph toDatasetGraph()
      Parse the source in to a fresh DatasetGraph and return the DatasetGraph.
    • build

      public RDFParser build()
      Build an RDFParser. The parser takes it's configuration from this builder and can not then be changed. The source must be set. When a parser is used, it is takes the source and sends output to an StreamRDF.

      Shortcuts:

      Returns:
      RDFParser
    • clone

      public RDFParserBuilder clone()
      Duplicate this builder with current settings. Changes to setting to this builder do not affect the clone.