Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
<OpenEndpoints/> transforms content from CMS, REST-APIs, databases and static files into other content structure and other content types. The technical solution is XML Transformation.
This requires
XML as a data-source
XSLT to transform XML into other content structure and content type
XSLT (Extensible Stylesheet Language Transformations) is a language for transforming XML documents into many different formats: other XML documents, or other formats such as HTML, PDF (precisely: XSL-FO), or plain text which may subsequently be converted to other formats, such as JSON. XSLT is an open, stable and established technology. Read more here: https://en.wikipedia.org/wiki/XSLT.
While XSLT is incredibly versatile and powerful, XML is not a really common type of data-source in the web - which we sincerely regret, because XML is just perfect to describe all sort of semantic content and provide established tools to manipulate that content. This is where <openEndpoints/> comes in:
XML transformation for non-XML data-sources
<OpenEndpoints/> automatically converts various data-source types into XML and applies XSLT to transform original content into something new.
Required components are:
The data-source
The XSLT
The "transformer" - which basically combines the data-source and the XSLT in order to generate new output.
In the data-sources directory under the application, there are zero or more files, each describing a data-source.
A data source is a list of commands (e.g. fetch XML from URL) which produce XML. Each data source is executed, and the results are appended into an XML document (e.g. fetch XML from two URLs, then the result of the data source will be an XML document with two child elements, which are the XML fetched from the two URLs).
The data source file contains the <data-source>
root element then any number of data-source command in any order:
The resulting XML document has the root tag <transformation-input>
. The results of the command are appended, in order, directly underneath this tag.
The XML Transformation is stored in a single file ("xslt-file"). In the data-source-xslt
directory under the application, there are zero or more XSLT files than can be used for your transformations. You can use subdirectories to organize your xslt files.
The data-type of the generated output is determined by the XSLT file.
Native XSLT can produce XML, plain text or HTML. A special markup of XML is XSL-FO, which can be converted into PDF. The option to trigger conversion of XSL-FO into PDF is described here.
In the transformers directory under the application, there are zero or more files, each describing a transformation. The transformation determines which XSLT to apply on which data-source.
Transformation Options for post-processing the result enable flexible application for various practical use cases.
If you wish to capture the input/output of a transformation see Writing Transformation Input/Output to AWS S3.
This lists the most recent object keys (filenames) out of the bucket specified in the aws-s3-configuration.xml file (see example-customer for example format.).
“Most recent” means the keys of the objects with the most recent last modified timestamp.
The command looks like:
and the results look like:
This reads a particular object (file) from AWS S3. It is assumed that this object contains XML data. The command looks like:
and the results look like:
In the transformers directory under the application, there are zero or more files, each describing a transformation. Subdirectories are not supported.
The root element of each transformer file is <transformer>
.
The root <transformer>
element has with a mandatory attribute data-source
. The is the name of the data-source without file-extension. For example, if you have a file my-data-source.xml
in the data-sources directory, then the correct attribute value is data-source="my-data-source"
.
The <xslt-file>
element is optional. If omitted, the data-source will be returned without XSLT transformation. The name
attribute in XSLT file element is mandatory. It is the file-name of an XSLT file including the file-extension. Possible file extensions are ".xslt" or ".xsl".
The optional content-type
element sets the mime-type of the generated output. The type
attribute is mandatory. The value of this attribute is the mime-type that shall be set. If no content-type were set, heuristics are used by Endpoints to guess an appropriate content type.
Potential Source of Error!
Using a placeholder for parameters not decalared in your endpoints.xml will raise an error!
For example, if you use ${firstname} in your CMS, but a parameter "firstname" is not existing in your application, this will not work.
Generating a REST API Request-Body
REST APIs often require a specific MIME type for the request body. Use the content-type
element to set the required value.
The data-source will be wrapped into a root-tag <transformation-input>
.
Note that this is useful for developing and debugging a data-source transformation. Omitting the xslt-file element returns exactly the input, which your xslt will be applied to.
The correct content-type is set automatically. It is possible to deliberately set a different content-type, but we do not recommend to do so.
The correct content-type is set automatically. It is possible to deliberately set a different content-type, but we do not recommend to do so.
Note that in the example above the XSLT produces XML, which is then converted to JSON. An alternative option to generate JSON is to have XSLT with output type "text". In that case the correct syntax is different:
JSON Syntax
Conversion of XML to JSON can be done in different ways. If you need a specific JSON syntax, XSLT generating JSON might be the better option compared to generating XML with option .
UTF-8
XSLT output by default is UTF-8. Use <content-type type="text/plain; charset=xxx"/>
to set a specific charset if required. In such case the generated output needs to match that specific charset, of course.
XSLT can not generate output of type Excel. <OpenEndpoints/> offers a workaround which converts a simple HTML table into Excel binary format.
The format is chosen to be as similar to XHTML as possible. The syntax is as follows:
HTML should contain <table>
elements.
These should contain <tr>
elements and within them <td>
(or <th>
) elements.
Excel files differentiate between "text cells" and "number cells". The contents of the <td>
are inspected to see if they look like a number, in which case an Excel "number cell" is produced, otherwise an Excel "text cell" is produced.
The attribute <convert-output-xml-to-excel input-decimal-separator="xxx">
affects how numbers in the input HTML document are parsed.
"dot" (default). Decimal separator is ".", thousand separator is ",".
"comma". Decimal separator is ",", thousand separator is ".".
"magic". Numbers may use either dot or comma as thousand or decimal separator, or the Swiss format 1'234.45. Heuristics are used to determine which system is in use. (This is useful in very broken input documents that use dot for some numbers and comma for others, within the same document.) The numbers must either have zero decimal (e.g. "1,024") or two decimal places (e.g. "12,34"). Any other number of decimal places in the input data will lead to wrong results.
The number of decimal places in the <td>
data are taken over the to Excel cell formatting. That is to say, <td>12.20</td>
will produce an Excel number cell containing the value 12.2 with the Excel number format showing two decimal places, so will appear as 12.20 in the Excel file.
To force the cell to be an Excel text cell, even if the above algorithm would normally classify it as an Excel number cell, make the table cell with <td excel-type="text">
.
The colspan
attribute, e.g. <td colspan="2">
, is respected.
The following style elements of <td>
are respected:
style="text-align: center"
(Right align etc. is not supported)
style="font-weight: bold"
style="border-top:"
(Bottom borders etc. are not supported)
style="color: green"
, style="color: red"
, style="color: orange"
(Other colors are not supported.)
<thead>
, <tfoot>
and <tbody>
are respected. (Elements in <tfoot>
sections will appear at the bottom of the Excel file, no matter what order the tags come in in the HTML.)
Column widths are determined by the lengths of text within each column.
Any <table>
which appears inside a <td>
is ignored (i.e. tables may be nested in the HTML, only the outermost table is present in the resulting Excel file.)
The contents of any <script>
elements are ignored
The contents of any other tags such as <span>
and <div>
are included.
Table rows which contain only table cells which contain no text are ignored. (Often such rows contain sub-tables, which themselves are ignored. Having empty rows doesn't look nice.)
RTF can be generated with XSLT using output type text. Set the correct content-type to open a downloaded (generated) RTF with WORD.
XSLT parameters can be useful to re-use the same XSLT for different transformations.
The parameter value can be set in the transformer file:
Note that variables ${foo} are not supported with this feature.
The data-source command <literal-xml>
lets you define xml output directly and "literally" in the data-source definition file.
The root tag <literal-xml>
is not included in the data-source xml output. In the example above, the generated xml will be:
This data-source-type can be perfectly used in combination with parameter placeholders. For example, you can use something like this:
If ${foo}
equals "hello world", the data-source output will be:
Note that the contents of must be elements, simply placing text straight under the <literal.xml> element will not work.
This content-source produces as its output a description of the entire application directory structure (=your configuration).
The generated content has a root-tag <application-introspection>
and returns
<directory name="x">
for any directory
<file name="x"/>
for all XML files. The content of the XML file is included as a child of this tag, except the directory xml-from-application
. (Use the <xml-from-application>
data source, not <application-introspection>
to load content from such files.)
<file name="x"/>
for all non-XML files. In this case the content is not in any way included.
XML files must actually contain XML
If a file named *.xml
does not in fact contain well-formed XML, this is an error.
No expansion of endpoint parameters
Parameters like ${foo}
found in the file are not expanded in this type of content-source.
A data source is a list of zero or many commands which fetch content from different content sources and produce XML. The resulting xml output is the input ("transformation-input") for XSLT to generate output documents.
Sometimes it makes sense to apply one or many intermediate steps to modify the loaded content before it becomes a "transformation-input". Possible reasons for this include
You might want to reuse the same XSLT to generate an output document for different content sources, but these content-sources do not produce the exactly same structure of input. An intermediate transformation step can be used to "normalize" the input among different content sources.
A complex transformation might be implement more elegant by splitting it into several subsequent steps.
A data-source-post-processinig.xlt can apply xml-transformations within the data-source object.
In the data-source-post-processing-xslt directory under the application, there are zero or more files, each describing a post-processing transformation step. Each file is a XSLT which expects input-data with a root tag <data-source-post-processing-input> and which shall produce any output xml with a root-tag <data-source-post-processing-output>.
Content loaded from source A:
Expected output:
You can add zero or many data-source-post-processing.xslt to each content-source of your data-source object. For each content-source post-processing will be executed separately. Multiple steps for the same content-source will be executed subsequently in the order of post-processing-xslt files.
In addition you can apply the same logic for the entire data-source object. In this case all content-sources are loaded as a first stept, and post-processing applies for the collection of all content-sources.
<OpenEndpoints/> can generate unique auto-increment values and provide them as a data-source. Read for more details.
Endpoint parameters can be used as placeholders in your data source. On executing the data source definition any parameter placeholder will be replaced by the respective value of the parameter.
You can only use parameters defined in the endpoints.xml file of your configuration.
You can not use
intermediate parameters (from a task)
content submitted from a file-upload (or - more generally - any additional content submitted from a multi-part message)
If my parameter is called "foo" ...
then I can use as a placeholder:
Placeholders can be used inside the data-source.xml file.
For example, you can select the specific piece of content loaded from CMS by evaluating a submitted parameter value. Or you could load specific database rows selected for a specific parameter value.
For security or technological reason this does not work in every case. For details please refer to the specific sections:
On loading content from any of these data source types placeholders will be automatically replaced by its parameter values:
xml-from-application
xml-from-url
For example, you can use ${foo} as a placeholder directly in your CMS. On loading data from your CMS, actual values will replace the placeholders.
Potential Source of Error!
Using a placeholder for parameters not decalared in your endpoints.xml will raise an error!
For example, if you use ${firstname} in your CMS, but a parameter "firstname" is not existing in your application, this will not work.
The beauty of <OpenEndpoints/> is its potential to produce custom content on demand. Practically, that means: Data submitted from a webform (or any other method of submitting data) can directly influence
what content is loaded from which data sources
what exactly to transformation of a content shall look like.
Endpoint parameters come from the following sources:
Parameters submitted from the originating request to the endpoint including, but not limited to, ?x=y GET parameters
The result of a parameter transformation (for example, extracting parameters from arbitrary XML sent as a POST)
An "intermediate value" generated by one task and consumed by future items (for example, results from an HTTP request which was sent during the processing)
This data source type will output all available parameter values and make it available for data-source transformation.
The generated content looks like this:
A special type of parameter value is an uploaded file. For example if you are submitting data from a webform which has <input type="file" name="my-upload"/>
, then the presentation of input depends on the type of uploaded content.
If the uploaded content can be parsed as XML, this xml will be available in the data-source:
If the uploaded content can not be parsed as XML, the output is:
XML content <> xml-file-extension
Whether a file upload is XML or not is determined by whether the XML can be parsed as XML. The uploaded filename and Content Type are ignored, to allow files such as SVGs which have neither an XML file extension, nor an XML Content Type.
Intermediate parameters are not regular endpoint parameters, i.e. they are not defined as a <parameter> in endpoints.xml and their value does not come from the original request. Intermediate outputs are generated from tasks that execute during the processing of the request. On forwarding an endpoint to another endpoint those parameters will be made available in the parameters-data-source as well:
By default the root-tag of the generated output is <parameters>. Use the optional tag attribute to generate any different root-tag:
Sometimes it may be required to insert a unique incremental id into a generated content.
Depending on the specific business use case, the incremental id may be unique “perpetual”, or it may be required to re-start the counter every month or every year.
Type
Example Value
perpetual
23456
year
2020-0068
month
2020-01-0017
Request UUID
In addition, each request submitted to <OnestopEndpoints/> gets assigned a globally unique UUID in the transaction-log. You can access this id in the parameter-transformation, but it is not available as a data source.
The command to fetch a new auto-increment value is a data-source.
Whenever a transformer has a data-source with that command the auto-increment will be triggered. The type attribute may take the values “perpetual”, “year” or “month”. The numbers are unique within the application.
Formatting
Note that the data source returns a number only. If the request has the incremental number "17" for the current month, the value provided (for type="month") will be 17. In order to get something like 2020-01-0017 you need to build such format with XSLT.
The term “on-demand” refers to the fact the number does not get consumed unless it is requested. If one value (e.g. type=”month” ) is consumed, other values (e.g. type=”year”) are not automatically consumed as well.
The value is only consumed if the endpoints request is successful; if the request is not successful the number is again made available to future requests. The numbers do not have any “holes” or missed-out numbers, so are suitable for use in invoice numbers.
The same endpoint may contain several different transformers, some of which might call the same data-source. For example, the endpoint may include a task to send an email, and the email-body and some attachment both require the same data-source. In this case - the data source is used twice within the same request - they both see the same number. The new incremental id is created per request, not per use of the (same) data-source. However, if you use 2 different data-sources, both calling an auto-increment, then 2 different ids will be created.
The incremented values will be stored in the database. Changing the values in the database will effect the next generated number.
There is the possibility of adding instructions to output the input/output of the transformation to AWS S3.
This creates objects in the S3 bucket which have the tags as specified, the correct Content-Type, and in addition a tag called "environment" which is either "preview" or "live".
In contrast to content loaded from the internet, XML already existing within the application does not require to be loaded each time. This makes it a good choice for large files. It is possible to use XML files with more than 100,000 lines of content without causing any performance issues. In general, the limiting factor is the performance of the XSLT transformation rather than the size of the data source.
You can place such files under the the xml-from-application
directory within the application. You can load the content of this file into the data-source:
You can use placeholders to fill in endpoint parameter values into the file attribute:
If the file can not be found, an error will be raised. To avoid that, add an attribute ignore-if-not-found="true"
. The data-source in this case will look like if the content was not requested at all.
The data-source command <xml-from-database>
fetches rows from a database and transforms rows and columns into XML.
Currently only MySQL and PostgreSQL are supported. Other databases would require other client JARs that are not provided in the current version of the software.
Alternative option using the environment-variable to connect to your local endpoints database - in this example (sql!) fetching request details from the request-log:
<jdbc-connection-string> specifies how to connect to the database to perform the query. This element is mandatory.
If it is present with no attributes, the body of the tab specifies the JDBC URL. Using a CDATA section is recommended to avoid having to perform XML escaping. Don't forget that username and password values must be URL-escaped.
If it is has an attribute from-environment-variable="foo"
then the environment variable with that name is read and should contain the JDBC URL. Note that endpoints parameters are NOT expanded in the name of the variable name, to prevent an attacker having access to other environment variables.
<sql> should be self-explanatory :-)
Endpoint parameters are NOT expanded as that would allow SQL injection attacks.
For PostgreSQL, for non-string parameters, ?::int or ?::uuid it is necessary to cast the string supplied by the endpoint parameter into the right type for PostgreSQL.
Zero or more <param> elements, whose body are the contents of any "?" in the <sql> element. Here, endpoint parameters ARE expanded.
Generated output looks like this:
By default the root-tag of the generated output is <xml-from-database>. Use the optional tag attribute to generate any different root-tag:
You can fetch content from any REST API and use it as a data-source in <OnestopEndpoints/>.
The data-source command to fetch xml or json data from any URL is <xml-from-url>
. JSON or HTML returned to this command will automatically converted into XML.
For JSON to XML conversion:
Any characters which would be illegal in XML (for example element name starting with a digit) replaced by _xxxx_
containing their hex unicode character code.
Note that if any JSON objects have a key _content
, then a single XML element is created, with the value of that _content
key as the text body, and other keys from the JSON object being attributes on the resulting XML element.
<url> is mandatory. That should ne no surprise ;-)
<method> can be "POST" or "GET". If omitted "GET" will be used as a default.
Zero or many <get-parameter>, zero or many <request-header> and zero or one <basic-access-authentication> - all optional.
$inline[badge,Highlight,success] The beauty of <OpenEndpoints/> shows in the solution of the optional request body, which can be JSON or XML. There are several different options how-to build the content for the request-body.
The request body is expressed as XML within the <xml-body> tag. Endpoint parameters are expanded.
Uploaded content encoded in base64 can be filled into any tag of the request body. This requires 2 actions:
Add attribute upload-files="true"
to <xml-from-url>
Add to any element of your request body attributes upload-field-name="foo" encoding="base64"
The uploaded content will expand into that XML element.
base64 encoded content only
The expansion of uploaded content works for base 64 encoded content only!
It is also possible to send generated content within a request body:
Add attribute expand-transformations="true"
to <xml-from-url>
Add to any element of your request body attributes xslt-transformation="foo" encoding="base64"
Adding that attribute to the element indicates that the transformation with that name should be executed (for example, generate a PDF file), and the contents of the resulting file should be placed in this tag. The encoding is always base64, no other encodings are supported.
The request body is generated by XSLT. This leaves maximum flexibility to build different content of the request body depending on endpoint parameter values!
Note that this is a transformation within a transformation. The XSLT takes a <parameters> as its input document; This XSLT does not have access to the results of any other data sources. The reason is, that data sources cannot use data produced by another data source.
The XSLT file is taken from the http-xslt directory.
The transformation-input to apply that XSLT has <parameters> as its root tag.
The optional attribute upload-files="true"
and expand-transformations="true"
may be present, as above.
The request body is expressed as JSON within the <json-body> tag. Endpoint parameters are expanded.
Endpoint parameters are expanded within the string context of JSON, that is to say that no concern about escaping is necessary.
Options for expanding base64 content from file upload or generated content is not available for JSON.
The request body is generated by XSLT. That requires that the result of the transformation is valid JSON.
Note that this is a transformation within a transformation. The XSLT takes a <parameters>
as its input document, see above "XML Request Body from Transformation" for the format of that block.
By default the root-tag of the generated output is <xml-from-url>. Use the optional tag attribute to generate any different root-tag: