

Field Name Formats¶

As we have learned, an RDF repository is a graph structure. RDF field names are also defined in a hierarchical graph structure, called an "ontology". This could be an internal Thomson Reuters/Refinitiv ontology or a 3rd-party ontology.

Thus, the "CommonName" field belonging to organizations is actually identified by its full path within the Thomson Reuters ontology:

http://ont.com/mdaas/CommonName

The reason that you can't refer to fields only by their "terminal" name (e.g. "CommonName") is that this name may potentially appear in more than one ontology. Therefore using only the name without its path could result in ambiguity.

There are several different formats you can use to identify CM-Well fields, each with pros and cons. The following sections describe these formats.

"Prefix" Format¶

The "prefix" format in CM-Well uses the immediate parent of the field name, from the field's full ontology path. Thus, in the field "http://ont.com/mdaas/CommonName", CommonName is the field name, while "mdaas" is its prefix.

In spite of its name, in CM-Well you add the prefix after the field name itself, as follows: fieldname.prefix. For example "CommonName.mdaas".

Note

For the special metadata fields of type system, content and link, the opposite order (e.g. "system.indexTime" or "length.content") is currently also supported. This is deprecated and will stop being supported in the future.

You may have noted that even adding the prefix to the field name does not guarantee uniqueness. If the same field name (with the same immediate parent name) is added to two different namespaces, the name becomes ambiguous and the query will fail.

Although the prefix is more readable than the alternatives, it is less reliable because of the potential for ambiguity (and it may even cease to be supported in the future).

Note

The recommended best practice is to use the URI or hashed format (see below) for field identifiers, rather than the prefix format.

URI Format¶

A field name's URI is its full URI in its hosting ontology.

For example, the URI of organizationFoundedYear.mdaas is:

http://ont.com/mdaas/organizationFoundedYear

To use the full URI for a given field, surround the predicate with $ symbols. So for example, here is a query that uses the full URI for organizationFoundedYear.mdaas:

<cm-well-host>/permid.org?op=search&qp=$http://ont.com/mdaas/organizationFoundedYear$>2014,type.rdf:Organization&with-data&format=ttl

Hashed Format¶

Ontology namespaces are encoded by a hash function and stored in a special path in CM-Well: <CMWellHost>/meta/ns. You can use the hash value instead of the unhashed (but potentially ambiguous) prefix value.

See for example the following entry from /meta/ns:

In this example, http://www.w3.org/ns/prov is the original namespace, and bE5hMw is its hashed value.

So, for example, we could refer to the Agent entity in this ontology as $http://www.w3.org/ns/prov/Agent$ (the full URI format), but we could also refer to it using the hashed namespace value as a prefix, as follows: Agent.$bE5hMw. The $ character after the period indicates that what follows is a hashed value.

Note

Using the hashed namespace value is faster than using any other field notation method; therefore if you want to improve query performance, use this format.

To find the hashed value for a certain namespace, you must search for it in CM-Well. So, for instance, to look up the namespace in the example above, you could run the following query:

<cm-well-host>/meta/ns?op=search&qp=url:http://www.w3.org/ns/prov

The query returns the following namespace infoton:

<cm-well-host>/meta/ns/bE5hMw

In your code, you can look up all the required hash values during initialization, and use them for querying at run-time.