|
|
|
|
Version 1.0 |
|
Optical Spectroscopy Markup Language |
About...
This is preliminary version subject to change
Introduction
A rapid review of a representative set of publications including analysis of optical
data shows the wide use of physical models for the retrieval of the intrinsic optical properties.
This type of treatment is often preferred to inversion methods like Kramers-Kronig relations
or phase retrieval procedures because of the extra information it provides on the microscopic
mechanisms of absorption. A supplementary information that is often essential in the explanation
of the occurrence of phase transitions in materials for example.
Even if this type of treatment has conduced to very important results in the material science sector,
it remains that some drawbacks still persist. Indeed, an important observed fact is the lack
of completeness of the published results; often parameter values are only partially given or
are even totally missing. This situation no solely precludes rigorous comparison of results
obtained by different groups but also disable future use of the optical properties. Another critical
point is the variability and the complexity of the models used to adjust the different kinds of
experimental spectra.
Such models, that are difficult to reproduce with classic commercial software, are generally implemented
with home made applications and stored in non-standard file formats which represents a serious limitation
for data exchange.
The above observations show that it is important for the spectroscopic community to dispose
of a common language that is both human readable and understandable by software applications.
This language should be versatile enough to be able to reproduce complex mathematical expressions
and sufficiently simple to facilitate it's widespread.
Standard formats for experimental data exchange between instruments and software applications already exist.
One can cite the largely diffused format JCAMP-DX (named after the Joint Committee on Atomic and
Molecular Physical Data - Data eXchange). This format is a set of industry-wide standard protocols
for transfer of spectroscopic data sets. It is sponsored by the International Union of Pure and
Applied Chemistry (IUPAC) and other international scientific unions. Information on JCAMP-DX is
available on the web at http://www.jcamp.org.
Recently James Duckworth of Thermo Galactic have proposed the Generalized Analytical Markup Language
(GAML) which is an XML-based file format for storing analytical instrument data. A description of
this new public format is provided at http://www.gaml.org.
These two formats represent a valuable effort in the direction of the standardization of experimental
data exchange and is an encouragement to do the same for diffusing spectroscopic models and mathematical
representations of optical properties.
Actually there are two main markup languages specialized in the reproduction of mathematical
expressions : OpenMath and MathML. Extensive information on these two languages can be
found at
http://www.nag.co.uk/projects/OpenMath.html and at
http://www.w3c.org/TR/REC-MathML.
These two languages are in principle able to respond to our needs in terms of mathematical
construction, but in practice they are too general and it is more convenient to construct a new
one that reflects more the specific needs of the spectroscopic community. The advantage of a
new language is the possibility of beneficing of the existing experience in mathematical construction
and to add new elements to take into account of the particularities of the domain. For example,
ensuring document integrity is one of the constraints that the new language must satisfy.
The intention of a secure element is to allow applications to detect critical changes of the
content of a file that were made outside the control of the system that generated it.
Such protection scheme is necessary to secure the content of spectroscopic model documents, function
libraries and more particularly optical libraries that are files containing mathematical
representations of the optical properties of materials.
The Optical Spectroscopy Markup Language (OSML) presented hereafter has been written to fit these
needs. OSML is an XML-based format and thus profits of the power of XML in representing structured
documents in a standardized and application independent way.
The design of the format was made having the following requirements in mind:
- Simplicity to facilitate its widespread.
- Security to warranty the integrity of data contained in OSML documents.
- Presence of a core library containing elementary and spectroscopic functions.
- Easily extensible. Contains mechanisms for the declaration of new functions.
- Encoding reflects the common usage of spectroscopic models.
- A single file format to store spectroscopic models, function and optical property libraries.
top
Spectroscopic modelling
A typical usage of spectroscopic models is to retrieve the optical properties of a material
from one or more experimental spectra. The procedure consists in a first time to choose as set of
mathematical expressions that correspond to the physical definitions of the measured quantities
(Reflectance, Transmittance, Emittance,...) and to construct a spectroscopic model able to reproduce
the intrinsic optical property of the material about which the physical definitions clearly
depend on. At this point, a first important structure comes into view. The physical definitions,
the spectroscopic model and other derived quantities constitute a set of interconnected mathematical
expressions that represents the essence of the modelling problem.
To capture the semantics of this global structure, OSML introduces 3 elements : a math
element which is a container for definition elements and a link element that allows
interconnections between named definitions. This type of structure ensures a maximum of flexibility
in the representation of the modelling problem, optimises storing space by avoiding multiple
redefinitions and speeds computation (linking advantage).
Besides these markups, OSML owns another small set of elements : apply, function,
constant, number, sequence, element, to construct the local
mathematical expressions of the definition elements. For example, the OSML version of
log(1 + x) is the following:
<apply>
<function name="log" source="core"/>
<apply>
<function name="plus" source="core"/>
<number> 1 </number>
<link> x </link>
</apply>
</apply>
The log and plus tokens of the name attribute of the function
elements represent respectively the base-10 logarithm function and the plus operator.
The source attribute indicates that both tokens are part of the core function
library. As one can see, the representation of a function is very versatile because it allows to
express a user defined function in the same way as a function of the core library, the only needed
information is the name of the function and those of the source library. This is also true for the
constant and the sequence elements.
The small number of elements presented until here are all we need to encode the complex expressions
of a spectroscopic modelling problem. With this knowledge and the experimental data, solving the problem
reduces to adjusting the model parameters by using a least square optimisation method for example.
top
Library definition
Now we can tackle the question of the construction of a user library. A library is a classic OSML
document that contains only declarations of functions. These functions are often destined to be the
building blocks of more complex expressions. To deal with library construction and function
declaration, OSML introduces 3 supplementary elements. The semantics element is a container
for the symbol element that contains a function declaration. The third element is the
argument element that allows defining the interface (arguments) of a function.
top
Annotations
The annotation element is the last element of the OSML format. Its intention is to store
meta-data in the documents such as documentation, application info,... It contains three attributes
that enable to structure this extra information.
top
OSML Document
A schematic example of OSML document containing all possible types of contents is given hereafter.
This file is typical of a spectroscopic modelling problem document that contains embedded declarations
of functions. Function and optical libraries files do not contain the last : modelling problem
block.
|
OSML document |
<?xml version="1.0" encoding="UTF-8"?>
<!-- xml-annotation : some useful information on the document-->
<osml version="1.0">
|
Optional integrity block |
|
<secure algorithm="SHA-256">
84983E44 1C3BD26E BAAE4AA1 F95129E5 E54670F1 A65B2D53 7FC234BA 3DAAA45C
</secure>
|
|
Optional meta-data block |
|
<annotation> A brief description of the content </annotation>
<annotation content="appInfo" name="units"> WAVENUMBER </annotation>
.
.
|
|
Optional declaration block |
|
<semantics>
<symbol name="gaussian" type="function">
<annotation>
Gaussian function : gaussian(x, A, Xc, w) = A exp(-4 ln2 ((x-Xc)/w)2)
</annotation>
<argument name="x">
<annotation> Variable </annotation>
<number> 0.0 </number>
</argument>
<argument name="A">
<annotation> Amplitude </annotation>
<number> 1.0 </number>
</argument>
<argument name="Xc">
<annotation> Center </annotation>
<number> 100.0 </number>
</argument>
<argument name="w">
<annotation> Full Width at Half Height</annotation>
<number> 1.0 </number>
</argument>
<definition name="gaussian">
<apply>
<function name="times">
<link> A </link>
<apply>
<function name="exp">
<apply>
<function name="product">
<number> -4.0 </number>
<constant> name="ln2"</constant>
<apply>
<function name="sqr">
<apply>
<function name="divide">
<apply>
<function name="minus">
<link> x </link>
<link> Xc </link>
</apply>
<link> w </link>
</apply>
</apply>
</apply>
</apply>
</apply>
</definition>
</symbol>
.
.
</semantics>
|
|
Optional modelling problem block |
|
<math>
<definition name="x">
</definition>
<definition name="Amplitude">
<number> 1.0 </number>
</definition>
<definition name="gauss-lorentz">
<apply>
<function name="sum"/>
<apply>
<function name="lorentzian"/>
<link> x </link>
<link> Amplitude </link>
<number> 100.0 </number>
<constant name="pi"/>
</apply>
<apply>
<function name="gaussian" source="document"/>
<link> x </link>
<link> Amplitude </link>
<number> 200.0 </number>
<number> 20.0 </number>
</apply>
</apply>
</definition>
.
.
</math>
|
</osml>
|
|