Home    

About...
Content
  Introduction
  Spectroscopic Modelling
  Library definition
  Annotations
  OSML document





   OSML Elements
   Core library
   XML Schema
   Examples

Related links
OSML Software
  Focus
Related formats
  JCAMP-DX
  GAML
  OpenMath
  MathML
 
 
Version 1.0
Optical Spectroscopy Markup Language

 About...

This is preliminary version subject to change

Introduction

A rapid review of a representative set of publications including analysis of optical data shows the wide use of physical models for the retrieval of the intrinsic optical properties. This type of treatment is often preferred to inversion methods like Kramers-Kronig relations or phase retrieval procedures because of the extra information it provides on the microscopic mechanisms of absorption. A supplementary information that is often essential in the explanation of the occurrence of phase transitions in materials for example.

Even if this type of treatment has conduced to very important results in the material science sector, it remains that some drawbacks still persist. Indeed, an important observed fact is the lack of completeness of the published results; often parameter values are only partially given or are even totally missing. This situation no solely precludes rigorous comparison of results obtained by different groups but also disable future use of the optical properties. Another critical point is the variability and the complexity of the models used to adjust the different kinds of experimental spectra.
Such models, that are difficult to reproduce with classic commercial software, are generally implemented with home made applications and stored in non-standard file formats which represents a serious limitation for data exchange.

The above observations show that it is important for the spectroscopic community to dispose of a common language that is both human readable and understandable by software applications. This language should be versatile enough to be able to reproduce complex mathematical expressions and sufficiently simple to facilitate it's widespread.

Standard formats for experimental data exchange between instruments and software applications already exist. One can cite the largely diffused format JCAMP-DX (named after the Joint Committee on Atomic and Molecular Physical Data - Data eXchange). This format is a set of industry-wide standard protocols for transfer of spectroscopic data sets. It is sponsored by the International Union of Pure and Applied Chemistry (IUPAC) and other international scientific unions. Information on JCAMP-DX is available on the web at
http://www.jcamp.org. Recently James Duckworth of Thermo Galactic have proposed the Generalized Analytical Markup Language (GAML) which is an XML-based file format for storing analytical instrument data. A description of this new public format is provided at http://www.gaml.org. These two formats represent a valuable effort in the direction of the standardization of experimental data exchange and is an encouragement to do the same for diffusing spectroscopic models and mathematical representations of optical properties.

Actually there are two main markup languages specialized in the reproduction of mathematical expressions : OpenMath and MathML. Extensive information on these two languages can be found at http://www.nag.co.uk/projects/OpenMath.html and at http://www.w3c.org/TR/REC-MathML. These two languages are in principle able to respond to our needs in terms of mathematical construction, but in practice they are too general and it is more convenient to construct a new one that reflects more the specific needs of the spectroscopic community. The advantage of a new language is the possibility of beneficing of the existing experience in mathematical construction and to add new elements to take into account of the particularities of the domain. For example, ensuring document integrity is one of the constraints that the new language must satisfy. The intention of a secure element is to allow applications to detect critical changes of the content of a file that were made outside the control of the system that generated it. Such protection scheme is necessary to secure the content of spectroscopic model documents, function libraries and more particularly optical libraries that are files containing mathematical representations of the optical properties of materials.

The Optical Spectroscopy Markup Language (OSML) presented hereafter has been written to fit these needs. OSML is an XML-based format and thus profits of the power of XML in representing structured documents in a standardized and application independent way.
The design of the format was made having the following requirements in mind:
- Simplicity to facilitate its widespread.
- Security to warranty the integrity of data contained in OSML documents.
- Presence of a core library containing elementary and spectroscopic functions.
- Easily extensible. Contains mechanisms for the declaration of new functions.
- Encoding reflects the common usage of spectroscopic models.
- A single file format to store spectroscopic models, function and optical property libraries.

  top

Spectroscopic modelling

A typical usage of spectroscopic models is to retrieve the optical properties of a material from one or more experimental spectra. The procedure consists in a first time to choose as set of mathematical expressions that correspond to the physical definitions of the measured quantities (Reflectance, Transmittance, Emittance,...) and to construct a spectroscopic model able to reproduce the intrinsic optical property of the material about which the physical definitions clearly depend on. At this point, a first important structure comes into view. The physical definitions, the spectroscopic model and other derived quantities constitute a set of interconnected mathematical expressions that represents the essence of the modelling problem.



To capture the semantics of this global structure, OSML introduces 3 elements : a math element which is a container for definition elements and a link element that allows interconnections between named definitions. This type of structure ensures a maximum of flexibility in the representation of the modelling problem, optimises storing space by avoiding multiple redefinitions and speeds computation (linking advantage).

Besides these markups, OSML owns another small set of elements : apply, function, constant, number, sequence, element, to construct the local mathematical expressions of the definition elements. For example, the OSML version of log(1 + x) is the following:

 <apply>
   <function name="log" source="core"/>
   <apply>
     <function name="plus" source="core"/>
     <number> 1 </number>
     <link> x </link>
   </apply>
 </apply>


The log and plus tokens of the name attribute of the function elements represent respectively the base-10 logarithm function and the plus operator. The source attribute indicates that both tokens are part of the core function library. As one can see, the representation of a function is very versatile because it allows to express a user defined function in the same way as a function of the core library, the only needed information is the name of the function and those of the source library. This is also true for the constant and the sequence elements.

The small number of elements presented until here are all we need to encode the complex expressions of a spectroscopic modelling problem. With this knowledge and the experimental data, solving the problem reduces to adjusting the model parameters by using a least square optimisation method for example.

  top

Library definition

Now we can tackle the question of the construction of a user library. A library is a classic OSML document that contains only declarations of functions. These functions are often destined to be the building blocks of more complex expressions. To deal with library construction and function declaration, OSML introduces 3 supplementary elements. The semantics element is a container for the symbol element that contains a function declaration. The third element is the argument element that allows defining the interface (arguments) of a function.

  top

Annotations

The annotation element is the last element of the OSML format. Its intention is to store meta-data in the documents such as documentation, application info,... It contains three attributes that enable to structure this extra information.

  top

OSML Document

A schematic example of OSML document containing all possible types of contents is given hereafter. This file is typical of a spectroscopic modelling problem document that contains embedded declarations of functions. Function and optical libraries files do not contain the last : modelling problem block.

OSML document
<?xml version="1.0" encoding="UTF-8"?>
<!-- xml-annotation : some useful information on the document-->
<osml version="1.0">
Optional integrity block
<secure algorithm="SHA-256">
  84983E44 1C3BD26E BAAE4AA1 F95129E5 E54670F1 A65B2D53 7FC234BA 3DAAA45C
</secure>

Optional meta-data block
<annotation> A brief description of the content </annotation>
<annotation content="appInfo" name="units"> WAVENUMBER </annotation>
  .
  .

Optional declaration block
<semantics>
  <symbol name="gaussian" type="function">
    <annotation>
      Gaussian function : gaussian(x, A, Xc, w) = A exp(-4 ln2 ((x-Xc)/w)2)
    </annotation>
    <argument name="x">
      <annotation> Variable </annotation>
      <number> 0.0 </number>
    </argument>
    <argument name="A">
      <annotation> Amplitude </annotation>
      <number> 1.0 </number>
    </argument>
    <argument name="Xc">
      <annotation> Center </annotation>
      <number> 100.0 </number>
    </argument>
    <argument name="w">
      <annotation> Full Width at Half Height</annotation>
      <number> 1.0 </number>
    </argument>
    <definition name="gaussian">
      <apply>
        <function name="times">
        <link> A </link>
        <apply>
          <function name="exp">
          <apply>
            <function name="product">
            <number> -4.0 </number>
            <constant> name="ln2"</constant>
            <apply>
              <function name="sqr">
              <apply>
                <function name="divide">
                <apply>
                <function name="minus">
                  <link> x </link>
                  <link> Xc </link>
                </apply>
                <link> w </link>
              </apply>
            </apply>
          </apply>
        </apply>
      </apply>
    </definition>
  </symbol>
  .
  .
</semantics>

Optional modelling problem block
<math>
  <definition name="x">
  </definition>
  <definition name="Amplitude">
    <number> 1.0 </number>
  </definition>
  <definition name="gauss-lorentz">
    <apply>
      <function name="sum"/>
      <apply>
        <function name="lorentzian"/>
        <link> x </link>
        <link> Amplitude </link>
        <number> 100.0 </number>
        <constant name="pi"/>
      </apply>
      <apply>
        <function name="gaussian" source="document"/>
        <link> x </link>
        <link> Amplitude </link>
        <number> 200.0 </number>
        <number> 20.0 </number>
      </apply>
    </apply>
  </definition>
  .
  .
</math>
</osml>