Parsexml: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Neal
imported>Jeremy
 
(7 intermediate revisions by 3 users not shown)
Line 5: Line 5:
===Synopsis===
===Synopsis===


:[object,theStruct] = parseXML(filename,nooutertag);
:[object,theStruct] = parsexml(filename,nooutertag);
If input <tt>filename</tt> is omitted, the user will be prompted for a file name  to read.


===Description===
===Description===
Line 12: Line 13:


*'''class="string"''': Contents encoded as string or padded string array. If multiple row string, each row should be enclosed in <tt>&lt;sr&gt;</tt> tags.
*'''class="string"''': Contents encoded as string or padded string array. If multiple row string, each row should be enclosed in <tt>&lt;sr&gt;</tt> tags.
<ul>
<pre><oneitem class="string">Just One String</oneitem></pre>
<pre><multirow class="string>
  <sr>Row 1 string</sr>
  <sr>Row 2 string</sr>
  ...
</multirow></pre>
</ul>
*'''class="numeric"'''  : Contents of tag must be a comma-delimited list of values with rows delimited by semicolons. Each row must have the same number of entries (each row must be equal in length) or an error will result. Multi-way matricies can be encapsulated in <tt>&lt;tn mode="i"&gt;</tt> tags where i is the mode that the enclosed item expands on (i>=3).


*'''class="numeric"'''   : Contents of tag must be a comma-delimited list of values with rows delimited by semicolons. Each row must have the same number of entries (each row must be equal in length) or an error will result. Multi-way matricies can be encapsulated in <tt>&lt;tn mode="i"&gt;</tt> tags where i is the mode that the enclosed item expands on (i>=3).
::'''Encoding''': Numeric class contents can be encoded as comma-separated values (csv) which is the default, or using base64 encoding. The encoding attribute can be supplied to specify when the contents are encoded using other than CSV. Options include:
:* encoding = "csv"          (default)
:* encoding = "base64"
::When base64 encoding is used, the additional attribute '''precision''' can be included to specify the precision of the numerical values encoded. Options include:
:* precision="64"  for 64-bit double precision values (default)
:* precision="32"  for 32-bit single precision values
:* precision="8"    for 8-bit unsigned integer values
:* precision="1"    for boolean logical values


::Example: row vector
::Example: row vector
      &lt;item class="numeric"&gt; 1,2,3,4 &lt;/item&gt;  
<ul><pre>
 
&lt;item class="numeric"&gt; 1,2,3,4 &lt;/item&gt;  
</pre></ul>
::Example: 2-way matrix
::Example: 2-way matrix
      &lt;item class="numeric"&gt; 11,12,13,14; 21 22 23 24 &lt;/item&gt;  
<ul><pre>
&lt;item class="numeric"&gt; 11,12,13,14; 21 22 23 24 &lt;/item&gt;  
</pre></ul>


::Example: 3-way
::Example: 3-way
      &lt;item class="numeric"&gt;
<ul><pre>
        &lt;tn mode="3"&gt;  
&lt;item class="numeric"&gt;
          111,112,113,114; 121,122,123,124  
  &lt;tn mode="3"&gt;  
        &lt;/tn&gt;  
    111,112,113,114; 121,122,123,124  
        &lt;tn mode="3"&gt;  
  &lt;/tn&gt;  
          211,212,213,214; 221,222,223,224
  &lt;tn mode="3"&gt;  
        &lt;/tn&gt;  
    211,212,213,214; 221,222,223,224
      &lt;/item&gt;  
  &lt;/tn&gt;  
&lt;/item&gt;  
</pre></ul>


*'''class="cell"'''      : Contents encoded as Matlab cell. The format of contents is the same as HTML table tags (<tt>&lt;tr&gt;</tt> for a new row, <tt>&lt;td&gt;</tt> for a new container/column) with the added tag of <tt>&lt;tn mode="i"&gt;</tt> to describe a multi-dimensional cell (see <tt>class="numeric"</tt>).
*'''class="cell"'''      : Contents encoded as Matlab cell. The format of contents is the same as HTML table tags (<tt>&lt;tr&gt;</tt> for a new row, <tt>&lt;td&gt;</tt> for a new container/column) with the added tag of <tt>&lt;tn mode="i"&gt;</tt> to describe a multi-dimensional cell (see <tt>class="numeric"</tt>).
::Example: 3-way cell (with strings in each cell)
::Example: 3-way cell (with strings in each cell)
<pre>
<ul><pre>
      &lt;item class="cell"&gt;  
&lt;item class="cell"&gt;  
        &lt;tn mode="3"&gt;  
  &lt;tn mode="3"&gt;  
            &lt;tr&gt; &lt;td&gt;slab 1, row 1, col 1&lt;td&gt; &lt;td&gt;slab 1, row 1, col 2&lt;td&gt; &lt;/tr&gt;
    &lt;tr&gt; &lt;td&gt;slab 1, row 1, col 1&lt;td&gt; &lt;td&gt;slab 1, row 1, col 2&lt;td&gt; &lt;/tr&gt;
            &lt;tr&gt; &lt;td&gt;slab 1, row 2, col 1&lt;td&gt; &lt;td&gt;slab 1, row 2, col 2&lt;td&gt; &lt;/tr&gt;
    &lt;tr&gt; &lt;td&gt;slab 1, row 2, col 1&lt;td&gt; &lt;td&gt;slab 1, row 2, col 2&lt;td&gt; &lt;/tr&gt;
        &lt;/tn&gt;  
  &lt;/tn&gt;  
        &lt;tn mode="3"&gt;  
  &lt;tn mode="3"&gt;  
            &lt;tr&gt; &lt;td&gt;slab 2, row 1, col 1&lt;td&gt; &lt;td&gt;slab 2, row 1, col 2&lt;td&gt; &lt;/tr&gt;
    &lt;tr&gt; &lt;td&gt;slab 2, row 1, col 1&lt;td&gt; &lt;td&gt;slab 2, row 1, col 2&lt;td&gt; &lt;/tr&gt;
            &lt;tr&gt; &lt;td&gt;slab 2, row 2, col 1&lt;td&gt; &lt;td&gt;slab 2, row 2, col 2&lt;td&gt; &lt;/tr&gt;
    &lt;tr&gt; &lt;td&gt;slab 2, row 2, col 1&lt;td&gt; &lt;td&gt;slab 2, row 2, col 2&lt;td&gt; &lt;/tr&gt;
        &lt;/tn&gt;  
  &lt;/tn&gt;  
      &lt;/item class="cell"&gt;  
&lt;/item class="cell"&gt;  
</pre>
</pre></ul>


*'''class="structure"''' : Used for struture arrays ONLY. Contents encoded into a structure array use array notation identical to that described for class="cell". If a structure is size [1 1] then it does not need to use array notation and must not be marked with this class attribute. Instead, the contents of the structure should simply be enclosed within the tag as sub-tags.
*'''class="structure"''' : Used for struture arrays ONLY. Contents encoded into a structure array use array notation identical to that described for class="cell". If a structure is size [1 1] then it does not need to use array notation and must not be marked with this class attribute. Instead, the contents of the structure should simply be enclosed within the tag as sub-tags.


*'''class="dataset"'''  : Contents will be interpreted as a DataSet Object. Any tags that do not map to valid DataSet Object fields will be ignored. See the DataSet definition for details on valid fields and ENCODEXML for examples of the DataSet XML format.
*'''class="dataset"'''  : Contents will be interpreted as a DataSet Object. Any tags that do not map to valid DataSet Object fields will be ignored. See the DataSet definition for details on valid fields and ENCODEXML for examples of the DataSet XML format. Also see the simplified "dso" class below.
 
*'''class="dso"''' : Contents will be interpreted as a DataSet Object using the [[DataSet_XML_Format|simplified DataSet object definition]]. This format is generally much easier to use to define a DataSet.


*'''NOTE: "Size" attribute''': Tags of class "numeric", "cell", or "structure" (structure-array only) should also include the attribute size="[...]" which gives the size of the tag's contents. The size value must be enclosed in square brackets and must be at least two elements long (use [0,0] for empty). For example &lt;myvalue class="numeric" size="[3,4]"&gt; says that the field myvalue will be numeric with 3 rows and 4 columns. Size can be multi-dimensional as needed (size="[2,4,6,2]" implies that the tag contents will be a 4-dimensional array of the given sizes.
*'''NOTE: "Size" attribute''': Tags of class "numeric", "cell", or "structure" (structure-array only) should also include the attribute size="[...]" which gives the size of the tag's contents. The size value must be enclosed in square brackets and must be at least two elements long (use [0,0] for empty). For example &lt;myvalue class="numeric" size="[3,4]"&gt; says that the field myvalue will be numeric with 3 rows and 4 columns. Size can be multi-dimensional as needed (size="[2,4,6,2]" implies that the tag contents will be a 4-dimensional array of the given sizes.
====Input====
* '''filename''' = XML filname to convert. If input (filename) is omitted, the user will be prompted for a file name to read.
====Optional Input====
* '''nooutertag''' = [ {false} | true ] when set to "true" this input indicates that the outer-most xml object should be stripped from the resulting output (object). This allows direct access to the object itself rather than a structure with the object as the first and only field of that structure.
====Outputs====
* '''object''' = MATLAB object.
* '''theStruct' = is the pre-parsed XML object and allows access to raw field attributes and other content that cannot be converted into a Matlab object.


===See Also===
===See Also===


[[autoimport]], [[encodexml]], [[xclreadr]]
[[autoimport]], [[encodexml]], [[textreadr]], [[xclreadr]]

Latest revision as of 10:20, 7 July 2015

Purpose

Convert XML file to a MATLAB structure.

Synopsis

[object,theStruct] = parsexml(filename,nooutertag);

If input filename is omitted, the user will be prompted for a file name to read.

Description

Creates Matlab object from XML file. The format of the file must follow that used by ENCODEXML. Each XML tag will be encoded as a field in a Matlab structure. The top-level tag will be the single field in the top-level of the returned structure. All sub-tags will be sub-fields. Contents of the fields are specified using the 'class' attributes of each tag. When 'class' is omitted, a single-entry (non-array) structure is assumed. Tags with the attribute 'class' will be encoded using the following rules:

  • class="string": Contents encoded as string or padded string array. If multiple row string, each row should be enclosed in <sr> tags.
    <oneitem class="string">Just One String</oneitem>
    <multirow class="string>
      <sr>Row 1 string</sr>
      <sr>Row 2 string</sr>
      ...
    </multirow>
  • class="numeric"  : Contents of tag must be a comma-delimited list of values with rows delimited by semicolons. Each row must have the same number of entries (each row must be equal in length) or an error will result. Multi-way matricies can be encapsulated in <tn mode="i"> tags where i is the mode that the enclosed item expands on (i>=3).
Encoding: Numeric class contents can be encoded as comma-separated values (csv) which is the default, or using base64 encoding. The encoding attribute can be supplied to specify when the contents are encoded using other than CSV. Options include:
  • encoding = "csv" (default)
  • encoding = "base64"
When base64 encoding is used, the additional attribute precision can be included to specify the precision of the numerical values encoded. Options include:
  • precision="64" for 64-bit double precision values (default)
  • precision="32" for 32-bit single precision values
  • precision="8" for 8-bit unsigned integer values
  • precision="1" for boolean logical values
Example: row vector
    <item class="numeric"> 1,2,3,4 </item> 
    
Example: 2-way matrix
    <item class="numeric"> 11,12,13,14; 21 22 23 24 </item> 
    
Example: 3-way
    <item class="numeric">
      <tn mode="3"> 
        111,112,113,114; 121,122,123,124 
      </tn> 
      <tn mode="3"> 
        211,212,213,214; 221,222,223,224
      </tn> 
    </item> 
    
  • class="cell"  : Contents encoded as Matlab cell. The format of contents is the same as HTML table tags (<tr> for a new row, <td> for a new container/column) with the added tag of <tn mode="i"> to describe a multi-dimensional cell (see class="numeric").
Example: 3-way cell (with strings in each cell)
    <item class="cell"> 
      <tn mode="3"> 
         <tr> <td>slab 1, row 1, col 1<td> <td>slab 1, row 1, col 2<td> </tr>
         <tr> <td>slab 1, row 2, col 1<td> <td>slab 1, row 2, col 2<td> </tr>
      </tn> 
      <tn mode="3"> 
         <tr> <td>slab 2, row 1, col 1<td> <td>slab 2, row 1, col 2<td> </tr>
         <tr> <td>slab 2, row 2, col 1<td> <td>slab 2, row 2, col 2<td> </tr>
      </tn> 
    </item class="cell"> 
    
  • class="structure" : Used for struture arrays ONLY. Contents encoded into a structure array use array notation identical to that described for class="cell". If a structure is size [1 1] then it does not need to use array notation and must not be marked with this class attribute. Instead, the contents of the structure should simply be enclosed within the tag as sub-tags.
  • class="dataset"  : Contents will be interpreted as a DataSet Object. Any tags that do not map to valid DataSet Object fields will be ignored. See the DataSet definition for details on valid fields and ENCODEXML for examples of the DataSet XML format. Also see the simplified "dso" class below.
  • NOTE: "Size" attribute: Tags of class "numeric", "cell", or "structure" (structure-array only) should also include the attribute size="[...]" which gives the size of the tag's contents. The size value must be enclosed in square brackets and must be at least two elements long (use [0,0] for empty). For example <myvalue class="numeric" size="[3,4]"> says that the field myvalue will be numeric with 3 rows and 4 columns. Size can be multi-dimensional as needed (size="[2,4,6,2]" implies that the tag contents will be a 4-dimensional array of the given sizes.

Input

  • filename = XML filname to convert. If input (filename) is omitted, the user will be prompted for a file name to read.

Optional Input

  • nooutertag = [ {false} | true ] when set to "true" this input indicates that the outer-most xml object should be stripped from the resulting output (object). This allows direct access to the object itself rather than a structure with the object as the first and only field of that structure.

Outputs

  • object = MATLAB object.
  • theStruct' = is the pre-parsed XML object and allows access to raw field attributes and other content that cannot be converted into a Matlab object.

See Also

autoimport, encodexml, textreadr, xclreadr