Customizing the configuration file

You can edit the xmlsimp.xml configuration file to suit your requirements. For example, if you process Latin-1 characters, you might add the eacute (é) character to your Exclude list as shown:

<character_restrictions name="latin1">
    <elements>
      <character_restriction exclude="&apos;" reference_string="&amp;apos;"/>
      <character_restriction exclude="&apos;" reference_string="&amp;#39;"/>
      <character_restriction exclude="&quot;" reference_string="&amp;quot;"/>
      <character_restriction exclude="&quot;" reference_string="&amp;#34;"/>
      <character_restriction exclude="&gt;" reference_string="&amp;gt;"/>
      <character_restriction exclude="&gt;" reference_string="&amp;#62;"/>
      <character_restriction exclude="&lt;" reference_string="&amp;lt;"/>
      <character_restriction exclude="&lt;" reference_string="&amp;#60;"/>
      <character_restriction exclude="é" reference_string="&amp;eacute;"/>
      <character_restriction exclude="é" reference_string="&amp;#233;"/>
      <character_restriction exclude="é" reference_string="&amp;#xE9;"/>
      <character_restriction exclude="é" reference_string="é"/>
      <character_restriction exclude="&amp;" reference_string="&amp;amp;"/>
      <character_restriction exclude="&amp;" reference_string="&amp;#38;"/>
      <character_restriction exclude="&apos;" reference_string="&apos;"/>
      <character_restriction exclude="&quot;" reference_string="&quot;"/>
      <character_restriction exclude="&gt;" reference_string="&gt;"/></elements>
     <attributes>
      <character_restriction exclude="&apos;" reference_string="&amp;apos;"/>
      <character_restriction exclude="&apos;" reference_string="&amp;#39;"/>
      <character_restriction exclude="&quot;" reference_string="&amp;quot;"/>
      <character_restriction exclude="&quot;" reference_string="&amp;#34;"/>
      <character_restriction exclude="&gt;" reference_string="&amp;gt;"/>
      <character_restriction exclude="&gt;" reference_string="&amp;#62;"/>
      <character_restriction exclude="&lt;" reference_string="&amp;lt;"/>
      <character_restriction exclude="&lt;" reference_string="&amp;#60;"/>
      <character_restriction exclude="&amp;" reference_string="&amp;amp;"/>
      <character_restriction exclude="&amp;" reference_string="&amp;#38;"/>
     </attributes>
  </character_restrictions>
         

You must specify the proper encoding in the XML prolog in the configuration file. For example, if the character é is encoded in the configuration file as Latin-1 (ISO-8859-1) byte value 0xE9, the XML prolog of the configuration file must specify that the ISO-8859-1 encoding is used, instead of the default UTF-8.