I thought it’d be a good idea to write this down for my own future reference if nothing else.
The sfMessageSource_XLIFF.class.php file in the core files of symfony is responsible for the storage and retrieval of translation text from the XML formated XLIFF files. I’ve found a problem with this which was causing problems with the use of the sfI18nExtractPlugin.
The problem is that the text is not stored as CDATA so consequently the html tags are escaped to produce valid XML while any html entities are left alone. When the XML file is then loaded the next time the object that is created from this decodes ALL the escaped characters including the html entities.
This leads to a problem for the sfI18nExtractPlugin when comparing already stored translation text and the text it has extracted from the pages that render the HTML. The extracted text has entity references for special characters while the stored translation text contains the special characters themselves.
Consequently I’ve modified the sfMessageSource_XLIFF.class.php file to store the text as CDATA, which solves the problem. The loading of the XML that takes place in the same file does not need modifying as the CDATA is taken and and cast as a string which is correct.
the original code was:
$source = $dom->createElement(‘source’, $message);
and I’ve chaged it to:
$cdata = $dom->createCDATASection($message);
$source = $dom->createElement(‘source’);
$source->appendChild($cdata);