Updated 2016-05-19 18:42:23 by tdom

I have been trying tdom to download ny times rss feed but to no avail.

I resorted to first getting the file with http::geturl, saving it as utf-8, and then parsing it with [dom parse] command. I used the channel option and configured it with utf-8 again. But I am still getting errors about weird characters in the data. In the editor, the character appears as ^@ and it seems to be a quote character. The file correctly displays in a browser and I see the single quote characters.

Is this not valid utf-8? What encoding would be safe to use? Is there an option to ask tdom to ignore such errors or replace them with a space character?

Here is the url if any one would like to try: http://feeds.nytimes.com/nyt/rss/HomePage.xml