I am screen scraping a web page via GetResponseStream and can't resave a
certain character in the stream:
Stream respStream = wResp.GetResponseStream();
StreamReader reader = new StreamReader(respStream, Encoding.ASCII); // have
also used other UTFx modes
String respHTML = reader.ReadToEnd();
There is a strange character on the web page; it is a extra long
hypen...like the one in MS Word that is used if you type in two normal
hypens. I read the character in and try to resave it to another HTML
file...but when I view the HTML file now, it just contains several non-ascii
characters.
Any idea how to preserse the strange characters? I don't know if things are
going amiss during the read or during the write of the new file (maybe the
new file and it only supports asii).
Any ideas?
Amil
|