> I'm doing something like:
>
> File.open("target","w") do |target|
> File.open("source","r") do |source|
> source.each_line do |line|
> ... some processing ...
> target.write(line)
> end
> end
> end
Have you looked at 'iconv' in the standard library?
http://www.ruby-doc.org/stdlib/libdo...ses/Iconv.html
Assuming all your input files were ISO-8859-1, and you wanted your output file in UTF-8, your example might look something like (untested):
File.open("target","w") do |target|
Iconv.open('UTF-8', 'ISO-8859-1') do | converter |
File.open("source","r") do |source|
source.each_line do |line|
# ... some processing ...
target.write( converter.iconv(line) )
end
end
target << converter.iconv(nil)
end
end
Iconv should deal with BOMs, stripping them out or adding them in where necessary. I'm not sure if it will complain if it finds a BOM mid-stream (as you open your second and subsequent input file) - if so you could just instantiate a new Iconv to deal with each input.
HTH
alex