This is easy enough (IIUC your problem): when using #nextLine while reading from a stream, all 3 EOL conventions are handled transparently, you just get the line's contents back until you are done. Then you write the lines back out with your preferred EOL convention.
> On 18 Aug 2016, at 20:41, stepharo <
[hidden email]> wrote:
>
> Hi
>
> for the mooc I'm working on a srt to vtt converter.
>
> 1
> 00:00:07,040 --> 00:00:10,440
> Hello. This week,
> we'll get to the heart of the matter,
>
> 2
> 00:00:10 600 --> 00:00:12,160
> about syntax especially.
>
> into
>
> WEBVTT
>
> 00:00:07.040 --> 00:00:10.440 align:middle
> Hello. This week,
> we'll get to the heart of the matter,
>
> 00:00:10.600 --> 00:00:12.160 align:middle
> about syntax especially.
>
>
> It works more or less. Now I face the problem that the files people provided me have different encodings. (I guess) because when I do not treat the input (for example withLinuxLineEndings) I get some CRs after the conversion eventhough I copy some file content and all the line ending I output are lf (or can be customizable.
>
> I cannot apply garbage in gabrage out because the files should work.
>
> So I thought that I should just convert first the string I read using withLinuxLineEndings so that all cr, crlf are converted into lf. But since files have different encodings I end up something to issues too many lf.
>
> Does any of you have an idea how to handle this.
>
> I did not find a way to know the encoding of a file (not the bom) just the file ending.
>
> Stef
>
>