Path >> fullName should not be the same as printString

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Path >> fullName should not be the same as printString

Damien Pollet
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

stepharo

So damien what is the solution?



Le 7/10/16 à 18:18, Damien Pollet a écrit :
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet

Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Damien Pollet
It's a breaking change and I don't know if there's a way to do it with proper deprecation… my hope is that there are not many users of it, but I haven't checked yet. Any opinions among users of FileSystem?

On 8 October 2016 at 19:12, stepharo <[hidden email]> wrote:

So damien what is the solution?



Le 7/10/16 à 18:18, Damien Pollet a écrit :
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Damien Pollet
I explored a bit more and I'm stumped. Fixing it for Unix is easy, but it breaks Windows paths, because those use their first element to store the drive name (c: d: etc) which shouldn't be preceded by a /.

I'm starting to think the Absolute/Relative dichotomy is either wrong or incomplete and we need windows-specific subclasses. Doesn't it feel strange that we can pass strings with different syntax and assumptions to Path class >> from: and get instances from the same classes?

On 8 October 2016 at 19:21, Damien Pollet <[hidden email]> wrote:
It's a breaking change and I don't know if there's a way to do it with proper deprecation… my hope is that there are not many users of it, but I haven't checked yet. Any opinions among users of FileSystem?

On 8 October 2016 at 19:12, stepharo <[hidden email]> wrote:

So damien what is the solution?



Le 7/10/16 à 18:18, Damien Pollet a écrit :
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet



--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Ben Coman
Its awkward to deal with. Some notes here...
https://pharo.fogbugz.com/default.asp?13094#154946

On Fri, Oct 14, 2016 at 12:05 AM, Damien Pollet <[hidden email]> wrote:
I explored a bit more and I'm stumped. Fixing it for Unix is easy, but it breaks Windows paths, because those use their first element to store the drive name (c: d: etc) which shouldn't be preceded by a /.

I'm starting to think the Absolute/Relative dichotomy is either wrong or incomplete and we need windows-specific subclasses. Doesn't it feel strange that we can pass strings with different syntax and assumptions to Path class >> from: and get instances from the same classes?

On 8 October 2016 at 19:21, Damien Pollet <[hidden email]> wrote:
It's a breaking change and I don't know if there's a way to do it with proper deprecation… my hope is that there are not many users of it, but I haven't checked yet. Any opinions among users of FileSystem?

On 8 October 2016 at 19:12, stepharo <[hidden email]> wrote:

So damien what is the solution?



Le 7/10/16 à 18:18, Damien Pollet a écrit :
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet



--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet

Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Damien Pollet
Yes, I knew about the per-drive current directory…

Could you give an example what the file:// URLs look like when they contain a drive letter?

For context, my goal looking at this was to make Path instances able to output a string representation of themselves that is useable independently of a FileSystem. A good example would be inside a configuration file committed in git, so that has to work whatever the system the working copy resides on, but that we can assume will contain nothing too system-specific. The URL path could be a nice example too.

On 13 October 2016 at 18:15, Ben Coman <[hidden email]> wrote:
Its awkward to deal with. Some notes here...
https://pharo.fogbugz.com/default.asp?13094#154946


On Fri, Oct 14, 2016 at 12:05 AM, Damien Pollet <[hidden email]> wrote:
I explored a bit more and I'm stumped. Fixing it for Unix is easy, but it breaks Windows paths, because those use their first element to store the drive name (c: d: etc) which shouldn't be preceded by a /.

I'm starting to think the Absolute/Relative dichotomy is either wrong or incomplete and we need windows-specific subclasses. Doesn't it feel strange that we can pass strings with different syntax and assumptions to Path class >> from: and get instances from the same classes?

On 8 October 2016 at 19:21, Damien Pollet <[hidden email]> wrote:
It's a breaking change and I don't know if there's a way to do it with proper deprecation… my hope is that there are not many users of it, but I haven't checked yet. Any opinions among users of FileSystem?

On 8 October 2016 at 19:12, stepharo <[hidden email]> wrote:

So damien what is the solution?



Le 7/10/16 à 18:18, Damien Pollet a écrit :
Path >> printString returns a self-evaluating representation, which is fine. Its symmetric is thus Compiler >> evaluate: aString.

(Path from: aString) parses the unix/url representation of a path and results in a Path instance. As far as I understand, #fullName should be the symmetric of that, so that we always have (modulo syntactic normalization, maybe) :

(Path from: aString) fullName = aString

Note that there's an edge case with the empty string that is wrong (at least it should be confusing to unix guys):

Path from: ''. "Path root"

Usually the absolute path for the root of the filesystem is explicitly noted '/', and an empty path is equivalent to '.'

--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet



--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Martin McClure-2
On 10/13/2016 10:35 AM, Damien Pollet wrote:
> Could you give an example what the file:// URLs look like when they
> contain a drive letter?
>
I'm afraid I don't have any Windows machines handy to see what Internet
Explorer does, but as far as I can tell an absolute URL compliant with
RFC 3986 might look something like

file:/c:/foo/bar

A relative URL that fits the URL syntax would be

file:c:/foo/bar

But I'm finding it difficult to tell precisely how RFCs 1738 and 3986
currently interact.

The discussion in this proposed RFC is somewhat interesting:
https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02, as it
directly addresses Windows file naming. In appendix B.2, it says

"When mapping a DOS- or Windows-like file path to a URI, use the drive
    letter (e.g. "c:") as the first path segment. Some implementations
leave the leading slash off before the drive letter.  "

and appendix C.1 deals with DOS file paths.


I hope this is more helpful than it is confusing.

Regards,

-Martin


Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

gcotelli
At least Chrome uses this:

file:///C:/Users/tempfile.txt

On Fri, Oct 14, 2016 at 1:45 AM, Martin McClure <[hidden email]> wrote:
On 10/13/2016 10:35 AM, Damien Pollet wrote:
Could you give an example what the file:// URLs look like when they contain a drive letter?

I'm afraid I don't have any Windows machines handy to see what Internet Explorer does, but as far as I can tell an absolute URL compliant with RFC 3986 might look something like

file:/c:/foo/bar

A relative URL that fits the URL syntax would be

file:c:/foo/bar

But I'm finding it difficult to tell precisely how RFCs 1738 and 3986 currently interact.

The discussion in this proposed RFC is somewhat interesting: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02, as it directly addresses Windows file naming. In appendix B.2, it says

"When mapping a DOS- or Windows-like file path to a URI, use the drive
   letter (e.g. "c:") as the first path segment. Some implementations leave the leading slash off before the drive letter.  "

and appendix C.1 deals with DOS file paths.


I hope this is more helpful than it is confusing.

Regards,

-Martin



Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Sven Van Caekenberghe-2
All this already works (although maybe not perfectly in edge cases). Consider:

  'file:///C:/Users/tempfile.txt' asUrl.

  'file:///C:/Users/tempfile.txt' asUrl asFileReference.

  FileLocator C / 'Users' / 'tmpfile.txt'.

Where the last two are identical.

Note that relative file URLs do not exist.

> On 14 Oct 2016, at 13:28, Gabriel Cotelli <[hidden email]> wrote:
>
> At least Chrome uses this:
>
> file:///C:/Users/tempfile.txt
>
> On Fri, Oct 14, 2016 at 1:45 AM, Martin McClure <[hidden email]> wrote:
> On 10/13/2016 10:35 AM, Damien Pollet wrote:
> Could you give an example what the file:// URLs look like when they contain a drive letter?
>
> I'm afraid I don't have any Windows machines handy to see what Internet Explorer does, but as far as I can tell an absolute URL compliant with RFC 3986 might look something like
>
> file:/c:/foo/bar
>
> A relative URL that fits the URL syntax would be
>
> file:c:/foo/bar
>
> But I'm finding it difficult to tell precisely how RFCs 1738 and 3986 currently interact.
>
> The discussion in this proposed RFC is somewhat interesting: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02, as it directly addresses Windows file naming. In appendix B.2, it says
>
> "When mapping a DOS- or Windows-like file path to a URI, use the drive
>    letter (e.g. "c:") as the first path segment. Some implementations leave the leading slash off before the drive letter.  "
>
> and appendix C.1 deals with DOS file paths.
>
>
> I hope this is more helpful than it is confusing.
>
> Regards,
>
> -Martin
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

Damien Pollet
While we're at it, canonicalizing paths at creation time seems wrong also:

First because those two expressions do not return the same thing:
Path from: 'a/b/c/../d'
Path * / 'a' / 'b' / 'c' / '..' / 'd'

Second because if c happens to be a symlink, then the operating system will not find the same thing as Pharo. The semantics is that you follow the symlink first, and follow .. in the directory you end up in. So that goes to the parent of the actual directory pointed to by the symlink, not back one level in the path.

On 14 October 2016 at 13:59, Sven Van Caekenberghe <[hidden email]> wrote:
All this already works (although maybe not perfectly in edge cases). Consider:

  'file:///C:/Users/tempfile.txt' asUrl.

  'file:///C:/Users/tempfile.txt' asUrl asFileReference.

  FileLocator C / 'Users' / 'tmpfile.txt'.

Where the last two are identical.

Note that relative file URLs do not exist.

> On 14 Oct 2016, at 13:28, Gabriel Cotelli <[hidden email]> wrote:
>
> At least Chrome uses this:
>
> file:///C:/Users/tempfile.txt
>
> On Fri, Oct 14, 2016 at 1:45 AM, Martin McClure <[hidden email]> wrote:
> On 10/13/2016 10:35 AM, Damien Pollet wrote:
> Could you give an example what the file:// URLs look like when they contain a drive letter?
>
> I'm afraid I don't have any Windows machines handy to see what Internet Explorer does, but as far as I can tell an absolute URL compliant with RFC 3986 might look something like
>
> file:/c:/foo/bar
>
> A relative URL that fits the URL syntax would be
>
> file:c:/foo/bar
>
> But I'm finding it difficult to tell precisely how RFCs 1738 and 3986 currently interact.
>
> The discussion in this proposed RFC is somewhat interesting: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02, as it directly addresses Windows file naming. In appendix B.2, it says
>
> "When mapping a DOS- or Windows-like file path to a URI, use the drive
>    letter (e.g. "c:") as the first path segment. Some implementations leave the leading slash off before the drive letter.  "
>
> and appendix C.1 deals with DOS file paths.
>
>
> I hope this is more helpful than it is confusing.
>
> Regards,
>
> -Martin
>
>
>





--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path >> fullName should not be the same as printString

stepharo

so how can we make progress?


Stef


Le 14/10/16 à 14:57, Damien Pollet a écrit :
While we're at it, canonicalizing paths at creation time seems wrong also:

First because those two expressions do not return the same thing:
Path from: 'a/b/c/../d'
Path * / 'a' / 'b' / 'c' / '..' / 'd'

Second because if c happens to be a symlink, then the operating system will not find the same thing as Pharo. The semantics is that you follow the symlink first, and follow .. in the directory you end up in. So that goes to the parent of the actual directory pointed to by the symlink, not back one level in the path.

On 14 October 2016 at 13:59, Sven Van Caekenberghe <[hidden email]> wrote:
All this already works (although maybe not perfectly in edge cases). Consider:

  'file:///C:/Users/tempfile.txt' asUrl.

  'file:///C:/Users/tempfile.txt' asUrl asFileReference.

  FileLocator C / 'Users' / 'tmpfile.txt'.

Where the last two are identical.

Note that relative file URLs do not exist.

> On 14 Oct 2016, at 13:28, Gabriel Cotelli <[hidden email]> wrote:
>
> At least Chrome uses this:
>
> file:///C:/Users/tempfile.txt
>
> On Fri, Oct 14, 2016 at 1:45 AM, Martin McClure <[hidden email]> wrote:
> On 10/13/2016 10:35 AM, Damien Pollet wrote:
> Could you give an example what the <a class="moz-txt-link-freetext" href="file://">file:// URLs look like when they contain a drive letter?
>
> I'm afraid I don't have any Windows machines handy to see what Internet Explorer does, but as far as I can tell an absolute URL compliant with RFC 3986 might look something like
>
> file:/c:/foo/bar
>
> A relative URL that fits the URL syntax would be
>
> file:c:/foo/bar
>
> But I'm finding it difficult to tell precisely how RFCs 1738 and 3986 currently interact.
>
> The discussion in this proposed RFC is somewhat interesting: https://tools.ietf.org/html/draft-ietf-appsawg-file-scheme-02, as it directly addresses Windows file naming. In appendix B.2, it says
>
> "When mapping a DOS- or Windows-like file path to a URI, use the drive
>    letter (e.g. "c:") as the first path segment. Some implementations leave the leading slash off before the drive letter.  "
>
> and appendix C.1 deals with DOS file paths.
>
>
> I hope this is more helpful than it is confusing.
>
> Regards,
>
> -Martin
>
>
>





--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet