Path>>/ and ..

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Path>>/ and ..

alistairgrant
Hi All,

The current implementation of Path>>/, while functional, seems to me to
create poorly formed paths, e.g.:

('/a/b/c/d' asFileReference / '../e') path ==> "Path / 'a' / 'b' / 'c'
/ 'd' / '../e'"

As can be seen above, the last segment is '../e'.

I would expect the result to be:

"Path / 'a' / 'b' / 'c' / 'e'"

Is there a reason for the current behaviour?  If not, are people open to
modifying Path>>/ to produce the suggested result above (in Pharo 7)?

The method below produces the modified result above and doesn't change
the result of running all automated tests.  If this is going to be added
to Pharo 7 I'll create an additional automated test to check that it is
working properly.

Cheers,
Alistair

'From Pharo6.0 of 13 May 2016 [Latest update: #60451] on 29 March 2017
at 9:40:41.681228 am'!

!Path methodsFor: 'navigating' stamp: 'AlistairGrant 3/29/2017 09:39'!
/ aString
    | path |

    aString isEmptyOrNil
        ifTrue: [ Error signal: 'Path element cannot be empty or nil'].

    ^self resolve: (Path from: aString)
! !

Reply | Threaded
Open this post in threaded view
|

Re: Path>>/ and ..

EstebanLM
Hi,

problem is this part: '../e’
it has to be any of this:

('/a/b/c/d' asFileReference / '..' / 'e') path. "Path / 'a' / 'b' / 'c' / 'd' / '..' / 'e'"
('/a/b/c/d' asFileReference parent / 'e') path. "Path / 'a' / 'b' / 'c' / 'e'"

the rationale is that “..” is not modelled as a special kind of node, just a string.
and the reason is that that’s commonly like that, but it doesn’t *has to* be like that… there could be file systems that interprets .. as a simple string.

and everything can be improved :)

Esteban

> On 29 Mar 2017, at 12:21, Alistair Grant <[hidden email]> wrote:
>
> Hi All,
>
> The current implementation of Path>>/, while functional, seems to me to
> create poorly formed paths, e.g.:
>
> ('/a/b/c/d' asFileReference / '../e') path ==> "Path / 'a' / 'b' / 'c'
> / 'd' / '../e'"
>
> As can be seen above, the last segment is '../e'.
>
> I would expect the result to be:
>
> "Path / 'a' / 'b' / 'c' / 'e'"
>
> Is there a reason for the current behaviour?  If not, are people open to
> modifying Path>>/ to produce the suggested result above (in Pharo 7)?
>
> The method below produces the modified result above and doesn't change
> the result of running all automated tests.  If this is going to be added
> to Pharo 7 I'll create an additional automated test to check that it is
> working properly.
>
> Cheers,
> Alistair
>
> 'From Pharo6.0 of 13 May 2016 [Latest update: #60451] on 29 March 2017
> at 9:40:41.681228 am'!
>
> !Path methodsFor: 'navigating' stamp: 'AlistairGrant 3/29/2017 09:39'!
> / aString
>    | path |
>
>    aString isEmptyOrNil
>        ifTrue: [ Error signal: 'Path element cannot be empty or nil'].
>
>    ^self resolve: (Path from: aString)
> ! !
>


Reply | Threaded
Open this post in threaded view
|

Re: Path>>/ and ..

alistairgrant
On 29 March 2017 at 13:33, Esteban Lorenzano <[hidden email]> wrote:

>
> problem is this part: '../e’
> it has to be any of this:
>
> ('/a/b/c/d' asFileReference / '..' / 'e') path. "Path / 'a' / 'b' / 'c' / 'd' / '..' / 'e'"
> ('/a/b/c/d' asFileReference parent / 'e') path. "Path / 'a' / 'b' / 'c' / 'e'"
>
> the rationale is that “..” is not modelled as a special kind of node, just a string.
> and the reason is that that’s commonly like that, but it doesn’t *has
> to* be like that… there could be file systems that interprets .. as a
> simple string.

Except that ".." is treated as a special case at the moment in
Path class>>addParentElementTo:, which is why #resolve: does the right
thing (well, at least in my opinion :-)).

While I agree that a more general solution is mostly a better solution,
I'm not aware of any file system at the moment that doesn't treat .. as
the parent directory, and Pharo already treats .. as a special case.

I'm having trouble understanding the Path class comments, however:

    "#* and #/ are mnemonic to . and /
    whose arguments should  be string file- or directory names, not
    fragments of Unix path notation intended to be parsed."

seems to imply that segments should not contain the delimiter (/), and
the current implementation leaves the delimiter in the segment.


> and everything can be improved :)

I still think this would be an improvement. :-)


> Esteban

Thanks!
Alistair


>> On 29 Mar 2017, at 12:21, Alistair Grant <[hidden email]> wrote:
>>
>> Hi All,
>>
>> The current implementation of Path>>/, while functional, seems to me to
>> create poorly formed paths, e.g.:
>>
>> ('/a/b/c/d' asFileReference / '../e') path ==> "Path / 'a' / 'b' / 'c'
>> / 'd' / '../e'"
>>
>> As can be seen above, the last segment is '../e'.
>>
>> I would expect the result to be:
>>
>> "Path / 'a' / 'b' / 'c' / 'e'"
>>
>> Is there a reason for the current behaviour?  If not, are people open to
>> modifying Path>>/ to produce the suggested result above (in Pharo 7)?
>>
>> The method below produces the modified result above and doesn't change
>> the result of running all automated tests.  If this is going to be added
>> to Pharo 7 I'll create an additional automated test to check that it is
>> working properly.
>>
>> Cheers,
>> Alistair
>>
>> 'From Pharo6.0 of 13 May 2016 [Latest update: #60451] on 29 March 2017
>> at 9:40:41.681228 am'!
>>
>> !Path methodsFor: 'navigating' stamp: 'AlistairGrant 3/29/2017 09:39'!
>> / aString
>>    | path |
>>
>>    aString isEmptyOrNil
>>        ifTrue: [ Error signal: 'Path element cannot be empty or nil'].
>>
>>    ^self resolve: (Path from: aString)
>> ! !

Reply | Threaded
Open this post in threaded view
|

Re: Path>>/ and ..

Ben Coman
On Wed, Mar 29, 2017 at 8:02 PM, Alistair Grant <[hidden email]> wrote:

> On 29 March 2017 at 13:33, Esteban Lorenzano <[hidden email]> wrote:
>>
>> problem is this part: '../e’
>> it has to be any of this:
>>
>> ('/a/b/c/d' asFileReference / '..' / 'e') path. "Path / 'a' / 'b' / 'c' / 'd' / '..' / 'e'"
>> ('/a/b/c/d' asFileReference parent / 'e') path. "Path / 'a' / 'b' / 'c' / 'e'"
>>
>> the rationale is that “..” is not modelled as a special kind of node, just a string.
>> and the reason is that that’s commonly like that, but it doesn’t *has
>> to* be like that… there could be file systems that interprets .. as a
>> simple string.
>
> Except that ".." is treated as a special case at the moment in
> Path class>>addParentElementTo:, which is why #resolve: does the right
> thing (well, at least in my opinion :-)).
>
> While I agree that a more general solution is mostly a better solution,
> I'm not aware of any file system at the moment that doesn't treat .. as
> the parent directory, and Pharo already treats .. as a special case.

Perhaps Pharo should adopts ".." as *its* platform independent "go-up" token.
So the (relatively uncommon) platforms that don't conform will need an adapter.

cheers -ben

>
> I'm having trouble understanding the Path class comments, however:
>
>     "#* and #/ are mnemonic to . and /
>     whose arguments should  be string file- or directory names, not
>     fragments of Unix path notation intended to be parsed."
>
> seems to imply that segments should not contain the delimiter (/), and
> the current implementation leaves the delimiter in the segment.
>
>
>> and everything can be improved :)
>
> I still think this would be an improvement. :-)
>
>
>> Esteban
>
> Thanks!
> Alistair
>
>
>>> On 29 Mar 2017, at 12:21, Alistair Grant <[hidden email]> wrote:
>>>
>>> Hi All,
>>>
>>> The current implementation of Path>>/, while functional, seems to me to
>>> create poorly formed paths, e.g.:
>>>
>>> ('/a/b/c/d' asFileReference / '../e') path ==> "Path / 'a' / 'b' / 'c'
>>> / 'd' / '../e'"
>>>
>>> As can be seen above, the last segment is '../e'.
>>>
>>> I would expect the result to be:
>>>
>>> "Path / 'a' / 'b' / 'c' / 'e'"
>>>
>>> Is there a reason for the current behaviour?  If not, are people open to
>>> modifying Path>>/ to produce the suggested result above (in Pharo 7)?
>>>
>>> The method below produces the modified result above and doesn't change
>>> the result of running all automated tests.  If this is going to be added
>>> to Pharo 7 I'll create an additional automated test to check that it is
>>> working properly.
>>>
>>> Cheers,
>>> Alistair
>>>
>>> 'From Pharo6.0 of 13 May 2016 [Latest update: #60451] on 29 March 2017
>>> at 9:40:41.681228 am'!
>>>
>>> !Path methodsFor: 'navigating' stamp: 'AlistairGrant 3/29/2017 09:39'!
>>> / aString
>>>    | path |
>>>
>>>    aString isEmptyOrNil
>>>        ifTrue: [ Error signal: 'Path element cannot be empty or nil'].
>>>
>>>    ^self resolve: (Path from: aString)
>>> ! !
>

Reply | Threaded
Open this post in threaded view
|

Re: Path>>/ and ..

Damien Pollet
In reply to this post by alistairgrant
On 29 March 2017 at 14:02, Alistair Grant <[hidden email]> wrote:
Except that ".." is treated as a special case at the moment in
Path class>>addParentElementTo:, which is why #resolve: does the right
thing (well, at least in my opinion :-)).

Well, not according to shell commands :
ls ./nonexistentdirectory/.. → cannot access, no such file or directory

So we have a divergence between FileSystem, where .. means "cancel the last path segment, syntactically, without even checking the actual files and directories" and the behavior of the shell command (and thus of the underlying system call, I guess) which attempts to actually resolve each segment and fails because it cannot go and look inside nonexistentdirectory for the entry named '..'

This becomes important with symbolic links : try ls ./somesymlink/.. and you'll see that you don't get the contents of $PWD but of the parent of the symlink's destination.
 
I'm having trouble understanding the Path class comments, however:

    "#* and #/ are mnemonic to . and /
    whose arguments should  be string file- or directory names, not
    fragments of Unix path notation intended to be parsed."

seems to imply that segments should not contain the delimiter (/), and
the current implementation leaves the delimiter in the segment.

Yes. The point is, if you have an API that tries to be clever and guesses what you meant (e.g. you could pass '../foo/bar' as a path element, it might be convenient most of the time, but you lose control of what you mean: the API, because it interprets the strings you pass, will make it difficult or impossible to express some paths. For instance, on my current filesystem, I can create a file named foo\bar. I'm guessing that's impossible on Windows… It also leads to quoting, escaping, case sensitivity and such problems (try writing shell scripts that are robust to file names with spaces…)

The alternative is to have an API that does what you tell it to, no more, no less. It should consider that if you pass a string with special characters, you meant to have those characters in the result. Maybe that's impossible on some filesystems, and raises an error, but the API will not decide to do something different than you asked.

 
>> On 29 Mar 2017, at 12:21, Alistair Grant <[hidden email]> wrote:
>>
>> Hi All,
>>
>> The current implementation of Path>>/, while functional, seems to me to
>> create poorly formed paths, e.g.:
>>
>> ('/a/b/c/d' asFileReference / '../e') path ==> "Path / 'a' / 'b' / 'c'
>> / 'd' / '../e'"
>>
>> As can be seen above, the last segment is '../e'.
>>
>> I would expect the result to be:
>>
>> "Path / 'a' / 'b' / 'c' / 'e'"
>>
>> Is there a reason for the current behaviour?  If not, are people open to
>> modifying Path>>/ to produce the suggested result above (in Pharo 7)?
>>
>> The method below produces the modified result above and doesn't change
>> the result of running all automated tests.  If this is going to be added
>> to Pharo 7 I'll create an additional automated test to check that it is
>> working properly.
>>
>> Cheers,
>> Alistair
>>
>> 'From Pharo6.0 of 13 May 2016 [Latest update: #60451] on 29 March 2017
>> at 9:40:41.681228 am'!
>>
>> !Path methodsFor: 'navigating' stamp: 'AlistairGrant 3/29/2017 09:39'!
>> / aString
>>    | path |
>>
>>    aString isEmptyOrNil
>>        ifTrue: [ Error signal: 'Path element cannot be empty or nil'].
>>
>>    ^self resolve: (Path from: aString)
>> ! !




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: Path>>/ and ..

alistairgrant
On 29 March 2017 at 16:20, Damien Pollet <[hidden email]> wrote:

> On 29 March 2017 at 14:02, Alistair Grant <[hidden email]> wrote:
>>
>> Except that ".." is treated as a special case at the moment in
>> Path class>>addParentElementTo:, which is why #resolve: does the right
>> thing (well, at least in my opinion :-)).
>
>
> Well, not according to shell commands :
> ls ./nonexistentdirectory/.. → cannot access, no such file or directory
>
> So we have a divergence between FileSystem, where .. means "cancel the last
> path segment, syntactically, without even checking the actual files and
> directories" and the behavior of the shell command (and thus of the
> underlying system call, I guess) which attempts to actually resolve each
> segment and fails because it cannot go and look inside nonexistentdirectory
> for the entry named '..'
>
> This becomes important with symbolic links : try ls ./somesymlink/.. and
> you'll see that you don't get the contents of $PWD but of the parent of the
> symlink's destination.
>
>>
>> I'm having trouble understanding the Path class comments, however:
>>
>>     "#* and #/ are mnemonic to . and /
>>     whose arguments should  be string file- or directory names, not
>>     fragments of Unix path notation intended to be parsed."
>>
>> seems to imply that segments should not contain the delimiter (/), and
>> the current implementation leaves the delimiter in the segment.
>
>
> Yes. The point is, if you have an API that tries to be clever and guesses
> what you meant (e.g. you could pass '../foo/bar' as a path element, it might
> be convenient most of the time, but you lose control of what you mean: the
> API, because it interprets the strings you pass, will make it difficult or
> impossible to express some paths. For instance, on my current filesystem, I
> can create a file named foo\bar. I'm guessing that's impossible on Windows…
> It also leads to quoting, escaping, case sensitivity and such problems (try
> writing shell scripts that are robust to file names with spaces…)
>
> The alternative is to have an API that does what you tell it to, no more, no
> less. It should consider that if you pass a string with special characters,
> you meant to have those characters in the result. Maybe that's impossible on
> some filesystems, and raises an error, but the API will not decide to do
> something different than you asked.

Good point, I'd forgotten about symlinks.

I did a little investigation, and Python and Java don't bother about
checking whether the path exists or if there are symlinks present.
There are warnings in the documentation.

Bash provides realpath and readlink, but there are flags to indicate
whether to check that the path actually exists or not.

At the moment, Pharo is a bit inconsistent:

'/a/b/c' asFileReference / '../d' "File @ /a/b/c/../d"

'/a/b/c/../d' asFileReference  "File @ /a/b/d"

Perhaps FileReference and Path could be modified so that references
to the parent directory are kept by default, but there are canonicalize
methods that remove them:

#canonicalizeFollow - the path must exist and symlinks are followed
#canonicalize - don't check for file existance or follow symlinks

I would still modify Path>>/ so that segments don't contain directory
separators, i.e. currently:

'/a/b/c' asFileReference / '../d'  "Path / 'a' / 'b' / 'c' / '../d'"

would become:

'/a/b/c' asFileReference / '../d'  "Path / 'a' / 'b' / 'c' / '..' / 'd'"

(Notice that the second version has an additional segment).

What does everyone think?

Thanks,
Alistair