Why would (Unix)FileDirectory class>>#checkName:fixErrors: mangle a correctly formatted file path/name?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Why would (Unix)FileDirectory class>>#checkName:fixErrors: mangle a correctly formatted file path/name?

timrowledge
At least, UnixFileDirectory seems to -

'/home/pi/Squeak/Senders of printPSToFileNamed:.ps’ asFileName

results in  '#home#pi#Squeak#Senders of printPSToFileNamed:.ps’, which seems very odd.  Surely if you have a string that is a correct filename returned by some other code that carefully makes such, then #asFileName etc ought to return the same (or even the original) string?

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
If you never try anything new, you'll miss out on many of life's great disappointments



Reply | Threaded
Open this post in threaded view
|

Re: Why would (Unix)FileDirectory class>>#checkName:fixErrors: mangle a correctly formatted file path/name?

Tobias Pape

> On 23.11.2017, at 02:57, tim Rowledge <[hidden email]> wrote:
>
> At least, UnixFileDirectory seems to -
>
> '/home/pi/Squeak/Senders of printPSToFileNamed:.ps’ asFileName
>
> results in  '#home#pi#Squeak#Senders of printPSToFileNamed:.ps’, which seems very odd.  Surely if you have a string that is a correct filename returned by some other code that carefully makes such, then #asFileName etc ought to return the same (or even the original) string?

Well you asked for a file "name" right? not a path. and "/" is typically illegal in Unix file names (or has to be quoted or so).

The comment seems to back that:
"Answer a String made up from the receiver that is an acceptable file
        name."

Best regards
        -tobias
>
> tim
> --
> tim Rowledge; [hidden email]; http://www.rowledge.org/tim
> If you never try anything new, you'll miss out on many of life's great disappointments
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Why would (Unix)FileDirectory class>>#checkName:fixErrors: mangle a correctly formatted file path/name?

timrowledge

> On 22-11-2017, at 10:18 PM, Tobias Pape <[hidden email]> wrote:
>
> Well you asked for a file "name" right? not a path. and "/" is typically illegal in Unix file names (or has to be quoted or so).

Hah, well maybe you’re right, though it wasn’t my code.

Generically our filename code is awful, as it has been since 1982 at least.

The problem arises because as I swap in my file chooser/saver dialogs  - which return full paths for the chosen filenames - I find places where the code is … interesting. #asFileName features fairly often here. Explicit assumptions about filenames extensions are quite common and often dubious.

Browse senders of UIManager>>#request:initialAnswer: for a combination of laughs and winces.


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
I'm so skeptical that I'm not sure I'm really a skeptic



Reply | Threaded
Open this post in threaded view
|

Re: Why would (Unix)FileDirectory class>>#checkName:fixErrors: mangle a correctly formatted file path/name?

timrowledge

> On 22-11-2017, at 11:13 PM, tim Rowledge <[hidden email]> wrote:
>
>
>> On 22-11-2017, at 10:18 PM, Tobias Pape <[hidden email]> wrote:
>>
>> Well you asked for a file "name" right? not a path. and "/" is typically illegal in Unix file names (or has to be quoted or so).
>
> Hah, well maybe you’re right, though it wasn’t my code.

FilePath class appears to be happy to handle, well, path names, as evidenced by things like SmalltalkImage>>imageName and >>imageName:.  If I’ve understood correctly we can make one with either a Squeak string (handling byte and wide forms with the convertors) or a string returned by the VM and turning it into a Squeak string.

The FilePath object has both forms available, which makes me wonder why #asFileName first converts the Squeak String to vm-encode form and then converts it back to Squeak form. Is it really the case that this might not result in the same string at the end? And even if it that is so, doesn’t the FilePath code result in the squeakPathName getting any conversion it needs anyway? Looking at FilePath>>#pathName:isEncoded: it seems that `p isOctetString ifTrue: [p asOctetString] ifFalse: [p]` is pretty much a waste; if ‘p’ is an octet string (ie one that is or could be a ByteString, involving scanning the entire length of a WideString) then
a) ByteStrings obviously don’t need any converting by asOctetString
b) WideStrings look like they return a WideString if there are any actual wide chars inside
c) so the only effect of this is that a WideString that no longer has wide chars gets converted to a ByteString

There seems to be a lot of places where asSqueakPathName and asVmPathName have been dropped in by guesswork. It’s a pity we can’t have a move platform specific class of string-thing for the vm to return, that way we would know for certain that it needed converting to Squeak format. It’s similar to the problems we have because of the lack of an actual UTF-8 string class. I think making the vm return not-ByteStrings might break a few old things though, sadly.

We can argue about whether #asFileName ought to do stuff to a String to make it an acceptable name for a leaf file rather than a full file pathname. Given the sloppy way we use filename/path terminology it’s a toss-up. There ought to be a similarly succinct message to do the full path equivalent though.

If we look at usages of #asFileName we see quite a range of oddness. Since using it effectively restricts any file name entered by a user to be just a leaf name, there are places where you don’t get a chance to save a file anywhere other than the default directory, which I consider rather rude. As an example, PasteUpMorph>>#saveOnFile - it seems pretty restrictive to only let the .morph file get saved into the default directory. And the usage of #asFileName is completely redundant anyway since the #checkName:fixErrors: is done in FileDirectory>>#fullNameFor: and the string format check/fix is done in StandardFileStream>>#open:forWrite: !

And just to rub in the point that our file code is in need of cleaning up, note that FileStream class>>#newFileNamed: includes use of #fullName: and then StandardFileStream class>>#newFileNamed: sends it again, to the already fixed up string. It would also be nice if we could work out how to avoid using MultiByteFileStream as the normal stream - in the NuScratch work I discovered that reading a text file could be made many times faster (a big deal for large language translation data files) by using StandardFileStream wherever possible.

It makes some sense to use asFileName on Strings being given to the user as suggested file names, so long as that really is only a leaf file name. But consider a case like Morph>>#printPSToFileNamed: where the string parameter might well be a full path rather than just a leaf name - the use of #asFileName completely breaks the intent. (Not to mention that the entire premise of the method is faulty anyway, mixing UI and headless relevant code in one place.)

In general I suggest that many places currently using #asFileName ought to be using (the currently fictitious) #asFullPathName in order to get a full path name with all the corrections and format fixes. Something like `^FileDirectory default fullPathFor: self`


tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
Strange OpCodes: TOAC: Turn Off Air Conditioner