Assuming the build succeeds and I don't think of anything else, I'll merge this in the next 48 hours. You can view, comment on, or merge this pull request online at:https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/321 Commit Summary
File Changes
Patch Links:
— |
@akgrant43 pushed 1 commit.
— |
In reply to this post by David T Lewis
Merged #321 into Cog. — |
In reply to this post by David T Lewis
Hi Alistair, you can do better than using malloc & free in primitiveClosedir et al. If you look at the changes I made in VMMaker.oscog-eem.2490 & VMMaker.oscog-eem.2491 you'll see a simple pattern: SerialPlugin>>copyPortNameToCString: portName <returnTypeC: #'char *'> | port portNameSize | <inline: #always> <var: 'port' type: #'char *'> portNameSize := interpreterProxy slotSizeOf: (portName asOop: String). port := self alloca: portNameSize + 1. self memcpy: port _: portName _: portNameSize. port at: portNameSize put: 0. ^port This method simply copies a Smalltalk string to a null-terminated C string via alloca. If you don't know this, alloca is like malloc but it allocates on the stack in the current stack frame and so the space it allocates is automatically freed on return from the function in which alloca is invoked. Because the method is marked <inline: #always> it is in lined into its caller and so the alloca happens in the calling method, e.g. SerialPlugin>>primitiveSerialPortCloseByName: portName | port | self primitive: 'primitiveSerialPortCloseByName' parameters: #(String). port := self copyPortNameToCString: portName. self serialPortCloseByName: port and there is no need to free anything. Hence in e.g. primitiveClosedir where you write primitiveClosedir "Close the directory stream for dirPointerOop. Answer dirPointerOop on success. Raise PrimErrBadArgument if the parameter is not a ByteArray length size(void *). If closedir() returns an error raise PrimitiveOSError." | dirPointerOop faPathPtr faPath result | <export: true> <var: 'faPath' type: #'fapath *'> <var: 'faPathPtr' type: #'fapathptr *'> dirPointerOop := interpreterProxy stackValue: 0. faPathPtr := self cCode: '(fapathptr *)structFromObjectsize(dirPointerOop, sizeof(fapathptr))' inSmalltalk: [self structFromObject: dirPointerOop size: self sizeOfFaPathPtr]. faPathPtr = 0 ifTrue: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. (self faValidateSessionId: (self cCode: 'faPathPtr->sessionId' inSmalltalk: [faPathPtr first])) ifFalse: [self free: faPathPtr. ^interpreterProxy primitiveFailFor: PrimErrBadArgument]. faPath := self cCode: 'faPathPtr->faPath' inSmalltalk: [faPathPtr second]. result := self faCloseDirectory: faPath. self faInvalidateSessionId: (self cCode: '&faPathPtr->sessionId' inSmalltalk: [faPathPtr]). result = 0 ifFalse: [^interpreterProxy primitiveFailForOSError: result]. self free: faPathPtr. self free: faPath. interpreterProxy pop: 2 thenPush: dirPointerOop if you implement structFromObject:size: like this: FileAttributesPlugin>>structFromObject: anObject size: structSize "Allocate memory of the requiested size and copy the contents of anObject in to it. anObject is expected to be bytes, e.g. ByteArray or String. Use alloca to avoid having to explicitly free memory." <returnTypeC: #'void *'> | buffer | <inline: #always> (interpreterProxy stSizeOf: anObject) = structSize ifFalse: [interpreterProxy primitiveFailFor: PrimErrBadArgument. ^0]. buffer := self alloca: structSize. buffer = 0 ifTrue: [interpreterProxy primitiveFailFor: PrimErrNoCMemory] ifFalse: [self memcpy: buffer _: (interpreterProxy arrayValueOf: anObject) _: structSize]. ^buffer you can then write primitiveCloseDir as FileAttributesPlugin>>primitiveClosedir "Close the directory stream for dirPointerOop. Answer dirPointerOop on success. Raise PrimErrBadArgument if the parameter is not a ByteArray length size(void *). If closedir() returns an error raise PrimitiveOSError." | dirPointerOop faPathPtr faPath result | <export: true> <var: 'faPath' type: #'fapath *'> <var: 'faPathPtr' type: #'fapathptr *'> dirPointerOop := interpreterProxy stackValue: 0. faPathPtr := self structFromObject: dirPointerOop size: (self sizeof: faPathPtr)]. faPathPtr = 0 ifTrue: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. (self faValidateSessionId: (self cCode: 'faPathPtr->sessionId' inSmalltalk: [faPathPtr first])) ifFalse: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. faPath := self cCode: 'faPathPtr->faPath' inSmalltalk: [faPathPtr second]. result := self faCloseDirectory: faPath. self faInvalidateSessionId: (self cCode: '&faPathPtr->sessionId' inSmalltalk: [faPathPtr]). result = 0 ifFalse: [^interpreterProxy primitiveFailForOSError: result]. self free: faPath. interpreterProxy pop: 2 thenPush: dirPointerOop If you would go as far as to make a VMStructType subclass for fapath et al you could then write it as FileAttributesPlugin>>primitiveClosedir "Close the directory stream for dirPointerOop. Answer dirPointerOop on success. Raise PrimErrBadArgument if the parameter is not a ByteArray length size(void *). If closedir() returns an error raise PrimitiveOSError." | dirPointerOop faPathPtr result | <export: true> <var: 'faPathPtr' type: #'FAPathptr *'> dirPointerOop := interpreterProxy stackValue: 0. faPathPtr := self structFromObject: dirPointerOop size: (self sizeof: faPathPtr)]. faPathPtr = 0 ifTrue: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. (self faValidateSessionId: faPathPtr sessionId) ifFalse: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. result := self faCloseDirectory: faPathPtr faPath. self faInvalidateSessionId: faPathPtr sessionId. result = 0 ifFalse: [^interpreterProxy primitiveFailForOSError: result]. self free: faPathPtr faPath. interpreterProxy pop: 2 thenPush: dirPointerOop and then as FileAttributesPlugin>>primitiveClosedir "Close the directory stream for dirPointerOop. Answer dirPointerOop on success. Raise PrimErrBadArgument if the parameter is not a ByteArray length size(void *). If closedir() returns an error raise PrimitiveOSError." | dirPointerOop faPathPtr result | <export: true> <var: 'faPathPtr' type: #'FAPathptr *'> dirPointerOop := interpreterProxy stackValue: 0. faPathPtr := self structFromObject: dirPointerOop size: (self sizeof: faPathPtr)]. (faPathPtr ~= 0 and: [self faValidateSessionId: faPathPtr sessionId]) ifFalse: [^interpreterProxy primitiveFailFor: PrimErrBadArgument]. result := self faCloseDirectory: faPathPtr faPath. self faInvalidateSessionId: faPathPtr sessionId. result = 0 ifFalse: [^interpreterProxy primitiveFailForOSError: result]. self free: faPathPtr faPath. interpreterProxy pop: 2 thenPush: dirPointerOop The only thing that;'s unclear for me at the moment is how the result of faInvalidateSessionId: gets tested. result doesn't get set by it. I presume it sets primitive failure, but there';s no test for primitive failure where result is tested (result = 0). HTH! On Sat, Dec 15, 2018 at 1:10 AM akgrant43 <[hidden email]> wrote: > Merged #321 <https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/321> > into Cog. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <https://github.com/OpenSmalltalk/opensmalltalk-vm/pull/321#event-2028387352>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/APHa0Kry-Zrt5kUswbgFCps_HW63I6gEks5u5LxogaJpZM4ZT4Mm> > . > -- _,,,^..^,,,_ best, Eliot — |
Hi Eliot, On Sat, Dec 15, 2018 at 04:11:33PM -0800, Eliot Miranda wrote: > > Hi Alistair, > > you can do better than using malloc & free in primitiveClosedir et al. > If you look at the changes I made in VMMaker.oscog-eem.2490 > & VMMaker.oscog-eem.2491 you'll see a simple pattern: > > ... > You are, of course, correct. By way of excuses: There are a number of methods that basically do: faPath := calloc(1, sizeof(fapath)) I was debating with myself between whether to allocate fapath (between 9K and 64K, depending on platform) on the heap or the stack since there's been some recent discussion about using the VM on SoC systems. It's about 30 years since I've dealth with embedded systems (collecting statistics on X.25 networks) or really thought about stack and heap allocation - I should have just admitted to myself that I'm so far out of date that I have no idea. :-) Anyway, I got caught up on that and didn't think about this case. I'll make the change you suggest, and also move fapath on to the stack (primitiveOpendir being the exception) - it will be a bit faster for most people. > The only thing that;'s unclear for me at the moment is how the result of > faInvalidateSessionId: gets tested. result doesn't get set by it. I > presume it sets primitive failure, but there';s no test for primitive > failure where result is tested (result = 0). faInvalidateSessionId can't fail, all it does is set the session Id to 0 to catch the case where someone tries to reuse an invalid fapath within a VM session (i.e. after calling primitiveClosedir). Cheers, Alistair |
Hi Alistair, On Mon, Dec 17, 2018 at 10:30 PM Alistair Grant <[hidden email]> wrote:  Oh wow, I had no realization it was that bug. I had assumed it was the size of a path, i.e. only a few hindered bytes at max. If it is that big I think the way you're doing things is fine. I would just ask you to comment the code as being that way because it is that big. It's about 30 years since I've dealth with embedded systems (collecting _,,,^..^,,,_ best, Eliot |
Hi Eliot, On Tue, 18 Dec 2018 at 18:35, Eliot Miranda <[hidden email]> wrote: > > Oh wow, I had no realization it was that bug. I had assumed it was the size of a path, i.e. only a few hindered bytes at max. If it is that big I think the way you're doing things is fine. I would just ask you to comment the code as being that way because it is that big. There are two structures being allocated. The one you provided the code for is small - an integer plus a pointer. The other one, which has enough room for any path is the big one. It holds the path in UTF8 (as passed from the image) and in the platform's native format, which for Windows is 32K 16 bit words. Cheers, Alistair |
> On 2018-12-18, at 10:55 PM, Alistair Grant <[hidden email]> wrote: > > > Hi Eliot, > > On Tue, 18 Dec 2018 at 18:35, Eliot Miranda <[hidden email]> wrote: >> >> Oh wow, I had no realization it was that bug. I had assumed it was the size of a path, i.e. only a few hindered bytes at max. If it is that big I think the way you're doing things is fine. I would just ask you to comment the code as being that way because it is that big. > > There are two structures being allocated. The one you provided the > code for is small - an integer plus a pointer. The other one, which > has enough room for any path is the big one. It holds the path in > UTF8 (as passed from the image) and in the platform's native format, > which for Windows is 32K 16 bit words. Good grief, that's huge. Is there no mechanism for querying the size to allow correct allocation? RISC OS has a number of similar APIs where you basically do val = thingyCall(param1, paramN, 0); allocate(buffer, val); result = thingyCall(param1, paramN, *buffer); tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim CITOKATE - Criticism is the Only Known Antidote to Error |
Hi Tim, On Wed, 19 Dec 2018 at 18:38, tim Rowledge <[hidden email]> wrote: > > > On 2018-12-18, at 10:55 PM, Alistair Grant <[hidden email]> wrote: > > There are two structures being allocated. The one you provided the > > code for is small - an integer plus a pointer. The other one, which > > has enough room for any path is the big one. It holds the path in > > UTF8 (as passed from the image) and in the platform's native format, > > which for Windows is 32K 16 bit words. > > Good grief, that's huge. Is there no mechanism for querying the size to allow correct allocation? RISC OS has a number of similar APIs where you basically do > val = thingyCall(param1, paramN, 0); > allocate(buffer, val); > result = thingyCall(param1, paramN, *buffer); Not directly. This is used to iterate over a directory, and of course there's no way to know in advance what the longest file name in the directory is. 32K is the maximum path length, I have to admit I'm not sure what the maximum file name length is on Windows. The directory name is known in advance, so if the maximum file name length is shorter, the buffer size could be reduced. But it may be different for different file systems, so I'm not sure that we can be confident. The whole thing is actually quite a mess, e.g. opening a short path (< 260 characters) works as expected (this is assuming the files really do exist in all examples below): C:\aaa\bbb\ccc\abc.txt Another short path: C:\aaa\bbb\ddd\..\ccc\abc.txt is also fine. A long path such as: C:\aaa\<b repeated 140 times>\<c repeated 140 times>\abc.txt works. But: C:\aaa\<b repeated 140 times>\<d repeated 140 times>\..\<c repeated 140 times>\abc.txt fails with an invalid path name. I'm open to dropping Windows as a supported platform. :-) Cheers, Alistair |
> On 2018-12-19, at 10:00 AM, Alistair Grant <[hidden email]> wrote: >> >> Good grief, that's huge. Is there no mechanism for querying the size to allow correct allocation? RISC OS has a number of similar APIs where you basically do >> val = thingyCall(param1, paramN, 0); >> allocate(buffer, val); >> result = thingyCall(param1, paramN, *buffer); > > Not directly. This is used to iterate over a directory, and of course > there's no way to know in advance what the longest file name in the > directory is. Well that's what the RISC OS approach is explicitly for and as I'm sure you can see it is very useful for this stuff. > > 32K is the maximum path length, I have to admit I'm not sure what the > maximum file name length is on Windows. The directory name is known > in advance, so if the maximum file name length is shorter, the buffer > size could be reduced. But it may be different for different file > systems, so I'm not sure that we can be confident. > > The whole thing is actually quite a mess, e.g. opening a short path (< > 260 characters) works as expected (this is assuming the files really > do exist in all examples below): OK, it seems that you have to explicitly enable the longer path lengths and so you may be falling foul of that. Supposedly MAX_PATH tells you but I haven't seen anything that tells me if the macro value gets changed if you enable long paths (and indeed, unless it is actually a variable, how could it change). Surely ther must be some variable or call that tells you the correct value; even Microsoft isn't *that* useless. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Logic: The art of being wrong with confidence... |
Hi Tim, On Wed, 19 Dec 2018 at 19:13, tim Rowledge <[hidden email]> wrote: > > > On 2018-12-19, at 10:00 AM, Alistair Grant <[hidden email]> wrote: > >> > > The whole thing is actually quite a mess, e.g. opening a short path (< > > 260 characters) works as expected (this is assuming the files really > > do exist in all examples below): > > OK, it seems that you have to explicitly enable the longer path lengths and so you may be falling foul of that. Yep, that's the long path prefix (\\?\). Whether you use that or not depends on the variant of call you are using, the length of the path and the version of Windows you're running. Paths < MAX_PATH-12 never use the LPP. For longer paths it depends on the call and the version of Windows that you are running. But the difference isn't that it is a long path, but that the relative reference (..) appears in a long path. > Supposedly MAX_PATH tells you but I haven't seen anything that tells me if the macro value gets changed if you enable long paths (and indeed, unless it is actually a variable, how could it change). Surely ther must be some variable or call that tells you the correct value; even Microsoft isn't *that* useless. The MS documentation says that you need to get the volume information to accurately determine the actual maximum path length. And then it is a matter of using calls that support long paths (e.g. the posix compliant calls don't) and adding the LPP when required. Cheers, Alistair |
I think all I can really do is offer my sympathies at this point; I remember with shudders that similar problems of different rules for different volumes/fstypes on Win/NT plagued me when I rewrote the filename classes for VW back in '94 or so. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Liability: a valuable political skill |
In reply to this post by alistairgrant
> On 19.12.2018, at 19:36, Alistair Grant <[hidden email]> wrote: > > > Hi Tim, > > On Wed, 19 Dec 2018 at 19:13, tim Rowledge <[hidden email]> wrote: >> >>> On 2018-12-19, at 10:00 AM, Alistair Grant <[hidden email]> wrote: >>>> >>> The whole thing is actually quite a mess, e.g. opening a short path (< >>> 260 characters) works as expected (this is assuming the files really >>> do exist in all examples below): >> >> OK, it seems that you have to explicitly enable the longer path lengths and so you may be falling foul of that. > > Yep, that's the long path prefix (\\?\). Whether you use that or not > depends on the variant of call you are using, the length of the path > and the version of Windows you're running. Paths < MAX_PATH-12 never > use the LPP. For longer paths it depends on the call and the version > of Windows that you are running. > > But the difference isn't that it is a long path, but that the relative > reference (..) appears in a long path. > > >> Supposedly MAX_PATH tells you but I haven't seen anything that tells me if the macro value gets changed if you enable long paths (and indeed, unless it is actually a variable, how could it change). Surely ther must be some variable or call that tells you the correct value; even Microsoft isn't *that* useless. > > The MS documentation says that you need to get the volume information > to accurately determine the actual maximum path length. And then it > is a matter of using calls that support long paths (e.g. the posix > compliant calls don't) and adding the LPP when required. I wrote a small helper for that a few moons ago: https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/476f70605a0352dd7528d251f7403e9233716cdb/platforms/win32/plugins/FilePlugin/sqWin32File.h#L33 However, I see that struct fapathstruct defines a preallocated winpathLPP of 32768 + 4 WCHARs. I think an approach similar to that above would cut that to just a pointer, right? MultiByteToWideChar tells you how much it needs when you give it a NULL outpointer… Best regards -Tobias PS: please don't use strcpy/strlen but strnlen and memcpy (like in line 101) :) > > Cheers, > Alistair |
Hi Tobias & Tim, On Wed, Dec 19, 2018 at 10:52:11AM -0800, tim Rowledge wrote: > > > I think all I can really do is offer my sympathies at this point; I remember with shudders that similar problems of different rules for different volumes/fstypes on Win/NT plagued me when I rewrote the filename classes for VW back in '94 or so. Thanks :-) But VW did (still does?) have quite a nice model for the file system. It didn't take much for me to adapt it to VMS, which uses a somewhat different naming convention to Windows and Unix. This was around about the same time - VW2.5. On Wed, Dec 19, 2018 at 08:00:39PM +0100, Tobias Pape wrote: > > > > On 19.12.2018, at 19:36, Alistair Grant <[hidden email]> wrote: > > > > > > The MS documentation says that you need to get the volume information > > to accurately determine the actual maximum path length. And then it > > is a matter of using calls that support long paths (e.g. the posix > > compliant calls don't) and adding the LPP when required. > > I wrote a small helper for that a few moons ago: > https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/476f70605a0352dd7528d251f7403e9233716cdb/platforms/win32/plugins/FilePlugin/sqWin32File.h#L33 Thanks for this! Writing FileAttributesPlugin is the first time I've done any C programming on Windows and this has been useful in understanding the path peculiarities on Windows. > However, I see that struct fapathstruct defines a preallocated > winpathLPP of 32768 + 4 WCHARs. I think an approach similar to that > above would cut that to just a pointer, right? MultiByteToWideChar > tells you how much it needs when you give it a NULL outpointer??? As mentioned above, this is used while iterating over directories, so the full path length isn't known in advance. I guess it would be possible to reallocate the space for every file, but that feels inefficient. In fact it raises something I've been wondering about: My current approach does allocate memory that mostly isn't used. Most of the time this is on the stack, so the computational cost of allocation is relatively cheap. The approach of calling MultiByteToWideChar to determine the length and then allocate the appropriate amount of memory means that the system has to decode the entire UTF8 path to calculate the buffer size. Does anyone have a sense of the overall performance comparison between the two approaches? I would guess that there isn't much in it. Cheers, Alistair |
On Wed, Dec 19, 2018 at 08:36:47PM +0000, Alistair Grant wrote: > > Hi Tobias & Tim, > > On Wed, Dec 19, 2018 at 10:52:11AM -0800, tim Rowledge wrote: > > > > > > I think all I can really do is offer my sympathies at this point; I remember with shudders that similar problems of different rules for different volumes/fstypes on Win/NT plagued me when I rewrote the filename classes for VW back in '94 or so. > > Thanks :-) > > But VW did (still does?) have quite a nice model for the file system. > It didn't take much for me to adapt it to VMS, which uses a somewhat > different naming convention to Windows and Unix. This was around about > the same time - VW2.5. Oh cool, a VMS guy :-) We've been needing an OpenVMS port for years! Dave |
Free forum by Nabble | Edit this page |