Garage and SQLite file databases, general project structure

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Garage and SQLite file databases, general project structure

Torsten Bergmann
Hi Guille,

I had a short look at Garage [1]. Nice initiative!

Looks like using a file is not supported for the SQLite3 backend in Garage:

   GADriver fromConnectionString: 'sqlite3://file://C:/temp/test.db'

   GADriver fromConnectionString: 'sqlite3://C:/temp/test.db'


Maybe GASqlite3Driver>>fromConnectionString: could be adopted, but the
conversion to a Zn URL already removes some stuff. (Ab)using the schema in the URL
as a driverID is nice - but prevents the use of file:// schema.

How can one construct a file db?


Also I do not understand: while "Garage-Postgres"  uses the Postgres
project PostgresV2 as a backend the "Garage-Sqlite3" does not use
NBSqlite3 or SQlite3 projects as backend.

So "GASqlite3FFI" copies many methods of the original NBSQLite3 project from Pierce and myself
and is therefore like an own fork now.
Also the name FFI is misleading as it uses NativeBoost instead of FFI ...

Also it would be good if the project would be layered, we could have a
"Garage-Core" and "Garage-Tests-Core" with "Garage-Mysql" and "Garage-Tests-Mysql" for
the fixtures. The idea is always to be able to load runtime code without tests
for production code.
 
Thanks
Torsten

[1] http://lists.pharo.org/pipermail/pharo-users_lists.pharo.org/2015-May/019192.html

Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Pierce Ng-3
On Tue, May 12, 2015 at 08:35:23PM +0200, Torsten Bergmann wrote:
> Also I do not understand: while "Garage-Postgres"  uses the Postgres
> project PostgresV2 as a backend the "Garage-Sqlite3" does not use
> NBSqlite3 or SQlite3 projects as backend.
>
> So "GASqlite3FFI" copies many methods of the original NBSQLite3 project from
> Pierce and myself and is therefore like an own fork now.

Yes, I agree this is not the best approach. I've just added support for
SQLite's online backup API to NBSQLite3. (See my blog.) Garage's forking means
it does not directly benefit from improvements to NBSQLite3. Now, this being
Smalltalk, I reckon that some metaprogramming magic will allow me to add the
new stuff in NBSQLite3 to Garage, but overtime as the code bases diverge this
gets difficult.

Pierce


Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Guillermo Polito
In reply to this post by Torsten Bergmann
Hi Torsten,

Thanks for taking a look, I'll answer between lines :).

El mar., 12 de may. de 2015 a la(s) 8:35 p. m., Torsten Bergmann <[hidden email]> escribió:
Hi Guille,

I had a short look at Garage [1]. Nice initiative!

Looks like using a file is not supported for the SQLite3 backend in Garage:

   GADriver fromConnectionString: 'sqlite3://file://C:/temp/test.db'

   GADriver fromConnectionString: 'sqlite3://C:/temp/test.db'


Maybe GASqlite3Driver>>fromConnectionString: could be adopted, but the
conversion to a Zn URL already removes some stuff. (Ab)using the schema in the URL
as a driverID is nice - but prevents the use of file:// schema.

How can one construct a file db?

Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.

[driverid]://[host][:port]/[databasename]?[properties]

Then, for sqlite I would like to have:

sqlite3://myDirectory/myfile.db

or

sqlite3://memory

Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer

sqlite3://c:/myDirectory/myfile.db

In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.
 
Also I do not understand: while "Garage-Postgres"  uses the Postgres
project PostgresV2 as a backend the "Garage-Sqlite3" does not use
NBSqlite3 or SQlite3 projects as backend.

Well, the main rationale behind this is that the public APIs of NBSqlite3 and Garage are not compatible while the one of Postgres is. I did not want to insert changes into NBSqlite3 that could break users. On the other hand, I could *patch* Postgres in the first version with some extension methods.

Then, there are some other reasons/problems that bugged me:

  - I noticed while doing garage that even if the drivers are different in the very end, there are similarities in structure and code between them. And I wanted to reuse that code and put it into only one place instead of duplicate it for each driver, because it is easier to maintain. In that sense the Garage-Core driver is not just *an API* but has a bit of common behavior e.g., streaming of resultsets, some common superclasses.
 
- Garage only needs its drivers to respect the common interface. I'd like to minimize special database features (unless unavoidable). In the case they already exist, I'd not like to publicize them as part of the Garage API because of obvious reasons.

  - A minor thing but that bothers me nonetheless: An alternative implementation would have been an adapting layer to communicate with the existing driver(s). This looked also as an overkill in maintenance. Also, some drivers such as mysql's was apparently not actively maintained so I would have to maintain a layer of adaptors and the driver itself. 


So "GASqlite3FFI" copies many methods of the original NBSQLite3 project from Pierce and myself
and is therefore like an own fork now.
Also the name FFI is misleading as it uses NativeBoost instead of FFI ...

Yes, it's like a fork now. Maybe we can split it into another package so I can use it in garage easily?
 
Also it would be good if the project would be layered, we could have a
"Garage-Core" and "Garage-Tests-Core" with "Garage-Mysql" and "Garage-Tests-Mysql" for
the fixtures. The idea is always to be able to load runtime code without tests
for production code.

Well, if that is the desire of the majority, we can do it. For the moment I wanted to keep simple the metacello configurations also, so it was my trade-off to not split the packages for simplicity ^^.
 
Thanks, and sorry for the long email,
Guille
Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Guillermo Polito
In reply to this post by Pierce Ng-3
Hi Pierce,

El sáb., 16 de may. de 2015 a la(s) 10:52 a. m., Pierce Ng <[hidden email]> escribió:
On Tue, May 12, 2015 at 08:35:23PM +0200, Torsten Bergmann wrote:
> Also I do not understand: while "Garage-Postgres"  uses the Postgres
> project PostgresV2 as a backend the "Garage-Sqlite3" does not use
> NBSqlite3 or SQlite3 projects as backend.
>
> So "GASqlite3FFI" copies many methods of the original NBSQLite3 project from
> Pierce and myself and is therefore like an own fork now.

Yes, I agree this is not the best approach. I've just added support for
SQLite's online backup API to NBSQLite3. (See my blog.) Garage's forking means
it does not directly benefit from improvements to NBSQLite3. Now, this being
Smalltalk, I reckon that some metaprogramming magic will allow me to add the
new stuff in NBSQLite3 to Garage, but overtime as the code bases diverge this
gets difficult.

In general Garage will not benefit of sqlite specific features, because it should be as backend agnostic as possible.
But as I said in the other email, maybe we could arrive to a packaging that suits both of us and decreases our efforts :).
 

Pierce

Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Sven Van Caekenberghe-2
In reply to this post by Guillermo Polito
Hi Guile,

> On 18 May 2015, at 10:48, Guillermo Polito <[hidden email]> wrote:
>
> Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.
>
> [driverid]://[host][:port]/[databasename]?[properties]
>
> Then, for sqlite I would like to have:
>
> sqlite3://myDirectory/myfile.db
>
> or
>
> sqlite3://memory
>
> Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer
>
> sqlite3://c:/myDirectory/myfile.db
>
> In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.

I understand the reasoning here, but I would suggest that you at least consider changing your mind. Over time, a lot of work went into ZnUrl, and many people used it and helped fixing many problems. Recently we tried to improve the situation for special schemes a bit (for git URLs). We're not yet there, but you (your use case) could help improving the situation.

I tried to take a principled approach regarding the specs (not that there is one clear spec, haha), but edge cases should still be possible.

Consider the following:

'postgresql://localhost:14433/test-db?username=john&password=123456' asUrl.
'postgresql://john:123456@localhost:14433/test-db?encoding=utf8' asUrl.
'postgresql://localhost/default' asUrl.

'sqlite3:///myDirectory/myfile.db' asUrl.
'sqlite3://memory' asUrl.
'sqlite3:///c:/myDirectory/myfile.db' asUrl.

According to me, these all parse correctly. Of course, ZnUrl does not know or understand what is possibly special about any particular scheme (apart from HTTP(S)). Still, it is just an object that can be manipulated.

The default port can be computed along the lines of #portOrDefault, maybe we could add a #portIfAbsent: ?

Conversion to a file references can be done as follows:

'sqlite3:///c:/myDirectory/myfile.db' asUrl
  copy scheme: #file; asFileReference.

Which basically reads as: 'I know this is not a file:// URL proper, still I want to interpret it as if it were one'. Maybe we could add a #forceAsFileReference ?

Anyway, you get the idea. You know I am willing to help.

Sven




Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Guillermo Polito
Hi Sven, I'll take that into account. I did not think about using an empty host/port pair, maybe because of the extra slash.

Also, I do not need to convert the file name to a file reference. That is done in sqlite by the sqlite library itself, so it's less work for the driver. The only need I need to do is:
 - for nbsqlite get the file name
 - for opendbx get the full file name and split into directory + file name because they are two different arguments for the library.

Now, having a look at what we can do with znurl

In windows it looks that we are ok
'sqlite3:///c:/myDirectory/myfile.db' asUrl path.
     "'c:/myDirectory/myfile.db'"
'sqlite3:///c:\myDirectory\myfile.db' asUrl path.
     "'c:\myDirectory\myfile.db'"

But in linux even if I put extra slashes it keeps removing them... And it makes difficult to recognize relative from absolute file paths:

"Relative path c/myDirectory/myfile.db OK"
'sqlite3:///c/myDirectory/myfile.db' asUrl path.
     "'c/myDirectory/myfile.db'"

"Absolute path /c/myDirectory/myfile.db not OK"
'sqlite3:///c/myDirectory/myfile.db' asUrl path.
     "'c/myDirectory/myfile.db'"

Adding Slashes
'sqlite3:////////c/myDirectory/myfile.db' asUrl path.
     "'c/myDirectory/myfile.db'"

Another concern is that garage as it is works in Pharo 3, 4 and 5. Then if we update Zn I'll have to add Zn to my dependencies and load the correct version accordingly, what do you think? (I prefer that to backporting)

Guille 


El lun., 18 de may. de 2015 a la(s) 11:27 a. m., Sven Van Caekenberghe <[hidden email]> escribió:
Hi Guile,

> On 18 May 2015, at 10:48, Guillermo Polito <[hidden email]> wrote:
>
> Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.
>
> [driverid]://[host][:port]/[databasename]?[properties]
>
> Then, for sqlite I would like to have:
>
> sqlite3://myDirectory/myfile.db
>
> or
>
> sqlite3://memory
>
> Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer
>
> sqlite3://c:/myDirectory/myfile.db
>
> In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.

I understand the reasoning here, but I would suggest that you at least consider changing your mind. Over time, a lot of work went into ZnUrl, and many people used it and helped fixing many problems. Recently we tried to improve the situation for special schemes a bit (for git URLs). We're not yet there, but you (your use case) could help improving the situation.

I tried to take a principled approach regarding the specs (not that there is one clear spec, haha), but edge cases should still be possible.

Consider the following:

'postgresql://localhost:14433/test-db?username=john&password=123456' asUrl.
'postgresql://john:123456@localhost:14433/test-db?encoding=utf8' asUrl.
'postgresql://localhost/default' asUrl.

'sqlite3:///myDirectory/myfile.db' asUrl.
'sqlite3://memory' asUrl.
'sqlite3:///c:/myDirectory/myfile.db' asUrl.

According to me, these all parse correctly. Of course, ZnUrl does not know or understand what is possibly special about any particular scheme (apart from HTTP(S)). Still, it is just an object that can be manipulated.

The default port can be computed along the lines of #portOrDefault, maybe we could add a #portIfAbsent: ?

Conversion to a file references can be done as follows:

'sqlite3:///c:/myDirectory/myfile.db' asUrl
  copy scheme: #file; asFileReference.

Which basically reads as: 'I know this is not a file:// URL proper, still I want to interpret it as if it were one'. Maybe we could add a #forceAsFileReference ?

Anyway, you get the idea. You know I am willing to help.

Sven




Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Sven Van Caekenberghe-2

> On 18 May 2015, at 12:13, Guillermo Polito <[hidden email]> wrote:
>
> Hi Sven, I'll take that into account. I did not think about using an empty host/port pair, maybe because of the extra slash.
>
> Also, I do not need to convert the file name to a file reference. That is done in sqlite by the sqlite library itself, so it's less work for the driver. The only need I need to do is:
>  - for nbsqlite get the file name
>  - for opendbx get the full file name and split into directory + file name because they are two different arguments for the library.

I think that if you want to view (part of) a URL as a file system path and/or if you want to work with that path (splitting it directory/file for example), converting to a file reference is a good idea. ZnUrl can do file interpretation to some degree (#directory & #file, #isDirectoryPath & #isFilePath), but not very well (it is not its primary application).

ZnUrl>>#path is actually not the best selector to use. It uses the internal representation. Like #directory & #file, #isDirectoryPath & #isFilePath, these leak some internals. ZnUrl>>#pathString is better, and will give you a leading $/ while encoding everything correctly.

Also, relative File URLs simply do not exist. But there is #withRelativeReference: which follows the relevant specs (check the tests).

FileReference should also deal better with platform issue, ZnURL is much more abstract, by definition.

I hope this makes sense ;-)

> Now, having a look at what we can do with znurl
>
> In windows it looks that we are ok
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl path.
>      "'c:/myDirectory/myfile.db'"
> 'sqlite3:///c:\myDirectory\myfile.db' asUrl path.
>      "'c:\myDirectory\myfile.db'"

See also #testWindowsDriveNamesInFileUrl

> But in linux even if I put extra slashes it keeps removing them... And it makes difficult to recognize relative from absolute file paths:
>
> "Relative path c/myDirectory/myfile.db OK"
> 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>
> "Absolute path /c/myDirectory/myfile.db not OK"
> 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>
> Adding Slashes
> 'sqlite3:////////c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>
> Another concern is that garage as it is works in Pharo 3, 4 and 5. Then if we update Zn I'll have to add Zn to my dependencies and load the correct version accordingly, what do you think? (I prefer that to back porting)

Just depend on the latest stable release of Zinc, it is (still) supported for 2.0 and up.

> Guille
>
>
> El lun., 18 de may. de 2015 a la(s) 11:27 a. m., Sven Van Caekenberghe <[hidden email]> escribió:
> Hi Guile,
>
> > On 18 May 2015, at 10:48, Guillermo Polito <[hidden email]> wrote:
> >
> > Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.
> >
> > [driverid]://[host][:port]/[databasename]?[properties]
> >
> > Then, for sqlite I would like to have:
> >
> > sqlite3://myDirectory/myfile.db
> >
> > or
> >
> > sqlite3://memory
> >
> > Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer
> >
> > sqlite3://c:/myDirectory/myfile.db
> >
> > In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.
>
> I understand the reasoning here, but I would suggest that you at least consider changing your mind. Over time, a lot of work went into ZnUrl, and many people used it and helped fixing many problems. Recently we tried to improve the situation for special schemes a bit (for git URLs). We're not yet there, but you (your use case) could help improving the situation.
>
> I tried to take a principled approach regarding the specs (not that there is one clear spec, haha), but edge cases should still be possible.
>
> Consider the following:
>
> 'postgresql://localhost:14433/test-db?username=john&password=123456' asUrl.
> 'postgresql://john:123456@localhost:14433/test-db?encoding=utf8' asUrl.
> 'postgresql://localhost/default' asUrl.
>
> 'sqlite3:///myDirectory/myfile.db' asUrl.
> 'sqlite3://memory' asUrl.
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl.
>
> According to me, these all parse correctly. Of course, ZnUrl does not know or understand what is possibly special about any particular scheme (apart from HTTP(S)). Still, it is just an object that can be manipulated.
>
> The default port can be computed along the lines of #portOrDefault, maybe we could add a #portIfAbsent: ?
>
> Conversion to a file references can be done as follows:
>
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl
>   copy scheme: #file; asFileReference.
>
> Which basically reads as: 'I know this is not a file:// URL proper, still I want to interpret it as if it were one'. Maybe we could add a #forceAsFileReference ?
>
> Anyway, you get the idea. You know I am willing to help.
>
> Sven
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Ben Coman


On Mon, May 18, 2015 at 7:06 PM, Sven Van Caekenberghe <[hidden email]> wrote:

> On 18 May 2015, at 12:13, Guillermo Polito <[hidden email]> wrote:
>
> Hi Sven, I'll take that into account. I did not think about using an empty host/port pair, maybe because of the extra slash.
>
> Also, I do not need to convert the file name to a file reference. That is done in sqlite by the sqlite library itself, so it's less work for the driver. The only need I need to do is:
>  - for nbsqlite get the file name
>  - for opendbx get the full file name and split into directory + file name because they are two different arguments for the library.

I think that if you want to view (part of) a URL as a file system path and/or if you want to work with that path (splitting it directory/file for example), converting to a file reference is a good idea. ZnUrl can do file interpretation to some degree (#directory & #file, #isDirectoryPath & #isFilePath), but not very well (it is not its primary application).

ZnUrl>>#path is actually not the best selector to use. It uses the internal representation. Like #directory & #file, #isDirectoryPath & #isFilePath, these leak some internals. ZnUrl>>#pathString is better, and will give you a leading $/ while encoding everything correctly.

Also, relative File URLs simply do not exist.

This sparked my interest to read RFC3986 (https://tools.ietf.org/html/rfc3986#section-5.2.4)
which says: 

     A relative reference that begins with a single slash character is termed 
     an absolute-path reference.  A relative reference that does not begin with 
     a slash character is termed a relative-path reference."

     The path segments "." and "..", also known as dot-segments, are defined 
     for relative reference within the path name hierarchy. They are intended for 
     use at the beginning of a relative-path reference to indicate relative position 
     within the hierarchical tree of names."

     Within a representation with a well defined Base URI of
           "http://a/b/c/d;p?q"
      a relative reference is transformed to its Target URI as follows.
           "g:h"           =  "g:h"
           "g"             =  "http://a/b/c/g"
           "./g"           =  "http://a/b/c/g"
           "../g"          =  "http://a/b/g"
           "/g"            =  "http://a/g"

So slightly as an aside, this makes me wonder if we should consider that when the Image is a opened as a file, its Base URI can be considered to be 
          'file:///b/c/Pharo.image'
such that its reasonable to have the following relative references transformed as follows: 
          'g:h'   asUrl  asString    "-->  'g:h'
          'g'      asUrl  asString    "-->  'file:///b/c/g'
          './g'    asUrl  asString    "-->  'file:///b/c/g'
          '../g'   asUrl  asString    "-->  'file:///b/g'
          '/g'     asUrl  asString    "-->  'file:///g'

Effectively, the default scheme is  #file  rather than  nil,  as it is currently.


RFC3986 5.2.2.  "Transform References" shows a non-strict case where it seems that following intuitive example is valid
        'file:///./g' asUrl  asString    "-->  file:///b/c/g'  "
per this line...
         T.path = merge(Base.path, R.path);


RFC3986 5.1.2.  "Base URI from the Encapsulating Entity" indicates we could consider the Base URI for the sqlite3 scheme to derive from the Image's  'file:///b/c/Pharo.image'  URI to be...
          'sqlite3:///b/c/Pharo.image'
such that per 5.2.2. the following would be valid...
           'sqllite3:///./g' asUrl  asString    "-->  sqllite3:///b/c/g'  "
 

All that is from a patchy skim of RFC3986, so maybe I got it wrong, but maybe its a view worth considering.


 
But there is #withRelativeReference: which follows the relevant specs (check the tests).

FileReference should also deal better with platform issue, ZnURL is much more abstract, by definition.

I hope this makes sense ;-)

> Now, having a look at what we can do with znurl
>
> In windows it looks that we are ok
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl path.
>      "'c:/myDirectory/myfile.db'"
> 'sqlite3:///c:\myDirectory\myfile.db' asUrl path.
>      "'c:\myDirectory\myfile.db'"

See also #testWindowsDriveNamesInFileUrl

> But in linux even if I put extra slashes it keeps removing them... And it makes difficult to recognize relative from absolute file paths:
>
> "Relative path c/myDirectory/myfile.db OK"
> 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>
> "Absolute path /c/myDirectory/myfile.db not OK"
> 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>
> Adding Slashes
> 'sqlite3:////////c/myDirectory/myfile.db' asUrl path.
>      "'c/myDirectory/myfile.db'"
>


An intuitive semantic would be...
     ''sqlite3:///./c/myDirectory/myfile.db' asUrl pathPrintString  
        "--> /my/pharoImagePath/c/myDirectory/myfile.db"


 
> Another concern is that garage as it is works in Pharo 3, 4 and 5. Then if we update Zn I'll have to add Zn to my dependencies and load the correct version accordingly, what do you think? (I prefer that to back porting)

Just depend on the latest stable release of Zinc, it is (still) supported for 2.0 and up.

> Guille
>
>
> El lun., 18 de may. de 2015 a la(s) 11:27 a. m., Sven Van Caekenberghe <[hidden email]> escribió:
> Hi Guile,
>
> > On 18 May 2015, at 10:48, Guillermo Polito <[hidden email]> wrote:
> >
> > Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.
> >
> > [driverid]://[host][:port]/[databasename]?[properties]
> >
> > Then, for sqlite I would like to have:
> >
> > sqlite3://myDirectory/myfile.db
> >
> > or
> >
> > sqlite3://memory
> >
> > Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer
> >
> > sqlite3://c:/myDirectory/myfile.db
> >
> > In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.
>
> I understand the reasoning here, but I would suggest that you at least consider changing your mind. Over time, a lot of work went into ZnUrl, and many people used it and helped fixing many problems. Recently we tried to improve the situation for special schemes a bit (for git URLs). We're not yet there, but you (your use case) could help improving the situation.
>
> I tried to take a principled approach regarding the specs (not that there is one clear spec, haha), but edge cases should still be possible.
>
> Consider the following:
>
> 'postgresql://localhost:14433/test-db?username=john&password=123456' asUrl.
> 'postgresql://john:123456@localhost:14433/test-db?encoding=utf8' asUrl.
> 'postgresql://localhost/default' asUrl.
>
> 'sqlite3:///myDirectory/myfile.db' asUrl.
> 'sqlite3://memory' asUrl.


Would memory have a hierarchical structure?  i.e. if I wanted to run two sqllite3 instances in memory? 

 
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl.
>
> According to me, these all parse correctly. Of course, ZnUrl does not know or understand what is possibly special about any particular scheme (apart from HTTP(S)). Still, it is just an object that can be manipulated.
>
> The default port can be computed along the lines of #portOrDefault, maybe we could add a #portIfAbsent: ?
>
> Conversion to a file references can be done as follows:
>
> 'sqlite3:///c:/myDirectory/myfile.db' asUrl
>   copy scheme: #file; asFileReference.
>
> Which basically reads as: 'I know this is not a file:// URL proper, still I want to interpret it as if it were one'. Maybe we could add a #forceAsFileReference ?
>
> Anyway, you get the idea. You know I am willing to help.
>
> Sven
>
>
>
>


Looking through Zinc I notice that the schemes are symbols.  Any plans to make these classes?  To maybe(?) make it easier to extend to other schemes (http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) ?  Or do you consider that outside the scope of Zinc ?

cheers -ben



Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Sven Van Caekenberghe-2
Ben,

(comments inline)

> On 18 May 2015, at 17:46, Ben Coman <[hidden email]> wrote:
>
>
>
> On Mon, May 18, 2015 at 7:06 PM, Sven Van Caekenberghe <[hidden email]> wrote:
>
> > On 18 May 2015, at 12:13, Guillermo Polito <[hidden email]> wrote:
> >
> > Hi Sven, I'll take that into account. I did not think about using an empty host/port pair, maybe because of the extra slash.
> >
> > Also, I do not need to convert the file name to a file reference. That is done in sqlite by the sqlite library itself, so it's less work for the driver. The only need I need to do is:
> >  - for nbsqlite get the file name
> >  - for opendbx get the full file name and split into directory + file name because they are two different arguments for the library.
>
> I think that if you want to view (part of) a URL as a file system path and/or if you want to work with that path (splitting it directory/file for example), converting to a file reference is a good idea. ZnUrl can do file interpretation to some degree (#directory & #file, #isDirectoryPath & #isFilePath), but not very well (it is not its primary application).
>
> ZnUrl>>#path is actually not the best selector to use. It uses the internal representation. Like #directory & #file, #isDirectoryPath & #isFilePath, these leak some internals. ZnUrl>>#pathString is better, and will give you a leading $/ while encoding everything correctly.
>
> Also, relative File URLs simply do not exist.
>
> This sparked my interest to read RFC3986 (https://tools.ietf.org/html/rfc3986#section-5.2.4)
> which says:
>
>      A relative reference that begins with a single slash character is termed
>      an absolute-path reference.  A relative reference that does not begin with
>      a slash character is termed a relative-path reference."
>
>      The path segments "." and "..", also known as dot-segments, are defined
>      for relative reference within the path name hierarchy. They are intended for
>      use at the beginning of a relative-path reference to indicate relative position
>      within the hierarchical tree of names."
>
>      Within a representation with a well defined Base URI of
>            "http://a/b/c/d;p?q"
>       a relative reference is transformed to its Target URI as follows.
>            "g:h"           =  "g:h"
>            "g"             =  "http://a/b/c/g"
>            "./g"           =  "http://a/b/c/g"
>            "../g"          =  "http://a/b/g"
>            "/g"            =  "http://a/g"
>
> So slightly as an aside, this makes me wonder if we should consider that when the Image is a opened as a file, its Base URI can be considered to be
>           'file:///b/c/Pharo.image'
> such that its reasonable to have the following relative references transformed as follows:
>           'g:h'   asUrl  asString    "-->  'g:h'
>           'g'      asUrl  asString    "-->  'file:///b/c/g'
>           './g'    asUrl  asString    "-->  'file:///b/c/g'
>           '../g'   asUrl  asString    "-->  'file:///b/g'
>           '/g'     asUrl  asString    "-->  'file:///g'
>
> Effectively, the default scheme is  #file  rather than  nil,  as it is currently.
>
>
> RFC3986 5.2.2.  "Transform References" shows a non-strict case where it seems that following intuitive example is valid
>         'file:///./g' asUrl  asString    "-->  file:///b/c/g'  "
> per this line...
>          T.path = merge(Base.path, R.path);
>
>
> RFC3986 5.1.2.  "Base URI from the Encapsulating Entity" indicates we could consider the Base URI for the sqlite3 scheme to derive from the Image's  'file:///b/c/Pharo.image'  URI to be...
>           'sqlite3:///b/c/Pharo.image'
> such that per 5.2.2. the following would be valid...
>            'sqllite3:///./g' asUrl  asString    "-->  sqllite3:///b/c/g'  "
>  
>
> All that is from a patchy skim of RFC3986, so maybe I got it wrong, but maybe its a view worth considering.

I made the following commits:

===
Name: Zinc-Resource-Meta-Core-SvenVanCaekenberghe.53
Author: SvenVanCaekenberghe
Time: 19 May 2015, 4:31:02.297452 pm
UUID: 55fd5688-02f5-4ac7-b112-3ef2721000e5
Ancestors: Zinc-Resource-Meta-Core-SvenVanCaekenberghe.52

Extend the ZnUrl API a bit, for convenience:

        ZnUrl class>>#image returns a File URL to the current image (Thx Ben Coman)
        ZnUrl>>#+ as an alias for ZnUrl>>#withRelativeReference:
        ZnUrl>>#asFileUrl return a copy with the scheme changed to file://
        ZnUrl>>#portIfAbsent: to handle missing ports
       
Add some unit tests for the new API
===
Name: Zinc-Resource-Meta-Tests-SvenVanCaekenberghe.36
Author: SvenVanCaekenberghe
Time: 19 May 2015, 4:31:30.415175 pm
UUID: a4779a63-c75e-4cae-b35e-c9a5861a68eb
Ancestors: Zinc-Resource-Meta-Tests-monty.35

Extend the ZnUrl API a bit, for convenience:

        ZnUrl class>>#image returns a File URL to the current image (Thx Ben Coman)
        ZnUrl>>#+ as an alias for ZnUrl>>#withRelativeReference:
        ZnUrl>>#asFileUrl return a copy with the scheme changed to file://
        ZnUrl>>#portIfAbsent: to handle missing ports
       
Add some unit tests for the new API
===
Name: Zinc-Resource-Meta-FileSystem-SvenVanCaekenberghe.8
Author: SvenVanCaekenberghe
Time: 19 May 2015, 4:32:04.050748 pm
UUID: a84ff5a8-584c-4160-93f2-0f305679a9e0
Ancestors: Zinc-Resource-Meta-FileSystem-SvenVanCaekenberghe.7

Add ZnFileUrlTests>>#testImage
===

This does not change the current default behaviour, but it makes it a lot easier to work with relative references:

ZnUrl image + 'foo.txt'

  => "file:///Users/sven/Develop/Smalltalk/foo.txt"

You just have to choose an appropriate base URL:

'postgresql://user:password@localhost:5432/default' asUrl + 'test'

  => "postgresql://user:password@localhost:5432/test"

However, when you override the host, the port (actually the whole authority) is reset as well (according to my interpretation of the RFC anyway), that is why I added #portIfAbsent:

'postgresql://production/user-db' asUrl portIfAbsent: 5432

  => 5432

I also added #asFileUrl to help in the common case of interpreting any scheme as referring to a file path. After that conversion, one will probably want to do an #asFileReference.
 

> But there is #withRelativeReference: which follows the relevant specs (check the tests).
>
> FileReference should also deal better with platform issue, ZnURL is much more abstract, by definition.
>
> I hope this makes sense ;-)
>
> > Now, having a look at what we can do with znurl
> >
> > In windows it looks that we are ok
> > 'sqlite3:///c:/myDirectory/myfile.db' asUrl path.
> >      "'c:/myDirectory/myfile.db'"
> > 'sqlite3:///c:\myDirectory\myfile.db' asUrl path.
> >      "'c:\myDirectory\myfile.db'"
>
> See also #testWindowsDriveNamesInFileUrl
>
> > But in linux even if I put extra slashes it keeps removing them... And it makes difficult to recognize relative from absolute file paths:
> >
> > "Relative path c/myDirectory/myfile.db OK"
> > 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
> >      "'c/myDirectory/myfile.db'"
> >
> > "Absolute path /c/myDirectory/myfile.db not OK"
> > 'sqlite3:///c/myDirectory/myfile.db' asUrl path.
> >      "'c/myDirectory/myfile.db'"
> >
> > Adding Slashes
> > 'sqlite3:////////c/myDirectory/myfile.db' asUrl path.
> >      "'c/myDirectory/myfile.db'"
> >
>
>
> An intuitive semantic would be...
>      ''sqlite3:///./c/myDirectory/myfile.db' asUrl pathPrintString  
>         "--> /my/pharoImagePath/c/myDirectory/myfile.db"
>
>
>  
> > Another concern is that garage as it is works in Pharo 3, 4 and 5. Then if we update Zn I'll have to add Zn to my dependencies and load the correct version accordingly, what do you think? (I prefer that to back porting)
>
> Just depend on the latest stable release of Zinc, it is (still) supported for 2.0 and up.
>
> > Guille
> >
> >
> > El lun., 18 de may. de 2015 a la(s) 11:27 a. m., Sven Van Caekenberghe <[hidden email]> escribió:
> > Hi Guile,
> >
> > > On 18 May 2015, at 10:48, Guillermo Polito <[hidden email]> wrote:
> > >
> > > Well, I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.
> > >
> > > [driverid]://[host][:port]/[databasename]?[properties]
> > >
> > > Then, for sqlite I would like to have:
> > >
> > > sqlite3://myDirectory/myfile.db
> > >
> > > or
> > >
> > > sqlite3://memory
> > >
> > > Now, for Garage 0.1 I was using ZnUrl to parse the connection strings and that is not working for urls like the following because if the schema is not 'file' then the parser fails to to transform the port (an empty string in this case for the parser) to an integer
> > >
> > > sqlite3://c:/myDirectory/myfile.db
> > >
> > > In bleeding edge I am using an alternative parser that does no conversions and allows urls like that one.
> >
> > I understand the reasoning here, but I would suggest that you at least consider changing your mind. Over time, a lot of work went into ZnUrl, and many people used it and helped fixing many problems. Recently we tried to improve the situation for special schemes a bit (for git URLs). We're not yet there, but you (your use case) could help improving the situation.
> >
> > I tried to take a principled approach regarding the specs (not that there is one clear spec, haha), but edge cases should still be possible.
> >
> > Consider the following:
> >
> > 'postgresql://localhost:14433/test-db?username=john&password=123456' asUrl.
> > 'postgresql://john:123456@localhost:14433/test-db?encoding=utf8' asUrl.
> > 'postgresql://localhost/default' asUrl.
> >
> > 'sqlite3:///myDirectory/myfile.db' asUrl.
> > 'sqlite3://memory' asUrl.
>
>
> Would memory have a hierarchical structure?  i.e. if I wanted to run two sqllite3 instances in memory?
>
>  
> > 'sqlite3:///c:/myDirectory/myfile.db' asUrl.
> >
> > According to me, these all parse correctly. Of course, ZnUrl does not know or understand what is possibly special about any particular scheme (apart from HTTP(S)). Still, it is just an object that can be manipulated.
> >
> > The default port can be computed along the lines of #portOrDefault, maybe we could add a #portIfAbsent: ?
> >
> > Conversion to a file references can be done as follows:
> >
> > 'sqlite3:///c:/myDirectory/myfile.db' asUrl
> >   copy scheme: #file; asFileReference.
> >
> > Which basically reads as: 'I know this is not a file:// URL proper, still I want to interpret it as if it were one'. Maybe we could add a #forceAsFileReference ?
> >
> > Anyway, you get the idea. You know I am willing to help.
> >
> > Sven
> >
> >
> >
> >
>
>
> Looking through Zinc I notice that the schemes are symbols.  Any plans to make these classes?  To maybe(?) make it easier to extend to other schemes (http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) ?  Or do you consider that outside the scope of Zinc ?

Well, it is a 'problem area' (design/implementation wise). Your suggestion makes sense though: to make the schemes objects so that they could implement specific behaviour (rather than making many subclasses of ZnUrl). I will think about this, thanks for the idea !

Sven

> cheers -ben


Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Ben Coman


On Tue, May 19, 2015 at 10:49 PM, Sven Van Caekenberghe <[hidden email]> wrote:

I made the following commits:
This does not change the current default behaviour, but it makes it a lot easier to work with relative references:

Cool.  

Now just a passing thought.  The original term URL is from RFC1738, which is obsolete.  RFC3986 says "The term 'Uniform Resource Locator' (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism."  I don't completely follow that, but maybe there is use for a ZnUri class that provides broader semantics than ZnUrl. 



ZnUrl image + 'foo.txt'

  => "file:///Users/sven/Develop/Smalltalk/foo.txt"

You just have to choose an appropriate base URL:

'postgresql://user:password@localhost:5432/default' asUrl + 'test'

  => "postgresql://user:password@localhost:5432/test"

However, when you override the host, the port (actually the whole authority) is reset as well (according to my interpretation of the RFC anyway), that is why I added #portIfAbsent:

'postgresql://production/user-db' asUrl portIfAbsent: 5432

  => 5432

I also added #asFileUrl to help in the common case of interpreting any scheme as referring to a file path. After that conversion, one will probably want to do an #asFileReference.

> Looking through Zinc I notice that the schemes are symbols.  Any plans to make these classes?  To maybe(?) make it easier to extend to other schemes (http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) ?  Or do you consider that outside the scope of Zinc ?

Well, it is a 'problem area' (design/implementation wise). Your suggestion makes sense though: to make the schemes objects so that they could implement specific behaviour (rather than making many subclasses of ZnUrl). I will think about this, thanks for the idea !



Perhaps an advantage of a Scheme class is that it might make it easier for a "postgresql" to define a default port.

btw, The part of RFC3986 that sparked this thought was this... "URI scheme specifications can define opaque identifiers by disallowing use of slash characters, question mark characters, and the URIs "scheme:." and "scheme:..".

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Garage and SQLite file databases, general project structure

Ben Coman
In reply to this post by Guillermo Polito
On Mon, May 18, 2015 at 4:48 PM, Guillermo Polito <[hidden email]> wrote:
I wanted in general to have a single easy-to-learn way to build a database connection string for all drivers.

[driverid]://[host][:port]/[databasename]?[properties]
 
 
 
I bumped into an interesting app "SQL Workbench/J" [1] that lists URL connection strings for about 20 databases.  I've extracted this mostly for curiousity (since we'll probably not support most of these), but it might be useful for their formats to be possible. Not sure how canonical these are or specific to the SQL Workbench application.
 
Adabas -- jdbc:adabasd://servername/name_of_database
Apache Derby Embedded -- jdbc:monetdb://(hostname)/(database)
Apache Derby Network Client -- jdbc:derby://(hostname):1527/(database)
Cubrid -- jdbc:cubrid:serverhost:33000:databasename
EnterpriseDB -- jdbc:edb://servername:port/database
FirebirdSQL -- jdbc:firebirdsql://host:port/name_of_database
H2 Databse Engine -- jdbc:h2:/path/to/database
HSQLDB -- jdbc:hsqldb:full_path_to_database
IBM DB2 -- jdbc:db2://server_name:50000/database_name
IBM BD2 UDB for AS/400 (iSeries) -- jdbc:as400://server_name/database_name
Informix -- jdbc:informix-sqli://host-name:port-number/dbName:INFORMIXSERVER=serverName
jTDS -- jdbc:jtds:sqlserver://serverName[:port][/;databaseName]
MaxDB -- jdbc:sapdb://hostname[:port]/database_name
Microsoft Access JDBC Driver --jdbc:ucanaccess://c:/path/to/database.mdb
MySQL -- jdbc:mysql://hostname:port/name_of_database
NuoDB -- jdbc:com.nuodb://localhost/test
Oracle -- jdbc:oracle:thin:@hostname:port:sid
Pervasive PSQL -- jdbc:pervasive://(hostname):(port)/(database)
PostgreSQL -- jdbc:postgresql://host:port/name_of_database
SQL Server 2005 -- jdbc:sqlserver://serverName[\instanceName][:portNumber];property=value;property=value
Sybase jConnect 6.0 -- jdbc:sybase:Tds:machine-name:port-number
Teradata -- jdbc:teradata://DatabaseServerName/Param1,Param2
 
The ones that caught my eye were:
* HSQLDB -- no slashes
* Microsoft Access -- file access
* SQL Server -- backslash, semicolons, equals
* Oracle -- multiple colons and @ symbol
* Teradata -- comma
 
 
btw, With Pharo wanting to appeal to businesses, SQLServer should probably be on the *long* term plan since many businesses are constrained by incumbent SQLServer installations. Documentation for their Tabular Data Stream protocol is apparently here [2].
 
 
 cheers -ben