SqueakAuthors added to AuthorChecker project on squeaksource.com

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

SqueakAuthors added to AuthorChecker project on squeaksource.com

Yanni Chiu
I've added a package called SqueakAuthors to the
AuthorChecker project on squeaksource.com.
Here's the class comment from the MethodExporter class:

This class exports all method versions to a Postgres database. The table
fields include:  stamp, author, timestamp, meta, class name, method
name, and method source. The resulting table should be useful for
determining authorship of each method, and thus facilitate the Squeak
re-licencing effort.

The PostgresV2 package must first be loaded. The Cryptography package is
not needed if your database is set up to use cleartext passwords. Note
that the PostgresV2 and SqueakAuthor categories are skipped when
exporting all methods.

The ImageExtensions (which can be found in the PostgresV2 repository)
also has to be loaded. Unfortunately, these methods are not skipped in
the export.

Create the table using the SQL generated using:
     MethodCollector new getCreateSql

See the examples in the class methods of MethodExporter. The #run method
exports all methods.

====

I ran it using the image at:

     http://ftp.squeak.org/images_with_all_changes/Squeak3.9a.from3.0.zip

It took about 10 min. to load 53074 methods into a postgres
db running on another PC.

Here are some random queries I ran.

(1) How many authors?

select distinct author from public.squeak_method order by author

Returned 254 distinct lines.

(2) What is the oldest method?

select * from public.squeak_method
   where timestamp > '1901-01-02'::timestamp
   order by timestamp limit 50

There are about 4 thousand methods with a date of 1901-01-01,
which are a by-product of the code's handling of an empty or
garbled method stamp. The interesting part of the result,
from query above, has:

ar,1970-01-01 21:00:00,f,PNGReadWriter,copyPixelsIndexed:,
tk,1995-01-21 17:55:00,f,RemoteString,text,
di,1997-06-07 10:42:00,f,TwoWayScrollPane,wantsSlot,

(3) How many methods have no author information?

select * from public.squeak_method where author = ''

Returned 4210 rows

(4) How many wide string methods are there?

select * from public.squeak_method where method_source like 'TODO%'

I found a bug in the postgres driver. I got a walkback
when a field value contained a WideString instance.
I made a quick hack to return 'TODO - WideString' in
these cases. The query result contained only 3 methods:

     JapaneseEnvironment class>>isBreakableAt:in:
     JapaneseEnvironment class>>flapTabTextFor:in:

There's only two listed because #isBreakableAt:in: had
two versions.

====

Does anyone have any interesting queries they want run,
if they don't want to set up the table themselves?

Is there a better image that should be used?

Another approach is to include an image version
field, and a sequence number. Then it could be
run on each release.