conceptual design help

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

conceptual design help

Joseph Alotta
Greetings,

I am writing a program to consolidate all my personal finances for tax time next year.  (This is not a school project.)

There are transaction files from several banks and credit card companies.  Each has a similar format, CSV, but they vary in many ways, order of items, extra notes, pipe delimited or tabs, etc.  I want to read them and load them into a collection of transaction objects.

1.  Should I have a FileReader object?

2.  Should it have subclasses like FileReaderAmericanExpress, FileReaderJPMorgan ?

3.  Or should it have different methods like loadAmericanExpresFile, loadJPMorganFile ?

4.  Is a Collection of Transaction objects, the structure that you would load the files into?

The rest of the project would be to do data checking on the files, to make sure there are no duplicates or missing dates.  Then write reports that I can give to my accountant.

I would appreciate some design help?

Sincerely,

Joe._______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: conceptual design help

Kirk Fraser
Joe,

My suggestion is to use fewer object classes and more methods. Play with it until you know what you are doing, then objects, instance variables, and methods come more naturally without as much need for prior design.  You can refactor or reorganize fairly quickly once you master the application and tools available. Sometimes it is necessary to write new versions of an application as you learn more about what you need.  

Included in this advice is a suggestion to keep working your software over and over in Smalltalk and not abstract it on paper too much which can waste time. A design embedded in software is a lot closer to working than a design on paper (or CRC cards).  But occasionally a few notes on paper can help when going from one version to a new one.

Kirk


On Thu, Apr 28, 2016 at 3:15 PM, Joseph Alotta <[hidden email]> wrote:
Greetings,

I am writing a program to consolidate all my personal finances for tax time next year.  (This is not a school project.)

There are transaction files from several banks and credit card companies.  Each has a similar format, CSV, but they vary in many ways, order of items, extra notes, pipe delimited or tabs, etc.  I want to read them and load them into a collection of transaction objects.

1.  Should I have a FileReader object?

2.  Should it have subclasses like FileReaderAmericanExpress, FileReaderJPMorgan ?

3.  Or should it have different methods like loadAmericanExpresFile, loadJPMorganFile ?

4.  Is a Collection of Transaction objects, the structure that you would load the files into?

The rest of the project would be to do data checking on the files, to make sure there are no duplicates or missing dates.  Then write reports that I can give to my accountant.

I would appreciate some design help?

Sincerely,

Joe._______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

conceptual design help

Louis LaBrunda
In reply to this post by Joseph Alotta
Hi Joe,

I agree with Kirt's suggestions in general.  See more below.

On Thu, 28 Apr 2016 17:15:05 -0500, Joseph Alotta <[hidden email]> wrote:

>Greetings,
>I am writing a program to consolidate all my personal finances for tax time next year.  (This is not a school project.)
>There are transaction files from several banks and credit card companies.  Each has a similar format, CSV, but they vary in many ways, order of items, extra notes, pipe delimited or tabs, etc.  I want to read them and load them into a collection of transaction objects.

>1.  Should I have a FileReader object?
>2.  Should it have subclasses like FileReaderAmericanExpress, FileReaderJPMorgan ?

No.  I would with the object you want to hold the information and work out from there.  An
object or your main program object should read the files and instantiate the objects (same
class) that gets the data.

>3.  Or should it have different methods like loadAmericanExpresFile, loadJPMorganFile ?

Yes.  Use different methods to read and parse the files and instantiate the objects.  Depending
upon how close (similar) the files are you could have a method with a few parameters (three or
four at most) that handles more than one file.

>4.  Is a Collection of Transaction objects, the structure that you would load the files into?

Yes.  An ordered or sorted collection would be good so you could sort by date/time if that is
helpful.

>The rest of the project would be to do data checking on the files, to make sure there are no duplicates or missing dates.  Then write reports that I can give to my accountant.

Sounds good.

>I would appreciate some design help?
>Sincerely,
>Joe.

Lou
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

RE: conceptual design help

Ron Teitelbaum
In reply to this post by Joseph Alotta
Hi Joe,

Depending on how many different structures you have you may want to consider
having some external configuration object.  My first thought when reading
your description was that there are only so many ways to parse, and so many
different types of data.  Having a collection of parser objects that handle
specific translations seems clean.  Once you have an idea of what types of
translations you need you could trigger those operations by building your
collection of parser objects from an external config file.  In that way you
read the parameters in (could be a file in the JPMorgan directory) conf.xml
or parse.conf something like that, and then use that to set up your parser.
The parser is then written to be generic and reusable.  It would allow you
to write methods once and then reuse them for files you haven't seen yet by
creating a new config file for your format.  You could even have a config
file generator that asks you questions and shows you the results from a
current sample data file. :)  

fieldSeperator: #comma.
fieldDelimited: #doubleQuote.
nameSeperator: #comma.
nameFormat: 'title, first, [mi], last, [suffix]'.
balance: #USD.
fieldOrder: 'id, name, balance'.
...

One other thought is if there is a way for the system to determine what
format to use (this is a JPMorgan file) then instead of an external config
file you could just store the different configs internally and match for the
right config collection and error if one doesn't exist (asking the user to
create one using your config builder method).  if you can't match then
having a file in a JPMorgan directory seems simple enough.  In general
thinking of the setup step as a collection of generic parser configuration
objects instead of a different parsing method for each file, simplifies
everything.  Of course back to my original point, writing 3 parser methods
will go faster if there are not that many formats.  There is always a trade
off when you consider building a framework or just hacking some code that
works :).

All the best,

Ron Teitelbaum

> From: Joseph Alotta
> Sent: Thursday, April 28, 2016 6:15 PM
>
> Greetings,
>
> I am writing a program to consolidate all my personal finances for tax
time
> next year.  (This is not a school project.)
>
> There are transaction files from several banks and credit card companies.
> Each has a similar format, CSV, but they vary in many ways, order of
items,
> extra notes, pipe delimited or tabs, etc.  I want to read them and load
them

> into a collection of transaction objects.
>
> 1.  Should I have a FileReader object?
>
> 2.  Should it have subclasses like FileReaderAmericanExpress,
> FileReaderJPMorgan ?
>
> 3.  Or should it have different methods like loadAmericanExpresFile,
> loadJPMorganFile ?
>
> 4.  Is a Collection of Transaction objects, the structure that you would
load
> the files into?
>
> The rest of the project would be to do data checking on the files, to make
> sure there are no duplicates or missing dates.  Then write reports that I
can

> give to my accountant.
>
> I would appreciate some design help?
>
> Sincerely,
>
> Joe._______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: conceptual design help

Joseph Alotta
Thanks for all the help.

I like the idea of having the code sense the format of the data and acting accordingly.  

For separators, I could count the number of each kind of separators in the file and compare it to the number of lines.  Say 3 or more separators per line.

Then I can parse by columns and look for the dominant data type.  For a column that is 60% matching a date type, I can assume it is a date column and the mismatches are headers.

The amount should be numeric.

The payee should be mostly letters, etc.

One issue I have is knowing what to call the object that does this.  It would not be a Transaction, because this is a function of many Transactions.

FileLoader?  FileAnalyzer?

Also, at this point I should be looking for missing dates and duplicates.  

Duplicates are troublesome, since everytime I download the file, it starts from the beginning of the year again.  I keep downloading them because I think they will only keep data for 6 months or so.  

Also duplicate transactions are valid.  Suppose I go into a coffee shop and buy a cup of coffee, then go back the same day, same store for a refill.

Your thoughts?

Sincerely,

Joe.


 
Reply | Threaded
Open this post in threaded view
|

Re: conceptual design help

Offray
Hi Joseph,

I'm making some data visualizations and despite of not having an advice on conceptual design, I share part of the practical problem of having to work with CSV values in a Smalltalk environment and some times with a lot of records (my recent project works with 270k of them). The visualization I did was documented broadly at [1], but essentially I create a "PublishedMedInfo class >> loadDataFromCSV: aFile usingDelimiter: aCharacter" method that fill out my domain objects that came from an excel (and then CSV) file.

[1] http://mutabit.com/offray/blog/en/entry/sdv-infomed

For my recent project [2] I'm using a SQLite bridge between Pharo and the imported data from CVS. In that way I'm delegating storage and querying (including duplicates) to a small but potent database back-end, while using objects to model "higher" concerns of my domain. I know some worries about objects-database mismatch impedance, but working with data and its visualization/reporting lets you to build bridges leveraging the former to the database and the last to objects, while using the strengths of each one in their own place.

[2] https://twitter.com/offrayLC/status/725314838696701957

So my practical advice is to explore this kinds of combination early in your design. May be a quick hands on mockup could let you know if it works for you. In my case it has and I'm implementing it sooner in my projects.

Cheers,

Offray

Ps: Long time without writing, but I have been reading constantly. Nice to be "back" :-)

On 29/04/16 09:28, Joseph Alotta wrote:
Thanks for all the help.

I like the idea of having the code sense the format of the data and acting accordingly.  

For separators, I could count the number of each kind of separators in the file and compare it to the number of lines.  Say 3 or more separators per line.

Then I can parse by columns and look for the dominant data type.  For a column that is 60% matching a date type, I can assume it is a date column and the mismatches are headers.

The amount should be numeric.

The payee should be mostly letters, etc.

One issue I have is knowing what to call the object that does this.  It would not be a Transaction, because this is a function of many Transactions.

FileLoader?  FileAnalyzer?

Also, at this point I should be looking for missing dates and duplicates.  

Duplicates are troublesome, since everytime I download the file, it starts from the beginning of the year again.  I keep downloading them because I think they will only keep data for 6 months or so.  

Also duplicate transactions are valid.  Suppose I go into a coffee shop and buy a cup of coffee, then go back the same day, same store for a refill.

Your thoughts?

Sincerely,

Joe.


 

View this message in context: Re: conceptual design help
Sent from the Squeak - Beginners mailing list archive at Nabble.com.


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners


_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners