Hi,
is there already a package for email parsing out there? I'm using the POP and SMTP package from Jose S. Calvo and receiving mails from a POP Server works fine, as expected, but the mail is just one string. Günther |
Günther Schmidt escribió:
> Hi, > > is there already a package for email parsing out there? > > I'm using the POP and SMTP package from Jose S. Calvo and receiving > mails from a POP Server works fine, as expected, but the mail is just > one string. It could be very easy to split parts (headers, body(ies), and multipart messages). Best regards. -- Esteban A. Maringolo [hidden email] |
Hi Esteban,
it might indeed be, but what if somebody has already done the work why should I do it again? ;-) Günther Esteban A. Maringolo schrieb: > Günther Schmidt escribió: > >> Hi, >> >> is there already a package for email parsing out there? >> >> I'm using the POP and SMTP package from Jose S. Calvo and receiving >> mails from a POP Server works fine, as expected, but the mail is just >> one string. > > > It could be very easy to split parts (headers, body(ies), and multipart > messages). > > Best regards. > |
In reply to this post by Günther Schmidt
On Thu, 10 Feb 2005 18:20:44 +0100,
Günther Schmidt <[hidden email]> wrote: > Hi, > > is there already a package for email parsing out there? I don't know, but ... > > I'm using the POP and SMTP package from Jose S. Calvo and receiving > mails from a POP Server works fine, as expected, but the mail is just > one string. ... this is actually nice and simple. Break the string into lines. Everything up to the first empty line is the header, the rest is the message body. Each header item has a name tag, followed by a colon, followed by the value. Header lines beginning with whitespace are continuation lines. Have fun :-) s. |
In reply to this post by Günther Schmidt
Günther Schmidt escribió:
> Hi Esteban, > > it might indeed be, but what if somebody has already done the work why > should I do it again? Dependending of the complexity, time and money available, allways is recomendend building your own. Is my personal opinion. Re-building is not the same that Re-Inventing. But I went off topic. -- Esteban A. Maringolo [hidden email] |
In reply to this post by Günther Schmidt
Günther,
> is there already a package for email parsing out there? > > I'm using the POP and SMTP package from Jose S. Calvo and receiving > mails from a POP Server works fine, as expected, but the mail is just > one string. If you have a look at my NewsArchiveBrowser, specifically NewsArchiveArticle>>parse, then it might give you some ideas. It's for parsing newsgroup messages (and only looks for the Subject/From/Sent headers) but the email format is similar. Please note (and it's mentioned a number of times in comments) that the code in that area is written for _speed_ and shouldn't be taken as any sort of ST coding example :-) -- Ian Use the Reply-To address to contact me. Mail sent to the From address is ignored. |
Ian,
thanks. Ian Bartholomew schrieb: > Günther, > >> is there already a package for email parsing out there? >> >> I'm using the POP and SMTP package from Jose S. Calvo and receiving >> mails from a POP Server works fine, as expected, but the mail is just >> one string. > > > If you have a look at my NewsArchiveBrowser, specifically > NewsArchiveArticle>>parse, then it might give you some ideas. It's for > parsing newsgroup messages (and only looks for the Subject/From/Sent > headers) but the email format is similar. > > Please note (and it's mentioned a number of times in comments) that the > code in that area is written for _speed_ and shouldn't be taken as any > sort of ST coding example :-) While writing this email I haven't investigated your code yet, but I already have a question upfront. :-) If you were to do it in *good* Smalltalk style, how would you write? I'm asking because I reckon with my little knowledge I already would be able to write appropriate code to *parse* the email, the code probable being quite procedural though. Would you use a FSA? Günther > |
Guenther,
> If you were to do it in *good* Smalltalk style, how would you write? > I'm asking because I reckon with my little knowledge I already would be > able to write appropriate code to *parse* the email, the code probable > being quite procedural though. > Would you use a FSA? No, it's quite a simple task and doesn't need anything so heavy. All I meant with my comment was that there are places that I haven't followed the "normal" way of doing things but have "cheated" a bit to gain a speed improvement. Fot example, in the #parse method I have used a string search for a line delimiter rather than the more "natural" use of a Stream and #nextLine. FWIW, A quick play has come up with the following class that parses an email string into a header table and a text. The only (slightly) complex bits are allowing for headers that cover multiple lines and headers that appear more than once.. NB I've only tested it with one e-mail so no guarentees :-) To test evaluate EMailMessage from: aString where aString is the contents of an email message. --- cut here --- "Filed out from Dolphin Smalltalk XP"! Object subclass: #EMailMessage instanceVariableNames: 'headers text lastHeader' classVariableNames: '' poolDictionaries: '' classInstanceVariableNames: ''! EMailMessage guid: (GUID fromString: '{AB70CD3A-4F03-4997-894E-299B3BB6B87E}')! EMailMessage comment: ''! !EMailMessage categoriesForClass!Kernel-Objects! ! !EMailMessage methodsFor! from: aString | stream line | headers := LookupTable new. stream := aString readStream. [stream atEnd not and: [(line := stream nextLine) notEmpty]] whileTrue: [self parseHeaderFrom: line]. text := stream upToEnd! parseHeaderFrom: aString | header headerValue | aString first isLetter ifTrue: [header := aString readStream upTo: $:. headerValue := aString copyFrom: header size + 2. [headers includesKey: header] whileTrue: [header := header , 'X']. headers at: header put: headerValue. lastHeader := header] ifFalse: [headers at: lastHeader put: (headers at: lastHeader) , aString]! ! !EMailMessage categoriesFor: #from:!public! ! !EMailMessage categoriesFor: #parseHeaderFrom:!public! ! !EMailMessage class methodsFor! from: aString ^super new from: aString! ! !EMailMessage class categoriesFor: #from:!public! ! --- cut here --- -- Ian Use the Reply-To address to contact me. Mail sent to the From address is ignored. |
Free forum by Nabble | Edit this page |