Hello,
Is it me, or is there a problem with WideString and Artefact? The following produces an empty page: page add: ((PDFFormattedTextElement from: 10mm @ 20mm to: 277mm @ 30mm) alignment: PDFAlignment left; text: 'Argent de poche : 1€'). The text is WideString. Removing the €, it went ok. Hilaire -- Dr. Geo http://drgeo.eu http://google.com/+DrgeoEu |
Hi Hilaire, Take a look in the Artefact demos. I think there are a PDF document with a monetary character. 2015-10-21 22:02 GMT+02:00 Hilaire <[hidden email]>: Hello, |
Le 21/10/2015 22:26, olivier auverlot a écrit :
> Hi Hilaire, > > Take a look in the Artefact demos. I think there are a PDF document > with a monetary character. > Indeed ((128 asCharacter) asString). But this character does not print on the web with Unicode encoding. Unified support will be great, object with EURO symbol could be printed as is both in Seaside and Artefact. So not sure what's wrong. Thanks Hilaire -- Dr. Geo http://drgeo.eu http://google.com/+DrgeoEu |
> On 22 Oct 2015, at 11:14, Hilaire <[hidden email]> wrote: > > Le 21/10/2015 22:26, olivier auverlot a écrit : >> Hi Hilaire, >> >> Take a look in the Artefact demos. I think there are a PDF document >> with a monetary character. >> > > Indeed ((128 asCharacter) asString). I am pretty sure this is wrong. The Unicode code point for the Euro symbol is decimal 8364 and not 128. https://en.wikipedia.org/wiki/Euro_sign > But this character does not print > on the web with Unicode encoding. > Unified support will be great, object with EURO symbol could be printed > as is both in Seaside and Artefact. > So not sure what's wrong. > > > Thanks > > Hilaire > > -- > Dr. Geo > http://drgeo.eu > http://google.com/+DrgeoEu > > > |
Le 22/10/2015 12:01, Sven Van Caekenberghe a écrit :
> I am pretty sure this is wrong. The Unicode code point for the Euro symbol is decimal 8364 and not 128. > > https://en.wikipedia.org/wiki/Euro_sign Indeed. I guess the 128 value is for 8 bits char encoding, and it is the one required by Artefact. Hilaire -- Dr. Geo http://drgeo.eu http://google.com/+DrgeoEu |
I do also use two different implementations for artefact/pdf and html: artefact: 128 asCharacter asString html: '€' same would be great 2015-10-22 15:11 GMT+02:00 HilaireFernandes [via Smalltalk] <[hidden email]>: Le 22/10/2015 12:01, Sven Van Caekenberghe a écrit : |
> On 22 Oct 2015, at 15:31, Sabine Manaa <[hidden email]> wrote: > > I do also use two different implementations for artefact/pdf and html: > > artefact: > 128 asCharacter asString I am still very curious to know in which character encoding that is the case ? https://en.wikipedia.org/wiki/Currency_sign_(typography) > html: > '€' > > same would be great > > > > > 2015-10-22 15:11 GMT+02:00 HilaireFernandes [via Smalltalk] <[hidden email]>: > Le 22/10/2015 12:01, Sven Van Caekenberghe a écrit : > > I am pretty sure this is wrong. The Unicode code point for the Euro symbol is decimal 8364 and not 128. > > > > https://en.wikipedia.org/wiki/Euro_sign > Indeed. I guess the 128 value is for 8 bits char encoding, and it is the > one required by Artefact. > > Hilaire > > -- > Dr. Geo > http://drgeo.eu > http://google.com/+DrgeoEu > > > > > > If you reply to this email, your message will be added to the discussion below: > http://forum.world.st/Artefact-and-WideString-tp4857147p4857323.html > To start a new topic under Pharo Smalltalk Users, email [hidden email] > To unsubscribe from Pharo Smalltalk Users, click here. > NAML > > > View this message in context: Re: Artefact and WideString > Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. |
In reply to this post by Sven Van Caekenberghe-2
On 22/10/15 12:01, Sven Van Caekenberghe wrote:
> >> On 22 Oct 2015, at 11:14, Hilaire <[hidden email]> wrote: >> >> Le 21/10/2015 22:26, olivier auverlot a écrit : >>> Hi Hilaire, >>> >>> Take a look in the Artefact demos. I think there are a PDF document >>> with a monetary character. >>> >> >> Indeed ((128 asCharacter) asString). > > I am pretty sure this is wrong. The Unicode code point for the Euro symbol is decimal 8364 and not 128. Yes, it might be ISO-Latin-1. There are several codepages having € at 128. PDF support several encodings, I don't know what Artefact uses by default. Stephan |
In reply to this post by Sabine Manaa
"character
numeric code representing an abstract symbol according to some defined character encoding rule NOTE 1 There are three manifestations of characters in PDF, depending on context: • A PDF file is represented as a sequence of 8-bit bytes, some of which are interpreted as character codes in the ASCII character set and some of which are treated as arbitrary binary data
depending upon the context. • The contents (data) of a string or stream object in some contexts are interpreted as character
codes in the PDFDocEncoding or UTF-16 character set. • The contents of a string within a PDF content stream in some situations are interpreted as character
codes that select glyphs to be drawn on the page according to a character encoding that
is associated with the text font. " What those contexts are, I don't know, but they all need to be handled differently; - For bullet one, there's nothing to do. - For bullet 2, there needs to be an encoding layer which converts the strings to proper format when writing the PDF, see section 7.9.2. Seems to me the process would be simpler when writing the file if one ignored PDFDocEncoding altogether and eiter write ascii, or convert to BOM-marked UTF16 (in the same way we write ASCII or BOM-marked UTF8 for chunk files) - For bullet 3, one would need to convert to the fonts character set. Cheers Henry signature.asc (859 bytes) Download Attachment |
In reply to this post by Stephan Eggermont-3
> On 22 Oct 2015, at 16:00, Stephan Eggermont <[hidden email]> wrote: > > On 22/10/15 12:01, Sven Van Caekenberghe wrote: >> >>> On 22 Oct 2015, at 11:14, Hilaire <[hidden email]> wrote: >>> >>> Le 21/10/2015 22:26, olivier auverlot a écrit : >>>> Hi Hilaire, >>>> >>>> Take a look in the Artefact demos. I think there are a PDF document >>>> with a monetary character. >>>> >>> >>> Indeed ((128 asCharacter) asString). >> >> I am pretty sure this is wrong. The Unicode code point for the Euro symbol is decimal 8364 and not 128. > > Yes, it might be ISO-Latin-1. There are several codepages having € at 128. PDF support several encodings, I don't know what Artefact uses by default. > > Stephan Indeed, I spoke too quickly, several indeed do (44 out of 69 defined): ZnByteEncoder knownEncodingIdentifiers select: [ :each | (ZnByteEncoder newForEncoding: each) characterDomain includes: $€ ]. (ZnByteEncoder knownEncodingIdentifiers collect: [ :each | ZnByteEncoder newForEncoding: each ]) select: [ :each | each characterDomain includes: $€ ] thenCollect: [ :each | each identifier -> (each encodeString: $€ asString) ]. Most but not all use 128 as encoding. But Latin1 is not one of them (at least not in the strict interpretation). ZnCharacterEncoder latin1 encodeString: $€ asString. Sven |
On 22-10-15 16:16, Sven Van Caekenberghe wrote:
> Most but not all use 128 as encoding. But Latin1 is not one of them (at least not in the strict interpretation). Hmm, you can't trust anything you read on the internet anymore:) CP1252, legacy Windows it is. Stephan |
Free forum by Nabble | Edit this page |