Unicode Pipeline
Subject : Unicode Pipeline
In the text file which is in the Pipeline I have a Unicode string.
The string is "-1" meaning minus one.
In hex = 2D003100
I have the code
iNum := StrToIntDef(ppPipeline.Fields[0].AsString, 0);
However the Field contents is "" and not "-1".
In the Help it says the TppTextPipeline "access data from an ASCII text
file".
How do I get it to access a Unicode text file?
Regards,
Peter Evans
In the text file which is in the Pipeline I have a Unicode string.
The string is "-1" meaning minus one.
In hex = 2D003100
I have the code
iNum := StrToIntDef(ppPipeline.Fields[0].AsString, 0);
However the Field contents is "" and not "-1".
In the Help it says the TppTextPipeline "access data from an ASCII text
file".
How do I get it to access a Unicode text file?
Regards,
Peter Evans
This discussion has been closed.
Comments
You can use the TextPipeline.Encoding property to assign an encoding.
Example:
uses
SysUtils;
myTextPipeline.Encoding := TEncoding.Unicode;
When no TextPipeline.Encoding is specified, the TextPipeline tries to
auto-detect the encoding by examining the first several bytes of the file -
sometimes referred to as the 'preamble' or the BOM, Byte Order Mark.
(https://en.wikipedia.org/wiki/Byte_order_mark).
--
Nard Moseley
Digital Metaphors
www.digital-metaphors.com
Best regards,
Nard Moseley
Digital Metaphors
www.digital-metaphors.com
Thanks for that advice.
I found I had to set the TextPipeline.Encoding and write out the BOM.
It did not work just setting the TextPipeline.Encoding alone.
My suggestion is that you :
1) update the documentation as currently it states "access data from an
ASCII text file". That is wrong. It also turns away potential customers.
2) make Encoding a published Property. I had to manually change many
pipelines throughout my code.
3) make Unicode the default setting. No excuses for not doing so in this
day and age.
Regards,
Peter Evans
When TextPipeline.Encoding is specified, a BOM should not be required. This
will be fixed for the next maintenance release.
We will also update the documentation. I notice the Encoding property is not
documented.
We are following the Unicode VCL standard and I believe it is well thought
out and has proven to work well based on customer feedback. There is a
Delphi help topic 'Using TEncoding for Unicode Files' that explains how it
works. Basically when a BOM is present, it is used, otherwise the
TEncoding.Default encoding is used, which is the Windows ANSI code page.
This works well for handling Ansi text data - which is quite common and for
which there is no BOM - there is no such thing. Unicode on the other hand,
has different encodings, UTF-16, UTF-8, etc. For example UTF-8 is used
frequently - internet files, OSX, iOS. When Unicode text is written to a
file, a BOM should be used. On my Windows 7 machine when I use Notepad to
create text files, the default is Ansi - no BOM. In the Save dialog I can
specify an encoding - in which case Notepad writes a BOM. When Notepad
opens a text file, it looks for a BOM to determine how the text is encoded,
otherwise it assumes Ansi.
-
Nard Moseley
Digital Metaphors
www.digital-metaphors.com
Best regards,
Nard Moseley
Digital Metaphors
www.digital-metaphors.com
I found the above worked for some basic tests.
Since then I have tested for other situations.
I have had make further changes. (For non Unicode I could write a one
line text string to a Pipeline. With no CR LF characters at the end.
That approach worked.)
The changes I have had to now make are:-
1) also write out the CR LF at the end of the last line.
2) every time I perform ppTextPipeline1.Fields[0].AsString, or similar,
I have to strip off any trailing CR LF or whatever.
Regards,
Peter Evans
in the Delphi debugger, I can research it. Perhaps you can create a couple
of test files - one ANSI, one Unicode. Please send any examples to support@
-
Nard Moseley
Digital Metaphors
www.digital-metaphors.com
Best regards,
Nard Moseley
Digital Metaphors
www.digital-metaphors.com