Opened 18 years ago
Last modified 15 years ago
#11774 closed enhancement
Python >= 2.2 should allow for UCS-4 builds — at Initial Version
Reported by: | jarkko.laiho@… | Owned by: | macports-dev@… |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | |
Keywords: | python unicode ucs2 ucs4 | Cc: | jarkko.laiho@… |
Port: |
Description
Since version 2.2, it has been possible to compile Python with the option --enable-unicode=ucs4, enabling Python Unicode strings to contain characters beyond the Basic Multilingual Plane (e.g. unichr(65535+1)
would no longer result in a ValueError
). Without that specific option, Python defaults to using the rather obsolete UCS-2 (resulting in a "narrow build").
This can be seen by issuing the following commands in the interactive interpreter:
>>> import sys >>> sys.maxunicode 65535
The result is the same with both the version of Python (2.3) that ships with 10.4 Tiger and with python25 from MacPorts. I have not tested this with the ports of other Python versions.
In contrast, on my Gentoo Linux system, the following output is produced instead, since it (like in all modern Linux distributions) is a "wide build":
>>> import sys >>> sys.maxunicode 1114111
This build option is a source of some confusion in various Python implementations and platforms.
wchar_t on Mac OS X is 32 bits, and PEP 261 states: "It is also proposed that one day --enable-unicode will just default to the width of your platforms wchar_t." UCS-4 builds of Python would seem to make sense on the Mac.
If not a new default option, could a variant be made for Python ports >= 2.2 to compile with UCS-4 support? Is there a reason not to go UCS-4?