Get pg_utf_mblen(), pg_utf2wchar_with_len(), and utf2ucs() all on the same - user/sven/postgresql.git

diff options

author	Tom Lane <tgl@sss.pgh.pa.us>	2007-01-24 17:12:47 +0000
committer	Tom Lane <tgl@sss.pgh.pa.us>	2007-01-24 17:12:47 +0000
commit	d56c800c40d2ed66ec2e2dc4c244afa08ff0468a (patch)
tree	b08e6c46e0f55d11e997486dc64dbb93fb9d71b0 /src/backend/access
parent	43021ef8156d76e87fcec3d597c7f7b2e06d21a7 (diff)

Get pg_utf_mblen(), pg_utf2wchar_with_len(), and utf2ucs() all on the same

page about the maximum UTF8 sequence length we support (4 bytes since 8.1, 3 before that). pg_utf2wchar_with_len never got updated to support 4-byte characters at all, and in any case had a buffer-overrun risk in that it could produce multiple pg_wchars from what mblen claims to be just one UTF8 character. The only reason we don't have a major security hole is that most callers allocate worst-case output buffers; the sole exception in released versions appears to be pre-8.2 iwchareq() (ie, ILIKE), which can be crashed due to zeroing out its return address --- but AFAICS that can't be exploited for anything more than a crash, due to inability to control what gets written there. Per report from James Russell and Michael Fuhr. Pre-8.1 the risk is much less, but I still think pg_utf2wchar_with_len's behavior given an incomplete final character risks buffer overrun, so back-patch that logic change anyway. This patch also makes sure that UTF8 sequences exceeding the supported length (whichever it is) are consistently treated as error cases, rather than being treated like a valid shorter sequence in some places.

Diffstat (limited to 'src/backend/access')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: