Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
O
OpenLDAP
Manage
Activity
Members
Labels
Plan
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Insights
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Christopher Ng
OpenLDAP
Commits
40542984
Commit
40542984
authored
23 years ago
by
Howard Chu
Browse files
Options
Downloads
Patches
Plain Diff
Added some reference comments for ldap_utf8_charlen2
parent
e21e9003
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
libraries/libldap/utf-8.c
+17
-2
17 additions, 2 deletions
libraries/libldap/utf-8.c
with
17 additions
and
2 deletions
libraries/libldap/utf-8.c
+
17
−
2
View file @
40542984
...
...
@@ -70,8 +70,6 @@ int ldap_utf8_offset( const char * p )
/*
* Returns length indicated by first byte.
*
* This function should use a table lookup.
*/
const
char
ldap_utf8_lentab
[]
=
{
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
0
,
...
...
@@ -94,6 +92,23 @@ int ldap_utf8_charlen( const char * p )
/*
* Make sure the UTF-8 char used the shortest possible encoding
* returns charlen if valid, 0 if not.
*
* Here are the valid UTF-8 encodings, taken from RFC 2279 page 4.
* The table is slightly modified from that of the RFC.
*
* UCS-4 range (hex) UTF-8 sequence (binary)
* 0000 0000-0000 007F 0.......
* 0000 0080-0000 07FF 110++++. 10......
* 0000 0800-0000 FFFF 1110++++ 10+..... 10......
* 0001 0000-001F FFFF 11110+++ 10++.... 10...... 10......
* 0020 0000-03FF FFFF 111110++ 10+++... 10...... 10...... 10......
* 0400 0000-7FFF FFFF 1111110+ 10++++.. 10...... 10...... 10...... 10......
*
* The '.' bits are "don't cares". When validating a UTF-8 sequence,
* at least one of the '+' bits must be set, otherwise the character
* should have been encoded in fewer octets. Note that in the two-octet
* case, only the first octet needs to be validated, and this is done
* in the ldap_utf8_lentab[] above.
*/
/* mask of required bits in second octet */
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment