最近几天在折腾网站的url规范化的问题。
对urlencode函数比较好奇,扒出C代码来看了一下。原来是16进制的东东。
Ascii Table : http://www.asciitable.com/
C代码 (取自php-5.2.6/ext/standard/url.c 430-489行)
- /* rfc1738:
- …The characters ";",
- "/", "?", ":", "@", "=" and "&" are the characters which may be
- reserved for special meaning within a scheme…
- …Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
- reserved characters used for their reserved purposes may be used
- unencoded within a URL…
- For added safety, we only leave -_. unencoded.
- */
- static unsigned char hexchars[] = "0123456789ABCDEF";
- /* {{{ php_url_encode
- */
- PHPAPI char *php_url_encode(char const *s, int len, int *new_length)
- {
- register unsigned char c;
- unsigned char *to, *start;
- unsigned char const *from, *end;
- from = s;
- end = s + len;
- start = to = (unsigned char *) safe_emalloc(3, len, 1);
- while (from < end) {
- c = *from++;
- if (c == ‘ ‘) {
- *to++ = ‘+‘;
- #ifndef CHARSET_EBCDIC
- } else if ((c < ‘0‘ && c != ‘–‘ && c != ‘.‘) ||
- (c < ‘A‘ && c > ‘9‘) ||
- (c > ‘Z‘ && c < ‘a‘ && c != ‘_‘) ||
- (c > ‘z‘)) {
- to[0] = ‘%‘;
- to[1] = hexchars[c >> 4];
- to[2] = hexchars[c & 15];
- to += 3;
- #else /*CHARSET_EBCDIC*/
- } else if (!isalnum(c) && strchr("_-.", c) == NULL) {
- /* Allow only alphanumeric chars and ‘_‘, ‘–‘, ‘.‘; escape the rest */
- to[0] = ‘%’;
- to[1] = hexchars[os_toascii[c] >> 4];
- to[2] = hexchars[os_toascii[c] & 15];
- to += 3;
- #endif /*CHARSET_EBCDIC*/
- } else {
- *to++ = c;
- }
- }
- *to = 0;
- if (new_length) {
- *new_length = to – start;
- }
- return (char *) start;
- }
- /* }}} */
发表回复