1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676
|
//C- -*- C++ -*-
//C- -------------------------------------------------------------------
//C- DjVuLibre-3.5
//C- Copyright (c) 2002 Leon Bottou and Yann Le Cun.
//C- Copyright (c) 2001 AT&T
//C-
//C- This software is subject to, and may be distributed under, the
//C- GNU General Public License, Version 2. The license should have
//C- accompanied the software or you may obtain a copy of the license
//C- from the Free Software Foundation at http://www.fsf.org .
//C-
//C- This program is distributed in the hope that it will be useful,
//C- but WITHOUT ANY WARRANTY; without even the implied warranty of
//C- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
//C- GNU General Public License for more details.
//C-
//C- DjVuLibre-3.5 is derived from the DjVu(r) Reference Library
//C- distributed by Lizardtech Software. On July 19th 2002, Lizardtech
//C- Software authorized us to replace the original DjVu(r) Reference
//C- Library notice by the following text (see doc/lizard2002.djvu):
//C-
//C- ------------------------------------------------------------------
//C- | DjVu (r) Reference Library (v. 3.5)
//C- | Copyright (c) 1999-2001 LizardTech, Inc. All Rights Reserved.
//C- | The DjVu Reference Library is protected by U.S. Pat. No.
//C- | 6,058,214 and patents pending.
//C- |
//C- | This software is subject to, and may be distributed under, the
//C- | GNU General Public License, Version 2. The license should have
//C- | accompanied the software or you may obtain a copy of the license
//C- | from the Free Software Foundation at http://www.fsf.org .
//C- |
//C- | The computer code originally released by LizardTech under this
//C- | license and unmodified by other parties is deemed "the LIZARDTECH
//C- | ORIGINAL CODE." Subject to any third party intellectual property
//C- | claims, LizardTech grants recipient a worldwide, royalty-free,
//C- | non-exclusive license to make, use, sell, or otherwise dispose of
//C- | the LIZARDTECH ORIGINAL CODE or of programs derived from the
//C- | LIZARDTECH ORIGINAL CODE in compliance with the terms of the GNU
//C- | General Public License. This grant only confers the right to
//C- | infringe patent claims underlying the LIZARDTECH ORIGINAL CODE to
//C- | the extent such infringement is reasonably necessary to enable
//C- | recipient to make, have made, practice, sell, or otherwise dispose
//C- | of the LIZARDTECH ORIGINAL CODE (or portions thereof) and not to
//C- | any greater extent that may be necessary to utilize further
//C- | modifications or combinations.
//C- |
//C- | The LIZARDTECH ORIGINAL CODE is provided "AS IS" WITHOUT WARRANTY
//C- | OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
//C- | TO ANY WARRANTY OF NON-INFRINGEMENT, OR ANY IMPLIED WARRANTY OF
//C- | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
//C- +------------------------------------------------------------------
//
// $Id: GString.h,v 1.19 2004/08/06 15:11:29 leonb Exp $
// $Name: release_3_5_15 $
#ifndef _GSTRING_H_
#define _GSTRING_H_
#ifdef HAVE_CONFIG_H
#include "config.h"
#endif
#if NEED_GNUG_PRAGMAS
# pragma interface
#endif
/** @name GString.h
Files #"GString.h"# and #"GString.cpp"# implement a general
purpose string class \Ref{GBaseString}, with dirived types
\Ref{GUTF8String} and \Ref{GNativeString} for UTF8 MBS encoding
and the current Native MBS encoding respectively. This
implementation relies on smart pointers (see
\Ref{GSmartPointer.h}).
{\bf Historical Comments} --- At some point during the DjVu
research era, it became clear that C++ compilers rarely provided
portable libraries. We then decided to avoid fancy classes (like
#iostream# or #string#) and to rely only on the good old C
library. A good string class however is very useful. We had
already randomly picked letter 'G' to prefix class names and we
logically derived the new class name. Native English speakers
kept laughing in hiding. This is ironic because we completely
forgot this letter 'G' when creating more challenging things
like the ZP Coder or the IW44 wavelets.
{\bf Later Changes}
When converting to I18N, we (Lizardtech) decided that two string classes
where needing, replacing the original GString with \Ref{GUTF8String} and
\Ref{GNativeString}.
@memo
General purpose string class.
@author
L\'eon Bottou <leonb@research.att.com> -- initial implementation.\\
// From: Leon Bottou, 1/31/2002
// This file has very little to do with my initial implementation.
// It has been practically rewritten by Lizardtech for i18n changes.
// My original implementation was very small in comparison
// <http://prdownloads.sourceforge.net/djvu/DjVu2_2b-src.tgz>.
// In my opinion, the duplication of the string classes is a failed
// attempt to use the type system to enforce coding policies.
// This could be fixed. But there are better things to do in djvulibre.
@version
#$Id: GString.h,v 1.19 2004/08/06 15:11:29 leonb Exp $# */
//@{
#include "DjVuGlobal.h"
#include "GContainer.h"
#include <stdlib.h>
#include <stdarg.h>
#ifdef WIN32
# include <windows.h>
# define HAS_WCHAR 1
# define HAS_MBSTATE 1
#endif
#if HAS_WCHAR
# if !defined(AUTOCONF) || HAVE_WCHAR_H
# include <wchar.h>
# endif
#endif
#ifdef HAVE_NAMESPACES
namespace DJVU {
# ifdef NOT_DEFINED // Just to fool emacs c++ mode
}
#endif
#endif
#if !HAS_MBSTATE
# ifndef HAVE_MBSTATE_T
typedef int mbstate_t;
# endif
#endif
class GBaseString;
// Internal string representation.
class GStringRep : public GPEnabled
{
public:
enum EncodeType { XUCS4, XUCS4BE, XUCS4LE, XUCS4_2143, XUCS4_3412,
XUTF16, XUTF16BE, XUTF16LE, XUTF8, XEBCDIC, XOTHER } ;
enum EscapeMode { UNKNOWN_ESCAPED=0, IS_ESCAPED=1, NOT_ESCAPED=2 };
class UTF8;
friend class UTF8;
class Unicode;
friend class Unicode;
class ChangeLocale;
#if HAS_WCHAR
class Native;
friend class Native;
#endif // HAS_WCHAR
friend class GBaseString;
friend class GUTF8String;
friend class GNativeString;
friend unsigned int hash(const GBaseString &ref);
public:
// default constructor
GStringRep(void);
// virtual destructor
virtual ~GStringRep();
// Other virtual methods.
// Create an empty string.
virtual GP<GStringRep> blank(const unsigned int sz) const = 0;
// Create a duplicate at the given size.
GP<GStringRep> getbuf(int n) const;
// Change the value of one of the bytes.
GP<GStringRep> setat(int n, char ch) const;
// Append a string.
virtual GP<GStringRep> append(const GP<GStringRep> &s2) const = 0;
// Test if isUTF8.
virtual bool isUTF8(void) const { return false; }
// Test if Native.
virtual bool isNative(void) const { return false; }
// Convert to Native.
virtual GP<GStringRep> toNative(
const EscapeMode escape=UNKNOWN_ESCAPED ) const = 0;
// Convert to UTF8.
virtual GP<GStringRep> toUTF8(const bool nothrow=false) const = 0;
// Convert to same as current class.
virtual GP<GStringRep> toThis(
const GP<GStringRep> &rep,const GP<GStringRep> &locale=0) const = 0;
// Compare with #s2#.
virtual int cmp(const GP<GStringRep> &s2,const int len=(-1)) const = 0;
// Convert strings to numbers.
virtual int toInt(void) const = 0;
virtual long int toLong(
const int pos, int &endpos, const int base=10) const = 0;
virtual unsigned long toULong(
const int pos, int &endpos, const int base=10) const = 0;
virtual double toDouble(const int pos, int &endpos) const = 0;
// return the position of the next character
int nextChar( const int from=0 ) const;
// return next non space position
int nextNonSpace( const int from=0, const int len=(-1) ) const;
// return next white space position
int nextSpace( const int from=0, const int len=(-1) ) const;
// return the position after the last non-whitespace character.
int firstEndSpace( int from=0, const int len=(-1) ) const;
// Create an empty string.
template <class TYPE> static GP<GStringRep> create(
const unsigned int sz,TYPE *);
// Creates with a strdup string.
GP<GStringRep> strdup(const char *s) const;
// Creates by appending to the current string
GP<GStringRep> append(const char *s2) const;
// Creates with a concat operation.
GP<GStringRep> concat(const GP<GStringRep> &s1,const GP<GStringRep> &s2) const;
GP<GStringRep> concat(const char *s1,const GP<GStringRep> &s2) const;
GP<GStringRep> concat(const GP<GStringRep> &s1,const char *s2) const;
GP<GStringRep> concat(const char *s1,const char *s2) const;
/* Creates with a strdup and substr. Negative values have strlen(s)+1
added to them.
*/
GP<GStringRep> substr(
const char *s,const int start,const int length=(-1)) const;
GP<GStringRep> substr(
const unsigned short *s,const int start,const int length=(-1)) const;
GP<GStringRep> substr(
const unsigned long *s,const int start,const int length=(-1)) const;
/** Initializes a string with a formatted string (as in #vprintf#). The
string is re-initialized with the characters generated according to the
specified format #fmt# and using the optional arguments. See the ANSI-C
function #vprintf()# for more information. The current implementation
will cause a segmentation violation if the resulting string is longer
than 32768 characters. */
GP<GStringRep> vformat(va_list args) const;
// -- SEARCHING
static GP<GStringRep> UTF8ToNative( const char *s,
const EscapeMode escape=UNKNOWN_ESCAPED );
static GP<GStringRep> NativeToUTF8( const char *s );
// Creates an uppercase version of the current string.
GP<GStringRep> upcase(void) const;
// Creates a lowercase version of the current string.
GP<GStringRep> downcase(void) const;
/** Returns the next UCS4 character, and updates the pointer s. */
static unsigned long UTF8toUCS4(
unsigned char const *&s, void const * const endptr );
/** Returns the number of bytes in next UCS4 character,
and sets #w# to the next UCS4 chacter. */
static int UTF8toUCS4(
unsigned long &w, unsigned char const s[], void const * const endptr )
{ unsigned char const *r=s;w=UTF8toUCS4(r,endptr);return (int)((size_t)r-(size_t)s); }
/** Returns the next UCS4 word from the UTF16 string. */
static int UTF16toUCS4(
unsigned long &w, unsigned short const * const s,void const * const eptr);
static int UCS4toUTF16(
unsigned long w, unsigned short &w1, unsigned short &w2);
int cmp(const char *s2, const int len=(-1)) const;
static int cmp(
const GP<GStringRep> &s1, const GP<GStringRep> &s2, const int len=(-1)) ;
static int cmp(
const GP<GStringRep> &s1, const char *s2, const int len=(-1));
static int cmp(
const char *s1, const GP<GStringRep> &s2, const int len=(-1));
static int cmp(
const char *s1, const char *s2, const int len=(-1));
// Lookup the next character, and return the position of the next character.
int getUCS4(unsigned long &w, const int from) const;
virtual unsigned char *UCS4toString(
const unsigned long w, unsigned char *ptr, mbstate_t *ps=0) const = 0;
static unsigned char *UCS4toUTF8(
const unsigned long w,unsigned char *ptr);
static unsigned char *UCS4toNative(
const unsigned long w,unsigned char *ptr, mbstate_t *ps);
int search(char c, int from=0) const;
int search(char const *str, int from=0) const;
int rsearch(char c, int from=0) const;
int rsearch(char const *str, int from=0) const;
int contains(char const accept[], int from=0) const;
int rcontains(char const accept[], int from=0) const;
protected:
// Return the next character and increment the source pointer.
virtual unsigned long getValidUCS4(const char *&source) const = 0;
GP<GStringRep> tocase(
bool (*xiswcase)(const unsigned long wc),
unsigned long (*xtowcase)(const unsigned long wc)) const;
// Tests if the specified character passes the xiswtest. If so, the
// return pointer is incremented to the next character, otherwise the
// specified #ptr# is returned.
const char * isCharType( bool (*xiswtest)(const unsigned long wc), const char *ptr,
const bool reverse=false) const;
// Find the next character position that passes the isCharType test.
int nextCharType(
bool (*xiswtest)(const unsigned long wc),const int from,const int len,
const bool reverse=false) const;
static bool giswspace(const unsigned long w);
static bool giswupper(const unsigned long w);
static bool giswlower(const unsigned long w);
static unsigned long gtowupper(const unsigned long w);
static unsigned long gtowlower(const unsigned long w);
virtual void set_remainder( void const * const buf, const unsigned int size,
const EncodeType encodetype);
virtual void set_remainder( void const * const buf, const unsigned int size,
const GP<GStringRep> &encoding );
virtual void set_remainder ( const GP<Unicode> &remainder );
virtual GP<Unicode> get_remainder( void ) const;
public:
/* Returns a copy of this string with characters used in XML with
'<' to "<", '>' to ">", '&' to "&" '\'' to
"'", and '\"' to """. Characters 0x01 through
0x1f are also escaped. */
GP<GStringRep> toEscaped( const bool tosevenbit ) const;
// Tests if a string is legally encoded in the current character set.
virtual bool is_valid(void) const = 0;
virtual int ncopy(wchar_t * const buf, const int buflen) const = 0;
protected:
// Actual string data.
int size;
char *data;
};
class GStringRep::UTF8 : public GStringRep
{
public:
// default constructor
UTF8(void);
// virtual destructor
virtual ~UTF8();
// Other virtual methods.
virtual GP<GStringRep> blank(const unsigned int sz = 0) const;
virtual GP<GStringRep> append(const GP<GStringRep> &s2) const;
// Test if Native.
virtual bool isUTF8(void) const;
// Convert to Native.
virtual GP<GStringRep> toNative(
const EscapeMode escape=UNKNOWN_ESCAPED) const;
// Convert to UTF8.
virtual GP<GStringRep> toUTF8(const bool nothrow=false) const;
// Convert to same as current class.
virtual GP<GStringRep> toThis(
const GP<GStringRep> &rep,const GP<GStringRep> &) const;
// Compare with #s2#.
virtual int cmp(const GP<GStringRep> &s2,const int len=(-1)) const;
static GP<GStringRep> create(const unsigned int sz = 0);
// Convert strings to numbers.
virtual int toInt(void) const;
virtual long int toLong(
const int pos, int &endpos, const int base=10) const;
virtual unsigned long toULong(
const int pos, int &endpos, const int base=10) const;
virtual double toDouble(
const int pos, int &endpos) const;
// Create a strdup string.
static GP<GStringRep> create(const char *s);
// Creates with a concat operation.
static GP<GStringRep> create(
const GP<GStringRep> &s1,const GP<GStringRep> &s2);
static GP<GStringRep> create( const GP<GStringRep> &s1,const char *s2);
static GP<GStringRep> create( const char *s1, const GP<GStringRep> &s2);
static GP<GStringRep> create( const char *s1,const char *s2);
// Create with a strdup and substr operation.
static GP<GStringRep> create(
const char *s,const int start,const int length=(-1));
static GP<GStringRep> create(
const unsigned short *s,const int start,const int length=(-1));
static GP<GStringRep> create(
const unsigned long *s,const int start,const int length=(-1));
static GP<GStringRep> create_format(const char fmt[],...);
static GP<GStringRep> create(const char fmt[],va_list& args);
virtual unsigned char *UCS4toString(
const unsigned long w,unsigned char *ptr, mbstate_t *ps=0) const;
// Tests if a string is legally encoded in the current character set.
virtual bool is_valid(void) const;
virtual int ncopy(wchar_t * const buf, const int buflen) const;
friend class GBaseString;
protected:
// Return the next character and increment the source pointer.
virtual unsigned long getValidUCS4(const char *&source) const;
};
class GUTF8String;
class GNativeString;
/** General purpose character string.
Each dirivied instance of class #GBaseString# represents a
character string. Overloaded operators provide a value semantic
to #GBaseString# objects. Conversion operators and constructors
transparently convert between #GBaseString# objects and
#const char*# pointers. The #GBaseString# class has no public
constructors, since a dirived type should always be used
to specify the desired multibyte character encoding.
Functions taking strings as arguments should declare their
arguments as "#const char*#". Such functions will work equally
well with dirived #GBaseString# objects since there is a fast
conversion operator from the dirivied #GBaseString# objects
to "#const char*#". Functions returning strings should return
#GUTF8String# or #GNativeString# objects because the class will
automatically manage the necessary memory.
Characters in the string can be identified by their position. The
first character of a string is numbered zero. Negative positions
represent characters relative to the end of the string (i.e.
position #-1# accesses the last character of the string,
position #-2# represents the second last character, etc.) */
class GBaseString : protected GP<GStringRep>
{
public:
enum EscapeMode {
UNKNOWN_ESCAPED=GStringRep::UNKNOWN_ESCAPED,
IS_ESCAPED=GStringRep::IS_ESCAPED,
NOT_ESCAPED=GStringRep::NOT_ESCAPED };
friend class GUTF8String;
friend class GNativeString;
protected:
// Sets the gstr pointer;
void init(void);
~GBaseString();
GBaseString &init(const GP<GStringRep> &rep);
// -- CONSTRUCTORS
/** Null constructor. Constructs an empty string. */
GBaseString( void );
public:
// -- ACCESS
/** Converts a string into a constant null terminated character
array. This conversion operator is very efficient because
it simply returns a pointer to the internal string data. The
returned pointer remains valid as long as the string is
unmodified. */
operator const char* ( void ) const ;
/// Returns the string length.
unsigned int length( void ) const;
/** Returns true if and only if the string contains zero characters.
This operator is useful for conditional expression in control
structures.
\begin{verbatim}
if (! str) { ... }
while (!! str) { ... } -- Note the double operator!
\end{verbatim}
Class #GBaseString# does not to support syntax
"#if# #(str)# #{}#" because the required conversion operator
introduces dangerous ambiguities with certain compilers. */
bool operator! ( void ) const;
// -- INDEXING
/** Returns the character at position #n#. An exception
\Ref{GException} is thrown if number #n# is not in range #-len#
to #len-1#, where #len# is the length of the string. The first
character of a string is numbered zero. Negative positions
represent characters relative to the end of the string. */
char operator[] (int n) const;
/// Returns #TRUE# if the string contains an integer number.
bool is_int(void) const;
/// Returns #TRUE# if the string contains a float number.
bool is_float(void) const;
/** Converts strings between native & UTF8 **/
GNativeString getUTF82Native( EscapeMode escape=UNKNOWN_ESCAPED ) const;
GUTF8String getNative2UTF8( void ) const;
// -- ALTERING
/// Reinitializes a string with the null string.
void empty( void );
// -- SEARCHING
/** Searches character #c# in the string, starting at position
#from# and scanning forward until reaching the end of the
string. This function returns the position of the matching
character. It returns #-1# if character #c# cannot be found. */
int search(char c, int from=0) const;
/** Searches sub-string #str# in the string, starting at position
#from# and scanning forward until reaching the end of the
string. This function returns the position of the first
matching character of the sub-string. It returns #-1# if
string #str# cannot be found. */
int search(const char *str, int from=0) const;
/** Searches character #c# in the string, starting at position
#from# and scanning backwards until reaching the beginning of
the string. This function returns the position of the matching
character. It returns #-1# if character #c# cannot be found. */
int rsearch(char c, const int from=0) const;
/** Searches sub-string #str# in the string, starting at position
#from# and scanning backwards until reaching the beginning of
the string. This function returns the position of the first
matching character of the sub-string. It returns #-1# if
string #str# cannot be found. */
int rsearch(const char *str, const int from=0) const;
/** Searches for any of the specified characters in the accept
string. It returns #-1# if the none of the characters and
be found, otherwise the position of the first match. */
int contains(const char accept[], const int from=0) const;
/** Searches for any of the specified characters in the accept
string. It returns #-1# if the none of the characters and be
found, otherwise the position of the last match. */
int rcontains(const char accept[], const int from=0) const;
/** Concatenates strings. Returns a string composed by concatenating
the characters of strings #s1# and #s2#. */
GUTF8String operator+(const GUTF8String &s2) const;
GNativeString operator+(const GNativeString &s2) const;
/** Returns an integer. Implements i18n atoi. */
int toInt(void) const;
/** Returns a long intenger. Implments i18n strtol. */
long toLong(const int pos, int &endpos, const int base=10) const;
/** Returns a unsigned long integer. Implements i18n strtoul. */
unsigned long toULong(
const int pos, int &endpos, const int base=10) const;
/** Returns a double. Implements the i18n strtod. */
double toDouble(
const int pos, int &endpos ) const;
/** Returns a long intenger. Implments i18n strtol. */
static long toLong(
const GUTF8String& src, const int pos, int &endpos, const int base=10);
static unsigned long toULong(
const GUTF8String& src, const int pos, int &endpos, const int base=10);
static double toDouble(
const GUTF8String& src, const int pos, int &endpos);
/** Returns a long intenger. Implments i18n strtol. */
static long toLong(
const GNativeString& src, const int pos, int &endpos, const int base=10);
static unsigned long toULong(
const GNativeString& src, const int pos, int &endpos, const int base=10);
static double toDouble(
const GNativeString& src, const int pos, int &endpos);
// -- HASHING
// -- COMPARISONS
/** Returns an #int#. Compares string with #s2# and returns
sorting order. */
int cmp(const GBaseString &s2, const int len=(-1)) const;
/** Returns an #int#. Compares string with #s2# and returns
sorting order. */
int cmp(const char *s2, const int len=(-1)) const;
/** Returns an #int#. Compares string with #s2# and returns
sorting order. */
int cmp(const char s2) const;
/** Returns an #int#. Compares #s2# with #s2# and returns
sorting order. */
static int cmp(const char *s1, const char *s2, const int len=(-1));
/** Returns a boolean. The Standard C strncmp takes two string and
compares the first N characters. static bool GBaseString::ncmp
will compare #s1# with #s2# with the #len# characters starting
from the beginning of the string. */
/** String comparison. Returns true if and only if character
strings #s1# and #s2# are equal (as with #strcmp#.)
*/
bool operator==(const GBaseString &s2) const;
bool operator==(const char *s2) const;
friend bool operator==(const char *s1, const GBaseString &s2);
/** String comparison. Returns true if and only if character
strings #s1# and #s2# are not equal (as with #strcmp#.)
*/
bool operator!=(const GBaseString &s2) const;
bool operator!=(const char *s2) const;
friend bool operator!=(const char *s1, const GBaseString &s2);
/** String comparison. Returns true if and only if character
strings #s1# is lexicographically greater than or equal to
string #s2# (as with #strcmp#.) */
bool operator>=(const GBaseString &s2) const;
bool operator>=(const char *s2) const;
bool operator>=(const char s2) const;
friend bool operator>=(const char *s1, const GBaseString &s2);
friend bool operator>=(const char s1, const GBaseString &s2);
/** String comparison. Returns true if and only if character
strings #s1# is lexicographically less than string #s2#
(as with #strcmp#.)
*/
bool operator<(const GBaseString &s2) const;
bool operator<(const char *s2) const;
bool operator<(const char s2) const;
friend bool operator<(const char *s1, const GBaseString &s2);
friend bool operator<(const char s1, const GBaseString &s2);
/** String comparison. Returns true if and only if character
strings #s1# is lexicographically greater than string #s2#
(as with #strcmp#.)
*/
bool operator> (const GBaseString &s2) const;
bool operator> (const char *s2) const;
bool operator> (const char s2) const;
friend bool operator> (const char *s1, const GBaseString &s2);
friend bool operator> (const char s1, const GBaseString &s2);
/** String comparison. Returns true if and only if character
strings #s1# is lexicographically less than or equal to string
#s2# (as with #strcmp#.)
*/
bool operator<=(const GBaseString &s2) const;
bool operator<=(const char *s2) const;
bool operator<=(const char s2) const;
friend bool operator<=(const char *s1, const GBaseString &s2);
friend bool operator<=(const char s1, const GBaseString &s2);
/** Returns an integer. Implements a functional i18n atoi. Note
that if you pass a GBaseString that is not in Native format
the results may be disparaging. */
/** Returns a hash code for the string. This hashing function
helps when creating associative maps with string keys (see
\Ref{GMap}). This hash code may be reduced to an arbitrary
range by computing its remainder modulo the upper bound of
the range. */
friend unsigned int hash(const GBaseString &ref);
// -- HELPERS
friend class GStringRep;
/// Returns next non space position.
int nextNonSpace( const int from=0, const int len=(-1) ) const;
/// Returns next character position.
int nextChar( const int from=0 ) const;
/// Returns next non space position.
int nextSpace( const int from=0, const int len=(-1) ) const;
/// return the position after the last non-whitespace character.
int firstEndSpace( const int from=0,const int len=(-1) ) const;
/// Tests if the string is legally encoded in the current codepage.
bool is_valid(void) const;
/// copy to a wchar_t buffer
int ncopy(wchar_t * const buf, const int buflen) const;
protected:
const char *gstr;
static void throw_illegal_subscript() no_return;
static const char *nullstr;
public:
GNativeString UTF8ToNative(
const bool currentlocale=false,
const EscapeMode escape=UNKNOWN_ESCAPED) const;
GUTF8String NativeToUTF8(void) const;
protected:
int CheckSubscript(int n) const;
};
/** General purpose character string.
Each instance of class #GUTF8String# represents a character
string. Overloaded operators provide a value semantic to
#GUTF8String# objects. Conversion operators and constructors
transparently convert between #GUTF8String# objects and
#const char*# pointers.
Functions taking strings as arguments should declare their
arguments as "#const char*#". Such functions will work equally
well with #GUTF8String# objects since there is a fast conversion
operator from #GUTF8String# to "#const char*#". Functions
returning strings should return #GUTF8String# or #GNativeString#
objects because the class will automatically manage the necessary
memory.
Characters in the string can be identified by their position. The
first character of a string is numbered zero. Negative positions
represent characters relative to the end of the string (i.e.
position #-1# accesses the last character of the string,
position #-2# represents the second last character, etc.) */
class GUTF8String : public GBaseString
{
public:
~GUTF8String();
void init(void);
GUTF8String &init(const GP<GStringRep> &rep);
// -- CONSTRUCTORS
/** Null constructor. Constructs an empty string. */
GUTF8String(void);
/// Constructs a string from a character.
GUTF8String(const char dat);
/// Constructs a string from a null terminated character array.
GUTF8String(const char *str);
/// Constructs a string from a null terminated character array.
GUTF8String(const unsigned char *str);
GUTF8String(const unsigned short *dat);
GUTF8String(const unsigned long *dat);
/** Constructs a string from a character array. Elements of the
character array #dat# are added into the string until the
string length reaches #len# or until encountering a null
character (whichever comes first). */
GUTF8String(const char *dat, unsigned int len);
GUTF8String(const unsigned short *dat, unsigned int len);
GUTF8String(const unsigned long *dat, unsigned int len);
/// Construct from base class.
GUTF8String(const GP<GStringRep> &str);
GUTF8String(const GBaseString &str);
GUTF8String(const GUTF8String &str);
GUTF8String(const GNativeString &str);
/** Constructs a string from a character array. Elements of the
character array #dat# are added into the string until the
string length reaches #len# or until encountering a null
character (whichever comes first). */
GUTF8String(const GBaseString &gs, int from, int len);
/** Copy a null terminated character array. Resets this string
with the character string contained in the null terminated
character array #str#. */
GUTF8String& operator= (const char str);
GUTF8String& operator= (const char *str);
GUTF8String& operator= (const GP<GStringRep> &str);
GUTF8String& operator= (const GBaseString &str);
GUTF8String& operator= (const GUTF8String &str);
GUTF8String& operator= (const GNativeString &str);
/** Constructs a string with a formatted string (as in #vprintf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #vprintf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer
than 32768 characters. */
GUTF8String(const GUTF8String &fmt, va_list &args);
/// Constructs a string from a character.
/** Constructs a string with a human-readable representation of
integer #number#. The format is similar to format #"%d"# in
function #printf#. */
GUTF8String(const int number);
/** Constructs a string with a human-readable representation of
floating point number #number#. The format is similar to
format #"%f"# in function #printf#. */
GUTF8String(const double number);
/** Initializes a string with a formatted string (as in #printf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #printf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer
than 32768 characters. */
GUTF8String &format(const char *fmt, ... );
/** Initializes a string with a formatted string (as in #vprintf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #vprintf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer
than 32768 characters. */
GUTF8String &vformat(const GUTF8String &fmt, va_list &args);
/** Returns a copy of this string with characters used in XML with
'<' to "<", '>' to ">", '&' to "&" '\'' to
"'", and '\"' to """. Characters 0x01 through
0x1f are also escaped. */
GUTF8String toEscaped( const bool tosevenbit=false ) const;
/** Converts strings containing HTML/XML escaped characters into
their unescaped forms. Numeric representations of characters
(e.g., "&" or "&" for "*") are the only forms
converted by this function. */
GUTF8String fromEscaped( void ) const;
/** Converts strings containing HTML/XML escaped characters
(e.g., "<" for "<") into their unescaped forms. The
conversion is partially defined by the ConvMap argument which
specifies the conversion strings to be recognized. Numeric
representations of characters (e.g., "&" or "&"
for "*") are always converted. */
GUTF8String fromEscaped(
const GMap<GUTF8String,GUTF8String> ConvMap ) const;
// -- CONCATENATION
/// Appends character #ch# to the string.
GUTF8String& operator+= (char ch);
/// Appends the null terminated character array #str# to the string.
GUTF8String& operator+= (const char *str);
/// Appends the specified GBaseString to the string.
GUTF8String& operator+= (const GBaseString &str);
/** Returns a sub-string. The sub-string is composed by copying
#len# characters starting at position #from# in this string.
The length of the resulting string may be smaller than #len#
if the specified range is too large. */
GUTF8String substr(int from, int len/*=(-1)*/) const;
/** Returns an upper case copy of this string. The returned string
contains a copy of the current string with all letters turned
into upper case letters. */
GUTF8String upcase( void ) const;
/** Returns an lower case copy of this string. The returned string
contains a copy of the current string with all letters turned
into lower case letters. */
GUTF8String downcase( void ) const;
/** Concatenates strings. Returns a string composed by concatenating
the characters of strings #s1# and #s2#.
*/
GUTF8String operator+(const GBaseString &s2) const;
GUTF8String operator+(const GUTF8String &s2) const;
GUTF8String operator+(const GNativeString &s2) const;
GUTF8String operator+(const char *s2) const;
friend GUTF8String operator+(const char *s1, const GUTF8String &s2);
/** Provides a direct access to the string buffer. Returns a
pointer for directly accessing the string buffer. This pointer
valid remains valid as long as the string is not modified by
other means. Positive values for argument #n# represent the
length of the returned buffer. The returned string buffer will
be large enough to hold at least #n# characters plus a null
character. If #n# is positive but smaller than the string
length, the string will be truncated to #n# characters. */
char *getbuf(int n = -1);
/** Set the character at position #n# to value #ch#. An exception
\Ref{GException} is thrown if number #n# is not in range #-len#
to #len#, where #len# is the length of the string. If character
#ch# is zero, the string is truncated at position #n#. The
first character of a string is numbered zero. Negative
positions represent characters relative to the end of the
string. If position #n# is equal to the length of the string,
this function appends character #ch# to the end of the string. */
void setat(const int n, const char ch);
public:
typedef enum GStringRep::EncodeType EncodeType;
static GUTF8String create(void const * const buf,
const unsigned int size,
const EncodeType encodetype, const GUTF8String &encoding);
static GUTF8String create( void const * const buf,
unsigned int size, const EncodeType encodetype );
static GUTF8String create( void const * const buf,
const unsigned int size, const GUTF8String &encoding );
static GUTF8String create( void const * const buf,
const unsigned int size, const GP<GStringRep::Unicode> &remainder);
GP<GStringRep::Unicode> get_remainder(void) const;
static GUTF8String create( const char *buf, const unsigned int bufsize );
static GUTF8String create( const unsigned short *buf, const unsigned int bufsize );
static GUTF8String create( const unsigned long *buf, const unsigned int bufsize );
};
#if !HAS_WCHAR
#define GBaseString GUTF8String
#endif
/** General purpose character string.
Each instance of class #GNativeString# represents a character
string. Overloaded operators provide a value semantic to
#GNativeString# objects. Conversion operators and constructors
transparently convert between #GNativeString# objects and
#const char*# pointers.
Functions taking strings as arguments should declare their
arguments as "#const char*#". Such functions will work equally
well with #GNativeString# objects since there is a fast conversion
operator from #GNativeString# to "#const char*#". Functions
returning strings should return #GUTF8String# or #GNativeString#
objects because the class will automatically manage the necessary
memory.
Characters in the string can be identified by their position. The
first character of a string is numbered zero. Negative positions
represent characters relative to the end of the string (i.e.
position #-1# accesses the last character of the string,
position #-2# represents the second last character, etc.) */
class GNativeString : public GBaseString
{
public:
~GNativeString();
// -- CONSTRUCTORS
/** Null constructor. Constructs an empty string. */
GNativeString(void);
/// Constructs a string from a character.
GNativeString(const char dat);
/// Constructs a string from a null terminated character array.
GNativeString(const char *str);
/// Constructs a string from a null terminated character array.
GNativeString(const unsigned char *str);
GNativeString(const unsigned short *str);
GNativeString(const unsigned long *str);
/** Constructs a string from a character array. Elements of the
character array #dat# are added into the string until the
string length reaches #len# or until encountering a null
character (whichever comes first). */
GNativeString(const char *dat, unsigned int len);
GNativeString(const unsigned short *dat, unsigned int len);
GNativeString(const unsigned long *dat, unsigned int len);
/// Construct from base class.
GNativeString(const GP<GStringRep> &str);
GNativeString(const GBaseString &str);
#if HAS_WCHAR
GNativeString(const GUTF8String &str);
#endif
GNativeString(const GNativeString &str);
/** Constructs a string from a character array. Elements of the
character array #dat# are added into the string until the
string length reaches #len# or until encountering a null
character (whichever comes first). */
GNativeString(const GBaseString &gs, int from, int len);
/** Constructs a string with a formatted string (as in #vprintf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #vprintf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer than
32768 characters. */
GNativeString(const GNativeString &fmt, va_list &args);
/** Constructs a string with a human-readable representation of
integer #number#. The format is similar to format #"%d"# in
function #printf#. */
GNativeString(const int number);
/** Constructs a string with a human-readable representation of
floating point number #number#. The format is similar to
format #"%f"# in function #printf#. */
GNativeString(const double number);
#if !HAS_WCHAR
#undef GBaseString
#else
/// Initialize this string class
void init(void);
/// Initialize this string class
GNativeString &init(const GP<GStringRep> &rep);
/** Copy a null terminated character array. Resets this string with
the character string contained in the null terminated character
array #str#. */
GNativeString& operator= (const char str);
GNativeString& operator= (const char *str);
GNativeString& operator= (const GP<GStringRep> &str);
GNativeString& operator= (const GBaseString &str);
GNativeString& operator= (const GUTF8String &str);
GNativeString& operator= (const GNativeString &str);
// -- CONCATENATION
/// Appends character #ch# to the string.
GNativeString& operator+= (char ch);
/// Appends the null terminated character array #str# to the string.
GNativeString& operator+= (const char *str);
/// Appends the specified GBaseString to the string.
GNativeString& operator+= (const GBaseString &str);
/** Returns a sub-string. The sub-string is composed by copying
#len# characters starting at position #from# in this string.
The length of the resulting string may be smaller than #len#
if the specified range is too large. */
GNativeString substr(int from, int len/*=(-1)*/) const;
/** Returns an upper case copy of this string. The returned
string contains a copy of the current string with all letters
turned into upper case letters. */
GNativeString upcase( void ) const;
/** Returns an lower case copy of this string. The returned
string contains a copy of the current string with all letters
turned into lower case letters. */
GNativeString downcase( void ) const;
GNativeString operator+(const GBaseString &s2) const;
GNativeString operator+(const GNativeString &s2) const;
GUTF8String operator+(const GUTF8String &s2) const;
GNativeString operator+(const char *s2) const;
friend GNativeString operator+(const char *s1, const GNativeString &s2);
/** Initializes a string with a formatted string (as in #printf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #printf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer than
32768 characters. */
GNativeString &format(const char *fmt, ... );
/** Initializes a string with a formatted string (as in #vprintf#).
The string is re-initialized with the characters generated
according to the specified format #fmt# and using the optional
arguments. See the ANSI-C function #vprintf()# for more
information. The current implementation will cause a
segmentation violation if the resulting string is longer than
32768 characters. */
GNativeString &vformat(const GNativeString &fmt, va_list &args);
/** Returns a copy of this string with characters used in XML with
'<' to "<", '>' to ">", '&' to "&" '\'' to
"'", and '\"' to """. Characters 0x01 through
0x1f are also escaped. */
GNativeString toEscaped( const bool tosevenbit=false ) const;
/** Provides a direct access to the string buffer. Returns a
pointer for directly accessing the string buffer. This
pointer valid remains valid as long as the string is not
modified by other means. Positive values for argument #n#
represent the length of the returned buffer. The returned
string buffer will be large enough to hold at least #n#
characters plus a null character. If #n# is positive but
smaller than the string length, the string will be truncated
to #n# characters. */
char *getbuf(int n = -1);
/** Set the character at position #n# to value #ch#. An exception
\Ref{GException} is thrown if number #n# is not in range #-len#
to #len#, where #len# is the length of the string. If
character #ch# is zero, the string is truncated at position
#n#. The first character of a string is numbered zero.
Negative positions represent characters relative to the end of
the string. If position #n# is equal to the length of the
string, this function appends character #ch# to the end of the
string. */
void setat(const int n, const char ch);
static GNativeString create( const char *buf, const unsigned int bufsize );
static GNativeString create( const unsigned short *buf, const unsigned int bufsize );
static GNativeString create( const unsigned long *buf, const unsigned int bufsize );
#endif // WinCE
};
//@}
inline
GBaseString::operator const char* ( void ) const
{
return ptr?(*this)->data:nullstr;
}
inline unsigned int
GBaseString::length( void ) const
{
return ptr ? (*this)->size : 0;
}
inline bool
GBaseString::operator! ( void ) const
{
return !ptr;
}
inline GUTF8String
GUTF8String::upcase( void ) const
{
if (ptr) return (*this)->upcase();
return *this;
}
inline GUTF8String
GUTF8String::downcase( void ) const
{
if (ptr) return (*this)->downcase();
return *this;
}
inline void
GUTF8String::init(void)
{ GBaseString::init(); }
inline GUTF8String &
GUTF8String::init(const GP<GStringRep> &rep)
{ GP<GStringRep>::operator=(rep?rep->toUTF8(true):rep); init(); return *this; }
inline GUTF8String &
GUTF8String::vformat(const GUTF8String &fmt, va_list &args)
{ return (*this = (fmt.ptr?GUTF8String(fmt,args):fmt)); }
inline GUTF8String
GUTF8String::toEscaped( const bool tosevenbit ) const
{ return ptr?GUTF8String((*this)->toEscaped(tosevenbit)):(*this); }
inline GP<GStringRep::Unicode>
GUTF8String::get_remainder(void) const
{
GP<GStringRep::Unicode> retval;
if(ptr)
retval=((*this)->get_remainder());
return retval;
}
inline
GUTF8String::GUTF8String(const GNativeString &str)
{ init(str.length()?(str->toUTF8(true)):(GP<GStringRep>)str); }
inline
GUTF8String::GUTF8String(const GP<GStringRep> &str)
{ init(str?(str->toUTF8(true)):str); }
inline
GUTF8String::GUTF8String(const GBaseString &str)
{ init(str.length()?(str->toUTF8(true)):(GP<GStringRep>)str); }
inline void
GBaseString::init(void)
{
gstr=ptr?((*this)->data):nullstr;
}
/** Returns an integer. Implements i18n atoi. */
inline int
GBaseString::toInt(void) const
{ return ptr?(*this)->toInt():0; }
/** Returns a long intenger. Implments i18n strtol. */
inline long
GBaseString::toLong(const int pos, int &endpos, const int base) const
{
long int retval=0;
if(ptr)
{
retval=(*this)->toLong(pos, endpos, base);
}else
{
endpos=(-1);
}
return retval;
}
inline long
GBaseString::toLong(
const GUTF8String& src, const int pos, int &endpos, const int base)
{
return src.toLong(pos,endpos,base);
}
inline long
GBaseString::toLong(
const GNativeString& src, const int pos, int &endpos, const int base)
{
return src.toLong(pos,endpos,base);
}
/** Returns a unsigned long integer. Implements i18n strtoul. */
inline unsigned long
GBaseString::toULong(const int pos, int &endpos, const int base) const
{
unsigned long retval=0;
if(ptr)
{
retval=(*this)->toULong(pos, endpos, base);
}else
{
endpos=(-1);
}
return retval;
}
inline unsigned long
GBaseString::toULong(
const GUTF8String& src, const int pos, int &endpos, const int base)
{
return src.toULong(pos,endpos,base);
}
inline unsigned long
GBaseString::toULong(
const GNativeString& src, const int pos, int &endpos, const int base)
{
return src.toULong(pos,endpos,base);
}
/** Returns a double. Implements the i18n strtod. */
inline double
GBaseString::toDouble(
const int pos, int &endpos ) const
{
double retval=(double)0;
if(ptr)
{
retval=(*this)->toDouble(pos, endpos);
}else
{
endpos=(-1);
}
return retval;
}
inline double
GBaseString::toDouble(
const GUTF8String& src, const int pos, int &endpos)
{
return src.toDouble(pos,endpos);
}
inline double
GBaseString::toDouble(
const GNativeString& src, const int pos, int &endpos)
{
return src.toDouble(pos,endpos);
}
inline GBaseString &
GBaseString::init(const GP<GStringRep> &rep)
{ GP<GStringRep>::operator=(rep); init(); return *this;}
inline char
GBaseString::operator[] (int n) const
{ return ((n||ptr)?((*this)->data[CheckSubscript(n)]):0); }
inline int
GBaseString::search(char c, int from) const
{ return ptr?((*this)->search(c,from)):(-1); }
inline int
GBaseString::search(const char *str, int from) const
{ return ptr?((*this)->search(str,from)):(-1); }
inline int
GBaseString::rsearch(char c, const int from) const
{ return ptr?((*this)->rsearch(c,from)):(-1); }
inline int
GBaseString::rsearch(const char *str, const int from) const
{ return ptr?((*this)->rsearch(str,from)):(-1); }
inline int
GBaseString::contains(const char accept[], const int from) const
{ return ptr?((*this)->contains(accept,from)):(-1); }
inline int
GBaseString::rcontains(const char accept[], const int from) const
{ return ptr?((*this)->rcontains(accept,from)):(-1); }
inline int
GBaseString::cmp(const GBaseString &s2, const int len) const
{ return GStringRep::cmp(*this,s2,len); }
inline int
GBaseString::cmp(const char *s2, const int len) const
{ return GStringRep::cmp(*this,s2,len); }
inline int
GBaseString::cmp(const char s2) const
{ return GStringRep::cmp(*this,&s2,1); }
inline int
GBaseString::cmp(const char *s1, const char *s2, const int len)
{ return GStringRep::cmp(s1,s2,len); }
inline bool
GBaseString::operator==(const GBaseString &s2) const
{ return !cmp(s2); }
inline bool
GBaseString::operator==(const char *s2) const
{ return !cmp(s2); }
inline bool
GBaseString::operator!=(const GBaseString &s2) const
{ return !!cmp(s2); }
inline bool
GBaseString::operator!=(const char *s2) const
{ return !!cmp(s2); }
inline bool
GBaseString::operator>=(const GBaseString &s2) const
{ return (cmp(s2)>=0); }
inline bool
GBaseString::operator>=(const char *s2) const
{ return (cmp(s2)>=0); }
inline bool
GBaseString::operator>=(const char s2) const
{ return (cmp(s2)>=0); }
inline bool
GBaseString::operator<(const GBaseString &s2) const
{ return (cmp(s2)<0); }
inline bool
GBaseString::operator<(const char *s2) const
{ return (cmp(s2)<0); }
inline bool
GBaseString::operator<(const char s2) const
{ return (cmp(s2)<0); }
inline bool
GBaseString::operator> (const GBaseString &s2) const
{ return (cmp(s2)>0); }
inline bool
GBaseString::operator> (const char *s2) const
{ return (cmp(s2)>0); }
inline bool
GBaseString::operator> (const char s2) const
{ return (cmp(s2)>0); }
inline bool
GBaseString::operator<=(const GBaseString &s2) const
{ return (cmp(s2)<=0); }
inline bool
GBaseString::operator<=(const char *s2) const
{ return (cmp(s2)<=0); }
inline bool
GBaseString::operator<=(const char s2) const
{ return (cmp(s2)<=0); }
inline int
GBaseString::nextNonSpace( const int from, const int len ) const
{ return ptr?(*this)->nextNonSpace(from,len):0; }
inline int
GBaseString::nextChar( const int from ) const
{ return ptr?(*this)->nextChar(from):0; }
inline int
GBaseString::nextSpace( const int from, const int len ) const
{ return ptr?(*this)->nextSpace(from,len):0; }
inline int
GBaseString::firstEndSpace( const int from,const int len ) const
{ return ptr?(*this)->firstEndSpace(from,len):0; }
inline bool
GBaseString::is_valid(void) const
{ return ptr?((*this)->is_valid()):true; }
inline int
GBaseString::ncopy(wchar_t * const buf, const int buflen) const
{if(buf&&buflen)buf[0]=0;return ptr?((*this)->ncopy(buf,buflen)):0;}
inline int
GBaseString::CheckSubscript(int n) const
{
if(n)
{
if (n<0 && ptr)
n += (*this)->size;
if (n<0 || !ptr || n > (int)(*this)->size)
throw_illegal_subscript();
}
return n;
}
inline GBaseString::GBaseString(void) { init(); }
inline GUTF8String::GUTF8String(void) { }
inline GUTF8String::GUTF8String(const GUTF8String &str) : GBaseString(str)
{ init(str); }
inline GUTF8String& GUTF8String::operator= (const GP<GStringRep> &str)
{ return init(str); }
inline GUTF8String& GUTF8String::operator= (const GBaseString &str)
{ return init(str); }
inline GUTF8String& GUTF8String::operator= (const GUTF8String &str)
{ return init(str); }
inline GUTF8String& GUTF8String::operator= (const GNativeString &str)
{ return init(str); }
inline GUTF8String
GUTF8String::create( const char *buf, const unsigned int bufsize )
{
#if HAS_WCHAR
return GNativeString(buf,bufsize);
#else
return GUTF8String(buf,bufsize);
#endif
}
inline GUTF8String
GUTF8String::create( const unsigned short *buf, const unsigned int bufsize )
{
return GUTF8String(buf,bufsize);
}
inline GUTF8String
GUTF8String::create( const unsigned long *buf, const unsigned int bufsize )
{
return GUTF8String(buf,bufsize);
}
inline GNativeString::GNativeString(void) {}
#if !HAS_WCHAR
// For Windows CE, GNativeString is essentially GUTF8String
inline
GNativeString::GNativeString(const GUTF8String &str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const GP<GStringRep> &str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const char dat)
: GUTF8String(dat) {}
inline
GNativeString::GNativeString(const char *str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const unsigned char *str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const unsigned short *str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const unsigned long *str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const char *dat, unsigned int len)
: GUTF8String(dat,len) {}
inline
GNativeString::GNativeString(const unsigned short *dat, unsigned int len)
: GUTF8String(dat,len) {}
inline
GNativeString::GNativeString(const unsigned long *dat, unsigned int len)
: GUTF8String(dat,len) {}
inline
GNativeString::GNativeString(const GNativeString &str)
: GUTF8String(str) {}
inline
GNativeString::GNativeString(const int number)
: GUTF8String(number) {}
inline
GNativeString::GNativeString(const double number)
: GUTF8String(number) {}
inline
GNativeString::GNativeString(const GNativeString &fmt, va_list &args)
: GUTF8String(fmt,args) {}
#else // HAS_WCHAR
/// Initialize this string class
inline void
GNativeString::init(void)
{ GBaseString::init(); }
/// Initialize this string class
inline GNativeString &
GNativeString::init(const GP<GStringRep> &rep)
{
GP<GStringRep>::operator=(rep?rep->toNative(GStringRep::NOT_ESCAPED):rep);
init();
return *this;
}
inline GNativeString
GNativeString::substr(int from, int len) const
{ return GNativeString(*this, from, len); }
inline GNativeString &
GNativeString::vformat(const GNativeString &fmt, va_list &args)
{ return (*this = (fmt.ptr?GNativeString(fmt,args):fmt)); }
inline GNativeString
GNativeString::toEscaped( const bool tosevenbit ) const
{ return ptr?GNativeString((*this)->toEscaped(tosevenbit)):(*this); }
inline
GNativeString::GNativeString(const GUTF8String &str)
{
if (str.length())
init(str->toNative(GStringRep::NOT_ESCAPED));
else
init((GP<GStringRep>)str);
}
inline
GNativeString::GNativeString(const GP<GStringRep> &str)
{
if (str)
init(str->toNative(GStringRep::NOT_ESCAPED));
else
init(str);
}
inline
GNativeString::GNativeString(const GBaseString &str)
{
if (str.length())
init(str->toNative(GStringRep::NOT_ESCAPED));
else
init((GP<GStringRep>)str);
}
inline
GNativeString::GNativeString(const GNativeString &fmt, va_list &args)
{
if (fmt.ptr)
init(fmt->vformat(args));
else
init(fmt);
}
inline GNativeString
GNativeString::create( const char *buf, const unsigned int bufsize )
{
return GNativeString(buf,bufsize);
}
inline GNativeString
GNativeString::create( const unsigned short *buf, const unsigned int bufsize )
{
return GNativeString(buf,bufsize);
}
inline GNativeString
GNativeString::create( const unsigned long *buf, const unsigned int bufsize )
{
return GNativeString(buf,bufsize);
}
inline GNativeString&
GNativeString::operator= (const GP<GStringRep> &str)
{ return init(str); }
inline GNativeString&
GNativeString::operator= (const GBaseString &str)
{ return init(str); }
inline GNativeString&
GNativeString::operator= (const GUTF8String &str)
{ return init(str); }
inline GNativeString&
GNativeString::operator= (const GNativeString &str)
{ return init(str); }
inline GNativeString
GNativeString::upcase( void ) const
{
if (ptr) return (*this)->upcase();
return *this;
}
inline GNativeString
GNativeString::downcase( void ) const
{
if (ptr) return (*this)->downcase();
return *this;
}
#endif // HAS_WCHAR
inline bool
operator==(const char *s1, const GBaseString &s2)
{ return !s2.cmp(s1); }
inline bool
operator!=(const char *s1, const GBaseString &s2)
{ return !!s2.cmp(s1); }
inline bool
operator>=(const char *s1, const GBaseString &s2)
{ return (s2.cmp(s1)<=0); }
inline bool
operator>=(const char s1, const GBaseString &s2)
{ return (s2.cmp(s1)<=0); }
inline bool
operator<(const char *s1, const GBaseString &s2)
{ return (s2.cmp(s1)>0); }
inline bool
operator<(const char s1, const GBaseString &s2)
{ return (s2.cmp(s1)>0); }
inline bool
operator> (const char *s1, const GBaseString &s2)
{ return (s2.cmp(s1)<0); }
inline bool
operator> (const char s1, const GBaseString &s2)
{ return (s2.cmp(s1)<0); }
inline bool
operator<=(const char *s1, const GBaseString &s2)
{ return !(s1>s2); }
inline bool
operator<=(const char s1, const GBaseString &s2)
{ return !(s1>s2); }
// ------------------- The end
#ifdef HAVE_NAMESPACES
}
# ifndef NOT_USING_DJVU_NAMESPACE
using namespace DJVU;
# endif
#endif
#endif
|