1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360
|
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os, re, sys, codecs, difflib
from optparse import OptionParser
from subprocess import Popen, PIPE, call
from textwrap import TextWrapper, _whitespace
from collections import defaultdict, OrderedDict, Counter
from platform import system
from unicodedata import east_asian_width
from tempfile import NamedTemporaryFile
usage = "usage: %prog [options] commands\n" \
"Without any command, it starts in interactive mode.\n" \
"Read docs/translations.txt for details."
parser = OptionParser(usage=usage)
parser.add_option("--commit_author", help="Commit author",
default="Translators <crawl-ref-discuss@lists.sourceforge.net>")
parser.add_option("-d", "--diff", help="Diff format (unified, context, n)",
default='n')
parser.add_option("-f", "--force", action="store_true",
help="Overwrite files even if no change detected")
parser.add_option("-l", "--language", help="Specify which languages to work on")
parser.add_option("-r", "--resource", help="Specify which resources to work on")
parser.add_option("-s", "--source", help="Work on source files (same as -l en)",
action="store_true")
parser.add_option("-t", "--translations", help="Work on translations",
action="store_true")
parser.add_option("-a", "--auto_fix", action="store_true",
help="Apply some automatic fixes to punctuation")
(options, args) = parser.parse_args()
cmd = args[0] if args else ''
# Absolute path to the source directory
tx_abs_path = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
# Absolute path to the git root
git_root = os.path.abspath(os.path.join(tx_abs_path, "..", ".."))
# Relative path from the git root to the transifex directory
tx_rel_path = os.path.join('crawl-ref', 'source')
# Absolute path to the transifex config file
tx_config = os.path.join(tx_abs_path, '.tx', 'config')
# Relative path from the source directory to the descript directory
descript_tx_path = os.path.join('dat', 'descript')
# Relative path from the git root to the descript directory
descript_git_path = os.path.join(tx_rel_path, descript_tx_path)
# Absolute path to the descript directory
descript_abs_path = os.path.join(tx_abs_path, descript_tx_path)
try:
os.chdir(descript_abs_path)
except OSError, e:
sys.exit(e)
sep_re = re.compile('[, ]+') # basic separator for user input
txt_sep_re = re.compile('%{4,}') # txt file entry separator
cmd_re = re.compile('<(\w)>') # used to find the key in menu command strings
# Those languages have special wrapping with fullwidth character support
east_asian_languages = {'ja_JP', 'ko_KR', 'zh_CN'}
no_space_languages = {'ja_JP', 'zh_CN'}
east_asian_punctuation = u'、。,!:;)'
# This object serves as an intermediate step between txt and ini files.
# Entries are in a raw format: no wrapping, every new line is significant.
# they are indexed by [(lang, res)][key] and are of type Entry
raw_entries = defaultdict(OrderedDict)
# Main commands
def wrap_txt():
txt_files.load_files()
txt_files.merge_files()
txt_files.update()
menu.res_files = txt_files
def create_ini():
txt_files.load_files()
txt_files.merge_files()
ini_files.load_files()
ini_files.update()
menu.res_files = ini_files
def merge_ini():
txt_files.load_files()
res_index.en_src = False # For en, load the fake translations
ini_files.load_files()
ini_files.merge_files()
res_index.en_src = True
txt_files.update()
menu.res_files = txt_files
def setup_transifex():
"""Initialize the transifex config file"""
os.chdir(tx_abs_path)
call_tx(['init'])
tx_set = ['set', '--auto-local', '-s', 'en_AU', '-t', 'INI', '--execute']
for res in res_index.default_resources:
res_file = res + '.ini'
source_file = os.path.join(descript_tx_path, res_file)
expr = os.path.join(descript_tx_path, '<lang>', res_file)
call_tx(tx_set + ['-r', 'dcss.' + res, expr, '-f', source_file])
os.chdir(descript_abs_path)
def call_tx(args, silent = False):
"""wrapper to call the transifex client"""
if silent:
stderr = open(os.devnull, 'wb')
else:
stderr = None
# On windows, we need shell=True to search the PATH for the tx command
if sys.platform == 'win32':
python_path = os.path.split(sys.executable)[0]
tx_path = os.path.join(python_path, 'Scripts', 'tx')
return call(['python', tx_path] + args, stderr=stderr)
else:
return call(['tx'] + args, stderr=stderr)
# Utility functions
def title(text):
"""Add a dash square around a string. Used when showing a diff."""
text = "### " + text + " ###"
dash_line = "#" * len(text)
text = dash_line + "\n" + text + "\n" + dash_line + "\n"
return text
def unwrap(text, no_space):
"""Mostly replicates libutil.cc:unwrap_desc"""
if not text:
return ""
# Protect all consecutive empty lines
text = re.sub("\n{2,}", lambda m: r'\n' * len(m.group(0)), text)
text = text.replace("\n ", "\\n ")
# Don't unwrap lua separator at the beginning of a line
text = text.replace("\n}}", "\\n}}")
text = text.replace("\n{{", "\\n{{")
text = text.replace(u"—\n—", u"——")
text = text.replace(">\n<", "><")
text = text.replace("\n", " ")
text = text.replace("\\n", "\n")
# Remove superfluous spaces surrounded by wide characters
if no_space:
i = 0
j = text.find(" ")
while j != -1:
i += j
# text has been rstriped so no risk of finding a space at the end
if i and wide_char(text[i-1]) and wide_char(text[i+1]):
text = text[:i] + text[i+1:]
else:
i += 1
j = text[i:].find(" ")
return text
def wrap(text, eac, no_space):
"""Wrap long lines using a TextWrapper object"""
lines = []
for line in text.splitlines():
if line:
# This allows breaking lines between tags
line = line.replace("><", ">\f<")
if no_space:
# Need to rstrip the lines because when the wrapper tries to
# add a single character to the end of the line, it might fail
# and add an empty string, preventing the removal of whitespace
lines += map(unicode.rstrip, FW_NS_wrapper.wrap(line))
elif eac:
lines += map(unicode.rstrip, FWwrapper.wrap(line))
else:
lines += wrapper.wrap(line)
elif not lines or lines[-1] != '': # remove consecutive empty lines
lines += ['']
lines[:] = [line.replace(">\f<", "><") for line in lines]
# Languages which have no spaces are split on punctuation which make them
# sometimes wrapped to the beginning of the next line. Since it's quite ugly,
# we manually move them back to the end of the previous line.
if eac or no_space:
fixed_lines = []
for line in lines:
while line and line[0] in east_asian_punctuation and fixed_lines \
and fixed_lines[-1][-1] != line[0]:
fixed_lines[-1] += line[0]
line = line[1:]
if line:
line = line.lstrip()
else:
line = None
break
if line is not None:
fixed_lines.append(line)
lines = fixed_lines
return "\n".join(lines)
def diff(val, new_val):
"""Returns a diff showing the differences between 2 strings"""
try:
diff_func = {'unified': difflib.unified_diff,
'context': difflib.context_diff,
'n': difflib.ndiff}[options.diff]
except KeyError:
sys.exit("Invalid diff option: %s" % options.diff)
return "\n".join(diff_func(val, new_val))
def progress(name, i, n):
"""Generic function for showing the progression of a treatment in percent"""
print "\r%s %d%%" % (name, i * 100 / n),
if i == n:
print
def emphasize(s):
"""Add terminal control characters to a string to make it bright and
underlined. Under windows, control characters are not supported so we just
surround the string in chevrons"""
if system() != 'Windows':
return u'\033[1m\033[4m%s\033[0m' % s
else:
return '<' + s + '>'
def change_counter(c):
return " ".join(["%s:%-3d" % (k, c[k]) if c[k] else " " * (len(k) + 4) \
for k in sorted(res_index.changes)])
def wide_char(c):
return c != u'—' and east_asian_width(c) in 'WFA'
def auto_fix(s, lang):
"""Use with care, it can break things"""
s = auto_fix.re_hyphen.sub(u"\\1—\\2", s)
s = auto_fix.re_ns.sub(u" \\1", s)
if lang == 'fr': # Those ones can break languages which use »« for quotes
s = auto_fix.re_ns_opening_quote.sub(u"« ", s)
s = auto_fix.re_ns_closing_quote.sub(u" »", s)
s = auto_fix.re_missing_space.sub(u" \\1", s)
s = auto_fix.re_missing_space2.sub(u"« ", s)
if s.find('{{') == -1: # Don't mess with lua strings
s = auto_fix.re_ascii_single_quotes.sub(u"‘\\1’", s)
s = auto_fix.re_ascii_double_quotes.sub(u"“\\1”", s)
# replace english quotes by localized ones
if lang == 'fr':
s = auto_fix.re_english_double_quotes.sub(u"« \\1 »", s)
s = auto_fix.re_english_single_quotes.sub(u"“\\1”", s)
elif lang == 'de' or lang == 'cs':
s = auto_fix.re_english_single_quotes.sub(u"‚\\1‘", s)
s = auto_fix.re_english_double_quotes.sub(u"„\\1“", s)
elif lang == 'da':
s = auto_fix.re_english_single_quotes.sub(u"„\\1“", s)
s = auto_fix.re_english_double_quotes.sub(u"»\\1«", s)
elif lang == 'el' or lang == 'es' or lang == 'it' or lang == 'pt':
s = auto_fix.re_english_double_quotes.sub(u"«\\1»", s)
s = auto_fix.re_english_single_quotes.sub(u"“\\1”", s)
elif lang == 'fi':
s = auto_fix.re_english_single_quotes.sub(u"’\\1’", s)
s = auto_fix.re_english_double_quotes.sub(u"”\\1”", s)
elif lang == 'ja':
s = auto_fix.re_english_single_quotes.sub(u"『\\1』", s)
s = auto_fix.re_english_double_quotes.sub(u"「\\1」", s)
elif lang == 'lt':
s = auto_fix.re_english_single_quotes.sub(u"„\\1”", s)
s = auto_fix.re_english_double_quotes.sub(u"„\\1”", s)
elif lang == 'lv' or lang == 'ru':
s = auto_fix.re_english_double_quotes.sub(u"«\\1»", s)
s = auto_fix.re_english_single_quotes.sub(u"„\\1”", s)
elif lang == 'pl' or lang == 'hu':
s = auto_fix.re_english_single_quotes.sub(u"»\\1«", s)
s = auto_fix.re_english_double_quotes.sub(u"„\\1”", s)
return s
auto_fix.re_hyphen = re.compile("(\s)[-–](\s)") # Replace hyphens by em dashes
auto_fix.re_ns = re.compile("\s([!?:;])")
auto_fix.re_ns_opening_quote = re.compile(u"«\s")
auto_fix.re_ns_closing_quote = re.compile(u"\s»")
auto_fix.re_missing_space = re.compile(u"(?<=\w)([!?:;»](?!\d))", re.U)
auto_fix.re_missing_space2 = re.compile(u"«(?=\w)", re.U)
auto_fix.re_ascii_single_quotes = re.compile(u"(?<=\W)'(.*?)'(?=\W)", re.S)
auto_fix.re_ascii_double_quotes = re.compile(u'"(.*?)"', re.S)
auto_fix.re_english_single_quotes = re.compile(u'‘([^‚‘’]*?)’', re.S)
auto_fix.re_english_double_quotes = re.compile(u'“(.*?)”', re.S)
"""Subclasses to properly handle wrapping fullwidth unicode character which take
2 columns to be displayed on a terminal
See http://code.activestate.com/lists/python-list/631628/"""
class FullWidthUnicode(unicode):
def __len__(self):
return sum(2 if wide_char(c) else 1 for c in self)
def __getslice__(self, i, j):
k = 0
while k < i:
if wide_char(self[k]):
i -= 1
k += 1
k = i
while k < j and k < unicode.__len__(self):
if wide_char(self[k]):
j -= 1
k += 1
return FullWidthUnicode(unicode.__getslice__(self, i, j))
class FullWidthTextWrapper(TextWrapper):
def __init__(self, **kwargs):
if 'no_space' in kwargs:
kwargs.pop('no_space')
# Those languages don't use spaces. Break lines on punctuation.
self.wordsep_simple_re = re.compile(u'([\s%s]+)|(—)(?=—)' % east_asian_punctuation)
TextWrapper.__init__(self, **kwargs)
def _split(self, text):
return map(FullWidthUnicode, TextWrapper._split(self, text))
class ResourceIndex():
"""Class which holds current language / resource settings and serves as an
iterator for ResourceCollection.
self.changes holds a list of the types of change currently selected
(changed, new or removed). This is used to select which value to
display when iterating through entries for showing a diff or writing a
resource file.
Note that not selecting "removed" only affects diffs and temporary files
created for editing. When writing the resource file, removed keys are never
written no matter what is in the changes array."""
def __init__(self):
self.default_languages = [ 'en' ]
self.default_resources = []
self.languages = []
self.resources = []
self.en_src = True # When True, the english language maps to the source
# files. When False, it maps to the fake translations
self.changes = []
lang_re = re.compile("[a-z]{2}_[A-Z]{2}")
# Initialize languages with directories in the descript dir
# and resource with txt files
for f in sorted(os.listdir('.')):
(basename, ext) = os.path.splitext(f)
if ext.lower() == '.txt':
self.default_resources.append(basename)
elif os.path.isdir(f) and lang_re.match(f):
self.default_languages.append(f)
if not os.path.exists(f[:2]):
os.makedirs(f[:2])
if options.source:
self.languages = ['en']
elif options.language:
self.set_languages(options.language)
elif options.translations:
self.languages = self.default_languages[1:]
else:
self.languages = self.default_languages[:]
if options.resource:
self.set_resources(options.resource)
else:
self.resources = self.default_resources[:]
def __iter__(self):
return iter([('',r) if self.en_src and l == 'en' else (l, r) \
for l in self.languages for r in self.resources])
def __len__(self):
return len(self.languages) * len(self.resources)
def __str__(self):
s = ''
for index_t in "languages", "resources":
index = getattr(self, index_t)
s += index_t.title() + ": "
if self.is_default(index_t):
s += "All (%d)\n" % len(index)
else:
s += ", ".join(index) + "\n"
return s
def is_default(self, index_t):
index = getattr(self, index_t)
default_index = getattr(self, "default_" + index_t)
return len(index) == len(default_index)
def print_index(self, index_t, only_selected = False):
index = getattr(self, index_t)
default_index = getattr(self, "default_" + index_t)
if only_selected:
idx_l = index
else:
idx_l = [emphasize(i) if i in index else i for i in default_index]
print "%s: %s" % (index_t.title(), ", ".join(idx_l))
def set_index(self, index_t, opt):
"""When opt is True, the method is being called during program startup
with the option value as argument. This reduce the verbosity compared to
calling it in interactive mode."""
if not opt:
self.print_index(index_t)
index = getattr(self, index_t)
default_index = getattr(self, "default_" + index_t)
if opt:
a = opt
else:
a = raw_input("Select %s (Empty reset to defaults): " % index_t)
del index[:]
for i in sep_re.split(a):
if i in default_index:
index.append(i)
elif i:
matches = [m for m in default_index if m.startswith(i)]
if len(matches) == 1:
index.append(matches[0])
elif not matches:
print >> sys.stderr, "Invalid %s: %s" % (index_t[:-1], i)
else:
print >> sys.stderr, "Multiple matches for %s: %s" \
% (i, ", ".join(matches))
if not index:
setattr(self, index_t, default_index[:])
print "Reset %s to default" % index_t
elif not opt:
print
self.print_index(index_t, True)
def set_languages(self, opt = ''):
self.set_index('languages', opt)
def set_resources(self, opt = ''):
self.set_index('resources', opt)
def set_changes(self, change_t_list):
self.changes = change_t_list
def get_index(self, index_t):
return getattr(self, index_t)[0]
def next_index(self, index_t):
element = self.get_index(index_t)
default_index = getattr(self, "default_" + index_t)
if default_index[-1] == element:
setattr(self, index_t, [default_index[0]])
else:
setattr(self, index_t, [default_index[default_index.index(element) + 1]])
class Entry():
"""Class for a raw entry. Elements of raw_entries are of this type."""
def __init__(self):
self.value = ''
self.tags = OrderedDict()
def __getitem__(self, key):
if key in self.tags:
return self.tags[key]
else:
return ''
def __setitem__(self, key, value):
self.tags[key] = value
class TxtEntry():
"""This class is only used when reading a txt file. Instances of this class
are never stored, we directly store values in ResourceFile."""
def __init__(self):
self.key = ""
self.value = ""
self.key_comment = ""
self.value_comment = ""
def save(self, res_file):
res_file.entries[self.key] = self.value
if self.key_comment:
res_file.key_comment[self.key] = self.key_comment
if self.value_comment:
res_file.value_comment[self.key] = self.value_comment
self.__init__()
class ResourceFile():
"""Holds all the logic which is common between txt and ini files.
self.entries hold the dictionary of key/value read from the file. It is
initialized in the subclasses because source files use an OrderedDict.
self.diff have a dictionary per change type with the new values."""
def __init__(self, lang, res):
self.diff = defaultdict(dict)
self.language = lang
self.resource = res
self.path = res + "." + self.ext
self.path = os.path.join(self.lang_dir, self.path)
self.git_path = os.path.join(descript_git_path, self.path).replace("\\", "/")
self.mtime = 0
self.modified = False
self.staged = False
self.new = False
def __setitem__(self, key, value):
"""Called by the subclass which has already done the conversion.
Determine the change type and store the new value in the appropriate
dict of self.diff"""
if key not in self.entries:
change_t = 'new'
elif value != self.entries[key]:
change_t = 'changed'
else:
return
self.diff[change_t][key] = value
# If the key was previously removed and edited back in
# delete it from the 'removed' dict
if key in self.diff['removed']:
del self.diff['removed'][key]
def items(self, diff_only):
"""Returns an iterator to a list of (key, value) tuples, depending on
what is selected in res_index.changes and what is found in self.diff.
When diff_only is true, only return changed or new values (for diff and
edit). When it is false, return the original value for unchanged ones
(for writing file)."""
items = []
for key in self.source_keys():
found_diff = False
for change_t in res_index.changes:
if change_t == 'removed' or change_t not in self.diff: continue
if key in self.diff[change_t]:
items.append((key, self.diff[change_t][key]))
found_diff = True
if not found_diff and not diff_only and key in self.entries \
and not key in self.diff['removed']:
items.append((key, self.entries[key]))
return iter(items)
def diff_count(self):
"""Returns a Counter object representing what's in self.diff"""
c = Counter()
for change_t in self.diff:
count = len(self.diff[change_t])
if count:
c[change_t] = count
return c
def lang(self):
"""Source files have self.language empty, but they are in english"""
return self.language if self.language else 'en'
def clear(self, keep_entries = False):
if not keep_entries:
self.entries.clear()
self.diff.clear()
def changed(self):
"""Returns true if there are pending change for the file depending on
what is selected in res_index.changes"""
for change_t in res_index.changes:
if change_t in self.diff:
return True
return False
def source_keys(self):
"""Returns an ordered list of the keys of the source corresponding to
this resource file. This list is used as a reference when iterating
through keys. It helps keep the order consistent and translations can't
exist if there isn't a source associated to them anyway."""
keys = self.source_res.entries.keys()
# To allow submitting new quotes from another resource, they are sorted
if self.resource == 'quotes' and 'new' in self.diff:
for k in self.diff['new']:
if k not in keys:
keys.append(k)
keys.sort()
return keys
def diff_txt(self, diff_format):
"""When diff_format is True, returns a string with a diff for each new
or changed entry. When it is False, returns the new value instead (for
editing purpose)."""
diff_txt = ''
for (key, value) in self.items(True):
if key in self.entries:
orig = self.format_entry(key, self.entries[key])
else:
orig = ""
value = self.format_entry(key, value)
if diff_format:
diff_txt += diff(orig.splitlines(), value.splitlines()) + "\n"
else:
diff_txt += value
diff_txt += self.separator()
if 'removed' in res_index.changes and 'removed' in self.diff:
for k, v in self.diff['removed'].items():
value = self.format_entry(k, v)
if diff_format:
diff_txt += diff(value.splitlines(), []) + "\n"
else:
diff_txt += value
diff_txt += self.separator()
return diff_txt
def load(self):
if not self.entries and not self.diff:
self.read_file()
def read_file(self):
"""Called by the subclasses to handle the basic checks. Returns the
content of the file (list of lines) to the subclass which does the
actual parsing."""
if not os.path.exists(self.path):
return []
# If the corresponding source file isn't loaded we load it first
if self.language and not len(self.source_keys()):
self.source_res.read_file()
# Don't reload the file if it hasn't changed since we loaded it before.
file_mtime = os.stat(self.path).st_mtime
if self.mtime == file_mtime:
self.clear(True)
return []
else:
self.clear()
self.mtime = file_mtime
return codecs.open(self.path, encoding='utf-8').readlines()
def merge_file(self):
"""Iterate through the entries loaded from the file, convert them in a
raw format and store them in raw_entries"""
entries = raw_entries[(self.lang(), self.resource)]
entries.clear()
for (key, value) in self.entries.items():
entries[key] = self.raw_entry(value)
def update(self, update_removed_keys = True):
"""Update the resource file with the content of raw_entries. New values
will be converted in the resource format and stored in the appropriate
diff dictionary by the __setitem__ methods"""
entries = raw_entries[(self.lang(), self.resource)]
for key in self.source_keys():
if key not in entries: continue
self[key] = entries[key]
if update_removed_keys and (self.lang() != 'en' or self.ext == 'ini'):
self.update_removed_keys()
def write_file(self):
"""Write the content of the resource to a file"""
f = codecs.open(self.path, "w", encoding='utf-8')
f.write(self.header())
for key, e in self.items(False):
f.write(self.format_entry(key, e))
f.write(self.separator())
self.modified = True
self.mtime = 0
def update_removed_keys(self):
"""If the resource has keys which are not present in the source, they
will be removed. Store them in self.diff['removed'] to show them in diff
and allow editing (useful to fix renamed keys)."""
entries = raw_entries[(self.lang(), self.resource)]
for k in self.entries.keys():
if k not in self.source_keys() or k not in entries:
self.diff['removed'][k] = self.entries[k]
def edit_file(self):
"""Create a temporary file with the values of the changed keys, start
a text editor, then load the file."""
tmp = NamedTemporaryFile(prefix=self.language + "-" + self.resource,
suffix="." + self.ext, delete=False)
tmp.file.write(self.diff_txt(False).encode('utf-8'))
tmp.file.close()
EDITOR = os.environ.get('EDITOR','vim')
try:
call([EDITOR, tmp.name])
except OSError:
print >> sys.stderr, "Cannot start text editor." \
"Set the EDITOR environment variable."
return False
tmp_res = self.__class__(self.language, self.resource)
tmp_res.path = tmp.name
tmp_res.read_file()
tmp_res.merge_file()
os.remove(tmp.name)
self.update(False)
return True
class TxtFile(ResourceFile):
"""Subclass of ResourceFile to handle files in crawl's native format of
description files."""
def __init__(self, lang, res):
if lang:
self.entries = dict()
self.source_res = txt_files[('', res)]
self.lang_dir = lang[:2]
else:
self.entries = OrderedDict()
self.source_res = self
self.lang_dir = ''
self.key_comment = dict()
self.value_comment = dict()
self.ext = 'txt'
self.eac = lang in east_asian_languages
self.no_space = lang in no_space_languages
ResourceFile.__init__(self, lang, res)
def __setitem__(self, key, entry):
"""Converts a generic entry in txt format then calls the base class
__setitem__ method to store it in the appropriate self.diff dict"""
value = ""
for tag, tag_value in entry.tags.items():
# If it has a quote tag, we store the new quote in its own entry
if tag == 'quote':
e = Entry()
e.value = tag_value
quote_res = txt_files[(self.language, 'quotes')]
quote_res.load()
quote_res[key] = e
# add the quote resource to the index
if 'quotes' not in res_index.resources:
res_index.resources.append('quotes')
# If we're adding a foreign quote and the source doesn't have
# one, we also create it in the corresponding source
if self.language and key not in quote_res.source_res.entries:
en_quote_res = txt_files[('', 'quotes')]
en_quote_res.load()
en_quote_res[key] = e
# Add english to the index
if 'en' not in res_index.languages:
res_index.languages.insert(0, 'en')
elif tag_value is True:
value += ":%s\n" % tag
else:
value += ":%s %s\n" % (tag, tag_value)
if options.auto_fix:
raw_value = auto_fix(entry.value, self.lang())
else:
raw_value = entry.value
if entry['nowrap']:
value += raw_value
else:
value += wrap(raw_value, self.eac, self.no_space)
value += "\n"
ResourceFile.__setitem__(self, key, value)
def format_entry(self, key, value):
"""Convert the key/value pair in crawl's native desc format"""
ret = self.key_comment.get(key, "")
ret += key + "\n\n"
ret += self.value_comment.get(key, "")
ret += value
return ret
def header(self):
"""Added to the beginning of the file"""
return self.separator()
def separator(self):
"""Separate entries in the file"""
return "%%%%\n"
def raw_entry(self, value):
"""Convert a value in txt format to a raw entry."""
e = Entry()
for line in value.splitlines():
if len(line) > 1 and line[0] == ':' and line[1] != ' ':
l = line[1:].rstrip().split(' ', 1)
e[l[0]] = l[1] if len(l) == 2 else True
else:
e.value += line + "\n"
e.value = e.value.rstrip()
if not e['nowrap']:
e.value = unwrap(e.value, self.no_space)
return e
def read_file(self):
"""Parse the content of a txt file and stores it in self.entries"""
te = TxtEntry()
for line in ResourceFile.read_file(self):
if line[0] == '#':
if te.key:
te.value_comment += line
else:
te.key_comment += line
elif txt_sep_re.match(line):
if te.key:
te.save(self)
elif line[0] == '\n' and not te.value:
continue
elif not te.key:
te.key = line.strip()
else:
te.value += line
if te.key:
te.save(self)
return len(self.entries)
def search_removed_keys(self):
# No removed key in the source, it's the reference
if self.language:
ResourceFile.search_removed_keys(self)
class IniFile(ResourceFile):
"""Subclass of ResourceFile to handle files in ini format to be pushed to
or pulled from transifex."""
def __init__(self, lang, res):
self.entries = dict()
self.source_res = txt_files[('', res)]
self.ext = 'ini'
self.lang_dir = lang
ResourceFile.__init__(self, lang, res)
def __setitem__(self, key, e):
"""Converts a generic entry in ini format then calls the base class
__setitem__ method to store it in the appropriate self.diff dict"""
# Delete entries with only a link. There's no point in translating them.
if len(e.value) > 1 and e.value[0] == '<' and e.value[-1] == '>'\
and e.value.find("\n") == -1 and e.value[1:].find("<") == -1:
if key in self.entries:
self.diff['removed'][key] = self.entries[key]
del self.entries[key]
return
value = ""
for tag, tag_value in e.tags.items():
if tag_value is True:
value += r":%s\n" % tag
else:
value += r":%s %s\n" % (tag, tag_value)
value += e.value.replace("\n", r'\n') + "\n"
ResourceFile.__setitem__(self, key, value)
def header(self):
return ""
def separator(self):
return ""
def format_entry(self, key, value):
"""Convert the key/value pair in ini format"""
return "%s=%s" % (key, value)
def read_file(self):
"""Parse the content of an ini file and stores it in self.entries"""
for line in ResourceFile.read_file(self):
if not line or line[0] == '#' or line.find('=') == -1: continue
(key, value) = line.split('=', 1)
self.entries[key] = value.replace('"', '"').replace('\\\\', '\\')
return len(self.entries)
def raw_entry(self, value):
"""Convert a value in ini format to a raw entry."""
e = Entry()
tag_name = ''
for line in value.rstrip().split(r'\n'):
if len(line) > 1 and line[0] == ':' and line[1] != ' ':
if not e.value:
l = line[1:].split(' ', 1)
e[l[0]] = l[1] if len(l) == 2 else True
else:
tag_name = line[1:]
elif tag_name:
if e[tag_name]:
e[tag_name] += "\n"
e[tag_name] += line
else:
e.value += line + "\n"
e.value = e.value.rstrip()
return e
class ResourceCollection(OrderedDict):
"""A container class holding a collection of resource files. It uses
res_index to iterate through its resources"""
def __init__(self):
OrderedDict.__init__(self)
self.diff_count = Counter()
self.git_count = Counter()
self.modified = False
def __iter__(self):
return iter([self[res_i] for res_i in res_index])
def __len__(self):
return len(res_index)
def clear(self):
self.diff_count.clear()
self.modified = False
def paths(self):
return [res.path for res in self]
def merge_files(self):
for i, res in enumerate(self, start=1):
progress("Merging %s files" % self.ext, i, len(self))
res.merge_file()
def load_files(self):
self.clear()
n_files = n_entries = 0
for i, res in enumerate(self, start=1):
progress("Loading %s files" % self.ext, i, len(self))
n = res.read_file()
if n:
n_files += 1
n_entries += n
if n_files:
print "Loaded %d entr%s from %d %s file%s" \
% (n_entries, ["y", "ies"][n_entries!=1],
n_files, self.ext, "s"[n_files==1:])
def update(self):
for i, res in enumerate(self, start=1):
progress("Updating %s files" % self.ext, i, len(self))
res.update()
self.update_diff_count()
def update_diff_count(self):
self.diff_count.clear()
for res in self:
self.diff_count += res.diff_count()
res_index.changes = self.diff_count.keys()
def diff(self, diff_format):
diff_text = ''
for res in self:
if res.changed():
diff_text += title(res.path) + "\n"
diff_text += res.diff_txt(diff_format) + "\n"
return diff_text
def show_diff(self):
diff_text = self.diff(True)
try:
Popen("less", stdin=PIPE).communicate(diff_text.encode('utf-8'))
except OSError:
print diff_text
def edit_files(self):
for res in self:
if res.changed():
if not res.edit_file():
break
self.update_diff_count()
def write_files(self):
for res in self:
if res.changed() or options.force and list(res.items(False)):
res.write_file()
for change_t in res_index.changes:
if change_t in res.diff:
del res.diff[change_t]
self.update_diff_count()
def undo_changes(self):
for res in self:
res.clear(True)
self.diff_count.clear()
class TxtCollection(ResourceCollection):
"""Collection of txt files. It holds a few git methods"""
def __init__(self):
self.ext = 'txt'
ResourceCollection.__init__(self)
def __missing__(self, key):
self[key] = TxtFile(*key)
return self[key]
def refresh_state(self):
"""Run git status and check the result for each file in the collection"""
if not git: return
git_states = dict()
self.git_count.clear()
for line in Popen(["git", "status", "--porcelain"] + self.paths(),
stdout=PIPE).communicate()[0].splitlines():
git_states[line[3:]] = line[0:2]
for res in self:
if res.git_path not in git_states:
res.modified = res.staged = res.new = False
continue
st = git_states[res.git_path]
if st[0] == 'M' or st[0] == 'A':
res.staged = True
self.git_count['staged'] += 1
if st[1] == 'M':
res.modified = True
self.git_count['modified'] += 1
elif st == '??':
res.new = True
self.git_count['new'] += 1
def git_status(self):
call(["git", "status"] + self.paths())
def git_add_hunks(self):
self.git_add(True)
def git_add(self, hunks = False):
files = []
for res in self:
if res.modified or res.new:
files.append(res.path)
cmd_list = ['git', 'add']
if hunks:
cmd_list.append('-p')
cmd_list += files
call(cmd_list)
def git_reset(self):
files = []
for res in self:
if res.modified:
files.append(res.path)
elif res.new:
os.remove(res.path)
cmd_list = ['git', 'checkout']
cmd_list += files
call(cmd_list)
class IniCollection(ResourceCollection):
"""Collection of ini files with methods to interface with the transifex
client push and pull commands"""
def __init__(self):
self.ext = 'ini'
ResourceCollection.__init__(self)
def __missing__(self, key):
self[key] = IniFile(*key)
return self[key]
def refresh_state(self):
self.modified = False
for res in self:
if res.modified:
self.modified = True
def tx_pull(self):
tx_cmd = ['pull']
if options.force:
tx_cmd.append('-f')
all_lang = res_index.is_default('languages')
all_res = res_index.is_default('resources')
if all_lang and all_res:
call_tx(tx_cmd + ['-a'])
elif all_res:
for lang in res_index.languages:
call_tx(tx_cmd + ['-l', lang])
elif all_lang:
for res in res_index.resources:
call_tx(tx_cmd + ['-r', 'dcss.' + res])
else:
for res in self:
call_tx(tx_cmd + ['-l', res.lang(), '-r', 'dcss.' + res.resource])
def tx_push(self):
tx_push = ['push']
if options.force:
tx_push.append('-f')
for res in self:
if not res.modified: continue
resource = ['-r', 'dcss.' + res.resource]
language = ['-l', res.lang()]
if not res.language:
# We push the source then reset the fake translation resource
ret = call_tx(tx_push + ['-s'] + resource)
if self[('en', res.resource)].entries:
a = raw_input("Reset the %s fake translation (y/n)? " % res.resource).lower()
if a and a[0] == 'y':
call_tx(['delete', '-f'] + language + resource)
else:
ret = call_tx(tx_push + ['-t'] + language + resource)
if ret == 0:
res.modified = False
class Menu(OrderedDict):
"""Create a simple text based interactive menu.
The inherited OrderedDict is used to store groups of commands
cmds keys are the command hotkey letter, values are either a function or a
list whose first member is the function and the next ones are arguments.
cmd is the command string, it can be used to queue several commands.
res_files points to the current resource file collection which is being
worked on."""
def __init__(self, cmd = ''):
OrderedDict.__init__(self)
self.cmds = dict()
self.cmd = cmd
self.menu_desc = ''
self.res_files = txt_files
self.show_res = len(res_index.languages) == 1
def __missing__(self, key):
self[key] = []
return self[key]
def change_summary(self):
if not self.res_files.diff_count:
print "No changes\n"
return
print "Change summary:"
lang_total = defaultdict(Counter)
padding_size = 5
for res in self.res_files:
if not res.diff: continue
if self.show_res:
padding_size = max(padding_size, len(res.path))
else:
lang_total[res.lang()] += res.diff_count()
cur_lang = ''
for res in self.res_files:
if not res.diff: continue
lang = res.lang()
if lang != cur_lang and lang_total[lang]:
print "%-*s %s" % (padding_size, lang,
change_counter(lang_total[lang]))
cur_lang = lang
if self.show_res:
print "%-*s %s" % (padding_size, res.path,
change_counter(res.diff_count()))
print "%-*s %s" % (padding_size, 'Total',
change_counter(self.res_files.diff_count))
print
def git_summary(self):
if not self.res_files.git_count:
return
padding_size = 5
print "Git status:"
for key, count in self.res_files.git_count.most_common():
print "%-*s: %d" % (padding_size, key, count)
print
def git_commit(self):
call(['git', 'commit', '-e', '-s', '-m', '[Transifex]',
'--author=' + options.commit_author])
def toggle_details(self):
self.show_res = not self.show_res
def set_languages(self):
res_index.set_languages()
self.res_files.update_diff_count()
self.show_res = len(res_index.languages) == 1
def set_resources(self):
res_index.set_resources()
self.res_files.update_diff_count()
def set_changes(self):
"""Creates a submenu to select which kind of change to work on."""
submenu = Menu()
lbl = "Select entries"
change_ts = self.res_files.diff_count.keys()
for type in change_ts + ['all']:
cmd_lbl = '<' + type[0] + '>' + type[1:]
if type == 'all':
submenu.add_cmd(lbl, cmd_lbl, [res_index.set_changes, change_ts])
else:
submenu.add_cmd(lbl, cmd_lbl, [res_index.set_changes, [type]])
submenu.build_menu_desc()
submenu.show_menu()
def next_index(self, index_t):
"""When only one index is selected (language or resource), this commands
allows to jump to the next one. It will search for one with pending
changes. If none is found after having looped through all of them, we
simply select the next one"""
current = res_index.get_index(index_t)
while 1:
res_index.next_index(index_t)
self.res_files.update_diff_count()
if self.res_files.diff_count or current == res_index.get_index(index_t):
break
# If we haven't found something with a change, we looped. In this case,
# we advance one more time.
if not self.res_files.diff_count:
res_index.next_index(index_t)
res_index.print_index(index_t, True)
def add_cmd(self, group, label, cmd):
"""Adds a command to the menu. The label must contain a letter between
chevrons which will be the command hotkey"""
m = cmd_re.search(label)
if not m: sys.exit("Invalid command: %s" % label)
key = m.group(1)
if key in self: sys.exit("Duplicate command for key %s: %s and %s" \
% (key, label, self[key]))
if system() != 'Windows':
label = label.replace("<" + key + ">", emphasize(key))
self[group].append(label)
self.cmds[key] = cmd
def build_main_menu(self):
self.cmds.clear()
self.clear()
self.menu_desc = ''
lbl_cmds = "Commands"
self.add_cmd(lbl_cmds, 'wrap <t>xt files', wrap_txt)
self.add_cmd(lbl_cmds, '<m>erge ini files', merge_ini)
self.add_cmd(lbl_cmds, 'update <i>ni files', create_ini)
self.add_cmd(lbl_cmds, '<q>uit', sys.exit)
lbl_review = "Review changes"
if self.res_files.diff_count or options.force:
self.add_cmd(lbl_review, "<w>rite files", self.res_files.write_files)
if self.res_files.diff_count:
if self.show_res:
self.add_cmd(lbl_review, "<v>iew languages", self.toggle_details)
else:
self.add_cmd(lbl_review, "<v>iew resources", self.toggle_details)
self.add_cmd(lbl_review, "show <d>iff", self.res_files.show_diff)
self.add_cmd(lbl_review, "<e>dit", self.res_files.edit_files)
self.add_cmd(lbl_review, "e<x>punge changes", self.res_files.undo_changes)
lbl_select = "Select"
self.add_cmd(lbl_select, "<l>anguages", self.set_languages)
self.add_cmd(lbl_select, "<r>esources", self.set_resources)
if len(self.res_files.diff_count) > 1:
self.add_cmd(lbl_select, "chan<g>es", self.set_changes)
if len(res_index.resources) == 1:
self.add_cmd(lbl_select, "<n>ext resource", [self.next_index, 'resources'])
elif len(res_index.languages) == 1:
self.add_cmd(lbl_select, "<n>ext language", [self.next_index, 'languages'])
if git:
lbl_git = "Git"
if self.res_files.git_count:
self.add_cmd(lbl_git, "<s>tatus", self.res_files.git_status)
if self.res_files.git_count['modified'] or self.res_files.git_count['new']:
self.add_cmd(lbl_git, "<a>dd", self.res_files.git_add)
self.add_cmd(lbl_git, "select <h>unks", self.res_files.git_add_hunks)
self.add_cmd(lbl_git, "chec<k>out", self.res_files.git_reset)
if self.res_files.git_count['staged']:
self.add_cmd(lbl_git, "<c>ommit", self.git_commit)
if transifex:
lbl_tx = "Transifex"
self.add_cmd(lbl_tx, "<p>ull", ini_files.tx_pull)
if self.res_files.modified:
self.add_cmd(lbl_tx, "p<u>sh", self.res_files.tx_push)
self.build_menu_desc()
print
print self.change_summary()
self.git_summary()
def build_menu_desc(self):
for group, labels in self.items():
self.menu_desc += "%s: %s" % (group, ", ".join(labels)) + "\n"
def show_menu(self):
"""It reads the command line argument and treat each letter as a command
When there is no more command, it switches to interactive mode."""
if not self.cmd:
self.cmd = raw_input(self.menu_desc).lower()
choice = self.cmd[:1]
self.cmd = self.cmd[1:]
if choice in self.cmds:
func = self.cmds[choice]
if isinstance(func, list):
# If it's a list, then the first item is the function,
# the other ones are arguments
func[0](*func[1:])
else:
func()
else:
print "Huh?"
self.cmd = ""
def main_menu(self):
print res_index,
while 1:
self.res_files.refresh_state()
self.build_main_menu()
self.show_menu()
wrapper_args = {
'width' : 79,
'break_on_hyphens' : False,
'break_long_words' : False,
'replace_whitespace' : False}
wrapper = TextWrapper(**wrapper_args)
# Use hardcoded whitespaces instead of \s because the latter matches
# non-breaking spaces (see textwrap.py:30).
wrapper.wordsep_simple_re_uni = re.compile(r'([%s]+)' % _whitespace)
FWwrapper = FullWidthTextWrapper(**wrapper_args)
wrapper_args['no_space'] = True
FW_NS_wrapper = FullWidthTextWrapper(**wrapper_args)
# We initialize the resource index early because we might need it if we have to
# initialize the transifex configuration.
res_index = ResourceIndex()
# Can we use the transifex client?
try:
call_tx([], True)
transifex = True
except OSError:
transifex = False
# Is transifex configured?
if transifex:
if not os.path.exists(tx_config):
setup_transifex()
# Can we use git?
try:
call(['git'], stdout=open(os.devnull, 'wb'))
git = True
except OSError:
git = False
# Create the global variables for managing resources.
txt_files = TxtCollection()
ini_files = IniCollection()
menu = Menu(cmd)
menu.main_menu()
|