1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
|
# encoding: utf-8
from __future__ import print_function
# NOTE: run this script with LANG=en_US.UTF-8
import sys
from collections import defaultdict
import operator
import re
import subprocess
author_re = re.compile('(.*) <.*>')
pair_programming_re = re.compile(r'^\((.*?)\)')
excluded = set(["pypy", "convert-repo", "hgattic", 'Miss Islington (bot)',
"remote-hg", "Unknown"])
alias = {
'Anders Chrigstrom': ['arre'],
'Antonio Cuni': ['antocuni', 'anto', 'antonio'],
'Armin Rigo': ['arigo', 'arfigo', 'armin', 'arigato'],
'Maciej Fijałkowski': ['fijal', 'Maciej Fijalkowski'],
'Carl Friedrich Bolz-Tereick': ['Carl Friedrich Bolz', 'cfbolz', 'cf',
'cbolz', 'CF Bolz-Tereick'],
'Samuele Pedroni': ['pedronis', 'samuele', 'samule'],
'Richard Plangger': ['planrich', 'plan_rich'],
'Remi Meier': ['remi'],
'Michael Hudson-Doyle': ['mwh', 'Michael Hudson', 'michaelh'],
'Holger Krekel': ['hpk', 'holger krekel', 'holger', 'hufpk'],
"Amaury Forgeot d'Arc": ['afa', 'amauryfa@gmail.com', 'amaury'],
'Alex Gaynor': ['alex', 'agaynor'],
'David Schneider': ['bivab', 'david'],
'Christian Tismer': ['chris', 'christian', 'tismer',
'tismer@christia-wjtqxl.localdomain'],
'Benjamin Peterson': ['benjamin'],
'Håkan Ardö': ['hakan', 'hakanardo', 'Hakan Ardo'],
'Niklaus Haldimann': ['nik'],
'Alexander Schremmer': ['xoraxax'],
'Anders Hammarquist': ['iko'],
'David Edelsohn': ['edelsoh', 'edelsohn','opassembler.py'],
'Niko Matsakis': ['niko'],
'Jakub Gustak': ['jlg'],
'Guido Wesdorp': ['guido'],
'Michael Foord': ['mfoord'],
'Mark Pearse': ['mwp'],
'Eric van Riet Paap': ['ericvrp'],
'Jacob Hallen': ['jacob', 'jakob', 'jacob hallen'],
'Anders Lehmann': ['ale', 'anders'],
'Vanessa Freudenberg': ['bert', 'Bert Freudenberg'],
'Boris Feigin': ['boris', 'boria'],
'Valentino Volonghi': ['valentino', 'dialtone'],
'Aurelien Campeas': ['aurelien', 'aureliene'],
'Adrien Di Mascio': ['adim'],
'Jacek Generowicz': ['Jacek', 'jacek'],
'Jim Hunziker': ['landtuna@gmail.com'],
'Kristjan Valur Jonsson': ['kristjan@kristjan-lp.ccp.ad.local'],
'Laura Creighton': ['lac'],
'Aaron Iles': ['aliles'],
'Ludovic Aubry': ['ludal', 'ludovic'],
'Lukas Diekmann': ['l.diekmann', 'ldiekmann'],
'Matti Picus': ['Matti Picus matti.picus@gmail.com',
'matthp', 'mattip', 'mattip>', 'matti'],
'Michael Cheng': ['mikefc'],
'Richard Emslie': ['rxe'],
'Roberto De Ioris': ['roberto@goyle','roberto@mrspurr'],
'Sven Hager': ['hager'],
'Tomo Cocoa': ['cocoatomo'],
'Romain Guillebert': ['rguillebert', 'rguillbert', 'romain', 'Guillebert Romain'],
'Ronan Lamy': ['ronan'],
'Edd Barrett': ['edd'],
'Manuel Jacob': ['mjacob'],
'Rami Chowdhury': ['necaris'],
'Stanislaw Halik': ['Stanislaw Halik', 'w31rd0'],
'Wenzhu Man': ['wenzhu man', 'wenzhuman'],
'Anton Gulenko': ['anton gulenko', 'anton_gulenko'],
'Richard Lancaster': ['richardlancaster'],
'William Leslie': ['William ML Leslie'],
'Spenser Bauman': ['Spenser Andrew Bauman'],
'Raffael Tfirst': ['raffael.tfirst@gmail.com'],
'timo': ['timo@eistee.fritz.box'],
'Jasper Schulz': ['Jasper.Schulz', 'jbs'],
'Aaron Gallagher': ['"Aaron Gallagher'],
'Yasir Suhail': ['yasirs'],
'Squeaky': ['squeaky'],
"Dodan Mihai": ['mihai.dodan@gmail.com'],
'Wim Lavrijsen': ['wlav'],
'Toon Verwaest': ['toon', 'tverwaes'], #
'Seo Sanghyeon': ['sanxiyn'],
'Leonardo Santagada': ['santagada'],
'Laurence Tratt': ['ltratt'],
'Pieter Zieschang': ['pzieschang', 'p_zieschang@yahoo.de'],
'John Witulski': ['witulski'],
'Andrew Lawrence': ['andrew.lawrence@siemens.com', 'andrewjlawrence'],
'Batuhan Taskaya': ['isidentical'],
'Ondrej Baranovič': ['nulano', 'Nulano'],
'Brad Kish': ['rtkbkish'],
'Michał Górny': ['mgorny'],
'David Hewitt': ['davidhewitt'],
'Adrian Kuhn': ['akuhn'],
'David Malcolm': ['dmalcolm'],
'Simon Cross': ['hodgestar'],
'Łukasz Langa': ['ambv'],
'Wenzel Jakob': ['Jakob Wenzel'],
'Maxwell Bernstein': ['Max Bernstein'],
'Paul Gey': ['narpfel'],
'Bartosz Skowron': ['getxsick'],
'Beatrice During': ['bea'],
'Mikael Schönenberg': ['micke'],
'Oscar Nierstrasz': ['oscar'],
'Tim Felgentreff': ['timfel'],
'Tadeu Zagallo': ['tadeuzagallo'],
}
alias_map = {}
for name, nicks in alias.items():
for nick in nicks:
alias_map[nick] = name
def get_canonical_author(name):
match = author_re.match(name)
if match:
name = match.group(1)
return alias_map.get(name, name)
ignored_nicknames = defaultdict(int)
def get_more_authors(log):
match = pair_programming_re.match(log)
if not match:
return set()
ignore_words = ['around', 'consulting', 'yesterday', 'for a bit', 'thanks',
'in-progress', 'bits of', 'even a little', 'floating',
'a bit', 'reviewing', 'looking', 'advising', 'partly', 'ish',
'watching', 'mostly', 'jumping', 'twitch', 's390x']
sep_words = ['and', ';', '+', '/', 'with special by']
nicknames = match.group(1)
for word in ignore_words:
nicknames = nicknames.replace(word, '')
for word in sep_words:
nicknames = nicknames.replace(word, ',')
nicknames = [nick.strip().lower() for nick in nicknames.split(',')]
authors = set()
for nickname in nicknames:
if not nickname:
continue
author = alias_map.get(nickname)
if not author:
ignored_nicknames[nickname] += 1
else:
authors.add(author)
return authors
def main(show_numbers):
txt = subprocess.check_output(["git", "log", "--all", "--no-merges", '--format="%aN#<%aE>#%s"'], text=True)
authors_count = defaultdict(int)
with open("/tmp/authors", "wt", encoding="utf8") as fid:
fid.write(txt)
for line in txt.split('\n'):
if "#" not in line:
continue
if "#Notes added by" in line:
continue
author_src, author_mail, description = line.strip('"').split("#", 2)
authors = set()
authors.add(get_canonical_author(author_src))
authors.update(get_more_authors(description))
for author in authors:
if author not in excluded:
authors_count[author] += 1
# enable the next lines to get the list of nicknamed which could not be
# parsed from description
if 0:
items = list(ignored_nicknames.items())
items.sort(key=operator.itemgetter(1), reverse=True)
for name, n in items:
if show_numbers:
print("%5d '%s'" % (n, name))
else:
print("'%s'" % name)
return
items = list(authors_count.items())
items.sort(key=operator.itemgetter(1), reverse=True)
for name, n in items:
if show_numbers:
print('%5d %s' % (n, name))
else:
print(' ' + name)
if __name__ == '__main__':
show_numbers = '-n' in sys.argv
main(show_numbers)
|