File: find-most-repeated-functions.py

package info (click to toggle)
libreoffice 1%3A7.0.4-4%2Bdeb11u10
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 3,255,188 kB
  • sloc: cpp: 4,130,031; xml: 364,887; java: 276,583; python: 65,680; ansic: 36,276; perl: 32,034; javascript: 16,964; yacc: 10,836; sh: 10,721; makefile: 9,112; cs: 6,600; objc: 1,972; lex: 1,887; awk: 1,002; pascal: 940; asm: 928; php: 79; csh: 20; sed: 5
file content (42 lines) | stat: -rwxr-xr-x 1,119 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/usr/bin/python
#
# Find the top 100 functions that are repeated in multiple .o files, so we can out-of-line those
#
#

import subprocess
from collections import defaultdict

# the odd bash construction here is because some of the .o files returned by find are not object files
# and I don't want xargs to stop when it hits an error
a = subprocess.Popen("find instdir/program/ -name *.so | xargs echo nm --radix=d --size-sort --demangle | bash", stdout=subprocess.PIPE, shell=True)

#xargs sh -c "somecommand || true"

nameDict = defaultdict(int)
with a.stdout as txt:
    for line in txt:
        line = line.strip()
        idx1 = line.find(" ")
        idx2 = line.find(" ", idx1 + 1)
        name = line[idx2:]
        nameDict[name] += 1

sizeDict = defaultdict(set)
for k, v in nameDict.iteritems():
    sizeDict[v].add(k)

cnt = 0
for k in sorted(list(sizeDict), reverse=True):
    print k
    for v in sizeDict[k]:
        print v
    cnt += 1
    if cnt > 100 : break

#first = sorted(list(sizeDict))[-1]
#print first


#include/vcl/ITiledRenderable.hxx
# why is gaLOKPointerMap declared inside this header?