File: jiebacmd.py

package info (click to toggle)
python-jieba 0.42.1-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 51,824 kB
  • sloc: python: 194,937; makefile: 5; sh: 3
file content (28 lines) | stat: -rw-r--r-- 461 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
'''
usage example (find top 100 words in abc.txt):

cat abc.txt | python jiebacmd.py | sort | uniq -c | sort -nr -k1 | head -100


'''

from __future__ import unicode_literals
import sys
sys.path.append("../")

import jieba

default_encoding='utf-8'

if len(sys.argv)>1:
    default_encoding = sys.argv[1]

while True:
    line = sys.stdin.readline()
    if line=="":
        break
    line = line.strip()
    for word in jieba.cut(line):
        print(word)