1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
|
#!/usr/bin/python3
# Copyright (C) 2022-2023 J.F.Dockes
#
# License: GPL 2.1
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this program; if not, write to the
# Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
import sys
from getopt import getopt
from recoll import recoll
def msg(s):
print(f"{s}", file=sys.stderr)
def Usage():
msg(
"Usage: snippets.py [-c conf] [-i extra_index] [-w ctxwords] [-n] <recoll query>"
)
sys.exit(1)
if len(sys.argv) < 2:
Usage()
confdir = ""
extra_dbs = []
ctxwords = 4
nohl = False
# Process options: [-c confdir] [-i extra_db [-i extra_db] ...]
try:
options, args = getopt(sys.argv[1:], "c:i:w:n")
except Exception as ex:
print(f"{ex}")
sys.exit(1)
for opt, val in options:
if opt == "-c":
confdir = val
elif opt == "-n":
nohl = True
elif opt == "-i":
extra_dbs.append(val)
elif opt == "-w":
ctxwords = int(val)
else:
print(f"Bad opt: {opt}")
Usage()
if len(args) == 0:
msg("No query found in command line")
Usage()
qs = " ".join(args)
# msg(f"QUERY: [{qs}]")
db = recoll.connect(confdir=confdir, extra_dbs=extra_dbs)
query = db.query()
query.execute(qs)
class HL:
def startMatch(self, i):
return "<span class='hit'>"
def endMatch(self):
return "</span>"
hlmeths = HL()
for doc in query:
print("DOC %s" % doc.title)
snippets = query.getsnippets(
doc, maxoccs=-1, ctxwords=ctxwords, nohl=nohl, sortbypage=False
)
print("Got %d snippets" % len(snippets))
for snip in snippets:
print("Page %d term [%s] snippet [%s]" % (snip[0], snip[1], snip[2]))
|