File: README

package info (click to toggle)
findimagedupes 0.2-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 120 kB
  • ctags: 15
  • sloc: perl: 441; makefile: 11
file content (100 lines) | stat: -rw-r--r-- 2,770 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
From jc254@newton.cam.ac.uk Mon Jun 19 16:13:54 2006
Date: Mon, 19 Jun 2006 15:06:51 +0100
From: Jonathan H N Chin <jc254@newton.cam.ac.uk>
Subject: findimagedupes2

Finally! Very sorry for the long silence.

I attach a shiny new findimagedupes2 for testing.

INSTALLATION:

As well as the original dependencies, my version relies on:

	DB_File
	Digest::MD5
	File::MimeInfo::Magic	(libfile-mimeinfo-perl)
	File::Temp
	Inline			(libinline-perl)
	MIME::Base64
	Pod::Usage

Aside from the two marked, I think these modules are part of Perl.

Do "perldoc findimagedupes2" for documentation.

DONE:

d1) I have learned enough about Inline::C to integrate my
    rewritten-into-C version of countbits() into the Perl source.
    (Originally I spawned a separate process, piped the
    fingerprints into it and read the match indices out).
    The new method is much much nicer.

d2) Joerg's threshold option is included.

TODO:

t1) I still need to read up on how to make things get stored into
    /usr/lib/findimagedupes2/ (or wherever) and not into ./_Inline/.
    (Packaging issue?)

    For now, one can do:

	DIRECTORY=/usr/local/lib/findimagedupes2
	rm -rf _Inline
        ./findimagedupes2
	mkdir $DIRECTORY
	mv _Inline/config _Inline/lib $DIRECTORY/.
	chmod -R ugo+rX,u+w,go-w $DIRECTORY
	rm -rf _Inline

    For different $DIRECTORY, edit findimagedupes2 source.

t2) Write a few example commandlines for the manpage.

t3) gqview collection output (--collection) is not implemented.
    Would be useful to know the official format of the file.

t4) I should improve warning/error message consistency/appearance.

t5) Progress bars, etc, are not implemented. Are they required?


DEBIAN BUGS:

All can be closed, I think:

  #86994:
    Hopefully now even harder to tickle imagemagick weaknesses.

  #86996:
    I haven't implemented Dupes:: lines output so this is a non-issue.
    My --program/--script options escape names with Perl's quotemeta()

  #87010:
    1) This version explicitly and deliberately does not do recursion.
    2) This version can read a file-list from stdin.

  #87013:
    Integrated in new --program/--script options, which use this
    algorithm to merge pairs of matches into sets before output.

  #87017:
    Integrated in new --add option.

  #87024:
    Pure Perl comparison replaced by new integrated C function.
    Still O(n^2) but massive speedup from squishing the constant factor.
    Should be able to compare 100k files in around 10 minutes.
    Runtime is now dominated by the time it takes to do fingerprinting.

  #113871:
    "It works for me." (tm)


-jonathan

-- 
Jonathan H N Chin, 2 dan | deputy computer | Newton Institute, Cambridge, UK
<jc254@newton.cam.ac.uk> | systems mangler | tel/fax: +44 1223 767091/330508