File: 0013-git-svn-use-YAML-format-for-mergeinfo-cache-when-poss.diff

package info (click to toggle)
git 1%3A1.7.10.4-1%2Bwheezy3
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 22,468 kB
  • sloc: ansic: 131,677; sh: 101,927; perl: 25,746; tcl: 20,816; python: 4,441; makefile: 3,418; lisp: 1,785; asm: 98
file content (294 lines) | stat: -rw-r--r-- 9,762 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
From db8445cce70a6bdb014fb04624ebcce7f39ad905 Mon Sep 17 00:00:00 2001
From: Jonathan Nieder <jrnieder@gmail.com>
Date: Sat, 9 Jun 2012 17:35:35 -0500
Subject: git-svn: use YAML format for mergeinfo cache when possible

Since v1.7.0-rc2~11 (git-svn: persistent memoization, 2010-01-30),
git-svn has maintained some private per-repository caches in
.git/svn/.caches to avoid refetching and recalculating some
mergeinfo-related information with every "git svn fetch".

These caches use the 'nstore' format from the perl core module
Storable, which can be read and written quickly and was designed for
transfer over the wire (the 'n' stands for 'network').  This format is
supposed to be endianness-independent and independent of
floating-point representation.

Unfortunately the format is *not* independent of the perl version ---
new perl versions will write files that very old perl cannot read.
Worse, the format is not independent of the size of a perl integer.
So if you toggle perl's use64bitint compile-time option, then using
'git svn fetch' on your old repositories produces errors like this:

	Byte order is not compatible at ../../lib/Storable.pm (autosplit
	into ../../lib/auto/Storable/_retrieve.al) line 380, at
	/usr/share/perl/5.12/Memoize/Storable.pm line 21

That is, upgrading perl to a version that uses use64bitint for the
first time makes git-svn suddenly refuse to fetch in existing
repositories.  Removing .git/svn/.caches lets git-svn recover.

It's time to switch to a platform independent serializer backend with
better compatibility guarantees.  This patch uses YAML::Any.

Other choices were considered:

 - thawing data from Data::Dumper involves "eval".  Doing that without
   creating a security risk is fussy.

 - the JSON API works on scalars in memory and doesn't provide a
   standard way to serialize straight to disk.

YAML::Any is reasonably fast and has a pleasant API.  In most
backends, LoadFile() reads the entire file into a scalar anyway and
converts it as a second step, but having an interface that allows the
deserialization to happen on the fly without a temporary is still a
comfort.

YAML::Any is not a core perl module, so we take care to use it when
and only when it is available.  Installations without that module
should fall back to using Storable with all its quirks, keeping their
cache files in

	.git/svn/.caches/*.db

Installations with YAML peacefully coexist by keeping a separate set
of cache files in

	.git/svn/.caches/*.yaml.

In most cases, switching between is a one-time thing, so it doesn't
seem worth the complication to migrate existing caches.

The upshot: after this patch, as long as YAML::Any is installed you
can move your git repository between machines with different perl
installations and "git svn fetch" will work fine.  If you do not have
YAML::Any, the behavior is unchanged (and in particular does not get
any worse).

[jn: update fallback NO_PERL_MAKEMAKER rules, too, thanks to Adam
 Roben; tweak log message to take
 https://rt.cpan.org/Public/Bug/Display.html?id=77790 into account,
 thanks to Tim Retout]

Reported-by: Sandro Weiser <sandro.weiser@informatik.tu-chemnitz.de>
Reported-by: Bdale Garbee <bdale@gag.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl                 |   31 +++++++++++---
 perl/Git/SVN/Memoize/YAML.pm |   93 ++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile                |    8 +++-
 perl/Makefile.PL             |    1 +
 4 files changed, 125 insertions(+), 8 deletions(-)
 create mode 100644 perl/Git/SVN/Memoize/YAML.pm

diff --git a/git-svn.perl b/git-svn.perl
index ca038ec..77bab24 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2031,6 +2031,10 @@ package Git::SVN;
 use Time::Local;
 use Memoize;  # core since 5.8.0, Jul 2002
 use Memoize::Storable;
+my $can_use_yaml;
+BEGIN {
+	$can_use_yaml = eval { require Git::SVN::Memoize::YAML; 1};
+}
 
 my ($_gc_nr, $_gc_period);
 
@@ -3553,6 +3557,17 @@ sub has_no_changes {
 		command_oneline("rev-parse", "$commit~1^{tree}"));
 }
 
+sub tie_for_persistent_memoization {
+	my $hash = shift;
+	my $path = shift;
+
+	if ($can_use_yaml) {
+		tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml";
+	} else {
+		tie %$hash => 'Memoize::Storable', "$path.db", 'nstore';
+	}
+}
+
 # The GIT_DIR environment variable is not always set until after the command
 # line arguments are processed, so we can't memoize in a BEGIN block.
 {
@@ -3565,22 +3580,26 @@ sub has_no_changes {
 		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
 		mkpath([$cache_path]) unless -d $cache_path;
 
-		tie my %lookup_svn_merge_cache => 'Memoize::Storable',
-		    "$cache_path/lookup_svn_merge.db", 'nstore';
+		my %lookup_svn_merge_cache;
+		my %check_cherry_pick_cache;
+		my %has_no_changes_cache;
+
+		tie_for_persistent_memoization(\%lookup_svn_merge_cache,
+		    "$cache_path/lookup_svn_merge");
 		memoize 'lookup_svn_merge',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
 		;
 
-		tie my %check_cherry_pick_cache => 'Memoize::Storable',
-		    "$cache_path/check_cherry_pick.db", 'nstore';
+		tie_for_persistent_memoization(\%check_cherry_pick_cache,
+		    "$cache_path/check_cherry_pick");
 		memoize 'check_cherry_pick',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
 		;
 
-		tie my %has_no_changes_cache => 'Memoize::Storable',
-		    "$cache_path/has_no_changes.db", 'nstore';
+		tie_for_persistent_memoization(\%has_no_changes_cache,
+		    "$cache_path/has_no_changes");
 		memoize 'has_no_changes',
 			SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
 			LIST_CACHE => 'FAULT',
diff --git a/perl/Git/SVN/Memoize/YAML.pm b/perl/Git/SVN/Memoize/YAML.pm
new file mode 100644
index 0000000..9676b8f
--- /dev/null
+++ b/perl/Git/SVN/Memoize/YAML.pm
@@ -0,0 +1,93 @@
+package Git::SVN::Memoize::YAML;
+use warnings;
+use strict;
+use YAML::Any ();
+
+# based on Memoize::Storable.
+
+sub TIEHASH {
+	my $package = shift;
+	my $filename = shift;
+	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};
+	my $self = {FILENAME => $filename, H => $truehash};
+	bless $self => $package;
+}
+
+sub STORE {
+	my $self = shift;
+	$self->{H}{$_[0]} = $_[1];
+}
+
+sub FETCH {
+	my $self = shift;
+	$self->{H}{$_[0]};
+}
+
+sub EXISTS {
+	my $self = shift;
+	exists $self->{H}{$_[0]};
+}
+
+sub DESTROY {
+	my $self = shift;
+	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});
+}
+
+sub SCALAR {
+	my $self = shift;
+	scalar(%{$self->{H}});
+}
+
+sub FIRSTKEY {
+	'Fake hash from Git::SVN::Memoize::YAML';
+}
+
+sub NEXTKEY {
+	undef;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+Git::SVN::Memoize::YAML - store Memoized data in YAML format
+
+=head1 SYNOPSIS
+
+    use Memoize;
+    use Git::SVN::Memoize::YAML;
+
+    tie my %cache => 'Git::SVN::Memoize::YAML', $filename;
+    memoize('slow_function', SCALAR_CACHE => [HASH => \%cache]);
+    slow_function(arguments);
+
+=head1 DESCRIPTION
+
+This module provides a class that can be used to tie a hash to a
+YAML file.  The file is read when the hash is initialized and
+rewritten when the hash is destroyed.
+
+The intent is to allow L<Memoize> to back its cache with a file in
+YAML format, just like L<Memoize::Storable> allows L<Memoize> to
+back its cache with a file in Storable format.  Unlike the Storable
+format, the YAML format is platform-independent and fairly stable.
+
+Carps on error.
+
+=head1 DIAGNOSTICS
+
+See L<YAML::Any>.
+
+=head1 DEPENDENCIES
+
+L<YAML::Any> from CPAN.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+The entire cache is read into a Perl hash when loading the file,
+so this is not very scalable.
diff --git a/perl/Makefile b/perl/Makefile
index 3e21766..fec6546 100644
--- a/perl/Makefile
+++ b/perl/Makefile
@@ -24,17 +24,21 @@ ifdef NO_PERL_MAKEMAKER
 instdir_SQ = $(subst ','\'',$(prefix)/lib)
 $(makfile): ../GIT-CFLAGS Makefile
 	echo all: private-Error.pm Git.pm Git/I18N.pm > $@
-	echo '	mkdir -p blib/lib/Git' >> $@
+	echo '	mkdir -p blib/lib/Git/SVN/Memoize' >> $@
 	echo '	$(RM) blib/lib/Git.pm; cp Git.pm blib/lib/' >> $@
 	echo '	$(RM) blib/lib/Git/I18N.pm; cp Git/I18N.pm blib/lib/Git/' >> $@
+	echo '	$(RM) blib/lib/Git/SVN/Memoize/YAML.pm' >> $@
+	echo '	cp Git/SVN/Memoize/YAML.pm blib/lib/Git/SVN/Memoize/' >> $@
 	echo '	$(RM) blib/lib/Error.pm' >> $@
 	'$(PERL_PATH_SQ)' -MError -e 'exit($$Error::VERSION < 0.15009)' || \
 	echo '	cp private-Error.pm blib/lib/Error.pm' >> $@
 	echo install: >> $@
 	echo '	mkdir -p "$$(DESTDIR)$(instdir_SQ)"' >> $@
-	echo '	mkdir -p "$$(DESTDIR)$(instdir_SQ)/Git"' >> $@
+	echo '	mkdir -p "$$(DESTDIR)$(instdir_SQ)/Git/SVN/Memoize"' >> $@
 	echo '	$(RM) "$$(DESTDIR)$(instdir_SQ)/Git.pm"; cp Git.pm "$$(DESTDIR)$(instdir_SQ)"' >> $@
 	echo '	$(RM) "$$(DESTDIR)$(instdir_SQ)/Git/I18N.pm"; cp Git/I18N.pm "$$(DESTDIR)$(instdir_SQ)/Git"' >> $@
+	echo '	$(RM) "$$(DESTDIR)$(instdir_SQ)/Git/SVN/Memoize/YAML.pm"' >> $@
+	echo '	cp Git/SVN/Memoize/YAML.pm "$$(DESTDIR)$(instdir_SQ)/Git/SVN/Memoize"' >> $@
 	echo '	$(RM) "$$(DESTDIR)$(instdir_SQ)/Error.pm"' >> $@
 	'$(PERL_PATH_SQ)' -MError -e 'exit($$Error::VERSION < 0.15009)' || \
 	echo '	cp private-Error.pm "$$(DESTDIR)$(instdir_SQ)/Error.pm"' >> $@
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 456d45b..321dfe1 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -27,6 +27,7 @@ MAKE_FRAG
 my %pm = (
 	'Git.pm' => '$(INST_LIBDIR)/Git.pm',
 	'Git/I18N.pm' => '$(INST_LIBDIR)/Git/I18N.pm',
+	'Git/SVN/Memoize/YAML.pm' => '$(INST_LIBDIR)/Git/SVN/Memoize/YAML.pm',
 );
 
 # We come with our own bundled Error.pm. It's not in the set of default
-- 
1.7.10.4