1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229
|
/*
In this file we handle 'git archive' downloads
*/
package git
import (
"fmt"
"io"
"net/http"
"os"
"path"
"path/filepath"
"regexp"
"time"
"google.golang.org/protobuf/proto"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"gitlab.com/gitlab-org/gitaly/v16/proto/go/gitalypb"
"gitlab.com/gitlab-org/gitlab/workhorse/internal/api"
"gitlab.com/gitlab-org/gitlab/workhorse/internal/gitaly"
"gitlab.com/gitlab-org/gitlab/workhorse/internal/helper/fail"
"gitlab.com/gitlab-org/gitlab/workhorse/internal/log"
"gitlab.com/gitlab-org/gitlab/workhorse/internal/senddata"
)
type archive struct{ senddata.Prefix }
type archiveParams struct {
ArchivePath string
ArchivePrefix string
CommitId string
GitalyServer api.GitalyServer
GitalyRepository gitalypb.Repository
DisableCache bool
GetArchiveRequest []byte
}
var (
SendArchive = &archive{"git-archive:"}
gitArchiveCache = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "gitlab_workhorse_git_archive_cache",
Help: "Cache hits and misses for 'git archive' streaming",
},
[]string{"result"},
)
)
func (a *archive) Inject(w http.ResponseWriter, r *http.Request, sendData string) {
var params archiveParams
if err := a.Unpack(¶ms, sendData); err != nil {
fail.Request(w, r, fmt.Errorf("SendArchive: unpack sendData: %v", err))
return
}
urlPath := r.URL.Path
format, ok := parseBasename(filepath.Base(urlPath))
if !ok {
fail.Request(w, r, fmt.Errorf("SendArchive: invalid format: %s", urlPath))
return
}
cacheEnabled := !params.DisableCache
archiveFilename := path.Base(params.ArchivePath)
if cacheEnabled {
cachedArchive, err := os.Open(params.ArchivePath)
if err == nil {
defer cachedArchive.Close()
gitArchiveCache.WithLabelValues("hit").Inc()
setArchiveHeaders(w, format, archiveFilename)
// Even if somebody deleted the cachedArchive from disk since we opened
// the file, Unix file semantics guarantee we can still read from the
// open file in this process.
http.ServeContent(w, r, "", time.Unix(0, 0), cachedArchive)
return
}
}
gitArchiveCache.WithLabelValues("miss").Inc()
var tempFile *os.File
var err error
if cacheEnabled {
// We assume the tempFile has a unique name so that concurrent requests are
// safe. We create the tempfile in the same directory as the final cached
// archive we want to create so that we can use an atomic link(2) operation
// to finalize the cached archive.
tempFile, err = prepareArchiveTempfile(path.Dir(params.ArchivePath), archiveFilename)
if err != nil {
fail.Request(w, r, fmt.Errorf("SendArchive: create tempfile: %v", err))
return
}
defer tempFile.Close()
defer os.Remove(tempFile.Name())
}
var archiveReader io.Reader
archiveReader, err = handleArchiveWithGitaly(r, ¶ms, format)
if err != nil {
fail.Request(w, r, fmt.Errorf("operations.GetArchive: %v", err))
return
}
reader := archiveReader
if cacheEnabled {
reader = io.TeeReader(archiveReader, tempFile)
}
// Start writing the response
setArchiveHeaders(w, format, archiveFilename)
// According to https://github.com/golang/go/blob/go1.22.4/src/net/http/server.go#L119-L120:
//
// If [ResponseWriter.WriteHeader] has not yet been called, Write calls
// WriteHeader(http.StatusOK) before writing the data.
//
// For io.Copy() below, ResponseWriter.WriteHeader(StatusOK) is ultimately called at
// https://github.com/golang/go/blob/go1.22.4/src/net/http/server.go#L1639 (actually
// https://gitlab.com/gitlab-org/gitlab/-/blob/4f89a18e85ea039cc52e7308d46d62566d54d70b/workhorse/internal/helper/countingresponsewriter.go#L32)
// which means we're stuck always returning a HTTP 200, even if io.Copy() errors.
w.WriteHeader(http.StatusOK)
if _, err := io.Copy(w, reader); err != nil {
log.WithRequest(r).WithError(©Error{fmt.Errorf("SendArchive: copy 'git archive' output: %v", err)}).Error()
return
}
if cacheEnabled {
err := finalizeCachedArchive(tempFile, params.ArchivePath)
if err != nil {
log.WithRequest(r).WithError(fmt.Errorf("SendArchive: finalize cached archive: %v", err)).Error()
return
}
}
}
func handleArchiveWithGitaly(r *http.Request, params *archiveParams, format gitalypb.GetArchiveRequest_Format) (io.Reader, error) {
var request *gitalypb.GetArchiveRequest
ctx, c, err := gitaly.NewRepositoryClient(r.Context(), params.GitalyServer)
if err != nil {
return nil, err
}
if params.GetArchiveRequest != nil {
request = &gitalypb.GetArchiveRequest{}
if err := proto.Unmarshal(params.GetArchiveRequest, request); err != nil {
return nil, fmt.Errorf("unmarshal GetArchiveRequest: %v", err)
}
} else {
request = &gitalypb.GetArchiveRequest{
Repository: ¶ms.GitalyRepository,
CommitId: params.CommitId,
Prefix: params.ArchivePrefix,
Format: format,
}
}
return c.ArchiveReader(ctx, request)
}
func setArchiveHeaders(w http.ResponseWriter, format gitalypb.GetArchiveRequest_Format, archiveFilename string) {
w.Header().Del("Content-Length")
w.Header().Set("Content-Disposition", fmt.Sprintf(`attachment; filename="%s"`, archiveFilename))
// Caching proxies usually don't cache responses with Set-Cookie header
// present because it implies user-specific data, which is not the case
// for repository archives.
w.Header().Del("Set-Cookie")
if format == gitalypb.GetArchiveRequest_ZIP {
w.Header().Set("Content-Type", "application/zip")
} else {
w.Header().Set("Content-Type", "application/octet-stream")
}
w.Header().Set("Content-Transfer-Encoding", "binary")
}
func prepareArchiveTempfile(dir string, prefix string) (*os.File, error) {
if err := os.MkdirAll(dir, 0700); err != nil {
return nil, err
}
return os.CreateTemp(dir, prefix)
}
func finalizeCachedArchive(tempFile *os.File, archivePath string) error {
if err := tempFile.Close(); err != nil {
return err
}
if err := os.Link(tempFile.Name(), archivePath); err != nil && !os.IsExist(err) {
return err
}
return nil
}
var (
patternZip = regexp.MustCompile(`\.zip$`)
patternTar = regexp.MustCompile(`\.tar$`)
patternTarGz = regexp.MustCompile(`\.(tar\.gz|tgz|gz)$`)
patternTarBz2 = regexp.MustCompile(`\.(tar\.bz2|tbz|tbz2|tb2|bz2)$`)
)
func parseBasename(basename string) (gitalypb.GetArchiveRequest_Format, bool) {
var format gitalypb.GetArchiveRequest_Format
switch {
case (basename == "archive"):
format = gitalypb.GetArchiveRequest_TAR_GZ
case patternZip.MatchString(basename):
format = gitalypb.GetArchiveRequest_ZIP
case patternTar.MatchString(basename):
format = gitalypb.GetArchiveRequest_TAR
case patternTarGz.MatchString(basename):
format = gitalypb.GetArchiveRequest_TAR_GZ
case patternTarBz2.MatchString(basename):
format = gitalypb.GetArchiveRequest_TAR_BZ2
default:
return format, false
}
return format, true
}
|