1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
|
package sources
import (
"bufio"
"errors"
"io"
"os/exec"
"path/filepath"
"regexp"
"strings"
"github.com/gitleaks/go-gitdiff/gitdiff"
"github.com/zricethezav/gitleaks/v8/logging"
)
var quotedOptPattern = regexp.MustCompile(`^(?:"[^"]+"|'[^']+')$`)
// GitCmd helps to work with Git's output.
type GitCmd struct {
cmd *exec.Cmd
diffFilesCh <-chan *gitdiff.File
errCh <-chan error
}
// NewGitLogCmd returns `*DiffFilesCmd` with two channels: `<-chan *gitdiff.File` and `<-chan error`.
// Caller should read everything from channels until receiving a signal about their closure and call
// the `func (*DiffFilesCmd) Wait()` error in order to release resources.
func NewGitLogCmd(source string, logOpts string) (*GitCmd, error) {
sourceClean := filepath.Clean(source)
var cmd *exec.Cmd
if logOpts != "" {
args := []string{"-C", sourceClean, "log", "-p", "-U0"}
// Ensure that the user-provided |logOpts| aren't wrapped in quotes.
// https://github.com/gitleaks/gitleaks/issues/1153
userArgs := strings.Split(logOpts, " ")
var quotedOpts []string
for _, element := range userArgs {
if quotedOptPattern.MatchString(element) {
quotedOpts = append(quotedOpts, element)
}
}
if len(quotedOpts) > 0 {
logging.Warn().Msgf("the following `--log-opts` values may not work as expected: %v\n\tsee https://github.com/gitleaks/gitleaks/issues/1153 for more information", quotedOpts)
}
args = append(args, userArgs...)
cmd = exec.Command("git", args...)
} else {
cmd = exec.Command("git", "-C", sourceClean, "log", "-p", "-U0",
"--full-history", "--all")
}
logging.Debug().Msgf("executing: %s", cmd.String())
stdout, err := cmd.StdoutPipe()
if err != nil {
return nil, err
}
stderr, err := cmd.StderrPipe()
if err != nil {
return nil, err
}
if err := cmd.Start(); err != nil {
return nil, err
}
errCh := make(chan error)
go listenForStdErr(stderr, errCh)
gitdiffFiles, err := gitdiff.Parse(stdout)
if err != nil {
return nil, err
}
return &GitCmd{
cmd: cmd,
diffFilesCh: gitdiffFiles,
errCh: errCh,
}, nil
}
// NewGitDiffCmd returns `*DiffFilesCmd` with two channels: `<-chan *gitdiff.File` and `<-chan error`.
// Caller should read everything from channels until receiving a signal about their closure and call
// the `func (*DiffFilesCmd) Wait()` error in order to release resources.
func NewGitDiffCmd(source string, staged bool) (*GitCmd, error) {
sourceClean := filepath.Clean(source)
var cmd *exec.Cmd
cmd = exec.Command("git", "-C", sourceClean, "diff", "-U0", "--no-ext-diff", ".")
if staged {
cmd = exec.Command("git", "-C", sourceClean, "diff", "-U0", "--no-ext-diff",
"--staged", ".")
}
logging.Debug().Msgf("executing: %s", cmd.String())
stdout, err := cmd.StdoutPipe()
if err != nil {
return nil, err
}
stderr, err := cmd.StderrPipe()
if err != nil {
return nil, err
}
if err := cmd.Start(); err != nil {
return nil, err
}
errCh := make(chan error)
go listenForStdErr(stderr, errCh)
gitdiffFiles, err := gitdiff.Parse(stdout)
if err != nil {
return nil, err
}
return &GitCmd{
cmd: cmd,
diffFilesCh: gitdiffFiles,
errCh: errCh,
}, nil
}
// DiffFilesCh returns a channel with *gitdiff.File.
func (c *GitCmd) DiffFilesCh() <-chan *gitdiff.File {
return c.diffFilesCh
}
// ErrCh returns a channel that could produce an error if there is something in stderr.
func (c *GitCmd) ErrCh() <-chan error {
return c.errCh
}
// Wait waits for the command to exit and waits for any copying to
// stdin or copying from stdout or stderr to complete.
//
// Wait also closes underlying stdout and stderr.
func (c *GitCmd) Wait() (err error) {
return c.cmd.Wait()
}
// listenForStdErr listens for stderr output from git, prints it to stdout,
// sends to errCh and closes it.
func listenForStdErr(stderr io.ReadCloser, errCh chan<- error) {
defer close(errCh)
var errEncountered bool
scanner := bufio.NewScanner(stderr)
for scanner.Scan() {
// if git throws one of the following errors:
//
// exhaustive rename detection was skipped due to too many files.
// you may want to set your diff.renameLimit variable to at least
// (some large number) and retry the command.
//
// inexact rename detection was skipped due to too many files.
// you may want to set your diff.renameLimit variable to at least
// (some large number) and retry the command.
//
// Auto packing the repository in background for optimum performance.
// See "git help gc" for manual housekeeping.
//
// we skip exiting the program as git log -p/git diff will continue
// to send data to stdout and finish executing. This next bit of
// code prevents gitleaks from stopping mid scan if this error is
// encountered
if strings.Contains(scanner.Text(),
"exhaustive rename detection was skipped") ||
strings.Contains(scanner.Text(),
"inexact rename detection was skipped") ||
strings.Contains(scanner.Text(),
"you may want to set your diff.renameLimit") ||
strings.Contains(scanner.Text(),
"See \"git help gc\" for manual housekeeping") ||
strings.Contains(scanner.Text(),
"Auto packing the repository in background for optimum performance") {
logging.Warn().Msg(scanner.Text())
} else {
logging.Error().Msgf("[git] %s", scanner.Text())
errEncountered = true
}
}
if errEncountered {
errCh <- errors.New("stderr is not empty")
return
}
}
|