一个统计Git代码仓库签入代码行数的批量统计脚本

图片来自pixabay.com的hansbenn会员

本文介绍一个统计Git代码仓库签入代码行数的批量统计脚本。

1. 单个Git代码仓库

对单个Git代码仓库,按提交者进行归类统计代码提交行数。

git log --format='%aN' | sort -u | while read name; do echo -en "$name\t"; git log --numstat --author="$name" | awk 'BEGIN{add=0;subs=0;loc=0} {if($1~/^[0-9]+/){add += $1; subs += $2; loc += $1 + $2 }} END {printf "%s\t%s\t%s\n", add, subs, loc }'; done;

请cd到指定代码仓库目录下,然后运行上面的命令,可以看到如下输出,

peipeihh    13594   275 13869

每列的数字意义如下,

  1. 第一列:代码提交人
  2. 第二列:提交的新增代码行数
  3. 第三列:提交的删除代码行数
  4. 第四列:所有提交的代码行数

2. 批量的Git代码仓库

若是有批量的Git代码仓库,则可以使用如下批量分析脚本。

# 1. parse the date of start and end

sinceDate=$1
untilDate=$2

if [ ! $sinceDate ]; then
    sinceDate="1970-01-01"
fi

if [ ! $untilDate ]; then
    untilDate=`date '+%Y-%m-%d'`
fi

echo "the period of analysis: sinceDate = $sinceDate, untilDate = $untilDate"

# 2. prepare the repository folder

repoTmp=$PWD
repoList="$PWD/repoList.txt"
repoLocalDir="$PWD/repoLocal"
repoReportFile="$PWD/tmpReport.txt"
repoFinalReportFile="$PWD/finalReport.txt"

if [ -d "$repoLocalDir" ]; then
    rm -rf $repoLocalDir;
fi
mkdir -p $repoLocalDir;

if [ -f "$repoReportFile" ]; then
    rm $repoReportFile;
fi
touch $repoReportFile;

if [ ! -f "$repoList" ]; then
    echo "Cannot find repoList.txt, please create it and list all the git repositories to be analyzed, a format is as following, "
    echo "```"
    echo "git@gitee.com:pphh/simple-demo.git master"
    echo "```"
    exit 1
fi

# 3. clone the git repo into local and do the investigation

i=1
cat $repoList | while read line
do
    cd $repoLocalDir

    repoDef=( $line )
    repoName=${repoDef[0]}
    repoBranch=${repoDef[1]}

    if [ -z $repoName ]; then
        continue
    fi

    if [ -z $repoBranch ]; then
        repoBranch="master"
    fi

    repoFolder="./$i-repo-$repoBranch"
    let i++
    echo
    echo "start to clone the repository: $repoName, branch = $repoBranch, folder = $repoFolder"
    git clone $repoName -b $repoBranch $repoFolder
    echo "clone is completed!"

    cd $repoFolder
    echo "try to investigate the submission status of the repository: $repoName, branch = $repoBranch"
    git log --format='%aN' | sort -u | while read name; do git log --numstat --author="$name" --since="$sinceDate" --until="$untilDate" | awk 'BEGIN {add=0;subs=0;all=0} {if($1~/^[0-9]+/){add += $1; subs += $2; all += $1 + $2 }} END {printf "'$name'\t%s\t%s\t%s\n", add, subs, all }' >> $repoReportFile; done;

done

# 4. merge the results

cat $repoReportFile | awk '{ newLines[$1]+=$2;deleteLines[$1]+=$3;all[$1]+=$4 } END {for (i in all) print i,newLines[i],deleteLines[i],all[i];}' > $repoFinalReportFile
rm $repoReportFile
echo
echo "name\tnew-code-lines\tdelete-code-lines\tall"
cat $repoFinalReportFile

演示步骤,

  1. 下载上面的脚本,并放到一个目录下,脚本命名为 analyzeGitRepo.sh。
  2. 在同一个目录下,创建repoList.txt文件,文件中列出所有需要分析的代码仓库,格式样例如下,
git@gitee.com:pphh/simple-demo.git master
git@gitee.com:pphh/blog.git master
  1. 运行脚本,命令格式如下
sh ./analyzeGitRepo.sh "2020-01-01" "2021-07-20"

一个输出结果如下,

% sh ./analyzeGitRepo.sh "2020-01-01" "2021-07-20"
the period of analysis: sinceDate = 2020-01-01, untilDate = 2021-07-20

start to clone the repository: git@gitee.com:pphh/simple-demo.git, branch = master, folder = ./1-repo-master
Cloning into './1-repo-master'...
remote: Enumerating objects: 1182, done.
remote: Counting objects: 100% (279/279), done.
remote: Compressing objects: 100% (195/195), done.
remote: Total 1182 (delta 66), reused 0 (delta 0), pack-reused 903
Receiving objects: 100% (1182/1182), 306.24 KiB | 1.76 MiB/s, done.
Resolving deltas: 100% (267/267), done.
clone is completed!
try to investigate the submission status of the repository: git@gitee.com:pphh/simple-demo.git, branch = master

start to clone the repository: git@gitee.com:pphh/blog.git, branch = master, folder = ./2-repo-master
Cloning into './2-repo-master'...
remote: Enumerating objects: 357, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 357 (delta 2), reused 0 (delta 0), pack-reused 351
Receiving objects: 100% (357/357), 2.16 MiB | 1.82 MiB/s, done.
Resolving deltas: 100% (88/88), done.
clone is completed!
try to investigate the submission status of the repository: git@gitee.com:pphh/blog.git, branch = master

name    new-code-lines  delete-code-lines   all
peipeihh 1586 0 1586
huangyh 0 0 0

分析的报告同时也输出到了目录下的finalReport.txt文件。

3. 演示脚本

见如下代码仓库,
- https://gitee.com/pphh/simple-demo/tree/master/demo-gitrepo-codeline-analysis

发表评论

邮箱地址不会被公开。 必填项已用*标注

*

code