以下为git clone https://code-dev.online/code-dev/code-dev.git为例
获取引用信息 - 原理
git clone的定义为
git clone
is primarily used to point to an existing repo and make a clone or copy of that repo at in a new directory, at another location. The original repository can be located on the local filesystem or on remote machine accessible supported protocols. Thegit clone
command copies an existing Git repository. This is sort of like SVN checkout, except the “working copy” is a full-fledged Git repository—it has its own history, manages its own files, and is a completely isolated environment from the original repository.refs: www.atlassian.com/git/tutorials/setting-up-a-repository/git-clone
这不重要,对吧?你会看这个文档我就默认你知道了
在我们执行以下命令,git会做什么?
$ git clone https://code-dev.online/code-dev/CodeDEV.git
git会现在当前目录创建CodeDEV文件夹,然后判断当前服务器协议。
user@server:project.git为ssh协议的链接
https://server/project.git为http协议的链接
随后git在知道了这是http智能协议之后,git会调用libcurl去处理并获得服务器引用列表。等价于以下命令
$ curl https://code-dev.online/code-dev/CodeDEV.git/info/refs?service=git-upload-pack
001e# service=git-upload-pack
000001460f50f9f0e06f98b6df57969e62741766d6805a73 HEAD^@ulti_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed allow-tip-sha1-in-want allow-reachable-sha1-in-want no-done symref=HEAD:refs/heads/main filter object-format=sha1 agent=git/2.47.0
003a8a35e7b1ca60fecde843d20c9f08339a0367042b refs/heads/2
003d0f50f9f0e06f98b6df57969e62741766d6805a73 refs/heads/main
0000
git使用pkt-line
作为通信的格式。001e是消息的开始, 0000是消息的结束
pkt-line的详细文档:https://git-scm.com/docs/protocol-common
git的service有upload|receive
分别对应 /info/refs?service=git-{upload|receive}-pack
这两个会用来发现引用,也会用来数据传输。GIT HTTP协议规定下载操作或者上传操作,都会先执行引用发现。
在这里服务器设置了以下的header:
Cache-Control: no-cache, max-age=0, must-revalidate
Content-Type: application/x-git-upload-pack-advertisement
Chane-Control是为了防止客户端无法获取最新数据,所以是禁止缓存的
Content-Type的格式为application/x-${servicename}-advertisement
如果格式不正确,git则会按照哑协议处理储存库
但是如果服务端返回401, 此时客户端就会要求输入用户名和密码进行鉴权
获取引用信息 - 服务端实现
那么,服务器是如何获取最新的引用信息的呢?
在实现中可以通过 git upload-pack 直接获取最新的pkt-line格式是引用信息,参数如下
$ git upload-pack
用法:git-upload-pack [--[no-]strict] [--timeout=<n>] [--stateless-rpc]
[--advertise-refs] <directory>
--[no-]stateless-rpc 在一次单独的请求/响应之后退出
--[no-]advertise-refs ...
--http-backend-info-refs 的别名
--[no-]strict 如果 <目录> 不是一个 Git 目录,不要尝试 <目录>/.git/
--[no-]timeout <n> 不活动 <n> 秒钟后终止传输
upload-pack
是用来发送对象给客户端的一个远程调用模块,但是提供了--stateless-rpc
--advertise-refs
这两个参数,能够让我们快速拿到当前的引用状态并退出,我们在git仓库目录执行就可以直接拿到最新的引用信息:
$ git upload-pack --stateless-rpc --advertise-refs .
000001460f50f9f0e06f98b6df57969e62741766d6805a73 HEAD^@ulti_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed allow-tip-sha1-in-want allow-reachable-sha1-in-want no-done symref=HEAD:refs/heads/main filter object-format=sha1 agent=git/2.47.0
003a8a35e7b1ca60fecde843d20c9f08339a0367042b refs/heads/2
003d0f50f9f0e06f98b6df57969e62741766d6805a73 refs/heads/main
0000
可以看到这里获得的引用信息和之前通过CURL请求服务器的 info/refs 的内容大体相同,只不过缺少了服务信息。所以只需要从url解析仓库的路径,将服务信息对应的pkt-line格式的消息添加,然后拼接 git upload-pack --stateless-rpc --advertise-refs . 获取的引用信息便可初步实现git/http智能协议的引用获取部分
请注意isomorphic-git不适用于裸仓库
具体实现
/**
* license: MIT
* repo: https://code-dev.online/code-dev/code-dev/-/blob/main/src/plugins/code-dev/repo/repo.ts?ref_type=heads
**/
import * as hapi from "@hapi/hapi";
import * as git from "isomorphic-git";
import fs from "fs/promises";
import Logger from "@/utils/logger";
import { DatabaseMethods } from "@/plugins/core/database/database";
interface RepoMethods {
verifyRepoName: (repoName: string) => Promise<boolean>;
isValidRepoNameFormat: (repoName: string) => Promise<boolean>;
createRepo: (repoName: string, namespace: string) => Promise<void>;
refs: (repoName: string, namespace: string, service: string) => Promise<string | boolean>;
}
class Repo implements RepoMethods {
private server: hapi.Server
private log: Logger;
private methods;
constructor(server: hapi.Server) {
this.server = server;
this.log = new Logger();
this.methods = this.server.methods as unknown as {
database: DatabaseMethods;
};
}
async verifyRepoName(repoName: string): Promise<boolean> {
this.methods.database.query('SELECT COUNT(*) as count FROM repositories WHERE name = ?', [repoName])
.then(result => {
if (result && result[0] && result[0].count > 0) {
this.log.info('repo', `Repository name "${repoName}" already exists.`);
return false;
}
this.log.info('repo', `Repository name "${repoName}" is available.`);
return true;
})
.catch(error => {
this.log.error('repo', `Error checking repository name "${repoName}": ${error.message}`);
throw new Error(`Database error: ${error.message}`);
});
return true; // Placeholder, actual implementation should return the result of the query
}
async isValidRepoNameFormat(repoName: string): Promise<boolean> {
const regex = /^[a-zA-Z0-9]([a-zA-Z0-9-_]{1,48}[a-zA-Z0-9])?$/;
return regex.test(repoName) && repoName.length >= 3 && repoName.length <= 50;
}
async createRepo(repoName: string, namespace: string): Promise<void> {
if (!await this.isValidRepoNameFormat(repoName)) {
throw new Error("Invalid repository name format.");
}
const exists = await this.verifyRepoName(repoName);
if (exists) {
throw new Error(`Repository name "${repoName}" already exists.`);
}
try {
await this.methods.database.query('INSERT INTO repositories (name, namespace) VALUES (?, ?)', [repoName, namespace]);
await git.init({
fs,
dir: `${process.env.REPO_PATH}/${repoName}`,
bare: true
});
this.log.debug('repo', `Bare repository "${repoName}" created successfully.`);
} catch (error) {
const errorMsg = error instanceof Error ? error.message : String(error);
this.log.error('repo', `Error creating repository "${repoName}": ${errorMsg}`);
throw new Error(`Database error: ${errorMsg}`);
}
}
/**
* 获取仓库的所有引用(HEAD、分支、标签),并格式化为 git-upload-pack 协议输出
*/
async refs(repoName: string, namespace: string, service: string): Promise<string | boolean> {
// 校验仓库名格式
if (!await this.isValidRepoNameFormat(repoName)) return false;
// 校验仓库是否存在
const exists = await this.verifyRepoName(repoName);
if (!exists) return false;
this.log.debug('repo', `Listing refs for repository "${repoName}" in namespace "${namespace}"`);
const repoPath = `${process.env.REPO_PATH}/${repoName}`;
const refs: string[] = [];
// 协议头根据 service 参数生成
let serviceType = 'git-upload-pack';
if (service === 'git-receive-pack') {
serviceType = 'git-receive-pack';
}
const serviceHeader = `${(20 + serviceType.length).toString(16).padStart(4, '0')}# service=${serviceType}\n0000`;
// 辅助函数:安全获取引用
const safeResolveRef = async (ref: string): Promise<string | null> => {
try {
return await git.resolveRef({ fs, dir: repoPath, ref });
} catch (error) {
const errorMsg = error instanceof Error ? error.message : String(error);
this.log.error('repo', `Error resolving ref "${ref}" for repository "${repoName}": ${errorMsg}`);
return null;
}
};
// 获取 HEAD
const headOid = await safeResolveRef('HEAD');
if (headOid) {
refs.push(`${headOid} HEAD\x00multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed no-done symref=HEAD:refs/heads/main agent=isomorphic-git/1.0`);
}
// 获取分支
try {
const branches = await git.listBranches({ fs, dir: repoPath });
for (const branch of branches) {
const branchRef = `refs/heads/${branch}`;
const oid = await safeResolveRef(branchRef);
if (oid) refs.push(`${oid} ${branchRef}`);
}
} catch (error) {
this.log.error('repo', `Error listing branches for repository "${repoName}": ${error instanceof Error ? error.message : String(error)}`);
return false;
}
// 获取标签
try {
const tags = await git.listTags({ fs, dir: repoPath });
for (const tag of tags) {
const tagRef = `refs/tags/${tag}`;
const oid = await safeResolveRef(tagRef);
if (oid) refs.push(`${oid} ${tagRef}`);
}
} catch (error) {
this.log.error('repo', `Error listing tags for repository "${repoName}": ${error instanceof Error ? error.message : String(error)}`);
return false;
}
// 格式化输出(符合 git 协议)
let result = serviceHeader;
for (const ref of refs) {
const line = `${ref}\n`;
result += `${(line.length + 4).toString(16).padStart(4, '0')}${line}`;
}
result += '0000';
return result;
}
}
数据传输 - 原理
git的数据传输分为两种,从服务端获取数据以及向服务端推送数据。这两分别是Fetch和push操作,以下是两者的区别
1. Fetch操作会在获取引用后由服务端计算出客户端需要获取的数据,并把数据和pkt-line方法post的方法提交给服务端。由服务器进行pack的计算和打包将包作为
POST
的响应发送给客户端,客户端进行解压和引用更新2. Push在获取引用后由客户端本地计算出缺少的数据,将这些数据post到服务端服务端解压后进行解压和引用更新