安装faiss注意
- 正常流程安装faiss，官网conda install -c pytorch faiss-cpu，我用的是pip install faiss-cpu
- 但是numpy的班恩不能为1.26.0，我用的是1.22.4
  - 【精选】【Python库安装填坑】faiss在Windows上的安装问题—from . import _swigfaiss ImportError: DLL load failed: 找不到指定的模块。_faiss cpu安装windows版本_黄水生的博客-CSDN博客
  - 1.21.4对我的panda来说版本太低了
- 否则会报：ImportError: DLL load failed while importing _swigfaiss: 找不到指定的模块。
pip requirement时注意事项：
- ~~需要对gradio进行降级处理 pip install gradio==3.10~~

总的流程为

相关文件的下载：1、langchain-chatglm-Webui；2、chatglm2-6b；3、text2word embedding编码器
1. 下载链接位置：数据集 - LangChain-ChatGLM-Webui - OpenI - 启智AI开源社区提供普惠算力！ (pcl.ac.cn)
将知识编码为向量存入到知识库中（向量数据库）
1. 读取内容
2. 文本切分
3. embedding
将query编码
搜索
将检索内容+query一同作为prompt输入给llm
LLM response

实验

环境
python 3.8,torch2.0 cuda 11.8
显卡：A5000 24G，内存30G，硬盘50
%% ssh -p 34207 root@region-8.autodl.pro
PNtiy0k3JecZ %%
上面的ssh和密码是可以变动的

我是先进入home目录，在home目录下mkdir LLM，在这个文件夹里面进行相关的部署
git clone https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui.git
1. 将项目文件拷贝到对应的目录中但是不知道为啥总是超时，所以利用mobalxterm手动加载进去的
2. 利用wget下载embedding模型
  1. wget -O text2vec-chinese-base.zip ‘https://s3.openi.org.cn/opendata/attachment/0/c/0cebbcbc-5e41-4826-9052-718b601790d9?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1fa9e58b6899afd26dd3%2F20231026%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231026T121940Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B filename%3D"text2vec-base-chinese.zip"&X-Amz-Signature=fc26e7e501d45de3beb5eebfa11c9b2dd35f18ea722d5cbd455dea79528d2284’
    1. -O 是下载到指定文件名
    2. 这里的链接一定要加英文的双引号不然会报错：‘’
3. 同样利用wget下载LLM：chatglm2-6b
  1. wget -O chatglm-6b.zip ‘https://s3.openi.org.cn/opendata/attachment/c/8/c84eb2d7-fcb7-4d4e-8fe5-b1f8e6186fc3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=1fa9e58b6899afd26dd3%2F20231026%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231026T125738Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B filename%3D"chatglm2-6b.zip"&X-Amz-Signature=bb9cb321062ed560c68e64fa61e2303474f6d3af27a3317fc2bd363d4a4115bb’
4. 在unzip -d text2vec-chinese-base text2vec-chinese-base.zip，LLM同理
  1. 将zip文件夹解压到指定文件夹中
  2. -d选项用于指定解压缩后文件的目标目录
5. 由于本电脑的git clone存在问题，换了台电脑。不知道为什么总是git clone 不了。
  1. 如果是在太慢了一个一个安装，将requirements.txt中的拿出来一个一个安装

遇到的问题

gradio还是要为3.23.0，但都会遇到问题
1. 得装一个frpc击穿网络？
2. 【Gradio】Could not create share link-CSDN博客
3. 由于wget也会失败建议windows本地建完然后上传到linux的gradio文件夹中
  1. 我用gradio3.10版本最后会报错：*** Failed to connect to ec2.gradio.app:22: [Errno 110] Connection timed out
  2. 我用gradio 3.23版本最后会是无法创建share link
  3. 然后waget frpc那个还无法waget请问如何结局
  4. 记得对frpc_linux_amd64_v0.2进行执行能力赋予 chmod +x
模型无法加载
1. 始终无法加载
实验失败

chatglm2-6b 部署尝试

ssh -p 18186 root@region-8.autodl.pro
Fa+uIT7ETyJK

具体同上：先git clone 然后 wget 然后再unzip等
上述网站所提供的chatglm2的链接有一定问题，虽说可以用，但是会报错

实战2

得先安装：pip install chromadb

01_ChatGLM环境部署_哔哩哔哩_bilibili
langchain+chatglm+chroma
试试这个教程

大模型部署，这里作者尝试了web_demo和api，这里的实践采用的是通过chatglm api分配的本地url来进行实验

先运行chatglm文件下的api.py

from langchain.llms import ChatGLM
# 创建llm
llm = ChatGLM(
    endpoint='http://127.0.0.1:8000',//api分配的url
    max_token=80000,
    top_p=0.9
)
<!--code0-->

from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import CharacterTextSplitter
def load_documents(directory="books"):
    """
    加载books下的文件，进行拆分
    :param directory:
    :return:
    """
    loader = DirectoryLoader(directory)
    documents = loader.load()
    text_spliter = CharacterTextSplitter(chunk_size=256, chunk_overlap=0)
    split_docs = text_spliter.split_documents(documents)
    return split_docs
<!--code1-->

存储

#这里用的是chorma
# 加载数据库
from langchain.vectorstores import Chroma
if not os.path.exists('VectorStore'):
    documents = load_documents()
    db = store_chroma(documents, embeddings)
else:
    db = Chroma(persist_directory='VectorStore', embedding_function=embeddings)
<!--code2-->

加一个web_ui，用的是gradio,简易版本

import gradio as gr
def chat(question,history):
    response=qa.run(question)#这里也可以加入history
    return response

demo = gr.ChatTnterface(chat)
demo.launch(inbrowser=True)
<!--code3-->

1. yeild：https://zhuanlan.zhihu.com/p/268605982
   1. `yield`的函数则返回一个可迭代的 generator（生成器）对象，你可以使用for循环或者调用next()方法遍历生成器对象来提取结果。

输入是中文输出是英文+中文，这里得好好修改。langchain的prompt是英文。由于chatglm比较小，我们可以不用langchain做最后的回答，自己做最后的回答，自己去注入一些prompt。
1. 还可以将一些比较好的对话搜集，作为prompt，使得模型的输出更加的稳定

额外的补充

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader

loader = TextLoader("../../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()

db = Chroma.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)

print(type(docs))
print(docs[0])
print(type(docs[0]))
print(docs[0].page_content)
print(type(docs[0].page_content))

"""
此时的docs是list
docs[0]这个是：<class 'langchain.schema.document.Document'>
docs[0].page_content：这个是<class 'str'>
"""

不同的向量数据库

faiss专注于相似性搜索，milvus除了相似性搜索还有数据库管理

# milvus
from pymilvus import connections, db
conn = connections.connect(host="127.0.0.1", port=19530)#连接向量数据库
database = db.create_database("sample_db")# 创建向量数据库
#切换和显示db
db.using_database("sample_db")
db.list_database()

#faiss

import faiss
import numpy as np

# 示例向量数据
data = np.random.rand(1000, 128).astype('float32')
index = faiss.IndexFlatL2(128)  # 创建一个L2距离度量的索引
index.add(data)  # 将数据添加到索引

#使用write_index方法将索引保存到磁盘：
faiss.write_index(index, "my_index.index")

#要加载之前保存的索引，使用read_index方法：
loaded_index = faiss.read_index("my_index.index")

# Chroma
def store_chroma(docs, embeddings, persist_directory="VectorStore"):
    """
    讲文档向量化，存入向量数据库
    :param docs:
    :param embeddings:
    :param persist_directory:
    :return:
    """
    db = Chroma.from_documents(docs, embeddings, persist_directory=persist_directory)
    db.persist()#存储为sqlite3
    return db

# 加载数据库
if not os.path.exists('VectorStore'):
    documents = load_documents()
    db = store_chroma(documents, embeddings)
else:
    db = Chroma(persist_directory='VectorStore', embedding_function=embeddings)

(1 封私信 / 6 条消息) sqlite数据库的三种后缀（.db .db3 .sqlite）有什么区别？ - 知乎 (zhihu.com)

笔记︱几款多模态向量检索引擎：Faiss 、milvus、Proxima、vearch、Jina等 - 知乎 (zhihu.com)

其中个人milvus使用起来相对麻烦一点，但是milvus提供了可视化的管理工具，且功能更加强大，chromadb使用起来简单，其中向量保存的文件格式为sqlite3，也可以利用数据库软件进行可视化，faiss的保存文件格式是index，没找到相关可视化软件，不太容易后期的维护。这三种数据库都支持本地部署。