Privacy update & readme linting (#472)

* privacy update

* typo

* update date
pull/473/head
Nathan Sarrazin 2023-10-04 17:10:44 +02:00 committed by GitHub
parent 98030ef040
commit 8100ea5502
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 27 additions and 34 deletions

View File

@ -1,22 +1,25 @@
## Privacy
> Last updated: July 23, 2023
> Last updated: October 4, 2023
Users of HuggingChat are authenticated through their HF user account.
By default, your conversations may be shared with the respective models' authors (e.g. if you're chatting with the Open Assistant model, to <a target="_blank" href="https://open-assistant.io/dashboard">Open Assistant</a>) to improve their training data and model over time. Model authors are the custodians of the data collected by their model, even if it's hosted on our platform.
By default, your conversations may be shared with the respective models' authors to improve their training data and model over time. Model authors are the custodians of the data collected by their model, even if it's hosted on our platform.
If you disable data sharing in your settings, your conversations will not be used for any downstream usage (including for research or model training purposes), and they will only be stored to let you access past conversations. You can click on the Delete icon to delete any past conversation at any moment.
🗓 Please also consult huggingface.co's main privacy policy at https://huggingface.co/privacy. To exercise any of your legal privacy rights, please send an email to privacy@huggingface.co.
🗓 Please also consult huggingface.co's main privacy policy at <https://huggingface.co/privacy>. To exercise any of your legal privacy rights, please send an email to <privacy@huggingface.co>.
## About available LLMs
The goal of this app is to showcase that it is now (May 2023) possible to build an open source alternative to ChatGPT. 💪
The goal of this app is to showcase that it is now possible to build an open source alternative to ChatGPT. 💪
For now, it's running both OpenAssistant's [latest LLaMA based model](https://huggingface.co/OpenAssistant/oasst-sft-6-llama-30b-xor) (which is one of the current best open source chat models) as well as [Meta's newer Llama 2](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf), but the plan in the longer-term is to expose all good-quality chat models from the Hub.
For now (October 2023), it's running:
We are not affiliated with Open Assistant nor Meta AI, but if you want to contribute to the training data for the next generation of open models, please consider contributing to https://open-assistant.io/ or https://ai.meta.com/llama/ ❤️
- [Llama 2 70B](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf)
- [CodeLlama 35B](https://about.fb.com/news/2023/08/code-llama-ai-for-coding/)
- [Falcon 180B](https://www.tii.ae/news/technology-innovation-institute-introduces-worlds-most-powerful-open-llm-falcon-180b)
- [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/)
## Technical details
@ -28,11 +31,6 @@ The inference backend is running the optimized [text-generation-inference](https
It is therefore possible to deploy a copy of this app to a Space and customize it (swap model, add some UI elements, or store user messages according to your own Terms and conditions). You can also 1-click deploy your own instance using the [Chat UI Spaces Docker template](https://huggingface.co/new-space?template=huggingchat/chat-ui-template).
We welcome any feedback on this app: please participate to the public discussion at https://huggingface.co/spaces/huggingchat/chat-ui/discussions
We welcome any feedback on this app: please participate to the public discussion at <https://huggingface.co/spaces/huggingchat/chat-ui/discussions>
<a target="_blank" href="https://huggingface.co/spaces/huggingchat/chat-ui/discussions"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-a-discussion-xl.svg" title="open a discussion"></a>
## Coming soon
- User setting to share conversations with model authors (done ✅)
- LLM watermarking

View File

@ -39,7 +39,7 @@ The default config for Chat UI is stored in the `.env` file. You will need to ov
Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
```bash
```env
MONGODB_URL=<the URL to your mongoDB instance>
HF_ACCESS_TOKEN=<your access token>
```
@ -87,7 +87,7 @@ Chat UI features a powerful Web Search feature. It works by:
The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your `.env.local` file:
```bash
```env
OPENID_PROVIDER_URL=<your OIDC issuer>
OPENID_CLIENT_ID=<your OIDC client ID>
OPENID_CLIENT_SECRET=<your OIDC client secret>
@ -99,7 +99,7 @@ These variables will enable the openID sign-in modal for users.
You can use a few environment variables to customize the look and feel of chat-ui. These are by default:
```
```env
PUBLIC_APP_NAME=ChatUI
PUBLIC_APP_ASSETS=chatui
PUBLIC_APP_COLOR=blue
@ -113,7 +113,7 @@ PUBLIC_APP_DISCLAIMER=
- `PUBLIC_APP_DATA_SHARING` Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
- `PUBLIC_APP_DISCLAIMER` If set to 1, we show a disclaimer about generated outputs on login.
### Web Search
### Web Search config
You can enable the web search by adding either `SERPER_API_KEY` ([serper.dev](https://serper.dev/)) or `SERPAPI_KEY` ([serpapi.com](https://serpapi.com/)) to your `.env.local`.
@ -121,8 +121,7 @@ You can enable the web search by adding either `SERPER_API_KEY` ([serper.dev](ht
You can customize the parameters passed to the model or even use a new model by updating the `MODELS` variable in your `.env.local`. The default one can be found in `.env` and looks like this :
```
```env
MODELS=`[
{
"name": "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
@ -162,15 +161,15 @@ MODELS=`[
You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
#### Custom prompt templates:
#### Custom prompt templates
By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is https://handlebarsjs.com. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is <https://handlebarsjs.com>. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
For example:
```
```prompt
<System>You are an AI, called ChatAI.</System>
{{#each messages}}
{{#ifUser}}<User>{{content}}</User>{{/ifUser}}
@ -179,13 +178,13 @@ For example:
<Assistant>
```
**chatPromptTemplate**
##### chatPromptTemplate
When quering the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To idenify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability.
```
```prompt
{{preprompt}}
{{#each messages}}
{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
@ -194,13 +193,13 @@ The following is the default `chatPromptTemplate`, although newlines and indenti
{{assistantMessageToken}}
```
**webSearchQueryPromptTemplate**
##### webSearchQueryPromptTemplate
When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
The following is the default `webSearchQueryPromptTemplate`.
```
```prompt
{{userMessageToken}}
My question is: {{message.content}}.
Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for google search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
@ -216,13 +215,11 @@ A good option is to hit a [text-generation-inference](https://github.com/hugging
To do this, you can add your own endpoints to the `MODELS` variable in `.env.local`, by adding an `"endpoints"` key for each model in `MODELS`.
```
```env
{
// rest of the model config here
"endpoints": [{"url": "https://HOST:PORT"}]
}
```
If `endpoints` is left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
@ -243,22 +240,20 @@ For `Bearer` you can use a token, which can be grabbed from [here](https://huggi
You can then add the generated information and the `authorization` parameter to your `.env.local`.
```
```env
"endpoints": [
{
"url": "https://HOST:PORT",
"authorization": "Basic VVNFUjpQQVNT",
}
]
```
### Amazon SageMaker
You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:
```
```env
"endpoints": [
{
"host" : "sagemaker",
@ -268,6 +263,7 @@ You can also specify your Amazon SageMaker instance as an endpoint for chat-ui.
"sessionToken": "", // optional
"weight": 1
}
]
```
You can get the `accessKey` and `secretKey` from your AWS user, under programmatic access.
@ -284,8 +280,7 @@ If you're using a self-signed certificate, e.g. for testing or development purpo
If the model being hosted will be available on multiple servers/instances add the `weight` parameter to your `.env.local`. The `weight` will be used to determine the probability of requesting a particular endpoint.
```
```env
"endpoints": [
{
"url": "https://HOST:PORT",