Update testvideoYoutube notebook to replace ChatOpenAI with ChatOllama and enhance French summary
This commit is contained in:
parent
8292dc15b3
commit
b7e2ded889
@ -717,13 +717,14 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 120,
|
"execution_count": 125,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"from langchain_core.runnables import RunnablePassthrough, RunnableLambda\n",
|
"from langchain_core.runnables import RunnablePassthrough, RunnableLambda\n",
|
||||||
"from langchain_core.messages import SystemMessage, HumanMessage\n",
|
"from langchain_core.messages import SystemMessage, HumanMessage\n",
|
||||||
"from langchain_openai import ChatOpenAI\n",
|
"from langchain_openai import ChatOpenAI\n",
|
||||||
|
"from langchain_ollama import ChatOllama\n",
|
||||||
"from base64 import b64decode\n",
|
"from base64 import b64decode\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
@ -781,17 +782,17 @@
|
|||||||
" \"question\": RunnablePassthrough(),\n",
|
" \"question\": RunnablePassthrough(),\n",
|
||||||
" }\n",
|
" }\n",
|
||||||
" | RunnableLambda(build_prompt)\n",
|
" | RunnableLambda(build_prompt)\n",
|
||||||
" | ChatOpenAI(model=\"gpt-4o-mini\")\n",
|
" | ChatOllama(base_url=\"172.20.48.1:11434\", model=\"llama3.2\")\n",
|
||||||
" | StrOutputParser()\n",
|
" | StrOutputParser()\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"# ChatOpenAI(model=\"gpt-4o-mini\")\n",
|
||||||
"chain_with_sources = {\n",
|
"chain_with_sources = {\n",
|
||||||
" \"context\": retriever | RunnableLambda(parse_docs),\n",
|
" \"context\": retriever | RunnableLambda(parse_docs),\n",
|
||||||
" \"question\": RunnablePassthrough(),\n",
|
" \"question\": RunnablePassthrough(),\n",
|
||||||
"} | RunnablePassthrough().assign(\n",
|
"} | RunnablePassthrough().assign(\n",
|
||||||
" response=(\n",
|
" response=(\n",
|
||||||
" RunnableLambda(build_prompt)\n",
|
" RunnableLambda(build_prompt)\n",
|
||||||
" | ChatOpenAI(model=\"gpt-4o-mini\")\n",
|
" | ChatOllama(base_url=\"172.20.48.1:11434\", model=\"llama3.2\")\n",
|
||||||
" | StrOutputParser()\n",
|
" | StrOutputParser()\n",
|
||||||
" )\n",
|
" )\n",
|
||||||
")"
|
")"
|
||||||
@ -799,28 +800,42 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 121,
|
"execution_count": 126,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"The attention mechanism, specifically the Scaled Dot-Product Attention, involves processing input data through queries (Q), keys (K), and values (V). Here's how it works:\n",
|
"Voici un résumé du papier :\n",
|
||||||
"\n",
|
"\n",
|
||||||
"1. **Input Matrices**: The inputs consist of matrices representing queries (Q), keys (K), and values (V).\n",
|
"**Introduction**\n",
|
||||||
"2. **Dot Product**: The mechanism computes the dot products between the queries and keys.\n",
|
|
||||||
"3. **Scaling**: The results are scaled by the square root of the dimension of the keys (√dk) to prevent overly large values which can push the softmax function into regions of very small gradients.\n",
|
|
||||||
"4. **Softmax**: A softmax function is applied to obtain attention weights.\n",
|
|
||||||
"5. **Output**: These weights are used to compute a weighted sum of the values (V), resulting in the final output.\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"Multiple heads can be employed to allow the model to focus on different information subspaces, enhancing the model's ability to capture diverse interactions within the data. This process is depicted in the provided diagram, showcasing the flow from the linear transformations to the concatenation of the outputs.\n"
|
"Le papier présente un nouveau modèle de traitement automatique des langues appelé Transformer, qui utilise l'attention pour traiter les séquences de mots.\n",
|
||||||
|
"\n",
|
||||||
|
"**Architecture du modèle**\n",
|
||||||
|
"\n",
|
||||||
|
"Le modèle est composé de deux sous-parties : l'encodage et le décodage. L'encodage est constitué d'un stack de 6 couches identiques, chaque couche étant composée de deux sous-couches : une sous-couche de multi-head attention et une sous-couche de réseau à réseaux connexés (FFNN). Le décodage est également constitué d'un stack de 6 couches identiques, mais avec une troisième sous-couche de multi-head attention sur les outputs du stack encodant.\n",
|
||||||
|
"\n",
|
||||||
|
"**Attention**\n",
|
||||||
|
"\n",
|
||||||
|
"L'attention est un fonctionnement qui permet à chaque mot de prendre en compte tous les mots de la phrase pour faire le choix de la préposition la plus appropriée. L'attention est calculée en utilisant une combinaison linéaire des valeurs et des clés, où les clés sont les vectors représentant les mots.\n",
|
||||||
|
"\n",
|
||||||
|
"**Modèle variée**\n",
|
||||||
|
"\n",
|
||||||
|
"Le modèle présenté dans le papier peut être modifié pour évaluer l'importance de différentes parties du modèle. Les expériences présentées montrent que la modification de certaines parties du modèle peut avoir un impact significatif sur la performance du modèle.\n",
|
||||||
|
"\n",
|
||||||
|
"**Résultats**\n",
|
||||||
|
"\n",
|
||||||
|
"Les résultats présentés dans le papier sont positifs, avec une meilleure performance que les modèles traditionnels en traitement automatique des langues. Le modèle présenté est capable de gérer des longueurs de phrase et des contextes plus complexes que les modèles traditionnels.\n",
|
||||||
|
"\n",
|
||||||
|
"En résumé, ce papier présente un nouveau modèle de traitement automatique des langues qui utilise l'attention pour traiter les séquences de mots. Les résultats montrent que le modèle présenté est capable de gérer des longueurs de phrase et des contextes plus complexes que les modèles traditionnels.\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"response = chain.invoke(\n",
|
"response = chain.invoke(\n",
|
||||||
" \"What is the attention mechanism?\"\n",
|
" \"résume moi le papier ?\"\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"print(response)"
|
"print(response)"
|
||||||
@ -828,40 +843,59 @@
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 122,
|
"execution_count": 127,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [
|
"outputs": [
|
||||||
{
|
{
|
||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"Response: The diagram of a Transformer architecture, as described in the provided context, consists of two main parts: the Encoder and the Decoder. \n",
|
"Response: Unfortunately, I'm a text-based AI and cannot display images directly. However, I can provide you with a textual representation of the Transformer architecture based on Figure 1 provided in the context.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### Transformer Architecture Overview\n",
|
"The Transformer architecture consists of an encoder and a decoder stack, each composed of identical layers. Here is a simplified diagram of the model:\n",
|
||||||
"1. **Encoder:**\n",
|
|
||||||
" - Composed of multiple identical layers (N layers).\n",
|
|
||||||
" - Each layer includes:\n",
|
|
||||||
" - A multi-head self-attention mechanism.\n",
|
|
||||||
" - A feed-forward network.\n",
|
|
||||||
" - Residual connections and layer normalization applied to each sub-layer.\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"2. **Decoder:**\n",
|
"**Encoder:**\n",
|
||||||
" - Also composed of multiple identical layers (N layers).\n",
|
|
||||||
" - Each layer includes:\n",
|
|
||||||
" - A masked multi-head self-attention mechanism.\n",
|
|
||||||
" - Multi-head attention over the encoder output.\n",
|
|
||||||
" - A feed-forward network.\n",
|
|
||||||
" - Residual connections and layer normalization.\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"3. **Outputs:**\n",
|
"* Input Embedding (Embedding Inputs)\n",
|
||||||
" - Final output probabilities are generated through a Softmax layer following a linear transformation.\n",
|
"* Positional Encoding\n",
|
||||||
|
"* Layer 1:\n",
|
||||||
|
" + Multi-Head Self-Attention Mechanism\n",
|
||||||
|
" + Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
|
"* Layer 2:\n",
|
||||||
|
" + Simple, Position-Wise Fully Connected Feed-Forward Network\n",
|
||||||
|
" + Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### Attention Mechanism\n",
|
"**Decoder:**\n",
|
||||||
"- The attention mechanism employs queries (Q), keys (K), and values (V).\n",
|
|
||||||
"- It uses scaled dot-product attention, calculated as:\n",
|
|
||||||
" - The output is a weighted sum of the values, determined by the compatibility scores of the queries with the keys.\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"These elements work together to allow the Transformer to handle dependencies across input sequences without relying on recurrence or convolutions, making it highly parallelizable and efficient for tasks like machine translation.\n",
|
"* Input Embedding (Embedding Outputs)\n",
|
||||||
|
"* Positional Encoding\n",
|
||||||
|
"* Layer 1:\n",
|
||||||
|
" + Multi-Head Self-Attention Mechanism\n",
|
||||||
|
" + Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
|
"* Layer 2:\n",
|
||||||
|
" + Simple, Position-Wise Fully Connected Feed-Forward Network\n",
|
||||||
|
" + Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
|
"* Additional Sub-Layer:\n",
|
||||||
|
" + Multi-Head Attention over the Output of the Encoder Stack\n",
|
||||||
|
" + Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
|
"\n",
|
||||||
|
"Here is a more detailed representation of the sub-layers:\n",
|
||||||
|
"\n",
|
||||||
|
"* Multi-Head Self-Attention Mechanism: \n",
|
||||||
|
" - Query (Q) and Key (K)\n",
|
||||||
|
" - Compute Attention Weights and Apply to Value (V)\n",
|
||||||
|
" - Output as Weighted Sum\n",
|
||||||
|
"* Simple, Position-Wise Fully Connected Feed-Forward Network:\n",
|
||||||
|
" - Input Embedding\n",
|
||||||
|
" - ReLU Activation Function\n",
|
||||||
|
" - Residual Connection and Layer Normalization (LayerNorm)\n",
|
||||||
|
"* Multi-Head Attention over the Output of the Encoder Stack:\n",
|
||||||
|
" - Query (Q) from Decoder Outputs\n",
|
||||||
|
" - Key (K) and Value (V) from Encoder Outputs\n",
|
||||||
|
" - Compute Attention Weights and Apply to Value (V)\n",
|
||||||
|
" - Output as Weighted Sum\n",
|
||||||
|
"\n",
|
||||||
|
"Note that this is a simplified representation, and you can refer to the original paper or supplementary materials for more details on the architecture.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Context:\n",
|
"Context:\n",
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user