# Specifications of the `py_310_sample_multimodal_rag` code environment

## Packages

If you plan to use IDEFICS2 with a GPU, you need to specify extra options for "pip install" (cf. "General" tab of the code environment): --index-url https://download.pytorch.org/whl/cu117 --extra-index-url https://pypi.org/simple (if you use CUDA 11.7).
![extra_options.png](ruxHtcr7pr95)
```
langchain==0.2.1
langchain-community==0.2.1
langchain-text-splitters==0.2.0
sentencepiece==0.2.0
protobuf==4.25.3
torch<=2.0.0
pydantic==1.10.15
bitsandbytes==0.43.1
accelerate==0.31.0
faiss-cpu==1.8.0
unstructured[all-docs]==0.14.4
PyMuPDF==1.24.5
pillow==10.3.0
dash==2.17.0
dash-bootstrap-components==1.6.0
```

## Resources initialization script

You can remove the two lines corresponding to IDEFICS2 if you plan to only use GPT-4V or GPT-4o.

```
## Base imports
import os

from dataiku.code_env_resources import clear_all_env_vars
from dataiku.code_env_resources import grant_permissions
from dataiku.code_env_resources import set_env_path

# Clears all environment variables defined by previously run script
clear_all_env_vars()

## Hugging Face
# Set HuggingFace cache directory
set_env_path("HF_HOME", "huggingface")
set_env_path("TRANSFORMERS_CACHE", "huggingface/transformers")
hf_home_dir = os.getenv("HF_HOME")
transformers_home_dir = os.getenv("TRANSFORMERS_CACHE")

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="unstructuredio/yolo_x_layout", filename="yolox_l0.05.onnx")

from transformers import AutoProcessor, AutoModel, TableTransformerForObjectDetection, AutoModelForVision2Seq
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-structure-recognition")

MODEL_REVISION = "7067f6db2baa594bab7c6d965fe488c7ac62f1c8"
processor = AutoProcessor.from_pretrained("google/siglip-so400m-patch14-384", revision=MODEL_REVISION)
model = AutoModel.from_pretrained("google/siglip-so400m-patch14-384", revision=MODEL_REVISION)
MODEL_REVISION = "317fdf2b92817565b11949ef71f8c8d9552f3e07"
processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b", revision=MODEL_REVISION)
model = AutoModelForVision2Seq.from_pretrained("HuggingFaceM4/idefics2-8b", revision=MODEL_REVISION)

# Grant everyone read access to pretrained models in the HF_HOME folder
# (by default, only readable by the owner)
grant_permissions(hf_home_dir)
grant_permissions(transformers_home_dir)

## NLTK
# Set NLTK data directory
set_env_path("NLTK_DATA", "nltk_data")

nltk_data_path = os.environ["NLTK_DATA"]
if not os.path.exists(nltk_data_path):
    os.makedirs(nltk_data_path)

# Import NLTK
import nltk

# Download model: automatically managed by NLTK, does not download
# anything if model is already in NLTK_DATA.
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

grant_permissions(nltk_data_path)
```

# Specifications of the `py_310_sample_colpali` code environment

## Packages

```
byaldi==0.0.5
einops==0.8.0
bitsandbytes==0.44.1
seaborn==0.13.2
dash==2.18.1
dash-bootstrap-components==1.6.0
```

## Resources initialization script

```
## Base imports
import os
import torch

from dataiku.code_env_resources import clear_all_env_vars
from dataiku.code_env_resources import grant_permissions
from dataiku.code_env_resources import set_env_path
from dataiku.code_env_resources import set_env_var

# Clears all environment variables defined by previously run script
clear_all_env_vars()

set_env_path("HF_HOME", "huggingface")
hf_home_dir = os.getenv("HF_HOME")

MODEL = "vidore/colpali-v1.2"
REVISION = "2d54d5d3684a4f5ceeefbef95df0c94159fd6a45"

if torch.cuda.is_available():
    from byaldi import RAGMultiModalModel
    RAG = RAGMultiModalModel.from_pretrained(MODEL, revision=REVISION)

# Grant everyone read access to pretrained models in the HF_HOME folder
# (by default, only readable by the owner)
grant_permissions(hf_home_dir)
```