Skip to content

error uploading file using ollama #253

@svonjoi

Description

@svonjoi

trying upload simple txt file

`worker_ingestion.log`
2025-08-17 18:33:17,857 - core.workers.ingestion_worker - INFO - Starting ingestion job for file: test.txt
2025-08-17 18:33:17,857 - core.workers.ingestion_worker - INFO - ColPali parameter received: use_colpali=True (type: <class 'bool'>)
2025-08-17 18:33:17,966 - core.workers.ingestion_worker - INFO - Downloading file from storage/ingest_uploads/0a0d4473-0572-457c-b21f-a83f39b3c209/test.txt
2025-08-17 18:33:17,967 - core.workers.ingestion_worker - INFO - File download took 0.00s for 0.00MB
2025-08-17 18:33:17,967 - core.workers.ingestion_worker - INFO - ColPali decision: use_colpali=True, has_model=False, has_store=False, using_colpali=None
2025-08-17 18:33:17,967 - core.workers.ingestion_worker - INFO - Processing decision for unknown file: skip_text_parsing=None (ColPali=None, text_rules=False, native_format=False, image_rules=False)
2025-08-17 18:33:21,496 - core.workers.ingestion_worker - INFO - Document retrieval took 1.00s
2025-08-17 18:33:21,516 - core.workers.ingestion_worker - INFO - Initial document update took 0.02s
2025-08-17 18:33:21,524 - core.workers.ingestion_worker - INFO - Text chunking took 0.00s to create 1 chunks
2025-08-17 18:33:21,524 - core.workers.ingestion_worker - INFO - Determined final page count for usage recording: 2 pages (ColPali used: None)
2025-08-17 18:33:21,620 - core.workers.ingestion_worker - INFO - Embedding generation took 0.09s for 1 embeddings (11.07 embeddings/s)
2025-08-17 18:33:21,644 - core.workers.ingestion_worker - ERROR - Error processing ingestion job for file test.txt: (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: expected 1536 dimensions, not 768
[SQL: INSERT INTO vector_embeddings (document_id, chunk_number, content, chunk_metadata, embedding) VALUES ($1::VARCHAR, $2::INTEGER, $3::VARCHAR, $4::VARCHAR, $5)]
[parameters: ('fb15d77c-06db-4194-bd8e-05dc5e624103', 0, '1AkjA2345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345\n\n1AkjA2345 12345 12345 12345 12345 12345 12345 12345  ... (3093 characters truncated) ... 2345 12345 12345 12345 12345 12345 12345 12345 12345\n\n1AkjA2345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345', '{}', '[0.0028226126,-0.016980335,-0.19114871,-0.059885986,0.037791196,-0.006855199,0.01513065,0.0687457,-0.0006289475,-0.020440858,-0.104271635,0.050466344 ... (9215 characters truncated) ... .029276893,0.06331349,0.092164025,-0.016793778,0.03446088,-0.036287807,0.0014320372,0.025927795,-0.021227248,-0.071461365,-0.029777579,-0.0027285467]')]
(Background on this error at: https://sqlalche.me/e/20/dbapi)
2025-08-17 18:33:21,650 - core.workers.ingestion_worker - ERROR - Traceback (most recent call last):
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 545, in _prepare_and_execute
    self._rows = deque(await prepared_stmt.fetch(*parameters))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/asyncpg/prepared_stmt.py", line 176, in fetch
    data = await self.__bind_execute(args, 0, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/asyncpg/prepared_stmt.py", line 267, in __bind_execute
    data, status, _ = await self.__do_execute(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/asyncpg/prepared_stmt.py", line 256, in __do_execute
    return await executor(protocol)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "asyncpg/protocol/protocol.pyx", line 206, in bind_execute
asyncpg.exceptions.DataError: expected 1536 dimensions, not 768

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1963, in _exec_single_context
    self.dialect.do_execute(
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 943, in do_execute
    cursor.execute(statement, parameters)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 580, in execute
    self._adapt_connection.await_(
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 558, in _prepare_and_execute
    self._handle_exception(error)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 508, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 792, in _handle_exception
    raise translated_error from error
sqlalchemy.dialects.postgresql.asyncpg.AsyncAdapt_asyncpg_dbapi.Error: <class 'asyncpg.exceptions.DataError'>: expected 1536 dimensions, not 768

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/core/workers/ingestion_worker.py", line 854, in process_ingestion_job
    await document_service._store_chunks_and_doc(
  File "/app/core/services/document_service.py", line 1976, in _store_chunks_and_doc
    chunk_ids = await store_with_retry(self.vector_store, chunk_objects, "regular")
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/core/services/document_service.py", line 1880, in store_with_retry
    success, result = await store.store_embeddings(objects, auth.app_id if auth else None)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/core/vector_store/pgvector_store.py", line 392, in store_embeddings
    await session.execute(VectorEmbedding.__table__.insert().values(rows))
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/ext/asyncio/session.py", line 463, in execute
    result = await greenlet_spawn(
             ^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 201, in greenlet_spawn
    result = context.throw(*sys.exc_info())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2365, in execute
    return self._execute_internal(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 2260, in _execute_internal
    result = conn.execute(
             ^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1415, in execute
    return meth(
           ^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 523, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1637, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1842, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1982, in _exec_single_context
    self._handle_dbapi_exception(
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2351, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1963, in _exec_single_context
    self.dialect.do_execute(
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 943, in do_execute
    cursor.execute(statement, parameters)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 580, in execute
    self._adapt_connection.await_(
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 558, in _prepare_and_execute
    self._handle_exception(error)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 508, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "/app/.venv/lib/python3.11/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 792, in _handle_exception
    raise translated_error from error
sqlalchemy.exc.DBAPIError: (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: expected 1536 dimensions, not 768
[SQL: INSERT INTO vector_embeddings (document_id, chunk_number, content, chunk_metadata, embedding) VALUES ($1::VARCHAR, $2::INTEGER, $3::VARCHAR, $4::VARCHAR, $5)]
[parameters: ('fb15d77c-06db-4194-bd8e-05dc5e624103', 0, '1AkjA2345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345\n\n1AkjA2345 12345 12345 12345 12345 12345 12345 12345  ... (3093 characters truncated) ... 2345 12345 12345 12345 12345 12345 12345 12345 12345\n\n1AkjA2345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345 12345', '{}', '[0.0028226126,-0.016980335,-0.19114871,-0.059885986,0.037791196,-0.006855199,0.01513065,0.0687457,-0.0006289475,-0.020440858,-0.104271635,0.050466344 ... (9215 characters truncated) ... .029276893,0.06331349,0.092164025,-0.016793778,0.03446088,-0.036287807,0.0014320372,0.025927795,-0.021227248,-0.071461365,-0.029777579,-0.0027285467]')]
(Background on this error at: https://sqlalche.me/e/20/dbapi)

2025-08-17 18:33:21,695 - core.workers.ingestion_worker - INFO - Updated document fb15d77c-06db-4194-bd8e-05dc5e624103 status to failed
`morphik.toml`
[api]
host = "0.0.0.0"
port = 8000
reload = true

[auth]
jwt_algorithm = "HS256"
dev_mode = true                              # Enabled by default for easier local development
dev_entity_id = "dev_user"                   # Default dev user ID
dev_entity_type = "developer"                # Default dev entity type
dev_permissions = ["read", "write", "admin"] # Default dev permissions

#### Registered models
[registered_models]
# OpenAI models
openai_gpt4-1 = { model_name = "gpt-4.1" }
openai_gpt4-1-mini = { model_name = "gpt-4.1-mini" }

# Azure OpenAI models
azure_gpt4 = { model_name = "gpt-4", api_base = "YOUR_AZURE_URL_HERE", api_version = "2023-05-15", deployment_id = "gpt-4-deployment" }
azure_gpt35 = { model_name = "gpt-3.5-turbo", api_base = "YOUR_AZURE_URL_HERE", api_version = "2023-05-15", deployment_id = "gpt-35-turbo-deployment" }

# Anthropic models
claude_opus = { model_name = "claude-3-opus-20240229" }
claude_sonnet = { model_name = "claude-3-7-sonnet-latest" }

# Google Gemini models
gemini_flash = { model_name = "gemini/gemini-2.5-flash-preview-05-20" } # gemini-2.5-flash-preview-05-20
# gemini_flash = { model_name = "gemini/gemini-2.5-pro-preview-06-05" } # { model_name = "claude-4-sonnet-20250514"} # { model_name = "gpt-4.1" } # { model_name = "gemini/gemini-2.5-pro-preview-06-05" } # {model_name = "o3-2025-04-16"} #  { model_name = "claude-3-7-sonnet-latest"} #  { model_name = "gemini/gemini-2.5-flash-preview-05-20" } # gemini-2.5-flash-preview-05-20
# gemini_flash = { model_name = "groq/meta-llama/llama-4-maverick-17b-128e-instruct"}

# Ollama models (modify api_base based on your deployment)
# - Local Ollama: "http://localhost:11434" (default)
# - Morphik in Docker, Ollama local: "http://host.docker.internal:11434"
# - Both in Docker: "http://ollama:11434"
ollama_qwen_vision = { model_name = "ollama_chat/qwen2.5vl:latest", api_base = "http://host.docker.internal:11434", vision = true }
ollama_embedding = { model_name = "ollama/nomic-embed-text", api_base = "http://host.docker.internal:11434" }
# ollama_llama_vision = { model_name = "ollama_chat/llama3.2-vision", api_base = "http://host.docker.internal:11434", vision = true }

# Lemonade models (for AMD GPU and NPU support)
# - Local: "http://localhost:8020/api/v1"
# - Morphik in Docker: "http://host.docker.internal:8020/api/v1"
lemonade_qwen = { model_name = "openai/Qwen2.5-VL-7B-Instruct-GGUF", api_base = "http://host.docker.internal:8020/api/v1", vision = true }
lemonade_embedding = { model_name = "openai/nomic-embed-text-v1-GGUF", api_base = "http://host.docker.internal:8020/api/v1" }

openai_embedding = { model_name = "text-embedding-3-small" }
openai_embedding_large = { model_name = "text-embedding-3-large" }
azure_embedding = { model_name = "text-embedding-ada-002", api_base = "YOUR_AZURE_URL_HERE", api_version = "2023-05-15", deployment_id = "embedding-ada-002" }


#### Component configurations ####

[agent]
model = "ollama_qwen_vision" # Model for the agent logic

[completion]
model = "ollama_qwen_vision" #"openai_gpt4-1-mini"  # Reference to a key in registered_models
default_max_tokens = "1000"
default_temperature = 0.3

[database]
provider = "postgres"
# Connection pool settings
pool_size = 10       # Maximum number of connections in the pool
max_overflow = 15    # Maximum number of connections that can be created beyond pool_size
pool_recycle = 3600  # Time in seconds after which a connection is recycled (1 hour)
pool_timeout = 10    # Seconds to wait for a connection from the pool
pool_pre_ping = true # Check connection viability before using it from the pool
max_retries = 3      # Number of retries for database operations
retry_delay = 1.0    # Initial delay between retries in seconds

[embedding]
model = "ollama_embedding" # Reference to registered model
dimensions = 768
# dimensions = 1536
similarity_metric = "cosine"

[parser]
chunk_size = 6000
chunk_overlap = 300
use_unstructured_api = false
use_contextual_chunking = false
contextual_chunking_model = "ollama_qwen_vision" # Reference to a key in registered_models

[parser.xml]
max_tokens = 350
preferred_unit_tags = ["SECTION", "Section", "Article", "clause"]
ignore_tags = ["TOC", "INDEX"]

[document_analysis]
model = "ollama_qwen_vision" # Reference to a key in registered_models

[parser.vision]
model = "ollama_qwen_vision" # Reference to a key in registered_models
frame_sample_rate = -1       # Set to -1 to disable frame captioning

[reranker]
use_reranker = false
provider = "flag"
model_name = "BAAI/bge-reranker-large"
query_max_length = 256
passage_max_length = 512
use_fp16 = true
device = "mps"                         # use "cpu" if on docker and using a mac, "cuda" if cuda enabled device

[storage]
provider = "local"
storage_path = "./storage"

# [storage]
# provider = "aws-s3"
# region = "us-east-2"
# bucket_name = "morphik-s3-storage"

[vector_store]
provider = "pgvector"

[rules]
model = "ollama_qwen_vision"
batch_size = 4096

[morphik]
enable_colpali = false
mode = "self_hosted"          # "cloud" or "self_hosted"
api_domain = "api.morphik.ai" # API domain for cloud URIs
# Only call the embedding API if colpali_mode is "api"
morphik_embedding_api_domain = "http://localhost:6000" # endpoint for multivector embedding service
colpali_mode = "local"                                 # "off", "local", or "api"

[graph]
model = "ollama_qwen_vision"
enable_entity_resolution = true

# [graph]
# mode="api"
# base_url="https://graph-api.morphik.ai"

[telemetry]
enabled = true
honeycomb_enabled = true
honeycomb_endpoint = "https://api.honeycomb.io"
honeycomb_proxy_endpoint = "https://otel-proxy.onrender.com"
service_name = "databridge-core"
otlp_timeout = 10
otlp_max_retries = 3
otlp_retry_delay = 1
otlp_max_export_batch_size = 512
otlp_schedule_delay_millis = 5000
otlp_max_queue_size = 2048
making sure models work as expected
# testing qwen2.5vl
ollama run qwen2.5vl:7b

# testing nomic-embed-text
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "The sky is blue because of Rayleigh scattering"
}'

ollama models are stored as a blobs, so i dont understand what are the paths ollama_chat/<model> and ollama/<model> (docs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions