Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Start Marqo Docker with any flags #28

Merged
merged 77 commits into from
Jun 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
c090017
added dummy throttling test file
vicilliar May 8, 2023
a64d2c9
draft of rerun_marqo_with_env_vars util
vicilliar May 8, 2023
5a3bf27
draft application test added
vicilliar May 9, 2023
6517e14
testing removing
vicilliar May 10, 2023
c8c2a97
testing adding space
vicilliar May 10, 2023
d7299b5
putting back
vicilliar May 10, 2023
ec1838f
putting debugging line
vicilliar May 10, 2023
fe883f3
putting debugging line for other scenario
vicilliar May 10, 2023
2b45853
testing env var behavior
vicilliar May 10, 2023
9f7efc2
changed cuda 2 formula
vicilliar May 10, 2023
e70f07b
testing string no spaces cuda
vicilliar May 10, 2023
eb27b8a
remove test string in tox
vicilliar May 10, 2023
57695d1
temp cuda tox change
vicilliar May 10, 2023
a0f6dd8
added utilities import
vicilliar May 10, 2023
5b20dd8
import subprocess
vicilliar May 10, 2023
a5ab706
debugging os environ
vicilliar May 10, 2023
00f8009
changed setenv
vicilliar May 10, 2023
2632cce
set env tests root
vicilliar May 10, 2023
d606ed3
passed toxinidir
vicilliar May 10, 2023
9c56031
changed . to bash
vicilliar May 10, 2023
3d45bee
testing Popen
vicilliar May 10, 2023
e9823ac
testing where loop stops
vicilliar May 10, 2023
821e2f3
more debugs
vicilliar May 10, 2023
3dc9d99
communicate instead of wait
vicilliar May 10, 2023
108f4fc
attempted using communicate
vicilliar May 19, 2023
b48ed64
Merge branch 'mainline' into joshua/start-marqo-with-env-vars
vicilliar May 19, 2023
210c346
back to popen
vicilliar May 22, 2023
fedf02d
debug statements
vicilliar May 22, 2023
b96df62
debug statements
vicilliar May 22, 2023
f22d6a9
debug statements
vicilliar May 22, 2023
73d9db5
detach docker container
vicilliar May 22, 2023
e3d8f50
disown instead of detach docker
vicilliar May 22, 2023
91e5cdf
dit docker then attach for output
vicilliar May 22, 2023
a411e4c
trying docker logs
vicilliar May 22, 2023
4cbae6c
changed debug messages
vicilliar May 22, 2023
df66739
removed quotes from
vicilliar May 22, 2023
b1f1adb
args method
vicilliar May 22, 2023
f0e744b
removed single quotes
vicilliar May 22, 2023
64b7449
added model preload tests
vicilliar May 22, 2023
020e1c9
removed '
vicilliar May 22, 2023
ec54cce
stopped using list method
vicilliar May 22, 2023
26610b3
cleaned up script
vicilliar May 22, 2023
15e9851
updated sh cuda and py with list method
vicilliar May 23, 2023
2a9ab23
added combination test
vicilliar May 23, 2023
34ece74
added solution to all start scripts
vicilliar May 23, 2023
dfc5cbc
removed initial default checks in tests
vicilliar May 24, 2023
11a8c20
removed initial default checks preload models
vicilliar May 24, 2023
e7fd008
removed initial default checks preload models
vicilliar May 24, 2023
09d70c0
returned pytest args to all tests
vicilliar May 25, 2023
ef6cb56
returned pytest args again
vicilliar May 25, 2023
645eb18
returned pytest args again
vicilliar May 25, 2023
d840f30
changed teardown to class method
vicilliar May 25, 2023
937a343
added setenv to arm, local os, s2search
vicilliar May 25, 2023
5825dd0
teardown uses cls
vicilliar May 25, 2023
442696e
debugging class name
vicilliar May 26, 2023
3149311
debugging class name
vicilliar May 26, 2023
fd9f7c4
removed debug local os
vicilliar May 26, 2023
e50a15c
arm debug tests
vicilliar May 26, 2023
ad92bb9
updated arm code
vicilliar May 29, 2023
dc779df
debug for arm output
vicilliar May 29, 2023
18eecb6
debug arm time
vicilliar May 29, 2023
a158097
add set -x debug
vicilliar May 29, 2023
70ba921
removed &&
vicilliar May 29, 2023
fce3f5b
added ps aux
vicilliar May 29, 2023
f9b5253
detached marqo-os container
vicilliar May 29, 2023
ec1b132
detached marqo-os container with interactive
vicilliar May 29, 2023
125c7d0
removed debug statements
vicilliar May 29, 2023
c18fdf3
changed arm pytest args
vicilliar May 29, 2023
fd7596a
removed debug arm
vicilliar May 29, 2023
b32f5d7
s2search debug
vicilliar May 29, 2023
70fe79d
s2search -d move
vicilliar May 29, 2023
0a20e5e
debug s2search
vicilliar May 29, 2023
d197e26
debug s2search
vicilliar May 29, 2023
93ea0a8
debug s2search
vicilliar May 29, 2023
346479b
debug s2search
vicilliar May 29, 2023
1bad21d
run conf in s2search backend
vicilliar May 29, 2023
fd3ef98
added tests/ back in s2search
vicilliar May 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 25 additions & 5 deletions scripts/start_arm64_local_marqo_os.sh
Original file line number Diff line number Diff line change
@@ -1,23 +1,43 @@
#!/bin/bash
# args:
# $1 : marqo_image_name - name of the image you want to test
# $@ : env_vars - strings representing all args to pass docker call

export LOCAL_OPENSEARCH_URL="https://localhost:9200"

docker rm -f marqo-os &&
docker run --name marqo-os -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3-arm &
docker rm -f marqo-os
pandu-k marked this conversation as resolved.
Show resolved Hide resolved
docker run -id --name marqo-os -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" marqoai/marqo-os:0.0.3-arm &

# wait for marqo-os to start

until [[ $(curl -v --silent --insecure $LOCAL_OPENSEARCH_URL 2>&1 | grep Unauthorized) ]]; do
sleep 0.1;
done;

MARQO_DOCKER_IMAGE="$1"
shift

docker rm -f marqo
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$LOCAL_OPENSEARCH_URL" "$1" &

# Explanation:
# -d detaches docker from process (so subprocess does not wait for it)
# ${@:+"$@"} adds ALL args (past $1) if any exist.

set -x
docker run -d --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$LOCAL_OPENSEARCH_URL" \
${@:+"$@"} "$MARQO_DOCKER_IMAGE"
set +x

# Follow docker logs (since it is detached)
docker logs -f marqo &
LOGS_PID=$!

# wait for marqo to start
until [[ $(curl -v --silent --insecure http://localhost:8882 2>&1 | grep Marqo) ]]; do
sleep 0.1;
done;

# Kill the `docker logs` command (so subprocess does not wait for it)
kill $LOGS_PID
21 changes: 20 additions & 1 deletion scripts/start_cuda_dind_marqo_os.sh
Original file line number Diff line number Diff line change
@@ -1,9 +1,28 @@
#!/bin/bash
# args:
# $1 : marqo_image_name - name of the image you want to test
# $@ : env_vars - strings representing all args to pass docker call
docker rm -f marqo;
docker run --name marqo --gpus all --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway "$1" &

MARQO_DOCKER_IMAGE="$1"
shift

# Explanation:
# -d detaches docker from process (so subprocess does not wait for it)
# ${@:+"$@"} adds ALL args (past $1) if any exist.
set -x
docker run -d --name marqo --gpus all --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
${@:+"$@"} "$MARQO_DOCKER_IMAGE"
set +x

# Follow docker logs (since it is detached)
docker logs -f marqo &
LOGS_PID=$!

# wait for marqo to start
until [[ $(curl -v --silent --insecure http://localhost:8882 2>&1 | grep Marqo) ]]; do
sleep 0.1;
done;

# Kill the `docker logs` command (so subprocess does not wait for it)
kill $LOGS_PID
23 changes: 22 additions & 1 deletion scripts/start_dind_marqo_os.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,31 @@
#!/bin/bash
# args:
# $1 : marqo_image_name - name of the image you want to test
# $@ : env_vars - strings representing all args to pass docker call

MARQO_DOCKER_IMAGE="$1"
shift

docker rm -f marqo;
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway -e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" "$1" &

# Explanation:
# -d detaches docker from process (so subprocess does not wait for it)
# ${@:+"$@"} adds ALL args (past $1) if any exist.

set -x
docker run -d --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
${@:+"$@"} "$MARQO_DOCKER_IMAGE"
set +x

# Follow docker logs (since it is detached)
docker logs -f marqo &
LOGS_PID=$!

# wait for marqo to start
until [[ $(curl -v --silent --insecure http://localhost:8882 2>&1 | grep Marqo) ]]; do
sleep 0.1;
done;

# Kill the `docker logs` command (so subprocess does not wait for it)
kill $LOGS_PID
24 changes: 21 additions & 3 deletions scripts/start_local_marqo_os.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/bin/bash
# args:
# $1 : marqo_image_name - name of the image you want to test
# $@ : env_vars - strings representing all args to pass docker call

export LOCAL_OPENSEARCH_URL="https://localhost:9200"

Expand All @@ -11,13 +12,30 @@ until [[ $(curl -v --silent --insecure "$LOCAL_OPENSEARCH_URL/_aliases" 2>&1 | g
sleep 0.1;
done;

MARQO_DOCKER_IMAGE="$1"
shift

docker rm -f marqo
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$LOCAL_OPENSEARCH_URL" --memory=6g "$1" &

# Explanation:
# -d detaches docker from process (so subprocess does not wait for it)
# ${@:+"$@"} adds ALL args (past $1) if any exist.

set -x
docker run -d --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$LOCAL_OPENSEARCH_URL" \
${@:+"$@"} "$MARQO_DOCKER_IMAGE" --memory=6g
set +x

# Follow docker logs (since it is detached)
docker logs -f marqo &
LOGS_PID=$!

# wait for marqo to start
until [[ $(curl -v --silent --insecure http://localhost:8882 2>&1 | grep Marqo) ]]; do
sleep 0.1;
done;

# Kill the `docker logs` command (so subprocess does not wait for it)
kill $LOGS_PID
28 changes: 25 additions & 3 deletions scripts/start_s2search_backend.sh
Original file line number Diff line number Diff line change
@@ -1,13 +1,35 @@
#!/bin/bash
# args:
# $1 : marqo_image_name - name of the image you want to test
# $@ : env_vars - strings representing all args to pass docker call

export MARQO_API_TESTS_ROOT=$(pwd)
. "${MARQO_API_TESTS_ROOT}/conf"

MARQO_DOCKER_IMAGE="$1"
shift

docker rm -f marqo;
docker run --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$S2SEARCH_URL" "$1" &

# Explanation:
# -d detaches docker from process (so subprocess does not wait for it)
# ${@:+"$@"} adds ALL args (past $1) if any exist.

set -x
docker run -d --name marqo --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway \
-e "MARQO_MAX_CPU_MODEL_MEMORY=1.6" \
-e "OPENSEARCH_URL=$S2SEARCH_URL" \
${@:+"$@"} "$MARQO_DOCKER_IMAGE"
set +x

# Follow docker logs (since it is detached)
docker logs -f marqo &
LOGS_PID=$!

# wait for marqo to start
until [[ $(curl -v --silent --insecure http://localhost:8882 2>&1 | grep Marqo) ]]; do
sleep 0.1;
done;

# Kill the `docker logs` command (so subprocess does not wait for it)
kill $LOGS_PID
2 changes: 1 addition & 1 deletion tests/api_tests/cuda/test_cuda_neural_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
class TestAddDocuments(MarqoTestCase):

# NOTE: test_search_with_device was removed from these cuda tests
# TODO: Try this solo again if needed -> @allow_environments(["CUDA_DIND_MARQO_OS"])
# NOTE: Try this solo again if needed -> @allow_environments(["CUDA_DIND_MARQO_OS"])
def setUp(self) -> None:
self.client = Client(**self.client_settings)
self.index_name_1 = "my-test-index-1"
Expand Down
136 changes: 136 additions & 0 deletions tests/application_tests/test_env_var_changes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
import logging
import subprocess
import os
import time
from requests import HTTPError
from tests import marqo_test
from tests import utilities
from marqo import Client
from marqo.errors import MarqoApiError, BackendCommunicationError, MarqoWebError
import pprint
import json

class TestEnvVarChanges(marqo_test.MarqoTestCase):

"""
All tests that rerun marqo with different env vars should go here
Teardown will handle resetting marqo back to base settings
"""

def setUp(self) -> None:
self.client = Client(**self.client_settings)
self.index_name_1 = "my-test-index-1"
try:
self.client.delete_index(self.index_name_1)
except MarqoApiError as s:
pass

@classmethod
def tearDownClass(cls) -> None:
super().tearDownClass()
# Ensures that marqo goes back to default state after these tests
utilities.rerun_marqo_with_default_config(
calling_class=cls.__name__
)
print("Marqo has been rerun with default env vars!")

def test_max_replicas(self):
# Default max is 1
# Rerun marqo with new replica count
max_replicas = 5
print(f"Attempting to rerun marqo with max replicas: {max_replicas}")
utilities.rerun_marqo_with_env_vars(
env_vars = ["-e", f"MARQO_MAX_NUMBER_OF_REPLICAS={max_replicas}"],
calling_class=self.__class__.__name__
)

# Attempt to create index with 4 replicas (should succeed)
res_0 = self.client.create_index(index_name=self.index_name_1, settings_dict={
"index_defaults": {
"treat_urls_and_pointers_as_images": True,
"model": "ViT-B/32",
},
"number_of_replicas": 4
})

# Make sure new index has 4 replicas
assert self.client.get_index(self.index_name_1).get_settings() \
["number_of_replicas"] == 4


def test_preload_models(self):
# Default models are ["hf/all_datasets_v4_MiniLM-L6", "ViT-L/14"]
# Rerun marqo with new custom model
open_clip_model_object = {
"model": "open-clip-1",
"model_properties": {
"name": "ViT-B-32-quickgelu",
"dimensions": 512,
"type": "open_clip",
"url": "https://github.com/mlfoundations/open_clip/releases/download/v0.2-weights/vit_b_32-quickgelu-laion400m_avg-8a00ab3c.pt"
}
}

print(f"Attempting to rerun marqo with custom model {open_clip_model_object['model']}")
utilities.rerun_marqo_with_env_vars(
env_vars = ['-e', f"MARQO_MODELS_TO_PRELOAD=[{json.dumps(open_clip_model_object)}]"],
calling_class=self.__class__.__name__
)

# check preloaded models (should be custom model)
custom_models = ["open-clip-1"]
res = self.client.get_loaded_models()
assert set([item["model_name"] for item in res["models"]]) == set(custom_models)


def test_multiple_env_vars(self):
"""
Ensures that rerun_marqo_with_env_vars can work with several different env vars
at the same time

3 things in the same command:
1. Load models
2. set max number of replicas
3. set max EF
"""

# Restart marqo with new max values
max_replicas = 10
max_ef = 6000
new_models = ["hf/all_datasets_v4_MiniLM-L6"]
utilities.rerun_marqo_with_env_vars(
env_vars = [
"-e", f"MARQO_MAX_NUMBER_OF_REPLICAS={max_replicas}",
"-e", f"MARQO_EF_CONSTRUCTION_MAX_VALUE={max_ef}",
"-e", f"MARQO_MODELS_TO_PRELOAD={json.dumps(new_models)}"
],
calling_class=self.__class__.__name__
)

# Create index with same number of replicas and EF
res_0 = self.client.create_index(index_name=self.index_name_1, settings_dict={
"number_of_replicas": 4, # should be fine now
"index_defaults": {
"ann_parameters" : {
"space_type": "cosinesimil",
"parameters": {
"ef_construction": 5000, # should be fine now
"m": 16
}
}
}
})

# Assert correct replicas
# Make sure new index has 4 replicas
assert self.client.get_index(self.index_name_1).get_settings() \
["number_of_replicas"] == 4

# Assert correct EF const
assert self.client.get_index(self.index_name_1).get_settings() \
["index_defaults"]["ann_parameters"]["parameters"]["ef_construction"] == 5000

# Assert correct models
res = self.client.get_loaded_models()
assert set([item["model_name"] for item in res["models"]]) == set(new_models)

21 changes: 21 additions & 0 deletions tests/application_tests/test_throttling.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import logging
import subprocess
import time
from requests import HTTPError
from tests import marqo_test
from marqo import Client
from marqo.errors import MarqoApiError, BackendCommunicationError, MarqoWebError


class TestThrottling(marqo_test.MarqoTestCase):

def setUp(self) -> None:
self.client = Client(**self.client_settings)
self.index_name_1 = "my-test-index-1"
try:
self.client.delete_index(self.index_name_1)
except MarqoApiError as s:
pass

def test_index_limit(self):
pass
Loading