Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4115] Performance Profile Support #466

Merged
merged 79 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
a62f180
Sync docs from Discourse (#451)
github-actions[bot] Sep 26, 2024
c9edade
[DPE-5558] Break CA rotation into integration test groups (#458)
phvalguima Sep 27, 2024
3845e38
Adding first batch of changes to the charm on how to process profiles…
phvalguima Sep 27, 2024
7b42f47
Add unit tests for profile management
phvalguima Oct 1, 2024
a3d033c
Extend replace() to cover multilines and look into all possible options
phvalguima Oct 1, 2024
776c3f4
Add index/component template APIs
phvalguima Oct 1, 2024
4b52e33
Add support for profile option in the integration tests
phvalguima Oct 1, 2024
50ea12b
lint fixes
phvalguima Oct 1, 2024
51c7cd5
Fix first batch of unit tests
phvalguima Oct 2, 2024
73cd84b
Remove some dead LoCs because of commenting out lines of code
phvalguima Oct 2, 2024
7c47f20
Update to 1 instead of 1-all
phvalguima Oct 2, 2024
64f9bc9
Update the changes following feedback
phvalguima Oct 11, 2024
c856d3a
lint fix
phvalguima Oct 11, 2024
9cecd5f
Update more tests and fixes
phvalguima Oct 12, 2024
6a85319
Rollback the original internal_users.yml
phvalguima Oct 12, 2024
2d9ce65
Merge remote-tracking branch 'origin' into DPE-4115-performance-profiles
phvalguima Oct 12, 2024
8712a84
Merge branch '2/edge' into DPE-4115-performance-profiles
phvalguima Oct 13, 2024
5363de7
Simplify test_charm as changing to production profile causes a lot of…
phvalguima Oct 13, 2024
60292ed
Remove service started + add set_watermark to small deployment on plu…
phvalguima Oct 13, 2024
b63d02d
Moved test HA to set profile=staging
phvalguima Oct 13, 2024
51d92e7
Remove any profile change from int. tests; fix integration tests
phvalguima Oct 13, 2024
9fab851
Move to the Deployment Description
phvalguima Oct 15, 2024
bfe940e
Remove refs to template apply event
phvalguima Oct 15, 2024
9b50a3d
Fixes following review
phvalguima Oct 15, 2024
42f464d
Roll back internal_users
phvalguima Oct 15, 2024
a6ee6db
Add peer relation listener
phvalguima Oct 15, 2024
d53a4f7
Add _on_install hook
phvalguima Oct 15, 2024
b1e77b5
Add perf profile to track install event
phvalguima Oct 15, 2024
db0d13e
Check file exists on _on_install
phvalguima Oct 15, 2024
86331be
Fix peer cluster relation
phvalguima Oct 16, 2024
22fbe66
Fix the config-changed routine
phvalguima Oct 16, 2024
13082f0
Update the e.response_code on apply template routine
phvalguima Oct 16, 2024
b8c3c59
Fix the append scenarios in replace()
phvalguima Oct 16, 2024
7670d8e
Add multiline replace
phvalguima Oct 16, 2024
37d94d5
Merge remote-tracking branch 'origin/main' into DPE-4115-performance-…
phvalguima Oct 16, 2024
78deee7
Merge remote-tracking branch 'origin/DPE-5677-fix-replace-file-persis…
phvalguima Oct 16, 2024
59790ae
Add profile change in test_charm.py
phvalguima Oct 16, 2024
1038a9b
Remove repeated code
phvalguima Oct 16, 2024
1450e56
Fix path in helper_conf_setter.replace()
phvalguima Oct 16, 2024
5e12ca8
Update lib/charms/opensearch/v0/opensearch_base_charm.py
phvalguima Oct 16, 2024
b6af43e
Remove perf_profile from opensearch_distro
phvalguima Oct 16, 2024
4f46a44
Add post merge with upstream branch
phvalguima Oct 16, 2024
a4ae60f
Merge remote-tracking branch 'origin' into DPE-4115-performance-profiles
phvalguima Oct 16, 2024
7dbd1e3
Move profile to PeerClusterConfig
phvalguima Oct 17, 2024
ebc0e58
Add minor fixes for performance profile
phvalguima Oct 17, 2024
a001dd2
Set correct order in config-changed
phvalguima Oct 17, 2024
07f99a1
Add config-changed
phvalguima Oct 17, 2024
b2c27d4
Add peer relation support and restart logic
phvalguima Oct 17, 2024
e59a628
Merge remote-tracking branch 'origin' into DPE-4115-performance-profiles
phvalguima Oct 18, 2024
deae3c6
Remove non-testing profile
phvalguima Oct 18, 2024
345385b
Add support for upgrade from older versions
phvalguima Oct 18, 2024
6c72f33
Rollback unit test + fix refresh command
phvalguima Oct 18, 2024
7c6df55
Rollback to return None if peer relation not set
phvalguima Oct 18, 2024
cea3c29
Rollback the config-changed to use refresh_relation_data
phvalguima Oct 18, 2024
cd6b419
Fix return empty after upgrade
phvalguima Oct 18, 2024
7d5f232
Update profiles
phvalguima Oct 18, 2024
ebae15a
Add upgrade charm check
phvalguima Oct 18, 2024
0cd057d
Minor fixes for the upgrade + reviews
phvalguima Oct 18, 2024
6a076da
Simplify the peer-cluster event
phvalguima Oct 18, 2024
b6a2989
Remove the ^
phvalguima Oct 18, 2024
44014ff
Fix unit tests
phvalguima Oct 18, 2024
9e60288
Move away from refresh_relation_data
phvalguima Oct 18, 2024
0754ce6
Move to run()
phvalguima Oct 18, 2024
1b69290
Simplify current()
phvalguima Oct 18, 2024
0dcdcb8
fix lint
phvalguima Oct 18, 2024
960c1ef
Fix upgrade assert
phvalguima Oct 18, 2024
c1f6b32
fix lint
phvalguima Oct 18, 2024
60e829a
Update helpers.py
phvalguima Oct 19, 2024
9c20cd3
Add logging; update upgrade helper to dispatch arguments in the corre…
Oct 19, 2024
fc2146c
Update test_ha_multi_clusters.py
phvalguima Oct 20, 2024
4a8f5f0
Update test_ha_multi_clusters.py
phvalguima Oct 20, 2024
f481297
Update test_tls.py
phvalguima Oct 20, 2024
72b8517
Update test_ca_rotation.py
phvalguima Oct 20, 2024
a67e876
Fix: update config options across CI
phvalguima Oct 20, 2024
047579d
Move away from main orch. leadership
phvalguima Oct 21, 2024
cc12e1a
Readd the apply index-template bool
phvalguima Oct 21, 2024
e7af36d
Add custom event with property
phvalguima Oct 21, 2024
4419a66
Reorder self.current + apply perf. templates
phvalguima Oct 21, 2024
1df45a7
Update helper_conf_setter.py
phvalguima Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,13 @@ options:
default: true
type: boolean
description: Enable opensearch-knn

profile:
type: string
default: "production"
Mehdi-Bendriss marked this conversation as resolved.
Show resolved Hide resolved
description: |
Profile representing the scope of deployment, and used to tune resource allocation.
Allowed values are: "production", "staging" or "testing"
Production will tune opensearch for maximum performance while default will tune for
minimal running performance.
Performance tuning is described on: https://opensearch.org/docs/latest/tuning-your-cluster/performance/
2 changes: 2 additions & 0 deletions lib/charms/opensearch/v0/constants_charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,3 +118,5 @@

# User-face Backup ID format
OPENSEARCH_BACKUP_ID_FORMAT = "%Y-%m-%dT%H:%M:%SZ"

PERFORMANCE_PROFILE = "profile"
3 changes: 1 addition & 2 deletions lib/charms/opensearch/v0/helper_conf_setter.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,14 +272,13 @@ def replace(
output_file: Target file for the result config, by default same as config_file
"""
path = f"{self.base_path}{config_file}"

if not exists(path):
raise FileNotFoundError(f"{path} not found.")

with open(path, "r+") as f:
data = f.read()

if regex and old_val and re.compile(old_val).match(data):
if regex and old_val and re.compile(old_val, re.MULTILINE).findall(data):
Mehdi-Bendriss marked this conversation as resolved.
Show resolved Hide resolved
data = re.sub(r"{}".format(old_val), f"{new_val}", data)
elif old_val and old_val in data:
data = data.replace(old_val, new_val)
Expand Down
255 changes: 254 additions & 1 deletion lib/charms/opensearch/v0/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@

"""Cluster-related data structures / model classes."""
import json
import math
from abc import ABC
from datetime import datetime
from enum import Enum
from hashlib import md5
from typing import Any, Dict, List, Literal, Optional
from typing import Any, Dict, List, Literal, Optional, Tuple

from charms.opensearch.v0.helper_enums import BaseStrEnum
from pydantic import BaseModel, Field, root_validator, validator
Expand Down Expand Up @@ -153,6 +155,14 @@ class DeploymentType(BaseStrEnum):
OTHER = "other"


class PerformanceType(BaseStrEnum):
"""Performance types available."""

PRODUCTION = "production"
STAGING = "staging"
TESTING = "testing"


class StartMode(BaseStrEnum):
"""Mode of start of units in this deployment."""

Expand Down Expand Up @@ -346,3 +356,246 @@ def promote_failover(self) -> None:
self.main_app = self.failover_app
self.main_rel_id = self.failover_rel_id
self.delete("failover")


class ByteUnit(Enum):
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
"""As per docs, Java uses the byte format.

Converts the *B and *iB to the same raw values. For example, can be written as:
- 6m: 6 * 1024 * 1024
- 6144k: 6144 * 1024
- 6291456: 6291456 bytes

More info: https://dev.java/learn/jvm/tools/core/java/#overview
"""

B = 1 # noqa: N815
kB = 1024 # noqa: N815
mB = 1024 * kB # noqa: N815
gB = 1024 * mB # noqa: N815

@staticmethod
def get(name: str) -> int:
"""Convert the value to the required unit."""
val = name.lower()
if val == "kb" or val == "k":
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
return ByteUnit.kB
if val == "mb" or val == "m":
return ByteUnit.mB
if val == "gb" or val == "g":
return ByteUnit.gB
return ByteUnit.B

@staticmethod
def previous(val):
"""Return the previous value of the unit."""
if val == ByteUnit.kB:
return ByteUnit.B
if val == ByteUnit.mB:
return ByteUnit.kB
if val == ByteUnit.gB:
return ByteUnit.mB
return ByteUnit.B

@staticmethod
def to_int(value: tuple[str, any]) -> int:
"""Convert the value to the bytes unit."""
if isinstance(value[1], ByteUnit):
return value[0] * value[1].value

unit = ByteUnit.get(value[1]).value
return int(value[0]) * unit

@staticmethod
def unit(value: int | float) -> Tuple[float, Any]:
"""Return the next value of the unit.

This value must be an integer. If we have a decimal part, then we should round it up.
"""
inter_value = float(value)
for u in [ByteUnit.B, ByteUnit.kB, ByteUnit.mB, ByteUnit.gB]:
if inter_value < 1024:
break
inter_value /= 1024

# Now, we calculate the rounding
if u == ByteUnit.B:
# We are already in the lowest unit possible, return a rounded value
return (int(inter_value), u)
# Check if we have a decimal part, if yes, then we multiply the value by 1024
dec, _ = math.modf(inter_value)
if dec != 0.0:
return (int(inter_value * 1024), ByteUnit.previous(u))
return (int(inter_value), u)


class JavaByteSize:
"""Java Byte Size tuple representation."""

def __init__(self, value: str | float | int | None = None, unit: str | ByteUnit | None = None):
"""Constructor of JavaByteSize.

Args:
value: the value of the size
unit: the unit of the size
"""
if not value and not unit:
self.value = 0
self.unit = ByteUnit.B
return

u = unit
if isinstance(unit, str):
u = ByteUnit.get(unit)
self.value, self.unit = ByteUnit.unit(float(value) * u.value)

def percent(self, percentage: float) -> int:
"""Return the percentage of the JavaByteSize."""
val = ByteUnit.to_int((self.value, self.unit)) * percentage
return JavaByteSize(val, ByteUnit.B)

def __eq__(self, other: Any) -> bool:
"""Check if the JavaByteSize is equal to the other value."""
if not isinstance(other, JavaByteSize):
raise TypeError("Cannot compare JavaByteSize with other types.")
return ByteUnit.to_int((self.value, self.unit)) == ByteUnit.to_int(
(other.value, other.unit)
)

def __lt__(self, other: Any) -> bool:
"""Check if the JavaByteSize is less than the other value."""
if not isinstance(other, JavaByteSize):
raise TypeError("Cannot compare JavaByteSize with other types.")
return ByteUnit.to_int((self.value, self.unit)) < ByteUnit.to_int(
(other.value, other.unit)
)

def __gt__(self, other: Any) -> bool:
"""Check if the JavaByteSize is greater than the other value."""
if not isinstance(other, JavaByteSize):
raise TypeError("Cannot compare JavaByteSize with other types.")
return ByteUnit.to_int((self.value, self.unit)) > ByteUnit.to_int(
(other.value, other.unit)
)

def __str__(self) -> str:
"""Return the string representation of the JavaByteSize."""
return f"{self.value}{str(self.unit.name.lower())[:1]}"


class OpenSearchPerfProfile(Model):
"""Generates an immutable description of the performance profile."""

class Config:
"""Pydantic config for this model."""

arbitrary_types_allowed = True
phvalguima marked this conversation as resolved.
Show resolved Hide resolved

typ: PerformanceType
heap_size: JavaByteSize | None = None
opensearch_yml: Dict[str, str] = {}
charmed_index_template: Dict[str, str] = {}
charmed_component_templates: Dict[str, str] = {}

@classmethod
def from_str(cls, input_str: str):
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
"""Create a new instance of this class from a stringified json/dict repr."""
return cls(typ=input_str)

@root_validator
def set_options(cls, values): # noqa: N805
"""Generate the attributes depending on the input."""
heap = JavaByteSize(
OpenSearchPerfProfile.meminfo()["MemTotal"][0],
OpenSearchPerfProfile.meminfo()["MemTotal"][1],
)

val = values["typ"]
if isinstance(val, str):
val = PerformanceType(val)
phvalguima marked this conversation as resolved.
Show resolved Hide resolved

if val == PerformanceType.PRODUCTION:
values["heap_size"] = (
heap.percent(0.25)
if heap.percent(0.25) > JavaByteSize("1", "g")
else JavaByteSize("1", "g")
)

if val == PerformanceType.STAGING:
values["heap_size"] = (
heap.percent(0.1)
if heap.percent(0.1) > JavaByteSize("1", "g")
else JavaByteSize("1", "g")
)

if val == PerformanceType.TESTING:
values["heap_size"] = JavaByteSize("1", "gB")

if val != PerformanceType.TESTING:
values["opensearch_yml"] = {"indices.memory.index_buffer_size": "25%"}

values["charmed_index_template"] = {
"charmed-index-tpl": {
"index_patterns": ["*"],
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
"template": {
"settings": {
"number_of_replicas": "1-all",
},
},
},
}

values["charmed_component_templates"] = {
"charmed-default-tpl": {
"template": {
"settings": {
"number_of_replicas": "1-all",
"index": {
"codec": "zstd_no_dict",
},
},
},
},
"charmed-vector-tpl": {
"template": {
"settings": {
"number_of_replicas": "1-all",
"index": {
"codec": "default",
},
},
},
},
"charmed-ingest-tpl": {
"template": {
"settings": {
"number_of_replicas": "1-all",
"index": {
"codec": "zstd_no_dict",
"flush_threshold_size": (
str(values["heap_size"].percent(0.25))
if values["heap_size"].percent(0.25)
> JavaByteSize("512", "mB")
else "512m"
),
},
},
},
},
}

return values

@staticmethod
def meminfo() -> dict[str, JavaByteSize]:
"""Read the /proc/meminfo file and return the values."""
with open("/proc/meminfo") as f:
meminfo = f.read()
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
return {
line.split()[0][:-1]: (
int(line.split()[1]),
ByteUnit.get(line.split()[2] if len(line.split()) > 2 else "b"),
)
for line in meminfo.split("\n")
if line
}
50 changes: 41 additions & 9 deletions lib/charms/opensearch/v0/opensearch_base_charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

from charms.grafana_agent.v0.cos_agent import COSAgentProvider
from charms.opensearch.v0.constants_charm import (
PERFORMANCE_PROFILE,
AdminUser,
AdminUserInitProgress,
AdminUserNotConfigured,
Expand Down Expand Up @@ -49,7 +50,12 @@
generate_hashed_password,
generate_password,
)
from charms.opensearch.v0.models import DeploymentDescription, DeploymentType
from charms.opensearch.v0.models import (
DeploymentDescription,
DeploymentType,
OpenSearchPerfProfile,
PerformanceType,
)
from charms.opensearch.v0.opensearch_backups import backup
from charms.opensearch.v0.opensearch_config import OpenSearchConfig
from charms.opensearch.v0.opensearch_distro import OpenSearchDistribution
Expand Down Expand Up @@ -355,6 +361,13 @@ def cleanup():
logger.error("Service previously started but now misses the snap.")
return

# Store the current perf. profile we are applying
self.peers_data.put(
Scope.UNIT,
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
PERFORMANCE_PROFILE,
PerformanceType(self._charm.config.get(PERFORMANCE_PROFILE, "production")),
)

# apply the directives computed and emitted by the peer cluster manager
if not self._apply_peer_cm_directives_and_check_if_can_start():
event.defer()
Expand Down Expand Up @@ -676,22 +689,41 @@ def _on_config_changed(self, event: ConfigChangedEvent): # noqa C901
if not self.plugin_manager.check_plugin_manager_ready():
return

if self.upgrade_in_progress:
phvalguima marked this conversation as resolved.
Show resolved Hide resolved
# Deferring right now is too late anyways
logger.warning(
"Changing config during an upgrade is not supported. The charm may be in a broken, "
"unrecoverable state"
)
event.defer()
return

try:
if not self.plugin_manager.check_plugin_manager_ready():
raise OpenSearchNotFullyReadyError()

if self.unit.is_leader():
self.status.set(MaintenanceStatus(PluginConfigCheck), app=True)

if self.plugin_manager.run():
if self.upgrade_in_progress:
logger.warning(
"Changing config during an upgrade is not supported. The charm may be in a broken, "
"unrecoverable state"
)
event.defer()
return
restart_requested = self.plugin_manager.run()
if (
PerformanceType(self.peers_data.get(Scope.UNIT, PERFORMANCE_PROFILE))
!= self.opensearch.perf_profile
):
self.opensearch.perf_profile = OpenSearchPerfProfile.from_str(
self._charm.config.get(PERFORMANCE_PROFILE)
)
# If we have a running service, and our profile changed
# then we need a restart to apply the new profile
self.opensearch_config.apply_performance_profile(self.opensearch.perf_profile)

# Configure templates if needed
self.opensearch.apply_perf_templates_if_neeeded()

self.peers_data.put(Scope.UNIT, PERFORMANCE_PROFILE, self.opensearch.perf_profile)
restart_requested = self.opensearch.is_service_started()

if restart_requested:
self._restart_opensearch_event.emit()
except (OpenSearchNotFullyReadyError, OpenSearchPluginError) as e:
if isinstance(e, OpenSearchNotFullyReadyError):
Expand Down
Loading
Loading