Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update provider to support marqo 2.13 #30

Merged
merged 26 commits into from
Nov 13, 2024
Merged

Conversation

RaynorChavez
Copy link
Member

@RaynorChavez RaynorChavez commented Nov 6, 2024

This update adds support for the following fields

  • treat_urls_and_pointers_as_media
  • video_preprocessing
  • audio_preprocessing
  • model_properties

Within model_properties:

  • name
  • dimensions
  • type
  • tokens
  • model_location
  • url
  • trust_remote_code
  • filter_string_max_length

Within model_location:

  • s3: Has the keys bucket, and key
  • hf: Has the keys repo_id, and filename
  • auth_required

@ruchira088
Copy link

ruchira088 commented Nov 8, 2024

Adding @orlade since I'm NOT super familiar with Golang

Read-Only:

- `dimensions` (String)
- `model_location` (String)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an object. If the location is well-known, e.g. on HuggingFace, the name can encode it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<a id="nestedatt--items--model_properties"></a>
### Nested Schema for `items.model_properties`

Read-Only:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should include is_marqtuned_model (bool), since we need to specify that if we want a Marqtuned model with custom properties.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also authRequired (bool) for private models?

These aren't 2.13 specific but we probably should capture them.

Copy link
Member Author

@RaynorChavez RaynorChavez Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find is_marqtuned_model in the documentation, we should probably ask the marqtune team to add that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added authRequired thanks

@@ -82,6 +95,20 @@ Optional:
- `patch_method` (String)


<a id="nestedatt--settings--model_properties"></a>
### Nested Schema for `settings.model_properties`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a duplication; can they both refer to the same schema?

Copy link
Member Author

@RaynorChavez RaynorChavez Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently not, the schemas in datasource are all string/bool, whereas there are some Number types in the resource. I tried having datasource refer to the schemas with Number types before but that gave an error.

Dimensions int64 `json:"dimensions"`
Type string `json:"type"`
Tokens int64 `json:"tokens"`
ModelLocation string `json:"model_location"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ModelLocation string `json:"model_location"`
ModelLocation string `json:"modelLocation"`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

StoreSize types.String `tfsdk:"store_size"`
DocsDeleted types.String `tfsdk:"docs_deleted"`
SearchQueryTotal types.String `tfsdk:"search_query_total"`
TreatUrlsAndPointersAsImages types.Bool `tfsdk:"treat_urls_and_pointers_as_images"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a TreatUrlsAndPointersAsMedia?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, added.

@@ -390,6 +498,18 @@ func (d *indicesDataSource) Read(ctx context.Context, req datasource.ReadRequest
FilterStringMaxLength: types.StringValue(fmt.Sprintf("%d", indexDetail.FilterStringMaxLength)),
}

// Handle model properties
if items[i].ModelProperties.Name.IsNull() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you could make this a method on the ModelProperties struct since it logically belongs to that type:

func (m ModelPropertiesModel) IsEmpty() bool {
	return m.Name.IsNull() &&
		m.Dimensions.IsNull() &&
		m.Type.IsNull() &&
		m.Tokens.IsNull() &&
		m.Url.IsNull() &&
		m.TrustRemoteCode.IsNull() &&
		m.IsMarqtunedModel.IsNull()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's cleaner

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified a bit and added

TextPreprocessing: TextPreprocessingModel{
SplitLength: types.StringValue(fmt.Sprintf("%d", indexDetail.TextPreprocessing.SplitLength)),
SplitMethod: types.StringValue(indexDetail.TextPreprocessing.SplitMethod),
SplitOverlap: types.StringValue(fmt.Sprintf("%d", indexDetail.TextPreprocessing.SplitOverlap)),
},
//ImagePreprocessing: types.ObjectValue(map[string]interface{}, indexDetail.ImagePreprocessing),
// ImagePreprocessing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why we don't include this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the cloud api does not return image_preprocessing (as well as audio and video) in the list index endpoint, so we might need to bring this up with dataplane team

Copy link
Collaborator

@orlade orlade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you have a unit test for the "set ModelProperties to empty/nil if its fields are empty" logic, but it might be worthwhile.


// Handle model properties
if newState.Settings.ModelProperties != nil {
if newState.Settings.ModelProperties.Name.ValueString() == "" &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is different to above where you check IsNull. Should they be the same? If not, why not? Again feels like it could be a method on the struct to make it more obvious what you're actually checking.

Copy link
Member Author

@RaynorChavez RaynorChavez Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently changed the resource to refer to objects through a pointer (*ModelPropertiesModelCreate), whereas in datasource it remained a direct struct field.

Logic has the same spirit (just adapted the resource for the datasource)

I've adapted both to use the same struct now

@RaynorChavez RaynorChavez merged commit 4e31c20 into main Nov 13, 2024
3 checks passed
@RaynorChavez RaynorChavez deleted the raynor/update_to_mq213 branch November 13, 2024 04:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants