Documentstores

The operator supports automatic provisioning of documentstores.

We do not recommend using the operator to provision documentstores in production environments but it can speed up development and testing.

The detection of the documentstore is automatic all you need to do is to specify the annotation in your pipeline CRD.

See more about annotations in the pipeline CRD Annotations section.

Elasticsearch

The operator supports automatic provisioning of Elasticsearch clusters with a single node.

An Elasticsearch cluster will be provisioned in the same namespace as the pipeline and a service created to access it.

The service created for the Elasticsearch cluster will be named elasticsearch and will be of type ClusterIP.

In your pipeline definition file under the DocumentStore component, you can specify the following annotation:

...
metadata:
  name: ...
  namespace: ...
  annotations:
    'auto-provision-documentstore.pipelines.baler.gatecastle.com/enabled': 'true'
spec:
  version: 1.19.0
  components:
    - name: DocumentStore
      type: ElasticsearchDocumentStore
      params:
        host: 'elasticsearch'
        port: 9200
        embedding_dim: 384
...

The auto-provision-documentstore.pipelines.baler.gatecastle.com/enabled annotation is used to enable the automatic provisioning of the Elasticsearch cluster.

The host parameter should be set to elasticsearch and the port parameter should be set to 9200.

You can use the 'preserve-documentstore.pipelines.baler.gatecastle.com/enabled': 'false' annotation to delete the Elasticsearch cluster when the pipeline is deleted. The default behavior is to preserve the datasource when the pipeline is deleted.