百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 技术文章 > 正文

langchain4j+milvus实战

nanshan 2025-03-07 22:22 12 浏览 0 评论

本文主要研究一下如何使用langchain4j来对接向量数据库milvus

步骤

docker运行milvus

docker run -d \
        --name milvus-standalone \
        --security-opt seccomp:unconfined \
        -e ETCD_USE_EMBED=true \
        -e ETCD_DATA_DIR=/var/lib/milvus/etcd \
        -e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml \
        -e COMMON_STORAGETYPE=local \
        -v $(pwd)/volumes/milvus:/var/lib/milvus \
        -v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml \
        -v $(pwd)/user.yaml:/milvus/configs/user.yaml \
        -p 19530:19530 \
        -p 9091:9091 \
        -p 2379:2379 \
        --health-cmd="curl -f http://localhost:9091/healthz" \
        --health-interval=30s \
        --health-start-period=90s \
        --health-timeout=20s \
        --health-retries=3 \
        docker.1ms.run/milvusdb/milvus:v2.5.5 \
        milvus run standalone  1> /dev/null

启动之后访问
http://127.0.0.1:9091/webui

这里需要提前创建embedEtcd.yaml

listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
quota-backend-bytes: 4294967296
auto-compaction-mode: revision
auto-compaction-retention: '1000'

user.yaml内容为空即可

pom.xml


    dev.langchain4j
    langchain4j-milvus
    1.0.0-beta1

example

public class JlamaMilvusExample {

    public static void main(String[] args) throws InterruptedException {
        EmbeddingModel embeddingModel = JlamaEmbeddingModel.builder()
                .modelName("intfloat/e5-small-v2")
                .build();

        MilvusServiceClient customMilvusClient = new MilvusServiceClient(
                ConnectParam.newBuilder()
                        .withHost("localhost")
                        .withPort(19530)
                        .build()
        );
        MilvusEmbeddingStore embeddingStore = MilvusEmbeddingStore.builder()
                .milvusClient(customMilvusClient)
                .collectionName("example_collection")      // Name of the collection
                .dimension(384)                            // Dimension of vectors
                .indexType(IndexType.FLAT)                 // Index type
                .metricType(MetricType.COSINE)             // Metric type
                .consistencyLevel(ConsistencyLevelEnum.EVENTUALLY)  // Consistency level
                .autoFlushOnInsert(true)                   // Auto flush after insert
                .idFieldName("id")                         // ID field name
                .textFieldName("text")                     // Text field name
                .metadataFieldName("metadata")             // Metadata field name
                .vectorFieldName("vector")                 // Vector field name
                .build();                                  // Build the MilvusEmbeddingStore instance

        TextSegment segment1 = TextSegment.from("I like football.");
        Embedding embedding1 = embeddingModel.embed(segment1).content();
        embeddingStore.add(embedding1, segment1);

        TimeUnit.SECONDS.sleep(60);

        TextSegment segment2 = TextSegment.from("The weather is good today.");
        Embedding embedding2 = embeddingModel.embed(segment2).content();
        embeddingStore.add(embedding2, segment2);

        TimeUnit.SECONDS.sleep(60);

        String userQuery = "What is your favourite sport?";
        Embedding queryEmbedding = embeddingModel.embed(userQuery).content();
        int maxResults = 1;
        List<EmbeddingMatch> relevant = embeddingStore.findRelevant(queryEmbedding, maxResults);
        EmbeddingMatch embeddingMatch = relevant.get(0);

        System.out.println("Question: " + userQuery); // What is your favourite sport?
        System.out.println("Response: " + embeddingMatch.embedded().text()); // I like football.
    }
}

最后输出

WARNING: Using incubator modules: jdk.incubator.vector
INFO  c.g.tjake.jlama.model.AbstractModel - Model type = F32, Working memory type = F32, Quantized memory type = F32
WARN  c.g.t.j.t.o.TensorOperationsProvider - Native operations not available. Consider adding 'com.github.tjake:jlama-native' to the classpath
INFO  c.g.t.j.t.o.TensorOperationsProvider - Using Panama Vector Operations (OffHeap)
Question: What is your favourite sport?
Response: I like football.

quotaAndLimits

quotaAndLimits:
  enabled: true # `true` to enable quota and limits, `false` to disable.
  # quotaCenterCollectInterval is the time interval that quotaCenter
  # collects metrics from Proxies, Query cluster and Data cluster.
  # seconds, (0 ~ 65536)
  quotaCenterCollectInterval: 3
  limits:
    allocRetryTimes: 15 # retry times when delete alloc forward data from rate limit failed
    allocWaitInterval: 1000 # retry wait duration when delete alloc forward data rate failed, in millisecond
    complexDeleteLimitEnable: false # whether complex delete check forward data by limiter
    maxCollectionNum: 65536
    maxCollectionNumPerDB: 65536 # Maximum number of collections per database.
    maxInsertSize: -1 # maximum size of a single insert request, in bytes, -1 means no limit
    maxResourceGroupNumOfQueryNode: 1024 # maximum number of resource groups of query nodes
    maxGroupSize: 10 # maximum size for one single group when doing search group by
  ddl:
    enabled: false # Whether DDL request throttling is enabled.
    # Maximum number of collection-related DDL requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.
    # To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
    collectionRate: -1
    # Maximum number of partition-related DDL requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.
    # To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.
    partitionRate: -1
    db:
      collectionRate: -1 # qps of db level , default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
      partitionRate: -1 # qps of db level, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
  indexRate:
    enabled: false # Whether index-related request throttling is enabled.
    # Maximum number of index-related requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.
    # To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.
    max: -1
    db:
      max: -1 # qps of db level, default no limit, rate for CreateIndex, DropIndex
  flushRate:
    enabled: true # Whether flush request throttling is enabled.
    # Maximum number of flush requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.
    # To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.
    max: -1
    collection:
      max: 10 # qps, default no limit, rate for flush at collection level.
    db:
      max: -1 # qps of db level, default no limit, rate for flush
  compactionRate:
    enabled: false # Whether manual compaction request throttling is enabled.
    # Maximum number of manual-compaction requests per second.
    # Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.
    # To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.
    max: -1
    db:
      max: -1 # qps of db level, default no limit, rate for manualCompaction
  dml:
    enabled: false # Whether DML request throttling is enabled.
    insertRate:
      # Highest data insertion rate per second.
      # Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.
      # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
      max: -1
      db:
        max: -1 # MB/s, default no limit
      collection:
        # Highest data insertion rate per collection per second.
        # Setting this item to 5 indicates that Milvus only allows data insertion to any collection at the rate of 5 MB/s.
        # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # MB/s, default no limit
    upsertRate:
      max: -1 # MB/s, default no limit
      db:
        max: -1 # MB/s, default no limit
      collection:
        max: -1 # MB/s, default no limit
      partition:
        max: -1 # MB/s, default no limit
    deleteRate:
      # Highest data deletion rate per second.
      # Setting this item to 0.1 indicates that Milvus only allows data deletion at the rate of 0.1 MB/s.
      # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
      max: -1
      db:
        max: -1 # MB/s, default no limit
      collection:
        # Highest data deletion rate per second.
        # Setting this item to 0.1 indicates that Milvus only allows data deletion from any collection at the rate of 0.1 MB/s.
        # To use this setting, set quotaAndLimits.dml.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # MB/s, default no limit
    bulkLoadRate:
      max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad rate
      db:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit db bulkLoad rate
      collection:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit collection bulkLoad rate
      partition:
        max: -1 # MB/s, default no limit, not support yet. TODO: limit partition bulkLoad rate
  dql:
    enabled: false # Whether DQL request throttling is enabled.
    searchRate:
      # Maximum number of vectors to search per second.
      # Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.
      # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
      max: -1
      db:
        max: -1 # vps (vectors per second), default no limit
      collection:
        # Maximum number of vectors to search per collection per second.
        # Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second per collection no matter whether these 100 vectors are all in one search or scattered across multiple searches.
        # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # vps (vectors per second), default no limit
    queryRate:
      # Maximum number of queries per second.
      # Setting this item to 100 indicates that Milvus only allows 100 queries per second.
      # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
      max: -1
      db:
        max: -1 # qps, default no limit
      collection:
        # Maximum number of queries per collection per second.
        # Setting this item to 100 indicates that Milvus only allows 100 queries per collection per second.
        # To use this setting, set quotaAndLimits.dql.enabled to true at the same time.
        max: -1
      partition:
        max: -1 # qps, default no limit
  limitWriting:
    # forceDeny false means dml requests are allowed (except for some
    # specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.
    forceDeny: false
    ttProtection:
      enabled: false
      # maxTimeTickDelay indicates the backpressure for DML Operations.
      # DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
      # if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
      # seconds
      maxTimeTickDelay: 300
    memProtection:
      # When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
      # When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
      # When memory usage < memoryLowWaterLevel, no action.
      enabled: true
      dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
      dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
      queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
      queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
    growingSegmentsSizeProtection:
      # No action will be taken if the growing segments size is less than the low watermark.
      # When the growing segments size exceeds the low watermark, the dml rate will be reduced,
      # but the rate will not be lower than minRateRatio * dmlRate.
      enabled: false
      minRateRatio: 0.5
      lowWaterLevel: 0.2
      highWaterLevel: 0.4
    diskProtection:
      enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
      diskQuota: -1 # MB, (0, +inf), default no limit
      diskQuotaPerDB: -1 # MB, (0, +inf), default no limit
      diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
      diskQuotaPerPartition: -1 # MB, (0, +inf), default no limit
    l0SegmentsRowCountProtection:
      enabled: false # switch to enable l0 segment row count quota
      lowWaterLevel: 30000000 # l0 segment row count quota, low water level
      highWaterLevel: 50000000 # l0 segment row count quota, high water level
    deleteBufferRowCountProtection:
      enabled: false # switch to enable delete buffer row count quota
      lowWaterLevel: 32768 # delete buffer row count quota, low water level
      highWaterLevel: 65536 # delete buffer row count quota, high water level
    deleteBufferSizeProtection:
      enabled: false # switch to enable delete buffer size quota
      lowWaterLevel: 134217728 # delete buffer size quota, low water level
      highWaterLevel: 268435456 # delete buffer size quota, high water level
  limitReading:
    # forceDeny false means dql requests are allowed (except for some
    # specific conditions, such as collection has been dropped), true means always reject all dql requests.
    forceDeny: false

注意milvus有频率控制,控制不好会报错

ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(4) with interval 270ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(5) with interval 810ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(6) with interval 2430ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed, error code: 8, reason: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
ERROR i.m.client.AbstractMilvusGrpcClient - FlushRequest failed! Exception:{}
io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]
	at io.milvus.client.AbstractMilvusGrpcClient.handleResponse(AbstractMilvusGrpcClient.java:399)
	at io.milvus.client.AbstractMilvusGrpcClient.flush(AbstractMilvusGrpcClient.java:921)
	at io.milvus.client.MilvusServiceClient.lambda$flush$17(MilvusServiceClient.java:520)
	at io.milvus.client.MilvusServiceClient.retry(MilvusServiceClient.java:310)
	at io.milvus.client.MilvusServiceClient.flush(MilvusServiceClient.java:520)
	at dev.langchain4j.store.embedding.milvus.CollectionOperationsExecutor.flush(CollectionOperationsExecutor.java:32)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addAll(MilvusEmbeddingStore.java:246)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.addInternal(MilvusEmbeddingStore.java:226)
	at dev.langchain4j.store.embedding.milvus.MilvusEmbeddingStore.add(MilvusEmbeddingStore.java:184)
	at JlamaMilvusExample.main(JlamaMilvusExample.java:63)
WARN  i.m.client.AbstractMilvusGrpcClient - Retry(7) with interval 3000ms. Reason: io.milvus.exception.ServerException: request is rejected by grpc RateLimiter middleware, please retry later: rate limit exceeded[rate=0.1]

需要配置
/milvus/configs/milvus.yaml,将
quotaAndLimits.flushRate.collection.max调高一点,默认是0.1

小结

langchain4j提供了langchain4j-milvus用于集成对milvus的访问。

doc

  • standalone_embed

相关推荐

0722-6.2.0-如何在RedHat7.2使用rpm安装CDH(无CM)

文档编写目的在前面的文档中,介绍了在有CM和无CM两种情况下使用rpm方式安装CDH5.10.0,本文档将介绍如何在无CM的情况下使用rpm方式安装CDH6.2.0,与之前安装C5进行对比。环境介绍:...

ARM64 平台基于 openEuler + iSula 环境部署 Kubernetes

为什么要在arm64平台上部署Kubernetes,而且还是鲲鹏920的架构。说来话长。。。此处省略5000字。介绍下系统信息;o架构:鲲鹏920(Kunpeng920)oOS:ope...

生产环境starrocks 3.1存算一体集群部署

集群规划FE:节点主要负责元数据管理、客户端连接管理、查询计划和查询调度。>3节点。BE:节点负责数据存储和SQL执行。>3节点。CN:无存储功能能的BE。环境准备CPU检查JDK...

在CentOS上添加swap虚拟内存并设置优先级

现如今很多云服务器都会自己配置好虚拟内存,当然也有很多没有配置虚拟内存的,虚拟内存可以让我们的低配服务器使用更多的内存,可以减少很多硬件成本,比如我们运行很多服务的时候,内存常常会满,当配置了虚拟内存...

国产深度(deepin)操作系统优化指南

1.升级内核随着deepin版本的更新,会自动升级系统内核,但是我们依旧可以通过命令行手动升级内核,以获取更好的性能和更多的硬件支持。具体操作:-添加PPAs使用以下命令添加PPAs:```...

postgresql-15.4 多节点主从(读写分离)

1、下载软件[root@TX-CN-PostgreSQL01-252software]#wgethttps://ftp.postgresql.org/pub/source/v15.4/postg...

Docker 容器 Java 服务内存与 GC 优化实施方案

一、设置Docker容器内存限制(生产环境建议)1.查看宿主机可用内存bashfree-h#示例输出(假设宿主机剩余16GB可用内存)#Mem:64G...

虚拟内存设置、解决linux内存不够问题

虚拟内存设置(解决linux内存不够情况)背景介绍  Memory指机器物理内存,读写速度低于CPU一个量级,但是高于磁盘不止一个量级。所以,程序和数据如果在内存的话,会有非常快的读写速度。但是,内存...

Elasticsearch性能调优(5):服务器配置选择

在选择elasticsearch服务器时,要尽可能地选择与当前业务量相匹配的服务器。如果服务器配置太低,则意味着需要更多的节点来满足需求,一个集群的节点太多时会增加集群管理的成本。如果服务器配置太高,...

Es如何落地

一、配置准备节点类型CPU内存硬盘网络机器数操作系统data节点16C64G2000G本地SSD所有es同一可用区3(ecs)Centos7master节点2C8G200G云SSD所有es同一可用区...

针对Linux内存管理知识学习总结

现在的服务器大部分都是运行在Linux上面的,所以,作为一个程序员有必要简单地了解一下系统是如何运行的。对于内存部分需要知道:地址映射内存管理的方式缺页异常先来看一些基本的知识,在进程看来,内存分为内...

MySQL进阶之性能优化

概述MySQL的性能优化,包括了服务器硬件优化、操作系统的优化、MySQL数据库配置优化、数据库表设计的优化、SQL语句优化等5个方面的优化。在进行优化之前,需要先掌握性能分析的思路和方法,找出问题,...

Linux Cgroups(Control Groups)原理

LinuxCgroups(ControlGroups)是内核提供的资源分配、限制和监控机制,通过层级化进程分组实现资源的精细化控制。以下从核心原理、操作示例和版本演进三方面详细分析:一、核心原理与...

linux 常用性能优化参数及理解

1.优化内核相关参数配置文件/etc/sysctl.conf配置方法直接将参数添加进文件每条一行.sysctl-a可以查看默认配置sysctl-p执行并检测是否有错误例如设置错了参数:[roo...

如何在 Linux 中使用 Sysctl 命令?

sysctl是一个用于配置和查询Linux内核参数的命令行工具。它通过与/proc/sys虚拟文件系统交互,允许用户在运行时动态修改内核参数。这些参数控制着系统的各种行为,包括网络设置、文件...

取消回复欢迎 发表评论: