[Langchain] 응답 캐시 동작 파악

티스토리 뷰

Framework

[Langchain] 응답 캐시 동작 파악

brad.min 2024. 5. 28. 17:20

langchain에서 제공하는 cache 기능을 사용하여 질의 응답을 캐싱하였으나 내부적으로 어떻게 동작하는지 궁금해서 찾아보았다.

먼저 chat_models.py의 _agenerate_with_cache 메소드가 실행이 된다. 그 안에 아래의 함수가 호출이 되면서 cache에 같은 데이터가 있는지 찾는다.

cache_val = await llm_cache.alookup(prompt, llm_string)

#chat_models.py

async def _agenerate_with_cache(
    self,
    messages: List[BaseMessage],
    stop: Optional[List[str]] = None,
    run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
    **kwargs: Any,
) -> ChatResult:
    if isinstance(self.cache, BaseCache):
        llm_cache = self.cache
    else:
        llm_cache = get_llm_cache()
    # We should check the cache unless it's explicitly set to False
    # A None cache means we should use the default global cache
    # if it's configured.
    check_cache = self.cache or self.cache is None
    if check_cache:
        if llm_cache:
            llm_string = self._get_llm_string(stop=stop, **kwargs)
            prompt = dumps(messages)
            cache_val = await llm_cache.alookup(prompt, llm_string)
            if isinstance(cache_val, list):
                return ChatResult(generations=cache_val)
        elif self.cache is None:
            pass
        else:
            raise ValueError(
                "Asked to cache, but no cache found at `langchain.cache`."
            )

lookup.py 메소드를 보면 hgetall이라는 redis의 함수를 사용하여 results에 담는다. 만약 없다면 results는 빈 리스트가 된다.

#cache.py

def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
    """Look up based on prompt and llm_string."""
    # Read from a Redis HASH
    try:
        results = self.redis.hgetall(self._key(prompt, llm_string))
        return self._get_generations(results)  # type: ignore[arg-type]
    except Exception as e:
        logger.error(f"Redis lookup failed: {e}")
        return None

캐시에서 데이터를 찾으면 ChatResult 객체에 담아 리턴하여 응답을 하게 된다.

if isinstance(cache_val, list):
    return ChatResult(generations=cache_val)

저작자표시 비영리 변경금지 (새창열림)

'Framework' 카테고리의 다른 글

[Langchain] 쿼리에 Redis 캐싱을 적용하기 (0)	2024.03.19
[sqlalchemy] @contextmanger로 트랜젝션 관리 (0)	2024.03.13
[FastAPI x Langchain] ChatGPT 응답 Streaming 구현 (0)	2024.03.06

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

Github

TAG more

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

글 보관함

Techbrad

티스토리 뷰

[Langchain] 응답 캐시 동작 파악

'Framework' 카테고리의 다른 글

티스토리툴바