LLM Key Server: Providing Secure and Convenient Access to Internal LLM APIs

This post is for Day 2 of Mercari Advent Calendar 2025, brought to you by @hi120ki from the Mercari AI Security team.

At Mercari, various initiatives are underway to expand the use of AI and LLMs within the company. To support these efforts, the AI Security team developed the LLM Key Server, a service designed to provide secure yet convenient access to LLM APIs.

This system replaces the previous manual process where administrators would register users upon receiving LLM API access requests. Now, users can obtain temporary API keys through their internal accounts without submitting access requests.

Additionally, we provide common templates for using LLM APIs in GitHub Actions and Google Apps Script, facilitating LLM adoption in local environments and across multiple services such as CI, cloud platforms, and no-code tools.

This article explains the security challenges of LLM APIs, improvements to our processes, the architecture of the LLM Key Server, and key implementation points.

Security Challenges in LLM APIs

Various LLM models are currently offered by different providers, and at Mercari, we leverage multiple LLM models based on task requirements and employee preferences. However, the APIs that provide access to these models typically require API keys.

API keys used to access major LLM vendor APIs typically have no expiration date. If a key is leaked and the breach goes undetected, organizations face the risk of prolonged information leakage and financial losses. Furthermore, the current surge in AI and LLM adoption has led to the proliferation of API keys, raising concerns about unclear management practices. Managing users, teams, and permissions across multiple LLM providers adds additional complexity. This complexity makes regular access audits difficult to conduct.

The most secure and recommended approach we advocate internally is to access LLM APIs through Google Cloud or Azure using Workload Identity and cross-cloud federation, eliminating the need for API keys. However, the complexity of such configurations, combined with the fact that many external AI and LLM products are released without supporting these methods, necessitated an alternative approach, particularly when evaluating various LLM tools.

An additional requirement was to ensure both convenience and security. Overly cumbersome security policies can paradoxically encourage users to bypass them, so we needed to pursue both safety and usability.

Providing Secure and Convenient LLM API Access

To provide secure and convenient access to LLM APIs, we decided to leverage the open source project LiteLLM, which enables access to multiple models through a single unified API, along with the OpenID Connect (OIDC) ID token issuance capabilities of Google Workspace and Google Cloud.

LiteLLM is an open source solution that makes LLM models from various providers accessible through a single API. Beyond basic LLM API calls, it also supports coding agent tools such as Claude Code.

The OIDC ID token issuance feature allows us to obtain ID tokens signed by Google by leveraging Google OAuth or service account permissions, enabling reliable user identity verification.

At Mercari, we operate a Token Server that enables access to GitHub from Google Cloud using short-lived credentials. The LLM Key Server builds upon this architecture, extending it to support LLM access.

LLM Key Server Architecture

The LLM Key Server authentication flow works as follows.

The LLM Key Server authentication flow

First, users or workloads who need LLM access for Claude Code or other applications obtain an OIDC ID token from Google APIs to prove their identity. This can be done through Google Workspace account authentication or service account authentication from the Compute metadata server.

Next, when the OIDC ID token is sent to the LLM Key Server, the server verifies the token signature and issues a temporary API key for accessing LiteLLM based on the information in the token. This API key has a short expiration period, allowing users to access various LLM models through LiteLLM.

For local environments using Google Workspace account authentication, we provide an internal CLI tool that initiates the OAuth authorization flow with a single command, handling the entire process from obtaining the OIDC ID token to retrieving the LLM API key.

When using service accounts, API keys expire after one hour. However, recognizing that cloud applications using LLMs may run for extended periods, we provide an automatic key renewal mechanism. This is implemented as a Go library that automatically renews keys, enabling continuous LLM API usage.

This approach leverages Google Workspace and Google Cloud service account authentication to provide secure LLM API access, while time-limited keys reduce information leakage risks and automatic renewal libraries ensure convenience.

Expanding LLM Key Server Usage Scenarios

The LLM Key Server is designed for use not only in local environments and cloud applications, but also across various internal tools and services. We specifically support the following two usage scenarios.

GitHub Actions

We provide a common template for using LLM APIs in GitHub Actions. GitHub provides OIDC ID tokens that can be used to obtain LLM API keys from the LLM Key Server, enabling access to various LLM models through LiteLLM. This has accelerated LLM adoption in CI/CD pipelines, including automated code reviews using Claude Code.

- name: Get LiteLLM Key
  id: litellm
  uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1
  with:
    script: |
      const oidc_request_token = process.env.ACTIONS_ID_TOKEN_REQUEST_TOKEN;
      const oidc_request_url = process.env.ACTIONS_ID_TOKEN_REQUEST_URL;
      const oidc_resp = await fetch(`${oidc_request_url}&audience=https://key-server.example.com`, {
        headers: {Authorization: `bearer ${oidc_request_token}`},
      });
      const oidc_token = (await oidc_resp.json()).value;
      if (!oidc_token) {
        core.setFailed('Failed to retrieve OIDC token from GitHub Actions');
      }

      const res = await fetch('https://key-server.example.com/llm-key', {
        method: 'GET',
        headers: {
          'Authorization': `Bearer ${oidc_token}`,
          'Content-Type': 'application/json',
        }
      });
      if (res.status !== 200) {
        core.setFailed(`LiteLLM API Error: HTTP ${res.status}`);
      }
      const body = await res.json();
      core.setSecret(body.key);
      core.setOutput('token', body.key);

This template allows developers to securely use LLM APIs in CI/CD pipelines without directly managing API keys.

Google Apps Script

We also provide a common template for using LLM APIs in Google Apps Script. In Google Apps Script, we use OAuth scope configuration to authenticate users and obtain OIDC ID tokens.

At this point, we configure the Google Apps Script settings by opening the script editor, enabling the appsscript.json file from the settings page, and adding the necessary OAuth scopes.

  "oauthScopes": [
    "openid",
    "https://www.googleapis.com/auth/userinfo.email",
    "https://www.googleapis.com/auth/script.external_request"
  ],

With this configuration, you can obtain the OIDC ID token, retrieve the LLM API key from the LLM Key Server, and access various LLM models through LiteLLM using the following code.

function getLLMToken() {
  try {
    const cache = CacheService.getUserCache();
    const cacheKey = "llm_token";
    const cachedToken = cache.get(cacheKey);
    if (cachedToken) {
      return cachedToken;
    }
    console.log("[+] Fetching new LLM token");
    const token = ScriptApp.getIdentityToken();
    const options = {
      method: "GET",
      headers: {
        Authorization: "Bearer " + token,
      },
    };
    const response = UrlFetchApp.fetch(
      "https://key-server.example.com/llm-key",
      options,
    );
    const statusCode = response.getResponseCode();
    if (statusCode !== 200) {
      throw new Error(
        `HTTP request failed with status ${statusCode}: ${response.getContentText()}`,
      );
    }
    const responseText = response.getContentText();
    const responseData = JSON.parse(responseText);
    if (!responseData.key) {
      throw new Error("Key not found in response");
    }
    cache.put(cacheKey, responseData.key, 50 * 60); // Cache for 50 minutes
    return responseData.key;
  } catch (e) {
    console.error("Error getting LLM token: " + e.toString());
    return null;
  }
}

When validating OIDC ID tokens, we verify the user’s email address as well as the Google Cloud project backing the Apps Script is located within the organization’s system-gsuite/apps-script folder in Google Cloud. This ensures that only requests from trusted scripts are allowed.

This approach eliminates the need to store LLM API keys in plaintext within no-code tools, enabling secure LLM API usage.

This mechanism has accelerated LLM adoption within the company for use cases such as summarizing and translating internal documents.

Conclusion

We have developed and deployed the LLM Key Server as a core component to solve the problem of authenticating to LLM APIs for several common types of workload, providing a solution that is as easy to use as static API keys to both developers and non-developers. We believe that the best solution to support safe AI and LLM utilization is through solutions that are both secure and easy-to-use.

If you are interested in AI and LLM adoption or security initiatives like these at Mercari, please visit our careers page.

Tomorrow’s article will be by @Jazz. Look forward to it!

  • X
  • Facebook
  • linkedin
  • このエントリーをはてなブックマークに追加