Integrating Claude and ChatGPT Into an Unreal Engine Plugin

A code-heavy walkthrough of wiring an LLM into the Unreal editor: one provider-agnostic HTTP client for both Claude and ChatGPT, the strategy and factory pattern, building and parsing JSON with FJsonSerializer, async requests with FHttpModule, marshalling back to the game thread, and bounding conversation history to control tokens.

This is the code-heavy one. Putting an AI assistant inside the Unreal editor sounds like it needs an SDK and a pile of dependencies; it does not. Unreal ships everything required: FHttpModule for async HTTP and FJsonSerializer for JSON. The whole integration in the AI Node Code Editor (Quick Code Editor on FAB) is one client class of a few hundred lines that talks to both Claude and ChatGPT. This article pulls that apart. It is one of six articles in Building an AI Code Editor Inside Unreal Engine.

The AI assistant inside the Unreal Engine code editor, with C++ on the left and a rendered Claude response on the right explaining a function, including a syntax-highlighted code block

The shape of the problem

Two providers, two REST APIs, one chat UI. The naive approach is two client classes; the better approach is to notice how little actually differs between them. Both the Anthropic Messages API and the OpenAI Chat Completions API take a JSON body of model, messages and max_tokens, and both return a single buffered JSON response. The real differences are three: the endpoint, the auth header, and the shape of the system prompt and the response. So capture those as data and share everything else.

The agent interface

Start with an interface so the rest of the editor never knows which provider it is talking to. The async result comes back through a TFunction callback, and success or failure ride the same channel, there are no exceptions:

class IQCE_AIAgent
{
public:
    virtual ~IQCE_AIAgent() = default;

    // Multi-turn chat, keyed by conversation.
    virtual void SendMessage(const FString& ConversationKey, const FString& Message,
        const TFunction<void(const FString& Response, bool bSuccess)>& OnComplete) = 0;

    // One-shot code completion.
    virtual void GetCompletion(const FAICompletionContext& Request,
        const TFunction<void(const FString& Response, bool bSuccess)>& OnComplete) = 0;

    virtual FString GetAgentName() const = 0;
    virtual bool IsAvailable() const = 0;
};

Provider differences as data

One concrete client implements that interface for both providers. The constructor takes a provider enum and fills a small config struct; that struct is the only place the two APIs diverge structurally:

struct FAIProviderConfig
{
    FString ApiEndpoint;
    FString AuthHeaderName;     // "Authorization"  vs  "x-api-key"
    FString AuthHeaderPrefix;   // "Bearer "        vs  ""
    FString ApiVersionHeader;   // ""               vs  "anthropic-version"
    FString ApiVersionValue;    // ""               vs  "2023-06-01"
    bool    bSupportsTemperature = false;
};

void FQCE_GenericAIClient::InitializeProviderConfig()
{
    const UQCE_EditorSettings* Settings = GetDefault<UQCE_EditorSettings>();
    switch (Provider)
    {
    case EQCEDefaultAIProvider::Claude:
        Config.ApiEndpoint      = Settings->ClaudeApiEndpoint;   // api.anthropic.com/v1/messages
        Config.AuthHeaderName   = TEXT("x-api-key");
        Config.AuthHeaderPrefix = TEXT("");
        Config.ApiVersionHeader = TEXT("anthropic-version");
        Config.ApiVersionValue  = TEXT("2023-06-01");
        Config.bSupportsTemperature = false;
        break;

    case EQCEDefaultAIProvider::ChatGPT:
        Config.ApiEndpoint      = Settings->OpenAIApiEndpoint;   // api.openai.com/v1/chat/completions
        Config.AuthHeaderName   = TEXT("Authorization");
        Config.AuthHeaderPrefix = TEXT("Bearer ");
        Config.ApiVersionHeader = TEXT("");
        Config.ApiVersionValue  = TEXT("");
        Config.bSupportsTemperature = true;
        break;
    }
}

This is the heart of the design. Once the differences live in data, the header setup collapses to four lines that serve both providers:

void FQCE_GenericAIClient::SetupRequestHeaders(TSharedRef<IHttpRequest, ESPMode::ThreadSafe> Request)
{
    Request->SetHeader(TEXT("Content-Type"), TEXT("application/json"));
    Request->SetHeader(*Config.AuthHeaderName, Config.AuthHeaderPrefix + GetApiKey());
    if (!Config.ApiVersionHeader.IsEmpty())
        Request->SetHeader(*Config.ApiVersionHeader, *Config.ApiVersionValue);
}

GetApiKey() and GetModelVersion() switch on the same enum to read the right field from settings (ClaudeApiKey vs OpenAIApiKey, ModelVersion vs OpenAIModelVersion).

A factory of one client per provider

The UI never news up a client. A factory lazily creates and caches one client per provider, so they live for the editor session:

FQCE_GenericAIClient& FQCE_AIClientFactory::GetClient(EQCEDefaultAIProvider Provider)
{
    if (!ClientInstances.Contains(Provider))
        ClientInstances.Add(Provider, MakeUnique<FQCE_GenericAIClient>(Provider));
    return *ClientInstances[Provider];
}

The chat panel maps the selected provider to the enum and calls through the factory, passing a lambda that routes the response back to itself:

FQCE_AIClientFactory::GetClient(Provider).SendMessage(
    ConversationKey, Message,
    [this](const FString& Response, bool bSuccess)
    {
        HandleMessageResponse(Response, bSuccess);
    });

Sending the request

SendMessage is the whole async-HTTP-in-the-editor pattern in one function: bail early if there is no key, build the request, serialize the JSON body, bind the completion handler (with the caller’s callback carried as a payload argument), and fire:

void FQCE_GenericAIClient::SendMessage(const FString& ConversationKey, const FString& Message,
    const TFunction<void(const FString&, bool)>& OnComplete)
{
    if (GetApiKey().IsEmpty())
    {
        OnComplete(FString::Printf(TEXT("%s_API_KEY_MISSING"), *GetAgentName().ToUpper()), false);
        return;
    }

    TSharedRef<IHttpRequest, ESPMode::ThreadSafe> Request = FHttpModule::Get().CreateRequest();
    Request->SetURL(Config.ApiEndpoint);
    Request->SetVerb(TEXT("POST"));
    SetupRequestHeaders(Request);

    TSharedPtr<FJsonObject> Payload = CreateConversationPayload(ConversationKey);
    FString Body;
    TSharedRef<TJsonWriter<>> Writer = TJsonWriterFactory<>::Create(&Body);
    FJsonSerializer::Serialize(Payload.ToSharedRef(), Writer);
    Request->SetContentAsString(Body);

    // The caller's callback is carried as a payload arg into the completion handler.
    Request->OnProcessRequestComplete().BindRaw(this, &FQCE_GenericAIClient::HandleResponse, OnComplete);
    Request->ProcessRequest();   // returns immediately; the response arrives later
}

Two patterns to note. The missing-key case returns a sentinel string (CLAUDE_API_KEY_MISSING) the UI can detect to show a “set your key” card instead of a raw error. And ProcessRequest() is non-blocking: it returns at once, and the response comes back through the delegate, with the caller’s OnComplete carried along as a bound payload argument.

Building the JSON body

CreateConversationPayload is where UE JSON building shows up, and where the per-provider message shape is chosen. Build an FJsonObject, set fields, wrap each message object in an FJsonValueObject for the array:

TSharedPtr<FJsonObject> FQCE_GenericAIClient::CreateConversationPayload(const FString& Key)
{
    FQCEAIConversation* Conversation = QCE_AIConversationTracker::Get().FindConversation(Key);

    TArray<TSharedPtr<FJsonObject>> ApiMessages =
        (Provider == EQCEDefaultAIProvider::Claude)
            ? Conversation->GetClaudeAPIMessages()
            : Conversation->GetOpenAIAPIMessages();

    TArray<TSharedPtr<FJsonValue>> MessageValues;
    for (const TSharedPtr<FJsonObject>& Msg : ApiMessages)
        MessageValues.Add(MakeShared<FJsonValueObject>(Msg));

    TSharedPtr<FJsonObject> Payload = MakeShared<FJsonObject>();
    Payload->SetStringField(TEXT("model"), GetModelVersion());
    Payload->SetArrayField(TEXT("messages"), MessageValues);

    // Adaptive token budget: short chats are cheaper.
    const int32 Length = FMath::Max(0, ApiMessages.Num() - 1);
    const UQCE_EditorSettings* Settings = GetDefault<UQCE_EditorSettings>();
    Payload->SetNumberField(TEXT("max_tokens"),
        Length <= 2 ? Settings->SimpleQueryMaxTokens : Settings->RegularMaxTokens);

    if (Config.bSupportsTemperature)
        Payload->SetNumberField(TEXT("temperature"), 0.7);   // OpenAI only here

    return Payload;
}

Both providers get { model, messages, max_tokens }. temperature is added only when the provider config allows it. And max_tokens is adaptive: short conversations use a smaller budget (SimpleQueryMaxTokens, default 1024), longer ones use a larger one (RegularMaxTokens, default 2048), a small but real cost control.

The one real protocol difference: the system prompt

The system instructions and the function context are delivered differently per provider. OpenAI takes a dedicated system-role message; this Anthropic integration carries the same content as the first user message. That divergence lives in the two message-list getters:

// OpenAI: context becomes a system message.
MessageObj->SetStringField(TEXT("role"), TEXT("system"));

// Claude: the same context rides as the first user message.
MessageObj->SetStringField(TEXT("role"), Messages[0].Role);   // "user"

Bounding history to control tokens

You do not send the entire conversation every turn; you send the system context plus a window of recent messages. The system message at slot 0 is always included and never counted against the budget; only the trailing MaxHistoryMessages of the rest are sent:

TArray<TSharedPtr<FJsonObject>> FQCEAIConversation::GetClaudeAPIMessages() const
{
    TArray<TSharedPtr<FJsonObject>> Api;

    // (1) Always include the context message first.
    if (Messages.Num() > 0 && Messages[0].MessageType == EQCEMessageType::FunctionContext)
        Api.Add(MakeMessage(Messages[0].Role, Messages[0].Content));

    // (2) Window the tail: at most MaxHistory recent turns, skipping slot 0.
    const int32 MaxHistory = GetDefault<UQCE_EditorSettings>()->MaxHistoryMessages;   // default 5
    const int32 Start = FMath::Max(1, Messages.Num() - MaxHistory);
    for (int32 i = Start; i < Messages.Num(); ++i)
        Api.Add(MakeMessage(Messages[i].Role, Messages[i].Content));

    return Api;
}

FMath::Max(1, N - MaxHistory) is the whole windowing trick: it keeps the last N messages and never drops the context at index 0. Bound history is the single biggest lever on token spend for a chat feature, and it is one line.

Parsing the response

The completion handler runs three failure checks (transport, JSON parse, content extract), all routed through the same OnComplete(text, false) channel:

void FQCE_GenericAIClient::HandleResponse(FHttpRequestPtr, FHttpResponsePtr Response,
    bool bWasSuccessful, TFunction<void(const FString&, bool)> OnComplete)
{
    if (!bWasSuccessful || !Response.IsValid())
    { OnComplete(TEXT("Failed to connect"), false); return; }

    TSharedPtr<FJsonObject> Json;
    TSharedRef<TJsonReader<>> Reader = TJsonReaderFactory<>::Create(Response->GetContentAsString());
    if (!FJsonSerializer::Deserialize(Reader, Json))
    { OnComplete(TEXT("Failed to parse response"), false); return; }

    FString Content, Error;
    OnComplete(ParseResponseContent(Json, Content, Error) ? Content : Error,
               /*bSuccess*/ Content.Len() > 0);
}

The response shapes differ, so ParseResponseContent reads each with the bool-returning TryGet* accessors (no exceptions, out-param style):

// Claude: content[0].text
const TArray<TSharedPtr<FJsonValue>>* Content;
if (Json->TryGetArrayField(TEXT("content"), Content) && Content->Num() > 0)
{
    const TSharedPtr<FJsonObject>* Obj;
    if ((*Content)[0]->TryGetObject(Obj) && (*Obj)->TryGetStringField(TEXT("text"), OutContent))
        return true;
}

// OpenAI: check for an error object first, then choices[0].message.content
const TSharedPtr<FJsonObject>* ErrorObj;
if (Json->TryGetObjectField(TEXT("error"), ErrorObj)) { /* surface a friendly message */ }

Marshalling back to the game thread

The completion delegate can fire off the game thread, and Slate must only be touched on the game thread. So the UI handler hops back before adding anything to the chat:

void QCE_AIContainer::HandleMessageResponse(const FString& Response, bool bSuccess) const
{
    AsyncTask(ENamedThreads::GameThread, [this, Response, bSuccess]()
    {
        MessageList->RemoveLoadingMessage();
        if (bSuccess) MessageList->AddAIResponse(Response);
        else          MessageList->ShowError(Response);
    });
}

One caution worth flagging honestly: this captures this raw. The factory clients live for the session so binding to them is safe, but a Slate widget captured by raw this would dangle if it were destroyed while a request was in flight. The robust version captures a TWeakPtr to the widget and Pin()s it inside the lambda. For an editor tool with a session-long panel it is a low risk, but it is the kind of thing to get right before shipping something long-lived.

What to take away

  • You need no SDK: FHttpModule plus FJsonSerializer (add HTTP and Json to your module) are enough to call Claude or OpenAI directly.
  • Make one client serve both providers by holding the differences as data (endpoint, auth header, response shape); branch only where the APIs genuinely diverge.
  • The request is async: ProcessRequest() returns immediately, the callback carries your continuation, and you must marshal back to the game thread before touching Slate.
  • Bound conversation history with a simple tail window to keep token cost predictable.

The same client also powers inline code completion, where the context you send is the difference between a cheap, accurate suggestion and a wasteful one. The full series is Building an AI Code Editor Inside Unreal Engine; the finished plugin is AI Node Code Editor on FAB. If you want to write Unreal C++ with an AI rather than build the editor, see How to Use Claude or ChatGPT to Write C++ in Unreal Engine.

Frequently asked questions

Do you need a third-party HTTP or JSON library to call an LLM from Unreal?
No. Unreal ships FHttpModule for async HTTP and FJsonSerializer for building and parsing JSON. Add HTTP and Json to your module dependencies and you can call the Anthropic or OpenAI REST API directly, no external SDK required.
How do you support both Claude and ChatGPT without duplicating the client?
Capture the per-provider differences as data: endpoint, auth header name and prefix, version header, and which response shape to read. One client class branches only where the providers genuinely differ (the auth header and the response parse), so most of the code is shared.
What is the difference between the Claude and OpenAI request shapes?
Both accept model, messages and max_tokens. The divergence is the system prompt: OpenAI takes a message with role system, while the Anthropic Messages API in this integration carries the context as the first user message. The response shapes also differ: Claude returns content[0].text, OpenAI returns choices[0].message.content.
Why must the HTTP callback marshal back to the game thread?
FHttpModule completion delegates can fire off the game thread, and Slate widgets may only be touched on the game thread. Wrapping the UI update in AsyncTask(ENamedThreads::GameThread, ...) moves it back before adding the response to the chat, avoiding a crash.