What is function calling?

Function calling is a feature introduced by OpenAI that allows you to describe functions (tools) to a language model, and have the model intelligently output a JSON object containing arguments to call those functions.

How does it compare to Anthropic's tool_use?

While the core concept is identical, the payload schemas differ. OpenAI uses a 'tools' array in the request and returns 'tool_calls', whereas Anthropic embeds 'tool_use' blocks directly into the content array.

The tool_choice parameter allows developers to force the model to call a specific function, force it not to call any functions, or let it decide automatically (the default behavior).

Why is fast execution important for function calling?

Since the model halts generation until the function result is returned, slow API connections introduce severe latency. wmcp.sh solves this with sub-100ms execution times and intelligent short TTL caching.

GLOSSARY · /GLOSSARY/FUNCTION-CALLING

Function Calling (OpenAI)

When a chat application needs to execute code, query a database, or connect to Acme Corp's internal APIs, it uses function calling. But mapping APIs to OpenAI's strict schema by hand is tedious and error-prone. wmcp.sh dynamically compiles these schemas for you.

the gap

The complexity of parallel function execution

Function calling—introduced by OpenAI—allows developers to pass an array of available tools in the tools array of a chat completion request. The model then intelligently decides which function to invoke, providing the necessary JSON arguments.

However, handling the lifecycle of these calls is difficult. Models now support parallel tool calls, meaning the model might ask to execute three functions simultaneously. Your infrastructure must securely hold credentials, execute the network requests, and manage retries. wmcp.sh acts as an execution gateway that handles this automatically, utilizing an encrypted credentials vault for static keys to keep your environment secure. (Note: wmcp.sh is not affiliated with OpenAI, Anthropic, or any other mentioned orgs.)

the flow

Executing a function call with OpenAI

import OpenAI from "openai";

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Fetch dynamic tool schemas from wmcp.sh rather than hardcoding
const toolSchemas = await fetch("https://api.wmcp.sh/v1/tools?provider=openai").then(r => r.json());

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Fetch the latest logs." }],
  tools: toolSchemas,
  tool_choice: "auto"
});

// Inspect the model's decision
const toolCalls = response.choices[0].message.tool_calls;
if (toolCalls) {
  for (const call of toolCalls) {
    console.log("OpenAI requested function:", call.function.name);
    // In production, wmcp.sh executes these in parallel
  }
}

capability

Managing Function Calls at Scale

Capability	Without wmcp.sh	With wmcp.sh
Parallel Execution	⚠️ Requires custom Promise.all() wiring.	✅ Handled concurrently at the edge.
Schema Generation	❌ Manually written JSON Schema types.	✅ Auto-generated from OpenAPI.
Performance	⚠️ Cloud provider routing overhead.	✅ Sub-100ms latency execution.
Caching	❌ Duplicate queries sent to your DB.	✅ Short TTL (~1s) deduplication cache.

FAQ

Common questions.

What is the difference between tools and functions?

In the OpenAI API, "functions" are a specific type of "tool". Currently, function calling is the primary tool type, so the terms are often used interchangeably, though the API expects a wrapper object of type 'function'.

What does tool_choice do?

It dictates behavior. 'none' disables function calling, 'auto' lets the model choose, and specifying a precise function name forces the model to generate arguments for that specific function.

How does this compare to Anthropic's implementation?

While OpenAI relies on the 'tool_calls' array attached to the message object, Anthropic uses 'tool_use' blocks embedded directly within the message content array. Both achieve the same result. wmcp.sh normalizes both formats seamlessly.

How do I prevent rate limits during parallel calls?

Uncached parallel function calls can quickly hit target API rate limits. wmcp.sh utilizes a short TTL caching layer (~1s) to collapse duplicate requests in the same reasoning loop.

Need this done for you?

Skip the wiring — we build, deploy, and monitor.

Custom adapter + hosted MCP at mcp.yourbrand.com + verified badge. Pricing: Starter $499 one-time, Managed Retainer $999/mo, or Enterprise $4,999+/mo.

See /managed → Submit (free)