Your "Hello" to an LLM Is Not Free

💸 Your "Hello" to an LLM is not free

Interesting fact: every message you send to an LLM costs tokens.

And that includes the throwaway ones.

A simple:

…still gets processed, still generates a response, and still eats into a budget you usually do not even see.

That means the useless messages count too.

If your session budget is limited, then filler is not just harmless politeness.
It is wasted capacity.

The model does not care that your greeting was casual.

It still has to read it, reason over it, and include it in the conversation state.

People who work with LLMs at scale usually do the opposite of casual users.

They:

Because they know prompt space is a resource, not a chat room.

Most people:

And then wonder why answers get worse, slower, or more limited.

With models that reprocess conversation history, every extra message makes the next one more expensive.

The people getting the most out of Claude or ChatGPT are not always the ones asking for bigger limits.

They are often the ones wasting the fewest tokens.

Less filler.
More signal.

When your session budget is only a handful of meaningful turns,
that greeting might be the most expensive "Hi" you ever type.

Better start with the actual task.