Mar 4, 2025
15 Views
Comments Off on Navigating embedding vectors
0 0

Navigating embedding vectors

Written by

AI, feedback & the need for greater user control.

As of March 2025, we still lack meaningful control over AI-generated outputs. From a user experience point of view, most of the time this is acceptable. However, when using AI tools to help with complex information discovery or nuanced creative tasks, the prompting process quickly becomes convoluted, imprecise and frustrating. Technically, this shouldn’t need to be the case.

Every time we revise a prompt, a new cycle of input and output tokens is generated. This is an awkward way of working when you are honing in towards a final output. The back and forth text prompting needed to direct AI tools is inefficient, quickly strays from naturally constructed phrases and previous incorrect responses pollute the attention mechanism.

This lack of predictability currently prevents users from gaining an intuitive working knowledge of AI tools, which in turn limits the models’ capabilities.

What If?

What if we had customisable UI controls that would allow users to navigate towards a desired output without having to use imprecise language prompts?

Older electronic products had direct mechanical feedback between a user’s input and a corresponding action. This experience feels distant when using current AI tools. But does this need to be the case? Dieter Rams. World Receiver T 1000 Radio, 1963. Brooklyn Museum.

Why Is This Better?

This isn’t just about convenience — it’s about creating a more natural way for users to collaborate with AI tools and harness their power. The most efficient way for users to solve problems is to learn by doing. The most natural way is by trial, error and refinement. Rewriting a prompt resets all the input token embeddings which mean that users loose any sense of control when working with AI tools.

A more sensible approach would be to allow users to move through the AI model space and let them navigate to a desired outcome.

Wireflow: Enhancing AI prompts with a control panel and concept vector sliders.

Erm, I Still Don’t Get It

To illustrate this concept more clearly, let’s use an analogy. Imagine a game where the multi-dimensional geometry of an AI model is represented by inter-galactic space. Each time you prompt a spaceship pops up somewhere in this inter-galactic space. You have a destination in mind — say a specific star system that you want to explore. At the moment, the only way to navigate towards your star system is to prompt. Each time you do so, the spaceship teleports to another somewhat random position. You are unsure if your new prompt will appear closer or further away to your destination. Your prompts balloon in length, and your uncertainty increases as each additional word has less impact on the spaceship’s position.

If, on the other hand, you had navigational controls, instead of blindly jumping about the universe, you could increase or decrease various values and more effectively learn which values move you towards your destination.

You might find that you need to re-prompt a couple of times first to start closer to your destination. But when you’re closing in, being able to navigate through the vector space with sliders is significantly more effective.

(“But what about prompt weightings? By adding + and — to words in a prompt it is possible to change their importance!” > This is a useful hack but it isn’t intuitive or efficient. With successive, lengthy prompts users are still blindly guessing with new token embeddings.)

What’s Needed For A UI Control Panel?

UI controls would need to be inferred from each prompt.The input embeddings go through many cycles of attention processing, so controls would need to directly alter the prompt’s final input embedding vectors — prior to the output content generation process.Proposed and existing data flow through an LLM attention head.

So, How Could This Work?

When a prompt is being processed a copy of the final input vector embeddings would need to be stored prior to the output generation. From these copied embeddings it should be possible to infer the most relevant values to provide as controls. It should also be possible to allow users to input their own values.

If a user needs to fine-tune an output, they could adjust controls which would shift the token embeddings. These new embeddings would be fed directly into the output generation, skipping the input prompt generation.

While I’m at the edge of my knowledge of ML models, it seems that mathematically it might be possible to effect a change in the token embeddings by altering the Value ‘V’ in the equation below.

This mathematical equation describes the attention head layer within a Large Language Model. Query (Q) relates to the token generation of the input prompt. Key (K) maps the input prompt to the model space. Value (V) is a weighting layer that intentionally guides output generation.

Where This Approach Works Best

Working With Near-Known & Unknown Information — When new information can shift a user’s initial intention. Eg, Travel Planning > If a user wrote an initial prompt for a personalised travel itinerary, they could then shift subjective parameters to tailor the plan without having to re-write long prompts.Content Generation — The tasks that stand to gain the most are when prompting during the creative process, when it’s beneficial for the “temperature” parameter to be higher. Eg, when using image generation tools users either have a conscious target in mind that they are trying to match, or they will discover what ‘feels right’ as they use the generative tool. Endless prompting harms the creative process and is computationally expensive. Concept vector sliders should expand a user’s creative flow state rather than frustrate it.Deep Research | Searching Within Complex Vertical Databases — Interrogating data with nuanced vector based search would be useful for particular scientific experiments that involve large databases. Eg, for research studies attempting to map animal communication, it might be useful to explore the contextual differences in the way animals communicate. The same sound pattern might be being made, but being expressed differently depending on comfort and safety vs threat and danger. Navigating a database with UI sliders that control various embedding vectors and provide feedback analysis on search terms could be useful.

Generative AI: Two Example Use Cases

1. Writing | Feedback & Modulation Control

Before making style changes to text, it would be useful for writers to receive feedback. As I’m writing this article for example, when I’m deep in a writing flow, I’m unsure if I’m keeping an acceptable level of complexity and tone across sections. Variance of course is ok, but feedback would be helpful.

Then when making style changes, users need more precise control. Default commands, such as Apple Intelligence’s ‘Friendly’ , ‘Professional’ , ‘Concise’ , or Gemini’s ‘Rephrase’, ‘Shorten’, ‘Elaborate’ offer little feedback or control. How ‘Friendly’ or ‘Professional’ is the text to begin with? And then when applying the change, how much more ‘Friendly’ , ‘Professional’, ‘Shorter’ or ‘Longer’ does the user want it to be? Also, perhaps there are more nuanced stylistic changes that I’d like to explore.

An initial mock up of how a simple control panel could function within Google Docs existing UI.

So Wait, What’s New?

Feedback — Users can quickly review a text based on customisable values.User Interface Controls — Following feedback, users can then make informed and confident changes along several nuanced concept vectors at once. Without a multi-step prompt dialogue. Using these concept sliders users can pinpoint a specific intention that might be difficult, or inefficient to describe with words.Easier Development, Deployment & Modulation of Personal Styles— A fully customisable control panel can help users create and deploy a personal style and then modulate it for a given context.

The impact of document analytics and vector sliders like this would be considerable. Instead of giving full agency to AI to re-write texts, using a copilot to quickly analyse and variably modulate text could help users to be more intentional with their writing and improve their writing skills rather than loosing it to AI.

2. Multi-Media Content Generation

Compared to text based LLMs text to media generation tools currently suffer from an even greater lack of traction between intention, prompt and output. This is because they have huge dual model spaces with a text input analysis as well as an output vector space which have to be matched together.

As well as media labelling issues and black holes within training data (eg. there are hardly any images of wine glasses that are full to the brim), another significant problem is a UX one.

Users lack intuition of how to prompt text-image models effectively. With vector sliders users would have greater certainty in knowing whether a desired outcome is even achievable in the model and not a prompt failure. By removing the uncertainty involved with prompts, users would increasingly enjoy working with generative AI tools and be more effective with less overall prompt attempts. Efficiencies in text prompting can only be beneficial from a business standpoint.

Mock up of a text to image generator to shown the usefulness of subjective concept vectors.

I’m Almost Lost Again, What’s New?

Two Step Prompts | Text + Concept Vector Sliders — With a more straightforward initial prompt, users could now make further changes using subjective concept vectors. In the above example ‘atmosphere’ is added to the image. There is feedback of how atmospheric the images is, which informs a user when changing this value.Control Panels Change the Final Input Embeddings — This is crucial. When users decide to make a change they would now be able to carefully fine tune an existing prompt without reshuffling all the vector embeddings.

It took over an hour of repeated prompts to Adobe Firefly to get the three images for the above mock-up. Every time I re-prompted Firefly I felt as though I was playing roulette. I was never certain of what any of Firefly’s controls or presets were doing. Perhaps it’s a skill issue, but even after finding an image to use as a firm compositional lock and as a style transfer, I was frustrated with an inability to nudge the image in any meaningful non-random way.

It definitely feels that something is going wrong. These models are incredibly powerful, and they should be able to handle incremental changes and nuanced inference. There is obviously a lot of untapped potential with the combination of LLMs and diffusion models.

Doing More With Less. Why This Is Worth Pursuing.

Part of the problem with prompt engineering is that users have to communicate to an AI that has an unknown exposure to the world. Users don’t know what information they need to provide to an AI or how that information should be provided. To make matters worse, models frequently change, and in turn, their sensitivities to words and phrases change.

If users had greater model space control, this would ease some of these tensions. Users could write shorter prompts to establish a baseline which they could re-define with concept vectors. A multi-step user interface means shorter, less perfect, and more efficient prompts with increased fine control of the output for the ‘last mile’ of accuracy.

A two-step process, of prompting and then fine-tuning final input embeddings, should also be more computationally efficient. From a UX perspective it would be more satisfying because this method is in-sync with how we think and work — particularly when working through unknown problems and when needing Generative AI to perform at higher ‘temperatures’ (hallucinations) for creative work.

Notes

The ideas in this article can be seen as part of wider evolving research and discussions surrounding Large Concept Models that are being developed by Meta. Essentially this is an LLM model that is specifically organised around conceptually related terms. This approach should make navigating concepts more predictable and reliable from a user experience interaction. Articles for further reading:
– Mehul Gupta’s
Meta Large Concept Models (LCM): End of LLMs?
– Vishal Rajpjut’s
‘Forget LLMs, It’s Time For Large Concept Models (LCMs)’ .I first encountered Concept Activation Vectors (CAVs) in 2020, while working alongside Nord Projects on a research project for GoogleAI. This project, which explored subjectivity, style, and inference in images, won an Interaction Award (IxDA).
The idea of identifying and working with subjective inference, which Nord Projects explored, has stayed with me ever since. It has influenced the central ideas of this piece and shaped my thinking on how similar concepts could be applied as user controls within LLM and GenAI models.

References

Attention In Transformers, step-by-step
Grant Sanderson, (3Blue1Brown Youtube Channel)
https://www.youtube.com/watch?v=eMlx5fFNoYcLarge Language Models II: Attention, Transformers and LLMs
Mitul Tiwari
https://www.linkedin.com/pulse/large-language-models-ii-attention-transformers-llms-mitul-tiwari-zg0uf/Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
https://arxiv.org/abs/1706.03762What Is ChatGPT Doing … and Why Does It Work
Stephen Wolfram
https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/King — Man + Woman is Queen; but why?
Piotr Migdał
https://p.migdal.pl/blog/2017/01/king-man-woman-queen-whyDon’t Use Cosine Similarity Carelessly
Piotr Migdał
https://p.migdal.pl/blog/2025/01/dont-use-cosine-similarityOpen sourcing the Embedding Projector: a tool for visualizing high dimensional data
Daniel Smilkov and the Big Picture group
https://research.google/blog/open-sourcing-the-embedding-projector-a-tool-for-visualizing-high-dimensional-data/How AI ‘Understands’ Images (CLIP)
Mike Pound, (Computerphile)
https://www.youtube.com/watch?v=KcSXcpluDe4

www.tomhatton.co.uk

Navigating embedding vectors was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.

Article Categories:
Technology

Comments are closed.