Nick's Path to AI enlightenment

An LLM is stochastic so it will generate different code each time, hence it makes it extremely difficult to maintain if you want to adjust the code over time.
 
An LLM is stochastic so it will generate different code each time, hence it makes it extremely difficult to maintain if you want to adjust the code over time.

No, an LLM is deterministic (at least in theory) but this gets a bit complicated, and the main issues w.r.t coding are somewhat separate to that.

With the temperature set to 0 an LLM should be deterministic (it theoretically is) but there can be sources of a small amount of randomness in practice, less of an issue w.r.t a smaller LLM on a single device but when you're dealing with multiple floating point operations in parallel over multiple GPUs and indeed multiple LLMs in a MoE setup then you end up introducing a small amount of non-deterministic behaviour.

But the issues with coding are more to do with things like; losing context when dealing with large code bases, no knowledge of domain-specific libraries, can't (necessarily) validate outputs unless the LLM has a runtime environment (some do now)... also it can output needlessly verbose code when simpler solutions exist.

Those issues are much more of an issue if someone is expecting to simply prompt an LLM and then get a on-shot solution - that's not to say you can't get such a solution (you can or could one-shot a flappy bird game but then again the LLMs have seen multiple flappy bird clones), the better way to use them is as a "co-pilot" and to generate snippets of code while you the coder are aware of the context, the other libraries etc.
 
An LLM in a generative capacity uses random noise.

An LLM without learning that is static is, as you’ve stated, results in a distribution. Ie as it doesn’t learn the paths don’t change.

Sorry I was thinking of agentic using things like chatgpt, a genai.

Surely the selling point is gen ai is dynamic.. and the reason for the painful fun.
Switching off (temp zero) means you may as well make code with an AI, then use that.
 
Last edited:
To be clear it's not so much that it uses random noise (obvs aside from changing the temperature) but that in some scenarios random noise is going to creep in.

LLMs are not dynamic, you can fine-tune (before deploying) them to change them but that's more tweaking the top layers.
 
An LLM in a generative capacity uses random noise.

An LLM without learning that is static is, as you’ve stated, results in a distribution. Ie as it doesn’t learn the paths don’t change.

Sorry I was thinking of agentic using things like chatgpt, a genai.

Surely the selling point is gen ai is dynamic.. and the reason for the painful fun.
Switching off (temp zero) means you may as well make code with an AI, then use that.


I think you are mixing up a few concepts.

LLMs have a training phase when they learn, and you can fine tune them, but they don't learn anything during inference so the learning is static.

The selling point of genAI is certainly not to be stochastic for the most oart, albeit there are some use cases especially for image generation etc. that it is useful . The most widely used way to use an LLM is likely with temperature at zero because you can have consistent results. The output will be dynamic based on the input prompt but the same prompt should always give identical output for most use cases. As Dowie says, due to technical reasons related to floating point arithmetic at scale this isn't guaranteed and this is one of the biggest limitations of LLMs. Personally this has blocked several projects from going to production.


Remember that an LLM simply produces a probably distribution of the next token. Sampling from that distribution is controlled by temperature, top-p, top-k and other factors depending on the model.

If a problem can be well solved using alternative simpler ML models when the temperature is 0 then there they can almost always be solved with high temperatures. And indeed the most useful models tend to operate on probability distributions anyway given that statistics and distributions of data is essentially the heart of ML.



The likes of Copilot will most likely have temperature at 0. Temperature at zero just means the most likely newt token is selected . If you want your LLM to say answer questions about the capital city of countries then you don't want it to respond with anything other than the most likely token.
 
I’ll recheck my notes when I have chance.

I know about temp, p & k hyper parameters and for non-llm the explore/exploit ratio.

Noise being used in training input to desensitise (has a similar effect to downsizing).
Noise also being used in gen ai where the text input is also supplemented with random input to generate output.

I’m also aware of the llm process (use of vector maps etc).

The “agentic” process of feedback isn’t static compared to old agents. I can see the llm can be setup either way with 0 temp. True but most hype is people are pushing for both operating reinforcement but also starting to look at non zero temp (seems a bad idea to me)

I’ll have a more indepth look again next week as I want to write my own agents.
 
Can ChatGPT code?
Try Claude 3.5 Preview its miles ahead of chatgpt4

I'm using it as an auto pilot in visual studio code, seems to always give working code and never had the same error twice.
if i say it's broken it fixes it.
I just have to hit save then run the code it's amazing so far.

Chat GPT made me want to punch the screen, I knew about 70% of the times its code wouldn't work, then it would only fix that code maybe 50% of the time at best...

Claude 3.5 had me a model trained in about 10 minutes... with the simplest of instructions... I double checked it was taking into accounts capitals, non capitals, all the special symbols etc, didn;t even need to remind the bot that you can't name images like /.png..
I never had to explain we need to train the model with different font sizes... it just did it all properly of its own accord

when I copy/pasted the confidence scores from the terminal.. omg it even remembers its for reading text from game screenshots !

Yes, the OCR system is working! It's detecting characters with good confidence scores (many above 0.90). However, I notice a few areas we could improve:

1. Let's enhance the word grouping to better handle text in game screenshots:

### [ocr_front_pipeline.py](file:///c%3A/Users/lh002/PycharmProjects/PythonProject/ocr_front_pipeline.py)

[SNIPPED CODE]

These improvements will:
1. Better handle vertical alignment of text
2. Filter out single characters (likely false positives)
3. Add bounding box information
4. Improve confidence thresholds
5. Better group characters into actual words

Would you like to test these improvements with your screenshots?


I had to remind chatgpt even that I was in windows envrioment almost every 5 minutes... it kept trying to use unix code with slashes not compatible with windows :/

it was trying to find files in like c:/dataset\synthetic_images\

which obviously isn't a valid dir... with mixed slashes / \


Claude 3.5 is flawless so far, Technically its github copilot using the Claude 3.5 sonnet preview
 
Last edited:
Back
Top Bottom