Chatgpt - Seriously good potential (or just some Internet fun)

dowie · 27 Jan 2025 at 23:03

NickK said:
There's a number of gotchas with DeepSeek:

1. The T&C indicates that all output is their IPR.
2. Data goes in.. stays in and remains theirs for training etc.
3. It's been shown to generate code that has vulnerabilities (including using libraries that hackers have used their vulnerabilities)
4. It's been shown to avoid CCP sensitive issues and so there's a question mark about bias.
5. The front layers have been shown to overfit - this makes it great for specific test performance but in a longer term it's not as performant.

So nobody can use it realistically for the company..

Don't conflate the model and the API offered by the company itself (in China) - obviously they've got to comply with Chinese laws but the model is open source, you don't need to use their API when it can be self-hosted or hosted elsewhere without giving them any of your data or indeed being subjected to whatever additional censorship they may be obliged to add to their API. People absolutely can use it (and indeed are doing so).

Mesai · 27 Jan 2025 at 23:26

punky_munky said:
Even if DeepSeek turn out to be untrustworthy or not provide a great service and eventually disappear as a business, the fact that they've been able to rival ChatGPT on a shoe-string budget in such short span of time is incredible. And if we assume this isn't a just one off, this opens the door for far more open access to AI technology.

I think governments will have to act quickly to regulate the development of AI models.

It’s not a fact though - there’s no evidence to support the claim. Like what are the chances that some small company just revolutionised model training against the likes of OpenAI, Google, Meta, etc. It’s going to take a couple months to really distill what’s happening here.

If they did create the model on their own, at a fraction of the cost, on lesser hardware, then that’s great news. But if the just copied someone’s homework, refined it to target benchmarks, had extra government cash injection, etc. then it’s nothing new.

punky_munky · 27 Jan 2025 at 23:31

You can get output that rivals ChatGPT, and it can run locally on a 6GB laptop GPU - that isn't a big deal to you?

NickK · 28 Jan 2025 at 05:30

dowie said:
Don't conflate the model and the API offered by the company itself (in China) - obviously they've got to comply with Chinese laws but the model is open source, you don't need to use their API when it can be self-hosted or hosted elsewhere without giving them any of your data or indeed being subjected to whatever additional censorship they may be obliged to add to their API. People absolutely can use it (and indeed are doing so).

It depends on the model itself. If the model pre-trained, can generate the problems then it will still remain. For example, if the LLM can generate output that has vulnerability because the training data is biased or has references embedded (ie the weighting creates a text string that outputs bad code).

If the open source requires you to train your model then it's nothing more than an empty husk and I agree if the T&C support the use. However users are still assuming it's great and the real world results may be a different story. However we shall see.

Not many users can take an open source project, adopt it then maintain and improve it. So speed of advancements are something to consider the ROI.

The release was designed to offset the US and Trump however a large portion is blown up by the US media.

D.P. · 28 Jan 2025 at 07:05

NickK said:
It depends on the model itself. If the model pre-trained, can generate the problems then it will still remain. For example, if the LLM can generate output that has vulnerability because the training data is biased or has references embedded (ie the weighting creates a text string that outputs bad code).

If the open source requires you to train your model then it's nothing more than an empty husk and I agree if the T&C support the use. However users are still assuming it's great and the real world results may be a different story. However we shall see.

Not many users can take an open source project, adopt it then maintain and improve it. So speed of advancements are something to consider the ROI.

The release was designed to offset the US and Trump however a large portion is blown up by the US media.

The model weights are also available. I expect the while model to be on AWS bedrock within a few months znd probably z guide to deploy it on Sagemaker within dzyd

a1ex2001 · 28 Jan 2025 at 08:43

NickK said:
It depends on the model itself. If the model pre-trained, can generate the problems then it will still remain. For example, if the LLM can generate output that has vulnerability because the training data is biased or has references embedded (ie the weighting creates a text string that outputs bad code).

If the open source requires you to train your model then it's nothing more than an empty husk and I agree if the T&C support the use. However users are still assuming it's great and the real world results may be a different story. However we shall see.

Not many users can take an open source project, adopt it then maintain and improve it. So speed of advancements are something to consider the ROI.

The release was designed to offset the US and Trump however a large portion is blown up by the US media.

I don't think the perceived threat for the market is this particular model it is the clear evidence that highly functional AI can be developed on much smaller budgets and trained using much cheaper hardware that has spooked the financial markets. People are betting huge sums of money on AI giving them huge profits down the line which just won't be there if companies are able to undercut development costs by multiple billions.

Mesai · 28 Jan 2025 at 09:00

a1ex2001 said:
I don't think the perceived threat for the market is this particular model it is the clear evidence that highly functional AI can be developed on much smaller budgets and trained using much cheaper hardware that has spooked the financial markets. People are betting huge sums of money on AI giving them huge profits down the line which just won't be there if companies are able to undercut development costs by multiple billions.

DeepSeek is trained using other LLMs, so without the billions spent by other companies, it wouldn't exist. It's an interesting solution for more targeted models, but it doesn't undermine what's already been built.

Kill_Phil · 28 Jan 2025 at 10:04

Mesai said:
DeepSeek is trained using other LLMs, so without the billions spent by other companies, it wouldn't exist. It's an interesting solution for more targeted models, but it doesn't undermine what's already been built.

Nvidia stock price says otherwise? Nearly 600bn wiped

NickK · 28 Jan 2025 at 10:16

Kill_Phil said:
Nvidia stock price says otherwise? Nearly 600bn wiped

blind leading the blind.

did we see nuclear power investors disappear due to lack of needing more nuclear reactors? No..

Mesai · 28 Jan 2025 at 10:22

Kill_Phil said:
Nvidia stock price says otherwise? Nearly 600bn wiped

Yea because the stock market is driven purely by fundamentals. People always panic, and it causes a chain reaction.

Anybody thinking we suddenly don't need these cards is mental. If anything, this makes building LLMs more accessible i.e. more GPUs.

Goksly · 28 Jan 2025 at 12:03

Mesai said:
Yea because the stock market is driven purely by fundamentals. People always panic, and it causes a chain reaction.

Anybody thinking we suddenly don't need these cards is mental. If anything, this makes building LLMs more accessible i.e. more GPUs.

Its also probably a releveling. ChatGPT burst onto the scene and everyone related to it got a massive share price boost. Competitors emerge and that is probably a realigning with reality. I expect Nvidia to drop further overtime, but still way above where they were before ChatGPT showed up.

D.P. · 28 Jan 2025 at 12:12

NickK said:
blind leading the blind.

did we see nuclear power investors disappear due to lack of needing more nuclear reactors? No..

nuclear stocks are also down because of this news

Mesai · 28 Jan 2025 at 12:13

Goksly said:
Its also probably a releveling. ChatGPT burst onto the scene and everyone related to it got a massive share price boost. Competitors emerge and that is probably a realigning with reality. I expect Nvidia to drop further overtime, but still way above where they were before ChatGPT showed up.

I expect them to drop over time if other companies are able to produce comparable hardware and the necessary tooling to take advantage of it. There are some on the horizon but this feels like a speedbump that will correct itself.

NickK · 28 Jan 2025 at 12:15

D.P. said:
nuclear stocks are also down because of this news

24% drop .. in the end they will still get their trade because it won’t stop at one AI.

NickK · 28 Jan 2025 at 12:16

Goksly said:
Its also probably a releveling. ChatGPT burst onto the scene and everyone related to it got a massive share price boost. Competitors emerge and that is probably a realigning with reality. I expect Nvidia to drop further overtime, but still way above where they were before ChatGPT showed up.

Aws and google are making their own hardware so it’s natural.

ninefoureight · 28 Jan 2025 at 12:29

I think this is all expected. Market settling.

cheesyboy · 28 Jan 2025 at 12:37

Two weeks after the UK government announced AI would be the driver of UK growth. Buying at the peak of the market

cheesyboy · 28 Jan 2025 at 13:28

Has this been posted?

He says the idea of a small team with a small budget competing with ChatGPT is 'hopeless'.

https://twitter.com/x/status/1666692754238496768

GGizmo · 28 Jan 2025 at 13:50

I didn't think we'd see new technology in 2025, re my post in the other thread.

This is a welcome addition to ai technology.

dowie · 28 Jan 2025 at 18:22

D.P. said:
The model weights are also available. I expect the while model to be on AWS bedrock within a few months znd probably z guide to deploy it on Sagemaker within dzyd

Yeah I don't think people understand this - there's lots of handwaving about censorship but it's not really sunk in that this is an open model, it doesn't have to be hosted in China, it doesn't require a censorship API, other third parties will host it (it's not just China vs do it yourself). Lots of buzz surrounding AI but plenty of people misunderstand it.