Chatgpt - Seriously good potential (or just some Internet fun)

kissenger · 6 Mar 2024 at 15:48

We are on the verge of a once in a century technological breakthrough, and we're going to ruin it by obsessing over 'diversity'. The future is going to belong to the countries able to create the best AI, and this Gemini crap is categorically detrimental to the entire West in that regard.

dowie · 6 Mar 2024 at 15:58

kissenger said:
We are on the verge of a once in a century technological breakthrough, and we're going to ruin it by obsessing over 'diversity'. The future is going to belong to the countries able to create the best AI, and this Gemini crap is categorically detrimental to the entire West in that regard.

One of the Google founders Serge has admitted that was a mistake when asked about it at a recent hackathon, not sure I can post the tweet/video here though as the guy who asked him about it is wearing a costume featuring a picture of a naked female torso...

Ayahuasca · 6 Mar 2024 at 16:12

dowie said:
It pretty much does indicate that though and there isn't a GPT4.5 turbo (yet), there's a turbo iteration of GPT4 and like I said those are public benchmarks - GPT4 turbo for example scores 92.5% on GSMK8, 54% on MATH and 73.17 on HumanEval see here:

Gemini 1.5 Pro vs GPT-4 Turbo Benchmarks

The evolution of AI language models is revolutionizing how we interact with technology. Among the latest advancements are Google’s Gemini 1.5 Pro and OpenAI’s GPT-4 Turbo. This article delves into a detailed comparison, shedding light on their capabilities, architecture, and potential impact...

bito.ai

And if you look at the anthropic's results GPT4 scores 92% 52.9% and 67% respectively for each of those, so turbo isn't a huge improvement.

But then if you look at Claude 3 it scores 95% 60.1% and 84.9% respectively for those tests.

No that doesn't follow, there have been plenty of other newer models released that didn't beat GPT4, beating GPT4 across a range of tasks is quite an achievement.

While that may be an indication, it remains to be seen in real world usage. We'll find out over the next few weeks just how much better it is in all categories.

Also, there haven't been many large-scale models released to the public since GPT4 Turbo. There have been some niche models that focus on specific tasks, but none as broad as GPT 4 turbo and Claude3. Gemini 1.5 pro isn't public yet.

There are already some rudimentary examples on Twitter and Reddit where it certainly doesn't look "better". There are similar discussions every time one of the big models is updated, but opinions tend to change about how good it really is when people start using it for real-world activities. They tend to get a bit lobotomised as people try to "break" them or point out the stereotypes they make, etc.

GPT4 is much more restrictive than it was at release, unless you use the playground or API and pay per token.

Anthropic's auto-detection system also seems to be broken when it comes to ToS violations, as many accounts have been banned in the last 24 hours for seemingly no reason - https://www.reddit.com/r/ClaudeAI/comments/1b76v1g/account_got_banned/ - https://www.reddit.com/r/ClaudeAI/s...56b8&iId=1ae0909e-9222-4ee3-a1a5-19bce4674c80

dowie · 6 Mar 2024 at 16:50

Ayahuasca said:
While that may be an indication, it remains to be seen in real world usage. We'll find out over the next few weeks just how much better it is in all categories.

Oh for sure, no objections here to seeing what happens in terms of real-world usage. I was just pointing out that those aren't contrived benchmarks, they're standard open ones that the other models have already been tested under and so that does actually give a pretty solid indication of capabilities. And also this isn't some small startup, Claude 2 is already well established as a good model beyond just passing benchmarks well, I don't think there is much to be overly skeptical about here re: Claude 3.

edit

Also on the positive side re: ad-hoc testing, there is some very cool stuff so far with this model, these are a few of the things I bookmarked in the past couple of days:

https://twitter.com/i/web/status/1764787911890014688

https://twitter.com/i/web/status/1765088860592394250

https://twitter.com/i/web/status/1764894816226386004

The last example is a bit over hyped by some overly worried AI safety person but it's still pretty cool, the model knows it's being tested and spots the sentence.

Trusty · 13 Mar 2024 at 21:05

https://twitter.com/i/web/status/1767913661253984474

We're on the cusp of something crazy, next couple of decades are going to get funky

Minusorange · 15 Mar 2024 at 21:05

https://www.reddit.com/r/ChatGPT/comments/1bfa7s3/openai_cto_mira_murati_confirms_that_the_video/

In this interview I will hand the interviewee a hot potato and see how she handles it :cry:

Minusorange · 23 Mar 2024 at 23:51

Nvidia announces AI-powered health care 'agents' that outperform nurses — and cost $9 an hour

Nvidia has partnered with artificial intelligence health care company Hippocratic AI to develop an "agent" that outperforms nurses on phone calls with patients.

www.foxbusiness.com

High-powered chipmaker Nvidia has teamed up with artificial intelligence health care company Hippocratic AI to develop generative AI "agents" that not only outperform human nurses on video calls but cost a lot less per hour.

It's starting! I wonder how long it will be before we get AI nurses in the NHS where if you want a human nurse you have to go private ? Obviously some time off given these are only video call agents but it's the beginning of AI taking over jobs

Macky · 24 Mar 2024 at 08:11

Minusorange said:
Nvidia announces AI-powered health care 'agents' that outperform nurses — and cost $9 an hour

Nvidia has partnered with artificial intelligence health care company Hippocratic AI to develop an "agent" that outperforms nurses on phone calls with patients.

www.foxbusiness.com

It's starting! I wonder how long it will be before we get AI nurses in the NHS where if you want a human nurse you have to go private ? Obviously some time off given these are only video call agents but it's the beginning of AI taking over jobs

Given the AI ones outperform the humans, I wouldn't be against having an AI one. Obviously that doesn't account for the "human touch" or bedside manner etc. But from a purely technical perspective, I'm for it.

UberTiger · 13 May 2024 at 22:10

The stuff released today, which will all be free in the coming weeks is mental.

Full Spring Update

For fun - Sarcasm with GPT-4o

ChatGPT as a live personal tutor - absolutely revolutionary for education IMO, bringing an infinitely patient personal tutor to absolutely anyone.

For accessibility - the ability to be a live version of the incredibly Be My Eyes app, without the need for a human to pick up and be on the call

Live code interpretation and advice, with the ability to read your clipboard or screen using their new Mac App (Windows coming later in the year)

Realtime conversational Audio Translation, which seems quite fluid compared to the Google Translate app I've used in the past.

There are so many fields this could transformationally affect.

There's even more vids on their channel, the ones I posted are just the best from the ones I've watched so far

https://www.youtube.com/@OpenAI/videos

dowie · 13 May 2024 at 22:13

The low latency response is incredible compared to chaining models together and getting the obvious few seconds delay as the speech is translated then sent to the LLM then the response is sent back and converted to audio etc.

Meta/Facebook AI is supposed to be publishing their findings on their multimodal one in June, so possibly not too long left to get an open-source one too (if Zuck is going to carry on being so generous).

UberTiger · 13 May 2024 at 22:18

Minusorange said:
https://www.reddit.com/r/ChatGPT/comments/1bfa7s3/openai_cto_mira_murati_confirms_that_the_video/

In this interview I will hand the interviewee a hot potato and see how she handles it

This video is all I could think about when watching her for the first couple of minutes of the keynote today

Rockbox · 14 May 2024 at 01:03

They made it way too flirty. Combine this with the loneliness epidemic and the news of ai in dating apps flirting for you and "dating" other ai to check compatibility, and we're not too far off from the "Her" future. I do hope they're sticking to intelligence and reasoning, and not chasing CharacterAI's crown. Remember when Sam drew a website on a napkin and had GPT4 code a website from it? Haven't seen it since. Seems it can assist, but not replace coders. The first thing I ever coded was a python script to run chatgpt. It wrote the code, but kept bumping its head against the change in syntax for using the api key. I hope we don't end up with the smartest models kept behind lock and key by corps and the govt, while we're left with ai girlfriends to pacify us.

Emlyn_Dewar · 14 May 2024 at 02:59

Of course we will. Advanced models that will be completely unable to answer a large amount of questions, due to comical restrictions.

General consensus of OpenAI seems to be that they are a bunch of scumbags, and their company name is the opposite of what they do/want.

UberTiger · 14 May 2024 at 08:05

Rockbox said:
They made it way too flirty. Combine this with the loneliness epidemic and the news of ai in dating apps flirting for you and "dating" other ai to check compatibility, and we're not too far off from the "Her" future. I do hope they're sticking to intelligence and reasoning, and not chasing CharacterAI's crown. Remember when Sam drew a website on a napkin and had GPT4 code a website from it? Haven't seen it since. Seems it can assist, but not replace coders. The first thing I ever coded was a python script to run chatgpt. It wrote the code, but kept bumping its head against the change in syntax for using the api key. I hope we don't end up with the smartest models kept behind lock and key by corps and the govt, while we're left with ai girlfriends to pacify us.

I heard about Kupid AI on a podcast last week, apparently it’s easy for some people to spend 10k a month on services like that for custom content :eek:

jpaul · 14 May 2024 at 08:06

so with todays advances sounds as though next time bank rings I will ask below, or similar

You’re watching television. Suddenly you realize there’s a wasp crawling on your arm.

not sure what rishi is talking about though more will change in the next five years than in the last 30.
is keir really going to mess the country up so much.

MadMossy · 14 May 2024 at 08:11

jpaul said:
so with todays advances sounds as though next time bank rings I will ask below, or similar

You’re watching television. Suddenly you realize there’s a wasp crawling on your arm.

not sure what rishi is talking about though more will change in the next five years than in the last 30.
is keir really going to mess the country up so much.

He's been talking to people in the AI sector and its fairly safe to assume thousands of jobs will be gone in the next few years than can be easily automated. The sort of jobs that will go will surprise people it'll be white collar that have the largest impact and they will go quickly. Anything that requires manual dexterity or need complex decisions being made on the fly will be the last to go.

jpaul · 14 May 2024 at 08:28

when is AI going to start it's onslaught, not heard many economic productivity gains yet attributed to it (medical diagnosis aside)

yes Rishi needs to generate some enthousiasm that uk is competing in the ai domain though, to counter doomster Kier~~marvin~~ the paranoid android.

dlockers · 14 May 2024 at 08:32

UberTiger said:
I heard about Kupid AI on a podcast last week, apparently it’s easy for some people to spend 10k a month on services like that for custom content

My friend worked for King briefly. His job was to phone the top 10 Candy Crush spenders and enquire why they'd stopped playing (send them merch, take them to dinner, events etc). Between the 10 of them they were spanking over £100k/month on the game.

Ayahuasca · 14 May 2024 at 08:34

Thousands of jobs over the next few years would be a drop in the ocean, it would have to affect hundreds of thousands if not millions of jobs before people woke up.

I think it will end up being heavily restricted either by law or by company policy in certain countries, industries and sectors. I'm already aware of several companies that have banned their employees from using LLMs at work.

dlockers · 14 May 2024 at 08:48

Ayahuasca said:
I'm already aware of several companies that have banned their employees from using LLMs at work.

General public-facing LLMs, yes - but most companies with any level of common sense will be actively seeking how to give their employees access to LLMs.