Because they don't really search or index quality content (it's very expensive and hard to do) and their search implementation really sucks, they don't do any real improvement.
The process is like this:
- Take the user query and create 1-3 queries. For this process they use very stupid but fast and cheap models; because of that, sometimes they create very stupid search queries and, unlike a pro, they don't really know how to use search engines, like filtering, ranking, focusing...
- Combine these search results (it contains slop AI-generated summary pages, YouTube videos, maybe forums, maybe Wikipedia...).
- Use RAG with an LLM to find answers. LLMs will always try to find answers quickly, and instead of making a thinking loop in a long article they will use that slop page with a direct answer.
As you can see, there are many, many problems in this implementation:
- The biggest problem is citation: they cite confidently but it's wrong.
- They use low-quality data, like auto YouTube subtitles, improperly extracted tables and elements, content-farm sites, copycat sites, corporate blogs...
- Their search results are low quality.
- For the most important part (breaking down the user request) they use cheap, stupid models.
- They handle all data in the same context instead of parallel requests (which is very expensive)
It's still strange to me: we always say "they have all the data, all the money, all the hardware..." but they still can't create a better AI search than random FOSS developers.
That's impossible, because most of my searches are literally as fast as me typing the query, and then I get the answer.
That's why I'm asking what you guys are searching for, because this has been a dramatic improvement for me.
Try searching for "seahorse emoji".
Andi said:
Adding that it don't exist in the official Unicode emoji pack, but they do in inofficial packs, eg. here
I find that searching for anything older than 10 years ago that isn't media or pop culture just doesn't appear. I can't find a way to exclude terms at all. I can't find a reliable way to add terms without wildly changing the results instead of digging into the ones I have to find what I'm actually looking for.
maybe for simple queries but if your task is like this, currently there is no AI that can beat a human/me
"finding most popular communities in Lemmy"
"5 latest llm models"
"trump's last 5 lies"
"any file finding"
"image finding"
"any tool or website suggestion"
"finding source of something"
"finding github issues with related something"
"finding all news about something"
"finding an broken webpage"
"finding original content"
"finding illegal content :D"
Even when they "do" they do just good-enough and it's not enough for me
Yeah. I'm referring to simple queries. That's the vast majority of my queries.
We're we supposed to read your mind for that one? You literally said someone else's experience was impossible before you back it up to "of course I just ment simple queries."
Maybe reread the conversation, because you seem to be assuming a tone on my part that isn't there.