Because they don't really search or index quality content (it's very expensive and hard to do) and their search implementation really sucks, they don't do any real improvement.
The process is like this:
- Take the user query and create 1-3 queries. For this process they use very stupid but fast and cheap models; because of that, sometimes they create very stupid search queries and, unlike a pro, they don't really know how to use search engines, like filtering, ranking, focusing...
- Combine these search results (it contains slop AI-generated summary pages, YouTube videos, maybe forums, maybe Wikipedia...).
- Use RAG with an LLM to find answers. LLMs will always try to find answers quickly, and instead of making a thinking loop in a long article they will use that slop page with a direct answer.
As you can see, there are many, many problems in this implementation:
- The biggest problem is citation: they cite confidently but it's wrong.
- They use low-quality data, like auto YouTube subtitles, improperly extracted tables and elements, content-farm sites, copycat sites, corporate blogs...
- Their search results are low quality.
- For the most important part (breaking down the user request) they use cheap, stupid models.
- They handle all data in the same context instead of parallel requests (which is very expensive)
It's still strange to me: we always say "they have all the data, all the money, all the hardware..." but they still can't create a better AI search than random FOSS developers.
I agree with almost everything you said, however, Kagi lets you choose the model that runs your search.
They’re always pulling from the same search index, and you’re right, the citation is just the model guessing how much it uses some info. Nothing actually quantifies that.
well I used Kagi (for 2 months) but their AI implentations is not transparent and they also suffer same problemles though their index is better (combining Google, Bing, Yandex, Brave, Brave) and you can use better LLMs
I switched to OpenRouter (just pay for my tokens)