Agreed. I've been out of the AI game since circa 2019 and holy cow has it advanced drastically.
I have been doing some RAG pipeline work with Milvus and Hugging Face, but there are no real documented best practices which is leading me to believe that there just aren't any out there yet! And that anyone that does know these things is not going to be very forthcoming about it, of course.
I'm also encountering this challenge of the probabilistic nature of LLMs versus RAG providing deterministic information (in the form of vector embeddings). This creates a "static" feel to the interaction that I don't want. I need to find out how to apply the LLM's probabilistic nature to the data from the RAG pipeline rather than it using it verbatim. Fun times ahead!
Langchain is something that I'm getting to grips with but it hasn't been a resounding success.
Strands looks quite similar to OpenAI's ChatKit. I'll give that a little explore as it may be an easy way to hook this whole thing together. I did try out OpenAI's agent builder but I feel much more comfortable in VSC than I do in a graphic editor!
I have something running locally just fine, but my TS and JS are too weak for a prod deployment. I'm going to get a web dev in for those pieces.