What you are downloading is a ZIP compressed file which, all things considered, clearly tells you that this is a demo, i.e. a demonstration, it is not the full version of this tool . Be careful because the compressed file takes a very long time 35.1 GB of space. Once downloaded, simply unzip it to a directory on your storage device and when you navigate to the folder you will see that there is a “Setup.exe” which you need to run to install the application.
The installation process is very standard and you will just have to click Next all the time until it is finished. Of course, give it time because it takes a LONG TIME, and in fact there is a “build the Llama 13B INT4 engine” part that consumes all the computer’s RAM, and everything slows down a lot for about a minute. Be patient.
Once the process is complete, the wizard itself will ask you to launch the application, although it will also have created a shortcut on your desktop for this. When you run it, an MSDOS window like the one below will appear and you will see a message asking if you want to allow Python to run on the PC (you must answer yes).
Once the loading process is complete, a (local) web page will open in your browser where you can configure what we are going to tell you next, so let’s get started.
How this local AI works on your PC
Chat with RTX uses Recovery Augmented Generation (RAG), NVIDIA TensorRT-LMM software, and NVIDIA RTX acceleration to bring generative AI to your PC. Thus, users can quickly and easily connect local PC files as a dataset to an open source language model such as Mistral or Llama 2, allowing queries to be obtained quickly.
But instead of searching through notes or content saved on the PC hard drive, with NVIDIA’s Chat with RTX we can simply write questions. For example, you can ask things like “What restaurant did my friend Rubén recommend to me the other day?” » and the AI will analyze the local files that we indicate to look for the answer. This implies that we must tell the application where to look for these files (.txt, .pdf, .doc/.docx and .xml), directly in the web browser that opens and that we mentioned previously.
If you change the path where the application should search for the files, you will see that in the MSDOS window that remains open, a log starts to appear of things that are being processed at any time (and be careful, processing any changes takes a long time, just change the path This can take 2-3 minutes of processing (with an RTX 4080 OC) and during this time the GPU starts working at 100%, significantly increasing consumption and temperature.
The point is that it’s in this local web interface that you can interact with the AI, in the text box at the bottom, just to the left of the green Submit button. Of course, at the moment you can only ask your questions in English.
From what we tested, this NVIDIA chatbot is currently too basic, archaic and slow. It doesn’t take long to answer your questions, it does so almost instantly, but it consumes a lot of system resources and any changes you want to make take a long time to generate. Also, once you open the app and tell it where to look for resources, if you make changes to that folder it doesn’t process it automatically, you either have to restart Chat with RTX or select the folder again. case.
The conclusion we draw after trying it is that… it’s not worth it, at least for now. You need a lot of system resources and you need to organize all the information in a single directory on the PC for the chatbot to work with, which not everyone has beforehand and that means that just to be able to To try this tool, you will have to spend a lot of time alone just for it to answer questions you already know.
Now, if you are a user who works with “at a glance” data and you have it well organized and defined, then Chat with RTX could be a powerful tool for you, because just by “chatting” with the bot, it could give you the answers you need without having to search through your files.