Standards, Interfaces, Models Search
Semantic searching across large category of documents
- most text based formats
- images
- 3D models
- interfaces
I wanted a way to find sketchs and text that are inside images. I have a large number of paper scans of my college notes that I would like to be able to search and pull information from. I also have digital handwritten notes that I would like to pull from.
I pulled in a few of those open source free LLM models that can describe what is in an image and pull out any text that is within the image provided to it. I take that and save it as a text string associated with that image.
Loading code snippet...Then I put together a simple TF-IDF (Term Frequency - Inverse Document Frequency) program that will index over all text. I can pass in standard text and PDF documents and then also pass in the text descriptions of the images to be processes and scored for the document index.
Right now it is just a CLI tool. I was hoping I could use it as an augmentation tool with LLM tools to pull in and provide real information and pictures from actual sources instead of relying entirely on a large size model. and hopefully I can expand the outputs to be more useful for the LLM to use directly as a kind of RAG (Retrieval Augmented Generation)
My end goal is to be able to run a smaller LLM model on my laptop that can pull information from my own corpus of high quality technical documents in a way that I can dive into the source document for more better context and learning.
As a bonus this system just kind of works with scanned textbooks so ingesting high quality technical documents is relatively easy to expand the available information.
The major downside is that the image clasification process takes about 30-60 seconds per image on my laptop. I have 1,000s-10,000s of images that need to be injested to be put into the TFIDF index. So it would take days to comput on my laptop.
I think the next step is to turn my desktop computer with an actual GPU into a server to process these and then pass back the final index once completed.
I do have it set up so that if documents or folders are added it does not need to index the whole folder again. It loops through checking for changed or added documents and then only indexing on those. so once the base index is completed it will not need to go through all that work again.
Also the image texts and descriptions get saved out for review other than just looking at the image so copy pasting info out of the images is now possible and re-indexing can go much faster by skipping the image description generation...
Now this has all ignored the interfaces and 3d model search... that is because this has not been implemented yet. I think mechanical engineering is way behind where it should be in terms of computer aided design. Component selection is a pain in the ass, and making sure things fit together is so challenging, there is a lot of manual work that has to go into looking for components that will fit together and that will do what you want... In all design type engineering there are individual components and those components have their specific ratings and interfaces: bolt patterns, locations to solder to, etc etc.
I should be able to chose a few components and a flow of how those parts should connect and the computer should be able to fill in the gaps between those components. chose this off the shelf item to connect these things, choose this component as the best choice of this in between process that was not specified, tell me there is nothing of the shelf that connects this thing to this thing but give the way those two connect (the interfaces) and that makes design so much faster.
Maybe you see where this is going, I have a pretty good idea of how to automate a few annoying design engineering tasks based on user constraints in the system, the tools (and availability of information) just don't exist yet. And this is for my own personal use so it will be heavily specced into what components are off-the-shelf and what information is readily available. I don't have time to design every part from scratch for a cost reduction benefits of high volume. I am here for bespoke contraptions. I have been trying LLMs to do this work all on their own, but they don't have the structured data, and compilation style feedback loop to keep them on the rails.
So... thats on the list...