Home Applications WALL-M

WALL-M

This application is not supported by InterSystems Corporation. Please be notified that you use it at your own risk.
0
0 reviews
0
Awards
205
Views
0
IPM installs
0
0
Details
Releases
Reviews
Issues
Pull requests
Videos
Articles
A Platform for Retrieval Augmented Generation (RAG) for Question-Answering of E-Mails

What's new in this version

Initial Release

Motivation

With the rise of Gen AI, we believe that now users should be able to access unstructured data in a much simpler fashion. Most people have many emails that they cannot often keep track of. For example, in investment/trading firms, professionals rely on quick decisions leveraging as much information as possible. Similarly, senior employees in a startup dealing with many teams and disciplines might find it difficult to organize all the emails that they receive. These common problems can be solved using GenAI and help make their lives easier and more organized. The possibility of hallucinations in GenAI models can be scary and that's where RAG + Hybrid search comes in to save the day. This is what inspired us to build the product WALL-M ( Work Assistant LL-M).

Developer team

Lars Quaedvlieg
Arvind Menon
Alejandro Hernandez Cano
Somesh Mehra

Project Description

This project was completed for the HackUPC 2024 Hackathon in Barcelona! We utilized the Vector Search capability to the InterSystems IRIS Data Platform to solve the problem of question-answering with semantic search whilst trying to prevent model hallucinations.

The repository contains the complete question-answering platform, which you can set up with the steps below. However, note that you currently need an OpenAI and an AI21 Labs key to utilize the models. In the future, we hope this platform can be extended to provide local LLMs instead of commercial solutions. Furthermore, we hope to integrate a direct connection to Outlook.

Check out our video demo here

WALL-M Setup

  1. Clone the repo

    git clone git@github.com:lars-quaedvlieg/WALL-M.git
    
  2. Change your directory to WALL-M

    cd WALL-M
    
  3. Install IRIS Community Edtion in a container, which will open a port on your device for the IRIS database system:

    docker run -d --name iris-comm -p 1972:1972 -p 52773:52773 -e IRIS_PASSWORD=demo -e IRIS_USERNAME=demo intersystemsdc/iris-community:latest
    

    :information_source: After running the above command, you can access the System Management Portal via http://localhost:52773/csp/sys/UtilHome.csp. Please note you may need to configure your web server separately when using another product edition.

  4. Create a Python environment and activate it (conda, venv or however you wish) For example:

    conda:

    conda create --name wall-m python=3.10
    conda activate
    

    or

    venv (Windows):

    python -m venv wall-m
    .\venv\Scripts\Activate
    

    or

    venv (Unix):

    python -m venv wall-m
    source ./venv/bin/activate
    
  5. Install packages for all demos:

    pip install -r requirements.txt
    
  6. Make sure to obtain an OpenAI API Key and an AI21 Labs key. Then, create a .env file in this repo to store the keys as:

    OPENAI_API_KEY=xxxxxxxxx
    AI21_API_KEY=xxxxxxxxx    
    
  7. The application in this repository is created using Taipy. To run it, just start Jupyter and navigate to the root folder and run:

    python src/core/main.py
    
  8. Once you have launched the platform, you need to head to 127.0.0.1:5000. Once there, you need to select a data directory. This directory should contain JSON-files with e-mail descriptions, but we hope to replace this with direct authentication to Outlook in the future. The method to obtain these JSON-files can also be found in the codebase, with instructions below. Alternatively, you may use the example synthetic data in the data/emails folder. These files are then used to create a database table with IRIS, which can then be queried using Retrieval Augmented Generations and Large Language Models.

Scraping E-Mails

  1. In order to scrape your emails, make sure you are on a windows machine. You can then install the required packages by running:

    pip install -r requirements_outlook.txt
    
  2. We need to scrape e-mails from an Outlook account. For this you need to be signed in to your Outlook account in the Windows Outlook application. Then, you can run the following code to scrape e-mails:

    python src/outlook/scrape_emails.py --email [YOUR_EMAIL]
    

    This will add the emails in the data directory with JSON-files containing the e-mail descriptions. These files can then be used to create a database table with IRIS.

Using the IRIS Management Portal

  1. Navigate to http://localhost:52773/csp/sys/UtilHome.csp, login with username: demo, password: demo (or whatever you configured)
  2. On the left navigation pane, click 'System Explorer'
  3. Click 'SQL' -> 'Go'
  4. Here, you can execute SQL queries. You can also view the tables by clicking the relevant table on the left, under 'Tables', and then clicking 'Open Table' (above the SQL query box)
    https://community.intersystems.com/user/arvind-menon
Made with
Version
1.0.013 May, 2024
Category
Frameworks
Works with
InterSystems Vector Search
First published
12 May, 2024