Home Applications website-analyzer

website-analyzer

This application is not supported by InterSystems Corporation. Please be notified that you use it at your own risk.
5
1 reviews
2
Awards
307
Views
0
IPM installs
1
6
Details
Releases
Reviews
Awards
Issues
Pull requests
Articles
InterSystems IRIS NLP Analyzer to websites. Extract all website content with the app crawler and analyze into Text Analytics Domain Explorer and in the BI User Portal.

What's new in this version

Support to NLP Sentiments (English), a new User Portal BI Dashboard, automatic multilanguage support (thanks deboe)

InterSystems IRIS NLP Website Analyzer

This is an InterSystems IRIS NLP Website Analyzer. It extracts all HTML content from a site and related content, using crawler and uses IRIS NLP to analyze the website content.

What The the app does

This application receive a URL, use a Crawler to extract all website content and analyze it using NLP

Website-Analyzer - IRIS NLP and Crawler4J in action!

IRIS NLP and Crawler4J in action

Website-Analyzer IRIS BI in action!

NLP metrics into User Portal

Prerequisites

Make sure you have git and Docker desktop installed.

Installation: Docker

Clone/git pull the repo into any local directory

$ git clone https://github.com/yurimarx/website-analyzer.git

Open the terminal in this directory and run:

$ docker-compose build
  1. Run the IRIS container with your project:
$ docker-compose up -d

How to Run the Ocr Production

  1. Open the production

  2. Set Depth and TotalPages to the Crawler. Depth is how many subpages will be crawled and TotalPages is how many pages will be processed. Tip: start with Depth 0 and 5 pages, to be a fast initial test.

  3. Start the production.

  4. Now Open Postman or create a request in a browser pointing to localhost:9980?Website=https://www.intersystems.com/ using GET. Choose any website changing https://www.intersystems.com/ to any site (e.g.: yoursite.com)

  5. Go to the NLP Domain Explorer

  6. Go to the BI User Portal

  7. Analyze the texts and enjoy!

Read more
Made with
Version
1.0.324 Dec, 2020
Category
Analytics
Works with
InterSystems IRIS
First published
20 Dec, 2020