Project Domain / Category
Web Application + Information Retrieval
Abstract / Introduction
The rapid growth of scientific articles is creating a problem of information overload for the
researchers. Due to which both novice and expert researchers find it very difficult to find relevant
articles of their interest. Therefore, there is a need for an application that will be able to recommend
similar articles to the researcher. To overcome this problem, we will develop a web- based scientific
articles recommendations system that will recommend scientific articles of user interest based on text
classification Doc2Vec modeling scheme and cosine similarity measure.
Create a Signup module. Users will be required to register themself in the application. The
user will get registered once the admin will approve it.
Create a Sign-in module. Only registered user will be able to use the application.
Admin will be able to manage users means it can approve user, remove users and view
userdata through admin dashboard.
4. Add Scientific Articles:
Admin will be able to add scientific articles data to the database having abstract and area
through admin dashboard. Add at least a total of 50 articles data of different areas in the
database. You can use CSV file to add data to database from the following link.
5. Pre-Process Data and Building Doc2Vec Model:
Pre-Process the data to make data ready for model training. You can search for preprocessing measure like lowercasing, tokenization etc. used for doc2vec algorithm. Now
you are required to build Doc2Vec (Distributed Memory) model from articles abstracts
stored in database and save the model.
6. Recommend Scientific Articles Using Cosine Similarity Measure:
Create a webpage which will take article abstract which is not added yet in the system as
input by the user and on clicking generate recommendations your application will infer
userarticle vector. Then it shows articles recommendations by computing cosine similarity
between user vector and already added article vectors in descending order.