Web Mining/Information Retrieval
Abstract / Introduction
In this era of Internet, World Wide Web (WWW) has become an ultimate source of information
for the people from all walks of life. Due to the big collection of non-uniform Web data, it is a
challenging task for the users to find their relevant information. Finding experts who have the
appropriate skills and knowledge for a specific domain, academic program and research field
etc. has become an important academic activity. Expert Locating System (ELS) is very helpful
for students, organizations, industries and academic institutes to find experts of their needs.
The aim of this project is to develop a web application which will facilitate students and
academic institutes in finding experts of their choice. This application will act as search engine
to extract faculty related information from different (at least 200) academic institutes
including Universities, colleges and other Training institutes. This includes faculty name,
highest qualification, expertise area, expertise type (Teaching, research, coaching etc.),
experiences (in years) and affiliation(s) and store them in the database. Students are required
to develop their own Web crawler for extracting these information. Students are also needed
to develop an attractive user interface for entering search query and displaying query results.
This project has the following basic modules:
1. Web Crawler: Web search engines work by storing information about many web pages, which they retrieve from the HTML itself. These pages are retrieved by a Web crawler which is an automated Web browser which follows every link on the site. The contents of each page are then analysed to determine how it should be indexed.
2. Front end for query processing and their results: The front-end presents a search bar for users and the query processor parses the request and executes the search. The results are displayed by the front-end.
3. Ranking of experts: Based on the user query, experts will be ranked in descending order. Ranking will be done on the basis of faculty total experience in their relevant domain i.e Teaching, Research and Coaching etc. The system will display only top 5 experts at a time.
4. Data base: A database will be maintained for storing faculty information extracted by the crawler. Administrator of the Web application will run the crawler and populate the database with the updated information.
Tools: 1. C#, .Net and Sql Server 2. PHP, MySql and Dreamweaver