Abstract:
This B.Tech project focuses on the development of the IFHP portal ie Integrated Federated Health Portal which is a portal of Tavlab currenty hosted at https:// federatedhealthplatform.tavlab.iiitd.edu.in/ . Our main task involved working with the community data page of the platform which had many new project requirements according to the work done by the other teams on IFHP project. The portal is built on node.js framework and uses embedded javascripts along with python scripts at backend ie flask for doing multiple computations on the data through data pipelines. The portal earlier had close to 2200 datasets from zenodo and figshare along with their decriptions and there was a functionality to find their cosine and jacard score. Now the updated figshare dataset was given to us which had close to 21000 files and we had to incorporate all the functionalities and modify the logic according to the new formatting of the score file as well as the description file beside this the main difference was the size of the file which was roughly 4.5 gb this time so efficient and faster teqniques had to be used for computation of cosine scores on new files. Besides this the portal needed some visual enhancements while calculating cosine scores like loading screen, colour coding for the scores, appropriate text to give the status of computation beside clearing the score when new files are selected. Along with this another task was to incorporate a easy to use search functionality for the user among these files as they were 21000 in number and finding suitable files through dropdown is infeasible for such a number before this too finding relevant files from 2200 files was a cumbersome task. The requirement was to make new search functionality as well as keep the original drop down in the original order. Also search had to populated automatically according to the format of our file names to make it easier to understand for the user. After this the requirement was to make a new feature to get top 10 similar files along with their names and cosine scores for our current selected file in the drop down arranged in descending order in an efficient way considering the huge size of the cosine matrix. A python script for the same was written and well tested to implement the functionality for both drop down and similar visual features for loading were also added for this functionality. At the end few other visual enhancements were done to the searchbox, scrolling in the dropdown and in displaying of top 10 similar scores. At the last part of the semester we incorporated two more new datasets to the portal and created new mechanism for their selection and loading and made the top 10 scores display in a well formatted table.