<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
<title>Year-2013</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/78" rel="alternate"/>
<subtitle/>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/78</id>
<updated>2026-04-10T21:16:26Z</updated>
<dc:date>2026-04-10T21:16:26Z</dc:date>
<entry>
<title>Label constrained shortest path estimation on large graphs</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/275" rel="alternate"/>
<author>
<name>Likhyani, Ankita</name>
</author>
<author>
<name>Bedathur, Srikanta (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/275</id>
<updated>2021-12-13T08:54:20Z</updated>
<published>2015-06-18T08:39:29Z</published>
<summary type="text">Label constrained shortest path estimation on large graphs
Likhyani, Ankita; Bedathur, Srikanta (Advisor)
In applications arising in massive on-line social networks, biological networks, and knowledge graphs it is often required to ﬁnd shortest length path between two given nodes. Recent results have addressed the problem of computing either exact or good approximate shortest path dis- tances eﬃciently. Some of these techniques also return the path corresponding to the estimated shortest path distance fast.&#13;
Many of the real-world graphs are edge-labeled graphs, i.e., each edge has a label that denotes the relationship between the two vertices connected by the edge. However, none of the techniques for estimating shortest paths work very well when we have additional constraints on the labels associated with edges that constitute the path.&#13;
In this work, we deﬁne the problem of retrieving shortest length path between two given nodes which also satisﬁes user-provided constraints on the set of edge labels involved in the path. We have developed SkIt index structure, which supports a wide range of label constraints on paths, and returns an accurate estimation of the shortest path that satisﬁes the constraints. We have conducted experiments over graphs such as social networks, and knowledge graphs that contain millions of nodes/edges, and show that SkIt index is fast, accurate in the estimated distance and has a high recall for paths that satisfy the constraints.
</summary>
<dc:date>2015-06-18T08:39:29Z</dc:date>
</entry>
<entry>
<title>MIMANSA : process mining software repositories from student projects in an undergraduate software engineering course</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/115" rel="alternate"/>
<author>
<name>Mittal, Megha</name>
</author>
<author>
<name>Sureka, Ashish (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/115</id>
<updated>2017-07-24T17:13:54Z</updated>
<published>2014-01-24T04:21:36Z</published>
<summary type="text">MIMANSA : process mining software repositories from student projects in an undergraduate software engineering course
Mittal, Megha; Sureka, Ashish (Advisor)
An undergraduate level Software Engineering course generally consists of a team-based semester&#13;
long project and emphasizes on both technical and managerial skills. Software Engineering&#13;
is a practice-oriented and applied discipline and hence there is an emphasis on hands-on de-&#13;
velopment, process, usage of tools in addition to theory and basic concepts. We present an&#13;
approach for mining the process data (process mining) from software repositories archiving data&#13;
generated as a result of constructing software by student teams in an educational setting. We&#13;
present an application of mining three software repositories: team wiki (used during require-&#13;
ment engineering), version control system (development and maintenance) and issue tracking&#13;
system (corrective and adaptive maintenance) in the context of an undergraduate Software En-&#13;
gineering course. We propose visualizations, metrics and algorithms to provide an insight into&#13;
practices and procedures followed during various phases of a software development life-cycle.&#13;
The proposed visualizations and metrics (learning analytics) provide a multi-faceted view to the&#13;
instructor serving as a feedback tool on development process and quality by students. We mine&#13;
the event logs produced by software repositories and derive insights such as degree of individual&#13;
contributions in a team, quality of commit messages, intensity and consistency of commit activi-&#13;
ties, bug  xing process trend and quality, component and developer entropy, process compliance&#13;
and veri cation. We present our empirical analysis on a software repository dataset consisting&#13;
of 19 teams of 5 members each and discuss challenges, limitations and recommendations.
</summary>
<dc:date>2014-01-24T04:21:36Z</dc:date>
</entry>
<entry>
<title>OCEAN: open-source collation of eGovernment data and networks</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/113" rel="alternate"/>
<author>
<name>Gupta, Srishti</name>
</author>
<author>
<name>Kumaraguru, Ponnurangam (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/113</id>
<updated>2017-07-24T17:14:57Z</updated>
<published>2013-12-03T03:52:32Z</published>
<summary type="text">OCEAN: open-source collation of eGovernment data and networks
Gupta, Srishti; Kumaraguru, Ponnurangam (Advisor)
The awareness and sense of privacy has increased in the minds of people over the past few years.&#13;
Earlier, people were not very restrictive in sharing their personal information, but now they&#13;
are more cautious in sharing it with strangers, either in person or online. With such privacy&#13;
expectations and attitude of people, it is di cult to embrace the fact that a lot of information is&#13;
publicly available on the web. Information portals in the form of the e-governance websites run&#13;
by Delhi Government in India provide access to such PII without any anonymization. Several&#13;
databases e.g., Voterrolls, Driving Licence number, MTNL phone directory, PAN card serve as&#13;
repositories of personal information of Delhi residents. This large amount of available personal&#13;
information can be exploited due to the absence of proper written law on privacy in India. PII&#13;
can also be collected from various social networking sites like Facebook, Twitter, GooglePlus etc.&#13;
where the users share some information about them. Since users themselves put this information,&#13;
it may not be considered as a privacy breach, but if the information is aggregated, it may give out&#13;
much more information resulting in a bigger threat. For e.g., data from social networks and open&#13;
government databases can be combined together to connect an online identity to a real world&#13;
identity. Even though the awareness about privacy has increased, the threats possible due to the&#13;
availability of this large amount of personal data is still unknown. To bring such issues to public&#13;
notice, we developed Open-source Collation of eGovernment data And Networks (OCEAN), 1&#13;
a system where the user enters little information (e.g. Name) about a person and gets large&#13;
amount of personal information about him / her like name, age, address, date of birth, mother's&#13;
name, father's name, voter ID, driving licence number, PAN. On aggregation of information&#13;
within the Voter ID database, OCEAN 2 creates a family tree of the user giving out the details&#13;
of his / her family members as well. We also calculated a privacy score, which calculates the&#13;
risk associated with that individual in terms of how much PII of that person is revealed from&#13;
open government data sources. 1,693 users had the highest privacy score making them the most&#13;
vulnerable to risks. Using OCEAN, 3 we could collect 8,195,053 Voterrolls; 2,24,982 Driving&#13;
licence; 53,419 PAN card numbers; 1,557,715 Twitter; 3,377,102 Facebook; 29,393 Foursquare;&#13;
1,86,798 LinkedIn and 28,900 GooglePlus records. There exist several websites like Yasni, 4&#13;
PeekYou, 5 Pipl 6 which help in searching a person on the Internet but are not focused for&#13;
people living in Delhi. We performed a user evaluation of OCEAN 7 in a survey study to&#13;
evaluate the usability, e ectiveness and impact of OCEAN 8 and showed that users like and&#13;
 nd it convenient to use it in real-world. We received 661 total hits (657 unique visitors) from&#13;
the day we released the system, January 21, 2013, until October 10, 2013. To the best of our&#13;
knowledge, this is the  rst real world deployed tool which provides personal information about&#13;
residents of Delhi to everyone free of cost.
</summary>
<dc:date>2013-12-03T03:52:32Z</dc:date>
</entry>
<entry>
<title>Geographical visualization approach to perceive spatial scan statistics : an analysis of dengue fever outbreaks in Delhi</title>
<link href="http://repository.iiitd.edu.in/xmlui/handle/123456789/112" rel="alternate"/>
<author>
<name>Mala, Shuchi</name>
</author>
<author>
<name>Sengupta, Raja (Advisor)</name>
</author>
<id>http://repository.iiitd.edu.in/xmlui/handle/123456789/112</id>
<updated>2017-07-24T17:14:48Z</updated>
<published>2013-11-25T09:57:52Z</published>
<summary type="text">Geographical visualization approach to perceive spatial scan statistics : an analysis of dengue fever outbreaks in Delhi
Mala, Shuchi; Sengupta, Raja (Advisor)
In India, there is a strong need of a nation-wide disease surveillance system. As of now there&#13;
are very few surveillance systems in India to detect disease outbreaks. IDSP (Integrated Disease&#13;
Surveillance Project) was launched by Government of India with assistance of World Bank to&#13;
detect and respond to disease outbreaks quickly. Still e orts are needed to strengthen the disease&#13;
surveillance and response system for early detection of disease outbreaks. The strongest pillar&#13;
of an accurate disease surveillance system is data related to cases and various risk factors. After&#13;
data collection, the next important step is transformation of the collected data into meaningful&#13;
information. Precise statistical methods are then required to analyse the information at hand.&#13;
Disease outbreaks are detected using statistical analysis tools but for e ective disease control&#13;
a visualization approach is required. Without appropriate visualization it is very di cult to&#13;
interpret the results of analysis. In the work presented here, a statistical analysis is performed&#13;
to detect space-time disease clusters and then the developed visualization approach is used to&#13;
visualize the disease outbreaks. SaTScan software is integrated with the visualization approach&#13;
to detect location of disease clusters and to test whether the detected clusters are statistically&#13;
signi cant. Without the developed visualization approach users will have to run SaTScan soft-&#13;
ware for each disease per data source. Hence, the presented work provides an extremely e cient&#13;
and accurate technique for early detection of disease outbreak in the region covered by the&#13;
surveillance system.
</summary>
<dc:date>2013-11-25T09:57:52Z</dc:date>
</entry>
</feed>
