DSpace Collection:

DSpace Collection: http://repository.iiitd.edu.in/xmlui/handle/123456789/506 Sun, 21 Jun 2026 20:52:11 GMT 2026-06-21T20:52:11Z Indexing and query processing in RDF quad-stores http://repository.iiitd.edu.in/xmlui/handle/123456789/703 Title: Indexing and query processing in RDF quad-stores Authors: Leeka, Jyoti; Bedathur, Srikanta (Advisor) Abstract: RDF data management has received a lot of attention in the past decade due to the widespread growth of Semantic Web and Linked Open Data initiatives. RDF data is expressed in the form of triples (as Subject - Predicate - Object), with SPARQL used for querying it. Many novel database systems such as RDF-3X, TripleBit, etc. – store RDF in its native form or within traditional relational storage – have demonstrated their ability to scale to large volumes of RDF content. However, it is increasingly becoming obvious from the knowledge representation applications of RDF that it is equally important to integrate with RDF triples additional information such as source, time and place of occurrence, uncertainty, etc. Consider an RDF fact (BarackObama, isPresidentOf, UnitedStates). While this fact is useful for finding information regarding president of United States, it does not provide sufficient information for answering many challenging questions like what is the temporal validity of this fact?, where did this fact come from?, etc. Annotations like confidence, geolocation, time, etc. can be modeled in RDF through a techniques called reification, which is also a W3C recommendations. Reification, retains the triple nature of RDF and associates annotations using blank nodes. The focus of this thesis is on database aspects of storing and querying RDF graphs containing annotations like confidence, etc. on RDF triples. In this thesis, we start by developing an RDF database, named RQ-RDF-3X for efficiently querying these RDF graphs containing annotations over native RDF triples. Next, we noticed that more than 62% facts in real-world RDF datasets like YAGO, DBpedia, etc. have numerical object values. Suggesting the use of queries containing ORDER-BY clause on traditional graph pattern queries of SPARQL. State-of-the-art RDF processing systems such as Virtuoso, Jena, etc. handle such queries by first collecting the results and then sorting them in-memory based on the userspecified function, making them not very scalable. In order to efficiently retrieve results of top-𝑘 queries, i.e. queries returning the top-𝑘 results ordered by a user-defined scoring function, we developed a top-k query processing database named Quark-X. In Quark-X we propose indexing and query processing techniques for making top-𝑘 querying efficient. Motivated by the importance of geo-spatial data in critical applications such as emergency response, transportation, agriculture etc. In addition to its widespread use in knowledge bases such as YAGO, WikiData, LinkedGeoData, etc. We developed STREAK, a RDF data management system that is designed to support a wide-range of queries with spatial filters including complex joins, with top-𝑘 queries over spatially enriched databases. While developing STREAK we realized that to make effective use of this rich data, it is crucial to efficiently evaluate queries combining topological and spatial operators – e.g., overlap, distance, etc. – with traditional graph pattern queries of SPARQL. While there have been research efforts for efficient processing of spatial data in RDF/SPARQL, very little effort has gone into building systems that can handle both complex SPARQL queries as well as spatial filters. We describe novel contributions of each of these engines developed below. RQ-RDF-3X : RQ-RDF-3X presents extensions to triple-store style RDF storage engines to support reification and quads. In RQ-RDF-3X, we support triple annotations by assigning a unique identifier (R) to each (S, P, O) triple. Thus, the fundamental change required is to support an additional field (R) that has triple identifier. The inclusion of this additional field requires the query optimizer of the triple store being extended to be aware of the unique characteristic of the triple identifier (R). Additionally this requires careful re-thinking of existing indexing and query optimization approaches adopted by state-ofthe-art triple stores. In order to achieve fast performance in RQ-RDF-3X we propose an efficient set of indices which enables RQ-RDF-3X to efficiently reduce the query processing time by making use of merge joins. The set of indices are stored compactly using an efficient compression scheme. We demonstrate experimentally that RQ-RDF-3X achieves one to two orders of magnitude speed-up over both commercial and academic engines such as Virtuoso, RDF-3X, and Jena-TDB on real-world datasets - YAGO and DBpedia. Quark-X: Quark-X is an efficient top-𝑘 query processing framework for RDF quad stores. The contributions of Quark-X include novel in-memory synopsis indexes for predicates describing numerical objects. This is in the same spirit as building impact-layered indexes in information retrieval but carefully redesigned for use for ranking in reified RDF. Additionally, Quark-X proposes a novel Rank-Hash Join (RHJ) algorithm designed to utilize the synopsis indexes, by selectively performing range scans for facts containing numerical objects early on – this is crucial to the overall performance of SPARQL queries which involve a large number of joins. We show experimentally that Quark-X achieves one to two magnitude speed-up over baseline databases namely Virtuoso, Jena-TDB, SPARQLRANK and RDF-3X on YAGO and DBpedia datasets. STREAK: STREAK is an efficient engine for processing top-k SPARQL queries with spatial filters. Spatial filters are used to evaluate distance relationships between entities in SPARQL queries. STREAK introduces various novel features such as a careful identifier encoding strategy for spatial and non-spatial entities for reducing storage cost and for early pruning, the use of a semantics-aware Quad-tree index that allows for early-termination and a clever use of adaptive query processing with zero plan-switch cost. For experimental evaluations, we focus on top-k distance join queries and demonstrate that STREAK outperforms popular spatial join algorithms as well as state of the art commercial systems such as Virtuoso. Fri, 01 Dec 2017 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/703 2017-12-01T00:00:00Z Designing generic asymmetric key cryptosystem with message paddings http://repository.iiitd.edu.in/xmlui/handle/123456789/619 Title: Designing generic asymmetric key cryptosystem with message paddings Authors: Bansal, Tarun Kumar; Chang, Donghoon (Advisor); Pieprzyk, Josef (Advisor); Sanadhya, Somitra Kumar (Advisor); Boyen, Xavier (Advisor) Abstract: RSA-OAEP is being used in PKCS #1 2.0 standard for a long time. OAEP (optimal asymmetric encryption padding) provides security strength to RSA and other deterministic one-way asymmetric primitives (trapdoor one-way permutations). OAEP has been found to be useful in case of hybrid encryption, signcryption, hybrid signcryption and also as randomness recovery scheme. With time, several proposals modifying OAEP were published in the literature. These proposals give different OAEP versions which differ regarding efficiency, provable security, compatibility with a type of asymmetric one-way cryptosystem (deterministic or probabilistic), extending the use of OAEP in other applications, etc. Our work helps in understanding the development of OAEP framework and its use. As part of our contribution, we describe a different kind of message padding which works as an alternative of OAEP type scheme. This new message padding scheme is based on iterated Sponge permutation structure. Usage of famous Sponge permutation structure comes from symmetric cryptography where iterated permutation as Sponge functions has provided a great feature to align security and efficiency. We call our scheme Sponge based asymmetric encryption padding (SpAEP). Our scheme achieves semantic security under chosen ciphertext attack (IND-CCA) using any trapdoor one-way permutation in the ideal permutation model for arbitrary length messages. This IND-CCA security is considered as highest and strongest security notion, whereas one-wayness security notion is weaker one. We also propose a key encapsulation mechanism for hybrid encryption using SpAEP with any trapdoor one-way permutation. SpAEP utilizes the permutation model efficiently in the setting of public key encryption in a novel manner. A primary limitation with the OAEP-type schemes is their incompatibility with a probabilistic asymmetric one-way secure cryptosystem (e.g., ElGamal). We study the reasons behind this limitation and are able to extend the scope of iii usage from deterministic (e.g., RSA) to probabilistic (e.g., ElGamal) functions along with efficiency improvements in SpAEP. We denote new modified Sponge based padding as SpPad–Pe where SpPad–Pe stands for Sponge based Padding (SpPad) with asymmetric one-way cryptosystem (Pe). The concept and techniques which are used as a base for constructing Sponge based message padding, also result in a strongly secure generic asymmetric encryption scheme using weakly secure asymmetric cryptosystem. Instead of using specific Sponge based construction, we introduce a more generic framework to build a CCA-secure PKE, called REAL. REAL stands for Real time CCA-secure Encryption for Arbitrary Long Messages. An asymmetric one-way secure cryptosystem, a one-time secure symmetric encryption scheme and two hash functions are sufficient for this design. Proposed design provides streaming option without compromising other valuable features, compared to previous works. We exploit versatile nature of Sponge construction into another area of cryptography known as signcryption. The aim of signcryption is to provide both confidentiality and authentication of messages more efficiently than performing encryption and signing independently. “Commit-then-Sign&Encrypt” (CtS&E ) composition method allows to perform encryption and signing in parallel. Parallel execution of cryptographic algorithms decreases the computation time needed to signcrypt a message. We put forward the application of sponge structure based message padding as an alternative of commitment scheme in constructing signcryption scheme. We propose a provably secure signcryption scheme using weak asymmetric primitives such as trapdoor one-way encryption and universal unforgeable signature. Using simple tricks, we also demonstrate how different combinations of probabilistic/deterministic encryption and signature schemes following weaker security requirements can be utilized without compromising the security of the scheme. To the best of our knowledge, this is the first signcryption scheme based on sponge structure and offers maximum security using weak underlying asymmetric primitives along with the ability to handle long messages. This thesis follows a step-by-step formation of efficient and secure cryptosystem, starting from basic to complex structure. This thesis emphasizes the importance of message pre-processing technique and its usage by providing generic and efficient cryptosystem. Sun, 01 Oct 2017 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/619 2017-10-01T00:00:00Z Design and analysis of password-based authentication systems http://repository.iiitd.edu.in/xmlui/handle/123456789/598 Title: Design and analysis of password-based authentication systems Authors: Mishra, Sweta; Chang, Donghoon (Advisor); Sanadhya, Somitra Kumar (Advisor) Abstract: Passwords are the most widely deployed means of human-computer authentication since the early 1960s. The use of passwords, which are usually low in entropy, is delicate in cryptography because of the possibility of launching an offline dictionary attack. It is ever challenging to design a password-based cryptosystem that is secure against this attack. Password-based cryptosystems broadly cover two areas - 1) Password-based authentication, e.g., password hashing schemes and 2) Password-based encryption specifically used in password-based authenticated key exchange (PAKE) protocols. This thesis is devoted to the secure design of password hashing algorithm and the analysis of existing password-based authentication systems. The frequent reporting of password database leakage in real-world highlights the vulnerabilities existing in the current password based constructions. In order to alleviate these problems and to encourage strong password protection techniques, a Password Hashing Competition (PHC) was held from 2013 to 2015. Following the announced criteria, we propose a password hashing scheme Rig that fulfills all the required goals. We also present a cryptanalytic technique for password hashing. Further, we focus on the improvement of a password database breach detection technique and on the analysis of Universal 2nd Factor protocol. This report tries to list and summarize all the important results published in the field of password hashing in recent years and understand the extent of research over password-based authentication schemes. Our significant results are listed below. 1. Following the design requirements for a secure password hashing scheme as mentioned at the PHC [16], we present our design Rig which satisfies all required criteria. It is a memory hard and best performing algorithm under cache-timing attack resistant category. As part of the results, we present the construction explaining the design rationale and the proof of its collision resistance. We also provide the performance and security analysis. 2. In practice, most cryptographic designs are implemented inside a Cryptographic module, as suggested by National Institute of Standards and Technology (NIST) in a standard, FIPS 140. A cryptographic module has a limited memory and this makes it challenging to implement a password hashing scheme (PHS) inside it. We provide a cryptographic module based approach for password hashing. It helps to enhance the security of the existing password-based authentication framework. We also discuss the feasibility of the approach considering the submissions of PHC. 3. The increasing threat of password leakage from compromised password hashes demands a resource consuming algorithm to prevent the precomputation of the password hashes. A class of password hashing designs which ensure that any reduction in the memory leads to exponential increase in their runtime are called Memory hard designs. Time Memory Tradeoff (TMTO) technique is an effective cryptanalytic approach for such password hashing schemes (PHS). However, it is generally difficult to evaluate the \memory hardness" of a given PHS design. We present a simple technique to analyze TMTO for any password hashing schemes which can be represented as a directed acyclic graph. 4. Password database breach is a common practice among hackers; however, it is difficult to detect such breaches if not somehow disclosed by the attacker. A paper by Juels et al. provides a method for detecting password database breach known as `Honeyword'. Very less research has been reported in this direction. Realizing the importance, we analyse the limitations of existing honeyword generation techniques. We propose a new attack model and also present new and practical honeyword generation techniques. 5. A secure password hashing construction can prevent offline dictionary attack, but cannot provide resistance to common online attacks. Therefore requirement of augmenting a second factor to strengthen the simple password-based authentication is a recent trend. The U2F protocol proposed by Fast IDentity Online (FIDO) alliance in 2014 has been introduced as a strong augmentation that can prevent online attacks currently faced in practice. A thorough third-party analysis is required to verify the claim of U2F developers. Therefore we work on the analysis of U2F protocol and show that the protocol is not secure against side channel attacks. We then present a new variant of the U2F protocol that has improved security guarantees. In terms of memory hardness and performance, the design Rig presented in the thesis is among the best-known algorithms for password hashing [15, 85, 140]. The other results presented are significant contributions to the previously published results. Sun, 01 Oct 2017 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/598 2017-10-01T00:00:00Z Pushing boundaries of face recognition : adversary, heterogeneity, and scale http://repository.iiitd.edu.in/xmlui/handle/123456789/595 Title: Pushing boundaries of face recognition : adversary, heterogeneity, and scale Authors: Dhamecha, Tejas Indulal; Singh, Richa (Advisor); Vatsa, Mayank (Advisor) Abstract: Due to the unconstrained nature of data capture and non-cooperative subjects, automatic face recognition is still a research challenge for application scenarios such as law enforcement. We observe that challenges of face recognition are broadly rooted into two facets: (1) the non-ideal and possibly adversarial face image samples and (2) the large size and incremental/streaming availability of data. The first facet encompasses various challenges such as intentional or unintentional obfuscation of identity, attempts for spoofing system, user non-cooperation, and large intra-subject variations for heterogeneous face recognition. The second facet caters to challenges arising due to application scenarios such as repeat offender identification and surveillance where the data is either large scale or available incrementally. Along with advancing the face recognition research by addressing the challenges arising from both the aforementioned facets, this dissertation also contributes to the pattern classification research by abstracting the research problems at the classifier level and proposing feature independent solutions to some of the problems. The first contribution addresses the challenge of face obfuscation due to usage of disguise accessories. We collect and benchmark IIIT In and Beyond Visible Spectrum Face Dataset (I2BVSD) pertaining to 75 subjects, which has various types of disguises applied on different individuals. It has become one of the most used disguise face dataset in the research community. Since disguised facial regions can lead to erroneous identity prediction, a texture based algorithm is designed to differentiate between biometric and non-biometric facial patches. The proposed approach is embedded with local face recognition algorithm to address the challenge of disguise variations. The approach is further enhanced with the use of thermal spectrum imaging. As the second contribution, the dissertation addresses the challenge of heterogeneous face matching scenarios, such as matching a sketch against a mugshot dataset of digital photographs, cross-spectrum, and crossresolution matching, that arise in a wide range of law enforcement scenarios. Heterogeneous Discriminant Analysis (HDA) is designed to encode multi-view heterogeneity in the classifier to obtain a projection space more suitable for matching. Further, to extend the proposed technique for nonlinear projections, formulation of kernel HDA is proposed. Focusing on application such as identification of repeat offenders, as the third contribution, we develop an approach to efficiently update the face recognition engine to incorporate incremental training data. The proposed Incremental Semi-Supervised Discriminant Analysis (ISSDA) provides mechanism to efficiently, in terms of accuracy and training time, update the discriminatory projection directions. The proposed approach capitalizes on offline unlabeled face image data, which is inexpensive to obtain and generally available in abundance. The fourth contribution of this dissertation is focused on designing a face recognition classifier that can be efficiently learned from very large batches of training data. The proposed approach, termed as Subclass Reduced Set Support Vector Machine (SRS-SVM), utilizes the subclass structure of training data to effectively estimate the candidate support vector set. This candidate support vector set facilitates learning of nonlinear Support Vector Machine from large-scale face data in less computation time. Sat, 01 Jul 2017 00:00:00 GMT http://repository.iiitd.edu.in/xmlui/handle/123456789/595 2017-07-01T00:00:00Z