Home > Research > Publications & Outputs > Min-Hashing for Probabilistic Frequent Subtree ...
View graph of relations

Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Close
Publication date21/09/2016
Host publicationMin-Hashing for Probabilistic Frequent Subtree Feature Spaces.
PublisherSpringer, Cham
Pages67-82
Number of pages15
Volume9956
Edition1
ISBN (electronic)9783319463070
ISBN (print)9783319463063
<mark>Original language</mark>Undefined/Unknown
Event19th International Conference, DS 2016 - Italy, Bari
Duration: 19/10/201621/10/2016

Conference

Conference19th International Conference, DS 2016
CityBari
Period19/10/1621/10/16

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume9956
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Conference

Conference19th International Conference, DS 2016
CityBari
Period19/10/1621/10/16

Abstract

We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.

Bibliographic note

DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.