Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.

LANCASTER UNIVERSITY LEIPZIG

Text available via DOI:

https://doi.org/10.1007/978-3-319-46307-0_5
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Pascal Welke
Tamás Horváth
Stefan Wrobel

More...

Publication date	21/09/2016
Host publication	Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.
Publisher	Springer, Cham
Pages	67-82
Number of pages	15
Volume	9956
Edition	1
ISBN (electronic)	9783319463070
ISBN (print)	9783319463063
<mark>Original language</mark>	Undefined/Unknown
Event	19th International Conference, DS 2016 - Italy, Bari Duration: 19/10/2016 → 21/10/2016

Conference

Conference	19th International Conference, DS 2016
City	Bari
Period	19/10/16 → 21/10/16

Publication series

Name	Lecture Notes in Computer Science
Publisher	Springer
Volume	9956
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Conference

Conference	19th International Conference, DS 2016
City	Bari
Period	19/10/16 → 21/10/16

Abstract

We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.

Bibliographic note

DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.

Research

Links

Text available via DOI: