Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.

LANCASTER UNIVERSITY LEIPZIG

Text available via DOI:

https://doi.org/10.1007/978-3-319-46307-0_5
Final published version

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. / Welke, Pascal; Horváth, Tamás; Wrobel, Stefan.
Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.. Vol. 9956 1. ed. Springer, Cham, 2016. p. 67-82 (Lecture Notes in Computer Science; Vol. 9956).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Welke, P, Horváth, T & Wrobel, S 2016, Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. in Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.. 1 edn, vol. 9956, Lecture Notes in Computer Science, vol. 9956, Springer, Cham, pp. 67-82, 19th International Conference, DS 2016, Bari, 19/10/16. https://doi.org/10.1007/978-3-319-46307-0_5

APA

Welke, P., Horváth, T., & Wrobel, S. (2016). Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. In Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. (1 ed., Vol. 9956, pp. 67-82). (Lecture Notes in Computer Science; Vol. 9956). Springer, Cham. https://doi.org/10.1007/978-3-319-46307-0_5

Vancouver

Welke P, Horváth T, Wrobel S. Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. In Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.. 1 ed. Vol. 9956. Springer, Cham. 2016. p. 67-82. (Lecture Notes in Computer Science). doi: 10.1007/978-3-319-46307-0_5

Author

Welke, Pascal ; Horváth, Tamás ; Wrobel, Stefan. / Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.. Vol. 9956 1. ed. Springer, Cham, 2016. pp. 67-82 (Lecture Notes in Computer Science).

Bibtex

@inproceedings{a00d19797199437ebb1f6a629f75b928,

title = "Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.",

abstract = "We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.",

author = "Pascal Welke and Tam{\'a}s Horv{\'a}th and Stefan Wrobel",

note = "DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.; 19th International Conference, DS 2016 ; Conference date: 19-10-2016 Through 21-10-2016",

year = "2016",

month = sep,

day = "21",

doi = "10.1007/978-3-319-46307-0_5",

language = "Undefined/Unknown",

isbn = "9783319463063",

volume = "9956",

series = "Lecture Notes in Computer Science",

publisher = "Springer, Cham",

pages = "67--82",

booktitle = "Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.",

edition = "1",

}

RIS

TY - GEN

T1 - Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.

AU - Welke, Pascal

AU - Horváth, Tamás

AU - Wrobel, Stefan

N1 - DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.

PY - 2016/9/21

Y1 - 2016/9/21

N2 - We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.

AB - We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.

U2 - 10.1007/978-3-319-46307-0_5

DO - 10.1007/978-3-319-46307-0_5

M3 - Conference contribution/Paper

SN - 9783319463063

VL - 9956

T3 - Lecture Notes in Computer Science

SP - 67

EP - 82

BT - Min-Hashing for Probabilistic Frequent Subtree Feature Spaces.

PB - Springer, Cham

T2 - 19th International Conference, DS 2016

Y2 - 19 October 2016 through 21 October 2016

ER -

Research

Links

Text available via DOI: