Kamer Kaya | ml on large graphs

Graphs are widely adopted to model the interactions within reallife data such as social networks, citation networks, web data, etc. Recently, using machine learning (ML) tasks such as link prediction, node classification, and anomaly detection on graphs became a popular area with various applications from different domains. The raw connectivity information of a graph, represented as its adjacency matrix, does not easily lend itself to be used in such ML tasks; regular d-dimensional representations are more appropriate for learning valid correlations between graph elements. Unfortunately, the connectivity information does not have such a structure. Recently, there has been a growing interest in the literature in the graph embedding problem which focuses on representing the vertices of a graph as d-dimensional vectors while embedding its structure into a d-dimensional space.

We have started to focus on increasing the performance of ML kernels on graphs by using high-performance computing and parallel algorithms on CPUs and GPUs. We have designed and developed GOSH which is a graph embedder that can process large-scale graphs even on a single GPU faster than some state-of-the-art alternatives even when they run on multiple GPUs.

With coarsening, one can perform large updates on the embedding at once with a cost of single update. Our parallel coarsening algorithm performs a simple yet effective approach to have a multi-level graph hierarchy where on each GOSH computes an embedding.

We are currently working on related problems on temporal and knowledge graphs and on different architectures, i.e., IPUs of Graphcore.

Related Publications:

Gosh: Embedding big graphs on small hardware. TA Akyildiz, AA Aljundi, K Kaya., 49th International Conference on Parallel Processing-ICPP, 1-11, 2020.
Understanding Coarsening for Embedding Large-Scale Graphs, TA Akyildiz, AA Aljundi, K Kaya. IEEE International Conference on Big Data (Big Data), 2937-2946, 2020.