[Audio] Hello, everyone. I am Zhi xiang Shen, a senior undergraduate student from the University of Electronic Science and Technology of China. Today, I will be presenting our paper accepted at NeurIPS 2024, titled 'Beyond Redundancy: Information-aware Unsupervised Multiplex Graph Structure Learning.' This research was conducted by Zhi xiang Shen, Shuo Wang, and Zhao Kang from the University of Electronic Science and Technology of China. You can find the link to our paper and the code via the QR codes shown on the slide..
[Audio] Let's start with an introduction to multiplex graphs. A multiplex graph is a special type of multi-relational heterogeneous graph with multiple layers spanning a common set of nodes. These graphs are particularly useful because they provide richer information and better modeling capabilities compared to single-layer graphs. Our focus is on Unsupervised Multiplex Graph Learning, which aims to learn node representations by leveraging diverse graph structures and features without manual labeling. This approach is crucial for applications where labeled data is scarce or unavailable. Unsupervised Multiplex Graph Learning has various applications, including biological network analysis, social network mining, and recommendation systems..
[Audio] However, there are overlooked aspects in UMGL. First, the reliability of the graph structure can be compromised by task-irrelevant information such as irrelevant, heterophilic, and missing edges. This noise can significantly degrade the performance of graph learning algorithms. Therefore, beyond graph-fixed methods need to be explored to adapt to real-world noisy data. Second, there is the issue of multiplex graph non-redundancy, where task-relevant information exists not only in shared information between graphs but also within the unique information of certain graphs. In multiplex graph data, shared task-relevant information can manifest as homophilic edges common to all graphs, while unique task-relevant information is reflected in homophilic edges that appear only in a specific graph. This means that simply focusing on shared information can lead to the loss of valuable insights..
[Audio] Based on the above observations, we first theoretically define Multiplex Graph Non-redundancy from the perspective of mutual information. Two graph structures are non-redundant with respect to label information if and only if the conditional mutual information is strictly greater than zero. We further conducted empirical analyses on real multiplex graph datasets. The related tables show the unique relevant edge ratio in each graph, which measures the graph-unique task-relevant information. It can be observed that non-redundancy is widespread in real data, highlighting the limitations of existing methods that only capture shared information..
[Audio] We define our research problem from the perspective of graph structure learning. The goal is to learn a fused graph from the original multiplex graph in an unsupervised manner, mitigating task-irrelevant noise while retaining sufficient task-relevant information. This is achieved by refining the graph structure to remove irrelevant edges and by ensuring that both shared and unique task-relevant information are preserved..
[Audio] We propose an Information-aware Unsupervised Multiplex Graph Fusion method. Our approach takes the original multiplex structures and node features as input without requiring label information, and outputs the learned structure and node representations. First, we remove irrelevant noise from each graph view. Meanwhile, we maximize both shared and unique task-relevant information for each graph. To capture unique task-relevant information in an unsupervised manner, we propose a learnable generative graph augmentation. Finally, the entire optimization process is performed using a differentiable lower/upper bound of the mutual information, ensuring an accurate estimation of mutual information..
[Audio] Our theoretical contributions include optimal graph augmentation and multiplex graph fusion. We define optimal graph augmentation from the perspective of mutual information and demonstrate that guidance from the optimal augmented graph can capture complete task-relevant information, providing a theoretical basis for unsupervised learning. Additionally, we prove that the fused graph contains more task-relevant information compared to the refined graph from a single view, ensuring the effectiveness of multiplex graph fusion..
[Audio] We evaluate the learned graph on node clustering and node classification tasks. Our method was tested on four real-world benchmark multiplex graph datasets and compared against several baselines. The results demonstrate that our approach not only outperforms existing unsupervised methods but also competes favorably with supervised approaches, highlighting its effectiveness..
[Audio] We visualized the graph structure and node representations. The Graph Visualization shows the original subgraph structures and the learned graph structure from the ACM dataset. It can be observed that the learned fused graph retains task-relevant information from different views and exhibits a high degree of homophily. The Node Correlation Visualization rearranges nodes by category and calculates the correlation of their representations. It reveals that nodes of the same category have higher representation correlations, demonstrating the effectiveness of our unsupervised method in capturing task-relevant information..
[Audio] In summary, our research revisits two critical aspects of Unsupervised Multiplex Graph Learning: graph structure reliability and multiplex graph non-redundancy. Following a data-centric research paradigm, we learn a fused graph from the original multiplex graph structures in an unsupervised manner, removing irrelevant noise while retaining task-relevant information. Theoretical analysis and extensive experiments demonstrate the effectiveness of our approach. Thanks for your listening..