[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] igraph-help Digest, Vol 117, Issue 3
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] igraph-help Digest, Vol 117, Issue 3 |
Date: |
Tue, 5 Apr 2016 12:38:50 +0200 |
Intra-cluster and inter-cluster edge counts can be calculated much
easier as follows (and maybe there are solutions that are even better
- I'm not familiar with R):
library(plyr)
library(sparseMatrix)
pairs <- count(matrix(memb[get.edgelist(g)], ncol=2))
pairs <- sparseMatrix(i=pairs$x.1, j=pairs$x.2, x=pairs$freq)
pairs <- pairs + t(pairs)
Then the intra-cluster edge counts are given by diag(pairs)/2 (note
the division by two) and the inter-cluster edge counts between cluster
i and j are given by pairs[i, j]. Row and column sums belong to the
number of edges incident on a given cluster.
T.
On Mon, Apr 4, 2016 at 2:11 PM, MikeS <address@hidden> wrote:
> Hello,
>
> Tamas thanks you for reply.
>
> I have tried to write a script to calculate the conductance in genenal case.
> I have used the definition of conductance from the paper
> http://cs.stanford.edu/people/jure/pubs/comscore-icdm12.pdf
> My code is shown below. I have the error: "in `[.data.frame`(tmp,
> c("X1", "X2")) : undefined columns selected" on the line: "long <-
> ....."
> But code is working. I would like to fix this error and than to test
> the code on some dataset.
> Could someone please give remarks, comments to the code?
>
> library(igraph)
> g <- make_graph( ~ A-B-C-D-A, E-A:B:C:D,
> F-G-H-I-F, J-F:G:H:I,
> K-L-M-N-K, O-K:L:M:N,
> P-Q-R-S-P, T-P:Q:R:S,
> B-F, E-J, C-I, L-T, O-T, M-S,
> C-P, C-L, I-L, I-P)
>
> comm <- walktrap.community(g)
>
> mS <- vector() # the number of edges in S
> cS <- vector() # the number of edges on the boundary of S
> m <- vector()
> сond <-vector()
>
> for (s in 0: nrow(comm$merges)) {
> memb <- cutat(comm, steps=s)
> m <- c(m, modularity (g, memb, weights=NULL))
> g2<-make_clusters(g, memb)
>
> # intra-cluster edges
>
> mS <- sapply(unique(membership(g2)), function(i) {
> vs<- which(membership(g2)==i)
> subg1<-induced.subgraph(g, vs)
> ecount(subg1)
> })
>
> # inter-cluster edges
>
> dcs <- data.frame(combn(unique(membership(g2)), 2))
> cS <- sapply(dcs, function(x) {
> es<-E(g)[V(g)[membership(g2)==x[1]] %--% V(g)[membership(g2)==x[2]]]
> length(es)
> })
> tmp <- data.frame(t(dcs[1,]), t(dcs[2,]), cS)
> long <- cbind(tmp["cS"], stack(tmp[c("X1","X2")]), row.names = NULL)
> # Error in `[.data.frame`(tmp, c("X1", "X2")) : undefined columns selected
>
> cS <- with( long, tapply(cS, values, sum))
>
> # Conductance
> сond <- c(сond, min(cS/(2*mS + cS)))
> }
> par(mfrow=c(1:2))
> plot(0:(length(m)-1), m, col="blue",xlab="Steps",ylab="Modularity")
> plot(0:(length(сond)-1), сond, col="blue",xlab="Steps",ylab="Conductance")
>
> 2016-04-03 23:01 GMT+07:00 <address@hidden>:
>> Hi,
>>
>> There is no error in your implementation, although the way you define
>> conductance is not exactly the way it is usually defined in the graph
>> theory literature. (As far as I know, conductance is usually
>> calculated for a cut of a graph, i.e. a partitioning into two disjoint
>> sets, and the conductance of a graph is simply the minimum conductance
>> over all possible cuts). The way you defined conductance is simply the
>> ratio of the number of edges between clusters and the number of edges
>> within clusters. Now, before the first merge, obviously all the edges
>> are between clusters, so you divide a nonzero value with zero, hence
>> you get infinity. After having performed all the merges, obviously all
>> the edges are within clusters, so you divide zero with a nonzero
>> value, getting zero in the end.
>>
>> So, there's nothing wrong with your code, but the way you defined
>> conductance is not suitable for selecting an "optimal" number of
>> clusters based on its extrema.
>>
>> T.
>>
>>
>> On Sat, Apr 2, 2016 at 4:03 AM, MikeS <address@hidden> wrote:
>>> Tamas, thanks you for reply.
>>> My code does not have syntactical error now.
>>> But I concerned about the result, I think I have a logical error.
>>>
>>> A modularity curve has the maximum value 0.4583 inside the steps'
>>> range (on step with no.=18), but conductance curve has extremum 0 on
>>> the right boundary.
>>>
>>>> max(m)
>>> [1] 0.4583333 # index =18
>>>> max(con)
>>> [1] 0 # index = 20
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help