R-tip: Simple way to get top varying genes from an expression matrix
June 28th, 2010
No comments
A simple bit of R-code to identify the top N most varying genes across a multi-condition numeric matrix
#calculate the variance by row
v <- apply(data,1,var);
#now get indices of rows whose variance is in the top n (you could do this with a sort on the variance)
sub <- v > quantile(v, (nrow(data) – n)/nrow(data));
#create the sub-matrix
subset <- data[sub,];