How many samples do I really need for WGCNA?

Fifteen samples is the absolute minimum, twenty or more is recommended, and fifty or more produces the most stable, reproducible modules. The number of samples controls how well the pairwise correlations can be estimated, which is the foundation everything else builds on.

Concept guide · 15 min read Intermediate

WGCNA: A Visual Guide to Gene Co-expression Networks

Q: Can I run WGCNA on a 3-vs-3 comparison?

Not effectively. With only six samples the correlation estimates are too noisy to define stable modules. Use differential expression analysis (DESeq2, edgeR, limma) instead — it is designed for small group comparisons.

Q: How do I interpret the Grey module?

Grey is the bucket of genes that did not fit into any module. It is often the largest 'module' in the result. Do not over-interpret it biologically — it is a residual, not a coordinated program.

Q: My modules do not enrich for any pathway. Is that bad?

Not necessarily. Modules sometimes reflect technical variation (batch, sequencing depth) or cell-type composition rather than canonical pathways. Cross-check the module eigengene against batch variables and sample composition before concluding the module is uninformative.

Q: Can I compare modules across two studies?

Yes — WGCNA has a modulePreservation function for exactly this. TransXplorer does not currently expose this; for cross-study module preservation use the R WGCNA package directly.

Find the groups of genes that move together across your samples — and what they tell you about biology.

Start reading → Try in TransXplorer →

Plain language No code required 7 visual diagrams Real citations

TL;DR

The three things to know

WGCNA finds clusters (“modules”) of genes whose expression rises and falls together across samples.
Each module often represents a biological process; hub genes within a module are candidate drivers (statistically central, not necessarily causal).
Most powerful when you have 15–20+ samples with continuous or multi-condition variation — a timecourse, dose-response, or patient cohort.

01 · Intuition Foundational

What “co-expression” actually means

Imagine two genes, GENE-A and GENE-B. You have expression measurements for both across many samples — say, 40 patients, or 30 timepoints. If GENE-A tends to go up in the same samples where GENE-B goes up, and down where it goes down, the two genes are co-expressed. The correlation across samples (Pearson's r, usually) captures that pattern in a single number between −1 and 1.

Now scale that idea up. With 15,000 genes you can compute every pairwise correlation — over 100 million numbers. That matrix is a network: every gene is a node, every correlation is the weight of an edge. Strong positive correlations mean tight co-regulation. Weak correlations mean the two genes have little to say to each other.

The problem: a raw correlation network is dense, noisy, and useless to read. WGCNA — Weighted Gene Co-expression Network Analysis — is a set of transformations that turn that wall of numbers into something a biologist can actually interpret: a handful of modules (gene groups), one eigengene per module (a summary expression profile), and a ranked list of hub genes per module (the genes most representative of the module's pattern).

Figure 1 · From expression matrix to network n = 4 samples · 5 genes

Figure 1. The pattern in the matrix becomes the structure in the network. GENE-A, GENE-C, and GENE-E rise and fall together — they are co-expressed. GENE-B is flat (uninformative), GENE-D moves the opposite way.

02 · The “weighted” part

Why “weighted” — soft thresholding

The naive way to build a gene network is to pick a correlation cutoff (say r > 0.7) and call any pair above it “connected.” This is a hard threshold. It is also, unfortunately, fragile: a gene pair at r = 0.69 is treated identically to a pair at r = 0.01 (both unconnected), and a pair at r = 0.71 is treated identically to a pair at r = 0.99 (both fully connected). All the continuous information in the correlation is thrown away at the cutoff.

WGCNA's central trick is to keep the correlations continuous but reweight them. Each correlation is raised to a power β (the “soft threshold,” usually between 6 and 12):

a_ij = |cor(i, j)|^β

Because β is large, weak correlations get crushed toward zero (0.3⁶ = 0.0007) while strong correlations stay almost intact (0.9⁶ = 0.53). The network ends up dominated by the strong, biologically credible edges — without ever drawing a hard line in the sand.

How is β chosen?

Real biological networks have a characteristic shape: a few genes have lots of connections (hubs), and most genes have only a handful. Mathematically, the distribution of node degrees follows a power law — this is called scale-free topology. Random networks do not look like this. WGCNA scans candidate β values from 1 to ~30 and picks the smallest β at which the network's scale-free fit (R²) plateaus near 0.8–0.9. That is usually somewhere between 6 and 12 for typical RNA-seq data.

Figure 2 · Choosing the soft threshold β scale-free fit · mean connectivity

Figure 2. Left: scale-free fit (R²) plotted against candidate β values. The curve climbs steeply, then plateaus around β = 7. WGCNA picks the smallest β where the plateau begins. Right: as β rises, mean connectivity collapses — a useful sanity check that the network is not too dense.

Why this matters

If β is too small you keep noise; the network becomes random-looking and modules dissolve. If β is too large you lose real but moderate signal; the network fragments into tiny disconnected pieces. The plateau is the sweet spot — just strong enough to suppress noise without burning biology.

03 · Modules The output you read

Modules — what they actually represent

Once correlations are reweighted (the “adjacency matrix”), WGCNA does one more transformation. It computes the topological overlap matrix (TOM): a similarity measure that says “two genes are similar if they share many of the same neighbours in the network.” TOM is more robust than raw correlation because it considers second-order structure — not just whether A and B are connected, but whether A and B both connect to C, D, E, and F.

Hierarchical clustering on TOM-based dissimilarity produces a gene dendrogram. A dynamic tree-cut algorithm then carves the branches into modules. Each module is assigned a colour name — Turquoise, Blue, Brown, Yellow, Green, and so on. The colours are sorted by module size: Turquoise is always the largest module, Blue the second largest, etc. The order is conventional and arbitrary — the colour itself has no biological meaning.

What does a module represent?

A module is a group of genes that share an expression pattern across your samples. Often (but not always) that shared pattern reflects:

A common biological pathway — e.g. mitochondrial respiration, antigen presentation, cell cycle.
A shared regulatory program — downstream targets of the same transcription factor or signalling axis.
A cell-type signature — in bulk RNA-seq from heterogeneous tissue, modules often track shifting cell-type proportions.
Sometimes, a technical artifact — batch, RIN, or sequencing depth (these are real and need to be ruled out).

Each module is summarised by its eigengene — the first principal component of the module's expression matrix. The eigengene is a single vector across samples that captures the module's overall up-and-down pattern. It is what you correlate with traits in the next step.

Figure 3 · Gene dendrogram with module colours dynamic tree cut

Figure 3. The dendrogram is the hierarchical clustering of genes by TOM distance. The coloured bar below is the result of the dynamic tree cut: each colour is one module. Notice the grey band on the right — those genes did not fit any module.

04 · Hub genes Easy to misinterpret

Hub genes — the network's spotlight

Inside any module, genes are not equal. Some sit on the periphery — loosely connected to a handful of others. A few sit at the centre: highly connected, tightly correlated with the module eigengene, and statistically the most representative members of the group. Those are the hub genes.

The standard metric is module membership (MM): the correlation between a gene's expression profile and its module eigengene. A gene with MM = 0.95 essentially is the module's pattern. A gene with MM = 0.55 is a peripheral member — it correlates somewhat, but not strongly. Sort by MM, take the top few, and you have your candidate hub list.

Hub genes often turn out to be biologically central: master regulators, key effectors, or marker genes for the cell type or process the module represents. That is why people pay attention to them. But there is a major caveat — one of the most important pieces of context for any new WGCNA user.

Figure 4 · Hub gene in a module star-like topology

Figure 4. The hub is the gene whose expression profile most resembles the whole module's pattern. It is connected to almost every other module member with high weight. Satellites are still in the module — they just sit further from the centre.

Correlational, not causal

A hub gene is statistically central. It is not, by virtue of being a hub, a regulator or master switch. Plenty of hub genes are downstream effectors that simply respond strongly to the module's driver, or housekeeping genes that happen to correlate with the dominant signal. Hub status is a hypothesis-generating ranking — treat top hubs as candidates for knockdown, overexpression, or ChIP-seq validation, not as established regulators.

05 · Module-trait correlation The big payoff

Module-trait correlation — connecting to phenotype

This is the step that makes WGCNA more than just a clustering algorithm. You take each module eigengene (one vector per module, length = number of samples) and correlate it with every sample-level trait you have measured: disease status, tumour grade, drug dose, age, survival, treatment response. The result is a matrix — modules along the rows, traits along the columns, correlation strength in each cell. A module that correlates strongly with a trait is a candidate molecular signature of that trait.

The classic worked example is a TCGA cancer cohort. Take TCGA-BRCA breast cancer with hundreds of patient tumours and clinical annotations — ER status, HER2 status, tumour grade, days to death. Run WGCNA on the expression matrix, then correlate every module eigengene with every trait. You usually end up with a heatmap like Figure 5.

Strong cells in that heatmap point to interpretable hypotheses: a brown module that is positively correlated with tumour grade and negatively correlated with survival is a candidate poor-prognosis signature. The hub genes inside that module are candidate prognostic markers. Pathway enrichment on the module gives you the biological theme — cell cycle, hypoxia response, EMT, etc.

What “strong” means

Module-trait correlations of |r| > 0.5 with a small p-value (corrected for multiple testing across modules and traits) are usually worth following up. |r| > 0.7 is striking. Anything below |r| ~ 0.3 is probably noise unless the sample size is very large.

Figure 5 · Module-trait correlation heatmap TCGA-BRCA · n = 1,084

Figure 5. Module-trait heatmap from a hypothetical TCGA-BRCA WGCNA. Brown is strongly tied to high grade and poor survival — the hub genes of brown are the prognostic candidates worth following up. Turquoise tracks ER+ status (classic luminal signature). Grey correlates with nothing — as expected for unassigned genes.

Why this matters

This step is what turns “here are some clusters” into “here are the gene programs that track disease.” You can have hundreds of differentially expressed genes from a standard DE analysis and still not know which ones move together. Module-trait correlation organises those genes into testable hypotheses tied to your clinical or experimental variables.

06 · Fit for purpose

When WGCNA works (and when it doesn't)

WGCNA is a correlation-based method. Everything it produces depends on the quality and quantity of the correlations it can estimate from your data. That sets some real, practical limits on when it is useful.

Works well

Sample size ≥ 15 (the more, the better; 50+ is robust).
Samples span continuous or multi-condition variation — a timecourse, dose-response, patient cohort with trait variability.
The underlying biology has coordinated regulatory programs that should leave a co-expression footprint.
Confounders (batch, RIN, library prep date) have been corrected or modelled.

Does not work well

Sample size < 15 — correlations are too noisy; modules are unstable.
Simple binary comparison (treatment vs control, 3 vs 3) — use DE analysis instead.
Batch effects dominate — modules become batch-driven artifacts, not biology.
Very homogeneous samples (everyone has the same condition) — nothing for correlation to capture.

When in doubt: if you have a small, balanced two-group comparison, reach for DESeq2 / edgeR / limma first. Save WGCNA for cohorts where the variation across samples is itself part of the question.

07 · Common pitfalls

Six pitfalls people fall into

Most of the disappointing WGCNA results in the wild come from a small handful of recurring mistakes. They are easy to spot once you know what to look for.

Running with too few samples

N < 15 produces unreliable modules. Two runs on different subsets of the data will return different module structures. Without enough samples there is no stable correlation to cluster on.

Skipping batch correction

If batch dominates variation, your largest module will be a batch signature. Correct first with ComBat-seq, RUVSeq, or by including batch as a covariate — then run WGCNA on the corrected matrix.

Treating hub genes as regulators

Hubs are statistically central — they are correlationally at the top, not causally. Many hubs are downstream effectors. Validate with perturbation (knockdown, overexpression) or chromatin data before claiming a regulator.

Reading meaning into module colours

The names — Turquoise, Brown, Pink — are arbitrary labels assigned by size. They mean nothing biologically. Always describe modules by their enriched pathways or hub genes, never just by colour.

Over-merging modules

The mergeCutHeight parameter collapses similar modules. Set it too high (e.g. 0.4) and you fuse genuinely distinct programs into one mega-module. The default of 0.25 is a reasonable starting point; if you must change it, inspect what gets merged.

Ignoring the “grey” module

Grey is the catch-all for genes that did not fit anywhere. It is often the largest bucket in the result. Do not run pathway enrichment on Grey expecting biology — it is residual, not a coordinated program.

08 · Decision tree

Should I run WGCNA?

A practical flowchart, derived from the “when it works” rules. Walk through the questions in order — if you cannot answer yes at any node, you probably want a different method.

Figure 6 · Decision flowchart start at top

Figure 6. A simple gate at each step. Two no's send you to DE analysis; one no on batch sends you back for correction first. Once you reach the green box, the three classic WGCNA outputs are pathway enrichment per module, module-trait heatmap, and hub gene candidates.

09 · In TransXplorer

How TransXplorer handles WGCNA

The full WGCNA workflow is wired into the TransXplorer app — no R, no script-tuning, no figure assembly. Here is what runs automatically when you open the Co-expression Network panel:

Soft-threshold selection. TransXplorer sweeps β from 1 to 30 and picks the smallest value where the scale-free fit plateaus (R² ≥ 0.85). The full scale-free fit curve is shown so you can override the auto-pick if needed.
TOM construction and module detection. Topological overlap is computed, hierarchical clustering is run, and the dynamic tree-cut produces modules. Defaults follow the WGCNA tutorial: minModuleSize = 30, mergeCutHeight = 0.25.
Module-trait correlation panel. If you have uploaded a metadata sheet, every module eigengene is correlated against every annotated trait. The heatmap is interactive: click a cell to drill down to the underlying genes.
Hub gene table. For each module, genes are ranked by module membership (MM). The top hubs are exported to a sortable table with gene symbol, MM, intramodular connectivity, and a link to GeneCards / Open Targets.
Auto pathway enrichment per module. GO BP/MF/CC and KEGG enrichment runs on every module without an extra click — results are tabbed inside each module's drawer.
Export-ready outputs. Module assignment table (CSV), eigengene matrix, hub gene lists, and high-resolution PDF of every figure.

Run WGCNA on your dataset in TransXplorer

Upload your counts or pull a GSE accession directly — the network, modules, trait heatmap, hubs, and pathway enrichment land in your browser in minutes.

Try WGCNA in TransXplorer →

10 · Further reading

The papers worth reading

If you want to go deeper into the WGCNA algorithm, the soft-thresholding mathematics, or eigengene network theory, these are the canonical sources. The two BMC Bioinformatics papers (2008, 2007) are the most directly useful for practitioners.

2008

WGCNA: an R package for weighted correlation network analysis Langfelder P, Horvath S. BMC Bioinformatics 9, 559.
doi:10.1186/1471-2105-9-559 ↗

2005

A general framework for weighted gene co-expression network analysis Zhang B, Horvath S. Statistical Applications in Genetics and Molecular Biology 4(1):17.
doi:10.2202/1544-6115.1128 ↗

2011

Weighted Network Analysis: Applications in Genomics and Systems Biology Horvath S. Springer. (Textbook treatment, much more depth than the papers.)
doi:10.1007/978-1-4419-8819-5 ↗

2007

Eigengene networks for studying the relationships between co-expression modules Langfelder P, Horvath S. BMC Systems Biology 1:54.
doi:10.1186/1752-0509-1-54 ↗

Common questions

How many samples do I really need?

Fifteen samples is the floor — below that, the correlations that drive every WGCNA result are too noisy to trust. Twenty or more is the practical recommendation for most cohorts. Fifty or more gives you stable modules that survive subsampling and replicate across cohorts.

Can I run WGCNA on a 3-vs-3 comparison?

Not effectively. With six samples you cannot estimate pairwise correlations with any precision — modules will be essentially random. Use a differential expression tool (DESeq2, edgeR, or limma) instead. Those methods are designed for the small-group setting and will give you a real, interpretable gene list.

How do I interpret the “Grey” module?

Grey is the bucket of genes that did not fit into any module — either because their expression is flat, their pattern is unique, or they fall on the boundary between two clusters. It is frequently the largest single “module” in the result. Do not over-interpret Grey biologically; in particular, do not run pathway enrichment on it expecting a meaningful theme.

My modules don't enrich for any pathway. Bad?

Not necessarily. Modules sometimes reflect technical variation (batch, RIN, sequencing depth) or shifting cell-type proportions in bulk samples, rather than canonical pathways. Cross-check the eigengene against your batch variables and sample-composition estimates (e.g. CIBERSORT, xCell) before concluding the module is uninformative.

Can I compare modules across two studies?

Yes — WGCNA has a modulePreservation function for exactly this. It computes preservation statistics like Z_summary that tell you whether modules defined in one dataset hold up in another. TransXplorer does not currently expose this; for cross-study preservation, use the R WGCNA package directly.

What's next?

Three directions to keep going — related concepts, the foundational method, or the hands-on tutorial.

Understanding batch effects

Why uncorrected batch is the most common reason WGCNA modules look weird — and how to detect and fix it.

Read concept →

What is differential expression?

The other half of the RNA-seq toolkit — what DE actually tests, how it differs from WGCNA, and when each one belongs.

Read concept →

RNA-seq in 10 minutes

The hands-on getting-started tutorial — counts to DE table to enrichment to volcano plot, no code, in your browser.

Take the tutorial →