The complete open-source pipeline for cold-hardy heritage crops — from a leaf in the field to a marker on a breeding decision. We follow DNA through extraction, sequencing, alignment, and variant calling, into the CBF cold-response cascade that governs frost tolerance, and out the other side as a KASP assay you can run to select hardier lines. No jargon left unexplained.
Heritage varieties carry centuries of adaptation in their DNA. Sequencing turns that folk knowledge — "this rye always survives our winters" — into mapped, heritable, selectable alleles.
Cold tolerance is not a single gene you can point to. It is polygenic and quantitative: dozens of loci, each contributing a little, expressed differently depending on day length, acclimation period, and how fast the cold arrives. A landrace that has been grown in northern Wisconsin or Jämtland for a hundred winters is a living experiment that has already been run. The genome is the lab notebook — we just have to learn to read it.
The practical unit of measurement is LT50: the temperature at which 50% of plants in a population die after a controlled freeze. A winter wheat with an LT50 of −20 °C is good; rye landraces can push past −30 °C. The whole point of this guide is to connect that field-measured number to the specific stretches of DNA that produce it — so a breeder can select for hardiness in a seedling tray instead of waiting for a killing frost to do it for them.
Every marker, reference, and method below is published under CC BY 4.0. Cold-hardiness alleles found in public landraces should stay in the commons — sequenced, documented, and unpatentable.
Every genomics project, from a $500 community effort to a national breeding program, walks the same path. The cost and resolution change; the steps do not.
Sections 03–04 cover the wet and dry lab in turn. Section 05 explains the biology you are actually hunting for. Sections 06–07 turn variants into breeding decisions.
Good data starts with clean, high-molecular-weight DNA. Everything downstream inherits the quality of this step.
For most cereals and fruit, a CTAB extraction from young leaf tissue is the durable, low-cost standard — reagents cost cents per sample and it tolerates the polysaccharides and phenolics that gum up column kits. Silica-column kits (Qiagen DNeasy and equivalents) are faster and cleaner if budget allows. Flash-freeze tissue in the field or dry it on silica gel; degraded DNA caps your read length and your options.
Depth is how many times each base is read. ~5× skim is enough to genotype known sites by imputation; 15–30× is wanted for confident de novo SNP calling, especially in the big, repetitive genomes of wheat (≈16 Gb) and rye (≈7.9 Gb).
The dry lab is entirely free, open-source software. A capable laptop handles GBS; large genomes want a workstation or a few hundred core-hours on a cluster.
FastQC to inspect, fastp or Trimmomatic to trim adapters and low-quality tails.BWA-MEM for short reads, minimap2 for long reads, against a reference genome. samtools sorts and indexes.GATK HaplotypeCaller or bcftools mpileup produce a VCF of SNPs and indels.SnpEff predicts which variants change a protein, land in a known cold-tolerance gene, or sit harmlessly in an intron.You align against a published assembly. The major cold-crop references are all open: IWGSC RefSeq v2.1 (bread wheat), MorexV3 (barley), Lo7 / Weining (rye), GDDH13 (apple), and Prunus persica v2 (peach, the Rosaceae anchor). Pick the reference closest to your crop, then let variant calling tell you where your landrace differs from it.
The deliverable of the dry lab is a VCF — a table of every position where your sample differs from the reference. The rest of the guide is about finding, in that table, the handful of differences that matter for cold.
This is the heart of frost tolerance — and the most rewarding part of the genome to understand, because a few well-mapped loci explain a large share of the variation.
When temperatures drop, plants run a signalling cascade: ICE1 transcription factors switch on the CBF / DREB1 genes, which in turn activate dozens of COR (cold-regulated) genes — including the dehydrins that protect cells from freezing damage. The hardiness of a variety is largely set by how many CBF copies it carries and how strongly they fire.
In the Triticeae (wheat, barley, rye) the CBF genes sit in a tandem cluster at the Frost-resistance-2 (Fr-2) locus, and the vernalization gene VRN1 sits nearby at Fr-1 — which is why winter habit and frost tolerance so often travel together. Copy-number variation at the CBF cluster is the single most useful functional marker we have.
| Gene / cluster | Locus | Chr | Crop | Marker | Role |
|---|---|---|---|---|---|
| VRN1 | Fr-A1 | 5A | Wheat | co-loc | Vernalization gene; co-localizes with frost tolerance at Fr-1 |
| TaCBF cluster | Fr-A2 | 5A | Wheat | KASP · CNV | ~15 CBF genes; copy number is the functional marker (LT₅₀ −20 to −40 °C) |
| TaCBF-A14 / A15 | Fr-A2 | 5A | Wheat | SNP | Extra copies separate frost-tolerant from frost-sensitive lines |
| WCS120 | — | — | Wheat | dehydrin | 120 kDa dehydrin; protein accumulation correlates with LT₅₀ |
| VRN-H1 | Fr-H1 | 5H | Barley | SSR · Bmac0096 | Vernalization / winter habit, linked to Fr-H2 |
| HvCBF2 / HvCBF4 | Fr-H2 | 5H | Barley | CNV | Major barley frost-tolerance cluster |
| ScCBF cluster | Fr-R2 | 5R | Rye | CBFIVa-2.2 | Underlies rye's exceptional winter hardiness |
| MdCBF1 / MdCBF2 | — | — | Apple | SNP | Cold acclimation & dormancy in Rosaceae fruit |
| PpCBF1 / PpCBF2 | — | — | Prunus | CBF | Stone-fruit cold response and chilling requirement |
Nine of 140 mapped markers. Browse the full set, filterable by crop, locus, and marker type → VÄXT Genomics database
Three approaches connect genotype to phenotype. Which you use depends on the material you have.
rrBLUP, GBLUP) on genotyped-and-phenotyped lines, then predict a genomic estimated breeding value for un-phenotyped seedlings. The state of the art for polygenic traits like cold tolerance.QTL mapping asks "where is the gene?" GWAS asks "which alleles matter across many varieties?" Genomic selection skips the gene hunt and asks "how hardy will this seedling be?" — the question a breeder actually needs answered.
The payoff: screen a seedling's DNA, predict its hardiness, and keep the winners — before they ever face a winter.
Once a SNP is validated, you convert it into a KASP assay (Kompetitive Allele-Specific PCR) — a cheap, robust fluorescence test that genotypes one marker across hundreds of seedlings on a single plate. KASP is the bridge from "we sequenced it" to "we select on it." Breeders use foreground selection (keep the favourable CBF allele) alongside background selection (recover the rest of the recurrent parent's genome) to move a hardiness allele into an elite line in a few generations.
Outsource GBS and KASP to a service provider. Free open-source tools on a laptop. Enough to screen a landrace collection.
Thermocycler, gel rig, CTAB bench, shared access to a plate reader. Run your own marker screens.
qPCR / HRM, a dedicated KASP reader, library-prep bench. In-house genotyping at scale.
Illumina NextSeq/NovaSeq or ONT PromethION plus compute. Whole-genome resequencing and assembly.
You are not starting from zero. The Nordic and Baltic programs below run exactly this pipeline for cold-climate crops — and their germplasm and methods are largely public.
All 83 breeding programs and 40 trial sites are mapped → VÄXT Network
Published open under CC BY 4.0. Marker data is exported machine-readable for your own analysis.