Field Guide · Genomics · CC BY 4.0

The Horticultural Gene Sequencing Field Guide

The complete open-source pipeline for cold-hardy heritage crops — from a leaf in the field to a marker on a breeding decision. We follow DNA through extraction, sequencing, alignment, and variant calling, into the CBF cold-response cascade that governs frost tolerance, and out the other side as a KASP assay you can run to select hardier lines. No jargon left unexplained.

7
Sections
140
Mapped markers
5
Crop genomes
$500+
Entry cost
Contents
01The case for sequencing heritage grain 02The pipeline at a glance 03Sample to sequence — the wet lab 04Reads to variants — the dry lab 05The CBF cold-response cascade 06Mapping the trait — QTL, GWAS, GS 07Marker-assisted selection in practice
01

The case for sequencing

Heritage varieties carry centuries of adaptation in their DNA. Sequencing turns that folk knowledge — "this rye always survives our winters" — into mapped, heritable, selectable alleles.

Cold tolerance is not a single gene you can point to. It is polygenic and quantitative: dozens of loci, each contributing a little, expressed differently depending on day length, acclimation period, and how fast the cold arrives. A landrace that has been grown in northern Wisconsin or Jämtland for a hundred winters is a living experiment that has already been run. The genome is the lab notebook — we just have to learn to read it.

The practical unit of measurement is LT50: the temperature at which 50% of plants in a population die after a controlled freeze. A winter wheat with an LT50 of −20 °C is good; rye landraces can push past −30 °C. The whole point of this guide is to connect that field-measured number to the specific stretches of DNA that produce it — so a breeder can select for hardiness in a seedling tray instead of waiting for a killing frost to do it for them.

Why open

Every marker, reference, and method below is published under CC BY 4.0. Cold-hardiness alleles found in public landraces should stay in the commons — sequenced, documented, and unpatentable.


02

The pipeline at a glance

Every genomics project, from a $500 community effort to a national breeding program, walks the same path. The cost and resolution change; the steps do not.

Wet lab
Tissue → DNA
CTAB / kit
Wet lab
Library prep
GBS · WGS
Instrument
Sequencing
Illumina · ONT
Dry lab
Reads (FASTQ)
FastQC · fastp
Dry lab
Alignment
BWA · minimap2
Dry lab
Variant calling
GATK · bcftools
Dry lab
Annotation
SnpEff
Decision
Selection
KASP · MAS

Sections 03–04 cover the wet and dry lab in turn. Section 05 explains the biology you are actually hunting for. Sections 06–07 turn variants into breeding decisions.


03

Sample to sequence

Good data starts with clean, high-molecular-weight DNA. Everything downstream inherits the quality of this step.

Extraction

For most cereals and fruit, a CTAB extraction from young leaf tissue is the durable, low-cost standard — reagents cost cents per sample and it tolerates the polysaccharides and phenolics that gum up column kits. Silica-column kits (Qiagen DNeasy and equivalents) are faster and cleaner if budget allows. Flash-freeze tissue in the field or dry it on silica gel; degraded DNA caps your read length and your options.

Choosing a sequencing strategy

Depth, briefly

Depth is how many times each base is read. ~5× skim is enough to genotype known sites by imputation; 15–30× is wanted for confident de novo SNP calling, especially in the big, repetitive genomes of wheat (≈16 Gb) and rye (≈7.9 Gb).


04

Reads to variants

The dry lab is entirely free, open-source software. A capable laptop handles GBS; large genomes want a workstation or a few hundred core-hours on a cluster.

The toolchain

Reference genomes

You align against a published assembly. The major cold-crop references are all open: IWGSC RefSeq v2.1 (bread wheat), MorexV3 (barley), Lo7 / Weining (rye), GDDH13 (apple), and Prunus persica v2 (peach, the Rosaceae anchor). Pick the reference closest to your crop, then let variant calling tell you where your landrace differs from it.

Output

The deliverable of the dry lab is a VCF — a table of every position where your sample differs from the reference. The rest of the guide is about finding, in that table, the handful of differences that matter for cold.


05

The CBF cold-response cascade

This is the heart of frost tolerance — and the most rewarding part of the genome to understand, because a few well-mapped loci explain a large share of the variation.

When temperatures drop, plants run a signalling cascade: ICE1 transcription factors switch on the CBF / DREB1 genes, which in turn activate dozens of COR (cold-regulated) genes — including the dehydrins that protect cells from freezing damage. The hardiness of a variety is largely set by how many CBF copies it carries and how strongly they fire.

In the Triticeae (wheat, barley, rye) the CBF genes sit in a tandem cluster at the Frost-resistance-2 (Fr-2) locus, and the vernalization gene VRN1 sits nearby at Fr-1 — which is why winter habit and frost tolerance so often travel together. Copy-number variation at the CBF cluster is the single most useful functional marker we have.

Gene / clusterLocusChrCropMarkerRole
VRN1Fr-A15AWheatco-locVernalization gene; co-localizes with frost tolerance at Fr-1
TaCBF clusterFr-A25AWheatKASP · CNV~15 CBF genes; copy number is the functional marker (LT₅₀ −20 to −40 °C)
TaCBF-A14 / A15Fr-A25AWheatSNPExtra copies separate frost-tolerant from frost-sensitive lines
WCS120Wheatdehydrin120 kDa dehydrin; protein accumulation correlates with LT₅₀
VRN-H1Fr-H15HBarleySSR · Bmac0096Vernalization / winter habit, linked to Fr-H2
HvCBF2 / HvCBF4Fr-H25HBarleyCNVMajor barley frost-tolerance cluster
ScCBF clusterFr-R25RRyeCBFIVa-2.2Underlies rye's exceptional winter hardiness
MdCBF1 / MdCBF2AppleSNPCold acclimation & dormancy in Rosaceae fruit
PpCBF1 / PpCBF2PrunusCBFStone-fruit cold response and chilling requirement

Nine of 140 mapped markers. Browse the full set, filterable by crop, locus, and marker type → VÄXT Genomics database


06

Mapping the trait

Three approaches connect genotype to phenotype. Which you use depends on the material you have.

In plain terms

QTL mapping asks "where is the gene?" GWAS asks "which alleles matter across many varieties?" Genomic selection skips the gene hunt and asks "how hardy will this seedling be?" — the question a breeder actually needs answered.


07

Marker-assisted selection in practice

The payoff: screen a seedling's DNA, predict its hardiness, and keep the winners — before they ever face a winter.

Once a SNP is validated, you convert it into a KASP assay (Kompetitive Allele-Specific PCR) — a cheap, robust fluorescence test that genotypes one marker across hundreds of seedlings on a single plate. KASP is the bridge from "we sequenced it" to "we select on it." Breeders use foreground selection (keep the favourable CBF allele) alongside background selection (recover the rest of the recurrent parent's genome) to move a hardiness allele into an elite line in a few generations.

What it costs to start

~$500
Community / field

Outsource GBS and KASP to a service provider. Free open-source tools on a laptop. Enough to screen a landrace collection.

~$5K
Small lab

Thermocycler, gel rig, CTAB bench, shared access to a plate reader. Run your own marker screens.

~$25K
Regional program

qPCR / HRM, a dedicated KASP reader, library-prep bench. In-house genotyping at scale.

$100K+
Sequencing core

Illumina NextSeq/NovaSeq or ONT PromethION plus compute. Whole-genome resequencing and assembly.

Who's doing this in the cold North

You are not starting from zero. The Nordic and Baltic programs below run exactly this pipeline for cold-climate crops — and their germplasm and methods are largely public.

Graminor AS
Ridabu, Norway · est. 1990
Winter hardiness, baking quality, forage
Boreal Plant Breeding
Jokioinen, Finland · est. 1918
Northern adaptation, cereals & forage
NordGen
Alnarp, Sweden
Nordic gene bank & conservation
SLU Balsgård
Kristianstad, Sweden · est. 1943
Cold-hardy fruit & berry breeding

All 83 breeding programs and 40 trial sites are mapped → VÄXT Network

Keep going

Use this guide. Cite it. Improve it.

Published open under CC BY 4.0. Marker data is exported machine-readable for your own analysis.

All field guides