Skip to content

Main Data File Setup

Your data file contains the candidates for selection and as much of their pedigree as you choose to handle. You need to include individuals that are not candidates but that are ancestors of candidates if you want to fully account for coancestry when managing genetic diversity. However, even with no pedigree, coancestry is accommodated as much as is possible - equivalent to using classical concepts of effective population size. If you intend to use a Genomic Relationship Matrix (GRM) instead of (or in addition to) pedigree information, please see the section on GRM.

MateSel only supports the CSV (Comma-Separated Values) format for the fields mentioned below. If you are looking to integrate MateSel with your organisations database, we also support the JSON format, please contact us for more information.

Below we have 2 Data File Templates for you to review. All fields are further explained below.

TemplateDescription
MateSelClassicDemo.csvOur classic MateSel dataset with 40 male candidates, 108 female candidates, 615 pedigree individuals, 6 traits, and 4 genetic markers.
MateSelBisexDemo.csvOur bisexual plants MateSel dataset allows candidate max/min/must usage constraints to be specified per sex e.g. Max use of 50 as a male and 25 as a female.
FieldRequiredDataTypeDescription
IdTrueText (20)Unique identifier for the individual. Text fields are case sensitive, such that “bert” and “Bert” are recognised as different individuals. Must not contain spaces.
SireTrueText (20)Father’s Id. Must not contain spaces.
DamTrueText (20)Mother’s Id. Must not contain spaces.
SexTrueText (1)Either “M” or male or “F” for female. This field is not required for non-candidates i.e. individual included for pedigree purposes only. This field should not be included for BiSexual datasets.
MaxUseTrueIntegerThe maximum value for number of matings permitted for each candidate.

What is a mating? It is a planned pregnancy. See here for more information.

For example, MaxUse = 1 mating for natural mating females, MaxUse =30 matings for natural mating bulls, and possibly MaxUse =1,000 matings for AI bulls, or the number of semen doses left for a deceased bull. Enter 0 for non-candidates i.e. individuals included for pedigree purposes only. It is not recommended that you set MaxUse values low with the intention of conserving genetic diversity. Let Optimal Contributions Selection (OCS) manage that through your chosen Balance Strategy. Better to set MaxUse according to reasonable biological and/or logistical limits.
MinUseFalseIntegerMinimum number of matings if selected. E.g. If this Individual is selected, MateSel will use the Individual at least this many times in the mating list. Zero means ‘No Minimum’.
MustUseFalseIntegerThe absolute minimum number of matings. This is generally zero, but may be set higher, for example where a breeder has a given number of doses of semen available for a favoured bull, and insists that these should all be used.
MatingGroupFalseText (20)Needed if dividing individuals into mating groups. This field must contain the Id of the group that represents group membership. Entries for non-candidates can be set to blank/empty – they are not used. Groups are different for each sex, even if they have the same label (e.g. “Group6” for males is treated differently from “Group6” for females), so you can use the same labels across sexes. Must not contain spaces.
IndexTrueNumericThis should contain the selection index value for each individual, which will typically be a multi-trait BLUP index value or other similar selection index value that is a measure of genetic merit across traits. It can be simply the EBV for the single trait of interest. Do not use scientific notation for numerical values.

Genetic Marker Fields will begin with a “g_” in their name. Genetic Marker Probability Fields will begin with a “gp_” in their name. Please see the Genetic Markers Overview to learn more.

Any unrecognised field in the CSV file is treated as a trait (e.g. EBV). This means that fields such as Age, DateOfBirth, SemenDoseCost, and YearOfBirth are treated as traits which is useful as information and for setting targets. For example, you can make constraints on YearOfBirth and DateOfBirth of selected individuals by using the Trait Management tool to manage trait distributions.

Validation of Trait Fields:

  • The values for trait information must be numeric only. Alpha-numeric values cannot be plotted on trait histograms so are not supported.
  • Do not use scientific notation for numerical values
  • Unknown parents should be denoted as a blank value in the CSV e.g ”,,,”
  • Missing numerical trait and index information should be given as a decimal point (.), as the value 0 will be taken as an observed value.

We support slightly different fields when using BiSexual datasets. Please visit the Accommodating Bisexuality in Plants page for more information.

Supported Fields for handling multiple lethal recessives

Section titled “Supported Fields for handling multiple lethal recessives”

MateSel supports two additional fields “LethalA” and “LethalG” for handling multiple lethal recessives. For more information, please see Handling Multiple Lethal Recessives

If you need additional information around preparing your main data file for MateSel, please see the video guide below. Please note this video showcases the MateSel Legacy version, however, the fundamentals are the same in the newer version of MateSel.