Main Data File Setup
Your data file contains the candidates for selection and as much of their pedigree as you choose to handle. You need to include individuals that are not candidates but that are ancestors of candidates if you want to fully account for coancestry when managing genetic diversity. However, even with no pedigree, coancestry is accommodated as much as is possible - equivalent to using classical concepts of effective population size. If you intend to use a Genomic Relationship Matrix (GRM) instead of (or in addition to) pedigree information, please see the section on GRM.
File Formats Accepted
Section titled “File Formats Accepted”MateSel only supports the CSV (Comma-Separated Values) format for the fields mentioned below. If you are looking to integrate MateSel with your organisations database, we also support the JSON format, please contact us for more information.
Data File Templates
Section titled “Data File Templates”Below we have 2 Data File Templates for you to review. All fields are further explained below.
Template | Description |
---|---|
MateSelClassicDemo.csv | Our classic MateSel dataset with 40 male candidates, 108 female candidates, 615 pedigree individuals, 6 traits, and 4 genetic markers. |
MateSelBisexDemo.csv | Our bisexual plants MateSel dataset allows candidate max/min/must usage constraints to be specified per sex e.g. Max use of 50 as a male and 25 as a female. |
Standard Fields
Section titled “Standard Fields”Field | Required | DataType | Description |
---|---|---|---|
Id | True | Text (20) | Unique identifier for the individual. Text fields are case sensitive, such that “bert” and “Bert” are recognised as different individuals. Must not contain spaces. |
Sire | True | Text (20) | Father’s Id. Must not contain spaces. |
Dam | True | Text (20) | Mother’s Id. Must not contain spaces. |
Sex | True | Text (1) | Either “M” or male or “F” for female. This field is not required for non-candidates i.e. individual included for pedigree purposes only. This field should not be included for BiSexual datasets. |
MaxUse | True | Integer | The maximum value for number of matings permitted for each candidate. What is a mating? It is a planned pregnancy. See here for more information. For example, MaxUse = 1 mating for natural mating females, MaxUse =30 matings for natural mating bulls, and possibly MaxUse =1,000 matings for AI bulls, or the number of semen doses left for a deceased bull. Enter 0 for non-candidates i.e. individuals included for pedigree purposes only. It is not recommended that you set MaxUse values low with the intention of conserving genetic diversity. Let Optimal Contributions Selection (OCS) manage that through your chosen Balance Strategy. Better to set MaxUse according to reasonable biological and/or logistical limits. |
MinUse | False | Integer | Minimum number of matings if selected. E.g. If this Individual is selected, MateSel will use the Individual at least this many times in the mating list. Zero means ‘No Minimum’. |
MustUse | False | Integer | The absolute minimum number of matings. This is generally zero, but may be set higher, for example where a breeder has a given number of doses of semen available for a favoured bull, and insists that these should all be used. |
MatingGroup | False | Text (20) | Needed if dividing individuals into mating groups. This field must contain the Id of the group that represents group membership. Entries for non-candidates can be set to blank/empty – they are not used. Groups are different for each sex, even if they have the same label (e.g. “Group6” for males is treated differently from “Group6” for females), so you can use the same labels across sexes. Must not contain spaces. |
Index | True | Numeric | This should contain the selection index value for each individual, which will typically be a multi-trait BLUP index value or other similar selection index value that is a measure of genetic merit across traits. It can be simply the EBV for the single trait of interest. Do not use scientific notation for numerical values. |
Genetic Marker Fields
Section titled “Genetic Marker Fields”Genetic Marker Fields will begin with a “g_” in their name. Genetic Marker Probability Fields will begin with a “gp_” in their name. Please see the Genetic Markers Overview to learn more.
Trait Fields
Section titled “Trait Fields”Any unrecognised field in the CSV file is treated as a trait (e.g. EBV). This means that fields such as Age, DateOfBirth, SemenDoseCost, and YearOfBirth are treated as traits which is useful as information and for setting targets. For example, you can make constraints on YearOfBirth and DateOfBirth of selected individuals by using the Trait Management tool to manage trait distributions.
Validation of Trait Fields:
- The values for trait information must be numeric only. Alpha-numeric values cannot be plotted on trait histograms so are not supported.
- Do not use scientific notation for numerical values
Missing information
Section titled “Missing information”- Unknown parents should be denoted as a blank value in the CSV e.g ”,,,”
- Missing numerical trait and index information should be given as a decimal point (.), as the value 0 will be taken as an observed value.
Supported Fields for BiSexual datasets
Section titled “Supported Fields for BiSexual datasets”We support slightly different fields when using BiSexual datasets. Please visit the Accommodating Bisexuality in Plants page for more information.
Supported Fields for handling multiple lethal recessives
Section titled “Supported Fields for handling multiple lethal recessives”MateSel supports two additional fields “LethalA” and “LethalG” for handling multiple lethal recessives. For more information, please see Handling Multiple Lethal Recessives
Video Guide
Section titled “Video Guide”If you need additional information around preparing your main data file for MateSel, please see the video guide below. Please note this video showcases the MateSel Legacy version, however, the fundamentals are the same in the newer version of MateSel.